mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 22:51:14 +00:00
a11ed15187
ADR 0002: hexagonal/ports-and-adapters architecture — components/ layout, protocol.py per component, composition root, core/ for concentrated math. ADR 0003: @dataclass(slots=True, frozen=True) on hot path; Pydantic retained only at REST/config/DB boundaries. Pose/GPSPoint migration deferred to Phase 2. ADR 0004: Stage 2 as independent iteration — own phases 1-6, own requirements, stage1 code treated as MVP starting capital. PROJECT.md: Stage 2 Key Decisions updated from Pending → Accepted with Phase 1 implementation notes, deferred work list, and final architecture summary. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
153 lines
13 KiB
Markdown
153 lines
13 KiB
Markdown
# Roadmap: GPS-Denied Onboard Navigation System — Stage 2
|
||
|
||
**Stage:** 2 (independent iteration)
|
||
**Created:** 2026-05-10
|
||
**Branch:** `stage2` (HEAD = stage1; v1.0 archived under `.planning/archive/v1.0/`)
|
||
**Granularity:** standard
|
||
**Total phases:** 6
|
||
**Total requirements mapped:** 52 / 52 (100% coverage)
|
||
|
||
---
|
||
|
||
## Overview
|
||
|
||
Stage 2 is a **self-contained iteration** with its own phase numbering (1–6). It is NOT a continuation of Stage 1's seven phases — those are archived under `.planning/archive/v1.0/` and treated as MVP starting capital (the working ESKF + cuVSLAM/ORB VO + GPR + MAVLink + 195 passing tests).
|
||
|
||
The Stage 2 mission: refactor the inherited MVP into a hexagonal/ports-and-adapters architecture, re-implement (not merge) selected concept-level ideas from the parallel `try02` branch, formalize acceptance criteria with testable numerics, and add the Azaion 10.05.2026 real-flight integration fixture — all without regressing any of the 195 stage1 tests.
|
||
|
||
Phases are derived from the ten Stage 2 requirement categories (ARCH, AC, SAFE, VERIFY, FDR, VPR, MAVOUT, FIXTURE, TEST, OBS) and ordered so each phase stabilizes the Protocol surfaces and test infrastructure that the next phase depends on.
|
||
|
||
## Phase Dependency Order
|
||
|
||
```
|
||
Phase 1 (ARCH — hexagonal refactor + composition root; Protocols stabilized)
|
||
↓
|
||
Phase 2 (AC + TEST taxonomy + structlog spine — measurement scaffolding)
|
||
↓
|
||
Phase 3 (SAFE state machine + VERIFY anchor gates — authoritative source labels)
|
||
↓
|
||
Phase 4 (Conditional Multi-Scale VPR + FDR — uses SAFE triggers, FDR for audit)
|
||
↓
|
||
Phase 5 (MAVOUT — source-aware GPS_INPUT + spoofing + blackout — needs SAFE labels)
|
||
↓
|
||
Phase 6 (FIXTURE — Azaion replay + CLI + per-env Docker — exercises everything e2e)
|
||
```
|
||
|
||
## Phases
|
||
|
||
- [ ] **Phase 1: Hexagonal Refactor & Composition Root** — Reorganize stage1 MVP into `components/` hexagonal layout with Protocol-typed DI composition root; no regressions.
|
||
- [ ] **Phase 2: Acceptance Criteria + Test Taxonomy + Observability Spine** — Formal AC document with numeric thresholds, `tests/{unit,integration,blackbox,sitl,e2e}/` taxonomy, structlog correlation_id spine.
|
||
- [ ] **Phase 3: Safety Anchor State Machine & Geometry-Gated Verifier** — Authoritative `source_label` ownership + accept/reject gates for satellite anchors before they reach ESKF.
|
||
- [ ] **Phase 4: Conditional Multi-Scale VPR + Flight Data Recorder** — Trigger-driven DINOv2 forward, multi-scale FAISS chunks, append-only event log with bounded storage.
|
||
- [ ] **Phase 5: MAVLink Source-Aware Output & Spoofing/Blackout Handling** — Source labels + anchor age in GPS_INPUT, spoofing-promotion <3s, visual-blackout dead-reckoning ≤400ms, ODOMETRY scaffold behind feature flag.
|
||
- [ ] **Phase 6: Real-Flight Fixture (Azaion 10.05.2026) + CLI + Per-Env Docker** — End-to-end integration test on real flight data, `gps_denied` typer CLI, split Jetson/x86 Dockerfiles.
|
||
|
||
## Phase Details
|
||
|
||
### Phase 1: Hexagonal Refactor & Composition Root
|
||
|
||
**Goal**: Stage1 MVP reorganized into hexagonal/ports-and-adapters layout with explicit DI composition root; all 195 stage1 tests still pass.
|
||
**Depends on**: Nothing (first phase; consumes stage1 archived code as input).
|
||
**Requirements**: ARCH-01, ARCH-02, ARCH-03, ARCH-04, ARCH-05, ARCH-06, ARCH-07
|
||
**Success Criteria** (what must be TRUE):
|
||
1. Every swappable component (vio, satellite_matcher, gpr, anchor_verifier, safety_state, flight_recorder, mavlink_io, coordinate_transforms) lives under `src/gps_denied/components/<name>/` with its own `protocol.py` + concrete impls + (where needed) `native/` bridge.
|
||
2. Hot-path types (`FrameState`, `IMUSample`, `PositionEstimate`, `VOEstimate`, `SatelliteAnchor`) are `@dataclass(slots=True, frozen=True)` and Pydantic no longer touches the per-frame path.
|
||
3. Calling `build_pipeline(env="x86_dev")` / `"jetson"` / `"ci"` / `"sitl"` from `pipeline/composition.py` returns a fully-wired `Pipeline` with environment-correct adapters and no concrete imports leaking into pipeline orchestration.
|
||
4. Per-environment YAML configs (`config/{jetson,x86_dev,ci,sitl}.yaml`) load via `pydantic-settings` into a typed `RuntimeConfig` that drives composition.
|
||
5. `pytest` runs all 195 stage1 tests (+ 8 SITL skipped) green and accuracy benchmarks show no regression vs the archived stage1 baseline.
|
||
**Plans**: TBD
|
||
|
||
### Phase 2: Acceptance Criteria + Test Taxonomy + Observability Spine
|
||
|
||
**Goal**: Project gains a formal, testable acceptance-criteria contract, a structured test taxonomy, and a structured-logging spine — the measurement scaffolding every later phase needs to prove its claims.
|
||
**Depends on**: Phase 1 (Protocol surfaces and components/ layout must exist before tests/AC can reference them).
|
||
**Requirements**: AC-01, AC-02, AC-03, AC-04, AC-05, AC-06, TEST-01, TEST-02, TEST-03, OBS-01
|
||
**Success Criteria** (what must be TRUE):
|
||
1. `_docs/00_problem/acceptance_criteria.md` lists every AC-1.x…AC-NEW-x with numeric threshold + validation method + linked test ID(s); no AC entry is unbound.
|
||
2. `tests/` is reorganized into `unit/integration/blackbox/sitl/e2e/`, every existing test is reclassified, and `pytest -m unit|integration|blackbox|sitl|e2e` selects the right subset for CI.
|
||
3. Running `scripts/gen_ac_traceability.py` produces `.planning/AC-TRACEABILITY.md` linking every AC ID → test ID(s) → component(s); CI fails if any AC is orphaned.
|
||
4. Position-accuracy, failure-mode, and real-time-performance ACs are wired to `tests/integration/accuracy/`, `tests/blackbox/failure_modes/`, and a benchmark harness that emits CI-tracked metrics.
|
||
5. Pipeline emits structured JSON via `structlog` with `correlation_id` (frame_id) on every per-frame log line, and Pydantic logging schemas guard the boundary records.
|
||
**Plans**: TBD
|
||
|
||
### Phase 3: Safety Anchor State Machine & Geometry-Gated Verifier
|
||
|
||
**Goal**: A separate safety layer — not the ESKF — owns the authoritative `source_label`, enforces monotonic covariance growth in non-anchored modes, and only accepts satellite anchors that pass formal geometric gates.
|
||
**Depends on**: Phase 2 (needs AC document + test taxonomy + structlog so state-machine behavior is testable and observable).
|
||
**Requirements**: SAFE-01, SAFE-02, SAFE-03, SAFE-04, SAFE-05, SAFE-06, VERIFY-01, VERIFY-02, VERIFY-03, VERIFY-04, VERIFY-05
|
||
**Success Criteria** (what must be TRUE):
|
||
1. Every emitted `PositionEstimate` carries one of `satellite_anchored / vo_extrapolated / dead_reckoned` set by `SafetyAnchorStateMachine`, plus an `anchor_age_ms` field that increases until the next accepted anchor.
|
||
2. Property-based tests prove covariance never decreases without an accepted anchor, and a unit-test matrix exercises all 9 declared state transitions.
|
||
3. `GeometryGatedAnchorVerifier` accepts/rejects each candidate using configurable gates (min inliers, max mean reprojection error, max homography condition number, freshness window) and emits a machine-readable rejection reason on every reject.
|
||
4. Tile-write eligibility (`can_persist_tile`) is exposed by the state machine and is `false` whenever the system is in `dead_reckoned`, so the tile cache cannot be poisoned during blind flight.
|
||
5. The state machine never sees raw VPR top-K candidates — `AnchorVerifier` is the only path that can hand it an accepted anchor — and benchmark mode lets matcher profiles be compared offline on a fixed frame.
|
||
**Plans**: TBD
|
||
|
||
### Phase 4: Conditional Multi-Scale VPR + Flight Data Recorder
|
||
|
||
**Goal**: DINOv2 retrieval runs only when re-localization is actually needed; chunks are decoupled from storage tiles with multi-scale coverage; every state transition / anchor decision / MAVLink emission is captured in an append-only flight recorder with bounded storage and explicit health states.
|
||
**Depends on**: Phase 3 (VPR triggers and FDR events ride on SAFE state-transitions and VERIFY accept/reject decisions).
|
||
**Requirements**: VPR-01, VPR-02, VPR-03, VPR-04, VPR-05, FDR-01, FDR-02, FDR-03, FDR-04, FDR-05, FDR-06
|
||
**Success Criteria** (what must be TRUE):
|
||
1. In steady state the pipeline ranks chunks by IMU+VO geometric prior and skips the DINOv2 forward; DINOv2 runs only on declared re-loc triggers (cold start, sharp turn, σ_xy > 50m, VO failure ≥2 frames, disconnected segment).
|
||
2. VPR chunks cover the operating area with 600–800m ground footprint and 40–50% overlap so any frame footprint falls fully inside ≥1 chunk; FAISS holds both fine-scale (z=20) and coarse-scale (z=17/18) descriptor sets.
|
||
3. Top-K is dynamic — K=5 stable, K=20 active-conflict, K=50 expanding-window — and the integration uses the existing `chunk_manager.py` / `gpr.py` API surface without breaking stage1 GPR contracts.
|
||
4. `FlightRecorder` writes append-only JSONL segments to `data/fdr/{flight_id}/segment-NNNN.jsonl`, enforces configurable segment + total storage byte limits, and exposes `health ∈ {ok, degraded, critical}`.
|
||
5. State transitions, anchor accept/reject decisions, MAVLink sends, and pipeline errors are all recorded as FDR events; AC-NEW-3 forensic thumbnails fire at ≤0.1Hz on tile-generation failures within the FDR size budget.
|
||
**Plans**: TBD
|
||
|
||
### Phase 5: MAVLink Source-Aware Output & Spoofing/Blackout Handling
|
||
|
||
**Goal**: The MAVLink output the flight controller actually sees carries source provenance and reacts correctly to GPS spoofing and visual blackout, with the dual-channel ODOMETRY path scaffolded but disabled.
|
||
**Depends on**: Phase 4 (needs SAFE source labels, FDR audit channel, and VPR triggers to drive blackout/promotion semantics).
|
||
**Requirements**: MAVOUT-01, MAVOUT-02, MAVOUT-03, MAVOUT-04
|
||
**Success Criteria** (what must be TRUE):
|
||
1. Every `GPS_INPUT` message carries `source_label`, `anchor_age_ms`, and `covariance_semimajor_m` propagated from the corresponding `PositionEstimate` (mapped into `horiz_accuracy` and a custom STATUSTEXT for label/age).
|
||
2. When real-GPS health rolling average drops below threshold, the system promotes its own estimate to FC primary within <3s and emits a `STATUSTEXT` on every promotion/demotion.
|
||
3. When the camera produces no usable signal, the pipeline switches to `dead_reckoned` within ≤1 processed frame OR ≤400ms and emits `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT at 1–2Hz until imagery returns.
|
||
4. The `ODOMETRY` emitter exists in code but is disabled by `config.mavlink.odometry_enabled=false` in stage 2, and an integration test asserts ODOMETRY is intentionally absent on the wire.
|
||
**Plans**: TBD
|
||
|
||
### Phase 6: Real-Flight Fixture (Azaion 10.05.2026) + CLI + Per-Env Docker
|
||
|
||
**Goal**: The whole stack is exercised end-to-end against real flight data, an operator-facing CLI replays flights and runs AC benchmarks, and per-environment Docker images close the deployment loop.
|
||
**Depends on**: Phase 5 (final phase — exercises ARCH + AC + SAFE + VERIFY + VPR + FDR + MAVOUT against the Azaion fixture).
|
||
**Requirements**: FIXTURE-01, FIXTURE-02, FIXTURE-03, FIXTURE-04, FIXTURE-05, FIXTURE-06, FIXTURE-07, OBS-02, OBS-03
|
||
**Success Criteria** (what must be TRUE):
|
||
1. `tests/integration/azaion_flight/` runs against `Data/Azaion/10.05.2026/` (tlog + cropped EO video + MAVLink CSV) and is documented in `_docs/00_problem/fixtures.md` with ground-truth references and known limitations.
|
||
2. `scripts/prep_azaion_fixture.py` produces HUD-stripped EO frames at 0.7 fps, an IMU/GPS/ATTITUDE CSV from the tlog, and a timestamp-aligned manifest.
|
||
3. MAVLink replay decodes every `GLOBAL_POSITION_INT` / `RAW_IMU` / `ATTITUDE` message without error; ESKF replay on the real IMU samples produces no NaN/Inf and shows bounded covariance growth; ORB-SLAM3 VO smoke test achieves ≥30% frame registration on the cropped EO frames.
|
||
4. The GPS-denial simulation masks `GPS_RAW_INT` for t∈[180s, 280s] and the pipeline correctly switches to `vo_extrapolated` and back to `satellite_anchored` when GPS returns.
|
||
5. `gps_denied` typer CLI exposes `replay --tlog ... --video ...`, `benchmark --scenario ...`, and `bench-ac AC-1.1`; `Dockerfile.x86_dev` and `Dockerfile.jetson` (multi-stage with TRT engine prebuild step) build green and run the replay end-to-end on their respective platforms.
|
||
**Plans**: TBD
|
||
|
||
## Progress
|
||
|
||
| Phase | Plans Complete | Status | Completed |
|
||
|-------|----------------|--------|-----------|
|
||
| 1. Hexagonal Refactor & Composition Root | 0/0 | Not started | - |
|
||
| 2. Acceptance Criteria + Test Taxonomy + Observability Spine | 0/0 | Not started | - |
|
||
| 3. Safety Anchor State Machine & Geometry-Gated Verifier | 0/0 | Not started | - |
|
||
| 4. Conditional Multi-Scale VPR + Flight Data Recorder | 0/0 | Not started | - |
|
||
| 5. MAVLink Source-Aware Output & Spoofing/Blackout Handling | 0/0 | Not started | - |
|
||
| 6. Real-Flight Fixture (Azaion 10.05.2026) + CLI + Per-Env Docker | 0/0 | Not started | - |
|
||
|
||
## Coverage Summary
|
||
|
||
| Category | Count | Phase |
|
||
|----------|-------|-------|
|
||
| ARCH | 7 | Phase 1 |
|
||
| AC | 6 | Phase 2 |
|
||
| TEST | 3 | Phase 2 |
|
||
| OBS-01 | 1 | Phase 2 |
|
||
| SAFE | 6 | Phase 3 |
|
||
| VERIFY | 5 | Phase 3 |
|
||
| VPR | 5 | Phase 4 |
|
||
| FDR | 6 | Phase 4 |
|
||
| MAVOUT | 4 | Phase 5 |
|
||
| FIXTURE | 7 | Phase 6 |
|
||
| OBS-02, OBS-03 | 2 | Phase 6 |
|
||
| **Total** | **52** | **6 phases** |
|
||
|
||
100% of Stage 2 requirements mapped; no orphans; no duplicates.
|