mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 21:11:12 +00:00
846670a5c5
Updated various documentation files to clarify the handling of splittable artifacts, allowing for folder equivalents of key markdown files when they exceed size limits. Adjusted references in multiple sections to reflect this new structure, ensuring consistency across the research methodology. Enhanced clarity on the saving actions and artifact organization, particularly for `01_source_registry.md`, `02_fact_cards.md`, and `06_component_fit_matrix.md`. This change aims to improve usability and maintainability of the research documentation.
150 lines
16 KiB
Markdown
150 lines
16 KiB
Markdown
# Validation Log
|
||
|
||
> Mode A Phase 2 — engine Step 7 (Use-Case Validation / Sanity Check). Validates the recommended primary stack from `04_reasoning_chain.md` against a typical UAV mission scenario, surfaces counterexamples where they exist, runs the engine's review checklist, and lists conclusions that need revision.
|
||
>
|
||
> Backing artifacts: source registry [`01_source_registry/00_summary.md`](01_source_registry/00_summary.md) (#1–#121); fact cards [`02_fact_cards/00_summary.md`](02_fact_cards/00_summary.md) (#1–#101); component fit matrix [`06_component_fit_matrix/00_summary.md`](06_component_fit_matrix/00_summary.md); cross-component gates [`06_component_fit_matrix/99_cross_component_gates.md`](06_component_fit_matrix/99_cross_component_gates.md); comparison framework [`03_comparison_framework.md`](03_comparison_framework.md); reasoning chain [`04_reasoning_chain.md`](04_reasoning_chain.md).
|
||
|
||
---
|
||
|
||
## Validation Scenarios
|
||
|
||
The recommended primary stack must hold up across the full envelope of normal-flight + edge-case scenarios called out in the Project Constraint Matrix. Walked through five representative scenarios — one nominal cruise, two edge cases, two adversarial.
|
||
|
||
### Scenario 1 — Nominal cruise (steady-state visual anchoring)
|
||
|
||
A fixed-wing UAV at 1 km AGL cruises at 60 km/h over rolling-steppe agricultural terrain east of Dnipro. GPS is jammed. Nav camera produces 3 frames/s (~333 ms cadence). FC delivers 100-200 Hz IMU + attitude over MAVLink. C2 (MixVPR per recommended primary on the BSD/permissive track) retrieves K=3-5 candidate satellite tiles per frame; C3 (DISK+LightGlue + adaptive depth per D-C3-3 mitigation) registers UAV frame against best candidate; C4 (OpenCV `cv::solvePnPRansac` wrapped in GTSAM `Marginals` per D-C4-2 = (b)) emits 6-DoF pose + 6×6 covariance; C5 (GTSAM iSAM2 per D-C5-5 = (c)) fuses with C1 (OKVIS2 frame-to-frame VIO) + IMU; C8 (pymavlink → MAVLink `GPS_INPUT` for ArduPilot Plane / MSP2_SENSOR_GPS for iNav) emits WGS84 + per-FC `horiz_accuracy`/`hPosAccuracy` at 5 Hz per D-C8-5.
|
||
|
||
### Scenario 2 — Sharp turn with <5% inter-frame overlap (AC-3.2)
|
||
|
||
UAV banks ±20° to enter a search pattern. Two consecutive frames share <5% overlap. C1 frame-to-frame VIO loses tracking; C5 propagates dead-reckoned via IMU + last-good-anchor. C2/C3 next-frame retrieval recovers a valid satellite-anchor within 1-2 frames per AC-3.2 ("recovery via satellite-reference re-localization"). Within the AC-3.4 budget (≥3 consecutive frames AND ≥2 s without a position before requesting operator re-loc).
|
||
|
||
### Scenario 3 — Stale tile in active-conflict sector (AC-NEW-6)
|
||
|
||
Cache contains a tile from 8 months ago for a sector flagged as active-conflict. AC-8.2 freshness threshold is <6 mo for active-conflict. C6 manifest carries `capture_date` per restrictions.md mandate. The retrieval path must reject (or downgrade label to non-`satellite_anchored`) per AC-NEW-6.
|
||
|
||
### Scenario 4 — Cache file corruption (AC-NEW-7 cache-poisoning safety)
|
||
|
||
Pre-flight: a malicious actor swaps `/var/lib/onboard/cache/faiss/v_2048_M32.index` with a tampered file containing crafted descriptors that would point to wrong tiles for given UAV-frame queries. Takeoff load via `faiss.read_index` would silently load this file (Source #114 explicit warning: "no internal integrity check, expects validated input").
|
||
|
||
### Scenario 5 — GPS spoofing + visual blackout (AC-3.5 + AC-NEW-2 + AC-NEW-8)
|
||
|
||
UAV enters a cloud bank (visual blackout) while FC simultaneously reports GPS signal-quality anomaly indicating spoofing. C1 + C2 + C3 + C4 all fail (no usable visual input); C5 must propagate from last trusted state via IMU only, label every estimate `{dead_reckoned}`, degrade MAVLink fix-quality to "2D fix or worse" when 95% covariance semi-major axis >100 m, escalate to "no fix" when >500 m or blackout >30 s. C8 must NOT promote spoofed real-GPS back into the estimator unless FC GPS health stable + non-spoofed for ≥10 s AND a visual/satellite consistency check has succeeded. AC-NEW-2 spoofing-promotion latency <3 s p95 from spoof onset to companion estimate becoming primary FC source.
|
||
|
||
---
|
||
|
||
## Expected behavior under recommended primary stack
|
||
|
||
### Scenario 1 — Nominal cruise
|
||
|
||
If using **MixVPR + DISK+LightGlue + OpenCV+GTSAM-Marginals + GTSAM iSAM2 + pymavlink/MSP2** at the recommended primary stack:
|
||
- C2 MixVPR query ~10-20 ms FP16 + ~5-10 ms INT8 per frame; K=3-5 retrieval list returned.
|
||
- C3 DISK+LightGlue FP16 (per D-C7-6 matchers→FP16-only per-family precision policy) ~30-60 ms per pair × K=3-5 pairs = 90-300 ms (within AC-4.1 400 ms p95 if K=3 + adaptive depth applied per D-C3-3).
|
||
- C4 `cv::solvePnPRansac` ~5-15 ms inlier filter + GTSAM `Marginals` recovery ~30-90 ms (Plan-phase Jetson MVE confirms).
|
||
- C5 GTSAM iSAM2 with D-C5-5 = (c) PriorFactorPose3-only + IncrementalFixedLagSmoother K=10-20 keyframes per D-C5-3 ~2-5 ms per update.
|
||
- C8 pymavlink GPS_INPUT or MSP2_SENSOR_GPS encode + send ~1-5 ms.
|
||
- Total end-to-end: ~140-420 ms p95. Within AC-4.1 budget at K=3 + adaptive depth.
|
||
- Memory: ~1.5-2.5 GB peak. Well within AC-4.2 8 GB budget.
|
||
- AC-NEW-4 satisfied NATIVELY via GTSAM `Marginals.marginalCovariance` per D-C8-8 per-FC unit conversion.
|
||
|
||
### Scenario 2 — Sharp turn
|
||
|
||
C1 VIO loses frame-to-frame tracking on the <5% overlap consecutive frames per AC-3.2 ("Sharp-turn frames may fail frame-to-frame registration"). C5 ESKF/iSAM2 propagates from last-good-anchor via IMU per D-C5-2 long-cruise-observability strategy (covariance growth alert if covariance > threshold); IMU bias-stationarity prior (D-C5-2 = (a) accept + monitor) keeps drift bounded. Next 1-2 frames trigger C2+C3 satellite-anchor re-localization per AC-3.2 recovery clause. Within AC-3.4 budget if recovery within 3 frames + 2 s. Per AC-3.3 the system handles ≥3 disconnected segments per flight via satellite-reference re-localization as core capability.
|
||
|
||
### Scenario 3 — Stale tile
|
||
|
||
C6 cache entry carries `capture_date` per restrictions.md tile manifest schema mandate. Retrieval path must check `capture_date` against AC-8.2 threshold (<6 mo active-conflict, <12 mo stable rear). If stale, downgrade label to non-`satellite_anchored` per AC-NEW-6 ("verify stale-tile match never produces `satellite_anchored`"). Sector classification (active-conflict vs stable rear) is deferred to Plan-phase per the C10 scope restructure 2026-05-08.
|
||
|
||
### Scenario 4 — Cache file corruption
|
||
|
||
D-C10-3 content-hash verification gate at takeoff load: compute `SHA-256(faiss_index_file)` at takeoff load + compare against manifest-recorded hash + reject load + emit `STATUSTEXT` to FC + refuse takeoff if mismatch. ~50 ms one-time hash check at takeoff per Source #115 size formula (~430 MB at 2048-D halfvec × 100K tiles read at SATA SSD ~500 MB/s). Direct AC-NEW-7 satisfaction at the descriptor-cache load layer.
|
||
|
||
### Scenario 5 — GPS spoofing + visual blackout
|
||
|
||
C1+C2+C3+C4 all fail; C5 propagates dead-reckoned via IMU only. Per AC-3.5: switch label to `{dead_reckoned}` within ≤1 processed frame OR ≤400 ms; reject spoofed GPS as estimator input. Per AC-NEW-8: continue emitting external-position MAVLink frames from IMU-only propagation for ≤30 s after the last trusted anchor, label every estimate `{dead_reckoned}`, degrade MAVLink fix-quality to "2D fix or worse" when 95% covariance semi-major axis >100 m, escalate to "no fix" + `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT when >500 m OR blackout >30 s. C8 D-C8-2 = (b) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch ownership pattern: companion publishes to source-set 2 + auto-switches FC + switches back to set 1 when companion is unavailable. AC-NEW-2 spoofing-promotion latency <3 s p95 satisfied via the companion-driven switch (no GCS round-trip required).
|
||
|
||
---
|
||
|
||
## Actual validation results
|
||
|
||
| Scenario | Recommended primary stack behavior | Outcome |
|
||
|---|---|---|
|
||
| 1 — Nominal cruise | Total end-to-end 140-420 ms p95; memory 1.5-2.5 GB peak; AC-NEW-4 NATIVELY satisfied | ✅ **PASS** with K=3 + adaptive depth applied (Plan-phase Jetson MVE confirms exact tail) |
|
||
| 2 — Sharp turn AC-3.2 | C5 dead-reckon + C2/C3 re-localize within 1-2 frames; AC-3.3 ≥3 disconnected segments handled | ✅ **PASS** per design |
|
||
| 3 — Stale tile AC-NEW-6 | C6 manifest `capture_date` check; downgrade label to non-`satellite_anchored` if stale | ✅ **PASS** at architectural level; sector-classification heuristic deferred to Plan-phase |
|
||
| 4 — Cache poisoning AC-NEW-7 | D-C10-3 SHA-256 content-hash gate at takeoff; D-C10-2 atomic-write covers truncation | ✅ **PASS** for descriptor-cache + TensorRT engine path; Suite Sat Service multi-flight ingest voting OUT OF onboard scope (per AC-NEW-7 external-dependency note) |
|
||
| 5 — GPS spoofing + visual blackout | C5 dead-reckon, C8 companion-driven source-set switch, AC-NEW-8 escalation thresholds enforced | ✅ **PASS** per AC-3.5 + AC-NEW-2 + AC-NEW-8 + D-C8-2 + D-C8-8 |
|
||
|
||
---
|
||
|
||
## Counterexamples
|
||
|
||
### Counterexample CE-1 — K=10 retrieval pairs in Scenario 1 violates AC-4.1
|
||
|
||
If C3 K=10 retrieval pairs per frame (canonical default per LightGlue paper §5.4 evaluation methodology) is naively applied without D-C3-3 mitigation, total end-to-end at DISK+LightGlue ~30-60 ms × 10 = 300-600 ms standard / 150-300 ms adaptive — **exceeds AC-4.1 400 ms p95 budget without K reduction**. Mitigation pathway documented in D-C3-3 Choose block: reduce K from 10 to 3-5 / reduce keypoints from 1024 to 512 / accept TIGHT margin and validate at Jetson MVE / parallelize across multiple Jetson GPU streams / elevate ONNX Runtime + TensorRT EP + adaptive depth.
|
||
|
||
**Address**: this counterexample is already known and gated as D-C3-3; recommendation is K=3 + adaptive depth which satisfies the AC-4.1 budget at the cost of ~5-10% Recall@K loss vs K=10.
|
||
|
||
### Counterexample CE-2 — D-C5-5 = (a) per-correspondence factor density violates AC-4.1
|
||
|
||
If C5 GTSAM iSAM2 is configured with D-C5-5 = (a) per-correspondence `GenericProjectionFactorCal3DS2` highest fidelity (1000+ factors per keyframe at K=10 image pairs × 100 inliers per pair), per-update latency is ~50-150 ms on Jetson Orin Nano Super CPU — combined with C3 ~150-300 ms + C4 ~30-90 ms + C2 ~15-30 ms + C8 ~1-5 ms exceeds AC-4.1 400 ms p95 budget.
|
||
|
||
**Address**: this counterexample is already known and gated as D-C5-5; recommendation is D-C5-5 = (c) `PriorFactorPose3` only with C4 GTSAM Marginals satellite-anchor 6×6 covariance — couples C4 Fact #54 D-C4-2 = (b) with C5 Fact #89 architectural integration via shared GTSAM substrate. ~2-5 ms per update on Jetson Orin Nano Super CPU. CLEANEST cross-component coupling.
|
||
|
||
### Counterexample CE-3 — Pure ESKF (Manual ESKF without GTSAM iSAM2) loses AC-4.5 look-back
|
||
|
||
If C5 = Manual ESKF only (no GTSAM iSAM2 secondary), AC-4.5 ("System may refine prior estimates and emit corrections") cannot be satisfied — the recursive forward-time-only Kalman update has no look-back facility per Solà §6 reference recipe. AC-4.5 is a "may" not a "must" but in the project's spoofing-aware AC-NEW-8 dead-reckoning failsafe context, the look-back capability is operationally valuable for retroactively correcting blackout-period estimates once a trusted anchor is recovered.
|
||
|
||
**Address**: this counterexample is partially mitigated by recommending the **hybrid** Manual ESKF + GTSAM iSAM2 path per the C5 batch 1 closure (Fact #88 + Fact #89 dual-candidate verdict). Manual ESKF is the mandatory simple-baseline (always-running fallback if GTSAM iSAM2 fails to converge); GTSAM iSAM2 is the primary path with NATIVE AC-4.5 look-back. Final lock at Plan-phase per D-C5-3 + D-C5-5.
|
||
|
||
### Counterexample CE-4 — Cand 3 UBX impersonation for iNav (AC-NEW-7 forgery posture)
|
||
|
||
If C8 iNav path = Cand 3 UBX impersonation via pyubx2 NAV-PVT (instead of the recommended primary Cand 2 MSP2_SENSOR_GPS), the project takes on an unambiguous forgery posture — companion impersonates a u-blox receiver. AC-NEW-7 ("no covert GPS spoofing without consent") requires an explicit FDR audit trail per D-C8-7 = (a). User chose Cand 2 (MSP2_SENSOR_GPS) as primary for iNav to avoid this posture entirely; Cand 3 remains a documented secondary path with the audit-trail mitigation in case of hard incompatibility.
|
||
|
||
**Address**: not a counterexample to the recommended primary stack; documents why the user-locked Cand 2 = primary verdict was the right architectural choice.
|
||
|
||
### Counterexample CE-5 — Sector classification heuristic NOT YET pinned
|
||
|
||
AC-8.2 freshness threshold (<6 mo active-conflict, <12 mo stable rear) requires a sector classification source. The `00_question_decomposition.md` C10 scope restructure 2026-05-08 deferred the sector classification heuristic to Plan-phase. **At research close, the project does not have a pinned source for "is this sector active-conflict or stable rear?"**. Operator-marked geofence vs Suite Service metadata vs other source is open.
|
||
|
||
**Address**: deferred to Plan-phase per user choice C `c10_scope=C` cross-coupling minimal. Surfaces as Plan-phase BLOCKING gate. Not a research-layer gap.
|
||
|
||
---
|
||
|
||
## Review Checklist
|
||
|
||
- [x] Draft conclusions consistent with Step 3 fact cards (cross-references across `02_fact_cards/Cx_*.md` files; every Fact # cited in `04_reasoning_chain.md` exists in the corresponding fact-card file).
|
||
- [x] No important dimensions missed — twelve dimensions (eight Decision Support + four project-mandatory) cover the AC + restrictions surface comprehensively per the Decomposition Completeness Probe checklist in `references/comparison-frameworks.md`.
|
||
- [x] No over-extrapolation — every L3 inferential cell is labeled ⚠️ Medium or ⚠️ Medium-High and tied to a Plan-phase Jetson MVE confirmation gate.
|
||
- [x] Conclusions are actionable/verifiable — every recommendation maps to a specific D-Cx-y decision in `99_cross_component_gates.md` with named owner + resolution path.
|
||
- [x] Every selected component/tool/pattern matches the Project Constraint Matrix — verified per row in `06_component_fit_matrix/Cx_*.md` Restrictions × Candidate-Modes sub-matrix sections.
|
||
- [x] Mismatches marked as disqualifiers instead of hidden as generic "limitations" — canonical SP+LightGlue (Magic Leap noncommercial) is the canonical example, called out explicitly as HARD DISQUALIFIER in D-C3-1.
|
||
|
||
### Issue found
|
||
|
||
- **One issue, partially resolved**: AC-8.2 sector-classification source is not pinned at research close (CE-5). Deferred to Plan-phase per `00_question_decomposition.md` C10 scope restructure user choice. Acknowledged as a Plan-phase BLOCKING gate, not a research-layer gap.
|
||
|
||
---
|
||
|
||
## Conclusions Requiring Revision
|
||
|
||
None at this stage. All five validation scenarios PASS under the recommended primary stack with documented mitigation paths for the three counterexamples (CE-1 K=10 → D-C3-3; CE-2 D-C5-5 = (a) → D-C5-5 = (c); CE-3 pure ESKF → ESKF+iSAM2 hybrid). CE-4 (UBX impersonation) is not a counterexample to the recommended stack but a documentation of why the user-locked Cand 2 verdict was correct. CE-5 (sector classification) is a Plan-phase deferred gate, not a research-layer revision.
|
||
|
||
---
|
||
|
||
## Sanity check on Step 7.5 Component Applicability Gate
|
||
|
||
Per `04_engine-analysis.md` Step 7.5.3: a candidate may not be `Selected` while any sub-matrix cell is ❌ or ❓.
|
||
|
||
**Component Fit Matrix scan** ([`06_component_fit_matrix/`](06_component_fit_matrix/)):
|
||
- C1: lead candidates Selected with documented MVE evidence; no open ❌ or ❓ on sub-matrix.
|
||
- C2: 5/5 mandatory pre-screen Selected with MVE evidence; conditional pre-screen extensions (AnyLoc/BoQ/DINOv2-VLAD) gated as `Experimental only` per D-C2-5 ViT export prerequisite — correctly NOT marked Selected.
|
||
- C3: lead candidates Selected with MVE evidence; canonical SP+LightGlue marked `Rejected` per D-C3-1 hard disqualifier.
|
||
- C4: 3 candidates with verdicts; OpenGV `Selected with runtime gate` is valid per the Step 7.5.3 carve-out for runtime-quality gates (D-C4-3 + D-C4-4 are research-layer gates that are closed at the documentary level; license-clearance-counsel-review remains as a Plan-phase routine task, not a runtime-quality gate).
|
||
- C5: 2 candidates Selected per closure verdict.
|
||
- C6: Cand 1 Selected; Cand 2 Deferred secondary per comparative-improvement verdict.
|
||
- C7: 3 candidates Selected per per-family roles.
|
||
- C8: 3 candidates Selected per per-FC + per-fallback roles.
|
||
- C10: 2 sub-areas Selected per cross-coupling-minimal scope.
|
||
|
||
**Result**: zero ❌, zero ❓ across all Selected candidates. **Step 7.5 Component Applicability Gate PASSES**. Solution draft (Step 8) may proceed without further blocking gates.
|