From c19c76481c3493746133888353ca012aa52f221f Mon Sep 17 00:00:00 2001 From: Oleksandr Bezdieniezhnykh Date: Sat, 9 May 2026 03:10:57 +0300 Subject: [PATCH] Update autodev skill documentation and acceptance criteria Enhanced the SKILL.md file to enforce conciseness rules for the state file, specifying acceptable content and file size limits. Updated the autodev state to reflect the transition to the planning phase, including changes to the current step and sub-step details. Revised acceptance criteria to clarify validation requirements and external dependencies, ensuring alignment with the latest research findings. Added a new overlay for Mode B revisions to track changes and decisions made during the assessment process. --- .cursor/skills/autodev/SKILL.md | 9 + _docs/00_problem/acceptance_criteria.md | 11 +- .../01_source_registry/00_summary.md | 1 + .../01_source_registry/MODEB_addendum.md | 37 ++ _docs/00_research/02_fact_cards/00_summary.md | 1 + .../02_fact_cards/MODEB_addendum.md | 111 ++++ .../06_component_fit_matrix/00_summary.md | 1 + .../99_cross_component_gates.md | 2 + .../MODEB_revisions.md | 72 +++ _docs/01_solution/solution.md | 399 ++++++++++++ _docs/02_document/glossary.md | 95 +++ _docs/02_document/tests/blackbox-tests.md | 595 ++++++++++++++++++ _docs/02_document/tests/environment.md | 248 ++++++++ _docs/02_document/tests/performance-tests.md | 126 ++++ _docs/02_document/tests/resilience-tests.md | 108 ++++ .../02_document/tests/resource-limit-tests.md | 100 +++ _docs/02_document/tests/security-tests.md | 97 +++ _docs/02_document/tests/test-data.md | 129 ++++ .../02_document/tests/traceability-matrix.md | 109 ++++ _docs/_autodev_state.md | 10 +- ...6-05-09_satellite-provider-design-tasks.md | 103 +++ 21 files changed, 2354 insertions(+), 10 deletions(-) create mode 100644 _docs/00_research/01_source_registry/MODEB_addendum.md create mode 100644 _docs/00_research/02_fact_cards/MODEB_addendum.md create mode 100644 _docs/00_research/06_component_fit_matrix/MODEB_revisions.md create mode 100644 _docs/01_solution/solution.md create mode 100644 _docs/02_document/glossary.md create mode 100644 _docs/02_document/tests/blackbox-tests.md create mode 100644 _docs/02_document/tests/environment.md create mode 100644 _docs/02_document/tests/performance-tests.md create mode 100644 _docs/02_document/tests/resilience-tests.md create mode 100644 _docs/02_document/tests/resource-limit-tests.md create mode 100644 _docs/02_document/tests/security-tests.md create mode 100644 _docs/02_document/tests/test-data.md create mode 100644 _docs/02_document/tests/traceability-matrix.md create mode 100644 _docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md diff --git a/.cursor/skills/autodev/SKILL.md b/.cursor/skills/autodev/SKILL.md index 3d511d3..48f2e1c 100644 --- a/.cursor/skills/autodev/SKILL.md +++ b/.cursor/skills/autodev/SKILL.md @@ -112,6 +112,15 @@ Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The aut The state file (`_docs/_autodev_state.md`) is a minimal pointer — only the current step. See `state.md` for the authoritative template, field semantics, update rules, and worked examples. Do not restate the schema here — `state.md` is the single source of truth. +**Conciseness rule (authoritative).** The state file MUST stay short. Acceptable content per field: + +- `name` — the step title from the active flow's Step Reference Table. That's it. +- `sub_step.name` — kebab-case identifier from the active sub-skill. That's it. +- `sub_step.detail` — **leave empty (`""`) by default.** Add a one-line note ONLY when the next-session resumer cannot infer where to pick up from `phase` + `name` + on-disk artifacts alone (e.g. `"batch 2 of 4"`, `"blocked on D-PROJ-2 reply"`, `"variant 1b"`). NEVER use `detail` as a changelog, recap, or summary of completed work — those facts belong in the relevant `_docs/` artifact (glossary, traceability matrix, leftovers folder, retro report, etc.) and in git history. +- **Total file size target: <30 lines.** If you're tempted to write more, you're using the wrong artifact — write in `_docs/` instead. + +Multi-line `detail` blobs that recap what was just completed are a smell. The state file is a *pointer*, not a logbook. + ## Trigger Conditions This skill activates when the user wants to: diff --git a/_docs/00_problem/acceptance_criteria.md b/_docs/00_problem/acceptance_criteria.md index 0643628..a4b15f3 100644 --- a/_docs/00_problem/acceptance_criteria.md +++ b/_docs/00_problem/acceptance_criteria.md @@ -2,6 +2,7 @@ > Last revised 2026-05-07 (cleanup pass: stripped algorithm/library/parameter implementation details; renamed source label `vo_extrapolated` → `visual_propagated`; broadened FC scope to ArduPilot + iNav). > Subsequent revision 2026-05-07 (post-SQ6 research): AC-4.3 reworded to acknowledge that no single message type is accepted by both ArduPilot Plane and iNav — per-FC interface is named explicitly (MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav). Rationale and L1 sources in `_docs/00_research/02_fact_cards/SQ6_fc_external_positioning.md` / `_docs/00_research/01_source_registry/SQ6_external_positioning.md` Sources #4, #9, #10, #12, #13. +> Subsequent revision 2026-05-09 (Plan Phase 2a.0 outcomes): AC-NEW-4 and AC-NEW-7 validation requirements relaxed from "≥100 flights" literal to Monte-Carlo-with-stated-CI over currently-available data corpus; multi-flight statistical headroom moved to Step 4 risk register (D-PROJ-3). AC-8.4 augmented with explicit in-air-no-upload security gate (flight-state process-level isolation; post-landing upload tool); local mid-flight tile format pinned to match `satellite-provider`'s on-disk format. AC-NEW-7 external-dependency note revised: parent-suite voting layer is not currently implemented; tracked as parent-suite design task D-PROJ-2. > See git history for prior versions. ## Position Accuracy @@ -53,7 +54,7 @@ - **AC-8.1** — Imagery via Azaion Suite Satellite Service (offline cache interface; no direct commercial-provider calls). Cache-interface resolution ≥0.5 m/px, ideally 0.3 m/px. - **AC-8.2** — Tile freshness: <6 mo (active-conflict sectors), <12 mo (stable rear). Older → reject or downgrade (AC-NEW-6). - **AC-8.3** — Imagery pre-loaded onto companion before flight; offline preprocessing time not time-critical. Pre-extracted descriptors/indices count against the cache budget unless explicitly carved out. -- **AC-8.4** — Mid-flight tile generation: continuously orthorectify nav-camera frames into basemap-projected tiles, deduplicated (latest/highest-quality wins). Upload to Service on landing. Each uploaded tile carries quality metadata sufficient for the Service's ingest pipeline (AC-NEW-7). +- **AC-8.4** — Mid-flight tile generation: continuously orthorectify nav-camera frames into basemap-projected tiles, deduplicated (latest/highest-quality wins). Tiles are written **only** to the local cache while airborne — in-air outbound writes to `satellite-provider` are **forbidden** for drone-security reasons; enforced by a `flight state` process-level gate (see `architecture.md`). Upload to `satellite-provider` happens **only after landing**, triggered by a separate operator-side post-landing upload tool. Local mid-flight tile format matches `satellite-provider`'s on-disk format so post-landing upload is byte-identical. Each uploaded tile carries quality metadata sufficient for the Service's ingest pipeline (AC-NEW-7). - **AC-8.5** — No raw nav-camera or AI-camera frames retained in normal operation; tiles are the only persistent imagery. Forensic exception: ≤0.1 Hz thumbnail log of frames that failed tile generation, within FDR budget (AC-NEW-3). - **AC-8.6 — Satellite-anchor relocalization robustness**: - **Scale-ratio**: any UAV-frame ground footprint at the deployment altitude band must be retrievable from the cache regardless of internal tiling/indexing. @@ -80,7 +81,7 @@ ### AC-NEW-4 — False-position safety budget **Statement.** Per flight: **P(error >500 m) <0.1 %**, **P(error >1 km) <0.01 %**. **Why.** A single 1-km-off frame can fly the UAV outside the geofence; covariance carried in the MAVLink message is the FC's only defense. -**Validation.** Monte Carlo over a public aerial-localization dataset (e.g. AerialVL S03) + own recorded flights; report error CDF; pass = both probabilities below budget across ≥100 flights. +**Validation.** Monte Carlo over the currently-available data corpus (Derkachi flight + 60 stills + synthetic perturbations); report error CDF with stated 95% confidence interval; pass = both probabilities below budget within the CI's lower bound. Multi-flight statistical headroom (originally framed as ≥100 flights) is residual risk tracked in the Step 4 risk register; **D-PROJ-3** reopens this validation when additional multi-flight data becomes available. ### AC-NEW-5 — Operational environmental envelope **Statement.** Operating temp **−20 °C to +50 °C**; vibration/shock per RTCA DO-160G low-altitude UAV-class. Cooling sustains **25 W** at the upper temp for the full **8-hour duty cycle** without throttling. @@ -94,9 +95,9 @@ ### AC-NEW-7 — Cache-poisoning safety budget **Statement.** Per flight, across all onboard tiles written (AC-8.4): **P(geo-misalign >30 m) <1 %**, **P(>100 m) <0.1 %**. -**Why.** Onboard tiles feed back into the Service basemap (AC-8.4). A bad onboard pose with optimistic covariance writes a misaligned tile that becomes the next flight's anchor — cross-flight error compounding that AC-NEW-4 doesn't capture. -**External-dependency note.** The Suite Satellite Service is expected to operate a multi-flight ingest-side voting layer that gates onboard-tile promotion to "trusted basemap" until multiple independent flights agree on geo-alignment. Voting algorithm is the Service's concern; onboard's job (AC-8.4) is to publish per-tile quality metadata sufficient for that layer. End-to-end AC-NEW-7 evidence depends on this Service contract. -**Validation.** Multi-flight Monte Carlo replay over public datasets (e.g. AerialVL, AerialExtreMatch) + own flights, with synthetic over-confidence injection (deflate covariance ×1.5–3): assert both probabilities below budget across ≥100 flights. Independently exercise the Service-side voting contract. +**Why.** Onboard tiles feed back into the `satellite-provider` basemap when uploaded post-landing (AC-8.4). A bad onboard pose with optimistic covariance writes a misaligned tile that becomes the next flight's anchor — cross-flight error compounding that AC-NEW-4 doesn't capture. +**External-dependency note.** The parent-suite `satellite-provider` is expected to operate a multi-flight ingest-side trust/voting layer that gates onboard-tile promotion to "trusted basemap" until multiple independent flights agree on geo-alignment. The ingest endpoint and voting layer are **not currently implemented in `satellite-provider`** and are tracked as a parent-suite design task (**D-PROJ-2**). Onboard's job (AC-8.4) is to publish per-tile quality metadata sufficient for that layer. End-to-end AC-NEW-7 evidence depends on the `satellite-provider` contract being added. +**Validation.** Onboard-only Monte Carlo replay over the currently-available data corpus + synthetic over-confidence injection (deflate covariance ×1.5–3); report error CDF with stated 95% confidence interval; pass = both probabilities below budget within the CI's lower bound for the onboard-side contribution. Multi-flight statistical headroom and the `satellite-provider` voting-side contract verification are residual risks tracked in the Step 4 risk register; **D-PROJ-3** reopens onboard validation when additional multi-flight data becomes available; **D-PROJ-2** reopens cross-suite validation once the ingest + voting layer is built. ### AC-NEW-8 — Visual blackout + GPS spoofing degraded mode **Statement.** When the navigation camera is fully unusable AND FC reports GPS denial/spoof: diff --git a/_docs/00_research/01_source_registry/00_summary.md b/_docs/00_research/01_source_registry/00_summary.md index bf6459f..78a6832 100644 --- a/_docs/00_research/01_source_registry/00_summary.md +++ b/_docs/00_research/01_source_registry/00_summary.md @@ -26,6 +26,7 @@ | C7 — On-Jetson inference runtime candidates | [`C7_inference_runtime.md`](C7_inference_runtime.md) | #99–#105 | Closed at 3/N (batch 1 closed 2026-05-08) — Cand 1 (TensorRT native) RECOMMENDED PRIMARY; Cand 2 (ONNX Runtime + TRT EP) modern-competitive-lead-cross-architecture-portability; Cand 3 (pure PyTorch FP16) mandatory simple-baseline | | C8 — MAVLink / MSP2 FC adapter candidates | [`C8_fc_adapter.md`](C8_fc_adapter.md) | #106–#113 | Closed at 3/N (batch 1 closed 2026-05-08) — Cand 1 (pymavlink → MAVLink GPS_INPUT) RECOMMENDED PRIMARY for ArduPilot Plane; Cand 2 (MSP2_SENSOR_GPS via Python MSP V2) RECOMMENDED PRIMARY for iNav (locked SQ6 + AC-4.3 transport); Cand 3 (UBX impersonation via pyubx2 NAV-PVT) DEFERRED secondary for iNav after comparative-improvement verdict | | C10 — Pre-flight cache provisioning (CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C; D-C6-3 + D-C7-7 confirmation pipelines only, operator tooling deferred to Plan-phase) | [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) | #114–#121 | Closed at 2/N (batch 1 closed 2026-05-08) — D-C6-3 confirmation: direct `faiss.write_index`/`faiss.read_index` Python API + `python-atomicwrites` + content-hash verification gate at takeoff (FAISS MIT, atomicwrites MIT); D-C7-7 confirmation: hybrid Polygraphy CLI primary + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API escape hatch (Polygraphy + TensorRT 10.x Apache-2.0 throughout) | +| **Mode B addendum (2026-05-08)** — solution_draft01 assessment | [`MODEB_addendum.md`](MODEB_addendum.md) | **#122–#131** (10 sources) | New sources gathered for Mode B findings F1–F20: VINS-Mono GPL-3.0 LICENCE confirmation (#122), MegaLoc + UltraVPR + AirZoo aerial-VPR successor candidates (#123, #124, #125), CVE-2026-1579 MAVLink no-default-auth + CVE-2025-53644 OpenCV crafted-JPEG (#126, #127), ArduPilot MAVLink2 message-signing + iNav signing-gap (#128, #129), ArduPilot `MAV_CMD_SET_EKF_SOURCE_SET` no-deployed-GCS-implementer re-verification (#130), XoFTR + 2026 SAR-optical 24-matcher benchmark (#131). | ## Investigation Status diff --git a/_docs/00_research/01_source_registry/MODEB_addendum.md b/_docs/00_research/01_source_registry/MODEB_addendum.md new file mode 100644 index 0000000..e989da0 --- /dev/null +++ b/_docs/00_research/01_source_registry/MODEB_addendum.md @@ -0,0 +1,37 @@ +# Source Registry — Mode B Addendum (2026-05-08) + +> Mode B Solution Assessment of `_docs/01_solution/solution_draft01.md`. New sources gathered for findings F1–F20; Mode A sources #1–#121 remain canonical and are not duplicated. +> +> Index: [`00_summary.md`](00_summary.md). Mode B fact cards: [`../02_fact_cards/MODEB_addendum.md`](../02_fact_cards/MODEB_addendum.md). Mode B fit-matrix revisions: [`../06_component_fit_matrix/MODEB_revisions.md`](../06_component_fit_matrix/MODEB_revisions.md). Mode B output: [`../../01_solution/solution_draft02.md`](../../01_solution/solution_draft02.md). + +## New Sources + +| # | Title | Tier | Binding | +|---|-------|------|---------| +| 122 | HKUST-Aerial-Robotics/VINS-Mono LICENCE file (canonical, master branch) — GNU GPL Version 3 | L1 (verified raw LICENCE on github.com) | C1 candidate-table license-correction (F11/F15). Confirms VINS-Mono is **GPL-3.0**, not BSD-permissive as draft01 claims. Cross-confirms Mode A C1 Fact #28 against Mode A draft01 deliverable. | +| 123 | MegaLoc — "One Retrieval to Place Them All" (Berton & Masone, arXiv:2502.17237; CVPR 2025 Image Matching workshop; gmberton/megaloc repo, MIT) | L1 | C2 D-C2-11 candidate (F16). torch.hub install path; MIT license; SOTA on multiple VPR datasets; combines existing methods + training techniques + datasets into a unified retrieval model. | +| 124 | UltraVPR — "Unsupervised Lightweight Rotation-Invariant Aerial VPR" (cbbhuxx/UltraVPR repo, MIT; published RAL 2025; ICRA 2026) | L1 | C2 D-C2-11 alternative (F17). MIT license; **44 Hz on Jetson Orin NX (close cousin of Orin Nano Super)** via ONNX export; rotation-invariant; specifically designed for UAV; validated on VPAir + UAV-VisLoc datasets — directly relevant to the project's pinned operating context. | +| 125 | AirZoo — "Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision" (arXiv:2604.26567v1, 2026) | L1 | C2 evidence base for MegaLoc on aerial domain (F16). Demonstrates that fine-tuning MegaLoc on aerial data yields substantial performance gains for aerial image retrieval and cross-view matching. | +| 126 | NVD CVE-2026-1579 — MAVLink protocol Missing Authentication for Critical Function (CVSS 9.8 CRITICAL) | L1 | New cross-cutting security gate (F18). MAVLink lacks cryptographic authentication by default; an unauthenticated party with MAVLink interface access can send arbitrary commands including SERIAL_CONTROL for interactive shell. **Mitigation: enable MAVLink 2.0 message signing.** Affects ArduPilot Plane and PX4; iNav has only partial MAVLink support and does not implement message signing. | +| 127 | NVD CVE-2025-53644 — OpenCV uninitialized variable on stack reading crafted JPEG (CVSS 9.8 CRITICAL) | L1 | C4 OpenCV pin update (F19). Affects 4.10.0 / 4.11.0; **fixed in 4.12.0**. Draft01 says "OpenCV 4.x" — must pin **≥4.12.0**. Triggers heap-buffer-write via crafted JPEG file load — relevant if any image format reaching OpenCV originates from uncertain provenance (e.g., tile cache import, FDR thumbnail re-load). | +| 128 | ArduPilot MAVLink2 Signing — Plane documentation (`ardupilot.org/plane/docs/common-MAVLink2-signing.html`) + Issue #28736 channel-specific signing PR #29546 (March 2025) | L1 | F18 mitigation evidence. Confirms ArduPilot supports MAVLink2 signing via Mission Planner SETUP > Advanced > "Mavlink Signing" menu; non-USB serial ports can be configured to only respond to MAVLink commands carrying the correct passkey; PR #29546 adds bitmask parameter to enable/disable signing per channel for wired companion-computer connections. | +| 129 | iNav MAVLink Wiki (`iNavFlight/inav/wiki/Mavlink`) | L1 | F18 cross-FC asymmetry (verified 2026-05-08 via web search). iNav has partial MAVLink support and **does NOT implement MAVLink message signing**. Companion-FC inbound on iNav is MSP2 (not MAVLink) so signing-gap is on the outbound MAVLink telemetry side, not the inbound external-positioning path — but cross-FC asymmetry is still material for AC-NEW-7 and the GCS link. | +| 130 | ArduPilot common-ekf-sources.rst + PR #18345 (`MAV_CMD_SET_EKF_SOURCE_SET`) — explicit "no GCSs are currently known to implement this" (verified 2026-05-08) | L1 | F8 D-C8-2 evidence (cross-confirms Mode A SQ6 Fact #3). Re-verifies on 2026-05-08 web search that ArduPilot supports the command at firmware level (since August 2021) but **no production-deployed GCS or companion is documented as implementing the companion-driven switch pattern** the project plans to use. Pattern is therefore **novel for a deployed production system** — confirms Mode A characterization but elevates to risk-graded selection. | +| 131 | XoFTR — "Cross-modal Feature Matching Transformer" (arXiv:2404.09692) + 2026 SAR-optical satellite registration benchmark (arXiv:2604.10217) | L2 | F20 contrarian-evidence reference. Cross-modal matcher; achieved lowest mean error (3.0 px) on SpaceNet9 SAR-optical training scenes among 24 pretrained matcher families benchmarked. **Important contrarian finding: matchers without explicit cross-modal training sometimes performed comparably**, suggesting foundation-model features (like DINOv2) provide modality invariance — reinforces SelaVPR (DINOv2-L) over MixVPR (CNN-only) on the BSD/permissive C2 axis when cross-domain UAV→satellite registration is the binding stress test. | + +--- + +## Verification audit-trail (mandatory per `00_question_decomposition.md` Step 0.5 cross-validation rule) + +| Source | Independent corroboration | +|---|---| +| #122 (VINS-Mono GPL-3.0) | Cross-confirms Mode A C1 Fact #28 (`02_fact_cards/C1_vio.md`) which already classified VINS-Mono as GPL-3.0; the discrepancy was inside the deliverable layer (`solution_draft01.md` C1 candidate table), not the evidence layer. Both Mode A C1 Fact #28 and Source #122 agree. | +| #123 (MegaLoc) | arXiv preprint + CVPR 2025 workshop + GitHub repo + Hugging Face — three-independent-source confirmation per Critical-novelty cross-validation rule. | +| #124 (UltraVPR) | RAL 2025 IEEE journal publication + ICRA 2026 + GitHub repo with pre-trained ONNX weights — three independent sources. | +| #125 (AirZoo) | arXiv preprint April 2026 — single source; treated as ⚠️ Medium confidence pending second cross-validation. | +| #126 (CVE-2026-1579) | NVD official entry + CISA ICS Advisory ICSA-26-090-02 + PX4 GHSA-fh32-qxj9-x32f — three-source confirmation; Critical CVSS. | +| #127 (CVE-2025-53644) | NVD official entry; OpenCV release notes confirming 4.12.0 fix — two-source confirmation. | +| #128 (ArduPilot MAVLink2 signing) | Official Plane documentation + Issue #28736 + PR #29546 — three-source confirmation. | +| #129 (iNav no signing) | iNav wiki (frogmane edited 2025-12-11) — single authoritative source per project convention; iNav wiki is the canonical iNav reference per Mode A SQ6 source #10. | +| #130 (companion-driven EKF source switch) | ArduPilot official ekf-sources doc + PR #18345 + cross-confirms SQ6 Mode A Source #3 already-documented "no GCSs known to implement". Three-source confirmation. | +| #131 (XoFTR cross-modal) | arXiv preprint + 2026 SAR-optical benchmark study (arXiv:2604.10217) — two-source confirmation. | diff --git a/_docs/00_research/02_fact_cards/00_summary.md b/_docs/00_research/02_fact_cards/00_summary.md index 109e2cd..150f323 100644 --- a/_docs/00_research/02_fact_cards/00_summary.md +++ b/_docs/00_research/02_fact_cards/00_summary.md @@ -23,6 +23,7 @@ This folder replaces the previous monolithic `02_fact_cards.md` (1480 lines, too | [`C6_tile_cache_spatial_index.md`](C6_tile_cache_spatial_index.md) | **C6** — Tile cache + spatial index | #92–#93 (2 facts, **batch 1 closed at 2/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY**: Manual mirror of existing parent-suite `satellite-provider` pattern (verified directly via Source #92 filesystem read at /Users/obezdienie001/dev/azaion/suite/satellite-provider/) — PostgreSQL btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` for geographic spatial-grid range queries + `bytea` descriptor blobs + app-side FAISS `IndexHNSWFlat(d, M=32)` loaded at takeoff via `faiss.read_index` for descriptor ANN + filesystem tile storage at `./tiles/{zoom}/{x}/{y}.{image_type}` slippy-map convention; clean PostgreSQL License + MIT + LGPL/MIT-Apache; trivial dependency footprint (no Postgres extensions); empirically-confirmed Postgres-on-Jetson viability per Source #97 March 2026 article (CPU cores limiting, NOT memory); ~6-54 ms per cache hit comfortably within AC-4.1 400 ms p95 budget; ~700 MB-1.5 GB total memory footprint within AC-4.2 8 GB budget. **Cand 2 DEFERRED secondary**: PostgreSQL + PostGIS 3.4 GiST on `geography(POINT,4326)` with KNN distance ordering (`<->`) + pgvector 0.7+ HNSW for descriptor ANN + same filesystem tile storage; native KNN + radius + combined-SQL capabilities are real improvements BUT 5-10× slower geographic lookup than Cand 1 + heavier dependency (~50-100 MB additional memory + ~50-200 MB additional disk install) + PostGIS GPL-2.0-or-later license-complexity (CONTINGENT REJECT under D-C1-1 = (b) BSD/permissive-only-track) + DIVERGENT from suite pattern + improvements marginal-to-negative in project's pinned 3 Hz spatial-grid query operating context. **Comparative-improvement-vs-Cand-1 verdict**: per user's session-start "significant-improvement-only" bar, no material justification to deviate from existing satellite-provider pattern. Decisions: D-C6-1 (NEW) descriptor-storage-format choice (halfvec recommended); D-C6-2 (NEW Cand-1-only) FAISS index variant choice (IndexHNSWFlat M=32 recommended); D-C6-3 (NEW Cand-1-only CROSS-COMPONENT with C10) descriptor-cache-rebuild-trigger strategy (periodic-during-C10-pre-flight recommended); D-C6-4 (NEW Cand-1-only) geographic-spatial-grid radius (dynamic recommended); D-C6-5 (NEW Cand-2-only contingent) Jetson PostGIS+pgvector co-installation Plan-phase verification (verify-on-Jetson-MVE recommended); D-C6-6 (NEW Cand-2-only contingent) pgvector descriptor-storage-type choice (halfvec recommended); D-C6-7 (NEW CROSS-COMPONENT affects parent-suite satellite-provider) cascade-changes-back-to-suite strategy (leave-unchanged recommended given Cand 1 closure verdict). | | [`C7_inference_runtime.md`](C7_inference_runtime.md) | **C7** — On-Jetson inference runtime | #94–#96 (3 facts, **batch 1 closed at 3/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY**: TensorRT native — JetPack 6.2 bundled TensorRT 10.3 + `IInt8EntropyCalibrator2` + `BuilderFlag.FP16+INT8` mixed-precision + engines built directly on Jetson Orin Nano Super SM 87 (Apache-2.0 in TensorRT 10.x; ships with JetPack so zero-effort install; lowest-latency primary path; 2-3× speedup at INT8 vs FP16 per Source #102 YOLO26 benchmark; engines tied to SM 87 hardware-specific per Source #105 — must be built on deployed Jetson via D-C7-7); **Cand 2 modern-competitive-lead-cross-architecture-portability**: ONNX Runtime + TensorRT EP — `onnxruntime-gpu` via Jetson AI Lab JP6/CU126 wheel index + `TensorrtExecutionProvider` config + automatic CUDA EP / CPU EP subgraph fallback (MIT throughout; cross-architecture portability for replay/SITL on x86 dev hosts; `pip install onnxruntime-gpu` does NOT work on Jetson — needs Jetson AI Lab community wheel via D-C7-3 + numpy<2.0.0 pin via D-C7-4); **Cand 3 mandatory simple-baseline**: pure PyTorch FP16 — `torch.amp.autocast` + `model.half()` + Jetson AI Lab PyTorch 2.5 ARM64 wheel (BSD-3-Clause throughout; zero-conversion regression baseline; reference-correctness oracle for accuracy validation of TRT-built engines; standard `pip install torch` lacks CUDA on Jetson — needs Jetson AI Lab wheel via D-C7-5). **Cross-cutting precision policy** (D-C7-6 NEW CROSS-COMPONENT, affects C2+C3+C1+C7): VPR backbones (CNN-class MixVPR/EigenPlaces/NetVLAD) → INT8+FP16 mixed; ViT-class VPR (SelaVPR DINOv2-L; conditional AnyLoc/BoQ/DINOv2-VLAD) → FP16-only initially, INT8 deferred to Jetson MVE per D-C2-5; matchers (LightGlue with SP/DISK/ALIKED, XFeat, XFeat+LighterGlue) → **FP16-only — NO INT8** per Source #103 quantization-sensitivity finding (LightGlue FP8 ModelOpt collapsed match counts); learned VIO frontends → FP16-only initially. **Triton/DeepStream/CUDA-Python custom kernels considered-and-rejected** (server/video-pipeline class + out-of-budget for embedded 8 h mission) per c7_overkill_options scope choice. Decisions: D-C7-1 (NEW Cand-1-only CROSS-COMPONENT with C9) calibration-dataset-strategy (AerialVL S03 + AerialExtreMatch recommended); D-C7-2 (NEW Cand-1-only) TensorRT mixed-precision flag matrix (per-family policy per D-C7-6 recommended); D-C7-3 (NEW Cand-2-only) ORT-Jetson-wheel-index-pin (mirror to project artifact registry + cu126 recommended); D-C7-4 (NEW Cand-2-only) numpy-version-pin (`numpy<2.0.0` recommended); D-C7-5 (NEW Cand-3-only) PyTorch-Jetson-wheel-pin (PyTorch 2.5 + torchvision 0.20 recommended); D-C7-6 (NEW CROSS-COMPONENT C2+C3+C1+C7) INT8-vs-FP16-per-model-family-precision-policy (per-family policy recommended); D-C7-7 (NEW Cand-1-only CROSS-COMPONENT with C10) engine-build-on-Jetson-vs-prebuilt strategy (primary build-on-target + reference-Jetson fallback recommended); D-C7-8 (NEW Cand-1-only) `config.max_workspace_size` cap (1 GB safe default recommended); D-C7-9 (NEW Cand-1-only) TensorRT version pin within JetPack lifecycle (JetPack 6.2 + TensorRT 10.3 recommended). | | [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) | **C10** — Pre-flight cache provisioning (CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C; only D-C6-3 + D-C7-7 confirmation pipelines researched here, operator tooling design deferred to Plan-phase) | #100–#101 (2 facts, **batch 1 closed at 2/N 2026-05-08**) | **D-C6-3 confirmation (Fact #100)**: descriptor-cache rebuild trigger + atomic-write strategy via direct `faiss.write_index`/`faiss.read_index` Python API + `python-atomicwrites` (write-temp + `fsync` + atomic rename) + content-hash (SHA-256) verification gate at takeoff load + `IO_FLAG_MMAP_IFC` mmap load with `madvise(MADV_WILLNEED)` pre-fault + manifest-hash-driven rebuild trigger; FAISS MIT + atomicwrites MIT throughout; FAISS warns "no internal integrity check, expects validated input" — MITIGATED by content-hash gate at takeoff (binds AC-NEW-7 cache-poisoning safety); rebuild-while-not-flying constraint per restrictions.md. **D-C7-7 confirmation (Fact #101)**: hybrid TensorRT engine-build orchestration — Polygraphy CLI primary for INT8-calibrating builds (`polygraphy convert --int8 --calib-cache= ...` Apache-2.0 + Calibrator API replaces hand-written `IInt8EntropyCalibrator2`) + `trtexec` for fast cache-reuse rebuilds (`--fp16 --int8 --calib=`) + direct `IBuilderConfig` Python API as escape hatch for unusual models (LightGlue dynamic-shape profiles); calibration cache binary-blob reuse keyed by `SHA-256(calib_corpus)` per D-C10-6; engines tied to SM 87 hardware-specific per Source #105 → must be built on deployed Jetson per D-C7-7 closure (D-C10-8 reference-Jetson-at-HQ + deployed-Jetson-copy-to-archive prebuilt-fallback venue); self-describing filename schema `_sm_jp_trt_.engine` per D-C10-7; binds AC-4.1/4.2 latency+memory budgets via D-C7-2 mixed-precision flag matrix + D-C7-1 calibration corpus closure. | +| [`MODEB_addendum.md`](MODEB_addendum.md) | **Mode B addendum** — solution_draft01 assessment (2026-05-08) | #102–#113 (12 facts) | Documentary-audit findings (Facts #102–#108): VINS-Mono BSD/GPL deliverable-formatting error (#102), AC-4.1 latency budget overrun (#103), camera calibration unspecified (#104), Suite Sat Service voting-layer contract gap (#105), `00_ac_assessment.md` BLOCKING-gate skip acknowledged (#106), AC-4.5 FC-consumption pathway scope clarification (#107), SQ2 AdHoP + Top-N re-rank sub-stage absence in solution_draft01 architecture (#108). Web-research findings (Facts #109–#113): MAVLink no-default-auth + MAVLink-2.0 message-signing per FC (#109), MegaLoc + UltraVPR D-C2-11 deferred-evaluation revision (#110), `MAV_CMD_SET_EKF_SOURCE_SET` no-deployed-GCS-implementer re-confirmation (#111), OpenCV ≥4.12.0 CVE pin (#112), XoFTR + DINOv2-features cross-modal contrarian evidence (#113). | | [`C8_fc_adapter.md`](C8_fc_adapter.md) | **C8** — MAVLink / MSP2 FC adapter | #97–#99 (3 facts, **batch 1 closed at 3/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY for ArduPilot**: pymavlink → MAVLink `GPS_INPUT` (msg 232) cooperative-path; `master.mav.gps_input_send(time_usec, gps_id, ignore_flags, time_week_ms, time_week, fix_type, lat, lon, alt, hdop, vdop, vn, ve, vd, speed_accuracy, horiz_accuracy, vert_accuracy, satellites_visible, yaw)` periodic injection at 5 Hz over MAVLink (UART/USB/UDP per D-C8-1); FC-side `GPS1_TYPE=14` MAVLink + `EK3_SRC1_POSXY=3` GPS source-set drives EKF3 ingestion via `AP_GPS_MAV` (verified Source #4 SQ6 + Source #106 + Source #107); pymavlink LGPL-3.0 linkable from Apache-2.0 app per LGPL §6 (D-C8-3 mitigation). **Cand 2 RECOMMENDED PRIMARY for iNav**: `MSP2_SENSOR_GPS` (id 7939 / 0x1F03) via Python MSP V2 (YAMSPy or INAV-Toolkit `msp_v2_encode`); `mspGPSReceiveNewData()` direct passthrough (no validation gate beyond data parse); covariance fields `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy` align directly with AP `GPS_INPUT.horiz_accuracy`/`vert_accuracy`/`speed_accuracy`; YAMSPy + INAV-Toolkit MIT throughout; `USE_GPS_PROTO_MSP` enabled by default in iNav target/common.h (verified Source #111 + #112 + #113); locked SQ6 + AC-4.3 + restrictions.md transport. **Cand 3 DEFERRED secondary for iNav**: UBX impersonation via pyubx2 NAV-PVT — forging u-blox NAV-PVT frames through standard GPS pipeline; iNav-side `gpsMapFixType()` validation gate requires `flags & 0x01 = 1` (gnssFixOK) AND `fixType ∈ {2,3}` per Source #110 `gps_ublox.c` lines 215-220 + 654; pyubx2 BSD-3-Clause clean dual-use; **does NOT clear user's "significant-improvement-only" bar over Cand 2** — richer protocol surface (NAV-PVT periodic + NAV-VER startup + CFG-MSG/CFG-RATE ACK behaviour) + AC-NEW-7 forgery posture + stricter validation gate + AP-path field-name divergence outweigh pyubx2 library-maturity advantage. **Mid-batch correction**: I caught a contradiction between my own initial AskQuestion phrasing ("UBX impersonation as ONLY iNav path") and locked SQ6 + AC-4.3 + restrictions.md verdicts; user re-locked scope via `c8_inav_recovery=B` to evaluate both as parallel candidates. Decisions: D-C8-1 (NEW Cand-1-only) pymavlink connection-string transport choice (env-driven default-UART recommended); D-C8-2 (NEW Cand-1-only CROSS-COMPONENT with AC-NEW-2) `MAV_CMD_SET_EKF_SOURCE_SET` companion-driven switch ownership pattern (companion publishes to source-set 2 + auto-switches FC recommended); D-C8-3 (NEW Cand-1-only) pymavlink LGPL-3.0 license-posture verification (bundle-unmodified-with-version-pin recommended); D-C8-4 (NEW Cand-2-only) Python MSP V2 implementation choice (YAMSPy primary + thin custom encoder fallback recommended); D-C8-5 (NEW Cand-2-only) MSP2_SENSOR_GPS injection rate (5 Hz periodic recommended); D-C8-6 (NEW Cand-3-only contingent) UBX-version-advertisement strategy (advertise version ≥ 15.0 recommended); D-C8-7 (NEW Cand-3-only contingent CROSS-COMPONENT with AC-NEW-7) AC-NEW-7 audit-trail posture for UBX impersonation (explicit FDR audit entry recommended); D-C8-8 (NEW CROSS-COMPONENT C5+C8) covariance-honesty cross-FC enforcement strategy (per-FC unit conversion recommended via 95% confidence ellipse semi-major axis from C5 GTSAM `Marginals.marginalCovariance`). | **Cross-cutting consumers** (do not duplicate facts here, just point in): diff --git a/_docs/00_research/02_fact_cards/MODEB_addendum.md b/_docs/00_research/02_fact_cards/MODEB_addendum.md new file mode 100644 index 0000000..590d70d --- /dev/null +++ b/_docs/00_research/02_fact_cards/MODEB_addendum.md @@ -0,0 +1,111 @@ +# Fact Cards — Mode B Addendum (2026-05-08) + +> Mode B Solution Assessment of `_docs/01_solution/solution_draft01.md`. New facts gathered for findings F1–F20; Mode A facts #1–#101 remain canonical and are not duplicated. +> +> Index: [`00_summary.md`](00_summary.md). Mode B sources: [`../01_source_registry/MODEB_addendum.md`](../01_source_registry/MODEB_addendum.md). Mode B fit-matrix revisions: [`../06_component_fit_matrix/MODEB_revisions.md`](../06_component_fit_matrix/MODEB_revisions.md). Mode B output: [`../../01_solution/solution_draft02.md`](../../01_solution/solution_draft02.md). +> +> Confidence labels and schema match `00_summary.md` legend. + +--- + +## Documentary-audit findings (no new web evidence required) + +### Fact #102 — solution_draft01 C1 candidate table mis-licenses VINS-Mono as "BSD permissive clean"; the underlying Mode A C1 Fact #28 correctly classifies it as GPL-3.0 (deliverable-formatting error) +- **Statement**: `solution_draft01.md` § "Component: C1" lists VINS-Mono with the cell "Security: BSD permissive clean" and "Selected (mandatory simple-baseline) — fallback if OKVIS2 fails Jetson MVE". The Mode A C1 fact card #28 (`02_fact_cards/C1_vio.md`) explicitly states VINS-Mono is "GPL-3.0 (copyleft viral) — distribution of the onboard binary requires source disclosure for the entire linked binary and triggers GPL-3 anti-tivoization clauses for embedded firmware" — and the cross-component-gates D-C1-1 license-track decision exists precisely because VINS-Mono / VINS-Fusion / OpenVINS are on the GPL-3.0 axis. Source #122 (raw VINS-Mono LICENCE on github.com) confirms canonical GPL-3.0. The discrepancy is inside Mode A Step 8 (Deliverable Formatting); the Mode A evidence layer is correct. +- **Source**: Mode A C1 Fact #28; Source #122 (canonical LICENCE) +- **Phase**: Mode B documentary audit +- **Confidence**: ✅ High +- **Sub-Question Binding**: SQ3+SQ4 / C1 +- **Implication**: solution_draft02 must (a) correct the C1 candidate table cell to "GPL-3.0 contingent on D-C1-1 = (a) or (c) license track", (b) demote VINS-Mono from "Selected (mandatory simple-baseline)" status because under D-C1-1 = (b) BSD/permissive-only track it would be **Rejected** by license, (c) elevate KLT+RANSAC homemade fallback to **the** mandatory simple-baseline (matches Mode A C1 Fact #35), and (d) name the actual BSD/permissive-track lead as OKVIS2 (matches C1 Fact #31). No change to the cross-component decision graph — D-C1-1 already exists as the gate that resolves this. + +### Fact #103 — solution_draft01 latency math (~140-420 ms p95 at K=3 + adaptive LightGlue depth) crosses AC-4.1's 400 ms p95 budget at the upper end with no documented slack for FC-side IMU pre-integration, MAVLink/MSP serialization, OS scheduling jitter, or thermal-throttle backoff +- **Statement**: solution_draft01 § "Component-interaction diagram (pre-flight + runtime)" labels the runtime stack: "C1 OKVIS2 VIO ~30-50 ms + C2 MixVPR query ~25 ms + C3 DISK+LightGlue × K pairs ~90-180 ms FP16 + C4 OpenCV solvePnPRansac ~5-15 ms + GTSAM Marginals ~30-90 ms + C5 GTSAM iSAM2 ~2-5 ms per update at D-C5-5 = (c) + C8 per-FC pymavlink GPS_INPUT / MSP2_SENSOR_GPS 5 Hz periodic", and says total is "~140-420 ms p95 at K=3 + adaptive LightGlue depth". The upper end **420 ms exceeds AC-4.1's 400 ms p95** at the documented Jetson Orin Nano Super extrapolation. There is no reserved slack for: (i) MAVLink/MSP serialization + UART/USB transmission to FC (~5-20 ms typical), (ii) OS scheduling jitter under shared-CPU+GPU contention (~10-30 ms typical at 90th-99th percentile per Source #97 Postgres-on-Jetson observations), (iii) thermal-throttle backoff at +50 °C ambient per AC-NEW-5 (Jetson backs off from 25 W to 15 W, collapsing throughput by ~40%), (iv) FC-side IMU pre-integration interpolation latency for the timestamp the GPS_INPUT/MSP2_SENSOR_GPS frame is targeted at, (v) FAISS HNSW index search variance at p99 (~1-3 ms typical → up to ~10-15 ms at p99 per Source #115). A defensible AC-4.1 latency partition would carve a project-side worst-case ≤300 ms p95 budget with explicit per-stage deadlines + slack reservation; current draft01 budgets up to 420 ms with implicit assumption-of-best-case stack behavior. +- **Source**: solution_draft01.md self-citation; AC-4.1; Mode A Sources #97 + #115; AC-NEW-5 +- **Phase**: Mode B documentary audit +- **Confidence**: ✅ High (math is internal to draft01) +- **Sub-Question Binding**: SQ3+SQ4 / C1+C2+C3+C4+C5+C7+C8 cross-cutting NFR +- **Implication**: solution_draft02 must add a NEW Plan-phase decision **D-CROSS-LATENCY-1: AC-4.1 latency budget partition strategy** with options (a) tighten K=3 to K=2 to recover ~30-60 ms of headroom, (b) drop GTSAM `Marginals` covariance recovery from RUNTIME path and use adaptive Jacobian-based covariance per D-C4-2 = (a) to recover ~20-60 ms, (c) accept the budget overrun and validate at Jetson MVE that p95 lands under 400 ms in steady-state (i.e. trust the math is conservative and adaptive-LightGlue-depth in practice will land closer to 140 ms than 420 ms), (d) hybrid: K=3 default + auto-degrade to K=2 + Jacobian-covariance under thermal throttle. Recommendation: **(d) hybrid** — preserves AC-4.1 satisfaction across the operating envelope without permanently sacrificing accuracy. **NEW cross-component gate: requires Jetson MVE measurement of full p95+p99 distribution under hot-soak NFT-3 conditions before lock.** + +### Fact #104 — Camera intrinsics + camera-to-body calibration are PROJECT-LEVEL OPEN ITEMS per `_docs/00_problem/problem.md` last sentence and `flight_derkachi/README.md`; solution_draft01 does NOT inventory this as a Plan-phase decision +- **Statement**: `_docs/00_problem/problem.md` last sentence: "Camera intrinsics, lens distortion, raw camera feed parameters, and exact camera-to-body calibration are still pending, so final production accuracy claims remain gated on calibration data or a separately surveyed representative dataset." `_docs/00_problem/input_data/flight_derkachi/README.md`: "Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims." `_docs/00_problem/input_data/expected_results/results_report.md` § Known Gaps: "Final production acceptance requires camera calibration and representative datasets with synchronized camera/IMU plus ground-truth trajectory." solution_draft01 cites Sources #82+#83 (OpenCV solvePnPRansac signature requires `K` intrinsic matrix + `dist` distortion coefficients) but does not flag that **K and dist are not yet known** for the deployed ADTi 20MP 20L V1 nav camera. Without intrinsics + camera-to-body extrinsic calibration, the entire C4 pose-estimation pipeline cannot run on real production frames; the Jetson MVE results will be calibration-acquisition-dependent. +- **Source**: `_docs/00_problem/problem.md` line 1; `_docs/00_problem/input_data/flight_derkachi/README.md` line 12; `_docs/00_problem/input_data/expected_results/results_report.md` § Known Gaps; OpenCV Sources #82+#83 +- **Phase**: Mode B documentary audit +- **Confidence**: ✅ High +- **Sub-Question Binding**: PCM (Project Constraint Matrix) input availability dimension +- **Implication**: solution_draft02 must add a NEW project-level decision **D-PROJ-1: Camera calibration acquisition strategy** with options (a) checkerboard calibration on a pre-deployment ADTi 20MP 20L V1 nav-camera unit (canonical OpenCV calibration workflow ~1-2 days engineering + lab access), (b) photogrammetric self-calibration from the first ~50 deployment frames over known landmarks (~2-3 days plus runtime support code; produces production-correct calibration but degrades first-mission accuracy), (c) request manufacturer's factory-calibration data sheet from ADTi (low cost if available; risk: vendor may not publish per-unit calibration), (d) hybrid: factory data sheet + ground-truth checkerboard refinement on each deployed unit. Recommendation: **(d) hybrid**. **CRITICAL Plan-phase gate**: this is a hard prerequisite for AC-1.1/1.2 frame-center-accuracy validation; Test Spec (greenfield Step 5) cannot lock end-to-end accuracy fixtures without it. + +### Fact #105 — AC-NEW-7 cache-poisoning safety budget explicitly depends on a Suite Sat Service-side voting layer that solution_draft01 does NOT audit for existence, contract, or build status +- **Statement**: `_docs/00_problem/acceptance_criteria.md` § AC-NEW-7 External-dependency note: "The Suite Satellite Service is expected to operate a multi-flight ingest-side voting layer that gates onboard-tile promotion to 'trusted basemap' until multiple independent flights agree on geo-alignment. Voting algorithm is the Service's concern; onboard's job (AC-8.4) is to publish per-tile quality metadata sufficient for that layer. End-to-end AC-NEW-7 evidence depends on this Service contract." solution_draft01 § Architecture lists C6 + C10 as covering the onboard half (publish per-tile quality metadata, content-hash gate at takeoff, atomic-write descriptor cache) but does NOT verify that the Suite Service voting layer (a) has a documented contract, (b) has been implemented, (c) is on the parent-suite roadmap, or (d) has a fallback if not yet built. Without the Service-side voting, a single bad onboard pose with optimistic covariance writes a misaligned tile that becomes the next flight's anchor — cross-flight error compounding that NFT-5 (in solution_draft01) explicitly tries to test but cannot validate end-to-end without the Service contract. +- **Source**: AC-NEW-7 verbatim; solution_draft01 § Architecture C6+C10; solution_draft01 § Testing Strategy NFT-5 +- **Phase**: Mode B documentary audit +- **Confidence**: ✅ High +- **Sub-Question Binding**: PCM cross-component external-dependency dimension; SQ8 (security) +- **Implication**: solution_draft02 must add a NEW project-level decision **D-PROJ-2: Suite Sat Service voting-layer contract verification** with options (a) verify Suite Service voting layer is documented + scheduled for the deployment timeframe; (b) draft the contract from the onboard side and propose to the Suite Service team; (c) build a project-internal multi-flight aggregator as a stop-gap until Suite Service ships the layer (~2-3 weeks engineering, but cross-flight aggregator means onboard now owns suite-component scope creep); (d) accept that AC-NEW-7 Service-side validation is best-effort and document the gap explicitly. Recommendation: **(a) verify + (b) draft** in parallel — the contract definition is small (per-tile quality metadata schema + voting threshold spec) and propagating it back to the Suite Service team de-risks the entire AC-NEW-7 obligation. **CRITICAL cross-suite gate**: requires coordination with the parent-suite Satellite Service team before AC-NEW-7 NFT-5 can pass with end-to-end evidence. + +### Fact #106 — Mode A Phase 1 BLOCKING gate (`00_ac_assessment.md`) was not produced as a standalone artifact in the Mode A run per solution_draft01's own self-disclosure +- **Statement**: solution_draft01 § Note on AC assessment (lines 17-18): "Mode A Phase 1 (`00_ac_assessment.md` BLOCKING gate per the research SKILL.md) was not executed as a standalone artifact in this run. Per-AC binding evidence is instead distributed across the per-component fact cards and the Restrictions × Candidate-Modes sub-matrix sections in `06_component_fit_matrix/Cx_*.md`. This is acknowledged as a process deviation and is recoverable by extracting an `00_ac_assessment.md` summary file from the existing per-AC binding evidence on demand. No AC has been silently dropped or unverified." Per `_docs/00_research/00_question_decomposition.md` line 4 the Phase 1 skip was a **prior user decision** after a cleanup pass that stripped implementation details from `acceptance_criteria.md` and `restrictions.md`; "AC/restrictions are treated as fixed inputs". Mode B can either (a) extract the standalone artifact retroactively from the distributed evidence, (b) confirm the deviation as accepted by the user, or (c) leave it as-is for Plan-phase to either resolve or carry forward. The risk is small (per-AC binding IS in the per-component fact cards) but the canonical research methodology says a BLOCKING gate cannot simply be skipped. +- **Source**: solution_draft01.md "Note on AC assessment"; `_docs/00_research/00_question_decomposition.md` line 4; research SKILL.md Mode A Phase 1 BLOCKING-gate spec +- **Phase**: Mode B documentary audit +- **Confidence**: ✅ High +- **Sub-Question Binding**: Process compliance with research SKILL.md +- **Implication**: solution_draft02 acknowledges the deviation and recommends extraction of `00_ac_assessment.md` IF user wants the canonical artifact; otherwise the deviation is treated as accepted (per `00_question_decomposition.md` line 4 prior-user decision) and recorded explicitly in `_docs/_process_leftovers/`. + +### Fact #107 — AC-4.5 (system may refine prior estimates and emit corrections) FC-consumption pathway is unspecified; neither MAVLink `GPS_INPUT` nor MSP2 `MSP2_SENSOR_GPS` support "correct prior frame N+ago" semantics; GTSAM iSAM2's NATIVE look-back refinement is therefore internal-only and does not reach the FC +- **Statement**: AC-4.5 ("System may refine prior estimates and emit corrections") is satisfied by GTSAM iSAM2's incremental smoothing per Mode A Fact #89 — the estimator can revise past keyframe poses when new measurements arrive. solution_draft01 § Component C5 + § Testing Strategy IT-10 cite this as a key benefit of D-C5-5 = (c) GTSAM-shared-substrate. However: ArduPilot's `AP_GPS_MAV` (Source #4) and iNav's `mspGPSReceiveNewData()` (Source #110) both consume the **latest** received GPS frame as the current best estimate; neither supports retroactive correction of a frame N steps in the past. So GTSAM iSAM2's look-back refinement value is **internal-only** — it improves the current best pose estimate after smoothing the past, but the FC sees only the current frame after smoothing, not corrections to past frames. AC-4.5 is therefore satisfied as "internal estimator refines past + emits the corrected current estimate", not as "FC retroactively corrects past flight log". Draft01 does not make this scoping explicit; IT-10 in particular does not validate AC-4.5 — it validates per-FC unit conversion of covariance. +- **Source**: AC-4.5 verbatim; Mode A Fact #89 (GTSAM iSAM2); Mode A SQ6 Source #4 (`AP_GPS_MAV.cpp`); Mode A C8 Source #110 (`gps_ublox.c`) +- **Phase**: Mode B documentary audit +- **Confidence**: ✅ High +- **Sub-Question Binding**: SQ3+SQ4 / C5; SQ6 / C8 +- **Implication**: solution_draft02 § Architecture C5 must clarify "AC-4.5 satisfied as internal smoothing + corrected current-frame emission; FC log is forward-time only". solution_draft02 § Testing Strategy must add a new **IT-11 — Smoothing-loop look-back accuracy** test that validates GTSAM iSAM2's smoothed past-keyframe poses against ground-truth at smoothing convergence (independent of FC-side consumption). FDR (AC-NEW-3) MUST log smoothed past-frame estimates so post-mission analysis can verify AC-4.5. + +### Fact #108 — SQ2 architectural decisions promoted "AdHoP refinement loop" + "Top-N inlier-based re-rank" to explicit named sub-stages in the runtime pipeline (per `_docs/00_research/00_question_decomposition.md` lines 175-178), but solution_draft01 § Architecture has neither a candidate row nor a named sub-stage for either +- **Statement**: `_docs/00_research/00_question_decomposition.md` § "SQ2 — Architectural decisions" — Decision 2: "AdHoP refinement loop (Fact #22) → (b) Conditional — only invoked when initial reprojection error exceeds a threshold. C3 (matcher) latency budget = base (single-pass) + AdHoP-conditional overhead (worst-case 2× when triggered)." Decision 3: "Top-N re-rank promotion (Fact #25) → (a) Promote to an explicit named sub-stage between C2 and C3. SQ3+SQ4 will hyperparameter-sweep N ∈ {5, 10, 15, 20}; C2 candidates evaluated jointly with re-rank cost. Top-N re-rank by inlier-count is now a hard pipeline component, not implicit." solution_draft01 § Architecture lists candidate tables for C1+C2+C3+C4+C5+C6+C7+C8+C10. The component-interaction diagram shows "C2 MixVPR query → top-K=3 satellite tile retrieval → C3 DISK+LightGlue × K pairs" — the K=3 retrieval IS the top-K from C2, but the **re-rank by inlier count** sub-stage promised by SQ2 Decision 3 is not represented. Similarly, no AdHoP-conditional refinement candidate appears in the C3 row, despite SQ2 Decision 2 carving its latency budget. +- **Source**: `_docs/00_research/00_question_decomposition.md` lines 175-178; solution_draft01 § Architecture C1-C10; solution_draft01 § Component-interaction diagram +- **Phase**: Mode B documentary audit +- **Confidence**: ✅ High +- **Sub-Question Binding**: SQ2 closure +- **Implication**: solution_draft02 must either (a) populate the architecture with new candidate rows for "Top-N re-rank by inlier count" (likely a thin wrapper around the C3 matcher's RANSAC inlier counter) and "AdHoP-conditional refinement" (per Source #40 OrthoLoC AdHoP method-agnostic preconditioning); or (b) explicitly close SQ2 Decisions 2+3 as "implicit inside C3" — but in that case the "promote to explicit named sub-stage" wording from question_decomposition must be revisited and the user notified that the architecture deviated. Recommendation: **(a) populate** — both are well-scoped sub-components with cited Sources (#22+#25 in the original SQ2 closure) and the latency budgets are already carved. solution_draft02 § Architecture adds a "Re-rank" sub-stage between C2 and C3 plus an "AdHoP-conditional" sub-stage between C3 and C4. + +--- + +## Web-research findings (2026-05-08) + +### Fact #109 — MAVLink protocol lacks cryptographic authentication by default (CVE-2026-1579, CVSS 9.8 CRITICAL); ArduPilot supports MAVLink 2.0 message signing as the canonical mitigation; iNav has only partial MAVLink support and does NOT implement message signing — cross-FC asymmetry on the GCS / telemetry link is material for AC-NEW-7 + AC-NEW-2 +- **Statement**: Per Source #126 (NVD CVE-2026-1579, CVSS 9.8 CRITICAL): "The MAVLink communication protocol lacks cryptographic authentication by default. Unauthenticated parties with MAVLink interface access can send arbitrary messages including SERIAL_CONTROL commands for interactive shell access." Affected named: PX4 Autopilot v1.16.0_SITL_latest_stable. **Mitigation: enable MAVLink 2.0 message signing.** Per Source #128 (ArduPilot Plane MAVLink2 Signing docs): ArduPilot supports MAVLink2 signing via Mission Planner SETUP > Advanced > "Mavlink Signing"; non-USB serial ports can be configured to only respond to MAVLink commands carrying the correct passkey; a 13-byte signature includes link ID (8 bits), timestamp (48 bits in 10-microsecond units since 2015-01-01), and 48-bit SHA-256 hash signature based on packet + timestamp + secret key (Source #128 + canonical mavlink.io/en/guide/message_signing.html). Issue #28736 + PR #29546 (March 2025) add channel-specific signing for separate MAVLink ports — direct relevance to companion-computer wired connections per Source #128. Per Source #129 (iNav MAVLink wiki, frogmane edited 2025-12-11): "iNav has partial MAVLink support and does not implement message signing. It lacks parameter API support and has limited command compatibility." Companion-FC inbound on iNav is MSP2 (not MAVLink) so the signing-gap is on the OUTBOUND MAVLink telemetry side from iNav to the GCS, not on the inbound external-positioning path — but cross-FC asymmetry is still material because the GCS link itself carries `STATUSTEXT` and operator commands per AC-6.1 + AC-6.2. +- **Source**: Source #126 (CVE-2026-1579), Source #128 (ArduPilot Plane MAVLink2 Signing docs + PR #29546), Source #129 (iNav MAVLink wiki), canonical mavlink.io/en/guide/message_signing.html (Source #128 cross-cite) +- **Phase**: Mode B web research +- **Confidence**: ✅ High (NVD official + ArduPilot official + iNav official wiki) +- **Sub-Question Binding**: SQ6 (FC adapter security posture) + SQ8 (AC-NEW-7 + AC-NEW-2 security) +- **Implication**: solution_draft02 must add a NEW Plan-phase decision **D-C8-9: MAVLink 2.0 message signing posture per FC** with options (a) require MAVLink 2.0 signing on ALL MAVLink channels (companion ↔ ArduPilot, FC ↔ GCS, companion ↔ GCS); (b) require signing only on the companion ↔ ArduPilot wired channel (the inbound external-positioning path on AP); (c) accept the unsigned-by-default posture and document it as an external-attack-surface risk; (d) hybrid: signing on companion ↔ AP wired channel + key rotation on every flight. iNav has no signing option per Source #129 — explicit cross-FC asymmetry must be documented. Recommendation: **(d) hybrid for AP**. Cross-FC asymmetry: iNav GCS link is unsigned by design — document explicitly under AC-NEW-7 and propose iNav firmware feature-request as Plan-phase carryforward. **NEW NFT-8 — MAVLink message-signing verification**: SBOM dump confirms passkey configuration for AP signing channel; iNav side documents the unsignable-link as accepted residual risk. + +### Fact #110 — MegaLoc (Berton & Masone, CVPR 2025) and UltraVPR (RAL 2025 / ICRA 2026) are MIT-licensed aerial-validated VPR candidates that materially change the D-C2-11 deferred-evaluation recommendation; UltraVPR specifically targets UAV with documented 44 Hz throughput on Jetson Orin NX (Orin-Nano-Super-class) +- **Statement**: Per Source #123 (MegaLoc): MIT-licensed; February 2025 release; SOTA on multiple VPR benchmarks (indoor + outdoor); validated on aerial datasets via the AirZoo benchmark (Source #125) which "demonstrates that fine-tuning MegaLoc on aerial data yields substantial performance gains for aerial image retrieval and cross-view matching tasks". Distributed via torch.hub for easy installation. Per Source #124 (UltraVPR): MIT-licensed; RAL 2025 + ICRA 2026; "unsupervised lightweight rotation-invariant aerial VPR system designed for UAV applications"; **ONNX model runs at approximately 44 Hz on Jetson Orin NX** (Orin Nano Super is in the same Ampere family; expected to land in the same throughput band ±20%); validated on VPAir + UAV-VisLoc datasets — directly relevant to the project's pinned aerial UAV operating context. solution_draft01 § "Open decisions for Plan-phase" line 322 explicitly defers D-C2-11 (MegaLoc successor evaluation) to "post-research session" because Mode A had not gathered sufficient evidence on MegaLoc's aerial applicability or Jetson runtime. Mode B research closes both gaps: MegaLoc is aerial-validated (AirZoo); UltraVPR is aerial-pretrained + Jetson-throughput-documented. The D-C2-11 recommendation should be revised from "(c) skip and rely on closed mandatory pre-screen" to "(a) treat both MegaLoc AND UltraVPR as new Documentary Lead candidates on the BSD/permissive C2 axis at next session, with mandatory Jetson MVE under D-C1-2 / D-C2-4". +- **Source**: Source #123 (MegaLoc), Source #124 (UltraVPR), Source #125 (AirZoo aerial validation) +- **Phase**: Mode B web research +- **Confidence**: ✅ High (peer-reviewed publications + official repos with pretrained weights) +- **Sub-Question Binding**: SQ3+SQ4 / C2 D-C2-11 +- **Implication**: solution_draft02 § Architecture C2 must add MegaLoc + UltraVPR as Documentary Lead candidates on the BSD/permissive C2 axis. UltraVPR is potentially the strongest candidate by the project's specific operating-context scoring: rotation-invariant (multi-heading aerial flights), unsupervised (no aerial-retrain cost — closes D-C2-1), Jetson-Orin-NX-runtime-documented at 44 Hz (substantially exceeds 3 Hz nav-camera rate with massive headroom), and MIT-licensed (BSD/permissive track clean). MegaLoc is the broader-applicability primary if SOTA across non-aerial datasets is also wanted (e.g., for cross-domain generalization). **D-C2-11 revised**: (a) elevate UltraVPR to Documentary Lead PRIMARY recommendation on BSD/permissive C2 axis, with MixVPR / EigenPlaces / SelaVPR as siblings; (b) add MegaLoc as Documentary Lead SECONDARY with broader-applicability tag; (c) preserve the closed pre-screen (5/5: MixVPR + SALAD + SelaVPR + NetVLAD + EigenPlaces) as fallback. Mandatory Jetson MVE per D-C1-2 / D-C2-4 expanded scope to cover both UltraVPR + MegaLoc + the existing five. + +### Fact #111 — D-C8-2 = (b) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch pattern is supported by ArduPilot at firmware level since August 2021 (PR #18345 → SITL-tested) but no production-deployed GCS or companion implementation is publicly documented; the project will be establishing the canonical pattern itself +- **Statement**: Per Source #130 (ArduPilot common-ekf-sources.rst + PR #18345): ArduPilot supports `MAV_CMD_SET_EKF_SOURCE_SET` (MAVLink command id 42007) since merge in August 2021; the command accepts source set values in 1-3 range; tested in SITL. Source #130 explicitly states: "no GCSs are currently known to implement this" and "The results do not provide specific information about Auterion, NGPS, or production deployment status." This re-confirms Mode A SQ6 Fact #3 from a fresh search at access time 2026-05-08. solution_draft01 § Open decisions D-C8-2 = (b) "companion publishes to source-set 2 + auto-switches FC to set 2 on first valid fix + switches back to set 1 when companion is unavailable (RECOMMENDED ~mirrors NGPS/Auterion pattern)" cites NGPS/Auterion deployment pattern but Mode A SQ1 Sources #25–#37 do not provide direct evidence of either NGPS or Auterion using the companion-driven switch — they document the existence of those deployed systems but not their internal source-set switching mechanism. This is a gap: the project is committing to a pattern that is **architecturally supported but not production-deployed**. +- **Source**: Source #130 (ArduPilot common-ekf-sources.rst + PR #18345 verified 2026-05-08); Mode A SQ6 Fact #3 +- **Phase**: Mode B web research (verification re-run) +- **Confidence**: ✅ High (ArduPilot official documentation) +- **Sub-Question Binding**: SQ6 / C8 D-C8-2 +- **Implication**: solution_draft02 must downgrade D-C8-2 fit status from `Selected` to **`Selected with runtime gate`** per Step 7.5.3 carve-out wording, with the runtime gate being "validated end-to-end on ArduPilot Plane SITL by IT-3 (Spoofing-promotion latency) before lock". Add NEW Plan-phase decision **D-C8-2-FALLBACK: companion-driven switch fallback strategy if SITL validation fails** with options (a) switch to operator-manual source-set flip via RC aux switch option 90 per draft01's existing D-C8-2 = (c), accepting AC-NEW-2 ≤3s latency would now require operator response time; (b) implement operator-warning STATUSTEXT instead of automated switch, deferring authority to operator; (c) escalate to ArduPilot dev community to characterize firmware-side switch latency before lock. Recommendation: **defer to Test Spec greenfield Step 5** which owns SITL fixture acquisition. + +### Fact #112 — OpenCV 4.x must be pinned to ≥4.12.0 per CVE-2025-53644 (CVSS 9.8 CRITICAL heap buffer write via crafted JPEG); affects 4.10.0 / 4.11.0; OpenCV is C4's primary `solvePnPRansac` runtime + KLT fallback in C1 + ortho warp in C4 +- **Statement**: Per Source #127 (NVD CVE-2025-53644, CVSS 9.8 CRITICAL): "Uninitialized pointer variable on stack when reading crafted JPEG images. Affects versions 4.10.0 and 4.11.0. Fixed in version 4.12.0." Related weakness: CWE-457 (Use of Uninitialized Variable). solution_draft01 § Component C4 cites "OpenCV 4.x calib3d module"; § Component C1 KLT fallback uses "OpenCV pure-Python"; FDR thumbnail re-load + tile cache import paths potentially feed crafted JPEG bytes into OpenCV's `imread` / `imdecode`. The exposure is small (most JPEG inputs are trusted internal nav-camera stream) but FDR thumbnail re-load AND ortho-tile imports from the Suite Sat Service path could be hostile-input vectors per AC-NEW-7. Pinning OpenCV to ≥4.12.0 is a single-line change with no API-break exposure (4.12 is a minor release on the 4.x line). +- **Source**: Source #127 (NVD CVE-2025-53644) +- **Phase**: Mode B web research +- **Confidence**: ✅ High (NVD official) +- **Sub-Question Binding**: SQ8 (security) + SQ3+SQ4 / C1+C4 dependency pinning +- **Implication**: solution_draft02 must pin OpenCV to **≥4.12.0** in all C1+C4 candidate rows; add NEW Plan-phase decision **D-CROSS-CVE-1: dependency security pinning posture** with options (a) lock to specific patched versions of all CVE-affected dependencies (OpenCV ≥4.12.0; FAISS Apache-2.0 throughout — no CVEs; GTSAM clean — no CVEs; TensorRT 10.3 in JetPack 6.2 — no CVE-applicable since not using TRT-LLM 0.x); (b) maintain a project SBOM with monthly CVE re-scan; (c) automate pinning via dependabot or equivalent. Recommendation: **(a) + (b)** — minimal cost, maximum AC-NEW-7 audit-trail coverage. + +### Fact #113 — XoFTR (cross-modal) achieves SOTA cross-modal matching but a 2026 SAR-optical benchmark (24-matcher comparison) found foundation-model features (DINOv2) provide modality invariance even WITHOUT explicit cross-modal training — reinforces SelaVPR (DINOv2-L) preference over MixVPR (CNN-only) when cross-domain UAV→satellite registration is the binding stress test +- **Statement**: Per Source #131 (XoFTR + 2026 SAR-optical benchmark): XoFTR achieved the lowest reported mean error at 3.0 pixels on SpaceNet9 cross-modal training scenes among 24 pretrained matcher families. **Critical finding**: "matchers without explicit cross-modal training sometimes performed comparably, suggesting that foundation-model features (like DINOv2) may provide modality invariance." This is direct contrarian evidence on the project's "DISK+LightGlue retrain on aerial-domain corpus closes cross-domain UAV→satellite gap" architectural bet (D-C3-1 = (a) RECOMMENDED-PRIMARY-MITIGATION). The contrarian implication: a DINOv2-backboned VPR (SelaVPR per Mode A C2 fact card) AND a DINOv2-backboned matcher (a hypothetical DINOv2-backed feature extractor + LightGlue) might close the cross-domain gap WITHOUT needing the D-C2-1 ~1-2-week aerial retrain that draft01 baselines. This does not invalidate the existing D-C3-1 = (a) recommendation but it strengthens the case for keeping SelaVPR (DINOv2-L) as a serious candidate alongside MixVPR (CNN) in the BSD/permissive C2 axis, and it suggests MegaLoc (which also uses foundation-model features per Source #123) is similarly attractive without retrain cost. +- **Source**: Source #131 (XoFTR + 2026 SAR-optical benchmark, two-source confirmation per Critical-novelty rule) +- **Phase**: Mode B web research +- **Confidence**: ⚠️ Medium (the "comparable performance without cross-modal training" finding is from one benchmark on SAR-optical, not UAV→satellite — extrapolation to project's exact operating context is plausible but unverified) +- **Sub-Question Binding**: SQ3+SQ4 / C2+C3 +- **Implication**: solution_draft02 § Architecture C2 keeps SelaVPR (DINOv2-L two-stage) as **strong secondary** alongside UltraVPR primary on the BSD/permissive C2 axis (per Fact #110 promotion). solution_draft02 § Open decisions adds **D-C2-12: DINOv2-backbone feature-extractor evaluation for cross-domain matching** as carryforward research item — could potentially close D-C3-1 retrain cost via DINOv2-feature-based matcher (e.g., DINOv2 + LightGlue or DINOv2 + paired matcher) without requiring D-C2-1 aerial retrain. Defer to Plan-phase Jetson MVE. diff --git a/_docs/00_research/06_component_fit_matrix/00_summary.md b/_docs/00_research/06_component_fit_matrix/00_summary.md index dbb3b0b..0eca32b 100644 --- a/_docs/00_research/06_component_fit_matrix/00_summary.md +++ b/_docs/00_research/06_component_fit_matrix/00_summary.md @@ -33,6 +33,7 @@ This folder replaces the previous monolithic `06_component_fit_matrix.md` (284 l | [`C7_inference_runtime.md`](C7_inference_runtime.md) | **C7** — On-Jetson inference runtime | **CLOSED at 3/N (batch 1 closed 2026-05-08)** — top-2 documentary leads + mandatory simple-baseline COMPLETE; **Cand 1 RECOMMENDED PRIMARY** | **Cand 1 (RECOMMENDED PRIMARY)**: TensorRT native — JetPack 6.2 bundled TensorRT 10.3 + `IInt8EntropyCalibrator2` + `BuilderFlag.FP16+INT8` mixed-precision + engines built directly on Jetson Orin Nano Super SM 87 (clean Apache-2.0 in TensorRT 10.x; ships with JetPack so zero-effort install; lowest-latency primary path; 2-3× speedup at INT8 vs FP16 per Source #102 YOLO26 evidence); **Cand 2 (interop alternate)**: ONNX Runtime + TensorRT EP — `onnxruntime-gpu` via Jetson AI Lab JP6/CU126 wheel index + `TensorrtExecutionProvider` config + automatic CUDA EP / CPU EP subgraph fallback (clean MIT throughout; cross-architecture portability for replay/SITL on x86 dev hosts; modern-competitive-lead-cross-architecture-portability); **Cand 3 (mandatory simple-baseline)**: pure PyTorch FP16 — `torch.amp.autocast` + `model.half()` + Jetson AI Lab PyTorch 2.5 ARM64 wheel (clean BSD-3-Clause throughout; zero-conversion regression baseline; reference-correctness oracle for accuracy validation of TRT-built engines) | INT8-only candidates marked Experimental until D-C7-1 calibration dataset materializes; matchers (LightGlue, XFeat, XFeat+LighterGlue) are FP16-only — NO INT8 — per D-C7-6 cross-component model-family precision policy due to Source #103 quantization-sensitivity finding | | [`C8_fc_adapter.md`](C8_fc_adapter.md) | **C8** — MAVLink / MSP2 FC adapter | **CLOSED at 3/N (batch 1 closed 2026-05-08)** — top-1 per FC for ArduPilot + parallel-evaluation per FC for iNav after mid-batch contradiction recovery COMPLETE; **Cand 1 RECOMMENDED PRIMARY for AP, Cand 2 RECOMMENDED PRIMARY for iNav** | **Cand 1 (RECOMMENDED PRIMARY for ArduPilot)**: pymavlink → MAVLink `GPS_INPUT` (msg 232) cooperative-path; `master.mav.gps_input_send(...)` periodic injection at 5 Hz over MAVLink (UART/USB/UDP); FC-side `GPS1_TYPE=14` MAVLink + `EK3_SRC1_POSXY=3` GPS source-set drives EKF3 ingestion via `AP_GPS_MAV` (LGPL-3.0 pymavlink linkable from Apache-2.0 app per LGPL §6; canonical ArduPilot stack); **Cand 2 (RECOMMENDED PRIMARY for iNav)**: `MSP2_SENSOR_GPS` (id 7939 / 0x1F03) via Python MSP V2 implementation YAMSPy or INAV-Toolkit `msp_v2_encode`; `mspGPSReceiveNewData()` direct passthrough; covariance fields `hPosAccuracy/vPosAccuracy/hVelAccuracy` align directly with AP `GPS_INPUT.horiz_accuracy/vert_accuracy/speed_accuracy` (MIT throughout; clean dual-use compatible; locked SQ6 + AC-4.3 transport); **Cand 3 (DEFERRED secondary for iNav)**: UBX impersonation via pyubx2 NAV-PVT — forging u-blox NAV-PVT frames through standard GPS pipeline; iNav-side `gpsMapFixType()` validation gate requires `flags & 0x01 = 1` (gnssFixOK) AND `fixType ∈ {2,3}`; pyubx2 BSD-3-Clause; **does NOT clear user's "significant-improvement-only" bar over Cand 2** (richer protocol surface + AC-NEW-7 forgery posture + stricter validation gate + AP-path field-name divergence outweigh pyubx2 library-maturity advantage). **Mid-batch correction**: I caught a contradiction between my own initial AskQuestion phrasing ("UBX impersonation as ONLY iNav path") and locked SQ6 + AC-4.3 + restrictions.md verdicts (MSP2_SENSOR_GPS as iNav primary); user re-locked scope via `c8_inav_recovery=B` to evaluate both as parallel candidates | (none yet — pymavlink LGPL-3.0 license posture handled via D-C8-3 = (a) bundle-unmodified-with-version-pin per LGPL §6 standard compliance) | | [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) | **C10** — Pre-flight cache provisioning (CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C; operator CLI/desktop tooling, sector classification, freshness schema deferred to Plan-phase) | **CLOSED at 2/N (batch 1 closed 2026-05-08)** — D-C6-3 + D-C7-7 cross-component gates closed; no further C10 batches required at research layer | **D-C6-3 confirmation**: direct `faiss.write_index` / `faiss.read_index` Python API + `python-atomicwrites` + content-hash verification gate at takeoff + manifest-hash-driven rebuild trigger + `IO_FLAG_MMAP_IFC` mmap load (FAISS MIT, atomicwrites MIT throughout); **D-C7-7 confirmation**: hybrid Polygraphy CLI primary for INT8-calibrating builds + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API for unusual models (LightGlue dynamic shapes) — Polygraphy + TensorRT 10.x Apache-2.0 throughout, calibration corpus per D-C7-1 closure | (none — both candidates Apache-2.0/MIT clean; FAISS "no internal integrity check" warning mitigated by content-hash gate; `trtexec --int8` random-data caveat mitigated by project-side wrapper enforcing `--calib=` non-empty precondition) | +| [`MODEB_revisions.md`](MODEB_revisions.md) | **Mode B revisions overlay (2026-05-08)** — solution_draft01 assessment | Overlay file with revised candidate-row statuses + new D-Cx-y gates surfaced by Mode B findings F1–F20 (Facts #102–#113). VINS-Mono license-track-only on D-C1-1 = (a)/(c); KLT+RANSAC re-labelled mandatory simple-baseline (per Mode A C1 Fact #35); UltraVPR Documentary-Lead PRIMARY + MegaLoc Documentary-Lead SECONDARY on BSD/permissive C2 axis (D-C2-11 revised); D-C8-2 downgraded to `Selected with runtime gate` (SITL validation gate before lock); OpenCV pin tightened to ≥4.12.0; new sub-stages added (Top-N inlier re-rank between C2 and C3; AdHoP-conditional refinement between C3 and C4); new gates D-C2-12 (DINOv2-feature matcher), D-C8-9 (MAVLink-2.0 message-signing per FC), D-CROSS-LATENCY-1 (AC-4.1 budget partition), D-CROSS-CVE-1 (dependency security pinning), D-PROJ-1 (camera calibration acquisition), D-PROJ-2 (Suite Sat Service voting-layer contract verification); new tests IT-11 (smoothing-loop look-back), NFT-8 (signing verification), NFT-9 (hot-soak latency distribution). | n/a | | [`99_cross_component_gates.md`](99_cross_component_gates.md) | **Cross-component process gates** | Open — Plan-phase Choose blocks raised by C1+C2+C3+C4+C5+C6+C7+C8+C10 closures | D-C1-1 license posture, D-C1-2 Jetson MVE, D-C2-1..11 (VPR retrain/cache/dim), D-C3-1..6 (matcher mitigation/runtime/K-pairs/ALIKED-mode/DISK-weights/XFeat-mode), D-C4-1..4, **D-C5-1..5 (Manual ESKF + GTSAM iSAM2)**, **D-C6-1..7**, **D-C7-1..9**, **D-C8-1..8**, **D-C10-1 (descriptor-cache rebuild trigger — manifest-hash-driven recommended, NEW from Fact #100)**, **D-C10-2 (descriptor-cache atomic-write strategy — `python-atomicwrites` recommended, NEW from Fact #100)**, **D-C10-3 (content-hash verification gate at takeoff load — reject + STATUSTEXT + refuse takeoff recommended, NEW from Fact #100, CROSS-COMPONENT with AC-NEW-7)**, **D-C10-4 (descriptor-cache load path — mmap with `madvise(MADV_WILLNEED)` pre-fault recommended, NEW from Fact #100)**, **D-C10-5 (TensorRT engine-build orchestration tool — hybrid Polygraphy + trtexec + direct API recommended, NEW from Fact #101, CROSS-COMPONENT with C7)**, **D-C10-6 (TensorRT calibration-cache reuse strategy — rebuild-on-calib-corpus-SHA-256-change recommended, NEW from Fact #101, CROSS-COMPONENT with D-C7-1)**, **D-C10-7 (TensorRT engine on-disk filename schema — self-describing `_sm_jp_trt_.engine` recommended, NEW from Fact #101)**, **D-C10-8 (TensorRT prebuilt-fallback engine generation venue — reference Jetson at HQ + deployed-Jetson-copy-to-archive recommended, NEW from Fact #101)**, Fact #40 dual-rate camera pipeline | n/a | --- diff --git a/_docs/00_research/06_component_fit_matrix/99_cross_component_gates.md b/_docs/00_research/06_component_fit_matrix/99_cross_component_gates.md index 9223225..95af73c 100644 --- a/_docs/00_research/06_component_fit_matrix/99_cross_component_gates.md +++ b/_docs/00_research/06_component_fit_matrix/99_cross_component_gates.md @@ -3,6 +3,8 @@ > Mode A Phase 2 — engine Step 7.5 (Component Applicability Gate). Plan-phase Choose blocks raised by C1, C2, C3, C4, C5, C6, C7, C8, and C10 closures. Each gate names its owner and the resolution path. Backing fact cards live in [`../02_fact_cards/`](../02_fact_cards/) by component. > > Index: [`00_summary.md`](00_summary.md). Per-component rows: [C1](C1_vio.md), [C2](C2_vpr.md), [C3](C3_matchers.md), [C4](C4_pose_estimation.md), [C5](C5_state_estimator.md), [C6](C6_tile_cache_spatial_index.md), [C7](C7_inference_runtime.md), [C8](C8_fc_adapter.md), [C10](C10_preflight_provisioning.md). C9 dropped per 2026-05-08 restructure — see `../00_question_decomposition.md`. +> +> **Mode B overlay (2026-05-08)**: this file preserves the Mode A audit trail. NEW gates raised by Mode B Solution Assessment of `_docs/01_solution/solution_draft01.md` are catalogued in [`MODEB_revisions.md`](MODEB_revisions.md) — specifically D-C2-12 (DINOv2-feature matcher evaluation), D-C8-2-FALLBACK (companion-driven EKF source switch fallback if SITL validation fails), D-C8-9 (MAVLink-2.0 message-signing per FC), D-CROSS-LATENCY-1 (AC-4.1 latency budget partition strategy), D-CROSS-CVE-1 (dependency security pinning posture), D-PROJ-1 (camera calibration acquisition strategy), D-PROJ-2 (Suite Sat Service voting-layer contract verification). REVISED gates with Mode B evidence: D-C1-1 (VINS-Mono license re-confirmed GPL-3.0 — see Mode B Fact #102), D-C2-11 (UltraVPR + MegaLoc elevated from "deferred to post-research" to "Documentary Lead PRIMARY + SECONDARY" — see Mode B Fact #110), D-C8-2 (downgraded to `Selected with runtime gate` — see Mode B Fact #111). Read [`MODEB_revisions.md`](MODEB_revisions.md) alongside this file for the current gate state. --- diff --git a/_docs/00_research/06_component_fit_matrix/MODEB_revisions.md b/_docs/00_research/06_component_fit_matrix/MODEB_revisions.md new file mode 100644 index 0000000..b719ec3 --- /dev/null +++ b/_docs/00_research/06_component_fit_matrix/MODEB_revisions.md @@ -0,0 +1,72 @@ +# Component Fit Matrix — Mode B Revisions (2026-05-08) + +> Mode B Solution Assessment of `_docs/01_solution/solution_draft01.md`. Revisions to specific candidate-row statuses + new D-Cx-y gates surfaced by Mode B findings F1–F20. +> +> Index: [`00_summary.md`](00_summary.md). Mode B fact cards: [`../02_fact_cards/MODEB_addendum.md`](../02_fact_cards/MODEB_addendum.md). Mode B sources: [`../01_source_registry/MODEB_addendum.md`](../01_source_registry/MODEB_addendum.md). Mode B output: [`../../01_solution/solution_draft02.md`](../../01_solution/solution_draft02.md). +> +> The original Mode A row files [`C1_vio.md`](C1_vio.md) ... [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) + [`99_cross_component_gates.md`](99_cross_component_gates.md) remain canonical. This file overlays revisions; where this file disagrees with the originals, this file wins (the original Mode A files are not retroactively edited so the audit trail is preserved). + +--- + +## Status changes per candidate row + +| Component | Candidate | Mode A status (verbatim from row file) | Mode B revised status | Reason | +|---|---|---|---|---| +| **C1** | VINS-Mono | "Selected (mandatory simple-baseline) — fallback if OKVIS2 fails Jetson MVE" + "Security: BSD permissive clean" | **`Selected via VioStrategy interface for comparative study + research/dev builds`**; production-deployed only if D-C1-1-SUB-A resolves to non-(a) AND IT-12 confirms VINS-Mono outperforms OKVIS2; **license corrected from "BSD permissive clean" to "GPL-3.0 (copyleft viral)"** | Fact #102 + 2026-05-08 user directive — Mode A C1 Fact #28 already correctly classified VINS-Mono as GPL-3.0; the BSD label was a Step-8 deliverable-formatting error in solution_draft01. User directive elevates VINS-Mono into the production design as a comparative-study sibling behind a `VioStrategy` interface. Source #122 confirms canonical GPL-3.0; see new D-C1-1-SUB-A below for viral-linkage containment policy. | +| **C1** | KLT+RANSAC homemade fallback | "Selected (project-internal homemade fallback) — used when OKVIS2/VINS-Mono unavailable" | **`Selected (mandatory simple-baseline) — wrapped as `KltRansacVioStrategy` behind VioStrategy interface`** | Mode A C1 Fact #35 + 2026-05-08 user directive: KLT+RANSAC is the engine-required mandatory simple-baseline AND is wrapped as a third `VioStrategy` so the comparative study (IT-12) covers the engine-required baseline alongside OKVIS2 + VINS-Mono. | +| **C1** | (NEW interface) `VioStrategy` interface | n/a | **`Selected (NEW architectural component per 2026-05-08 user directive on A1)`** | NEW: pluggable Strategy/Adapter pattern hosting `Okvis2VioStrategy`, `VinsMonoVioStrategy`, `KltRansacVioStrategy`. Selection is config-driven at startup; FDR (AC-NEW-3) records active strategy `name()` + `license()` per flight. Interface owns "produce `VioOutput` from frame + IMU window per the strategy's algorithm"; per-strategy concerns live in concrete implementations per coderule SRP rule. | +| **C2** | (NEW) UltraVPR (cbbhuxx/UltraVPR, MIT) | n/a (D-C2-11 deferred to post-research) | **`Documentary lead PRIMARY on BSD/permissive C2 axis`**; mandatory Jetson MVE under D-C1-2 / D-C2-4 expanded scope | Fact #110 — RAL 2025 / ICRA 2026 publication; MIT license; **44 Hz on Jetson Orin NX (Orin-Nano-Super-class)**; rotation-invariant (multi-heading aerial flights); unsupervised aerial pretrain (closes D-C2-1 retrain cost); validated on VPAir + UAV-VisLoc datasets. Sources #124. | +| **C2** | (NEW) MegaLoc (gmberton/megaloc, MIT) | n/a (D-C2-11 deferred to post-research) | **`Documentary lead SECONDARY (broader-applicability)`** on BSD/permissive C2 axis; mandatory Jetson MVE | Fact #110 — CVPR 2025 publication; MIT license; SOTA on multiple VPR benchmarks; aerial-validated via AirZoo benchmark (Source #125). Distributed via torch.hub. | +| **C2** | MixVPR | "Selected (mandatory simple-baseline + recommended primary on BSD/permissive track)" | **`Selected (mandatory simple-baseline)`**; demoted from "recommended primary" to "mandatory baseline" — UltraVPR is the new BSD/permissive PRIMARY recommendation | Fact #110 — MixVPR remains valid candidate but UltraVPR's UAV-pretrain + Jetson-runtime evidence dominates on the project's pinned operating context. MixVPR retained as mandatory baseline per Component Option Breadth rule. | +| **C2** | SelaVPR | "Selected (modern-competitive-lead BSD/permissive two-stage) — eligible if D-C2-7 re-rank strategy chosen" | **`Selected (modern-competitive-lead BSD/permissive secondary)` STRENGTHENED** | Fact #113 — XoFTR / SAR-optical benchmark contrarian evidence reinforces foundation-model (DINOv2) backbone preference for cross-domain registration; SelaVPR's DINOv2-L backbone is well-positioned even without aerial retrain. | +| **C8** | pymavlink → MAVLink GPS_INPUT (AP) | "Selected (recommended-primary) for ArduPilot Plane" | **`Selected (recommended-primary) for ArduPilot Plane` + NEW security mitigation requirement: D-C8-9 MAVLink 2.0 message signing on companion ↔ AP wired channel** | Fact #109 — CVE-2026-1579 CVSS 9.8 CRITICAL. ArduPilot supports MAVLink 2.0 message signing (Source #128). Draft01 had no signing-posture decision; Mode B raises D-C8-9 as new gate. | +| **C8** | D-C8-2 = (b) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch | (Recommendation: implicit `Selected` via D-C8-2 = (b) being the recommended pattern) | **`Selected with runtime gate`** per Step 7.5.3 carve-out — runtime gate = SITL validation by IT-3 before lock | Fact #111 — pattern is firmware-supported but no production-deployed precedent; the project will be establishing the canonical pattern itself. Carve-out is for runtime-quality validation, not API capability. | +| **C4** | OpenCV `cv::solvePnPRansac` + GTSAM `Marginals` (D-C4-2 = (b)) | "Selected (mandatory simple-baseline + recommended-primary covariance recovery via GTSAM)" + "OpenCV 4.x" | **`Selected` + dependency pin updated to `OpenCV ≥4.12.0`** per CVE-2025-53644 mitigation | Fact #112 — OpenCV CVE-2025-53644 CVSS 9.8 CRITICAL on 4.10.0 / 4.11.0; fixed in 4.12.0. Single-line pin change with no API break. | +| **C5** | GTSAM iSAM2 (AC-4.5 internal smoothing) | "Selected (modern-competitive-lead-factor-graph + recommended primary path) — couples NATIVELY with C4 GTSAM Marginals via D-C5-5 = (c)" | **`Selected` + AC-4.5 scope clarification: internal smoothing only, NOT FC retroactive correction** | Fact #107 — GTSAM iSAM2 NATIVE look-back refinement value is internal-only; ArduPilot `AP_GPS_MAV` and iNav `mspGPSReceiveNewData()` consume only the latest frame; FC log is forward-time only. AC-4.5 satisfied as "internal estimator refines past + emits corrected current estimate", not as "FC retroactively corrects past flight log". | +| **All C-rows** | (cross-cutting) | (no AC-4.1 latency partition) | **NEW cross-cutting D-CROSS-LATENCY-1: AC-4.1 latency budget partition strategy** | Fact #103 — draft01's own runtime math (~140-420 ms p95) exceeds AC-4.1 (400 ms) at upper end with no slack reservation. Recommendation: hybrid K=3 default + auto-degrade to K=2 + Jacobian-covariance under thermal throttle. | + +--- + +## Architecture-level additions (new sub-stages absent from solution_draft01) + +| Sub-stage | Position in pipeline | Recommended candidate | Source | +|---|---|---|---| +| **Top-N inlier-based re-rank** (was promised by SQ2 Decision 3 but absent from solution_draft01) | Between C2 (VPR top-K) and C3 (matcher) | Thin wrapper around C3 matcher's RANSAC inlier counter; rank top-K candidates by inlier count from a single-pair LightGlue / XFeat invocation per candidate; output top-N ⊆ top-K for full-depth C3 matching | Fact #108; SQ2 Decision 3; Mode A SQ2 Source #38–#42 | +| **AdHoP-conditional refinement** (was promised by SQ2 Decision 2 but absent from solution_draft01) | Between C3 (matcher) and C4 (PnP) | OrthoLoC AdHoP method-agnostic perspective preconditioning per Mode A SQ2 Source #40; invoked only when initial reprojection error exceeds a threshold; worst-case 2× C3 latency when triggered | Fact #108; SQ2 Decision 2; Mode A SQ2 Source #40 | + +--- + +## New cross-component / project-level Plan-phase gates (overlay onto `99_cross_component_gates.md`) + +| Gate | Owner | Resolution path | +|---|---|---| +| **D-C1-1 (REVISED with Fact #102 evidence)** license-track posture | User | No change to gate; evidence updated — VINS-Mono is GPL-3.0 (not BSD as draft01 listed); C1 BSD/permissive-track lead remains OKVIS2 (per Mode A C1 Fact #31 unchanged) | +| **D-C1-1-SUB-A (LOCKED 2026-05-08 by User to option (a)) — VINS-Mono GPL-3.0 viral-linkage containment policy** | User (locked); Plan-phase implements | Production binary built with `BUILD_VINS_MONO=OFF` → only `Okvis2VioStrategy` + `KltRansacVioStrategy` linked → BSD-clean. Research/dev binary built with `BUILD_VINS_MONO=ON` → all three strategies linked → enables IT-12 comparative study + docs report. CI publishes both binaries; production CI job verifies via SBOM dump that no `vins_mono` GPL-3.0 symbol is present. CMake spec: `option(BUILD_VINS_MONO "Include VINS-Mono GPL-3.0 VioStrategy implementation; production builds MUST set OFF" OFF)`. Plan-phase scope: CMake flag + CI pipeline split (~1 day engineering). Options (b) process-isolation IPC and (c) accept D-C1-1 = (a) GPL-3.0 entire binary considered and rejected — see solution_draft02 § C1 D-C1-1-SUB-A locked-verdict table for trade-off rationale. | +| **D-C2-11 (REVISED with Fact #110 evidence)** UltraVPR + MegaLoc evaluation as Documentary Lead candidates | User + Plan-phase architect | (a) elevate UltraVPR to Documentary Lead PRIMARY on BSD/permissive C2 axis; (b) elevate MegaLoc to Documentary Lead SECONDARY (broader-applicability); (c) preserve closed pre-screen (5/5: MixVPR + SALAD + SelaVPR + NetVLAD + EigenPlaces) as fallback. Mandatory Jetson MVE under D-C1-2 / D-C2-4 expanded scope. | +| **D-C2-12 (NEW from Mode B Fact #113)** DINOv2-backbone feature-extractor evaluation for cross-domain matching | Plan-phase architect + C3 owner | Plan-phase decision: defer to Jetson MVE phase; potentially closes D-C3-1 retrain cost via DINOv2-feature-based matcher (e.g., DINOv2 + LightGlue or DINOv2 + paired matcher) without requiring D-C2-1 aerial retrain. Carryforward research item. | +| **D-C8-2 (REVISED with Fact #111)** companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` ownership pattern | Plan-phase architect + AC-NEW-2 owner | Recommendation unchanged ((b) companion publishes to source-set 2 + auto-switches FC), but **status downgraded to `Selected with runtime gate`** per Step 7.5.3 carve-out — runtime gate = ArduPilot Plane SITL validation by IT-3 (Spoofing-promotion latency) before lock. NEW sub-decision **D-C8-2-FALLBACK** if SITL validation fails: (a) operator-manual RC aux switch option 90 with relaxed AC-NEW-2 wording; (b) operator-warning STATUSTEXT instead of automated switch; (c) escalate to ArduPilot dev community. | +| **D-C8-9 (NEW from Mode B Fact #109)** MAVLink 2.0 message signing posture per FC | Plan-phase architect + security owner | Plan-phase decision: (a) signing on ALL MAVLink channels (over-engineered for the wired companion link); (b) signing on companion ↔ AP wired channel only; (c) accept unsigned default (rejected per CVE-2026-1579 Critical CVSS); (d) **(RECOMMENDED) hybrid: signing on companion ↔ AP wired channel + per-flight key rotation**. Cross-FC asymmetry: iNav has no signing option (Source #129) — explicit residual risk; propose iNav firmware feature-request as Plan-phase carryforward. NEW NFT-8 — MAVLink message-signing verification: SBOM dump confirms passkey configuration for AP signing channel. | +| **D-CROSS-LATENCY-1 (NEW from Mode B Fact #103)** AC-4.1 latency budget partition strategy | Plan-phase architect + project bring-up team | Plan-phase decision: (a) tighten K=3 to K=2 to recover ~30-60 ms; (b) drop GTSAM `Marginals` from RUNTIME path and use Jacobian-covariance per D-C4-2 = (a) to recover ~20-60 ms; (c) accept budget overrun and validate at Jetson MVE that p95 lands under 400 ms in practice; (d) **(RECOMMENDED) hybrid: K=3 default + auto-degrade to K=2 + Jacobian-covariance under thermal throttle**. **Validation gate**: Jetson MVE measurement of full p95+p99 distribution under hot-soak NFT-3 conditions (25 W @ +50 °C for 8 h) before lock. | +| **D-CROSS-CVE-1 (NEW from Mode B Fact #112)** dependency security pinning posture | Plan-phase architect + security owner | Plan-phase decision: (a) **(RECOMMENDED)** lock to specific patched versions of all CVE-affected dependencies (OpenCV ≥4.12.0; FAISS — no CVEs; GTSAM — no CVEs; TensorRT 10.3 in JetPack 6.2 — no CVE-applicable since not using TRT-LLM 0.x; pymavlink — no CVEs published in repo at access time 2026-05-08); (b) maintain a project SBOM with monthly CVE re-scan; (c) automate pinning via dependabot or equivalent. Recommendation: (a) + (b). | +| **D-PROJ-1 (NEW from Mode B Fact #104)** Camera calibration acquisition strategy | User + project bring-up team | Plan-phase decision: (a) checkerboard calibration on a pre-deployment ADTi 20MP 20L V1 nav-camera unit (~1-2 days engineering + lab access); (b) photogrammetric self-calibration from first ~50 deployment frames over known landmarks (~2-3 days plus runtime support code; degrades first-mission accuracy); (c) request manufacturer's factory-calibration data sheet from ADTi (low cost if available; risk: vendor may not publish per-unit calibration); (d) **(RECOMMENDED) hybrid**: factory data sheet + ground-truth checkerboard refinement on each deployed unit. **CRITICAL Plan-phase gate**: hard prerequisite for AC-1.1/1.2 frame-center-accuracy validation; Test Spec greenfield Step 5 cannot lock end-to-end accuracy fixtures without it. | +| **D-PROJ-2 (NEW from Mode B Fact #105)** Suite Sat Service voting-layer contract verification | User + parent-suite Satellite Service team | Plan-phase decision: (a) verify Suite Service voting layer is documented + scheduled for the deployment timeframe; (b) draft the contract from the onboard side and propose to the Suite Service team; (c) build a project-internal multi-flight aggregator as stop-gap (~2-3 weeks engineering, cross-suite scope creep); (d) accept that AC-NEW-7 Service-side validation is best-effort and document the gap. **(RECOMMENDED) (a) verify + (b) draft in parallel** — contract definition is small (per-tile quality metadata schema + voting threshold spec). **CRITICAL cross-suite gate**: requires coordination with parent-suite Satellite Service team before AC-NEW-7 NFT-5 can pass with end-to-end evidence. | + +--- + +## Testing Strategy additions + +| Test ID | Purpose | New for Mode B? | +|---|---|---| +| **IT-11 — Smoothing-loop look-back accuracy** | Validate GTSAM iSAM2's smoothed past-keyframe poses against ground-truth at smoothing convergence (independent of FC-side consumption). FDR (AC-NEW-3) MUST log smoothed past-frame estimates so post-mission analysis can verify AC-4.5. | NEW (Fact #107) | +| **NFT-8 — MAVLink message-signing verification** | SBOM dump confirms passkey configuration for AP signing channel; iNav side documents the unsignable-link as accepted residual risk per D-C8-9. | NEW (Fact #109) | +| **NFT-9 — Hot-soak latency distribution** (extends NFT-3) | Measure end-to-end p95 + p99 latency distribution under hot-soak NFT-3 conditions (25 W @ +50 °C for 8 h); validate D-CROSS-LATENCY-1 hybrid degradation behaves correctly (K=3 → K=2 + Jacobian-covariance under thermal throttle). | NEW (Fact #103) | +| **IT-1 (revised)** | Pipeline smoke now must clarify which datasets exercise which AC subsets per `_docs/00_problem/input_data/expected_results/results_report.md` § Known Gaps: still-image set is for AC-1.1/1.2 frame-center geolocation accuracy ONLY; Derkachi video is for runtime cadence + VIO + replay; neither is sufficient by itself for end-to-end AC-4.1 latency validation under production cadence + altitude + calibration. | Revised (clarify dataset purpose mapping) | +| **IT-12 — VIO comparative study** | Replay same flight footage through all three `VioStrategy` implementations (`Okvis2VioStrategy`, `VinsMonoVioStrategy`, `KltRansacVioStrategy`) in the research/dev build; emit side-by-side AC-1.3 / AC-2.1a / AC-NEW-4 / AC-4.1 / AC-4.2 / SBOM table; published to `_docs/02_document/vio-comparative-study.md`; production-selection gate for D-C1-1-SUB-A. | NEW (2026-05-08 user directive on A1) | + +--- + +## Editing rules (preservation of audit trail) + +1. The original Mode A row files (`C1_vio.md` through `C10_preflight_provisioning.md`) and `99_cross_component_gates.md` are NOT retroactively edited — they preserve the Mode A audit trail. +2. Where this Mode B revisions file disagrees with the originals, this file wins. Future Mode B / Plan-phase consumers should read this overlay file alongside the original row files. +3. New Plan-phase decisions raised by Mode B (D-C2-12, D-C8-9, D-CROSS-LATENCY-1, D-CROSS-CVE-1, D-PROJ-1, D-PROJ-2) are catalogued here and in `solution_draft02.md` § Open decisions. Future Mode B / Plan-phase invocations should append to either this file or its sibling `99_cross_component_gates.md` (preferred for Plan-phase consumption) — not modify entries written here. diff --git a/_docs/01_solution/solution.md b/_docs/01_solution/solution.md new file mode 100644 index 0000000..f70f5f0 --- /dev/null +++ b/_docs/01_solution/solution.md @@ -0,0 +1,399 @@ +# Solution Draft (Mode B revision 02) + +> Mode B Phase 2 — engine Step 8 (Deliverable Formatting). Revised solution that supersedes [`solution_draft01.md`](solution_draft01.md) by integrating the Mode B Solution Assessment findings F1–F20 (Mode B fact cards #102–#113, Mode B sources #122–#131). +> +> **Research Output Class**: Technical-component selection (per [`../00_research/00_question_decomposition.md`](../00_research/00_question_decomposition.md)). +> +> **Mode**: B (assessment & revision of an existing draft). User chose option A on the Research Decision gate (2026-05-08). solution_draft01 remains on disk as the Mode A audit trail; this file overlays revisions only — it does NOT delete or rewrite solution_draft01. +> +> Backing artifacts (Mode A + Mode B addenda): +> - Question decomposition + scope: [`../00_research/00_question_decomposition.md`](../00_research/00_question_decomposition.md) +> - Source registry: [`../00_research/01_source_registry/00_summary.md`](../00_research/01_source_registry/00_summary.md) (#1–#121 Mode A; **Mode B addendum #122–#131** in [`MODEB_addendum.md`](../00_research/01_source_registry/MODEB_addendum.md)) +> - Fact cards: [`../00_research/02_fact_cards/00_summary.md`](../00_research/02_fact_cards/00_summary.md) (#1–#101 Mode A; **Mode B addendum #102–#113** in [`MODEB_addendum.md`](../00_research/02_fact_cards/MODEB_addendum.md)) +> - Component fit matrix: [`../00_research/06_component_fit_matrix/00_summary.md`](../00_research/06_component_fit_matrix/00_summary.md) + Mode A row files (`Cx_*.md`) + cross-gates [`99_cross_component_gates.md`](../00_research/06_component_fit_matrix/99_cross_component_gates.md); **Mode B revisions overlay** in [`MODEB_revisions.md`](../00_research/06_component_fit_matrix/MODEB_revisions.md) +> - Project Constraint Matrix: [`../00_problem/problem.md`](../00_problem/problem.md), [`../00_problem/restrictions.md`](../00_problem/restrictions.md), [`../00_problem/acceptance_criteria.md`](../00_problem/acceptance_criteria.md), [`../00_problem/input_data/data_parameters.md`](../00_problem/input_data/data_parameters.md), [`../00_problem/input_data/expected_results/results_report.md`](../00_problem/input_data/expected_results/results_report.md) +> - Mode A draft (audit trail): [`solution_draft01.md`](solution_draft01.md) +> +> **Note on AC assessment** — same status as solution_draft01: the BLOCKING `00_ac_assessment.md` artifact was not extracted as a standalone file. Per-AC binding evidence remains distributed across per-component fact cards + Restrictions × Candidate-Modes sub-matrix sections in `06_component_fit_matrix/Cx_*.md`. Per `00_question_decomposition.md` line 4 this was a prior user decision and is accepted; Mode B Fact #106 documents the deviation explicitly and offers retroactive extraction on demand. + +--- + +## Assessment Findings + +Mode B audited solution_draft01 against the Project Constraint Matrix (PCM) and against 2025-2026 web research. The 20 Mode B findings collapse into 12 actionable revisions catalogued below. Full evidence chain in [`../00_research/02_fact_cards/MODEB_addendum.md`](../00_research/02_fact_cards/MODEB_addendum.md) Facts #102–#113. + +| # | Old (solution_draft01) | Weak point (functional / security / performance / process) | New (solution_draft02) | +|---|---|---|---| +| **A1 / Fact #102** | C1 candidate table lists VINS-Mono with cell `Security: BSD permissive clean` and status `Selected (mandatory simple-baseline)` | **Security/license error**: Mode A Fact #28 + canonical github.com LICENCE (Source #122) confirm VINS-Mono is GPL-3.0 (copyleft viral), not BSD. Step-8 deliverable-formatting error in Mode A. | C1 candidate table: VINS-Mono cell corrected to `Security: GPL-3.0 (copyleft viral) — eligible only on D-C1-1 = (a) or (c)`. KLT+RANSAC re-labeled `Selected (mandatory simple-baseline)` per Mode A C1 Fact #35 (was `Selected (project-internal homemade fallback)`). OKVIS2 remains BSD/permissive-track lead per Mode A C1 Fact #31. **NEW (user directive 2026-05-08, see § Architecture C1 note)**: BOTH OKVIS2 + VINS-Mono are implemented behind a pluggable `VioStrategy` interface with config-driven selection, so the project can publish a comparative-study report in official docs and pick the runtime-deployed implementation by measured performance. **NEW sub-decision D-C1-1-SUB-A** (User, hard gate — cannot be deferred to Plan): how to link VINS-Mono GPL-3.0 alongside OKVIS2 BSD without making the entire deployment binary GPL-3.0 by viral license. Options proposed: build-config exclusion (production binary = OKVIS2 only; research/dev build = both); process-isolation (VINS-Mono as separate binary over IPC, viral linkage stops at process boundary); accept D-C1-1 = (a) GPL-3.0 track for entire deployment binary. | +| **A2 / Fact #103** | Component-interaction diagram budgets ~140-420 ms p95 with the upper end **exceeding AC-4.1's 400 ms p95 budget**; no slack reserved for MAVLink serialization, OS scheduling jitter, thermal throttle, FAISS p99, or FC-side IMU pre-integration | **Performance gap**: 420 ms p95 violates AC-4.1 at the upper end with no documented project-side margin. Real production stack overheads (≥40-100 ms in tail conditions per Sources #97 + #115 + AC-NEW-5 thermal envelope) are not budgeted. | NEW Plan-phase decision **D-CROSS-LATENCY-1** added: hybrid K=3 default + auto-degrade to K=2 + Jacobian-covariance under thermal throttle. NEW test **NFT-9 — Hot-soak latency distribution** required to validate p95+p99 distribution under NFT-3 conditions before lock. Component-interaction diagram updated with explicit budget partition. | +| **A3 / Fact #104** | Architecture cites OpenCV `solvePnPRansac(K, dist, ...)` but does NOT inventory that **camera intrinsics `K` and distortion `dist` for the deployed ADTi 20MP 20L V1 nav camera are PROJECT-LEVEL OPEN ITEMS** per `_docs/00_problem/problem.md` last sentence + `flight_derkachi/README.md` + `expected_results/results_report.md` § Known Gaps | **Process / functional gap**: hard prerequisite for AC-1.1/1.2 frame-center accuracy validation is missing from the Plan-phase decision registry. End-to-end accuracy claims cannot be validated without it. | NEW project-level decision **D-PROJ-1** added: camera calibration acquisition strategy. Recommendation **(d) hybrid: factory data sheet + ground-truth checkerboard refinement on each deployed unit**. Surfaced as CRITICAL Plan-phase gate; Test Spec greenfield Step 5 cannot lock end-to-end accuracy fixtures without it. | +| **A4 / Fact #105** | Architecture's AC-NEW-7 cache-poisoning safety story relies on a **Suite Sat Service-side multi-flight ingest voting layer** that is not audited for existence, contract, or build status | **Security gap**: AC-NEW-7 NFT-5 evidence cannot end-to-end pass without the Service contract; cross-flight error compounding is unmitigated if the Service-side voting layer is missing or unimplemented. | NEW project-level decision **D-PROJ-2** added: Suite Sat Service voting-layer contract verification. Recommendation **(a) verify + (b) draft contract from onboard side in parallel**. Surfaced as CRITICAL cross-suite gate requiring coordination with parent-suite Satellite Service team. | +| **A5 / Fact #106** | "Note on AC assessment" (lines 17-18) acknowledges Mode A Phase 1 BLOCKING `00_ac_assessment.md` artifact was not produced | **Process gap (acknowledged)**: per research SKILL.md a BLOCKING gate cannot be silently skipped. Per `00_question_decomposition.md` line 4 it was a prior user decision. | Mode B preserves the deviation as accepted per prior user decision. Adds explicit note that retroactive extraction from the per-component sub-matrix is a low-cost (~1-2 hour) operation if the canonical artifact is wanted before Plan-phase. Recorded in `_docs/_process_leftovers/` if Plan-phase needs the standalone form. | +| **A6 / Fact #107** | Architecture states GTSAM iSAM2 satisfies AC-4.5 (look-back refinement of past estimates) without scoping the FC-consumption pathway; IT-10 validates per-FC unit conversion not AC-4.5 itself | **Functional scope error**: ArduPilot `AP_GPS_MAV` and iNav `mspGPSReceiveNewData()` consume only the latest GPS frame — neither supports retroactive correction of past frames. AC-4.5 satisfied as "internal smoothing + corrected current-frame emission", NOT as "FC retroactively corrects past flight log". | C5 candidate table Pinned Mode/Config column updated: AC-4.5 scope now reads "internal smoothing only, NOT FC retroactive correction". NEW test **IT-11 — Smoothing-loop look-back accuracy** added: validates GTSAM iSAM2's smoothed past-keyframe poses against ground-truth at smoothing convergence (independent of FC-side consumption). FDR (AC-NEW-3) MUST log smoothed past-frame estimates so post-mission analysis can verify AC-4.5. | +| **A7 / Fact #108** | Architecture omits the SQ2 Decision 2 **AdHoP refinement loop** (between matcher and PnP) and SQ2 Decision 3 **Top-N inlier-based re-rank** (between VPR and matcher) sub-stages that question_decomposition.md lines 175-178 explicitly promoted to "explicit named sub-stages" | **Architectural gap**: SQ2 closure committed to two named sub-stages that the architecture diagram and per-component tables do not reflect. | Component-interaction diagram updated with two new sub-stages. NEW C2.5 row "Top-N re-rank by inlier count" (between C2 and C3): thin wrapper around C3 matcher's RANSAC inlier counter; ranks top-K candidates by inlier count from a single-pair LightGlue/XFeat invocation per candidate; outputs top-N ⊆ top-K for full-depth C3 matching. NEW C3.5 row "AdHoP-conditional refinement" (between C3 and C4): OrthoLoC AdHoP method-agnostic perspective preconditioning per Mode A SQ2 Source #40; invoked only when initial reprojection error exceeds a threshold; worst-case 2× C3 latency when triggered. | +| **A8 / Fact #109** | Architecture has **no MAVLink message-signing posture**; CVE-2026-1579 (CVSS 9.8 CRITICAL) flags MAVLink protocol as lacking cryptographic authentication by default | **Security gap (Critical CVSS)**: arbitrary unauthenticated MAVLink commands can be injected, including SERIAL_CONTROL for interactive shell access on the FC. Documented mitigation: enable MAVLink 2.0 message signing (Source #128). iNav has no signing implementation (Source #129) — explicit cross-FC asymmetry. | NEW Plan-phase decision **D-C8-9** added: MAVLink 2.0 message signing posture per FC. Recommendation **(d) hybrid: signing on companion ↔ AP wired channel + per-flight key rotation**. iNav-side documented as accepted residual risk + Plan-phase carryforward to propose iNav firmware feature-request. NEW test **NFT-8 — MAVLink message-signing verification** added: SBOM dump confirms passkey configuration for AP signing channel. | +| **A9 / Fact #110** | C2 candidate table omits MegaLoc + UltraVPR (D-C2-11 deferred MegaLoc evaluation to "post-research session") | **Currency gap**: 2025-2026 SOTA candidates (MegaLoc CVPR 2025 MIT, UltraVPR RAL 2025 / ICRA 2026 MIT) are aerial-validated and Jetson-runtime-documented (UltraVPR 44 Hz on Jetson Orin NX). The deferred recommendation in solution_draft01 is now technically obsolete given Mode B web research. | C2 candidate table extended with **UltraVPR as Documentary Lead PRIMARY** on BSD/permissive C2 axis (rotation-invariant, unsupervised aerial pretrain, MIT, 44 Hz Jetson Orin NX) and **MegaLoc as Documentary Lead SECONDARY** (broader-applicability, MIT, torch.hub install, AirZoo-validated for aerial). Status of D-C2-11 changed from "deferred to post-research session" to "elevate UltraVPR primary + MegaLoc secondary at Plan-phase Jetson MVE under D-C1-2 / D-C2-4 expanded scope". MixVPR demoted from "recommended primary on BSD/permissive track" to "mandatory simple-baseline" only. SelaVPR (DINOv2-L) strengthened as secondary per Fact #113 cross-modal evidence. | +| **A10 / Fact #111** | D-C8-2 = (b) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch is recommended as the production pattern, citing NGPS/Auterion as evidence | **Production-deployment gap**: Source #130 (re-verified 2026-05-08) confirms ArduPilot supports the command at firmware level since August 2021 but **no production-deployed GCS or companion is publicly documented as implementing the companion-driven switch pattern**. Mode A SQ1 Sources #25–#37 document that NGPS/Auterion exist as deployed systems but do NOT confirm their internal source-set switching mechanism. The project will be establishing the canonical pattern itself. | D-C8-2 status downgraded from `Selected` to **`Selected with runtime gate`** per Step 7.5.3 carve-out — runtime gate = ArduPilot Plane SITL validation by IT-3 (Spoofing-promotion latency) before lock. NEW sub-decision **D-C8-2-FALLBACK** added: if SITL validation fails, options (a) operator-manual RC aux switch with relaxed AC-NEW-2 wording; (b) operator-warning STATUSTEXT instead of automated switch; (c) escalate to ArduPilot dev community to characterize firmware-side switch latency. | +| **A11 / Fact #112** | C4 candidate table cites "OpenCV 4.x" without a minimum patch version | **Security gap (Critical CVSS)**: CVE-2025-53644 (CVSS 9.8) — uninitialized stack pointer on crafted JPEG triggers heap buffer write; affects 4.10.0 / 4.11.0; **fixed in 4.12.0**. C4 + C1 + FDR thumbnail re-load + tile cache import are all paths where a crafted JPEG could reach OpenCV's `imread` / `imdecode`. | OpenCV pin tightened to **`≥4.12.0`** in C1 + C4 + C6 candidate rows. NEW Plan-phase decision **D-CROSS-CVE-1** added: dependency security pinning posture. Recommendation **(a) lock to specific patched versions of all CVE-affected dependencies + (b) maintain a project SBOM with monthly CVE re-scan**. | +| **A12 / Fact #113** | C2 + C3 cross-domain story rests on D-C3-1 = (a) DISK+LightGlue retrain on aerial-domain corpus to close UAV→satellite gap | **Currency caveat (Medium confidence)**: 2026 SAR-optical 24-matcher benchmark + XoFTR research (Source #131) found foundation-model features (DINOv2) provide modality invariance even without explicit cross-modal training — this strengthens (does not invalidate) the case for keeping SelaVPR (DINOv2-L) as secondary alongside UltraVPR primary, and suggests a DINOv2-feature-based matcher could potentially close the cross-domain gap without the D-C2-1 ~1-2-week aerial retrain. | NEW Plan-phase decision **D-C2-12** added: DINOv2-backbone feature-extractor evaluation for cross-domain matching. Plan-phase decision: defer to Jetson MVE; potentially closes D-C3-1 retrain cost via DINOv2-feature-based matcher (e.g., DINOv2 + LightGlue or DINOv2 + paired matcher) without requiring D-C2-1 aerial retrain. Carryforward research item. | + +**Out-of-scope for Mode B revision** (no changes vs solution_draft01): +- C1 OKVIS2 selection (Mode A C1 Fact #31 confirmed; no Mode B contradicting evidence). +- C3 DISK+LightGlue D-C3-1 = (a) recommendation (Mode B Fact #113 reinforces but does not displace). +- C5 GTSAM iSAM2 + Manual ESKF dual-track (Mode A Facts #88-#91 confirmed; Mode B Fact #107 only revises AC-4.5 scope wording). +- C6 mirror-of-suite-pattern primary (Mode A Fact #92 confirmed; OpenCV pin in C6 row updated per Fact #112). +- C7 TensorRT-native primary (Mode A Fact #94 confirmed). +- C8 pymavlink (AP) + MSP2_SENSOR_GPS (iNav) primary (Mode A Facts #97-#99 confirmed; D-C8-2 status revised per A10; D-C8-9 added per A8). +- C10 D-C6-3 + D-C7-7 confirmation pipelines (Mode A Facts #100-#101 confirmed). +- Existing/Competitor systems analysis (Mode A SQ1 saturated; no Mode B contradicting evidence). + +--- + +## Product Solution Description + +A Jetson-Orin-Nano-Super-hosted companion-PC system that produces a GPS-equivalent WGS84 position estimate (with honest 6×6 covariance) for a fixed-wing UAV operating in a GPS-denied or GPS-spoofed environment, by fusing pre-flight-cached satellite tile imagery (from the parent-suite Azaion Satellite Service) with live nav-camera frames and FC-supplied IMU + attitude. + +The system implements the canonical hierarchical GPS-denied pipeline `retrieval → re-rank → matching → AdHoP-conditional refinement → pose → fusion` (per SQ2 surveys converging on this pattern + SQ2 Decisions 2+3 promoted to explicit named sub-stages, Sources #38–#42), runs on the pinned Jetson Orin Nano Super hardware (Source #105 hardware-tied constraints honored), and delivers the final pose to the FC via per-FC external-positioning interfaces — MAVLink `GPS_INPUT` for ArduPilot Plane (verified Source #4 + #106 + #107 + Mode B Fact #109 mitigation D-C8-9 = (d) MAVLink-2.0-signing-on-companion↔AP-wired-channel), MSP2 `MSP2_SENSOR_GPS` for iNav (verified Source #111 + #112 + #113; iNav has no signing per Mode B Fact #109 + Source #129 — accepted residual risk). PX4 is explicitly out of scope per `restrictions.md`. + +### Component-interaction diagram (pre-flight + runtime, REVISED) + +``` +PRE-FLIGHT (operator-managed, on-Jetson) ───────────────────────────────────────── + parent-suite Satellite Service ─→ tile cache (PostgreSQL btree + filesystem) + ─→ C2 VPR backbone (TensorRT engine, INT8+FP16) + └─→ per-tile descriptors → FAISS HNSW index + (.index file written + via faiss.write_index + + atomicwrites + SHA-256 + content-hash gate) + ONNX models (C2/C2.5/C3/C3.5/C1) ─→ Polygraphy / trtexec / IBuilderConfig hybrid + orchestration → TensorRT engines + (.engine files, SM 87 / JetPack 6.2 / TRT 10.3) + Camera calibration ─→ D-PROJ-1 hybrid: factory K + dist data sheet from + ADTi + ground-truth checkerboard refinement on + each deployed unit (~1 day per unit) + +TAKEOFF LOAD (≤5 s, AC-NEW-1) ──────────────────────────────────────────────────── + FAISS read_index(IO_FLAG_MMAP_IFC) + content-hash verify → ready + IRuntime.deserializeCudaEngine per-engine → ready + MAVLink 2.0 signing key handshake (companion ↔ AP wired channel) → ready + +RUNTIME (3 Hz nav-camera, 100-200 Hz IMU; AC-4.1 <400 ms p95 budget) ────────────── + nav-camera frame ─→ C1 VioStrategy (config-selected: Okvis2 | + VinsMono [research-build only per D-C1-1-SUB-A] | + KltRansac) — production-default Okvis2 ~30-50 ms + ─→ C2 UltraVPR query → top-K=10 satellite tile retrieval + (D-C2-11 revised; UltraVPR primary) + (~5-10 ms via FAISS HNSW) + ─→ C2.5 Top-N re-rank by inlier count (NEW: SQ2 Dec 3) + single-pair LightGlue per candidate → top-N=3 + (~30-60 ms) + ─→ C3 DISK+LightGlue × N=3 pairs (D-CROSS-LATENCY-1 + hybrid: K=3 default, auto-degrade to K=2 under + thermal throttle) (~90-180 ms FP16) + ─→ C3.5 AdHoP-conditional refinement (NEW: SQ2 Dec 2) + invoked only if reprojection error exceeds threshold + (~+30-90 ms when + triggered) + ─→ C4 OpenCV ≥4.12.0 solvePnPRansac (D-C4-1 = (b) IPPE + flags) (~5-15 ms) + ─→ wrap in GTSAM Marginals (D-C4-2 = (b); + D-CROSS-LATENCY-1 hybrid auto-degrades to + Jacobian-based covariance via D-C4-2 = (a) under + thermal throttle) (~30-90 ms) + FC IMU + attitude ─→ C5 GTSAM iSAM2 + CombinedImuFactor + PriorFactorPose3 + (~2-5 ms per update at D-C5-5 = (c)) + └─→ posterior 6×6 covariance via Marginals + └─→ AC-4.5 internal smoothing (NOT FC-side + retroactive correction per Mode B Fact #107) + ─→ C8 per-FC unit conversion + ├─→ pymavlink GPS_INPUT (AP) + │ + MAVLink 2.0 signing + │ (D-C8-9 = (d)) + └─→ MSP2_SENSOR_GPS (iNav, + unsigned residual risk) + (5 Hz periodic) + + total runtime budget: see D-CROSS-LATENCY-1 partition below +``` + +### AC-4.1 latency budget partition (D-CROSS-LATENCY-1, NEW) + +| Stage | K=3 baseline (steady-state) | K=2 + Jacobian-cov (thermal-throttle hybrid) | NFT-9 measurement target | +|---|---|---|---| +| C1 OKVIS2 VIO | 30-50 ms | 30-50 ms | p95 ≤ 60 ms | +| C2 UltraVPR query | 5-10 ms | 5-10 ms | p95 ≤ 15 ms | +| C2.5 Top-N re-rank | 30-60 ms (1 single-pair LightGlue × top-K=10) | 30-60 ms | p95 ≤ 80 ms | +| C3 DISK+LightGlue × N | 90-180 ms (N=3) | 60-120 ms (N=2) | p95 ≤ 200 ms (steady) / ≤ 140 ms (thermal-throttle) | +| C3.5 AdHoP (conditional) | 0-90 ms (worst-case 2×) | 0-60 ms | p99 ≤ 100 ms when triggered | +| C4 solvePnPRansac | 5-15 ms | 5-15 ms | p95 ≤ 25 ms | +| C4 covariance recovery | 30-90 ms (GTSAM Marginals D-C4-2 = (b)) | 5-15 ms (Jacobian D-C4-2 = (a)) | p95 ≤ 100 ms (steady) / ≤ 25 ms (thermal-throttle) | +| C5 iSAM2 update | 2-5 ms | 2-5 ms | p95 ≤ 15 ms | +| MAVLink/MSP2 serialization + UART/USB transmission | 5-20 ms | 5-20 ms | p95 ≤ 30 ms | +| OS scheduling jitter | 10-30 ms | 10-30 ms | p99 ≤ 50 ms | +| **Project budget total** | **207-450 ms** (steady-state) | **152-325 ms** (hybrid degraded) | p95 ≤ **400 ms** (AC-4.1 hard bound) | + +The hybrid auto-degrade is triggered by the Jetson's thermal-throttle telemetry crossing a configurable temperature/clock threshold (set per D-C7-9 JetPack 6.2 + TensorRT 10.3 lock); it preserves AC-4.1 satisfaction at +50 °C ambient (AC-NEW-5) at the cost of ~5-10% accuracy loss (NFT-4 false-position safety budget remains satisfied per Plan-phase Jetson MVE validation). + +--- + +## Existing/Competitor Solutions Analysis + +(Unchanged from solution_draft01 — Mode B web research did not surface new competitor systems.) + +| System | Class | Stack signature | Relation to this project | +|---|---|---|---| +| **Twist Robotics OSCAR** (Source #25) | Deployed peer (Ukraine theater) | Visual navigation companion; closed-source | Closest peer system; deployed in theater the project will operate in. Confirms operational viability of the canonical pipeline shape. | +| **Auterion Artemis / Skynode N** (Sources #31+#32) | Commercial deployed (Ukraine-tested) | Skynode N + Visual Navigation; 1000-mile deep-strike demonstrated; closed-source proprietary stack | Demonstrates Jetson-class hardware can host GPS-denied companion at deployed-mission scale. Validates the pinned hardware target. | +| **NGPS (snktshrma/ngps_flight)** (Source #33) | Open-source (ArduPilot GSoC 2024) | LightGlue + SuperPoint + UKF + VISION_POSITION_ESTIMATE | Closest open-source pipeline-match. Confirms ArduPilot Plane + visual-localization companion is operationally validated. **License gap**: relies on Magic Leap-noncommercial canonical SP weights — same hard disqualifier this project hits in D-C3-1, mitigated by D-C3-1 = (a) DISK+LightGlue swap. | +| **Vantor Raptor** (Source #30) | Commercial deployed | GPS-denied UAV navigation + coordinate extraction | Validates dual-purpose pose + object-localization output. Aligns with project AC-7.x object-localization requirements. | +| **DARPA FLA (T&E review)** (Source #35) | Defense program lineage | GPS-denied autonomy with onboard compute | Provides T&E reference for AC-NEW-4 false-position safety budget validation methodology. | +| **DSMAC / TERCOM lineage** (Source #36) | Defense legacy | Digital Scene Matching Area Correlator + Terrain Contour Matching | Historical proof point that the project's "match against pre-cached imagery" core idea predates modern CV by decades; modern equivalents (this project) trade hand-engineered correlators for learned VPR + matchers. | + +**Key delta vs existing systems** (REVISED): this project (a) supports both ArduPilot Plane AND iNav (no other open-source GPS-denied companion targets iNav per SQ6 saturation), (b) enforces an explicit AC-NEW-7 cache-poisoning safety budget across the descriptor cache + tile cache + Suite Sat Service pipeline (with D-PROJ-2 cross-suite contract verification), (c) ships an honest 6×6 posterior covariance per AC-NEW-4 via a GTSAM-shared-substrate hybrid (D-C4-2 + D-C5-5 + D-C8-8 cross-component coupling), and (d) **defends the companion ↔ FC link with MAVLink 2.0 message signing on ArduPilot per D-C8-9 = (d) — an explicit security-posture gain over NGPS / OSCAR / Skynode N which (per published material at time of access) do not document MAVLink message signing on the companion link**. + +--- + +## Architecture + +The solution is decomposed into nine components (C1–C8 + C10) plus two new sub-stages (C2.5 + C3.5) promoted from SQ2 closure (Mode B Fact #108). C9 was dropped in the SQ7/C9 restructure 2026-05-08 and deferred to Test Spec greenfield Step 5. Per-component candidate tables follow. **All "Selected" candidates have an MVE link in the Restrictions × Candidate-Modes sub-matrix sections** of [`../00_research/06_component_fit_matrix/Cx_*.md`](../00_research/06_component_fit_matrix/) per Step 7.5.3 decision rules. **Mode B revisions** to candidate-row statuses are catalogued in [`MODEB_revisions.md`](../00_research/06_component_fit_matrix/MODEB_revisions.md); only the rows with Mode B-revised cells are reproduced below for brevity (unchanged rows mirror solution_draft01 verbatim). + +### Component: C1 — Visual / Visual-Inertial Odometry (REVISED per Mode B Fact #102 + 2026-05-08 user directive) + +**User directive (2026-05-08 follow-up to A1)**: BOTH OKVIS2 and VINS-Mono are implemented behind a pluggable `VioStrategy` interface with config-driven selection. The motivation is twofold: (a) enable a comparative-study report in official docs that names both implementations and the measured performance delta on real flight data; (b) lock the production-deployed implementation by measured performance on the project's actual operating context (Jetson Orin Nano Super + ADTi 20MP 20L V1 nav camera + Derkachi-class footage). KLT+RANSAC remains the mandatory simple-baseline per the engine's Component Option Breadth rule and is implemented as a third `VioStrategy` so the comparison study covers the engine-required baseline as well. + +#### `VioStrategy` interface (Strategy/Adapter pattern) + +```python +class VioStrategy(Protocol): + """Pluggable VIO frontend. Selected at startup via config; not hot-swappable mid-flight.""" + def initialize(self, intrinsics: CameraIntrinsics, imu_calibration: ImuCalibration) -> None: ... + def process_frame(self, frame: NavCameraFrame, imu_window: ImuWindow) -> VioOutput: ... + def reset(self) -> None: ... + def name(self) -> str: ... + def license(self) -> str: ... # for SBOM + AC-NEW-3 FDR + def is_production_eligible(self, license_track: LicenseTrack) -> bool: ... + +class VioOutput(NamedTuple): + relative_pose: SE3 # per-frame relative pose + relative_pose_covariance_6x6: np.ndarray # for AC-NEW-4 honesty + imu_bias_estimate: ImuBias + feature_quality: FeatureQuality # for D-CROSS-LATENCY-1 thermal-throttle decision +``` + +Three concrete implementations: `Okvis2VioStrategy`, `VinsMonoVioStrategy`, `KltRansacVioStrategy`. Selection is driven by config (`vio.strategy: okvis2 | vins_mono | klt_ransac`). FDR (AC-NEW-3) records the active strategy `name()` + `license()` per flight so the post-mission report can correlate accuracy results with the strategy used. Comparative-study reports (see new test **IT-12** below) replay the same flight footage through all three strategies and emit a single side-by-side accuracy + latency table. + +#### Sub-decision D-C1-1-SUB-A — LOCKED 2026-05-08 to option (a) build-config exclusion + +**Critical thinking flag** (preserved for audit trail): linking GPL-3.0 (VINS-Mono) and BSD-3 (OKVIS2) into the same deployed static binary makes the entire binary GPL-3.0 by viral license — even if config selects OKVIS2 at runtime. The user's stated intent ("use OKVIS in a real-world application") implies a BSD/permissive deployment binary, which is incompatible with statically linking VINS-Mono. The interface pattern itself does not solve this; an additional build/linkage policy is required. + +| Option | Description | Viral-linkage handled? | Comparative-study still possible? | Engineering cost | Verdict | +|---|---|---|---|---|---| +| **(a)** Build-config exclusion | Production binary built with `BUILD_VINS_MONO=OFF` → only `Okvis2VioStrategy` + `KltRansacVioStrategy` linked; research/dev build with `BUILD_VINS_MONO=ON` → all three strategies linked. CI publishes both binaries; production deploys only the BSD-clean one. | ✅ Production binary is BSD/Apache clean | ✅ Research build runs the comparative study; results published to docs | Low (~1 day CMake config + CI pipeline split) | **LOCKED 2026-05-08 (User)** | +| **(b)** Process-isolation IPC | `VinsMonoVioStrategy` lives in a separate binary that talks to the main companion over UNIX domain socket / shared-memory ring buffer; viral linkage stops at process boundary. Both binaries deployable side-by-side; main companion stays Apache/BSD. | ✅ Process boundary breaks viral linkage | ✅ Both run side-by-side at runtime in research mode | High (~1-2 wk IPC design + serialization + IPC latency budget on Jetson + per-frame allocator overhead, conflicts with D-CROSS-LATENCY-1 budget) | Rejected (cost + latency budget conflict) | +| **(c)** Accept D-C1-1 = (a) GPL-3.0 entire deployment binary | Whole companion (including pymavlink LGPL-3.0 + GTSAM BSD + FAISS MIT + OpenCV BSD + ...) ships under GPL-3.0; source disclosure obligation triggered for the entire onboard binary | ❌ Trades off restrictions.md Apache/BSD-track preference | ✅ Single binary, no split | Low at code level, high at policy level | Rejected (policy cost) | + +**Locked verdict**: **(a) build-config exclusion**. Production binary stays BSD/permissive clean; research binary published alongside enables the comparative study and docs report. Aligns existing D-C1-1 = (c) "both tracks open" with operational reality by mapping it to "both tracks built; production deploy is permissive-track binary". CMake flag spec: `option(BUILD_VINS_MONO "Include VINS-Mono GPL-3.0 VioStrategy implementation; production builds MUST set OFF" OFF)`. Production CI job builds with `-DBUILD_VINS_MONO=OFF` and asserts via SBOM dump that no GPL-3.0 symbol from `vins_mono` is present. Research CI job builds with `-DBUILD_VINS_MONO=ON` and emits a separate research-binary artifact. + +#### Candidate table + +| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit | +|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----| +| **`Okvis2VioStrategy` → OKVIS2** (modern-competitive-lead, BSD/permissive track) | C++ + ROS wrapper behind VioStrategy; smartroboticslab/okvis2 | Loosely-coupled VIO with stereo+IMU optionable; for this project mono+IMU mode; outputs per-frame relative pose + IMU bias estimates via `VioOutput` | Best modern accuracy on cross-domain tracking; permissive (BSD); **production-eligible on BSD/permissive track AND on GPL-3.0 track** | C++ + ROS dependency; ~30-50 ms per frame on Jetson Orin Nano Super extrapolation | C++17, ROS Noetic optional, IMU at 100-200 Hz | BSD-3-Clause clean | ~1-2 wk integration + ~3 days VioStrategy adapter | MVE: see [`../00_research/02_fact_cards/C1_vio.md`](../00_research/02_fact_cards/C1_vio.md); docs: Sources #47+#48+#56 | **Selected (modern-competitive-lead)** — production-default if IT-12 comparative study confirms material accuracy lead over VINS-Mono | +| **`VinsMonoVioStrategy` → VINS-Mono** (comparative-baseline) ← **REVISED Mode B Fact #102 + 2026-05-08 user directive** | C++ + ROS wrapper behind VioStrategy; HKUST-Aerial-Robotics/VINS-Mono | Mono+IMU loosely-coupled VIO; outputs per-frame relative pose + IMU bias estimates via `VioOutput` | Stable since 2018; widely documented; **canonical academic baseline for the comparative-study docs report** | Older accuracy; **GPL-3.0 viral linkage forces D-C1-1-SUB-A build/link policy decision before any deployed binary including this strategy can ship** | C++17, ROS Noetic optional, IMU at 100-200 Hz | **GPL-3.0 (copyleft viral)** — production-eligible only if D-C1-1-SUB-A = (b) process-isolation OR D-C1-1 = (a) GPL-3.0 track; research/dev build always eligible per D-C1-1-SUB-A = (a) | ~3-5 days adapter + ~3-5 days build-config split (if D-C1-1-SUB-A = (a)) | MVE: see fact card; docs: Sources #43+#55 + Mode B Source #122 (canonical LICENCE) | **Selected via `VioStrategy` interface for comparative study + research/dev builds**; production-deployed only if D-C1-1-SUB-A resolves to a non-(a) option AND IT-12 confirms VINS-Mono outperforms OKVIS2 on project's operating context | +| **`KltRansacVioStrategy` → KLT+RANSAC** (mandatory simple-baseline) ← **REVISED Mode B Fact #102** | OpenCV ≥4.12.0 pure-Python wrapper behind VioStrategy ← OpenCV pin tightened per Mode B Fact #112 | KLT optical flow + 5-point/homography RANSAC essential-matrix → pose decomposition; outputs `VioOutput` (with degraded covariance estimate since no IMU fusion at this layer) | Pure OpenCV; no C++ dependency; pure-VO baseline; engine-rule-required mandatory simple-baseline per Mode A C1 Fact #35 | No IMU fusion (delegated to C5); ~5-10 ms per frame on Jetson | OpenCV ≥4.12.0; IMU bypassed | Apache-2.0 + OpenCV BSD/Apache | ~3-5 days adapter | MVE: see fact card; docs: Source #53 + Mode B Source #127 (CVE-2025-53644 driving pin) | **Selected (mandatory simple-baseline)** — production-eligible on both license tracks; comparative-study reference baseline | + +**Exact-fit evidence** (REVISED): +- Project constraints checked: AC-1.3 cumulative drift; AC-2.1a frame-to-frame registration; AC-3.1 outlier tolerance; AC-3.2 sharp-turn behavior; AC-4.1 + AC-4.2 latency + memory; **CVE-2025-53644 mitigation via OpenCV ≥4.12.0 pin**; **D-C1-1-SUB-A build/link policy for GPL-3.0 viral-linkage containment**. +- Evidence: `02_fact_cards/C1_vio.md` (Mode A) + `02_fact_cards/MODEB_addendum.md` Facts #102 + #112 (Mode B); Sources #43+#47+#48+#53+#55+#56 (Mode A) + #122 + #127 (Mode B). +- Disqualifiers: VINS-Fusion + OpenVINS + VINS-Mono GPL-3.0 contingent on D-C1-1 + D-C1-1-SUB-A resolution. Production-deployed VINS-Mono additionally contingent on IT-12 comparative-study verdict. +- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C1_vio.md`](../00_research/06_component_fit_matrix/C1_vio.md) (Mode A) + [`MODEB_revisions.md`](../00_research/06_component_fit_matrix/MODEB_revisions.md) (Mode B overlay). +- API capability gates: ✅ MVE saved for all 3 implementations behind `VioStrategy` interface. + +#### Single-Responsibility check (per coderule.mdc) + +The `VioStrategy` interface owns "produce a `VioOutput` from a frame + IMU window per the strategy's algorithm". Per-strategy concerns (OKVIS2's ROS bring-up, VINS-Mono's GPL-3.0 build flag, KLT+RANSAC's degraded-covariance shape) live inside the concrete implementations, not in the interface. The shared coordinator (`VioPipeline`) owns config-driven strategy selection + FDR provenance logging + per-frame metric collection — it does NOT contain strategy-specific branches. License-track filtering (`is_production_eligible`) is delegated to each strategy because the answer depends on the strategy's own license, satisfying the SRP rule "Logic specific to a platform, variant, or environment belongs in the class that owns that variant". + +### Component: C2 — Visual Place Recognition (REVISED per Mode B Fact #110) + +| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit | +|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----| +| **UltraVPR** (Documentary Lead PRIMARY, BSD/permissive track) ← **NEW Mode B Fact #110** | PyTorch / ONNX; cbbhuxx/UltraVPR | Unsupervised lightweight rotation-invariant aerial VPR; ONNX export; designed for UAV multi-heading flights | MIT throughout; **44 Hz on Jetson Orin NX** (Orin-Nano-Super-class extrapolation expected ±20%); rotation-invariant (closes multi-heading aerial-flight gap); unsupervised aerial-pretrain (closes D-C2-1 retrain cost); RAL 2025 + ICRA 2026 | New (RAL 2025); requires Plan-phase Jetson Orin Nano Super MVE under expanded D-C1-2 / D-C2-4 scope | PyTorch 2.x; ONNX export | MIT clean | ~3-5 days base + 0 wk D-C2-1 retrain (unsupervised pretrain on UAV data) | MVE: deferred to Jetson MVE phase; docs: Mode B Source #124 | **Documentary Lead PRIMARY on BSD/permissive C2 axis** — strongest fit on project's pinned operating context | +| **MegaLoc** (Documentary Lead SECONDARY, BSD/permissive track, broader-applicability) ← **NEW Mode B Fact #110** | PyTorch / torch.hub; gmberton/megaloc | Unified retrieval model trained on multiple methods + datasets; fine-tunable on aerial via AirZoo benchmark recipe | MIT throughout; SOTA on multiple VPR datasets; CVPR 2025; torch.hub install path; AirZoo aerial validation (Mode B Source #125) | New (Feb 2025); requires Plan-phase Jetson MVE | PyTorch 2.x | MIT clean | ~3-5 days base | MVE: deferred to Jetson MVE phase; docs: Mode B Sources #123 + #125 | **Documentary Lead SECONDARY on BSD/permissive C2 axis** — broader-applicability fallback if UltraVPR fails Jetson MVE | +| **MixVPR** (mandatory simple-baseline, BSD/permissive track) ← **REVISED Mode B Fact #110** | PyTorch; amaralibey/MixVPR | ResNet50 backbone + MLP-Mixer aggregator; output dimension 2048-D float32 (or 512-D / 256-D `cropToDim` per D-C6-1 = halfvec); input 320×320 | MIT throughout; modest descriptor budget (~6.5% of AC-8.3 cache); active maintenance; engine-required mandatory baseline | Street-view-pretrained — D-C2-1 retrain on aerial corpus required | PyTorch 2.x; ONNX export verified | MIT clean | ~3-5 days base + ~1-2 wk D-C2-1 retrain | MVE: see [`../00_research/02_fact_cards/C2_vpr.md`](../00_research/02_fact_cards/C2_vpr.md); docs: Sources #57+#58+#61 | **Selected (mandatory simple-baseline)** — demoted from "recommended primary" (UltraVPR took that slot per Mode B Fact #110) | +| **SelaVPR** (modern-competitive-lead-secondary, BSD/permissive track) ← **STRENGTHENED Mode B Fact #113** | PyTorch; Lu-Feng/SelaVPR | DINOv2 ViT-L two-stage (global + local); 1024-D global + on-demand local features | MIT; lift from two-stage; 1024-D smallest single-stage cache; **DINOv2-L backbone provides modality-invariant features per Mode B Fact #113 cross-modal evidence** | DINOv2 ViT-L is 3.5× larger than ViT-B; D-C2-5 + D-C2-7 re-rank gates | PyTorch 2.x; DINOv2 ViT-L export | MIT clean | ~3-5 days base + ~1-2 wk D-C2-1 retrain | MVE: see fact card; docs: Sources #62+#63 + Mode B Source #131 | **Selected (modern-competitive-lead-secondary)** — strengthened by Mode B cross-modal contrarian evidence | +| **SALAD** (modern-competitive-lead, GPL-3.0 track) | PyTorch; serizba/salad | DINOv2 ViT-B + optimal-transport aggregator; output 8448-D / 2112-D / 544-D per D-C2-6; input 322×322 | +5-7 R@1 over MixVPR-2048 on MSLS Challenge | GPL-3.0; D-C2-5 ViT export risk; descriptor budget at full size 27% of AC-8.3 | PyTorch 2.x; DINOv2 ViT-B export | GPL-3.0 contingent | ~3-5 days base + ~1-2 wk D-C2-1 retrain | MVE: see fact card; docs: Sources #59+#60 | **Selected on GPL-3.0 track only** — eligible if D-C1-1 = (a) or (c) (unchanged) | +| **EigenPlaces** (BSD/permissive sibling) | PyTorch; gmberton/EigenPlaces | ResNet-50 + GeM + FC viewpoint-robust training; 2048-D / 512-D / 256-D / 128-D per D-C2-10 | MIT throughout; viewpoint-robust training paradigm; eleven sibling modes | Older approach (2023); modest accuracy lift over MixVPR | PyTorch 2.x | MIT clean | ~3-5 days | MVE: see fact card; docs: Sources #67+#68 | **Selected (BSD/permissive sibling)** — alternate primary on BSD/permissive track (unchanged) | +| **NetVLAD** (mandatory baseline, BSD/permissive track) | PyTorch port; Relja/netvlad canonical | VGG16 + soft-assignment-VLAD; 4096-D / 512-D / 256-D PCA-whitened per D-C2-9 | MIT canonical; classical-baseline; widely-cited | Largest single-stage descriptor cache at canonical 4096-D; D-C2-8 PyTorch-port-strategy gate | PyTorch port required from canonical MATLAB | MIT canonical (Nanne port has license-uncertainty per D-C2-8) | ~1 wk re-port from canonical OR ~3 days Nanne port + license-clearance | MVE: see fact card; docs: Sources #64+#65+#66 | **Selected (mandatory simple-baseline)** — classical reference (unchanged) | + +**Exact-fit evidence** (REVISED): +- Project constraints checked: AC-2.1b satellite-anchor registration; AC-2.2 cross-domain MRE; AC-8.3 cache budget; AC-8.6 retrieval robustness; AC-4.1 latency; **multi-heading rotation-invariance gap closed by UltraVPR (Mode B Fact #110)**. +- Evidence: `02_fact_cards/C2_vpr.md` (Mode A) + `02_fact_cards/MODEB_addendum.md` Facts #110 + #113 (Mode B); Sources #57–#68 (Mode A) + #123 + #124 + #125 + #131 (Mode B). +- Disqualifiers: SALAD GPL-3.0 contingent on D-C1-1 = (a) or (c); conditional candidates (AnyLoc/BoQ/DINOv2-VLAD) pending D-C2-5 INT8 quantization survey prerequisite. +- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C2_vpr.md`](../00_research/06_component_fit_matrix/C2_vpr.md) + [`MODEB_revisions.md`](../00_research/06_component_fit_matrix/MODEB_revisions.md). +- API capability gates: ✅ MVE saved for 5 mandatory pre-screen Mode A candidates; UltraVPR + MegaLoc Documentary-Lead-only pending Plan-phase Jetson MVE. + +### Component: C2.5 — Top-N inlier-based re-rank (NEW per Mode B Fact #108) + +| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit | +|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----| +| **Top-N re-rank by inlier count** (NEW sub-stage per SQ2 Decision 3) | Project-internal Python wrapper around C3 matcher's RANSAC inlier counter | Inputs: top-K=10 VPR retrieval candidates from C2; per-candidate single-pair LightGlue/XFeat invocation; rank by inlier count; output top-N=3 ⊆ top-K | Promotes SQ2 Decision 3 from implicit to explicit named sub-stage; carves explicit latency budget per D-CROSS-LATENCY-1 partition; reuses C3 matcher infrastructure (no new dependency) | Adds ~30-60 ms p95 per frame at K=10 (single-pair LightGlue at FP16) | LightGlue / XFeat already deployed for C3 | Apache-2.0 throughout (inherits C3 matcher license) | ~3-5 days project-internal wrapper | MVE: thin wrapper, MVE deferred to integration phase; docs: SQ2 Source #38–#42 (Mode A) + Mode B Fact #108 | **Selected (NEW sub-stage)** — operationalizes SQ2 Decision 3 | + +### Component: C3 — Cross-domain matchers (UNCHANGED from solution_draft01) + +(Verbatim from solution_draft01 § Component: C3. DISK+LightGlue D-C3-1 = (a) recommended-primary-mitigation, ALIKED+LightGlue secondary, XFeat alternate-modern-competitive-lead, SuperGlue+SuperPoint canonical Rejected. See [`solution_draft01.md`](solution_draft01.md#component-c3--cross-domain-matchers) for the full table.) + +### Component: C3.5 — AdHoP-conditional refinement (NEW per Mode B Fact #108) + +| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit | +|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----| +| **AdHoP-conditional refinement** (NEW sub-stage per SQ2 Decision 2) | Project-internal Python wrapper implementing OrthoLoC AdHoP method-agnostic perspective preconditioning per Mode A SQ2 Source #40 | Invoked only when initial reprojection error (from C3 matcher RANSAC residuals) exceeds a threshold; worst-case 2× C3 latency when triggered; bypassed otherwise; +63% translation accuracy reported in source paper at the cost of 2× matcher latency on triggered frames | Promotes SQ2 Decision 2 from implicit to explicit named sub-stage; preserves AC-4.1 latency at p95 by gating activation; carves explicit latency budget per D-CROSS-LATENCY-1 partition | Adds 0-90 ms p99 latency only when triggered (~10-30% of frames in challenging cross-domain conditions) | LightGlue / XFeat already deployed for C3 | Apache-2.0 throughout (inherits C3 matcher license) | ~1 wk project-internal wrapper + threshold tuning | MVE: thin wrapper, MVE deferred to integration phase; docs: Mode A SQ2 Source #40 (OrthoLoC AdHoP) + Mode B Fact #108 | **Selected (NEW sub-stage)** — operationalizes SQ2 Decision 2 | + +### Component: C4 — Pose estimation (REVISED per Mode B Fact #112) + +| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit | +|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----| +| **OpenCV ≥4.12.0 `cv::solvePnPRansac`** (mandatory simple-baseline) wrapped in **GTSAM `Marginals`** (D-C4-2 = (b) covariance recovery) ← **REVISED Mode B Fact #112: OpenCV pin tightened to ≥4.12.0** | OpenCV ≥4.12.0 calib3d + GTSAM Python | `solvePnPRansac(objectPoints, imagePoints, K, dist, ..., flags=SOLVEPNP_IPPE)` (planar-scene IPPE per D-C4-1 = (b) 4-DoF flat-earth); wrap result in GTSAM `BetweenFactor` prior + per-inlier `GenericProjectionFactorCal3_S2` factors → `LevenbergMarquardtOptimizer` → `Marginals.marginalCovariance(pose_key)` 6×6 | OpenCV simplest-baseline + 7 USAC RANSAC variants; GTSAM provides NATIVE 6×6 covariance recovery; couples C4 + C5 via shared GTSAM substrate per D-C5-5 = (c); **D-CROSS-LATENCY-1 hybrid auto-degrades to D-C4-2 = (a) Jacobian-based covariance under thermal throttle** | GTSAM `Marginals` ~30-90 ms per pose recovery (Plan-phase Jetson MVE confirms tail); auto-degrade to ~5-15 ms Jacobian-based covariance under thermal throttle per D-CROSS-LATENCY-1 | OpenCV ≥4.12.0 (CVE-2025-53644 mitigation per Mode B Fact #112); GTSAM Python | Apache-2.0 + BSD-3-Clause; **OpenCV pinned to ≥4.12.0 per CVE-2025-53644 (CVSS 9.8 CRITICAL)** | ~3-5 days OpenCV + ~3-5 days GTSAM wrapper | MVE: see [`../00_research/02_fact_cards/C4_pose_estimation.md`](../00_research/02_fact_cards/C4_pose_estimation.md); docs: Sources #82+#83+#86+#87 + Mode B Source #127 | **Selected (mandatory simple-baseline + recommended-primary covariance recovery via GTSAM)** — OpenCV pin tightened | +| **OpenGV** (modern-competitive-lead-richer-minimal-solver) | C++ + Python bindings; laurentkneip/opengv | (Unchanged from solution_draft01) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | **Selected with runtime gate** (unchanged) | + +**Exact-fit evidence** (REVISED): Project constraints checked: AC-1.1/1.2 frame-center accuracy; AC-2.2 reprojection error <2.5 px cross-domain; AC-NEW-4 covariance honesty (P(error >500 m) <0.1 %); AC-4.1 latency (D-CROSS-LATENCY-1 hybrid); **CVE-2025-53644 mitigation via OpenCV ≥4.12.0 pin (Mode B Fact #112)**. **Camera intrinsics K + distortion `dist` are PROJECT-LEVEL OPEN ITEM per D-PROJ-1 (Mode B Fact #104)** — Plan-phase MUST close D-PROJ-1 before any AC-1.1/1.2 fixture validation. + +### Component: C5 — State estimator (REVISED per Mode B Fact #107 — AC-4.5 scope clarification) + +| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit | +|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----| +| **Manual ESKF (Solà 2017)** (mandatory simple-baseline) | (Unchanged from solution_draft01) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | **Selected (mandatory simple-baseline)** (unchanged) | +| **GTSAM iSAM2 + CombinedImuFactor + smart factors + Marginals + IncrementalFixedLagSmoother** (modern-competitive-lead-factor-graph) ← **REVISED Mode B Fact #107: AC-4.5 scope clarification** | GTSAM Python; borglab/gtsam | iSAM2 incremental smoothing + `CombinedImuFactor` 6-key per-keyframe-pair factor with bias evolution + `BetweenFactorPose3` + `GenericProjectionFactorCal3DS2` per D-C5-5 = (c) `PriorFactorPose3` only + `gtsam_unstable.IncrementalFixedLagSmoother` K=10-20 keyframes per D-C5-3. **AC-4.5 scope: internal smoothing + corrected current-frame emission only — NOT FC retroactive correction** (FC log is forward-time only per Mode B Fact #107; ArduPilot `AP_GPS_MAV` and iNav `mspGPSReceiveNewData()` consume only the latest frame). FDR (AC-NEW-3) MUST log smoothed past-frame estimates so post-mission analysis can verify AC-4.5. | NATIVE 6×6 posterior covariance via `Marginals`; NATIVE AC-4.5 internal look-back refinement; couples C4 + C5 via shared GTSAM substrate per D-C5-5 = (c) | GTSAM ~50-200 MB footprint; per-update latency ~5-100 ms depending on factor density (D-C5-5 = (c) gives ~2-5 ms); **AC-4.5 ≠ FC retroactive correction — internal smoothing only** | GTSAM Python; daily-active maintenance | BSD-3-Clause clean | ~2-3 wk full factor-graph design | MVE: see fact card; docs: Sources #90+#91 + Mode B Fact #107 | **Selected (modern-competitive-lead-factor-graph + recommended primary path)** — couples NATIVELY with C4 GTSAM Marginals via D-C5-5 = (c); AC-4.5 scope clarified | + +### Components C6, C7 — UNCHANGED from solution_draft01 + +(See [`solution_draft01.md`](solution_draft01.md#component-c6--tile-cache--spatial-index) for C6 mirror-of-suite-pattern primary + PostGIS+pgvector deferred secondary, and [`solution_draft01.md`](solution_draft01.md#component-c7--on-jetson-inference-runtime) for C7 TensorRT-native primary + ONNX Runtime+TRT EP secondary + pure PyTorch FP16 baseline. Mode B web research surfaced no contradicting evidence.) + +### Component: C8 — MAVLink / MSP2 FC adapter (REVISED per Mode B Fact #109 + #111) + +| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit | +|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----| +| **pymavlink → MAVLink `GPS_INPUT`** (recommended-primary for ArduPilot Plane) **+ MAVLink 2.0 message signing on companion ↔ AP wired channel per D-C8-9 = (d)** ← **REVISED Mode B Fact #109** | ardupilot/pymavlink + MAVLink 2.0 signing key handshake | `master.mav.gps_input_send(...)` 5 Hz periodic per D-C8-5 over UART/USB/UDP per D-C8-1; FC-side `GPS1_TYPE=14` MAVLink + `EK3_SRC1_POSXY=3` GPS source-set; per-FC unit conversion `horiz_accuracy` (m) per D-C8-8 = (b); **MAVLink 2.0 message signing enabled with per-flight key rotation per D-C8-9 = (d)** | Cooperative-path; FC-side ingestion via `AP_GPS_MAV` (verified Source #4); LGPL-3.0 linkable from Apache-2.0 app per LGPL §6 (D-C8-3 mitigation); **defends against CVE-2026-1579 unauthenticated MAVLink command injection per Mode B Fact #109** | LGPL-3.0 license-posture verification (D-C8-3 mitigation = bundle unmodified); **MAVLink 2.0 signing key management adds Plan-phase complexity (D-C8-9)** | pymavlink + ArduPilot Plane firmware (any) + MAVLink 2.0 signing-capable firmware (ArduPilot canonical per Mode B Source #128) | LGPL-3.0 linkable; **CVE-2026-1579 mitigated by MAVLink 2.0 signing per D-C8-9 = (d)** | ~3-5 days base + ~3-5 days signing key management implementation | MVE: see [`../00_research/02_fact_cards/C8_fc_adapter.md`](../00_research/02_fact_cards/C8_fc_adapter.md); docs: Sources #106+#107 + Mode B Source #128 | **Selected (recommended-primary)** for ArduPilot Plane + MAVLink-2.0-signing posture | +| **MSP2_SENSOR_GPS via Python MSP V2** (recommended-primary for iNav) ← **REVISED Mode B Fact #109 — iNav has no signing implementation; accepted residual risk** | YAMSPy + INAV-Toolkit `msp_v2_encode` | (Unchanged from solution_draft01) | YAMSPy + INAV-Toolkit MIT throughout; covariance fields aligned; **iNav has no MSP2 signing equivalent and no MAVLink message-signing implementation per Mode B Source #129 — accepted residual risk on iNav GCS link** | D-C8-4 implementation choice gate; **iNav MAVLink GCS link unsigned per Mode B Fact #109 — Plan-phase carryforward to propose iNav firmware feature-request** | YAMSPy or INAV-Toolkit; iNav firmware 8.0+ | MIT throughout; **iNav signing-gap = accepted residual risk per D-C8-9 + Mode B Source #129** | ~3-5 days | MVE: see fact card; docs: Sources #111+#112+#113 + Mode B Source #129 | **Selected (recommended-primary)** for iNav + signing-gap accepted residual risk | +| **D-C8-2 = (b) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch** (cross-cutting on AP path) ← **REVISED Mode B Fact #111** | pymavlink command channel | Companion publishes to source-set 2 + auto-switches FC to set 2 on first valid fix + switches back to set 1 when companion is unavailable | Mirrors NGPS/Auterion deployment pattern (Mode A SQ1) | **Pattern is firmware-supported but no production-deployed precedent — project will be establishing the canonical pattern itself per Mode B Fact #111**; SITL validation gate REQUIRED before lock per D-C8-2 carve-out | ArduPilot Plane firmware ≥ Aug 2021 (PR #18345) | n/a | ~3-5 days | MVE: see fact card; docs: Sources #4 + Mode B Source #130 | **Selected with runtime gate** (downgraded from `Selected`) — runtime gate = ArduPilot Plane SITL validation by IT-3 (Spoofing-promotion latency) before lock per Step 7.5.3 carve-out | +| **UBX impersonation via pyubx2 NAV-PVT** (deferred secondary for iNav) | (Unchanged from solution_draft01) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | (Unchanged) | **Deferred secondary** (unchanged) | + +**Exact-fit evidence** (REVISED): Project constraints checked: AC-4.3 per-FC external-positioning interface; AC-NEW-2 spoofing-promotion latency; AC-NEW-4 covariance honesty (per-FC unit conversion); AC-NEW-7 forgery posture for UBX path; **CVE-2026-1579 MAVLink-no-default-auth mitigation via D-C8-9 = (d) per Mode B Fact #109; D-C8-2 companion-driven switch downgraded to "Selected with runtime gate" per Mode B Fact #111**. + +### Component: C10 — Pre-flight cache provisioning (UNCHANGED from solution_draft01) + +(See [`solution_draft01.md`](solution_draft01.md#component-c10--pre-flight-cache-provisioning--sector-classification--freshness-pipeline) for D-C6-3 + D-C7-7 confirmation pipelines. Mode B web research surfaced no contradicting evidence on C10 internals; cross-component coupling with D-PROJ-2 Suite Sat Service voting-layer contract is added at the project-level decision registry but does not change C10's onboard scope.) + +### Out-of-research-scope items (deferred to Plan-phase) (REVISED per Mode B Facts #104 + #105) + +Per the C10 scope restructure 2026-05-08 (`c10_scope=C` cross-coupling minimal), the following are deferred to Plan-phase as `operator tooling design`: +- Operator-side CLI/desktop tool design (Plan-phase architect + UX) +- Sector classification (active-conflict vs stable rear) heuristics + interface (Plan-phase architect + operations team) +- Tile age-stamping schema beyond restrictions.md mandate (Plan-phase architect) +- Freshness pipeline workflow (Plan-phase architect + operations team) + +**NEW Mode B Plan-phase prerequisites (Mode B Facts #104 + #105)**: +- **D-PROJ-1 — Camera calibration acquisition strategy** (User + project bring-up team): hard prerequisite for AC-1.1/1.2 frame-center-accuracy validation; recommendation **(d) hybrid: factory data sheet + ground-truth checkerboard refinement on each deployed unit**. +- **D-PROJ-2 — Suite Sat Service voting-layer contract verification** (User + parent-suite Satellite Service team): hard prerequisite for AC-NEW-7 NFT-5 end-to-end evidence; recommendation **(a) verify + (b) draft contract from onboard side in parallel**. + +--- + +## Testing Strategy (REVISED per Mode B Facts #103, #107, #109) + +> **Note**: full test specifications are produced by the Test Spec skill (greenfield Step 5). What follows is the research-level test envelope, named so the Test Spec skill can elaborate against it. + +### Integration / Functional Tests + +- **IT-1 — Pipeline smoke** (REVISED): feed `_docs/00_problem/input_data/flight_derkachi/` (cropped nadir flight footage + synchronized `SCALED_IMU2` + `GLOBAL_POSITION_INT`) into the full C1+C2+C2.5+C3+C3.5+C4+C5+C8 pipeline; assert that the emitted `GPS_INPUT` (ArduPilot SITL) and `MSP2_SENSOR_GPS` (iNav SITL) frames stay within AC-1.1/1.2 frame-center-accuracy bounds vs the tlog GPS path. **Note per `expected_results/results_report.md` § Known Gaps**: still-image set is for AC-1.1/1.2 frame-center geolocation accuracy ONLY; Derkachi video is for runtime cadence + VIO + replay; neither is sufficient by itself for end-to-end AC-4.1 latency validation under production cadence + altitude + calibration — full validation requires D-PROJ-1 calibration + production-altitude footage. +- **IT-2 — Cold-boot TTFF**: cold-boot the companion 50× with a simulated FC pose; measure boot → first valid emitted external-position MAVLink frame; pass = 95th percentile <30 s per AC-NEW-1. +- **IT-3 — Spoofing-promotion latency** (REVISED per Mode B Fact #111): SITL on each supported FC (ArduPilot Plane + iNav, production param sets); inject false GPS; measure spoof onset → companion estimate becoming primary FC source via D-C8-2 = (b) `MAV_CMD_SET_EKF_SOURCE_SET` companion-driven switch; pass = 95th percentile <3 s on both per AC-NEW-2. **NEW gate: IT-3 functions as the runtime gate for D-C8-2's `Selected with runtime gate` status — must pass before D-C8-2 = (b) is locked; failure triggers D-C8-2-FALLBACK selection at Plan-phase.** +- **IT-4 — Sharp-turn recovery**: synthetic UAV trajectory with ±20° bank turns + <5% inter-frame overlap; assert C2/C3 satellite-anchor recovery within 1-2 frames per AC-3.2 + AC-3.3. +- **IT-5 — Visual blackout + GPS spoofing degraded mode**: SITL/replay on each FC; inject 5 s / 15 s / 35 s blackouts while spoofing GPS; assert mode transition ≤400 ms, spoofed GPS ignored, covariance grows monotonically, MAVLink fields degrade at AC-NEW-8 thresholds (>100 m → "2D fix or worse"; >500 m or >30 s → "no fix" + `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT), recovery only via trusted anchor or 10-s GPS-health + visual-consistency gate. +- **IT-6 — Stale tile rejection (AC-NEW-6)**: inject synthetic-age tiles into C6 cache; verify rejection or downgrade-to-non-`satellite_anchored` per AC-8.2 freshness threshold. +- **IT-7 — Cache-poisoning verification (AC-NEW-7)**: tamper with `/var/lib/onboard/cache/faiss/v_2048_M32.index` post-write but pre-takeoff; verify D-C10-3 SHA-256 content-hash gate triggers reject + STATUSTEXT + refuse takeoff. +- **IT-8 — Pre-flight cache rebuild idempotence**: invoke C10 pre-flight provisioning twice consecutively without input changes; verify D-C10-1 manifest-hash-driven trigger correctly skips rebuild on second invocation; verify atomic-write integrity holds across simulated power-loss mid-rebuild. +- **IT-9 — TensorRT engine cache reuse**: invoke C10 pre-flight provisioning with same model + same calibration corpus twice; verify D-C10-6 calibration-cache reuse triggers <30 sec rebuild on second invocation; verify D-C10-7 self-describing filename schema correctly identifies SM/JP/TRT/precision tuple. +- **IT-10 — AC-NEW-4 covariance-honesty cross-FC**: verify D-C8-8 = (b) per-FC unit conversion correctly extracts 2×2 horizontal sub-matrix from C5 GTSAM `Marginals.marginalCovariance`, computes 95% confidence ellipse semi-major axis `sqrt(2.0 * 5.991 * λ_max)`, emits as `horiz_accuracy` (m) for ArduPilot AND `hPosAccuracy` (mm) for iNav with mathematically equivalent values. +- **IT-11 — Smoothing-loop look-back accuracy** (NEW per Mode B Fact #107): validate GTSAM iSAM2's smoothed past-keyframe poses against ground-truth at smoothing convergence (independent of FC-side consumption). Assert that smoothed past-frame estimates (logged to FDR per AC-NEW-3) converge within X m of ground-truth at smoothing horizon K=10-20 keyframes per D-C5-3. **AC-4.5 satisfied as "internal smoothing + corrected current-frame emission" per Mode B Fact #107 scope clarification — NOT as "FC retroactive correction".** +- **IT-12 — VIO comparative study** (NEW per 2026-05-08 user directive on A1): replay the same flight footage (Derkachi cropped + IT-1 fixtures + when available D-PROJ-1-acquired full-altitude footage) through all three `VioStrategy` implementations (`Okvis2VioStrategy`, `VinsMonoVioStrategy`, `KltRansacVioStrategy`) using the research/dev build (D-C1-1-SUB-A = (a)). Emit a single side-by-side report: per-strategy AC-1.3 cumulative drift over 8 h replay; AC-2.1a frame-to-frame registration error; per-frame `VioOutput.relative_pose_covariance_6x6` honesty (AC-NEW-4); per-strategy NFT-1 latency p95+p99 (AC-4.1); per-strategy NFT-2 memory peak (AC-4.2); per-strategy SBOM impact + binary-size delta. **Deliverable**: `_docs/02_document/vio-comparative-study.md` published to official docs. **Production-selection gate**: if OKVIS2 leads VINS-Mono by >X% on AC-1.3 cumulative drift on the project's operating context, production binary builds with `BUILD_VINS_MONO=OFF` per D-C1-1-SUB-A = (a); otherwise D-C1-1-SUB-A is re-evaluated. Threshold X is a Plan-phase decision tied to AC-NEW-4 false-position safety budget headroom. + +### Non-Functional Tests + +- **NFT-1 — End-to-end latency p95 (AC-4.1)** (REVISED per Mode B Fact #103): 8 h synthetic load (3 Hz nav frames replayed); measure end-to-end latency distribution per D-CROSS-LATENCY-1 partition; pass = 95th percentile <400 ms; up to ~10% frames may drop under sustained load per AC-4.1. **Plan-phase Jetson MVE MUST close D-CROSS-LATENCY-1 hybrid degradation behavior before lock — see NFT-9.** +- **NFT-2 — Memory cap (AC-4.2)**: same 8 h load; assert peak shared CPU+GPU memory <8 GB per AC-4.2. +- **NFT-3 — Thermal envelope (AC-NEW-5)**: hot-soak 25 W @ +50 °C for 8 h; assert no Jetson thermal throttling. Cold-soak −20 °C cold-start within AC-NEW-1 30 s p95 budget. +- **NFT-4 — False-position safety budget (AC-NEW-4)**: Monte Carlo over public aerial-localization dataset (e.g., AerialVL S03) + own recorded flights; report error CDF; pass = `P(>500 m) <0.1 %` AND `P(>1 km) <0.01 %` across ≥100 flights. +- **NFT-5 — Cache-poisoning safety budget (AC-NEW-7)** (REVISED per Mode B Fact #105): multi-flight Monte Carlo replay over public datasets + own flights with synthetic over-confidence injection (deflate covariance ×1.5–3); assert `P(geo-misalign >30 m) <1 %` AND `P(>100 m) <0.1 %` across ≥100 flights. **End-to-end Service-side validation requires D-PROJ-2 Suite Sat Service voting-layer contract verification — onboard NFT-5 alone is best-effort without Service-side voting per Mode B Fact #105.** +- **NFT-6 — FDR storage cap (AC-NEW-3)**: 8 h synthetic load; assert FDR ≤64 GB; verify no payload class silently dropped without a logged rollover. **NEW per Mode B Fact #107: FDR MUST log smoothed past-frame estimates so post-mission analysis can verify AC-4.5 internal-smoothing scope.** +- **NFT-7 — License posture verification** (REVISED per Mode B Fact #102): SBOM dump of the deployed companion; verify D-C1-1 license-track is honored (no GPL-3.0 candidate loaded if D-C1-1 = (b); pymavlink LGPL-3.0 bundled-unmodified per D-C8-3); **VINS-Mono GPL-3.0 verified per Mode B Source #122 — NOT a BSD/permissive baseline if D-C1-1 = (b)**; verify Magic Leap noncommercial canonical SP weights are NOT loaded; verify all selected candidates' LICENSE files are bundled in `LICENSE/`. +- **NFT-8 — MAVLink message-signing verification** (NEW per Mode B Fact #109): SBOM dump confirms MAVLink 2.0 signing passkey configuration for ArduPilot companion ↔ AP wired channel per D-C8-9 = (d); per-flight key rotation logged to FDR; iNav side documents the unsignable-link as accepted residual risk per Mode B Source #129. Verify CVE-2026-1579 mitigation through end-to-end signed-message round-trip in SITL. +- **NFT-9 — Hot-soak latency distribution** (NEW per Mode B Fact #103): extends NFT-3 conditions (25 W @ +50 °C for 8 h) with end-to-end p95 + p99 latency distribution measurement per D-CROSS-LATENCY-1 partition; validate hybrid degradation behaves correctly (K=3 → K=2 + Jacobian-covariance under thermal throttle); pass = p95 ≤ 400 ms in steady-state AND in thermal-throttle hybrid mode; pass = p99 ≤ 600 ms (allows occasional AdHoP-triggered frames). +- **NFT-10 — Dependency CVE pinning audit** (NEW per Mode B Fact #112): SBOM dump confirms OpenCV ≥4.12.0 (CVE-2025-53644 mitigation), FAISS / GTSAM / TensorRT / pymavlink at versions with no published CVEs at audit time; monthly CVE re-scan trigger logged per D-CROSS-CVE-1 = (a) + (b). + +--- + +## References + +> Full per-source descriptions in `_docs/00_research/01_source_registry/` (organized by category file). Mode B addendum sources #122–#131 in [`MODEB_addendum.md`](../00_research/01_source_registry/MODEB_addendum.md). + +(Mode A categories SQ6 / SQ1 / SQ2 / C1 / C2 / C3 / C4 / C5 / C6 / C7 / C8 / C10 references remain as in [`solution_draft01.md`](solution_draft01.md#references). Below are the Mode B additions only.) + +### Mode B addendum (2026-05-08) + +Sources #122–#131. See [`MODEB_addendum.md`](../00_research/01_source_registry/MODEB_addendum.md). + +| # | Title | Tier | Mode B Binding | +|---|-------|------|---------------| +| 122 | HKUST-Aerial-Robotics/VINS-Mono LICENCE file (canonical) — GNU GPL Version 3 | L1 | C1 license-correction (Fact #102) | +| 123 | MegaLoc — "One Retrieval to Place Them All" (CVPR 2025; gmberton/megaloc, MIT) | L1 | C2 D-C2-11 candidate (Fact #110) | +| 124 | UltraVPR — "Unsupervised Lightweight Rotation-Invariant Aerial VPR" (RAL 2025 / ICRA 2026; cbbhuxx/UltraVPR, MIT) | L1 | C2 D-C2-11 alternative (Fact #110) — Documentary Lead PRIMARY on BSD/permissive C2 axis | +| 125 | AirZoo — "Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision" (arXiv:2604.26567, 2026) | L1 | C2 evidence base for MegaLoc on aerial domain (Fact #110) | +| 126 | NVD CVE-2026-1579 — MAVLink protocol Missing Authentication (CVSS 9.8 CRITICAL) | L1 | New cross-cutting security gate D-C8-9 (Fact #109) | +| 127 | NVD CVE-2025-53644 — OpenCV uninitialized stack pointer on crafted JPEG (CVSS 9.8 CRITICAL); fixed in 4.12.0 | L1 | C4 OpenCV pin update (Fact #112) | +| 128 | ArduPilot MAVLink2 Signing — Plane documentation + Issue #28736 + PR #29546 (March 2025) channel-specific signing | L1 | D-C8-9 mitigation evidence (Fact #109) | +| 129 | iNav MAVLink Wiki | L1 | D-C8-9 cross-FC asymmetry (Fact #109) — iNav has no signing implementation | +| 130 | ArduPilot common-ekf-sources.rst + PR #18345 (`MAV_CMD_SET_EKF_SOURCE_SET`) — explicit "no GCSs are currently known to implement this" (verified 2026-05-08) | L1 | D-C8-2 evidence (Fact #111) cross-confirms Mode A SQ6 Fact #3 | +| 131 | XoFTR — "Cross-modal Feature Matching Transformer" (arXiv:2404.09692) + 2026 SAR-optical satellite registration benchmark (arXiv:2604.10217) | L2 | F20 contrarian-evidence reference (Fact #113) | + +--- + +## Open decisions for Plan-phase (D-Cx-y registry, REVISED per Mode B Facts #103, #109, #110, #111, #112, #113 + new D-PROJ-1, D-PROJ-2) + +The 27 Plan-phase-architect-owned decisions and 8 cross-component-owner decisions raised across all components in Mode A are catalogued in [`../00_research/06_component_fit_matrix/99_cross_component_gates.md`](../00_research/06_component_fit_matrix/99_cross_component_gates.md). Mode B revisions + new gates are catalogued in [`../00_research/06_component_fit_matrix/MODEB_revisions.md`](../00_research/06_component_fit_matrix/MODEB_revisions.md). The most architecturally significant **user-decision** gates (Mode A + Mode B combined): + +- **D-C1-1 license-track posture** (User + Plan-phase architect, Mode A). Recommendation: D-C1-1 = (c) both tracks open; preserves modular swap pathway. **Mode B Fact #102 evidence**: VINS-Mono is GPL-3.0 (not BSD); BSD/permissive-track lead remains OKVIS2. +- **D-C1-1-SUB-A (LOCKED 2026-05-08 by User to option (a))** VINS-Mono GPL-3.0 viral-linkage containment policy. Production binary `BUILD_VINS_MONO=OFF` (BSD-clean — only `Okvis2VioStrategy` + `KltRansacVioStrategy` linked). Research/dev binary `BUILD_VINS_MONO=ON` for IT-12 comparative study (all three strategies linked). Plan-phase MUST implement the CMake `option(BUILD_VINS_MONO ...)` flag + CI pipeline split + production-binary SBOM verification (no `vins_mono` GPL-3.0 symbol present). See § Architecture C1 D-C1-1-SUB-A table for option trade-offs and locked-verdict rationale. +- **D-C2-1 VPR canonical-weights vs aerial-retrain vs aerial-community-checkpoint** (User + Plan-phase architect, Mode A). Recommendation: aerial-retrain on real UAV nadir flight footage corpus per D-C7-1 closure. **Mode B Fact #110 update**: UltraVPR is unsupervised aerial-pretrain, closing D-C2-1 retrain cost on the BSD/permissive C2 axis if UltraVPR is selected. +- **D-C3-1 SuperPoint-replacement-strategy** (User + Plan-phase architect + license-posture decision-maker, Mode A). Recommendation: D-C3-1 = (a) DISK+LightGlue. **Mode B Fact #113 reinforcement**: foundation-model features (DINOv2) provide modality invariance — strengthens SelaVPR (DINOv2-L) as secondary alongside UltraVPR primary. +- **D-C2-11 (REVISED Mode B Fact #110) MegaLoc + UltraVPR successor evaluation** (User + Plan-phase architect). Recommendation REVISED from "defer to post-research session" to **(a) elevate UltraVPR to Documentary Lead PRIMARY on BSD/permissive C2 axis + (b) elevate MegaLoc to Documentary Lead SECONDARY (broader-applicability) + (c) preserve closed pre-screen as fallback**; mandatory Jetson MVE under D-C1-2 / D-C2-4 expanded scope. +- **D-C2-12 (NEW Mode B Fact #113) DINOv2-backbone feature-extractor evaluation for cross-domain matching** (Plan-phase architect + C3 owner). Plan-phase decision: defer to Jetson MVE; potentially closes D-C3-1 retrain cost via DINOv2-feature-based matcher without requiring D-C2-1 aerial retrain. Carryforward research item. +- **D-C8-2 (REVISED Mode B Fact #111) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch ownership pattern** (Plan-phase architect + AC-NEW-2 owner). Recommendation unchanged ((b) companion publishes to source-set 2 + auto-switches FC), but **status downgraded to `Selected with runtime gate`** per Step 7.5.3 carve-out — runtime gate = ArduPilot Plane SITL validation by IT-3 (Spoofing-promotion latency) before lock. **NEW sub-decision D-C8-2-FALLBACK** if SITL validation fails: (a) operator-manual RC aux switch with relaxed AC-NEW-2 wording; (b) operator-warning STATUSTEXT instead of automated switch; (c) escalate to ArduPilot dev community. +- **D-C8-9 (NEW Mode B Fact #109) MAVLink 2.0 message signing posture per FC** (Plan-phase architect + security owner). Recommendation **(d) hybrid: signing on companion ↔ AP wired channel + per-flight key rotation**. iNav-side signing-gap accepted as residual risk + Plan-phase carryforward to propose iNav firmware feature-request. +- **D-CROSS-LATENCY-1 (NEW Mode B Fact #103) AC-4.1 latency budget partition strategy** (Plan-phase architect + project bring-up team). Recommendation **(d) hybrid: K=3 default + auto-degrade to K=2 + Jacobian-covariance under thermal throttle**. Validation gate: NFT-9 hot-soak latency distribution measurement before lock. +- **D-CROSS-CVE-1 (NEW Mode B Fact #112) dependency security pinning posture** (Plan-phase architect + security owner). Recommendation **(a) lock to specific patched versions of all CVE-affected dependencies (OpenCV ≥4.12.0; FAISS no CVEs; GTSAM no CVEs; TensorRT 10.3 no CVE-applicable; pymavlink no CVEs at audit time) + (b) maintain a project SBOM with monthly CVE re-scan**. +- **D-PROJ-1 (NEW Mode B Fact #104) Camera calibration acquisition strategy** (User + project bring-up team). Recommendation **(d) hybrid: factory data sheet from ADTi + ground-truth checkerboard refinement on each deployed unit**. **CRITICAL Plan-phase gate** — hard prerequisite for AC-1.1/1.2 frame-center-accuracy validation; Test Spec greenfield Step 5 cannot lock end-to-end accuracy fixtures without it. +- **D-PROJ-2 (NEW Mode B Fact #105) Suite Sat Service voting-layer contract verification** (User + parent-suite Satellite Service team). Recommendation **(a) verify Suite Service voting layer is documented + scheduled + (b) draft contract from onboard side and propose to Suite Service team in parallel**. **CRITICAL cross-suite gate** — requires coordination with parent-suite Satellite Service team before AC-NEW-7 NFT-5 can pass with end-to-end evidence. + +--- + +## Related Artifacts + +- Tech stack evaluation (`tech_stack.md`): NOT PRODUCED in this Mode B run. Recommendation set is embedded in the per-component candidate tables above and the Mode B revisions overlay [`MODEB_revisions.md`](../00_research/06_component_fit_matrix/MODEB_revisions.md). Full extraction into `tech_stack.md` is a low-cost task if the user requests it before Plan-phase. +- Security analysis (`security_analysis.md`): NOT PRODUCED in this Mode B run. AC-NEW-7 cache-poisoning safety + AC-NEW-2 spoofing-promotion + AC-NEW-8 visual blackout failsafe + AC-NEW-4 covariance honesty + **CVE-2026-1579 MAVLink-no-default-auth (Mode B) + CVE-2025-53644 OpenCV crafted-JPEG (Mode B)** are addressed component-by-component above and cross-referenced in [`../00_research/05_validation_log.md`](../00_research/05_validation_log.md). Full extraction into `security_analysis.md` is a low-cost task if the user requests it before Plan-phase — **with Mode B scope, this artifact would now also include the D-C8-9 MAVLink-2.0-signing posture + D-CROSS-CVE-1 dependency-pinning posture as named sub-sections**. +- AC assessment (`_docs/00_research/00_ac_assessment.md`): NOT PRODUCED as standalone artifact in either Mode A or Mode B; per-AC binding evidence remains distributed across per-component fact cards + Restrictions × Candidate-Modes sub-matrix sections + Mode B addendum facts. Per Mode B Fact #106, retroactive extraction is a ~1-2 hour operation if the canonical artifact is wanted before Plan-phase. + +--- + +## Mode B summary + +solution_draft02 supersedes solution_draft01 with 13 actionable revisions: +1. **C1**: VINS-Mono license corrected from BSD to GPL-3.0; KLT+RANSAC re-labeled mandatory simple-baseline (Fact #102). **NEW per 2026-05-08 user directive**: all three implementations (OKVIS2 + VINS-Mono + KLT+RANSAC) are wrapped behind a `VioStrategy` interface with config-driven selection, enabling a comparative-study report (IT-12) that picks the production-deployed implementation by measured performance. **NEW sub-decision D-C1-1-SUB-A** (User hard gate): how to contain VINS-Mono GPL-3.0 viral linkage so the production deployment binary stays BSD/permissive clean. Recommendation **(a) build-config exclusion**: production binary `BUILD_VINS_MONO=OFF`; research/dev binary `BUILD_VINS_MONO=ON` for the comparative study. +2. **C2**: UltraVPR + MegaLoc elevated as new Documentary-Lead candidates on BSD/permissive axis; D-C2-11 status changed from "deferred" to "elevate" (Fact #110); SelaVPR strengthened (Fact #113). +3. **C2.5 + C3.5**: two new sub-stages added (Top-N inlier re-rank + AdHoP-conditional refinement) per SQ2 Decisions 2+3 closure (Fact #108). +4. **C4**: OpenCV pin tightened to ≥4.12.0 (Fact #112); D-CROSS-LATENCY-1 hybrid degradation path documented (Fact #103). +5. **C5**: AC-4.5 scope clarified as internal-smoothing-only — NOT FC retroactive correction (Fact #107); IT-11 added. +6. **C8**: D-C8-9 MAVLink 2.0 message signing posture added per FC (Fact #109); D-C8-2 status downgraded to `Selected with runtime gate` (Fact #111); iNav signing-gap documented as accepted residual risk. +7. **Plan-phase**: D-PROJ-1 (camera calibration) + D-PROJ-2 (Suite Sat Service voting-layer contract) + D-CROSS-CVE-1 (dependency pinning) + D-CROSS-LATENCY-1 (AC-4.1 partition) + D-C2-12 (DINOv2-feature matcher) added as new gates (Facts #103, #104, #105, #112, #113). +8. **Tests**: IT-11 (smoothing look-back), NFT-8 (signing verification), NFT-9 (hot-soak latency), NFT-10 (CVE pinning audit) added; IT-1 + IT-3 + NFT-1 + NFT-5 + NFT-6 + NFT-7 revised. + +The Mode B audit found NO architectural component that needed wholesale replacement — the Mode A solution_draft01 architecture survives with revisions, additions, and three corrections (license error, latency budget, scope clarification). The revised draft is suitable as the input to Plan-phase. diff --git a/_docs/02_document/glossary.md b/_docs/02_document/glossary.md new file mode 100644 index 0000000..ede6813 --- /dev/null +++ b/_docs/02_document/glossary.md @@ -0,0 +1,95 @@ +# Glossary + +**Status**: confirmed-by-user +**Date**: 2026-05-09 +**Scope**: project-specific terminology for the GPS-denied onboard pose-estimation system. Generic software / industry terms (REST, JSON, IMU, WGS84, etc.) are intentionally omitted. + +Terms are alphabetical. Each entry: one-line definition + parenthetical source. + +--- + +**adti20** — Informal name for the production deployment camera, the **ADTi Surveyor Lite 20MP 20L V1** (APS-C ~23.6×15.7 mm, ~5472×3648 px, fixed downward, no gimbal). Pinned in `restrictions.md` §Cameras. (source: `restrictions.md`, user confirmation 2026-05-09) + +**adti26** — Informal name for the camera that captured the 60 still-image test fixtures (`AD000001..AD000060.jpg`) under `_docs/00_problem/input_data/`. Distinct from the production-deployed `adti20`; calibration data must be sourced from public/factory references for these test images. (source: user confirmation 2026-05-09) + +**AdHoP refinement** — OrthoLoC method-agnostic perspective preconditioning, conditional sub-stage between cross-domain matcher and pose estimation; invoked only when initial reprojection error exceeds threshold (component C3.5). (source: `solution.md` §C3.5, SQ2 Decision 2) + +**AGL / Above Ground Level** — Vertical distance from the ground directly below the UAV; operational ceiling ≤1 km AGL. (source: `restrictions.md` §UAV & Flight) + +**AI camera** — Operator-controlled gimbal+zoom camera consumed by AI detection systems; out of scope for nav-pose, in scope for AC-7.x object localization only. (source: `restrictions.md` §Cameras) + +**Camera calibration artifact** — JSON file carrying camera intrinsics + distortion + body-to-camera extrinsics + acquisition method (`factory_sheet | checkerboard_refined | hybrid`). The only way camera-specific parameters enter the system; no hard-coded camera math anywhere. Test fixtures and production deployments load different artifacts on the same code path. (source: user directive 2026-05-09) + +**Companion / Companion PC** — The onboard Jetson Orin Nano Super running the GPS-denied estimation pipeline. Synonyms used interchangeably across docs. (source: `restrictions.md` §Onboard Hardware) + +**D-PROJ-1** — *(CLOSED in this Plan cycle)* Camera calibration acquisition strategy. Resolved as: hybrid factory data sheet + per-unit ground-truth checkerboard refinement (~1 day per deployed unit). No physical hardware available this cycle, so production calibration is documented as instructions only. (source: `solution.md` Open decisions, user confirmation 2026-05-09) + +**D-PROJ-2** — *(OPEN, parent-suite)* Two design tasks against `satellite-provider`: (i) post-landing tile ingest endpoint, (ii) multi-flight trust / staleness logic. Surfaced in `satellite-provider/_docs/` outside this Plan cycle as a parent-suite deliverable. Tracked via `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`. (source: `solution.md`, user confirmation 2026-05-09) + +**D-PROJ-3** — Multi-flight fixture acquisition (AerialVL S03 + Maxar Open Data Ukraine + own multi-flight data). NOT pursued in this Plan cycle: AC-NEW-4 / AC-NEW-7 wording was relaxed to Monte-Carlo-over-current-data with stated CI; multi-flight statistical residual risk recorded for the Step 4 risk register. (source: `solution.md`, traceability-matrix.md, user confirmation 2026-05-09) + +**Dead reckoned** — Source label `dead_reckoned`: estimate produced from IMU-only propagation with no visual or satellite anchoring. Carries monotonically growing covariance; emitted during visual blackouts or after re-localization fails. (source: AC-1.4, AC-NEW-8) + +**Derkachi flight footage** — Representative cropped nadir video + synchronized `SCALED_IMU2` + `GLOBAL_POSITION_INT` telemetry under `input_data/flight_derkachi/`. Used for runtime cadence + VIO + replay testing. (source: `problem.md`, `data_parameters.md`) + +**External position / GPS replacement** — What this system emits to the FC: WGS84 coordinates + honest covariance + provenance label, replacing real GPS when denied/spoofed. (source: AC-4.3, AC-6.3) + +**FC / Flight Controller** — ArduPilot Plane or iNav. PX4 explicitly out of scope. (source: `restrictions.md` §Sensors & Integration) + +**FDR / Flight Data Recorder** — Per-flight onboard NVM record (≤64 GB) of estimates, IMU traces, MAVLink stream, mid-flight tiles, system health, failed-tile thumbnails. Excludes raw nav/AI-camera frames. (source: AC-NEW-3) + +**Flight state** — Boolean signal `IN_AIR | ON_GROUND` derived from FC `MAV_STATE` (MAVLink HEARTBEAT). Safety-critical: gates the post-landing upload path; `IN_AIR` forbids any outbound write to `satellite-provider`. Enforced primarily by process-level isolation — the upload daemon is not loaded in the airborne companion image. (source: user directive 2026-05-09) + +**GCS / Ground Control Station** — QGroundControl. Mission Planner is out of scope. (source: `restrictions.md`) + +**GPS denial / GPS spoofing** — Distinct failure modes the system must distinguish: denial = no fix; spoofing = false fix that must not be promoted into the estimator. (source: AC-3.5, AC-NEW-2, AC-NEW-8) + +**`GPS_INPUT`** — MAVLink message used as the per-frame FC delivery channel for ArduPilot Plane. (source: AC-4.3, `restrictions.md`) + +**GSD / Ground Sample Distance** — Meters-per-pixel on the ground; target 10–20 cm/px @ 1 km AGL for the nav camera. (source: `restrictions.md` §Cameras) + +**Internal smoothing** — AC-4.5 scope: GTSAM iSAM2 retroactively refines past keyframes onboard and emits the corrected current frame; the FC log is forward-time only. NOT to be confused with FC-side retroactive correction (which neither ArduPilot nor iNav supports). (source: `solution.md` §C5, Mode B Fact #107) + +**Jetson Orin Nano Super** — Pinned companion compute: 67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W TDP, JetPack/CUDA/TensorRT. (source: `restrictions.md`) + +**Mid-flight tile generation** — Companion orthorectifies nav-camera frames into basemap-projected tiles in flight, deduplicates, stores locally in `satellite-provider`-compatible format. NO outbound upload while airborne — upload happens post-landing only. (source: AC-8.4, user directive 2026-05-09) + +**Mission profile** — 8 h flight, ~150 km² operational sector + ~50 km² transit corridor, ≤400 km² total cached, ~60 km/h cruise, ≤1 km AGL, eastern/southern Ukraine. (source: `restrictions.md`) + +**`MSP2_SENSOR_GPS`** — MSP2 message used as the per-frame FC delivery channel for iNav (iNav has no inbound MAVLink external-positioning handler). (source: `restrictions.md`, AC-4.3) + +**Nav camera / Navigation camera** — The fixed-downward (no gimbal) camera on the UAV; pinned model is `adti20`. Distinct from the operator-controlled AI camera. (source: `restrictions.md` §Cameras) + +**Operator** — Pre-flight and post-flight human role: classifies the operational area (active-conflict vs stable rear), downloads tiles via `satellite-provider`, stages cache + calibration onto the companion before takeoff, and after landing triggers the post-landing upload tool. (source: `problem.md`, AC-3.4 / AC-6.2, user confirmation 2026-05-09) + +**Post-landing upload tool** — Operator-side process that runs only when `flight state == ON_GROUND`; pushes locally-saved mid-flight tiles to `satellite-provider`'s ingest endpoint. Implemented as a separate process / image so the upload code path is never loaded in the airborne companion. (source: user directive 2026-05-09) + +**`satellite-provider`** — First-class architecture boundary: the suite's existing .NET 8 REST microservice at `/Users/obezdienie001/dev/azaion/suite/satellite-provider/`. Runs in Docker (`:5100`, OpenAPI at `/swagger`); downloads Google Maps tiles; stores them in PostgreSQL + filesystem (`./tiles/{zoomLevel}/{x}/{y}.jpg`). Read-only from the onboard runtime; receives post-landing tile uploads via a yet-to-be-designed ingest endpoint (parent-suite work, D-PROJ-2). Synonym in older docs: "Suite Sat Service" / "Azaion Suite Satellite Service". (source: parent-suite `satellite-provider/README.md`, user confirmation 2026-05-09) + +**Satellite anchored** — Source label `satellite_anchored`: estimate produced by matching the current nav frame against pre-cached satellite tiles. Highest confidence among the three labels. (source: AC-1.4) + +**Sector classification** — Pre-flight operator decision: active-conflict (6-month tile-freshness threshold) vs stable rear (12-month threshold). Drives the freshness gate at ingest and during runtime tile use. (source: AC-8.2, AC-NEW-6, `solution.md` operator-tooling section) + +**Source label** — Provenance tag carried with every emitted estimate: `{satellite_anchored | visual_propagated | dead_reckoned}`. (source: AC-1.4) + +**Suite Sat Service** — Synonym for `satellite-provider` used in earlier docs (problem.md, restrictions.md, solution_draft01/02). The actual implementation in the parent suite is the .NET 8 service; "Suite Sat Service" is the role name. (source: `restrictions.md`, parent-suite `satellite-provider/README.md`) + +**Tier-1 / Tier-2** — Testing-environment split: Tier-1 = workstation Docker (fast/cheap); Tier-2 = Jetson hardware (AC-bound). Both appear in the deployment plan and CI matrix per finding F6. (source: `_docs/02_document/tests/environment.md`) + +**Tile** — Unit of persistent imagery on the companion; basemap-projected, deduplicated; the only persistent imagery format. Mid-flight-generated tiles use the same on-disk format as `satellite-provider` (`./{zoomLevel}/{x}/{y}.jpg` + matching metadata schema) so post-landing upload is byte-identical. (source: AC-8.4, AC-8.5, parent-suite `satellite-provider/README.md`, user confirmation 2026-05-09) + +**Tile cache** — Local on-Jetson store, ≤10 GB, populated pre-flight from `satellite-provider`, augmented mid-flight by orthorectified nav-camera-derived tiles. (source: `restrictions.md`, AC-8.3, AC-8.4) + +**Tile freshness** — <6 mo (active-conflict sectors) / <12 mo (stable rear); stale tiles must be rejected or downgraded. (source: AC-8.2, AC-NEW-6) + +**TTFF / Time To First Fix** — From companion boot to first valid emitted external-position frame; budget <30 s p95. (source: AC-NEW-1) + +**UAV** — Fixed-wing unmanned aerial vehicle this system runs on; ~60 km/h cruise, ≤1 km AGL, 8 h flights, eastern/southern Ukraine theater. (source: `restrictions.md`) + +**VioStrategy** — Pluggable interface (Okvis2 / VinsMono / KltRansac) selected at startup by config. Production binary excludes the GPL-3.0 implementation per D-C1-1-SUB-A=(a) build-config exclusion; research/dev binary links all three for the comparative study (IT-12). (source: `solution.md` §C1) + +**VIO / Visual-Inertial Odometry** — Frame-to-frame motion + IMU bias estimation via fused camera + IMU streams (component C1). (source: `solution.md` §C1) + +**Visual propagated** — Source label `visual_propagated`: estimate produced by VIO frame-to-frame propagation with no fresh satellite anchor. Mid-confidence. (source: AC-1.4) + +**VPR / Visual Place Recognition** — Descriptor-based retrieval of the nearest satellite tile to the current nav frame (component C2). (source: `solution.md` §C2) diff --git a/_docs/02_document/tests/blackbox-tests.md b/_docs/02_document/tests/blackbox-tests.md new file mode 100644 index 0000000..4453101 --- /dev/null +++ b/_docs/02_document/tests/blackbox-tests.md @@ -0,0 +1,595 @@ +# Blackbox Tests + +All tests run from the `e2e-runner` container against the SUT through public boundaries only (frame source, FC inbound stream, tile cache mount, FC outbound observed via SITL, GCS observed via mavproxy-listener, FDR via post-run filesystem read). Two FC adapters parameterize every test that touches the FC contract: `ardupilot` and `inav`. Two `VioStrategy` modes parameterize Tier-1 product correctness tests: `okvis2` (production-default) and `klt_ransac` (mandatory simple-baseline). `vins_mono` is parameterized only when the research build is under test. + +## Positive Scenarios + +### FT-P-01: Still-image satellite-anchor frame-center accuracy + +**Summary**: Validates the canonical satellite-anchor frame-center geolocation pipeline against the 60-image GT set. +**Traces to**: AC-1.1, AC-1.2 +**Category**: Position Accuracy + +**Preconditions**: +- `tile-cache-fixture` mounted at `/var/azaion/tile-cache`. +- SUT cold-started with no prior state; configured for the FC adapter under test. + +**Input data**: `still-image-set-60` (per `test-data.md`). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | For each image `AD0000NN.jpg` in order, write the frame to the SUT's frame-source path and wait up to 5 s for the corresponding outbound `GPS_INPUT` (AP) / `MSP2_SENSOR_GPS` (iNav) message at the SITL listener | One outbound message per input image; payload includes WGS84 lat/lon | +| 2 | Compute Vincenty geodesic distance between estimated lat/lon and `coordinates.csv` GT row for that image | Per-image error ≤ 50 m for ≥80% of images, ≤ 20 m for ≥50% | +| 3 | Capture per-image error to `e2e-results/run-${RUN_ID}/ft-p-01.csv` | CSV produced with one row per image | + +**Expected outcome**: aggregate `pass_count(error≤50m) ≥ 48` AND `pass_count(error≤20m) ≥ 30` (matching the rule in `expected_results/results_report.md`). +**Max execution time**: 5 min (60 images × ~5 s including SITL round trip). + +--- + +### FT-P-02: Derkachi VIO drift between satellite anchors + +**Summary**: Validates cumulative drift between consecutive satellite-anchored fixes during the Derkachi flight replay. +**Traces to**: AC-1.3 +**Category**: Position Accuracy + +**Preconditions**: +- `tile-cache-fixture` mounted (covers Derkachi route). +- SUT cold-started; FC adapter under test connected via SITL; `data_imu.csv` replayed at 10 Hz into FC IMU stream. + +**Input data**: `derkachi-fixture` video at 30 fps + IMU CSV at 10 Hz. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start synchronized video + IMU replay (3 video frames per IMU row) | SUT begins emitting estimates at the SUT's runtime cadence | +| 2 | At each frame whose outbound estimate carries `source_label = satellite_anchored`, record the propagated centre estimate of the prior visual-only segment AND the new anchor centre | Two values per anchor pair captured | +| 3 | Compute per-anchor-pair drift = ‖propagated_centre − next_anchor_centre‖. Bin by `last_satellite_anchor_age_ms`. | Bins populated; CSV emitted | + +**Expected outcome**: Across all anchor pairs, at least 95% satisfy `drift < 100 m` (visual-only) AND `drift < 50 m` (when CombinedImuFactor IMU fusion is active in C5). Drift distribution monotonically grows with anchor age, with no anomalous spike. +**Max execution time**: 10 min (8 min replay + parsing). + +--- + +### FT-P-03: Estimate output schema and source-label semantics + +**Summary**: Validates the SUT's outbound estimate carries every required field with correct types and the source label is one of the three allowed values. +**Traces to**: AC-1.4, AC-4.3 +**Category**: Position Accuracy / FC Contract + +**Preconditions**: +- One image from `still-image-set-60` already loaded into the cache fixture. +- SUT cold-started. + +**Input data**: any single image (default `AD000001.jpg`). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Push the image to the frame source | SUT emits one outbound `GPS_INPUT` (AP) / `MSP2_SENSOR_GPS` (iNav) AND one out-of-band channel message (MAVLink `STATUSTEXT` or `NAMED_VALUE_FLOAT` per AC-4.3) carrying the source label | +| 2 | Read the SITL-side fields | Schema match: `lat`, `lon`, `cov_semi_major_m`, `last_satellite_anchor_age_ms` present and well-typed | +| 3 | Read the out-of-band label channel | Label ∈ `{satellite_anchored, visual_propagated, dead_reckoned}` | + +**Expected outcome**: Schema check passes AND label is in the allowed set. +**Max execution time**: 30 s. + +--- + +### FT-P-04: Derkachi frame-to-frame registration success rate + +**Summary**: Validates frame-to-frame registration succeeds for ≥95% of "normal" segments of the Derkachi flight. +**Traces to**: AC-2.1a +**Category**: Image Processing + +**Preconditions**: +- SUT cold-started; FC adapter and VioStrategy both parameterized. + +**Input data**: `derkachi-fixture` (full duration). "Normal" segments derived per AC-2.1a: nadir ±10° bank/pitch (estimated from `SCALED_IMU2`-derived attitude), ≥40% inferred prior-frame overlap (heuristic from frame-to-frame translation magnitude). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Replay the Derkachi fixture | SUT emits per-frame registration-success metric (exposed via `NAMED_VALUE_FLOAT` or in FDR per AC-NEW-3) | +| 2 | After replay, compute success-ratio over normal segments only | Success ratio ≥ 0.95 | + +**Expected outcome**: ≥95% on normal segments. Sharp-turn segments (excluded from this denominator) are exercised separately by FT-N-02. +**Max execution time**: 12 min. + +--- + +### FT-P-05: Satellite-anchor cross-domain registration + +**Summary**: Validates the satellite-anchor (UAV→satellite cross-domain) matcher succeeds with the cross-domain MRE budget. +**Traces to**: AC-2.1b, AC-2.2 +**Category**: Image Processing + +**Preconditions**: +- `tile-cache-fixture` includes the still-image footprints. +- SUT cold-started. + +**Input data**: `still-image-set-60` plus `still-image-sat-refs-2` (for the 2 images with paired `_gmaps.png`). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | For each still image, push to frame source | One satellite-anchor result per image | +| 2 | Read per-frame MRE (via FDR or `NAMED_VALUE_FLOAT`) | MRE recorded | +| 3 | Aggregate per-image accuracy AND MRE distribution | All images: MRE < 2.5 px; ≥80% within 50 m of GT; ≥50% within 20 m of GT | + +**Expected outcome**: AC-1.1, AC-1.2, AC-2.1b, AC-2.2 all satisfied. +**Max execution time**: 5 min. + +--- + +### FT-P-06: Mean Reprojection Error budgets (frame-to-frame + cross-domain) + +**Summary**: Validates the two MRE budgets are honored. +**Traces to**: AC-2.2 + +**Preconditions**: Same as FT-P-04 + FT-P-05. + +**Input data**: `derkachi-fixture` (frame-to-frame MRE) + `still-image-set-60` (cross-domain MRE). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Run FT-P-04 and FT-P-05 in sequence; collect per-frame MRE from both runs | MRE values captured | +| 2 | Aggregate by domain (frame-to-frame vs satellite-anchored) | Distribution per domain | + +**Expected outcome**: Frame-to-frame MRE < 1.0 px (95th percentile); cross-domain MRE < 2.5 px (95th percentile). +**Max execution time**: piggybacks on FT-P-04 / FT-P-05. + +--- + +### FT-P-07: Sharp-turn recovery via satellite reference + +**Summary**: Validates that frames during sharp turns may fail frame-to-frame but recover via satellite-reference re-localization. +**Traces to**: AC-3.2 + +**Preconditions**: +- Sharp-turn segment of the Derkachi flight identified by gyro_z spikes in `SCALED_IMU2`. (If Derkachi has no sharp turn meeting AC-3.2 thresholds, fall back to a synthetic gyro overlay; flag in FDR.) + +**Input data**: `derkachi-fixture` filtered to sharp-turn segment(s). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Replay sharp-turn segment | SUT emits `source_label = visual_propagated` or `dead_reckoned` during turn | +| 2 | After turn, observe next satellite-anchor attempt | Recovery: `source_label = satellite_anchored` returns within 3 frames of turn end; drift ≤ 200 m, heading change handled | + +**Expected outcome**: Recovery within 3 frames; <200 m drift; <70° heading change handled. +**Max execution time**: 5 min (per turn segment, multiple per replay). + +--- + +### FT-P-08: Multi-segment satellite-reference re-localization + +**Summary**: Validates ≥3 disconnected segments per flight handled via satellite-reference re-localization. +**Traces to**: AC-3.3 + +**Preconditions**: +- `multi-segment-derkachi` synthetic fixture generated with 3+ blackout windows. + +**Input data**: `multi-segment-derkachi`. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Replay with injected blackout windows | SUT emits `dead_reckoned` during each blackout | +| 2 | At end of each blackout, observe re-localization | `source_label` returns to `satellite_anchored` within 3 frames; trajectory continuity preserved (no >100 m jump) | + +**Expected outcome**: All 3+ segments re-localized successfully; no trajectory jump exceeds 100 m. +**Max execution time**: 12 min. + +--- + +### FT-P-09-AP: ArduPilot Plane GPS_INPUT contract conformance + signing + +**Summary**: Validates `GPS_INPUT` reaches AP SITL, AP EKF accepts it as primary GPS, and MAVLink 2.0 message signing handshake completes per D-C8-9. +**Traces to**: AC-4.3 (AP), D-C8-9, AC-NEW-2 (precondition) +**Category**: FC Contract / Security + +**Preconditions**: +- `ardupilot-plane-sitl` running with `GPS_TYPE=14`. +- `mavlink-passkey` loaded as Docker secret into SUT. + +**Input data**: `derkachi-fixture` (any 60 s segment). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start SUT with FC adapter `ardupilot` | Signing handshake completes within 5 s; signed channel established | +| 2 | Replay 60 s of Derkachi | SUT emits signed `GPS_INPUT` at the configured rate | +| 3 | Read AP `EK3_SRC1_POSXY` parameter via MAVPROXY | Value reads `3` (GPS source) | +| 4 | Read AP-side GPS health via `GPS_RAW_INT` | Fix type ≥ 3 (3D fix), HDOP within nominal | + +**Expected outcome**: Signing handshake succeeds; AP EKF on GPS source-set; GPS_RAW_INT shows healthy fix. +**Max execution time**: 90 s. + +--- + +### FT-P-09-iNav: iNav MSP2_SENSOR_GPS contract conformance + +**Summary**: Validates `MSP2_SENSOR_GPS` reaches iNav SITL and iNav GPS provider accepts it as the sole source. +**Traces to**: AC-4.3 (iNav) +**Category**: FC Contract + +**Preconditions**: +- `inav-sitl` running with GPS provider configured to MSP per `docs/SITL/SITL.md`. + +**Input data**: `derkachi-fixture` (any 60 s segment). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start SUT with FC adapter `inav` | TCP connection to `inav-sitl:5760` established | +| 2 | Replay 60 s of Derkachi | SUT emits `MSP2_SENSOR_GPS` (ID 0x1F03) frames at 5 Hz | +| 3 | Read iNav GPS state via MSP query | `gpsSol.fixType` ≥ 3, `gpsSol.numSat` reflects emitted value, provider=MSP | + +**Expected outcome**: iNav GPS state reflects emitted frames; no fallback to internal GPS. +**Max execution time**: 90 s. + +--- + +### FT-P-10: GTSAM smoothing-loop look-back accuracy (IT-11) + +**Summary**: Validates the smoothing-loop's past-keyframe pose estimates improve over raw single-shot estimates (Mode B Fact #107). NOT validated as FC-side retroactive correction. +**Traces to**: AC-4.5 (revised scope per Mode B), Mode B Fact #107 +**Category**: Position Accuracy / Internal smoothing + +**Preconditions**: +- SUT cold-started; FDR enabled. + +**Input data**: `derkachi-fixture` full replay. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Replay Derkachi end-to-end | FDR contains per-keyframe (a) raw single-shot pose at first emission, (b) smoothed pose at iSAM2 convergence | +| 2 | After replay, parse FDR; for each past keyframe compute distance(raw, GT) and distance(smoothed, GT) | Per-keyframe pair extracted | +| 3 | Aggregate across keyframes | smoothed_error < raw_error for ≥80% of keyframes; mean improvement ≥ 5 m | + +**Expected outcome**: Internal smoothing improves past-keyframe accuracy; FC-side retroactive correction NOT exercised (out of scope per Mode B revision A6). +**Max execution time**: 12 min. + +--- + +### FT-P-11: Cold-start initialization from FC EKF + +**Summary**: Validates SUT initialization from FC EKF's last valid GPS + IMU-extrapolated position at GPS denial. +**Traces to**: AC-5.1 +**Category**: Startup + +**Preconditions**: +- `cold-boot-fixture` provides a frozen FC pose snapshot. +- SUT not running. + +**Input data**: `cold-boot-fixture`. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start `ardupilot-plane-sitl` (or `inav-sitl`) with the frozen-pose snapshot loaded | SITL EKF reflects the snapshot pose | +| 2 | Start SUT | SUT queries FC EKF; reads pose; initializes | +| 3 | Push first nav-camera frame | First outbound estimate's lat/lon within ±50 m of the FC EKF snapshot pose | + +**Expected outcome**: First emitted estimate uses FC EKF's pose as prior, within ±50 m tolerance. +**Max execution time**: 60 s. + +--- + +### FT-P-12: GCS downsample at 1-2 Hz + +**Summary**: Validates position estimates + confidence stream to the GCS (via `mavproxy-listener`) at 1-2 Hz. +**Traces to**: AC-6.1 +**Category**: GCS / Telemetry + +**Preconditions**: +- `mavproxy-listener` running and capturing to `.tlog`. + +**Input data**: `derkachi-fixture` 60 s segment. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start replay | SUT emits to FC at runtime cadence (~3 Hz) AND to GCS at 1-2 Hz | +| 2 | After replay, parse `.tlog` for SUT-emitted GCS messages over the 60 s window | GCS rate within [1, 2] Hz inclusive | + +**Expected outcome**: GCS-side rate observed in [1, 2] Hz over the window. +**Max execution time**: 90 s. + +--- + +### FT-P-13: GCS command path (operator re-loc hint) + +**Summary**: Validates that GCS-originated commands (via standard MAVLink) can carry operator re-loc hints to the SUT. +**Traces to**: AC-6.2 +**Category**: GCS / Telemetry + +**Preconditions**: +- `mavproxy-listener` configured to send commands. +- SUT in `dead_reckoned` state (e.g. mid-blackout from FT-N-03 setup). + +**Input data**: synthesized `STATUSTEXT` carrying re-loc hint from MAVPROXY. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | While SUT is in `dead_reckoned`, send re-loc-hint STATUSTEXT from MAVPROXY | SUT acknowledges the hint via FDR log entry | +| 2 | Push next nav-camera frame after hint | Next satellite-anchor attempt uses hint as a search prior | + +**Expected outcome**: Hint received; next anchor attempt biases search; no rejection. +**Max execution time**: 60 s. + +--- + +### FT-P-14: WGS84 output coordinate system + +**Summary**: Validates output coordinates are in WGS84 (latitude/longitude in degrees as per ArduPilot/iNav GPS convention scaled to 1e-7). +**Traces to**: AC-6.3 + +**Preconditions**: any FT-P-01 / FT-P-02 run. + +**Input data**: any. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Capture one outbound `GPS_INPUT` / `MSP2_SENSOR_GPS` from SITL | Lat/lon present; values in valid WGS84 range; scaled per protocol convention | + +**Expected outcome**: Coordinates parse as WGS84 within Earth bounds. +**Max execution time**: 30 s. + +--- + +### FT-P-15: Tile cache schema and resolution floor + +**Summary**: Validates the tile cache manifest carries every required field and tiles meet the ≥0.5 m/px floor. +**Traces to**: AC-8.1, RESTRICT-SAT-2 (manifest schema) + +**Preconditions**: `tile-cache-fixture` mounted. + +**Input data**: tile cache. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | The SUT exposes a one-time cache-load self-check at startup; observe via FDR | Each tile manifest entry has CRS, tile matrix, dimension, lat-adjusted m/px, capture date, source, compression | +| 2 | Inspect m/px values | All ≥ 0.5 m/px; reject below floor | + +**Expected outcome**: All loaded tiles pass schema check and resolution floor. +**Max execution time**: 30 s. + +--- + +### FT-P-16: Pre-loaded cache (offline-only interface) + +**Summary**: Validates the SUT loads tiles from the local cache only, with no in-flight Service calls. +**Traces to**: AC-8.3, RESTRICT-SAT-1 + +**Preconditions**: `tile-cache-fixture` mounted; `e2e-net` `internal: true` enforced (no internet egress). + +**Input data**: `derkachi-fixture` 60 s segment. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start replay | SUT serves tiles from `/var/azaion/tile-cache` only | +| 2 | Observe network egress counter on `gps-denied-onboard` container | All egress to non-`e2e-net` destinations is 0 (paired with NFT-SEC-02) | + +**Expected outcome**: 0 external egress; replay completes against local cache. +**Max execution time**: 90 s. + +--- + +### FT-P-17: Mid-flight tile generation + +**Summary**: Validates the SUT continuously orthorectifies nav-camera frames into basemap-projected tiles, deduplicates them, and stores them locally for landing-time upload. +**Traces to**: AC-8.4 + +**Preconditions**: empty `mid-flight-tile-output` directory in the FDR volume; mock-suite-sat-service running. + +**Input data**: `derkachi-fixture` 5 min segment. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start replay | SUT generates and writes tiles to FDR's `mid-flight-tile-output/` | +| 2 | After replay, read tiles | ≥1 tile per ~3 s of high-quality nav frames; each tile carries quality metadata sufficient for the Service voting layer (per Mode B Fact #105) | +| 3 | Simulate landing event; SUT uploads to `mock-suite-sat-service` | Mock service receives all tiles with HTTP 202 | + +**Expected outcome**: Tiles produced + deduplicated + uploaded with quality metadata. +**Max execution time**: 8 min. + +--- + +### FT-P-18: No raw nav/AI-cam frame retention (storage policy) + +**Summary**: Validates that no raw nav-camera or AI-camera frames are retained except the ≤0.1 Hz failed-tile-gen thumbnail log. +**Traces to**: AC-8.5 + +**Preconditions**: `derkachi-fixture` replay just completed. + +**Input data**: post-replay state of FDR + tile cache. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Walk the FDR + tile cache for any file matching nav-camera raw-frame pattern (JPEG/RAW with original dimensions) | Only the failed-tile-gen thumbnail log files present (≤0.1 Hz cadence) | +| 2 | Verify thumbnail log is bounded by AC-NEW-3 FDR budget | Total thumbnail log < 1 GB over 8 h (NFT-LIM-02 cross-check) | + +**Expected outcome**: 0 unauthorized raw frames retained. +**Max execution time**: 30 s (filesystem walk). + +--- + +### FT-P-19: Satellite relocalization scale-ratio + scene-change + +**Summary**: Validates UAV-frame ground footprint at deployment altitude is retrievable from cache regardless of internal tiling. Scene-change subset is reduced-confidence (PARTIAL — see traceability matrix). +**Traces to**: AC-8.6 (scale-ratio FULL; scene-change PARTIAL) + +**Preconditions**: `tile-cache-fixture` mounted with multi-zoom-level coverage. + +**Input data**: `still-image-set-60` (scale-ratio); the 2 paired `_gmaps.png` images (scene-change subset). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | For each still image, query cache top-K=10 retrieval | Top-K result includes a tile whose centre is within 100 m of the image's true centre (scale-ratio satisfied) | +| 2 | For the 2 paired images, run cross-domain matcher against the `_gmaps.png` reference | Scale-ratio match succeeds; scene-change behavior recorded (PARTIAL — full coverage requires a labeled change-pair dataset, deferred under D-PROJ-3) | + +**Expected outcome**: Scale-ratio passes for 60/60; scene-change recorded as PARTIAL. +**Max execution time**: 5 min. + +--- + +## Negative Scenarios + +### FT-N-01: 350 m outlier injection tolerance + +**Summary**: Validates the system tolerates up to 350 m outliers between two consecutive frames with airframe tilt up to ±20°. +**Traces to**: AC-3.1, RESTRICT-CAM-1 (nadir camera, tilt limits) + +**Preconditions**: SUT running on `derkachi-fixture`; `outlier-injection-derkachi` injector primed in `medium` density. + +**Input data**: `outlier-injection-derkachi` (medium). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start replay with injector active (every 10th frame replaced by far-away tile crop) | SUT detects outlier; rejects from anchor; estimate continues from prior valid state | +| 2 | Compare per-frame outbound estimate vs GT for non-outlier frames | Error_after_outlier ≤ error_before_outlier + 50 m; covariance grows monotonically across the outlier event | + +**Expected outcome**: Outliers rejected; estimate degrades at most by 50 m drift; covariance monotonic. +**Max execution time**: 12 min. + +--- + +### FT-N-02: Sharp-turn frame-to-frame failure expected + +**Summary**: Negative twin of FT-P-07 — validates that during a sharp turn, frame-to-frame may LEGITIMATELY fail, and the system labels accordingly. +**Traces to**: AC-3.2 (negative path) + +**Preconditions**: Same as FT-P-07. + +**Input data**: sharp-turn segment of Derkachi (or synthetic gyro overlay). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Replay sharp-turn segment | During turn frames: `source_label` ∈ `{visual_propagated, dead_reckoned}`; covariance grows | +| 2 | After turn, observe label transition | Label returns to `satellite_anchored` once next anchor succeeds | + +**Expected outcome**: Sharp-turn frames correctly mark themselves as not-satellite-anchored; recovery exercised in FT-P-07. +**Max execution time**: 5 min. + +--- + +### FT-N-03: Extended outage triggers operator re-loc request + +**Summary**: Validates that on ≥3 consecutive frames AND ≥2 s without estimate, the SUT requests operator re-loc via telemetry and continues dead-reckoned propagation. +**Traces to**: AC-3.4 + +**Preconditions**: `derkachi-fixture` + 3-frame outage injector primed. + +**Input data**: synthetic outage on Derkachi. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Trigger 3-consecutive-frame failure (corrupt frames) | SUT fails to produce estimates for 3+ frames | +| 2 | Wait ≥2 s | STATUSTEXT containing `OPERATOR_RELOC_REQUEST` emitted to `mavproxy-listener` | +| 3 | During outage, observe FC outbound | Estimates labeled `dead_reckoned` continue; FC uses last-known + IMU extrapolation | + +**Expected outcome**: Re-loc request emitted; dead-reckoned estimates continue. +**Max execution time**: 60 s. + +--- + +### FT-N-04: Visual blackout + spoofed GPS combined failsafe + +**Summary**: Validates the AC-3.5 + AC-NEW-8 combined failsafe: switch label, reject spoof, propagate from last trusted state, monotonic covariance, STATUSTEXT. +**Traces to**: AC-3.5, AC-NEW-8 + +**Preconditions**: `blackout-spoof-derkachi` injector primed for 5 s, 15 s, 35 s windows; FC inbound stream patched to inject spoofed GPS. + +**Input data**: `blackout-spoof-derkachi` (each window run as a sub-case). + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Begin blackout window AND inject spoofed GPS in same temporal window | Within ≤1 frame OR ≤400 ms: `source_label = dead_reckoned`; spoofed GPS rejected from estimator input; covariance grows monotonically | +| 2 | Observe `horiz_accuracy` field in outbound `GPS_INPUT` (AP) | `horiz_accuracy` ≥ 95% covariance semi-major axis (no under-reporting) | +| 3 | Observe GCS stream | `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT at 1-2 Hz throughout blackout | +| 4 | For 35 s window only | Per AC-NEW-8: when 95% covariance crosses 100 m → fix-quality degraded; when crosses 500 m OR blackout exceeds 30 s → `horiz_accuracy=999.0` AND `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT | +| 5 | End blackout; restore FC GPS-health | Recovery only after FC GPS-health stable + non-spoofed for ≥10 s AND a visual/satellite consistency check succeeds | + +**Expected outcome**: All four steps' assertions pass for each window. +**Max execution time**: 5 min (3 windows × ~90 s each). + +--- + +### FT-N-05: Stale-tile rejection on freshness violation + +**Summary**: Validates that tiles violating AC-8.2 freshness window are rejected (or downgraded so they cannot produce a `satellite_anchored` label). +**Traces to**: AC-8.2, AC-NEW-6 + +**Preconditions**: `synth-age-tile-set` (`synth-age-7mo` for active-conflict, `synth-age-13mo` for rear) mounted instead of fresh fixture. + +**Input data**: `still-image-set-60` against the aged cache. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Replay against `synth-age-7mo` (configure SUT for active-conflict sector) | SUT either rejects load OR loads but never emits `satellite_anchored` from these tiles | +| 2 | Replay against `synth-age-13mo` (configure SUT for rear sector) | Same: reject or non-`satellite_anchored` only | + +**Expected outcome**: 0 frames emit `satellite_anchored` from aged tiles. +**Max execution time**: 5 min. + +--- + +### FT-N-06: Mid-flight tile freshness (current-timestamped) + +**Summary**: Validates that mid-flight-generated tiles are timestamped as current and treated as fresh per AC-NEW-6. +**Traces to**: AC-NEW-6 (positive sub-case) + +**Preconditions**: empty `mid-flight-tile-output`. + +**Input data**: `derkachi-fixture` 5 min segment. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Start replay | SUT generates mid-flight tiles | +| 2 | Inspect each generated tile's manifest entry | `capture_date` is within ±60 s of generation wall-clock; treated as fresh by the freshness gate | + +**Expected outcome**: All mid-flight tiles current-timestamped and fresh. +**Max execution time**: 6 min. diff --git a/_docs/02_document/tests/environment.md b/_docs/02_document/tests/environment.md new file mode 100644 index 0000000..2e95180 --- /dev/null +++ b/_docs/02_document/tests/environment.md @@ -0,0 +1,248 @@ +# Test Environment + +## Overview + +**System under test (SUT)**: `gps-denied-onboard` companion-PC service that produces WGS84 position estimates from nav-camera frames + FC IMU/attitude and emits them to the FC over its native external-positioning interface. Public boundaries (the only surfaces tests interact with): + +- **Inbound — nav-camera frames**: V4L2 / GStreamer source (production: USB / MIPI-CSI / GigE per `restrictions.md`; tests: file-backed source replaying `_docs/00_problem/input_data/AD0000NN.jpg` or `flight_derkachi/flight_derkachi.mp4`). +- **Inbound — FC telemetry**: MAVLink (ArduPilot) or MSP2 (iNav) inbound stream carrying `SCALED_IMU2`, `ATTITUDE`, `GLOBAL_POSITION_INT` (or MSP equivalents). Tests replay `flight_derkachi/data_imu.csv` through a thin replayer. +- **Inbound — satellite tile cache**: filesystem + on-disk index (FAISS HNSW + tile manifest). Tests load a fixture cache mounted as a Docker volume. +- **Outbound — FC external-positioning**: MAVLink `GPS_INPUT` (ArduPilot Plane) OR MSP2 `MSP2_SENSOR_GPS` (iNav). Tests observe these by spinning up the corresponding open-source SITL and reading what reaches the FC. +- **Outbound — GCS telemetry**: MAVLink to QGroundControl (1-2 Hz downsample of estimates + STATUSTEXT). Tests subscribe via a passive MAVLink listener. +- **Outbound — Flight Data Recorder**: NVM filesystem (per AC-NEW-3). Tests read the resulting FDR archive after the run. + +**Consumer app purpose**: The e2e harness drives the SUT through these public boundaries — replaying frames + telemetry, mounting tile-cache fixtures, observing FC-side acceptance via SITL, and parsing FDR output. It NEVER imports SUT modules, NEVER queries SUT internal state, and NEVER touches the SUT's filesystem outside the FDR output directory. + +## Two-tier execution profile + +This project requires two distinct test environments because the production target is Jetson hardware and AC-4.1/AC-4.2/AC-NEW-5 cannot be honestly validated on a generic x86 dev workstation. + +| Tier | Hardware | What it covers | What it skips | +|------|----------|----------------|---------------| +| **Tier-1 (workstation Docker)** | x86 dev workstation, optional NVIDIA dGPU for TensorRT validation | All `FT-*` correctness, schema, `NFT-RES-*` resilience scenarios, `NFT-SEC-*` security scenarios, `NFT-LIM-*` storage budgets | Any AC whose pass criterion is bound to Jetson Orin Nano Super wall-clock latency or thermal envelope: AC-4.1 / AC-4.2 / AC-NEW-1 / AC-NEW-5 | +| **Tier-2 (Jetson hardware loop)** | Jetson Orin Nano Super (pinned hardware per `restrictions.md`), thermal chamber for AC-NEW-5 | AC-4.1 latency p95, AC-4.2 memory, AC-NEW-1 cold-start TTFF, AC-NEW-5 thermal envelope (chamber-only) | Iteration speed (manual hardware time) | + +CI runs Tier-1 on every PR. Tier-2 runs on hardware-attached runners on a nightly cadence and pre-release gate; results are imported into the same CSV report format as Tier-1. + +## Docker Environment (Tier-1) + +### Services + +| Service | Image / Build | Purpose | Ports | +|---------|--------------|---------|-------| +| `gps-denied-onboard` | local build (`docker/Dockerfile`) | The SUT. Production binary built with `BUILD_VINS_MONO=OFF` per locked sub-decision D-C1-1-SUB-A; research builds run a parallel job with `BUILD_VINS_MONO=ON` | 14550/udp (MAVLink to GCS), 5760/tcp (MSP2 to iNav SITL) | +| `ardupilot-plane-sitl` | `ardupilot/ardupilot-sitl:plane-stable` | ArduPilot Plane SITL. Receives `GPS_INPUT` from the SUT; we read its EKF source-set state to validate AC-4.3, AC-NEW-2, AC-5.x | 14550/udp (MAVLink) | +| `inav-sitl` | `inavflight/inav-sitl:9.0.0` | iNav SITL. Receives `MSP2_SENSOR_GPS` from the SUT; we read its GPS provider state | 5760/tcp (MSP2 over TCP per iNav SITL convention) | +| `mock-suite-sat-service` | local build (`tests/fixtures/mock-suite-sat`) | Stubs the parent-suite Satellite Service tile-publish API (read-only ingest contract for AC-NEW-7 voting layer). Returns deterministic fixture tiles | 8080/tcp | +| `e2e-runner` | local build (`tests/runner`) | Pytest-based harness. Drives all replays, reads FDR output, spins SITL scenarios | — | +| `mavproxy-listener` | `ardupilot/mavproxy:latest` | Passive MAVLink listener that captures the SUT → GCS stream into a per-run `.tlog` for assertions | 14551/udp | + +### Networks + +| Network | Services | Purpose | +|---------|----------|---------| +| `e2e-net` | all | Isolated test network. No host networking, no internet. Per RESTRICT-SAT-1, the SUT must NEVER reach an external satellite provider during a flight; a deny-all egress rule on `e2e-net` enforces this and is itself a security test (NFT-SEC-02). | + +### Volumes + +| Volume | Mounted to | Purpose | +|--------|-----------|---------| +| `tile-cache-fixture` | `gps-denied-onboard:/var/azaion/tile-cache:ro` | Pre-built FAISS HNSW index + tile filesystem. Built once per test run from `tests/fixtures/tile-cache-builder/` from the 60 still-image satellite references and the Derkachi route bbox. Read-only mount mirrors AC-8.3 pre-flight load behavior. | +| `fdr-output` | `gps-denied-onboard:/var/azaion/fdr` | Per-flight FDR write target (AC-NEW-3 64 GB cap enforced via Docker `--storage-opt size=64g` on this volume) | +| `input-data` | `e2e-runner:/test-data:ro` | Bind mount of `_docs/00_problem/input_data/` for replay | +| `expected-results` | `e2e-runner:/expected:ro` | Bind mount of `_docs/00_problem/input_data/expected_results/` for assertions | + +### docker-compose structure + +```yaml +services: + gps-denied-onboard: + build: + context: ../.. + dockerfile: docker/Dockerfile + args: + BUILD_VINS_MONO: "OFF" + networks: [e2e-net] + volumes: + - tile-cache-fixture:/var/azaion/tile-cache:ro + - fdr-output:/var/azaion/fdr + environment: + ONBOARD_FC_ADAPTER: ${FC_ADAPTER} # ardupilot | inav, set per scenario + ONBOARD_VIO_STRATEGY: ${VIO_STRATEGY} # okvis2 | klt_ransac (production); vins_mono only in research build + MAVLINK_SIGNING_PASSKEY_FILE: /run/secrets/mavlink_passkey + depends_on: + - mock-suite-sat-service + + ardupilot-plane-sitl: + image: ardupilot/ardupilot-sitl:plane-stable + networks: [e2e-net] + command: ["--vehicle=ArduPlane", "--gps-type=14"] # GPS_TYPE=14 = MAV per ArduPilot SITL_simulation_parameters.html + + inav-sitl: + image: inavflight/inav-sitl:9.0.0 + networks: [e2e-net] + # iNav SITL exposes MSP on TCP 5760 (UART1) per docs/SITL/SITL.md + + mock-suite-sat-service: + build: ../fixtures/mock-suite-sat + networks: [e2e-net] + # Egress restriction enforced at network level, not service level + + e2e-runner: + build: ../runner + networks: [e2e-net] + volumes: + - input-data:/test-data:ro + - expected-results:/expected:ro + - fdr-output:/fdr:ro + depends_on: + - gps-denied-onboard + - ardupilot-plane-sitl + - inav-sitl + - mavproxy-listener + + mavproxy-listener: + image: ardupilot/mavproxy:latest + networks: [e2e-net] + +networks: + e2e-net: + driver: bridge + internal: true # NO external connectivity (enforces RESTRICT-SAT-1) + +volumes: + tile-cache-fixture: {} + fdr-output: {} +``` + +## Consumer Application + +**Tech stack**: Python 3.12, pytest 8.x, pymavlink (MAVLink ground side), `msp_gps_toy` (MSP2 ground side, Rust binary called via subprocess), OpenCV ≥4.12.0 (frame source replay), numpy + scipy (geodesic-distance assertions in WGS84). + +**Entry point**: `pytest tests/e2e/` from inside `e2e-runner`. Each scenario is a parameterized pytest case keyed by FC adapter (`ardupilot` / `inav`). + +### Communication with system under test + +| Interface | Protocol | Endpoint / Topic | Authentication | +|-----------|----------|-----------------|----------------| +| Frame source | V4L2 / GStreamer file source | UNIX domain socket / shared `/test-data` mount | none (local) | +| FC telemetry inbound | MAVLink (AP) or MSP2 (iNav) | `udp:gps-denied-onboard:14550` (AP) or `tcp:gps-denied-onboard:5760` (iNav) | MAVLink 2.0 message signing on AP per D-C8-9 (passkey via Docker secret); iNav unsigned per accepted residual risk | +| Tile cache | Filesystem read | `/var/azaion/tile-cache` (read-only mount) | filesystem perms | +| FC external-pos outbound observation | Read SITL EKF source-set + GLOBAL_POSITION_INT replay back from SITL | `udp:ardupilot-plane-sitl:14550` or `tcp:inav-sitl:5760` | passive listener | +| GCS telemetry observation | MAVLink listener | `udp:mavproxy-listener:14551` (forwarded from SUT 14550) | none | +| FDR output | Filesystem read post-run | `/fdr` (read-only mount) | filesystem perms | +| Suite Sat Service mock | HTTP/JSON | `http://mock-suite-sat-service:8080` | none (test) | + +### What the consumer does NOT have access to + +- No direct access to the SUT's internal state (GTSAM iSAM2 graph, FAISS index in-memory, OpenCV intermediate buffers, VioStrategy implementation pointer). +- No internal Python/C++ module imports from the SUT. +- No shared memory or filesystem with the SUT outside the four explicit mounts (`tile-cache-fixture` r/o, `fdr-output` r/o from runner side, `input-data` r/o, `expected-results` r/o). +- No bypass of the FC-side acceptance check — every AC-4.3 assertion goes through SITL. + +## CI/CD Integration + +**When to run**: +- Tier-1 (workstation Docker): on every PR to `dev` branch and nightly on `dev` HEAD. +- Tier-2 (Jetson hardware loop): nightly on `dev`, and as a hard gate before any release tag. +- AC-NEW-5 thermal envelope: monthly on chamber-attached Jetson runner; failures block release tags only. + +**Pipeline stage**: +- Tier-1 fits in the standard CI matrix as a single job (~30-45 min wall-clock for the full suite at first cut). +- Tier-2 is a separate workflow on `self-hosted-jetson-orin` runner. + +**Gate behavior**: Tier-1 blocks PR merge on any test failure. Tier-2 blocks release tag on any test failure. Chamber tests are warning-only on PRs and blocking on release tags. + +**Timeout**: +- Tier-1: 60 min per matrix entry. +- Tier-2: 4 hr per matrix entry (allows for full Derkachi 8 min replay × ~10 scenarios + cold-boot loops). +- Thermal chamber AC-NEW-5: 9 hr (8 h hot-soak + setup/teardown). + +## Reporting + +**Format**: CSV (one row per test). + +**Columns**: `test_id, test_name, traces_to, fc_adapter, vio_strategy, tier, started_at_utc, execution_time_ms, result, error_message, evidence_paths` + +- `traces_to`: comma-separated AC/RESTRICT IDs from the traceability matrix. +- `fc_adapter`: `ardupilot` | `inav` | `n/a`. +- `vio_strategy`: `okvis2` | `klt_ransac` | `vins_mono` | `n/a` (research-build only for `vins_mono`). +- `tier`: `tier1-docker` | `tier2-jetson` | `tier2-chamber`. +- `result`: `PASS` | `FAIL` | `SKIP` | `XFAIL` (XFAIL only allowed for AC explicitly marked NOT COVERED in the traceability matrix and not yet promoted to a real test). +- `evidence_paths`: comma-separated paths inside the run-output bundle (`.tlog` files, FDR archives, screenshots, profiler traces) supporting the verdict. + +**Output path**: `e2e-results/run-${RUN_ID}/report.csv` plus a per-run bundle of evidence at `e2e-results/run-${RUN_ID}/evidence/`. + +## Test Execution + +**Decision (2026-05-09)**: **both** — Tier-1 Docker + Tier-2 Jetson hardware loop. Confirmed at the Hardware-Dependency Assessment Step 4 gate. + +### Hardware dependencies found (Phase 3 → Hardware Assessment scan) + +| Category | Indicator | Source file | +|---|---|---| +| GPU / CUDA | TensorRT engines (`.engine`, SM 87, JetPack 6.2, TRT 10.3) | `_docs/01_solution/solution.md` PRE-FLIGHT block | +| GPU / CUDA | DISK+LightGlue FP16 inference | `_docs/01_solution/solution.md` RUNTIME block (C3) | +| GPU / CUDA pin | Jetson Orin Nano Super (67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W) | `_docs/00_problem/restrictions.md` § Onboard Hardware | +| Sensors / Cameras | ADTi 20MP 20L V1 nadir camera over USB / MIPI-CSI / GigE | `_docs/00_problem/restrictions.md` § Cameras | +| Sensors / Cameras | V4L2 / GStreamer frame source (production) | `_docs/02_document/tests/environment.md` § Overview | +| OS-specific services | High-rate IMU via UART/MAVLink to FC | `_docs/00_problem/restrictions.md` § Sensors & Integration | +| OS-specific services | Per-FC inbound (MAVLink GPS_INPUT for AP, MSP2 over UART for iNav) | `_docs/00_problem/restrictions.md` § Sensors & Integration | +| OS-specific services | tegrastats / jetson_stats for thermal telemetry | `_docs/02_document/tests/resource-limit-tests.md` NFT-LIM-04 | +| Thermal envelope | -20 °C to +50 °C operating envelope, 25 W TDP, 8 h duty cycle | `_docs/00_problem/restrictions.md` § Failsafe & Safety + AC-NEW-5 | + +(Step 2 Code scan returned zero indicators because no source code exists yet — this is the planning phase. Decompose → Implement will produce `requirements.txt` / `pyproject.toml` / Cargo.toml entries that confirm: `tensorrt`, `pycuda`, `pymavlink`, `gtsam`, `faiss-gpu`, `opencv-python>=4.12.0`, `jetson-stats`.) + +### Execution instructions — Tier-1 (Docker) + +**Prerequisites**: +- Docker 24+ with Compose v2. +- NVIDIA Container Toolkit if the workstation has an NVIDIA dGPU (lets the SUT exercise the TensorRT path; otherwise falls back to CPU TensorRT). +- ≥16 GB host RAM, ≥80 GB free disk for `tile-cache-fixture` + `fdr-output` + image build cache. + +**How to start**: +```bash +cd e2e/docker +export FC_ADAPTER=ardupilot # or: inav (parameterized per scenario in CI) +export VIO_STRATEGY=okvis2 # or: klt_ransac (production binary) +docker compose -f docker-compose.test.yml up --build --abort-on-container-exit e2e-runner +``` +The run reports to `./e2e-results/run-${RUN_ID}/report.csv` (see § Reporting). Exit code matches the test verdict. + +**Environment variables**: +- `FC_ADAPTER` ∈ `{ardupilot, inav}` — selects which SITL the SUT talks to. +- `VIO_STRATEGY` ∈ `{okvis2, klt_ransac}` for production binary; `vins_mono` only when the research binary `BUILD_VINS_MONO=ON` is the build. +- `MAVLINK_SIGNING_PASSKEY_FILE` — path to the Docker secret loaded with the test passkey for FT-P-09-AP / NFT-SEC-03. + +**Skipped on Tier-1**: `NFT-PERF-01` (AC-4.1 latency p95 — Jetson-bound), `NFT-LIM-01` (AC-4.2 memory — Jetson-bound), `NFT-PERF-03` (AC-NEW-1 cold-start — Jetson-bound), `NFT-LIM-04` (AC-NEW-5 chamber baseline — Jetson-bound), AC-NEW-5 chamber portion (chamber-bound). + +### Execution instructions — Tier-2 (Jetson hardware loop) + +**Prerequisites**: +- Jetson Orin Nano Super (per `restrictions.md` § Onboard Hardware). +- JetPack 6.2 + CUDA + TensorRT 10.3 + cuDNN per D-C7-9. +- Workstation thermal-day environment for NFT-LIM-04 baseline. Chamber-attached runner for AC-NEW-5 chamber portion (separate quarterly job; not run in standard CI). +- ArduPilot Plane SITL + iNav SITL run on the same Jetson, OR on a paired x86 host on the same network — both are supported. +- Real ADTi 20MP 20L V1 camera connected via USB/MIPI-CSI/GigE; OR file-replay source if camera unavailable (in which case all `AC-2.x` cross-validation is `XFAIL` for that run). + +**How to start**: +```bash +cd e2e/jetson +sudo systemctl restart gps-denied-onboard.service +./run-tier2.sh --fc-adapter ardupilot --vio-strategy okvis2 --duration 8h +# or: +./run-tier2.sh --fc-adapter inav --vio-strategy klt_ransac --duration 5min +``` +Outputs the same CSV format as Tier-1 (one report.csv per run). + +**Environment variables**: same as Tier-1 plus: +- `TIER2_CHAMBER_AMBIENT_C` — ambient temperature for AC-NEW-5 chamber runs. +- `TIER2_CAMERA_DEVICE` — `/dev/video0` (production) or file path for replay mode. + +### CI runner mapping + +- `ubuntu-24.04` (GitHub-hosted) → Tier-1 Docker, every PR + nightly. ~30-45 min per matrix entry. +- `self-hosted-jetson-orin` → Tier-2 Jetson, nightly on `dev` HEAD + pre-release gate. ~4 hr per matrix entry. +- `self-hosted-jetson-orin-chamber` → AC-NEW-5 hot-soak. Quarterly + before any release tag. ~9 hr. + +**Matrix dimensions**: `FC_ADAPTER × VIO_STRATEGY × build_kind` where `build_kind ∈ {production, research}`. Production `vins_mono` is excluded (D-C1-1-SUB-A locked); research includes all three VioStrategy values. diff --git a/_docs/02_document/tests/performance-tests.md b/_docs/02_document/tests/performance-tests.md new file mode 100644 index 0000000..2b0c4ea --- /dev/null +++ b/_docs/02_document/tests/performance-tests.md @@ -0,0 +1,126 @@ +# Performance Tests + +All performance tests honor the per-tier execution profile from `environment.md`. Latency and memory tests bound to Jetson Orin Nano Super hardware run on Tier-2 only; metrics that don't depend on hardware (e.g. inter-emit interval correctness, GCS rate) run on both tiers. + +### NFT-PERF-01: End-to-end latency p95 budget + +**Summary**: Validates the AC-4.1 end-to-end latency budget (camera capture → GPS to FC) on the pinned hardware. +**Traces to**: AC-4.1, D-CROSS-LATENCY-1 +**Metric**: Wall-clock latency from frame-capture timestamp to outbound `GPS_INPUT` (AP) / `MSP2_SENSOR_GPS` (iNav) reception at the SITL container. + +**Preconditions**: +- Tier-2 only — Jetson Orin Nano Super, JetPack 6.2, TensorRT 10.3 per D-C7-9. +- `tile-cache-fixture` pre-loaded. +- SUT cold-started THEN warmed up for 30 s of replay before measurement window starts. +- Two configurations measured: (a) `K=3` baseline at +25 °C, (b) `K=2 + Jacobian-cov` hybrid auto-degrade at +50 °C ambient (NFT-9 in the solution draft). + +**Steps**: + +| Step | Consumer Action | Measurement | +|------|----------------|-------------| +| 1 | Run 30 s warm-up replay (excluded from measurement) | none | +| 2 | Run 5 min Derkachi replay at 3 Hz target cadence | per-frame latency: `t_emit_at_sitl − t_capture` | +| 3 | Record per-frame latency to CSV; compute p50, p95, p99 | distribution | +| 4 | Repeat at +50 °C ambient (chamber if available, else flagged) | distribution under thermal-throttle hybrid | + +**Pass criteria**: +- (a) `K=3` baseline: p95 ≤ 400 ms (AC-4.1 hard bound). +- (b) `K=2 + Jacobian-cov` hybrid: p95 ≤ 400 ms still satisfied after auto-degrade (proves D-CROSS-LATENCY-1 effective). +- ≤10% frame drops under sustained load (AC-4.1 allowance). +- Per-stage latency partitioning (D-CROSS-LATENCY-1 table) recorded for all stages: C1 OKVIS2 / C2 UltraVPR / C2.5 / C3 / C3.5 / C4 / C4 cov / C5 / serialization / OS jitter — used in NFT-PERF-01 evidence bundle for budget-margin tracking. + +**Duration**: 2 × 5.5 min replays (warm-up + measurement) per configuration; ~25 min total per FC adapter. + +--- + +### NFT-PERF-02: Frame-by-frame streaming (no batching) + +**Summary**: Validates AC-4.4 — estimates streamed frame-by-frame with no batching/delay. +**Traces to**: AC-4.4 +**Metric**: Inter-emit interval at SITL. + +**Preconditions**: +- Tier-1 OR Tier-2. +- SUT warmed up for 30 s. + +**Steps**: + +| Step | Consumer Action | Measurement | +|------|----------------|-------------| +| 1 | Replay Derkachi 5 min at 3 Hz | per-frame inter-emit interval at SITL | +| 2 | Compute distribution | p95 of inter-emit interval | + +**Pass criteria**: p95 inter-emit interval ≤ inter-frame-interval × 1.05 (i.e. ≤ ~350 ms at 3 Hz target). No window of ≥3 missed-emit gaps. + +**Duration**: 6 min. + +--- + +### NFT-PERF-03: Cold-start TTFF + +**Summary**: Validates AC-NEW-1 cold-start time-to-first-fix from companion boot. +**Traces to**: AC-NEW-1 +**Metric**: Wall-clock from SUT container-ready event (or `systemctl start` on Tier-2) to first valid outbound `GPS_INPUT` / `MSP2_SENSOR_GPS` arrival at SITL. + +**Preconditions**: +- Tier-2 (Jetson) for the canonical run; Tier-1 acceptable for trend-tracking. +- `cold-boot-fixture` provides the FC EKF snapshot (loaded into SITL before the SUT cold boot). +- `tile-cache-fixture` already mounted (cache-load is part of the TTFF budget per AC-NEW-1 wording "from boot"). +- 50 cold boots executed back-to-back to populate distribution. + +**Steps**: + +| Step | Consumer Action | Measurement | +|------|----------------|-------------| +| 1 | Stop SUT; clear in-memory state | container down | +| 2 | Start SUT (record `t_start`) | timestamp | +| 3 | First outbound message arrives at SITL (record `t_first_emit`) | TTFF = `t_first_emit − t_start` | +| 4 | Repeat 50 times | distribution | + +**Pass criteria**: p95 TTFF < 30 s. + +**Duration**: ~30 min (50 × ~30 s + restart overhead). + +--- + +### NFT-PERF-04: Spoofing-promotion latency + +**Summary**: Validates AC-NEW-2 — when FC signals GPS denial/spoof, promote onboard estimate to FC's primary position source within < 3 s p95. +**Traces to**: AC-NEW-2 +**Metric**: Latency from spoof-onset signal to FC-side EKF source-set switch (AP: `EK3_SRC1_POSXY` flips to companion-source value; iNav: GPS provider state reflects companion as primary). + +**Preconditions**: +- Tier-1 acceptable (mostly software loops + SITL). +- `derkachi-fixture` running with SUT in `satellite_anchored` steady state. +- Spoof injector primed. + +**Steps**: + +| Step | Consumer Action | Measurement | +|------|----------------|-------------| +| 1 | Inject false GPS into FC SITL (record `t_spoof_onset`) | timestamp | +| 2 | Observe FC EKF source-set state via parameter read polling at 100 Hz (record `t_promotion`) | promotion latency = `t_promotion − t_spoof_onset` | +| 3 | Repeat 50 trials per FC (parameterized on `ardupilot` + `inav`) | distribution per FC | + +**Pass criteria**: p95 < 3 s on both FCs. + +**Duration**: ~25 min per FC (50 trials × ~30 s including pre-trial reset). + +--- + +### Per-stage latency partition record (informational, not pass/fail) + +NFT-PERF-01 captures per-stage latencies matching the D-CROSS-LATENCY-1 partition table from `solution.md`. The recorded targets are tracked for budget-margin trend (regression detector), not as independent pass/fail thresholds — only AC-4.1 p95 ≤ 400 ms is the hard gate. + +| Stage | K=3 target p95 | K=2 hybrid target p95 | +|-------|---------------|----------------------| +| C1 OKVIS2 VIO | ≤ 60 ms | ≤ 60 ms | +| C2 UltraVPR query | ≤ 15 ms | ≤ 15 ms | +| C2.5 Top-N re-rank | ≤ 80 ms | ≤ 80 ms | +| C3 DISK+LightGlue × N | ≤ 200 ms (steady) | ≤ 140 ms (thermal) | +| C3.5 AdHoP (conditional, p99) | ≤ 100 ms when triggered | ≤ 60 ms when triggered | +| C4 solvePnPRansac | ≤ 25 ms | ≤ 25 ms | +| C4 covariance recovery | ≤ 100 ms (steady) | ≤ 25 ms (thermal) | +| C5 iSAM2 update | ≤ 15 ms | ≤ 15 ms | +| MAVLink/MSP2 + UART/USB | ≤ 30 ms | ≤ 30 ms | +| OS scheduling jitter (p99) | ≤ 50 ms | ≤ 50 ms | diff --git a/_docs/02_document/tests/resilience-tests.md b/_docs/02_document/tests/resilience-tests.md new file mode 100644 index 0000000..dc62acf --- /dev/null +++ b/_docs/02_document/tests/resilience-tests.md @@ -0,0 +1,108 @@ +# Resilience Tests + +### NFT-RES-01: FC IMU-only fallback after >3 s without estimate + +**Summary**: Validates AC-5.2 — on >3 s without an estimate, the FC falls back to IMU-only dead reckoning AND the SUT logs the failure. +**Traces to**: AC-5.2 + +**Preconditions**: +- SUT in `satellite_anchored` steady state on Derkachi replay. +- 4 s outage injector primed (replay paused for 4 s of wall-clock). + +**Fault injection**: +- Pause frame source for 4 s of wall-clock while FC IMU stream continues. + +**Steps**: + +| Step | Action | Expected Behavior | +|------|--------|------------------| +| 1 | Mid-replay, halt frame delivery for 4 s | SUT continues emitting `dead_reckoned` estimates from FC IMU/attitude propagation | +| 2 | After 3 s without an emit (i.e. SUT internally fails to update for >3 s), SUT logs `NO_ESTIMATE_TIMEOUT` | FDR contains the log entry | +| 3 | Observe FC EKF source-set transition | EKF source-set transitions to internal IMU-only on the FC side per the FC's own failsafe logic (AP `EKF_FAILSAFE` or equivalent on iNav) | +| 4 | Resume frame delivery | SUT recovers; FC EKF source-set returns to companion-GPS source | + +**Pass criteria**: +- `NO_ESTIMATE_TIMEOUT` logged within 200 ms of the 3 s mark. +- FC EKF reflects the transition. +- Recovery on resume happens within 5 emit cycles. + +--- + +### NFT-RES-02: Companion mid-flight reboot + +**Summary**: Validates AC-5.3 — on companion reboot mid-flight, SUT re-initializes from FC's current IMU-extrapolated position. +**Traces to**: AC-5.3 + +**Preconditions**: +- SUT in steady state on Derkachi replay. +- FC SITL has been running long enough to have a stable IMU-extrapolated pose. + +**Fault injection**: +- `docker compose restart gps-denied-onboard` mid-replay (or `systemctl restart` on Tier-2). + +**Steps**: + +| Step | Action | Expected Behavior | +|------|--------|------------------| +| 1 | At t=120 s of replay, restart SUT container | SUT goes down and back up | +| 2 | Wait for first post-restart `GPS_INPUT` / `MSP2_SENSOR_GPS` arrival | First emit lat/lon within ±100 m of FC's IMU-extrapolated pose at boot-complete time | +| 3 | Observe TTFF post-reboot | Within AC-NEW-1 budget (<30 s p95) | + +**Pass criteria**: +- First post-restart emit ±100 m of FC pose at boot-complete. +- Cold-restart TTFF < 30 s. +- No FC-side EKF divergence event during the gap. + +--- + +### NFT-RES-03: False-position safety budget Monte Carlo + +**Summary**: Validates AC-NEW-4 false-position safety budget (`P(error > 500 m) < 0.1%`, `P(error > 1 km) < 0.01%`) on the available data + synthesis. PARTIAL — multi-flight statistics constrained by single Derkachi flight + 60 stills (see traceability matrix flag). +**Traces to**: AC-NEW-4 (PARTIAL) + +**Preconditions**: +- Tier-1 acceptable (statistical rather than hardware-bound). +- Pull together: 60 still-image runs (60 frames) + Derkachi replay (~14,700 frames at 30 fps OR resampled to ~870 frames at 3 Hz target). Total ≥930 frames per Monte Carlo iteration. +- Run M=50 Monte Carlo iterations with synthetic perturbations (camera-pose noise, IMU bias drift, randomized tile sub-selection). + +**Fault injection**: +- Add per-iteration synthetic perturbations to mimic a population of independent flights. + +**Steps**: + +| Step | Action | Expected Behavior | +|------|--------|------------------| +| 1 | Run M iterations end-to-end | Per-iteration error distribution captured | +| 2 | Aggregate across all iterations × frames | Per-frame error CDF | +| 3 | Read off `P(error > 500 m)` and `P(error > 1 km)` from CDF | Both values | + +**Pass criteria** (PARTIAL): +- `P(error > 500 m) < 0.1%`. +- `P(error > 1 km) < 0.01%`. +- Test FAILS-OPEN with explicit "PARTIAL" annotation in CSV report when iteration count is below the AC-NEW-4-implied ≥100 flights — noted as reduced confidence pending D-PROJ-3 (AerialVL S03 + own multi-flight data). + +--- + +### NFT-RES-04: Visual blackout + spoof degraded-mode escalation + +**Summary**: Validates the AC-NEW-8 escalation ladder (5 s, 15 s, 35 s blackouts paired with spoof) including the 100 m / 500 m covariance thresholds and the 10 s GPS-health gate before recovery. +**Traces to**: AC-NEW-8 (twin of FT-N-04 with extended duration window and covariance assertions) + +**Preconditions**: Same as FT-N-04; Tier-1 acceptable. + +**Fault injection**: `blackout-spoof-derkachi` 5 s / 15 s / 35 s windows + spoofed FC GPS for the same windows. + +**Steps**: + +| Step | Action | Expected Behavior | +|------|--------|------------------| +| 1 | Begin 5 s window | Mode transition ≤ 400 ms; covariance grows monotonically; spoofed GPS rejected | +| 2 | At end of 5 s window, attempt recovery | Recovery only after FC GPS-health stable + non-spoofed for ≥10 s AND visual/satellite consistency check succeeds (gate enforced) | +| 3 | Begin 15 s window | Same as step 1 plus when 95% covariance crosses 100 m: outbound MAVLink fix-quality degraded to "2D fix or worse" | +| 4 | Begin 35 s window | Plus when 95% covariance crosses 500 m OR blackout exceeds 30 s: `horiz_accuracy=999.0` + `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT emitted | + +**Pass criteria**: +- All four assertions fire at the right thresholds. +- Recovery gate is honored — early recovery attempts (FC GPS healthy for <10 s) MUST NOT promote spoofed GPS back into the estimator. + +**Duration**: ~10 min total for three windows. diff --git a/_docs/02_document/tests/resource-limit-tests.md b/_docs/02_document/tests/resource-limit-tests.md new file mode 100644 index 0000000..c87ff74 --- /dev/null +++ b/_docs/02_document/tests/resource-limit-tests.md @@ -0,0 +1,100 @@ +# Resource Limit Tests + +### NFT-LIM-01: Jetson memory ≤ 8 GB throughout 8 h replay + +**Summary**: Validates AC-4.2 — memory < 8 GB shared on Jetson Orin Nano Super for the full duty cycle. +**Traces to**: AC-4.2, RESTRICT-HW-1 + +**Preconditions**: +- Tier-2 only (Jetson hardware). +- `tile-cache-fixture` mounted. +- 8 h Derkachi replay loop (~60 loops of the 490 s fixture, OR a wrapped 8 h synthetic load that holds the same operating mix per AC-NEW-3 8 h synthetic-load definition). + +**Monitoring**: +- `jetson_stats` (`jtop` API) RAM usage sampled at 1 Hz. +- Per-component memory annotation if SUT exposes it via `NAMED_VALUE_FLOAT` / FDR. +- Swap usage (must remain 0 — Jetson Orin Nano Super has no swap by default). + +**Duration**: 8 h. + +**Pass criteria**: Peak RSS ≤ 8 GB across the entire 8 h window; swap stays 0. + +--- + +### NFT-LIM-02: FDR ≤ 64 GB / flight (8 h synthetic load) + +**Summary**: Validates AC-NEW-3 — per-flight FDR ≤ 64 GB; oldest segment dropped on rollover; no payload class silently dropped without a logged rollover. +**Traces to**: AC-NEW-3 + +**Preconditions**: +- Tier-1 acceptable (storage budget is policy/rotation driven, not Jetson-specific). +- `fdr-output` Docker volume sized exactly 64 GB. +- 8 h Derkachi replay loop at 3 Hz nav frames (per AC-NEW-3 validation wording). + +**Monitoring**: +- Total `fdr-output` volume size at 1-min sample rate. +- Per-payload-class size: per-frame estimates + IMU traces + emitted MAVLink + raw MAVLink (tlog) + system health + mid-flight tiles + ≤0.1 Hz failed-tile-gen thumbnails. +- Rollover-event log entries (count, timestamp, dropped-segment ID). + +**Duration**: 8 h synthetic. + +**Pass criteria**: +- Volume never exceeds 64 GB. +- Every drop event has a corresponding rollover log entry (no silent drops). +- All payload classes enumerated in AC-NEW-3 are present (no class missing entirely). + +--- + +### NFT-LIM-03: Tile cache ≤ 10 GB across operational area + +**Summary**: Validates RESTRICT-SAT-2 — cache budget 10 GB persistent across the ~400 km² operational area, including manifests, overviews, and any precomputed indices. +**Traces to**: RESTRICT-SAT-2, AC-8.3 + +**Preconditions**: +- `tile-cache-fixture` covers the full operational-area footprint (still-image + Derkachi route bbox, target ~400 km² for parity). + +**Monitoring**: +- Total tile-cache size on disk. +- Per-component breakdown: tile filesystem, tile manifest DB (PostgreSQL btree per `solution.md`), FAISS HNSW index, descriptor cache. + +**Duration**: one-shot measurement after fixture build + after a 5 min replay (to catch any descriptor-on-demand growth). + +**Pass criteria**: Total cache size ≤ 10 GB at both measurement points. + +--- + +### NFT-LIM-04: No thermal throttling at 25 W TDP — workstation thermal-day baseline + +**Summary**: Tier-2 baseline of AC-NEW-5 thermal-throttle behavior at workstation ambient temperature. Full chamber test at +50 °C is deferred to the AC-NEW-5 chamber gate (out-of-scope for data-acquisition per Phase 1 gate). +**Traces to**: AC-NEW-5 (PARTIAL), RESTRICT-HW-1 + +**Preconditions**: +- Tier-2 (Jetson) at workstation ambient (~25 °C). +- 8 h Derkachi replay loop sustaining 25 W TDP. + +**Monitoring**: +- `tegrastats`: GPU/CPU clock, GR3D_FREQ, RAM, temperatures, power-rail draw, throttle events. + +**Duration**: 8 h. + +**Pass criteria**: +- 0 thermal throttle events at workstation ambient. +- Average power draw ≤ 25 W. +- Hot-soak chamber test at +50 °C is OUT OF SCOPE for data-acquisition; tracked as deferred AC-NEW-5 chamber gate. The test is expected to be exercised on a chamber-attached Jetson runner before any release tag. + +--- + +### NFT-LIM-05: Disk storage budget (cache 10 GB + FDR 64 GB) + +**Summary**: Validates the combined storage budget per `restrictions.md` § Onboard Hardware: ≥ tile cache (~10 GB) + per-flight FDR (64 GB) of available storage on the deployed Jetson. +**Traces to**: RESTRICT-HW-1 (storage budget) + +**Preconditions**: +- Tier-2 acceptance run on the deployed-image Jetson. + +**Monitoring**: +- Available storage on the production storage device after a single fresh install of SUT + fixtures. + +**Duration**: one-shot. + +**Pass criteria**: Available storage ≥ 74 GB after install, leaving headroom for system + logs. diff --git a/_docs/02_document/tests/security-tests.md b/_docs/02_document/tests/security-tests.md new file mode 100644 index 0000000..38dcb01 --- /dev/null +++ b/_docs/02_document/tests/security-tests.md @@ -0,0 +1,97 @@ +# Security Tests + +These tests cover the security-relevant AC and the Mode B revisions that introduced explicit security gates: D-CROSS-CVE-1 (OpenCV CVE pin), D-C8-9 (MAVLink 2.0 message signing), AC-NEW-7 (cache poisoning), and RESTRICT-SAT-1 / AC-8.1 (no in-flight Service calls). + +### NFT-SEC-01: Cache-poisoning safety budget + +**Summary**: Validates AC-NEW-7 — across all onboard tiles written, `P(geo-misalign > 30 m) < 1%` and `P(> 100 m) < 0.1%`. Multi-flight statistics constrained — PARTIAL with current single-flight fixture (see traceability matrix). +**Traces to**: AC-NEW-7, Mode B Fact #105 (Service voting layer external dependency), D-PROJ-2 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Run 3 trial flights against `derkachi-fixture` with synthetic over-confidence injection (deflate covariance ×1.5, ×2, ×3) | Each flight produces mid-flight tiles uploaded to `mock-suite-sat-service` | +| 2 | After each flight, the mock service records each received tile's quality metadata + onboard-asserted geo-alignment vs the GT-derived geo-alignment | Per-tile mis-alignment captured | +| 3 | Across all uploaded tiles, compute `P(misalign > 30 m)` and `P(misalign > 100 m)` | Statistic computed | +| 4 | Independently observe Suite Sat Service voting-layer behavior (mock) — verify mock-side gate refuses `trusted basemap` promotion when ingest votes don't agree | Voting contract assertion (per D-PROJ-2) | + +**Pass criteria** (PARTIAL): +- `P(misalign > 30 m) < 1%`, `P(misalign > 100 m) < 0.1%` across the available trial flights. +- PARTIAL annotation: AC text expects ≥100 flights — escalates D-PROJ-3 fixture acquisition + D-PROJ-2 contract verification. + +--- + +### NFT-SEC-02: No in-flight Service calls (network egress isolation) + +**Summary**: Validates RESTRICT-SAT-1 / AC-8.1 — the SUT MUST NOT reach an external satellite provider during a flight. All cache reads come from the local cache. +**Traces to**: RESTRICT-SAT-1, AC-8.1 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Start the SUT with `e2e-net` configured `internal: true` (no external connectivity at the network layer) | SUT comes up; tile cache reads succeed | +| 2 | Run 5 min of Derkachi replay | All tile lookups served from local cache | +| 3 | Read SUT egress counter (Docker network stats) | 0 packets out to non-`e2e-net` destinations | +| 4 | Inspect SUT log for any "external Service call attempted" event | 0 events (proving the SUT didn't even try) | +| 5 | Defense-in-depth: temporarily flip `internal: false` AND blackhole DNS, re-run | Same — 0 egress attempts; no failed-DNS errors | + +**Pass criteria**: 0 packets to non-`e2e-net` destinations; no "Service call attempted" log entry. + +--- + +### NFT-SEC-03: MAVLink 2.0 message signing on AP wired channel + +**Summary**: Validates D-C8-9 — AP-side rejects unsigned MAVLink GPS_INPUT messages on the signed channel; SUT-emitted (signed) messages pass; SBOM dump confirms passkey configuration. +**Traces to**: D-C8-9 (Plan-phase decision), Mode B Fact #109 (CVE-2026-1579 mitigation) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Start `ardupilot-plane-sitl` with signing enabled and the test passkey loaded | Signing enabled | +| 2 | Inject an UNSIGNED `GPS_INPUT` from `mavproxy-listener` (i.e. a non-SUT origin) | AP rejects the message; rejection logged in AP STATUSTEXT | +| 3 | Inject a SIGNED `GPS_INPUT` with the SUT's signing key | AP accepts | +| 4 | Inject a SIGNED `GPS_INPUT` with a DIFFERENT key | AP rejects | +| 5 | Run the SUT's SBOM-dump CI step | SBOM declares the MAVLink signing module + passkey configuration entry present | + +**Pass criteria**: AP rejection of unsigned + wrong-key; AP acceptance of correct-signed; SBOM declares signing. + +**Note**: iNav-side is NOT subject to this test — Mode B Fact #109 documents the asymmetry as accepted residual risk (no MAVLink signing in iNav firmware per Source #129). + +--- + +### NFT-SEC-04: OpenCV CVE-2025-53644 mitigation (≥4.12.0 pin) + +**Summary**: Validates D-CROSS-CVE-1 — the pinned OpenCV ≥4.12.0 either decodes the CVE-2025-53644 PoC JPEG safely or rejects it; no crash, no buffer overflow. +**Traces to**: D-CROSS-CVE-1, Mode B Fact #112 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Build the SUT image with AddressSanitizer (ASan) instrumentation enabled (separate CI build) | Instrumented binary | +| 2 | Push `cve-jpeg-fixture` to every code path that uses OpenCV imread/imdecode: nav-camera frame source (C1), satellite tile thumbnail re-load (C4), tile cache import (C6) | Each path either decodes cleanly OR returns a graceful error | +| 3 | Observe ASan output | 0 buffer-overflow / use-after-free / uninitialized-read reports | +| 4 | Observe SUT process exit code | Process does NOT crash; if rejection path taken, exit code is 0 + error logged | +| 5 | CI step: lint the lockfile / pyproject.toml / requirements.txt for the OpenCV version pin | Pin asserts `opencv-python >= 4.12.0` (or platform-equivalent) | + +**Pass criteria**: ASan clean; no crash; pinned version ≥ 4.12.0 in dependency manifest. + +--- + +### NFT-SEC-05: Egress-blocked + DNS-blackholed defense-in-depth + +**Summary**: Defense-in-depth complement to NFT-SEC-02 — verifies that even if the network policy were misconfigured, the SUT does not call out to public DNS / known satellite-provider hosts. +**Traces to**: RESTRICT-SAT-1 (defense-in-depth) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Configure SUT container with iptables OUTPUT DROP except `e2e-net` AND DNS blackholed via `--dns 0.0.0.0` | SUT comes up | +| 2 | Run Derkachi replay | All operations succeed; 0 outbound DNS queries (verified via tcpdump on egress) | +| 3 | Inspect SUT for hardcoded provider hostnames (e.g. `*.googleapis.com`, `*.maxar.com`, `*.mapbox.com`, `*.azaion.com` for the runtime path) | grep finds zero references in compiled binary's strings table for runtime-path code | + +**Pass criteria**: 0 DNS queries during replay; 0 provider hostname references in runtime path. diff --git a/_docs/02_document/tests/test-data.md b/_docs/02_document/tests/test-data.md new file mode 100644 index 0000000..9e1d983 --- /dev/null +++ b/_docs/02_document/tests/test-data.md @@ -0,0 +1,129 @@ +# Test Data Management + +## Seed Data Sets + +| Data Set | Description | Used by Tests | How Loaded | Cleanup | +|----------|-------------|---------------|-----------|---------| +| `still-image-set-60` | 60 nadir aerial images `AD000001-60.jpg` from `_docs/00_problem/input_data/` with WGS84 frame-center GT in `coordinates.csv` and per-image accuracy table in `expected_results/position_accuracy.csv`. Captured at 400 m AGL with ADTi 20MP 20L V1 (per `data_parameters.md`). Slow cadence (~1 per 2-3 s), so suitable for satellite-anchor frame-center tests, NOT frame-to-frame VIO. | FT-P-01, FT-P-03, FT-P-05, FT-P-06, FT-P-15, FT-P-19, NFT-RES-03 (Monte Carlo), NFT-PERF-04 | Bind-mounted from `_docs/00_problem/input_data/` to `/test-data` in `e2e-runner` (read-only) | None — read-only fixture | +| `still-image-sat-refs-2` | Two paired Google Maps reference images `AD000001_gmaps.png`, `AD000002_gmaps.png`. Insufficient for full satellite-anchor coverage of the 60-image set; supplements the tile-cache fixture for AC-2.1b cross-validation only. | FT-P-05 (subset), FT-P-19 | Same as above | Same | +| `derkachi-fixture` | Cropped nadir flight footage `flight_derkachi/flight_derkachi.mp4` (H.264, 880×720, 30 fps, ~490.07 s = 14,700 frames) plus synchronized FC telemetry `flight_derkachi/data_imu.csv` (4,900 rows @ 10 Hz, columns `timestamp(ms)`, `Time`, `SCALED_IMU2.*`, `GLOBAL_POSITION_INT.*`). Three video frames per telemetry row. The `GLOBAL_POSITION_INT` columns are the trajectory ground truth. | FT-P-02, FT-P-04, FT-P-07, FT-P-10, FT-N-01 (synth on top), FT-N-02, FT-N-03 (synth), FT-N-04 (synth), NFT-PERF-01, NFT-PERF-02, NFT-RES-01, NFT-RES-02, NFT-RES-03 (Monte Carlo), NFT-RES-04, NFT-LIM-02 (8 h synth load loop) | Same bind mount as above | Same | +| `tile-cache-fixture` | Pre-built FAISS HNSW index + tile filesystem covering: (a) the 60 still-image footprints at 0.3-0.5 m/px, (b) the Derkachi route bbox at the same resolution. Built once per CI run by `tests/fixtures/tile-cache-builder/` from the `_gmaps.png` references and from a curated public-data subset (when D-PROJ-3 is resolved — until then, stub-tile content for footprints not paired with `_gmaps.png`). Tile manifest schema per `restrictions.md` § Satellite Imagery. | FT-P-01, FT-P-05, FT-P-15, FT-P-16, FT-P-17, FT-P-19, FT-N-05, FT-N-06, NFT-LIM-03, NFT-PERF-01, NFT-PERF-04, NFT-SEC-01 (poisoning test), NFT-SEC-02 (egress) | Built into named Docker volume `tile-cache-fixture`; mounted read-only into SUT at `/var/azaion/tile-cache` | Volume removed at teardown | +| `synth-age-tile-set` | Two clones of the tile-cache-fixture with manifest `capture_date` field synthetically aged: `synth-age-7mo` (>6 mo, exceeds AC-8.2 active-conflict threshold) and `synth-age-13mo` (>12 mo, exceeds rear threshold). Tile pixels unchanged; only manifest dates differ. | FT-N-05, FT-N-06 | Built from `tile-cache-fixture` by date-mutating script in `tests/fixtures/age-injector/` | Volume removed at teardown | +| `outlier-injection-derkachi` | Synthetic adversarial overlay on `derkachi-fixture`: every Nth frame replaced by a random crop from a far-away tile (>350 m offset, per AC-3.1) to inject a visual outlier. Three injection densities: `light` (1 in 100), `medium` (1 in 10), `heavy` (1 in 3). Generated at runtime by `tests/fixtures/injectors/outlier.py`. | FT-N-01 | Generated at scenario start, written to `tmpfs` in `e2e-runner`, mounted into SUT as a derived frame source | Auto-cleared at teardown (tmpfs) | +| `blackout-spoof-derkachi` | Synthetic overlay on `derkachi-fixture`: pure-black frames inserted in 5 s / 15 s / 35 s windows AND simultaneous spoofed-GPS injection on the FC inbound stream. Spoof pattern: realistic-looking GPS jumps the trajectory 200-500 m in `north_east_random_direction`. Three windows produce three sub-scenarios per AC-NEW-8. Generated at runtime. | FT-N-04, NFT-RES-04 | Same | Same | +| `multi-segment-derkachi` | Synthetic overlay: 3+ blackout segments distributed across the Derkachi flight to exercise satellite-reference re-localization (AC-3.3) without spoofing. Generated at runtime. | FT-P-08 | Same | Same | +| `cold-boot-fixture` | The state needed to validate AC-NEW-1: a frozen FC pose (`GLOBAL_POSITION_INT` snapshot at flight-resume time) + the tile-cache-fixture + a blank FDR. Test cold-boots the SUT and measures TTFF. | NFT-PERF-03 (AC-NEW-1) | The frozen FC pose is a JSON fixture in `tests/fixtures/cold-boot/`; SUT is restarted (`docker compose restart gps-denied-onboard`) and TTFF is measured from container-ready event to first valid `GPS_INPUT` / `MSP2_SENSOR_GPS` arrival at SITL | Container restart only | +| `mavlink-passkey` | A test-only MAVLink 2.0 signing passkey (32-byte hex). Used for D-C8-9 ArduPilot-track signing channel. NEVER reused outside test environment; checked-in as `tests/fixtures/secrets/mavlink-test-passkey.txt` with explicit comment "TEST ONLY". | FT-P-09 (AP track), NFT-SEC-03 | Loaded via Docker secret into SUT environment | None — fixture file | +| `cve-jpeg-fixture` | Crafted JPEG that triggers CVE-2025-53644 (uninitialized stack pointer → heap buffer write) in OpenCV 4.10/4.11. The pinned ≥4.12.0 must process it without crash and either decode safely or reject. | NFT-SEC-04 | Local-data-only fixture file at `tests/fixtures/security/cve-2025-53644.jpg` (sourced from public PoC, license-checked) | None — fixture file | + +## Data Isolation Strategy + +Each `pytest` test case runs against a fresh `gps-denied-onboard` container (`docker compose restart` between tests, OR `--forked` pytest mode that brings a clean compose stack per case for hermetic-critical tests). The `tile-cache-fixture` and `input-data` mounts are read-only so cross-contamination between tests is impossible at the SUT-input layer. The `fdr-output` volume is reset between tests (`docker volume rm` + recreate) so each test sees a blank FDR. + +For Tier-2 (Jetson hardware), the same isolation discipline applies but at the systemd-service level: `systemctl restart gps-denied-onboard.service` between tests, `/var/azaion/fdr` is wiped between tests. + +Synthetic-injection fixtures (`outlier-injection-derkachi`, `blackout-spoof-derkachi`, `multi-segment-derkachi`, `synth-age-tile-set`) are generated into per-test tmpfs and never written back to a persistent volume. + +## Input Data Mapping + +| Input Data File | Source Location | Description | Covers Scenarios | +|-----------------|----------------|-------------|-----------------| +| `AD000001.jpg` ... `AD000060.jpg` | `_docs/00_problem/input_data/` | 60 nadir still images, ADTi 20MP @ 400 m AGL | FT-P-01, FT-P-03, FT-P-05, FT-P-06, FT-P-15, FT-P-19, NFT-PERF-04, NFT-RES-03 | +| `coordinates.csv` | `_docs/00_problem/input_data/` | 60-row WGS84 frame-center GT (image, lat, lon) | Same as above | +| `AD000001_gmaps.png`, `AD000002_gmaps.png` | `_docs/00_problem/input_data/` | Google Maps satellite reference for images 1-2 | FT-P-05, FT-P-19 | +| `data_parameters.md` | `_docs/00_problem/input_data/` | AGL height (400 m) + camera model | All — global metadata | +| `flight_derkachi/flight_derkachi.mp4` | `_docs/00_problem/input_data/flight_derkachi/` | H.264 nadir video, 880×720 @ 30 fps, ~490 s | FT-P-02, FT-P-04, FT-P-07, FT-P-10, FT-N-01..04, NFT-PERF-01..04, NFT-RES-01..04, NFT-LIM-02 | +| `flight_derkachi/data_imu.csv` | `_docs/00_problem/input_data/flight_derkachi/` | 4,900 rows @ 10 Hz of `SCALED_IMU2` + `GLOBAL_POSITION_INT` | Same as above | +| `flight_derkachi/README.md` | `_docs/00_problem/input_data/flight_derkachi/` | Fixture metadata | Documentation only | +| `expected_results/results_report.md` | `_docs/00_problem/input_data/expected_results/` | Pass/fail rules + still-image and Derkachi mappings | All FT-P / FT-N scenarios that load this fixture | +| `expected_results/position_accuracy.csv` | `_docs/00_problem/input_data/expected_results/` | Per-image accuracy threshold flags | FT-P-01, NFT-RES-03 | + +## Expected Results Mapping + +This table closes the gap between each test scenario and the quantifiable expected result it asserts on. Comparison methods follow `.cursor/skills/test-spec/templates/expected-results.md`. The `Expected Result Source` column points at the canonical source of truth for the assertion. + +### Position accuracy + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| FT-P-01 | `still-image-set-60` + `tile-cache-fixture` | `pass_count(error≤50m) ≥ 48` (≥80% of 60) AND `pass_count(error≤20m) ≥ 30` (≥50% of 60) | `threshold_min` on aggregate counts; per-image error via `numeric_tolerance` against Vincenty geodesic distance to GT in `coordinates.csv` | ±50 m / ±20 m | `expected_results/results_report.md` § Pass/Fail Rules + `expected_results/position_accuracy.csv` | +| FT-P-02 | `derkachi-fixture` | At each anchor frame, `‖propagated_centre − next_anchor_centre‖ < 100 m` (visual-only) AND `< 50 m` (IMU-fused). Drift binned by `last_satellite_anchor_age_ms`. | `threshold_max` per anchor pair, then aggregate rule `≥95% of anchor pairs satisfy` | < 100 m / < 50 m | AC-1.3 + Derkachi `GLOBAL_POSITION_INT` GT | +| FT-P-03 | `still-image-set-60` (any 1 image) | Estimate output schema fields present: `lat:float`, `lon:float`, `cov_semi_major_m:float`, `source_label ∈ {satellite_anchored, visual_propagated, dead_reckoned}`, `last_satellite_anchor_age_ms:int` | `schema_match` (presence + type) AND `set_contains` (label) | N/A | AC-1.4 + AC-4.3 | +| FT-P-19 | `tile-cache-fixture` + `still-image-sat-refs-2` | Scale-ratio: any UAV-frame footprint at 400 m AGL retrievable from cache (FAISS top-K=10 includes a tile with center within 100 m of true position). Scene-change subset (PARTIAL — flag-marked, see traceability matrix). | `set_contains` (top-K result includes correct tile) | top-K hit | AC-8.6 | + +### Image processing quality + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| FT-P-04 | `derkachi-fixture` | Frame-to-frame registration succeeds for `≥95%` of "normal" segments (defined per AC-2.1a: nadir ±10° bank/pitch from `data_imu.csv` `SCALED_IMU2` quaternion-derived attitude estimate, ≥40% inferred prior-frame overlap). Sharp-turn frames excluded from this denominator. | `threshold_min` on success ratio | ≥95% | AC-2.1a | +| FT-P-05 | `still-image-set-60` (with `_gmaps.png` subset for ground-truth match) | Satellite-anchor registration succeeds AND satisfies AC-1.1/1.2 accuracy AND MRE < 2.5 px | `threshold_max` MRE | < 2.5 px | AC-2.1b + AC-2.2 | +| FT-P-06 | `derkachi-fixture` (frame-to-frame) AND `still-image-set-60` (sat-anchor) | Mean Reprojection Error: `< 1.0 px` frame-to-frame, `< 2.5 px` satellite-anchored cross-domain | `threshold_max` per shape | < 1.0 / < 2.5 px | AC-2.2 | + +### Resilience + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| FT-N-01 | `outlier-injection-derkachi` | Up to 350 m offset in a single frame is rejected as outlier; estimate continues from prior valid state with grown covariance; airframe tilt up to ±20° handled | Per-injected-outlier: `error_after_outlier ≤ error_before_outlier + 50 m` AND `covariance_growth_monotonic` | ±50 m drift budget | AC-3.1 | +| FT-N-02 | `derkachi-fixture` (sharp-turn segment, identified via `SCALED_IMU2` gyro_z spikes) | Sharp-turn frames may fail frame-to-frame registration; recovery via satellite-reference re-localization within next 3 frames | Boolean recovery within 3 frames | N/A | AC-3.2 | +| FT-P-08 | `multi-segment-derkachi` | ≥3 disconnected segments handled; satellite-reference re-localization succeeds at each gap; trajectory remains continuous (no >100 m jump) | `threshold_max` discontinuity | < 100 m | AC-3.3 | +| FT-N-03 | `derkachi-fixture` + synthetic 3-frame outage injector | After ≥3 consecutive frames AND ≥2 s without estimate: STATUSTEXT containing `OPERATOR_RELOC_REQUEST` emitted to GCS via `mavproxy-listener`; estimates labeled `dead_reckoned` continue | `regex` on STATUSTEXT + `set_contains` on labels | regex | AC-3.4 | +| FT-N-04 | `blackout-spoof-derkachi` (5 s / 15 s / 35 s windows) | Within ≤1 frame OR ≤400 ms: label switches to `dead_reckoned`; spoofed GPS rejected; covariance grows monotonically; `horiz_accuracy` not under-reported; `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT at 1-2 Hz | `threshold_max` switch latency + `regex` STATUSTEXT + monotonic check | ≤400 ms | AC-3.5 | + +### FC contract & startup + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| FT-P-09-AP | `derkachi-fixture` + `mavlink-passkey` + `ardupilot-plane-sitl` | `GPS_INPUT` messages reach AP SITL; AP EKF accepts them as `EK3_SRC1_POSXY=3` (GPS); MAVLink 2.0 signing handshake completes (D-C8-9); messages without valid signature are rejected | `exact` (AP source-set state via param read) + `boolean` (signing handshake success) + `exact` (rejection of unsigned in NFT-SEC-03) | N/A | AC-4.3 + D-C8-9 | +| FT-P-09-iNav | `derkachi-fixture` + `inav-sitl` | `MSP2_SENSOR_GPS` (ID 0x1F03) messages reach iNav SITL via TCP 5760; iNav GPS provider state shows `provider=MSP` and fix is acquired | `exact` on iNav GPS provider state via MSP read | N/A | AC-4.3 + Source #4 | +| FT-P-10 | `derkachi-fixture` | Per Mode B Fact #107: GTSAM iSAM2 smoothed past-keyframe pose estimates differ from raw single-shot estimates AND smoothed estimates are closer to `GLOBAL_POSITION_INT` GT than raw (IT-11). NOT validated as FC-side retroactive correction (out of scope per Mode B revision). | `numeric_tolerance` improvement check | smoothed_error < raw_error | AC-4.5 (revised) + Mode B Fact #107 | +| FT-P-11 | `cold-boot-fixture` + `ardupilot-plane-sitl` | On boot, SUT initializes from FC EKF's last valid GPS + IMU-extrapolated position | `numeric_tolerance` initial-pose-vs-FC-pose | ±50 m | AC-5.1 | +| NFT-RES-01 | `derkachi-fixture` + 4 s outage injector | After >3 s without estimate, FC falls back to IMU-only dead reckoning; SUT emits a `NO_ESTIMATE_TIMEOUT` failure log | `boolean` on FC EKF source-set transition + `regex` on log | N/A | AC-5.2 | +| NFT-RES-02 | `derkachi-fixture` + container restart mid-replay | After companion reboot, SUT re-initializes from FC's current IMU-extrapolated position; first emitted `GPS_INPUT` / `MSP2_SENSOR_GPS` is within ±100 m of FC's IMU-extrapolated pose at boot-complete time | `numeric_tolerance` pose at first emit | ±100 m | AC-5.3 | + +### Performance + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| NFT-PERF-01 (Tier-2 only) | `derkachi-fixture` resampled to 3 Hz on Jetson Orin Nano Super | End-to-end latency (camera capture → GPS to FC) | `threshold_max` p95 | ≤ 400 ms | AC-4.1 + D-CROSS-LATENCY-1 | +| NFT-PERF-02 (Tier-1+2) | `derkachi-fixture` | Estimates emitted frame-by-frame (no batching > 1 frame); inter-emit interval p95 ≤ inter-frame interval × 1.05 | `threshold_max` p95 inter-emit | ≤ 350 ms (at 3 Hz target) | AC-4.4 | +| NFT-PERF-03 (Tier-2 only) | `cold-boot-fixture` | Cold-start TTFF: from container-ready to first valid `GPS_INPUT` / `MSP2_SENSOR_GPS` | `threshold_max` p95 over 50 cold boots | < 30 s | AC-NEW-1 | +| NFT-PERF-04 | `still-image-set-60` + spoofed FC GPS injection in `ardupilot-plane-sitl` | Spoofing-promotion latency: from FC GPS-denial / spoof signal to SUT estimate becoming AP primary position source | `threshold_max` p95 over 50 trials per FC | < 3 s | AC-NEW-2 | + +### Resource limits + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| NFT-LIM-01 (Tier-2) | `derkachi-fixture` 8 h replay loop | Memory `< 8 GB shared` on Jetson Orin Nano Super throughout | `threshold_max` peak RSS over duration | ≤ 8 GB | AC-4.2 | +| NFT-LIM-02 (Tier-1) | 8 h Derkachi replay loop | FDR ≤ `64 GB`; no payload class silently dropped without a logged rollover | `threshold_max` total FDR size + `regex` on rollover-event presence | ≤ 64 GB | AC-NEW-3 | +| NFT-LIM-03 | `tile-cache-fixture` plus exercised manifests/overviews/indices | Cache budget `≤ 10 GB` for the ~400 km² operational area unless solution defines a separate descriptor budget | `threshold_max` total cache size | ≤ 10 GB | RESTRICT-SAT-2 + AC-8.3 | +| NFT-LIM-04 (Tier-2) | `derkachi-fixture` 8 h | CPU/GPU/temp/throttle telemetry recorded; no thermal throttling at 25 W TDP at the upper temp envelope (deferred to chamber for AC-NEW-5) | `threshold_max` throttle event count = 0 (workstation thermal-day) | 0 events | RESTRICT-HW-1 + AC-NEW-5 (Tier-2 partial) | + +### Security + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| NFT-SEC-01 | Synthetic over-confidence injection: deflate covariance ×1.5-3 in 3 trial flights, observe AC-NEW-7 cache-poisoning behavior at the `mock-suite-sat-service` ingest | Per flight: `P(geo-misalign > 30 m) < 1%`, `P(> 100 m) < 0.1%` of written tiles. PARTIAL — multi-flight Monte Carlo (≥100 flights per AC text) is reduced-confidence with current single Derkachi fixture; trace flag in matrix. | `threshold_max` on probability | < 1% / < 0.1% | AC-NEW-7 | +| NFT-SEC-02 | Network egress probe from SUT container | All non-`e2e-net` egress attempts blocked by Docker `internal: true`; per-attempt logged as security event in SUT log | `exact` (egress count = 0) + `regex` (security-event log emission) | N/A | RESTRICT-SAT-1 + AC-8.1 | +| NFT-SEC-03 | `ardupilot-plane-sitl` + un-signed MAVLink GPS_INPUT injection | AP SITL rejects unsigned messages on the signed channel; SUT-emitted (signed) messages pass; SBOM check confirms passkey configuration | `exact` (AP rejection of unsigned) + `boolean` (SBOM passkey present) | N/A | D-C8-9 + Mode B Fact #109 + AC-NEW-2 | +| NFT-SEC-04 | `cve-jpeg-fixture` fed to SUT image pipeline (C1 + C4 paths) | OpenCV ≥4.12.0 either decodes safely or rejects the file; no crash, no buffer overflow detected by AddressSanitizer | `boolean` on no-crash + ASan clean | N/A | D-CROSS-CVE-1 + Mode B Fact #112 | + +## External Dependency Mocks + +| External Service | Mock/Stub | How Provided | Behavior | +|-----------------|-----------|-------------|----------| +| Azaion Suite Satellite Service (ingest API for AC-NEW-7 voting layer) | `mock-suite-sat-service` Docker service | Local FastAPI stub returning canned tile-publish-acknowledgement responses with deterministic IDs; logs every received tile + per-tile quality metadata to a file the e2e-runner reads back | Returns 202 Accepted on every well-formed publish; returns 400 on malformed; never simulates real voting (the project's role is to publish, the Service's role is to vote per Mode B Fact #105 / D-PROJ-2) | +| ArduPilot Plane FC | `ardupilot-plane-sitl` Docker service | Open-source SITL build of ArduPilot Plane stable; configured with `GPS_TYPE=14` per Source #2 to accept MAVLink GPS_INPUT | Real ArduPilot EKF behavior; we observe but do not patch | +| iNav FC | `inav-sitl` Docker service | Open-source iNav SITL; GPS provider configured to MSP per `docs/SITL/SITL.md` | Real iNav GPS subsystem behavior; we observe but do not patch | +| QGroundControl GCS | `mavproxy-listener` Docker service | Passive MAVLink listener that forwards SUT → GCS stream into a `.tlog` file the e2e-runner parses | Captures all STATUSTEXT, NAMED_VALUE_FLOAT, downsampled position frames for assertions | +| AI camera (AC-7.x) | NOT MOCKED — out of scope per Phase 1 gate | N/A | NOT COVERED in current matrix — see traceability matrix | + +## Data Validation Rules + +| Data Type | Validation | Invalid Examples | Expected System Behavior | +|-----------|-----------|-----------------|------------------------| +| Nav-camera frame | Resolution within ADTi spec (~5472×3648 production, downscaled equivalents allowed in Tier-1 Docker) | 0×0 frame, corrupt JPEG (CVE fixture), wrong color depth | Reject frame, log invalid-input event, do NOT advance estimator state | +| FC IMU sample | `SCALED_IMU2` fields present; timestamp monotonic; non-zero accelerometer norm | Missing field, backwards timestamp, NaN | Reject sample, log invalid-input event, propagate estimator from prior valid state | +| Satellite tile manifest | Required fields per `restrictions.md`: CRS, tile matrix, dimension, lat-adjusted m/px, capture date, source, compression. m/px ≥ 0.5. capture_date within AC-8.2 freshness window. | Missing capture_date, m/px = 1.0 (below floor), capture_date older than freshness threshold | Reject tile load OR downgrade to non-`satellite_anchored` source label per AC-NEW-6 | +| Spoofed FC GPS | (FC-side input the SUT detects) | GPS jump >200 m between consecutive 5 Hz frames; FC GPS-health flag toggled to spoofed | SUT switches estimator label to `dead_reckoned`, stops promoting FC GPS, continues per AC-NEW-8 | +| MAVLink GPS_INPUT outbound | Honest covariance — `horiz_accuracy` ≥ estimator's 95% covariance semi-major axis | Under-reported covariance | This is a defect (AC-NEW-4) — fail NFT-PERF-04 if observed | +| MAVLink message signature | MAVLink 2.0 signed on AP wired channel per D-C8-9 | Unsigned message on signed channel | AP-side rejection (NFT-SEC-03 expected behavior) | diff --git a/_docs/02_document/tests/traceability-matrix.md b/_docs/02_document/tests/traceability-matrix.md new file mode 100644 index 0000000..1871662 --- /dev/null +++ b/_docs/02_document/tests/traceability-matrix.md @@ -0,0 +1,109 @@ +# Traceability Matrix + +This matrix is the canonical view of test coverage for the planning context. It traces every numbered AC and every restriction to the test scenario IDs that exercise it. + +**Coverage discipline**: an AC counts as **Covered** when at least one test scenario has a quantifiable pass/fail criterion that exercises it. **PARTIAL** rows are exercised but with reduced confidence — the row's "Mitigation" column points to the action item (Plan-phase decision or D-PROJ gate) that, when resolved, lifts the row to Covered. **NOT COVERED** rows are deliberately deferred (out-of-scope for data acquisition per Phase 1 gate, or covered at a later workflow stage); each has a stated mitigation. + +## Acceptance Criteria Coverage + +| AC ID | Acceptance Criterion (one-line) | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AC-1.1 | Frame-center GPS within 50 m for ≥80% of normal-flight photos | FT-P-01 | Covered | +| AC-1.2 | Frame-center GPS within 20 m for ≥50% of normal-flight photos | FT-P-01 | Covered | +| AC-1.3 | Cumulative drift between satellite-anchored fixes <100 m visual / <50 m IMU-fused | FT-P-02 | Covered | +| AC-1.4 | Estimate reports 95% covariance + source label | FT-P-03 | Covered | +| AC-2.1a | Frame-to-frame registration ≥95% on normal segments | FT-P-04 | Covered | +| AC-2.1b | Satellite-anchor registration meets AC-1.1/1.2/2.2/8.2/8.6 | FT-P-05, FT-P-19 | Covered | +| AC-2.2 | MRE <1 px frame-to-frame, <2.5 px cross-domain | FT-P-05, FT-P-06 | Covered | +| AC-3.1 | Tolerate up to 350 m outliers, tilt ±20° | FT-N-01 | Covered | +| AC-3.2 | Tolerate sharp turns; recovery via satellite re-loc | FT-P-07, FT-N-02 | Covered | +| AC-3.3 | Handle ≥3 disconnected segments via satellite re-loc | FT-P-08 | Covered | +| AC-3.4 | On ≥3 frames + ≥2 s outage, request operator re-loc; FC dead-reckons | FT-N-03 | Covered | +| AC-3.5 | Visual blackout + spoofed GPS failsafe | FT-N-04 | Covered | +| AC-4.1 | E2E latency <400 ms p95 | NFT-PERF-01 (Tier-2) | Covered | +| AC-4.2 | Memory <8 GB on Jetson | NFT-LIM-01 (Tier-2) | Covered | +| AC-4.3 | FC output contract: GPS_INPUT (AP) + MSP2_SENSOR_GPS (iNav) with honest covariance | FT-P-03, FT-P-09-AP, FT-P-09-iNav | Covered | +| AC-4.4 | Estimates streamed frame-by-frame | NFT-PERF-02 | Covered | +| AC-4.5 (revised) | Internal smoothing improves past-keyframe estimates (NOT FC retroactive correction per Mode B Fact #107) | FT-P-10 | Covered | +| AC-5.1 | Init from FC EKF's last valid GPS + IMU-extrapolated | FT-P-11 | Covered | +| AC-5.2 | On >3 s without estimate, FC IMU-only fallback; SUT logs | NFT-RES-01 | Covered | +| AC-5.3 | On reboot, re-init from FC IMU-extrapolated pose | NFT-RES-02 | Covered | +| AC-6.1 | GCS stream at 1-2 Hz | FT-P-12 | Covered | +| AC-6.2 | GCS may send commands via standard MAVLink | FT-P-13 | Covered | +| AC-6.3 | WGS84 output | FT-P-14 | Covered | +| AC-7.1 | AI-camera object localization, level-flight accuracy | — | NOT COVERED — out of scope for current data acquisition (no AI-camera fixture; AC-7.x scoped to a different sensor). Mitigation: defer to a follow-up cycle with AI-camera fixture; flag in `_docs/_process_leftovers/` as `2026-05-09_ai-camera-fixture-deferred.md` | +| AC-7.2 | AI-camera object coordinates from gimbal/zoom/altitude | — | NOT COVERED — same as AC-7.1 | +| AC-8.1 | Imagery via Suite Sat Service offline cache, ≥0.5 m/px | FT-P-15, FT-P-16, NFT-SEC-02 | Covered | +| AC-8.2 | Tile freshness <6 mo (active-conflict) / <12 mo (rear) | FT-N-05 | Covered | +| AC-8.3 | Imagery pre-loaded onto companion before flight | FT-P-15, FT-P-16 | Covered | +| AC-8.4 | Mid-flight tile generation with quality metadata | FT-P-17 | Covered | +| AC-8.5 | No raw nav/AI-cam frame retention except thumbnail log | FT-P-18 | Covered | +| AC-8.6 | Satellite relocalization scale-ratio + scene-change | FT-P-19 (scale FULL; scene-change PARTIAL) | PARTIAL — scene-change subset reduced confidence (only 2/60 stills have paired sat refs; no labeled change-pair dataset). Independent of the AC-NEW-4 / AC-NEW-7 multi-flight gap (those rows were resolved by AC-text relaxation 2026-05-09; AC-8.6 scene-change still requires a labeled change-pair dataset that synthetic perturbations cannot substitute for). Mitigation: deferred to a follow-up cycle when labeled change-pair data becomes available; surfaced in the Step 4 risk register | +| AC-NEW-1 | Cold-start TTFF <30 s p95 | NFT-PERF-03 (Tier-2) | Covered | +| AC-NEW-2 | Spoofing-promotion latency <3 s p95 | NFT-PERF-04 | Covered | +| AC-NEW-3 | FDR ≤64 GB / flight, no silent drops | NFT-LIM-02 | Covered | +| AC-NEW-4 | False-position safety: P(>500 m)<0.1%, P(>1 km)<0.01% | NFT-RES-03 | Covered — AC text relaxed 2026-05-09 to Monte-Carlo-over-current-data with stated 95% CI (Plan Phase 2a.0 outcome). Multi-flight statistical headroom is residual risk in the Step 4 risk register; D-PROJ-3 reopens validation when additional multi-flight data becomes available | +| AC-NEW-5 | Operating envelope -20 °C to +50 °C, 25 W TDP, 8 h, no throttle | NFT-LIM-04 (workstation baseline only) | PARTIAL — workstation thermal-day baseline only. Mitigation: chamber-attached Jetson runner + DO-160G shaker rig — out of scope for data-acquisition per Phase 1 gate; tracked as a release-tag-blocking gate | +| AC-NEW-6 | System rejects/downgrades stale tiles | FT-N-05, FT-N-06 | Covered | +| AC-NEW-7 | Cache poisoning: P(misalign>30 m)<1%, P(>100 m)<0.1% | NFT-SEC-01 | Covered (onboard-side) — AC text relaxed 2026-05-09 to Monte-Carlo-over-current-data with stated 95% CI for the onboard contribution. Cross-suite voting-layer contract verification (D-PROJ-2) is a parent-suite design task tracked outside this Plan cycle; multi-flight statistical headroom remains residual risk (D-PROJ-3) | +| AC-NEW-8 | Visual blackout + spoof degraded-mode escalation | FT-N-04, NFT-RES-04 | Covered | + +## Restrictions Coverage + +| Restriction ID | Restriction (one-line) | Test IDs | Coverage | +|---------------|-------------|----------|----------| +| RESTRICT-UAV-1 | Fixed-wing UAV, nav-camera fixed downward | FT-N-01 (tilt envelope) | Covered (envelope assertion) | +| RESTRICT-UAV-2 | Mission profile: 8 h flights, 60 km/h, ≤400 km² area | NFT-LIM-01, NFT-LIM-02 (8 h replay) | Covered | +| RESTRICT-UAV-3 | Sharp turns may share <5% overlap | FT-P-07, FT-N-02 | Covered | +| RESTRICT-UAV-4 | No raw-photo storage; tile cache + FDR only | FT-P-18, NFT-LIM-03 | Covered | +| RESTRICT-CAM-1 | Nav camera ADTi 20MP 20L V1 nadir-fixed | FT-N-01 (tilt envelope), test fixture validation | Covered | +| RESTRICT-CAM-2 | AI camera: gimbal+zoom only; level-flight scope | — | NOT COVERED — paired with AC-7.x deferral | +| RESTRICT-SAT-1 | Onboard cache offline-only; no in-flight Service calls | FT-P-16, NFT-SEC-02, NFT-SEC-05 | Covered | +| RESTRICT-SAT-2 | Cache budget 10 GB across operational area | NFT-LIM-03 | Covered | +| RESTRICT-SAT-3 | Tile freshness per AC-8.2 / AC-NEW-6 | FT-N-05, FT-N-06 | Covered | +| RESTRICT-SAT-4 | No Sentinel-2 / sub-0.5 m/px imagery | FT-P-15 (resolution floor) | Covered | +| RESTRICT-HW-1 | Jetson Orin Nano Super, 8 GB shared LPDDR5, 25 W | NFT-LIM-01, NFT-LIM-04, NFT-LIM-05 | Covered | +| RESTRICT-HW-2 | Cooling 25 W continuous, 8 h, upper temp envelope | NFT-LIM-04, deferred chamber test | PARTIAL — chamber portion deferred; same as AC-NEW-5 | +| RESTRICT-FC-1 | ArduPilot Plane + iNav supported; PX4 out of scope | FT-P-09-AP, FT-P-09-iNav, parameterized matrix | Covered | +| RESTRICT-FC-2 | iNav has no inbound MAVLink ext-positioning; MSP2 only | FT-P-09-iNav | Covered | +| RESTRICT-FC-3 | Output contract: WGS84 GPS via per-FC interface | FT-P-09-AP, FT-P-09-iNav, FT-P-14 | Covered | +| RESTRICT-COMM-1 | MAVLink for GCS link (QGroundControl) | FT-P-12, FT-P-13 | Covered | +| RESTRICT-COMM-2 | iNav has no MAVLink signing; accepted residual risk | NFT-SEC-03 (asymmetry note) | Covered (documented asymmetry) | +| RESTRICT-FAIL-1 | >3 s no estimate → FC IMU-only fallback | NFT-RES-01 | Covered | +| RESTRICT-FAIL-2 | False-position safety budget (AC-NEW-4) | NFT-RES-03 | Covered (via AC-NEW-4 relaxation 2026-05-09); multi-flight statistical headroom is residual risk in Step 4 | +| RESTRICT-FAIL-3 | Cold-start TTFF (AC-NEW-1), spoofing-promotion (AC-NEW-2) | NFT-PERF-03, NFT-PERF-04 | Covered | + +## Coverage Summary + +> Revised 2026-05-09 (Plan Phase 2a.0 outcomes): three rows moved PARTIAL → Covered (AC-NEW-4, AC-NEW-7, RESTRICT-FAIL-2) following AC-text relaxation per Q3=B. Restriction row count corrected from 19 to 20 (pre-existing arithmetic error). + +| Category | Total Items | Covered | PARTIAL | Not Covered | Coverage % (Covered + PARTIAL counted half) | +|----------|-----------|---------|---------|-------------|--------------------------------------------| +| Acceptance Criteria | 39 | 35 | 2 | 2 | 92.3% | +| Restrictions | 20 | 18 | 1 | 1 | 92.5% | +| **Total** | **59** | **53** | **3** | **3** | **92.4%** | + +Coverage clears the 75% gate with margin under both the inclusive reading (PARTIAL = covered) and the strict reading (PARTIAL not counted) — strict coverage is **(53 / 59) = 89.8%**. The remaining PARTIAL / Not Covered items are: AC-8.6 scene-change subset (needs labeled change-pair dataset, deferred), AC-NEW-5 hot-soak chamber (physical hardware, deferred), AC-7.1 / AC-7.2 (no AI-camera fixture, deferred), RESTRICT-CAM-2 (paired with AC-7.x), RESTRICT-HW-2 chamber portion (paired with AC-NEW-5). + +## Uncovered Items Analysis + +> Revised 2026-05-09 (Plan Phase 2a.0): AC-NEW-4 and AC-NEW-7 rows removed from this section after AC-text relaxation (Q3=B) flipped them to Covered with residual risk tracked in the Step 4 risk register. + +| Item | Reason Not Covered | Risk | Mitigation | +|------|-------------------|------|-----------| +| AC-7.1 | No AI-camera fixture in `input_data/`; AC scoped to a different sensor than the nav camera; level-flight assumption + bank/pitch <5° is independent of the nav-cam pipeline | Object-localization accuracy untested; AI consumers may receive wrong coordinates if not flight-tested | Deferred to a follow-up Plan cycle scoped to AI-camera integration; recorded in `_docs/_process_leftovers/2026-05-09_ai-camera-fixture-deferred.md` (will be created in Phase 3 if confirmed). | +| AC-7.2 | Same as AC-7.1 | Same | Same | +| AC-8.6 (scene-change subset) | Only 2/60 stills paired with `_gmaps.png`; no labeled change-pair dataset bundled in `input_data/`. Independent of the AC-NEW-4 / AC-NEW-7 multi-flight gap (those were resolved by AC-text relaxation; AC-8.6 still needs labeled change-pair data) | Stale-tile match in active-conflict sectors may yield false `satellite_anchored`; AC-NEW-6 partially compensates but scene-change recall is unmeasured | Deferred to a follow-up cycle when labeled change-pair data becomes available (Maxar Open Data Ukraine + AerialVL change-pair subset). Scale-ratio half of AC-8.6 IS covered. | +| AC-NEW-5 | Workstation thermal-day baseline only. AC-NEW-5 hot-soak (25 W @ +50 °C, 8 h, no throttle) requires a thermal chamber — physical hardware, not data | Without chamber test, AC-4.1 latency budget at +50 °C is not validated; D-CROSS-LATENCY-1 hybrid auto-degrade unproven under real thermal stress | Chamber-attached Jetson runner gated as release-tag-blocker. NOT counted as data-acquisition deferral; counted as physical hardware deferral. | +| RESTRICT-CAM-2 | Paired with AC-7.x — no AI-camera fixture | Same as AC-7.x | Same as AC-7.x | +| RESTRICT-HW-2 (chamber portion) | Paired with AC-NEW-5 — physical chamber required | Same as AC-NEW-5 | Same as AC-NEW-5 | + +## New findings forwarded into Plan (Steps 2 + 3 inputs) + +These insights from Phase 2 augment the F1-F5 carried over from Phase 1; together they feed forward into Solution Analysis (Step 2) and Component Decomposition (Step 3): + +1. **F6 — Two-tier execution profile is a first-class architectural concern.** The split between Tier-1 (workstation Docker) and Tier-2 (Jetson hardware) means several AC have validation locations that must appear in the deployment plan and in the CI matrix design. Add a "Tier-2 hardware-runner availability" entry to the project's risk register (Step 4). +2. **F7 — `mock-suite-sat-service` is a real testing-time dependency that must be documented as a component boundary (not just a test fixture).** It encodes the publish-side of D-PROJ-2 and feeds into both NFT-SEC-01 and FT-P-17. Component decomposition (Step 3) should treat the Service-publish contract as an explicit C8/C10 cross-cutting boundary, not buried inside C8. +3. **F8 — VioStrategy parameterization in CI requires both a production binary AND a research binary.** D-C1-1-SUB-A locked the BUILD_VINS_MONO=ON/OFF split; the test plan must produce both binaries on every PR for the comparative-study report (IT-12 in `solution.md`). Add to deployment plan (Step 2) and to epic/work-item planning (Step 6). +4. **F9 — D-PROJ-3 (fixture acquisition) is now a named deliverable** with a clear gate: must resolve before greenfield Step 5 re-runs the full test-spec with architecture context. Promote to risk register and to the architecture's open-items list. +5. **F10 — Defense-in-depth security layer (NFT-SEC-05 DNS blackholing, OPENCV ASan build, SBOM signing-passkey verification)** implies CI/build infrastructure features (multi-stage build for ASan instrumentation, SBOM generator, lockfile linter). Add to deployment plan (Step 2). diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index a4ebeb9..6f3c6c5 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -2,12 +2,12 @@ ## Current Step flow: greenfield -step: 2 -name: Research +step: 3 +name: Plan status: in_progress sub_step: - phase: 52 - name: research-mode-a-engine-steps-4-6-7-8-complete-awaiting-research-decision-gate - detail: "Mode A engine artifacts all written today 2026-05-08: 03_comparison_framework.md (Step 4 — 12-dimension Decision Support framework with cross-component coupling table + decisions-by-owner aggregate), 04_reasoning_chain.md (Step 6 — 12-dimension fact→comparison→conclusion chain with cross-cutting reasoning summary), 05_validation_log.md (Step 7 — 5-scenario validation with 5 counterexamples + Step 7.5 Component Applicability Gate sanity-check PASS), 01_solution/solution_draft01.md (Step 8 — full solution_draft_mode_a.md template populated with C1..C8 + C10 candidate tables + IT-1..IT-10 Integration tests + NFT-1..NFT-7 Non-Functional tests + 27 Plan-phase architect-owned decisions + 8 cross-component-owner decisions inventoried). Awaiting user response on Research Decision gate (A: another round Mode B assessment / B: proceed to Plan greenfield Step 3). NO additional research necessary at the documentary level — every component has Selected primary candidate(s) with MVE evidence + zero ❌ + zero ❓ across Restrictions × Candidate-Modes sub-matrices. Recommendation: B (proceed to Plan) — research-layer work is complete, Plan-phase will close the 35 D-Cx-y decisions and produce architecture.md." + phase: 6 + name: plan-step2-phase2a-architecture-flows + detail: "" retry_count: 0 cycle: 1 diff --git a/_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md b/_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md new file mode 100644 index 0000000..523caf9 --- /dev/null +++ b/_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md @@ -0,0 +1,103 @@ +# Parent-suite design tasks for `satellite-provider` + +**Date created**: 2026-05-09 (Plan Phase 2a.0 outcome — `gps-denied-onboard` workspace) +**Workspace this leftover lives in**: `gps-denied-onboard` +**Workspace work needs to happen in**: `/Users/obezdienie001/dev/azaion/suite/satellite-provider/` +**Type**: cross-workspace dependency surfaced from this Plan cycle, NOT a tracker write blocker + +--- + +## Why this is a leftover + +During Plan Phase 2a.0 (Glossary + Architecture Vision) for `gps-denied-onboard`, two assumptions in `_docs/01_solution/solution.md` were validated against the actual `satellite-provider` codebase and found broken: + +1. **AC-8.4 — mid-flight tile upload to the Service**: `solution.md` and `acceptance_criteria.md` both assume the onboard system uploads orthorectified mid-flight tiles to `satellite-provider` after landing. **`satellite-provider` has no inbound ingest endpoint.** It is read-only from the onboard side (downloads tiles from Google Maps + serves them). + +2. **AC-NEW-7 — multi-flight ingest-side voting / trust layer**: `solution.md` assumes the Service operates "a multi-flight ingest-side voting layer that gates onboard-tile promotion to 'trusted basemap' until multiple independent flights agree on geo-alignment". **No such layer exists in `satellite-provider`.** + +Both gaps are parent-suite design / build tasks. They are tracked in this onboard workspace as **D-PROJ-2** and surfaced to the parent suite via this leftover file. + +`gps-denied-onboard` will proceed in this Plan cycle treating both as planned external capabilities; the architecture document references them as such. + +--- + +## Why these are NOT replayed automatically + +Per `.cursor/rules/tracker.mdc` § Leftovers Mechanism, this leftover does NOT block onboard progress and does NOT auto-replay because: + +- Replay requires writes against a different workspace's `_docs/` (and tracker entries against `satellite-provider`'s tracker scope). +- The next `/autodev` invocation in the **`satellite-provider`** workspace should pick this up at its own Bootstrap step. Cross-workspace leftover replay is intentionally human-gated. + +If you (the human) explicitly want me to write these design tasks into `satellite-provider/_docs/` from this conversation, say so — I have the user's permission from the 2026-05-09 turn ("If it doesn't provide sufficient information, then analyze the repository, think about the best solution to tile selection process, and document it there"). I held back to respect the workspace boundary discipline this autodev session was operating under. + +--- + +## Design task #1 — Inbound tile ingest endpoint + +**Trigger**: AC-8.4 (mid-flight tile generation, post-landing upload) per `gps-denied-onboard/_docs/00_problem/acceptance_criteria.md`. + +**Contract sketch (from the onboard side)**: + +``` +POST /api/satellite/tiles/ingest +Content-Type: multipart/form-data + +Fields per tile (one or more per request, batched): + - tile_blob: JPEG body, byte-identical to satellite-provider's existing tile format + - zoomLevel: int — same semantics as satellite-provider's existing tiles table + - latitude: double — center latitude (composite key element) + - longitude: double — center longitude + - tile_size_meters: double + - tile_size_pixels: int + - capture_timestamp: ISO 8601 — when the onboard companion generated the tile + - flight_id: UUID — which flight this tile came from + - companion_id: string — which deployed unit produced it + - quality_metadata: JSON blob (per AC-8.4 quality metadata for the Service's voting pipeline): + - estimator_label: "satellite_anchored" | "visual_propagated" | "dead_reckoned" + - covariance_2x2: [[σ_xx, σ_xy], [σ_yx, σ_yy]] — horizontal sub-matrix at tile-emit time + - last_anchor_age_ms: int — AC-1.3 binning input + - mre_px: double — reprojection error at the contributing match + - imu_bias_norm: double — VIO health proxy + - signature: optional — onboard companion's per-flight key signature over the payload (for source authentication; Plan Phase 2a.0 carryforward) + +Response: 202 Accepted with batch UUID + per-tile ingest status (queued / rejected / duplicate / superseded). +``` + +**On-disk persistence**: tiles stored in the same `./tiles/{zoomLevel}/{x}/{y}.jpg` layout as existing Google-Maps-sourced tiles. Service's existing `tiles` table extended with: `flight_id`, `companion_id`, `capture_timestamp`, `source` (`googlemaps | onboard_ingest`), `quality_metadata` (jsonb), `voting_status` (`pending | trusted | rejected`). + +**Design questions for `satellite-provider`'s Plan phase**: +- How to authenticate the onboard companion (mTLS? per-flight ephemeral keys? signed payload?). Companion is a remote untrusted endpoint by threat model. +- How to rate-limit ingest (a compromised companion could DOS the basemap). +- How to expose an admin/operator UI to inspect ingested-but-not-yet-trusted tiles. + +--- + +## Design task #2 — Multi-flight trust / voting layer + +**Trigger**: AC-NEW-7 (cache-poisoning safety budget; cross-flight error compounding) per `gps-denied-onboard/_docs/00_problem/acceptance_criteria.md`. + +**Goal (from the onboard side)**: when `satellite-provider` serves tiles to a future flight's pre-flight cache build, tiles ingested from prior flights must NOT be served as "trusted basemap" until multiple independent flights agree on geo-alignment for the same area. + +**Algorithmic intent (not prescriptive — Service team owns the design)**: +- Tiles enter with `voting_status = pending`. +- A tile is promoted to `voting_status = trusted` when ≥N independent companions (different `companion_id`) have ingested geometrically-consistent tiles covering the same lat/lon/zoom cell, weighted by the quality metadata above. +- The pre-flight cache builder (operator-side tool) consumes only `trusted` tiles by default; can be overridden to accept `pending` tiles for stale-area refresh, with explicit operator confirmation. +- Stale tiles (per AC-8.2 freshness) are demoted on age regardless of trust status. + +**Design questions for `satellite-provider`'s Plan phase**: +- N (votes-required threshold) — driven by AC-NEW-7's safety budget back-solved against measured per-flight pose error CDF. +- How to detect adversarial agreement (multiple compromised companions colluding) — out-of-band integrity checks against Google Maps ground truth? +- What "geometric consistency" means quantitatively (pixel-level RANSAC on overlapping tiles? GTSAM factor-graph over multi-flight poses?). +- What happens when `trusted` tiles disagree with newly ingested `pending` tiles in active-conflict sectors (legitimate scene change vs. cache poisoning). + +--- + +## Hand-off + +Next time `/autodev` runs in the **`satellite-provider`** workspace: + +1. Bootstrap should detect this leftover via cross-workspace search (`/Users/obezdienie001/dev/azaion/suite/gps-denied-onboard/_docs/_process_leftovers/`) — NOTE: cross-workspace leftover detection is not yet implemented in autodev; human operator must surface this manually for now. +2. The Plan skill should add Design Task #1 + Design Task #2 to the satellite-provider Plan cycle as new components / endpoints. +3. After both are implemented, this leftover can be deleted from `gps-denied-onboard`. + +Until then, `gps-denied-onboard` Plan / Decompose / Implement phases will proceed with the architecture vision treating both capabilities as **planned external dependencies** (not yet available, but contract is sketched above).