Update autodev skill documentation and acceptance criteria

Enhanced the SKILL.md file to enforce conciseness rules for the state file, specifying acceptable content and file size limits. Updated the autodev state to reflect the transition to the planning phase, including changes to the current step and sub-step details. Revised acceptance criteria to clarify validation requirements and external dependencies, ensuring alignment with the latest research findings. Added a new overlay for Mode B revisions to track changes and decisions made during the assessment process.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-09 03:10:57 +03:00
parent 846670a5c5
commit c19c76481c
21 changed files with 2354 additions and 10 deletions
@@ -23,6 +23,7 @@ This folder replaces the previous monolithic `02_fact_cards.md` (1480 lines, too
| [`C6_tile_cache_spatial_index.md`](C6_tile_cache_spatial_index.md) | **C6** — Tile cache + spatial index | #92#93 (2 facts, **batch 1 closed at 2/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY**: Manual mirror of existing parent-suite `satellite-provider` pattern (verified directly via Source #92 filesystem read at /Users/obezdienie001/dev/azaion/suite/satellite-provider/) — PostgreSQL btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` for geographic spatial-grid range queries + `bytea` descriptor blobs + app-side FAISS `IndexHNSWFlat(d, M=32)` loaded at takeoff via `faiss.read_index` for descriptor ANN + filesystem tile storage at `./tiles/{zoom}/{x}/{y}.{image_type}` slippy-map convention; clean PostgreSQL License + MIT + LGPL/MIT-Apache; trivial dependency footprint (no Postgres extensions); empirically-confirmed Postgres-on-Jetson viability per Source #97 March 2026 article (CPU cores limiting, NOT memory); ~6-54 ms per cache hit comfortably within AC-4.1 400 ms p95 budget; ~700 MB-1.5 GB total memory footprint within AC-4.2 8 GB budget. **Cand 2 DEFERRED secondary**: PostgreSQL + PostGIS 3.4 GiST on `geography(POINT,4326)` with KNN distance ordering (`<->`) + pgvector 0.7+ HNSW for descriptor ANN + same filesystem tile storage; native KNN + radius + combined-SQL capabilities are real improvements BUT 5-10× slower geographic lookup than Cand 1 + heavier dependency (~50-100 MB additional memory + ~50-200 MB additional disk install) + PostGIS GPL-2.0-or-later license-complexity (CONTINGENT REJECT under D-C1-1 = (b) BSD/permissive-only-track) + DIVERGENT from suite pattern + improvements marginal-to-negative in project's pinned 3 Hz spatial-grid query operating context. **Comparative-improvement-vs-Cand-1 verdict**: per user's session-start "significant-improvement-only" bar, no material justification to deviate from existing satellite-provider pattern. Decisions: D-C6-1 (NEW) descriptor-storage-format choice (halfvec recommended); D-C6-2 (NEW Cand-1-only) FAISS index variant choice (IndexHNSWFlat M=32 recommended); D-C6-3 (NEW Cand-1-only CROSS-COMPONENT with C10) descriptor-cache-rebuild-trigger strategy (periodic-during-C10-pre-flight recommended); D-C6-4 (NEW Cand-1-only) geographic-spatial-grid radius (dynamic recommended); D-C6-5 (NEW Cand-2-only contingent) Jetson PostGIS+pgvector co-installation Plan-phase verification (verify-on-Jetson-MVE recommended); D-C6-6 (NEW Cand-2-only contingent) pgvector descriptor-storage-type choice (halfvec recommended); D-C6-7 (NEW CROSS-COMPONENT affects parent-suite satellite-provider) cascade-changes-back-to-suite strategy (leave-unchanged recommended given Cand 1 closure verdict). |
| [`C7_inference_runtime.md`](C7_inference_runtime.md) | **C7** — On-Jetson inference runtime | #94#96 (3 facts, **batch 1 closed at 3/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY**: TensorRT native — JetPack 6.2 bundled TensorRT 10.3 + `IInt8EntropyCalibrator2` + `BuilderFlag.FP16+INT8` mixed-precision + engines built directly on Jetson Orin Nano Super SM 87 (Apache-2.0 in TensorRT 10.x; ships with JetPack so zero-effort install; lowest-latency primary path; 2-3× speedup at INT8 vs FP16 per Source #102 YOLO26 benchmark; engines tied to SM 87 hardware-specific per Source #105 — must be built on deployed Jetson via D-C7-7); **Cand 2 modern-competitive-lead-cross-architecture-portability**: ONNX Runtime + TensorRT EP — `onnxruntime-gpu` via Jetson AI Lab JP6/CU126 wheel index + `TensorrtExecutionProvider` config + automatic CUDA EP / CPU EP subgraph fallback (MIT throughout; cross-architecture portability for replay/SITL on x86 dev hosts; `pip install onnxruntime-gpu` does NOT work on Jetson — needs Jetson AI Lab community wheel via D-C7-3 + numpy<2.0.0 pin via D-C7-4); **Cand 3 mandatory simple-baseline**: pure PyTorch FP16 — `torch.amp.autocast` + `model.half()` + Jetson AI Lab PyTorch 2.5 ARM64 wheel (BSD-3-Clause throughout; zero-conversion regression baseline; reference-correctness oracle for accuracy validation of TRT-built engines; standard `pip install torch` lacks CUDA on Jetson — needs Jetson AI Lab wheel via D-C7-5). **Cross-cutting precision policy** (D-C7-6 NEW CROSS-COMPONENT, affects C2+C3+C1+C7): VPR backbones (CNN-class MixVPR/EigenPlaces/NetVLAD) → INT8+FP16 mixed; ViT-class VPR (SelaVPR DINOv2-L; conditional AnyLoc/BoQ/DINOv2-VLAD) → FP16-only initially, INT8 deferred to Jetson MVE per D-C2-5; matchers (LightGlue with SP/DISK/ALIKED, XFeat, XFeat+LighterGlue) → **FP16-only — NO INT8** per Source #103 quantization-sensitivity finding (LightGlue FP8 ModelOpt collapsed match counts); learned VIO frontends → FP16-only initially. **Triton/DeepStream/CUDA-Python custom kernels considered-and-rejected** (server/video-pipeline class + out-of-budget for embedded 8 h mission) per c7_overkill_options scope choice. Decisions: D-C7-1 (NEW Cand-1-only CROSS-COMPONENT with C9) calibration-dataset-strategy (AerialVL S03 + AerialExtreMatch recommended); D-C7-2 (NEW Cand-1-only) TensorRT mixed-precision flag matrix (per-family policy per D-C7-6 recommended); D-C7-3 (NEW Cand-2-only) ORT-Jetson-wheel-index-pin (mirror to project artifact registry + cu126 recommended); D-C7-4 (NEW Cand-2-only) numpy-version-pin (`numpy<2.0.0` recommended); D-C7-5 (NEW Cand-3-only) PyTorch-Jetson-wheel-pin (PyTorch 2.5 + torchvision 0.20 recommended); D-C7-6 (NEW CROSS-COMPONENT C2+C3+C1+C7) INT8-vs-FP16-per-model-family-precision-policy (per-family policy recommended); D-C7-7 (NEW Cand-1-only CROSS-COMPONENT with C10) engine-build-on-Jetson-vs-prebuilt strategy (primary build-on-target + reference-Jetson fallback recommended); D-C7-8 (NEW Cand-1-only) `config.max_workspace_size` cap (1 GB safe default recommended); D-C7-9 (NEW Cand-1-only) TensorRT version pin within JetPack lifecycle (JetPack 6.2 + TensorRT 10.3 recommended). |
| [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) | **C10** — Pre-flight cache provisioning (CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C; only D-C6-3 + D-C7-7 confirmation pipelines researched here, operator tooling design deferred to Plan-phase) | #100#101 (2 facts, **batch 1 closed at 2/N 2026-05-08**) | **D-C6-3 confirmation (Fact #100)**: descriptor-cache rebuild trigger + atomic-write strategy via direct `faiss.write_index`/`faiss.read_index` Python API + `python-atomicwrites` (write-temp + `fsync` + atomic rename) + content-hash (SHA-256) verification gate at takeoff load + `IO_FLAG_MMAP_IFC` mmap load with `madvise(MADV_WILLNEED)` pre-fault + manifest-hash-driven rebuild trigger; FAISS MIT + atomicwrites MIT throughout; FAISS warns "no internal integrity check, expects validated input" — MITIGATED by content-hash gate at takeoff (binds AC-NEW-7 cache-poisoning safety); rebuild-while-not-flying constraint per restrictions.md. **D-C7-7 confirmation (Fact #101)**: hybrid TensorRT engine-build orchestration — Polygraphy CLI primary for INT8-calibrating builds (`polygraphy convert --int8 --calib-cache=<path> ...` Apache-2.0 + Calibrator API replaces hand-written `IInt8EntropyCalibrator2`) + `trtexec` for fast cache-reuse rebuilds (`--fp16 --int8 --calib=<existing_cache>`) + direct `IBuilderConfig` Python API as escape hatch for unusual models (LightGlue dynamic-shape profiles); calibration cache binary-blob reuse keyed by `SHA-256(calib_corpus)` per D-C10-6; engines tied to SM 87 hardware-specific per Source #105 → must be built on deployed Jetson per D-C7-7 closure (D-C10-8 reference-Jetson-at-HQ + deployed-Jetson-copy-to-archive prebuilt-fallback venue); self-describing filename schema `<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` per D-C10-7; binds AC-4.1/4.2 latency+memory budgets via D-C7-2 mixed-precision flag matrix + D-C7-1 calibration corpus closure. |
| [`MODEB_addendum.md`](MODEB_addendum.md) | **Mode B addendum** — solution_draft01 assessment (2026-05-08) | #102#113 (12 facts) | Documentary-audit findings (Facts #102#108): VINS-Mono BSD/GPL deliverable-formatting error (#102), AC-4.1 latency budget overrun (#103), camera calibration unspecified (#104), Suite Sat Service voting-layer contract gap (#105), `00_ac_assessment.md` BLOCKING-gate skip acknowledged (#106), AC-4.5 FC-consumption pathway scope clarification (#107), SQ2 AdHoP + Top-N re-rank sub-stage absence in solution_draft01 architecture (#108). Web-research findings (Facts #109#113): MAVLink no-default-auth + MAVLink-2.0 message-signing per FC (#109), MegaLoc + UltraVPR D-C2-11 deferred-evaluation revision (#110), `MAV_CMD_SET_EKF_SOURCE_SET` no-deployed-GCS-implementer re-confirmation (#111), OpenCV ≥4.12.0 CVE pin (#112), XoFTR + DINOv2-features cross-modal contrarian evidence (#113). |
| [`C8_fc_adapter.md`](C8_fc_adapter.md) | **C8** — MAVLink / MSP2 FC adapter | #97#99 (3 facts, **batch 1 closed at 3/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY for ArduPilot**: pymavlink → MAVLink `GPS_INPUT` (msg 232) cooperative-path; `master.mav.gps_input_send(time_usec, gps_id, ignore_flags, time_week_ms, time_week, fix_type, lat, lon, alt, hdop, vdop, vn, ve, vd, speed_accuracy, horiz_accuracy, vert_accuracy, satellites_visible, yaw)` periodic injection at 5 Hz over MAVLink (UART/USB/UDP per D-C8-1); FC-side `GPS1_TYPE=14` MAVLink + `EK3_SRC1_POSXY=3` GPS source-set drives EKF3 ingestion via `AP_GPS_MAV` (verified Source #4 SQ6 + Source #106 + Source #107); pymavlink LGPL-3.0 linkable from Apache-2.0 app per LGPL §6 (D-C8-3 mitigation). **Cand 2 RECOMMENDED PRIMARY for iNav**: `MSP2_SENSOR_GPS` (id 7939 / 0x1F03) via Python MSP V2 (YAMSPy or INAV-Toolkit `msp_v2_encode`); `mspGPSReceiveNewData()` direct passthrough (no validation gate beyond data parse); covariance fields `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy` align directly with AP `GPS_INPUT.horiz_accuracy`/`vert_accuracy`/`speed_accuracy`; YAMSPy + INAV-Toolkit MIT throughout; `USE_GPS_PROTO_MSP` enabled by default in iNav target/common.h (verified Source #111 + #112 + #113); locked SQ6 + AC-4.3 + restrictions.md transport. **Cand 3 DEFERRED secondary for iNav**: UBX impersonation via pyubx2 NAV-PVT — forging u-blox NAV-PVT frames through standard GPS pipeline; iNav-side `gpsMapFixType()` validation gate requires `flags & 0x01 = 1` (gnssFixOK) AND `fixType ∈ {2,3}` per Source #110 `gps_ublox.c` lines 215-220 + 654; pyubx2 BSD-3-Clause clean dual-use; **does NOT clear user's "significant-improvement-only" bar over Cand 2** — richer protocol surface (NAV-PVT periodic + NAV-VER startup + CFG-MSG/CFG-RATE ACK behaviour) + AC-NEW-7 forgery posture + stricter validation gate + AP-path field-name divergence outweigh pyubx2 library-maturity advantage. **Mid-batch correction**: I caught a contradiction between my own initial AskQuestion phrasing ("UBX impersonation as ONLY iNav path") and locked SQ6 + AC-4.3 + restrictions.md verdicts; user re-locked scope via `c8_inav_recovery=B` to evaluate both as parallel candidates. Decisions: D-C8-1 (NEW Cand-1-only) pymavlink connection-string transport choice (env-driven default-UART recommended); D-C8-2 (NEW Cand-1-only CROSS-COMPONENT with AC-NEW-2) `MAV_CMD_SET_EKF_SOURCE_SET` companion-driven switch ownership pattern (companion publishes to source-set 2 + auto-switches FC recommended); D-C8-3 (NEW Cand-1-only) pymavlink LGPL-3.0 license-posture verification (bundle-unmodified-with-version-pin recommended); D-C8-4 (NEW Cand-2-only) Python MSP V2 implementation choice (YAMSPy primary + thin custom encoder fallback recommended); D-C8-5 (NEW Cand-2-only) MSP2_SENSOR_GPS injection rate (5 Hz periodic recommended); D-C8-6 (NEW Cand-3-only contingent) UBX-version-advertisement strategy (advertise version ≥ 15.0 recommended); D-C8-7 (NEW Cand-3-only contingent CROSS-COMPONENT with AC-NEW-7) AC-NEW-7 audit-trail posture for UBX impersonation (explicit FDR audit entry recommended); D-C8-8 (NEW CROSS-COMPONENT C5+C8) covariance-honesty cross-FC enforcement strategy (per-FC unit conversion recommended via 95% confidence ellipse semi-major axis from C5 GTSAM `Marginals.marginalCovariance`). |
**Cross-cutting consumers** (do not duplicate facts here, just point in):
@@ -0,0 +1,111 @@
# Fact Cards — Mode B Addendum (2026-05-08)
> Mode B Solution Assessment of `_docs/01_solution/solution_draft01.md`. New facts gathered for findings F1F20; Mode A facts #1#101 remain canonical and are not duplicated.
>
> Index: [`00_summary.md`](00_summary.md). Mode B sources: [`../01_source_registry/MODEB_addendum.md`](../01_source_registry/MODEB_addendum.md). Mode B fit-matrix revisions: [`../06_component_fit_matrix/MODEB_revisions.md`](../06_component_fit_matrix/MODEB_revisions.md). Mode B output: [`../../01_solution/solution_draft02.md`](../../01_solution/solution_draft02.md).
>
> Confidence labels and schema match `00_summary.md` legend.
---
## Documentary-audit findings (no new web evidence required)
### Fact #102 — solution_draft01 C1 candidate table mis-licenses VINS-Mono as "BSD permissive clean"; the underlying Mode A C1 Fact #28 correctly classifies it as GPL-3.0 (deliverable-formatting error)
- **Statement**: `solution_draft01.md` § "Component: C1" lists VINS-Mono with the cell "Security: BSD permissive clean" and "Selected (mandatory simple-baseline) — fallback if OKVIS2 fails Jetson MVE". The Mode A C1 fact card #28 (`02_fact_cards/C1_vio.md`) explicitly states VINS-Mono is "GPL-3.0 (copyleft viral) — distribution of the onboard binary requires source disclosure for the entire linked binary and triggers GPL-3 anti-tivoization clauses for embedded firmware" — and the cross-component-gates D-C1-1 license-track decision exists precisely because VINS-Mono / VINS-Fusion / OpenVINS are on the GPL-3.0 axis. Source #122 (raw VINS-Mono LICENCE on github.com) confirms canonical GPL-3.0. The discrepancy is inside Mode A Step 8 (Deliverable Formatting); the Mode A evidence layer is correct.
- **Source**: Mode A C1 Fact #28; Source #122 (canonical LICENCE)
- **Phase**: Mode B documentary audit
- **Confidence**: ✅ High
- **Sub-Question Binding**: SQ3+SQ4 / C1
- **Implication**: solution_draft02 must (a) correct the C1 candidate table cell to "GPL-3.0 contingent on D-C1-1 = (a) or (c) license track", (b) demote VINS-Mono from "Selected (mandatory simple-baseline)" status because under D-C1-1 = (b) BSD/permissive-only track it would be **Rejected** by license, (c) elevate KLT+RANSAC homemade fallback to **the** mandatory simple-baseline (matches Mode A C1 Fact #35), and (d) name the actual BSD/permissive-track lead as OKVIS2 (matches C1 Fact #31). No change to the cross-component decision graph — D-C1-1 already exists as the gate that resolves this.
### Fact #103 — solution_draft01 latency math (~140-420 ms p95 at K=3 + adaptive LightGlue depth) crosses AC-4.1's 400 ms p95 budget at the upper end with no documented slack for FC-side IMU pre-integration, MAVLink/MSP serialization, OS scheduling jitter, or thermal-throttle backoff
- **Statement**: solution_draft01 § "Component-interaction diagram (pre-flight + runtime)" labels the runtime stack: "C1 OKVIS2 VIO ~30-50 ms + C2 MixVPR query ~25 ms + C3 DISK+LightGlue × K pairs ~90-180 ms FP16 + C4 OpenCV solvePnPRansac ~5-15 ms + GTSAM Marginals ~30-90 ms + C5 GTSAM iSAM2 ~2-5 ms per update at D-C5-5 = (c) + C8 per-FC pymavlink GPS_INPUT / MSP2_SENSOR_GPS 5 Hz periodic", and says total is "~140-420 ms p95 at K=3 + adaptive LightGlue depth". The upper end **420 ms exceeds AC-4.1's 400 ms p95** at the documented Jetson Orin Nano Super extrapolation. There is no reserved slack for: (i) MAVLink/MSP serialization + UART/USB transmission to FC (~5-20 ms typical), (ii) OS scheduling jitter under shared-CPU+GPU contention (~10-30 ms typical at 90th-99th percentile per Source #97 Postgres-on-Jetson observations), (iii) thermal-throttle backoff at +50 °C ambient per AC-NEW-5 (Jetson backs off from 25 W to 15 W, collapsing throughput by ~40%), (iv) FC-side IMU pre-integration interpolation latency for the timestamp the GPS_INPUT/MSP2_SENSOR_GPS frame is targeted at, (v) FAISS HNSW index search variance at p99 (~1-3 ms typical → up to ~10-15 ms at p99 per Source #115). A defensible AC-4.1 latency partition would carve a project-side worst-case ≤300 ms p95 budget with explicit per-stage deadlines + slack reservation; current draft01 budgets up to 420 ms with implicit assumption-of-best-case stack behavior.
- **Source**: solution_draft01.md self-citation; AC-4.1; Mode A Sources #97 + #115; AC-NEW-5
- **Phase**: Mode B documentary audit
- **Confidence**: ✅ High (math is internal to draft01)
- **Sub-Question Binding**: SQ3+SQ4 / C1+C2+C3+C4+C5+C7+C8 cross-cutting NFR
- **Implication**: solution_draft02 must add a NEW Plan-phase decision **D-CROSS-LATENCY-1: AC-4.1 latency budget partition strategy** with options (a) tighten K=3 to K=2 to recover ~30-60 ms of headroom, (b) drop GTSAM `Marginals` covariance recovery from RUNTIME path and use adaptive Jacobian-based covariance per D-C4-2 = (a) to recover ~20-60 ms, (c) accept the budget overrun and validate at Jetson MVE that p95 lands under 400 ms in steady-state (i.e. trust the math is conservative and adaptive-LightGlue-depth in practice will land closer to 140 ms than 420 ms), (d) hybrid: K=3 default + auto-degrade to K=2 + Jacobian-covariance under thermal throttle. Recommendation: **(d) hybrid** — preserves AC-4.1 satisfaction across the operating envelope without permanently sacrificing accuracy. **NEW cross-component gate: requires Jetson MVE measurement of full p95+p99 distribution under hot-soak NFT-3 conditions before lock.**
### Fact #104 — Camera intrinsics + camera-to-body calibration are PROJECT-LEVEL OPEN ITEMS per `_docs/00_problem/problem.md` last sentence and `flight_derkachi/README.md`; solution_draft01 does NOT inventory this as a Plan-phase decision
- **Statement**: `_docs/00_problem/problem.md` last sentence: "Camera intrinsics, lens distortion, raw camera feed parameters, and exact camera-to-body calibration are still pending, so final production accuracy claims remain gated on calibration data or a separately surveyed representative dataset." `_docs/00_problem/input_data/flight_derkachi/README.md`: "Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims." `_docs/00_problem/input_data/expected_results/results_report.md` § Known Gaps: "Final production acceptance requires camera calibration and representative datasets with synchronized camera/IMU plus ground-truth trajectory." solution_draft01 cites Sources #82+#83 (OpenCV solvePnPRansac signature requires `K` intrinsic matrix + `dist` distortion coefficients) but does not flag that **K and dist are not yet known** for the deployed ADTi 20MP 20L V1 nav camera. Without intrinsics + camera-to-body extrinsic calibration, the entire C4 pose-estimation pipeline cannot run on real production frames; the Jetson MVE results will be calibration-acquisition-dependent.
- **Source**: `_docs/00_problem/problem.md` line 1; `_docs/00_problem/input_data/flight_derkachi/README.md` line 12; `_docs/00_problem/input_data/expected_results/results_report.md` § Known Gaps; OpenCV Sources #82+#83
- **Phase**: Mode B documentary audit
- **Confidence**: ✅ High
- **Sub-Question Binding**: PCM (Project Constraint Matrix) input availability dimension
- **Implication**: solution_draft02 must add a NEW project-level decision **D-PROJ-1: Camera calibration acquisition strategy** with options (a) checkerboard calibration on a pre-deployment ADTi 20MP 20L V1 nav-camera unit (canonical OpenCV calibration workflow ~1-2 days engineering + lab access), (b) photogrammetric self-calibration from the first ~50 deployment frames over known landmarks (~2-3 days plus runtime support code; produces production-correct calibration but degrades first-mission accuracy), (c) request manufacturer's factory-calibration data sheet from ADTi (low cost if available; risk: vendor may not publish per-unit calibration), (d) hybrid: factory data sheet + ground-truth checkerboard refinement on each deployed unit. Recommendation: **(d) hybrid**. **CRITICAL Plan-phase gate**: this is a hard prerequisite for AC-1.1/1.2 frame-center-accuracy validation; Test Spec (greenfield Step 5) cannot lock end-to-end accuracy fixtures without it.
### Fact #105 — AC-NEW-7 cache-poisoning safety budget explicitly depends on a Suite Sat Service-side voting layer that solution_draft01 does NOT audit for existence, contract, or build status
- **Statement**: `_docs/00_problem/acceptance_criteria.md` § AC-NEW-7 External-dependency note: "The Suite Satellite Service is expected to operate a multi-flight ingest-side voting layer that gates onboard-tile promotion to 'trusted basemap' until multiple independent flights agree on geo-alignment. Voting algorithm is the Service's concern; onboard's job (AC-8.4) is to publish per-tile quality metadata sufficient for that layer. End-to-end AC-NEW-7 evidence depends on this Service contract." solution_draft01 § Architecture lists C6 + C10 as covering the onboard half (publish per-tile quality metadata, content-hash gate at takeoff, atomic-write descriptor cache) but does NOT verify that the Suite Service voting layer (a) has a documented contract, (b) has been implemented, (c) is on the parent-suite roadmap, or (d) has a fallback if not yet built. Without the Service-side voting, a single bad onboard pose with optimistic covariance writes a misaligned tile that becomes the next flight's anchor — cross-flight error compounding that NFT-5 (in solution_draft01) explicitly tries to test but cannot validate end-to-end without the Service contract.
- **Source**: AC-NEW-7 verbatim; solution_draft01 § Architecture C6+C10; solution_draft01 § Testing Strategy NFT-5
- **Phase**: Mode B documentary audit
- **Confidence**: ✅ High
- **Sub-Question Binding**: PCM cross-component external-dependency dimension; SQ8 (security)
- **Implication**: solution_draft02 must add a NEW project-level decision **D-PROJ-2: Suite Sat Service voting-layer contract verification** with options (a) verify Suite Service voting layer is documented + scheduled for the deployment timeframe; (b) draft the contract from the onboard side and propose to the Suite Service team; (c) build a project-internal multi-flight aggregator as a stop-gap until Suite Service ships the layer (~2-3 weeks engineering, but cross-flight aggregator means onboard now owns suite-component scope creep); (d) accept that AC-NEW-7 Service-side validation is best-effort and document the gap explicitly. Recommendation: **(a) verify + (b) draft** in parallel — the contract definition is small (per-tile quality metadata schema + voting threshold spec) and propagating it back to the Suite Service team de-risks the entire AC-NEW-7 obligation. **CRITICAL cross-suite gate**: requires coordination with the parent-suite Satellite Service team before AC-NEW-7 NFT-5 can pass with end-to-end evidence.
### Fact #106 — Mode A Phase 1 BLOCKING gate (`00_ac_assessment.md`) was not produced as a standalone artifact in the Mode A run per solution_draft01's own self-disclosure
- **Statement**: solution_draft01 § Note on AC assessment (lines 17-18): "Mode A Phase 1 (`00_ac_assessment.md` BLOCKING gate per the research SKILL.md) was not executed as a standalone artifact in this run. Per-AC binding evidence is instead distributed across the per-component fact cards and the Restrictions × Candidate-Modes sub-matrix sections in `06_component_fit_matrix/Cx_*.md`. This is acknowledged as a process deviation and is recoverable by extracting an `00_ac_assessment.md` summary file from the existing per-AC binding evidence on demand. No AC has been silently dropped or unverified." Per `_docs/00_research/00_question_decomposition.md` line 4 the Phase 1 skip was a **prior user decision** after a cleanup pass that stripped implementation details from `acceptance_criteria.md` and `restrictions.md`; "AC/restrictions are treated as fixed inputs". Mode B can either (a) extract the standalone artifact retroactively from the distributed evidence, (b) confirm the deviation as accepted by the user, or (c) leave it as-is for Plan-phase to either resolve or carry forward. The risk is small (per-AC binding IS in the per-component fact cards) but the canonical research methodology says a BLOCKING gate cannot simply be skipped.
- **Source**: solution_draft01.md "Note on AC assessment"; `_docs/00_research/00_question_decomposition.md` line 4; research SKILL.md Mode A Phase 1 BLOCKING-gate spec
- **Phase**: Mode B documentary audit
- **Confidence**: ✅ High
- **Sub-Question Binding**: Process compliance with research SKILL.md
- **Implication**: solution_draft02 acknowledges the deviation and recommends extraction of `00_ac_assessment.md` IF user wants the canonical artifact; otherwise the deviation is treated as accepted (per `00_question_decomposition.md` line 4 prior-user decision) and recorded explicitly in `_docs/_process_leftovers/`.
### Fact #107 — AC-4.5 (system may refine prior estimates and emit corrections) FC-consumption pathway is unspecified; neither MAVLink `GPS_INPUT` nor MSP2 `MSP2_SENSOR_GPS` support "correct prior frame N+ago" semantics; GTSAM iSAM2's NATIVE look-back refinement is therefore internal-only and does not reach the FC
- **Statement**: AC-4.5 ("System may refine prior estimates and emit corrections") is satisfied by GTSAM iSAM2's incremental smoothing per Mode A Fact #89 — the estimator can revise past keyframe poses when new measurements arrive. solution_draft01 § Component C5 + § Testing Strategy IT-10 cite this as a key benefit of D-C5-5 = (c) GTSAM-shared-substrate. However: ArduPilot's `AP_GPS_MAV` (Source #4) and iNav's `mspGPSReceiveNewData()` (Source #110) both consume the **latest** received GPS frame as the current best estimate; neither supports retroactive correction of a frame N steps in the past. So GTSAM iSAM2's look-back refinement value is **internal-only** — it improves the current best pose estimate after smoothing the past, but the FC sees only the current frame after smoothing, not corrections to past frames. AC-4.5 is therefore satisfied as "internal estimator refines past + emits the corrected current estimate", not as "FC retroactively corrects past flight log". Draft01 does not make this scoping explicit; IT-10 in particular does not validate AC-4.5 — it validates per-FC unit conversion of covariance.
- **Source**: AC-4.5 verbatim; Mode A Fact #89 (GTSAM iSAM2); Mode A SQ6 Source #4 (`AP_GPS_MAV.cpp`); Mode A C8 Source #110 (`gps_ublox.c`)
- **Phase**: Mode B documentary audit
- **Confidence**: ✅ High
- **Sub-Question Binding**: SQ3+SQ4 / C5; SQ6 / C8
- **Implication**: solution_draft02 § Architecture C5 must clarify "AC-4.5 satisfied as internal smoothing + corrected current-frame emission; FC log is forward-time only". solution_draft02 § Testing Strategy must add a new **IT-11 — Smoothing-loop look-back accuracy** test that validates GTSAM iSAM2's smoothed past-keyframe poses against ground-truth at smoothing convergence (independent of FC-side consumption). FDR (AC-NEW-3) MUST log smoothed past-frame estimates so post-mission analysis can verify AC-4.5.
### Fact #108 — SQ2 architectural decisions promoted "AdHoP refinement loop" + "Top-N inlier-based re-rank" to explicit named sub-stages in the runtime pipeline (per `_docs/00_research/00_question_decomposition.md` lines 175-178), but solution_draft01 § Architecture has neither a candidate row nor a named sub-stage for either
- **Statement**: `_docs/00_research/00_question_decomposition.md` § "SQ2 — Architectural decisions" — Decision 2: "AdHoP refinement loop (Fact #22) → (b) Conditional — only invoked when initial reprojection error exceeds a threshold. C3 (matcher) latency budget = base (single-pass) + AdHoP-conditional overhead (worst-case 2× when triggered)." Decision 3: "Top-N re-rank promotion (Fact #25) → (a) Promote to an explicit named sub-stage between C2 and C3. SQ3+SQ4 will hyperparameter-sweep N ∈ {5, 10, 15, 20}; C2 candidates evaluated jointly with re-rank cost. Top-N re-rank by inlier-count is now a hard pipeline component, not implicit." solution_draft01 § Architecture lists candidate tables for C1+C2+C3+C4+C5+C6+C7+C8+C10. The component-interaction diagram shows "C2 MixVPR query → top-K=3 satellite tile retrieval → C3 DISK+LightGlue × K pairs" — the K=3 retrieval IS the top-K from C2, but the **re-rank by inlier count** sub-stage promised by SQ2 Decision 3 is not represented. Similarly, no AdHoP-conditional refinement candidate appears in the C3 row, despite SQ2 Decision 2 carving its latency budget.
- **Source**: `_docs/00_research/00_question_decomposition.md` lines 175-178; solution_draft01 § Architecture C1-C10; solution_draft01 § Component-interaction diagram
- **Phase**: Mode B documentary audit
- **Confidence**: ✅ High
- **Sub-Question Binding**: SQ2 closure
- **Implication**: solution_draft02 must either (a) populate the architecture with new candidate rows for "Top-N re-rank by inlier count" (likely a thin wrapper around the C3 matcher's RANSAC inlier counter) and "AdHoP-conditional refinement" (per Source #40 OrthoLoC AdHoP method-agnostic preconditioning); or (b) explicitly close SQ2 Decisions 2+3 as "implicit inside C3" — but in that case the "promote to explicit named sub-stage" wording from question_decomposition must be revisited and the user notified that the architecture deviated. Recommendation: **(a) populate** — both are well-scoped sub-components with cited Sources (#22+#25 in the original SQ2 closure) and the latency budgets are already carved. solution_draft02 § Architecture adds a "Re-rank" sub-stage between C2 and C3 plus an "AdHoP-conditional" sub-stage between C3 and C4.
---
## Web-research findings (2026-05-08)
### Fact #109 — MAVLink protocol lacks cryptographic authentication by default (CVE-2026-1579, CVSS 9.8 CRITICAL); ArduPilot supports MAVLink 2.0 message signing as the canonical mitigation; iNav has only partial MAVLink support and does NOT implement message signing — cross-FC asymmetry on the GCS / telemetry link is material for AC-NEW-7 + AC-NEW-2
- **Statement**: Per Source #126 (NVD CVE-2026-1579, CVSS 9.8 CRITICAL): "The MAVLink communication protocol lacks cryptographic authentication by default. Unauthenticated parties with MAVLink interface access can send arbitrary messages including SERIAL_CONTROL commands for interactive shell access." Affected named: PX4 Autopilot v1.16.0_SITL_latest_stable. **Mitigation: enable MAVLink 2.0 message signing.** Per Source #128 (ArduPilot Plane MAVLink2 Signing docs): ArduPilot supports MAVLink2 signing via Mission Planner SETUP > Advanced > "Mavlink Signing"; non-USB serial ports can be configured to only respond to MAVLink commands carrying the correct passkey; a 13-byte signature includes link ID (8 bits), timestamp (48 bits in 10-microsecond units since 2015-01-01), and 48-bit SHA-256 hash signature based on packet + timestamp + secret key (Source #128 + canonical mavlink.io/en/guide/message_signing.html). Issue #28736 + PR #29546 (March 2025) add channel-specific signing for separate MAVLink ports — direct relevance to companion-computer wired connections per Source #128. Per Source #129 (iNav MAVLink wiki, frogmane edited 2025-12-11): "iNav has partial MAVLink support and does not implement message signing. It lacks parameter API support and has limited command compatibility." Companion-FC inbound on iNav is MSP2 (not MAVLink) so the signing-gap is on the OUTBOUND MAVLink telemetry side from iNav to the GCS, not on the inbound external-positioning path — but cross-FC asymmetry is still material because the GCS link itself carries `STATUSTEXT` and operator commands per AC-6.1 + AC-6.2.
- **Source**: Source #126 (CVE-2026-1579), Source #128 (ArduPilot Plane MAVLink2 Signing docs + PR #29546), Source #129 (iNav MAVLink wiki), canonical mavlink.io/en/guide/message_signing.html (Source #128 cross-cite)
- **Phase**: Mode B web research
- **Confidence**: ✅ High (NVD official + ArduPilot official + iNav official wiki)
- **Sub-Question Binding**: SQ6 (FC adapter security posture) + SQ8 (AC-NEW-7 + AC-NEW-2 security)
- **Implication**: solution_draft02 must add a NEW Plan-phase decision **D-C8-9: MAVLink 2.0 message signing posture per FC** with options (a) require MAVLink 2.0 signing on ALL MAVLink channels (companion ↔ ArduPilot, FC ↔ GCS, companion ↔ GCS); (b) require signing only on the companion ↔ ArduPilot wired channel (the inbound external-positioning path on AP); (c) accept the unsigned-by-default posture and document it as an external-attack-surface risk; (d) hybrid: signing on companion ↔ AP wired channel + key rotation on every flight. iNav has no signing option per Source #129 — explicit cross-FC asymmetry must be documented. Recommendation: **(d) hybrid for AP**. Cross-FC asymmetry: iNav GCS link is unsigned by design — document explicitly under AC-NEW-7 and propose iNav firmware feature-request as Plan-phase carryforward. **NEW NFT-8 — MAVLink message-signing verification**: SBOM dump confirms passkey configuration for AP signing channel; iNav side documents the unsignable-link as accepted residual risk.
### Fact #110 — MegaLoc (Berton & Masone, CVPR 2025) and UltraVPR (RAL 2025 / ICRA 2026) are MIT-licensed aerial-validated VPR candidates that materially change the D-C2-11 deferred-evaluation recommendation; UltraVPR specifically targets UAV with documented 44 Hz throughput on Jetson Orin NX (Orin-Nano-Super-class)
- **Statement**: Per Source #123 (MegaLoc): MIT-licensed; February 2025 release; SOTA on multiple VPR benchmarks (indoor + outdoor); validated on aerial datasets via the AirZoo benchmark (Source #125) which "demonstrates that fine-tuning MegaLoc on aerial data yields substantial performance gains for aerial image retrieval and cross-view matching tasks". Distributed via torch.hub for easy installation. Per Source #124 (UltraVPR): MIT-licensed; RAL 2025 + ICRA 2026; "unsupervised lightweight rotation-invariant aerial VPR system designed for UAV applications"; **ONNX model runs at approximately 44 Hz on Jetson Orin NX** (Orin Nano Super is in the same Ampere family; expected to land in the same throughput band ±20%); validated on VPAir + UAV-VisLoc datasets — directly relevant to the project's pinned aerial UAV operating context. solution_draft01 § "Open decisions for Plan-phase" line 322 explicitly defers D-C2-11 (MegaLoc successor evaluation) to "post-research session" because Mode A had not gathered sufficient evidence on MegaLoc's aerial applicability or Jetson runtime. Mode B research closes both gaps: MegaLoc is aerial-validated (AirZoo); UltraVPR is aerial-pretrained + Jetson-throughput-documented. The D-C2-11 recommendation should be revised from "(c) skip and rely on closed mandatory pre-screen" to "(a) treat both MegaLoc AND UltraVPR as new Documentary Lead candidates on the BSD/permissive C2 axis at next session, with mandatory Jetson MVE under D-C1-2 / D-C2-4".
- **Source**: Source #123 (MegaLoc), Source #124 (UltraVPR), Source #125 (AirZoo aerial validation)
- **Phase**: Mode B web research
- **Confidence**: ✅ High (peer-reviewed publications + official repos with pretrained weights)
- **Sub-Question Binding**: SQ3+SQ4 / C2 D-C2-11
- **Implication**: solution_draft02 § Architecture C2 must add MegaLoc + UltraVPR as Documentary Lead candidates on the BSD/permissive C2 axis. UltraVPR is potentially the strongest candidate by the project's specific operating-context scoring: rotation-invariant (multi-heading aerial flights), unsupervised (no aerial-retrain cost — closes D-C2-1), Jetson-Orin-NX-runtime-documented at 44 Hz (substantially exceeds 3 Hz nav-camera rate with massive headroom), and MIT-licensed (BSD/permissive track clean). MegaLoc is the broader-applicability primary if SOTA across non-aerial datasets is also wanted (e.g., for cross-domain generalization). **D-C2-11 revised**: (a) elevate UltraVPR to Documentary Lead PRIMARY recommendation on BSD/permissive C2 axis, with MixVPR / EigenPlaces / SelaVPR as siblings; (b) add MegaLoc as Documentary Lead SECONDARY with broader-applicability tag; (c) preserve the closed pre-screen (5/5: MixVPR + SALAD + SelaVPR + NetVLAD + EigenPlaces) as fallback. Mandatory Jetson MVE per D-C1-2 / D-C2-4 expanded scope to cover both UltraVPR + MegaLoc + the existing five.
### Fact #111 — D-C8-2 = (b) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch pattern is supported by ArduPilot at firmware level since August 2021 (PR #18345 → SITL-tested) but no production-deployed GCS or companion implementation is publicly documented; the project will be establishing the canonical pattern itself
- **Statement**: Per Source #130 (ArduPilot common-ekf-sources.rst + PR #18345): ArduPilot supports `MAV_CMD_SET_EKF_SOURCE_SET` (MAVLink command id 42007) since merge in August 2021; the command accepts source set values in 1-3 range; tested in SITL. Source #130 explicitly states: "no GCSs are currently known to implement this" and "The results do not provide specific information about Auterion, NGPS, or production deployment status." This re-confirms Mode A SQ6 Fact #3 from a fresh search at access time 2026-05-08. solution_draft01 § Open decisions D-C8-2 = (b) "companion publishes to source-set 2 + auto-switches FC to set 2 on first valid fix + switches back to set 1 when companion is unavailable (RECOMMENDED ~mirrors NGPS/Auterion pattern)" cites NGPS/Auterion deployment pattern but Mode A SQ1 Sources #25#37 do not provide direct evidence of either NGPS or Auterion using the companion-driven switch — they document the existence of those deployed systems but not their internal source-set switching mechanism. This is a gap: the project is committing to a pattern that is **architecturally supported but not production-deployed**.
- **Source**: Source #130 (ArduPilot common-ekf-sources.rst + PR #18345 verified 2026-05-08); Mode A SQ6 Fact #3
- **Phase**: Mode B web research (verification re-run)
- **Confidence**: ✅ High (ArduPilot official documentation)
- **Sub-Question Binding**: SQ6 / C8 D-C8-2
- **Implication**: solution_draft02 must downgrade D-C8-2 fit status from `Selected` to **`Selected with runtime gate`** per Step 7.5.3 carve-out wording, with the runtime gate being "validated end-to-end on ArduPilot Plane SITL by IT-3 (Spoofing-promotion latency) before lock". Add NEW Plan-phase decision **D-C8-2-FALLBACK: companion-driven switch fallback strategy if SITL validation fails** with options (a) switch to operator-manual source-set flip via RC aux switch option 90 per draft01's existing D-C8-2 = (c), accepting AC-NEW-2 ≤3s latency would now require operator response time; (b) implement operator-warning STATUSTEXT instead of automated switch, deferring authority to operator; (c) escalate to ArduPilot dev community to characterize firmware-side switch latency before lock. Recommendation: **defer to Test Spec greenfield Step 5** which owns SITL fixture acquisition.
### Fact #112 — OpenCV 4.x must be pinned to ≥4.12.0 per CVE-2025-53644 (CVSS 9.8 CRITICAL heap buffer write via crafted JPEG); affects 4.10.0 / 4.11.0; OpenCV is C4's primary `solvePnPRansac` runtime + KLT fallback in C1 + ortho warp in C4
- **Statement**: Per Source #127 (NVD CVE-2025-53644, CVSS 9.8 CRITICAL): "Uninitialized pointer variable on stack when reading crafted JPEG images. Affects versions 4.10.0 and 4.11.0. Fixed in version 4.12.0." Related weakness: CWE-457 (Use of Uninitialized Variable). solution_draft01 § Component C4 cites "OpenCV 4.x calib3d module"; § Component C1 KLT fallback uses "OpenCV pure-Python"; FDR thumbnail re-load + tile cache import paths potentially feed crafted JPEG bytes into OpenCV's `imread` / `imdecode`. The exposure is small (most JPEG inputs are trusted internal nav-camera stream) but FDR thumbnail re-load AND ortho-tile imports from the Suite Sat Service path could be hostile-input vectors per AC-NEW-7. Pinning OpenCV to ≥4.12.0 is a single-line change with no API-break exposure (4.12 is a minor release on the 4.x line).
- **Source**: Source #127 (NVD CVE-2025-53644)
- **Phase**: Mode B web research
- **Confidence**: ✅ High (NVD official)
- **Sub-Question Binding**: SQ8 (security) + SQ3+SQ4 / C1+C4 dependency pinning
- **Implication**: solution_draft02 must pin OpenCV to **≥4.12.0** in all C1+C4 candidate rows; add NEW Plan-phase decision **D-CROSS-CVE-1: dependency security pinning posture** with options (a) lock to specific patched versions of all CVE-affected dependencies (OpenCV ≥4.12.0; FAISS Apache-2.0 throughout — no CVEs; GTSAM clean — no CVEs; TensorRT 10.3 in JetPack 6.2 — no CVE-applicable since not using TRT-LLM 0.x); (b) maintain a project SBOM with monthly CVE re-scan; (c) automate pinning via dependabot or equivalent. Recommendation: **(a) + (b)** — minimal cost, maximum AC-NEW-7 audit-trail coverage.
### Fact #113 — XoFTR (cross-modal) achieves SOTA cross-modal matching but a 2026 SAR-optical benchmark (24-matcher comparison) found foundation-model features (DINOv2) provide modality invariance even WITHOUT explicit cross-modal training — reinforces SelaVPR (DINOv2-L) preference over MixVPR (CNN-only) when cross-domain UAV→satellite registration is the binding stress test
- **Statement**: Per Source #131 (XoFTR + 2026 SAR-optical benchmark): XoFTR achieved the lowest reported mean error at 3.0 pixels on SpaceNet9 cross-modal training scenes among 24 pretrained matcher families. **Critical finding**: "matchers without explicit cross-modal training sometimes performed comparably, suggesting that foundation-model features (like DINOv2) may provide modality invariance." This is direct contrarian evidence on the project's "DISK+LightGlue retrain on aerial-domain corpus closes cross-domain UAV→satellite gap" architectural bet (D-C3-1 = (a) RECOMMENDED-PRIMARY-MITIGATION). The contrarian implication: a DINOv2-backboned VPR (SelaVPR per Mode A C2 fact card) AND a DINOv2-backboned matcher (a hypothetical DINOv2-backed feature extractor + LightGlue) might close the cross-domain gap WITHOUT needing the D-C2-1 ~1-2-week aerial retrain that draft01 baselines. This does not invalidate the existing D-C3-1 = (a) recommendation but it strengthens the case for keeping SelaVPR (DINOv2-L) as a serious candidate alongside MixVPR (CNN) in the BSD/permissive C2 axis, and it suggests MegaLoc (which also uses foundation-model features per Source #123) is similarly attractive without retrain cost.
- **Source**: Source #131 (XoFTR + 2026 SAR-optical benchmark, two-source confirmation per Critical-novelty rule)
- **Phase**: Mode B web research
- **Confidence**: ⚠️ Medium (the "comparable performance without cross-modal training" finding is from one benchmark on SAR-optical, not UAV→satellite — extrapolation to project's exact operating context is plausible but unverified)
- **Sub-Question Binding**: SQ3+SQ4 / C2+C3
- **Implication**: solution_draft02 § Architecture C2 keeps SelaVPR (DINOv2-L two-stage) as **strong secondary** alongside UltraVPR primary on the BSD/permissive C2 axis (per Fact #110 promotion). solution_draft02 § Open decisions adds **D-C2-12: DINOv2-backbone feature-extractor evaluation for cross-domain matching** as carryforward research item — could potentially close D-C3-1 retrain cost via DINOv2-feature-based matcher (e.g., DINOv2 + LightGlue or DINOv2 + paired matcher) without requiring D-C2-1 aerial retrain. Defer to Plan-phase Jetson MVE.