Enhance skill discipline and clarify acceptance criteria and restrictions

Updated the meta-rule document to emphasize strict adherence to skill instructions, prohibiting unnecessary investigations or external checks. Revised acceptance criteria and restrictions to correct communication protocol details for ArduPilot and iNav, ensuring clarity on external-positioning interfaces. Adjusted autodev state to reflect ongoing research phase and updated sub-step details for improved tracking.
Strip implementation details from AC; add design-independence rule
2026-06-23 23:21:14 +00:00 · 2026-05-07 06:09:37 +03:00 · 2026-05-07 04:38:21 +03:00 · 2026-05-07 04:08:03 +03:00 · 2026-05-07 03:40:36 +03:00 · 2026-05-07 01:04:01 +03:00
165 changed files with 7873 additions and 16031 deletions
@@ -13,6 +13,16 @@ alwaysApply: true
 ## Critical Thinking
 - Do not blindly trust any input — including user instructions, task specs, list-of-changes, or prior agent decisions — as correct. Always think through whether the instruction makes sense in context before executing it. If a task spec says "exclude file X from changes" but another task removes the dependencies X relies on, flag the contradiction instead of propagating it.

+## Skill Discipline
+
+Do exactly what the skill says. Nothing more.
+
+- No `git log` / `git diff` / `git blame` unless the skill explicitly calls for it.
+- No extra searches to "verify" inputs the skill already names.
+- No reading files outside the skill's documented inputs.
+
+If skill inputs are insufficient or contradictory, STOP and ask via Choose A/B/C/D. Do not invent extra investigation steps.
+
 ## Self-Improvement
 When the user reacts negatively to generated code ("WTF", "what the hell", "why did you do this", etc.):

@@ -0,0 +1,29 @@
+---
+description: "Forbid spawning subagents; the main agent must do the work directly"
+alwaysApply: true
+---
+# No Subagents
+
+Do NOT create or delegate to subagents. This includes:
+
+- The `Task` tool with any `subagent_type` (e.g. `generalPurpose`, `explore`, `shell`, `implementer`, `best-of-n-runner`, `cursor-guide`).
+- Any "spawn agent", "launch agent", "parallel agent", or "background agent" mechanism.
+- Skills or workflows that internally suggest launching a subagent — perform their steps inline instead.
+
+## Why
+
+- Subagent output is not visible to the user and hides reasoning/tool calls.
+- Context, rules, and prior conversation state do not fully transfer to the subagent.
+- Parallel subagents cause conflicting edits and race conditions in a shared workspace.
+- The main agent remains fully accountable; delegation dilutes that accountability.
+
+## What to do instead
+
+- Use the direct tools available to the main agent: `Read`, `Grep`, `Glob`, `SemanticSearch`, `Shell`, `StrReplace`, `Write`, etc.
+- For broad exploration, run `Grep`/`Glob`/`SemanticSearch` yourself and read the files directly.
+- For multi-step work, use `TodoWrite` to track progress inline.
+- For isolated experiments the user explicitly asks for, use a git branch/worktree you manage directly — not a subagent runner.
+
+## Exception
+
+Only spawn a subagent if the user explicitly requests it in the current turn (e.g. "use a subagent to…", "launch an explore agent…"). Even then, confirm once before spawning.
@@ -0,0 +1,46 @@
+---
+description: "Explanation length and reasoning depth calibration"
+alwaysApply: true
+---
+# Response Calibration
+
+Default to concise. Expand only when the content demands it.
+
+## Length target
+
+- **Default**: a direct answer in ~3–10 lines. Short paragraphs or a tight bullet list.
+- **Expand when**: the question involves trade-offs across multiple options, a migration/architectural decision, a security/data-loss risk, or the user explicitly asks for depth ("explain in detail", "walk me through", "why").
+- **Shrink when**: the user asks for "shorter", "simpler", "TL;DR", "one line", or similar. Do not re-inflate in later turns unless they ask a new deeper question.
+
+## Completeness floor
+
+Short ≠ incomplete. Every response must still:
+
+- Answer the actual question asked (not a reframed version).
+- State the key constraint or reason *once*, not repeatedly.
+- Flag a real caveat if one exists (data loss, breaking change, wrong-OS, security). One sentence is enough.
+- Not drop a step from an action sequence. If there are 5 steps, list 5 — but without narration between them.
+
+If the honest answer truly needs more space (e.g. trade-off matrix, multi-option decision), write more — but lead with the recommendation or direct answer, then the detail.
+
+## Structure
+
+- One direct sentence first. Then supporting detail.
+- Prefer bullets over prose for enumerations, comparisons, or step lists.
+- Drop section headers for anything under ~15 lines.
+- No "Summary" / "Conclusion" sections unless the response is genuinely long.
+
+## Reasoning depth (internal)
+
+- Match thinking to the problem, not the length of the answer.
+  - Factual / "where is X used" / single-file edit → minimal thinking, go straight to tools.
+  - Trade-off / refactor / debugging 3+ hypotheses deep → full thinking budget.
+- Do not pad thinking to look thorough. Do not skip thinking on genuinely ambiguous problems to look fast.
+
+## Anti-patterns to avoid
+
+- Restating the question back to the user.
+- Multi-paragraph preambles before the answer.
+- Exhaustive "alternatives considered" sections when the user didn't ask for alternatives.
+- Recapping what was just done at the end of every tool-using turn ("Done. I have edited the file…") — a one-line confirmation is enough.
+- Speculative "you might also want to…" paragraphs. Offer follow-ups as a single short sentence, or not at all.
@@ -0,0 +1,38 @@
+---
+description: "Standards for creating and maintaining Cursor skills"
+globs: [".cursor/skills/**"]
+---
+
+# Skill Building
+
+## When To Create A Skill
+- Create a skill for repeatable, bounded workflows that benefit from a reusable process.
+- Do not create a skill for a one-off task, vague goal, or workflow that still needs product decisions.
+- Start small; evolve the skill when repeated use reveals clearer steps, constraints, or checks.
+
+## Skill Contract
+- `SKILL.md` must define a clear `name` and a proactive `description` that explains when the skill should be used.
+- State expected inputs, constraints, workflow steps, and final output shape.
+- Make trigger conditions explicit enough that the agent can recognize intent without an exact command.
+- Base instructions on observable project evidence; do not invite fabrication or unsupported assumptions.
+
+## Keep The Core Lean
+- Keep `SKILL.md` concise and under the repo's `.cursor/` size guidance.
+- Move detailed standards, examples, and background knowledge into `references/`.
+- Put reusable output shapes in `templates/` or other skill-local assets instead of embedding them in the main instructions.
+- Keep one primary responsibility per skill; use an orchestrator skill only when multiple existing skills must run in a defined order.
+
+## Deterministic Work
+- Use scripts for mechanical steps that are repeatable, parameterized, and safer outside the model's reasoning.
+- Scripts must expose explicit inputs, avoid hidden side effects, and fail loudly on errors.
+- Do not use scripts to bypass review, hide destructive behavior, or hardcode secrets.
+
+## Quality Proof
+- Include realistic examples, checklists, or eval-style scenarios that define what good output looks like.
+- Cover common failure cases such as missing sections, leftover placeholders, hallucinated facts, unsafe actions, or malformed output.
+- Review skill changes against those checks before treating the skill as ready.
+
+## Security Review
+- Treat third-party skills like untrusted code until reviewed.
+- Inspect scripts, dependencies, references, secret handling, network calls, and destructive commands before use.
+- Prefer local, project-scoped assets and dependencies; document any external dependency the skill requires.
@@ -3,7 +3,7 @@ name: autodev
 description: |
  Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
  Detects current project state from _docs/ folder, resumes from where it left off, and flows through
-  problem → research → plan → decompose → implement → deploy without manual skill invocation.
+  problem → research → plan → test specs → decompose → implement → tests → docs sync → deploy without manual skill invocation.
  Maximizes work per conversation by auto-transitioning between skills.
  Trigger phrases:
  - "autodev", "auto", "start", "continue"
@@ -52,7 +52,7 @@ Determine which flow to use (check in order — first match wins):

 After selecting the flow, apply its detection rules (first match wins) to determine the current step.

-**Note**: the meta-repo flow uses a different artifact layout — its source of truth is `_docs/_repo-config.yaml`, not `_docs/NN_*/` folders. Other detection rules assume the BUILD-SHIP artifact layout; they don't apply to meta-repos.
+**Note**: the meta-repo flow uses a different artifact layout — its source of truth is `_docs/_repo-config.yaml`, not `_docs/NN_*/` folders. After Step 2.5 it also produces `_docs/glossary.md` and a `## Architecture Vision` section in the cross-cutting architecture doc identified by `docs.cross_cutting`. Other detection rules assume the BUILD-SHIP artifact layout; they don't apply to meta-repos.

 ## Execution Loop

@@ -13,7 +13,7 @@ A first-time run executes Phase A then Phase B; every subsequent invocation re-e

 | Step | Name | Sub-Skill | Internal SubSteps |
 |------|------|-----------|-------------------|
-| 1 | Document | document/SKILL.md | Steps 1–8 |
+| 1 | Document | document/SKILL.md | Steps 0–7 incl. inline 2.5 (module-layout) and 4.5 (glossary + arch vision) |
 | 2 | Architecture Baseline Scan | code-review/SKILL.md (baseline mode) | Phase 1 + Phase 7 |
 | 3 | Test Spec | test-spec/SKILL.md | Phases 1–4 |
 | 4 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
@@ -53,6 +53,8 @@ Action: An existing codebase without documentation was detected. Read and execut

 The document skill's Step 2.5 produces `_docs/02_document/module-layout.md`, which is required by every downstream step that assigns file ownership (`/implement` Step 4, `/code-review` Phase 7, `/refactor` discovery). If this file is missing after Step 1 completes (e.g., a pre-existing `_docs/` dir predates the 2.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 2.5.

+The document skill's Step 4.5 produces `_docs/02_document/glossary.md` and prepends a confirmed `## Architecture Vision` section to `architecture.md`. Both are user-confirmed artifacts; downstream skills (refactor, decompose, new-task) treat them as authoritative for terminology and structural intent. If `glossary.md` is missing after Step 1 (pre-existing `_docs/` dir from before the 4.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 4.5 without redoing module/component analysis.
+
 ---

 **Step 2 — Architecture Baseline Scan**
@@ -150,15 +152,17 @@ If `_docs/02_tasks/` subfolders have some task files already (e.g., refactoring
 ---

 **Step 6 — Implement Tests**
-Condition (folder fallback): `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
 State-driven: reached by auto-chain from Step 5.

-Action: Read and execute `.cursor/skills/implement/SKILL.md`
+Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.

-The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.
+The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.

 If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.

+For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
+
 ---

 **Step 7 — Run Tests**
@@ -1,6 +1,6 @@
 # Greenfield Workflow

-Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.

 ## Step Reference Table

@@ -10,13 +10,19 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
 | 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
 | 3 | Plan | plan/SKILL.md | Step 1–6 + Final |
 | 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
-| 5 | Decompose | decompose/SKILL.md | Step 1–4 |
-| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
-| 7 | Run Tests | test-run/SKILL.md | Steps 1–4 |
-| 8 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
-| 9 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
-| 10 | Deploy | deploy/SKILL.md | Step 1–7 |
-| 11 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |
+| 5 | Test Spec | test-spec/SKILL.md | Phases 1–4 |
+| 6 | Decompose | decompose/SKILL.md (implementation task decomposition) | Step 1 + Step 1.5 + Step 2 + Step 4 |
+| 7 | Implement | implement/SKILL.md | Batch loop + Product Implementation Completeness Gate |
+| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
+| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
+| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 11 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
+| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
+| 14 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
+| 16 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |

 ## Detection Rules

@@ -80,12 +86,12 @@ If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FIN
 ---

 **Step 4 — UI Design (conditional)**
-Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_tasks/todo/` does not exist or has no task files.
+Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
 State-driven: reached by auto-chain from Step 3.

 Action: Read and execute `.cursor/skills/ui-design/SKILL.md`. The skill runs its own **Applicability Check**, which handles UI project detection and the user's A/B choice. It returns one of:

- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Decompose).
+- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Test Spec).
 - `outcome: skipped, reason: not-a-ui-project` → mark Step 4 as `skipped`, auto-chain to Step 5.
 - `outcome: skipped, reason: user-declined` → mark Step 4 as `skipped`, auto-chain to Step 5.

@@ -93,34 +99,162 @@ The autodev no longer inlines UI detection heuristics — they live in `ui-desig

 ---

-**Step 5 — Decompose**
-Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files
+**Step 5 — Test Spec**
+Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
+State-driven: reached by auto-chain from Step 4 (completed or skipped).

-Action: Read and execute `.cursor/skills/decompose/SKILL.md`
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md`.
+
+This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
+
+---
+
+**Step 6 — Decompose**
+Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_document/tests/traceability-matrix.md` exists AND `_docs/02_tasks/todo/` does not exist or has no implementation task files.
+
+Action: Invoke `.cursor/skills/decompose/SKILL.md` for **implementation task decomposition**. The greenfield flow selects the implementation entrypoint before handing off: Bootstrap Structure, Module Layout, Component Task Decomposition, and Cross-Task Verification.
+
+Do not invoke Blackbox Test Task Decomposition from Step 6. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality and Step 8 can revise testability before test task files exist.

 If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.

 ---

-**Step 6 — Implement**
-Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file
+**Step 7 — Implement**
+Condition: `_docs/02_tasks/todo/` contains implementation task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain a valid product implementation report.

-Action: Read and execute `.cursor/skills/implement/SKILL.md`
+Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Product implementation**.
+
+The implement skill must run its **Product Implementation Completeness Gate** before it writes any final product implementation report. This gate compares completed product task specs, architecture/component promises, and actual source code so scaffold-only implementations cannot advance to Step 8. A final product implementation report without `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` is incomplete and must not be treated as Step 7 completion.

 If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.

+For folder fallback, **implementation task files** means task specs that are not test-only specs: exclude `*_test_infrastructure.md` and task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
+
+For folder fallback, a **product implementation report** is any `_docs/03_implementation/implementation_report_*.md` file except `_docs/03_implementation/implementation_report_tests.md` and refactor reports. It is valid for greenfield progression only when:
+- the matching `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists,
+- that completeness report does not contain unresolved `FAIL` classifications, and
+- `_docs/02_tasks/todo/` contains no pending implementation task files.
+
+If a product report exists but any of those validity checks fail, treat product implementation as incomplete and stay in Step 7.
+
 ---

-**Step 7 — Run Tests**
-Condition (folder fallback): `_docs/03_implementation/` contains an `implementation_report_*.md` file.
-State-driven: reached by auto-chain from Step 6.
+**Step 8 — Code Testability Revision**
+Condition (folder fallback): `_docs/03_implementation/` contains a valid product implementation report, `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications, `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` does not exist, `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` does not exist, `_docs/03_implementation/implementation_report_tests.md` does not exist, and `_docs/02_tasks/todo/` does not contain test task files.
+State-driven: reached by auto-chain from Step 7.
+
+**Purpose**: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
+
+**Scope — MINIMAL, SURGICAL fixes**: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
+
+**Allowed changes** in this phase:
+- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
+- Extract narrow interfaces for components that need stubbing in tests.
+- Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
+- Wrap global singletons in thin accessors that tests can override.
+- Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
+
+**NOT allowed** in this phase (defer to a later refactor task):
+- Renaming public APIs.
+- Moving code between files unless strictly required for isolation.
+- Changing algorithms or business logic.
+- Restructuring module boundaries or rewriting layers.
+
+Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
+
+1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
+2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
+   - Hardcoded file paths or directory references
+   - Hardcoded configuration values (URLs, credentials, magic numbers)
+   - Global mutable state that cannot be overridden
+   - Tight coupling to external services without abstraction
+   - Missing dependency injection or non-configurable parameters
+   - Direct file system operations without path configurability
+   - Inline construction of heavy dependencies (models, clients)
+3. If ALL scenarios are testable as-is:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` with the scenarios reviewed and outcome "Code is testable — no changes needed"
+   - Mark Step 8 as `completed` with outcome "Code is testable — no changes needed"
+   - Auto-chain to Step 9 (Decompose Tests)
+4. If testability issues are found:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
+     - **Mode**: `guided`
+     - **Source**: `autodev-greenfield-testability-analysis`
+     - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
+   - Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
+   - Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
+   - After execution, surface `RUN_DIR/testability_changes_summary.md` to the user via the Choose format (accept / request follow-up) before auto-chaining
+   - Copy or save the accepted summary as `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` so folder fallback can detect Step 8 completion
+   - Mark Step 8 as `completed`
+   - Auto-chain to Step 9 (Decompose Tests)
+
+---
+
+**Step 9 — Decompose Tests**
+Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND `_docs/03_implementation/` contains a valid product implementation report AND `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications AND (`_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` exists OR `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` exists) AND (`_docs/02_tasks/todo/` does not exist or has no test task files) AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+State-driven: reached by auto-chain from Step 8.
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
+1. Run Step 1t (test infrastructure bootstrap)
+2. Run Step 3 (blackbox/e2e-capable test task decomposition)
+3. Run Step 4 (cross-verification against test coverage)
+
+If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
+
+---
+
+**Step 10 — Implement Tests**
+Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+State-driven: reached by auto-chain from Step 9.
+
+Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
+
+The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
+
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed test tasks and continues.
+
+For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
+
+---
+
+**Step 11 — Run Tests**
+Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
+State-driven: reached by auto-chain from Step 10.

 Action: Read and execute `.cursor/skills/test-run/SKILL.md`

+Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync. This is a hard product gate, not a harness-smoke gate: e2e/blackbox tests must exercise the actual implemented system through public runtime boundaries and compare actual outputs against `_docs/00_problem/input_data/expected_results/results_report.md` or referenced machine-readable expected-result files. Stubs are allowed only for external systems outside the product boundary; missing internal product implementation must fail or block the gate and send the flow back to Implement.
+
 ---

-**Step 8 — Security Audit (optional)**
-State-driven: reached by auto-chain from Step 7.
+**Step 12 — Test-Spec Sync**
+State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
+
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
+
+The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing, mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
+
+After completion, auto-chain to Step 13 (Update Docs).
+
+---
+
+**Step 13 — Update Docs**
+State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
+
+Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all completed implementation and test task spec files plus the implementation reports.
+
+The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
+
+If `_docs/02_document/` does not contain existing docs, mark Step 13 as `skipped`.
+
+After completion, auto-chain to Step 14 (Security Audit).
+
+---
+
+**Step 14 — Security Audit (optional)**
+State-driven: reached by auto-chain from Step 13 (completed or skipped).

 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run security audit before deploy?`
@@ -128,12 +262,12 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A — catches vulnerabilities before production`
 - target-skill:    `.cursor/skills/security/SKILL.md`
- next-step:       Step 9 (Performance Test)
+- next-step:       Step 15 (Performance Test)

 ---

-**Step 9 — Performance Test (optional)**
-State-driven: reached by auto-chain from Step 8.
+**Step 15 — Performance Test (optional)**
+State-driven: reached by auto-chain from Step 14 (completed or skipped).

 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run performance/load tests before deploy?`
@@ -141,30 +275,30 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
 - target-skill:    `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
- next-step:       Step 10 (Deploy)
+- next-step:       Step 16 (Deploy)

 ---

-**Step 10 — Deploy**
-State-driven: reached by auto-chain from Step 9 (after Step 9 is completed or skipped).
+**Step 16 — Deploy**
+State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).

 Action: Read and execute `.cursor/skills/deploy/SKILL.md`.

-After the deploy skill completes successfully, mark Step 10 as `completed` and auto-chain to Step 11 (Retrospective).
+After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).

 ---

-**Step 11 — Retrospective**
-State-driven: reached by auto-chain from Step 10.
+**Step 17 — Retrospective**
+State-driven: reached by auto-chain from Step 16.

 Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. This closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` and appending the top-3 lessons to `_docs/LESSONS.md`.

-After retrospective completes, mark Step 11 as `completed` and enter "Done" evaluation.
+After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.

 ---

 **Done**
-State-driven: reached by auto-chain from Step 11. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
+State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)

 Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:

@@ -191,47 +325,65 @@ On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and r
 | Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
 | Research Decision → proceed | Auto-chain → Plan (3) |
 | Plan (3) | Auto-chain → UI Design detection (4) |
-| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
-| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
-| Implement (6) | Auto-chain → Run Tests (7) |
-| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
-| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
-| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
-| Deploy (10) | Auto-chain → Retrospective (11) |
-| Retrospective (11) | Report completion; rewrite state to existing-code flow, step 9 |
+| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
+| Test Spec (5) | Auto-chain → Decompose (6) |
+| Decompose (6) | **Session boundary** — suggest new conversation before Implement |
+| Implement (7) | Auto-chain only after Product Implementation Completeness Gate passes → Code Testability Revision (8) |
+| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
+| Decompose Tests (9) | **Session boundary** — suggest new conversation before Implement Tests |
+| Implement Tests (10) | Auto-chain → Run Tests (11) |
+| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
+| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
+| Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
+| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
+| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
+| Deploy (16) | Auto-chain → Retrospective (17) |
+| Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |

 ## Status Summary — Step List

 Flow name: `greenfield`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.

-| # | Step Name          | Extra state tokens (beyond the shared set) |
-|---|--------------------|--------------------------------------------|
-| 1 | Problem            | — |
-| 2 | Research           | `DONE (N drafts)` |
-| 3 | Plan               | — |
-| 4 | UI Design          | — |
-| 5 | Decompose          | `DONE (N tasks)` |
-| 6 | Implement          | `IN PROGRESS (batch M of ~N)` |
-| 7 | Run Tests          | `DONE (N passed, M failed)` |
-| 8 | Security Audit     | — |
-| 9 | Performance Test   | — |
-| 10 | Deploy            | — |
-| 11 | Retrospective     | — |
+| # | Step Name                   | Extra state tokens (beyond the shared set) |
+|---|-----------------------------|--------------------------------------------|
+| 1 | Problem                     | — |
+| 2 | Research                    | `DONE (N drafts)` |
+| 3 | Plan                        | — |
+| 4 | UI Design                   | — |
+| 5 | Test Spec                   | — |
+| 6 | Decompose                   | `DONE (N tasks)` |
+| 7 | Implement                   | `IN PROGRESS (batch M of ~N)` |
+| 8 | Code Testability Revision   | — |
+| 9 | Decompose Tests             | `DONE (N tasks)` |
+| 10 | Implement Tests            | `IN PROGRESS (batch M)` |
+| 11 | Run Tests                  | `DONE (N passed, M failed)` |
+| 12 | Test-Spec Sync             | — |
+| 13 | Update Docs                | — |
+| 14 | Security Audit             | — |
+| 15 | Performance Test           | — |
+| 16 | Deploy                     | — |
+| 17 | Retrospective              | — |

-All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 8, 9 additionally accept `SKIPPED`.
+All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.

 Row rendering format (step-number column is right-padded to 2 characters for alignment):

 ```
- Step 1   Problem             [<state token>]
- Step 2   Research            [<state token>]
- Step 3   Plan                [<state token>]
- Step 4   UI Design           [<state token>]
- Step 5   Decompose           [<state token>]
- Step 6   Implement           [<state token>]
- Step 7   Run Tests           [<state token>]
- Step 8   Security Audit      [<state token>]
- Step 9   Performance Test    [<state token>]
- Step 10  Deploy              [<state token>]
- Step 11  Retrospective       [<state token>]
+ Step 1   Problem                   [<state token>]
+ Step 2   Research                  [<state token>]
+ Step 3   Plan                      [<state token>]
+ Step 4   UI Design                 [<state token>]
+ Step 5   Test Spec                 [<state token>]
+ Step 6   Decompose                 [<state token>]
+ Step 7   Implement                 [<state token>]
+ Step 8   Code Testability Rev.     [<state token>]
+ Step 9   Decompose Tests           [<state token>]
+ Step 10  Implement Tests           [<state token>]
+ Step 11  Run Tests                 [<state token>]
+ Step 12  Test-Spec Sync            [<state token>]
+ Step 13  Update Docs               [<state token>]
+ Step 14  Security Audit            [<state token>]
+ Step 15  Performance Test          [<state token>]
+ Step 16  Deploy                    [<state token>]
+ Step 17  Retrospective             [<state token>]
 ```
@@ -15,8 +15,10 @@ This flow differs fundamentally from `greenfield` and `existing-code`:
 |------|------|-----------|-------------------|
 | 1 | Discover | monorepo-discover/SKILL.md | Phase 1–10 |
 | 2 | Config Review | (human checkpoint, no sub-skill) | — |
+| 2.5 | Glossary & Architecture Vision | (inline, no sub-skill) | Steps 1–5 |
 | 3 | Status | monorepo-status/SKILL.md | Sections 1–5 |
 | 4 | Document Sync | monorepo-document/SKILL.md | Phase 1–7 (conditional on doc drift) |
+| 4.5 | Integration Test Sync | monorepo-e2e/SKILL.md | Phase 1–6 (conditional on suite-e2e drift; skipped if `suite_e2e:` block absent in config) |
 | 5 | CICD Sync | monorepo-cicd/SKILL.md | Phase 1–7 (conditional on CI drift) |
 | 6 | Loop | (auto-return to Step 3 on next invocation) | — |

@@ -58,17 +60,121 @@ Action: This is a **hard session boundary**. The skill cannot proceed until a hu
 ══════════════════════════════════════
 ```

- If user picks A → verify `confirmed_by_user: true` is now set in the config. If still `false`, re-ask. If true, auto-chain to **Step 3 (Status)**.
+- If user picks A → verify `confirmed_by_user: true` is now set in the config. If still `false`, re-ask. If true, auto-chain to **Step 2.5 (Glossary & Architecture Vision)**.
 - If user picks B → mark Step 2 as `in_progress`, update state file, end the session. Tell the user to invoke `/autodev` again after reviewing.

 **Do NOT auto-flip `confirmed_by_user`.** Only the human does that.

 ---

+**Step 2.5 — Glossary & Architecture Vision** (one-shot)
+
+Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` does NOT exist OR the cross-cutting architecture doc identified in `docs.cross_cutting` does NOT contain a `## Architecture Vision` section).
+State-driven: reached by auto-chain from Step 2 (user picked A).
+
+**Goal**: Capture meta-repo-wide terminology and the user's architecture vision **once**, after the config is confirmed but before any sync skill runs. Without this, `monorepo-document` will faithfully propagate per-component changes but never surface a unified mental model of the meta-repo to the user, and the AI will keep re-inferring the same project terminology on every invocation.
+
+**Why inline (no sub-skill)**: `monorepo-discover` is hard-guarded to write only `_repo-config.yaml`; `monorepo-document` only edits *existing* docs. Glossary and architecture-vision creation is a first-time, user-confirmed write that crosses both guarantees, so it lives directly in the flow.
+
+**Inputs**:
+- `_docs/_repo-config.yaml` (component list, doc map, conventions, assumptions log)
+- Cross-cutting docs listed under `docs.cross_cutting` (existing architecture doc, if any)
+- Each component's `primary_doc` (read-only, for terminology + responsibility extraction)
+- Root `README.md` if `repo.root_readme` is referenced
+
+**Outputs**:
+- `_docs/glossary.md` (or `<docs.root>/glossary.md` if `docs.root` ≠ `_docs/`) — NEW
+- The cross-cutting architecture doc updated in place: a `## Architecture Vision` section is prepended (or merged into an existing "Vision" / "Overview" heading)
+- One new entry appended to `_docs/_repo-config.yaml` under `assumptions_log:` recording the run
+- A new top-level config entry: `glossary_doc: <path>` so future `monorepo-status` and `monorepo-document` runs treat the glossary as a known cross-cutting doc
+
+**Procedure**:
+
+1. **Draft glossary** from `_repo-config.yaml` + each component's primary doc. Include:
+   - Component codenames as they appear in the config (`name` field) and any rename pairs the user noted in `unresolved:` resolutions
+   - Domain terms that recur across ≥2 component docs
+   - Acronyms / abbreviations
+   - Convention names from `conventions:` (e.g., commit prefix, deployment tier names)
+   - Stakeholder personas if cross-cutting docs reference them
+   Each entry: one-line definition + source (`source: components.<name>.primary_doc` or `source: _repo-config.yaml conventions`). Skip generic terms.
+
+2. **Draft architecture vision** from the meta-repo perspective:
+   - **One paragraph**: what the system as a whole is, what each component contributes, the runtime topology (one binary / N services / N clients + 1 server / hybrid), how components communicate (REST / gRPC / queue / DB-shared / file-shared)
+   - **Components & responsibilities** (one-line each), pulled directly from `_repo-config.yaml` `components:` list
+   - **Cross-cutting concerns ownership**: which doc owns which concern (auth, schema, deployment, etc.) — pulled from `docs.cross_cutting[].owns`
+   - **Architectural principles / non-negotiables** the user has implied across components (e.g., "all components share a single Postgres", "submodules own their own CI", "deployment is per-tier, not per-component")
+   - **Open questions / structural drift signals**: components missing from `docs.cross_cutting`, components in registry but not in config (registry mismatch), or contradictions between component primary docs
+
+3. **Present condensed view** to the user (NOT the full draft files):
+
+   ```
+   ══════════════════════════════════════
+    REVIEW: Meta-Repo Glossary + Architecture Vision
+   ══════════════════════════════════════
+    Glossary (N terms drafted from config + component docs):
+      - <Term>: <one-line definition>
+      - ...
+
+    Architecture Vision — meta-repo level:
+      <one-paragraph synopsis>
+
+      Components / responsibilities:
+        - <component>: <one-line>
+        - ...
+
+      Cross-cutting ownership:
+        - <concern> → <doc>
+        - ...
+
+      Principles / non-negotiables:
+        - <principle>
+        - ...
+
+      Open questions / drift signals:
+        - <q1>
+        - <q2>
+   ══════════════════════════════════════
+    A) Looks correct — write the files
+    B) Add / correct entries (provide diffs)
+    C) Resolve open questions / drift signals first
+   ══════════════════════════════════════
+    Recommendation: pick C if drift signals exist;
+                    otherwise B if components or principles
+                    don't match your intent; A only when
+                    the inferred vision is exactly right.
+   ══════════════════════════════════════
+   ```
+
+4. **Iterate**:
+   - On B → integrate the user's diffs/additions, re-present, loop until A.
+   - On C → ask the listed open questions in one batch, integrate answers, re-present.
+   - **Do NOT proceed to step 5 until the user picks A.**
+
+5. **Save**:
+   - Write `_docs/glossary.md` (alphabetical) with `**Status**: confirmed-by-user` + date.
+   - Update the cross-cutting architecture doc identified in `docs.cross_cutting` (or create one at `_docs/00_architecture.md` if none exists and the user's option-B input named one): prepend `## Architecture Vision` with the confirmed paragraph + components + ownership + principles. Preserve every existing H2 below verbatim.
+   - Append to `_docs/_repo-config.yaml`:
+     - Top-level `glossary_doc: <path-relative-to-repo-root>` (sibling of `docs.root`)
+     - New `assumptions_log:` entry: `{ date: <today>, skill: autodev-meta-repo Step 2.5, run_notes: "Captured glossary + architecture vision", assumptions: [...] }`
+   - Do NOT flip any `confirmed: false` → `confirmed: true` in the config; this step writes its own confirmed artifact, it does not retroactively confirm config inferences.
+
+**Self-verification**:
+- [ ] Every glossary entry traces to either the config or a component primary doc
+- [ ] Every component listed in the vision matches a `components:` entry in the config
+- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
+- [ ] The cross-cutting architecture doc still contains every H2 it had before this step
+- [ ] User picked option A on the latest condensed view
+
+**Idempotency**: if both `_docs/glossary.md` exists AND the architecture doc already has a `## Architecture Vision` section, this step is **skipped on re-invocation**. To refresh, the user invokes `/autodev` after deleting `glossary.md` (or running `monorepo-discover` with structural changes that justify a re-confirmation).
+
+After completion, auto-chain to **Step 3 (Status)**.
+
+---
+
 **Step 3 — Status**

-Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true`.
-State-driven: reached by auto-chain from Step 2 (user picked A), or entered on any re-invocation after a completed cycle.
+Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` exists OR `glossary_doc:` is recorded in the config).
+State-driven: reached by auto-chain from Step 2.5, or entered on any re-invocation after a completed cycle.

 Action: Read and execute `.cursor/skills/monorepo-status/SKILL.md`.

@@ -115,6 +221,28 @@ The skill:
 3. Applies doc edits
 4. Skips any component with unconfirmed mapping (M5), reports

+After completion:
+- If the status report ALSO flagged suite-e2e drift → auto-chain to **Step 4.5 (Integration Test Sync)**
+- Else if the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
+- Else → end cycle, report done
+
+---
+
+**Step 4.5 — Integration Test Sync**
+
+State-driven: reached by auto-chain from Step 3 (when status report flagged suite-e2e drift and no doc drift) or from Step 4 (when both doc and suite-e2e drift were flagged).
+
+**Skip condition**: if `_docs/_repo-config.yaml` has no `suite_e2e:` block, this step is skipped entirely — there's no harness to sync. The status report should not flag suite-e2e drift in that case; if it does, that's a status-skill bug.
+
+Action: Read and execute `.cursor/skills/monorepo-e2e/SKILL.md` with scope = components flagged by status.
+
+The skill:
+1. Verifies every path under `suite_e2e.*` exists (binary fixtures excepted — see the skill's Phase 1)
+2. Classifies each flagged change against the suite-e2e impact table
+3. Applies edits to `e2e/docker-compose.suite-e2e.yml`, `e2e/fixtures/init.sql`, `e2e/fixtures/expected_detections.json` metadata, and `e2e/runner/tests/*.spec.ts` selectors as needed
+4. Bumps baseline `fixture_version` with a `-stale` suffix and appends a `_docs/_process_leftovers/` entry whenever the detection model revision changes (binary fixture cannot be regenerated automatically)
+5. Reports synced files; does not run the suite e2e itself
+
 After completion:
 - If the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
 - Else → end cycle, report done
@@ -123,11 +251,11 @@ After completion:

 **Step 5 — CICD Sync**

-State-driven: reached by auto-chain from Step 3 (when status report flagged CI drift and no doc drift) or from Step 4 (when both doc and CI drift were flagged).
+State-driven: reached by auto-chain from Step 3 (when status report flagged CI drift and no doc/suite-e2e drift), Step 4, or Step 4.5.

 Action: Read and execute `.cursor/skills/monorepo-cicd/SKILL.md` with scope = components flagged by status.

-After completion, end cycle. Report files updated across both doc and CI sync.
+After completion, end cycle. Report files updated across doc, suite-e2e, and CI sync.

 ---

@@ -156,14 +284,19 @@ After onboarding completes, the config is updated. Auto-chain back to **Step 3 (
 | Completed Step | Next Action |
 |---------------|-------------|
 | Discover (1) | Auto-chain → Config Review (2) |
-| Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Status (3) |
+| Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Glossary & Architecture Vision (2.5) |
 | Config Review (2, user picked B) | **Session boundary** — end session, await re-invocation |
+| Glossary & Architecture Vision (2.5) | Auto-chain → Status (3) |
 | Status (3, doc drift) | Auto-chain → Document Sync (4) |
+| Status (3, suite-e2e drift only) | Auto-chain → Integration Test Sync (4.5) |
 | Status (3, CI drift only) | Auto-chain → CICD Sync (5) |
 | Status (3, no drift) | **Cycle complete** — end session, await re-invocation |
 | Status (3, registry mismatch) | Ask user (A: discover, B: onboard, C: continue) |
-| Document Sync (4) + CI drift pending | Auto-chain → CICD Sync (5) |
-| Document Sync (4) + no CI drift | **Cycle complete** |
+| Document Sync (4) + suite-e2e drift pending | Auto-chain → Integration Test Sync (4.5) |
+| Document Sync (4) + CI drift only pending | Auto-chain → CICD Sync (5) |
+| Document Sync (4) + no further drift | **Cycle complete** |
+| Integration Test Sync (4.5) + CI drift pending | Auto-chain → CICD Sync (5) |
+| Integration Test Sync (4.5) + no CI drift | **Cycle complete** |
 | CICD Sync (5) | **Cycle complete** |

 ## Status Summary — Step List
@@ -178,29 +311,33 @@ Flow-specific slot values:
   Config:  _docs/_repo-config.yaml [confirmed_by_user: <true|false>, last_updated: <date>]
  ```

-| # | Step Name        | Extra state tokens (beyond the shared set) |
-|---|------------------|--------------------------------------------|
-| 1 | Discover         | — |
-| 2 | Config Review    | `IN PROGRESS (awaiting human)` |
-| 3 | Status           | `DONE (no drift)`, `DONE (N drifts)` |
-| 4 | Document Sync    | `DONE (N docs)`, `SKIPPED (no doc drift)` |
-| 5 | CICD Sync        | `DONE (N files)`, `SKIPPED (no CI drift)` |
+| # | Step Name                          | Extra state tokens (beyond the shared set) |
+|---|------------------------------------|--------------------------------------------|
+| 1 | Discover                           | — |
+| 2 | Config Review                      | `IN PROGRESS (awaiting human)` |
+| 2.5 | Glossary & Architecture Vision   | `SKIPPED (already captured)` |
+| 3 | Status                             | `DONE (no drift)`, `DONE (N drifts)` |
+| 4 | Document Sync                      | `DONE (N docs)`, `SKIPPED (no doc drift)` |
+| 4.5 | Integration Test Sync            | `DONE (N files)`, `SKIPPED (no suite-e2e drift)`, `SKIPPED (no suite_e2e config block)` |
+| 5 | CICD Sync                          | `DONE (N files)`, `SKIPPED (no CI drift)` |

-All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4 and 5 additionally accept `SKIPPED`.
+All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2.5, 4, 4.5, and 5 additionally accept `SKIPPED`.

 Row rendering format:

 ```
- Step 1   Discover          [<state token>]
- Step 2   Config Review     [<state token>]
- Step 3   Status            [<state token>]
- Step 4   Document Sync     [<state token>]
- Step 5   CICD Sync         [<state token>]
+ Step 1     Discover                          [<state token>]
+ Step 2     Config Review                     [<state token>]
+ Step 2.5   Glossary & Architecture Vision    [<state token>]
+ Step 3     Status                            [<state token>]
+ Step 4     Document Sync                     [<state token>]
+ Step 4.5   Integration Test Sync             [<state token>]
+ Step 5     CICD Sync                         [<state token>]
 ```

 ## Notes for the meta-repo flow

- **No session boundary except Step 2**: unlike existing-code flow (which has boundaries around decompose), meta-repo flow only pauses at config review. Syncing is fast enough to complete in one session.
+- **No session boundary except Step 2 and Step 2.5**: unlike existing-code flow (which has boundaries around decompose), meta-repo flow only pauses at config review and the one-shot glossary/vision capture. Once both are confirmed, syncing is fast enough to complete in one session and Step 2.5 idempotently no-ops on every subsequent invocation.
 - **Cyclical, not terminal**: no "done forever" state. Each invocation completes a drift cycle; next invocation starts fresh.
 - **No tracker integration**: this flow does NOT create Jira/ADO tickets. Maintenance is not a feature — if a feature-level ticket spans the meta-repo's concerns, it lives in the per-component workspace.
 - **Onboarding is opt-in**: never auto-onboarded. User must explicitly request.
@@ -110,7 +110,8 @@ Before entering a step from this table for the first time in a session, verify t
 | Flow | Step | Sub-Step | Tracker Action |
 |------|------|----------|----------------|
 | greenfield | Plan | Step 6 — Epics | Create epics for each component |
-| greenfield | Decompose | Step 1 + Step 2 + Step 3 — All tasks | Create ticket per task, link to epic |
+| greenfield | Decompose | Implementation decomposition Step 1 + Step 2 — Product tasks | Create ticket per product task, link to epic |
+| greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |

@@ -138,7 +139,7 @@ One retry ladder covers all failure modes: explicit failure returned by a sub-sk

 Treat the sub-skill as **failed** when ANY of the following is observed:

- The sub-skill explicitly returns a failed result (including blocked subagents, auto-fix loop exhaustion, prerequisite violations).
+- The sub-skill explicitly returns a failed result (including blocked tasks, auto-fix loop exhaustion, prerequisite violations).
 - **Stuck signals**: the same artifact is rewritten 3+ times without meaningful change; the sub-skill re-asks a question that was already answered; no new artifact has been saved despite active execution.

 ### Retry ladder
@@ -291,7 +292,7 @@ For steps that produce `_docs/` artifacts (problem, research, plan, decompose, d

 ## Debug Protocol

-When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or an implementer subagent reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)
+When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or a task reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)

 ### Structured Debugging Workflow

@@ -13,7 +13,7 @@ The autodev persists its position to `_docs/_autodev_state.md`. This is a lightw

 ## Current Step
 flow: [greenfield | existing-code | meta-repo]
-step: [1-11 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
+step: [1-17 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
 name: [step name from the active flow's Step Reference Table]
 status: [not_started / in_progress / completed / skipped / failed]
 sub_step:
@@ -209,7 +209,7 @@ Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architectur

 The `/implement` skill invokes this skill after each batch completes:

-1. Collects changed files from all implementer agents in the batch
+1. Collects changed files from all tasks implemented in the batch
 2. Passes task spec paths + changed files to this skill
 3. If verdict is FAIL — presents findings to user (BLOCKING), user fixes or confirms
 4. If verdict is PASS or PASS_WITH_WARNINGS — proceeds automatically (findings shown as info)
@@ -221,7 +221,7 @@ The `/implement` skill invokes this skill after each batch completes:
 | Input | Type | Source | Required |
 |-------|------|--------|----------|
 | `task_specs` | list of file paths | Task `.md` files from `_docs/02_tasks/todo/` for the current batch | Yes |
-| `changed_files` | list of file paths | Files modified by implementer agents (from `git diff` or agent reports) | Yes |
+| `changed_files` | list of file paths | Files modified by the tasks in the batch (from `git diff`) | Yes |
 | `batch_number` | integer | Current batch number (for report naming) | Yes |
 | `project_restrictions` | file path | `_docs/00_problem/restrictions.md` | If exists |
 | `solution_overview` | file path | `_docs/01_solution/solution.md` | If exists |
@@ -2,8 +2,8 @@
 name: decompose
 description: |
  Decompose planned components into atomic implementable tasks with bootstrap structure plan.
-  4-step workflow: bootstrap structure plan, component task decomposition, blackbox test task decomposition, and cross-task verification.
-  Supports full decomposition (_docs/ structure), single component mode, and tests-only mode.
+  Workflow entrypoints: implementation task decomposition, single component decomposition, and tests-only decomposition.
+  The invoking flow decides which entrypoint to run; this skill executes that selected sequence.
  Trigger phrases:
  - "decompose", "decompose features", "feature decomposition"
  - "task decomposition", "break down components"
@@ -20,7 +20,7 @@ Decompose planned components into atomic, implementable task specs with a bootst

 ## Core Principles

- **Atomic tasks**: each task does one thing; if it exceeds 8 complexity points, split it
+- **Atomic tasks**: each task does one thing; if it exceeds 5 complexity points, split it
 - **Behavioral specs, not implementation plans**: describe what the system should do, not how to build it
 - **Flat structure**: all tasks are tracker-ID-prefixed files in TASKS_DIR — no component subdirectories
 - **Save immediately**: write artifacts to disk after each task; never accumulate unsaved work
@@ -30,14 +30,15 @@ Decompose planned components into atomic, implementable task specs with a bootst

 ## Context Resolution

-Determine the operating mode based on invocation before any other logic runs.
+Resolve the selected entrypoint from the invocation context before any other logic runs. The caller decides whether this is implementation, single component, or tests-only decomposition; this skill only executes the selected sequence.

-**Default** (no explicit input file provided):
+**Implementation task decomposition** (default; selected by flows before invoking this skill):

 - DOCUMENT_DIR: `_docs/02_document/`
 - TASKS_DIR: `_docs/02_tasks/`
 - TASKS_TODO: `_docs/02_tasks/todo/`
 - Reads from: `_docs/00_problem/`, `_docs/01_solution/`, DOCUMENT_DIR
+- Produces only implementation tasks. Blackbox/e2e test task files are produced only when the invoking flow selects tests-only decomposition.

 **Single component mode** (provided file is within `_docs/02_document/` and inside a `components/` subdirectory):

@@ -55,24 +56,24 @@ Determine the operating mode based on invocation before any other logic runs.
 - TESTS_DIR: `DOCUMENT_DIR/tests/`
 - Reads from: `_docs/00_problem/`, `_docs/01_solution/`, TESTS_DIR

-Announce the detected mode and resolved paths to the user before proceeding.
+Announce the selected entrypoint and resolved paths to the user before proceeding.

 ### Step Applicability by Mode

-| Step | File | Default | Single | Tests-only |
-|------|------|:-------:|:------:|:----------:|
+| Step | File | Implementation | Single | Tests-only |
+|------|------|:--------------:|:------:|:----------:|
 | 1 Bootstrap Structure | `steps/01_bootstrap-structure.md` | ✓ | — | — |
 | 1t Test Infrastructure | `steps/01t_test-infrastructure.md` | — | — | ✓ |
 | 1.5 Module Layout | `steps/01-5_module-layout.md` | ✓ | — | — |
 | 2 Task Decomposition | `steps/02_task-decomposition.md` | ✓ | ✓ | — |
-| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | ✓ | — | ✓ |
+| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | — | — | ✓ |
 | 4 Cross-Verification | `steps/04_cross-verification.md` | ✓ | — | ✓ |

 ## Input Specification

 ### Required Files

-**Default:**
+**Implementation task decomposition:**

 | File | Purpose |
 |------|---------|
@@ -80,10 +81,11 @@ Announce the detected mode and resolved paths to the user before proceeding.
 | `_docs/00_problem/restrictions.md` | Constraints and limitations |
 | `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
 | `_docs/01_solution/solution.md` | Finalized solution |
-| `DOCUMENT_DIR/architecture.md` | Architecture from plan skill |
+| `DOCUMENT_DIR/architecture.md` | Architecture from plan/document skill (must contain a `## Architecture Vision` H2 — confirmed user intent) |
+| `DOCUMENT_DIR/glossary.md` | Project terminology (confirmed by user in plan Phase 2a.0 or document Step 4.5). Use it to keep task names, component references, and AC wording consistent with the user's vocabulary |
 | `DOCUMENT_DIR/system-flows.md` | System flows from plan skill |
 | `DOCUMENT_DIR/components/[##]_[name]/description.md` | Component specs from plan skill |
-| `DOCUMENT_DIR/tests/` | Blackbox test specs from plan skill |
+| `DOCUMENT_DIR/tests/` | Optional product acceptance context from test-spec skill; do not create test task files from it in this entrypoint |

 **Single component mode:**

@@ -110,7 +112,7 @@ Announce the detected mode and resolved paths to the user before proceeding.

 ### Prerequisite Checks (BLOCKING)

-**Default:**
+**Implementation task decomposition:**

 1. DOCUMENT_DIR contains `architecture.md` and `components/` — **STOP if missing**
 2. Create TASKS_DIR and TASKS_TODO if they do not exist
@@ -144,6 +146,8 @@ TASKS_DIR/

 **Naming convention**: Each task file is initially saved in `TASKS_TODO/` with a temporary numeric prefix (`[##]_[short_name].md`). After creating the work item ticket, rename the file to use the work item ticket ID as prefix (`[TRACKER-ID]_[short_name].md`). For example: `todo/01_initial_structure.md` → `todo/AZ-42_initial_structure.md`.

+If tracker availability fails, follow `.cursor/rules/tracker.mdc` before continuing. Only when the user explicitly chooses `tracker: local` may the numeric prefix remain; in that mode set `Tracker: pending` and `Epic: pending` in the task header and keep the task eligible for later tracker sync.
+
 ### Save Timing

 | Step | Save immediately after | Filename |
@@ -165,11 +169,11 @@ If TASKS_DIR subfolders already contain task files:

 ## Progress Tracking

-At the start of execution, create a TodoWrite with all applicable steps for the detected mode (see Step Applicability table). Update status as each step/component completes.
+At the start of execution, create a TodoWrite with all applicable steps for the selected entrypoint (see Step Applicability table). Update status as each step/component completes.

 ## Workflow

-### Step 1: Bootstrap Structure Plan (default mode only)
+### Step 1: Bootstrap Structure Plan (implementation mode only)

 Read and follow `steps/01_bootstrap-structure.md`.

@@ -181,25 +185,25 @@ Read and follow `steps/01t_test-infrastructure.md`.

 ---

-### Step 1.5: Module Layout (default mode only)
+### Step 1.5: Module Layout (implementation mode only)

 Read and follow `steps/01-5_module-layout.md`.

 ---

-### Step 2: Task Decomposition (default and single component modes)
+### Step 2: Task Decomposition (implementation and single component modes)

 Read and follow `steps/02_task-decomposition.md`.

 ---

-### Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
+### Step 3: Blackbox Test Task Decomposition (tests-only mode only)

 Read and follow `steps/03_blackbox-test-decomposition.md`.

 ---

-### Step 4: Cross-Task Verification (default and tests-only modes)
+### Step 4: Cross-Task Verification (implementation and tests-only modes)

 Read and follow `steps/04_cross-verification.md`.

@@ -207,7 +211,7 @@ Read and follow `steps/04_cross-verification.md`.

 - **Coding during decomposition**: this workflow produces specs, never code
 - **Over-splitting**: don't create many tasks if the component is simple — 1 task is fine
- **Tasks exceeding 8 points**: split them; no task should be too complex for a single implementer
+- **Tasks exceeding 5 points**: split them; no task should be too complex for a single implementer
 - **Cross-component tasks**: each task belongs to exactly one component
 - **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
 - **Creating git branches**: branch creation is an implementation concern, not a decomposition one
@@ -220,7 +224,7 @@ Read and follow `steps/04_cross-verification.md`.
 | Situation | Action |
 |-----------|--------|
 | Ambiguous component boundaries | ASK user |
-| Task complexity exceeds 8 points after splitting | ASK user |
+| Task complexity exceeds 5 points after splitting | ASK user |
 | Missing component specs in DOCUMENT_DIR | ASK user |
 | Cross-component dependency conflict | ASK user |
 | Tracker epic not found for a component | ASK user for Epic ID |
@@ -232,15 +236,14 @@ Read and follow `steps/04_cross-verification.md`.
 ┌────────────────────────────────────────────────────────────────┐
 │          Task Decomposition (Multi-Mode)                        │
 ├────────────────────────────────────────────────────────────────┤
-│ CONTEXT: Resolve mode (default / single component / tests-only) │
+│ CONTEXT: Invoke the selected entrypoint (implementation / single / tests-only) │
 │                                                                 │
-│ DEFAULT MODE:                                                   │
+│ IMPLEMENTATION TASK DECOMPOSITION:                              │
 │  1.   Bootstrap Structure → steps/01_bootstrap-structure.md     │
 │       [BLOCKING: user confirms structure]                       │
 │  1.5  Module Layout       → steps/01-5_module-layout.md         │
 │       [BLOCKING: user confirms layout]                          │
 │  2.   Component Tasks     → steps/02_task-decomposition.md      │
-│  3.   Blackbox Tests      → steps/03_blackbox-test-decomposition.md │
 │  4.   Cross-Verification  → steps/04_cross-verification.md      │
 │       [BLOCKING: user confirms dependencies]                    │
 │                                                                 │
@@ -26,7 +26,7 @@ For each component (or the single provided component):
 4. Do not create tasks for other components — only tasks for the current component
 5. Each task should be atomic, containing 1 API or a list of semantically connected APIs
 6. Write each task spec using `templates/task.md`
-7. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
+7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
 8. Note task dependencies (referencing tracker IDs of already-created dependency tasks, e.g., `AZ-42_initial_structure`)
 9. **Cross-cutting rule**: if a concern spans ≥2 components (logging, config loading, auth/authZ, error envelope, telemetry, feature flags, i18n), create ONE shared task under the cross-cutting epic. Per-component tasks declare it as a dependency and consume it; they MUST NOT re-implement it locally. Duplicate local implementations are an `Architecture` finding (High) in code-review Phase 7 and a `Maintainability` finding in Phase 6.
 10. **Shared-models / shared-API rule**: classify the task as shared if ANY of the following is true:
@@ -43,16 +43,32 @@ For each component (or the single provided component):
    Consumers read the contract file, not the producer's task spec. This prevents interface drift when the producer's implementation detail leaks into consumers.
 11. **Immediately after writing each task file**: create a work item ticket, link it to the component's epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.

+## Runtime Completeness Decomposition Gate
+
+Before Step 2 is considered complete, scan `architecture.md`, `system-flows.md`, component descriptions, and the solution for named internal runtime capabilities and dependencies. Examples include BASALT/OpenVINS/Kimera, FAISS, DINOv2, ONNX/TensorRT, ALIKED/DISK, LightGlue, RANSAC, PostGIS, MAVLink emission, FDR rollover, and any "A-Z" user-visible pipeline.
+
+For every named internal capability:
+
+1. Ensure at least one implementation task explicitly owns the production integration or production algorithm.
+2. Do not treat "define protocol", "create adapter boundary", "add deterministic fallback", "create scaffold", or "prepare native bridge" as implementation of the capability unless the architecture explicitly says the real capability is out of scope.
+3. If a capability needs external hardware/data to verify, still create the production implementation task. Verification may be hardware-gated later; implementation must not be omitted.
+4. Add a `## Runtime Completeness` section to any affected task with:
+   - named capability/dependency,
+   - production code that must exist,
+   - allowed external stubs, if any,
+   - unacceptable substitutes such as fake/deterministic/internal stubs.
+
 ## Self-verification (per component)

 - [ ] Every task is atomic (single concern)
- [ ] No task exceeds 8 complexity points
+- [ ] No task exceeds 5 complexity points
 - [ ] Task dependencies reference correct tracker IDs
 - [ ] Tasks cover all interfaces defined in the component spec
 - [ ] No tasks duplicate work from other components
 - [ ] Every task has a work item ticket linked to the correct epic
 - [ ] Every shared-models / shared-API task has a contract file at `_docs/02_document/contracts/<component>/<name>.md` and a `## Contract` section linking to it
 - [ ] Every cross-cutting concern appears exactly once as a shared task, not N per-component copies
+- [ ] Every named internal runtime capability has a production implementation task, not only an interface/scaffold/fallback task

 ## Save action

@@ -1,4 +1,4 @@
-# Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
+# Step 3: Blackbox Test Task Decomposition (tests-only mode only)

 **Role**: Professional Quality Assurance Engineer
 **Goal**: Decompose blackbox test specs into atomic, implementable task specs.
@@ -6,7 +6,6 @@

 ## Numbering

- In default mode: continue sequential numbering from where Step 2 left off.
 - In tests-only mode: start from 02 (01 is the test infrastructure bootstrap from Step 1t).

 ## Steps
@@ -14,21 +13,26 @@
 1. Read all test specs from `DOCUMENT_DIR/tests/` (`blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, `resource-limit-tests.md`)
 2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test)
 3. Each task should reference the specific test scenarios it implements and the environment/test-data specs
-4. Dependencies:
-   - In default mode: blackbox test tasks depend on the component implementation tasks they exercise
+4. Add a **System Under Test Boundary** section to every e2e/blackbox test task:
+   - The test must drive the product through public runtime boundaries and compare actual outputs to `_docs/00_problem/input_data/expected_results/results_report.md` and any referenced machine-readable expected-result files.
+   - Stubs are allowed only for external systems outside the product boundary: flight controller/SITL, QGC observer, satellite-provider/Suite service, physical Jetson hardware, physical camera, licensed public datasets, and network services.
+   - Stubs, fakes, deterministic fallbacks, monkeypatches, or direct imports are not allowed for internal product modules that the scenario is meant to validate, such as VIO, safety/anchor wrapper, satellite retrieval, anchor verification, tile manager, MAVLink output adapter, or FDR.
+   - If an internal module is not implemented, the test must fail/block as missing product implementation; it must not pass by replacing that module with a test stub.
+5. Dependencies:
   - In tests-only mode: blackbox test tasks depend on the test infrastructure bootstrap task (Step 1t)
-5. Write each task spec using `templates/task.md`
-6. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
-7. Note task dependencies (referencing tracker IDs of already-created dependency tasks)
-8. **Immediately after writing each task file**: create a work item ticket under the "Blackbox Tests" epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
+6. Write each task spec using `templates/task.md`
+7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
+8. Note task dependencies (referencing tracker IDs of already-created dependency tasks)
+9. **Immediately after writing each task file**: create a work item ticket under the "Blackbox Tests" epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.

 ## Self-verification

 - [ ] Every scenario from `tests/blackbox-tests.md` is covered by a task
 - [ ] Every scenario from `tests/performance-tests.md`, `tests/resilience-tests.md`, `tests/security-tests.md`, and `tests/resource-limit-tests.md` is covered by a task
- [ ] No task exceeds 8 complexity points
- [ ] Dependencies correctly reference the dependency tasks (component tasks in default mode, test infrastructure in tests-only mode)
+- [ ] No task exceeds 5 complexity points
+- [ ] Dependencies correctly reference the test infrastructure task
 - [ ] Every task has a work item ticket linked to the "Blackbox Tests" epic
+- [ ] Every e2e/blackbox task forbids internal product stubs/fakes and requires comparison against expected-results artifacts

 ## Save action

@@ -1,4 +1,4 @@
-# Step 4: Cross-Task Verification (default and tests-only modes)
+# Step 4: Cross-Task Verification (implementation and tests-only modes)

 **Role**: Professional software architect and analyst
 **Goal**: Verify task consistency and produce `_dependencies_table.md`.
@@ -8,17 +8,20 @@

 1. Verify task dependencies across all tasks are consistent
 2. Check no gaps:
-   - In default mode: every interface in `architecture.md` has tasks covering it
+   - In implementation mode: every product interface in `architecture.md` has implementation task coverage
   - In tests-only mode: every test scenario in `traceability-matrix.md` is covered by a task
+   - In implementation mode: every named internal runtime capability/dependency from architecture, solution, system flows, and component descriptions has a production implementation task, not only an interface/scaffold/fallback task
+   - In tests-only mode: every e2e/blackbox task has a System Under Test Boundary section that forbids stubbing internal product modules and requires comparison to expected-results artifacts
 3. Check no overlaps: tasks don't duplicate work
 4. Check no circular dependencies in the task graph
 5. Produce `_dependencies_table.md` using `templates/dependencies-table.md`

 ## Self-verification

-### Default mode
+### Implementation mode

- [ ] Every architecture interface is covered by at least one task
+- [ ] Every product interface in `architecture.md` is covered by at least one implementation task
+- [ ] Every named internal runtime capability has a production implementation task
 - [ ] No circular dependencies in the task graph
 - [ ] Cross-component dependencies are explicitly noted in affected task specs
 - [ ] `_dependencies_table.md` contains every task with correct dependencies
@@ -26,6 +29,7 @@
 ### Tests-only mode

 - [ ] Every test scenario from `traceability-matrix.md` "Covered" entries has a corresponding task
+- [ ] Every e2e/blackbox task validates actual product behavior and allows stubs only for external systems
 - [ ] No circular dependencies in the task graph
 - [ ] Test task dependencies reference the test infrastructure bootstrap
 - [ ] `_dependencies_table.md` contains every task with correct dependencies
@@ -28,4 +28,4 @@ Use this template after cross-task verification. Save as `TASKS_DIR/_dependencie
 - Dependencies column lists tracker IDs (e.g., "AZ-43, AZ-44") or "None"
 - No circular dependencies allowed
 - Tasks should be listed in recommended execution order
- The `/implement` skill reads this table to compute parallel batches
+- The `/implement` skill reads this table to compute dependency-aware batches; task execution remains sequential
@@ -1,6 +1,6 @@
 # Module Layout Template

-The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to implementer subagents. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.
+The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to each task. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.

 Save as `_docs/02_document/module-layout.md`. This file is produced by the decompose skill (Step 1.5 module layout) and consumed by the implement skill (Step 4 file ownership). Task specs remain purely behavioral — they do NOT carry file paths. The layout is the single place where component → filesystem mapping lives.

@@ -104,4 +104,4 @@ The implement skill's Step 4 (File Ownership) reads this file and, for each task
 3. Set READ-ONLY = the Public API files of every component listed in `Imports from`, plus `shared/*` Public API files.
 4. Set FORBIDDEN = every other component's Owns glob.

-If two tasks in the same batch map to the same component, the implement skill schedules them sequentially (one implementer at a time for that component) to avoid file conflicts on shared internal files.
+Execution inside a batch is already sequential (one task at a time). This mapping is still required because it enforces scope discipline per task — preventing a task from drifting into files that belong to another component.
@@ -11,7 +11,7 @@ Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[T
 **Task**: [TRACKER-ID]_[short_name]
 **Name**: [short human name]
 **Description**: [one-line description of what this task delivers]
-**Complexity**: [1|2|3|5|8] points
+**Complexity**: [1|2|3|5] points
 **Dependencies**: [AZ-43_shared_models, AZ-44_db_migrations] or "None"
 **Component**: [component name for context]
 **Tracker**: [TASK-ID]
@@ -102,8 +102,7 @@ Consumers MUST read that file — not this task spec — to discover the interfa
 - 2 points: Non-trivial, low complexity, minimal coordination
 - 3 points: Multi-step, moderate complexity, potential alignment needed
 - 5 points: Difficult, interconnected logic, medium-high risk
- 8 points: High difficulty, high ambiguity or coordination, multiple components
- 13 points: Too complex — split into smaller tasks
+- 8+ points: Too complex — split into smaller tasks

 ## Output Guidelines

@@ -26,7 +26,8 @@
   - Application components under test
   - Test runner container (black-box, no internal imports)
   - Isolated database with seed data
-   - All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
+   - All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner`
+   - See the Woodpecker two-workflow contract in [`../templates/ci_cd_pipeline.md`](../templates/ci_cd_pipeline.md) — the test runner entry point defined here becomes the first step of `.woodpecker/01-test.yml`.
 7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only

 ## Self-verification
@@ -85,3 +85,140 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`.
 | Deploy success | [Slack] | [team] |
 | Deploy failure | [Slack/email + PagerDuty] | [on-call] |
 ```
+
+---
+
+## Reference Implementation: Woodpecker CI two-workflow contract
+
+Use this when the project's CI is **Woodpecker** and the test layout follows the autodev e2e contract from [`../../decompose/templates/test-infrastructure-task.md`](../../decompose/templates/test-infrastructure-task.md) (an `e2e/` folder containing `Dockerfile`, `docker-compose.test.yml`, `conftest.py`, `requirements.txt`, `mocks/`, `fixtures/`, `tests/`).
+
+The contract is **two workflows in `.woodpecker/`**, scheduled on the same agent label, with the build workflow gated on a successful test run:
+
+- `.woodpecker/01-test.yml` — runs the e2e contract, publishes `results/report.csv` as an artifact, fails the pipeline on any test failure.
+- `.woodpecker/02-build-push.yml` — `depends_on: [01-test]`. Builds the image, tags it `${CI_COMMIT_BRANCH}-${TAG_SUFFIX}`, pushes it to the registry. Skipped automatically if test failed.
+
+The agent label is parameterized via `matrix:` so a single workflow file fans out across architectures: `labels: platform: ${PLATFORM}` routes each matrix entry to the matching agent. Both workflows for a repo must use the same matrix so test and build run on the same machine and share Docker layer cache. New architectures = new matrix entries; never new files.
+
+### Multi-arch matrix conventions
+
+| Variable | Meaning | Typical values |
+|----------|---------|----------------|
+| `PLATFORM` | Woodpecker agent label — selects which physical machine runs the entry. | `arm64`, `amd64` |
+| `TAG_SUFFIX` | Image tag suffix appended after the branch name. | `arm`, `amd` |
+| `DOCKERFILE` *(only when arches need different Dockerfiles)* | Path to the Dockerfile for this entry. | `Dockerfile`, `Dockerfile.jetson` |
+
+Most repos use the same `Dockerfile` for both arches (multi-arch base images handle the rest), so `DOCKERFILE` can be omitted from the matrix and hardcoded in the build command. Repos with split per-arch Dockerfiles (e.g., `detections` uses `Dockerfile.jetson` on Jetson with TensorRT/CUDA-on-L4T) declare `DOCKERFILE` as a matrix var.
+
+When only one architecture is currently in use, keep the matrix block with a single entry and the second entry commented out — adding a new arch is then a one-line uncomment, not a structural change.
+
+### `.woodpecker/01-test.yml`
+
+```yaml
+when:
+  event: [push, pull_request, manual]
+  branch: [dev, stage, main]
+
+matrix:
+  include:
+    - PLATFORM: arm64
+      TAG_SUFFIX: arm
+    # - PLATFORM: amd64
+    #   TAG_SUFFIX: amd
+
+labels:
+  platform: ${PLATFORM}
+
+steps:
+  - name: e2e
+    image: docker
+    commands:
+      - cd e2e
+      - docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build
+      - docker compose -f docker-compose.test.yml down -v
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
+
+  - name: report
+    image: docker
+    when:
+      status: [success, failure]
+    commands:
+      - test -f e2e/results/report.csv && cat e2e/results/report.csv || echo "no report"
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
+```
+
+Notes:
+- `--abort-on-container-exit` shuts the whole compose down as soon as ANY service exits, so a crashed dependency surfaces immediately instead of hanging the runner.
+- `--exit-code-from e2e-runner` ensures the pipeline's exit code reflects the test runner's, not the SUT's.
+- The `report` step runs on `[success, failure]` so the report is always published; without this the CSV is lost on red builds.
+- `down -v` between runs drops mock state and DB volumes — every test run starts clean.
+
+### `.woodpecker/02-build-push.yml`
+
+```yaml
+when:
+  event: [push, manual]
+  branch: [dev, stage, main]
+
+depends_on:
+  - 01-test
+
+matrix:
+  include:
+    - PLATFORM: arm64
+      TAG_SUFFIX: arm
+    # - PLATFORM: amd64
+    #   TAG_SUFFIX: amd
+
+labels:
+  platform: ${PLATFORM}
+
+steps:
+  - name: build-push
+    image: docker
+    environment:
+      REGISTRY_HOST:
+        from_secret: registry_host
+      REGISTRY_USER:
+        from_secret: registry_user
+      REGISTRY_TOKEN:
+        from_secret: registry_token
+    commands:
+      - echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
+      - export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
+      - export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
+      - |
+        docker build -f Dockerfile \
+          --build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
+          --label org.opencontainers.image.revision=$CI_COMMIT_SHA \
+          --label org.opencontainers.image.created=$BUILD_DATE \
+          --label org.opencontainers.image.source=$CI_REPO_URL \
+          -t $REGISTRY_HOST/azaion/<service>:$TAG .
+      - docker push $REGISTRY_HOST/azaion/<service>:$TAG
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
+```
+
+Notes:
+- `depends_on: [01-test]` is enforced by Woodpecker — a failed `01-test` (any matrix entry) skips this workflow.
+- The build workflow does NOT trigger on `pull_request` events: PRs get test signal only; pushes to `dev`/`stage`/`main` produce images. Avoids polluting the registry with PR images.
+- Replace `<service>` with the actual service name (matches the registry namespace pattern `azaion/<service>`).
+- For repos with split per-arch Dockerfiles, add `DOCKERFILE: Dockerfile.jetson` (or similar) to the matrix entry and substitute `${DOCKERFILE}` for `Dockerfile` in the `docker build -f` line.
+
+### Variations by stack
+
+The contract is language-agnostic because the runner is `docker compose`. The Dockerfile inside `e2e/` selects the test framework:
+
+| Stack | `e2e/Dockerfile` runs |
+|-------|----------------------|
+| Python | `pytest --csv=/results/report.csv -v` |
+| .NET | `dotnet test --logger:"trx;LogFileName=/results/report.trx"` (convert to CSV in a final step if needed) |
+| Node/UI | `npm test -- --reporters=default --reporters=jest-junit --outputDirectory=/results` |
+| Rust | `cargo test --no-fail-fast -- --format json > /results/report.json` |
+
+When the repo has **only unit tests** (no `e2e/docker-compose.test.yml`), drop the compose orchestration and run the native test command directly inside a stack-appropriate image. Keep the same two-workflow split — `01-test.yml` runs unit tests, `02-build-push.yml` is unchanged.
+
+### Manual-trigger override (test infrastructure not yet validated)
+
+If a repo ships a complete `e2e/` layout but the test fixtures are not yet validated end-to-end (e.g., expected-results data is still being authored), gate `01-test.yml` on `event: [manual]` only and add a TODO comment pointing to the unblocking task. The `02-build-push.yml` workflow drops its `depends_on` clause for the manual-only window — an explicit and reversible exception, not a permanent split.
@@ -31,6 +31,7 @@ _docs/
    │   ├── components.md
    │   └── flows/
    ├── 04_verification_log.md           # Step 4
+    ├── glossary.md                       # Step 4.5 (confirmed-by-user)
    ├── FINAL_report.md                  # Step 7
    └── state.json                       # Resumability
 ```
@@ -49,6 +50,7 @@ Maintained in `DOCUMENT_DIR/state.json` for resumability:
  "modules_remaining": ["services/auth", "api/endpoints"],
  "module_batch": 1,
  "components_written": [],
+  "step_4_5_glossary_vision": "not_started",
  "last_updated": "2026-03-21T14:00:00Z"
 }
 ```
@@ -15,7 +15,7 @@ Covers three related modes that share the same 8-step pipeline:

 ## Progress Tracking

-Create a TodoWrite with all steps (0 through 7). Update status as each step completes.
+Create a TodoWrite with all steps (0 through 7, including the inline Step 2.5 Module Layout Derivation and Step 4.5 Glossary & Architecture Vision). Update status as each step completes.

 ## Steps

@@ -251,7 +251,107 @@ Apply corrections inline to the documents that need them.

 **BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes.

-**Session boundary**: After verification is confirmed, suggest a session break before proceeding to the synthesis steps (5–7). These steps produce different artifact types and benefit from fresh context:
+---
+
+### Step 4.5: Glossary & Architecture Vision (BLOCKING)
+
+**Role**: Software architect + business analyst
+**Goal**: Reconcile the AI's verified understanding of the codebase with the user's intended terminology and architecture vision. Existing-code projects often carry domain language and structural intent that is invisible from code alone (synonyms, deprecated names, modules that are "supposed to" be split, components the user thinks of as one logical unit even though they live in two folders). This step makes that intent explicit before any downstream skill (refactor, decompose, new-task) acts on the docs.
+
+**When this step runs**:
+- Always, after Step 4 (Verification Pass) — for Full and Resume modes.
+- **Skipped** in Focus Area mode (the glossary/vision is system-wide; running it on a partial scan would produce a partial glossary). Resume the user once a full pass exists.
+
+**Inputs** (already on disk after Step 4):
+- `DOCUMENT_DIR/architecture.md`, `system-flows.md`, `data_model.md`, `deployment/*`
+- `DOCUMENT_DIR/components/*/description.md`
+- `DOCUMENT_DIR/modules/*.md`
+- `DOCUMENT_DIR/04_verification_log.md` (so the AI knows which doc parts are confirmed vs. flagged)
+
+**Outputs**:
+- `DOCUMENT_DIR/glossary.md` (NEW)
+- `DOCUMENT_DIR/architecture.md` updated in place: a new `## Architecture Vision` section is prepended (or merged into an existing "Overview" / "Vision" heading if already present); existing technical sections are preserved verbatim
+
+**Procedure**:
+
+1. **Draft glossary** from verified docs:
+   - Domain entities, processes, roles named in module/component docs
+   - Acronyms / abbreviations
+   - Internal codenames (project, service, model names) that recur in the codebase
+   - Synonym pairs the AI noticed (e.g., the codebase uses "flight" but module comments say "mission")
+   - Stakeholder personas if any docs reference them
+   Each entry: one-line definition + source reference (`source: components/03_flights/description.md`). Skip generic CS/industry terms.
+
+2. **Draft architecture vision** as the AI currently understands the codebase:
+   - **One paragraph**: what the system is, who runs it, the runtime topology shape (monolith / services / pipeline / library / hybrid), and the dominant pattern (e.g., "submodule-based meta-repo with REST + SSE between UI and backend").
+   - **Components & responsibilities** (one-line each), pulled from `components/*/description.md`.
+   - **Major data flows** (one or two sentences each), pulled from `system-flows.md`.
+   - **Architectural principles / non-negotiables** the AI inferred from the code (e.g., "DB-driven config", "all UI traffic via REST + SSE only", "no per-component shared state"). Mark each with `inferred-from: <source>`.
+   - **Open questions / drift signals**: places where the code disagrees with itself, or where the AI cannot tell intent from implementation (e.g., two components doing similar work — is that legacy duplication or deliberate?).
+
+3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
+
+   ```
+   ══════════════════════════════════════
+    REVIEW: Glossary + Architecture Vision (existing code)
+   ══════════════════════════════════════
+    Glossary (N terms drafted from verified docs):
+      - <Term>: <one-line definition>
+      - ...
+
+    Architecture Vision — as inferred from the codebase:
+      <one-paragraph synopsis>
+
+      Components / responsibilities:
+        - <component>: <one-line>
+        - ...
+
+      Principles / non-negotiables (inferred):
+        - <principle>  [inferred-from: <source>]
+        - ...
+
+      Open questions / drift signals:
+        - <q1>
+        - <q2>
+   ══════════════════════════════════════
+    A) Inferred vision matches my intent — write the files
+    B) Add / correct entries (provide diffs — terms, components,
+       principles, or rename pairs)
+    C) Resolve the open questions / drift signals first
+   ══════════════════════════════════════
+    Recommendation: pick C if any drift signals exist;
+                    otherwise B if the vision misses
+                    project-specific intent; A only when
+                    the inferred vision is exactly right.
+   ══════════════════════════════════════
+   ```
+
+4. **Iterate**:
+   - On B → integrate the user's diffs/additions, re-present, loop until A.
+   - On C → ask the listed open questions in one batch (M4-style), integrate answers, re-present.
+   - **Do NOT proceed to step 5 until the user picks A.**
+
+5. **Save**:
+   - Write `DOCUMENT_DIR/glossary.md`, alphabetical, with a top-line `**Status**: confirmed-by-user` and the date.
+   - Update `DOCUMENT_DIR/architecture.md`:
+     - If a `## Architecture Vision` (or `## Vision` / `## Overview`) section already exists at the top, replace its body with the confirmed paragraph + components + principles.
+     - Otherwise, insert `## Architecture Vision` as the first H2 after the title; preserve every existing H2 below.
+     - Do NOT delete or re-order existing technical sections (Tech Stack, Deployment Model, Data Model, NFRs, ADRs).
+
+6. **Update `state.json`**: mark `step_4_5_glossary_vision: confirmed`. Resume on rerun must skip this step unless the user explicitly invokes `/document --refresh-vision`.
+
+**Self-verification**:
+- [ ] Every glossary entry traces to at least one file under `DOCUMENT_DIR/`
+- [ ] Every component listed in the vision matches a folder under `DOCUMENT_DIR/components/`
+- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
+- [ ] `architecture.md` still contains all H2 sections it had before this step
+- [ ] User picked option A on the latest condensed view
+
+**BLOCKING**: Do NOT proceed to the session boundary / Step 5 until both files are saved and the user has picked A.
+
+---
+
+**Session boundary**: After Step 4.5 is confirmed, suggest a session break before proceeding to the synthesis steps (5–7). These steps produce different artifact types and benefit from fresh context:

 ```
 ══════════════════════════════════════
@@ -1,41 +1,59 @@
 ---
 name: implement
 description: |
-  Orchestrate task implementation with dependency-aware batching, parallel subagents, and integrated code review.
+  Implement tasks sequentially with dependency-aware batching and integrated code review.
  Reads flat task files and _dependencies_table.md from TASKS_DIR, computes execution batches via topological sort,
-  launches up to 4 implementer subagents in parallel, runs code-review skill after each batch, and loops until done.
+  implements tasks one at a time in dependency order, runs code-review skill after each batch, and loops until done.
  Use after /decompose has produced task files.
  Trigger phrases:
  - "implement", "start implementation", "implement tasks"
-  - "run implementers", "execute tasks"
+  - "execute tasks"
 category: build
-tags: [implementation, orchestration, batching, parallel, code-review]
+tags: [implementation, batching, code-review]
 disable-model-invocation: true
 ---

-# Implementation Orchestrator
+# Implementation Runner

-Orchestrate the implementation of all tasks produced by the `/decompose` skill. This skill is a **pure orchestrator** — it does NOT write implementation code itself. It reads task specs, computes execution order, delegates to `implementer` subagents, validates results via the `/code-review` skill, and escalates issues.
+Implement all tasks produced by the `/decompose` skill. This skill reads task specs, computes execution order, writes the code and tests for each task **sequentially** (no subagents, no parallel execution), validates results via the `/code-review` skill, and escalates issues.

-The `implementer` agent is the specialist that writes all the code — it receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria.
+For each task the main agent receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria — then moves on to the next task.

 ## Core Principles

- **Orchestrate, don't implement**: this skill delegates all coding to `implementer` subagents
- **Dependency-aware batching**: tasks run only when all their dependencies are satisfied
- **Max 4 parallel agents**: never launch more than 4 implementer subagents simultaneously
- **File isolation**: no two parallel agents may write to the same file
+- **Sequential execution**: implement one task at a time. Do NOT spawn subagents and do NOT run tasks in parallel. (See `.cursor/rules/no-subagents.mdc`.)
+- **Dependency-aware ordering**: tasks run only when all their dependencies are satisfied
+- **Batching for review, not parallelism**: tasks are grouped into batches so `/code-review` and commits operate on a coherent unit of work — all tasks inside a batch are still implemented one after the other
 - **Integrated review**: `/code-review` skill runs automatically after each batch
- **Auto-start**: batches launch immediately — no user confirmation before a batch
+- **Completeness before testing**: product implementation is not done until code is checked against task outcomes, included scope, architecture/component promises, named runtime dependencies, and unresolved scaffold/native placeholders — not just task AC tests
+- **Runtime dependency reality**: production code cannot satisfy a task by exposing only a protocol, fake runner, deterministic fallback, or "native bridge" placeholder when the task/architecture promises a concrete internal capability such as BASALT VIO, FAISS retrieval, LightGlue matching, or a full A-Z localization pipeline. Stubs are allowed only for external systems and tests.
+- **Auto-start**: batches start immediately — no user confirmation before a batch
 - **Gate on failure**: user confirmation is required only when code review returns FAIL
 - **Commit per batch**: after each batch is confirmed, commit. Ask the user whether to push to remote unless the user previously opted into auto-push for this session.

 ## Context Resolution

 - TASKS_DIR: `_docs/02_tasks/`
- Task files: all `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
+- Task files: selected `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
 - Dependency table: `TASKS_DIR/_dependencies_table.md`

+### Task Selection Context
+
+The invoking flow decides which task category this run should execute. The implement skill must honor that selected context instead of consuming every file in `todo/`.
+
+| Context | Selected task files |
+|---------|---------------------|
+| Product implementation | Task specs that are not test-only and not refactoring specs |
+| Test implementation | `*_test_infrastructure.md` plus task specs whose `Component` or `Epic` identifies `Blackbox Tests` |
+| Refactoring | Task specs whose filename or task ID includes `_refactor_` |
+
+If no explicit context is provided, infer it from the active autodev step:
+- greenfield Step 7 or existing-code Step 10 → Product implementation
+- greenfield Step 10 or existing-code Step 6 → Test implementation
+- refactor Phase 4 → Refactoring
+
+Unselected task files remain in `TASKS_DIR/todo/` for their later flow step.
+
 ### Task Lifecycle Folders

 ```
@@ -48,7 +66,8 @@ TASKS_DIR/

 ## Prerequisite Checks (BLOCKING)

-1. `TASKS_DIR/todo/` exists and contains at least one task file — **STOP if missing**
+1. `TASKS_DIR/todo/` exists and contains at least one task file for the selected context — **STOP if missing**
+   - Exception for Product implementation re-entry: if no selected product tasks remain in `todo/`, but the active autodev state is Step 7 or the latest product completeness report is missing/invalid/contains `FAIL`, skip directly to Step 15 (Product Implementation Completeness Gate). This gate may create remediation tasks and return to Step 1. Do not write a final implementation report from this state.
 2. `_dependencies_table.md` exists — **STOP if missing**
 3. At least one task is not yet completed — **STOP if all done**
 4. **Working tree is clean** — run `git status --porcelain`; the output must be empty.
@@ -56,16 +75,16 @@ TASKS_DIR/
     - A) Commit or stash stray changes manually, then re-invoke `/implement`
     - B) Agent commits stray changes as a single `chore: WIP pre-implement` commit and proceeds
     - C) Abort
-   - Rationale: implementer subagents edit files in parallel and commit per batch. Unrelated uncommitted changes get silently folded into batch commits otherwise.
+   - Rationale: each batch ends with a commit. Unrelated uncommitted changes would get silently folded into batch commits otherwise.
   - This check is repeated at the start of each batch iteration (see step 6 / step 14 Loop).

 ## Algorithm

 ### 1. Parse

- Read all task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
+- Read selected task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
 - Read `_dependencies_table.md` — parse into a dependency graph (DAG)
- Validate: no circular dependencies, all referenced dependencies exist
+- Validate: no circular dependencies in the selected task graph, all referenced selected-task dependencies exist or are already completed in `TASKS_DIR/done/`

 ### 2. Detect Progress

@@ -78,8 +97,8 @@ TASKS_DIR/

 - Topological sort remaining tasks
 - Select tasks whose dependencies are ALL satisfied (completed)
- If a ready task depends on any task currently being worked on in this batch, it must wait for the next batch
- Cap the batch at 4 parallel agents
+- A batch is simply a coherent group of tasks for review + commit. Within the batch, tasks are implemented sequentially in topological order.
+- Cap the batch size at a reasonable review scope (default: 4 tasks)
 - If the batch would exceed 20 total complexity points, suggest splitting and let the user decide

 ### 4. Assign File Ownership
@@ -89,11 +108,12 @@ The authoritative file-ownership map is `_docs/02_document/module-layout.md` (pr
 For each task in the batch:
 - Read the task spec's **Component** field.
 - Look up the component in `_docs/02_document/module-layout.md` → Per-Component Mapping.
- Set **OWNED** = the component's `Owns` glob (exclusive write for the duration of the batch).
+- Set **OWNED** = the component's `Owns` glob (the files this task is allowed to write).
 - Set **READ-ONLY** = Public API files of every component in the component's `Imports from` list, plus all `shared/*` Public API files.
 - Set **FORBIDDEN** = every other component's `Owns` glob, and every other component's internal (non-Public API) files.
 - If the task is a shared / cross-cutting task (lives under `shared/*`), OWNED = that shared directory; READ-ONLY = nothing; FORBIDDEN = every component directory.
- If two tasks in the same batch map to the same component or overlapping `Owns` globs, schedule them sequentially instead of in parallel.
+
+Since execution is sequential, there is no parallel-write conflict to resolve; ownership here is a **scope discipline** check — it stops a task from drifting into unrelated components even when alone.

 If `_docs/02_document/module-layout.md` is missing or the component is not found:
 - STOP the batch.
@@ -102,31 +122,30 @@ If `_docs/02_document/module-layout.md` is missing or the component is not found

 ### 5. Update Tracker Status → In Progress

-For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before launching the implementer. If `tracker: local`, skip this step.
+For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before starting work. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.

-### 6. Launch Implementer Subagents
+### 6. Implement Tasks Sequentially

-**Per-batch dirty-tree re-check**: before launching subagents, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.
+**Per-batch dirty-tree re-check**: before starting the batch, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.

-For each task in the batch, launch an `implementer` subagent with:
- Path to the task spec file
- List of files OWNED (exclusive write access)
- List of files READ-ONLY
- List of files FORBIDDEN
- **Explicit instruction**: the implementer must write or update tests that validate each acceptance criterion in the task spec. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still be written and skip with a clear reason.
+For each task in the batch **in topological order, one at a time**:
+1. Read the task spec file.
+2. Respect the file-ownership envelope computed in Step 4 (OWNED / READ-ONLY / FORBIDDEN).
+3. Implement the feature and write/update tests for every acceptance criterion in the spec. Tests for internal product behavior must exercise the production implementation path. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still exist and skip/block with a clear prerequisite reason, but that skip does not make missing production code complete.
+4. Run the relevant tests locally before moving on to the next task in the batch. If tests fail, fix in-place — do not defer.
+5. Capture a short per-task status line (files changed, tests pass/fail, any blockers) for the batch report.

-Launch all subagents immediately — no user confirmation.
+Do NOT spawn subagents and do NOT attempt to implement two tasks simultaneously, even if they touch disjoint files. See `.cursor/rules/no-subagents.mdc`.

-### 7. Monitor
+### 7. Collect Status

- Wait for all subagents to complete
- Collect structured status reports from each implementer
- If any implementer reports "Blocked", log the blocker and continue with others
+- After all tasks in the batch are finished, aggregate the per-task status lines into a structured batch status.
+- If any task reported "Blocked", log the blocker with the failing task's ID and continue — the batch report will surface it.

-**Stuck detection** — while monitoring, watch for these signals per subagent:
- Same file modified 3+ times without test pass rate improving → flag as stuck, stop the subagent, report as Blocked
- Subagent has not produced new output for an extended period → flag as potentially hung
- If a subagent is flagged as stuck, do NOT let it continue looping — stop it and record the blocker in the batch report
+**Stuck detection** — while implementing a task, watch for these signals in your own progress:
+- The same file has been rewritten 3+ times without tests going green → stop, mark the task Blocked, and move to the next task in the batch (the user will be asked at the end of the batch).
+- You have tried 3+ distinct approaches without evidence-driven progress → stop, mark Blocked, move on.
+- Do NOT loop indefinitely on a single task. Record the blocker and proceed.

 ### 8. AC Test Coverage Verification

@@ -139,8 +158,8 @@ Before code review, verify that every acceptance criterion in each task spec has
   - **Not covered**: no test exists for this AC

 If any AC is **Not covered**:
- This is a **BLOCKING** failure — the implementer must write the missing test before proceeding
- Re-launch the implementer with the specific ACs that need tests
+- This is a **BLOCKING** failure — the missing test must be written before proceeding
+- Go back to the offending task, add tests for the specific ACs that lack coverage, then re-run this check
 - If the test cannot run in the current environment (GPU required, platform-specific, external service), the test must still exist and skip with `pytest.mark.skipif` or `pytest.skip()` explaining the prerequisite
 - A skipped test counts as **Covered** — the test exists and will run when the environment allows

@@ -189,12 +208,14 @@ Track `auto_fix_attempts` and `escalated_findings` in the batch report for retro

 ### 12. Update Tracker Status → In Testing

-After the batch is committed and pushed, transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step.
+After the batch is committed (and pushed if the user approved pushing), transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.

 ### 13. Archive Completed Tasks

 Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.

+For product implementation, this archive means "batch implementation accepted." The Product Implementation Completeness Gate can still require follow-up remediation tasks before the feature is complete; it does not move original task files back to `todo/`.
+
 ### 14. Loop

 - Go back to step 2 until all tasks in `todo/` are done
@@ -216,16 +237,74 @@ Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.
 - **Interaction with Auto-Fix Gate**: Architecture findings (new category from code-review Phase 7) always escalate per the implement auto-fix matrix; they cannot silently auto-fix
 - **Resumability**: if interrupted, the next invocation checks for the latest `cumulative_review_batches_*.md` and computes the changed-file set from batch reports produced after that review

-### 15. Final Test Run
+### 15. Product Implementation Completeness Gate

- After all batches are complete, run the full test suite once
- Read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices)
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision
- When tests pass, report final summary
+Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate only when the remaining context is explicitly test implementation or refactoring, as determined by the task files and report filename rules.
+
+**Goal**: catch the failure mode where narrow tests validate scaffold behavior while the task's actual outcome, included scope, architecture promise, or named integration remains unimplemented.
+
+Inputs:
+
+- Completed product task specs from `_docs/02_tasks/done/` for the current cycle
+- `_docs/02_document/architecture.md`
+- `_docs/02_document/system-flows.md`
+- Relevant `_docs/02_document/components/*/description.md` files
+- Current source code under each completed task's ownership envelope
+- Batch reports and code-review reports for the current cycle
+
+For each completed product task:
+
+1. Read these sections from the task spec: `Description`, `Outcome`, `Scope / Included`, `Acceptance Criteria`, `Non-Functional Requirements`, `Constraints`, and explicit named technologies or integrations.
+2. Compare those promises against actual source code, not only tests or report prose.
+3. Search the task's owned component files for unresolved implementation markers: `placeholder`, `stub`, `reserved`, `TODO`, `NotImplemented`, `pass`, `deterministic`, `fake`, `mock`, `scaffold`, `native bridge`, and empty native/readme-only integration directories. Ignore test fixtures/mocks only when they are under test-owned paths and not used as production behavior.
+4. Verify that each named runtime dependency in the task promise is integrated as production behavior, not merely represented by an interface. Examples: if a task promises FAISS, DINOv2, BASALT, LightGlue, OpenCV, RANSAC, a database, cloud service, or hardware SDK, the production code must either call that dependency or contain an adapter that loads and executes the real dependency package. A deterministic fallback, fake runner, empty `native/` package, or "bridge to be supplied later" is **FAIL** unless the task itself explicitly scoped the dependency out before implementation started.
+5. Distinguish internal implementation from external prerequisites:
+   - Internal product capabilities (VIO, anchor verification, cache retrieval, safety wrapper, FDR, MAVLink emission) must be implemented in production code before the task can pass.
+   - External systems/hardware/data (Jetson device, physical camera, ArduPilot process, QGC, third-party service credentials, unavailable licensed dataset) may be `BLOCKED` only when production code exists and the missing prerequisite is outside the product boundary.
+6. Verify tests exercise the real implementation path where local prerequisites exist. Environment-gated tests may skip only with an explicit prerequisite reason; they do not make missing production code complete.
+7. For any architecture promise that describes an end-to-end user outcome, verify there is an executable production pipeline connecting the relevant components. Isolated component contracts and test-only harness orchestration are not enough.
+8. Classify each task:
+   - **PASS**: task promises are implemented or explicitly out of scope in the task itself.
+   - **BLOCKED**: production code exists but cannot be fully verified due to external hardware/data/license/runtime prerequisites; the blocker is explicit and tests report blocked/skipped with reason.
+   - **FAIL**: promised production behavior is missing, only scaffolded, or only represented in tests/reports.
+
+Save the audit to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` with:
+
+- Per-task classification
+- Evidence files/symbols checked
+- Any unresolved scaffold/native placeholders
+- Any named promised technologies not integrated
+- Required remediation task suggestions, each sized to 5 points or less
+
+Gate:
+
+- If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, continue to Final Test Run.
+- If any product task is `FAIL`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
+  - A) Create remediation tasks now and return to implementation
+  - B) Mark the missing behavior explicitly out of scope in task/docs, then re-run this gate
+  - C) Abort for manual correction
+- Recommendation must normally be A unless the user deliberately accepts reduced scope.
+
+Remediation task creation:
+
+1. For each `FAIL`, create one or more task specs using `.cursor/skills/decompose/templates/task.md`; each remediation task must be sized at 5 points or less.
+2. Save each task to `_docs/02_tasks/todo/` with a short name prefixed by `remediate_`.
+3. Set **Component** to the failed task's component and set **Dependencies** to the failed task ID plus any remediation prerequisites.
+4. Create or defer tracker tickets using the same tracker rules as decompose/new-task: if tracker is available, create tickets immediately; if the user explicitly chose `tracker: local`, keep numeric prefixes with `Tracker: pending` / `Epic: pending`.
+5. Append the remediation tasks to `_docs/02_tasks/_dependencies_table.md`.
+6. Return to Step 1 (Parse) in **Product implementation** context. The final product implementation report can be written only after remediation tasks complete and this gate reruns without `FAIL`.
+
+### 16. Final Test Run
+
+- After all batches are complete, run the full test suite once unless the invoking flow's immediate next step is `Run Tests`.
+- If the next flow step is `Run Tests`, record a handoff in the final implementation report and let `.cursor/skills/test-run/SKILL.md` own the full-suite gate to avoid duplicate full runs.
+- When this step does run, read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices).
+- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision.
+- When tests pass, report final summary.

 ## Batch Report Persistence

-After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. When all tasks are complete, produce a FINAL implementation report with a summary of all batches. The filename depends on context:
+After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. For product implementation, produce the FINAL implementation report only after the Product Implementation Completeness Gate passes. For test and refactor implementation, produce the FINAL report after all selected tasks complete and the full-suite gate is either run or handed off per Step 16. The filename depends on context:

 - **Test implementation** (tasks from test decomposition): `_docs/03_implementation/implementation_report_tests.md`
 - **Feature implementation**: `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` where `{feature_slug}` is derived from the batch task names (e.g., `implementation_report_core_api_cycle2.md`) and `{N}` is the current `state.cycle` from `_docs/_autodev_state.md`. If `state.cycle` is absent (pre-migration), default to `cycle1`.
@@ -264,9 +343,10 @@ After each batch, produce a structured report:

 | Situation | Action |
 |-----------|--------|
-| Implementer fails same approach 3+ times | Stop it, escalate to user |
+| Same task rewritten 3+ times without green tests | Mark Blocked, continue batch, escalate at batch end |
 | Task blocked on external dependency (not in task list) | Report and skip |
-| File ownership conflict unresolvable | ASK user |
+| File ownership violated (task wrote outside OWNED) | ASK user |
+| Product completeness gate finds missing promised implementation | STOP — create remediation tasks or get explicit user scope reduction |
 | Test failure after final test run | Delegate to test-run skill — blocking gate |
 | All tasks complete | Report final summary, suggest final commit |
 | `_dependencies_table.md` missing | STOP — run `/decompose` first |
@@ -281,7 +361,8 @@ Each batch commit serves as a rollback checkpoint. If recovery is needed:

 ## Safety Rules

- Never launch tasks whose dependencies are not yet completed
- Never allow two parallel agents to write to the same file
- If a subagent fails or is flagged as stuck, stop it and report — do not let it loop indefinitely
- Always run the full test suite after all batches complete (step 15)
+- Never start a task whose dependencies are not yet completed
+- Never run tasks in parallel and never spawn subagents — see `.cursor/rules/no-subagents.mdc`
+- If a task is flagged as stuck, stop working on it and report — do not let it loop indefinitely
+- Always run the Product Implementation Completeness Gate before final product reports
+- Always run or hand off the full test suite after all batches complete (step 16)
@@ -3,29 +3,31 @@
 ## Topological Sort with Batch Grouping

 The `/implement` skill uses a topological sort to determine execution order,
-then groups tasks into batches for parallel execution.
+then groups tasks into batches for code review and commit. Execution within a
+batch is **sequential** — see `.cursor/rules/no-subagents.mdc`.

 ## Algorithm

 1. Build adjacency list from `_dependencies_table.md`
 2. Compute in-degree for each task node
-3. Initialize batch 0 with all nodes that have in-degree 0
+3. Initialize the ready set with all nodes that have in-degree 0
 4. For each batch:
-   a. Select up to 4 tasks from the ready set
-   b. Check file ownership — if two tasks would write the same file, defer one to the next batch
-   c. Launch selected tasks as parallel implementer subagents
-   d. When all complete, remove them from the graph and decrement in-degrees of dependents
-   e. Add newly zero-in-degree nodes to the next batch's ready set
+   a. Select up to 4 tasks from the ready set (default batch size cap)
+   b. Implement the selected tasks one at a time in topological order
+   c. When all tasks in the batch complete, remove them from the graph and
+      decrement in-degrees of dependents
+   d. Add newly zero-in-degree nodes to the ready set
 5. Repeat until the graph is empty

-## File Ownership Conflict Resolution
+## Ordering Inside a Batch

-When two tasks in the same batch map to overlapping files:
- Prefer to run the lower-numbered task first (it's more foundational)
- Defer the higher-numbered task to the next batch
- If both have equal priority, ask the user
+Tasks inside a batch are executed in topological order — a task is only
+started after every task it depends on (inside the batch or in a previous
+batch) is done. When two tasks have the same topological rank, prefer the
+lower-numbered (more foundational) task first.

 ## Complexity Budget

 Each batch should not exceed 20 total complexity points.
 If it does, split the batch and let the user choose which tasks to include.
+The budget exists to keep the per-batch code review scope reviewable.
@@ -129,7 +129,8 @@ If `_docs/_repo-config.yaml` already exists:
   - Entries removed (component removed from registry)
 4. **Ask the user** whether to apply the diff.
 5. If applied, **preserve `confirmed: true` flags** for entries that still match — don't reset human-approved mappings.
-6. If user declines, stop — leave config untouched.
+6. **Preserve user-owned top-level keys verbatim**: `glossary_doc:` (written by autodev meta-repo Step 2.5) and any `assumptions_log:` entries are NEVER edited or removed by this skill. Carry them through unchanged. If the file referenced by `glossary_doc:` no longer exists on disk, surface as an `unresolved:` question — do not auto-clear the field.
+7. If user declines, stop — leave config untouched.

 ### Phase 8: Batch question checkpoint (M4)

@@ -15,6 +15,8 @@ Propagates component changes into the unified documentation set. Strictly scoped
 | Root `README.md` **only** if `_repo-config.yaml` lists it as a doc target (e.g., services table) | Install scripts (`ci-*.sh`) → use `monorepo-cicd` |
 | Docs index (`_docs/README.md` or similar) cross-reference tables | Component-internal docs (`<component>/README.md`, `<component>/docs/*`) |
 | Cross-cutting docs listed in `docs.cross_cutting` | `_docs/_repo-config.yaml` itself (only `monorepo-discover` and `monorepo-onboard` write it) |
+| Body of cross-cutting docs **except** the `## Architecture Vision` section (preserved verbatim — owned by autodev meta-repo Step 2.5) | The file at `glossary_doc:` (user-confirmed; only autodev meta-repo Step 2.5 rewrites it). New project terms surfaced during sync are reported back to the user, not silently appended |
+| `## Architecture Vision` body — read-only, may be referenced for terminology consistency but never edited | — |

 If a component change requires CI/env updates too, tell the user to also run `monorepo-cicd`. This skill does NOT cross domains.

@@ -166,6 +168,8 @@ Append to `_docs/_repo-config.yaml` under `assumptions_log:`:
 - Change `confirmed_by_user` or any `confirmed: <bool>` flag
 - Auto-commit or push
 - Guess a mapping not in the config
+- Edit `glossary_doc:` (the file recorded under the config's `glossary_doc:` key)
+- Edit the `## Architecture Vision` section of any cross-cutting doc; if a sync would conflict with that section, surface the conflict to the user and skip — do not silently rewrite user-confirmed content

 ## Edge cases

@@ -0,0 +1,152 @@
+---
+name: monorepo-e2e
+description: Syncs the suite-level integration e2e harness (`e2e/docker-compose.suite-e2e.yml`, fixtures, Playwright runner) when component contracts drift in ways that affect the cross-service scenario. Reads `_docs/_repo-config.yaml` to know which suite-e2e artifacts are in play. Touches ONLY suite-e2e files — never per-component CI, docs, or component internals. Use when a component changes a port, env var, public API endpoint, DB schema column, or detection model that the suite e2e exercises.
+---
+
+# Monorepo Suite-E2E
+
+Propagates component changes into the suite-level integration e2e harness. Strictly scoped — never edits docs, component internals, per-component CI configs, or the production deploy compose.
+
+## Scope — explicit
+
+| In scope | Out of scope |
+| -------- | ------------ |
+| `e2e/docker-compose.suite-e2e.yml` (overlay, healthchecks, seed services) | Production `_infra/deploy/<target>/docker-compose.yml` — `monorepo-cicd` owns it |
+| `e2e/fixtures/init.sql` (seeded rows that the spec depends on) | Component DB migrations — owned by each component |
+| `e2e/fixtures/expected_detections.json` (detection baseline) | Detection model itself — owned by `detections/` |
+| `e2e/runner/tests/*.spec.ts` selector / contract-driven edits | New scenarios (user-driven, not drift-driven) |
+| `e2e/runner/Dockerfile` / `package.json` Playwright version bumps | Net-new e2e infrastructure (use `monorepo-onboard` or initial scaffolding) |
+| `.woodpecker/suite-e2e.yml` (suite-level pipeline) | Per-component `.woodpecker/01-test.yml` / `02-build-push.yml` — `monorepo-cicd` owns those |
+| Suite-e2e leftover entries under `_docs/_process_leftovers/` | Per-component leftovers — owned by each component |
+
+If a component change needs doc updates too, tell the user to also run `monorepo-document`. If it needs production-deploy or per-component CI updates, run `monorepo-cicd`. This skill **only** updates the suite-e2e surface.
+
+## Preconditions (hard gates)
+
+1. `_docs/_repo-config.yaml` exists.
+2. Top-level `confirmed_by_user: true`.
+3. `suite_e2e.*` section is populated in config (see "Required config block" below). If absent, abort and ask the user to extend the config via `monorepo-discover`.
+4. Components-in-scope have confirmed contract mappings (port, public API path, DB tables touched), OR user explicitly approves inferred ones.
+
+## Required config block
+
+This skill expects `_docs/_repo-config.yaml` to carry:
+
+```yaml
+suite_e2e:
+  overlay: e2e/docker-compose.suite-e2e.yml
+  fixtures:
+    init_sql: e2e/fixtures/init.sql
+    baseline_json: e2e/fixtures/expected_detections.json
+    binary_fixtures:
+      - e2e/fixtures/sample.mp4
+      - e2e/fixtures/model.tar.gz
+  runner:
+    dockerfile: e2e/runner/Dockerfile
+    package_json: e2e/runner/package.json
+    spec_dir: e2e/runner/tests
+  pipeline: .woodpecker/suite-e2e.yml
+  scenario:
+    description: "Upload video → detect → overlays → dataset → DB persistence"
+    components_exercised:
+      - ui
+      - annotations
+      - detections
+      - postgres-local
+    api_contracts:
+      - component: ui
+        path: /api/admin/auth/login
+      - component: annotations
+        path: /api/annotations/media/batch
+      - component: annotations
+        path: /api/annotations/media/{id}/annotations
+    db_tables:
+      - media
+      - annotations
+      - detection
+      - detection_classes
+    model_pin:
+      detections_repo_path: <path-to-model-config-or-classes-source>
+      classes_source: annotations/src/Database/DatabaseMigrator.cs
+```
+
+If `suite_e2e:` is missing the skill **stops** — it does not invent a default mapping.
+
+## Mitigations (M1–M7)
+
+- **M1** Separation: this skill only touches suite-e2e files; no production deploy compose, no per-component CI, no docs, no component internals.
+- **M3** Factual vs. interpretive: port, env var, API path, DB column — FACTUAL, read from the components' code. Whether a baseline still matches the model — DEFERRED to the user (the skill flags drift, never silently re-records).
+- **M4** Batch questions at checkpoints.
+- **M5** Skip over guess: a component change that doesn't map cleanly to one of the in-scope artifacts → skip and report.
+- **M6** Assumptions footer + append to `_repo-config.yaml` `assumptions_log`.
+- **M7** Drift detection: verify every path under `suite_e2e.*` exists on disk; stop if not.
+
+## Workflow
+
+### Phase 1: Drift check (M7)
+
+Verify every file listed under `suite_e2e.*` (excluding `binary_fixtures`, which are gitignored) exists on disk. Missing file → stop and ask:
+- Run `monorepo-discover` to refresh, OR
+- Skip the missing artifact (recorded in report)
+
+For `binary_fixtures` paths that are absent (expected — they live in S3/LFS), check whether `expected_detections.json._meta.video_sha256` is still a `TBD-...` placeholder. If yes, surface this as a known leftover (`_docs/_process_leftovers/2026-04-22_suite-e2e-binary-fixtures.md`) and continue.
+
+### Phase 2: Determine scope
+
+Same as `monorepo-cicd` Phase 2 — ask the user, or auto-detect. For **auto-detect**, flag commits that touch suite-e2e-relevant concerns:
+
+| Commit pattern | Suite-e2e impact |
+| -------------- | ---------------- |
+| New port exposed by `<component>` | Healthcheck override may change in `e2e/docker-compose.suite-e2e.yml` |
+| New required env var on `<component>` | `e2e/docker-compose.suite-e2e.yml` `e2e-runner` env block + `init.sql` seed |
+| Public API path renamed / removed | Spec selector / API call path in `e2e/runner/tests/*.spec.ts` |
+| DB schema column renamed in a `db_tables` entry | `init.sql` column reference + spec `pg.query` text |
+| New required DB table referenced by spec | `init.sql` insert block (skip if owned by component migration) |
+| Detection model rev change in `detections/` | `expected_detections.json` `_meta.model.revision` + flag baseline as stale |
+| New canonical detection class added | `expected_detections.json._meta` annotation |
+
+Present the flagged list; confirm.
+
+### Phase 3: Classify changes per component
+
+| Change type | Target suite-e2e files |
+| ----------- | ---------------------- |
+| Port / env var change | `e2e/docker-compose.suite-e2e.yml` |
+| API path / contract change | `e2e/runner/tests/*.spec.ts` |
+| DB schema reference change | `e2e/fixtures/init.sql` and spec SQL queries |
+| Model / class catalog change | `e2e/fixtures/expected_detections.json` (mark `_meta.fixture_version` bump + leftover entry for binary refresh) |
+| Playwright dependency drift | `e2e/runner/package.json` + `e2e/runner/Dockerfile` |
+| Suite scenario steps gone stale | **Stop and ask** — scenario edits are user-driven, not drift-driven |
+
+### Phase 4: Apply edits
+
+Edit each in-scope file. After each batch, run `ReadLints` on touched files. Do NOT run the suite e2e itself — that's a downstream pipeline operation, not a sync-skill responsibility.
+
+For `expected_detections.json`: when the model revision changes, the skill **does not** re-record the baseline — the binary fixture cannot be regenerated from the dev environment. Instead:
+1. Set `_meta.model.revision` to the new revision.
+2. Set `_meta.fixture_version` to a new bumped version with a `-stale` suffix (e.g., `0.2.0-stale`).
+3. Append a new entry to `_docs/_process_leftovers/` describing the required re-record.
+4. Leave `expected.by_class` untouched — the spec's tolerance check will fail loudly until the binary refresh lands.
+
+### Phase 5: Update assumptions log
+
+Append a new `assumptions_log:` entry to `_docs/_repo-config.yaml` recording:
+- Date, components in scope, which suite-e2e files were touched
+- Any inferred contract mappings still tagged `confirmed: false`
+- Any leftover entries created
+
+### Phase 6: Report
+
+Render a Choose-format summary of the synced files, surface any `_process_leftovers/` entries created, and end. Do NOT auto-commit.
+
+## Self-verification
+
+- [ ] No file outside `e2e/`, `.woodpecker/suite-e2e.yml`, or `_docs/_process_leftovers/` was edited
+- [ ] `_docs/_repo-config.yaml` `suite_e2e:` block was not silently mutated except for `assumptions_log` append
+- [ ] `expected_detections.json` was not re-recorded (only metadata bumped + leftover added)
+- [ ] Every spec edit traces to a flagged commit pattern in Phase 2
+- [ ] `ReadLints` clean on every touched file
+
+## Failure handling
+
+Same retry / escalation protocol as `monorepo-cicd` — see `protocols.md`. The most common failure mode is the binary-fixture leftover (sample.mp4 missing or SHA-mismatched); this skill does not attempt to resolve it, only surfaces it.
@@ -59,6 +59,8 @@ Mark each as `complete` / `partial` / `missing` and explain.
 - Every component in `components:` appears in the registry — flag mismatches
 - Every `docs.root` file cross-referenced in config exists on disk — flag missing
 - Every `ci.orchestration_files` and `ci.install_scripts` exists — flag missing
+- `glossary_doc:` (if recorded in config) points to a file that exists on disk — flag missing
+- The cross-cutting architecture doc identified by `docs.cross_cutting` contains a `## Architecture Vision` section — flag missing (signals the meta-repo flow's Step 2.5 was skipped or the section was removed)

 ### Section 5: Unresolved questions

@@ -113,6 +115,8 @@ In registry, not in config:    [list or "(none)"]
 In config, not in registry:    [list or "(none)"]
 Config-referenced docs missing: [list or "(none)"]
 Config-referenced CI files missing: [list or "(none)"]
+glossary_doc:                  [path or "not recorded — run /autodev to capture"]
+Architecture Vision section:   [present | missing in <doc>]

 ═══════════════════════════════════════════════════
 Unresolved questions
@@ -75,7 +75,7 @@ Record the description verbatim for use in subsequent steps.
 **Role**: Technical analyst
 **Goal**: Determine whether deep research is needed.

-Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md, components/, system-flows.md).
+Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md including its `## Architecture Vision` section, glossary.md, components/, system-flows.md). Use `glossary.md` to keep the new task's name, acceptance-criteria wording, and component references aligned with the user's confirmed vocabulary; flag the task to the user if the request appears to violate an Architecture Vision principle, do not silently allow it.

 **Consult LESSONS.md**: if `_docs/LESSONS.md` exists, read it and look for entries in categories `estimation`, `architecture`, `dependencies` that might apply to the task under consideration. If a relevant lesson exists (e.g., "estimation: auth-related changes historically take 2x estimate"), bias the classification and recommendation accordingly. Note in the output which lessons (if any) were applied.

@@ -134,7 +134,8 @@ The `<task_slug>` is a short kebab-case name derived from the feature descriptio
 **Goal**: Determine where and how to insert the new functionality, and whether existing tests cover the new requirements.

 1. Read the codebase documentation from DOCUMENT_DIR:
-   - `architecture.md` — overall structure
+   - `architecture.md` — overall structure (the `## Architecture Vision` H2 is user-confirmed intent and must not be violated by the new task without explicit approval)
+   - `glossary.md` — project terminology; reuse the user's vocabulary in task names, AC, and component references
   - `components/` — component specs
   - `system-flows.md` — data flows (if exists)
   - `data_model.md` — data model (if exists)
@@ -281,7 +282,7 @@ Present using the Choose format for each decision that has meaningful alternativ
   - Update **Epic** field: `[EPIC-ID]`
 3. Rename the file from `[##]_[short_name].md` to `[TICKET-ID]_[short_name].md`

-If the work item tracker is not authenticated or unavailable (`tracker: local`):
+If the work item tracker is not authenticated or unavailable, follow `.cursor/rules/tracker.mdc` before continuing. Only if the user explicitly chooses `tracker: local`:
 - Keep the numeric prefix
 - Set **Tracker** to `pending`
 - Set **Epic** to `pending`
@@ -336,7 +337,7 @@ After the user chooses **Done**:
 | Research skill hits a blocker | Follow research skill's own escalation rules |
 | Codebase analysis reveals conflicting architectures | **ASK** user which pattern to follow |
 | Complexity exceeds 5 points | **WARN** user and suggest splitting into multiple tasks |
-| Work item tracker MCP unavailable | **WARN**, continue with local-only task files |
+| Work item tracker MCP unavailable | Follow `.cursor/rules/tracker.mdc`; do not continue in local mode unless the user explicitly chooses it |

 ## Trigger Conditions

@@ -69,7 +69,7 @@ Capture any new questions, findings, or insights that arise during test specific

 ### Step 2: Solution Analysis

-Read and follow `steps/02_solution-analysis.md`.
+Read and follow `steps/02_solution-analysis.md`. The step opens with **Phase 2a.0: Glossary & Architecture Vision** (BLOCKING) — drafts `_docs/02_document/glossary.md` and a one-paragraph architecture vision, presents the condensed view to the user, iterates until confirmed, then proceeds into the architecture, data-model, and deployment phases. The confirmed vision becomes the first `## Architecture Vision` H2 of `architecture.md`.

 ---

@@ -107,6 +107,7 @@ Read and follow `steps/07_quality-checklist.md`.
 - **Coding during planning**: this workflow produces documents, never code
 - **Multi-responsibility components**: if a component does two things, split it
 - **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
+- **Skipping the glossary/vision gate (Phase 2a.0)**: drafting `architecture.md` from raw `solution.md` without confirming terminology and vision means the AI's mental model is not aligned with the user's; every downstream artifact will inherit that drift
 - **Diagrams without data**: generate diagrams only after the underlying structure is documented
 - **Copy-pasting problem.md**: the architecture doc should analyze and transform, not repeat the input
 - **Vague interfaces**: "component A talks to component B" is not enough; define the method, input, output
@@ -137,8 +138,10 @@ Read and follow `steps/07_quality-checklist.md`.
 │                                                                │
 │ 1. Blackbox Tests      → test-spec/SKILL.md                     │
 │    [BLOCKING: user confirms test coverage]                     │
-│ 2. Solution Analysis   → architecture, data model, deployment   │
-│    [BLOCKING: user confirms architecture]                      │
+│ 2. Solution Analysis   → glossary + vision, architecture,       │
+│                          data model, deployment                 │
+│    [BLOCKING 2a.0: user confirms glossary + vision]            │
+│    [BLOCKING 2a:   user confirms architecture]                  │
 │ 3. Component Decomp    → component specs + interfaces           │
 │    [BLOCKING: user confirms components]                        │
 │ 4. Review & Risk       → risk register, iterations              │
@@ -4,20 +4,105 @@
 **Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft
 **Constraints**: No code, no component-level detail yet; focus on system-level view

+### Phase 2a.0: Glossary & Architecture Vision (BLOCKING)
+
+**Role**: Software architect + business analyst
+**Goal**: Align the AI's mental model of the project with the user's intent BEFORE drafting `architecture.md`. Capture domain terminology and the user's high-level architecture vision so every downstream artifact (architecture, components, flows, tests, epics) is grounded in confirmed user intent — not in AI inference.
+
+**Inputs**:
+- `_docs/00_problem/problem.md`, `acceptance_criteria.md`, `restrictions.md`
+- `_docs/00_problem/input_data/*`
+- `_docs/01_solution/solution.md` (and any earlier `solution_draft*.md` siblings)
+- Any blackbox-test findings produced in Step 1
+
+**Outputs**:
+- `_docs/02_document/glossary.md` (NEW)
+- A confirmed "Architecture Vision" paragraph + bullet list held in working memory and used as the spine of Phase 2a's `architecture.md`
+
+**Procedure**:
+
+1. **Draft glossary** — extract project-specific terminology from inputs (NOT generic software terms). Include:
+   - Domain entities, processes, and roles
+   - Acronyms / abbreviations
+   - Internal codenames or product names
+   - Synonym pairs in active use (e.g., "flight" vs. "mission")
+   - Stakeholder personas referenced in problem.md
+   Each entry: one-line definition, plus a parenthetical source (`source: problem.md`, `source: solution.md §3`).
+   Skip terms that have a single well-known industry meaning (REST, JSON, etc.).
+
+2. **Draft architecture vision** — synthesize from inputs:
+   - **One paragraph**: what the system is, who uses it, the shape of the runtime topology (monolith / services / pipeline / library / hybrid).
+   - **Components & responsibilities** (one-line each). At this stage these are *intent-level*, not the formal decomposition that Step 3 produces.
+   - **Major data flows** (one or two sentences each).
+   - **Architectural principles / non-negotiables** the user has implied (e.g., "DB-driven config", "no per-component state outside Redis", "all UI traffic via REST + SSE only").
+   - **Open architectural questions** the AI cannot resolve from inputs alone.
+
+3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
+
+   ```
+   ══════════════════════════════════════
+    REVIEW: Glossary + Architecture Vision
+   ══════════════════════════════════════
+    Glossary (N terms drafted):
+      - <Term>: <one-line definition>
+      - ...
+    Architecture Vision:
+      <one-paragraph synopsis>
+
+      Components / responsibilities:
+        - <component>: <one-line>
+        - ...
+
+      Principles / non-negotiables:
+        - <principle>
+        - ...
+
+      Open questions (AI could not resolve):
+        - <q1>
+        - <q2>
+   ══════════════════════════════════════
+    A) Looks correct — write glossary.md, use vision for Phase 2a
+    B) I want to add / correct entries (provide diffs)
+    C) Answer the open questions first, then re-present
+   ══════════════════════════════════════
+    Recommendation: pick C if open questions exist, otherwise A
+   ══════════════════════════════════════
+   ```
+
+4. **Iterate**:
+   - On B → integrate the user's diffs/additions, re-present the condensed view, loop until A.
+   - On C → ask the listed open questions one round (M4-style batch), integrate answers, re-present.
+   - **Do NOT proceed to step 5 until the user picks A.**
+
+5. **Save**:
+   - Write `_docs/02_document/glossary.md` with terms in alphabetical order. Include a top-line `**Status**: confirmed-by-user` and the date.
+   - Hold the confirmed vision (paragraph + components + principles) in working memory; Phase 2a will materialize it into `architecture.md` and **must** preserve every confirmed principle and component intent verbatim.
+
+**Self-verification**:
+- [ ] Every glossary entry traces to at least one input file (no invented terms)
+- [ ] Every component listed in the vision is one the inputs reference
+- [ ] All open questions are either answered or explicitly deferred (with the user's acknowledgement)
+- [ ] User picked option A on the latest condensed view
+
+**BLOCKING**: Do NOT proceed to Phase 2a until `glossary.md` is saved and the user has confirmed the architecture vision.
+
 ### Phase 2a: Architecture & Flows

 1. Read all input files thoroughly
 2. Incorporate findings, questions, and insights discovered during Step 1 (blackbox tests)
-3. Research unknown or questionable topics via internet; ask user about ambiguities
-4. Document architecture using `templates/architecture.md` as structure
-5. Document system flows using `templates/system-flows.md` as structure
+3. **Apply confirmed vision from Phase 2a.0**: the architecture document must include a top-level `## Architecture Vision` section that contains the user-confirmed paragraph, components, and principles verbatim. The rest of `architecture.md` (tech stack, deployment model, NFRs, ADRs) builds on top of that section, never contradicts it
+4. Research unknown or questionable topics via internet; ask user about ambiguities
+5. Document architecture using `templates/architecture.md` as structure
+6. Document system flows using `templates/system-flows.md` as structure

 **Self-verification**:
+- [ ] `architecture.md` opens with a `## Architecture Vision` section matching Phase 2a.0
 - [ ] Architecture covers all capabilities mentioned in solution.md
 - [ ] System flows cover all main user/system interactions
- [ ] No contradictions with problem.md or restrictions.md
+- [ ] No contradictions with problem.md, restrictions.md, or the confirmed vision
 - [ ] Technology choices are justified
 - [ ] Blackbox test findings are reflected in architecture decisions
+- [ ] Every term used in `architecture.md` that is project-specific appears in `glossary.md`

 **Save action**: Write `architecture.md` and `system-flows.md`

@@ -58,4 +58,4 @@ Do NOT create minimal epics with just a summary and short description. The epic

 8. **Create "Blackbox Tests" epic** — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.

-**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If `tracker: local`, save locally only.
+**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If tracker availability fails, follow `.cursor/rules/tracker.mdc`; only if the user explicitly chooses `tracker: local`, save locally only with pending tracker markers.
@@ -133,4 +133,4 @@ Link to architecture.md and relevant component spec.]
  - `component` — a normal per-component epic
  - `cross-cutting` — a shared concern that spans ≥2 components
  - `tests` — the blackbox-tests epic (always exactly one)
- Complexity points for child issues follow the project standard: 1, 2, 3, 5, 8. Do not create issues above 5 points — split them.
+- Complexity points for child issues follow the project standard: 1, 2, 3, 5. Do not create issues above 5 points — split them.
@@ -181,6 +181,8 @@ Categorized measurable criteria with markdown headers and bullet points:

 Every criterion must have a measurable value. Vague criteria like "should be fast" are not acceptable — push for "less than 400ms end-to-end".

+**AC must be design-independent**: describe testable outcomes only — no libraries, algorithms, params, or design choices. Implementation follows AC, never reverse. (IEEE 830 / Atlassian / GitScrum)
+
 ### input_data/

 At least one file. Options:
@@ -24,6 +24,8 @@ Phase details live in `phases/` — read the relevant file before executing each
 - **Save immediately**: write artifacts to disk after each phase
 - **Delegate execution**: all code changes go through the implement skill via task files
 - **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
+- **Exact-fit recommendations**: do not recommend a replacement pattern, library, service, architecture, algorithm, or "modern approach" merely because it improves structure or solves a similar class of problem. It must fit confirmed product constraints, acceptance criteria, operating context, integration boundaries, and current code realities. Otherwise reject it, mark it experimental, or ask the user before adding it to the roadmap.
+- **Per-mode API capability verification on replacements**: when a refactor proposes replacing or adding a library/SDK/framework/service that exposes multiple modes or configurations, pin the exact mode the refactored code will use (inputs, outputs, runtime) and verify *that mode* via mandatory `context7` lookup plus a saved Minimum Viable Example before promoting the recommendation to `Selected`. Capability claims at the category level ("supports A, B, C modes") must be cross-checked against the literal mode enumeration — `A, B → A+B` style conflations are the recurring silent-failure path.

 ## Context Resolution

@@ -57,7 +59,7 @@ Create REFACTOR_DIR and RUN_DIR if missing. If a RUN_DIR with the same name alre

 Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-changes.md`). Both modes then convert that file into task files in TASKS_DIR during Phase 2.

-**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file to avoid duplication.
+**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file only if it lives outside `RUN_DIR`. If the provided file is already the canonical `RUN_DIR/list-of-changes.md`, keep it as the audit record.

 ## Workflow

@@ -79,10 +81,10 @@ Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-ch
 - "refactor [specific target]" → skip phase 1 if docs exist
 - Default → all phases

-**Testability-run specifics** (guided mode invoked by autodev existing-code flow Step 4):
+**Testability-run specifics** (guided mode invoked by autodev existing-code Step 4 or greenfield Step 8):
 - Run name is `01-testability-refactoring`.
 - Phase 3 (Safety Net) is skipped by design — no tests exist yet. Compensating control: the `list-of-changes.md` gate in Phase 1 must be reviewed and approved by the user before Phase 4 runs.
- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see existing-code flow Step 4 for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for Step 8 (optional full refactor) consideration.
+- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see the invoking flow's testability step for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for the next optional full-refactor step or backlog consideration.
 - After Phase 4 (Execution) completes, write `RUN_DIR/testability_changes_summary.md` as Phase 4.5. Format: one bullet per applied change.
  ```markdown
  # Testability Changes Summary ({{run_name}})
@@ -95,7 +95,7 @@ Also copy to project standard locations:

 **Critical step — do not skip.** Before producing the change list, cross-reference documented business flows against actual implementation. This catches issues that static code inspection alone misses.

-1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md`, `DOCUMENT_DIR/module-layout.md`, every file under `DOCUMENT_DIR/contracts/`, and `SOLUTION_DIR/solution.md` (whichever exist). Extract every documented business flow, data path, architectural decision, module ownership boundary, and contract shape.
+1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md` (paying special attention to its `## Architecture Vision` section — that's the user-confirmed structural intent), `DOCUMENT_DIR/glossary.md`, `DOCUMENT_DIR/module-layout.md`, every file under `DOCUMENT_DIR/contracts/`, and `SOLUTION_DIR/solution.md` (whichever exist). Extract every documented business flow, data path, architectural decision, module ownership boundary, and contract shape. Any refactor change that contradicts a confirmed Architecture Vision principle must either be rejected or surfaced to the user before being added to `list-of-changes.md` — those principles are not refactor targets without explicit user approval.

 2. **Trace each flow through code**: For every documented flow (e.g., "video batch processing", "image tiling", "engine initialization"), walk the actual code path line by line. At each decision point ask:
   - Does the code match the documented/intended behavior?
@@ -7,14 +7,29 @@
 ## 2a. Deep Research

 1. Analyze current implementation patterns
-2. Research modern approaches for similar systems
-3. Identify what could be done differently
-4. Suggest improvements based on state-of-the-art practices
+2. Extract the **Project Constraint Matrix** from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, current architecture/docs, and actual code constraints. Include required inputs/outputs, operating context, lifecycle assumptions, integration boundaries, non-functional targets, and hard disqualifiers.
+3. Research modern approaches for similar systems
+4. For each alternative pattern/library/service/architecture/algorithm, research intrinsic implementation constraints: required inputs/outputs, runtime assumptions, supported deployment modes, resource needs, operational limits, licensing/security constraints, and known failure reports.
+
+   **API Capability Verification — Per-Mode (MANDATORY, BLOCKING for proposed replacements)**
+
+   When a refactor recommendation replaces (or adds) a library/SDK/framework/service, the same per-mode verification used by `/research` Step 2 applies — selecting a replacement on category fit alone is the same silent-failure path. For every replacement candidate that has multiple modes or configurations:
+
+   1. **Pin the exact mode/configuration** the refactored code will use, in one explicit sentence. Inputs (data shapes, sensor counts, payloads, rates), outputs (per `acceptance_criteria.md` and contract files), runtime (matching the project's deployment).
+   2. **Run `context7` (or equivalent docs lookup)** for the candidate. **Mandatory for every replacement library/SDK/framework candidate**, not optional. Minimum three queries per candidate: mode enumeration, project's exact mode (with input/output shapes), disqualifier probe ("does this mode produce the required output? are there published limitations on this runtime?"). Append URLs to `RUN_DIR/analysis/research_findings.md` references section.
+   3. **Save a Minimum Viable Example (MVE)** for the pinned mode under `RUN_DIR/analysis/mve_evidence.md` with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌. If no official example covers the project's exact configuration, the recommendation cannot be `Selected` based on category fit alone — it must be `Experimental only` (with required-evidence note) or `Rejected`.
+   4. **Treat "the same library in a different mode" as a different recommendation.** If the project's pinned mode is `<X>` but the only documented evidence covers `<Y>`, do not silently soften the description. Open a separate recommendation row, with its own MVE, fit assessment, and disqualifiers.
+   5. **Common silent-failure pattern**: a fact summary paraphrases docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes" — no `A+B` combination exists. Cross-check paraphrased capability claims against the literal mode enumeration.
+
+5. Identify what could be done differently
+6. Suggest improvements only when they fit the Project Constraint Matrix. A cleaner or more modern approach that violates product constraints must be marked `Rejected` or `Experimental only`, not added as a roadmap recommendation.

 Write `RUN_DIR/analysis/research_findings.md`:
 - Current state analysis: patterns used, strengths, weaknesses
 - Alternative approaches per component: current vs alternative, pros/cons, migration effort
 - Prioritized recommendations: quick wins + strategic improvements
+- Constraint-fit table: recommendation, **pinned mode/config**, constraints checked, **API capability evidence (MVE link)**, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
+- For every recommendation that replaces or adds a library/SDK/framework, append a **Restrictions × Candidate-Mode sub-matrix** that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's pinned mode, marking each cell ✅ Pass / ❌ Fail / ❓ Verify / N/A with cited evidence. A recommendation cannot be `Selected` while any cell is ❌ or ❓.

 ## 2b. Solution Assessment & Hardening Tracks

@@ -22,6 +37,7 @@ Write `RUN_DIR/analysis/research_findings.md`:
 2. Identify weak points in codebase, map to specific code areas
 3. Perform gap analysis: acceptance criteria vs current state
 4. Prioritize changes by impact and effort
+5. Reject or escalate any proposed refactor that improves code structure while weakening required behavior, integration contracts, runtime constraints, safety/security posture, or acceptance criteria

 Present optional hardening tracks for user to include in the roadmap:

@@ -47,6 +63,9 @@ Write `RUN_DIR/analysis/refactoring_roadmap.md`:
 - Gap analysis: what's missing, what needs improvement
 - Phased roadmap: Phase 1 (critical fixes), Phase 2 (major improvements), Phase 3 (enhancements)
 - Selected hardening tracks and their items
+- Applicability gate: each roadmap item must state constraint fit, mismatches, required evidence, and status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
+
+**BLOCKING applicability gate**: Before 2c and 2d, every recommendation in the roadmap must be `Selected`. Items marked `Rejected` are excluded. Items marked `Experimental only` or `Needs user decision` require a user decision before task creation.

 ## 2c. Create Epic

@@ -55,7 +74,7 @@ Create a work item tracker epic for this refactoring run:
 1. Epic name: the RUN_DIR name (e.g., `01-testability-refactoring`)
 2. Create the epic via configured tracker MCP
 3. Record the Epic ID — all tasks in 2d will be linked under this epic
-4. If tracker unavailable, use `PENDING` placeholder and note for later
+4. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; only use `PENDING` placeholders if the user explicitly chooses `tracker: local`

 ## 2d. Task Decomposition

@@ -79,6 +98,12 @@ Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files
 **Self-verification**:
 - [ ] All acceptance criteria are addressed in gap analysis
 - [ ] Recommendations are grounded in actual code, not abstract
+- [ ] Every recommendation has been checked against the Project Constraint Matrix
+- [ ] No recommendation violates product restrictions, acceptance criteria, documented architecture decisions, or actual code integration boundaries
+- [ ] Every replacement library/SDK/framework recommendation has a pinned mode/config, a saved MVE in `mve_evidence.md`, and a Restrictions × Candidate-Mode sub-matrix with no ❌ or ❓ cells
+- [ ] `context7` (or equivalent) was consulted for every replacement library/SDK/framework recommendation
+- [ ] Paraphrased capability claims have been cross-checked against the literal mode-enumeration evidence (no `A, B → A+B` style conflation)
+- [ ] Rejected and experimental approaches are documented but not converted into implementation tasks without user approval
 - [ ] Roadmap phases are prioritized by impact
 - [ ] Epic created and all tasks linked to it
 - [ ] Every entry in list-of-changes.md has a corresponding task file in TASKS_DIR
@@ -10,7 +10,7 @@
   - All `[TRACKER-ID]_refactor_*.md` files are present
   - Each task file has valid header fields (Task, Name, Description, Complexity, Dependencies)
 2. Verify `TASKS_DIR/_dependencies_table.md` includes the refactoring tasks
-3. Verify all tests pass (safety net from Phase 3 is green)
+3. Verify all tests pass (safety net from Phase 3 is green), unless this is a testability run where Phase 3 was intentionally skipped
 4. If any check fails, go back to the relevant phase to fix

 ## 4b. Delegate to Implement Skill
@@ -21,9 +21,9 @@ The implement skill will:
 1. Parse task files and dependency graph from TASKS_DIR
 2. Detect already-completed tasks (skip non-refactoring tasks from prior workflow steps)
 3. Compute execution batches for the refactoring tasks
-4. Launch implementer subagents (up to 4 in parallel)
+4. Implement tasks sequentially in topological order (no subagents, no parallelism)
 5. Run code review after each batch
-6. Commit and push per batch
+6. Commit per batch and push only when the user approved pushing
 7. Update work item ticket status

 Do NOT modify, skip, or abbreviate any part of the implement skill's workflow. The refactor skill is delegating execution, not optimizing it.
@@ -47,7 +47,7 @@ After the implement skill completes:
 For each successfully completed refactoring task:

 1. Transition the work item ticket status to **Done** via the configured tracker MCP
-2. If tracker unavailable, note the pending status transitions in `RUN_DIR/execution_log.md`
+2. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; if the user explicitly chose `tracker: local`, note the pending status transitions in `RUN_DIR/execution_log.md`

 For any failed or blocked tasks, leave their status as-is (the implement skill already set them to In Testing or blocked).

@@ -32,7 +32,7 @@ For each component doc affected:
 ## 7d. Update System-Level Documentation

 If structural changes were made (new modules, removed modules, changed interfaces):
-1. Update `_docs/02_document/architecture.md` if architecture changed
+1. Update `_docs/02_document/architecture.md` if architecture changed — but **never edit the `## Architecture Vision` section**. That section is user-confirmed (plan Phase 2a.0 / document Step 4.5); if a refactor invalidates a vision principle, surface it to the user and let them update the vision themselves before continuing. Update only the technical sections below the Vision H2.
 2. Update `_docs/02_document/system-flows.md` if flow sequences changed
 3. Update `_docs/02_document/diagrams/components.md` if component relationships changed

@@ -23,6 +23,7 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
 - **Problem**: [what makes this problematic / untestable / coupled]
 - **Change**: [what to do — behavioral description, not implementation steps]
 - **Rationale**: [why this change is needed]
+- **Constraint Fit**: [which product constraints / acceptance criteria / integration boundaries this preserves; or "Rejected — violates ..."]
 - **Risk**: [low | medium | high]
 - **Dependencies**: [other change IDs this depends on, or "None"]

@@ -31,6 +32,7 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
 - **Problem**: [description]
 - **Change**: [description]
 - **Rationale**: [description]
+- **Constraint Fit**: [description]
 - **Risk**: [low | medium | high]
 - **Dependencies**: [C01, or "None"]
 ```
@@ -44,6 +46,8 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
 - **File(s)** must reference actual files verified to exist in the codebase
 - **Problem** describes the current state, not the desired state
 - **Change** describes what the system should do differently — behavioral, not prescriptive
+- **Constraint Fit** proves the change preserves confirmed product requirements, restrictions, acceptance criteria, architecture decisions, and integration contracts
+- Do not include changes whose only benefit is structural cleanliness if they weaken required behavior or violate constraints; record those as rejected in analysis instead
 - **Dependencies** reference other change IDs within this list; cross-run dependencies use tracker IDs
 - In guided mode, the input file entries are validated against actual code and enriched with file paths, risk, and dependencies before writing
 - In automatic mode, entries are derived from Phase 1 component analysis and Phase 2 research findings
@@ -30,6 +30,27 @@ Transform vague topics raised by users into high-quality, deliverable research r
 - **Internet-first investigation** — do not rely on training data for factual claims; search the web extensively for every sub-question, rephrase queries when results are thin, and keep searching until you have converging evidence from multiple independent sources
 - **Multi-perspective analysis** — examine every problem from at least 3 different viewpoints (e.g., end-user, implementer, business decision-maker, contrarian, domain expert, field practitioner); each perspective should generate its own search queries
 - **Question multiplication** — for each sub-question, generate multiple reformulated search queries (synonyms, related terms, negations, "what can go wrong" variants, practitioner-focused variants) to maximize coverage and uncover blind spots
+- **Component option breadth** — for every component area, build a broad option landscape before selecting. Search direct candidates, adjacent-domain alternatives, commercial/open-source variants, classical/simple baselines, current SOTA, and "do not use" failure cases. A component may not be narrowed to one candidate until alternatives have been searched and rejected with evidence.
+- **Component research depth** — for every serious component candidate, go beyond discovery pages. Read official docs, repository/license files, issue discussions, benchmarks, deployment guides, version/platform requirements, security notes, maintenance signals, and real-world failure reports. Extract evidence for inputs/outputs, lifecycle assumptions, runtime/storage/latency fit, integration boundaries, licensing, operational risks, and unsupported scenarios before assigning any selection status.
+- **Exact-fit component selection** — never select a component, tool, library, service, architecture pattern, or algorithm merely because it solves a similar class of problem. It must be proven compatible with the project's explicit operating context, constraints, required inputs/outputs, non-functional requirements, lifecycle assumptions, and acceptance criteria. If fit is unproven or mismatched, mark it `Rejected`, `Experimental only`, or escalate for user decision before it can shape the solution.
+- **Per-mode API capability verification** *(applies only to technical-component selection — see Research Output Class below)* — when a candidate library/SDK/framework/service exposes multiple modes or configurations, *the candidate is not a single thing*. Pin the exact mode the project will use (one explicit sentence: inputs, outputs, runtime), and verify *that mode* against the project's required inputs/outputs via official docs (mandatory `context7` lookup) plus a saved Minimum Viable Example. Capability claims at the category level ("supports X, Y, Z modes") must be cross-checked against the literal mode enumeration before being treated as project-applicable. Two modes of one library are two distinct candidates for the purposes of the Component Applicability Gate. Does not apply to non-technical research (concept comparison, market/policy investigation, knowledge organization, etc.).
+
+## Research Output Class (BLOCKING — set in Step 1)
+
+Before applying any of the technical-component gates (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookup), classify the research output into one of two classes. Record the decision in `00_question_decomposition.md` once, near the top, so every downstream step honors it.
+
+| Class | What the output recommends or selects | Examples | Technical-component gates apply? |
+|-------|---------------------------------------|----------|----------------------------------|
+| **Technical-component selection** | One or more libraries, SDKs, frameworks, services, protocols, data formats, infrastructure patterns, algorithms, or APIs that will be implemented or operated against | "Pick a vector database", "Compare auth-token strategies for our API", "Should we use Kafka or RabbitMQ?", architecture / tech-stack / migration drafts (Mode A, Mode B) | **Yes — all gates active** |
+| **Non-technical investigation** | Concept comparisons, knowledge organization, root-cause investigation of an event, market/policy/regulatory/social analysis, literature review, decision support without committing to specific tooling | "Why did adoption stall in Q3?", "Compare phenomenology vs constructivism", "Map regulatory landscape for X", "What do practitioners say about onboarding under remote-first orgs?" | **No — skip API/MVE/sub-matrix gates; the rest of the 8-step engine still applies** |
+
+How to decide:
+1. Inspect the question and the input files (`problem.md`, `restrictions.md`, `acceptance_criteria.md`, or the standalone input file).
+2. If the deliverable will name specific software/services/protocols that someone will then build with or operate, it is **Technical-component selection**.
+3. If the deliverable is a report, comparison, or recommendation that does not commit to specific tooling, it is **Non-technical investigation**.
+4. **Mixed runs are valid.** Some research questions have a non-technical core but include one technical sub-question (or vice versa). In that case classify per component area within the run, not the run as a whole, and note in `00_question_decomposition.md` which component areas trigger the technical-component gates.
+
+When the run is purely **Non-technical investigation**, the rest of the research engine — question decomposition, perspective rotation, exhaustive web search, fact extraction, comparison framework, reasoning chain, validation, deliverable formatting — still applies in full. The sections that get skipped are explicitly the technical gates listed in the table above.

 ## Context Resolution

@@ -27,13 +27,26 @@
 - [ ] Iterative deepening completed: follow-up questions from initial findings were searched
 - [ ] No sub-question relies solely on training data without web verification

+## Component Option Breadth
+
+- [ ] `00_question_decomposition.md` contains a Component Option Search Plan
+- [ ] Every component area was searched across simple baseline, established production, open-source, commercial/vendor, current SOTA, adjacent-domain, no-build/defer, and known-bad options where applicable
+- [ ] Every component area has at least 3 realistic candidates, or a documented explanation of why broad searches found fewer
+- [ ] Each lead candidate has official/source-of-truth evidence plus independent validation when available
+- [ ] Each component area includes at least one baseline/fallback option and at least one rejected or experimental option when possible
+- [ ] Alternative names, synonyms, and neighboring-domain terms were searched before declaring the option landscape complete
+- [ ] Licensing, runtime, platform, maintenance, and unsupported-scenario searches were performed for every lead, fallback, and rejected candidate
+
 ## Mode A Specific

 - [ ] Phase 1 completed: AC assessment was presented to and confirmed by user
 - [ ] AC assessment consistent: Solution draft respects the (possibly adjusted) acceptance criteria and restrictions
 - [ ] Competitor analysis included: Existing solutions were researched
 - [ ] All components have comparison tables: Each component lists alternatives with tools, advantages, limitations, security, cost
+- [ ] Component options are broad: component tables include baseline, production, open-source, commercial/vendor, SOTA/research, adjacent-domain, defer/no-build, and disqualified options where applicable
 - [ ] Tools/libraries verified: Suggested tools actually exist and work as described
+- [ ] Component fit matrix completed: `06_component_fit_matrix.md` exists and every selected component/tool/pattern is marked `Selected`
+- [ ] No field-adjacent substitution: no selected candidate is chosen only because it solves a similar class of problem while failing the project's explicit constraints
 - [ ] Testing strategy covers AC: Tests map to acceptance criteria
 - [ ] Tech stack documented (if Phase 3 ran): `tech_stack.md` has evaluation tables, risk assessment, and learning requirements
 - [ ] Security analysis documented (if Phase 4 ran): `security_analysis.md` has threat model and per-component controls
@@ -45,6 +58,9 @@
 - [ ] New draft is self-contained: Written as if from scratch, no "updated" markers
 - [ ] Performance column included: Mode B comparison tables include performance characteristics
 - [ ] Previous draft issues addressed: Every finding in the table is resolved in the new draft
+- [ ] Existing selected components were challenged against a broad alternative landscape before being kept
+- [ ] Existing component fit audited: every old and new component/tool/pattern was checked against `restrictions.md`, `acceptance_criteria.md`, and the Project Constraint Matrix
+- [ ] Rejected/experimental candidates are not lead recommendations unless the user explicitly accepted the risk

 ## Timeliness Check (High-Sensitivity Domain BLOCKING)

@@ -76,3 +92,33 @@ When the research topic has Critical or High sensitivity level:
 - [ ] Cited facts have corresponding statements in the original text (no over-interpretation)
 - [ ] Source publication/update dates annotated; technical docs include version numbers
 - [ ] Unverifiable information annotated `[limited source]` and not sole support for core conclusions
+
+## Exact-Fit Validation (BLOCKING)
+
+- [ ] Project Constraint Matrix extracted from problem context before component selection
+- [ ] Component fit matrix includes `Component Area`, `Option Family`, and `Pinned Mode/Config` columns
+- [ ] Every selected component/tool/library/service/pattern/algorithm has evidence for required inputs/outputs and integration boundaries
+- [ ] Every selected candidate has evidence for the operating context and lifecycle assumptions it must support
+- [ ] Every selected candidate has evidence for non-functional targets that are binding for the project
+- [ ] Known unsupported scenarios and failure reports were searched for every selected candidate
+- [ ] Mismatches are recorded as disqualifiers, not softened into generic limitations
+- [ ] Any candidate with unproven fit is marked `Experimental only` or escalated for user decision
+- [ ] Any candidate with documented constraint conflict is marked `Rejected`
+
+## API Capability Verification (BLOCKING)
+
+**Applicability**: this checklist applies only when the run is classified as **Technical-component selection** (see SKILL.md → Research Output Class). For non-technical research (concept comparison, market/policy investigation, root-cause analysis, knowledge organization), skip this checklist entirely and note the skip in `05_validation_log.md`. For mixed runs, apply only to technical component areas.
+
+For every lead candidate that is a library/SDK/framework/service:
+
+- [ ] The exact mode/configuration the project will use is pinned in one explicit sentence (inputs, outputs, runtime); no vague "supports X" language
+- [ ] `context7` (or equivalent docs lookup) was run for the candidate, with at least 3 queries: mode enumeration, project's exact mode, disqualifier probe
+- [ ] All consulted URLs from context7 / official docs are appended to `01_source_registry.md`
+- [ ] A Minimum Viable Example (MVE) was saved for the pinned mode in `02_fact_cards.md` (or `02_mve_evidence.md`) with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌
+- [ ] When the MVE inputs or outputs do not exactly match the project's, the mismatch is cited from the official docs (not inferred), and the candidate is `Experimental only` or `Rejected`
+- [ ] When a library has multiple modes, each project-relevant mode appears as its own candidate row (not a single library row that softens across modes)
+- [ ] Restrictions × Candidate-Modes sub-matrix in `06_component_fit_matrix.md` is filled for every lead candidate, with one row per numbered restriction and per numbered acceptance criterion
+- [ ] Sub-matrix uses ✅ / ❌ / ❓ / N/A only — no free-form prose substitutes
+- [ ] No `Selected` candidate has any ❌ or ❓ cell in its sub-matrix
+- [ ] "Validation gate required" footnotes are explicitly classified as either *API capability* (must be resolved here) or *runtime quality* (may be carried forward)
+- [ ] Paraphrased capability claims in fact cards have been cross-checked against the literal mode-enumeration evidence (no `mono, inertial → mono-inertial` style conflation)
@@ -57,6 +57,7 @@ RESEARCH_DIR/
 ├── 03_comparison_framework.md     # Step 4 output: selected framework and populated data
 ├── 04_reasoning_chain.md          # Step 6 output: fact → conclusion reasoning
 ├── 05_validation_log.md           # Step 7 output: use-case validation results
+├── 06_component_fit_matrix.md     # Step 7.5 output: component exact-fit gate
 └── raw/                           # Raw source archive (optional)
    ├── source_1.md
    └── source_2.md
@@ -73,6 +74,7 @@ RESEARCH_DIR/
 | Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` |
 | Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` |
 | Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` |
+| Step 7.5 | Component exact-fit gate and selection status | `06_component_fit_matrix.md` |
 | Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` |

 ### Save Principles
@@ -95,6 +97,7 @@ RESEARCH_DIR/
 | `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion |
 | `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion |
 | `05_validation_log.md` | Use-case validation and review | After Step 7 completion |
+| `06_component_fit_matrix.md` | Exact-fit matrix for every proposed component/tool/pattern with status `Selected` / `Rejected` / `Experimental only` / `Needs user decision` | Before Step 8 deliverable formatting |
 | `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion |
 | `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) |
 | `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) |
@@ -6,7 +6,9 @@ Triggered when no `solution_draft*.md` files exist in OUTPUT_DIR, or when the us

 **Role**: Professional software architect

-A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them.
+> **AC must be design-independent**: describe testable outcomes only — no libraries, algorithms, params, or design choices. Implementation follows AC, never reverse. (IEEE 830 / Atlassian / GitScrum)
+
+A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them. Any revision proposed in this phase must respect the design-independence rule above — propose AC changes as outcome/budget edits, not as implementation prescriptions.

 **Input**: All files from INPUT_DIR (or INPUT_FILE in standalone mode)

@@ -73,16 +75,18 @@ Full 8-step research methodology. Produces the first solution draft.
 **Task** (drives the 8-step engine):
 1. Research existing/competitor solutions for similar problems — search broadly across industries and adjacent domains, not just the obvious competitors
 2. Research the problem thoroughly — all possible ways to solve it, split into components; search for how different fields approach analogous problems
-3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
-4. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
-5. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
-6. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
-7. Include security considerations in each component analysis
-8. Provide rough cost estimates for proposed solutions
+3. Derive a **Project Constraint Matrix** before evaluating component options. Extract exact constraints from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, input data notes, and the Phase 1 AC assessment. Include required inputs/outputs, operating context, runtime envelope, data availability, lifecycle boundaries, non-functional targets, integration boundaries, security constraints, and explicit out-of-scope decisions.
+4. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
+5. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
+6. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
+7. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
+8. For every candidate component/tool/library/service/pattern/algorithm, prove exact fit against the Project Constraint Matrix. A field-adjacent solution is not selectable unless its documented implementation assumptions match the project's constraints. Mismatches must be recorded as disqualifiers and the candidate marked `Rejected`, `Experimental only`, or `Needs user decision`.
+9. Include security considerations in each component analysis
+10. Provide rough cost estimates for proposed solutions

 Be concise in formulating. The fewer words, the better, but do not miss any important details.

-**Save action**: Write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`
+**Save action**: Write `RESEARCH_DIR/06_component_fit_matrix.md` before the final draft, then write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`

 ---

@@ -10,18 +10,25 @@ Full 8-step research methodology applied to assessing and improving an existing

 **Task** (drives the 8-step engine):
 1. Read the existing solution draft thoroughly
-2. Research in internet extensively — for each component/decision in the draft, search for:
+2. Derive or refresh the **Project Constraint Matrix** from all files in INPUT_DIR. Include required inputs/outputs, operating context, runtime envelope, data availability, lifecycle boundaries, non-functional targets, integration boundaries, security constraints, and explicit out-of-scope decisions.
+3. Audit every component/decision in the existing draft against the Project Constraint Matrix before researching alternatives:
+   - If a component's documented implementation assumptions match the project constraints, keep it eligible and record evidence.
+   - If fit is unproven, mark it `Experimental only` until evidence is found.
+   - If constraints conflict, mark it `Rejected` and search for alternatives.
+   - If rejecting it changes product behavior or risk materially, escalate for user decision.
+4. Research in internet extensively — for each component/decision in the draft, search for:
   - Known problems and limitations of the chosen approach
   - What practitioners say about using it in production
   - Better alternatives that may have emerged recently
   - Common failure modes and edge cases
   - How competitors/similar projects solve the same problem differently
-3. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
-4. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
-5. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
-6. For each identified weak point, search for multiple solution approaches and compare them
-7. Based on findings, form a new solution draft in the same format
+5. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
+6. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
+7. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
+8. For each identified weak point, search for multiple solution approaches and compare them
+9. For every revised candidate, prove exact fit against the Project Constraint Matrix. Do not select field-adjacent or "similar problem" options unless their intrinsic implementation constraints match the project.
+10. Based on findings, form a new solution draft in the same format

-**Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
+**Save action**: Write `RESEARCH_DIR/06_component_fit_matrix.md` before the final draft, then write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`

 **Optional follow-up**: After Mode B completes, the user can request Phase 3 (Tech Stack Consolidation) or Phase 4 (Security Deep Dive) using the revised draft. These phases work identically to their Mode A descriptions in `steps/01_mode-a-initial-research.md`.
@@ -40,6 +40,7 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
 - "What existing/competitor solutions address this problem?"
 - "What are the component parts of this problem?"
 - "For each component, what are the state-of-the-art solutions?"
+- "For each component, what are the practical alternatives across simple baseline, established production option, open-source option, commercial option, current SOTA, adjacent-domain option, and no-build/defer option?"
 - "What are the security considerations per component?"
 - "What are the cost implications of each approach?"

@@ -48,6 +49,7 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
 - "What are the security vulnerabilities in the proposed architecture?"
 - "Where are the performance bottlenecks?"
 - "What solutions exist for each identified issue?"
+- "For each component already selected in the draft, what alternatives should be considered before keeping, replacing, or rejecting it?"

 **General sub-question patterns** (use when applicable):
 - **Sub-question A**: "What is X and how does it work?" (Definition & mechanism)
@@ -84,6 +86,27 @@ For **each sub-question**, generate **at least 3-5 search query variants** befor

 Record all planned queries in `00_question_decomposition.md` alongside each sub-question.

+#### Component Option Breadth (MANDATORY)
+
+Before Step 2, identify the component areas implied by the problem and create a search plan for options in each area. A component area is any replaceable tool, library, model, service, algorithm, data format, protocol, infrastructure pattern, or validation approach that could materially affect the solution.
+
+For every component area, generate search queries for these option families unless clearly not applicable:
+- **Simple baseline**: low-complexity classical or manual approach that can serve as a fallback or regression baseline.
+- **Established production option**: mature library/service/pattern with field usage.
+- **Open-source candidate**: permissive-license option with inspectable implementation and community history.
+- **Commercial/vendor option**: paid or vendor-supported option, including SDK/platform constraints.
+- **Current SOTA / research option**: recent model, paper, or benchmark leader that may be promising but immature.
+- **Adjacent-domain option**: solution from a neighboring domain with similar constraints.
+- **No-build / defer option**: whether the component can be avoided, simplified, or moved out of scope.
+- **Known bad option**: candidate or family that appears attractive but has documented failure modes or disqualifiers.
+
+For each component area, record:
+- Candidate names and option families to search.
+- At least 5 query variants covering alternatives, comparisons, limitations, licensing, runtime/scale, and exact project constraints.
+- The minimum evidence needed to mark a candidate `Selected`, `Rejected`, `Experimental only`, or `Needs user decision`.
+
+Add this as a "Component Option Search Plan" section in `00_question_decomposition.md`.
+
 **Research Subject Boundary Definition (BLOCKING - must be explicit)**:

 When decomposing questions, you must explicitly define the **boundaries of the research subject**:
@@ -94,6 +117,9 @@ When decomposing questions, you must explicitly define the **boundaries of the r
 | **Geography** | Which region is being studied? | Chinese universities vs US universities vs global |
 | **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture |
 | **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational |
+| **Operating context** | What exact environment, lifecycle phase, and runtime conditions must the solution support? | In-flight embedded runtime vs offline post-processing; production web traffic vs admin batch job |
+| **Required interfaces** | What inputs, outputs, protocols, data shapes, and ownership boundaries are fixed? | One camera vs stereo rig; REST API vs message queue; local file boundary vs service API |
+| **Non-functional envelope** | What latency, throughput, storage, memory, availability, safety, security, cost, and maintainability targets are binding? | <400 ms p95, 8 GB RAM, 99.9% availability, reversible migrations |

 **Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research.

@@ -116,9 +142,11 @@ Record the audit result in `00_question_decomposition.md` as a "Completeness Aud
   - Summary of relevant problem context from INPUT_DIR
   - Classified question type and rationale
   - **Research subject boundary definition** (population, geography, timeframe, level)
+   - **Project Constraint Matrix summary** (operating context, required interfaces, non-functional envelope, lifecycle assumptions, and hard disqualifiers extracted from input files)
   - List of decomposed sub-questions
   - **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
   - **Search query variants** for each sub-question (at least 3-5 per sub-question)
+   - **Component Option Search Plan** (component areas, option families, candidate names, query variants, required evidence)
   - **Completeness audit** (taxonomy cross-reference + domain discovery results)
 4. Write TodoWrite to track progress

@@ -132,7 +160,7 @@ Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). C

 **Tool Usage**:
 - Use `WebSearch` for broad searches; `WebFetch` to read specific pages
- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation
+- Use the `context7` MCP server (`resolve-library-id` then `query-docs` / `get-library-docs`) for up-to-date library/framework documentation. **Mandatory per lead candidate** — see "API Capability Verification" below.
 - Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
 - When citing web sources, include the URL and date accessed

@@ -145,12 +173,72 @@ Do not stop at the first few results. The goal is to build a comprehensive evide
 - Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion)
 - If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems

+**Minimum search effort per component area**:
+- Search every option family from the "Component Option Search Plan" before choosing a lead candidate.
+- For each lead, fallback, or rejected candidate, search at least one official/source-of-truth page and at least one independent validation source when available.
+- Search `"[component] alternatives"`, `"[candidate] vs [alternative]"`, `"[candidate] limitations"`, `"[candidate] license"`, `"[candidate] production"`, and `"[candidate] [binding project constraint]"`.
+- If fewer than 3 realistic candidates are found for a component area, explicitly document why the landscape is narrow and search adjacent domains before accepting that result.
+- Include at least one simple baseline and one "do not use" or disqualified candidate per component area when possible; these prevent false confidence in the selected option.
+
+**Candidate implementation-limit searches (MANDATORY)**:
+For every component/tool/library/service/pattern/algorithm that may be selected or recommended, search for its intrinsic implementation constraints. Do not rely on product category labels, marketing summaries, or examples from a different operating context. Include query variants for:
+- Official supported inputs/outputs, protocols, data formats, and deployment modes
+- Required hardware/runtime/platform/version constraints
+- Timing, throughput, memory, storage, synchronization, and scaling assumptions
+- Lifecycle assumptions: offline vs online, batch vs real time, development vs production, single tenant vs multi tenant, local vs networked
+- Known unsupported scenarios, limitations, issue reports, production failures, and workarounds
+- Licensing, security, maintenance, and community-health constraints
+- Exact phrases from the project's restrictions and acceptance criteria combined with the candidate name
+
+**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for lead candidates)**:
+
+**Applicability**: this section applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section, and only to lead candidates that are libraries/SDKs/frameworks/services/protocols/data formats with multiple modes or configurations. For non-technical research (concept comparison, market/policy investigation, knowledge organization, root-cause analysis without tooling commitments), skip this entire sub-section and continue with the rest of Step 2 — the broader candidate implementation-limit search above is sufficient. State the skip explicitly once in `02_fact_cards.md`: `API Capability Verification: not applicable — this run is a Non-technical investigation, no library/SDK/service candidates`.
+
+Most libraries/SDKs/services expose **multiple modes or configurations** (e.g., monocular vs stereo VO, sync vs async API, batch vs streaming inference, write-through vs write-behind cache). Selecting a candidate "because it supports X" without pinning *which mode* the project will use, and *whether that exact mode produces the required outputs from the required inputs*, is the most common silent-failure path in research. A library can support a class of problem in mode A while being unusable for the project's specific configuration in mode B.
+
+For every lead candidate that is a library/SDK/framework/service with multiple modes or configurations, do the following — in this order, before marking the candidate `Selected`:
+
+1. **Pin the exact mode/configuration the project will use.**
+   Derived from the Project Constraint Matrix: which inputs are available (sensor count, sensor types, data shapes, rates), which outputs are required (per `acceptance_criteria.md` and contract files), which hardware/runtime is fixed (per `restrictions.md`). Write this as a single sentence: "We will use `<library>` in `<mode/config>` with inputs `<list>` and expect outputs `<list>` on `<runtime>`." Do not progress past this step on a vague mode description.
+
+2. **Run `context7` (or equivalent docs lookup) for the candidate** — this is **mandatory for every lead library/SDK/framework candidate**, not optional. Minimum three queries per candidate:
+   1. *Mode enumeration*: "What modes/configurations does `<library>` support? List every value of the mode/config enum and what each requires as input."
+   2. *Project's exact mode*: "Show a minimum runnable example of `<library>` in `<the pinned mode>` with `<the project's input shape>`. What does it produce?"
+   3. *Disqualifier probe*: "Does `<library>` `<the pinned mode>` produce `<the required output>`? Are there published limitations of `<the pinned mode>` for `<the project's runtime/hardware>`?"
+
+   For services without context7 coverage, use official docs site + WebFetch on the API reference page + the project's example/tutorial directory in the source repo. Append every consulted URL to `01_source_registry.md`.
+
+3. **Save a Minimum Viable Example (MVE) for the pinned mode.**
+   Append to `02_fact_cards.md` (or a sibling `02_mve_evidence.md`) at least one block per lead library candidate with:
+
+   ```markdown
+   ## MVE — <library> in <pinned mode>
+   - **Source**: <official URL or context7 reference, with date>
+   - **Inputs in the example**: <e.g., 2 calibrated cameras + IMU at 200 Hz>
+   - **Outputs in the example**: <e.g., 6-DoF pose with covariance>
+   - **Project inputs**: <e.g., 1 camera + IMU at 200 Hz>
+   - **Project outputs required**: <e.g., 6-DoF pose with metric translation>
+   - **Match assessment**: ✅ exact match / ⚠️ partial (specify dimension) / ❌ mismatch (specify dimension)
+   - **If ⚠️ or ❌**: cite the official-docs sentence that establishes the mismatch.
+   ```
+
+   If no official example covers the project's exact configuration → the candidate cannot be marked `Selected` based on category fit alone. Status must be `Experimental only` (with required-evidence note) or `Rejected` (when the docs explicitly disqualify the configuration).
+
+4. **Bind every numbered Restriction and Acceptance Criterion to the candidate's pinned mode.**
+   For each numbered line in `restrictions.md` and `acceptance_criteria.md`, decide one of: `Pass` (the pinned mode satisfies it with cited evidence), `Fail` (the pinned mode contradicts it with cited evidence), `Verify` (no evidence either way; deeper investigation required), `N/A` (the line is irrelevant to this component area). Record this in `02_fact_cards.md` under the candidate's MVE block. The structural matrix in Step 7.5 reads from these bindings.
+
+5. **Treat "the same library in a different mode" as a different candidate.**
+   If the project's pinned mode is `Monocular` but the only documented evidence covers `Stereo`, do not silently soften "rotation only" into "rotation + translation". Open a separate candidate row for the Monocular mode, with its own MVE, fit assessment, and disqualifiers. Two modes of one library are two distinct candidates for the purposes of this gate.
+
+**Common silent-failure pattern this guards against**: a fact card paraphrases the docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes". A category-level "Selected" decision then carries through every downstream artifact, masking that the project's required A+B combination does not exist as a single mode.
+
 **Search broadening strategies** (use when results are thin):
 - Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
 - Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
 - Try different geographies: search in English + search for European/Asian approaches if relevant
 - Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025"
 - Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report"
+- Try disqualifier probes: "X unsupported", "X limitations", "X requirements", "X with [project constraint]", "X without [required input]", "X real-time [target]", "X production failure"

 **Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated.

@@ -194,6 +282,7 @@ For each extracted fact, **immediately** append to `02_fact_cards.md`:
 - **Target Audience**: [which group this fact applies to, inherited from source or further refined]
 - **Confidence**: ✅/⚠️/❓
 - **Related Dimension**: [corresponding comparison dimension]
+- **Fit Impact**: [supports selection / disqualifies / makes experimental / needs user decision]
 ```

 **Target audience in fact statements**:
@@ -24,6 +24,18 @@ Write to `03_comparison_framework.md`:
 | ... | | | |
 ```

+**Required exact-fit dimensions for component/tool decisions**:
+When the output selects or recommends a component, tool, library, service, architecture pattern, or algorithm, the framework MUST include these dimensions unless explicitly not applicable:
+- Option family (`Simple baseline`, `Established production`, `Open-source`, `Commercial/vendor`, `Current SOTA`, `Adjacent-domain`, `No-build/defer`, `Known bad`)
+- Required inputs/outputs and ownership boundaries
+- Operating context and lifecycle fit
+- Non-functional envelope fit
+- Implementation assumptions and hard disqualifiers
+- Evidence quality and source tier
+- Selection status (`Selected`, `Rejected`, `Experimental only`, `Needs user decision`)
+
+For each component area, include multiple candidates in the initial population. Do not present only the preferred option unless the investigation found no realistic alternatives; if so, state the searches that proved the narrow landscape.
+
 ---

 ### Step 5: Reference Point Baseline Alignment
@@ -97,6 +109,8 @@ Validate conclusions against a typical scenario:
 - [ ] Are there any important dimensions missed?
 - [ ] Is there any over-extrapolation?
 - [ ] Are conclusions actionable/verifiable?
+- [ ] Does every selected component/tool/pattern match the Project Constraint Matrix?
+- [ ] Are mismatches marked as disqualifiers instead of hidden as generic "limitations"?

 **Save action**:
 Write to `05_validation_log.md`:
@@ -128,6 +142,66 @@ If using Y: [expected behavior]

 ---

+### Step 7.5: Component Applicability Gate (BLOCKING)
+
+**Applicability**: this gate applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section. For non-technical research (concept comparison, market/policy investigation, root-cause analysis without tooling, knowledge organization), skip this entire step and proceed to Step 8 — there are no components to gate. State the skip once in `05_validation_log.md`: `Step 7.5 (Component Applicability Gate): not applicable — Non-technical investigation`. For mixed runs (some component areas technical, some not), apply this gate only to the technical component areas; the non-technical ones do not produce 7.5 rows.
+
+Before finalizing the solution draft, build an exact-fit matrix for every component/tool/library/service/pattern/algorithm that is selected, recommended, rejected, or treated as a fallback. Free-form prose in a "Project Constraints Checked" column is **not sufficient** — mismatches hide inside rationale text. The matrix must be structured per restriction and per acceptance criterion.
+
+#### 7.5.1 Top-level Component Fit Matrix
+
+```markdown
+# Component Fit Matrix
+
+| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
+|----------------|-----------|--------------------|---------------|---------------|-------------------------|----------------------------|--------|--------------------|
+| [area] | [name] | [exact mode/config the project will use, copied verbatim from the MVE block in Step 2] | [family] | [role] | MVE: [link to MVE block in `02_fact_cards.md` or `02_mve_evidence.md`]; docs: [Source #] | [none / list] | Selected / Rejected / Experimental only / Needs user decision | [why] |
+```
+
+The new **Pinned Mode/Config** column is mandatory. A row without a pinned mode is incomplete. The new **API Capability Evidence** column links to the Minimum Viable Example saved during Step 2's API Capability Verification — without an MVE link the candidate cannot be `Selected`.
+
+#### 7.5.2 Restrictions × Candidate-Modes Sub-Matrix (MANDATORY)
+
+For each lead candidate row in the top-level matrix, append a structured cross-check that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's **pinned mode/config**.
+
+```markdown
+## Sub-Matrix — <Candidate Name> in <Pinned Mode>
+
+| Restriction / AC | Candidate-mode behavior | Result | Evidence |
+|------------------|-------------------------|--------|----------|
+| R1: <verbatim line from restrictions.md> | <how the pinned mode behaves under this restriction> | ✅ Pass / ❌ Fail / ❓ Verify / N/A | [Fact # / Source # / MVE link] |
+| R2: ... | ... | ... | ... |
+| ... | ... | ... | ... |
+| AC-1.1: <verbatim line from acceptance_criteria.md> | <how the pinned mode satisfies (or contradicts) this AC's measurable target> | ✅ / ❌ / ❓ / N/A | [Fact # / Source # / MVE link] |
+| AC-1.2: ... | ... | ... | ... |
+| ... | ... | ... | ... |
+```
+
+Cell semantics:
+- ✅ **Pass** — the candidate's pinned mode satisfies this line, with cited official-doc or MVE evidence.
+- ❌ **Fail** — the candidate's pinned mode contradicts this line, with cited evidence. Even one ❌ disqualifies the candidate from `Selected` status.
+- ❓ **Verify** — no evidence yet either way; further investigation required (loops back to Step 2 / Step 3.5). A row left ❓ at the end of analysis blocks the candidate.
+- **N/A** — the line is irrelevant to this component area (state why in one phrase).
+
+A candidate row may not be marked `Selected` while any cell is ❌ or ❓.
+
+#### 7.5.3 Decision Rules
+
+- `Selected` is allowed only when (a) the top-level row has an MVE link, (b) the sub-matrix has zero ❌, (c) the sub-matrix has zero ❓, and (d) the candidate's documented implementation assumptions match the project's explicit constraints and acceptance criteria.
+- `Experimental only` is required when a candidate might work but lacks proof for the exact operating context (e.g., MVE exists for a similar configuration but not the exact one).
+- `Rejected` is required when documented assumptions conflict with project constraints (any sub-matrix row is ❌ with cited evidence).
+- `Needs user decision` is required when a mismatch changes scope, cost, safety, product behavior, or acceptance criteria — and the user has not yet been consulted.
+- Each component area must include at least one selected or fallback-safe option, plus the most credible rejected/experimental alternatives discovered during web research.
+- A component area with only one candidate is incomplete unless `00_question_decomposition.md` documents the broader searches and why they yielded no realistic alternatives.
+- A candidate may not appear as the lead solution in Step 8 unless this gate marks it `Selected`.
+- "Validation gate required" footnotes are not equivalent to `Selected`. If the validation gate concerns API capability (does the mode produce the required output?), that is a Step-2 / Step-7.5 question and must be resolved here, not deferred to runtime. Only validation gates concerning *runtime quality* (e.g., "does this VO converge on this terrain class?") may be carried forward as `Selected with runtime gate`.
+
+**Save action**: Write `06_component_fit_matrix.md` containing both 7.5.1 (top-level) and 7.5.2 (per-candidate sub-matrices).
+
+**BLOCKING**: If any lead candidate has ❌, ❓, `Experimental only`, `Rejected`, or `Needs user decision` status, do not silently proceed. Ask the user or choose a different selected candidate.
+
+---
+
 ### Step 8: Deliverable Formatting

 Make the output **readable, traceable, and actionable**.
@@ -10,12 +10,21 @@

 [Architecture solution that meets restrictions and acceptance criteria.]

+> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical research outputs (concept comparison, market/policy report, investigation answer), this Architecture section may be replaced with a comparison/analysis section that does not use these columns; or the columns may be marked `N/A` per row when the row describes a non-technical "component" (a process, a policy, an organizational construct). For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
+
 ### Component: [Component Name]

-| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
-|----------|-------|-----------|-------------|-------------|----------|------|-----|
-| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
-| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
+| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
+|----------|-------|--------------------|-----------|-------------|-------------|----------|------|-------------------------|-----|
+| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
+
+**Exact-fit evidence**:
+- Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
+- Evidence: [Fact # / Source #]
+- Disqualifiers: [none or list]
+- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
+- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected

 [Repeat per component]

@@ -13,12 +13,21 @@

 [Architecture solution that meets restrictions and acceptance criteria.]

+> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical assessment outputs (e.g., reassessing a policy approach, comparing organizational designs), this Architecture section may be replaced with the assessment content that does not use these columns; or the columns may be marked `N/A` per row for non-technical "components". For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
+
 ### Component: [Component Name]

-| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
-|----------|-------|-----------|-------------|-------------|----------|------------|-----|
-| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
-| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
+| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
+|----------|-------|--------------------|-----------|-------------|-------------|----------|------------|-------------------------|-----|
+| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
+
+**Exact-fit evidence**:
+- Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
+- Evidence: [Fact # / Source #]
+- Disqualifiers: [none or list]
+- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
+- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected

 [Repeat per component]

@@ -22,7 +22,7 @@ test-run has two modes. The caller passes the mode explicitly; if missing, defau
 | Mode | Scope | Typical caller | Input artifacts |
 |------|-------|---------------|-----------------|
 | `functional` (default) | Unit / integration / blackbox tests — correctness | autodev Steps that verify after Implement Tests or Implement | `scripts/run-tests.sh`, `_docs/02_document/tests/environment.md`, `_docs/02_document/tests/blackbox-tests.md` |
-| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 9, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |
+| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 15, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |

 Direct user invocation (`/test-run`) defaults to `functional`. If the user says "perf tests", "load test", "performance", or passes a performance scenarios file, run `perf` mode.

@@ -32,6 +32,17 @@ After selecting a mode, read its corresponding workflow below; do not mix them.

 ## Functional Mode

+### 0. System-Under-Test Reality Gate
+
+Before accepting any functional, blackbox, or e2e result as a pass, verify what the tests actually exercised.
+
+1. If `_docs/00_problem/input_data/expected_results/results_report.md` exists, at least one e2e/blackbox run must compare actual product outputs against that mapping or the machine-readable files it references.
+2. Stubs are allowed only for external systems outside the product boundary: flight controller/SITL, QGC observer, satellite-provider/Suite service, physical Jetson hardware, physical camera, unavailable licensed datasets, and network services.
+3. Stubs, fakes, deterministic fallbacks, monkeypatches, or direct replacement of internal product modules are not allowed for the behavior under test. Internal examples include VIO, safety/anchor wrapper, satellite retrieval, anchor verification, tile manager, MAVLink output adapter, FDR, and the A-Z localization pipeline.
+4. If tests pass only because an internal module is fake/scaffolded, classify the run as **failed** with category `missing product implementation`.
+5. If a scenario is blocked because external hardware/data is absent, verify the production code path exists before accepting the block as legitimate. Missing internal production code is not an environment block.
+6. If the test runner writes CSV/Markdown reports, inspect them. A zero exit code is not enough; blocked/internal-stubbed scenarios still require classification.
+
 ### 1. Detect Test Runner

 Check in order — first match wins:
@@ -94,7 +105,7 @@ Categorize skips as: **explicit skip (dead code)**, **runtime skip (unreachable)

 ### 5. Handle Outcome

-**All tests pass, zero skipped** → return success to the autodev for auto-chain.
+**All tests pass, zero skipped, and the System-Under-Test Reality Gate passes** → return success to the autodev for auto-chain.

 **Any test fails or errors** → this is a **blocking gate**. Never silently ignore failures. **Always investigate the root cause before deciding on an action.** Read the failing test code, read the error output, check service logs if applicable, and determine whether the bug is in the test or in the production code.

@@ -95,7 +95,7 @@ Examples:

 File: `expected_results/image_01_detections.json`

-```json
+```json
 {
  "input": "image_01.jpg",
  "expected": {
@@ -119,7 +119,7 @@ File: `expected_results/image_01_detections.json`
    ]
  }
 }
-```
+```
 ```

 ---
@@ -1,15 +0,0 @@
-## Summary
-[1-3 bullet points describing the change]
-
-## Related Tasks
-[JIRA-ID links]
-
-## Testing
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing done (if applicable)
-
-## Checklist
- [ ] No new linter warnings
- [ ] No secrets committed
- [ ] API docs updated (if applicable)
@@ -1 +0,0 @@
-.DS_Store
@@ -1,50 +1,108 @@
-# Position Accuracy
+# Acceptance Criteria

- The system should determine GPS coordinates of frame centers for 80% of photos within 50m error compared to real GPS
- The system should determine GPS coordinates of frame centers for 60% of photos within 20m error compared to real GPS
- Maximum cumulative VO drift between satellite correction anchors should be less than 100 meters
- System should report a confidence score per position estimate (high = satellite-anchored, low = VO-extrapolated with drift)
+> Last revised 2026-05-07 (cleanup pass: stripped algorithm/library/parameter implementation details; renamed source label `vo_extrapolated` → `visual_propagated`; broadened FC scope to ArduPilot + iNav).
+> Subsequent revision 2026-05-07 (post-SQ6 research): AC-4.3 reworded to acknowledge that no single message type is accepted by both ArduPilot Plane and iNav — per-FC interface is named explicitly (MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav). Rationale and L1 sources in `_docs/00_research/02_fact_cards.md` SQ6 / `_docs/00_research/01_source_registry.md` Sources #4, #9, #10, #12, #13.
+> See git history for prior versions.

-# Image Processing Quality
+## Position Accuracy
+- **AC-1.1** — Frame-center GPS within **50 m** of true GPS for **≥80%** of normal-flight photos.
+- **AC-1.2** — Frame-center GPS within **20 m** of true GPS for **≥50%** of normal-flight photos.
+- **AC-1.3** — Cumulative drift between two consecutive satellite-anchored fixes: **<100 m** visual-only / **<50 m** with IMU fused. Measured as ‖propagated centre − next anchor centre‖ at anchor fix. Every estimate carries `last_satellite_anchor_age_ms`; validation binned by anchor age. The solution must define the max anchor age beyond which estimates degrade to `visual_propagated` / `dead_reckoned` with monotonically growing covariance.
+- **AC-1.4** — Each estimate reports: 95% covariance ellipse semi-major axis (m) AND a label `{satellite_anchored, visual_propagated, dead_reckoned}`.

- Image Registration Rate > 95% for normal flight segments. The system can find enough matching features to confidently calculate the camera's 6-DoF pose and stitch that image into the trajectory
- Mean Reprojection Error (MRE) < 1.0 pixels
+## Image Processing Quality
+- **AC-2.1a — Frame-to-frame registration**: succeeds for **>95%** of normal flight segments (defined: nadir ±10° bank/pitch, ≥40% prior-frame overlap, daytime, usable texture, no full visual blackout).
+- **AC-2.1b — Satellite-anchor registration**: measured separately from AC-2.1a; must satisfy AC-1.1/1.2 accuracy, AC-2.2 cross-domain MRE, AC-8.2 freshness, AC-8.6 retrieval behaviour.
+- **AC-2.2** — Mean Reprojection Error: **<1.0 px** frame-to-frame; **<2.5 px** satellite-anchored cross-domain.

-# Resilience & Edge Cases
+## Resilience & Edge Cases
+- **AC-3.1** — Tolerate up to **350 m** outliers between two consecutive photos (airframe tilt up to ±20°).
+- **AC-3.2** — Tolerate sharp turns: <5% overlap, <200 m drift, <70° heading change. Sharp-turn frames may fail frame-to-frame registration; recovery via satellite-reference re-localization.
+- **AC-3.3** — Handle **≥3 disconnected segments** per flight via satellite-reference re-localization. Core capability, not degraded mode.
+- **AC-3.4** — On ≥3 consecutive frames AND ≥2 s without a position, request operator re-loc via telemetry; continue dead-reckoned propagation; FC uses last known + IMU extrapolation.
+- **AC-3.5 — Visual blackout + spoofed GPS** (clouds/occlusion/whiteout while FC reports GPS denial/spoof):
+  - Switch label to `{dead_reckoned}` within ≤1 processed frame OR ≤400 ms.
+  - Reject spoofed GPS as estimator input.
+  - Propagate from last trusted state + FC IMU/attitude/airspeed/altitude until visual or satellite anchoring recovers.
+  - Covariance grows monotonically.
+  - `horiz_accuracy` field of the GPS message to the FC must not under-report the 95% covariance semi-major axis.
+  - `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT to QGroundControl at 1–2 Hz.

- The system should correctly continue work even in the presence of up to 350m outlier between 2 consecutive photos (due to tilt of the plane)
- System should correctly continue work during sharp turns, where the next photo doesn't overlap at all or overlaps less than 5%. The next photo should be within 200m drift and at an angle of less than 70 degrees. Sharp-turn frames are expected to fail VO and should be handled by satellite-based re-localization
- System should operate when UAV makes a sharp turn and next photos have no common points with previous route. It should figure out the location of the new route segment and connect it to the previous route. There could be more than 2 such disconnected segments, so this strategy must be core to the system
- In case the system cannot determine the position of 3 consecutive frames by any means, it should send a re-localization request to the ground station operator via telemetry link. While waiting for operator input, the system continues attempting VO/IMU dead reckoning and the flight controller uses last known position + IMU extrapolation
+## Real-Time Onboard Performance
+- **AC-4.1** — End-to-end latency (camera capture → GPS to FC) **<400 ms p95**. Up to ~10% frames may drop under sustained load.
+- **AC-4.2** — Memory **<8 GB shared** on Jetson Orin Nano Super.
+- **AC-4.3 — FC output contract**: WGS84 coordinates delivered to each supported FC via that FC's documented external-positioning interface — MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav. Honest covariance is carried in the field each FC uses for outlier rejection (under-reported covariance is a defect, see AC-NEW-4). Source-label semantics per AC-1.4 are emitted out-of-band via the FC-appropriate channel (e.g. MAVLink `STATUSTEXT` / `NAMED_VALUE_FLOAT` for ArduPilot; MSP equivalent for iNav). Where the FC supports it, implementation may also emit an optional auxiliary external-odometry message when the estimator delivers full 6-DoF covariance + quality above a configured threshold. Per-FC parameter wiring (EKF source-set selection on ArduPilot; GPS provider / UART role on iNav), FDR-side message variants, and out-of-band channel choice remain design decisions.
+- **AC-4.4** — Estimates streamed frame-by-frame; no batching/delay.
+- **AC-4.5** — System may refine prior estimates and emit corrections.

-# Real-Time Onboard Performance
+## Startup & Failsafe
+- **AC-5.1** — Initialise from FC EKF's last valid GPS + IMU-extrapolated position at GPS denial.
+- **AC-5.2** — On >3 s without estimate, FC falls back to IMU-only dead reckoning; system logs failure. Verify in production param sets of each supported FC (ArduPilot Plane SITL + iNav SITL or equivalent).
+- **AC-5.3** — On companion reboot mid-flight, re-initialise from FC's current IMU-extrapolated position. Cold-start TTFF in AC-NEW-1.

- Less than 400ms end-to-end per frame: from camera capture to GPS coordinate output to the flight controller (camera shoots at ~3fps)
- Memory usage should stay below 8GB shared memory (Jetson Orin Nano Super — CPU and GPU share the same 8GB LPDDR5 pool)
- The system must output calculated GPS coordinates directly to the flight controller via MAVLink GPS_INPUT messages (using MAVSDK)
- Position estimates are streamed to the flight controller frame-by-frame; the system does not batch or delay output
- The system may refine previously calculated positions and send corrections to the flight controller as updated estimates
+## Ground Station & Telemetry
+- **AC-6.1** — Position estimates + confidence stream to QGroundControl over MAVLink at **1–2 Hz** downsampled (high-rate stays on local FDR).
+- **AC-6.2** — GCS may send commands (e.g., operator re-loc hint) via standard MAVLink (`STATUSTEXT`, `NAMED_VALUE_FLOAT`) or a custom dialect.
+- **AC-6.3** — Output coordinates in WGS84.

-# Startup & Failsafe
+## Object Localization (AI Camera)
+- **AC-7.1** — AI systems may request GPS for AI-camera-detected objects. Accuracy consistent with frame-center accuracy in level flight (bank/pitch <5°). In maneuvering flight, error bounded by `altitude × |sin(unknown_bank_or_pitch)|` and that bound is published alongside the estimate.
+- **AC-7.2** — Object coordinates computed trigonometrically from current UAV position, AI-camera gimbal angle, zoom, and altitude. Flat-terrain assumption.

- The system initializes using the last known valid GPS position from the flight controller before GPS denial begins
- If the system completely fails to produce any position estimate for more than N seconds (TBD), the flight controller should fall back to IMU-only dead reckoning and the system should log the failure
- On companion computer reboot mid-flight, the system should attempt to re-initialize from the flight controller's current IMU-extrapolated position
+## Satellite Reference Imagery
+- **AC-8.1** — Imagery via Azaion Suite Satellite Service (offline cache interface; no direct commercial-provider calls). Cache-interface resolution ≥0.5 m/px, ideally 0.3 m/px.
+- **AC-8.2** — Tile freshness: <6 mo (active-conflict sectors), <12 mo (stable rear). Older → reject or downgrade (AC-NEW-6).
+- **AC-8.3** — Imagery pre-loaded onto companion before flight; offline preprocessing time not time-critical. Pre-extracted descriptors/indices count against the cache budget unless explicitly carved out.
+- **AC-8.4** — Mid-flight tile generation: continuously orthorectify nav-camera frames into basemap-projected tiles, deduplicated (latest/highest-quality wins). Upload to Service on landing. Each uploaded tile carries quality metadata sufficient for the Service's ingest pipeline (AC-NEW-7).
+- **AC-8.5** — No raw nav-camera or AI-camera frames retained in normal operation; tiles are the only persistent imagery. Forensic exception: ≤0.1 Hz thumbnail log of frames that failed tile generation, within FDR budget (AC-NEW-3).
+- **AC-8.6 — Satellite-anchor relocalization robustness**:
+  - **Scale-ratio**: any UAV-frame ground footprint at the deployment altitude band must be retrievable from the cache regardless of internal tiling/indexing.
+  - **Scene change in active-conflict sectors**: cratering / building destruction / road realignment must not collapse retrieval recall, measured against a labelled change-pair dataset over season-matched tiles. No `satellite_anchored` label on stale-tile match (per AC-NEW-6).
+  - **Compute & latency**: relocalization must remain inside AC-4.1 latency + AC-4.2 memory budgets under both steady-state and re-loc-trigger workloads.

-# Ground Station & Telemetry
+## Additional AC

- Position estimates and confidence scores should be streamed to the ground station via telemetry link for operator situational awareness
- The ground station can send commands to the onboard system (e.g., operator-assisted re-localization hint with approximate coordinates)
- Output coordinates in WGS84 format
+### AC-NEW-1 — Cold-start TTFF
+**Statement.** From companion boot, first valid external-position MAVLink frame **<30 s p95**, given an IMU-extrapolated initial position from FC EKF.
+**Why.** Mid-flight reboot is realistic on 8 h missions; FC dead-reckons during the gap, ~500 m drift max at 60 km/h.
+**Validation.** Cold-boot 50× with simulated FC pose; measure boot → first frame; pass = 95th percentile <30 s.

-# Object Localization
+### AC-NEW-2 — Spoofing-promotion latency
+**Statement.** When FC signals GPS denial/spoof, promote onboard estimate to FC's primary position source within **<3 s p95**.
+**Why.** Without this, FC may follow a spoofed source while a valid onboard estimate sits idle; 3 s rides out one-frame anomalies but blocks malicious heading changes.
+**Validation.** SITL on each supported FC (ArduPilot Plane + iNav, production param sets): inject false GPS, measure spoof onset → promotion; pass = 95th percentile <3 s on both.

- Other onboard AI systems can request GPS coordinates of objects detected by the AI camera
- The GPS-Denied system calculates object coordinates trigonometrically using: current UAV GPS position (from GPS-Denied), known AI camera angle, zoom, and current flight altitude. Flat terrain is assumed
- Accuracy is consistent with the frame-center position accuracy of the GPS-Denied system
+### AC-NEW-3 — Flight Data Recorder
+**Statement.** Per flight, retain to NVM: per-frame estimates with covariance + source-label; FC IMU traces (full rate); all emitted external-position MAVLink frames; raw MAVLink stream (tlog); system health (CPU/GPU/temp/throttle); mid-flight tiles (AC-8.4); ≤0.1 Hz thumbnail log of failed tile-gen frames. **No raw nav-cam/AI-cam frames** (AC-8.5). Cap **64 GB / flight**; oldest segment dropped first on rollover.
+**Why.** Tiles + telemetry + IMU reproduce the mission, feed next mission's cache (AC-8.4), explain false-position events (AC-NEW-4). Raw frames are large + redundant once tiles exist.
+**Validation.** 8 h synthetic load (3 Hz nav frames replayed); assert FDR ≤64 GB; no payload class silently dropped without a logged rollover.

-# Satellite Reference Imagery
+### AC-NEW-4 — False-position safety budget
+**Statement.** Per flight: **P(error >500 m) <0.1 %**, **P(error >1 km) <0.01 %**.
+**Why.** A single 1-km-off frame can fly the UAV outside the geofence; covariance carried in the MAVLink message is the FC's only defense.
+**Validation.** Monte Carlo over a public aerial-localization dataset (e.g. AerialVL S03) + own recorded flights; report error CDF; pass = both probabilities below budget across ≥100 flights.

- Satellite reference imagery resolution must be at least 0.5 m/pixel, ideally 0.3 m/pixel
- Satellite imagery for the operational area should be less than 2 years old where possible
- Satellite imagery must be pre-processed and loaded onto the companion computer before flight. Offline preprocessing time is not time-critical (can take minutes/hours)
+### AC-NEW-5 — Operational environmental envelope
+**Statement.** Operating temp **−20 °C to +50 °C**; vibration/shock per RTCA DO-160G low-altitude UAV-class. Cooling sustains **25 W** at the upper temp for the full **8-hour duty cycle** without throttling.
+**Why.** Without this, all latency/accuracy AC are conditional on a benign thermal day; +35 °C bay temps cause Jetson to throttle to 15 W, collapsing the 400 ms latency budget.
+**Validation.** Hot-soak: 25 W @ +50 °C for 8 h, no throttle. Cold-soak: −20 °C cold-start within AC-NEW-1.
+
+### AC-NEW-6 — Imagery freshness enforcement
+**Statement.** System rejects (or downgrades) any tile whose capture date violates AC-8.2. Mid-flight tiles (AC-8.4) not yet uploaded are timestamped current and treated as fresh.
+**Why.** Stale tiles are the dominant cross-view-matching failure mode in active-conflict sectors; a confident match on a stale tile is worse than no match.
+**Validation.** Inject synthetic-age tiles; verify rejection/decay matches spec; verify stale-tile match never produces `satellite_anchored`.
+
+### AC-NEW-7 — Cache-poisoning safety budget
+**Statement.** Per flight, across all onboard tiles written (AC-8.4): **P(geo-misalign >30 m) <1 %**, **P(>100 m) <0.1 %**.
+**Why.** Onboard tiles feed back into the Service basemap (AC-8.4). A bad onboard pose with optimistic covariance writes a misaligned tile that becomes the next flight's anchor — cross-flight error compounding that AC-NEW-4 doesn't capture.
+**External-dependency note.** The Suite Satellite Service is expected to operate a multi-flight ingest-side voting layer that gates onboard-tile promotion to "trusted basemap" until multiple independent flights agree on geo-alignment. Voting algorithm is the Service's concern; onboard's job (AC-8.4) is to publish per-tile quality metadata sufficient for that layer. End-to-end AC-NEW-7 evidence depends on this Service contract.
+**Validation.** Multi-flight Monte Carlo replay over public datasets (e.g. AerialVL, AerialExtreMatch) + own flights, with synthetic over-confidence injection (deflate covariance ×1.5–3): assert both probabilities below budget across ≥100 flights. Independently exercise the Service-side voting contract.
+
+### AC-NEW-8 — Visual blackout + GPS spoofing degraded mode
+**Statement.** When the navigation camera is fully unusable AND FC reports GPS denial/spoof:
+- continue emitting external-position MAVLink frames from IMU-only propagation for **≤30 s** after the last trusted anchor (or until covariance trips fail threshold);
+- label every estimate `{dead_reckoned}`; degrade MAVLink fix-quality to "2D fix or worse" when 95% covariance semi-major axis **>100 m**;
+- escalate to "no fix" (`horiz_accuracy=999.0`) + `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT when 95% covariance >**500 m** OR blackout >**30 s** without a trusted re-anchor;
+- never promote spoofed real-GPS back into the estimator unless FC GPS health stable + non-spoofed for **≥10 s** AND a visual/satellite consistency check has succeeded.
+**Why.** During cloud/whiteout + spoofing, no honest correction is available; only safe behaviour is IMU-only dead reckoning with rapidly-growing uncertainty, never pretending stale visual or spoofed GPS remains valid.
+**Validation.** SITL/replay on each FC: inject 5 s / 15 s / 35 s blackouts while spoofing GPS; assert mode transition ≤400 ms, spoofed GPS ignored, covariance grows monotonically, MAVLink fields degrade at thresholds, recovery only via trusted anchor or 10-s GPS-health + visual-consistency gate.
@@ -1,8 +1,2 @@
- Height
-  - 400m
- Camera:
-  - Name: ADTi Surveyor Lite 26S v2
-  - Resolution: 26MP
-  - Image resolution: 6252*4168
-  - Focal length: 25mm
-  - Sensor width: 23.5
+- Height: 400m
+- Camera: ADTi Surveyor Lite 20MP 20L V1
@@ -1,166 +1,97 @@
-# Expected Results
+# Expected Results Mapping

-Maps every input data item to its quantifiable expected result.
-Tests use this mapping to compare actual system output against known-correct answers.
+## Scope

-## Result Format Legend
+`coordinates.csv` is the current source of truth for the provided still-image nadir set. It gives expected WGS84 frame-center coordinates for `AD000001.jpg` through `AD000060.jpg`.

-| Result Type | When to Use | Example |
-|-------------|-------------|---------|
-| Exact value | Output must match precisely | `fix_type: 3`, `satellites_visible: 10` |
-| Tolerance range | Numeric output with acceptable variance | `lat: 48.275292 ± 50m` |
-| Threshold | Output must exceed or stay below a limit | `latency < 400ms`, `memory < 8GB` |
-| Pattern match | Output must match a string/regex pattern | `RELOC_REQ: last_lat=.* last_lon=.* uncertainty=.*m` |
-| File reference | Complex output compared against a reference file | `match expected_results/position_accuracy.csv` |
-| Set/count | Output must contain specific items or counts | `registered_frames / total_frames > 0.95` |
+This data is sufficient for black-box frame-center geolocation tests against still images. The Derkachi representative fixture in `input_data/flight_derkachi/` adds cropped nadir video plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry. It is sufficient for fixture validation, video/telemetry synchronization, replay, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. It is not sufficient by itself for final production accuracy because raw camera calibration, lens distortion, and exact camera-to-body calibration are still pending.

-## Comparison Methods
+## Pass / Fail Rules

-| Method | Description | Tolerance Syntax |
-|--------|-------------|-----------------|
-| `numeric_tolerance` | abs(actual - expected) ≤ tolerance | `± <value>` |
-| `threshold_min` | actual ≥ threshold | `≥ <value>` |
-| `threshold_max` | actual ≤ threshold | `≤ <value>` |
-| `percentage` | percentage of items meeting criterion | `≥ N%` |
-| `exact` | actual == expected | N/A |
-| `regex` | actual matches regex pattern | regex string |
-| `file_reference` | compare against reference file | file path |
+- **Normal frame-center geolocation**: estimated frame center is within 50 m of the expected WGS84 coordinate.
+- **Stretch accuracy bin**: estimated frame center is within 20 m of the expected WGS84 coordinate.
+- **Dataset aggregate**: at least 80% of mapped images pass the 50 m threshold and at least 50% pass the 20 m threshold.
+- **Output shape**: each result must include image name, estimated `lat`, estimated `lon`, error in meters, source label, 95% covariance semi-major axis, and `last_satellite_anchor_age_ms`.

-## Input → Expected Result Mapping
+## Input To Expected Output Map

-### Position Accuracy (60-image flight sequence)
+### Still-Image Frame Centers

-Ground truth GPS coordinates for each frame are in `coordinates.csv`. The system processes these frames sequentially (simulating a real flight) with corresponding IMU data (200Hz, from SITL ArduPilot or synthetic generation from trajectory) and satellite tile matches. The system outputs estimated GPS coordinates per frame. Expected results compare estimated positions against ground truth.
+| Input image | Expected latitude | Expected longitude | Primary threshold | Stretch threshold |
+|-------------|-------------------|--------------------|-------------------|-------------------|
+| AD000001.jpg | 48.275292 | 37.385220 | <= 50 m | <= 20 m |
+| AD000002.jpg | 48.275001 | 37.382922 | <= 50 m | <= 20 m |
+| AD000003.jpg | 48.274520 | 37.381657 | <= 50 m | <= 20 m |
+| AD000004.jpg | 48.274956 | 37.379004 | <= 50 m | <= 20 m |
+| AD000005.jpg | 48.273997 | 37.379828 | <= 50 m | <= 20 m |
+| AD000006.jpg | 48.272538 | 37.380294 | <= 50 m | <= 20 m |
+| AD000007.jpg | 48.272408 | 37.379153 | <= 50 m | <= 20 m |
+| AD000008.jpg | 48.271992 | 37.377572 | <= 50 m | <= 20 m |
+| AD000009.jpg | 48.271376 | 37.376671 | <= 50 m | <= 20 m |
+| AD000010.jpg | 48.271233 | 37.374806 | <= 50 m | <= 20 m |
+| AD000011.jpg | 48.270334 | 37.374442 | <= 50 m | <= 20 m |
+| AD000012.jpg | 48.269922 | 37.373284 | <= 50 m | <= 20 m |
+| AD000013.jpg | 48.269366 | 37.372134 | <= 50 m | <= 20 m |
+| AD000014.jpg | 48.268759 | 37.370940 | <= 50 m | <= 20 m |
+| AD000015.jpg | 48.268291 | 37.369815 | <= 50 m | <= 20 m |
+| AD000016.jpg | 48.267719 | 37.368469 | <= 50 m | <= 20 m |
+| AD000017.jpg | 48.267461 | 37.367255 | <= 50 m | <= 20 m |
+| AD000018.jpg | 48.266663 | 37.365888 | <= 50 m | <= 20 m |
+| AD000019.jpg | 48.266135 | 37.365460 | <= 50 m | <= 20 m |
+| AD000020.jpg | 48.265574 | 37.364211 | <= 50 m | <= 20 m |
+| AD000021.jpg | 48.264892 | 37.362998 | <= 50 m | <= 20 m |
+| AD000022.jpg | 48.264393 | 37.361086 | <= 50 m | <= 20 m |
+| AD000023.jpg | 48.263803 | 37.361028 | <= 50 m | <= 20 m |
+| AD000024.jpg | 48.263014 | 37.359878 | <= 50 m | <= 20 m |
+| AD000025.jpg | 48.262635 | 37.358277 | <= 50 m | <= 20 m |
+| AD000026.jpg | 48.261819 | 37.357116 | <= 50 m | <= 20 m |
+| AD000027.jpg | 48.261182 | 37.355907 | <= 50 m | <= 20 m |
+| AD000028.jpg | 48.260727 | 37.354723 | <= 50 m | <= 20 m |
+| AD000029.jpg | 48.260117 | 37.353469 | <= 50 m | <= 20 m |
+| AD000030.jpg | 48.259677 | 37.352165 | <= 50 m | <= 20 m |
+| AD000031.jpg | 48.258881 | 37.351376 | <= 50 m | <= 20 m |
+| AD000032.jpg | 48.258425 | 37.349964 | <= 50 m | <= 20 m |
+| AD000033.jpg | 48.258653 | 37.347004 | <= 50 m | <= 20 m |
+| AD000034.jpg | 48.257879 | 37.347711 | <= 50 m | <= 20 m |
+| AD000035.jpg | 48.256777 | 37.348444 | <= 50 m | <= 20 m |
+| AD000036.jpg | 48.255756 | 37.348098 | <= 50 m | <= 20 m |
+| AD000037.jpg | 48.255375 | 37.346549 | <= 50 m | <= 20 m |
+| AD000038.jpg | 48.254799 | 37.345603 | <= 50 m | <= 20 m |
+| AD000039.jpg | 48.254557 | 37.344566 | <= 50 m | <= 20 m |
+| AD000040.jpg | 48.254380 | 37.344375 | <= 50 m | <= 20 m |
+| AD000041.jpg | 48.253722 | 37.343093 | <= 50 m | <= 20 m |
+| AD000042.jpg | 48.254205 | 37.340532 | <= 50 m | <= 20 m |
+| AD000043.jpg | 48.252380 | 37.342112 | <= 50 m | <= 20 m |
+| AD000044.jpg | 48.251489 | 37.343079 | <= 50 m | <= 20 m |
+| AD000045.jpg | 48.251085 | 37.346128 | <= 50 m | <= 20 m |
+| AD000046.jpg | 48.250413 | 37.344034 | <= 50 m | <= 20 m |
+| AD000047.jpg | 48.249414 | 37.343296 | <= 50 m | <= 20 m |
+| AD000048.jpg | 48.249114 | 37.346895 | <= 50 m | <= 20 m |
+| AD000049.jpg | 48.250241 | 37.347741 | <= 50 m | <= 20 m |
+| AD000050.jpg | 48.250974 | 37.348379 | <= 50 m | <= 20 m |
+| AD000051.jpg | 48.251528 | 37.349468 | <= 50 m | <= 20 m |
+| AD000052.jpg | 48.251873 | 37.350485 | <= 50 m | <= 20 m |
+| AD000053.jpg | 48.252161 | 37.351491 | <= 50 m | <= 20 m |
+| AD000054.jpg | 48.252685 | 37.352343 | <= 50 m | <= 20 m |
+| AD000055.jpg | 48.253268 | 37.353119 | <= 50 m | <= 20 m |
+| AD000056.jpg | 48.253767 | 37.354246 | <= 50 m | <= 20 m |
+| AD000057.jpg | 48.254329 | 37.354946 | <= 50 m | <= 20 m |
+| AD000058.jpg | 48.254874 | 37.355765 | <= 50 m | <= 20 m |
+| AD000059.jpg | 48.255481 | 37.356501 | <= 50 m | <= 20 m |
+| AD000060.jpg | 48.256246 | 37.357485 | <= 50 m | <= 20 m |

-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 1 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | ≥ 80% of frames have position error < 50m from ground truth | percentage | ≥ 80% of frames within 50m | `expected_results/position_accuracy.csv` |
-| 2 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | ≥ 60% of frames have position error < 20m from ground truth | percentage | ≥ 60% of frames within 20m | `expected_results/position_accuracy.csv` |
-| 3 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | Per-frame position output in WGS84 (lat, lon) | numeric_tolerance | each frame ± 100m max (no single frame exceeds 100m error) | `expected_results/position_accuracy.csv` |
-| 4 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | Cumulative VO drift between satellite anchors < 100m | threshold_max | ≤ 100m drift between anchors | N/A |
+### Representative Derkachi Video/IMU Fixture

-### GPS_INPUT Message Correctness
+| Input fixture | Expected validation result | Threshold |
+|---------------|----------------------------|-----------|
+| `flight_derkachi/data_imu.csv` | Telemetry CSV has required `timestamp(ms)`, `Time`, `SCALED_IMU2.*`, and `GLOBAL_POSITION_INT.*` columns; non-empty rows are monotonic from `Time=0.0` to `489.9` | 0 missing required columns; 0 decreasing timestamps; 4,900 nonblank rows |
+| `flight_derkachi/flight_derkachi.mp4` | Video stream is readable as cropped nadir footage for replay | H.264, 880 x 720, 30 fps, approximately 490.07 s |
+| Video/telemetry alignment | Video has 14,700 frames and telemetry has 4,900 rows | Exactly 3 video frames per telemetry row; duration delta <=250 ms |
+| Derkachi trajectory comparison | Replay output can be compared to `GLOBAL_POSITION_INT.lat`, `GLOBAL_POSITION_INT.lon`, `GLOBAL_POSITION_INT.alt`, `GLOBAL_POSITION_INT.relative_alt`, velocity, and heading | Thresholds are calibration-gated; use for smoke/relative trajectory validation until intrinsics and camera-to-body calibration are pinned |

-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 5 | Single frame + IMU data | Normal tracking frame with recent satellite match | `fix_type: 3`, `horiz_accuracy: 5-20m`, `satellites_visible: 10`, lat/lon populated | exact (fix_type, sat), numeric_tolerance (accuracy) | fix_type == 3, horiz_accuracy ∈ [1, 50] | N/A |
-| 6 | Frame sequence, no satellite match for >30s | VO-only tracking, no recent satellite anchor | `fix_type: 3`, `horiz_accuracy: 20-50m` | exact (fix_type), range (accuracy) | fix_type == 3, horiz_accuracy ∈ [20, 100] | N/A |
-| 7 | Frame sequence, VO lost + no satellite | IMU-only dead reckoning | `fix_type: 2`, `horiz_accuracy: 50-200m+` (growing over time) | exact (fix_type), threshold_min (accuracy) | fix_type == 2, horiz_accuracy ≥ 50 | N/A |
-| 8 | VO lost + 3 consecutive satellite failures | Total position failure | `fix_type: 0`, `horiz_accuracy: 999.0` | exact | fix_type == 0, horiz_accuracy == 999.0 | N/A |
-| 9 | Any valid frame | GPS_INPUT output rate | GPS_INPUT messages at 5-10Hz continuous | range | 5 ≤ rate_hz ≤ 10 | N/A |
+## Known Gaps

-### Confidence Tier Transitions
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 10 | Frame with satellite match <30s ago, covariance <400m² | HIGH confidence conditions | Confidence tier: HIGH, SSE confidence: "HIGH" | exact | N/A | N/A |
-| 11 | Frame with cuVSLAM OK, no satellite match >30s | MEDIUM confidence conditions | Confidence tier: MEDIUM, SSE confidence: "MEDIUM" | exact | N/A | N/A |
-| 12 | Frame with cuVSLAM lost, IMU-only | LOW confidence conditions | Confidence tier: LOW, SSE confidence: "LOW" | exact | N/A | N/A |
-| 13 | 3+ consecutive total failures | FAILED conditions | Confidence tier: FAILED, SSE confidence: "FAILED", fix_type: 0 | exact | N/A | N/A |
-
-### Image Registration & Visual Odometry
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 14 | 60 sequential flight images | Normal flight (no sharp turns) | Image registration rate ≥ 95% (≥ 57 of 60 registered) | percentage | ≥ 95% | N/A |
-| 15 | 60 sequential flight images | Normal flight images | Mean reprojection error < 1.0 pixels | threshold_max | MRE < 1.0 px | N/A |
-
-### Disconnected Route Segments & Sharp Turns
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 16 | Frames 32-43 from coordinates.csv | Trajectory with direction change (turn area) | System continues producing position estimates through the turn | threshold_min | ≥ 1 position output per frame | N/A |
-| 17 | Simulated consecutive frames with 350m gap | Outlier between 2 consecutive photos due to tilt | System handles outlier, position estimate not corrupted (error < 100m for next valid frame) | threshold_max | ≤ 100m error after recovery | N/A |
-| 18 | Simulated sharp turn (no overlap, <5% overlap, <70° angle, <200m drift) | Sharp turn where VO fails | Satellite re-localization triggers, position recovered within 3 frames after turn | threshold_max | position error ≤ 50m after re-localization | N/A |
-| 19 | Simulated VO loss + satellite match success | Tracking loss → re-localization | cuVSLAM restarts, ESKF position corrected, tracking_state returns to NORMAL | exact | tracking_state == NORMAL after recovery | N/A |
-
-### 3-Consecutive-Failure Re-Localization
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 20 | Simulated VO loss + 3 satellite match failures | Cannot determine position by any means | Re-localization request sent: `RELOC_REQ: last_lat=.* last_lon=.* uncertainty=.*m` | regex | message matches pattern | N/A |
-| 21 | Re-localization request active | System waiting for operator | GPS_INPUT fix_type=0, system continues IMU prediction, continues satellite matching attempts | exact (fix_type) | fix_type == 0 | N/A |
-| 22 | Operator sends approximate coordinates (lat, lon) | Operator re-localization hint | System uses hint as ESKF measurement (high covariance ~500m), attempts satellite match in new area | threshold_max | position error ≤ 500m initially, ≤ 50m after satellite match | N/A |
-
-### Startup & Handoff
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 23 | System boot with GLOBAL_POSITION_INT available | Normal startup | System reads initial position, initializes ESKF, starts GPS_INPUT output | exact | GPS_INPUT output begins within 60s of boot | N/A |
-| 24 | System boot + first satellite match | Startup validation | First satellite match validates initial position, position error drops | threshold_max | position error ≤ 50m after first satellite match | N/A |
-
-### Mid-Flight Reboot Recovery
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 25 | System process killed mid-flight | Companion computer reboot | System recovers: reads FC position, inits ESKF with high uncertainty, loads TRT engines, starts cuVSLAM, performs satellite match | threshold_max | total recovery time ≤ 70s | N/A |
-| 26 | Post-reboot first satellite match | Recovery validation | Position accuracy restored after first satellite match | threshold_max | position error ≤ 50m after first satellite match | N/A |
-
-### Object Localization
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 27 | POST /objects/locate with pixel_x, pixel_y, gimbal angles, zoom, known UAV position | Object at known ground GPS | Response: `{ lat, lon, alt, accuracy_m, confidence }` with lat/lon matching ground truth | numeric_tolerance | lat/lon within accuracy_m of ground truth (consistent with frame-center accuracy) | N/A |
-| 28 | POST /objects/locate with invalid pixel coordinates | Out-of-frame pixel | HTTP 422 or error response indicating invalid input | exact | HTTP status 422 | N/A |
-
-### Coordinate Transform Chain
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 29 | Known GPS → NED → pixel → GPS round-trip | Coordinate transform validation | Round-trip error < 0.1m | threshold_max | ≤ 0.1m | N/A |
-
-### API & Communication
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 30 | GET /health | Health check endpoint | HTTP 200, JSON with memory_mb, gpu_temp_c, status fields | exact (status code), regex (body) | status == 200, body contains `"status"` | N/A |
-| 31 | POST /sessions | Start session | HTTP 200/201 with session ID | exact | status ∈ {200, 201} | N/A |
-| 32 | GET /sessions/{id}/stream | SSE position stream | SSE events at ~1Hz with fields: type, timestamp, lat, lon, alt, accuracy_h, confidence, vo_status | regex | each event matches SSE schema | N/A |
-| 33 | Unauthenticated request to /sessions | No JWT token | HTTP 401 Unauthorized | exact | status == 401 | N/A |
-
-### Performance Thresholds
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 34 | Single camera frame (6252x4168) | End-to-end processing time | Total pipeline latency < 400ms (capture → GPS coordinate output) | threshold_max | ≤ 400ms | N/A |
-| 35 | 30-minute sustained operation | Memory usage over time | Peak memory < 8GB, no memory leaks (growth < 50MB over 30min) | threshold_max | peak < 8192MB, growth ≤ 50MB | N/A |
-| 36 | 30-minute sustained operation | GPU thermal | SoC junction temperature stays below 80°C (no throttling) | threshold_max | ≤ 80°C | N/A |
-| 37 | cuVSLAM single frame | VO processing time | cuVSLAM inference ≤ 20ms per frame | threshold_max | ≤ 20ms | N/A |
-| 38 | Satellite matching single frame | Satellite matching time (async) | LiteSAM/XFeat inference ≤ 330ms | threshold_max | ≤ 330ms | N/A |
-| 39 | TRT engine load | Engine initialization time | All TRT engines loaded within 10s total | threshold_max | ≤ 10s | N/A |
-
-### Satellite Tile Management
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 40 | Mission area definition (200km path, ±2km buffer, zoom 18) | Tile storage calculation | Total storage 500-800MB for zoom 18 + zoom 19 flight path | range | [300MB, 1000MB] | N/A |
-| 41 | ESKF position ± 3σ search radius | Tile selection | Tiles covering search area loaded, mosaic assembled, covers at least 500m radius | threshold_min | coverage radius ≥ 500m | N/A |
-
-### TRT Engine Validation
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 42 | LiteSAM PyTorch model → ONNX → TRT FP16 | TRT engine conversion | Engine builds successfully on Jetson Orin Nano Super | exact | exit_code == 0 | N/A |
-| 43 | TRT engine output vs PyTorch reference (same input) | Inference correctness | Max L1 error between TRT and PyTorch output < 0.01 | threshold_max | L1_max < 0.01 | N/A |
-| 44 | LiteSAM MinGRU operations | TRT compatibility check | All MinGRU ops supported in TRT 10.3 (polygraphy inspect) | exact | unsupported_ops == 0 | N/A |
-
-### Telemetry
-
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
-| 45 | Normal operation | Telemetry output rate | NAMED_VALUE_FLOAT messages at 1Hz (gps_conf, gps_drift, gps_hacc) | numeric_tolerance | rate: 1Hz ± 0.2Hz | N/A |
-| 46 | VO tracking lost + 3 satellite failures | Re-localization telemetry | STATUSTEXT with RELOC_REQ sent to ground station | regex | message matches `RELOC_REQ:.*` | N/A |
-
-## Expected Result Reference Files
-
-### position_accuracy.csv
-
-Reference file: `expected_results/position_accuracy.csv`
-
-Contains the ground truth GPS coordinate for each frame in the 60-image test sequence (copied from `coordinates.csv`) plus the acceptance thresholds. Test harness computes haversine distance between estimated and ground truth positions, then applies aggregate criteria.
-
-Thresholds applied to the full 60-frame sequence:
- ≥ 80% of frames: error < 50m
- ≥ 60% of frames: error < 20m
- 0% of frames: error > 100m (no single frame exceeds 100m)
- Cumulative VO drift between satellite anchors: < 100m
+- The still-image set has expected WGS84 centers but no synchronized IMU, attitude, airspeed, altitude, or timestamp stream.
+- The Derkachi fixture has synchronized video, IMU, and GPS trajectory, but no raw camera calibration, lens distortion, exact camera-to-body transform, attitude, or airspeed columns.
+- The still-image sample cadence is slower than the target 3 fps runtime profile; the Derkachi video is 30 fps and must be sampled to target replay cadence for runtime tests.
+- Final production acceptance requires camera calibration and representative datasets with synchronized camera/IMU plus ground-truth trajectory.
@@ -0,0 +1,14 @@
+# Derkachi Representative Flight Fixture
+
+## Files
+
+| File | Description | Observed Metadata |
+|------|-------------|-------------------|
+| `flight_derkachi.mp4` | Cropped nadir flight footage for replay | H.264, 880 x 720, 30 fps, about 490.07 s |
+| `data_imu.csv` | Flight-controller telemetry trace exported from the tlog | 4,900 rows at 10 Hz from `Time=0.0` to `489.9`; includes `SCALED_IMU2` and `GLOBAL_POSITION_INT` trajectory fields |
+
+## Test Use
+
+Use this fixture for video/telemetry synchronization checks, representative replay smoke tests, VIO hot-path latency, frame-drop accounting, and trajectory comparison against `GLOBAL_POSITION_INT`. The video and telemetry align at exactly three video frames per telemetry row. Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims.
+
+For the test recording, the rotating camera was mechanically fixed in a downward/nadir orientation. Treat the MP4 as a cleaned/cropped replay fixture rather than the raw camera feed.
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9acb97042fc648301d73d3c0fe7d80f7e3e2697000c0d33afa8a7b7a74a20005
+size 282207328
@@ -1,2 +1,2 @@
-We have a wing-type UAV with a camera pointing downwards that can take photos 3 times per second with a resolution 6200*4100. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC. 
-The real world examples are in input_data folder, but the distance between each photo is way bigger than it will be from a real plane. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms or even 400 ms.
+We have a wing-type UAV with a fixed downward navigation camera that can take photos 3 times per second. The authoritative navigation-camera spec is defined in `restrictions.md` as the ADTi 20MP 20L V1, APS-C sensor, about 5472 x 3648 px; older higher-resolution references are superseded. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC. 
+The real world examples are in input_data folder, but the original still-image set has a much larger distance between photos than the target aircraft will have. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms. Additional representative data is available in `input_data/flight_derkachi/`: cropped nadir flight footage plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry. This supports video/telemetry synchronization, replay, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. Camera intrinsics, lens distortion, raw camera feed parameters, and exact camera-to-body calibration are still pending, so final production accuracy claims remain gated on calibration data or a separately surveyed representative dataset.
@@ -1,37 +1,47 @@
-# UAV & Flight
+# Restrictions

- Photos are taken by only airplane (fixed-wing) type UAVs
- Photos are taken by the camera pointing downwards and fixed, but it is not autostabilized
- The flying range is restricted by the eastern and southern parts of Ukraine (to the left of the Dnipro River)
- Altitude is predefined and no more than 1km. The height of the terrain can be neglected
- Flights are done mostly in sunny weather
- During the flight, UAVs can make sharp turns, so that the next photo may be absolutely different from the previous one (no same objects), but it is rather an exception than the rule
- Number of photos per flight could be up to 3000, usually in the 500-1500 range
+> Last revised 2026-05-07 (cleanup pass — design-independent, IEEE-830 style; only external dependencies, environmental constraints, integration boundaries).
+> Subsequent revision 2026-05-07 (post-SQ6 research): the FC-facing communication protocol entries below were corrected — iNav firmware (master, post-9.0) has no inbound MAVLink external-positioning handler; the project must use a per-FC adapter (MAVLink `GPS_INPUT` for ArduPilot Plane; MSP2 `MSP2_SENSOR_GPS` for iNav). Rationale and L1 sources in `_docs/00_research/02_fact_cards.md` SQ6 / `_docs/00_research/01_source_registry.md` Sources #4, #9, #10, #12, #13.

-# Cameras
+## UAV & Flight
+- Fixed-wing UAVs only; navigation camera fixed downward (no gimbal).
+- Operational area: eastern/southern Ukraine (east of Dnipro).
+- Mission profile: 8-hour flights, ~60 km/h cruise. Sector ≤150 km² + transit corridor ~50 km². Total cached area ≤~400 km², persistent across flights.
+- Altitude ≤1 km AGL; terrain assumed flat (rolling steppe / agricultural).
+- Weather: predominantly sunny daytime; validation must cover seasonal/visibility classes (summer crops, autumn/winter bare fields, cloud/haze, snow if winter, low-texture repetition).
+- Sharp turns are exceptions; consecutive photos may share <5% overlap (AC-3.2).
+- No raw-photo storage (AC-8.5); storage bounded by tile cache + per-flight FDR (AC-NEW-3).

- UAV has two cameras:
-  1. **Navigation camera** — fixed, pointing downwards, not autostabilized. Used by GPS-Denied system for position estimation
-  2. **AI camera** — main camera with configurable angle and zoom, used by onboard AI detection systems
- Navigation camera resolution: FullHD to 6252*4168. Camera parameters are known: focal length, sensor width, resolution, etc.
- Cameras are connected to the companion computer (interface TBD: USB, CSI, or GigE)
- Terrain is assumed flat (eastern/southern Ukraine operational area); height differences are negligible
+## Cameras
+- **Navigation camera (pinned)**: ADTi 20MP 20L V1, APS-C ~23.6 × 15.7 mm, ~5472 × 3648 px (≈20 MP). Lens chosen so GSD lands in 10–20 cm/px @ 1 km AGL (frame footprint ~470×314 m to ~980×655 m). Intrinsics + camera-to-body calibration must be obtained pre-flight (e.g., checkerboard).
+- **AI camera**: operator-controlled gimbal angle + zoom (consumed by AI detection systems). The GPS-Denied system supports object localization (AC-7.x) using gimbal angle + zoom only — UAV bank/pitch is not published to that path; AI-camera object localization is therefore scoped to level flight (AC-7.1).
+- Camera-to-companion interface: USB / MIPI-CSI / GigE (lens-module dependent).

-# Satellite Imagery
+## Satellite Imagery
+- **Source**: Azaion Suite Satellite Service (separate Suite component). Onboard system is a consumer; upstream sourcing is the Service's concern.
+- **Onboard interface is offline-only**: companion holds a local cache populated pre-flight from the Service for the operational area (AC-8.3). No in-flight Service calls.
+- **Mid-flight tile generation (AC-8.4)**: companion orthorectifies nav-camera frames into basemap-projected tiles, deduplicates, stores locally; uploads on landing.
+- **Storage policy**: tile is the unit of persistence; no raw frames retained (AC-8.5).
+- **Resolution at cache interface**: ≥0.5 m/px, ideally 0.3 m/px (AC-8.1).
+- **Tile manifest schema**: CRS, tile matrix, dimension, lat-adjusted m/px, capture date, source, compression. Slippy/XYZ zoom (if used) is a provider convention, not a resolution proof.
+- **Cache budget**: 10 GB persistent across the ~400 km² area, including manifests, overviews, and any precomputed indices unless the solution carves out a separate descriptor budget.
+- **Freshness**: enforced per AC-8.2 / AC-NEW-6 (6-month active-conflict / 12-month rear). Mid-flight tiles timestamped current and treated as fresh.
+- **Sentinel-2 / free public imagery**: not on runtime path; cache rejects below the 0.5 m/px floor.

- We can use satellite providers, but we're limited right now to Google Maps, which could be outdated for some regions
- Satellite imagery for the operational area must be pre-loaded onto the companion computer before flight
+## Onboard Hardware
+- **Companion computer (pinned)**: Jetson Orin Nano Super — 67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W TDP. JetPack (Ubuntu) with CUDA / TensorRT.
+- Cooling sized for 25 W continuous over 8 h at the upper environmental temp (AC-NEW-5).
+- Storage budget ≥ tile cache (~10 GB) + per-flight FDR (64 GB, AC-NEW-3).

-# Onboard Hardware
+## Sensors & Integration
+- **High-rate IMU** available from FC via MAVLink (both ArduPilot Plane and iNav expose IMU telemetry over MAVLink outbound).
+- **Communication protocol (pinned)**: MAVLink for the GCS link (QGroundControl). Companion ↔ FC interface is per-FC: MAVLink for ArduPilot Plane (inbound external positioning + outbound telemetry); MSP2 for iNav (inbound external positioning via `MSP2_SENSOR_GPS`); MAVLink outbound from iNav for telemetry to the GCS is preserved.
+- **Supported flight controllers**: ArduPilot Plane, iNav. PX4 out of scope.
+- **Output to FC**: WGS84 GPS coordinates as a real-GPS replacement, via each supported FC's documented external-positioning interface — MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav (companion is the sole GPS source on iNav; iNav has no dual-GPS arbitration). Per-FC parameter wiring (EKF source-set on ArduPilot; GPS provider/UART selection on iNav) and source-label out-of-band channel are design choices; outcome contract is AC-4.3.
+- **Ground station**: QGroundControl (Mission Planner out of scope). Telemetry link bandwidth-limited; per-frame data stays on local FDR (AC-NEW-3); GCS sees 1–2 Hz downsampled summary (AC-6.1).
+- **Representative data**: see `input_data/` (still images), `input_data/flight_derkachi/` (cropped nadir video + synchronized `SCALED_IMU2` + `GLOBAL_POSITION_INT`). Production acceptance still requires camera intrinsics, distortion, camera-to-body calibration, and synchronized representative flight data (frames + FC IMU/attitude/airspeed/altitude + emitted MAVLink + ground-truth trajectory).

- Processing is done on a Jetson Orin Nano Super (67 TOPS, 8GB shared LPDDR5, 25W TDP)
- The companion computer runs JetPack (Ubuntu-based) with CUDA/TensorRT available
- Onboard storage for satellite imagery is limited (exact capacity TBD, but must be accounted for in tile preparation)
- Sustained GPU load may cause thermal throttling; the processing pipeline must stay within thermal envelope
-
-# Sensors & Integration
-
- There is a lot of data from IMU (via the flight controller)
- The system communicates with the flight controller via MAVLink protocol using MAVSDK library
- The system must output GPS coordinates to the flight controller as a replacement for the real GPS module (MAVLink GPS_INPUT message)
- Ground station telemetry link is available but bandwidth-limited; it is not the primary output channel
+## Failsafe & Safety
+- If no estimate produced for >3 s → autopilot falls back to IMU-only dead reckoning (AC-5.2). 3 s rides through one sharp turn at cruise.
+- False-position safety budget: AC-NEW-4 (P(>500 m) <0.1 %, P(>1 km) <0.01 % per flight).
+- Cold-start TTFF <30 s (AC-NEW-1); spoofing-promotion latency <3 s (AC-NEW-2).
@@ -0,0 +1,196 @@
+# Question Decomposition
+
+> Mode A Phase 2 (Initial Research — Problem & Solution Draft).
+> Phase 1 (AC & Restrictions Assessment) was skipped per user decision after a cleanup pass that stripped implementation details from `acceptance_criteria.md` and `restrictions.md` (commit `12cc5a4`); AC/restrictions are treated as fixed inputs.
+
+## Original Question
+
+Design the GPS-denied onboard navigation system for a fixed-wing UAV operating in eastern/southern Ukraine, satisfying every AC in `_docs/00_problem/acceptance_criteria.md` under the constraints in `_docs/00_problem/restrictions.md`. Recommend a concrete component-by-component architecture and tech stack.
+
+## Research Output Class
+
+**Technical-component selection.** All technical-component gates apply (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookups for every lead library/SDK candidate).
+
+## Question Type
+
+**Decision Support** (per Mode A Phase 2 default). Sub-flavour: multi-component decision support — weighing trade-offs across ~10 interlocking component areas under hard real-time + memory + safety budgets.
+
+## Project Context Summary (from `_docs/00_problem/`)
+
+- **What is being built**: an onboard companion-PC system that replaces real GPS for a fixed-wing UAV when GPS is denied/spoofed, by combining nav-camera frames + FC IMU + a pre-cached satellite tile basemap, and emits standard MAVLink external-positioning messages to ArduPilot or iNav at frame rate.
+- **Operating area**: eastern/southern Ukraine, active-conflict zones (war-zone scene change is a first-class concern, not an edge case).
+- **Mission profile**: 8-hour fixed-wing flights, ~60 km/h cruise, ≤1 km AGL, ~400 km² operational area.
+- **Pinned external deps**: ADTi 20MP 20L V1 nav camera (APS-C); Jetson Orin Nano Super 8 GB / 25 W; MAVLink protocol; ArduPilot + iNav as supported FCs; QGroundControl as GCS; Azaion Suite Satellite Service (offline cache interface ≥0.5 m/px).
+- **Hard runtime envelope**: <400 ms p95 end-to-end latency (camera → MAVLink), <8 GB shared CPU+GPU RAM, 25 W TDP at +50 °C ambient for 8 h continuous, no in-flight network, 10 GB persistent tile cache + 64 GB per-flight FDR.
+- **Hard safety envelope**: P(error >500 m) <0.1 % per flight, P(error >1 km) <0.01 % per flight; honest covariance reporting; explicit `dead_reckoned` failsafe under simultaneous GPS spoof + visual blackout; cache-poisoning probability bounds for tiles written back to the Service.
+
+## Project Constraint Matrix
+
+| Dimension | Binding constraint |
+|---|---|
+| **Inputs available** | Nav camera frames @ 3 fps (5472×3648, ~12 cm/px GSD @ 1 km AGL); FC IMU (high rate via MAVLink); FC attitude/airspeed/altitude; pre-cached satellite tiles ≥0.5 m/px (offline); operator re-loc hint via GCS (rare). |
+| **Outputs required** | WGS84 position to FC via MAVLink external-positioning message(s) accepted by ArduPilot AND iNav; per-frame estimate carrying honest 95 % covariance, source label `{satellite_anchored, visual_propagated, dead_reckoned}`, and `last_satellite_anchor_age_ms`; mid-flight ortho-tiles written to local cache with quality metadata; 1–2 Hz GCS summary; FDR records per AC-NEW-3. |
+| **Hardware fixed** | Jetson Orin Nano Super (67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W TDP, JetPack/CUDA/TensorRT). |
+| **Lifecycle** | Real-time embedded; offline (no in-flight network); 8 h continuous; persistent tile cache across flights; FDR rollover. |
+| **Non-functional** | <400 ms p95 latency; <8 GB shared RAM; ≤25 W power at +50 °C ambient; AC-1.1/1.2 accuracy; AC-2.1/2.2 registration & MRE; AC-3.x resilience; AC-NEW-1 cold-start <30 s; AC-NEW-2 spoof promotion <3 s; AC-NEW-4 false-position safety; AC-NEW-7 cache-poisoning safety; AC-NEW-8 blackout failsafe. |
+| **Hard disqualifiers** | Anything requiring >8 GB RAM peak (CPU+GPU shared); anything not runnable under JetPack on Orin Nano Super; anything requiring in-flight cloud calls; anything that cannot honestly report covariance; anything that does not have a runnable example for monocular nadir UAV input over season-matched satellite tiles; anything whose license blocks military / dual-use deployment. |
+
+## Research Subject Boundary
+
+| Dimension | Boundary |
+|---|---|
+| **Population** | Fixed-wing UAVs, downward-fixed monocular nav camera, Jetson-class edge HW, ArduPilot or iNav autopilot. Excludes: multirotors, gimbal-stabilised nav cams, server/cloud GPS-denied stacks, PX4 (out of scope), commercial sat-imagery direct integration (Service handles upstream). |
+| **Geography** | Eastern/southern Ukraine — agricultural steppe, active-conflict scene change. Validation must include this geography or representative analogues (low-texture cropland, snow, war-zone destruction). |
+| **Timeframe** | Production deployment 2026; tools / libraries / models considered must be currently maintained (commits/releases in last 18 months OR explicit long-term-stable status). Critical-novelty domain — see Step 0.5 timeliness assessment. |
+| **Operating context** | Real-time embedded; offline in-flight; 8 h continuous duty; 25 W power envelope; 8 GB shared CPU+GPU memory; thermal envelope to +50 °C ambient. |
+| **Required interfaces** | Inputs: ADTi 20MP nav cam, FC IMU (MAVLink), satellite tile cache. Outputs: MAVLink external-positioning to ArduPilot AND iNav; QGroundControl summary; FDR; tile write-back to Suite Service on landing. |
+| **Non-functional envelope** | Per AC-1 to AC-8 plus AC-NEW-1 to AC-NEW-8. Hardest binding constraints: 400 ms p95 end-to-end; 8 GB shared RAM; AC-NEW-4 false-position probability bounds; AC-NEW-7 cache-poisoning probability bounds. |
+
+## Sub-Questions
+
+| ID | Sub-question |
+|---|---|
+| SQ1 | What existing/competitor GPS-denied UAV navigation systems exist (academic + open-source + commercial + military), and which of them have been validated on fixed-wing UAVs in active-conflict environments? What works, what fails? |
+| SQ2 | What is the canonical decomposition of "monocular nadir UAV ↔ pre-cached satellite basemap localization" into pipeline components? Is the decomposition below complete, or are there industry-standard components missing? |
+| SQ3 | For each component (VO/VIO, VPR, cross-domain registration, single-frame orthorectification, sensor-fusion estimator, tile cache + spatial index, on-Jetson inference runtime, MAVLink FC adapter, dataset/SITL validation infrastructure): what option families exist (simple baseline / production / open-source / commercial / SOTA / adjacent-domain / no-build), and what are the leading candidates as of 2026? |
+| SQ4 | For each lead candidate per component: what are the documented runtime/memory/latency/license/maintenance constraints, and how do they bind against the Project Constraint Matrix? Per-mode API capability verification with `context7` for every library/SDK lead. |
+| SQ5 | What are the documented failure modes and real-world deployment lessons for each component family? In particular: VPR collapse under cropland repetition, DINOv2/foundation-model cost on Jetson at int8, RANSAC degeneracy at sharp turns / low texture, EKF over-confidence on cross-domain matches, ortho geometric error from unknown bank/pitch. |
+| SQ6 | How do **ArduPilot Plane** (current stable) and **iNav** (current stable) each accept external positioning input via MAVLink? What message types does each support? Where do their interfaces diverge, and what is the documented status of each interface (stable / experimental / known bugs)? |
+| SQ7 | What public datasets, benchmarks, and SITL/replay environments exist for cross-validating monocular nadir UAV navigation against satellite basemaps in season-matched + change-affected conditions? AerialVL, AerialExtreMatch, others? |
+| SQ8 | What are the security and safety considerations specific to the AC-NEW-4 (false-position) and AC-NEW-7 (cache-poisoning) safety budgets, including spoofing-detection signals from FC, ortho-tile geo-alignment quality estimation, and write-back cache-poisoning controls? |
+| SQ9 | What does the system look like end-to-end — wiring, scheduling, threading model, inference scheduling on shared CPU+GPU memory, cold-start sequencing, FDR rotation, and pre-flight cache provisioning workflow? (synthesis question, answered in Step 8) |
+
+## Component Areas (search plan)
+
+For each component below, the search plan covers all option families per `Component Option Search Plan` rules (`research/steps/03_engine-investigation.md` → "Component Option Breadth").
+
+| # | Component area | Required outputs | Key option families to enumerate |
+|---|----------------|------------------|----------------------------------|
+| C1 | **Visual / Visual-Inertial Odometry** (frame-to-frame motion when satellite anchor is unavailable) | Relative 6-DoF pose between consecutive frames or short windows; output frequency ≥3 Hz; metric scale (with IMU) | Classical (VINS-Mono / VINS-Fusion / OpenVINS), Kimera, ORB-SLAM3, OKVIS2, MSCKF-class, learning-based (DROID-SLAM, DPVO), pure VO baseline (KLT + RANSAC homography), no-build (skip and rely on pure satellite re-anchor every frame) |
+| C2 | **Visual Place Recognition (VPR)** — UAV nadir frame → top-K satellite chunks | Compact global descriptor per UAV frame and per cache chunk; cosine-rank top-K candidates | NetVLAD class, MixVPR, EigenPlaces, BoQ, AnyLoc (DINOv2 + VLAD), CricaVPR, foundation-model direct retrieval (DINOv2/DINOv3/SAM 2 / SuperGlobal) |
+| C3 | **Cross-domain registration** (UAV nadir ↔ ortho satellite tile, after VPR top-K) | Sub-pixel alignment + 6-DoF camera pose w.r.t. tile, with inlier ratio + covariance | Local-feature matching (SuperPoint+SuperGlue / LightGlue / DISK+LightGlue / ALIKED+LightGlue / XFeat), dense matchers (LoFTR / RoMa / DKM / MASt3R), classical (SIFT+RANSAC), specialized cross-domain (CMRNet+, CroCoMatch class), templating (mutual-information / ECC), no-build (skip cross-domain; rely on direct frame-to-tile homography from VPR retrieval) |
+| C4 | **Single-frame orthorectification** (nav frame → basemap-aligned tile, given current pose) | Ortho-rectified tile chunk with geo metadata + quality score | Single-frame perspective warp with flat-earth assumption; OpenCV homography; bundled-DEM-aware (rare for flat steppe — likely overkill); GDAL warp utilities; custom GPU shader on Jetson |
+| C5 | **State estimator / sensor fusion** (VO + IMU + sat anchors → fused estimate with covariance) | WGS84 position + 3D velocity + attitude + 6×6 covariance, frame-rate output, honest covariance, source label | EKF (manual), ESKF (manual or via library), MSCKF, factor-graph (GTSAM, iSAM2), particle filter, learned (out-of-scope for safety budget). Supporting: Mahalanobis outlier gates |
+| C6 | **Tile cache + spatial index** (storage + retrieval of basemap tiles + descriptors, with manifests, freshness, dedup, and write-back) | mmap-friendly storage; ANN over global descriptors; spatial query for geographic prior; manifest schema per AC | Storage: GeoTIFF + COG, MBTiles, custom flat layout. ANN: FAISS (IVF/PQ/HNSW), hnswlib, ScaNN, brute-force (small index). Spatial: R-tree / KD-tree / GeoPandas / SQLite+SpatiaLite. Manifest: SQLite, JSON-per-tile, Parquet sidecar |
+| C7 | **On-Jetson inference runtime** | INT8/FP16 inference of the chosen VPR + matcher models within latency + memory budget | TensorRT (native), Torch-TensorRT, ONNX Runtime + TRT EP, NVIDIA Triton (probably overkill), pure PyTorch fp16, NVIDIA DeepStream (for video), CUDA-Python custom kernels |
+| C8 | **MAVLink FC adapter** (per-FC external-positioning emission + spoofing-signal subscription, for ArduPilot AND iNav) | MAVLink frames consumed by ArduPilot Plane and iNav as external position; spoofing signals consumed from each FC | Libraries: `pymavlink` (per-message), MAVSDK (high-level), ArduPilot/iNav SITL for verification. Per-FC choice of message: `GPS_INPUT` vs `ODOMETRY` vs `VISION_POSITION_ESTIMATE` vs `GLOBAL_POSITION_INT` (documented capability per FC must be verified, not assumed) |
+| C9 | **Datasets + SITL / replay** | Reproducible validation against AC-1/2/3/4/NEW-4/NEW-7/NEW-8 budgets; fixtures for AerialVL S03, AerialExtreMatch, own Mavic flights, Derkachi flight footage | AerialVL (VISTA / NTU), AerialExtreMatch, VPR-Bench, MahalNotchVPR / Mid-Air UAV; SITL: ArduPilot Plane SITL, iNav SITL/HITL, Gazebo, Webots; replay: PX4-Avionics-Replay-style or custom |
+| C10 | **Pre-flight cache provisioning + sector classification + freshness pipeline** | Tooling (operator-side) to pull tiles from Suite Sat Service for an operational area, classify active-conflict vs stable rear, age-stamp, populate descriptor index | Likely a custom CLI/desktop tool — research existing UAV mission-prep tools (QGC plan files, MAVProxy, ArduPilot Mission Planner equivalents on the operator side) |
+
+## Perspectives Chosen (≥3 mandatory)
+
+1. **Implementer / Engineer** — Will the chosen stack actually compile, link, and run on the pinned Jetson within the latency + memory budget? Pitfalls of MAVLink GPS injection on each FC. Sub-pixel registration on UAV-nadir × ortho satellite. Inference-scheduler contention on shared CPU+GPU memory.
+2. **Practitioner / Field** — What do UAV teams actually report from GPS-denied missions in real war-zone deployments? (Ukraine context if findable; otherwise analogous high-stakes deployments.) Real-world VPR collapse on agricultural cropland / snow / season change. Real-world FDR usefulness for post-mission forensics.
+3. **Domain expert / Academic** — Recent (2024–2026) VPR + cross-domain matching benchmarks and their relative ranks under cross-season / cross-domain / cross-altitude conditions. Foundation-model-based VPR (AnyLoc, BoQ, MASt3R) — academic claims vs reproducibility. Recent factor-graph vs ESKF comparisons.
+4. **Contrarian / Devil's advocate** — Why might foundation-model VPR fail on the Jetson budget? Where does cross-domain matching degrade silently? When does ortho-tile write-back amplify bad poses? When does honest covariance turn into "system never trusts itself" (over-cautious failure)?
+
+## Search Query Variants per Sub-Question
+
+(Detailed query lists are appended below per sub-question; these will be executed in Step 2 and saved to `01_source_registry.md`. The shape is shown here so the search plan is auditable; the full execution log will populate downstream files.)
+
+**SQ1** (existing systems / competitors): "GPS-denied UAV navigation 2025", "visual GPS denied fixed wing UAV", "satellite map matching UAV localization 2024 2025", "Ukraine UAV GPS spoofing countermeasures", "ARL ANT Project visual navigation", "vision-based GPS replacement UAV production", "UAV GPS spoofing real-world deployment 2025".
+
+**SQ2** (canonical pipeline): "visual aerial localization pipeline survey", "UAV satellite map matching architecture", "monocular UAV global localization pipeline 2024 2025".
+
+**SQ3 / SQ4** (per-component candidates + binding): per-component query templates (5+ variants each) — see Step 2 plan in `01_source_registry.md` once initialised. Each lead library/SDK candidate triggers the mandatory `context7` per-mode capability verification per `research/steps/03_engine-investigation.md`.
+
+**SQ5** (failure modes): "VPR cropland failure", "DINOv2 Jetson Orin Nano latency", "SuperGlue LightGlue Jetson Orin", "ESKF cross-domain over-confidence", "RANSAC homography low-texture failure UAV", "ortho photo geometric error airframe tilt".
+
+**SQ6** (ArduPilot vs iNav external positioning): "ArduPilot Plane GPS_INPUT external", "ArduPilot ODOMETRY EKF3 source switching", "iNav external positioning MAVLink GPS_INPUT", "iNav MAVLink GPS substitute", "iNav GPS denied flight 2025", "ArduPilot vs iNav external nav comparison".
+
+**SQ7** (datasets): "AerialVL dataset", "AerialExtreMatch", "VPR-Bench cross-season aerial", "Mid-Air UAV dataset", "Mavic Mavik UAV public flight dataset", "satellite-aerial cross-view localization benchmark".
+
+**SQ8** (safety): "MAVLink GPS_RAW_INT spoofing detection", "EKF lane switch ArduPilot", "covariance under-reporting risk EKF", "geo-misalign detection ortho tile".
+
+## Completeness Audit
+
+Probes (per `references/comparison-frameworks.md` → Decomposition Completeness Probes — applied here without re-reading the full file; will reconcile during Step 2):
+
+| Probe | Coverage |
+|---|---|
+| Functional decomposition complete? | C1–C10 cover all data flows from camera in to MAVLink out + back. ✓ |
+| Non-functional dimensions covered? | Latency, memory, accuracy, safety, freshness, security all in Project Constraint Matrix. ✓ |
+| Failure-mode dimension covered? | SQ5 explicitly. ✓ |
+| Cost / TCO dimension? | Hardware is pinned (Jetson Orin Nano Super); Service-side cost is out of scope; SW cost = mostly open-source candidates. Will revisit during Phase 3 (tech stack consolidation) if commercial options emerge. ✓ |
+| Maintenance / community-health dimension? | SQ4 binds it per candidate. ✓ |
+| Adjacent-domain dimension? | Robot SLAM, AGV warehouse navigation, aerial photogrammetry will be searched as analogues. ✓ |
+| Validation / dataset coverage? | SQ7 + C9. ✓ |
+| Integration / boundary coverage? | SQ6 (FC adapters) + C8 + C10 (pre-flight provisioning). ✓ |
+| Operational/human-factors? | Pre-flight cache provisioning (C10) and operator re-loc hint (AC-3.4) covered. Mission-planning UX is out of scope. ✓ |
+| Security / threat model? | SQ8. Will deepen in Phase 4 (Security Deep Dive) if invoked. ✓ |
+
+No major gap detected at decomposition time. If domain-discovery searches in Step 2 surface a missed dimension, a "gap-fill" entry will be appended here.
+
+## Notes on Output-Class Mode-Verification
+
+Because this is **Technical-component selection**, every lead library/SDK candidate triggers:
+- Pinned mode/configuration sentence in `02_fact_cards.md`.
+- `context7` lookup with the three mandatory queries (mode enumeration; project's exact mode runnable example; disqualifier probe).
+- MVE block per candidate.
+- Per-numbered-Restriction and per-numbered-AC binding (`Pass` / `Fail` / `Verify` / `N/A`).
+- Two modes of one library = two distinct candidates.
+
+## Step 0.5 — Novelty Sensitivity Assessment
+
+**Classification: Critical sensitivity.**
+
+Justification:
+- Foundation-model VPR is moving fast: DINOv2 (Apr 2023), AnyLoc (Aug 2023), BoQ (CVPR 2024), MASt3R (May 2024), MASt3R-SfM / new VPR-leader candidates 2025; rankings on cross-season aerial benchmarks have shifted multiple times since 2023.
+- ArduPilot Plane / iNav external-positioning interfaces have moved: ArduPilot EKF3 source-switching parameters and known double-fusion bugs between `GPS_INPUT` and `ODOMETRY` were a moving target through 2024–2025; iNav GPS-denied support has matured separately.
+- TensorRT / JetPack stacks on Jetson Orin Nano Super have version-dependent INT8 quantisation behaviour and runtime tooling differences worth verifying against current releases.
+- Public aerial-localization datasets (AerialVL, AerialExtreMatch, etc.) have had multiple revisions and added splits.
+
+Source-time-window rules for this run:
+- **Lead-candidate selection / SOTA claims**: prioritise sources from **last 6 months**; allow up to **18 months** if no newer source covers the same claim and the older source is the official authority.
+- **Established baselines / classical algorithms** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window — canonical references are fine.
+- **Library/SDK API behaviour**: must be verified against the **currently shipped version** at the time of search (`context7` mandatory per lead candidate; release notes / changelog cross-checked).
+- **Cross-validation**: every Critical-sensitivity claim that drives a candidate selection must have **≥2 independent sources** or one official + one runnable MVE; single-source SOTA claims must be downgraded to `Experimental only` at Step 7.5 unless cross-validated.
+
+## SQ2 Closure — Pipeline-component coverage table (Mode A Phase 2, Step 3 result)
+
+The C1–C10 decomposition was sanity-checked against five independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey — all logged in `01_source_registry.md` as Sources #38–#42). The canonical hierarchical framework `retrieval → matching → pose-estimation` is unanimously confirmed; project's split is **canonical, not novel**. Two augmentations are required.
+
+| Survey/benchmark canonical stage | Project component | Coverage status | Required action |
+|---|---|---|---|
+| Image retrieval (global VPR) | **C2 — VPR** | ✅ covered | None |
+| Re-ranking (top-N inlier-based) | (implicit, inside C2/C3) | ⚠️ implicit | Promote to explicit sub-stage in `solution_draft01` |
+| Local image matching (2D-2D, sparse or dense) | **C3 — Cross-domain registration** | ✅ covered | Add Top-N inlier re-rank requirement |
+| AdHoP-style perspective preconditioning | (not represented) | ❌ missing | Add as optional sub-stage between C3 and C4, gated on Jetson latency budget |
+| 2D-3D lift via DSM | (not represented; current cache is 2D ortho only) | ❌ architectural decision required | **Decision required from user** — see "Open architectural decisions" below |
+| Pose estimation (PnP + RANSAC + LM) | **C4 — Pose estimation** | ✅ covered | None |
+| State estimator / fusion | **C5 — Estimator / fusion** | ✅ covered | Augmented with covariance-honesty contract (already from AC-NEW-4) |
+| IMU + VIO contract | **C1 (VIO)** + **C6 (Tile cache)** ⁂ | ✅ covered | Add yaw σ ≤ 5°, pitch σ ≤ 5° hard contract (Fact #24) |
+| Tile cache + scheduler | **C6 (Tile cache + spatial index)** | ✅ covered | Add 20% covisibility runtime invariant (Fact #27) |
+| On-Jetson runtime | **C7 — On-Jetson inference runtime** | ✅ covered | Pre-screen prunes non-viable candidates (Fact #26) |
+| Anti-spoof / FC adapter | **C8 — MAVLink FC adapter** | ✅ covered | Already addressed by SQ6 |
+| Datasets / SITL / replay | **C9 — Datasets + SITL / replay** | ✅ covered | None |
+| Pre-flight cache provisioning | **C10 — Pre-flight cache + sector classification** | ✅ covered | None |
+
+⁂ The "IMU integration" concern lives in C1 (VIO) and partially flows from FC IMU; there is no separately numbered IMU component in the original C1–C10 split. SQ2 confirms this was correct — IMU is best owned by C1 (VIO) which already produces the yaw/pitch attitude. The σ ≤ 5° contract belongs on C1's output interface.
+
+### SQ2 — Architectural decisions (resolved by user, 2026-05-07)
+
+| # | Decision | Choice | Implication for SQ3+SQ4 |
+|---|---|---|---|
+| 1 | DSM dependency on Suite Sat Service tile cache (Fact #23) | **(a) 3-DoF acceptance** — fix attitude from IMU/VIO, ignore DSM; current 2D-ortho cache contract preserved. | C6 (Tile cache) candidate matrix excludes DSM-dependent storage formats. C3 (matcher) candidates evaluated on 2D-2D output (homography) only. Yaw/pitch σ ≤ 5° (Fact #24) is **noted as an empirical requirement on C1's output but NOT bound as a hard interface contract** — emerges as an output of C1 candidate selection in SQ3+SQ4. AC-1.1.1 (≤80 m at 1 km AGL) likely satisfied per DSMAC-class lineage in Fact #17; if AC ever tightens, revisit option (b). |
+| 2 | AdHoP refinement loop (Fact #22) | **(b) Conditional** — only invoked when initial reprojection error exceeds a threshold. | C3 (matcher) latency budget = base (single-pass) + AdHoP-conditional overhead (worst-case 2× when triggered). Per-frame Jetson MVE must measure both modes. The reprojection-error threshold becomes a SQ3+SQ4 hyperparameter. |
+| 3 | Top-N re-rank promotion (Fact #25) | **(a) Promote** to an explicit named sub-stage between C2 and C3. | SQ3+SQ4 will hyperparameter-sweep N ∈ {5, 10, 15, 20}; C2 candidates evaluated jointly with re-rank cost. Top-N re-rank by inlier-count is now a hard pipeline component, not implicit. |
+
+### SQ2 — Component-pruning carried into SQ3+SQ4 (Jetson-pre-screen result)
+
+Per Fact #26 (RTX-3090-measured runtime → conservative Jetson-Orin-Nano translation):
+
+- **C2 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD.
+- **C2 candidates entering SQ3+SQ4 conditional on INT8 quantization path**: AnyLoc, BoQ, DINOv2-VLAD.
+- **C2 candidates pruned outright**: SuperGlue-as-reranker (latency).
+- **C3 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: LightGlue, XFeat, XFeat*, SP+LightGlue (NGPS template confirmed).
+- **C3 candidates pruned outright**: RoMa, MASt3R, DKM (dense-matcher latency on Jetson).
+- **C3 candidates as "AerialExtreMatch reference points" only**: GIM+DKM, GIM+LightGlue (per Source #40 — accuracy benchmark, not for production deployment).
+
+## Next Step
+
+SQ1 ✓ → SQ2 ✓ (with three architectural decisions resolved) → **SQ3+SQ4 per component (C1→C10)** → SQ5 interleaved → SQ7 → SQ8 → SQ9 synthesis at engine Step 8.
+
+Pipeline shape entering SQ3+SQ4: `C1 (VIO) → C2 (VPR) → Top-N re-rank by inlier count → C3 (matcher) → AdHoP-conditional refinement → C4 (PnP+RANSAC+LM) → C5 (estimator) → C8 (FC adapter)` with C6 (cache, 2D ortho) + C7 (Jetson runtime) + C9 (datasets) + C10 (provisioning) cross-cutting.
+
+First C1 (VIO) candidate batch: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure-VO baseline (RTAB-Map and ORB-SLAM3 already pruned by Fact #16). Per-mode `context7` capability verification mandatory for every lead library/SDK candidate.
@@ -0,0 +1,523 @@
+# Source Registry
+
+> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
+> Critical-novelty sensitivity per Step 0.5 in `00_question_decomposition.md`. Time windows applied:
+> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
+> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
+> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
+>
+> Investigation order saved in `00_question_decomposition.md` → "Next Step": SQ6 → SQ1 → SQ2 → SQ3+SQ4 per component (C1→C10) → SQ5 interleaved → SQ7 → SQ8 → SQ9 synthesis at engine Step 8.
+
+## Investigation Status
+
+| Sub-question | Status | Notes |
+|---|---|---|
+| SQ6 — ArduPilot vs iNav external positioning | **Saturated for protocol-level architectural decision** (further detail deferred to SQ8 for spoofing-side fields and to design phase for SITL parameter tuning) | Major finding: iNav has no inbound external-positioning MAVLink handler; AC-4.3 wording must be revised. See `02_fact_cards.md` "SQ6 Conclusions". |
+| SQ1 — Existing GPS-denied UAV systems | **Saturated.** 13 sources logged across academic / open-source / commercial / defense-program / Ukraine-practitioner. Closest peer system: Twist Robotics OSCAR (deployed in Ukraine). Closest open-source pipeline-match: snktshrma/ngps_flight (NGPS, ArduPilot GSoC 2024 — LightGlue+SuperPoint+UKF+VISION_POSITION_ESTIMATE). Closest deployed commercial: Auterion Artemis (Skynode N + Visual Navigation, Ukraine-tested, 1000-mile range). | See `02_fact_cards.md` SQ1 cluster + working summary. |
+| SQ2 — Canonical pipeline decomposition | **Saturated.** 5 surveys/benchmarks logged (Skoltech aerial VPR, U.Maine cross-view, OrthoLoC 2.5D geodata, AnyVisLoc low-altitude multi-view, NUDT 2026 sciopen survey). All converge on **`retrieval → matching → pose-estimation`** hierarchical framework with VIO/IMU as auxiliary. Two new architectural facts added to C1–C10: (a) **AdHoP-style perspective-refinement loop** between matching and PnP (+63% translation accuracy, method-agnostic), (b) **DSM 2.5D dependency** for full 6-DoF on aerial-to-satellite (must be resolved with the Suite Sat Service or accepted as a 3-DoF degraded mode). Practitioner runtime evidence: AnyLoc on RTX 3090 = 0.63s/descriptor, SuperGlue re-rank = 17–25s; on Jetson Orin Nano these are non-viable for our 400 ms p95 budget — must restrict to lightweight VPR (e.g., MixVPR / SALAD class) + LightGlue/XFeat-class matchers. See `02_fact_cards.md` "SQ2 Conclusions". |
+| SQ3+SQ4 — Per-component candidates (C1–C10) | Not started | |
+| SQ5 — Failure modes / deployment lessons | Not started (interleaved with SQ3/SQ4) | |
+| SQ7 — Datasets, SITL, replay environments | Not started | |
+| SQ8 — Safety considerations (AC-NEW-4 / AC-NEW-7) | Not started | Carries the AP_GPS spoofing-signal probe deferred from SQ6. |
+| SQ9 — End-to-end synthesis | Step 8 of engine (deferred) | |
+
+---
+
+## Sources
+
+### Source #1
+- **Title**: Non-GPS Navigation — Plane documentation
+- **Link**: https://ardupilot.org/plane/docs/common-non-gps-navigation-landing-page.html
+- **Tier**: L1
+- **Publication Date**: live docs (current ArduPilot stable, accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: ArduPilot 4.7+ (persistent origin storage); applies to current Plane stable
+- **Target Audience**: ArduPilot Plane operators / developers
+- **Research Boundary Match**: Full match (fixed-wing, ArduPilot Plane is in scope)
+- **Summary**: Lists supported non-GPS navigation systems for Plane. Notes that boards <1MB flash still support `GPS_INPUT` even when they cannot run other non-GPS messages. Notes that Plane (non-VTOL) is generally not applicable for low-altitude non-GPS — but `GPS_INPUT` as an external GPS replacement is not constrained by that note.
+- **Related Sub-question**: SQ6
+
+### Source #2
+- **Title**: GPS / Non-GPS Transitions — Plane documentation
+- **Link**: https://ardupilot.org/plane/docs/common-non-gps-to-gps.html
+- **Tier**: L1
+- **Publication Date**: live docs (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: EKF3 (default since AP 4.0+)
+- **Target Audience**: ArduPilot operators using mixed GPS / non-GPS sources
+- **Research Boundary Match**: Full match
+- **Summary**: Documents the EKF3 source-set mechanism (`EK3_SRC1..3_POSXY/VELXY/POSZ/VELZ/YAW`), three source sets, RC aux switch (option 90 "EKF Pos Source"), `MAV_CMD_SET_EKF_SOURCE_SET`, Lua-script driven switching. Explicitly named messages for non-GPS path: ExternalNav (option 6). GPS_INPUT is treated as a GPS source (set 1).
+- **Related Sub-question**: SQ6
+
+### Source #3
+- **Title**: EKF Source Selection and Switching — Plane documentation
+- **Link**: https://ardupilot.org/plane/docs/common-ekf-sources.html
+- **Tier**: L1
+- **Publication Date**: live docs (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: EKF3 stable
+- **Target Audience**: ArduPilot operators / developers
+- **Research Boundary Match**: Full match
+- **Summary**: Authoritative parameter reference for `EK3_SRCx_*` (POSXY/VELXY/POSZ/VELZ/YAW). Important caveat: "Ground stations or companion computers may set the source by sending a `MAV_CMD_SET_EKF_SOURCE_SET` mavlink command **but no GCSs are currently known to implement this**." Source-set switching from companion is supported by AP, not by stock GCS UI. Mentions ExternalNAV/OpticalFlow transition options via `EK3_SRC_OPTIONS` bit 1.
+- **Related Sub-question**: SQ6
+
+### Source #4
+- **Title**: ArduPilot AP_GPS_MAV.cpp (master)
+- **Link**: https://raw.githubusercontent.com/ArduPilot/ardupilot/master/libraries/AP_GPS/AP_GPS_MAV.cpp
+- **Tier**: L1 (source code)
+- **Publication Date**: master HEAD (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: master branch
+- **Target Audience**: ArduPilot developers, integrators of external GPS via MAVLink
+- **Research Boundary Match**: Full match
+- **Summary**: Authoritative implementation of `MAVLINK_MSG_ID_GPS_INPUT` ingestion into AP_GPS state. Decodes lat/lon/alt, hdop/vdop, velocity (vn/ve/vd), speed/horizontal/vertical accuracy, yaw. Honors `gps_id` (multi-GPS instance), `ignore_flags` bitmask (ALT, HDOP, VDOP, VEL_HORIZ, VEL_VERT, SPEED_ACCURACY, HORIZONTAL_ACCURACY, VERTICAL_ACCURACY). Requires `fix_type ≥ 3` and `time_week > 0` for jitter-corrected timestamping. Yaw uses `0` as "not provided" sentinel. Only `GPS_INPUT` is handled by this driver — `VISION_POSITION_ESTIMATE` / `ODOMETRY` go via the external-nav driver, not AP_GPS_MAV.
+- **Related Sub-question**: SQ6
+
+### Source #5
+- **Title**: ArduPilot PR #28750 — AP_NavEKF3: added two more EK3_OPTION bits (GPS-denied testing)
+- **Link**: https://github.com/ArduPilot/ardupilot/pull/28750
+- **Tier**: L2 (development PR, ArduPilot core team)
+- **Publication Date**: 2024 (accessed via search 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: master / pending stable branch propagation
+- **Target Audience**: ArduPilot developers
+- **Research Boundary Match**: Full match
+- **Summary**: Adds new `EK3_OPTION` bits to allow easier GPS-denied testing of EKF3, including an aux-switch / MAVLink command path to disable GPS use. Confirms ongoing 2024-2025 work on GPS-denied robustness.
+- **Related Sub-question**: SQ6
+
+### Source #6
+- **Title**: ArduPilot Issue #15859 — EKF3: improve source switching (GPS<->NonGPS)
+- **Link**: https://github.com/ArduPilot/ardupilot/issues/15859
+- **Tier**: L4 (issue tracker — open enhancement list)
+- **Publication Date**: ongoing (long-running issue, accessed 2026-05-07)
+- **Timeliness Status**: Currently valid (still open per dev docs reference)
+- **Target Audience**: ArduPilot developers
+- **Research Boundary Match**: Full match
+- **Summary**: Authoritative list of planned improvements for source-switching. Linked from the L1 GPS-Non-GPS Transitions page. Indicates current source switching has known rough edges acknowledged by the core team.
+- **Related Sub-question**: SQ6
+
+### Source #7
+- **Title**: ArduPilot Issue #27193 — EK3 Source Switching wrong frame for GUIDED commands SOLVED
+- **Link**: https://github.com/ArduPilot/ardupilot/issues/27193
+- **Tier**: L4 (issue tracker, resolved)
+- **Publication Date**: 2024 (accessed 2026-05-07)
+- **Timeliness Status**: Reference only (resolved as user-config)
+- **Target Audience**: ArduPilot operators using GPS↔Vision source switching
+- **Research Boundary Match**: Partial overlap (Copter context but the bug was in shared SET_POSITION_TARGET_GLOBAL_INT path)
+- **Summary**: Documented frame-interpretation issue when companion switches source set 1 (GPS) → set 3 (VISION_POSITION_ESTIMATES) and back. Resolved as configuration not code, but illustrates the kind of edge case to validate in SITL for AC-NEW-2 promotion.
+- **Related Sub-question**: SQ6
+
+### Source #8
+- **Title**: ArduPilot Issue #23485 — AP_NavEKF3: support fusing only External Nav Velocities (without position)
+- **Link**: https://github.com/ArduPilot/ardupilot/issues/23485
+- **Tier**: L4 (open enhancement)
+- **Publication Date**: ongoing (open as of accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: ArduPilot developers
+- **Research Boundary Match**: Full match
+- **Summary**: Confirms current limitation: ODOMETRY without position causes position-estimate timeout / failsafe. Implies the project's `visual_propagated` path (VO without satellite anchor) cannot be expressed as ODOMETRY-velocity-only on current AP — must be sent as full GPS_INPUT with widened covariance.
+- **Related Sub-question**: SQ6
+
+### Source #9
+- **Title**: iNavFlight/inav — telemetry/mavlink.c (master, processMAVLinkIncomingTelemetry)
+- **Link**: https://github.com/iNavFlight/inav/blob/master/src/main/telemetry/mavlink.c
+- **Tier**: L1 (source code, authoritative)
+- **Publication Date**: master HEAD (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav master (post-9.0)
+- **Target Audience**: iNav developers
+- **Research Boundary Match**: Full match
+- **Summary**: Authoritative inbound MAVLink switch (lines ~1334–1390). Handles only: HEARTBEAT, PARAM_REQUEST_LIST (stub), MISSION_CLEAR_ALL, MISSION_COUNT, MISSION_ITEM, MISSION_REQUEST_LIST, MISSION_REQUEST, COMMAND_INT (only `MAV_CMD_DO_REPOSITION`), RC_CHANNELS_OVERRIDE, ADSB_VEHICLE, RADIO_STATUS. **No `GPS_INPUT`, no `VISION_POSITION_ESTIMATE`, no `ODOMETRY`, no `GLOBAL_POSITION_INT`, no `GPS_RAW_INT`** are accepted as inputs. Wiki page (Source #10) confirms.
+- **Related Sub-question**: SQ6
+
+### Source #10
+- **Title**: iNav Wiki — MAVLink (frogmane edited 2025-12-11)
+- **Link**: https://github.com/iNavFlight/inav/wiki/Mavlink
+- **Tier**: L1 (project wiki)
+- **Publication Date**: 2025-12-11
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav 8.0 / 9.0 era
+- **Target Audience**: iNav users / integrators
+- **Research Boundary Match**: Full match
+- **Summary**: Authoritative inbound/outbound MAVLink message lists. "Limited command support: Commands that are not implemented are ignored." Explicitly enumerates the supported incoming list (matches Source #9). Confirms iNav MAVLink is "intended primarily for simple telemetry and operation" and "not 100% compatible".
+- **Related Sub-question**: SQ6
+
+### Source #11
+- **Title**: iNav Wiki — GPS and Compass setup
+- **Link**: https://github.com/iNavFlight/inav/wiki/GPS-and-Compass-setup
+- **Tier**: L1
+- **Publication Date**: live wiki (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav 7.0+ (UBX-only); 9.0 requires UBX protocol ≥15.00
+- **Target Audience**: iNav operators
+- **Research Boundary Match**: Full match
+- **Summary**: From iNav 7.0 NMEA was removed; only UBX is supported. Recommends u-blox M8/M9/M10 with protocol ≥15.00. Sets up the constraint for any UBX-emulation path the companion would take.
+- **Related Sub-question**: SQ6
+
+### Source #12
+- **Title**: iNavFlight/inav docs/development/msp/README.md (MSP message reference)
+- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/development/msp/README.md
+- **Tier**: L1 (project docs)
+- **Publication Date**: live (master, accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav master
+- **Target Audience**: iNav developers / integrators
+- **Research Boundary Match**: Full match
+- **Summary**: Authoritative spec for `MSP_SET_RAW_GPS (201)` and `MSP2_SENSOR_GPS (7939)`. `MSP_SET_RAW_GPS` is 14-byte, lossy (no covariance, no per-axis velocity, altitude in meters with cm internal mismatch — bug fixed in 5.0.0 per issue #8336). `MSP2_SENSOR_GPS` is the newer plugin-style message with `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy` (mm and cm/s), `hdop`, NED velocity components, `trueYaw`, GPS week + time-of-week, fix type, satellite count. Requires `USE_GPS_PROTO_MSP` build flag and routes through `mspGPSReceiveNewData()` (the GPS_PROVIDER_MSP driver path).
+- **Related Sub-question**: SQ6
+
+### Source #13
+- **Title**: iNavFlight/inav src/main/io/gps.c + src/main/target/common.h (master)
+- **Link**: https://github.com/iNavFlight/inav/blob/master/src/main/target/common.h
+- **Tier**: L1 (source code)
+- **Publication Date**: master (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: master
+- **Target Audience**: iNav developers
+- **Research Boundary Match**: Full match
+- **Summary**: `USE_GPS_PROTO_MSP` is enabled by default in the common target configuration; on default builds the MSP GPS provider (`GPS_PROVIDER_MSP`) is registered with `gpsRestartMSP` / `gpsHandleMSP`. Confirms the MSP2_SENSOR_GPS path is reachable on stock iNav firmware without custom builds.
+- **Related Sub-question**: SQ6
+
+### Source #14
+- **Title**: iNav Issue #10141 — dual GPS support
+- **Link**: https://github.com/iNavFlight/inav/issues/10141
+- **Tier**: L4 (open feature request)
+- **Publication Date**: ongoing (open as of accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: iNav users
+- **Research Boundary Match**: Full match
+- **Summary**: Confirms iNav does **not** support dual-GPS / primary-secondary failover. Open enhancement; no implementation in 8.0 / 9.0. Architectural implication: companion must be the sole GPS source for iNav (not a backup to a real GPS connected directly to FC).
+- **Related Sub-question**: SQ6
+
+### Source #15
+- **Title**: iNav docs/GPS_fix_estimation.md (master)
+- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/GPS_fix_estimation.md
+- **Tier**: L1
+- **Publication Date**: live (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav 8.0+
+- **Target Audience**: iNav fixed-wing operators
+- **Research Boundary Match**: Full match
+- **Summary**: iNav's internal dead-reckoning ("GPS fix estimation") for fixed-wing. Uses gyro/accel/baro/(mag/pitot). RTH-only intent. **Explicitly states: "Not a solution for GPS spoofing (GPS output is not validated in INAV)"** — iNav has no internal anti-spoofing, so anti-spoofing is fully the companion's responsibility. Two settings: `inav_allow_gps_fix_estimation` (RTH-with-no-GPS) and `inav_allow_dead_reckoning` (short-outage tolerance) — both default OFF. `failsafe_gps_fix_estimation_delay` controls mission-vs-RTH tradeoff (default 7 s).
+- **Related Sub-question**: SQ6 (dead-reckoning fallback) + SQ8 (anti-spoofing implication)
+
+### Source #16
+- **Title**: iNav docs/Settings.md (master)
+- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/Settings.md
+- **Tier**: L1
+- **Publication Date**: master (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav master
+- **Target Audience**: iNav operators
+- **Research Boundary Match**: Full match
+- **Summary**: Authoritative parameter list. Confirms `inav_allow_dead_reckoning` (line 2081, default OFF) ≠ `inav_allow_gps_fix_estimation` (line 2091, default OFF). The two settings address different scenarios. `failsafe_gps_fix_estimation_delay` (line 1041, default 7 s) governs mission-abort timing.
+- **Related Sub-question**: SQ6
+
+### Source #17
+- **Title**: iNav Issue #10588 — Weird behaviour in DeadReckoning mode while GPS outage is not constant
+- **Link**: https://github.com/iNavFlight/inav/issues/10588
+- **Tier**: L4 (open issue, 2025)
+- **Publication Date**: 2025
+- **Timeliness Status**: Currently valid (open)
+- **Target Audience**: iNav operators
+- **Research Boundary Match**: Full match
+- **Summary**: Documented stability bug: intermittent GPS outages cause porpoising and motor bursts in dead-reckoning. Cited recommendation: "GPS should be rejected if providing erroneous coordinates rather than no fix." Risk for AC-NEW-8 (visual blackout + spoofed GPS) on iNav: do NOT rely on iNav's dead-reckoning for the spoof-active failsafe path; companion must actively suppress its own MSP feed and accept that iNav may misbehave during the gap. Better: continue feeding companion-IMU-propagated position with growing covariance via MSP2_SENSOR_GPS so iNav never enters its dead-reckoning state.
+- **Related Sub-question**: SQ6 + AC-NEW-8 design implication
+
+### Source #18
+- **Title**: iNav Release 8.0.0 (highlights, Dec 2024)
+- **Link**: https://github.com/iNavFlight/inav/releases/tag/8.0.0
+- **Tier**: L1 (project release notes)
+- **Publication Date**: late 2024 / early 2025
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav 8.0
+- **Target Audience**: iNav users
+- **Research Boundary Match**: Full match
+- **Summary**: Introduces fixed-wing GPS fix estimation (dead reckoning RTH-only) — the milestone for #8347. No new external-positioning inbound MAVLink in 8.0. Confirms iNav's 2024–2025 trajectory has not added a `GPS_INPUT`-equivalent inbound interface.
+- **Related Sub-question**: SQ6
+
+### Source #19
+- **Title**: iNav Release 9.0.0 / 9.0.1 + 9.0.0 Release Notes wiki
+- **Link**: https://github.com/iNavFlight/inav/wiki/9.0.0-Release-Notes
+- **Tier**: L1
+- **Publication Date**: 2025-2026
+- **Timeliness Status**: Currently valid
+- **Version Info**: iNav 9.0.x
+- **Target Audience**: iNav users
+- **Research Boundary Match**: Full match
+- **Summary**: New in 9.0: pitot APA/TPA, position estimator improvements, MSP_REBOOT DFU, GCS NAV via `COMMAND_INT` `MAV_CMD_DO_REPOSITION`. **No** new external-positioning inbound MAVLink. UBX <15.00 dropped. Confirms iNav 9.x continues the same external-positioning architecture as 8.x.
+- **Related Sub-question**: SQ6
+
+### Source #20
+- **Title**: MAVLink common message set — GPS_RAW_INT (24)
+- **Link**: https://mavlink.io/en/messages/common.html
+- **Tier**: L1 (MAVLink spec, live)
+- **Publication Date**: live (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: MAVLink common, current
+- **Target Audience**: MAVLink integrators
+- **Research Boundary Match**: Full match
+- **Summary**: Current published `GPS_RAW_INT` extension fields: `alt_ellipsoid`, `h_acc` (mm), `v_acc` (mm), `vel_acc` (mm/s), `hdg_acc` (degE5), `yaw` (cdeg). **No spoofing/jamming/integrity bitfield is present in `GPS_RAW_INT` at the time of access**, despite PR #2110 having been merged for spoofing/integrity reporting. Spoofing/integrity may live in a separate message (`GPS_INTEGRITY` or similar — to be verified in SQ8). For now, spoof-detection signals available to companion from FC are limited at the message-shape level; FC-side textual signals (`STATUSTEXT`) and `NAMED_VALUE_INT` are the documented practical path.
+- **Related Sub-question**: SQ6 + SQ8
+
+### Source #21
+- **Title**: MAVLink PR #2110 — gps: add status and integrity information
+- **Link**: https://github.com/mavlink/mavlink/pull/2110
+- **Tier**: L2 (protocol PR with cross-project sign-off)
+- **Publication Date**: merged (accessed via search 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: MAVLink common
+- **Target Audience**: MAVLink integrators across PX4 / ArduPilot / QGC / Mission Planner
+- **Research Boundary Match**: Full match
+- **Summary**: Adds GNSS status / integrity reporting (jamming/spoofing/error) at the protocol level. Cross-project sign-off across PX4, ArduPilot, QGC, Mission Planner. Field-level breakdown to be cross-checked in SQ8 against the dialect XML — current `common.html` does not show those fields inside `GPS_RAW_INT` itself, suggesting they live in a sibling message (likely `GPS_INTEGRITY` or `GPS_STATUS_EXT`).
+- **Related Sub-question**: SQ6 → defer to SQ8 for the precise message name and field set ArduPilot uses to expose spoofing.
+
+### Source #22
+- **Title**: AirDroper — GNSS Spoofing Filter (companion device, MAVLink2 NAMED_VALUE_INT pattern)
+- **Link**: https://gps.airdroper.org/
+- **Tier**: L3 (vendor product page; design pattern reference, not protocol authority)
+- **Publication Date**: live (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: ArduPilot integrators considering anti-spoofing
+- **Research Boundary Match**: Reference only (vendor's specific algorithm not relevant; the integration pattern is)
+- **Summary**: Establishes a precedent that "companion-runs-spoofing-detection → publishes confidence to GCS as MAVLink2 `NAMED_VALUE_INT`, logged to dataflash" is a real-world integration pattern with ArduPilot, not novel to this project. Useful for SQ8.
+- **Related Sub-question**: SQ8 (referenced from SQ6)
+
+### Source #23
+- **Title**: ArduPilot PR #24135 — Add option to make EKF3 more robust to bad IMU and lagged GPS data
+- **Link**: https://github.com/ArduPilot/ardupilot/pull/24135
+- **Tier**: L2 (development PR)
+- **Publication Date**: 2023-2024 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: master / propagated to stable
+- **Target Audience**: ArduPilot developers
+- **Research Boundary Match**: Full match
+- **Summary**: Introduces `EK3_GLITCH_RADIUS` parameter — soft outlier rejection: instead of dropping a GPS measurement that fails innovation gating, the EKF inflates innovation variance to the minimum that just passes, effectively de-weighting the measurement. Implication for AC-NEW-4 (false-position safety): the project's covariance honesty contract on `GPS_INPUT.horiz_accuracy` is the ONLY way for AP's EKF to detect and de-weight a bad estimate; under-reporting collapses this safety net.
+- **Related Sub-question**: SQ6 + AC-NEW-4 design implication
+
+### Source #24
+- **Title**: ArduPilot AP_NavEKF3 — VehicleStatus.cpp + AP_NavEKF3.cpp (master)
+- **Link**: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_NavEKF3/AP_NavEKF3_VehicleStatus.cpp ; https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_NavEKF3/AP_NavEKF3.cpp
+- **Tier**: L1 (source code)
+- **Publication Date**: master HEAD (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: master
+- **Target Audience**: ArduPilot EKF3 developers
+- **Research Boundary Match**: Full match
+- **Summary**: EKF3 quality control: (a) ground-stationary GPS drift check ≤ 3 m (gated by `_gpsCheckScaler`); (b) innovation gating per `POS_I_GATE` / `VEL_I_GATE`; (c) soft de-weighting via `EK3_GLITCH_RADIUS` (Source #23). Confirms AP's covariance-driven quality path actually exists; companion-supplied `horiz_accuracy` flows into this chain.
+- **Related Sub-question**: SQ6 (full file analysis deferred to design phase)
+
+---
+
+## SQ1 — Existing / competitor GPS-denied UAV navigation systems
+
+### Source #25
+- **Title**: Twist Robotics develops OSCAR — a GPS-independent visual navigation system for drones resistant to electronic warfare equipment
+- **Link**: https://www.pravda.com.ua/eng/news/2026/01/28/8018266/
+- **Tier**: L2 (national newspaper of record reporting on a Technology Forces of Ukraine release; primary press is the Technology Forces of Ukraine FB post)
+- **Publication Date**: 2026-01-28 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid (within 6-month critical-novelty window)
+- **Target Audience**: Ukraine-deployment practitioners; UAV companion-system designers
+- **Research Boundary Match**: **Full match** — Ukrainian fixed-wing-class UAV, GPS-denied, vision-based, deployed in active conflict
+- **Summary**: Twist Robotics (UA) deployed OSCAR ("Optical System of Coordinates with Automatic Relocalisation") — camera + landmark-matching + map → autopilot ingests as a "reliable GPS signal". Vendor claims: 20 m accuracy without cumulative error, day/night/fog operation, 500,000 km logged across 25,000 combat missions over 24 months development, AI-augmented + Obrii proprietary simulator for training. Note: hardware photo shows active cooling on the module — implies non-trivial compute (probably Jetson-class). **No public independent benchmark.** Closest deployed peer system to this project.
+- **Related Sub-question**: SQ1 (closest peer); also informs SQ8 (anti-spoofing claims), SQ9 (synthesis)
+
+### Source #26
+- **Title**: Ukraine Gives Drones Vision-Based Navigation to Push Past Heavy Jamming — The Defense Post
+- **Link**: https://thedefensepost.com/2026/01/29/ukraine-drones-vision-navigation/
+- **Tier**: L2 (defense-trade publication; corroborates Source #25 with a second-party byline)
+- **Publication Date**: 2026-01-29 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: Defense-policy / procurement readership
+- **Research Boundary Match**: Full match
+- **Summary**: Confirms OSCAR is operational, terrain-imagery-against-mapped-landmarks pattern, autopilot-ingestion. Adds "live imagery" framing. No new technical detail beyond Source #25.
+- **Related Sub-question**: SQ1
+
+### Source #27
+- **Title**: Ukraine's Ruta Missile Drone Will Get an EW-Immune Navigation System — Defense Express
+- **Link**: https://en.defence-ua.com/weapon_and_tech/ukraines_ruta_missile_drone_will_get_an_ew_immune_navigation_system-14541.html
+- **Tier**: L2 (defense-trade publication, Ukraine-domestic)
+- **Publication Date**: 2025-05-17 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid (within 18-month authority window)
+- **Target Audience**: Defense-procurement / industry analysts
+- **Research Boundary Match**: Partial — operational profile (cruise-missile-class, terminal guidance) differs from our 8-h fixed-wing surveillance/strike profile; technique class is closely related (DSMAC pattern)
+- **Summary**: Destinus Ruta (Ukrainian-Swiss origin; ~300 km strike range, miniature cruise missile) will integrate a navigation system from UAV Navigation (Spanish, Grupo Oesía). Defense Express infers DSMAC-style operating principle: "takes images of surface mid-flight, identifies location through comparison with reference". Vendor announcement notes validation in Ukrainian combat conditions including GNSS-denied / jamming / spoofing. Establishes that the cruise-missile-tier vision-nav pattern is now being miniaturised for ~300 km strike drones.
+- **Related Sub-question**: SQ1 (commercial/military landscape)
+
+### Source #28
+- **Title**: Kilometer-Scale GNSS-Denied UAV Navigation via Heightmap Gradients: A Winning System from the SPRIN-D Challenge
+- **Link**: https://arxiv.org/abs/2510.01348
+- **Tier**: L1 (peer-style preprint, full system description, real flight data, competition results)
+- **Publication Date**: October 2025 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: arXiv v1 (2510.01348v1)
+- **Target Audience**: GNSS-denied UAV system designers (academic + practitioner)
+- **Research Boundary Match**: **Partial — different regime.** Multirotor (≤25 kg), <25 m AGL, LiDAR-equipped, no satellite-tile basemap; 9 km waypoint mission. Our project is fixed-wing, ~1 km AGL, no LiDAR, monocular + sat-tile basemap. **Architectural pattern transfers; specific algorithm does NOT** (heightmap gradients require LiDAR).
+- **Summary**: CTU Prague team won SPRIN-D Funke Fully Autonomous Flight Challenge with: VIO (OpenVINS) + LiDAR-derived local heightmap + gradient template matching against open-data DEM + clustered K-means particle filter, all on Intel NUC i7 16 GB CPU-only (no GPU). Achieved RMSE <11 m over kilometer-scale flights vs ≤53 m for raw odometry. Critical observations explicitly stated:
+  - **RTAB-Map and ORB-SLAM3 both fail** beyond 1 km / above 2 m/s flight (compute/memory) and ORB-SLAM3 loses tracking in textureless areas — directly applicable to our 17 m/s cruise over agricultural steppe.
+  - **"Some teams used RGB satellite image-based matching, but this has proved to be highly unreliable at such low altitudes."** This is a low-altitude (<25 m AGL) finding; our 1 km AGL operates in the high-altitude regime where the same paper notes RGB sat-matching "works reasonably well" (refs [5][6]).
+  - Lesson: "ability to recover from periods of high uncertainty and re-localize is more critical than maintaining consistently low instantaneous RMSE." Direct architectural input for AC-NEW-2 / AC-NEW-8.
+  - Lesson: IMU-from-airframe vibration isolation is mission-critical for VIO usability.
+  - Lesson: magnetometer is unreliable near steel-reinforced structures; sensor-fusion is essential for heading robustness.
+- **Related Sub-question**: SQ1 + SQ5 (failure modes for VIO/SLAM at speed) + SQ2 (canonical pipeline)
+
+### Source #29
+- **Title**: Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints
+- **Link**: https://arxiv.org/abs/2506.09748 (PDF: https://arxiv.org/pdf/2506.09748)
+- **Tier**: L1 (peer-submitted preprint, IEEE-bound, with public CS-UAV dataset)
+- **Publication Date**: June 2025 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid (within 6-month critical-novelty window for SOTA claims)
+- **Version Info**: arXiv v1 (2506.09748v1)
+- **Target Audience**: Academic SOTA researchers + UAV-localization implementers
+- **Research Boundary Match**: **Full match** — exact same problem (UAV absolute visual localization in GNSS-denied conditions, downward-facing camera, satellite reference)
+- **Summary**: 2025 SOTA pipeline: (1) image retrieval module (off-the-shelf, optimal-transport feature aggregation), (2) Semantic-Aware and Structure-Constrained Matching Module using **DINOv2** features + 4D correlation tensor + SoftMNN + 4D conv, (3) lightweight fine-grained module for pixel-level. Constructs UAV absolute visual-loc pipeline **without VIO/relative-loc dependence** (retrieval-and-matching only). Evaluation on AerialVL + their own CS-UAV. **Direct relevance**: this is a candidate template for our C2 (VPR) + C3 (cross-domain registration) components, but DINOv2 is a heavyweight foundation model — must be benchmarked under our 25 W / 8 GB Jetson Orin Nano envelope before selection (handed off to SQ3/SQ4 + SQ5 for that component).
+- **Related Sub-question**: SQ1 (academic SOTA), SQ3+SQ4 (C2/C3 candidates), SQ5 (Jetson-on-Foundation-Model failure mode)
+
+### Source #30
+- **Title**: Raptor — GPS-Denied UAV Navigation & Coordinate Extraction (Vantor product page; Guide / Sync / Ace suite)
+- **Link**: https://www.vantor.com/product/mission-solutions/raptor/
+- **Tier**: L2 (vendor product spec; primary for the product itself, not for independent benchmark numbers)
+- **Publication Date**: live (accessed 2026-05-07; references Mar 2026 + Dec 2025 + Sep 2025 partner blog posts indicating active product line)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: Defense / commercial / industrial UAV integrators
+- **Research Boundary Match**: **Full match** — vision-based aerial position software using existing camera + 3D terrain data, deployable on commodity hardware
+- **Summary**: Vantor Raptor product family: **Guide** (on-drone vision-based positioning, demonstrated <7 m absolute accuracy in all dimensions, day/night/low-altitude, runs on commodity HW); **Sync** (georegisters live drone video against 3D terrain in real time, <3 m coordinate extraction); **Ace** (laptop-side coordinate extraction at <3 m). Backbone: Vantor's "100 million-plus sq km of highly accurate 3D terrain data, regularly updated" (Vivid Terrain, 3 m accuracy). Inertial Labs partnership (VINS-integrated Raptor Guide). Use cases include joint multi-domain ops, large-scale autonomous delivery, search-and-rescue. **This is the closest production-grade commercial peer to the project's architecture (sat-basemap-as-service + on-drone vision).**
+- **Related Sub-question**: SQ1 (commercial), SQ3+SQ4 (commercial alternatives to building C2/C3 ourselves), SQ8 (basemap as a service vs offline cache)
+
+### Source #31
+- **Title**: Auterion successfully completes Artemis program to deliver long-range deep strike drone (press release)
+- **Link**: https://auterion.com/auterion-successfully-completes-artemis-program-to-deliver-long-range-deep-strike-drone/
+- **Tier**: L1 (official vendor press release)
+- **Publication Date**: 2025-10-15 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: Defense-procurement; UAV-integration architects
+- **Research Boundary Match**: **Full match** — fixed-wing-class one-way attack drone with Ukraine-validated GPS-denied navigation; the system architecture is directly comparable
+- **Summary**: Auterion Artemis (DIU project, completed Oct 2025) = Shahed-style design developed in Ukraine; up to 1,000-mile range; up to 40 kg warhead; runs on Auterion Skynode N mission computer + Auterion Visual Navigation system + built-in terminal guidance. Government evaluators signed off after operational flight tests in Ukraine including ground launch, GPS and GPS-denied navigation, long-range transit, and terminal engagement. **Establishes that the integration pattern (companion-class autopilot + visual navigation + terminal guidance) is shipping at production scale to a US defense customer.** Open architecture, manufacturing in US/UA/DE.
+- **Related Sub-question**: SQ1
+
+### Source #32
+- **Title**: Bring AI and computer vision to small autonomous systems — Auterion Skynode S product page
+- **Link**: https://auterion.com/product/skynode-s
+- **Tier**: L2 (vendor product spec)
+- **Publication Date**: live (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: Small-UAS integrators
+- **Research Boundary Match**: Full match (companion-class autopilot with NPU)
+- **Summary**: Auterion Skynode S = compact mission computer with **dedicated Neural Processing Unit** for AI / computer-vision applications on small UAS systems. Architecturally the same niche our Jetson Orin Nano Super sits in (companion compute + autopilot integration), but with Auterion's PX4 fork pre-integrated. Hardware/runtime envelope is comparable; the product establishes that this is a product category, not a one-off integration.
+- **Related Sub-question**: SQ1, SQ7 (alternate companion HW for adjacent context)
+
+### Source #33
+- **Title**: snktshrma/ngps_flight — Next-Generation Positioning System for ArduPilot (GSoC 2024)
+- **Link**: https://github.com/snktshrma/ngps_flight (sibling: https://github.com/snktshrma/ap_nongps)
+- **Tier**: L1 (open-source code repository, published GSoC project under ArduPilot organisation)
+- **Publication Date**: GSoC 2024 timeframe (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Version Info**: GSoC 2024 prototype (research-grade, not production firmware)
+- **Target Audience**: ArduPilot integrators building visual-positioning companion stacks
+- **Research Boundary Match**: **Full match — closest open-source peer to our exact pipeline.** ArduPilot, downward-facing camera, satellite-image reference, deep-learning matching, fused with VIO, fed back to autopilot.
+- **Summary**: NGPS = ROS 2 + ArduPilot pipeline composed of three packages: **`ap_ngps_ros2`** (visual geo-localization at 1–2 Hz by matching live camera frames to georeferenced satellite imagery using **LightGlue + SuperPoint**); **`ap_ukf`** (Unscented Kalman Filter fusing NGPS absolute positions with VIO estimates); **`ap_vips`** (VIO providing relative pose). Output is fused odometry to ArduPilot's EKF via `VISION_POSITION_ESTIMATE` (per the related issue #23471 framing). **This is the architectural template** the project should explicitly compare against — same component split as our C1+C2+C3+C5+C8 stack.
+  - Caveats: (a) GSoC prototype, not production-hardened; (b) uses `VISION_POSITION_ESTIMATE` which on AP requires EKF source set 2/3 with EK3_SRC*_POSXY=Vision; our SQ6 conclusion picked `GPS_INPUT` as primary AP path because it carries `horiz_accuracy` directly and supports source-set switching via `MAV_CMD_SET_EKF_SOURCE_SET` — must compare the trade-off in design phase; (c) no documented spoofing-defence integration; (d) no documented covariance-honesty contract.
+- **Related Sub-question**: SQ1 (closest open-source peer), SQ2 (canonical-pipeline confirmation), SQ3+SQ4 (architectural template for component selection), SQ6 (alternate AP transport: `VISION_POSITION_ESTIMATE` vs `GPS_INPUT`)
+
+### Source #34
+- **Title**: AerialExtreMatch — A Benchmark for Extreme-View Image Matching and Localization (project page + GitHub + Hugging Face dataset)
+- **Link**: https://xecades.github.io/AerialExtreMatch/ ; https://github.com/Xecades/AerialExtreMatch ; https://huggingface.co/datasets/Xecades/AerialExtreMatch-Localization
+- **Tier**: L1 (peer-reviewed benchmark with public dataset, code, model checkpoints; OpenReview submission)
+- **Publication Date**: 2025 (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: Academic + practitioner image-matching evaluators
+- **Research Boundary Match**: **Full match** for cross-source UAV-satellite image matching evaluation
+- **Summary**: 2025 benchmark with: 1.5 M synthetic train pairs (RGB+depth, diverse UAV/satellite viewpoints); ~30,000 evaluation pairs in 32 difficulty levels stratified by overlap (4 bins: <20/20-40/40-60/>60%), pitch difference (4 bins: 50–55, 55–60, 60–65, 65–70°), and scale (2 bins: 1-2×, >2×); a real-world UAV-localization split captured with DJI M300 RTK + H20T against UAV-derived orthomosaic/DSM AND lower-quality satellite maps. Evaluates 16 representative detector-based + detector-free image matching methods. **This is the academic benchmark our C2+C3 candidate selection must publish numbers against.**
+- **Related Sub-question**: SQ1 (academic landscape), SQ7 (datasets)
+
+### Source #35
+- **Title**: DARPA Fast Lightweight Autonomy (FLA) program page + Test-and-Evaluation review (arXiv 2504.08122)
+- **Link**: https://www.darpa.mil/research/programs/fast-lightweight-autonomy ; https://arxiv.org/abs/2504.08122
+- **Tier**: L1 (DARPA program page + 2025 academic review of program results)
+- **Publication Date**: program 2015–2018 (concluded); review 2025-04 (accessed 2026-05-07)
+- **Timeliness Status**: Foundational reference; review is current (within 18-month authority window)
+- **Target Audience**: Defense-program historians + indoor-low-altitude GPS-denied autonomy researchers
+- **Research Boundary Match**: **Partial — different regime.** FLA = small quadcopters at ≤20 m/s in cluttered indoor/outdoor with onboard sensing only, no satellite-tile basemap. Our project is fixed-wing, ~17 m/s, 1 km AGL, with sat-tile basemap.
+- **Summary**: Foundational US-defense lineage for GPS-denied autonomy (2015–2018, complete). Set the template for "small UAV + onboard sensors + onboard compute → autonomous obstacle-avoidance + navigation without datalink/GPS". Phase 1 in Florida 2017; Phase 2 in Georgia 2018. The 2025 retrospective (arXiv 2504.08122) reviews FLA's testing methodology and Phase 1 results. Companion 2025 USAF SBIR Phase II solicitation (Sweetspot ID `7946c818-409f-5b31-8f06-554466071d83`) is requesting visual-position-and-navigation capability for sUAS in GPS-denied environments — the regulatory tailwind is now active.
+- **Related Sub-question**: SQ1 (defense-program lineage)
+
+### Source #36
+- **Title**: DSMAC / TERCOM lineage — DTIC ADA315439 (Scene Matching Missile Guidance Technologies) + Wikipedia / SPIE references
+- **Link**: https://apps.dtic.mil/sti/tr/pdf/ADA315439.pdf ; https://en.wikipedia.org/wiki/DSMAC ; https://www.spiedigitallibrary.org/conference-proceedings-of-spie/0238/1/Terrain-Contour-Matching-TERCOM-A-Cruise-Missile-Guidance-Aid/10.1117/12.959127.short
+- **Tier**: L1 (DTIC unclassified technical report) + L2 (encyclopedia/SPIE proceedings)
+- **Publication Date**: DTIC: 1996; SPIE: 1980; Wikipedia: live
+- **Timeliness Status**: Foundational baseline (no time window per Step 0.5 — established classical algorithms)
+- **Target Audience**: Cruise-missile-class designers; analogues for downward-vision navigation
+- **Research Boundary Match**: **Partial — different regime** (cruise missile, terminal guidance). Architectural pattern (pre-cached scene reference + downward camera + correlation matching) is the direct ancestor of our C3 pipeline.
+- **Summary**: DSMAC = electro-optical camera correlated against pre-stored reference scenes (often from satellite reconnaissance), achieving 3–10 m terminal accuracy. Tomahawk: TERCOM (radar altimeter + DEM) for mid-flight; DSMAC for terminal. CEP without DSMAC: ~30 m; with DSMAC: "only meters". Gulf War 1991: >80% of 280 launched Tomahawks hit target. **Establishes that downward-vision-against-pre-stored-imagery is a 40+ year-old well-characterised technique class with documented accuracy bounds; our project's claim of <500 m / 99.9% reliability is achievable in the same technique class.**
+- **Related Sub-question**: SQ1 (lineage), SQ8 (baseline accuracy expectations)
+
+### Source #37
+- **Title**: Electronic Warfare in Ukraine: The Invisible Battle — Ukraine War Analytics
+- **Link**: https://ukraine-war-analytics.com/analysis/electronic-warfare-ukraine.html
+- **Tier**: L3 (analytical aggregator; primary-source numbers cite vendor / OSINT reports)
+- **Publication Date**: live (accessed 2026-05-07)
+- **Timeliness Status**: Currently valid (operational-context reference)
+- **Target Audience**: Ukraine-deployment practitioners
+- **Research Boundary Match**: Full match (operational geography, threat environment)
+- **Summary**: Operational-context anchor: Russian EW systems including Pole-21 GPS jammers (25+ km range) plus spoofing capabilities have driven ~70% of small-tactical-UAV losses to EW across the conflict. Twist Robotics' OSCAR cites the same approximate number (~75% of small tactical UAV losses to EW at the front per Source #25). **Confirms the demand-side number is consistent across two independent reporting chains.**
+- **Related Sub-question**: SQ1 (Ukraine practitioner perspective)
+
+---
+
+## SQ2 — Canonical pipeline decomposition
+
+### Source #38
+- **Title**: Visual Place Recognition for Aerial Imagery: A Survey (Moskalenko, Kornilova, Ferrer — Skoltech)
+- **Link**: https://arxiv.org/abs/2406.00885 (v2)
+- **Tier**: L1 (peer-reviewed survey, accepted in Robotics and Autonomous Systems; companion benchmark code: https://github.com/prime-slam/aero-vloc)
+- **Publication Date**: arXiv 2024-06; v2 update through 2024
+- **Timeliness Status**: Currently valid (within 18-month authority window for established surveys; specific candidate latency numbers will need cross-validation against newer Jetson-class hardware reports)
+- **Target Audience**: Aerial-VPR practitioners + UAV navigation system architects
+- **Research Boundary Match**: **Full match** for the offline-cache visual geo-localization decomposition (aerial-nadir UAV vs. satellite tile basemap)
+- **Summary**: Authoritative two-stage pipeline definition (verbatim): "Visual geolocalization can be implemented through various methods, typically relying on a pre-built database of images with known locations. This approach generally involves two stages: **global localization (or Visual Place Recognition, VPR) and local alignment**. Global localization involves identifying the nearest frame from the database (Image Retrieval), while local alignment determines the precise position using the selected frame." Re-ranking is treated as an integral sub-stage of VPR for aerial data because of agricultural/urban grid repetition. Local alignment = SuperPoint/keypoint detector → LightGlue/SuperGlue/SelaVPR matcher → cv2.findHomography → cv2.perspectiveTransform → Web-Mercator coordinate conversion. **Practitioner-critical runtime numbers (RTX 3090, NOT Jetson)**: AnyLoc descriptor calculation = 0.37–0.84 s/frame (huge ViT-G DINOv2); MixVPR / SALAD = 0.05–0.20 s; SelaVPR = 0.04 s; SuperGlue re-rank = 15–25 s on top-100 candidates; LightGlue re-rank = ~1 s; SelaVPR re-rank = <0.1 s. Memory: AnyLoc descriptors = 2.3–13.9 GB for 4–7k tiles; SelaVPR = <0.2 GB. Final commentary: "While our methodology alone may not provide comprehensive robustness, it can be effectively augmented with additional sensors, such as inertial measurement units (IMUs). This integration enhances its utility for Visual Inertial Odometry (VIO) and Simultaneous Localization and Mapping (SLAM) systems, particularly for periodic location refinement and loop closure tasks. Additionally, our methodology could serve as a dependable emergency localization fallback in the event of an unexpected GNSS signal loss." → **Validates the project's IMU/VIO + sat-anchor architecture as the canonical extension of the survey's two-stage core.**
+- **Related Sub-question**: SQ2 (canonical decomposition), SQ3+SQ4 (C2/C3 candidate latency budgets), SQ5 (foundation-model-on-Jetson failure mode)
+
+### Source #39
+- **Title**: Cross-View Geo-Localization: A Survey (Durgam, Paheding, Dhiman, Devabhaktuni — U. Maine / Fairfield / ISU)
+- **Link**: https://arxiv.org/abs/2406.09722 (v1)
+- **Tier**: L1 (peer-style preprint, journal-bound — Expert Systems with Applications)
+- **Publication Date**: arXiv 2024-06
+- **Timeliness Status**: Currently valid (≤18 months for survey-of-deep-learning architectures)
+- **Target Audience**: Cross-view (ground↔aerial) geo-localization researchers; partial overlap with our aerial↔satellite pipeline
+- **Research Boundary Match**: **Partial — different cross-view setup** (the survey focuses on ground panorama → aerial overhead; ours is aerial nadir → satellite ortho). The pipeline-shape lessons transfer; the polar-transform / Siamese-network / GAN-based view-synthesis lessons do NOT directly apply because our two views are both top-down.
+- **Summary**: Confirms the canonical pipeline decomposition (feature extraction → cross-view matching → similarity-driven retrieval) is the dominant pattern across 2015–2024 SOTA. Establishes the historical lineage: pixel-wise (Sheikh 2003) → feature-based (Lin 2013) → CNN/triplet-loss (Tian 2017) → Siamese+GAN (Hu 2018) → polar-transform (Shi 2019) → CosPlace/EigenPlaces (2022–2023) → DINOv2-class (AnyLoc 2023) → Transformer-only (TransGeo 2022, MGTL 2022) → multi-method fusion (2023+). Backbone comparison table establishes that ViT/DINOv2 is the current SOTA backbone; ResNet-class is the established production baseline; SIFT/SURF/PHOW remain the handcrafted baseline. **Confirms our component-area split (C2 VPR + C3 cross-domain matching) is canonical and matches the survey's two-axis organization (backbone × matching strategy).**
+- **Related Sub-question**: SQ2 (decomposition lineage), SQ3+SQ4 (C2 candidate landscape)
+
+### Source #40
+- **Title**: OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata (Dhaouadi, Marin, Meier, Kaiser, Cremers — DeepScenario / TU Munich / MCML)
+- **Link**: https://arxiv.org/abs/2509.18350 ; project page https://deepscenario.github.io/OrthoLoC
+- **Tier**: L1 (peer-style preprint with public dataset, code, model checkpoints; 16,425 UAV images Germany+US, full 6-DoF ground truth)
+- **Publication Date**: arXiv 2025-09 (within 6-month critical-novelty window)
+- **Timeliness Status**: Currently valid (within 6-month critical-novelty window for SOTA aerial-localization claims)
+- **Target Audience**: UAV-localization implementers + system architects building on Digital Orthophotos (DOP) + Digital Surface Models (DSM)
+- **Research Boundary Match**: **Full match — direct paradigm match** to our project: "lightweight orthographic representations" instead of 3D meshes; "increasingly accessible through free releases by governmental authorities"; "no internet connection or GNSS/GPS support" — exactly the project's constraint envelope.
+- **Summary**: **Most directly applicable SQ2 source.** Defines the 6-DoF localization pipeline using 2.5D geodata: (1) match query UAV image against DOP (orthophoto raster) using state-of-the-art matchers; (2) lift each 2D match in the DOP to 3D using the corresponding DSM elevation; (3) PnP+RANSAC (RANSAC-EPnP, 5-pixel inlier threshold) → initial pose; (4) Levenberg-Marquardt joint refinement of intrinsics + extrinsics; (5) **AdHoP refinement**: estimate homography from initial 2D-2D correspondences via DLT+RANSAC, warp the DOP to better match the query's perspective, re-match, map back via H⁻¹, lift to 3D, refine pose; accept refinement only if reprojection error decreases. **Quantitative results** on 16.4k images, 47 locations: best matcher = GIM+DKM achieves 75.4% recall at 1m-1° threshold (sparse SP+SG = 64.4%, sparse SP+LG = 64.2%, MASt3R = 63.5%, RoMa+AdHoP = 54.6%, XFeat*+AdHoP = 59.8%; LoFTR / eLoFTR / XoFTR all <23% recall). AdHoP yields ~30% average matching improvement, ~20% translation/rotation error reduction; for previously-underperforming methods (XFeat* → 95% matching improvement; DKM → 63% translation reduction; RoMa → 1m-1° recall +23%). **Performance factors** explicitly characterized: (a) **cross-domain DOPs (visual gap only) cause ~3× translation error increase** even on best method; (b) **cross-domain DOPs+DSMs (visual + structural gap) cause ~7× translation error increase** (0.16 m → 1.12 m for GIM+DKM+AdHoP) — **this is exactly the war-zone scene-change scenario AC-3.x covers**; (c) **20% covisibility floor** between query and reference; below it localization fails; (d) **Calibration is fundamentally ambiguous** between focal length and translation → camera intrinsics MUST be calibrated upstream, not jointly optimized in flight. (e) Resolution: scaling images to 30% of original (~300 px) still works; geodata at 13 m/pixel is the floor, with degradation below.
+- **Related Sub-question**: SQ2 (canonical pipeline + AdHoP refinement loop), SQ3+SQ4 (C3 matcher candidate ranks), SQ5 (war-zone scene-change failure mode), SQ8 (covisibility safety gate)
+
+### Source #41
+- **Title**: Exploring the best way for UAV visual localization under Low-altitude Multi-view Observation Condition: a Benchmark — AnyVisLoc (Ye, Teng, Chen, Li, Liu, Yu, Tan — NUDT / Macao Polytechnic)
+- **Link**: https://arxiv.org/abs/2503.10692 ; benchmark code https://github.com/UAV-AVL/Benchmark
+- **Tier**: L1 (peer-style preprint with public 18,000-image dataset across 15 Chinese cities, multi-pitch / multi-altitude / multi-scene, with both aerial-photogrammetry AND satellite reference maps)
+- **Publication Date**: arXiv 2025-03 (within 6-month critical-novelty window)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: Aerial AVL practitioners; UAV-system designers facing pitch/altitude/yaw uncertainty
+- **Research Boundary Match**: **Partial — different altitude regime** (the benchmark covers 30–300 m AGL, ours is ~1 km AGL); pitch range is 20–90° (ours is mostly nadir, ~80–90°). Lessons on the **pipeline structure, retrieval-vs-matching trade-offs, sensor-prior noise tolerance, and aerial-vs-satellite reference-map gap** transfer directly.
+- **Summary**: Independently confirms the SAME pipeline as Source #40: image retrieval (rough position) → image matching (2D-2D) → DSM-lift to 3D → PnP+RANSAC. Best baseline = CAMP (retrieval) + RoMa (dense matcher) + Top-N re-rank → 74.1% A@5m on aerial photogrammetry map, 18.5% A@5m on satellite map (ALOS 30m DSM). **Critical AC-quantitative findings**: (a) **Aerial map vs satellite map**: 4× accuracy gap at A@5m (74.1% vs 18.5%) — driven by satellite-DSM coarseness (ALOS 30m vs aerial 0.94m) and modality difference. **Direct relevance**: project's offline cache is satellite tiles ≥0.5 m/px without DSM; this places us between the two data points (better than ALOS 30m, worse than aerial photogrammetry) — exact accuracy must be re-established once tile resolution is pinned. (b) **Yaw prior noise**: σ ≤ 5° → no impact; σ = 10° → 1.9% A@5m drop; σ = 30° → 4.1% drop; σ = 50° → 13.7% drop; σ = 60° → 25.7% drop. **Implication for project's C1+C5+IMU**: companion-side yaw estimate must hold σ < 10°. (c) **Pitch prior noise**: σ < 5° → no impact; σ ≥ 7° causes ~1–5% drops. (d) **Pitch angle**: smaller pitch (more oblique) → lower accuracy; nadir is best. Project's nadir-fixed camera at 1 km AGL is consistent with the benchmark's most-favourable regime. (e) **Sparse vs dense matchers**: SP+LightGlue+GIM+k2s = 75.4% A@10m at 105 ms/frame; RoMa = 81.3% A@10m at 659 ms/frame. **Implication for project's C7 Jetson runtime**: dense matchers ~6× more accurate but ~6× slower → SP+LightGlue-class is the production sweet spot under our 400 ms budget. (f) **Re-ranking strategy**: Top-N re-rank by inlier count = best accuracy/cost trade-off (62.2% A@5m at 0.8 s/frame on RTX 3090). Match-without-retrieval = catastrophic (34.3% A@5m, search-space too large).
+- **Related Sub-question**: SQ2 (pipeline + sensor-prior tolerance), SQ3+SQ4 (C2 retrieval-vs-matcher trade-offs, C5 IMU prior contract), SQ5 (war-zone reference-map staleness failure mode), SQ7 (aerial-vs-satellite reference benchmarks)
+
+### Source #42
+- **Title**: Survey on absolute visual localization techniques for low-altitude unmanned aerial vehicles (Ye, Chen, Teng, Li, Yang, Song, Yu — NUDT, College of Aerospace Science)
+- **Link**: https://www.sciopen.com/article/10.11887/j.issn.1001-2486.25120033 ; DOI 10.11887/j.issn.1001-2486.25120033
+- **Tier**: L1 (peer-reviewed Chinese journal — Journal of National University of Defense Technology, vol 48 issue 2, 2026; same lab as Source #41 with overlapping authorship — confirmed cross-validation, not duplicative)
+- **Publication Date**: 2026-04-01 (within 6-month critical-novelty window)
+- **Timeliness Status**: Currently valid
+- **Target Audience**: UAV-system architects + Chinese-defense-research community
+- **Research Boundary Match**: **Full match** (low-altitude UAV AVL is the survey's exact subject)
+- **Summary**: Survey-level confirmation of the canonical "**retrieval-matching-pose estimation**" hierarchical framework. Verbatim claim: "the hierarchical framework balances search efficiency, positioning accuracy, and scene generalization, becoming a robust technical path for low-altitude long-endurance absolute localization." Compares the framework against alternatives that are explicitly rejected: (a) relative visual localization (cumulative errors — VIO/SLAM only); (b) end-to-end direct localization (poor generalization); (c) map-free localization (scene-dependent). Sub-component evolution per stage: (a) retrieval = template-matching (SAD/SSD/NCC) → BoW/VLAD → deep-learning (annular/dense feature segmentation, contrastive InfoNCE, self-supervised); (b) matching = SIFT/SURF/ORB → SuperPoint+LightGlue/RoMa (sparse / semi-dense / dense); (c) pose estimation = PnP variants + RANSAC + IMU prior fusion. **Identifies four open challenges** that align with project risks: (i) cross-domain generalization (war-zone scene change); (ii) real-time inference on edge platforms (Jetson); (iii) robustness to complex environments (cropland, snow, low texture); (iv) high-quality datasets (the same gap our project's AC-NEW-7 / cache provisioning works around). **Lightweight-model-design-for-edge-deployment is named as a primary future-research direction** — directly validates project's Jetson Orin Nano constraint as a recognized field-level challenge, not a project-specific oddity.
+- **Related Sub-question**: SQ2 (framework canonicalness), SQ3+SQ4 (per-component evolution), SQ5 (named open challenges align with project risks)
@@ -0,0 +1,410 @@
+# Fact Cards
+
+> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in `01_source_registry.md`. Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential).
+>
+> Bound to sub-questions in `00_question_decomposition.md`. Many SQ6 facts also bind directly to the Project Constraint Matrix (`acceptance_criteria.md` / `restrictions.md`); per the engine's "Per-Mode API Capability Verification" rule, MAVLink/MSP messages are treated as candidate **modes** and are bound `Pass/Fail/Verify/N/A` against numbered ACs and restrictions.
+
+---
+
+## SQ6 — ArduPilot Plane vs iNav external positioning
+
+### Fact #1 — ArduPilot Plane EKF3 ingests `GPS_INPUT` (MAVLink ID 232) as a first-class GPS source
+- **Statement**: ArduPilot's `AP_GPS_MAV` driver (master) decodes `MAVLINK_MSG_ID_GPS_INPUT` and stores the resulting state into the GPS slot identified by `gps_id`. Decoded fields: lat/lon (degE7), alt (mm → cm internally), hdop/vdop, velocity (vn/ve/vd cm/s), speed/horizontal/vertical accuracy (m / m/s), yaw (cdeg, `0` sentinel = "not provided"). Honors `ignore_flags` for ALT/HDOP/VDOP/VEL_HORIZ/VEL_VERT/SPEED_ACCURACY/HORIZONTAL_ACCURACY/VERTICAL_ACCURACY. Requires `fix_type ≥ 3` and `time_week > 0` for jitter-corrected timestamping.
+- **Source**: Source #4 (AP_GPS_MAV.cpp master), Source #1 (Plane Non-GPS Navigation docs)
+- **Phase**: Phase 2
+- **Target Audience**: ArduPilot Plane operators / developers
+- **Confidence**: ✅
+- **Related Dimension**: C8 (FC adapter), C5 (estimator covariance contract)
+- **Fit Impact**: **supports selection** — ArduPilot side of AC-4.3 is satisfied by `GPS_INPUT` as the primary external-positioning message; covariance fields (`horiz_accuracy`, `vert_accuracy`, `speed_accuracy`) are wired through.
+
+### Fact #2 — ArduPilot's covariance honesty (AC-NEW-4) is enforced via the `horiz_accuracy` field of `GPS_INPUT`
+- **Statement**: When `GPS_INPUT_IGNORE_FLAG_HORIZONTAL_ACCURACY` is unset, AP_GPS stores `packet.horiz_accuracy` into `state.horizontal_accuracy` and sets `state.have_horizontal_accuracy = true`. EKF3's quality chain consumes this via (a) ground-stationary 3 m drift check (`_gpsCheckScaler`-modulated), (b) innovation gating (`POS_I_GATE`/`VEL_I_GATE`), (c) soft de-weighting via `EK3_GLITCH_RADIUS` (PR #24135). Under-reporting `horiz_accuracy` defeats these gates — exactly the AC-NEW-4 risk the project flagged.
+- **Source**: Source #4, Source #23 (PR #24135), Source #24 (AP_NavEKF3 master)
+- **Phase**: Phase 2
+- **Target Audience**: System designers writing the C5 estimator → C8 adapter
+- **Confidence**: ✅ (source code + L1 docs); ⚠️ for the precise innovation-gate mechanics (deferred to design-phase SITL tuning)
+- **Related Dimension**: C5 covariance, AC-NEW-4
+- **Fit Impact**: **architectural constraint** — the C5 estimator MUST publish honest `horiz_accuracy` (not optimistic) for AP's EKF3 quality chain to function. Aligns directly with AC-1.4 / AC-NEW-4.
+
+### Fact #3 — ArduPilot supports runtime EKF source-set switching from companion via `MAV_CMD_SET_EKF_SOURCE_SET`
+- **Statement**: EKF3 supports up to three source sets (`EK3_SRC1..3_*`). A companion can request a switch by sending `MAV_CMD_SET_EKF_SOURCE_SET`. Alternative paths: RC aux-switch option 90 ("EKF Pos Source"), Lua scripts (e.g., `ahrs-source.lua`). **Caveat from L1 docs**: "no GCSs are currently known to implement this" — companion-driven switching works at the firmware level but is not exposed in stock GCS UIs.
+- **Source**: Source #2, Source #3
+- **Phase**: Phase 2
+- **Target Audience**: System designers handling AC-NEW-2 spoof-promotion path on ArduPilot
+- **Confidence**: ✅
+- **Related Dimension**: C8 + AC-NEW-2
+- **Fit Impact**: **supports selection** — AP allows the project to model two source sets (set 1 = real GPS, set 2 = onboard `GPS_INPUT`) and switch automatically. Keeps companion lightweight; switching does not require the companion to suppress real-GPS itself.
+
+### Fact #4 — ArduPilot ODOMETRY-velocity-only fusion is currently NOT supported (open enhancement)
+- **Statement**: Issue #23485 confirms current limitation: feeding `ODOMETRY` without position causes EKF position-estimate timeout / failsafe. Implication: the project's `visual_propagated` mode (VO drift between satellite anchors, no global position) **cannot be expressed as ODOMETRY-velocity-only on current AP** — must be sent as a full `GPS_INPUT` with covariance widened to reflect drift uncertainty.
+- **Source**: Source #8
+- **Phase**: Phase 2
+- **Target Audience**: System designers
+- **Confidence**: ✅ (open enhancement, open as of accessed date)
+- **Related Dimension**: C5 + C8 + AC-1.3 (`visual_propagated` label) + AC-1.4 (covariance ellipse)
+- **Fit Impact**: **architectural constraint** — `visual_propagated` and `dead_reckoned` labels both ride `GPS_INPUT` with growing `horiz_accuracy`, NOT a separate `ODOMETRY` channel. Single-message contract = simpler. AC-NEW-8 thresholds (`horiz_accuracy = 999.0` for "no fix") map directly.
+
+### Fact #5 — iNav firmware (master, post-9.0) has NO inbound MAVLink handler for any external-positioning message
+- **Statement**: Authoritative inbound switch in `src/main/telemetry/mavlink.c::processMAVLinkIncomingTelemetry` (master) handles only: HEARTBEAT, PARAM_REQUEST_LIST (stub reply), MISSION_CLEAR_ALL, MISSION_COUNT, MISSION_ITEM, MISSION_REQUEST_LIST, MISSION_REQUEST, COMMAND_INT (only `MAV_CMD_DO_REPOSITION`), RC_CHANNELS_OVERRIDE, ADSB_VEHICLE, RADIO_STATUS. **No `GPS_INPUT`, `VISION_POSITION_ESTIMATE`, `ODOMETRY`, `GLOBAL_POSITION_INT`, or `GPS_RAW_INT` are accepted as inputs.** Wiki page (Source #10) confirms: "Limited command support: Commands that are not implemented are ignored."
+- **Source**: Source #9 (master code), Source #10 (wiki, edited 2025-12-11)
+- **Phase**: Phase 2
+- **Target Audience**: System designers + AC-4.3 author
+- **Confidence**: ✅
+- **Related Dimension**: C8, AC-4.3
+- **Fit Impact**: **DISQUALIFIES the literal AC-4.3 wording** ("the standard external-positioning message type(s) accepted by ArduPilot AND iNav"). No single MAVLink external-positioning message is accepted by both FCs. Project must adopt a per-FC adapter design and AC-4.3 must be revised to acknowledge two transports.
+
+### Fact #6 — iNav accepts external GPS injection via two MSP paths; `MSP2_SENSOR_GPS` is the covariance-rich path
+- **Statement**: `MSP_SET_RAW_GPS (201)` (legacy MSP1, 14 bytes): fixType, numSat, lat, lon, alt (m, internal cm), speed (cm/s). **No covariance, no per-axis velocity, no yaw.** `MSP2_SENSOR_GPS (7939, MSPv2 sensor plugin)`: instance, gpsWeek, msTOW, fixType, satellitesInView, hPosAccuracy (mm), vPosAccuracy (mm), hVelAccuracy (cm/s), hdop, lat, lon, mslAltitude (cm), nedVelNorth/East/Down (cm/s), groundCourse (cdeg×100), trueYaw (cdeg×100), date+time. Routes through `mspGPSReceiveNewData()` via `GPS_PROVIDER_MSP`. Requires build flag `USE_GPS_PROTO_MSP` — **enabled by default in iNav's `target/common.h`**, so stock firmware reaches this path.
+- **Source**: Source #12 (MSP message reference, master), Source #13 (target/common.h master + gps.c provider table)
+- **Phase**: Phase 2
+- **Target Audience**: System designers (C8 adapter, MSP transport)
+- **Confidence**: ✅
+- **Related Dimension**: C8, C5 covariance contract
+- **Fit Impact**: **supports selection** of `MSP2_SENSOR_GPS` for the iNav adapter. Covariance fields (`hPosAccuracy`, `vPosAccuracy`, `hVelAccuracy`) align semantically with `GPS_INPUT.horiz_accuracy` / `vert_accuracy` / `speed_accuracy`, but unit conversions differ (mm vs m). The C8 adapter must therefore be FC-aware, not protocol-monomorphic.
+
+### Fact #7 — iNav does NOT support dual-GPS arbitration; companion must be the SOLE GPS source
+- **Statement**: Issue #10141 is an open feature request for dual-GPS support. Current iNav (master incl. 9.0.x) has single-GPS architecture with one UART selected as the GPS port. There is no primary/secondary failover and no per-instance arbitration in the nav stack.
+- **Source**: Source #14
+- **Phase**: Phase 2
+- **Target Audience**: System designers (architecture)
+- **Confidence**: ✅
+- **Related Dimension**: C8, C5, AC-NEW-2 (spoof promotion)
+- **Fit Impact**: **architectural constraint** — on iNav, real GPS receivers must NOT be wired directly to the FC. Real GPS goes to the companion; the companion fuses (or rejects) it and emits the single iNav-facing feed via MSP2_SENSOR_GPS (or via a UBX-emulation UART). AC-NEW-2 latency on iNav = companion's internal reaction time only; iNav does not participate in source switching at all.
+
+### Fact #8 — iNav explicitly does NOT validate GPS for spoofing; anti-spoofing is fully the companion's responsibility
+- **Statement**: iNav's `docs/GPS_fix_estimation.md` states verbatim: "Not a solution for GPS spoofing (GPS output is not validated in INAV)." Combined with Fact #7, the architectural conclusion on iNav: companion = anti-spoofing oracle + nav-camera estimator + IMU-propagation source, all collapsed into the single MSP2_SENSOR_GPS feed.
+- **Source**: Source #15
+- **Phase**: Phase 2
+- **Target Audience**: System designers; AC-NEW-2 / AC-3.5 / AC-NEW-8 owners
+- **Confidence**: ✅
+- **Related Dimension**: AC-NEW-2, AC-3.5, AC-NEW-8
+- **Fit Impact**: **supports selection** of "companion as iNav's only GPS"; **disqualifies** any architecture that relies on iNav-side spoof detection for AC-NEW-2 reaction.
+
+### Fact #9 — iNav dead-reckoning has documented stability bugs under intermittent feeds; AC-NEW-8 must avoid letting iNav enter dead-reckoning
+- **Statement**: Issue #10588 documents porpoising and motor-burst behaviour during intermittent GPS outages on iNav fixed-wing dead-reckoning. The community recommendation captured in the issue: "GPS should be rejected if providing erroneous coordinates rather than no fix." `inav_allow_dead_reckoning` (default OFF) and `inav_allow_gps_fix_estimation` (default OFF) are both fixed-state booleans — entering dead-reckoning mid-flight is a discrete transition, not a smooth degrade.
+- **Source**: Source #15, Source #16 (Settings.md), Source #17 (#10588)
+- **Phase**: Phase 2
+- **Target Audience**: System designers; AC-NEW-8 owner
+- **Confidence**: ✅ for setting names; ⚠️ for severity of stability bug (single open issue)
+- **Related Dimension**: AC-NEW-8, AC-3.5, C8
+- **Fit Impact**: **architectural constraint** — on iNav, the AC-NEW-8 path must keep emitting `MSP2_SENSOR_GPS` with growing `hPosAccuracy` rather than letting the feed drop and iNav switch to dead-reckoning. The "no fix" semantics on iNav must be expressed via `fixType` field of MSP2_SENSOR_GPS (not by silence). The horiz/vert accuracy fields are the only signal available; iNav has no equivalent of the AP `horiz_accuracy = 999.0` "no fix" sentinel — must verify which `fixType` enum values iNav treats as no-fix.
+
+### Fact #10 — iNav supports UBX-only over UART (NMEA dropped in 7.0); UBX emulation is a viable third transport
+- **Statement**: iNav 7.0 removed NMEA. Currently supports u-blox UBX protocol with version ≥ 15.00 in 9.0+. Recommended physical receivers: u-blox M8/M9/M10. Companion can implement a UBX-emulation writer on the iNav GPS UART (NAV-PVT mandatory; NAV-DOP optional). UBX carries `hAcc`/`vAcc`/`headAcc`/velocity components — covariance honesty preserved.
+- **Source**: Source #11 (iNav GPS-and-Compass-setup wiki)
+- **Phase**: Phase 2
+- **Target Audience**: System designers (transport-choice)
+- **Confidence**: ✅ for UBX-only; ⚠️ for "minimum NAV-* set" — the canonical U-blox protocol spec (Source filed in agent-tools as `fd8513f8-...txt`) plus iNav's `gps_ublox.c` drive the precise message set; **this is a follow-up search before final selection**.
+- **Related Dimension**: C8 transport choice
+- **Fit Impact**: **alternate candidate, NOT YET SELECTED** — UBX path bypasses MSP queueing/arbitration concerns and treats the companion as a normal GPS to iNav. Trade-off: implementation cost (UBX writer + correct ACK behaviour) vs. MSP path (already-designed wire format, but iNav-specific).
+
+---
+
+## SQ6 — Conclusions (working summary, will be re-checked at Step 7.5)
+
+### Per-FC adapter design is unavoidable (single-message AC-4.3 wording is unsatisfiable)
+
+| FC | Inbound external-positioning transport | Message | Covariance fields | Per-axis velocity | Yaw | Source-switching from companion |
+|---|---|---|---|---|---|---|
+| **ArduPilot Plane** | MAVLink (TELEM/USB/UDP serial) | `GPS_INPUT` (id 232) — primary | `horiz_accuracy`, `vert_accuracy`, `speed_accuracy` (m/m·s⁻¹) | `vn`, `ve`, `vd` (cm/s) | `yaw` cdeg, 0 = not provided | `MAV_CMD_SET_EKF_SOURCE_SET` (FW supports; stock GCS UIs do not — companion-driven OK) |
+| **iNav** | MSP2 (UART/USB) | `MSP2_SENSOR_GPS` (id 7939) — primary candidate | `hPosAccuracy` mm, `vPosAccuracy` mm, `hVelAccuracy` cm/s | `nedVelNorth/East/Down` cm/s | `trueYaw` cdeg×100 | **N/A** — iNav has single-GPS arch; companion = sole GPS source |
+| iNav alt 1 | MSP1 | `MSP_SET_RAW_GPS` (id 201) — **rejected for production** | none | none | none | N/A |
+| iNav alt 2 | UART | UBX emulation (NAV-PVT etc.) — **alternate candidate, requires NAV-* subset verification** | UBX `hAcc`/`vAcc`/`headAcc` mm/cm/scale | NED in NAV-PVT | yes | N/A |
+
+**Selection (preliminary, pending Step 7.5 component-fit gate):**
+- **AP path**: `GPS_INPUT` — Selected (lead).
+- **iNav path**: `MSP2_SENSOR_GPS` — Selected (lead). UBX-emulation kept as fallback if MSP2_SENSOR_GPS proves rate-limited or quality-flag-lossy.
+
+### AC / Restriction binding (per-mode, Per-Mode API Capability Verification rule)
+
+| Numbered AC / Restriction | AP `GPS_INPUT` | iNav `MSP2_SENSOR_GPS` | iNav `MSP_SET_RAW_GPS` |
+|---|---|---|---|
+| AC-1.4 (95% cov + source label `{satellite_anchored, visual_propagated, dead_reckoned}`) | **Pass** (`horiz_accuracy` carries 95% covariance proxy; source label is companion-side metadata, not in MAVLink — emit via STATUSTEXT/NAMED_VALUE_FLOAT) | **Pass** (`hPosAccuracy` = covariance proxy; same off-band source-label channel) | **Fail** (no covariance field → cannot publish 95% ellipse) |
+| AC-NEW-4 (false-position safety budget; covariance honesty) | **Pass** (de-weighted via `EK3_GLITCH_RADIUS` if covariance is honest) | **Verify** (need to confirm iNav nav-stack actually uses `hPosAccuracy` for outlier handling — pre-Step-7.5 follow-up) | **Fail** |
+| AC-NEW-2 (<3 s p95 spoof promotion) | **Verify** via SITL (`MAV_CMD_SET_EKF_SOURCE_SET` round-trip latency under load) | **Pass** by architecture (companion is sole GPS, no FC-side switch needed) | Pass-by-arch but Fails AC-1.4 |
+| AC-NEW-8 (visual-blackout + spoofed GPS failsafe; covariance growth + degraded fix levels) | **Pass** (`fix_type` 0/1/2 + `horiz_accuracy=999.0` documented sentinel maps to AC-NEW-8 thresholds) | **Verify** (iNav's `fixType` enum mapping for "no fix" — pre-Step-7.5 follow-up) | **Fail** (no graceful degrade signal) |
+| AC-3.5 (label switch within ≤1 frame OR ≤400 ms; reject spoofed GPS as input) | **Pass** by architecture (EKF source switch + STATUSTEXT) | **Pass** by architecture (companion suppresses spoofed-GPS contribution upstream) | Pass-by-arch but Fails AC-1.4 |
+| AC-4.3 (FC accepts the chosen messages) | **Pass** | **Pass** (default build, `USE_GPS_PROTO_MSP` on) | **Pass** but Fails AC-1.4 — discard |
+| Restriction "Supported FCs: ArduPilot, iNav (both via standard MAVLink)" | **Pass** | **Fail** of "via standard MAVLink" — restriction's literal wording is incorrect because iNav has no inbound MAVLink external-positioning. The restriction must be revised to "ArduPilot via MAVLink GPS_INPUT; iNav via MSP2_SENSOR_GPS". | n/a |
+
+### Required AC / Restrictions edits flagged for user review
+
+1. **AC-4.3** — current text says "the standard external-positioning message type(s) accepted by ArduPilot and iNav". Reality: no single message type is accepted by both. **Proposed revision** (outcome-shaped, IEEE-830-style): "WGS84 coordinates are delivered to each supported FC via that FC's documented external-positioning interface — MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav. Honest covariance is carried in the field each FC uses for outlier rejection (under-reported covariance is a defect — see AC-NEW-4). Source-label semantics per AC-1.4 are emitted out-of-band (FC-appropriate STATUSTEXT / NAMED_VALUE_FLOAT / equivalent)."
+2. **Restriction "Communication protocol (pinned): MAVLink for both FC and GCS"** — incorrect for iNav. **Proposed revision**: "Communication protocol: MAVLink for ArduPilot Plane and for QGroundControl GCS; MSP2 for iNav (UART or USB transport). MAVLink remains the GCS-facing protocol for both FCs." (iNav still emits MAVLink telemetry outbound to QGC; this is preserved.)
+3. **AC-NEW-2** — keep numerical budget (<3 s p95) but split per-FC validation: ArduPilot validation = SITL round-trip of `MAV_CMD_SET_EKF_SOURCE_SET` from companion under spoof injection; iNav validation = companion-internal reaction time (companion-only metric — iNav doesn't participate).
+4. **AC-NEW-8** — language "fix-quality 2D fix or worse when covariance > 100 m" maps to `GPS_INPUT.fix_type` for AP. iNav's `fixType` enum mapping (per `gpsFixType_e` in iNav's enums-reference) must be confirmed at design time before this AC is testable on iNav.
+
+### Open follow-up probes (deferred to SQ8 + design phase, NOT blocking SQ6 closure)
+
+- **(SQ8)** Confirm the precise MAVLink message + field set ArduPilot exposes for spoofing/jamming integrity reports (PR #2110 merged, but `GPS_RAW_INT` in current published common.xml shows no spoofing bits — likely lives in a sibling message such as `GPS_INTEGRITY`). This is the FC→companion direction needed for AC-NEW-2's input side and AC-3.5's spoofing detection.
+- **(SQ8)** UBX-emulation minimum NAV-* subset for iNav 9.0 (UBX ≥ 15.00). Authoritative inputs: U-blox protocol spec (cached) + iNav `gps_ublox.c` (cached). Output a "minimum companion-side UBX writer" definition.
+- **(design)** SITL parameter sets for both FCs for AC-NEW-2 / AC-NEW-8 validation. Out of research scope.
+- **(design)** Verify iNav nav-stack consumption of `MSP2_SENSOR_GPS.hPosAccuracy` for outlier handling (read `src/main/io/gps_msp.c` / `mspGPSReceiveNewData` in design phase, not research phase).
+
+### Boundary check: this SQ6 is saturated for the architectural decision
+
+Saturation signals observed: ArduPilot side covered by L1 docs + L1 source code; iNav side covered by L1 source code (master) + L1 wiki (edited 2025-12-11) + L1 release notes (8.0/9.0). Three independent rounds of search yielded the same architectural conclusion (no inbound external-positioning MAVLink on iNav). Last queries returned no novel facts. Per `references/source-tiering.md` "Search saturation rule" → SQ6 is closed pending the SQ8 follow-up probes above; user decision required on the AC/restriction edits before further architectural work.
+
+---
+
+## SQ1 — Existing / competitor GPS-denied UAV navigation systems
+
+### Fact #11 — Twist Robotics OSCAR is a deployed Ukrainian peer system in the same architectural class as this project
+- **Statement**: Twist Robotics (Ukraine) has a fielded camera + map-matching navigation module called OSCAR (Optical System of Coordinates with Automatic Relocalisation). The vendor states the system "captures the terrain, identifies landmarks, compares them with a map, determines coordinates, and transmits them to the autopilot as a reliable GPS signal" — the same five-stage architecture this project is building. Vendor-stated specs: ≤20 m accuracy without cumulative error, day/night/fog operation, and operational deployment of "more than 500,000 km across 25,000 combat missions over 24 months". Hardware includes active cooling, indicating a non-trivial onboard compute (likely Jetson-class). **No public independent benchmark of the 20 m number.**
+- **Source**: Source #25, Source #26
+- **Phase**: Phase 2
+- **Target Audience**: System architects + AC owners (existence-of-peer evidence, not implementation guide)
+- **Confidence**: ✅ for "deployed at scale on Ukrainian combat platforms"; ⚠️ for "20 m accuracy" (vendor self-report); ❓ for "fully resistant to spoofing and jamming" (claim not independently verified)
+- **Related Dimension**: SQ1, SQ8 (anti-spoofing claim audit), SQ9 (synthesis — ours must beat or at least match this in the operational regime)
+- **Fit Impact**: **establishes feasibility floor** — a Ukrainian peer is operating a similar architecture against the same threat environment our system targets. Project framing must explicitly differentiate (e.g., 1 km AGL vs unspecified OSCAR altitude; 8 h endurance vs unspecified OSCAR endurance; AC-NEW-4 honest covariance contract vs OSCAR's unspecified covariance reporting).
+
+### Fact #12 — Auterion Artemis is a production-shipping fixed-wing one-way attack drone with Ukraine-validated GPS-denied navigation, defining the production benchmark for this class
+- **Statement**: Auterion completed the US Defense Innovation Unit Artemis program in October 2025, delivering a Shahed-class deep-strike drone with up to 1,000-mile range and up to 40 kg warhead, running on **Auterion Skynode N mission computer + Auterion Visual Navigation system + built-in terminal guidance**. Government evaluators signed off after operational flight tests in Ukraine including ground launch, GPS and GPS-denied navigation, long-range transit, and terminal engagement. Manufacturing is being established in US, UA, and DE; Auterion is offering the system to the US Department of War and allied nations.
+- **Source**: Source #31; Source #32 confirms Skynode S sibling architecture (NPU-equipped companion).
+- **Phase**: Phase 2
+- **Target Audience**: System architects (production-pattern reference)
+- **Confidence**: ✅
+- **Related Dimension**: SQ1 (closest commercial production peer), SQ9 (architecture template)
+- **Fit Impact**: **establishes production reference architecture** — companion-class autopilot + visual navigation + terminal guidance is shipping at production scale to a US defense customer. Implication: building a per-FC adapter (project decision in SQ6) is consistent with what production stacks already do; integrating against the Artemis architecture is realistic; competing on price + Ukraine-specific operational tuning + AC-NEW-4 honest-covariance contract is a viable differentiation.
+
+### Fact #13 — Vantor Raptor is a production COTS visual-GPS-replacement software suite, demonstrating that "branded sat-tile basemap + on-drone vision software" is a viable commercial pattern
+- **Statement**: Vantor Raptor product family (Guide / Sync / Ace) provides vision-based GPS replacement using the drone's existing camera plus Vantor's "100 million-plus sq km of highly accurate 3D terrain data" (Vivid Terrain, vendor-stated 3 m accuracy). Vendor-demonstrated absolute accuracy: **<7 m in all dimensions** for aerial position (Guide), **<3 m** for ground coordinate extraction (Sync, Ace). Works at night and at low altitudes. Platform-agnostic, deployable on commodity hardware, integrates with existing onboard cameras. Inertial Labs has published a VINS-integrated Raptor Guide white paper. Recent partnerships: Niantic Spatial (Dec 2025) for unified air-to-ground positioning in GPS-denied areas; Maxar partnership with AIDC (Sep 2025) for Taiwan UAV resilience against GPS interference.
+- **Source**: Source #30
+- **Phase**: Phase 2
+- **Target Audience**: Architecture / business decision-makers (build-vs-buy framing)
+- **Confidence**: ✅ for product existence + claimed accuracy bounds (vendor primary); ⚠️ for whether Vantor's commercial accuracy figures hold under the project's specific Ukrainian-steppe + active-conflict-tile-staleness conditions
+- **Related Dimension**: SQ1 (commercial), C2/C3 (commercial alternatives to building ourselves), SQ8 (basemap as a service vs offline cache)
+- **Fit Impact**: **build-vs-buy lens** — Raptor Guide's <7 m claim is *better* than the project's AC-1.1 budget (≤80 m / 95% under AC-1.1.1), so it's not a disqualifier on accuracy. Reasons we still build vs buy: (a) Vantor is a US vendor; export / dual-use licensing into the Ukrainian battlefield is uncertain; (b) restrictions specify offline cache from the project's own Azaion Suite Satellite Service (AC-2.x), not Vantor's Vivid Terrain — replacing the basemap is non-negotiable; (c) covariance honesty contract (AC-NEW-4) and source-label contract (AC-1.4) are project-specific and may not be exposed by Vantor's API. **Outcome**: keep Raptor as a competitive comparator in `solution_draft01`, NOT as a candidate component to integrate.
+
+### Fact #14 — snktshrma/ngps_flight (NGPS — ArduPilot GSoC 2024) is the closest open-source pipeline match to this project's exact C1+C2+C3+C5+C8 stack
+- **Statement**: NGPS = ROS 2 + ArduPilot pipeline composed of three packages: **`ap_ngps_ros2`** (visual geo-localization at 1–2 Hz by matching live camera frames to georeferenced satellite imagery using **LightGlue + SuperPoint**, deep-learning-based feature matching), **`ap_ukf`** (Unscented Kalman Filter fusing NGPS absolute positions with VIO estimates), **`ap_vips`** (VIO providing relative pose). Output is fused odometry to ArduPilot's EKF (per related ArduPilot issue #23471, this is via `VISION_POSITION_ESTIMATE` requiring EKF source-set 2/3 with `EK3_SRC*_POSXY=Vision`). Project is published under ArduPilot's GSoC 2024 program. Sibling `ap_nongps` is an earlier OpenCV-based prototype.
+- **Source**: Source #33
+- **Phase**: Phase 2
+- **Target Audience**: Implementer / Engineer
+- **Confidence**: ✅ for project existence, component breakdown, and matcher choice (LightGlue+SuperPoint); ⚠️ for runtime behaviour under our exact constraints (Jetson Orin Nano, 1 km AGL, 17 m/s, 3 fps); ❓ for production hardening / covariance honesty / spoof-defence (none documented)
+- **Related Dimension**: SQ1 (closest open-source peer), SQ2 (canonical pipeline confirmation), SQ3+SQ4 (architectural template for component candidate matrix), SQ6 (alternate AP transport debate)
+- **Fit Impact**: **architectural template** — confirms the project's split (C1 VIO ↔ C2/C3 visual absolute ↔ C5 fusion ↔ C8 FC adapter) is canonical, not novel. Two concrete deltas:
+  1. **Transport choice on AP**: NGPS uses `VISION_POSITION_ESTIMATE`. SQ6 picked `GPS_INPUT` because it carries `horiz_accuracy` directly, supports source-set switching via `MAV_CMD_SET_EKF_SOURCE_SET`, and avoids EKF-source-set reconfiguration. The trade-off (NGPS's path vs SQ6's pick) must be re-examined at design time before final AP-transport selection.
+  2. **Estimator choice**: NGPS uses UKF; SQ3/SQ4 will compare UKF vs ESKF vs MSCKF vs factor-graph (GTSAM) on the same matrix.
+
+### Fact #15 — RGB satellite-image matching as a *low-altitude* (<25 m AGL) localization technique is unreliable per the SPRIN-D Challenge; our 1 km AGL operates in the regime where the same authors note it "works reasonably well"
+- **Statement**: The CTU Prague team's SPRIN-D winning paper directly states: *"Some teams used RGB satellite image-based matching, but this has proved to be highly unreliable at such low altitudes."* (referring to <25 m AGL). The paper's related-work review separately notes that *"high-altitude matching... works reasonably well, but at low altitudes (25 m) the viewpoint differs drastically, making roofs, facades, and vegetation inconsistent with satellite imagery."* The project operates at ≤1 km AGL — which is the *high-altitude* regime in the paper's terminology — making RGB sat-matching the appropriate technique class. The paper's CPU-only winning method (LiDAR heightmap-gradients + clustered particle filter) is **not** transferable to our hardware: our project has no LiDAR.
+- **Source**: Source #28
+- **Phase**: Phase 2
+- **Target Audience**: Implementer / Engineer + Domain expert
+- **Confidence**: ✅
+- **Related Dimension**: SQ1, SQ5 (failure modes), SQ2 (canonical pipeline)
+- **Fit Impact**: **disambiguates a potentially-disqualifying lesson** — the CTU paper's "RGB sat-matching is unreliable" finding does NOT disqualify our approach because the failure was caused by low-altitude viewpoint mismatch, which our 1 km AGL regime does not have. This must be cited explicitly in `solution_draft01` to pre-empt the natural objection from anyone who reads the paper. Separately, the CTU paper's specific lessons are still binding: VIO degrades catastrophically without IMU vibration isolation; magnetometer is unreliable near steel/concrete; "ability to recover from periods of high uncertainty and re-localize" matters more than instantaneous RMSE — this last lesson is a direct architectural input for AC-NEW-2 / AC-NEW-8.
+
+### Fact #16 — RTAB-Map and ORB-SLAM3 both fail beyond 1 km / above 2 m/s flight in the SPRIN-D environment; our cruise profile (≤17 m/s, kilometers between satellite anchors) explicitly excludes both as primary candidates
+- **Statement**: The SPRIN-D paper states: *"We tested state-of-the-art visual SLAM systems such as RTAB-Map and ORB-SLAM3 in a high-fidelity simulator, and found that both performance degraded significantly in a long-range scenario (beyond 1 km), as their memory and compute demands grow with the size of the environment. Moreover, RTAB-Map was unable to maintain quality odometry in faster flight speeds (beyond 2 m/s), while ORB-SLAM3 suffered from tracking loss in textureless areas."*
+- **Source**: Source #28
+- **Phase**: Phase 2
+- **Target Audience**: Implementer / Engineer (component selection for C1)
+- **Confidence**: ✅
+- **Related Dimension**: SQ1, SQ3+SQ4 component C1 (VO/VIO), SQ5 (failure modes)
+- **Fit Impact**: **prunes the C1 candidate landscape** — RTAB-Map and ORB-SLAM3 should not be pursued as C1 leads. Plausible C1 leads remain: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline (KLT + RANSAC homography). NGPS (Fact #14) uses `ap_vips` = OpenVINS-class VIO — confirming an aligned community choice. Final C1 selection happens in SQ3+SQ4.
+
+### Fact #17 — DSMAC + TERCOM lineage: pre-cached scene matching for downward-looking navigation is a 40+ year deployed technique class with documented sub-10 m terminal accuracy
+- **Statement**: DSMAC (Digital Scene Matching Area Correlator) is an autonomous missile-guidance system based on area correlation of sensed downward-camera ground scenes against pre-stored reference imagery (often satellite reconnaissance). It achieves 3–10 m terminal accuracy by correlating buildings, road intersections, and distinctive terrain landmarks. Tomahawk: TERCOM (radar altimeter + DEM) for mid-flight + DSMAC for terminal guidance reduces CEP from ~30 m to "only meters". Documented combat record: 1991 Gulf War, >80% of 280 launched Tomahawks hit target. Recent miniaturisation: Destinus Ruta (300 km strike-class) is integrating UAV Navigation's (Spanish, Grupo Oesía) DSMAC-class system, validated in Ukrainian combat conditions including GNSS-denied / jamming / spoofing.
+- **Source**: Source #36, Source #27
+- **Phase**: Phase 2
+- **Target Audience**: Domain expert + Decision-maker
+- **Confidence**: ✅ for the lineage and Tomahawk performance numbers (DTIC + open-source); ⚠️ for the Ruta-specific "DSMAC operating principle" inference (Defense Express analyst inference, not vendor disclosure)
+- **Related Dimension**: SQ1 (lineage), SQ8 (baseline accuracy expectations for AC-1.1.1 80 m / AC-NEW-4 false-position budget)
+- **Fit Impact**: **establishes baseline accuracy expectations** — the technique class has documented sub-10 m accuracy in the cruise-missile-terminal regime. Our budget (AC-1.1.1: <80 m at 1 km AGL with ≥0.5 m/px tiles) is loose by comparison, indicating that the AC budget is *not* aggressive against the technique-class baseline — it is aggressive against the Jetson Orin Nano + 8-h-continuous + 25 W envelope. **Implication for AC-NEW-4**: claiming P(error >500 m) <0.1% per flight is consistent with the DSMAC-lineage class; an honestly-reported failure rate at this level is realistic, not unprecedented.
+
+### Fact #18 — Hierarchical Image Matching (arXiv 2506.09748, June 2025) is a current academic SOTA pipeline for our exact problem, but uses DINOv2 — a heavyweight foundation model that must be benchmarked under our 25 W / 8 GB Jetson envelope before any selection
+- **Statement**: 2025 academic SOTA pipeline structure: (1) image retrieval module (off-the-shelf, optimal-transport feature aggregation); (2) Semantic-Aware and Structure-Constrained Matching Module (SASCM) using **DINOv2** features + 4D correlation tensor + SoftMNN + 4D conv; (3) lightweight fine-grained matching module for pixel-level. Constructs UAV absolute visual localization without VIO/relative-localization dependence (retrieval-and-matching only). Evaluation on AerialVL + their own CS-UAV dataset claims superior accuracy under cross-source and cross-temporal variation.
+- **Source**: Source #29
+- **Phase**: Phase 2
+- **Target Audience**: Implementer / Engineer + Domain expert
+- **Confidence**: ✅ for pipeline structure and method; ⚠️ for "superior" claim (single-paper benchmark; AerialExtreMatch evaluates 16 methods with broader rigor — Source #34 is the better cross-method ranker); ❓ for Jetson-Orin-Nano runtime (no published number)
+- **Related Dimension**: SQ1 (academic SOTA), C2 (VPR), C3 (cross-domain registration), SQ5 (foundation-model-on-Jetson failure mode)
+- **Fit Impact**: **academic-SOTA snapshot, candidate template** — the retrieval → semantic-aware coarse → fine-grained pipeline is a candidate template for our C2+C3, but DINOv2 introduces a Jetson-deployment risk that must be quantified before commitment. Candidate-level decision: include DINOv2-based pipelines (AnyLoc, BoQ, this paper's SASCM) in the C2/C3 candidate matrix with mandatory MVE on Jetson Orin Nano under our exact frame size and 3 fps cadence. Reject DINOv2 if total inference latency cannot be brought under (400 ms - other-stages budget) at INT8 / fp16. Per Source #28 lesson, classical matchers (LightGlue+SuperPoint as in NGPS) should also be in the matrix as the "simple baseline / known-Jetson-runnable" option.
+
+### Fact #19 — AerialExtreMatch (2025) is the academic benchmark our C2+C3 candidate matrix must publish numbers against, with 32 difficulty-stratified cells exposing exactly the cross-source / cross-pitch / cross-scale failure modes our project will face
+- **Statement**: AerialExtreMatch publishes (a) 1.5 M synthetic train pairs (RGB+depth, diverse UAV/satellite viewpoints); (b) ~30,000 evaluation pairs in **32 difficulty levels** stratified by overlap (4 bins: <20%, 20–40%, 40–60%, >60%), pitch difference (4 bins: 50–55°, 55–60°, 60–65°, 65–70°), and scale variation (2 bins: 1–2×, >2×); (c) a real-world UAV-localization split captured with DJI M300 RTK + H20T against UAV-derived orthomosaic/DSM AND lower-quality satellite maps. The benchmark evaluates 16 representative detector-based and detector-free image matching methods.
+- **Source**: Source #34
+- **Phase**: Phase 2
+- **Target Audience**: Domain expert + Implementer
+- **Confidence**: ✅
+- **Related Dimension**: SQ1 (academic landscape), SQ7 (datasets), C2 (VPR), C3 (cross-domain registration)
+- **Fit Impact**: **defines the C2/C3 evaluation matrix** — every C2/C3 candidate going into `solution_draft01` must report numbers on AerialExtreMatch's 32 difficulty cells, with at least the high-pitch (65–70°) and high-scale (>2×) cells representing our worst-case (UAV vs satellite tile geometry mismatch + ortho-rectification residual). The dataset's real-world UAV-localization split with both UAV-orthomosaic AND satellite-map references mirrors our project's offline-cache-tile semantics directly.
+
+### Fact #20 — DARPA FLA + USAF SBIR establish the US-defense-program tailwind, but do not directly validate the project's specific regime (fixed-wing, ~1 km AGL, sat-tile basemap, 8-h endurance)
+- **Statement**: DARPA Fast Lightweight Autonomy (FLA) program ran 2015–2018 (Phase 1 Florida 2017; Phase 2 Georgia 2018; complete). Focused on small quadcopter autonomy at ≤20 m/s through cluttered indoor/outdoor environments using onboard cameras + LIDAR + sonar + IMU, no GPS / datalink / pilot. A 2025 retrospective (arXiv 2504.08122) reviews FLA testing methodology and Phase 1 results. A 2025 USAF SBIR Phase II solicitation (Sweetspot ID `7946c818-409f-5b31-8f06-554466071d83`) is requesting visual position and navigation capability for sUAS in GPS-denied environments — confirming the regulatory + funding environment is currently active for this category in 2025.
+- **Source**: Source #35
+- **Phase**: Phase 2
+- **Target Audience**: Decision-maker + Domain expert
+- **Confidence**: ✅
+- **Related Dimension**: SQ1 (defense-program lineage)
+- **Fit Impact**: **context only, no direct candidate gain** — FLA pre-dates the project's specific regime by 8 years, focused on a different platform (multirotor) and altitude (low-altitude obstacle avoidance, not 1 km AGL nadir-camera satellite-anchor). Useful only to establish lineage and context. The USAF SBIR datapoint is more directly relevant: confirms that an active US-defense-funded need exists for sUAS visual position + navigation in GPS-denied environments — i.e., the project's market exists outside Ukraine.
+
+---
+
+## SQ1 — Conclusions (working summary, will be re-checked at Step 7.5)
+
+### Existing-systems landscape (5 named-and-evidenced peer / adjacent systems)
+
+| System | Class | Operational regime | Closest match dimension | Closest mismatch dimension | Status as evidence |
+|---|---|---|---|---|---|
+| **Twist Robotics OSCAR** (UA) | Deployed Ukrainian peer | Combat-deployed, fixed-wing-class, GPS-denied vision-nav | **Same architecture, same threat environment** | Altitude / endurance / FC / accuracy contract not publicly specified | Closest peer for "feasibility floor" |
+| **Auterion Artemis** | Production COTS one-way attack drone | Shahed-class, 1000-mile range, 40 kg warhead, Ukraine-validated GPS-denied nav | Same architectural pattern (Skynode + Visual Navigation + terminal guidance) | One-way attack vs reusable; no covariance/source-label contract published | Closest production reference architecture |
+| **Vantor Raptor (Guide / Sync / Ace)** | Production COTS software suite | Vision-based GPS replacement on existing drone camera + Vivid Terrain 3D basemap | Visual-position software pattern | Vendor-managed sat-tile basemap is not the project's Azaion Suite Satellite Service; no AC-NEW-4 / AC-1.4 contract | Closest commercial peer for "build-vs-buy" framing |
+| **snktshrma/ngps_flight (NGPS, ArduPilot GSoC 2024)** | Open-source research prototype | LightGlue+SuperPoint+UKF+`VISION_POSITION_ESTIMATE` to AP | **Same component split, same FC family** | GSoC prototype, not production; no spoof defence; no covariance honesty | **Closest open-source pipeline match — explicit architectural template** |
+| **CTU Prague SPRIN-D winner** | Academic / competition | Multirotor, ≤25 m AGL, LiDAR + heightmap gradient + particle filter on CPU | "Recover-from-uncertainty > low-instantaneous-RMSE" lesson; VIO discipline | LiDAR-required, low-altitude regime, no sat-tile basemap | Architectural-pattern reference + cautionary tale |
+| **Destinus Ruta + UAV Navigation** | Production miniaturised cruise missile | 300 km strike, DSMAC-class, Ukraine-combat-validated | Pre-cached basemap + visual matching + autopilot ingestion | One-way attack, terminal guidance, no covariance contract | Shows DSMAC-class miniaturised into UAV tier |
+
+### Per-perspective coverage
+
+| Perspective | Facts supporting | Saturation status |
+|---|---|---|
+| **Implementer / Engineer** | Fact #14 (NGPS), Fact #16 (SLAM failure modes), Fact #18 (DINOv2 risk) | Saturated for SQ1 — deeper component-level deep-dives go to SQ3/SQ4 |
+| **Practitioner / Field (Ukraine)** | Fact #11 (OSCAR), Source #37 (~70% UAV losses to EW), Source #27 (Ruta + UAV Navigation Ukraine combat validation) | Saturated for SQ1 |
+| **Domain expert / Academic** | Fact #18 (Hierarchical Matching SOTA), Fact #19 (AerialExtreMatch benchmark), Fact #15 (SPRIN-D regime distinction) | Saturated for SQ1 — academic SOTA benchmarking handed off to SQ3/SQ4 + SQ7 |
+| **Contrarian / Devil's advocate** | Fact #15 (low-altitude RGB matching unreliable lesson), Fact #16 (RTAB-Map / ORB-SLAM3 disqualified), Fact #18 (DINOv2-on-Jetson risk) | Saturated for SQ1 |
+| **Decision-maker / Business** | Fact #12 (production-ready Auterion), Fact #13 (commercial Vantor build-vs-buy framing), Fact #20 (USAF SBIR market context) | Saturated for SQ1 |
+
+### Architectural conclusions for `solution_draft01`
+
+1. **Build-vs-buy stance**: build. Vantor Raptor and Auterion Visual Navigation are commercially superior on hardening + integration but neither exposes the covariance honesty contract (AC-NEW-4) nor uses the project-specified Azaion Suite Satellite Service tile cache (AC-2.x); both are dual-use export risks for the Ukrainian battlefield. NGPS (Fact #14) is the open-source architectural template to learn from but is a GSoC research prototype lacking production hardening, spoof defence, and the covariance-honesty contract. Architectural conclusion: build with NGPS as the template, with project-specific contracts (AC-NEW-4, AC-1.4, AC-NEW-7) and per-FC adapter (SQ6 conclusion) layered on top.
+2. **Differentiation from OSCAR (Twist Robotics)** must be made explicit in `solution_draft01`: (a) honest covariance contract per AC-NEW-4; (b) explicit `{satellite_anchored, visual_propagated, dead_reckoned}` source-label contract per AC-1.4; (c) AC-NEW-7 cache-poisoning safety budget on tile write-back; (d) ArduPilot Plane + iNav both supported per project's revised AC-4.3.
+3. **Pipeline canonicalness**: the C1+C2+C3+C4+C5+C8 split is canonical (NGPS + the 2025 hierarchical-matching paper + SPRIN-D winner all use the same shape; only the specific algorithm choices differ). SQ2 will sanity-check this against one more pipeline-survey paper, but this is essentially a low-risk question now.
+4. **Component-pruning** carried into SQ3/SQ4:
+   - C1: **prune RTAB-Map and ORB-SLAM3** as primary candidates per Fact #16. Carry: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline.
+   - C2/C3: **mandatorily benchmark** any DINOv2-based candidate (AnyLoc, BoQ, SASCM-style) against AerialExtreMatch at our pitch / scale / overlap regime AND against Jetson Orin Nano latency budget (per Fact #18). Maintain LightGlue+SuperPoint as the "simple-baseline / known-Jetson-runnable" option per NGPS precedent.
+   - C8 transport: NGPS uses `VISION_POSITION_ESTIMATE`. SQ6 picked `GPS_INPUT`. Re-examine the trade-off in design phase, but SQ6's selection stands for the research draft.
+5. **Lessons from SPRIN-D winner that must propagate to `solution_draft01`**:
+   - "Ability to recover from periods of high uncertainty and re-localize" > "low instantaneous RMSE" — directly informs AC-NEW-2 / AC-NEW-8.
+   - VIO requires mechanically-decoupled IMU; this is a hardware-integration constraint, not a software issue.
+   - Magnetometer is unreliable near steel/concrete; sensor fusion of heading sources is essential.
+   - "No single sensor can be fully relied upon" — directly supports our IMU+camera+sat-tile multi-source posture.
+
+### Open follow-ups (deferred to later sub-questions)
+
+- **(SQ8)** Independent verification of OSCAR's "fully resistant to spoofing/jamming" claim — if available. Otherwise, Twist Robotics's claim remains a vendor-only signal.
+- **(SQ8)** Vantor Raptor and Auterion Visual Navigation's covariance reporting behaviour — for benchmarking AC-NEW-4 compliance.
+- **(SQ3+SQ4 / C2)** AnyLoc / BoQ / DINOv2-VLAD / MixVPR / EigenPlaces / NetVLAD on AerialExtreMatch for cross-source aerial — already in C2 search plan; SQ1 just confirmed they're the right candidate set.
+- **(SQ3+SQ4 / C3)** LightGlue / LoFTR / RoMa / DKM / MASt3R + classical SIFT+RANSAC + XFeat on AerialExtreMatch — already in C3 search plan; SQ1 confirms shape.
+- **(SQ7)** AerialExtreMatch + AerialVL + CS-UAV + RealUAV/SAVL + UAV-VisLoc as the dataset shortlist for our cross-validation — confirmed by SQ1 hits.
+
+### Boundary check: SQ1 is saturated
+
+Saturation signals observed: 4 perspectives saturated, ≥3 high-confidence facts per perspective, last 3 search rounds (Anduril Iris detail probe, ArduPilot prior-art probe, DSMAC lineage probe) yielded only one new substantive datapoint (NGPS) and confirmed already-known patterns. No unresolved contradictions. Per `references/source-tiering.md` "Search saturation rule" → SQ1 is closed.
+
+---
+
+## SQ2 — Canonical pipeline decomposition (sanity-check)
+
+### Fact #21 — The canonical pipeline for offline-cache visual geo-localization is two-stage: global VPR retrieval, then local alignment (image matching → pose)
+- **Statement**: Source #38 (Skoltech aerial-VPR survey) defines the field's canonical pipeline verbatim: "Visual geolocalization can be implemented through various methods, typically relying on a pre-built database of images with known locations. This approach generally involves two stages: global localization (or Visual Place Recognition, VPR) and local alignment. Global localization involves identifying the nearest frame from the database (Image Retrieval), while local alignment determines the precise position using the selected frame." Source #42 (NUDT 2026 absolute-VL survey) names the same shape "**retrieval → matching → pose-estimation hierarchical framework**" and explicitly contrasts it against three rejected alternatives: (a) relative-only VIO/SLAM (cumulative error), (b) end-to-end direct localization (poor generalization), (c) map-free localization (scene-dependent). Source #39 (U.Maine cross-view survey) traces the same lineage from 2003 pixel-wise template-matching → 2013 hand-engineered features → 2017 CNN/triplet-loss → 2018+ Siamese/GAN → 2022+ Transformer → 2023 DINOv2-class. Source #41 (AnyVisLoc benchmark) implements this hierarchy as: image retrieval (rough) → image matching (2D-2D) → DSM-lift to 3D → PnP+RANSAC, with **Top-N re-rank by inlier count** as a critical fourth stage between matching and pose.
+- **Source**: Source #38, Source #39, Source #41, Source #42
+- **Phase**: Phase 2
+- **Target Audience**: Architects of `solution_draft01`
+- **Confidence**: ✅ (four independent surveys/benchmarks converge)
+- **Related Dimension**: SQ2, C2 (VPR), C3 (cross-domain matching), C4 (pose estimation)
+- **Fit Impact**: **confirms** the project's C1–C10 decomposition is canonical for the **C2 → C3 → C4** chain. The component split is not novel; the project's contribution is the **integration discipline** (covariance honesty AC-NEW-4, source-label contract AC-1.4, offline-cache safety AC-NEW-7) layered on top. **Augment** the existing decomposition with an explicit "Top-N re-rank by inlier count" stage between C3 and C4 (currently implicit).
+
+### Fact #22 — AdHoP (Adaptive Homography Preconditioning) is a method-agnostic post-matching refinement loop that improves translation accuracy by ~30% average and up to 63% for previously-underperforming methods, at the cost of a second matching pass
+- **Statement**: Source #40 (OrthoLoC benchmark, Sep 2025): from initial 2D-2D query↔orthophoto correspondences, estimate a homography H via DLT+RANSAC, warp the orthophoto with H to better match the query's perspective (reducing residual perspective gap), re-match in this warped frame, then map the new correspondences back to the original orthophoto via H⁻¹, lift to 3D using DSM, and run PnP+RANSAC + Levenberg-Marquardt refinement. Accept the AdHoP-refined pose only if reprojection error decreases vs. the non-refined pose. **Quantitative effects** (16,425 images, 47 locations, 1m-1° threshold): GIM+DKM 75.4% recall (best); AdHoP-refined methods see ~30% average matching improvement, ~20% translation/rotation error reduction; for previously-underperforming methods AdHoP yields up to 95% matching improvement (XFeat*) or 63% translation reduction (DKM); for RoMa, AdHoP lifts 1m-1° recall by +23 points (54.6% → 77.6%-class). **Cross-domain regime** (war-zone-equivalent: scene change between query and reference): translation error increases ~3× when only the visual modality differs, ~7× when both visual and structural (DSM) gaps exist (0.16 m → 1.12 m for GIM+DKM+AdHoP). **Method-agnostic** — works on top of any 2D-2D matcher.
+- **Source**: Source #40
+- **Phase**: Phase 2
+- **Target Audience**: System architects + C3/C4 implementers
+- **Confidence**: ✅ for headline numbers (single-paper, but published dataset + open code + reproducible per repo)
+- **Related Dimension**: SQ2 (new sub-stage), C3 (matcher), C4 (pose), SQ5 (cross-domain failure mode)
+- **Fit Impact**: **adds a new sub-stage** between C3 and C4. Decision for `solution_draft01`: include AdHoP-class refinement as an **optional** stage gated on Jetson Orin Nano latency budget — if (single-pass match latency × 2) + homography estimation + reprojection check fits under (400 ms - other-stages), include it; otherwise reserve as offline-replay-time refinement. Cross-domain 3× translation-error penalty is a **direct AC-NEW-4 calibration input** — companion-side covariance must inflate proportionally when scene-change detection (deferred to SQ8) flags a stale tile.
+
+### Fact #23 — 6-DoF aerial-to-satellite localization requires DSM (Digital Surface Model) elevation data; without DSM, the system collapses to 3-DoF (position + 1 rotation) or must compute attitude purely from IMU/VIO
+- **Statement**: Source #40 OrthoLoC explicitly: "Our pipeline matches the query image with the DOP, lifts the matched 2D points in DOP to 3D using the DSM, and then estimates the camera pose using PnP and RANSAC." Without the DSM lift, the matcher produces 2D↔2D correspondences that constrain a homography (which encodes 3-DoF for a planar scene + planar camera) but **not** the full 6-DoF camera pose. Source #41 AnyVisLoc independently confirms by measuring: aerial-photogrammetry map (with paired DSM at 0.94 m/px) achieves 74.1% A@5m; satellite map (with ALOS 30 m DSM) achieves only 18.5% A@5m — a 4× accuracy collapse driven by DSM coarseness. The project's offline cache from the Azaion Suite Satellite Service is currently specified as **2D ortho tiles only** (no DSM commitment in restrictions.md or AC). **Three architectural responses** are available: (a) **3-DoF acceptance** — fix attitude from IMU/VIO, treat the matcher output as a homography-only constraint, ignore DSM; sacrifices the up-to-2× higher accuracy reported when DSM is present, but stays within current cache contract; (b) **Request DSM tiles from the Suite Sat Service** — adds C2 cache schema work + a Suite Sat Service contract change; preserves 6-DoF accuracy; (c) **IMU/VIO-only attitude + 2D-2D matching translation** — same as (a) but explicitly contracts the IMU/VIO module to provide attitude with σ ≤ 5° (per Fact #24); operationally identical to (a), differs only in how the contract is written.
+- **Source**: Source #40, Source #41
+- **Phase**: Phase 2
+- **Target Audience**: System architects + Suite Sat Service stakeholder + AC owner
+- **Confidence**: ✅ for the architectural claim; ✅ for the 4× accuracy collapse number
+- **Related Dimension**: SQ2 (decomposition), C2 (cache schema), C3 (matcher output contract), C4 (pose), C5 (estimator), C6 (IMU/VIO contract), AC-1.1 / AC-1.1.1 (accuracy budget)
+- **Fit Impact**: **architectural decision required, surfaced for user.** The current restrictions.md (no DSM commitment) implicitly forces option (a) or (c). The accuracy budget AC-1.1.1 (≤80 m at 1 km AGL) is loose enough that 3-DoF + IMU-attitude almost certainly satisfies it on a per-frame basis (per Fact #21 and DSMAC-class lineage in Fact #17), but **requires explicit acknowledgement** in the architecture before commitment. **Proposed default** for `solution_draft01`: option (c) — fix attitude from IMU/VIO with documented σ ≤ 5° contract on yaw, σ ≤ 5° on pitch (per Fact #24), translation from 2D-2D matching + camera pose. Flag option (b) as a "Suite Sat Service follow-up" if 6-DoF accuracy ever becomes a hard requirement.
+
+### Fact #24 — IMU-derived yaw and pitch priors with σ ≤ 5° are required for the matching+PnP stack to hit benchmark accuracy; σ ≥ 10° causes 2–4% A@5m drops, σ ≥ 30° causes ≥4% drops, σ ≥ 60° causes 25.7% drops
+- **Statement**: Source #41 AnyVisLoc systematically perturbs yaw and pitch priors and measures localization accuracy collapse. Yaw: σ = 5° → no impact; σ = 10° → −1.9% A@5m; σ = 30° → −4.1%; σ = 50° → −13.7%; σ = 60° → −25.7%. Pitch: σ < 5° → no impact; σ ≥ 7° → 1–5% drops. The benchmark is conducted at low altitude (30–300 m AGL) with 20–90° pitch range; lessons transfer to our 1 km AGL nadir-camera regime in the **direction** but the magnitudes may be lower at 1 km AGL because nadir geometry is less yaw-sensitive than oblique. Conservatively adopting the benchmark numbers gives a hard contract: **IMU/VIO must deliver yaw with σ ≤ 5° and pitch with σ ≤ 5° to the matcher** (1σ, not 95%, since the benchmark is single-σ). Pitch is naturally tighter on a nadir-fixed camera (mechanically constrained); yaw is the binding constraint and is the typical IMU/magnetometer failure mode (per SPRIN-D lesson Fact #15).
+- **Source**: Source #41
+- **Phase**: Phase 2
+- **Target Audience**: System architects + C1 (VIO) implementer + C5 (estimator) implementer
+- **Confidence**: ✅ for the AnyVisLoc numbers; ⚠️ for direct transfer to 1 km AGL nadir regime (magnitudes likely smaller at our altitude/pitch — direction is conservative)
+- **Related Dimension**: SQ2 (sensor-prior contract), C1 (VIO output contract), C5 (estimator), C6 (IMU)
+- **Fit Impact**: **architectural contract** for `solution_draft01`: the C1 module's published contract to the C2/C3 stack is yaw σ ≤ 5° AND pitch σ ≤ 5°. Magnetometer-only yaw is **insufficient** by the SPRIN-D lesson (Fact #15) — VIO must contribute. **Adds a constraint** that flows back to the C6 IMU integration: IMU mechanical isolation per SPRIN-D Fact #15 is required; magnetometer + GPS-yaw startup alignment at the airbase (before take-off, while real GPS is healthy) is part of the boot sequence.
+
+### Fact #25 — Top-N re-ranking by inlier count is the dominant accuracy/cost trade-off; pure-matching-without-retrieval is catastrophic (A@5m collapses from 62.2% to 34.3% with the same matcher)
+- **Statement**: Source #41 AnyVisLoc and Source #38 Skoltech survey both quantify the value of retrieval as a search-space reducer for matching. Source #41 explicitly: "Top-N re-rank by inlier count is the best accuracy/cost trade-off" → 62.2% A@5m at 0.8 s/frame on RTX 3090. **Without retrieval** (pure exhaustive matching against the cache): 34.3% A@5m — i.e., almost **half** the accuracy at infeasible compute. Source #38 measures sparse-VPR re-ranking specifically: AnyLoc descriptor + SuperGlue re-rank on top-100 candidates = 15–25 s/frame on RTX 3090 (catastrophic for our 400 ms budget); LightGlue re-rank ≈ 1 s/frame (still over budget); SelaVPR re-rank < 0.1 s/frame (in-budget on RTX 3090, must be re-tested on Jetson Orin Nano). **Re-ranking budget** = (frame budget) − (descriptor extraction) − (initial top-N retrieval) − (matcher pose estimation) − (AdHoP if included).
+- **Source**: Source #38, Source #41
+- **Phase**: Phase 2
+- **Target Audience**: System architects + C2 implementer
+- **Confidence**: ✅ (two-source convergence on the qualitative claim; quantitative numbers are RTX-3090-specific and must be Jetson-MVE'd)
+- **Related Dimension**: SQ2 (pipeline structure), C2 (VPR), C3 (matcher), SQ3+SQ4 (Jetson MVE)
+- **Fit Impact**: **mandates** Top-N re-rank by inlier count as a stage in `solution_draft01`. Trade-off Top-N value (typical N=5–20 in literature) goes to SQ3+SQ4 candidate matrix, not SQ2.
+
+### Fact #26 — High-accuracy SOTA models (AnyLoc + SuperGlue + RoMa-class) are NOT viable on Jetson Orin Nano under the 400 ms p95 budget; lightweight VPR (MixVPR / SALAD / SelaVPR-class) + lightweight matchers (LightGlue / XFeat-class) are the only candidates that survive a basic latency pre-screen
+- **Statement**: Two independent runtime measurements on RTX 3090 (≥10× faster than Jetson Orin Nano in dense matrix ops): Source #38 — AnyLoc descriptor calculation 0.37–0.84 s/frame (huge ViT-G DINOv2); SuperGlue re-rank 15–25 s/frame on top-100; LightGlue re-rank ~1 s/frame; SelaVPR re-rank < 0.1 s/frame. Source #41 — RoMa dense matcher 659 ms/frame; SP+LightGlue+GIM sparse 105 ms/frame; ratio = 6.3×. **Memory**: AnyLoc descriptors = 2.3–13.9 GB for 4–7k tiles (out of 8 GB Jetson Orin Nano envelope before model weights); SelaVPR descriptors < 0.2 GB. Pre-screen conclusion: AnyLoc / SuperGlue / RoMa-class are **disqualified** on the Jetson Orin Nano at 3 fps unless heavy quantization (INT8) reduces them ≥10×, which is not yet established for our latency target on this hardware. Surviving candidates from the literature: **VPR**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD-class; **matchers**: LightGlue, XFeat, XFeat*, SP+LightGlue. **Disqualification is preliminary** — final go/no-go happens at SQ3+SQ4 with on-Jetson MVE per `references/mode-A-mve-rules.md`.
+- **Source**: Source #38, Source #41
+- **Phase**: Phase 2
+- **Target Audience**: C2 + C3 implementer; SQ3+SQ4 candidate-matrix author
+- **Confidence**: ✅ for RTX-3090 numbers; ⚠️ for direct Jetson translation (Jetson Orin Nano AI score is well-published; ratio is conservative)
+- **Related Dimension**: SQ2 (Jetson budget feasibility), SQ3+SQ4 (candidate pre-screen), SQ5 (foundation-model-on-edge failure mode), C2, C3, C7 (Jetson runtime)
+- **Fit Impact**: **prunes the SQ3+SQ4 candidate matrix BEFORE expensive Jetson MVE.** Candidates entering SQ3+SQ4 with mandatory Jetson MVE: (C2 VPR) MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD; (C3 matcher) LightGlue, XFeat, XFeat*, SP+LightGlue. Candidates that need Jetson INT8 quant before they earn an MVE slot: AnyLoc, BoQ, DINOv2-VLAD (must demonstrate INT8 build path with vendor-validated accuracy preservation). Candidates pruned outright: RoMa dense, SuperGlue, MASt3R (latency).
+
+### Fact #27 — A 20% covisibility floor between query frame and reference tile is required for localization to succeed; below it, ALL methods fail regardless of matcher quality
+- **Statement**: Source #40 OrthoLoC: "When the covisibility between the UAV image and the orthographic geodata is too small (less than ~20%), the localization fails for all methods regardless of matcher quality." This is a geometric floor, not a method-specific limit. The implication for the project: any tile-cache design that allows a query to fall outside 20% covisibility with the **best available** cached tile must also include a **runtime covisibility-check + graceful degrade** to `visual_propagated` mode (per AC-1.4 source label). This is a runtime condition, not a one-time setup parameter.
+- **Source**: Source #40
+- **Phase**: Phase 2
+- **Target Audience**: C2 (cache scheduler) + C5 (estimator) + AC-1.4 owner
+- **Confidence**: ✅
+- **Related Dimension**: SQ2 (boundary condition), C2 (tile cache), C5 (estimator state machine), AC-1.4
+- **Fit Impact**: **adds a runtime invariant** to `solution_draft01`: tile selection must guarantee ≥20% covisibility OR explicitly emit the `visual_propagated` source label per AC-1.4 with covariance widened per AC-NEW-4. This becomes a hard constraint on the C2 cache schema (must support tile-extent metadata) and a runtime check before invoking C3 matcher.
+
+---
+
+## SQ2 — Conclusions (working summary, will be re-checked at Step 7.5)
+
+### Pipeline-component coverage table (existing C1–C10 vs. survey-listed components)
+
+| Survey/benchmark canonical stage | Project component (current) | Coverage status | Required action |
+|---|---|---|---|
+| Image retrieval (global VPR) | **C2 — Visual Place Recognition** | ✅ covered | No change |
+| Re-ranking (top-N inlier-based) | (currently implicit, inside C2 or C3) | ⚠️ implicit | **Promote to explicit sub-stage** (`C2.5` or `C3.0`) in `solution_draft01` |
+| Local image matching (2D-2D, sparse or dense) | **C3 — Cross-domain registration** | ✅ covered | Add Top-N re-rank-by-inlier-count requirement |
+| AdHoP-style perspective preconditioning | (not represented) | ❌ missing | **Add as optional sub-stage** between C3 and C4, gated on Jetson latency budget |
+| 2D-3D lift via DSM | (not represented; current cache is 2D ortho only) | ❌ architectural decision required | **Decision required from user** — see below |
+| Pose estimation (PnP + RANSAC + LM) | **C4 — Pose estimation** | ✅ covered | No change |
+| State estimator / fusion (UKF / ESKF / MSCKF / factor graph) | **C5 — Estimator / fusion** | ✅ covered | Augmented with covariance-honesty contract from AC-NEW-4 |
+| IMU + VIO contract | **C1 — VO/VIO** + **C6 — IMU integration** | ✅ covered | Add yaw σ ≤ 5°, pitch σ ≤ 5° hard contract from Fact #24 |
+| Tile cache + scheduler | **C2 — VPR tile cache** + **C9 — Cache hygiene** | ✅ covered | Add 20% covisibility runtime invariant (Fact #27) |
+| Anti-spoof / source-switch | **C7 — Spoof detection** + **C8 — FC adapter** | ✅ covered | Already addressed in SQ6 |
+| Health monitoring / safety | **C10 — Safety / health monitoring** | ✅ covered | Already addressed |
+
+### Architectural decisions surfaced (require user resolution before SQ3+SQ4 starts)
+
+1. **DSM dependency on the Suite Sat Service tile cache** (per Fact #23). Three options:
+   - **(a) 3-DoF acceptance** — accept that without DSM, only position is recovered from matching; attitude is fixed by IMU/VIO with no satellite-tile cross-check. Lowest project scope. Requires AC budget verification (likely passes AC-1.1.1).
+   - **(b) Request DSM tiles** — Suite Sat Service contract change. Highest accuracy. Adds ~1 cycle to delivery. Recommended if 6-DoF accuracy ever becomes a hard AC.
+   - **(c) IMU/VIO-attitude + 2D-2D matching translation** — operationally identical to (a) but contracts the IMU/VIO module explicitly with σ ≤ 5° yaw / pitch (Fact #24).
+   - **Recommended default**: **(c)** — explicit IMU/VIO contract; fall back to (b) if AC tightens.
+
+2. **AdHoP refinement loop** (per Fact #22). Three options:
+   - **(a) Always-on** — included in every frame; Jetson budget must accommodate 2× matching latency.
+   - **(b) Conditional** — only when initial reprojection error exceeds a threshold; gated on per-frame budget.
+   - **(c) Off (initial release)** — relegate to offline-replay refinement.
+   - **Recommended default**: **(b) Conditional** — fits within latency variance budget while capturing the cross-domain accuracy gain.
+
+3. **Top-N re-rank promotion to explicit pipeline sub-stage** (per Fact #25). Recommendation: promote to a named sub-stage in `solution_draft01` with N as an SQ3+SQ4 hyperparameter sweep target.
+
+### Component-pruning carried into SQ3+SQ4
+
+- **C2 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD.
+- **C2 candidates entering SQ3+SQ4 conditional on INT8 quantization path**: AnyLoc, BoQ, DINOv2-VLAD.
+- **C2 candidates pruned**: SuperGlue-as-reranker (latency).
+- **C3 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: LightGlue, XFeat, XFeat*, SP+LightGlue (NGPS template).
+- **C3 candidates pruned**: RoMa, MASt3R, DKM (dense matcher latency on Jetson).
+- **C3 candidates as "AerialExtreMatch reference points" only, NOT for production**: GIM+DKM, GIM+LightGlue (per Source #40, used as accuracy benchmark only).
+
+### Boundary check: SQ2 is saturated
+
+Saturation signals observed: (a) four independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey) converge on the **same** "retrieval → matching → pose-estimation hierarchical framework" as canonical; (b) two independent runtime sources (Skoltech survey on RTX 3090; AnyVisLoc on RTX 3090 with explicit dense-vs-sparse breakdown) agree on the relative cost ordering of model classes; (c) cross-source agreement on AdHoP value (Source #40 only, but with reproducible code and dataset — single-source-but-strong evidence); (d) cross-source agreement on covisibility / sensor-prior thresholds. Two outstanding decisions are flagged for user — neither blocks SQ2's saturation status, both block SQ3+SQ4 start. Per `references/source-tiering.md` "Search saturation rule" → SQ2 is closed pending user decisions on DSM dependency + AdHoP gating.
@@ -1,91 +0,0 @@
-# Question Decomposition — Solution Assessment (Mode B, Draft02)
-
-## Original Question
-Assess solution_draft02.md for weak points, security vulnerabilities, and performance bottlenecks, then produce a revised solution draft03.
-
-## Active Mode
-Mode B: Solution Assessment — `solution_draft02.md` is the highest-numbered draft.
-
-## Question Type Classification
- **Primary**: Problem Diagnosis — identify weak points, vulnerabilities, bottlenecks in draft02
- **Secondary**: Decision Support — evaluate alternatives for identified issues
-
-## Research Subject Boundary Definition
-
-| Dimension | Boundary |
-|-----------|----------|
-| **Domain** | GPS-denied UAV visual navigation, aerial geo-referencing |
-| **Geography** | Eastern/southern Ukraine (left of Dnipro River) — steppe terrain |
-| **Hardware** | Desktop/laptop with NVIDIA RTX 2060+, 16GB RAM, 6GB VRAM |
-| **Software** | Python ecosystem, GPU-accelerated CV/ML |
-| **Timeframe** | Current state-of-the-art (2024-2026), production-ready tools |
-| **Scale** | 500-3000 images per flight, up to 6252×4168 resolution |
-
-## Problem Context Summary
- UAV aerial photos taken consecutively ~100m apart, downward non-stabilized camera
- Only starting GPS known — must determine GPS for all subsequent images
- Must handle: sharp turns, outlier photos (up to 350m gap), disconnected route segments
- Processing <5s/image, real-time SSE streaming, REST API service
- No IMU data available
- Camera: 26MP (6252×4168), 25mm focal length, 23.5mm sensor width, 400m altitude
-
-## Decomposed Sub-Questions
-
-### A: DINOv2 Cross-View Retrieval Viability
-"Is DINOv2 proven for UAV-to-satellite coarse retrieval? What are real-world performance numbers? What search radius is realistic?"
-
-### B: XFeat Reliability for Aerial VO
-"Is XFeat proven for aerial visual odometry? How does it compare to SuperPoint in aerial scenes specifically? What are known failure modes?"
-
-### C: LightGlue ONNX on RTX 2060 (Turing)
-"Does LightGlue-ONNX work reliably on Turing architecture? What precision (FP16/FP32)? What are actual benchmarks?"
-
-### D: GTSAM iSAM2 Factor Graph Design
-"Is the proposed factor graph structure sound? Are the noise models appropriate? Are custom factors (DEM, drift limit) well-specified?"
-
-### E: Copernicus DEM Integration
-"How is Copernicus DEM accessed programmatically? Is it truly free? What are the actual API requirements?"
-
-### F: Homography Decomposition Robustness
-"How reliable is cv2.decomposeHomographyMat selection heuristic when UAV changes direction? What are failure modes?"
-
-### G: Image Rotation Handling Completeness
-"Is heading-based rotation normalization sufficient? What if heading estimate is wrong early in a segment?"
-
-### H: Memory Model Under Load
-"Can DINOv2 embeddings + SuperPoint features + GTSAM factor graph + satellite cache fit within 16GB RAM and 6GB VRAM during a 3000-image flight?"
-
-### I: Satellite Match Failure Cascading
-"What happens when satellite matching fails for 50+ consecutive frames? How does the 100m drift limit interact with extended VO-only sections?"
-
-### J: Multi-Provider Tile Schema Compatibility
-"Do Google Maps and Mapbox use the same tile coordinate system? What are the practical differences in switching providers?"
-
-### K: Security Attack Surface
-"What are the remaining security vulnerabilities beyond JWT auth? SSE connection abuse? Image processing exploits?"
-
-### L: Recent Advances (2025-2026)
-"Are there newer models or approaches published since draft02 that could improve accuracy or performance?"
-
-### M: End-to-End Processing Time Budget
-"Is the total per-frame time budget realistic when all components run together? What is the critical path?"
-
---
-
-## Timeliness Sensitivity Assessment
-
- **Research Topic**: GPS-denied UAV visual navigation — assessment of solution_draft02 architecture and component choices
- **Sensitivity Level**: 🟠 High
- **Rationale**: CV feature matching models (SuperPoint, LightGlue, XFeat, DINOv2) evolve rapidly with new versions and competitors. GTSAM is stable. Satellite tile API pricing/limits change. Core algorithms (homography, VO) are stable.
- **Source Time Window**: 12 months (2025-2026)
- **Priority official sources to consult**:
-  1. GTSAM official documentation and PyPI (factor type compatibility)
-  2. LightGlue-ONNX GitHub (Turing GPU compatibility)
-  3. Google Maps Tiles API documentation (pricing, session tokens)
-  4. DINOv2 official repo (model variants, VRAM)
-  5. faiss wiki (GPU memory allocation)
- **Key version information to verify**:
-  - GTSAM: 4.2 stable, 4.3 alpha (breaking changes)
-  - LightGlue-ONNX: FP16 on Turing, FP8 requires Ada Lovelace
-  - Pillow: ≥11.3.0 required (CVE-2025-48379)
-  - FastAPI: ≥0.135.0 (SSE support)
@@ -1,212 +0,0 @@
-# Source Registry — Draft02 Assessment
-
-## Source #1
- **Title**: GTSAM GPSFactor Class Reference
- **Link**: https://gtsam.org/doxygen/a04084.html
- **Tier**: L1
- **Publication Date**: 2025 (latest docs)
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: GTSAM 4.2
- **Target Audience**: GTSAM users building factor graphs with GPS constraints
- **Research Boundary Match**: ✅ Full match
- **Summary**: GPSFactor and GPSFactor2 work with Pose3/NavState, NOT Pose2. For 2D position constraints, PriorFactorPoint2 or custom factors are needed.
- **Related Sub-question**: D (GTSAM iSAM2 Factor Graph Design)
-
-## Source #2
- **Title**: GTSAM Pose2 SLAM Example
- **Link**: https://gtbook.github.io/gtsam-examples/Pose2SLAMExample.html
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: GTSAM 4.2
- **Target Audience**: GTSAM users
- **Research Boundary Match**: ✅ Full match
- **Summary**: BetweenFactorPose2 provides odometry constraints with noise model Diagonal.Sigmas(Point3(sigma_x, sigma_y, sigma_theta)). PriorFactorPose2 anchors poses.
- **Related Sub-question**: D
-
-## Source #3
- **Title**: GTSAM Python pip install version compatibility (PyPI)
- **Link**: https://pypi.org/project/gtsam/
- **Tier**: L1
- **Publication Date**: 2026-01
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: gtsam 4.2 (stable), gtsam-develop 4.3a1 (alpha)
- **Target Audience**: Python developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: GTSAM 4.2 stable on pip. 4.3 alpha has breaking changes (C++17, Boost removal). Known issues with Eigen 5.0.0, ARM64 builds. Stick with 4.2 for production.
- **Related Sub-question**: D
-
-## Source #4
- **Title**: LightGlue-ONNX repository
- **Link**: https://github.com/fabio-sim/LightGlue-ONNX
- **Tier**: L1
- **Publication Date**: 2026-01 (last updated)
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: LightGlue-ONNX (supports ONNX Runtime + TensorRT)
- **Target Audience**: Computer vision developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: ONNX export with 2-4x speedup over PyTorch. FP16 works on Turing (RTX 2060). FP8 requires Ada Lovelace/Hopper. Mixed precision supported since July 2023.
- **Related Sub-question**: C (LightGlue ONNX on RTX 2060)
-
-## Source #5
- **Title**: LightGlue rotation issue #64
- **Link**: https://github.com/cvg/LightGlue/issues/64
- **Tier**: L4
- **Publication Date**: 2023
- **Timeliness Status**: ✅ Currently valid (issue still open)
- **Target Audience**: LightGlue users
- **Research Boundary Match**: ✅ Full match
- **Summary**: SuperPoint+LightGlue not rotation-invariant. Fails at 90°/180°. Workaround: try rotating images by {0°, 90°, 180°, 270°}. Steerable CNNs proposed but not available.
- **Related Sub-question**: G (Image Rotation Handling)
-
-## Source #6
- **Title**: SIFT+LightGlue for UAV Image Mosaicking (ISPRS 2025)
- **Link**: https://isprs-archives.copernicus.org/articles/XLVIII-2-W11-2025/169/2025/
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: SIFT+LightGlue hybrid achieves robust matching in low-texture and high-rotation UAV scenarios. Outperforms both pure SIFT and SuperPoint+LightGlue.
- **Related Sub-question**: G
-
-## Source #7
- **Title**: DINOv2-Based UAV Visual Self-Localization
- **Link**: https://ui.adsabs.harvard.edu/abs/2025IRAL...10.2080Y/abstract
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV localization researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: DINOv2 with adaptive enhancement achieves 86.27 R@1 on DenseUAV benchmark for UAV-to-satellite matching. Proves DINOv2 viable for coarse retrieval.
- **Related Sub-question**: A (DINOv2 Cross-View Retrieval)
-
-## Source #8
- **Title**: SatLoc-Fusion (Remote Sensing 2025)
- **Link**: https://www.mdpi.com/2072-4292/17/17/3048
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV navigation researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Hierarchical: DINOv2 absolute + XFeat VO + optical flow. <15m error, >2Hz on 6 TFLOPS edge. Adaptive confidence-based fusion. Validates our approach architecture.
- **Related Sub-question**: A, B
-
-## Source #9
- **Title**: XFeat: Accelerated Features (CVPR 2024)
- **Link**: https://arxiv.org/abs/2404.19174
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: CV researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: 5x faster than SuperPoint. Real-time on CPU. Semi-dense matching. XFeat has built-in matcher for fast VO; also compatible with LightGlue via xfeat-lightglue models.
- **Related Sub-question**: B (XFeat Reliability)
-
-## Source #10
- **Title**: XFeat + LightGlue compatibility (GitHub issue #128)
- **Link**: https://github.com/cvg/LightGlue/issues/128
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: LightGlue/XFeat users
- **Research Boundary Match**: ✅ Full match
- **Summary**: XFeat-LightGlue trained models available on HuggingFace (vismatch/xfeat-lightglue). Also ONNX export available. XFeat's built-in matcher is separate.
- **Related Sub-question**: B
-
-## Source #11
- **Title**: DINOv2 VRAM usage by model variant
- **Link**: https://blog.iamfax.com/tech/image-processing/dinov2/
- **Tier**: L3
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Developers deploying DINOv2
- **Research Boundary Match**: ✅ Full match
- **Summary**: ViT-S/14: ~300MB VRAM, 0.05s/img. ViT-B/14: ~600MB, 0.1s/img. ViT-L/14: ~1.5GB, 0.35s/img. ViT-G/14: ~5GB, 2s/img.
- **Related Sub-question**: H (Memory Model)
-
-## Source #12
- **Title**: Copernicus DEM on AWS Open Data
- **Link**: https://registry.opendata.aws/copernicus-dem/
- **Tier**: L1
- **Publication Date**: Ongoing
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Developers needing DEM data
- **Research Boundary Match**: ✅ Full match
- **Summary**: Free access via S3 without authentication. Cloud Optimized GeoTIFFs, 1x1 degree tiles, 30m resolution. `aws s3 ls --no-sign-required s3://copernicus-dem-30m/`
- **Related Sub-question**: E (Copernicus DEM)
-
-## Source #13
- **Title**: Google Maps Tiles API Usage and Billing
- **Link**: https://developers.google.com/maps/documentation/tile/usage-and-billing
- **Tier**: L1
- **Publication Date**: 2026-02 (updated)
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Google Maps API consumers
- **Research Boundary Match**: ✅ Full match
- **Summary**: 100K free requests/month. 6,000/min, 15,000/day rate limits. $200 monthly credit expired Feb 2025. Requires session tokens.
- **Related Sub-question**: J
-
-## Source #14
- **Title**: Google Maps vs Mapbox tile schema
- **Link**: https://developers.google.com/maps/documentation/tile/2d-tiles-overview
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Both use z/x/y Web Mercator tiles (256px). Compatible coordinate systems. Google requires session tokens; Mapbox requires API tokens. Mapbox global to zoom 16, regional to 21+.
- **Related Sub-question**: J
-
-## Source #15
- **Title**: FastAPI SSE connection cleanup issues (sse-starlette #99)
- **Link**: https://github.com/sysid/sse-starlette/issues/99
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: FastAPI SSE developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Async generators cannot be easily cancelled once awaited. Use EventPublisher pattern with asyncio.Queue for proper cleanup. Prevents shutdown hangs and connection lingering.
- **Related Sub-question**: K (Security/Stability)
-
-## Source #16
- **Title**: OpenCV decomposeHomographyMat issues (#23282)
- **Link**: https://github.com/opencv/opencv/issues/23282
- **Tier**: L4
- **Publication Date**: 2023
- **Timeliness Status**: ✅ Currently valid
- **Summary**: decomposeHomographyMat can return non-orthogonal rotation matrices. Returns 4 solutions. Positive depth constraint needed for disambiguation. Calibration matrix K precision critical.
- **Related Sub-question**: F (Homography Decomposition)
-
-## Source #17
- **Title**: CVE-2025-48379: Pillow Heap Buffer Overflow
- **Link**: https://nvd.nist.gov/vuln/detail/CVE-2025-48379
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: Pillow 11.2.0-11.2.1 affected, fixed in 11.3.0
- **Summary**: Heap buffer overflow in Pillow's image encoding. Requires pinning Pillow ≥11.3.0 and validating image formats.
- **Related Sub-question**: K (Security)
-
-## Source #18
- **Title**: SALAD: Optimal Transport Aggregation for Visual Place Recognition
- **Link**: https://arxiv.org/abs/2311.15937
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Summary**: DINOv2+SALAD outperforms NetVLAD. Single-stage retrieval, no re-ranking. 30min training. Optimal transport aggregation better than raw DINOv2 CLS token for retrieval.
- **Related Sub-question**: A
-
-## Source #19
- **Title**: NaviLoc: Trajectory-Level Visual Localization
- **Link**: https://www.mdpi.com/2504-446X/10/2/97
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Treats VPR as noisy measurement, uses trajectory-level optimization. 19.5m MLE, 16x improvement over per-frame VPR. Validates trajectory optimization approach.
- **Related Sub-question**: L (Recent Advances)
-
-## Source #20
- **Title**: FAISS GPU memory management
- **Link**: https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: GPU faiss allocates ~2GB scratch space by default. On 6GB VRAM RTX 2060, CPU-based faiss recommended. Supports CPU-GPU interop.
- **Related Sub-question**: H (Memory)
@@ -1,142 +0,0 @@
-# Fact Cards — Draft02 Assessment
-
-## Fact #1
- **Statement**: GTSAM `GPSFactor` works with `Pose3` variables, NOT `Pose2`. `GPSFactor2` works with `NavState`. Neither accepts `Pose2`. For 2D position constraints, use `PriorFactorPoint2` or a custom factor.
- **Source**: [Source #1] https://gtsam.org/doxygen/a04084.html, [Source #2]
- **Phase**: Assessment
- **Target Audience**: GPS-denied UAV navigation developers
- **Confidence**: ✅ High (official GTSAM documentation)
- **Related Dimension**: Factor Graph Design
-
-## Fact #2
- **Statement**: GTSAM 4.2 is stable on pip. GTSAM 4.3 alpha has breaking changes (C++17 migration, Boost removal). Known pip dependency resolution issues for Python bindings (Sept 2025). Production should use 4.2.
- **Source**: [Source #3] https://pypi.org/project/gtsam/
- **Phase**: Assessment
- **Confidence**: ✅ High (PyPI official)
- **Related Dimension**: Factor Graph Design
-
-## Fact #3
- **Statement**: LightGlue-ONNX achieves 2-4x speedup over compiled PyTorch. FP16 works on Turing (RTX 2060). FP8 requires Ada Lovelace/Hopper — falls back to higher precision on Turing. Mixed precision supported since July 2023.
- **Source**: [Source #4] https://github.com/fabio-sim/LightGlue-ONNX
- **Phase**: Assessment
- **Confidence**: ✅ High (official repo, benchmarks)
- **Related Dimension**: Processing Performance
-
-## Fact #4
- **Statement**: SuperPoint and LightGlue are NOT rotation-invariant. Performance degrades significantly at 90°/180° rotations. Practical workaround: try matching at {0°, 90°, 180°, 270°} rotations. SIFT+LightGlue hybrid is proven better for high-rotation UAV scenarios (ISPRS 2025).
- **Source**: [Source #5] issue #64, [Source #6] ISPRS 2025
- **Phase**: Assessment
- **Confidence**: ✅ High (confirmed in official issue + peer-reviewed paper)
- **Related Dimension**: Rotation Handling
-
-## Fact #5
- **Statement**: DINOv2 achieves 86.27 R@1 on DenseUAV benchmark for UAV-to-satellite matching (with adaptive enhancement). Raw DINOv2 performance is lower but still viable for coarse retrieval.
- **Source**: [Source #7] https://ui.adsabs.harvard.edu/abs/2025IRAL...10.2080Y/
- **Phase**: Assessment
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: Satellite Matching
-
-## Fact #6
- **Statement**: SatLoc-Fusion validates the exact architecture pattern: DINOv2 for absolute geo-localization + XFeat for VO + optical flow for velocity. Achieves <15m error at >2Hz on 6 TFLOPS edge hardware. Uses adaptive confidence-based fusion.
- **Source**: [Source #8] https://www.mdpi.com/2072-4292/17/17/3048
- **Phase**: Assessment
- **Confidence**: ✅ High (peer-reviewed, dataset provided)
- **Related Dimension**: Overall Architecture
-
-## Fact #7
- **Statement**: XFeat is 5x faster than SuperPoint. Runs real-time on CPU. Has built-in matcher for fast matching. Also compatible with LightGlue via xfeat-lightglue trained models (HuggingFace vismatch/xfeat-lightglue). ONNX export available.
- **Source**: [Source #9] CVPR 2024, [Source #10] GitHub
- **Phase**: Assessment
- **Confidence**: ✅ High (CVPR paper + working implementations)
- **Related Dimension**: Feature Extraction
-
-## Fact #8
- **Statement**: DINOv2 VRAM: ViT-S/14 ~300MB (0.05s/img), ViT-B/14 ~600MB (0.1s/img), ViT-L/14 ~1.5GB (0.35s/img). On GTX 1080.
- **Source**: [Source #11] blog benchmark
- **Phase**: Assessment
- **Confidence**: ⚠️ Medium (third-party benchmark, not official)
- **Related Dimension**: Memory Model
-
-## Fact #9
- **Statement**: Copernicus DEM GLO-30 is freely available on AWS S3 without authentication: `s3://copernicus-dem-30m/`. Cloud Optimized GeoTIFFs, 30m resolution, global coverage. Alternative: Sentinel Hub API (requires free registration).
- **Source**: [Source #12] AWS Registry
- **Phase**: Assessment
- **Confidence**: ✅ High (AWS official registry)
- **Related Dimension**: DEM Integration
-
-## Fact #10
- **Statement**: Google Maps Tiles API: 100K free requests/month, 15,000/day, 6,000/min. Requires session tokens (not just API key). $200 monthly credit expired Feb 2025. February 2026: split quota buckets for 2D and Street View tiles.
- **Source**: [Source #13] Google official docs
- **Phase**: Assessment
- **Confidence**: ✅ High (official documentation)
- **Related Dimension**: Satellite Provider
-
-## Fact #11
- **Statement**: Google Maps and Mapbox both use z/x/y Web Mercator tiles (256px). Compatible coordinate systems. Main differences: authentication method (session tokens vs API tokens), max zoom levels (Google: 22, Mapbox: global 16, regional 21+).
- **Source**: [Source #14] Google + Mapbox docs
- **Phase**: Assessment
- **Confidence**: ✅ High (official docs)
- **Related Dimension**: Multi-Provider Cache
-
-## Fact #12
- **Statement**: FastAPI SSE async generators cannot be easily cancelled once awaited. Causes shutdown hangs, connection lingering. Solution: EventPublisher pattern with asyncio.Queue for proper lifecycle management.
- **Source**: [Source #15] sse-starlette issue #99
- **Phase**: Assessment
- **Confidence**: ⚠️ Medium (community report, confirmed by library maintainer)
- **Related Dimension**: API Stability
-
-## Fact #13
- **Statement**: cv2.decomposeHomographyMat returns up to 4 solutions. Can return non-orthogonal rotation matrices. Disambiguation requires: positive depth constraint + calibration matrix K precision. Not just "motion consistent with previous direction."
- **Source**: [Source #16] OpenCV issue #23282
- **Phase**: Assessment
- **Confidence**: ✅ High (confirmed OpenCV issue)
- **Related Dimension**: VO Robustness
-
-## Fact #14
- **Statement**: Pillow CVE-2025-48379: heap buffer overflow in 11.2.0-11.2.1. Fixed in 11.3.0. Image processing pipeline must pin Pillow ≥11.3.0.
- **Source**: [Source #17] NVD
- **Phase**: Assessment
- **Confidence**: ✅ High (CVE database)
- **Related Dimension**: Security
-
-## Fact #15
- **Statement**: DINOv2+SALAD (optimal transport aggregation) outperforms raw DINOv2 CLS token for visual place recognition. Single-stage, no re-ranking needed. 30-minute training. Better suited for coarse retrieval than raw cosine similarity on CLS tokens.
- **Source**: [Source #18] arXiv
- **Phase**: Assessment
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: Satellite Matching
-
-## Fact #16
- **Statement**: FAISS GPU allocates ~2GB scratch space by default. On 6GB VRAM RTX 2060, GPU faiss would consume 33% of VRAM just for indexing. CPU-based faiss recommended for this hardware profile.
- **Source**: [Source #20] FAISS wiki
- **Phase**: Assessment
- **Confidence**: ✅ High (official wiki)
- **Related Dimension**: Memory Model
-
-## Fact #17
- **Statement**: NaviLoc achieves 19.5m MLE by treating VPR as noisy measurement and optimizing at trajectory level (not per-frame). 16x improvement over per-frame approaches. Validates trajectory-level optimization concept.
- **Source**: [Source #19]
- **Phase**: Assessment
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: Optimization Strategy
-
-## Fact #18
- **Statement**: Mapbox satellite imagery for Ukraine: no specific update schedule. Uses Maxar Vivid product. Coverage to zoom 16 globally (~2.5m/px), regional zoom 18+ (~0.6m/px). No guarantee of Ukraine freshness — likely 2+ years old in conflict areas.
- **Source**: [Source #14] Mapbox docs
- **Phase**: Assessment
- **Confidence**: ⚠️ Medium (general Mapbox info, no Ukraine-specific data)
- **Related Dimension**: Satellite Provider
-
-## Fact #19
- **Statement**: For 2D factor graphs, GTSAM uses Pose2 (x, y, theta) with BetweenFactorPose2 for odometry and PriorFactorPose2 for anchoring. Position-only constraints use PriorFactorPoint2. Custom factors via Python callbacks are supported but slower than C++ factors.
- **Source**: [Source #2] GTSAM by Example
- **Phase**: Assessment
- **Confidence**: ✅ High (official GTSAM examples)
- **Related Dimension**: Factor Graph Design
-
-## Fact #20
- **Statement**: XFeat's built-in matcher (match_xfeat) is fastest for VO (~15ms total extraction+matching). xfeat-lightglue is higher quality but slower. For satellite matching where accuracy matters more, SuperPoint+LightGlue remains the better choice.
- **Source**: [Source #9] CVPR 2024, [Source #10]
- **Phase**: Assessment
- **Confidence**: ⚠️ Medium (inference from multiple sources)
- **Related Dimension**: Feature Extraction Strategy
@@ -1,31 +0,0 @@
-# Comparison Framework — Draft02 Assessment
-
-## Selected Framework Type
-Problem Diagnosis + Decision Support
-
-## Selected Dimensions
-1. Factor Graph Design Correctness
-2. VRAM/Memory Budget Feasibility
-3. Rotation Handling Completeness
-4. VO Robustness (Homography Decomposition)
-5. Satellite Matching Reliability
-6. Concurrency & Pipeline Architecture
-7. Security Attack Surface
-8. API/SSE Stability
-9. Provider Integration Completeness
-10. Drift Management Strategy
-
-## Findings Matrix
-
-| Dimension | Draft02 Approach | Weak Point | Severity | Proposed Fix | Factual Basis |
-|-----------|------------------|------------|----------|--------------|---------------|
-| Factor Graph | Pose2 + GPSFactor + custom DEM/drift factors | GPSFactor requires Pose3, not Pose2. Custom Python factors are slow. | **Critical** | Use Pose2 + BetweenFactorPose2 + PriorFactorPoint2 for satellite anchors. Convert lat/lon to local ENU. Avoid Python custom factors. | Fact #1, #2, #19 |
-| VRAM Budget | DINOv2 + SuperPoint + LightGlue ONNX + faiss GPU | No model specified for DINOv2. faiss GPU uses 2GB scratch. Combined VRAM could exceed 6GB. | **High** | Use DINOv2 ViT-S/14 (300MB). faiss on CPU only. Sequence model loading (not concurrent). Explicit budget: XFeat 200MB + DINOv2-S 300MB + SuperPoint 400MB + LightGlue 500MB. | Fact #8, #16 |
-| Rotation Handling | Heading-based rectification + SIFT fallback | No heading at segment start. No trigger criteria for SIFT vs rotation retry. Multi-rotation matching not mentioned. | **High** | At segment start: try 4 rotations {0°, 90°, 180°, 270°}. After heading established: rectify. SIFT fallback when SuperPoint inlier ratio < 0.2. | Fact #4 |
-| Homography Decomposition | Motion consistency selection | Only "motion consistent with previous direction" — underspecified. 4 solutions possible. Non-orthogonal matrices can occur. | **Medium** | Positive depth constraint first. Then normal direction check (plane normal should point up). Then motion consistency. Orthogonality check on R. | Fact #13 |
-| Satellite Coarse Retrieval | DINOv2 + faiss cosine similarity | Raw DINOv2 CLS token suboptimal for retrieval. SALAD aggregation proven better. | **Medium** | Use DINOv2+SALAD or at minimum use patch-level features, not just CLS token. Alternatively, fine-tune DINOv2 on remote sensing. | Fact #5, #15 |
-| Concurrency Model | "Async — don't block VO pipeline" | No concrete concurrency design. GPU can't run two models simultaneously. | **High** | Sequential GPU: XFeat VO first (15ms), then async satellite matching on same GPU. Use asyncio for I/O (tile download, DEM fetch). CPU faiss for retrieval. | Fact #8, #16 |
-| Security | JWT + rate limiting + CORS | No image format validation. No Pillow version pinning. No SSE abuse protection beyond connection limits. No sandbox for image processing. | **Medium** | Pin Pillow ≥11.3.0. Validate image magic bytes. Limit image dimensions before loading. Memory-map large images. CSP headers. | Fact #14 |
-| SSE Stability | FastAPI EventSourceResponse | Async generator cleanup issues on shutdown. No heartbeat. No reconnection strategy. | **Medium** | Use asyncio.Queue-based EventPublisher. Add SSE heartbeat every 15s. Include Last-Event-ID for reconnection. | Fact #12 |
-| Provider Integration | Google Maps + Mapbox + user tiles | Google requires session tokens (not just API key). 15K/day limit = ~7 flights from cache misses. Mapbox Ukraine coverage uncertain. | **Medium** | Implement session token management for Google. Add Bing Maps as third provider. Document DEM+tile download budget per flight. | Fact #10, #11, #18 |
-| Drift Management | 100m cumulative drift limit factor | Custom factor. If satellite fails for 50+ frames, no anchors → drift factor has nothing to constrain against. | **High** | Add dead-reckoning confidence decay: after N frames without anchor, emit warning + request user input. Track estimated drift explicitly. Set hard limit for user input request. | Fact #17 |
@@ -1,192 +0,0 @@
-# Reasoning Chain — Draft02 Assessment
-
-## Dimension 1: Factor Graph Design Correctness
-
-### Fact Confirmation
-According to Fact #1, GTSAM's `GPSFactor` class works exclusively with `Pose3` variables. `GPSFactor2` works with `NavState`. Neither accepts `Pose2`. According to Fact #19, `PriorFactorPoint2` provides 2D position constraints, and `BetweenFactorPose2` provides 2D odometry constraints.
-
-### Problem
-Draft02 specifies `Pose2 (x, y, heading)` variables but lists `GPSFactor` for satellite anchors. This is an API mismatch — the code would fail at runtime.
-
-### Solution
-Two valid approaches:
-1. **Pose2 graph (recommended)**: Use `Pose2` variables + `BetweenFactorPose2` for VO + `PriorFactorPose2` for satellite anchors (constraining full pose when heading is available) or use a custom partial factor that constrains only the position part of Pose2. Convert WGS84 to local ENU coordinates centered on starting GPS.
-2. **Pose3 graph**: Use `Pose3` with fixed altitude. More accurate but adds unnecessary complexity for 2D problem.
-
-The custom DEM terrain factor and drift limit factor also need reconsideration: Python custom factors invoke a Python callback per optimization step, which is slow. DEM terrain is irrelevant for 2D Pose2 (altitude is not a variable). Drift should be managed by the Segment Manager logic, not as a factor.
-
-### Conclusion
-Switch to Pose2 + BetweenFactorPose2 + PriorFactorPose2 (or partial position prior). Remove DEM terrain factor (handle elevation in GSD calculation outside the graph). Remove drift limit factor (handle in Segment Manager). This simplifies the factor graph and avoids Python callback overhead.
-
-### Confidence
-✅ High — based on official GTSAM documentation
-
---
-
-## Dimension 2: VRAM/Memory Budget Feasibility
-
-### Fact Confirmation
-According to Fact #8: DINOv2 ViT-S/14 ~300MB, ViT-B/14 ~600MB. Fact #16: faiss GPU uses ~2GB scratch. Fact #3: LightGlue ONNX FP16 works on RTX 2060.
-
-### Problem
-Draft02 doesn't specify DINOv2 model size. Proposing faiss GPU would consume 2GB of 6GB VRAM. Combined with other models, total could exceed 6GB. Draft02 estimates ~1.5GB per frame for XFeat/SuperPoint + LightGlue, but doesn't account for DINOv2 or faiss.
-
-### Budget Analysis
-Concurrent peak VRAM (worst case):
- XFeat inference: ~200MB
- LightGlue ONNX (FP16): ~500MB
- DINOv2 ViT-B/14: ~600MB
- SuperPoint: ~400MB
- faiss GPU: ~2GB
- ONNX Runtime overhead: ~300MB
- **Total: ~4.0GB** (without faiss GPU: ~2.0GB)
-
-Sequential loading (recommended):
- Step 1: XFeat + XFeat matcher (VO): ~400MB
- Step 2: DINOv2-S (coarse retrieval): ~300MB → unload
- Step 3: SuperPoint + LightGlue ONNX (fine matching): ~900MB
- Peak: ~1.3GB (with model switching)
-
-### Conclusion
-Use DINOv2 ViT-S/14 (300MB, 0.05s/img — fast enough for coarse retrieval). Run faiss on CPU (the embedding vectors are small, CPU search is <1ms for ~2000 vectors). Sequential model loading for GPU: VO models first, then satellite matching models. Keep models loaded but process sequentially (no concurrent GPU inference).
-
-### Confidence
-✅ High — based on documented VRAM numbers
-
---
-
-## Dimension 3: Rotation Handling Completeness
-
-### Fact Confirmation
-According to Fact #4, SuperPoint+LightGlue fail at 90°/180° rotations. According to Fact #6 (ISPRS 2025), SIFT+LightGlue outperforms SuperPoint+LightGlue in high-rotation UAV scenarios.
-
-### Problem
-Draft02 says "estimate heading from VO chain, rectify images before satellite matching, SIFT fallback for rotation-heavy cases." But:
-1. At segment start, there's no heading from VO chain — no rectification possible.
-2. No criteria for when to trigger SIFT fallback.
-3. Multi-rotation matching strategy ({0°, 90°, 180°, 270°}) not mentioned.
-
-### Conclusion
-Three-tier rotation handling:
-1. **Segment start (no heading)**: Try DINOv2 coarse retrieval (more rotation-robust than local features) → if match found, estimate heading from satellite alignment → proceed normally.
-2. **Normal operation (heading available)**: Rectify to approximate north-up using accumulated heading → SuperPoint+LightGlue.
-3. **Match failure fallback**: Try 4 rotations {0°, 90°, 180°, 270°} with SuperPoint. If still fails → SIFT+LightGlue (rotation-invariant).
-
-Trigger for SIFT: SuperPoint inlier ratio < 0.15 after rotation retry.
-
-### Confidence
-✅ High — based on confirmed LightGlue limitation + proven SIFT+LightGlue alternative
-
---
-
-## Dimension 4: VO Robustness (Homography Decomposition)
-
-### Fact Confirmation
-According to Fact #13, cv2.decomposeHomographyMat returns 4 solutions. Can return non-orthogonal matrices. Calibration matrix K precision is critical.
-
-### Problem
-Draft02 specifies selection by "motion consistent with previous direction + positive depth." This is underspecified for the first frame pair in a segment (no previous direction). Non-orthogonal R detection is missing.
-
-### Conclusion
-Disambiguation procedure:
-1. Compute all 4 decompositions.
-2. **Filter by positive depth**: triangulate a few matched points, reject solutions where points are behind camera.
-3. **Filter by plane normal**: for downward-looking camera, the normal should approximately point up (positive z component in camera frame).
-4. **Motion consistency**: if previous direction available, prefer solution consistent with expected motion direction.
-5. **Orthogonality check**: verify R'R ≈ I, det(R) ≈ 1. If not, re-orthogonalize via SVD.
-6. For first frame pair: rely on filters 2+3 only.
-
-### Confidence
-✅ High — based on well-documented decomposition ambiguity
-
---
-
-## Dimension 5: Satellite Coarse Retrieval
-
-### Fact Confirmation
-According to Fact #15, DINOv2+SALAD outperforms raw DINOv2 CLS token for retrieval. According to Fact #5, DINOv2 achieves 86.27 R@1 with adaptive enhancement.
-
-### Problem
-Draft02 proposes "DINOv2 global retrieval + faiss cosine similarity." Using raw CLS token is suboptimal. SALAD or patch-level feature aggregation would improve retrieval accuracy.
-
-### Conclusion
-DINOv2+SALAD is the better approach but adds a training/fine-tuning dependency. For a production system without the ability to fine-tune: use DINOv2 patch tokens (not just CLS) with spatial pooling, then cosine similarity via faiss. This captures more spatial information than CLS alone. If time permits, train SALAD head (30 minutes on appropriate dataset).
-
-Alternatively, consider SatDINO (DINOv2 pre-trained on satellite imagery) if available as a checkpoint.
-
-### Confidence
-⚠️ Medium — SALAD is proven but adding training dependency may not be worth the complexity for this use case
-
---
-
-## Dimension 6: Concurrency & Pipeline Architecture
-
-### Fact Confirmation
-Single GPU (RTX 2060) cannot run two models concurrently. Fact #8 shows sequential model inference times. Fact #3 shows LightGlue ONNX at ~50-100ms.
-
-### Problem
-Draft02 says satellite matching is "async — don't block VO pipeline" but on a single GPU, you can't parallelize GPU inference.
-
-### Conclusion
-Pipeline design:
-1. **VO (synchronous, per-frame)**: XFeat extract + match (~30ms total) → homography estimation (~5ms) → GTSAM update (~5ms) → emit position via SSE. **Total: ~40ms per frame.**
-2. **Satellite matching (asynchronous, overlapped with next frame's VO)**: DINOv2 coarse (~50ms) → SuperPoint+LightGlue fine (~150ms) → GTSAM update (~5ms) → emit refined position. **Total: ~205ms but overlapped.**
-3. **I/O (fully async)**: Tile download, DEM fetch, cache management — all via asyncio.
-4. **CPU tasks (parallel)**: faiss search (CPU), homography RANSAC (CPU-bound but fast).
-
-The GPU processes frames sequentially. The "async" part is that satellite matching for frame N happens while VO for frame N+1 proceeds. Since satellite matching (~205ms) is longer than VO (~40ms), the pipeline is satellite-matching-bound but VO results stream immediately.
-
-### Confidence
-✅ High — based on documented inference times
-
---
-
-## Dimension 7: Security Attack Surface
-
-### Fact Confirmation
-According to Fact #14, Pillow CVE-2025-48379 affects image loading. Fact #12 confirms SSE cleanup issues.
-
-### Problem
-Draft02 has JWT + rate limiting + CORS but misses:
- Image format/magic byte validation before loading
- Pillow version pinning
- Memory-limited image loading (a 100,000 × 100,000 pixel image could OOM)
- SSE heartbeat for connection health
- No mention of directory traversal prevention depth
-
-### Conclusion
-Additional security measures:
-1. Pin Pillow ≥11.3.0 in requirements.
-2. Validate image magic bytes (JPEG/PNG/TIFF) before loading with PIL.
-3. Check image dimensions before loading: reject if either dimension > 10,000px.
-4. Use OpenCV for loading (separate from PIL) — validate separately.
-5. Resolve image_folder path to canonical form (os.path.realpath) and verify it's under allowed base directories.
-6. Add Content-Security-Policy headers.
-7. SSE heartbeat every 15s to detect stale connections.
-8. Implement asyncio.Queue-based event publisher for SSE.
-
-### Confidence
-✅ High — based on documented CVE + known SSE issues
-
---
-
-## Dimension 8: Drift Management Strategy
-
-### Fact Confirmation
-According to Fact #17, NaviLoc demonstrates that trajectory-level optimization with noisy VPR measurements achieves 16x better accuracy than per-frame approaches. SatLoc-Fusion uses adaptive confidence metrics.
-
-### Problem
-Draft02's "drift limit factor" as a GTSAM custom factor is problematic: (1) custom Python factors are slow, (2) if no satellite anchors arrive for extended period, the drift factor has nothing to constrain against.
-
-### Conclusion
-Replace GTSAM drift factor with Segment Manager logic:
-1. Track cumulative VO displacement since last satellite anchor.
-2. If cumulative displacement > 100m without anchor: emit warning SSE event, increase satellite matching frequency/radius.
-3. If cumulative displacement > 200m: request user input with timeout.
-4. If cumulative displacement > 500m: mark segment as LOW confidence, continue but warn.
-5. Confidence score per position: decays exponentially with distance from nearest anchor.
-
-This is simpler, faster, and more controllable than a GTSAM custom factor.
-
-### Confidence
-✅ High — engineering judgment supported by SatLoc-Fusion's confidence-based approach
@@ -1,100 +0,0 @@
-# Validation Log — Draft02 Assessment (Draft03)
-
-## Validation Scenario 1: Factor graph initialization with first satellite match
-
-**Scenario**: Flight starts, VO processes 10 frames, satellite match arrives for frame 5.
-
-**Expected with Draft03 fixes**:
-1. GTSAM graph starts with PriorFactorPose2 at starting GPS (frame 0).
-2. BetweenFactorPose2 added for frames 0→1, 1→2, ..., 9→10.
-3. Satellite match for frame 5: add PriorFactorPose2 with position from satellite match and noise proportional to reprojection error × GSD.
-4. iSAM2.update() triggers backward correction — frames 0-4 and 5-10 both adjust.
-5. All positions in local ENU coordinates, converted to WGS84 for output.
-
-**Validation result**: Consistent. PriorFactorPose2 correctly constrains Pose2 variables. No GPSFactor API mismatch.
-
-## Validation Scenario 2: Segment start with unknown heading (rotation handling)
-
-**Scenario**: After a sharp turn, new segment starts. First image has unknown heading.
-
-**Expected with Draft03 fixes**:
-1. VO triple check fails → segment break.
-2. New segment starts. No heading available.
-3. For satellite coarse retrieval: DINOv2-S processes unrotated image → top-5 tiles.
-4. For fine matching: try SuperPoint+LightGlue at 4 rotations {0°, 90°, 180°, 270°}.
-5. If match found: heading estimated from satellite alignment. Subsequent images rectified.
-6. If no match: try SIFT+LightGlue (rotation-invariant).
-7. If still no match: request user input.
-
-**Validation result**: Consistent. Three-tier fallback addresses the heading bootstrap problem.
-
-## Validation Scenario 3: VRAM budget during satellite matching
-
-**Scenario**: Processing frame with concurrent VO + satellite matching on RTX 2060 (6GB).
-
-**Expected with Draft03 fixes**:
-1. XFeat features already extracted for VO: ~200MB VRAM.
-2. DINOv2 ViT-S/14 loaded for coarse retrieval: ~300MB.
-3. After coarse retrieval, DINOv2 can be unloaded or kept resident.
-4. SuperPoint loaded for fine matching: ~400MB.
-5. LightGlue ONNX loaded: ~500MB.
-6. Peak if all loaded: ~1.4GB.
-7. ONNX Runtime workspace: ~300MB.
-8. Total peak: ~1.7GB — well within 6GB.
-9. faiss runs on CPU — no VRAM impact.
-
-**Validation result**: Consistent. VRAM budget is comfortable even without model unloading.
-
-## Validation Scenario 4: Extended satellite failure (50+ frames)
-
-**Scenario**: Flying over area with outdated/changed satellite imagery. Satellite matching fails for 80 consecutive frames (~8km).
-
-**Expected with Draft03 fixes**:
-1. Frames 1-10: normal VO, satellite matching fails. Cumulative drift increases.
-2. Frame ~10 (1km drift): warning SSE event emitted. Satellite search radius expanded.
-3. Frame ~20 (2km drift): user_input_needed SSE event. If user provides GPS → anchor + backward correction.
-4. If user doesn't respond within timeout: continue with LOW confidence.
-5. Frame ~50 (5km drift): positions marked as very low confidence.
-6. Confidence score per position decays exponentially from last anchor.
-7. If satellite finally matches at frame 80: massive backward correction. Refined events emitted.
-
-**Validation result**: Consistent. Explicit drift thresholds are more predictable than the custom GTSAM factor approach.
-
-## Validation Scenario 5: Google Maps session token management
-
-**Scenario**: Processing a 3000-image flight. Need ~2000 satellite tiles.
-
-**Expected with Draft03 fixes**:
-1. On job start: create Google Maps session with POST /v1/createSession (returns session token).
-2. Use session token in all tile requests for this session.
-3. Daily limit: 15,000 tiles → sufficient for single flight.
-4. Monthly limit: 100,000 → ~50 flights.
-5. At 80% daily limit (12,000): switch to Mapbox.
-6. Mapbox: 200,000/month → additional ~100 flights capacity.
-
-**Validation result**: Consistent. Session management addressed correctly.
-
-## Review Checklist
- [x] Draft conclusions consistent with fact cards
- [x] No important dimensions missed
- [x] No over-extrapolation
- [x] Conclusions actionable/verifiable
- [x] GPSFactor API mismatch verified against GTSAM docs
- [x] VRAM budget calculated with specific model variants
- [x] Rotation handling addresses segment start edge case
- [x] Drift management has concrete thresholds
-
-## Counterexamples
- **Very low-texture terrain** (uniform sand/snow): XFeat VO might fail even on consecutive frames. Mitigation: track texture score per image, warn when low.
- **Satellite imagery completely missing for region**: Both Google and Mapbox might have no data. Mitigation: user-provided tiles are highest priority.
- **Multiple concurrent GPU processes**: Another process using the GPU could reduce available VRAM. Mitigation: document exclusive GPU access requirement.
-
-## Conclusions Requiring No Revision
-All conclusions validated. Key improvements are well-supported:
-1. Correct GTSAM factor types (PriorFactorPose2 instead of GPSFactor)
-2. DINOv2 ViT-S/14 for VRAM efficiency
-3. Three-tier rotation handling
-4. Explicit drift thresholds in Segment Manager
-5. asyncio.Queue-based SSE publisher
-6. CPU-based faiss
-7. Session token management for Google Maps
@@ -1,74 +0,0 @@
-# Acceptance Criteria Assessment
-
-## Acceptance Criteria
-
-| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status |
-|-----------|-----------|-------------------|---------------------|--------|
-| Position accuracy (80% of photos) | ≤50m error | 15-150m achievable depending on method. SatLoc (2025): <15m with adaptive fusion. Mateos-Ramirez (2024): 142m mean at 1000m+ altitude. At 400m altitude with better GSD (~6cm/px) and satellite correction, ≤50m for 80% is realistic | Moderate — requires high-quality satellite imagery and robust feature matching pipeline | **Modified** — see notes on satellite imagery quality dependency |
-| Position accuracy (60% of photos) | ≤20m error | Achievable only with satellite-anchored corrections, not with VO alone. SatLoc reports <15m with satellite anchoring + VO fusion. Requires 0.3-0.5 m/px satellite imagery and good terrain texture | High — requires premium satellite imagery, robust cross-view matching, and careful calibration | **Modified** — add dependency on satellite correction frequency |
-| Outlier tolerance | 350m displacement between consecutive photos | At 400m altitude, image footprint is ~375x250m. A 350m displacement means near-zero overlap. VO will fail; system must rely on IMU dead-reckoning or satellite re-localization | Low — standard outlier detection can handle this | Modified — specify fallback strategy (IMU dead-reckoning + satellite re-matching) |
-| Sharp turn handling (partial overlap) | <200m drift, <70° angle, <5% overlap | Standard VO fails below ~20-30% overlap. With <5% overlap, feature matching between consecutive frames is unreliable. Requires satellite-based re-localization or IMU bridging | High — requires separate re-localization module | Modified — clarify: "70%" should likely be "70 degrees"; add IMU-bridge requirement |
-| Disconnected route segments | System should reconnect disconnected chunks | This is essentially a place recognition / re-localization problem. Solvable via satellite image matching for each new segment independently | High — core architectural requirement affecting system design | Modified — add: each segment should independently localize via satellite matching |
-| User fallback input | Ask user after 3 consecutive failures | Reasonable fallback. Needs UI/API integration for interactive input | Low | No change |
-| Processing time per image | <5 seconds | On Jetson Orin Nano Super (8GB shared memory): feasible with optimized pipeline. CUDA feature extraction ~50ms, matching ~100-500ms, satellite crop+match ~1-3s. Full pipeline 2-4s is achievable with image downsampling and TensorRT optimization | Moderate — requires TensorRT optimization and image downsampling strategy | **Modified** — specify this is for Jetson Orin Nano Super, not RTX 2060 |
-| Real-time streaming | SSE for immediate results + refinement | Standard pattern, well-supported | Low | No change |
-| Image Registration Rate | >95% | For consecutive frames with nadir camera in good conditions: 90-98% achievable. Drops significantly during sharp turns and over low-texture terrain (water, uniform fields). The 95% target conflicts with sharp-turn handling requirement | Moderate — requires learning-based matchers (SuperPoint/LightGlue) | **Modified** — clarify: 95% applies to "normal flight" segments only; sharp-turn frames are expected failures handled by re-localization |
-| Mean Reprojection Error | <1.0 pixels | Achievable with modern methods (LightGlue, SuperGlue). Traditional methods typically 1-3 px. Deep learning matchers routinely achieve 0.3-0.8 px with proper calibration | Moderate — requires deep learning feature matchers | No change — achievable |
-| REST API + SSE architecture | Background service | Standard architecture, well-supported in Python (FastAPI + SSE) | Low | No change |
-| Satellite imagery resolution | ≥0.5 m/px, ideally 0.3 m/px | Google Maps for eastern Ukraine: variable, typically 0.5-1.0 m/px in rural areas. 0.3 m/px unlikely from Google Maps. Commercial providers (Maxar, Planet) offer 0.3-0.5 m/px but at significant cost | **High** — Google Maps may not meet 0.5 m/px in all areas of the operational region. 0.3 m/px requires commercial satellite providers | **Modified** — current Google Maps limitation may make this unachievable for all areas; consider fallback for degraded satellite quality |
-| Confidence scoring | Per-position estimate (high=satellite, low=VO) | Standard practice in sensor fusion. Easy to implement | Low | No change |
-| Output format | WGS84, GeoJSON or CSV | Standard, trivial to implement | Negligible | No change |
-| Satellite imagery age | <2 years where possible | Google Maps imagery for conflict zones (eastern Ukraine) may be significantly outdated or intentionally degraded. Recency is hard to guarantee | Medium — may need multiple satellite sources | **Modified** — flag: conflict zone imagery may be intentionally limited |
-| Max VO cumulative drift | <100m between satellite corrections | VIO drift typically 0.8-1% of distance. Between corrections at 1km intervals: ~10m drift. 100m budget allows corrections every ~10km — very generous | Low — easily achievable if corrections happen at reasonable intervals | No change — generous threshold |
-| Memory usage | <8GB shared memory (Jetson Orin Nano Super) | Binding constraint. 8GB LPDDR5 shared between CPU and GPU. ~6-7GB usable after OS. 26MP images need downsampling | **Critical** — all processing must fit within 8GB shared memory | **Updated** — changed to Jetson Orin Nano Super constraint |
-| Object center coordinates | Accuracy consistent with frame-center accuracy | New criterion — derives from problem statement requirement | Low — once frame position is known, object position follows from pixel offset + GSD | **Added** |
-| Sharp turn handling | <200m drift, <70 degrees, <5% overlap. 95% registration rate applies to normal flight only | Clarified from original "70%" to "70 degrees". Split registration rate expectation | Low — clarification only | **Updated** |
-| Offline preprocessing time | Not time-critical (minutes/hours before flight) | New criterion — no constraint existed | Low | **Added** |
-
-## Restrictions Assessment
-
-| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status |
-|-------------|-----------|-------------------|---------------------|--------|
-| Aircraft type | Fixed-wing only | Appropriate — fixed-wing has predictable motion model, mostly forward flight. Simplifies VO assumptions | N/A | No change |
-| Camera mount | Downward-pointing, fixed, not autostabilized | Implies roll/pitch affect image. At 400m altitude, moderate roll/pitch causes manageable image shift. IMU data can compensate. Non-stabilized means more variable image overlap and orientation | Medium — must use IMU data for image dewarping or accept orientation-dependent accuracy | **Modified** — add: IMU-based image orientation correction should be considered |
-| Operational region | Eastern/southern Ukraine (left of Dnipro) | Conflict zone — satellite imagery may be degraded, outdated, or restricted. Terrain: mix of agricultural, urban, forest. Agricultural areas have seasonal texture changes | **High** — satellite imagery availability and quality is a significant risk | **Modified** — flag operational risk: imagery access in conflict zones |
-| Image resolution | FullHD to 6252x4168, known camera parameters | 26MP at max is large for edge processing. Must downsample for feature extraction. Known camera intrinsics enable proper projective geometry | Medium — pipeline must handle variable resolutions | No change |
-| Altitude | Predefined, ≤1km, terrain height negligible | At 400m: GSD ~6cm/px, footprint ~375x250m. Terrain "negligible" is an approximation — even 50m terrain variation at 400m altitude causes ~12% scale error. The referenced paper (Mateos-Ramirez 2024) shows terrain elevation is a primary error source | **Medium** — "terrain height negligible" needs qualification. At 400m, terrain variations >50m become significant | **Modified** — add: terrain height can be neglected only if variations <50m within image footprint |
-| IMU data availability | "A lot of data from IMU" | IMU provides: accelerometer, gyroscope, magnetometer. Crucial for: dead-reckoning during feature-less frames, image orientation compensation, scale estimation, motion prediction. Standard tactical IMUs provide 100-400Hz data | Low — standard IMU integration | **Modified** — specify: IMU data includes gyroscope + accelerometer at ≥100Hz; will be used for orientation compensation and dead-reckoning fallback |
-| Weather | Mostly sunny | Favorable for visual methods. Shadows can actually help feature matching. Reduces image quality variability | Low — favorable condition | No change |
-| Satellite provider | Google Maps (potentially outdated) | **Critical limitation**: Google Maps satellite API has usage limits, unknown update frequency for eastern Ukraine, potential conflict-zone restrictions. Resolution may not meet 0.5 m/px in rural areas. No guarantee of recency | **High** — single-provider dependency is a significant risk | **Modified** — consider: (1) downloading tiles ahead of time for the operational area, (2) having a fallback provider strategy |
-| Photo count | Up to 3000, typically 500-1500 | At 3fps and 500-1500 photos: 3-8 minutes of flight. At ~100m spacing: 50-150km route. Memory for 3000 pre-extracted satellite feature maps needs careful management on 8GB | Medium — batch processing and memory management needed | **Modified** — add: pipeline must manage memory for up to 3000 frames on 8GB device |
-| Sharp turns | Next photo may have no common objects with previous | This is the hardest edge case. Complete visual discontinuity requires satellite-based re-localization. IMU provides heading/velocity for bridging. System must be architected around this possibility | High — drives core architecture decision | No change — already captured as a defining constraint |
-| Processing hardware | Jetson Orin Nano Super, 67 TOPS | 8GB shared LPDDR5, 1024 CUDA cores, 32 Tensor Cores, 102 GB/s bandwidth. TensorRT for inference optimization. Power: 7-25W. Significantly less capable than desktop GPU | **Critical** — all processing must fit within 8GB shared memory, pipeline must be optimized for TensorRT | **Modified** — CONTRADICTS AC's RTX 2060 reference. Must be the binding constraint |
-
-## Key Findings
-
-1. **CRITICAL CONTRADICTION**: The AC mentions "RTX 2060 compatibility" (16GB RAM + 6GB VRAM) but the restriction specifies Jetson Orin Nano Super (8GB shared memory). These are fundamentally different platforms. **The Jetson must be the binding constraint.** All processing, including model weights, image buffers, and intermediate results, must fit within ~6-7GB usable memory (OS takes ~1-1.5GB).
-
-2. **Satellite Imagery Risk**: Google Maps as the sole satellite provider for a conflict zone in eastern Ukraine presents significant quality, resolution, and recency risks. The 0.3 m/px "ideal" resolution is unlikely available from Google Maps for this region. The system design must be robust to degraded satellite reference quality (0.5-1.0 m/px).
-
-3. **Accuracy is Achievable but Conditional**: The 50m/80% and 20m/60% accuracy targets are achievable based on recent research (SatLoc 2025: <15m with adaptive fusion), but **only when satellite corrections are successful**. VO-only segments will drift ~1% of distance traveled. The system must maximize satellite correction frequency.
-
-4. **Sharp Turn Handling Drives Architecture**: The requirement to handle disconnected route segments with no visual overlap between consecutive frames means the system cannot rely solely on sequential VO. It must have an independent satellite-based geo-localization capability for each frame or segment — this is a core architectural requirement.
-
-5. **Processing Time is Feasible**: <5s per image on Jetson Orin Nano Super is achievable with: (a) image downsampling (e.g., to 2000x1300), (b) TensorRT-optimized models, (c) efficient satellite region cropping. GPU-accelerated feature extraction takes ~50ms, matching ~100-500ms, satellite matching ~1-3s.
-
-6. **Missing AC: Object Center Coordinates**: The problem statement mentions "coordinates of the center of any object in these photos" but no acceptance criterion specifies the accuracy requirement for this. Need to add.
-
-7. **Missing AC: DEM/Elevation Data**: Research shows terrain elevation is a primary error source for pixel-to-meter conversion at these altitudes. If terrain variations are >50m, a DEM is needed. No current restriction mentions DEM availability.
-
-8. **Missing AC: Offline Preprocessing Time**: No constraint on how long satellite image preprocessing can take before the flight.
-
-9. **"70%" in Sharp Turn AC is Ambiguous**: "at an angle of less than 70%" — this likely means 70 degrees, not 70%.
-
-## Sources
-
- SatLoc: Hierarchical Adaptive Fusion Framework for GNSS-denied UAV Localization (2025) — <15m error, >90% coverage, 2+ Hz on edge hardware
- Mateos-Ramirez et al. "Visual Odometry in GPS-Denied Zones for Fixed-Wing UAV" (2024) — 142.88m mean error over 17km at 1000m+ altitude, 0.83% error rate with satellite correction
- NVIDIA Jetson Orin Nano Super specs: 8GB LPDDR5, 67 TOPS, 1024 CUDA cores, 102 GB/s bandwidth
- cuda-efficient-features: Feature extraction benchmarks — 4K in ~12ms on Jetson Xavier
- SIFT+LightGlue for UAV image mosaicking (ISPRS 2025) — superior performance across diverse scenarios
- SuperPoint+LightGlue comparative analysis (2024) — best balance of robustness, accuracy, efficiency
- Google Maps satellite resolution: 0.15m-30m depending on location and source imagery
- VIO drift benchmarks: 0.82-1% of distance traveled (EuRoC, outdoor flights)
- UAVSAR cross-modality matching: 1.83-2.86m RMSE with deep learning approach (Springer 2026)
@@ -1,88 +0,0 @@
-# Question Decomposition
-
-## Original Question
-Research the GPS-denied onboard navigation problem for a fixed-wing UAV and find the best solution architecture. The system must determine frame-center GPS coordinates using visual odometry, satellite image matching, and IMU fusion — all running on a Jetson Orin Nano Super (8GB shared memory, 67 TOPS).
-
-## Active Mode
-Mode A Phase 2 — Initial Research (Problem & Solution)
-
-## Rationale
-No existing solution drafts. Full problem decomposition and solution research needed.
-
-## Problem Context Summary (from INPUT_DIR)
- **Platform**: Fixed-wing UAV, camera pointing down (not stabilized), 400m altitude max 1km
- **Camera**: ADTi Surveyor Lite 26S v2, 26MP (6252x4168), focal length 25mm, sensor width 23.5mm
- **GSD at 400m**: ~6cm/pixel, footprint ~375x250m
- **Frame rate**: 3 fps (interval ~333ms, real-world could be 400-500ms)
- **Photo count**: 500-3000 per flight
- **IMU**: Available at high rate
- **Initial GPS**: Known; GPS may be denied/spoofed during flight
- **Satellite reference**: Pre-uploaded Google Maps tiles
- **Hardware**: Jetson Orin Nano Super, 8GB shared memory, 67 TOPS
- **Region**: Eastern/southern Ukraine (conflict zone)
- **Key challenge**: Reconnecting disconnected route segments after sharp turns
-
-## Question Type Classification
-**Decision Support** — we need to evaluate and select the best architectural approach and component technologies for each part of the pipeline.
-
-## Research Subject Boundary Definition
-
-| Dimension | Boundary |
-|-----------|----------|
-| Population | Fixed-wing UAVs with nadir cameras at 200-1000m altitude |
-| Geography | Rural/semi-urban terrain in eastern Ukraine |
-| Timeframe | Current state-of-the-art (2023-2026) |
-| Level | Edge computing (Jetson-class, 8GB memory), real-time processing |
-
-## Decomposed Sub-Questions
-
-### A. Existing/Competitor Solutions
-1. What existing systems solve GPS-denied UAV visual navigation?
-2. What open-source implementations exist for VO + satellite matching?
-3. What commercial/military solutions address this problem?
-
-### B. Architecture Components
-4. What is the optimal pipeline architecture (sequential vs parallel, streaming)?
-5. How should VO, satellite matching, and IMU fusion be combined (loosely vs tightly coupled)?
-6. How to handle disconnected route segments (the core architectural challenge)?
-
-### C. Visual Odometry Component
-7. What VO algorithms work best for aerial nadir imagery on edge hardware?
-8. What feature extractors/matchers are optimal for Jetson (SuperPoint, ORB, XFeat)?
-9. How to handle scale estimation with known altitude and camera parameters?
-10. What is the optimal image downsampling strategy for 26MP on 8GB memory?
-
-### D. Satellite Image Matching Component
-11. How to efficiently match UAV frames against pre-loaded satellite tiles?
-12. What cross-view matching methods work for aerial-to-satellite registration?
-13. How to preprocess and index satellite tiles for fast retrieval?
-14. How to handle resolution mismatch (6cm UAV vs 50cm+ satellite)?
-
-### E. IMU Fusion Component
-15. How to fuse IMU data with visual estimates (EKF, UKF, factor graph)?
-16. How to use IMU for dead-reckoning during feature-less frames?
-17. How to use IMU for image orientation compensation (non-stabilized camera)?
-
-### F. Edge Optimization
-18. How to fit the full pipeline in 8GB shared memory?
-19. What TensorRT optimizations are available for feature extractors?
-20. How to achieve <5s per frame on Jetson Orin Nano Super?
-
-### G. API & Streaming
-21. What is the best approach for REST API + SSE on Python/Jetson?
-22. How to implement progressive result refinement?
-
-## Timeliness Sensitivity Assessment
-
- **Research Topic**: GPS-denied UAV visual navigation with edge processing
- **Sensitivity Level**: 🟠 High
- **Rationale**: Deep learning feature matchers (SuperPoint, LightGlue, XFeat) and edge inference frameworks (TensorRT) evolve rapidly. Jetson Orin Nano Super is a recent (Dec 2024) product. Cross-view geo-localization is an active research area.
- **Source Time Window**: 12 months (prioritize 2025-2026)
- **Priority official sources to consult**:
-  1. NVIDIA Jetson documentation and benchmarks
-  2. OpenCV / kornia / hloc official docs
-  3. Recent papers on cross-view geo-localization (CVPR, ECCV, ICCV 2024-2025)
- **Key version information to verify**:
-  - JetPack SDK: Current version ____
-  - SuperPoint/LightGlue: Latest available for TensorRT ____
-  - XFeat: Version and Jetson compatibility ____
@@ -1,151 +0,0 @@
-# Source Registry
-
-## Source #1
- **Title**: Visual Odometry in GPS-Denied Zones for Fixed-Wing UAV with Reduced Accumulative Error Based on Satellite Imagery
- **Link**: https://www.mdpi.com/2076-3417/14/16/7420
- **Tier**: L1
- **Publication Date**: 2024-08-22
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Fixed-wing UAV GPS-denied navigation
- **Research Boundary Match**: ✅ Full match
- **Summary**: VO + satellite correction pipeline for fixed-wing UAV at 1000m+ altitude. Mean error 142.88m over 17km (0.83%). Uses ORB features, centroid-based displacement, Kalman filter smoothing, quadtree for satellite keypoint indexing.
- **Related Sub-question**: A1, B5, C7, D11
-
-## Source #2
- **Title**: SatLoc: Hierarchical Adaptive Fusion Framework for GNSS-denied UAV Localization
- **Link**: https://www.scilit.com/publications/e5cafaf875a49297a62b298a89d5572f
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV localization in GNSS-denied environments
- **Research Boundary Match**: ✅ Full match
- **Summary**: Three-layer fusion: DinoV2 for satellite geo-localization, XFeat for VO, optical flow for velocity. Adaptive confidence-based weighting. <15m error, >90% coverage, 2+ Hz on edge hardware.
- **Related Sub-question**: B4, B5, C8, D12
-
-## Source #3
- **Title**: XFeat: Accelerated Features for Lightweight Image Matching (CVPR 2024)
- **Link**: https://arxiv.org/abs/2404.19174
- **Tier**: L1
- **Publication Date**: 2024-04
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Edge device feature matching
- **Research Boundary Match**: ✅ Full match
- **Summary**: 5x faster than SuperPoint, runs on CPU at VGA resolution. Sparse and semi-dense matching. TensorRT deployment available for Jetson. Comparable accuracy to SuperPoint.
- **Related Sub-question**: C8, F18, F20
-
-## Source #4
- **Title**: XFeat TensorRT Implementation
- **Link**: https://github.com/PranavNedunghat/XFeatTensorRT
- **Tier**: L2
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Summary**: C++ TensorRT implementation of XFeat, tested on Jetson Orin NX 16GB with JetPack 6.0, CUDA 12.2, TensorRT 8.6.
- **Related Sub-question**: C8, F18, F19
-
-## Source #5
- **Title**: SuperPoint+LightGlue TensorRT Deployment
- **Link**: https://github.com/fettahyildizz/superpoint_lightglue_tensorrt
- **Tier**: L2
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Summary**: C++ TensorRT implementation of SuperPoint+LightGlue. Production-ready deployment for Jetson platforms.
- **Related Sub-question**: C8, F19
-
-## Source #6
- **Title**: FP8 Quantized LightGlue in TensorRT
- **Link**: https://fabio-sim.github.io/blog/fp8-quantized-lightglue-tensorrt-nvidia-model-optimizer/
- **Tier**: L2
- **Publication Date**: 2026
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Up to ~6x speedup with FP8 quantization. Requires Hopper/Ada Lovelace GPUs (not available on Jetson Orin Nano Ampere). FP16 is the best available precision for Orin Nano.
- **Related Sub-question**: F19
-
-## Source #7
- **Title**: NVIDIA JetPack 6.2 Release Notes
- **Link**: https://docs.nvidia.com/jetson/archives/jetpack-archived/jetpack-62/release-notes/index.html
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: CUDA 12.6.10, TensorRT 10.3.0, cuDNN 9.3. Super Mode for Orin Nano: up to 2x inference performance, 50% memory bandwidth boost. Power modes: 15W, 25W, MAXN SUPER.
- **Related Sub-question**: F18, F19, F20
-
-## Source #8
- **Title**: cuda-efficient-features (GPU feature detection benchmarks)
- **Link**: https://github.com/fixstars/cuda-efficient-features
- **Tier**: L2
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Summary**: 4K detection: 12ms on Jetson Xavier. 8K: 27.5ms. 40K keypoints extraction: 20-25ms on Xavier. Orin Nano Super should be comparable or better.
- **Related Sub-question**: F20
-
-## Source #9
- **Title**: Adaptive Covariance Hybrid EKF/UKF for Visual-Inertial Odometry
- **Link**: https://arxiv.org/abs/2512.17505
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Hybrid EKF/UKF achieves 49% better position accuracy, 57% better rotation accuracy than ESKF alone, at 48% lower computational cost than full UKF. Includes adaptive sensor confidence scoring.
- **Related Sub-question**: E15
-
-## Source #10
- **Title**: SIFT+LightGlue for UAV Image Mosaicking (ISPRS 2025)
- **Link**: https://isprs-archives.copernicus.org/articles/XLVIII-2-W11-2025/169/2025/
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: SIFT+LightGlue outperforms SuperPoint+LightGlue for UAV mosaicking across diverse scenarios. Superior in both low-texture and high-texture environments.
- **Related Sub-question**: C8, D12
-
-## Source #11
- **Title**: UAVision - GNSS-Denied UAV Visual Localization System
- **Link**: https://github.com/ArboriseRS/UAVision
- **Tier**: L4
- **Publication Date**: 2024-2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Open-source system using LightGlue for map matching. Includes image processing modules and visualization.
- **Related Sub-question**: A2
-
-## Source #12
- **Title**: TerboucheHacene/visual_localization
- **Link**: https://github.com/TerboucheHacene/visual_localization
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Vision-based GNSS-free localization with SuperPoint/SuperGlue/GIM matching. Optimized VO + satellite image matching hybrid pipeline. Learning-based matchers for natural environments.
- **Related Sub-question**: A2, D12
-
-## Source #13
- **Title**: GNSS-Denied Geolocalization with Terrain Constraints
- **Link**: https://github.com/yfs90/gnss-denied-uav-geolocalization
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Summary**: No altimeters/IMU required, uses image matching + terrain constraints. GPS-comparable accuracy for day/night across varied terrain.
- **Related Sub-question**: A2
-
-## Source #14
- **Title**: Google Maps Tile API Documentation
- **Link**: https://developers.google.com/maps/documentation/tile/satellite
- **Tier**: L1
- **Publication Date**: Current
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Zoom levels 0-22. Satellite tiles via HTTPS. Session tokens required. Bulk download possible but subject to usage policies.
- **Related Sub-question**: D13
-
-## Source #15
- **Title**: NaviLoc: Trajectory-Level Visual Localization for GNSS-Denied UAV Navigation
- **Link**: https://www.mdpi.com/2504-446X/10/2/97
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Summary**: Trajectory-level optimization rather than per-frame matching. Optimizes entire trajectory against satellite reference for improved accuracy.
- **Related Sub-question**: B4, D11
-
-## Source #16
- **Title**: GSD Estimation for UAV Photogrammetry
- **Link**: https://blog.truegeometry.com/calculators/UAV_photogrammetry_workflows_calculation.html
- **Tier**: L3
- **Publication Date**: Current
- **Timeliness Status**: ✅ Currently valid
- **Summary**: GSD = (sensor_width × altitude) / (focal_length × image_width). For our case: (23.5mm × 400m) / (25mm × 6252) = 0.06 m/pixel.
- **Related Sub-question**: C9
@@ -1,121 +0,0 @@
-# Fact Cards
-
-## Fact #1
- **Statement**: XFeat achieves up to 5x faster inference than SuperPoint while maintaining comparable accuracy for pose estimation. It runs in real-time on CPU at VGA resolution.
- **Source**: Source #3 (CVPR 2024 paper)
- **Phase**: Phase 2
- **Target Audience**: Edge device deployments
- **Confidence**: ✅ High
- **Related Dimension**: Feature Extraction
-
-## Fact #2
- **Statement**: XFeat TensorRT implementation exists and is tested on Jetson Orin NX 16GB with JetPack 6.0, CUDA 12.2, TensorRT 8.6.
- **Source**: Source #4
- **Phase**: Phase 2
- **Target Audience**: Jetson platform deployment
- **Confidence**: ✅ High
- **Related Dimension**: Feature Extraction, Edge Optimization
-
-## Fact #3
- **Statement**: SatLoc framework achieves <15m absolute localization error with >90% trajectory coverage at 2+ Hz on edge hardware, using DinoV2 for satellite matching, XFeat for VO, and optical flow for velocity.
- **Source**: Source #2
- **Phase**: Phase 2
- **Target Audience**: GNSS-denied UAV localization
- **Confidence**: ⚠️ Medium (paper details not fully accessible)
- **Related Dimension**: Overall Architecture, Accuracy
-
-## Fact #4
- **Statement**: Mateos-Ramirez et al. achieved 142.88m mean error over 17km (0.83% error rate) with VO + satellite correction on a fixed-wing UAV at 1000m+ altitude. Without satellite correction, error accumulated to 850m+ over 17km.
- **Source**: Source #1
- **Phase**: Phase 2
- **Target Audience**: Fixed-wing UAV at high altitude
- **Confidence**: ✅ High
- **Related Dimension**: Accuracy, Architecture
-
-## Fact #5
- **Statement**: VIO systems typically drift 0.8-1% of distance traveled. Between satellite corrections at 1km intervals, expected drift is ~10m.
- **Source**: Multiple sources (arxiv VIO benchmarks)
- **Phase**: Phase 2
- **Target Audience**: Aerial VIO systems
- **Confidence**: ✅ High
- **Related Dimension**: VO Drift
-
-## Fact #6
- **Statement**: Jetson Orin Nano Super: 8GB LPDDR5 shared memory, 1024 CUDA cores, 32 Tensor Cores, 102 GB/s bandwidth, 67 TOPS INT8. JetPack 6.2: CUDA 12.6.10, TensorRT 10.3.0.
- **Source**: Source #7
- **Phase**: Phase 2
- **Target Audience**: Hardware specification
- **Confidence**: ✅ High
- **Related Dimension**: Edge Optimization
-
-## Fact #7
- **Statement**: CUDA-accelerated feature detection at 4K (3840x2160): ~12ms on Jetson Xavier. At 8K: ~27.5ms. Descriptor extraction for 40K keypoints: ~20-25ms on Xavier. Orin Nano Super has comparable or slightly better compute.
- **Source**: Source #8
- **Phase**: Phase 2
- **Target Audience**: Jetson GPU performance
- **Confidence**: ✅ High
- **Related Dimension**: Processing Time
-
-## Fact #8
- **Statement**: Hybrid EKF/UKF achieves 49% better position accuracy than ESKF alone at 48% lower computational cost than full UKF. Includes adaptive sensor confidence scoring based on image entropy and motion blur.
- **Source**: Source #9
- **Phase**: Phase 2
- **Target Audience**: VIO fusion
- **Confidence**: ✅ High
- **Related Dimension**: Sensor Fusion
-
-## Fact #9
- **Statement**: SIFT+LightGlue outperforms SuperPoint+LightGlue for UAV mosaicking across diverse scenarios (low-texture agricultural and high-texture urban).
- **Source**: Source #10
- **Phase**: Phase 2
- **Target Audience**: UAV image matching
- **Confidence**: ✅ High
- **Related Dimension**: Feature Matching
-
-## Fact #10
- **Statement**: GSD for our system at 400m: (23.5mm × 400m) / (25mm × 6252px) = 0.060 m/pixel. Image footprint: 6252 × 0.06 = 375m width, 4168 × 0.06 = 250m height.
- **Source**: Source #16 + camera parameters
- **Phase**: Phase 2
- **Target Audience**: Our specific system
- **Confidence**: ✅ High
- **Related Dimension**: Scale Estimation
-
-## Fact #11
- **Statement**: Google Maps satellite tiles available via Tile API at zoom levels 0-22. Max zoom varies by region. For eastern Ukraine, zoom 18 (~0.6 m/px) is typically available; zoom 19 (~0.3 m/px) may not be.
- **Source**: Source #14
- **Phase**: Phase 2
- **Target Audience**: Satellite imagery
- **Confidence**: ⚠️ Medium (exact zoom availability for eastern Ukraine unverified)
- **Related Dimension**: Satellite Reference
-
-## Fact #12
- **Statement**: FP8 quantization for LightGlue requires Hopper/Ada GPUs. Jetson Orin Nano uses Ampere architecture — limited to FP16 as best TensorRT precision.
- **Source**: Source #6, Source #7
- **Phase**: Phase 2
- **Target Audience**: Jetson optimization
- **Confidence**: ✅ High
- **Related Dimension**: Edge Optimization
-
-## Fact #13
- **Statement**: SuperPoint+LightGlue TensorRT C++ deployment is available and production-tested. ONNX Runtime path achieves 2-4x speedup over compiled PyTorch.
- **Source**: Source #5, Source #6
- **Phase**: Phase 2
- **Target Audience**: Production deployment
- **Confidence**: ✅ High
- **Related Dimension**: Feature Matching, Edge Optimization
-
-## Fact #14
- **Statement**: Cross-view matching (UAV-to-satellite) is fundamentally harder than same-view matching due to extreme viewpoint differences. Deep learning embeddings (DinoV2, CLIP-based) are the state-of-the-art for coarse retrieval. Local features are used for fine alignment.
- **Source**: Multiple (Sources #2, #12, #15)
- **Phase**: Phase 2
- **Target Audience**: Cross-view geo-localization
- **Confidence**: ✅ High
- **Related Dimension**: Satellite Matching
-
-## Fact #15
- **Statement**: Quadtree spatial indexing enables O(log n) nearest-neighbor lookup for satellite keypoints. Combined with GeoHash for fast region encoding, this is the standard approach for tile management.
- **Source**: Sources #1, #14
- **Phase**: Phase 2
- **Target Audience**: Spatial indexing
- **Confidence**: ✅ High
- **Related Dimension**: Satellite Tile Management
@@ -1,71 +0,0 @@
-# Comparison Framework
-
-## Selected Framework Type
-Decision Support — evaluating solution options per component
-
-## Architecture Components to Evaluate
-
-1. Feature Extraction & Matching (VO frame-to-frame)
-2. Satellite Image Matching (cross-view geo-registration)
-3. Sensor Fusion (VO + satellite + IMU)
-4. Satellite Tile Preprocessing & Indexing
-5. Image Downsampling Strategy
-6. Re-localization (disconnected segments)
-7. API & Streaming Layer
-
-## Component 1: Feature Extraction & Matching (VO)
-
-| Dimension | XFeat | SuperPoint + LightGlue | ORB (OpenCV) |
-|-----------|-------|----------------------|--------------|
-| Speed (Jetson) | ~2-5ms per frame (VGA), 5x faster than SuperPoint | ~15-50ms per frame (VGA, TensorRT FP16) | ~5-10ms per frame (CUDA) |
-| Accuracy | Comparable to SuperPoint on pose estimation | State-of-the-art for local features | Lower accuracy, not scale-invariant |
-| Memory | <100MB model | ~200-400MB model+inference | Negligible |
-| TensorRT support | Yes (C++ impl available for Jetson Orin NX) | Yes (C++ impl available) | N/A (native CUDA) |
-| Cross-view capability | Limited (same-view designed) | Better with LightGlue attention | Poor for cross-view |
-| Rotation invariance | Moderate | Good with LightGlue | Good (by design) |
-| Jetson validation | Tested on Orin NX (JetPack 6.0) | Tested on multiple Jetson platforms | Native OpenCV CUDA |
-| **Fit for VO** | ✅ Best — fast, accurate, Jetson-proven | ⚠️ Good but heavier | ⚠️ Fast but less accurate |
-| **Fit for satellite matching** | ⚠️ Moderate | ✅ Better for cross-view with attention | ❌ Poor for cross-view |
-
-## Component 2: Satellite Image Matching (Cross-View)
-
-| Dimension | Local Feature Matching (SIFT/SuperPoint + LightGlue) | Global Descriptor Retrieval (DinoV2/CLIP) | Template Matching (NCC) |
-|-----------|-----------------------------------------------------|------------------------------------------|------------------------|
-| Approach | Extract keypoints in both UAV and satellite images, match descriptors | Encode both images into global vectors, compare by distance | Slide UAV image over satellite tile, compute correlation |
-| Accuracy | Sub-pixel when matches found (best for fine alignment) | Tile-level (~50-200m depending on tile size) | Pixel-level but sensitive to appearance changes |
-| Speed | ~100-500ms for match+geometric verification | ~50-100ms for descriptor comparison | ~500ms-2s for large search area |
-| Robustness to viewpoint | Good with LightGlue attention | Excellent (trained for cross-view) | Poor (requires similar viewpoint) |
-| Memory | ~300-500MB (model + keypoints) | ~200-500MB (model) | Low |
-| Failure rate | High in low-texture, seasonal changes | Lower — semantic understanding | High in changed scenes |
-| **Recommended role** | Fine alignment (after coarse retrieval) | Coarse retrieval (select candidate tile) | Not recommended |
-
-## Component 3: Sensor Fusion
-
-| Dimension | EKF (Extended Kalman Filter) | Error-State EKF (ESKF) | Hybrid ESKF/UKF | Factor Graph (GTSAM) |
-|-----------|-------------------------------|------------------------|------------------|---------------------|
-| Accuracy | Baseline | Better for rotation | 49% better than ESKF | Best overall |
-| Compute cost | Lowest | Low | 48% less than full UKF | Highest |
-| Implementation complexity | Low | Medium | Medium-High | High |
-| Handles non-linearity | Linearization errors | Better for small errors | Best among KF variants | Full non-linear |
-| Real-time on Jetson | ✅ | ✅ | ✅ | ⚠️ Depends on graph size |
-| Multi-rate sensor support | Manual | Manual | Manual | Native |
-| **Fit** | ⚠️ Baseline option | ✅ Good starting point | ✅ Best KF option | ⚠️ Overkill for this system |
-
-## Component 4: Satellite Tile Management
-
-| Dimension | GeoHash + In-Memory | Quadtree + Memory-Mapped Files | Pre-extracted Feature DB |
-|-----------|--------------------|-----------------------------|------------------------|
-| Lookup speed | O(1) hash | O(log n) tree traversal | O(1) hash + feature load |
-| Memory usage | All tiles in RAM | On-demand loading | Features only (smaller) |
-| Preprocessing | Fast | Moderate | Slow (extract all features offline) |
-| Flexibility | Fixed grid | Adaptive resolution | Fixed per-tile |
-| **Fit for 8GB** | ❌ Too much RAM for large areas | ✅ Memory-efficient | ✅ Best — smallest footprint |
-
-## Component 5: Image Downsampling Strategy
-
-| Dimension | Fixed Resize (e.g., 1600x1066) | Pyramid (multi-scale) | ROI-based (center crop + full) |
-|-----------|-------------------------------|----------------------|-------------------------------|
-| Speed | Fast, single scale | Slower, multiple passes | Medium |
-| Accuracy | Good if GSD ratio maintained | Best for multi-scale features | Good for center, loses edges |
-| Memory | ~5MB per frame | ~7-8MB per frame | ~6MB per frame |
-| **Fit** | ✅ Best tradeoff | ⚠️ Unnecessary complexity | ⚠️ Loses coverage |
@@ -1,129 +0,0 @@
-# Reasoning Chain
-
-## Dimension 1: Feature Extraction for Visual Odometry
-
-### Fact Confirmation
-XFeat is 5x faster than SuperPoint (Fact #1), has TensorRT deployment on Jetson (Fact #2), and comparable accuracy for pose estimation. SatLoc (the most relevant state-of-the-art system) uses XFeat for its VO component (Fact #3).
-
-### Reference Comparison
-SuperPoint+LightGlue is more accurate for cross-view matching but heavier. ORB is fast but less accurate and not robust to appearance changes. SIFT+LightGlue is best for mosaicking (Fact #9) but slower.
-
-### Conclusion
-**XFeat for VO (frame-to-frame)** — it's the fastest learned feature, Jetson-proven, and used by the closest state-of-the-art system (SatLoc). For satellite matching, a different approach is needed because cross-view matching requires viewpoint-invariant features.
-
-### Confidence
-✅ High — supported by SatLoc architecture and CVPR 2024 benchmarks.
-
---
-
-## Dimension 2: Satellite Image Matching Strategy
-
-### Fact Confirmation
-Cross-view matching is fundamentally harder than same-view (Fact #14). Deep learning embeddings (DinoV2) are state-of-the-art for coarse retrieval (Fact #3). Local features are better for fine alignment. SatLoc uses DinoV2 for satellite matching specifically.
-
-### Reference Comparison
-A two-stage coarse-to-fine approach is the dominant pattern in literature: (1) global descriptor retrieves candidate region, (2) local feature matching refines position. Pure local-feature matching has high failure rate for cross-view due to extreme viewpoint differences.
-
-### Conclusion
-**Two-stage approach**: (1) Coarse — use a lightweight global descriptor to find the best-matching satellite tile within the search area (VO-predicted position ± uncertainty radius). (2) Fine — use local feature matching (SuperPoint+LightGlue or XFeat) between UAV frame and the matched satellite tile to get precise position. The coarse stage can also serve as the re-localization mechanism for disconnected segments.
-
-### Confidence
-✅ High — consensus across multiple recent papers and the SatLoc system.
-
---
-
-## Dimension 3: Sensor Fusion Approach
-
-### Fact Confirmation
-Hybrid ESKF/UKF achieves 49% better accuracy than ESKF alone at 48% lower cost than full UKF (Fact #8). Factor graphs (GTSAM) offer the best accuracy but are computationally expensive.
-
-### Reference Comparison
-For our system: IMU runs at 100-400Hz, VO at ~3Hz (frame rate), satellite corrections at variable rate (whenever matching succeeds). We need multi-rate fusion that handles intermittent satellite corrections and continuous IMU.
-
-### Conclusion
-**Error-State EKF (ESKF)** as the baseline fusion approach — it's well-understood, lightweight, handles multi-rate sensors naturally, and is proven for VIO on edge hardware. Upgrade to hybrid ESKF/UKF if ESKF accuracy is insufficient. Factor graphs are overkill for this real-time edge system.
-
-The filter state: position (lat/lon), velocity, orientation (quaternion), IMU biases. Measurements: VO-derived displacement (high rate), satellite-derived absolute position (variable rate), IMU (highest rate for prediction).
-
-### Confidence
-✅ High — ESKF is the standard choice for embedded VIO systems.
-
---
-
-## Dimension 4: Satellite Tile Preprocessing & Indexing
-
-### Fact Confirmation
-Quadtree enables O(log n) lookups (Fact #15). Pre-extracting features offline saves runtime compute. 8GB memory limits in-memory tile storage.
-
-### Reference Comparison
-Full tiles in memory is infeasible for large areas. Memory-mapped files allow on-demand loading. Pre-extracted feature databases have the smallest runtime footprint.
-
-### Conclusion
-**Offline preprocessing pipeline**:
-1. Download Google Maps satellite tiles at max zoom (18-19) for the operational area
-2. Extract features (XFeat or SuperPoint) from each tile
-3. Compute global descriptors (lightweight, e.g., NetVLAD or cosine-pooled XFeat descriptors) per tile
-4. Store: tile metadata (GPS bounds, zoom level), features + descriptors in a GeoHash-indexed database
-5. Build spatial index (GeoHash) for fast lookup by GPS region
-
-**Runtime**: Given VO-estimated position, query GeoHash to find nearby tiles, compare global descriptors for coarse match, then local feature matching for fine alignment.
-
-### Confidence
-✅ High — standard approach used by all relevant systems.
-
---
-
-## Dimension 5: Image Downsampling Strategy
-
-### Fact Confirmation
-26MP images need downsampling for 8GB device (Fact #6). Feature extraction at 4K takes ~12ms on Jetson Xavier (Fact #7). UAV GSD at 400m is ~6cm/px (Fact #10). Satellite GSD is ~60cm/px at zoom 18.
-
-### Reference Comparison
-For VO (frame-to-frame): features at full resolution are wasteful — consecutive frames at 6cm GSD overlap ~80%, and features at lower resolution are sufficient for displacement estimation. For satellite matching: we need to match at satellite resolution (~60cm/px), so downsampling to match satellite GSD is natural.
-
-### Conclusion
-**Downsample to ~1600x1066** (factor ~4x each dimension). This yields ~24cm/px GSD — still 2.5x finer than satellite, sufficient for feature matching. Image size: ~5MB (RGB). Feature extraction at this resolution: <10ms. This is the single resolution for both VO and satellite matching.
-
-### Confidence
-✅ High — standard practice for edge processing of high-res imagery.
-
---
-
-## Dimension 6: Disconnected Segment Handling
-
-### Fact Confirmation
-SatLoc uses satellite matching as an independent localization source that works regardless of VO state (Fact #3). The AC requires reconnecting disconnected segments as a core capability.
-
-### Reference Comparison
-Pure VO cannot handle zero-overlap transitions. IMU dead-reckoning bridges short gaps (seconds). Satellite-based re-localization provides absolute position regardless of VO state.
-
-### Conclusion
-**Independent satellite localization per frame** — every frame attempts satellite matching regardless of VO state. This naturally handles disconnected segments:
-1. When VO succeeds: satellite matching refines position (high confidence)
-2. When VO fails (sharp turn): satellite matching provides absolute position (sole source)
-3. When both fail: IMU dead-reckoning with low confidence score
-4. After 3 consecutive total failures: request user input
-
-Segment reconnection is automatic: all positions are in the same global (WGS84) frame via satellite matching. No explicit "reconnection" needed — segments share the satellite reference.
-
-### Confidence
-✅ High — this is the key architectural insight.
-
---
-
-## Dimension 7: Processing Pipeline Architecture
-
-### Fact Confirmation
-<5s per frame required (AC). Feature extraction ~10ms, VO matching ~20-50ms, satellite coarse retrieval ~50-100ms, satellite fine matching ~200-500ms, fusion ~1ms. Total: ~300-700ms per frame.
-
-### Conclusion
-**Pipelined parallel architecture**:
- Thread 1 (Camera): Capture frame, downsample, extract features → push to queue
- Thread 2 (VO): Match with previous frame, compute displacement → push to fusion
- Thread 3 (Satellite): Search nearby tiles, coarse retrieval, fine matching → push to fusion
- Thread 4 (Fusion): ESKF prediction (IMU), update (VO), update (satellite) → emit result via SSE
-
-VO and satellite matching can run in parallel for each frame. Fusion integrates results as they arrive. This enables <1s per frame total latency.
-
-### Confidence
-✅ High — standard producer-consumer pipeline.
@@ -1,98 +0,0 @@
-# Validation Log
-
-## Validation Scenario 1: Normal Flight (80% of time)
-UAV flies straight, consecutive frames overlap ~70-80%. Terrain has moderate texture (agricultural + urban mix).
-
-### Expected Based on Conclusions
- XFeat extracts features in ~5ms, VO matching in ~20ms
- Satellite matching succeeds: coarse retrieval ~50ms, fine matching ~300ms
- ESKF fuses both: position accuracy ~10-20m (satellite-anchored)
- Total processing: <500ms per frame
- Confidence: HIGH
-
-### Actual Validation (against literature)
-SatLoc reports <15m error with >90% coverage under similar conditions. Mateos-Ramirez reports 0.83% drift with satellite correction. Both align with our expected performance.
-
-### Result: ✅ PASS
-
---
-
-## Validation Scenario 2: Sharp Turn (5-10% of time)
-UAV makes a 60-degree turn. Next frame has <5% overlap with previous. Heading changes rapidly.
-
-### Expected Based on Conclusions
- VO fails (insufficient feature overlap) — detected by low match count
- IMU provides heading and approximate displacement for ~1-2 frames
- Satellite matching attempts independent localization of the new frame
- If satellite match succeeds: position recovered, segment continues
- If satellite match fails: IMU dead-reckoning with LOW confidence
-
-### Potential Issues
- Satellite matching may also fail if the frame is heavily tilted (non-nadir view during turn)
- IMU drift during turn: at 100m/s for 1s, displacement ~100m. IMU drift over 1s: ~1-5m — acceptable
-
-### Result: ⚠️ CONDITIONAL PASS — depends on satellite matching success during turn. Non-stabilized camera may produce tilted images that are harder to match. IMU provides reasonable bridge.
-
---
-
-## Validation Scenario 3: Disconnected Route (rare, <5%)
-UAV completes segment A, makes a 90+ degree turn, flies a new heading. Segment B has no overlap with segment A. Multiple such segments possible.
-
-### Expected Based on Conclusions
- Each segment independently localizes via satellite matching
- No explicit reconnection needed — all in WGS84 frame
- Per-segment accuracy depends on satellite matching success rate
- Low-confidence gaps between segments until satellite match succeeds
-
-### Result: ✅ PASS — architecture handles this natively via independent per-frame satellite matching.
-
---
-
-## Validation Scenario 4: Memory-Constrained Operation (always)
-3000 frames, 8GB shared memory. Full pipeline running.
-
-### Expected Based on Conclusions
- Downsampled frame: ~5MB per frame. Keep 2 in memory (current + previous): ~10MB
- XFeat model (TensorRT): ~50-100MB
- Satellite tile features (loaded tiles): ~200-500MB for tiles near current position
- ESKF state: <1MB
- OS + runtime: ~1.5GB
- Total: ~2-3GB active, well within 8GB
-
-### Potential Issues
- Satellite feature DB for large operational areas could be large on disk (not memory — loaded on demand)
- Need careful management of tile loading/unloading
-
-### Result: ✅ PASS — 8GB is sufficient with proper memory management.
-
---
-
-## Validation Scenario 5: Degraded Satellite Imagery
-Google Maps tiles at 0.5-1.0 m/px resolution. Some areas have outdated imagery. Seasonal appearance changes.
-
-### Expected Based on Conclusions
- Coarse retrieval (global descriptors) should handle moderate appearance changes
- Fine matching may fail on outdated/seasonal tiles — confidence drops to LOW
- System falls back to VO + IMU in degraded areas
- Multiple consecutive failures → user input request
-
-### Potential Issues
- If large areas have degraded satellite imagery, the system may operate mostly in VO+IMU mode with significant drift
- 50m accuracy target may not be achievable in these areas
-
-### Result: ⚠️ CONDITIONAL PASS — system degrades gracefully, but accuracy targets depend on satellite quality. This is a known risk per Phase 1 assessment.
-
---
-
-## Review Checklist
- [x] Draft conclusions consistent with fact cards
- [x] No important dimensions missed
- [x] No over-extrapolation
- [x] Conclusions actionable/verifiable
- [x] Sharp turn handling addressed
- [x] Memory constraints validated
- [ ] Issue: Satellite imagery quality in eastern Ukraine remains a risk
- [ ] Issue: Non-stabilized camera during turns may degrade satellite matching
-
-## Conclusions Requiring No Revision
-All major architectural decisions validated. Two known risks (satellite quality, non-stabilized camera during turns) are acknowledged and handled by the fallback hierarchy.
@@ -1,80 +0,0 @@
-# Question Decomposition — Solution Assessment (Mode B)
-
-## Original Question
-Assess the existing solution draft (solution_draft01.md) for weak points, security vulnerabilities, and performance bottlenecks, then produce a revised solution draft.
-
-## Active Mode
-Mode B: Solution Assessment — `solution_draft01.md` exists and is the highest-numbered draft.
-
-## Question Type Classification
- **Primary**: Problem Diagnosis — identify weak points, vulnerabilities, bottlenecks in existing solution
- **Secondary**: Decision Support — evaluate alternatives for identified issues
-
-## Research Subject Boundary Definition
-
-| Dimension | Boundary |
-|-----------|----------|
-| **Domain** | GPS-denied UAV visual navigation, aerial geo-referencing |
-| **Geography** | Eastern/southern Ukraine (left of Dnipro River) — steppe terrain, potential conflict-related satellite imagery degradation |
-| **Hardware** | Desktop/laptop with NVIDIA RTX 2060+, 16GB RAM, 6GB VRAM |
-| **Software** | Python ecosystem, GPU-accelerated CV/ML |
-| **Timeframe** | Current state-of-the-art (2024-2026), production-ready tools |
-| **Scale** | 500-3000 images per flight, up to 6252×4168 resolution |
-
-## Problem Context Summary
- UAV aerial photos taken consecutively ~100m apart, camera pointing down (not autostabilized)
- Only starting GPS known — must determine GPS for all subsequent images
- Must handle: sharp turns, outlier photos (up to 350m gap), disconnected route segments
- Processing <5s/image, real-time SSE streaming, REST API service
- No IMU data available
-
-## Decomposed Sub-Questions
-
-### A: Cross-View Matching Viability
-"Is SuperPoint+LightGlue with perspective warping reliable for UAV-to-satellite cross-view matching, or are there specialized cross-view methods that would perform better?"
-
-### B: Homography-Based VO Robustness
-"Is homography-based VO (flat terrain assumption) robust enough for non-stabilized camera with potential roll/pitch variations and non-flat objects?"
-
-### C: Satellite Imagery Reliability
-"What are the risks of relying solely on Google Maps satellite imagery for eastern Ukraine, and what fallback strategies exist?"
-
-### D: Processing Time Feasibility
-"Are the processing time estimates (<5s per image) realistic on RTX 2060 with SuperPoint+LightGlue+satellite matching pipeline?"
-
-### E: Optimizer Specification
-"Is the sliding window optimizer well-specified, and are there more proven alternatives like factor graph optimization?"
-
-### F: Camera Rotation Handling
-"How should the system handle arbitrary image rotation from non-stabilized camera mount?"
-
-### G: Security Assessment
-"What are the security vulnerabilities in the REST API + SSE architecture with image processing pipeline?"
-
-### H: Newer Tools & Libraries
-"Are there newer (2025-2026) tools, models, or approaches that outperform the current selections (SuperPoint, LightGlue, etc.)?"
-
-### I: Segment Management Robustness
-"Is the segment management strategy robust enough for multiple disconnected segments, especially when satellite anchoring fails for a segment?"
-
-### J: Memory & Resource Management
-"Can the pipeline stay within 16GB RAM / 6GB VRAM while processing 3000 images at 6252×4168 resolution?"
-
---
-
-## Timeliness Sensitivity Assessment
-
- **Research Topic**: GPS-denied UAV visual navigation using learned feature matching and satellite geo-referencing
- **Sensitivity Level**: 🟠 High
- **Rationale**: Computer vision feature matching models (SuperPoint, LightGlue, etc.) are actively evolving with new versions and competitors. However, the core algorithms (homography, VO, optimization) are stable. The tool ecosystem changes frequently.
- **Source Time Window**: 12 months (2025-2026)
- **Priority official sources to consult**:
-  1. LightGlue / SuperPoint GitHub repos (releases, issues)
-  2. OpenCV documentation (current version)
-  3. Google Maps Tiles API documentation
-  4. Recent aerial geo-referencing papers (2024-2026)
- **Key version information to verify**:
-  - LightGlue: current version and ONNX/TensorRT support status
-  - SuperPoint: current version and alternatives
-  - FastAPI: SSE support status
-  - Google Maps Tiles API: pricing, coverage, rate limits
@@ -1,201 +0,0 @@
-# Source Registry — Solution Assessment (Mode B)
-
-## Source #1
- **Title**: GLEAM: Learning to Match and Explain in Cross-View Geo-Localization
- **Link**: https://arxiv.org/abs/2509.07450
- **Tier**: L1
- **Publication Date**: 2025-09
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Cross-view geo-localization researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Framework for cross-view geo-localization with explainable matching across modalities. Demonstrates that specialized cross-view methods outperform generic feature matchers.
-
-## Source #2
- **Title**: Robust UAV Image Mosaicking Using SIFT and LightGlue (ISPRS 2025)
- **Link**: https://isprs-archives.copernicus.org/articles/XLVIII-2-W11-2025/169/2025/
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV photogrammetry and aerial image processing
- **Research Boundary Match**: ✅ Full match
- **Summary**: SIFT+LightGlue achieves superior spatial consistency and reliability for UAV image mosaicking, including low-texture and high-rotation conditions. SIFT outperforms SuperPoint for rotation-heavy scenarios.
-
-## Source #3
- **Title**: Precise GPS-Denied UAV Self-Positioning via Context-Enhanced Cross-View Geo-Localization (CEUSP)
- **Link**: https://arxiv.org/abs/2502.11408 / https://github.com/eksnew/ceusp
- **Tier**: L1
- **Publication Date**: 2025-02
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: GPS-denied UAV navigation
- **Research Boundary Match**: ⚠️ Partial overlap (urban, not steppe)
- **Summary**: DINOv2-based cross-view matching for UAV self-positioning. State-of-the-art on DenseUAV benchmark. Uses retrieval-based (not feature-matching) approach.
-
-## Source #4
- **Title**: SatLoc Dataset and Hierarchical Adaptive Fusion Framework
- **Link**: https://www.mdpi.com/2072-4292/17/17/3048
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: GNSS-denied UAV navigation
- **Research Boundary Match**: ✅ Full match
- **Summary**: Three-layer architecture: DINOv2 for absolute geo-localization, XFeat for VO, optical flow for velocity. Adaptive fusion with confidence weighting. <15m absolute error on edge hardware.
-
-## Source #5
- **Title**: LightGlue ONNX/TensorRT acceleration blog
- **Link**: https://fabio-sim.github.io/blog/accelerating-lightglue-inference-onnx-runtime-tensorrt/
- **Tier**: L2
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: LightGlue users optimizing inference
- **Research Boundary Match**: ✅ Full match
- **Summary**: LightGlue ONNX achieves 2-4x speedup over PyTorch. FP8 quantization (Ada/Hopper GPUs only) adds 6x more. RTX 2060 does NOT support FP8.
-
-## Source #6
- **Title**: LightGlue-ONNX GitHub repository
- **Link**: https://github.com/fabio-sim/LightGlue-ONNX
- **Tier**: L2
- **Publication Date**: 2024-2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: LightGlue deployment engineers
- **Research Boundary Match**: ✅ Full match
- **Summary**: ONNX export for LightGlue with FlashAttention-2 support. TopK-trick for ~30% speedup. Pre-exported models available.
-
-## Source #7
- **Title**: LightGlue GitHub Issue #64 — Rotation sensitivity
- **Link**: https://github.com/cvg/LightGlue/issues/64
- **Tier**: L4
- **Publication Date**: 2023-2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: LightGlue users
- **Research Boundary Match**: ✅ Full match
- **Summary**: LightGlue (with SuperPoint/DISK) is NOT rotation-invariant. 90° or 180° rotation causes matching failure. Manual rectification needed.
-
-## Source #8
- **Title**: LightGlue GitHub Issue #13 — No-match handling
- **Link**: https://github.com/cvg/LightGlue/issues/13
- **Tier**: L4
- **Publication Date**: 2023
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: LightGlue users
- **Research Boundary Match**: ✅ Full match
- **Summary**: LightGlue lacks explicit training on unmatchable pairs. May produce geometrically meaningless matches instead of rejecting non-overlapping views.
-
-## Source #9
- **Title**: YFS90/GNSS-Denied-UAV-Geolocalization GitHub
- **Link**: https://github.com/yfs90/gnss-denied-uav-geolocalization
- **Tier**: L1
- **Publication Date**: 2024-2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: GPS-denied UAV navigation
- **Research Boundary Match**: ✅ Full match
- **Summary**: <7m MAE using terrain-weighted constraint optimization + 2D-3D geo-registration. Uses DEM data. Validated across 20 complex scenarios. Works with publicly available satellite maps.
-
-## Source #10
- **Title**: Efficient image matching for UAV visual navigation via DALGlue (Scientific Reports 2025)
- **Link**: https://www.nature.com/articles/s41598-025-21602-5
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV visual navigation
- **Research Boundary Match**: ✅ Full match
- **Summary**: 11.8% MMA improvement over LightGlue. Uses dual-tree complex wavelet transform + adaptive spatial feature fusion + linear attention. Designed for UAV dynamic flight.
-
-## Source #11
- **Title**: XFeat: Accelerated Features for Lightweight Image Matching (CVPR 2024)
- **Link**: https://arxiv.org/html/2404.19174v1 / https://github.com/verlab/accelerated_features
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Real-time feature matching applications
- **Research Boundary Match**: ✅ Full match
- **Summary**: 5x faster than SuperPoint. Runs real-time on CPU. Sparse + semi-dense matching. Used by SatLoc-Fusion for VO. 1500+ GitHub stars.
-
-## Source #12
- **Title**: An Oblique-Robust Absolute Visual Localization Method (IEEE TGRS 2024)
- **Link**: https://ieeexplore.ieee.org/iel7/36/10354519/10356107.pdf
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: GPS-denied UAV localization
- **Research Boundary Match**: ✅ Full match
- **Summary**: SE(2)-steerable network for rotation-equivariant features. Handles drastic perspective changes, non-perpendicular camera angles. No additional training for new scenes.
-
-## Source #13
- **Title**: Google Maps Tiles API Usage and Billing
- **Link**: https://developers.google.com/maps/documentation/tile/usage-and-billing
- **Tier**: L1
- **Publication Date**: 2025-2026 (continuously updated)
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Google Maps API users
- **Research Boundary Match**: ✅ Full match
- **Summary**: 100,000 free tile requests/month. Rate limit: 6,000/min, 15,000/day for 2D tiles. $200/month free credit expired Feb 2025. Now pay-as-you-go only.
-
-## Source #14
- **Title**: GTSAM Python API and Factor Graph examples
- **Link**: https://github.com/borglab/gtsam / https://pypi.org/project/gtsam-develop/
- **Tier**: L1
- **Publication Date**: 2025-2026 (v4.2 stable, v4.3a1 dev)
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Robot navigation, SLAM
- **Research Boundary Match**: ✅ Full match
- **Summary**: Python bindings for factor graph optimization. GPSFactor for absolute position constraints. iSAM2 for incremental optimization. Stable v4.2 for production use.
-
-## Source #15
- **Title**: Copernicus DEM documentation
- **Link**: https://documentation.dataspace.copernicus.eu/APIs/SentinelHub/Data/DEM.html
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: DEM data users
- **Research Boundary Match**: ✅ Full match
- **Summary**: Free 30m DEM (GLO-30) covering Ukraine. API access via Sentinel Hub Process API. Registration required.
-
-## Source #16
- **Title**: Homography Decomposition Revisited (IJCV 2025)
- **Link**: https://link.springer.com/article/10.1007/s11263-025-02680-4
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Computer vision researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Existing homography decomposition methods can be unstable in certain configurations. Proposes hybrid framework for improved stability.
-
-## Source #17
- **Title**: Sliding window factor graph optimization for visual/inertial navigation (Cambridge 2020)
- **Link**: https://www.cambridge.org/core/services/aop-cambridge-core/content/view/523C7C41D18A8D7C159C59235DF502D0/
- **Tier**: L1
- **Publication Date**: 2020
- **Timeliness Status**: ✅ Currently valid (foundational method)
- **Target Audience**: Navigation system designers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Sliding-window factor graph optimization combines accuracy of graph optimization with efficiency of windowed approach. Superior to separate filtering or full batch optimization.
-
-## Source #18
- **Title**: SuperPoint feature extraction and matching benchmarks
- **Link**: https://preview-www.nature.com/articles/s41598-024-59626-y/tables/3
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Feature matching benchmarking
- **Research Boundary Match**: ✅ Full match
- **Summary**: SuperPoint+LightGlue: ~0.36±0.06s per image pair for extraction+matching on GPU. Competitive accuracy for satellite stereo scenarios.
-
-## Source #19
- **Title**: DINOv2-Based UAV Visual Self-Localization in Low-Altitude Urban Environments
- **Link**: https://ui.adsabs.harvard.edu/abs/2025IRAL...10.2080Y/
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV visual localization researchers
- **Research Boundary Match**: ⚠️ Partial overlap (urban, not steppe)
- **Summary**: DINOv2-based method achieves 86.27 R@1 on DenseUAV benchmark for cross-view matching. Integrates global-local feature enhancement.
-
-## Source #20
- **Title**: Mapbox Satellite Tiles and Pricing
- **Link**: https://docs.mapbox.com/data/tilesets/reference/mapbox-satellite/ / https://mapbox.com/pricing
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Map tile consumers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Mapbox offers satellite tiles up to 0.3m resolution (zoom 16+). 200,000 free vector tile requests/month. Unlimited offline downloads on pay-as-you-go. Multi-provider imagery (Maxar, Landsat, Sentinel).
@@ -1,161 +0,0 @@
-# Fact Cards — Solution Assessment (Mode B)
-
-## Fact #1
- **Statement**: LightGlue (with SuperPoint/DISK descriptors) is NOT rotation-invariant. Image pairs with 90° or 180° rotation produce very few or zero matches. Manual image rectification is required before matching.
- **Source**: Source #7 (LightGlue GitHub Issue #64)
- **Phase**: Assessment
- **Target Audience**: UAV systems with non-stabilized cameras
- **Confidence**: ✅ High (confirmed by LightGlue maintainers)
- **Related Dimension**: Cross-view matching robustness, camera rotation handling
-
-## Fact #2
- **Statement**: LightGlue lacks explicit training on unmatchable image pairs. When given non-overlapping views (e.g., after sharp turn), it may return semantically correct but geometrically meaningless matches instead of correctly rejecting the pair.
- **Source**: Source #8 (LightGlue GitHub Issue #13)
- **Phase**: Assessment
- **Target Audience**: Systems requiring segment detection (VO failure detection)
- **Confidence**: ✅ High (confirmed by LightGlue maintainers)
- **Related Dimension**: Segment management, VO failure detection
-
-## Fact #3
- **Statement**: SatLoc-Fusion achieves <15m absolute localization error using a three-layer hierarchical approach: DINOv2 for coarse absolute geo-localization, XFeat for high-frequency VO, optical flow for velocity estimation. Runs real-time on 6 TFLOPS edge hardware.
- **Source**: Source #4 (SatLoc-Fusion, Remote Sensing 2025)
- **Phase**: Assessment
- **Target Audience**: GPS-denied UAV systems
- **Confidence**: ✅ High (peer-reviewed, with dataset)
- **Related Dimension**: Architecture, localization accuracy, hierarchical matching
-
-## Fact #4
- **Statement**: XFeat is 5x faster than SuperPoint with comparable accuracy. Runs real-time on CPU. Supports both sparse and semi-dense matching. 1500+ GitHub stars, actively maintained.
- **Source**: Source #11 (CVPR 2024)
- **Phase**: Assessment
- **Target Audience**: Real-time feature extraction
- **Confidence**: ✅ High (peer-reviewed, CVPR 2024)
- **Related Dimension**: Processing speed, feature extraction
-
-## Fact #5
- **Statement**: SIFT+LightGlue achieves superior spatial consistency and reliability for UAV image mosaicking, including in low-texture and high-rotation conditions. SIFT is rotation-invariant unlike SuperPoint.
- **Source**: Source #2 (ISPRS 2025)
- **Phase**: Assessment
- **Target Audience**: UAV image matching
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: Feature extraction, rotation handling
-
-## Fact #6
- **Statement**: SuperPoint+LightGlue extraction+matching takes ~0.36±0.06s per image pair on GPU (unspecified GPU model). This is for standard resolution images, not 6000+ pixel width.
- **Source**: Source #18
- **Phase**: Assessment
- **Target Audience**: Performance planning
- **Confidence**: ⚠️ Medium (GPU model not specified, may not be RTX 2060)
- **Related Dimension**: Processing time
-
-## Fact #7
- **Statement**: LightGlue ONNX/TensorRT achieves 2-4x speedup over compiled PyTorch. FP8 quantization adds 6x more but requires Ada Lovelace or newer GPUs. RTX 2060 (Turing) does NOT support FP8 — limited to FP16/INT8 acceleration.
- **Source**: Source #5, #6 (LightGlue-ONNX blog and repo)
- **Phase**: Assessment
- **Target Audience**: RTX 2060 deployment
- **Confidence**: ✅ High (benchmarked by repo maintainer)
- **Related Dimension**: Processing time, hardware constraints
-
-## Fact #8
- **Statement**: YFS90 achieves <7m MAE using terrain-weighted constraint optimization + 2D-3D geo-registration with DEM data. Validated across 20 complex scenarios including plains, hilly terrain, urban/rural. Works with publicly available satellite maps and DEM data. Re-localization capability after failures.
- **Source**: Source #9 (YFS90 GitHub)
- **Phase**: Assessment
- **Target Audience**: GPS-denied UAV navigation
- **Confidence**: ✅ High (peer-reviewed, open source, 69★)
- **Related Dimension**: Optimization approach, DEM integration, accuracy
-
-## Fact #9
- **Statement**: Google Maps $200/month free credit expired February 28, 2025. Current free tier is 100,000 tile requests/month. Rate limits: 6,000 requests/min, 15,000 requests/day for 2D tiles.
- **Source**: Source #13 (Google Maps official docs)
- **Phase**: Assessment
- **Target Audience**: Cost planning
- **Confidence**: ✅ High (official documentation)
- **Related Dimension**: Cost, satellite imagery access
-
-## Fact #10
- **Statement**: Google Maps satellite imagery for eastern Ukraine is likely updated only every 3-5+ years due to: conflict zone (lower priority), geopolitical challenges, limited user demand. This may not meet the AC requirement of "less than 2 years old."
- **Source**: Multiple web sources on Google Maps update frequency
- **Phase**: Assessment
- **Target Audience**: Satellite imagery reliability
- **Confidence**: ⚠️ Medium (general guidelines, not Ukraine-specific confirmation)
- **Related Dimension**: Satellite imagery reliability
-
-## Fact #11
- **Statement**: Mapbox Satellite offers imagery up to 0.3m resolution at zoom 16+, sourced from Maxar, Landsat, Sentinel. 200,000 free vector tile requests/month. Unlimited offline downloads on pay-as-you-go. Potentially more diverse and recent imagery for Ukraine than Google Maps alone.
- **Source**: Source #20 (Mapbox docs)
- **Phase**: Assessment
- **Target Audience**: Alternative satellite providers
- **Confidence**: ✅ High (official documentation)
- **Related Dimension**: Satellite imagery reliability, cost
-
-## Fact #12
- **Statement**: Copernicus DEM GLO-30 provides free 30m resolution global elevation data including Ukraine. Accessible via Sentinel Hub API. Can be used for terrain-weighted optimization like YFS90.
- **Source**: Source #15 (Copernicus docs)
- **Phase**: Assessment
- **Target Audience**: DEM integration
- **Confidence**: ✅ High (official documentation)
- **Related Dimension**: Position optimizer, terrain constraints
-
-## Fact #13
- **Statement**: GTSAM v4.2 (stable) provides Python bindings with GPSFactor for absolute position constraints and iSAM2 for incremental optimization. Can model VO constraints, satellite anchor constraints, and drift limits in a unified factor graph.
- **Source**: Source #14 (GTSAM docs)
- **Phase**: Assessment
- **Target Audience**: Optimizer design
- **Confidence**: ✅ High (widely used in robotics)
- **Related Dimension**: Position optimizer
-
-## Fact #14
- **Statement**: DALGlue achieves 11.8% MMA improvement over LightGlue on MegaDepth benchmark. Specifically designed for UAV visual navigation with wavelet transform preprocessing for handling dynamic flight blur.
- **Source**: Source #10 (Scientific Reports 2025)
- **Phase**: Assessment
- **Target Audience**: Feature matching selection
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: Feature matching
-
-## Fact #15
- **Statement**: The oblique-robust AVL method (IEEE TGRS 2024) uses SE(2)-steerable networks for rotation-equivariant features. Handles drastic perspective changes and non-perpendicular camera angles for UAV-to-satellite matching. No retraining needed for new scenes.
- **Source**: Source #12 (IEEE TGRS 2024)
- **Phase**: Assessment
- **Target Audience**: Cross-view matching
- **Confidence**: ✅ High (peer-reviewed, IEEE)
- **Related Dimension**: Cross-view matching, rotation handling
-
-## Fact #16
- **Statement**: Homography decomposition can be unstable in certain configurations (2025 IJCV study). Non-planar objects (buildings, trees) violate planar assumption. For aerial images, dominant ground plane exists but RANSAC inlier ratio drops with non-planar content.
- **Source**: Source #16 (IJCV 2025)
- **Phase**: Assessment
- **Target Audience**: VO design
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: VO robustness
-
-## Fact #17
- **Statement**: Sliding-window factor graph optimization combines the accuracy of full graph optimization with the efficiency of windowed processing. Superior to either pure filtering or full batch optimization for real-time navigation.
- **Source**: Source #17 (Cambridge 2020)
- **Phase**: Assessment
- **Target Audience**: Optimizer design
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: Position optimizer
-
-## Fact #18
- **Statement**: SuperPoint is a fully-convolutional model — GPU memory scales linearly with image resolution. 6252×4168 input would require significant VRAM. Standard practice is to downscale to 1024-2048 long edge for feature extraction.
- **Source**: Source #18, SuperPoint docs
- **Phase**: Assessment
- **Target Audience**: Memory management
- **Confidence**: ✅ High (architectural fact)
- **Related Dimension**: Memory management, processing pipeline
-
-## Fact #19
- **Statement**: For GPS-denied UAV localization, hierarchical coarse-to-fine approaches (image retrieval → local feature matching) are state-of-the-art. Direct local feature matching alone fails when the search area is too large or viewpoint difference is too high.
- **Source**: Source #3, #4, #12 (CEUSP, SatLoc, Oblique-robust AVL)
- **Phase**: Assessment
- **Target Audience**: Architecture design
- **Confidence**: ✅ High (consensus across multiple papers)
- **Related Dimension**: Architecture, satellite matching
-
-## Fact #20
- **Statement**: Google Maps Tiles API daily rate limit of 15,000 requests would be hit when processing a 3000-image flight requiring ~2000 satellite tiles plus expansion tiles. Need to either pre-cache or use the per-minute limit (6,000/min) strategically across multiple days.
- **Source**: Source #13 (Google Maps docs)
- **Phase**: Assessment
- **Target Audience**: System design
- **Confidence**: ✅ High (official rate limits)
- **Related Dimension**: Satellite tile management, rate limiting
@@ -1,79 +0,0 @@
-# Comparison Framework — Solution Assessment (Mode B)
-
-## Selected Framework Type
-Problem Diagnosis + Decision Support
-
-## Identified Weak Points and Assessment Dimensions
-
-### Dimension 1: Cross-View Matching Strategy (UAV→Satellite)
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Strategy | Direct SuperPoint+LightGlue matching with perspective warping | No coarse localization stage. Fails when VO drift is large. LightGlue not rotation-invariant. | Hierarchical: DINOv2/global retrieval → SuperPoint+LightGlue refinement | Fact #1, #2, #15, #19 |
-| Rotation handling | Not addressed | Non-stabilized camera = rotated images. SuperPoint/LightGlue fail at 90°/180° | Image rectification via VO-estimated heading, or rotation-invariant features (SIFT for fallback) | Fact #1, #5 |
-| Domain gap | Perspective warping only | Insufficient for seasonal/illumination/resolution differences | Multi-scale matching, DINOv2 for semantic retrieval, warping + matched features | Fact #3, #15 |
-
-### Dimension 2: Feature Extraction & Matching
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| VO features | SuperPoint (~80ms) | Adequate but not optimized for speed | XFeat (5x faster, CPU-capable) for VO; keep SuperPoint for satellite matching | Fact #4 |
-| Matching | LightGlue | Good baseline. DALGlue 11.8% better MMA. | LightGlue with ONNX optimization as primary. DALGlue for evaluation. | Fact #7, #14 |
-| Non-match detection | Not addressed | LightGlue returns false matches on non-overlapping pairs | Inlier ratio + match count threshold + geometric consistency check | Fact #2 |
-
-### Dimension 3: Visual Odometry Robustness
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Geometric model | Homography (planar assumption) | Unstable for non-planar objects. Decomposition instability in certain configs. | Homography with RANSAC + high inlier ratio requirement. Essential matrix as fallback. | Fact #16 |
-| Scale estimation | GSD from altitude | Valid if altitude is constant. Terrain elevation changes not accounted for. | Integrate Copernicus DEM for terrain-corrected GSD | Fact #12 |
-| Camera rotation | Not addressed | Non-stabilized camera introduces roll/pitch | Estimate rotation from VO, apply rectification before satellite matching | Fact #1, #5 |
-
-### Dimension 4: Position Optimizer
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Algorithm | scipy.optimize sliding window | Generic optimizer, no proper uncertainty modeling, no factor types | GTSAM factor graph with iSAM2 incremental optimization | Fact #13, #17 |
-| Terrain constraints | Not used | YFS90 achieves <7m with terrain weighting | Integrate DEM-based terrain constraints via Copernicus DEM | Fact #8, #12 |
-| Drift modeling | Max 100m between anchors | Single hard constraint, no probabilistic modeling | Per-VO-step uncertainty based on inlier ratio, propagated through factor graph | Fact #17 |
-
-### Dimension 5: Satellite Imagery Reliability
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Provider | Google Maps only | Eastern Ukraine: 3-5 year update cycle. $200 credit expired. 15K/day rate limit. | Multi-provider: Google Maps primary + Mapbox fallback + pre-cached tiles | Fact #9, #10, #11, #20 |
-| Freshness | Assumed adequate | May not meet AC "< 2 years old" for conflict zone | Provider selection per-area. User can provide custom imagery. | Fact #10 |
-| Rate limiting | Not addressed | 15,000/day cap could block large flights | Progressive download with request budgeting. Pre-cache for known areas. | Fact #20 |
-
-### Dimension 6: Processing Time Budget
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Target | <5s (claim <2s) | Per-frame pipeline: VO match + satellite match + optimization. Total could exceed budget. | XFeat for VO (~20ms). LightGlue ONNX for satellite (~100ms). Async satellite matching. | Fact #4, #6, #7 |
-| Image downscaling | Not specified | 6252×4168 cannot be processed at full resolution | Downscale to 1600 long edge for features. Keep full resolution for GSD calculation. | Fact #18 |
-| Parallelism | Not specified | Sequential pipeline wastes GPU idle time | Async: extract features while satellite tile downloads. Pipeline overlap. | — |
-
-### Dimension 7: Memory Management
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Image loading | Not specified | 6252×4168 × 3ch = 78MB per raw image. 3000 images = 234GB. | Stream images one at a time. Keep only current + previous features in memory. | Fact #18 |
-| VRAM budget | Not specified | SuperPoint on full resolution could exceed 6GB VRAM | Downscale images. Batch size 1. Clear GPU cache between frames. | Fact #18 |
-| Feature storage | Not specified | 3000 images × features = significant RAM | Store only features needed for sliding window. Disk-backed for older frames. | — |
-
-### Dimension 8: Security
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Authentication | API key mentioned | No implementation details. API key in query params = insecure. | JWT tokens for session auth. Short-lived tokens for SSE connections. | SSE security research |
-| Path traversal | Mentioned in testing | image_folder parameter could be exploited | Whitelist base directories. Validate path doesn't escape allowed root. | — |
-| DoS protection | Not addressed | Large image uploads, SSE connection exhaustion | Max file size limits. Connection pool limits. Request rate limiting. | — |
-| API key storage | env var mentioned | Adequate baseline | .env file + secrets manager in production. Never log API keys. | — |
-
-### Dimension 9: Segment Management
-
-| Aspect | Draft01 Approach | Identified Problem | Alternative | Factual Basis |
-|--------|-----------------|-------------------|------------|---------------|
-| Re-connection | Via satellite anchoring only | If satellite matching fails, segment stays floating | Attempt cross-segment matching when new anchors arrive. DEM-based constraint stitching. | Fact #8 |
-| Multi-segment handling | Described conceptually | No detail on how >2 segments are managed | Explicit segment graph with pending connections. Priority queue for unresolved segments. | — |
-| User input fallback | POST /jobs/{id}/anchor | Good design. Needs timeout/escalation for when user doesn't respond. | Add configurable timeout before continuing with VO-only estimate. | — |
@@ -1,145 +0,0 @@
-# Reasoning Chain — Solution Assessment (Mode B)
-
-## Dimension 1: Cross-View Matching Strategy
-
-### Fact Confirmation
-According to Fact #1, LightGlue is not rotation-invariant and fails on rotated images. According to Fact #2, it returns false matches on non-overlapping pairs. According to Fact #19, state-of-the-art GPS-denied localization uses hierarchical coarse-to-fine approaches. SatLoc-Fusion (Fact #3) achieves <15m with DINOv2 + XFeat + optical flow.
-
-### Reference Comparison
-Draft01 uses direct SuperPoint+LightGlue matching with perspective warping. This is a single-stage approach — it assumes the VO-estimated position is close enough to fetch the right satellite tile, then matches directly. But: (a) when VO drift accumulates between satellite anchors, the estimated position may be wrong enough to fetch the wrong tile; (b) the domain gap between UAV oblique images and satellite nadir is significant; (c) rotation from non-stabilized camera is not handled.
-
-State-of-the-art approaches add a coarse localization stage (DINOv2 image retrieval over a wider area) before fine matching. This makes satellite matching robust to larger VO drift.
-
-### Conclusion
-**Replace single-stage with two-stage satellite matching**: (1) DINOv2-based coarse retrieval over a search area (e.g., 500m radius around VO estimate) to find the best-matching satellite tile, (2) SuperPoint+LightGlue for precise alignment on the selected tile. Add image rotation normalization before matching. This is the most critical improvement.
-
-### Confidence
-✅ High — multiple independent sources confirm hierarchical approach superiority.
-
---
-
-## Dimension 2: Feature Extraction & Matching
-
-### Fact Confirmation
-According to Fact #4, XFeat is 5x faster than SuperPoint with comparable accuracy and is used in SatLoc-Fusion for real-time VO. According to Fact #5, SIFT+LightGlue is more robust for high-rotation conditions. According to Fact #14, DALGlue improves LightGlue MMA by 11.8% for UAV scenarios.
-
-### Reference Comparison
-Draft01 uses SuperPoint for all feature extraction (both VO and satellite matching). This is simpler (unified pipeline) but suboptimal: VO needs speed (processed every frame), while satellite matching needs accuracy (processed periodically).
-
-### Conclusion
-**Dual-extractor strategy**: XFeat for VO (fast, adequate accuracy for frame-to-frame), SuperPoint for satellite matching (higher accuracy needed for cross-view). LightGlue with ONNX/TensorRT optimization as matcher. SIFT as fallback for rotation-heavy scenarios. DALGlue is promising but too new for production — monitor.
-
-### Confidence
-✅ High — XFeat benchmarks are from CVPR 2024, well-established.
-
---
-
-## Dimension 3: Visual Odometry Robustness
-
-### Fact Confirmation
-According to Fact #16, homography decomposition can be unstable and non-planar objects degrade results. According to Fact #12, Copernicus DEM provides free 30m elevation data for terrain-corrected GSD.
-
-### Reference Comparison
-Draft01's homography-based VO is valid for flat terrain but doesn't account for: (a) terrain elevation changes affecting GSD calculation, (b) non-planar objects in the scene, (c) camera roll/pitch from non-stabilized mount. The terrain in eastern Ukraine is mostly steppe but has settlements, forests, and infrastructure.
-
-### Conclusion
-**Keep homography VO as primary** (valid for dominant ground plane), but: (1) add RANSAC inlier ratio check — if below threshold, fall back to essential matrix estimation; (2) integrate Copernicus DEM for terrain-corrected altitude in GSD calculation; (3) estimate and track camera rotation (roll/pitch/yaw) from consecutive VO estimates and use it for image rectification before satellite matching.
-
-### Confidence
-✅ High — homography with RANSAC and fallback is well-established.
-
---
-
-## Dimension 4: Position Optimizer
-
-### Fact Confirmation
-According to Fact #13, GTSAM provides Python bindings with GPSFactor and iSAM2 incremental optimization. According to Fact #17, sliding-window factor graph optimization is superior to either pure filtering or full batch optimization. According to Fact #8, YFS90 achieves <7m MAE with terrain-weighted constraints + DEM.
-
-### Reference Comparison
-Draft01 proposes scipy.optimize with a custom sliding window. While functional, this is reinventing the wheel — GTSAM's iSAM2 already implements incremental smoothing with proper uncertainty propagation. GTSAM's factor graph naturally supports: BetweenFactor for VO constraints (with uncertainty), GPSFactor for satellite anchors, custom factors for terrain constraints, drift limit constraints.
-
-### Conclusion
-**Replace scipy.optimize with GTSAM iSAM2 factor graph**. Use BetweenFactor for VO relative motion, GPSFactor for satellite anchors (with uncertainty based on match quality), and a custom terrain factor using Copernicus DEM. This provides: proper uncertainty propagation, incremental updates (fits SSE streaming), backwards smoothing when new anchors arrive.
-
-### Confidence
-✅ High — GTSAM is production-proven, stable v4.2 available via pip.
-
---
-
-## Dimension 5: Satellite Imagery Reliability
-
-### Fact Confirmation
-According to Fact #9, Google Maps $200/month free credit expired Feb 2025. Current free tier is 100K tiles/month. According to Fact #10, eastern Ukraine imagery may be 3-5+ years old. According to Fact #20, 15,000/day rate limit could be hit on large flights. According to Fact #11, Mapbox offers alternative satellite tiles at comparable resolution.
-
-### Reference Comparison
-Draft01 relies solely on Google Maps. Single-provider dependency creates multiple risk points: outdated imagery, rate limits, cost, API changes.
-
-### Conclusion
-**Multi-provider satellite tile manager**: Google Maps as primary, Mapbox as secondary, user-provided tiles as override. Implement: provider fallback when matching confidence is low, request budgeting to stay within rate limits, tile freshness metadata logging, pre-caching mode for known operational areas.
-
-### Confidence
-✅ High — multi-provider is standard practice for production systems.
-
---
-
-## Dimension 6: Processing Time Budget
-
-### Fact Confirmation
-According to Fact #6, SuperPoint+LightGlue takes ~0.36s per pair on GPU. According to Fact #7, ONNX optimization adds 2-4x speedup (on RTX 2060, limited to FP16). According to Fact #4, XFeat is 5x faster than SuperPoint for VO.
-
-### Reference Comparison
-Draft01's per-frame pipeline: (1) feature extraction, (2) VO matching, (3) satellite tile fetch, (4) satellite matching, (5) optimization, (6) SSE emit. Total estimated without optimization: ~1-2s for VO + ~0.5-1s for satellite + overhead = 2-4s. With ONNX optimization for matching and XFeat for VO, this drops to ~0.5-1.5s.
-
-### Conclusion
-**Budget is achievable with optimizations**: XFeat for VO (~20ms extraction + ~50ms matching), LightGlue ONNX for satellite (~100ms extraction + ~100ms matching), async satellite tile download (overlapped with VO), GTSAM incremental update (~10ms). Total: ~0.5-1s per frame. Satellite matching can be async — not every frame needs satellite match. Image downscaling to 1600 long edge is essential.
-
-### Confidence
-⚠️ Medium — depends on actual RTX 2060 benchmarks, which are extrapolated from general numbers.
-
---
-
-## Dimension 7: Memory Management
-
-### Fact Confirmation
-According to Fact #18, SuperPoint is fully-convolutional and VRAM scales with resolution. 6252×4168 images would require significant VRAM and RAM.
-
-### Reference Comparison
-Draft01 doesn't specify memory management. With 3000 images at max resolution, naive processing would exceed 16GB RAM.
-
-### Conclusion
-**Strict memory management**: (1) Downscale all images to max 1600 long edge before feature extraction; (2) stream images one at a time — only keep current + previous frame features in GPU memory; (3) store features for sliding window in CPU RAM, older features to disk; (4) limit satellite tile cache to 500MB in RAM, overflow to disk; (5) batch size 1 for all GPU operations; (6) explicit torch.cuda.empty_cache() between frames if VRAM pressure detected.
-
-### Confidence
-✅ High — standard memory management patterns.
-
---
-
-## Dimension 8: Security
-
-### Fact Confirmation
-JWT tokens are recommended for SSE endpoint security. API keys in query parameters are insecure (persist in logs, browser history).
-
-### Reference Comparison
-Draft01 mentions API key auth but no implementation details. SSE connections need proper authentication and resource limits.
-
-### Conclusion
-**Security improvements**: (1) JWT-based authentication for all endpoints; (2) short-lived tokens for SSE connections; (3) image folder whitelist (not just path traversal prevention — explicit whitelist of allowed base directories); (4) max concurrent SSE connections per client; (5) request rate limiting; (6) max image size validation; (7) all API keys in environment variables, never logged.
-
-### Confidence
-✅ High — standard security practices.
-
---
-
-## Dimension 9: Segment Management
-
-### Fact Confirmation
-According to Fact #8, YFS90 has re-localization capability after positioning failures. According to Fact #2, LightGlue may return false matches on non-overlapping pairs.
-
-### Reference Comparison
-Draft01's segment management relies on satellite matching to anchor each segment independently. If satellite matching fails, the segment stays "floating." No mechanism for cross-segment matching or delayed resolution.
-
-### Conclusion
-**Enhanced segment management**: (1) Explicit VO failure detection using match count + inlier ratio + geometric consistency (not just match count); (2) when a new segment gets satellite-anchored, attempt to connect to nearby floating segments using satellite-based position proximity; (3) DEM-based constraint: position must be consistent with terrain elevation; (4) configurable timeout for user input request — if no response within N frames, continue with best estimate and flag.
-
-### Confidence
-⚠️ Medium — cross-segment connection is logical but needs careful implementation to avoid false connections.
@@ -1,93 +0,0 @@
-# Validation Log — Solution Assessment (Mode B)
-
-## Validation Scenario 1: Normal flight over steppe with gradual turns
-
-**Scenario**: 1000-image flight over flat agricultural steppe. FullHD resolution. Starting GPS known. Gradual turns every 200 frames. Satellite imagery 2 years old.
-
-**Expected with Draft02 improvements**:
-1. XFeat VO processes frames at ~70ms each → well under 5s budget
-2. DINOv2 coarse retrieval finds correct satellite area despite 50-100m VO drift
-3. SuperPoint+LightGlue ONNX refines position to ~10-20m accuracy
-4. GTSAM iSAM2 smooths trajectory, reduces drift between anchors
-5. At gradual turns, VO continues working (overlap >30%)
-6. Processing stays under 1GB VRAM with 1600px downscale
-
-**Actual validation result**: Consistent with expectations. This is the "happy path" — both draft01 and draft02 would work. Draft02 advantage: faster processing, better optimizer.
-
-## Validation Scenario 2: Sharp turn with no overlap
-
-**Scenario**: After 500 normal frames, UAV makes a 90° sharp turn. Next 3 images have zero overlap with previous route. Then normal flight continues.
-
-**Expected with Draft02 improvements**:
-1. VO detects failure: match count drops below threshold → segment break
-2. LightGlue false-match protection: geometric consistency check rejects bad matches
-3. New segment starts. DINOv2 coarse retrieval searches wider area for satellite match
-4. If satellite match succeeds: new segment anchored, connected to previous via shared coordinate frame
-5. If satellite match fails: segment marked floating, user input requested (with timeout)
-6. After turn, if UAV returns near previous route, cross-segment connection attempted
-
-**Draft01 comparison**: Draft01 would also detect VO failure and create new segment, but lacks coarse retrieval → satellite matching depends entirely on VO estimate which may be wrong after turn. Higher risk of satellite match failure.
-
-## Validation Scenario 3: High-resolution images (6252×4168)
-
-**Scenario**: 500 images at full 6252×4168 resolution. RTX 2060 (6GB VRAM).
-
-**Expected with Draft02 improvements**:
-1. Images downscaled to 1600×1066 for feature extraction
-2. Full resolution preserved for GSD calculation only
-3. Per-frame VRAM: ~1.5GB for XFeat/SuperPoint + LightGlue
-4. RAM per frame: ~78MB raw + ~5MB features → manageable with streaming
-5. Total peak RAM: sliding window (50 frames × 5MB features) + satellite cache (500MB) + overhead ≈ 1.5GB pipeline
-6. Well within 16GB RAM budget
-
-**Actual validation result**: Consistent. Downscaling strategy is essential and was missing from draft01.
-
-## Validation Scenario 4: Outdated satellite imagery
-
-**Scenario**: Flight over area where Google Maps imagery is 4 years old. Significant changes: new buildings, removed forests, changed roads.
-
-**Expected with Draft02 improvements**:
-1. DINOv2 coarse retrieval: partial success (terrain structure still recognizable)
-2. SuperPoint+LightGlue fine matching: lower match count on changed areas
-3. Confidence score drops for affected frames → flagged in output
-4. Multi-provider fallback: try Mapbox tiles if Google matches are poor
-5. System falls back to VO-only for sections with no good satellite match
-6. User can provide custom satellite imagery for specific areas
-
-**Draft01 comparison**: Draft01 would also fail on changed areas but has no alternative provider and no coarse retrieval to help.
-
-## Validation Scenario 5: 3000-image flight hitting API rate limits
-
-**Scenario**: First flight in a new area. No cached tiles. 3000 images need ~2000 satellite tiles.
-
-**Expected with Draft02 improvements**:
-1. Initial download: 300 tiles around starting GPS (within rate limits)
-2. Progressive download as route extends: 5-20 tiles per frame
-3. Daily limit (15,000): sufficient for tiles but tight if multiple flights
-4. Request budgeting: prioritize tiles around current position, defer expansion
-5. Per-minute limit (6,000): no issue
-6. Monthly limit (100,000): covers ~50 flights at 2000 tiles each
-7. Mapbox fallback if Google budget exhausted
-
-**Draft01 comparison**: Draft01 assumed $200 free credit (expired). Rate limit analysis was incorrect.
-
-## Review Checklist
- [x] Draft conclusions consistent with fact cards
- [x] No important dimensions missed
- [x] No over-extrapolation
- [x] Conclusions actionable/verifiable
- [x] All scenarios plausible for the operational context
-
-## Counterexamples
- **Night flight**: Not addressed (out of scope — restriction says "mostly sunny weather")
- **Very low altitude (<100m)**: Satellite matching would have poor GSD match — not addressed but within restrictions (altitude ≤1km)
- **Urban area with tall buildings**: Homography VO degradation — mitigated by essential matrix fallback but not fully addressed
-
-## Conclusions Requiring No Revision
-All conclusions validated against scenarios. Key improvements are well-supported:
-1. Hierarchical satellite matching (coarse + fine)
-2. GTSAM factor graph optimization
-3. Multi-provider satellite tiles
-4. XFeat for VO speed
-5. Image downscaling for memory
-6. Proper security (JWT, rate limiting)
@@ -1,56 +0,0 @@
-# Question Decomposition
-
-## Original Question
-Assess current solution draft. Additionally:
-1. Try SuperPoint + LightGlue for visual odometry
-2. Can LiteSAM be SO SLOW because of big images? If we reduce size to 1280p, would that work faster?
-
-## Active Mode
-Mode B: Solution Assessment — `solution_draft01.md` exists in OUTPUT_DIR.
-
-## Question Type
-Problem Diagnosis + Decision Support
-
-## Research Subject Boundary
- **Population**: GPS-denied UAV navigation systems on edge hardware
- **Geography**: Eastern Ukraine conflict zone
- **Timeframe**: Current (2025-2026), using latest available tools
- **Level**: Jetson Orin Nano Super (8GB, 67 TOPS) — edge deployment
-
-## Decomposed Sub-Questions
-
-### Q1: SuperPoint + LightGlue for Visual Odometry
- What is SP+LG inference speed on Jetson-class hardware?
- How does it compare to cuVSLAM (116fps on Orin Nano)?
- Is SP+LG suitable for frame-to-frame VO at 3fps?
- What is SP+LG accuracy vs cuVSLAM for VO?
-
-### Q2: LiteSAM Speed vs Image Resolution
- What resolution was LiteSAM benchmarked at? (1184px on AGX Orin)
- How does LiteSAM speed scale with resolution?
- What would 1280px achieve on Orin Nano Super vs AGX Orin?
- Is the bottleneck image size or compute power gap?
-
-### Q3: General Weak Points in solution_draft01
- Are there functional weak points?
- Are there performance bottlenecks?
- Are there security gaps?
-
-### Q4: SP+LG for Satellite Matching (alternative to LiteSAM/XFeat)
- How does SP+LG perform on cross-view satellite-aerial matching?
- What does the LiteSAM paper say about SP+LG accuracy?
-
-## Timeliness Sensitivity Assessment
- **Research Topic**: Edge-deployed visual odometry and satellite-aerial matching
- **Sensitivity Level**: 🟠 High
- **Rationale**: cuVSLAM v15.0.0 released March 2026; LiteSAM published October 2025; LightGlue TensorRT optimizations actively evolving
- **Source Time Window**: 12 months
- **Priority official sources**:
-  1. LiteSAM paper (MDPI Remote Sensing, October 2025)
-  2. cuVSLAM / PyCuVSLAM v15.0.0 (March 2026)
-  3. LightGlue-ONNX / TensorRT benchmarks (2024-2026)
-  4. Intermodalics cuVSLAM benchmark (2025)
- **Key version information**:
-  - cuVSLAM: v15.0.0 (March 2026)
-  - LightGlue: ICCV 2023, TensorRT via fabio-sim/LightGlue-ONNX
-  - LiteSAM: Published October 2025, code at boyagesmile/LiteSAM
@@ -1,121 +0,0 @@
-# Source Registry
-
-## Source #1
- **Title**: LiteSAM: Lightweight and Robust Feature Matching for Satellite and Aerial Imagery
- **Link**: https://www.mdpi.com/2072-4292/17/19/3349
- **Tier**: L1
- **Publication Date**: 2025-10-01
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: LiteSAM v1.0; benchmarked on Jetson AGX Orin (JetPack 5.x era)
- **Target Audience**: UAV visual localization researchers and edge deployers
- **Research Boundary Match**: ✅ Full match
- **Summary**: LiteSAM (opt) achieves 497.49ms on Jetson AGX Orin at 1184px input. 6.31M params. RMSE@30 = 17.86m on UAV-VisLoc. Paper directly compares with SP+LG, stating "SP+LG achieves the fastest inference speed but at the expense of accuracy." Section 4.9 shows resolution vs speed tradeoff on RTX 3090Ti.
- **Related Sub-question**: Q2 (LiteSAM speed), Q4 (SP+LG for satellite matching)
-
-## Source #2
- **Title**: cuVSLAM: CUDA accelerated visual odometry and mapping
- **Link**: https://arxiv.org/abs/2506.04359
- **Tier**: L1
- **Publication Date**: 2025-06 (paper), v15.0.0 released 2026-03-10
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: cuVSLAM v15.0.0 / PyCuVSLAM v15.0.0
- **Target Audience**: Robotics/UAV visual odometry on NVIDIA Jetson
- **Research Boundary Match**: ✅ Full match
- **Summary**: CUDA-accelerated VO+SLAM, supports mono+IMU. 116fps on Jetson Orin Nano 8GB at 720p. <1% trajectory error on KITTI. <5cm on EuRoC.
- **Related Sub-question**: Q1 (SP+LG vs cuVSLAM)
-
-## Source #3
- **Title**: Intermodalics — NVIDIA Isaac ROS In-Depth: cuVSLAM and the DP3.1 Release
- **Link**: https://www.intermodalics.ai/blog/nvidia-isaac-ros-in-depth-cuvslam-and-the-dp3-1-release
- **Tier**: L2
- **Publication Date**: 2025 (DP3.1 release)
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: cuVSLAM v11 (DP3.1), benchmark data applicable to later versions
- **Target Audience**: Robotics developers using Isaac ROS
- **Research Boundary Match**: ✅ Full match
- **Summary**: 116fps on Orin Nano 8GB, 232fps on AGX Orin, 386fps on RTX 4060 Ti. Outperforms ORB-SLAM2 on KITTI.
- **Related Sub-question**: Q1
-
-## Source #4
- **Title**: Accelerating LightGlue Inference with ONNX Runtime and TensorRT
- **Link**: https://fabio-sim.github.io/blog/accelerating-lightglue-inference-onnx-runtime-tensorrt/
- **Tier**: L2
- **Publication Date**: 2024-07-17
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: torch 2.4.0, TensorRT 10.2.0, RTX 4080 benchmarks
- **Target Audience**: ML engineers deploying LightGlue
- **Research Boundary Match**: ⚠️ Partial (desktop GPU, not Jetson)
- **Summary**: TensorRT achieves 2-4x speedup over compiled PyTorch for SuperPoint+LightGlue. Full pipeline benchmarks on RTX 4080. TensorRT has 3840 keypoint limit. No Jetson-specific benchmarks provided.
- **Related Sub-question**: Q1
-
-## Source #5
- **Title**: LightGlue-with-FlashAttentionV2-TensorRT (Jetson Orin NX 8GB)
- **Link**: https://github.com/qdLMF/LightGlue-with-FlashAttentionV2-TensorRT
- **Tier**: L4
- **Publication Date**: 2025-02
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: TensorRT 8.5.2, Jetson Orin NX 8GB
- **Target Audience**: Edge ML deployers
- **Research Boundary Match**: ✅ Full match (similar hardware)
- **Summary**: CUTLASS-based FlashAttention V2 TensorRT plugin for LightGlue, tested on Jetson Orin NX 8GB. No published latency numbers, but confirms LightGlue TensorRT deployment on Orin-class hardware is feasible.
- **Related Sub-question**: Q1
-
-## Source #6
- **Title**: vo_lightglue — Visual Odometry with LightGlue
- **Link**: https://github.com/himadrir/vo_lightglue
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: N/A
- **Target Audience**: VO researchers
- **Research Boundary Match**: ⚠️ Partial (desktop, KITTI dataset)
- **Summary**: SP+LG achieves 10fps on KITTI dataset (desktop GPU). Odometric error ~1% vs 3.5-4.1% for FLANN-based matching. Much slower than cuVSLAM.
- **Related Sub-question**: Q1
-
-## Source #7
- **Title**: ForestVO: Enhancing Visual Odometry in Forest Environments through ForestGlue
- **Link**: https://arxiv.org/html/2504.01261v1
- **Tier**: L1
- **Publication Date**: 2025-04
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: N/A
- **Target Audience**: VO researchers
- **Research Boundary Match**: ⚠️ Partial (forest environment, not nadir UAV)
- **Summary**: SP+LG VO pipeline achieves 1.09m avg relative pose error, KITTI score 2.33%. Uses 512 keypoints (reduced from 2048) to cut compute. Outperforms DSO by 40%.
- **Related Sub-question**: Q1
-
-## Source #8
- **Title**: SuperPoint-SuperGlue-TensorRT (C++ deployment)
- **Link**: https://github.com/yuefanhao/SuperPoint-SuperGlue-TensorRT
- **Tier**: L4
- **Publication Date**: 2023-2024
- **Timeliness Status**: ⚠️ Needs verification (SuperGlue, not LightGlue)
- **Version Info**: TensorRT 8.x
- **Target Audience**: Edge deployers
- **Research Boundary Match**: ⚠️ Partial
- **Summary**: SuperPoint TensorRT extraction ~40ms on Jetson for 200 keypoints. C++ implementation.
- **Related Sub-question**: Q1
-
-## Source #9
- **Title**: Comparative Analysis of Advanced Feature Matching Algorithms in HSR Satellite Stereo
- **Link**: https://arxiv.org/abs/2405.06246
- **Tier**: L1
- **Publication Date**: 2024-05
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: N/A
- **Target Audience**: Remote sensing researchers
- **Research Boundary Match**: ⚠️ Partial (satellite stereo, not UAV-satellite cross-view)
- **Summary**: SP+LG shows "overall superior performance in balancing robustness, accuracy, distribution, and efficiency" for satellite stereo matching. But this is same-view satellite-satellite, not cross-view UAV-satellite.
- **Related Sub-question**: Q4
-
-## Source #10
- **Title**: PyCuVSLAM with reComputer (Seeed Studio)
- **Link**: https://wiki.seeedstudio.com/pycuvslam_recomputer_robotics/
- **Tier**: L3
- **Publication Date**: 2026
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: PyCuVSLAM v15.0.0, JetPack 6.2
- **Target Audience**: Robotics developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Tutorial for deploying PyCuVSLAM on Jetson Orin NX. Confirms mono+IMU mode, pip install from aarch64 wheel, EuRoC dataset examples.
- **Related Sub-question**: Q1
@@ -1,122 +0,0 @@
-# Fact Cards
-
-## Fact #1
- **Statement**: cuVSLAM achieves 116fps on Jetson Orin Nano 8GB at 720p resolution (~8.6ms/frame). 232fps on AGX Orin. 386fps on RTX 4060 Ti.
- **Source**: [Source #3] Intermodalics benchmark
- **Phase**: Assessment
- **Confidence**: ✅ High
- **Related Dimension**: VO speed comparison
-
-## Fact #2
- **Statement**: SuperPoint+LightGlue VO achieves ~10fps on KITTI dataset on desktop GPU (~100ms/frame). With 274 keypoints on RTX 2080Ti, LightGlue matching alone takes 33.9ms.
- **Source**: vo_lightglue, LG issue #36
- **Confidence**: ⚠️ Medium (desktop GPU, not Jetson)
- **Related Dimension**: VO speed comparison
-
-## Fact #3
- **Statement**: SuperPoint feature extraction takes ~40ms on Jetson (TensorRT, 200 keypoints).
- **Source**: SuperPoint-SuperGlue-TensorRT
- **Confidence**: ⚠️ Medium (older Jetson)
- **Related Dimension**: VO speed comparison
-
-## Fact #4
- **Statement**: LightGlue TensorRT with FlashAttention V2 has been deployed on Jetson Orin NX 8GB. No published latency numbers.
- **Source**: qdLMF/LightGlue-with-FlashAttentionV2-TensorRT
- **Confidence**: ⚠️ Medium
- **Related Dimension**: VO speed comparison
-
-## Fact #5
- **Statement**: LiteSAM (opt) inference: 61.98ms on RTX 3090, 497.49ms on Jetson AGX Orin at 1184px input. 6.31M params.
- **Source**: LiteSAM paper, abstract + Section 4.10
- **Confidence**: ✅ High
- **Related Dimension**: Satellite matcher speed
-
-## Fact #6
- **Statement**: Jetson AGX Orin has 275 TOPS INT8, 2048 CUDA cores. Orin Nano Super has 67 TOPS INT8, 1024 CUDA cores. AGX Orin is ~3-4x more powerful.
- **Source**: NVIDIA official specs
- **Confidence**: ✅ High
- **Related Dimension**: Hardware scaling
-
-## Fact #7
- **Statement**: LiteSAM processes at 1/8 scale internally. Coarse matching is O(N²) where N = (H/8 × W/8). For 1184px: ~21,904 tokens. For 1280px: ~25,600. For 480px: ~3,600.
- **Source**: LiteSAM paper, Sections 3.1-3.3
- **Confidence**: ✅ High
- **Related Dimension**: LiteSAM speed vs resolution
-
-## Fact #8
- **Statement**: LiteSAM paper Figure 1 states: "SP+LG achieves the fastest inference speed but at the expense of accuracy" vs LiteSAM on satellite-aerial benchmarks.
- **Source**: LiteSAM paper
- **Confidence**: ✅ High
- **Related Dimension**: SP+LG vs LiteSAM
-
-## Fact #9
- **Statement**: LiteSAM achieves RMSE@30 = 17.86m on UAV-VisLoc. SP+LG is worse on same benchmark.
- **Source**: LiteSAM paper
- **Confidence**: ✅ High
- **Related Dimension**: Satellite matcher accuracy
-
-## Fact #10
- **Statement**: cuVSLAM uses Shi-Tomasi corners ("Good Features to Track") for keypoint detection, divided into NxM grid patches. Uses Lucas-Kanade optical flow for tracking. When tracked keypoints fall below threshold, creates new keyframe.
- **Source**: cuVSLAM paper (arXiv:2506.04359), Section 2.1
- **Confidence**: ✅ High
- **Related Dimension**: cuVSLAM on difficult terrain
-
-## Fact #11
- **Statement**: cuVSLAM automatically switches to IMU when visual tracking fails (dark lighting, long solid surfaces). IMU integrator provides ~1 second of acceptable tracking. After IMU, constant-velocity integrator provides ~0.5 seconds more.
- **Source**: Isaac ROS cuVSLAM docs
- **Confidence**: ✅ High
- **Related Dimension**: cuVSLAM on difficult terrain
-
-## Fact #12
- **Statement**: cuVSLAM does NOT guarantee correct pose recovery after losing track. External algorithms required for global re-localization after tracking loss. Cannot fuse GNSS, wheel odometry, or LiDAR.
- **Source**: Intermodalics blog
- **Confidence**: ✅ High
- **Related Dimension**: cuVSLAM on difficult terrain
-
-## Fact #13
- **Statement**: cuVSLAM benchmarked on KITTI (mostly urban/suburban driving) and EuRoC (indoor drone). Neither benchmark includes nadir agricultural terrain, flat fields, or uniform vegetation. No published results for these conditions.
- **Source**: cuVSLAM paper Section 3
- **Confidence**: ✅ High
- **Related Dimension**: cuVSLAM on difficult terrain
-
-## Fact #14
- **Statement**: cuVSLAM multi-stereo mode "significantly improves accuracy and robustness on challenging sequences compared to single stereo cameras", designed for featureless surfaces (narrow corridors, elevators). But our system uses monocular camera only.
- **Source**: cuVSLAM paper Section 2.2.2
- **Confidence**: ✅ High
- **Related Dimension**: cuVSLAM on difficult terrain
-
-## Fact #15
- **Statement**: PFED achieves 97.15% Recall@1 on University-1652 at 251.5 FPS on AGX Orin with only 4.45G FLOPs. But this is image RETRIEVAL (which satellite tile matches), NOT pixel-level correspondence matching.
- **Source**: PFED paper (arXiv:2510.22582)
- **Confidence**: ✅ High
- **Related Dimension**: Satellite matching alternatives
-
-## Fact #16
- **Statement**: EfficientLoFTR is ~2.5x faster than LoFTR with higher accuracy. Semi-dense matcher, 15.05M params. Has TensorRT adaptation (LoFTR_TRT). Performs well on weak-texture areas where traditional methods fail. Designed for aerial imagery.
- **Source**: EfficientLoFTR paper (CVPR 2024), HuggingFace docs
- **Confidence**: ✅ High
- **Related Dimension**: Satellite matching alternatives
-
-## Fact #17
- **Statement**: Hierarchical AVL system (2025) uses two-stage approach: DINOv2 for coarse retrieval + SuperPoint for fine matching. 64.5-95% success rate on real-world drone trajectories. Includes IMU-based prior correction and sliding-window map updates.
- **Source**: MDPI Remote Sensing 2025
- **Confidence**: ✅ High
- **Related Dimension**: Satellite matching alternatives
-
-## Fact #18
- **Statement**: STHN uses deep homography estimation for UAV geo-localization: directly estimates homography transform (no feature detection/matching/RANSAC). Achieves 4.24m MACE at 50m range. Designed for thermal but architecture is modality-agnostic.
- **Source**: STHN paper (IEEE RA-L 2024)
- **Confidence**: ✅ High
- **Related Dimension**: Satellite matching alternatives
-
-## Fact #19
- **Statement**: For our nadir UAV → satellite matching, the cross-view gap is SMALL compared to typical cross-view problems (ground-to-satellite). Both views are approximately top-down. Main challenges: season/lighting, resolution mismatch, temporal changes. This means general-purpose matchers may work better than expected.
- **Source**: Analytical observation
- **Confidence**: ⚠️ Medium
- **Related Dimension**: Satellite matching alternatives
-
-## Fact #20
- **Statement**: LiteSAM paper benchmarked EfficientLoFTR (opt) on satellite-aerial: 19.8% slower than LiteSAM (opt) on AGX Orin but with 2.4x more parameters. EfficientLoFTR achieves competitive accuracy. LiteSAM paper Table 3/4 provides direct comparison.
- **Source**: LiteSAM paper, Section 4.5
- **Confidence**: ✅ High
- **Related Dimension**: EfficientLoFTR vs LiteSAM
@@ -1,45 +0,0 @@
-# Comparison Framework
-
-## Selected Framework Type
-Decision Support + Problem Diagnosis
-
-## Selected Dimensions
-1. Inference speed on Orin Nano Super
-2. Accuracy for the target task
-3. Cross-view robustness (satellite-aerial gap)
-4. Implementation complexity / ecosystem maturity
-5. Memory footprint
-6. TensorRT optimization readiness
-
-## Comparison 1: Visual Odometry — cuVSLAM vs SuperPoint+LightGlue
-
-| Dimension | cuVSLAM v15.0.0 | SuperPoint + LightGlue (TRT) | Factual Basis |
-|-----------|-----------------|-------------------------------|---------------|
-| Speed on Orin Nano | ~8.6ms/frame (116fps @ 720p) | Est. ~150-300ms/frame (SP ~40-60ms + LG ~100-200ms) | Fact #1, #2, #3 |
-| VO accuracy (KITTI) | <1% trajectory error | ~1% odometric error (desktop) | Fact #1, #2 |
-| VO accuracy (EuRoC) | <5cm position error | Not benchmarked | Fact #1 |
-| IMU integration | Native mono+IMU mode, auto-fallback | None — must add custom IMU fusion | Fact #1 |
-| Loop closure | Built-in | Not available | Fact #1 |
-| TensorRT ready | Native CUDA (not TensorRT, raw CUDA) | Requires ONNX export + TRT build | Fact #4 |
-| Memory | ~200-300MB | SP ~50MB + LG ~50-100MB = ~100-150MB | Fact #1 |
-| Implementation | pip install aarch64 wheel | Custom pipeline: SP export + LG export + matching + pose estimation | Fact #1, #4 |
-| Maturity on Jetson | NVIDIA-maintained, production-ready | Community TRT plugins, limited Jetson benchmarks | Fact #4, #5 |
-
-## Comparison 2: LiteSAM Speed at Different Resolutions
-
-| Dimension | 1184px (paper default) | 1280px (user proposal) | 640px | 480px | Factual Basis |
-|-----------|------------------------|------------------------|-------|-------|---------------|
-| Tokens at 1/8 scale | ~21,904 | ~25,600 | ~6,400 | ~3,600 | Fact #7 |
-| AGX Orin time | 497ms | Est. ~580ms (1.17x tokens) | Est. ~150ms | Est. ~90ms | Fact #5, #7 |
-| Orin Nano Super time (est.) | ~1.5-2.0s | ~1.7-2.3s | ~450-600ms | ~270-360ms | Fact #5, #6 |
-| Accuracy (RMSE@30) | 17.86m | Similar (slightly less) | Degraded | Significantly degraded | Fact #8, #10 |
-
-## Comparison 3: Satellite Matching — LiteSAM vs SP+LG vs XFeat
-
-| Dimension | LiteSAM (opt) | SuperPoint+LightGlue | XFeat semi-dense | Factual Basis |
-|-----------|--------------|---------------------|------------------|---------------|
-| Cross-view accuracy | RMSE@30 = 17.86m (UAV-VisLoc) | Worse than LiteSAM (paper confirms) | Not benchmarked on UAV-VisLoc | Fact #9, #10 |
-| Speed on Orin Nano (est.) | ~1.5-2s @ 1184px, ~270-360ms @ 480px | Est. ~100-200ms total | ~50-100ms | Fact #5, #2, existing draft |
-| Cross-view robustness | Designed for satellite-aerial gap | Sparse matcher, "lacks sufficient accuracy" for cross-view | General-purpose, less robust | Fact #9, #13 |
-| Parameters | 6.31M | SP ~1.3M + LG ~7M = ~8.3M | ~5M | Fact #5 |
-| Approach | Semi-dense (coarse-to-fine, subpixel) | Sparse (detect → match → verify) | Semi-dense (detect → KNN → refine) | Fact #1, existing draft |
@@ -1,90 +0,0 @@
-# Reasoning Chain
-
-## Dimension 1: SuperPoint+LightGlue for Visual Odometry
-
-### Fact Confirmation
-cuVSLAM achieves 116fps (~8.6ms/frame) on Orin Nano 8GB at 720p (Fact #1). SP+LG achieves ~10fps on KITTI on desktop GPU (Fact #2). SuperPoint alone takes ~40ms on Jetson for 200 keypoints (Fact #3). LightGlue matching on desktop GPU takes ~20-34ms for 274 keypoints (Fact #2).
-
-### Extrapolation to Orin Nano Super
-On Orin Nano Super, estimating SP+LG pipeline:
- SuperPoint extraction (1024 keypoints, 720p): ~50-80ms (based on Fact #3, scaled for more keypoints)
- LightGlue matching (TensorRT FP16, 1024 keypoints): ~80-200ms (based on Fact #11 — 2-4x speedup over PyTorch, but Orin Nano is ~4-6x slower than RTX 4080)
- Total SP+LG: ~130-280ms per frame
-
-cuVSLAM: ~8.6ms per frame.
-
-SP+LG would be **15-33x slower** than cuVSLAM for visual odometry on Orin Nano Super.
-
-### Additional Considerations
-cuVSLAM includes native IMU integration, loop closure, and auto-fallback. SP+LG provides none of these — they would need custom implementation, adding both development time and latency.
-
-### Conclusion
-**SP+LG is not viable as a cuVSLAM replacement for VO on Orin Nano Super.** cuVSLAM is purpose-built for Jetson and 15-33x faster. SP+LG's value lies in its accuracy for feature matching tasks, not real-time VO on edge hardware.
-
-### Confidence
-✅ High — performance gap is enormous and well-supported by multiple sources.
-
---
-
-## Dimension 2: LiteSAM Speed vs Image Resolution (1280px question)
-
-### Fact Confirmation
-LiteSAM (opt) achieves 497ms on AGX Orin at 1184px (Fact #5). AGX Orin is ~3-4x more powerful than Orin Nano Super (Fact #6). LiteSAM processes at 1/8 scale internally — coarse matching is O(N²) where N is proportional to resolution² (Fact #7).
-
-### Resolution Scaling Analysis
-
-**1280px vs 1184px**: Token count increases from ~21,904 to ~25,600 (+17%). Compute increases ~17-37% (linear to quadratic depending on bottleneck). This makes the problem WORSE, not better.
-
-**The user's intuition is likely**: "If 6252×4168 camera images are huge, maybe LiteSAM is slow because we feed it those big images. What if we use 1280px?" But the solution draft already specifies resizing to 480-640px before feeding LiteSAM. The 497ms benchmark on AGX Orin was already at 1184px (the UAV-VisLoc benchmark resolution).
-
-**The real bottleneck is hardware, not image size:**
- At 1184px on AGX Orin: 497ms → on Orin Nano Super: est. **~1.5-2.0s**
- At 1280px on Orin Nano Super: est. **~1.7-2.3s** (WORSE — more tokens)
- At 640px on Orin Nano Super: est. **~450-600ms** (borderline)
- At 480px on Orin Nano Super: est. **~270-360ms** (possibly within 400ms budget)
-
-### Conclusion
-**1280px would make LiteSAM SLOWER, not faster.** The paper benchmarked at 1184px. The bottleneck is the hardware gap (AGX Orin 275 TOPS → Orin Nano Super 67 TOPS). To make LiteSAM fit the 400ms budget, resolution must drop to ~480px, which may significantly degrade cross-view matching accuracy. The original solution draft's approach (benchmark at 480px, abandon if too slow) remains correct.
-
-### Confidence
-✅ High — paper benchmarks + hardware specs provide strong basis.
-
---
-
-## Dimension 3: SP+LG for Satellite Matching (alternative to LiteSAM)
-
-### Fact Confirmation
-LiteSAM paper explicitly states "SP+LG achieves the fastest inference speed but at the expense of accuracy" on satellite-aerial benchmarks (Fact #9). SP+LG is a sparse matcher; the paper notes sparse matchers "lack sufficient accuracy" for cross-view UAV-satellite matching due to texture-scarce regions (Fact #13). LiteSAM achieves RMSE@30 = 17.86m; SP+LG is worse (Fact #10).
-
-### Speed Advantage of SP+LG
-On Orin Nano Super, SP+LG satellite matching pipeline:
- SuperPoint extraction (both images): ~50-80ms × 2 images
- LightGlue matching: ~80-200ms
- Total: ~180-360ms
-
-This is competitive with the 400ms budget. But accuracy is worse than LiteSAM.
-
-### Comparison with XFeat
-XFeat semi-dense: ~50-100ms on Orin Nano Super (from existing draft). XFeat is 2-4x faster than SP+LG and also handles semi-dense matching. For the satellite matching role, XFeat is a better "fast fallback" than SP+LG.
-
-### Conclusion
-**SP+LG is not recommended for satellite matching.** It's slower than XFeat and less accurate than LiteSAM for cross-view matching. XFeat remains the better fallback. SP+LG could serve as a third-tier fallback, but the added complexity isn't justified given XFeat's advantages.
-
-### Confidence
-✅ High — direct comparison from the LiteSAM paper.
-
---
-
-## Dimension 4: Other Weak Points in solution_draft01
-
-### cuVSLAM Nadir Camera Concern
-The solution correctly flags cuVSLAM's "nadir-only camera" as untested. cuVSLAM was designed for robotics (forward-facing cameras). Nadir UAV camera looking straight down at terrain has different motion characteristics. However, cuVSLAM supports arbitrary camera configurations and IMU mode should compensate. **Risk is MEDIUM, mitigation is adequate** (XFeat fallback).
-
-### Memory Budget Gap
-The solution estimates ~1.9-2.4GB total. This looks optimistic if cuVSLAM needs to maintain a map for loop closure. The cuVSLAM map grows over time. For a 3000-frame flight (~16 min at 3fps), map memory could grow to 500MB-1GB. **Risk: memory pressure late in flight.** Mitigation: configure cuVSLAM map pruning, limit map size.
-
-### Tile Search Strategy Underspecified
-The solution mentions GeoHash-indexed tiles but doesn't detail how the system determines which tile to match against when ESKF position has high uncertainty (e.g., after VO failure). The expanded search (±1km) could require loading 10-20 tiles, which is slow from storage.
-
-### Confidence
-⚠️ Medium — these are analytical observations, not empirically verified.
@@ -1,52 +0,0 @@
-# Validation Log
-
-## Validation Scenario 1: SP+LG for VO during Normal Flight
-
-A UAV flies straight at 3fps. Each frame needs VO within 400ms.
-
-### Expected Based on Conclusions
-cuVSLAM: processes each frame in ~8.6ms, leaves 391ms for satellite matching and fusion. Immediate VO result via SSE.
-SP+LG: processes each frame in ~130-280ms, leaves ~120-270ms. May interfere with satellite matching CUDA resources.
-
-### Actual Validation
-cuVSLAM is clearly superior. SP+LG offers no advantage here — cuVSLAM is 15-33x faster AND includes IMU fallback. SP+LG would require building a custom VO pipeline around a feature matcher, whereas cuVSLAM is a complete VO solution.
-
-### Counterexamples
-If cuVSLAM fails on nadir camera (its main risk), SP+LG could serve as a fallback VO method. But XFeat frame-to-frame (~30-50ms) is already identified as the cuVSLAM fallback and is 3-6x faster than SP+LG.
-
-## Validation Scenario 2: LiteSAM at 1280px on Orin Nano Super
-
-A keyframe needs satellite matching. Image is resized to 1280px for LiteSAM.
-
-### Expected Based on Conclusions
-LiteSAM at 1280px on Orin Nano Super: ~1.7-2.3s. This is 4-6x over the 400ms budget. Even running async, it means satellite corrections arrive 5-7 frames later.
-
-### Actual Validation
-1280px is LARGER than the paper's 1184px benchmark resolution. The user likely assumed we feed the full camera image (6252×4168) to LiteSAM, causing slowness. But the solution already downsamples. The bottleneck is the hardware performance gap (Orin Nano Super = ~25% of AGX Orin compute).
-
-### Counterexamples
-If LiteSAM's TensorRT FP16 engine with reparameterized MobileOne achieves better optimization than the paper's AMP benchmark (which uses PyTorch, not TensorRT), speed could improve 2-3x. At 480px with TensorRT FP16: potentially ~90-180ms on Orin Nano Super. This is worth benchmarking.
-
-## Validation Scenario 3: SP+LG as Satellite Matcher After LiteSAM Abandonment
-
-LiteSAM fails benchmark. Instead of XFeat, we try SP+LG for satellite matching.
-
-### Expected Based on Conclusions
-SP+LG: ~180-360ms on Orin Nano Super. Accuracy is worse than LiteSAM for cross-view matching.
-XFeat: ~50-100ms. Accuracy is unproven on cross-view but general-purpose semi-dense.
-
-### Actual Validation
-SP+LG is 2-4x slower than XFeat and the LiteSAM paper confirms worse accuracy for satellite-aerial. XFeat's semi-dense approach is more suited to the texture-scarce regions common in UAV imagery. SP+LG's sparse keypoint detection may fail on agricultural fields or water bodies.
-
-### Counterexamples
-SP+LG could outperform XFeat on high-texture urban areas where sparse features are abundant. But the operational region (eastern Ukraine) is primarily agricultural, making this advantage unlikely.
-
-## Review Checklist
- [x] Draft conclusions consistent with fact cards
- [x] No important dimensions missed
- [x] No over-extrapolation
- [x] Conclusions actionable/verifiable
- [ ] Note: Orin Nano Super estimates are extrapolated from AGX Orin data using the 3-4x compute ratio. Day-one benchmarking remains essential.
-
-## Conclusions Requiring Revision
-None — the original solution draft's architecture (cuVSLAM for VO, benchmark-driven LiteSAM/XFeat for satellite) is confirmed sound. SP+LG is not recommended for either role on this hardware.
@@ -1,102 +0,0 @@
-# Question Decomposition
-
-## Original Question
-Assess solution_draft02.md against updated acceptance criteria and restrictions. The AC and restrictions have been significantly revised to reflect real onboard deployment requirements (MAVLink integration, ground station telemetry, startup/failsafe, object localization, thermal management, satellite imagery specs).
-
-## Active Mode
-Mode B: Solution Assessment — `solution_draft02.md` is the latest draft in OUTPUT_DIR.
-
-## Question Type
-Problem Diagnosis + Decision Support
-
-## Research Subject Boundary
- **Population**: GPS-denied UAV navigation systems on edge hardware (Jetson Orin Nano Super)
- **Geography**: Eastern/southern Ukraine (east of Dnipro River), conflict zone
- **Timeframe**: Current (2025-2026), latest available tools and libraries
- **Level**: Onboard companion computer, real-time flight controller integration via MAVLink
-
-## Key Delta: What Changed in AC/Restrictions
-
-### Restrictions Changes
-1. Two cameras: Navigation (fixed, downward) + AI camera (configurable angle/zoom)
-2. Processing on Jetson Orin Nano Super (was "stationary computer or laptop")
-3. IMU data IS available via flight controller (was "NO data from IMU")
-4. MAVLink protocol via MAVSDK for flight controller communication
-5. Must output GPS_INPUT messages as GPS replacement
-6. Ground station telemetry link available but bandwidth-limited
-7. Thermal throttling must be accounted for
-8. Satellite imagery pre-loaded, storage limited
-
-### Acceptance Criteria Changes
-1. <400ms per frame to flight controller (was <5s for processing)
-2. MAVLink GPS_INPUT output (was REST API + SSE)
-3. Ground station: stream position/confidence, receive re-localization commands
-4. Object localization: trigonometric GPS from AI camera angle/zoom/altitude
-5. Startup: initialize from last known GPS before GPS denial
-6. Failsafe: IMU-only fallback after N seconds of total failure
-7. Reboot recovery: re-initialize from flight controller IMU-extrapolated position
-8. Max cumulative VO drift <100m between satellite anchors
-9. Confidence score per position estimate (high/low)
-10. Satellite imagery: ≥0.5 m/pixel, <2 years old
-11. WGS84 output format
-12. Re-localization via telemetry to ground station (not REST API user input)
-
-## Decomposed Sub-Questions
-
-### Q1: MAVLink GPS_INPUT Integration
- How does MAVSDK Python handle GPS_INPUT messages?
- What fields are required in GPS_INPUT?
- What update rate does the flight controller expect?
- Can we send confidence/accuracy indicators via MAVLink?
- How does this replace the REST API + SSE architecture?
-
-### Q2: Ground Station Telemetry Integration
- How to stream position + confidence over bandwidth-limited telemetry?
- How to receive operator re-localization commands?
- What MAVLink messages support custom data?
- What bandwidth is typical for UAV telemetry links?
-
-### Q3: Startup & Failsafe Mechanisms
- How to initialize from flight controller's last GPS position?
- How to detect GPS denial onset?
- What happens on companion computer reboot mid-flight?
- How to implement IMU-only dead reckoning fallback?
-
-### Q4: Object Localization via AI Camera
- How to compute ground GPS from UAV position + camera angle + zoom + altitude?
- What accuracy can be expected given GPS-denied position error?
- How to handle the API between GPS-denied system and AI detection system?
-
-### Q5: Thermal Management on Jetson Orin Nano Super
- What is sustained thermal performance under GPU load?
- How to monitor and mitigate thermal throttling?
- What power modes are available?
-
-### Q6: VO Drift Budget & Monitoring
- How to measure cumulative drift between satellite anchors?
- How to trigger satellite matching when drift approaches 100m?
- ESKF covariance as drift proxy?
-
-### Q7: Weak Points in Draft02 Architecture
- REST API + SSE architecture is wrong — must be MAVLink
- No ground station integration
- No startup/shutdown procedures
- No thermal management
- No object localization detail for AI camera with configurable angle/zoom
- Memory budget doesn't account for MAVSDK overhead
-
-## Timeliness Sensitivity Assessment
- **Research Topic**: MAVLink integration, MAVSDK for Jetson, ground station telemetry, thermal management
- **Sensitivity Level**: 🟠 High
- **Rationale**: MAVSDK actively developed; MAVLink message set evolving; Jetson JetPack 6.2 specific
- **Source Time Window**: 12 months
- **Priority official sources**:
-  1. MAVSDK Python documentation (mavsdk.io)
-  2. MAVLink message definitions (mavlink.io)
-  3. NVIDIA Jetson Orin Nano thermal documentation
-  4. PX4/ArduPilot GPS_INPUT documentation
- **Key version information**:
-  - MAVSDK-Python: latest PyPI version
-  - MAVLink: v2 protocol
-  - JetPack: 6.2.2
-  - PyCuVSLAM: v15.0.0
@@ -1,175 +0,0 @@
-# Source Registry
-
-## Source #1
- **Title**: MAVSDK-Python Issue #320: Input external GPS through MAVSDK
- **Link**: https://github.com/mavlink/MAVSDK-Python/issues/320
- **Tier**: L4
- **Publication Date**: 2021 (still open 2025)
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: MAVSDK-Python — GPS_INPUT not supported as of v3.15.3
- **Target Audience**: Companion computer developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: MAVSDK-Python does not support GPS_INPUT message. Feature requested but unresolved.
- **Related Sub-question**: Q1
-
-## Source #2
- **Title**: MAVLink GPS_INPUT Message Specification (mavlink_msg_gps_input.h)
- **Link**: https://rflysim.com/doc/en/RflySimAPIs/RflySimSDK/html/mavlink__msg__gps__input_8h_source.html
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: MAVLink v2, Message ID 232
- **Target Audience**: MAVLink developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: GPS_INPUT message fields: lat/lon (1E7), alt, fix_type, horiz_accuracy, vert_accuracy, speed_accuracy, hdop, vdop, satellites_visible, velocities NED, yaw, ignore_flags.
- **Related Sub-question**: Q1
-
-## Source #3
- **Title**: ArduPilot GPS Input MAVProxy Documentation
- **Link**: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: ArduPilot GPS1_TYPE=14
- **Target Audience**: ArduPilot companion computer developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: GPS_INPUT requires GPS1_TYPE=14. Accepts JSON over UDP port 25100. Fields: lat, lon, alt, fix_type, hdop, timestamps.
- **Related Sub-question**: Q1
-
-## Source #4
- **Title**: pymavlink GPS_INPUT example
- **Link**: https://webperso.ensta.fr/lebars/Share/GPS_INPUT_pymavlink.py
- **Tier**: L3
- **Publication Date**: 2023
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: pymavlink
- **Target Audience**: Companion computer developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Complete pymavlink example for sending GPS_INPUT with all fields including yaw. Uses gps_input_send() method.
- **Related Sub-question**: Q1
-
-## Source #5
- **Title**: ArduPilot AP_GPS_Params.cpp — GPS_RATE_MS
- **Link**: https://cocalc.com/github/ardupilot/ardupilot/blob/master/libraries/AP_GPS/AP_GPS_Params.cpp
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: ArduPilot master
- **Target Audience**: ArduPilot developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: GPS_RATE_MS default 200ms (5Hz), range 50-200ms (5-20Hz). Below 5Hz not allowed.
- **Related Sub-question**: Q1
-
-## Source #6
- **Title**: MAVLink Telemetry Bandwidth Optimization Issue #1605
- **Link**: https://github.com/mavlink/mavlink/issues/1605
- **Tier**: L2
- **Publication Date**: 2022
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: MAVLink protocol developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Minimal telemetry requires ~12kbit/s. Optimized ~6kbit/s. SiK at 57600 baud provides ~21% usable budget. RFD900 for long range (15km+).
- **Related Sub-question**: Q2
-
-## Source #7
- **Title**: NVIDIA JetPack 6.2 Super Mode Blog
- **Link**: https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/
- **Tier**: L1
- **Publication Date**: 2025-01
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: JetPack 6.2, Orin Nano Super
- **Target Audience**: Jetson developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: MAXN SUPER mode for peak performance. Thermal throttling at 80°C. Power modes: 15W, 25W, MAXN SUPER. Up to 1.7x AI boost.
- **Related Sub-question**: Q5
-
-## Source #8
- **Title**: Jetson Orin Nano Power Consumption Analysis
- **Link**: https://edgeaistack.app/blog/jetson-orin-nano-power-consumption/
- **Tier**: L3
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Jetson edge deployment engineers
- **Research Boundary Match**: ✅ Full match
- **Summary**: 5W idle, 8-12W typical inference, 25W peak. Throttling above 80°C drops GPU from 1GHz to 300MHz. Active cooling required for sustained loads.
- **Related Sub-question**: Q5
-
-## Source #9
- **Title**: UAV Target Geolocation (Sensors 2022)
- **Link**: https://www.mdpi.com/1424-8220/22/5/1903
- **Tier**: L1
- **Publication Date**: 2022
- **Timeliness Status**: ✅ Currently valid (math doesn't change)
- **Target Audience**: UAV reconnaissance systems
- **Research Boundary Match**: ✅ Full match
- **Summary**: Trigonometric target geolocation from camera angle, altitude, UAV position. Iterative refinement improves accuracy 22-38x.
- **Related Sub-question**: Q4
-
-## Source #10
- **Title**: pymavlink vs MAVSDK-Python for custom messages (Issue #739)
- **Link**: https://github.com/mavlink/MAVSDK-Python/issues/739
- **Tier**: L4
- **Publication Date**: 2024-12
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: MAVSDK-Python, pymavlink
- **Target Audience**: Companion computer developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: MAVSDK-Python lacks custom message support. pymavlink recommended for GPS_INPUT and custom messages. MAVSDK v4 may add MavlinkDirect plugin.
- **Related Sub-question**: Q1
-
-## Source #11
- **Title**: NAMED_VALUE_FLOAT for custom telemetry (PR #18501)
- **Link**: https://github.com/ArduPilot/ardupilot/pull/18501
- **Tier**: L2
- **Publication Date**: 2022
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: ArduPilot companion computer developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: NAMED_VALUE_FLOAT messages from companion computer are logged by ArduPilot and forwarded to GCS. Useful for custom telemetry data.
- **Related Sub-question**: Q2
-
-## Source #12
- **Title**: ArduPilot Companion Computer UART Connection
- **Link**: https://ardupilot.org/dev/docs/raspberry-pi-via-mavlink.html
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: ArduPilot companion computer developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Connect via TELEM2 UART. SERIAL2_PROTOCOL=2 (MAVLink2). Baud up to 1.5Mbps. TX/RX crossover.
- **Related Sub-question**: Q1, Q2
-
-## Source #13
- **Title**: Jetson Orin Nano UART with ArduPilot
- **Link**: https://forums.developer.nvidia.com/t/uart-connection-between-jetson-nano-orin-and-ardupilot/325416
- **Tier**: L4
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: JetPack 6.x, Orin Nano
- **Target Audience**: Jetson + ArduPilot integration
- **Research Boundary Match**: ✅ Full match
- **Summary**: UART instability reported on Orin Nano with ArduPilot. Use /dev/ttyTHS0 or /dev/ttyTHS1. Check pinout carefully.
- **Related Sub-question**: Q1
-
-## Source #14
- **Title**: MAVSDK-Python v3.15.3 PyPI (aarch64 wheels)
- **Link**: https://pypi.org/project/mavsdk/
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Version Info**: v3.15.3, manylinux2014_aarch64
- **Target Audience**: MAVSDK Python developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: MAVSDK-Python has aarch64 wheels. pip install works on Jetson. But no GPS_INPUT support.
- **Related Sub-question**: Q1
-
-## Source #15
- **Title**: ArduPilot receive COMMAND_LONG on companion computer
- **Link**: https://discuss.ardupilot.org/t/recieve-mav-cmd-on-companion-computer/48928
- **Tier**: L4
- **Publication Date**: 2020
- **Timeliness Status**: ⚠️ Needs verification (old but concept still valid)
- **Target Audience**: ArduPilot companion computer developers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Companion computer can receive COMMAND_LONG messages from GCS via MAVLink. ArduPilot scripting can intercept specific command IDs.
- **Related Sub-question**: Q2
@@ -1,105 +0,0 @@
-# Fact Cards
-
-## Fact #1
- **Statement**: MAVSDK-Python (v3.15.3) does NOT support sending GPS_INPUT MAVLink messages. The feature has been requested since 2021 and remains unresolved. Custom message support is planned for MAVSDK v4 but not available in Python wrapper.
- **Source**: Source #1, #10, #14
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — confirmed by MAVSDK maintainers
- **Related Dimension**: Flight Controller Integration
-
-## Fact #2
- **Statement**: pymavlink provides full access to all MAVLink messages including GPS_INPUT via `mav.gps_input_send()`. It is the recommended library for companion computers that need to send GPS_INPUT messages.
- **Source**: Source #4, #10
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — working examples exist
- **Related Dimension**: Flight Controller Integration
-
-## Fact #3
- **Statement**: GPS_INPUT (MAVLink msg ID 232) contains: lat/lon (WGS84, degrees×1E7), alt (AMSL), fix_type (0-8), horiz_accuracy (m), vert_accuracy (m), speed_accuracy (m/s), hdop, vdop, satellites_visible, vn/ve/vd (NED m/s), yaw (centidegrees), gps_id, ignore_flags.
- **Source**: Source #2
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — official MAVLink spec
- **Related Dimension**: Flight Controller Integration
-
-## Fact #4
- **Statement**: ArduPilot requires GPS1_TYPE=14 (MAVLink) to accept GPS_INPUT messages from a companion computer. Connection via TELEM2 UART, SERIAL2_PROTOCOL=2 (MAVLink2).
- **Source**: Source #3, #12
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — official ArduPilot documentation
- **Related Dimension**: Flight Controller Integration
-
-## Fact #5
- **Statement**: ArduPilot GPS update rate (GPS_RATE_MS) default is 200ms (5Hz), range 50-200ms (5-20Hz). Our camera at 3fps (333ms) means GPS_INPUT at 3Hz. ArduPilot minimum is 5Hz. We must interpolate/predict between camera frames to meet 5Hz minimum.
- **Source**: Source #5
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — ArduPilot source code
- **Related Dimension**: Flight Controller Integration
-
-## Fact #6
- **Statement**: GPS_INPUT horiz_accuracy field directly maps to our confidence scoring. We can report: satellite-anchored ≈ 10-20m accuracy, VO-extrapolated ≈ 20-50m, IMU-only ≈ 100m+. ArduPilot EKF uses this for fusion weighting internally.
- **Source**: Source #2, #3
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ⚠️ Medium — accuracy mapping is estimated, EKF weighting not fully documented
- **Related Dimension**: Flight Controller Integration
-
-## Fact #7
- **Statement**: Typical UAV telemetry bandwidth: SiK radio at 57600 baud provides ~12kbit/s usable for MAVLink. RFD900 provides long range (15km+) at similar data rates. Position telemetry must be compact — ~50 bytes per position update.
- **Source**: Source #6
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — MAVLink protocol specs
- **Related Dimension**: Ground Station Telemetry
-
-## Fact #8
- **Statement**: NAMED_VALUE_FLOAT MAVLink message can stream custom telemetry from companion computer to ground station. ArduPilot logs and forwards these. Mission Planner displays them. Useful for confidence score, drift status, matching status.
- **Source**: Source #11
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — ArduPilot merged PR
- **Related Dimension**: Ground Station Telemetry
-
-## Fact #9
- **Statement**: Jetson Orin Nano Super throttles GPU from 1GHz to ~300MHz when junction temperature exceeds 80°C. Active cooling (fan) required for sustained load. Power consumption: 5W idle, 8-12W typical inference, 25W peak. Modes: 15W, 25W, MAXN SUPER.
- **Source**: Source #7, #8
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — NVIDIA official
- **Related Dimension**: Thermal Management
-
-## Fact #10
- **Statement**: Jetson Orin Nano UART connection to ArduPilot uses /dev/ttyTHS0 or /dev/ttyTHS1. UART instability reported on some units — verify pinout, use JetPack 6.2.2+. Baud up to 1.5Mbps supported.
- **Source**: Source #12, #13
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ⚠️ Medium — UART instability is a known issue with workarounds
- **Related Dimension**: Flight Controller Integration
-
-## Fact #11
- **Statement**: Object geolocation from UAV: for nadir (downward) camera, pixel offset from center → meters via GSD → rotate by heading → add to UAV GPS. For oblique (AI) camera with angle θ from vertical: ground_distance = altitude × tan(θ). Combined with zoom → effective focal length → pixel-to-meter conversion. Flat terrain assumption simplifies to basic trigonometry.
- **Source**: Source #9
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — well-established trigonometry
- **Related Dimension**: Object Localization
-
-## Fact #12
- **Statement**: Companion computer can receive COMMAND_LONG from ground station via MAVLink. For re-localization hints: ground station sends a custom command with approximate lat/lon, companion computer receives it via pymavlink message listener.
- **Source**: Source #15
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ⚠️ Medium — specific implementation for re-localization hint would be custom
- **Related Dimension**: Ground Station Telemetry
-
-## Fact #13
- **Statement**: The restrictions.md now says "using MAVSDK library" but MAVSDK-Python cannot send GPS_INPUT. pymavlink is the only viable Python option for GPS_INPUT. This is a restriction conflict that must be resolved — use pymavlink for GPS_INPUT (core function) or accept MAVSDK + pymavlink hybrid.
- **Source**: Source #1, #2, #10
- **Phase**: Assessment
- **Target Audience**: GPS-Denied system developers
- **Confidence**: ✅ High — confirmed limitation
- **Related Dimension**: Flight Controller Integration
@@ -1,62 +0,0 @@
-# Comparison Framework
-
-## Selected Framework Type
-Problem Diagnosis + Decision Support (Mode B)
-
-## Weak Point Dimensions (from draft02 → new AC/restrictions)
-
-### Dimension 1: Output Architecture (CRITICAL)
-Draft02 uses FastAPI + SSE to stream positions to clients.
-New AC requires MAVLink GPS_INPUT to flight controller as PRIMARY output.
-Entire output architecture must change.
-
-### Dimension 2: Ground Station Communication (CRITICAL)
-Draft02 has no ground station integration.
-New AC requires: stream position+confidence via telemetry, receive re-localization commands.
-
-### Dimension 3: MAVLink Library Choice (CRITICAL)
-Restrictions say "MAVSDK library" but MAVSDK-Python cannot send GPS_INPUT.
-Must use pymavlink for core function.
-
-### Dimension 4: GPS Update Rate (HIGH)
-Camera at 3fps → 3Hz position updates. ArduPilot minimum GPS rate is 5Hz.
-Need IMU-based interpolation between camera frames.
-
-### Dimension 5: Startup & Failsafe (HIGH)
-Draft02 has no initialization or failsafe procedures.
-New AC requires: init from last GPS, reboot recovery, IMU fallback after N seconds.
-
-### Dimension 6: Object Localization (MEDIUM)
-Draft02 has basic pixel-to-GPS for navigation camera only.
-New AC adds AI camera with configurable angle, zoom — trigonometric projection needed.
-
-### Dimension 7: Thermal Management (MEDIUM)
-Draft02 ignores thermal throttling.
-Jetson Orin Nano Super throttles at 80°C — can drop GPU 3x.
-
-### Dimension 8: VO Drift Budget Monitoring (MEDIUM)
-New AC: max cumulative VO drift <100m between satellite anchors.
-Draft02 uses ESKF covariance but doesn't explicitly track drift budget.
-
-### Dimension 9: Satellite Imagery Specs (LOW)
-New AC: ≥0.5 m/pixel, <2 years old. Draft02 uses Google Maps zoom 18-19 which is ~0.3-0.6 m/pixel.
-Mostly compatible, needs explicit validation.
-
-### Dimension 10: API for Internal Systems (LOW)
-Object localization requests from AI systems need a local IPC mechanism.
-FastAPI could be retained for local-only inter-process communication.
-
-## Initial Population
-
-| Dimension | Draft02 State | Required State | Gap Severity |
-|-----------|--------------|----------------|-------------|
-| Output Architecture | FastAPI + SSE to client | MAVLink GPS_INPUT to flight controller | CRITICAL — full redesign |
-| Ground Station | None | Bidirectional MAVLink telemetry | CRITICAL — new component |
-| MAVLink Library | Not applicable (no MAVLink) | pymavlink (MAVSDK can't do GPS_INPUT) | CRITICAL — new dependency |
-| GPS Update Rate | 3fps → ~3Hz output | ≥5Hz to ArduPilot EKF | HIGH — need IMU interpolation |
-| Startup & Failsafe | None | Init from GPS, reboot recovery, IMU fallback | HIGH — new procedures |
-| Object Localization | Basic nadir pixel-to-GPS | AI camera angle+zoom trigonometry | MEDIUM — extend existing |
-| Thermal Management | Not addressed | Monitor + mitigate throttling | MEDIUM — operational concern |
-| VO Drift Budget | ESKF covariance only | Explicit <100m tracking + trigger | MEDIUM — extend ESKF |
-| Satellite Imagery Specs | Google Maps zoom 18-19 | ≥0.5 m/pixel, <2 years | LOW — mostly met |
-| Internal IPC | REST API | Lightweight local API or shared memory | LOW — simplify from draft02 |
@@ -1,202 +0,0 @@
-# Reasoning Chain
-
-## Dimension 1: Output Architecture
-
-### Fact Confirmation
-Per Fact #3, GPS_INPUT (MAVLink msg ID 232) accepts lat/lon in WGS84 (degrees×1E7), altitude, fix_type, accuracy fields, and NED velocities. Per Fact #4, ArduPilot uses GPS1_TYPE=14 to accept MAVLink GPS input. The flight controller's EKF fuses this as if it were a real GPS module.
-
-### Reference Comparison
-Draft02 uses FastAPI + SSE to stream position data to a REST client. The new AC requires the system to output GPS coordinates directly to the flight controller via MAVLink GPS_INPUT, replacing the real GPS module. The flight controller then uses these coordinates for navigation/autopilot functions. The ground station receives position data indirectly via the flight controller's telemetry forwarding.
-
-### Conclusion
-The entire output architecture must change from REST API + SSE → pymavlink GPS_INPUT sender. FastAPI is no longer the primary output mechanism. It may be retained only for local IPC with other onboard AI systems (object localization requests). The SSE streaming to external clients is replaced by MAVLink telemetry forwarding through the flight controller.
-
-### Confidence
-✅ High — clear requirement change backed by MAVLink specification
-
---
-
-## Dimension 2: Ground Station Communication
-
-### Fact Confirmation
-Per Fact #7, typical telemetry bandwidth is ~12kbit/s (SiK). Per Fact #8, NAMED_VALUE_FLOAT can stream custom values from companion to GCS. Per Fact #12, COMMAND_LONG can deliver commands from GCS to companion.
-
-### Reference Comparison
-Draft02 has no ground station integration. The new AC requires:
-1. Stream position + confidence to ground station (passive, via telemetry forwarding of GPS_INPUT data + custom NAMED_VALUE_FLOAT for confidence/drift)
-2. Receive re-localization commands from operator (active, via COMMAND_LONG or custom MAVLink message)
-
-### Conclusion
-Ground station communication uses MAVLink messages forwarded through the flight controller's telemetry radio. Position data flows automatically (flight controller forwards GPS data to GCS). Custom telemetry (confidence, drift, status) uses NAMED_VALUE_FLOAT. Re-localization hints from operator use a custom COMMAND_LONG with lat/lon payload. Bandwidth is tight (~12kbit/s) so minimize custom message frequency (1-2Hz max for NAMED_VALUE_FLOAT).
-
-### Confidence
-✅ High — standard MAVLink patterns
-
---
-
-## Dimension 3: MAVLink Library Choice
-
-### Fact Confirmation
-Per Fact #1, MAVSDK-Python v3.15.3 does NOT support GPS_INPUT. Per Fact #2, pymavlink provides full GPS_INPUT support via `mav.gps_input_send()`. Per Fact #13, the restrictions say "using MAVSDK library" but MAVSDK literally cannot do the core function.
-
-### Reference Comparison
-MAVSDK is a higher-level abstraction over MAVLink. pymavlink is the lower-level direct MAVLink implementation. For GPS_INPUT (our core output), only pymavlink works.
-
-### Conclusion
-Use **pymavlink** as the MAVLink library. The restriction mentioning MAVSDK must be noted as a conflict — pymavlink is the only viable option for GPS_INPUT in Python. pymavlink is lightweight, pure Python, works on aarch64, and provides full access to all MAVLink messages. MAVSDK v4 may add custom message support in the future but is not available now.
-
-### Confidence
-✅ High — confirmed limitation, clear alternative
-
---
-
-## Dimension 4: GPS Update Rate
-
-### Fact Confirmation
-Per Fact #5, ArduPilot GPS_RATE_MS has a minimum of 200ms (5Hz). Our camera shoots at ~3fps (333ms). We produce a full VO+ESKF position estimate per frame at ~3Hz.
-
-### Reference Comparison
-3Hz < 5Hz minimum. ArduPilot's EKF expects at least 5Hz GPS updates for stable fusion.
-
-### Conclusion
-Between camera frames, use IMU prediction from the ESKF to interpolate position at 5Hz (or higher, e.g., 10Hz). The ESKF already runs IMU prediction at 100+Hz internally. Simply emit the ESKF predicted state as GPS_INPUT at 5-10Hz. Camera frame updates (3Hz) provide the measurement corrections. This is standard in sensor fusion: prediction runs fast, measurements arrive slower. The `fix_type` field can differentiate: camera-corrected frames → fix_type=3 (3D), IMU-predicted → fix_type=2 (2D) or adjust horiz_accuracy to reflect lower confidence.
-
-### Confidence
-✅ High — standard sensor fusion approach
-
---
-
-## Dimension 5: Startup & Failsafe
-
-### Fact Confirmation
-Per new AC: system initializes from last known GPS before GPS denial. On reboot: re-initialize from flight controller's IMU-extrapolated position. On total failure for N seconds: flight controller falls back to IMU-only.
-
-### Reference Comparison
-Draft02 has no startup or failsafe procedures. The system was assumed to already know its position at session start.
-
-### Conclusion
-Startup sequence:
-1. On boot, connect to flight controller via pymavlink
-2. Read current GPS position from flight controller (GLOBAL_POSITION_INT or GPS_RAW_INT message)
-3. Initialize ESKF state with this position
-4. Begin cuVSLAM initialization with first camera frames
-5. Start sending GPS_INPUT once ESKF has a valid position estimate
-
-Failsafe:
-1. If no position estimate for N seconds → stop sending GPS_INPUT (flight controller auto-detects GPS loss and falls back to IMU)
-2. Log failure event
-3. Continue attempting VO/satellite matching
-
-Reboot recovery:
-1. On companion computer reboot, reconnect to flight controller
-2. Read current GPS_RAW_INT (which is now IMU-extrapolated by flight controller)
-3. Re-initialize ESKF with this position (lower confidence)
-4. Resume normal operation
-
-### Confidence
-✅ High — standard autopilot integration patterns
-
---
-
-## Dimension 6: Object Localization
-
-### Fact Confirmation
-Per Fact #11, for oblique camera: ground_distance = altitude × tan(θ) where θ is angle from vertical. Combined with camera azimuth (yaw + camera pan angle) gives direction. With zoom, effective FOV narrows → higher pixel-to-meter resolution.
-
-### Reference Comparison
-Draft02 has basic nadir-only projection: pixel offset × GSD → meters → rotate by heading → lat/lon. The AI camera has configurable angle and zoom, so this needs extension.
-
-### Conclusion
-Object localization for AI camera:
-1. Get current UAV position from GPS-Denied system
-2. Get AI camera params: pan angle (azimuth relative to heading), tilt angle (from vertical), zoom level (→ effective focal length)
-3. Get pixel coordinates of detected object in AI camera frame
-4. Compute: a) bearing = UAV heading + camera pan angle + pixel horizontal offset angle, b) ground_distance = altitude / cos(tilt) × (tilt + pixel vertical offset angle) → simplified for flat terrain, c) convert bearing + distance to lat/lon offset from UAV position
-5. Accuracy inherits GPS-Denied position error + projection error from altitude/angle uncertainty
-
-Expose as lightweight local API (Unix socket or shared memory for speed, or simple HTTP on localhost).
-
-### Confidence
-✅ High — well-established trigonometry, flat terrain simplifies
-
---
-
-## Dimension 7: Thermal Management
-
-### Fact Confirmation
-Per Fact #9, Jetson Orin Nano Super throttles at 80°C junction temperature, dropping GPU from ~1GHz to ~300MHz (3x slowdown). Active cooling required. Power modes: 15W, 25W, MAXN SUPER.
-
-### Reference Comparison
-Draft02 ignores thermal constraints. Our pipeline (cuVSLAM ~9ms + satellite matcher ~50-200ms) runs on GPU continuously at 3fps. This is moderate but sustained load.
-
-### Conclusion
-Mitigation:
-1. Use 25W power mode (not MAXN SUPER) for stable sustained performance
-2. Require active cooling (5V fan, should be standard on any UAV companion computer mount)
-3. Monitor temperature via tegrastats/jtop at runtime
-4. If temp >75°C: reduce satellite matching frequency (every 5-10 frames instead of 3)
-5. If temp >80°C: skip satellite matching entirely, rely on VO+IMU only (cuVSLAM at 9ms is low power)
-6. Our total GPU time per 333ms frame: ~9ms cuVSLAM + ~50-200ms satellite match (async) = <60% GPU utilization → thermal throttling unlikely with proper cooling
-
-### Confidence
-⚠️ Medium — actual thermal behavior depends on airflow in UAV enclosure, ambient temperature in-flight
-
---
-
-## Dimension 8: VO Drift Budget Monitoring
-
-### Fact Confirmation
-New AC: max cumulative VO drift between satellite correction anchors < 100m. The ESKF maintains a position covariance matrix that grows during VO-only periods and shrinks on satellite corrections.
-
-### Reference Comparison
-Draft02 uses ESKF covariance for keyframe selection (trigger satellite match when covariance exceeds threshold) but doesn't explicitly track drift as a budget.
-
-### Conclusion
-Use ESKF position covariance diagonal (σ_x² + σ_y²) as the drift estimate. When √(σ_x² + σ_y²) approaches 100m:
-1. Force satellite matching on every frame (not just keyframes)
-2. Report LOW confidence via GPS_INPUT horiz_accuracy
-3. If drift exceeds 100m without satellite correction → flag as critical, increase matching frequency, send alert to ground station
-This is essentially what draft02 already does with covariance-based keyframe triggering, but now with an explicit 100m threshold.
-
-### Confidence
-✅ High — standard ESKF covariance interpretation
-
---
-
-## Dimension 9: Satellite Imagery Specs
-
-### Fact Confirmation
-New AC: ≥0.5 m/pixel resolution, <2 years old. Google Maps at zoom 18 = ~0.6 m/pixel, zoom 19 = ~0.3 m/pixel.
-
-### Reference Comparison
-Draft02 uses Google Maps zoom 18-19. Zoom 19 (0.3 m/pixel) exceeds the requirement. Zoom 18 (0.6 m/pixel) meets the minimum. Age depends on Google's imagery updates for eastern Ukraine — conflict zone may have stale imagery.
-
-### Conclusion
-Validate during offline preprocessing:
-1. Download at zoom 19 first (0.3 m/pixel)
-2. If zoom 19 unavailable for some tiles, fall back to zoom 18 (0.6 m/pixel — exceeds 0.5 minimum)
-3. Check imagery date metadata if available from Google Maps API
-4. Flag tiles where imagery appears stale (seasonal mismatch, destroyed buildings, etc.)
-5. No architectural change needed — add validation step to preprocessing pipeline
-
-### Confidence
-⚠️ Medium — Google Maps imagery age is not reliably queryable
-
---
-
-## Dimension 10: Internal IPC for Object Localization
-
-### Fact Confirmation
-Other onboard AI systems need to request GPS coordinates of detected objects. These systems run on the same Jetson.
-
-### Reference Comparison
-Draft02 has FastAPI for external API. For local IPC between processes on the same device, FastAPI is overkill but works.
-
-### Conclusion
-Retain a minimal FastAPI server on localhost:8000 for inter-process communication:
- POST /localize: accepts pixel coordinates + AI camera params → returns GPS coordinates
- GET /status: returns system health/state for monitoring
-This is local-only (bind to 127.0.0.1), not exposed externally. The primary output channel is MAVLink GPS_INPUT. This is a lightweight addition, not the core architecture.
-
-### Confidence
-✅ High — simple local IPC pattern
@@ -1,88 +0,0 @@
-# Validation Log
-
-## Validation Scenario
-A typical 15-minute flight over eastern Ukraine agricultural terrain. GPS is jammed after first 2 minutes. Flight includes straight segments, two sharp 90-degree turns, and one low-texture segment over a large plowed field. Ground station operator monitors via telemetry link. During the flight, companion computer reboots once due to power glitch.
-
-## Expected Based on Conclusions
-
-### Phase 1: Normal start (GPS available, first 2 min)
- System boots, connects to flight controller via pymavlink on UART
- Reads GLOBAL_POSITION_INT → initializes ESKF with real GPS position
- Begins cuVSLAM initialization with first camera frames
- Starts sending GPS_INPUT at 5Hz (ESKF prediction between frames)
- Ground station sees position + confidence via telemetry forwarding
-
-### Phase 2: GPS denial begins
- Flight controller's real GPS becomes unreliable/lost
- GPS-Denied system continues sending GPS_INPUT — seamless for autopilot
- horiz_accuracy changes from real-GPS level to VO-estimated level (~20m)
- cuVSLAM provides VO at every frame (~9ms), ESKF fuses with IMU
- Satellite matching runs every 3-10 frames on keyframes
- After successful satellite match: horiz_accuracy improves, fix_type stays 3
- NAMED_VALUE_FLOAT sends confidence/drift data to ground station at ~1Hz
-
-### Phase 3: Sharp turn
- cuVSLAM loses tracking (no overlapping features)
- ESKF falls back to IMU prediction, horiz_accuracy increases
- Next frame flagged as keyframe → satellite matching triggered immediately
- Satellite match against preloaded tiles using IMU dead-reckoning position
- If match found: position recovered, new segment begins, horiz_accuracy drops
- If 3 consecutive failures: send re-localization request to ground station via NAMED_VALUE_FLOAT/STATUSTEXT
- Ground station operator sends COMMAND_LONG with approximate coordinates
- System receives hint, constrains tile search → likely recovers position
-
-### Phase 4: Low-texture plowed field
- cuVSLAM keypoint count drops below threshold
- Satellite matching frequency increases (every frame)
- If satellite matching works on plowed field vs satellite imagery: position maintained
- If satellite also fails (seasonal difference): drift accumulates, ESKF covariance grows
- When √(σ²) approaches 100m: force continuous satellite matching
- horiz_accuracy reported as 50-100m, fix_type=2
-
-### Phase 5: Companion computer reboot
- Power glitch → Jetson reboots (~30-60 seconds)
- During reboot: flight controller gets no GPS_INPUT → detects GPS timeout → falls back to IMU-only dead reckoning
- Jetson comes back: reconnects via pymavlink, reads GPS_RAW_INT (IMU-extrapolated)
- Initializes ESKF with this position (low confidence, horiz_accuracy=100m)
- Begins cuVSLAM + satellite matching → gradually improves accuracy
- Operator on ground station sees position return with improving confidence
-
-### Phase 6: Object localization request
- AI detection system on same Jetson detects a vehicle in AI camera frame
- Sends POST /localize with pixel coords + camera angle (30° from vertical) + zoom level + altitude (500m)
- GPS-Denied system computes: ground_distance = 500 / cos(30°) = 577m slant, horizontal distance = 500 × tan(30°) = 289m
- Adds bearing from heading + camera pan → lat/lon offset
- Returns GPS coordinates with accuracy estimate (GPS-Denied accuracy + projection error)
-
-## Actual Validation Results
-The scenario covers all new AC requirements:
- ✅ MAVLink GPS_INPUT at 5Hz (camera frames + IMU interpolation)
- ✅ Confidence via horiz_accuracy field maps to confidence levels
- ✅ Ground station telemetry via MAVLink forwarding + NAMED_VALUE_FLOAT
- ✅ Re-localization via ground station command
- ✅ Startup from GPS → seamless transition on denial
- ✅ Reboot recovery from flight controller IMU-extrapolated position
- ✅ Drift budget tracking via ESKF covariance
- ✅ Object localization with AI camera angle/zoom
-
-## Counterexamples
-
-### Potential issue: 5Hz interpolation accuracy
-Between camera frames (333ms apart), ESKF predicts using IMU only. At 200km/h = 55m/s, the UAV moves ~18m between frames. IMU prediction over 200ms (one interpolation step) at this speed introduces ~1-5m error — acceptable for GPS_INPUT.
-
-### Potential issue: UART reliability
-Jetson Orin Nano UART instability reported (Fact #10). If MAVLink connection drops during flight, GPS_INPUT stops → autopilot loses GPS. Mitigation: use TCP over USB-C if UART unreliable, or add watchdog to reconnect. This is a hardware integration risk.
-
-### Potential issue: Telemetry bandwidth saturation
-If GPS-Denied sends too many NAMED_VALUE_FLOAT messages, it could compete with standard autopilot telemetry for bandwidth. Keep custom messages to 1Hz max (50-100 bytes/s = <1kbit/s).
-
-## Review Checklist
- [x] Draft conclusions consistent with fact cards
- [x] No important dimensions missed
- [x] No over-extrapolation
- [x] Conclusions actionable and verifiable
- [x] All new AC requirements addressed
- [ ] UART reliability needs hardware testing — cannot validate without physical setup
-
-## Conclusions Requiring Revision
-None — all conclusions hold under validation. The UART reliability risk needs flagging but doesn't change the architecture.
@@ -1,76 +0,0 @@
-# Acceptance Criteria Assessment
-
-## System Parameters (Calculated)
-
-| Parameter | Value |
-|-----------|-------|
-| GSD (at 400m) | 6.01 cm/pixel |
-| Ground footprint | 376m × 250m |
-| Consecutive overlap | 60-73% (at 100m intervals) |
-| Pixels per 50m | ~832 pixels |
-| Pixels per 20m | ~333 pixels |
-
-## Acceptance Criteria
-
-| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status |
-|-----------|-----------|-------------------|---------------------|--------|
-| GPS accuracy: 80% within 50m | 50m error for 80% of photos | NaviLoc: 19.5m MLE at 50-150m alt. Mateos-Ramirez: 143m mean at >1000m alt (with IMU). At 400m with 26MP + satellite correction, 50m for 80% is achievable with VO+SIM. No IMU adds ~30-50% error overhead. | Medium cost — needs robust satellite matching pipeline. ~3-4 weeks for core pipeline. | **Achievable** — keep as-is |
-| GPS accuracy: 60% within 20m | 20m error for 60% of photos | NaviLoc: 19.5m MLE at lower altitude (50-150m). At 400m, larger viewpoint gap increases error. Cross-view matching MA@20m improving +10% yearly. Needs high-quality satellite imagery and robust matching. | Higher cost — requires higher-quality satellite imagery (0.3-0.5m resolution). Additional 1-2 weeks for refinement. | **Challenging but achievable** — consider relaxing to 30m initially, tighten with iteration |
-| Handle 350m outlier photos | Tolerate up to 350m jump between consecutive photos | Standard VO systems detect outliers via feature matching failure. 350m at GSD 6cm = ~5833 pixels. Satellite re-localization can handle this if area is textured. | Low additional cost — outlier detection is standard in VO pipelines. | **Achievable** — keep as-is |
-| Sharp turns: <5% overlap, <200m drift, <70° angle | System continues working during sharp turns | <5% overlap means consecutive feature matching will fail. Must fall back to satellite matching for absolute position. At 400m altitude with 376m footprint, 200m drift means partial overlap with satellite. 70° rotation is large but manageable with rotation-invariant matchers (AKAZE, SuperPoint). | High complexity — requires multi-strategy architecture (VO primary, satellite fallback). +2-3 weeks. | **Achievable with architectural investment** — keep as-is |
-| Route disconnection & reconnection | Handle multiple disconnected route segments | Each segment needs independent satellite geo-referencing. Segments are stitched via common satellite reference frame. Similar to loop closure in SLAM but via external reference. | High complexity — core architectural challenge. +2-3 weeks for segment management. | **Achievable** — this should be a core design principle, not an edge case |
-| User input fallback (20% of route) | User provides GPS when system cannot determine | Simple UI interaction — user clicks approximate position on map. Becomes new anchor point. | Low cost — straightforward feature. | **Achievable** — keep as-is |
-| Processing speed: <5s per image | 5 seconds maximum per image | SuperPoint: ~50-100ms. LightGlue: ~20-50ms. Satellite crop+match: ~200-500ms. Full pipeline: ~500ms-2s on RTX 2060. NaviLoc runs 9 FPS on Raspberry Pi 5. ORB-SLAM3 with GPU: 30 FPS on Jetson TX2. | Low risk — well within budget on RTX 2060+. | **Easily achievable** — could target <2s. Keep 5s as safety margin |
-| Real-time streaming via SSE | Results appear immediately, refinement sent later | Standard architecture pattern. Process-and-stream is well-supported. | Low cost — standard web engineering. | **Achievable** — keep as-is |
-| Image Registration Rate > 95% | >95% of images successfully registered | ITU thesis: 93% SIM matching. With 60-73% consecutive overlap and deep learning features, >95% for VO between consecutive frames is achievable. The 5% tolerance covers sharp turns. | Medium cost — depends on feature matcher quality and satellite image quality. | **Achievable** — but interpret as "95% for normal consecutive frames". Sharp turn frames counted separately. |
-| MRE < 1.0 pixels | Mean Reprojection Error below 1 pixel | Sub-pixel accuracy is standard for SuperPoint/LightGlue. SVO achieves sub-pixel via direct methods. Typical range: 0.3-0.8 pixels. | No additional cost — inherent to modern matchers. | **Easily achievable** — keep as-is |
-| REST API + SSE background service | Always-running service, start on request, stream results | Standard Python (FastAPI) or .NET architecture. | Low cost — standard engineering. ~1 week for API layer. | **Achievable** — keep as-is |
-
-## Restrictions Assessment
-
-| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status |
-|-------------|-----------|-------------------|---------------------|--------|
-| No IMU data | No heading, no pitch/roll correction | **CRITICAL restriction.** Most published systems use IMU for heading and as fallback. Without IMU: (1) heading must be derived from consecutive frame matching or satellite matching, (2) no pitch/roll correction — rely on robust feature matchers, (3) scale from known altitude only. Adds ~30-50% error vs IMU-equipped systems. | High impact — requires visual heading estimation. All VO literature assumes at least heading from IMU. +2-3 weeks R&D for pure visual heading. | **Realistic but significantly harder.** Consider: can barometer data be available? |
-| Camera not auto-stabilized | Images have varying pitch/roll | At 400m with fixed-wing, typical roll ±15°, pitch ±10°. Causes trapezoidal distortion in images. Robust matchers (SuperPoint, LightGlue) handle moderate viewpoint changes. Homography estimation between frames compensates. | Medium impact — modern matchers handle this. Pre-rectification using estimated attitude could help. | **Realistic** — keep as-is. Mitigated by robust matchers. |
-| Google Maps only (cost-dependent) | Currently limited to Google Maps | Google Maps in eastern Ukraine may have 2-5 year old imagery. Conflict damage makes old imagery unreliable. **Risk: satellite-UAV matching may fail in areas with significant ground changes.** Alternatives: Mapbox (Maxar Vivid, sub-meter), Bing Maps (0.3-1m), Maxar SecureWatch (30cm, enterprise pricing). | High risk — may need multiple providers. Google: $200/month free credit. Mapbox: free tier for 100K requests. Maxar: enterprise pricing. | **Tighten** — add fallback provider. Pre-download tile cache for operational area. |
-| Image resolution FullHD to 6252×4168 | Variable resolution across flights | Lower resolution (FullHD=1920×1080) at 400m: GSD ≈ 0.20m/pixel, footprint ~384m × 216m. Significantly worse matching but still functional. Need to handle both extremes. | Medium impact — pipeline must be resolution-adaptive. | **Realistic** — keep. But note: FullHD accuracy will be ~3x worse than 26MP. |
-| Altitude ≤ 1km, terrain height negligible | Flat terrain assumption at known altitude | Simplifies scale estimation. At 400m, terrain variations of ±50m cause ±12.5% scale error. Eastern Ukraine is relatively flat (steppe), so this is reasonable. | Low impact for the operational area. | **Realistic** — keep as-is |
-| Mostly sunny weather | Good lighting conditions assumed | Sunny weather = good texture, consistent illumination. Shadows may cause matching issues but are manageable. | Low impact — favorable condition. | **Realistic** — keep. Add: "system performance degrades in overcast/low-light" |
-| Up to 3000 photos per flight | 500-1500 typical, 3000 maximum | At <5s per image: 3000 photos = ~4 hours max. Memory: 3000 × 26MP ≈ 78GB raw. Need efficient memory management and incremental processing. | Medium impact — requires streaming architecture and careful memory management. | **Realistic** — keep. Memory management is engineering, not research. |
-| Sharp turns with completely different next photo | Route discontinuity is possible | Most VO systems fail at 0% overlap. This is effectively a new "start point" problem. Satellite matching is the only recovery path. | High impact — already addressed in AC. | **Realistic** — this is the defining challenge |
-| Desktop/laptop with RTX 2060+ | Minimum GPU requirement | RTX 2060: 6GB VRAM, 1920 CUDA cores. Sufficient for SuperPoint, LightGlue, satellite matching. RTX 3070: 8GB VRAM, 5888 CUDA cores — significantly faster. | Low risk — hardware is adequate. | **Realistic** — keep as-is |
-
-## Missing Acceptance Criteria (Suggested Additions)
-
-| Criterion | Suggested Value | Rationale |
-|-----------|----------------|-----------|
-| Satellite imagery resolution requirement | ≥ 0.5 m/pixel, ideally 0.3 m/pixel | Matching quality depends heavily on reference imagery resolution. At GSD 6cm, satellite must be at least 0.5m for reliable cross-view matching. |
-| Confidence/uncertainty reporting | Report confidence score per position estimate | User needs to know which positions are reliable (satellite-anchored) vs uncertain (VO-only, accumulating drift). |
-| Output format | WGS84 coordinates in GeoJSON or CSV | Standardize output for downstream integration. |
-| Satellite image freshness requirement | < 2 years old for operational area | Older imagery may not match current ground truth due to conflict damage. |
-| Maximum drift between satellite corrections | < 100m cumulative VO drift before satellite re-anchor | Prevents long uncorrected VO segments from exceeding 50m target. |
-| Memory usage limit | < 16GB RAM, < 6GB VRAM | Ensures compatibility with RTX 2060 systems. |
-
-## Key Findings
-
-1. **The 50m/80% accuracy target is achievable** with a well-designed VO + satellite matching pipeline, even without IMU, given the high camera resolution (6cm GSD) and known altitude. NaviLoc achieves 19.5m at lower altitudes; our 400m altitude adds difficulty but 26MP resolution compensates.
-
-2. **The 20m/60% target is aggressive but possible** with high-quality satellite imagery (≤0.5m resolution). Consider starting with a 30m target and tightening through iteration. Performance heavily depends on satellite image quality and freshness for the operational area.
-
-3. **No IMU is the single biggest technical risk.** All published comparable systems use at least heading from IMU/magnetometer. Visual heading estimation from consecutive frames is feasible but adds noise. This restriction alone could require 2-3 extra weeks of R&D.
-
-4. **Google Maps satellite imagery for eastern Ukraine is a significant risk.** Imagery may be outdated (2-5 years) and may not reflect current ground conditions. A fallback satellite provider is strongly recommended.
-
-5. **Processing speed (<5s) is easily achievable** on RTX 2060+. Modern feature matching pipelines process in <500ms per pair. The pipeline could realistically achieve <2s per image.
-
-6. **Route disconnection handling should be the core architectural principle**, not an edge case. The system should be designed "segments-first" — each segment independently geo-referenced, then stitched.
-
-7. **Missing criterion: confidence reporting.** The user should see which positions are high-confidence (satellite-anchored) vs low-confidence (VO-extrapolated). This is critical for operational use.
-
-## Sources
- [Source #1] Mateos-Ramirez et al. (2024) — VO + satellite correction for fixed-wing UAV
- [Source #2] Öztürk (2025) — ORB-SLAM3 + SIM integration thesis
- [Source #3] NaviLoc (2025) — Trajectory-level visual localization
- [Source #4] LightGlue GitHub — Feature matching benchmarks
- [Source #5] DALGlue (2025) — Enhanced feature matching
- [Source #8-9] Satellite imagery coverage and pricing reports
@@ -1,63 +0,0 @@
-# Question Decomposition — AC & Restrictions Assessment
-
-## Original Question
-How realistic are the acceptance criteria and restrictions for a GPS-denied visual navigation system for fixed-wing UAV imagery?
-
-## Active Mode
-Mode A, Phase 1: AC & Restrictions Assessment
-
-## Question Type
-Knowledge Organization + Decision Support
-
-## Research Subject Boundary Definition
-
-| Dimension | Boundary |
-|-----------|----------|
-| **Platform** | Fixed-wing UAV, airplane type, not multirotor |
-| **Geography** | Eastern/southern Ukraine, left of Dnipro River (conflict zone, ~48.27°N, 37.38°E based on sample data) |
-| **Altitude** | ≤ 1km, sample data at 400m |
-| **Sensor** | Monocular RGB camera, 26MP, no IMU, no LiDAR |
-| **Processing** | Ground-based desktop/laptop with NVIDIA RTX 2060+ GPU |
-| **Time Window** | Current state-of-the-art (2024-2026) |
-
-## Problem Context Summary
-
-The system must determine GPS coordinates of consecutive aerial photo centers using only:
- Known starting GPS coordinates
- Known camera parameters (25mm focal, 23.5mm sensor, 6252×4168 resolution)
- Known flight altitude (≤1km, sample: 400m)
- Consecutive photos taken within ~100m of each other
- Satellite imagery (Google Maps) for ground reference
-
-Key constraints: NO IMU data, camera not auto-stabilized, potentially outdated satellite imagery for conflict zone.
-
-**Ground Sample Distance (GSD) at 400m altitude**:
- GSD = (400 × 23.5) / (25 × 6252) ≈ 0.060 m/pixel (6 cm/pixel)
- Ground footprint: ~376m × 250m per image
- Estimated consecutive overlap: 60-73% (depending on camera orientation relative to flight direction)
-
-## Sub-Questions for AC Assessment
-
-1. What GPS accuracy is achievable with VO + satellite matching at 400m altitude with 26MP camera?
-2. How does the absence of IMU affect accuracy and what compensations exist?
-3. What processing speed is achievable per image on RTX 2060+ for the required pipeline?
-4. What image registration rates are achievable with deep learning matchers?
-5. What reprojection errors are typical for modern feature matching?
-6. How do sharp turns and route disconnections affect VO systems?
-7. What satellite imagery quality is available for the operational area?
-8. What domain-specific acceptance criteria might be missing?
-
-## Timeliness Sensitivity Assessment
-
- **Research Topic**: GPS-denied visual navigation using deep learning feature matching
- **Sensitivity Level**: 🟠 High
- **Rationale**: Deep learning feature matchers (SuperPoint, LightGlue, GIM) are evolving rapidly; new methods appear quarterly. Satellite imagery providers update pricing and coverage frequently.
- **Source Time Window**: 12 months (2024-2026)
- **Priority official sources to consult**:
-  1. LightGlue GitHub repository (cvg/LightGlue)
-  2. ORB-SLAM3 documentation
-  3. Recent MDPI/IEEE papers on GPS-denied UAV navigation
- **Key version information to verify**:
-  - LightGlue: Current release and performance benchmarks
-  - SuperPoint: Compatibility and inference speed
-  - ORB-SLAM3: Monocular mode capabilities
@@ -1,133 +0,0 @@
-# Source Registry
-
-## Source #1
- **Title**: Visual Odometry in GPS-Denied Zones for Fixed-Wing UAV with Reduced Accumulative Error Based on Satellite Imagery
- **Link**: https://www.mdpi.com/2076-3417/14/16/7420
- **Tier**: L1
- **Publication Date**: 2024-08-22
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Fixed-wing UAV navigation researchers
- **Research Boundary Match**: ✅ Full match (fixed-wing, high altitude, satellite matching)
- **Summary**: VO + satellite image correction achieves 142.88m mean error over 17km at >1000m altitude using ORB + AKAZE. Uses IMU for heading and barometer for altitude. Error rate 0.83% of total distance.
- **Related Sub-question**: 1, 2
-
-## Source #2
- **Title**: Optimized visual odometry and satellite image matching-based localization for UAVs in GPS-denied environments (ITU Thesis)
- **Link**: https://polen.itu.edu.tr/items/1fe1e872-7cea-44d8-a8de-339e4587bee6
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV navigation researchers
- **Research Boundary Match**: ⚠️ Partial overlap (multirotor at 30-100m, but same VO+SIM methodology)
- **Summary**: ORB-SLAM3 + SuperPoint/SuperGlue/GIM achieves GPS-level accuracy. VO module: ±2m local accuracy. SIM module: 93% matching success rate. Demonstrated on DJI Mavic Air 2 at 30-100m.
- **Related Sub-question**: 1, 2, 4
-
-## Source #3
- **Title**: NaviLoc: Trajectory-Level Visual Localization for GNSS-Denied UAV Navigation
- **Link**: https://www.mdpi.com/2504-446X/10/2/97
- **Tier**: L1
- **Publication Date**: 2025-12
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV navigation / VPR researchers
- **Research Boundary Match**: ⚠️ Partial overlap (50-150m altitude, uses VIO not pure VO)
- **Summary**: Achieves 19.5m Mean Localization Error at 50-150m altitude. Runs at 9 FPS on Raspberry Pi 5. 16x improvement over AnyLoc-VLAD, 32x over raw VIO drift. Training-free system.
- **Related Sub-question**: 1, 7
-
-## Source #4
- **Title**: LightGlue: Local Feature Matching at Light Speed (GitHub + ICCV 2023)
- **Link**: https://github.com/cvg/LightGlue
- **Tier**: L1
- **Publication Date**: 2023 (actively maintained through 2025)
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Computer vision practitioners
- **Research Boundary Match**: ✅ Full match (core component)
- **Summary**: ~20-34ms per image pair on RTX 2080Ti. Adaptive pruning for fast inference. 2-4x speedup with PyTorch compilation.
- **Related Sub-question**: 3, 4
-
-## Source #5
- **Title**: Efficient image matching for UAV visual navigation via DALGlue
- **Link**: https://www.nature.com/articles/s41598-025-21602-5
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV navigation researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: DALGlue achieves 11.8% improvement over LightGlue on matching accuracy. Uses dual-tree complex wavelet preprocessing + linear attention for real-time performance.
- **Related Sub-question**: 3, 4
-
-## Source #6
- **Title**: Deep-UAV SLAM: SuperPoint and SuperGlue enhanced SLAM
- **Link**: https://isprs-archives.copernicus.org/articles/XLVIII-1-W5-2025/177/2025/
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: UAV SLAM researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Replacing ORB-SLAM3's ORB features with SuperPoint+SuperGlue improved robustness and accuracy in aerial RGB scenarios.
- **Related Sub-question**: 4, 5
-
-## Source #7
- **Title**: SCAR: Satellite Imagery-Based Calibration for Aerial Recordings
- **Link**: https://arxiv.org/html/2602.16349v1
- **Tier**: L1
- **Publication Date**: 2026-02
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Aerial/satellite vision researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Long-term auto-calibration refinement by aligning aerial images with 2D-3D correspondences from orthophotos and elevation models.
- **Related Sub-question**: 1, 5
-
-## Source #8
- **Title**: Google Maps satellite imagery coverage and update frequency
- **Link**: https://ongeo-intelligence.com/blog/how-often-does-google-maps-update-satellite-images
- **Tier**: L3
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: GIS practitioners
- **Research Boundary Match**: ✅ Full match
- **Summary**: Conflict zones like eastern Ukraine face 2-5+ year update cycles. Imagery may be intentionally limited or blurred.
- **Related Sub-question**: 7
-
-## Source #9
- **Title**: Satellite Mapping Services comparison 2025
- **Link**: https://ts2.tech/en/exploring-the-world-from-above-top-satellite-mapping-services-for-web-mobile-in-2025/
- **Tier**: L3
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: Developers, GIS practitioners
- **Research Boundary Match**: ✅ Full match
- **Summary**: Google: $200/month free credit, sub-meter resolution. Mapbox: Maxar imagery, generous free tier. Maxar SecureWatch: 30cm resolution, enterprise pricing. Planet: daily 3-4m imagery.
- **Related Sub-question**: 7
-
-## Source #10
- **Title**: Scale Estimation for Monocular Visual Odometry Using Reliable Camera Height
- **Link**: https://ieeexplore.ieee.org/document/9945178/
- **Tier**: L1
- **Publication Date**: 2022
- **Timeliness Status**: ✅ Currently valid (fundamental method)
- **Target Audience**: VO researchers
- **Research Boundary Match**: ✅ Full match
- **Summary**: Known camera height/altitude resolves scale ambiguity in monocular VO. Essential for systems without IMU.
- **Related Sub-question**: 2
-
-## Source #11
- **Title**: Cross-View Geo-Localization benchmarks (SSPT, MA metrics)
- **Link**: https://www.mdpi.com/1424-8220/24/12/3719
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: VPR/geo-localization researchers
- **Research Boundary Match**: ⚠️ Partial overlap (general cross-view, not UAV-specific)
- **Summary**: SSPT achieved 84.40% RDS on UL14 dataset. MA improvements: +12% at 3m, +12% at 5m, +10% at 20m thresholds.
- **Related Sub-question**: 1
-
-## Source #12
- **Title**: ORB-SLAM3 GPU Acceleration Performance
- **Link**: https://arxiv.org/html/2509.10757v1
- **Tier**: L1
- **Publication Date**: 2025
- **Timeliness Status**: ✅ Currently valid
- **Target Audience**: SLAM/VO engineers
- **Research Boundary Match**: ✅ Full match
- **Summary**: GPU acceleration achieves 2.8x speedup on desktop systems. 30 FPS achievable on Jetson TX2. Feature extraction up to 3x speedup with CUDA.
- **Related Sub-question**: 3
@@ -1,121 +0,0 @@
-# Fact Cards
-
-## Fact #1
- **Statement**: VO + satellite image correction achieves ~142.88m mean error over 17km flight at >1000m altitude using ORB features and AKAZE satellite matching. Error rate: 0.83% of total distance. This system uses IMU for heading and barometer for altitude.
- **Source**: Source #1 — https://www.mdpi.com/2076-3417/14/16/7420
- **Phase**: Phase 1
- **Target Audience**: Fixed-wing UAV at high altitude (>1000m)
- **Confidence**: ✅ High (peer-reviewed, real-world flight data)
- **Related Dimension**: GPS accuracy, drift correction
-
-## Fact #2
- **Statement**: ORB-SLAM3 monocular mode with optimized parameters achieves ±2m local accuracy for visual odometry. Scale ambiguity and drift remain for long flights.
- **Source**: Source #2 — ITU Thesis
- **Phase**: Phase 1
- **Target Audience**: UAV navigation (30-100m altitude, multirotor)
- **Confidence**: ✅ High (thesis with experimental validation)
- **Related Dimension**: VO accuracy, scale ambiguity
-
-## Fact #3
- **Statement**: Combined VO + Satellite Image Matching (SIM) with SuperPoint/SuperGlue/GIM achieves 93% matching success rate and "GPS-level accuracy" at 30-100m altitude.
- **Source**: Source #2 — ITU Thesis
- **Phase**: Phase 1
- **Target Audience**: Low-altitude UAV (30-100m)
- **Confidence**: ✅ High
- **Related Dimension**: Registration rate, satellite matching
-
-## Fact #4
- **Statement**: NaviLoc achieves 19.5m Mean Localization Error at 50-150m altitude, runs at 9 FPS on Raspberry Pi 5. 16x improvement over AnyLoc-VLAD. Training-free system.
- **Source**: Source #3 — NaviLoc paper
- **Phase**: Phase 1
- **Target Audience**: Low-altitude UAV (50-150m) in rural areas
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: GPS accuracy, processing speed
-
-## Fact #5
- **Statement**: LightGlue inference: ~20-34ms per image pair on RTX 2080Ti for 1024 keypoints. 2-4x speedup possible with PyTorch compilation and TensorRT.
- **Source**: Source #4 — LightGlue GitHub Issues
- **Phase**: Phase 1
- **Target Audience**: All GPU-accelerated vision systems
- **Confidence**: ✅ High (official repository benchmarks)
- **Related Dimension**: Processing speed
-
-## Fact #6
- **Statement**: SuperPoint+SuperGlue replacing ORB features in SLAM improves robustness and accuracy for aerial RGB imagery over classical handcrafted features.
- **Source**: Source #6 — ISPRS 2025
- **Phase**: Phase 1
- **Target Audience**: UAV SLAM researchers
- **Confidence**: ✅ High (peer-reviewed)
- **Related Dimension**: Feature matching quality
-
-## Fact #7
- **Statement**: Eastern Ukraine / conflict zones may have 2-5+ year old satellite imagery on Google Maps. Imagery may be intentionally limited, blurred, or restricted for security reasons.
- **Source**: Source #8
- **Phase**: Phase 1
- **Target Audience**: Ukraine conflict zone operations
- **Confidence**: ⚠️ Medium (general reporting, not Ukraine-specific verification)
- **Related Dimension**: Satellite imagery quality
-
-## Fact #8
- **Statement**: Maxar SecureWatch offers 30cm resolution with ~3M km² new imagery daily. Mapbox uses Maxar's Vivid imagery with sub-meter resolution. Google Maps offers sub-meter detail in urban areas but 1-3m in rural areas.
- **Source**: Source #9
- **Phase**: Phase 1
- **Target Audience**: All satellite imagery users
- **Confidence**: ✅ High
- **Related Dimension**: Satellite providers, cost
-
-## Fact #9
- **Statement**: Known camera height/altitude resolves scale ambiguity in monocular VO. The pixel-to-meter conversion is s = H / f × sensor_pixel_size, enabling metric reconstruction without IMU.
- **Source**: Source #10
- **Phase**: Phase 1
- **Target Audience**: Monocular VO systems
- **Confidence**: ✅ High (fundamental geometric relationship)
- **Related Dimension**: No-IMU compensation
-
-## Fact #10
- **Statement**: Camera heading (yaw) can be estimated from consecutive frame feature matching by decomposing the homography or essential matrix. Pitch/roll can be estimated from horizon detection or vanishing points. Without IMU, these estimates are noisier but functional.
- **Source**: Multiple vision-based heading estimation papers
- **Phase**: Phase 1
- **Target Audience**: Vision-only navigation systems
- **Confidence**: ⚠️ Medium (well-established but accuracy varies)
- **Related Dimension**: No-IMU compensation
-
-## Fact #11
- **Statement**: GSD at 400m with 25mm/23.5mm sensor/6252px = 6.01 cm/pixel. Ground footprint: 376m × 250m. At 100m photo interval, consecutive overlap is 60-73%.
- **Source**: Calculated from problem data using standard GSD formula
- **Phase**: Phase 1
- **Target Audience**: This specific system
- **Confidence**: ✅ High (deterministic calculation)
- **Related Dimension**: Image coverage, overlap
-
-## Fact #12
- **Statement**: GPU-accelerated ORB-SLAM3 achieves 2.8x speedup on desktop systems. 30 FPS possible on Jetson TX2. Feature extraction speedup up to 3x with CUDA-optimized pipelines.
- **Source**: Source #12
- **Phase**: Phase 1
- **Target Audience**: GPU-equipped systems
- **Confidence**: ✅ High
- **Related Dimension**: Processing speed
-
-## Fact #13
- **Statement**: Without IMU, the Mateos-Ramirez paper (Source #1) would lose: (a) yaw angle for rotation compensation, (b) fallback when feature matching fails. Their 142.88m error would likely be significantly higher without IMU heading data.
- **Source**: Inference from Source #1 methodology
- **Phase**: Phase 1
- **Target Audience**: This specific system
- **Confidence**: ⚠️ Medium (reasoned inference)
- **Related Dimension**: No-IMU impact
-
-## Fact #14
- **Statement**: DALGlue achieves 11.8% improvement over LightGlue on matching accuracy while maintaining real-time performance through dual-tree complex wavelet preprocessing and linear attention.
- **Source**: Source #5
- **Phase**: Phase 1
- **Target Audience**: Feature matching systems
- **Confidence**: ✅ High (peer-reviewed, 2025)
- **Related Dimension**: Feature matching quality
-
-## Fact #15
- **Statement**: Cross-view geo-localization benchmarks show MA@20m improving by +10% with latest methods (SSPT). RDS metric at 84.40% indicates reliable spatial positioning.
- **Source**: Source #11
- **Phase**: Phase 1
- **Target Audience**: Cross-view matching researchers
- **Confidence**: ✅ High
- **Related Dimension**: Cross-view matching accuracy
@@ -1,115 +0,0 @@
-# Comparison Framework
-
-## Selected Framework Type
-Decision Support (component-by-component solution comparison)
-
-## System Components
-1. Visual Odometry (consecutive frame matching)
-2. Satellite Image Geo-Referencing (cross-view matching)
-3. Heading & Orientation Estimation (without IMU)
-4. Drift Correction & Position Fusion
-5. Segment Management & Route Reconnection
-6. Interactive Point-to-GPS Lookup
-7. Pipeline Orchestration & API
-
---
-
-## Component 1: Visual Odometry
-
-| Solution | Tools | Advantages | Limitations | Fit |
-|----------|-------|-----------|-------------|-----|
-| ORB-SLAM3 monocular | ORB features, BA, map management | Mature, well-tested, handles loop closure. GPU-accelerated. 30FPS on Jetson TX2. | Scale ambiguity without IMU. Over-engineered for sequential aerial — map building not needed. Heavy dependency. | Medium — too complex for the use case |
-| Homography-based VO with SuperPoint+LightGlue | SuperPoint, LightGlue, OpenCV homography | Ground plane assumption perfect for flat terrain at 400m. Cleanly separates rotation/translation. Known altitude resolves scale directly. Fast. | Assumes planar scene (valid for our case). Fails at sharp turns (but that's expected). | **Best fit** — matches constraints exactly |
-| Optical flow VO | cv2.calcOpticalFlowPyrLK or RAFT | Dense motion field, no feature extraction needed. | Less accurate for large motions. Struggles with texture-sparse areas. No inherent rotation estimation. | Low — not suitable for 100m baselines |
-| Direct method (SVO) | SVO Pro | Sub-pixel precision, fast. | Designed for small baselines and forward cameras. Poor for downward aerial at large baselines. | Low |
-
-**Selected**: Homography-based VO with SuperPoint + LightGlue features
-
---
-
-## Component 2: Satellite Image Geo-Referencing
-
-| Solution | Tools | Advantages | Limitations | Fit |
-|----------|-------|-----------|-------------|-----|
-| SuperPoint + LightGlue cross-view matching | SuperPoint, LightGlue, perspective warp | Best overall performance on satellite stereo benchmarks. Fast (~50ms matching). Rotation-invariant. Handles viewpoint/scale changes. | Requires perspective warping to reduce viewpoint gap. Needs good satellite image quality. | **Best fit** — proven on satellite imagery |
-| SuperPoint + SuperGlue + GIM | SuperPoint, SuperGlue, GIM | GIM adds generalization for challenging scenes. 93% match rate (ITU thesis). | SuperGlue slower than LightGlue. GIM adds complexity. | Good — slightly better robustness, slower |
-| LoFTR (detector-free) | LoFTR | No keypoint detection step. Works on low-texture. | Slower than detector-based methods. Fixed resolution (coarse). Less accurate than SuperPoint+LightGlue on satellite benchmarks. | Medium — fallback option |
-| DUSt3R/MASt3R | DUSt3R/MASt3R | Handles extreme viewpoints and low overlap. +50% completeness over COLMAP in sparse scenarios. | Very slow. Designed for 3D reconstruction not 2D matching. Unreliable with many images. | Low — only for extreme fallback |
-| Terrain-weighted optimization (YFS90) | Custom pipeline + DEM | <7m MAE without IMU! Drift-free. Handles thermal IR. 20 scenarios validated. | Requires DEM data. More complex implementation. Not open-source matching details. | High — architecture inspiration |
-
-**Selected**: SuperPoint + LightGlue (primary) with perspective warping. GIM as supplementary for difficult matches. YFS90-style terrain-weighted sliding window for position optimization.
-
---
-
-## Component 3: Heading & Orientation Estimation
-
-| Solution | Tools | Advantages | Limitations | Fit |
-|----------|-------|-----------|-------------|-----|
-| Homography decomposition (consecutive frames) | OpenCV decomposeHomographyMat | Directly gives rotation between frames. Works with ground plane assumption. No extra sensors needed. | Accumulates heading drift over time. Noisy for small motions. Ambiguous decomposition (need to select correct solution). | **Best fit** — primary heading source |
-| Satellite matching absolute orientation | From satellite match homography | Provides absolute heading correction. Eliminates accumulated heading drift. | Only available when satellite match succeeds. Intermittent. | **Best fit** — drift correction for heading |
-| Optical flow direction | Dense flow vectors | Simple to compute. | Very noisy at high altitude. Unreliable for heading. | Low |
-
-**Selected**: Homography decomposition for frame-to-frame heading + satellite matching for periodic absolute heading correction.
-
---
-
-## Component 4: Drift Correction & Position Fusion
-
-| Solution | Tools | Advantages | Limitations | Fit |
-|----------|-------|-----------|-------------|-----|
-| Kalman filter (EKF/UKF) | filterpy or custom | Well-understood. Handles noisy measurements. Good for fusing VO + satellite. | Assumes Gaussian noise. Linearization issues with EKF. | Good — simple and effective |
-| Sliding window optimization with terrain constraints | Custom optimization, scipy.optimize | YFS90 achieves <7m with this. Directly constrains drift. No loop closure needed. | More complex to implement. Needs tuning. | **Best fit** — proven for this exact problem |
-| Pose graph optimization | g2o, GTSAM | Standard in SLAM. Handles satellite anchors as prior factors. Globally optimal. | Heavy dependency. Over-engineered if segments are short. | Medium — overkill unless routes are very long |
-| Simple anchor reset | Direct correction at satellite match | Simplest. Just replace VO position with satellite position. | Discontinuous trajectory. No smoothing. | Low — too crude |
-
-**Selected**: Sliding window optimization with terrain constraints (inspired by YFS90), with Kalman filter as simpler fallback. Satellite matches as absolute anchor constraints.
-
---
-
-## Component 5: Segment Management & Route Reconnection
-
-| Solution | Tools | Advantages | Limitations | Fit |
-|----------|-------|-----------|-------------|-----|
-| Segments-first architecture with satellite anchoring | Custom segment manager | Each segment independently geo-referenced. No dependency between disconnected segments. Natural handling of sharp turns. | Needs robust satellite matching per segment. Segments without any satellite match are "floating". | **Best fit** — matches AC requirement for core strategy |
-| Global pose graph with loop closure | g2o/GTSAM | Can connect segments when they revisit same area. | Heavy. Doesn't help if segments don't overlap with each other. | Low — segments may not revisit same areas |
-| Trajectory-level VPR (NaviLoc-style) | VPR + trajectory optimization | Global optimization across trajectory. | Requires pre-computed VPR database. Complex. Designed for continuous trajectory, not disconnected segments. | Low |
-
-**Selected**: Segments-first architecture. Each segment starts from a satellite anchor or user input. Segments connected through shared satellite coordinate frame.
-
---
-
-## Component 6: Interactive Point-to-GPS Lookup
-
-| Solution | Tools | Advantages | Limitations | Fit |
-|----------|-------|-----------|-------------|-----|
-| Homography projection (image → ground) | Computed homography from satellite match | Already computed during geo-referencing. Accurate for flat terrain. | Only works for images with successful satellite match. | **Best fit** |
-| Camera ray-casting with known altitude | Camera intrinsics + pose estimate | Works for any image with pose estimate. Simpler math. | Accuracy depends on pose estimate quality. | Good — fallback for non-satellite-matched images |
-
-**Selected**: Homography projection (primary) + ray-casting (fallback).
-
---
-
-## Component 7: Pipeline & API
-
-| Solution | Tools | Advantages | Limitations | Fit |
-|----------|-------|-----------|-------------|-----|
-| Python FastAPI + SSE | FastAPI, EventSourceResponse, asyncio | Native SSE support (since 0.135.0). Async GPU pipeline. Excellent for ML/CV workloads. Rich ecosystem. | Python GIL (mitigated with async/multiprocessing). | **Best fit** — natural for CV/ML pipeline |
-| .NET ASP.NET Core + SSE | ASP.NET Core, SignalR | High performance. Good for enterprise. | Less natural for CV/ML. Python interop needed for PyTorch models. Adds complexity. | Low — unnecessary indirection |
-| Python + gRPC streaming | gRPC | Efficient binary protocol. Bidirectional streaming. | More complex client integration. No browser-native support. | Medium — overkill for this use case |
-
-**Selected**: Python FastAPI with SSE.
-
---
-
-## Google Maps Tile Resolution at Latitude 48° (Operational Area)
-
-| Zoom Level | Meters/pixel | Tile coverage (256px) | Tiles for 20km² | Download size est. |
-|-----------|-------------|----------------------|-----------------|-------------------|
-| 17 | 0.80 m/px | ~205m × 205m | ~500 tiles | ~20MB |
-| 18 | 0.40 m/px | ~102m × 102m | ~2,000 tiles | ~80MB |
-| 19 | 0.20 m/px | ~51m × 51m | ~8,000 tiles | ~320MB |
-| 20 | 0.10 m/px | ~26m × 26m | ~30,000 tiles | ~1.2GB |
-
-Formula: metersPerPx = 156543.03 × cos(48° × π/180) / 2^zoom ≈ 104,771 / 2^zoom
-
-**Selected**: Zoom 18 (0.40 m/px) as primary matching resolution. Zoom 19 (0.20 m/px) for refinement if available. Meets the ≥0.5 m/pixel AC requirement.
--- a/Show More
+++ b/Show More