chore: sync .cursor from suite

[AZ-240] Update product implementation and task decomposition processes
- Refined task decomposition steps to ensure implementation tasks are atomic and complexity does not exceed 5 points. - Enhanced the product implementation process with a completeness gate to verify task outcomes against architecture promises before proceeding to testing. - Updated dependencies table to reflect new tasks and their relationships, ensuring all test tasks are linked to product remediation tasks. - Adjusted workflow documentation to clarify entry points for task decomposition and implementation contexts. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-24 20:01:12 +00:00 · 2026-05-05 01:08:48 +03:00 · 2026-05-05 01:02:25 +03:00 · 2026-05-03 19:10:10 +03:00 · 2026-05-03 19:02:13 +03:00 · 2026-05-03 18:49:37 +03:00
256 changed files with 20091 additions and 395 deletions
@@ -0,0 +1,38 @@
 ---
 description: "Standards for creating and maintaining Cursor skills"
 globs: [".cursor/skills/**"]
 ---
 # Skill Building
 ## When To Create A Skill
 - Create a skill for repeatable, bounded workflows that benefit from a reusable process.
 - Do not create a skill for a one-off task, vague goal, or workflow that still needs product decisions.
 - Start small; evolve the skill when repeated use reveals clearer steps, constraints, or checks.
 ## Skill Contract
 - `SKILL.md` must define a clear `name` and a proactive `description` that explains when the skill should be used.
 - State expected inputs, constraints, workflow steps, and final output shape.
 - Make trigger conditions explicit enough that the agent can recognize intent without an exact command.
 - Base instructions on observable project evidence; do not invite fabrication or unsupported assumptions.
 ## Keep The Core Lean
 - Keep `SKILL.md` concise and under the repo's `.cursor/` size guidance.
 - Move detailed standards, examples, and background knowledge into `references/`.
 - Put reusable output shapes in `templates/` or other skill-local assets instead of embedding them in the main instructions.
 - Keep one primary responsibility per skill; use an orchestrator skill only when multiple existing skills must run in a defined order.
 ## Deterministic Work
 - Use scripts for mechanical steps that are repeatable, parameterized, and safer outside the model's reasoning.
 - Scripts must expose explicit inputs, avoid hidden side effects, and fail loudly on errors.
 - Do not use scripts to bypass review, hide destructive behavior, or hardcode secrets.
 ## Quality Proof
 - Include realistic examples, checklists, or eval-style scenarios that define what good output looks like.
 - Cover common failure cases such as missing sections, leftover placeholders, hallucinated facts, unsafe actions, or malformed output.
 - Review skill changes against those checks before treating the skill as ready.
 ## Security Review
 - Treat third-party skills like untrusted code until reviewed.
 - Inspect scripts, dependencies, references, secret handling, network calls, and destructive commands before use.
 - Prefer local, project-scoped assets and dependencies; document any external dependency the skill requires.
@@ -3,7 +3,7 @@ name: autodev
 description: |
  Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
  Detects current project state from _docs/ folder, resumes from where it left off, and flows through
-  problem → research → plan → decompose → implement → deploy without manual skill invocation.
+  problem → research → plan → test specs → decompose → implement → tests → docs sync → deploy without manual skill invocation.
  Maximizes work per conversation by auto-transitioning between skills.
  Trigger phrases:
  - "autodev", "auto", "start", "continue"
@@ -152,15 +152,17 @@ If `_docs/02_tasks/` subfolders have some task files already (e.g., refactoring
 ---
 **Step 6 — Implement Tests**
-Condition (folder fallback): `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
 State-driven: reached by auto-chain from Step 5.
-Action: Read and execute `.cursor/skills/implement/SKILL.md`
+Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
-The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.
+The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
 If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
 For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
 ---
 **Step 7 — Run Tests**
@@ -1,6 +1,6 @@
 # Greenfield Workflow
-Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
 ## Step Reference Table
@@ -10,13 +10,19 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
 | 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
 | 3 | Plan | plan/SKILL.md | Step 1–6 + Final |
 | 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
-| 5 | Decompose | decompose/SKILL.md | Step 1–4 |
+| 5 | Test Spec | test-spec/SKILL.md | Phases 1–4 |
-| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 6 | Decompose | decompose/SKILL.md (implementation task decomposition) | Step 1 + Step 1.5 + Step 2 + Step 4 |
-| 7 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 7 | Implement | implement/SKILL.md | Batch loop + Product Implementation Completeness Gate |
-| 8 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
-| 9 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
+| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
-| 10 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
-| 11 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |
+| 11 | Run Tests | test-run/SKILL.md | Steps 1–4 |
 | 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
 | 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
 | 14 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
 | 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
 | 16 | Deploy | deploy/SKILL.md | Step 1–7 |
 | 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |
 ## Detection Rules
@@ -80,12 +86,12 @@ If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FIN
 ---
 **Step 4 — UI Design (conditional)**
-Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_tasks/todo/` does not exist or has no task files.
+Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
 State-driven: reached by auto-chain from Step 3.
 Action: Read and execute `.cursor/skills/ui-design/SKILL.md`. The skill runs its own **Applicability Check**, which handles UI project detection and the user's A/B choice. It returns one of:
- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Decompose).
+- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Test Spec).
 - `outcome: skipped, reason: not-a-ui-project` → mark Step 4 as `skipped`, auto-chain to Step 5.
 - `outcome: skipped, reason: user-declined` → mark Step 4 as `skipped`, auto-chain to Step 5.
@@ -93,34 +99,162 @@ The autodev no longer inlines UI detection heuristics — they live in `ui-desig
 ---
-**Step 5 — Decompose**
+**Step 5 — Test Spec**
-Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files
+Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
 State-driven: reached by auto-chain from Step 4 (completed or skipped).
-Action: Read and execute `.cursor/skills/decompose/SKILL.md`
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md`.
 This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
 ---
 **Step 6 — Decompose**
 Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_document/tests/traceability-matrix.md` exists AND `_docs/02_tasks/todo/` does not exist or has no implementation task files.
 Action: Invoke `.cursor/skills/decompose/SKILL.md` for **implementation task decomposition**. The greenfield flow selects the implementation entrypoint before handing off: Bootstrap Structure, Module Layout, Component Task Decomposition, and Cross-Task Verification.
 Do not invoke Blackbox Test Task Decomposition from Step 6. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality and Step 8 can revise testability before test task files exist.
 If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.
 ---
-**Step 6 — Implement**
+**Step 7 — Implement**
-Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file
+Condition: `_docs/02_tasks/todo/` contains implementation task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain a valid product implementation report.
-Action: Read and execute `.cursor/skills/implement/SKILL.md`
+Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Product implementation**.
 The implement skill must run its **Product Implementation Completeness Gate** before it writes any final product implementation report. This gate compares completed product task specs, architecture/component promises, and actual source code so scaffold-only implementations cannot advance to Step 8. A final product implementation report without `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` is incomplete and must not be treated as Step 7 completion.
 If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.
 For folder fallback, **implementation task files** means task specs that are not test-only specs: exclude `*_test_infrastructure.md` and task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
 For folder fallback, a **product implementation report** is any `_docs/03_implementation/implementation_report_*.md` file except `_docs/03_implementation/implementation_report_tests.md` and refactor reports. It is valid for greenfield progression only when:
 - the matching `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists,
 - that completeness report does not contain unresolved `FAIL` classifications, and
 - `_docs/02_tasks/todo/` contains no pending implementation task files.
 If a product report exists but any of those validity checks fail, treat product implementation as incomplete and stay in Step 7.
 ---
-**Step 7 — Run Tests**
+**Step 8 — Code Testability Revision**
-Condition (folder fallback): `_docs/03_implementation/` contains an `implementation_report_*.md` file.
+Condition (folder fallback): `_docs/03_implementation/` contains a valid product implementation report, `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications, `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` does not exist, `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` does not exist, `_docs/03_implementation/implementation_report_tests.md` does not exist, and `_docs/02_tasks/todo/` does not contain test task files.
-State-driven: reached by auto-chain from Step 6.
+State-driven: reached by auto-chain from Step 7.
 **Purpose**: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
 **Scope — MINIMAL, SURGICAL fixes**: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
 **Allowed changes** in this phase:
 - Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
 - Extract narrow interfaces for components that need stubbing in tests.
 - Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
 - Wrap global singletons in thin accessors that tests can override.
 - Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
 **NOT allowed** in this phase (defer to a later refactor task):
 - Renaming public APIs.
 - Moving code between files unless strictly required for isolation.
 - Changing algorithms or business logic.
 - Restructuring module boundaries or rewriting layers.
 Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
 1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
 2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
   - Hardcoded file paths or directory references
   - Hardcoded configuration values (URLs, credentials, magic numbers)
   - Global mutable state that cannot be overridden
   - Tight coupling to external services without abstraction
   - Missing dependency injection or non-configurable parameters
   - Direct file system operations without path configurability
   - Inline construction of heavy dependencies (models, clients)
 3. If ALL scenarios are testable as-is:
   - Create `_docs/04_refactoring/01-testability-refactoring/`
   - Write `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` with the scenarios reviewed and outcome "Code is testable — no changes needed"
   - Mark Step 8 as `completed` with outcome "Code is testable — no changes needed"
   - Auto-chain to Step 9 (Decompose Tests)
 4. If testability issues are found:
   - Create `_docs/04_refactoring/01-testability-refactoring/`
   - Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
     - **Mode**: `guided`
     - **Source**: `autodev-greenfield-testability-analysis`
     - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
   - Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
   - Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
   - After execution, surface `RUN_DIR/testability_changes_summary.md` to the user via the Choose format (accept / request follow-up) before auto-chaining
   - Copy or save the accepted summary as `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` so folder fallback can detect Step 8 completion
   - Mark Step 8 as `completed`
   - Auto-chain to Step 9 (Decompose Tests)
 ---
 **Step 9 — Decompose Tests**
 Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND `_docs/03_implementation/` contains a valid product implementation report AND `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications AND (`_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` exists OR `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` exists) AND (`_docs/02_tasks/todo/` does not exist or has no test task files) AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
 State-driven: reached by auto-chain from Step 8.
 Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
 1. Run Step 1t (test infrastructure bootstrap)
 2. Run Step 3 (blackbox/e2e-capable test task decomposition)
 3. Run Step 4 (cross-verification against test coverage)
 If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
 ---
 **Step 10 — Implement Tests**
 Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
 State-driven: reached by auto-chain from Step 9.
 Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
 The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
 If `_docs/03_implementation/` has batch reports, the implement skill detects completed test tasks and continues.
 For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
 ---
 **Step 11 — Run Tests**
 Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
 State-driven: reached by auto-chain from Step 10.
 Action: Read and execute `.cursor/skills/test-run/SKILL.md`
 Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync.
 ---
-**Step 8 — Security Audit (optional)**
+**Step 12 — Test-Spec Sync**
-State-driven: reached by auto-chain from Step 7.
+State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
 Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
 The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing, mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
 After completion, auto-chain to Step 13 (Update Docs).
 ---
 **Step 13 — Update Docs**
 State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
 Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all completed implementation and test task spec files plus the implementation reports.
 The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
 If `_docs/02_document/` does not contain existing docs, mark Step 13 as `skipped`.
 After completion, auto-chain to Step 14 (Security Audit).
 ---
 **Step 14 — Security Audit (optional)**
 State-driven: reached by auto-chain from Step 13 (completed or skipped).
 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run security audit before deploy?`
@@ -128,12 +262,12 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A — catches vulnerabilities before production`
 - target-skill:    `.cursor/skills/security/SKILL.md`
- next-step:       Step 9 (Performance Test)
+- next-step:       Step 15 (Performance Test)
 ---
-**Step 9 — Performance Test (optional)**
+**Step 15 — Performance Test (optional)**
-State-driven: reached by auto-chain from Step 8.
+State-driven: reached by auto-chain from Step 14 (completed or skipped).
 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run performance/load tests before deploy?`
@@ -141,30 +275,30 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
 - target-skill:    `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
- next-step:       Step 10 (Deploy)
+- next-step:       Step 16 (Deploy)
 ---
-**Step 10 — Deploy**
+**Step 16 — Deploy**
-State-driven: reached by auto-chain from Step 9 (after Step 9 is completed or skipped).
+State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).
 Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
-After the deploy skill completes successfully, mark Step 10 as `completed` and auto-chain to Step 11 (Retrospective).
+After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).
 ---
-**Step 11 — Retrospective**
+**Step 17 — Retrospective**
-State-driven: reached by auto-chain from Step 10.
+State-driven: reached by auto-chain from Step 16.
 Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. This closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` and appending the top-3 lessons to `_docs/LESSONS.md`.
-After retrospective completes, mark Step 11 as `completed` and enter "Done" evaluation.
+After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.
 ---
 **Done**
-State-driven: reached by auto-chain from Step 11. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
+State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
 Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:
@@ -191,47 +325,65 @@ On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and r
 | Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
 | Research Decision → proceed | Auto-chain → Plan (3) |
 | Plan (3) | Auto-chain → UI Design detection (4) |
-| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
+| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
-| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
+| Test Spec (5) | Auto-chain → Decompose (6) |
-| Implement (6) | Auto-chain → Run Tests (7) |
+| Decompose (6) | **Session boundary** — suggest new conversation before Implement |
-| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
+| Implement (7) | Auto-chain only after Product Implementation Completeness Gate passes → Code Testability Revision (8) |
-| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
+| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
-| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
+| Decompose Tests (9) | **Session boundary** — suggest new conversation before Implement Tests |
-| Deploy (10) | Auto-chain → Retrospective (11) |
+| Implement Tests (10) | Auto-chain → Run Tests (11) |
-| Retrospective (11) | Report completion; rewrite state to existing-code flow, step 9 |
+| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
 | Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
 | Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
 | Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
 | Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
 | Deploy (16) | Auto-chain → Retrospective (17) |
 | Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |
 ## Status Summary — Step List
 Flow name: `greenfield`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.
-| # | Step Name          | Extra state tokens (beyond the shared set) |
+| # | Step Name                   | Extra state tokens (beyond the shared set) |
-|---|--------------------|--------------------------------------------|
+|---|-----------------------------|--------------------------------------------|
-| 1 | Problem            | — |
+| 1 | Problem                     | — |
-| 2 | Research           | `DONE (N drafts)` |
+| 2 | Research                    | `DONE (N drafts)` |
-| 3 | Plan               | — |
+| 3 | Plan                        | — |
-| 4 | UI Design          | — |
+| 4 | UI Design                   | — |
-| 5 | Decompose          | `DONE (N tasks)` |
+| 5 | Test Spec                   | — |
-| 6 | Implement          | `IN PROGRESS (batch M of ~N)` |
+| 6 | Decompose                   | `DONE (N tasks)` |
-| 7 | Run Tests          | `DONE (N passed, M failed)` |
+| 7 | Implement                   | `IN PROGRESS (batch M of ~N)` |
-| 8 | Security Audit     | — |
+| 8 | Code Testability Revision   | — |
-| 9 | Performance Test   | — |
+| 9 | Decompose Tests             | `DONE (N tasks)` |
-| 10 | Deploy            | — |
+| 10 | Implement Tests            | `IN PROGRESS (batch M)` |
-| 11 | Retrospective     | — |
+| 11 | Run Tests                  | `DONE (N passed, M failed)` |
 | 12 | Test-Spec Sync             | — |
 | 13 | Update Docs                | — |
 | 14 | Security Audit             | — |
 | 15 | Performance Test           | — |
 | 16 | Deploy                     | — |
 | 17 | Retrospective              | — |
-All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 8, 9 additionally accept `SKIPPED`.
+All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.
 Row rendering format (step-number column is right-padded to 2 characters for alignment):
 ```
- Step 1   Problem             [<state token>]
+ Step 1   Problem                   [<state token>]
- Step 2   Research            [<state token>]
+ Step 2   Research                  [<state token>]
- Step 3   Plan                [<state token>]
+ Step 3   Plan                      [<state token>]
- Step 4   UI Design           [<state token>]
+ Step 4   UI Design                 [<state token>]
- Step 5   Decompose           [<state token>]
+ Step 5   Test Spec                 [<state token>]
- Step 6   Implement           [<state token>]
+ Step 6   Decompose                 [<state token>]
- Step 7   Run Tests           [<state token>]
+ Step 7   Implement                 [<state token>]
- Step 8   Security Audit      [<state token>]
+ Step 8   Code Testability Rev.     [<state token>]
- Step 9   Performance Test    [<state token>]
+ Step 9   Decompose Tests           [<state token>]
- Step 10  Deploy              [<state token>]
+ Step 10  Implement Tests           [<state token>]
- Step 11  Retrospective       [<state token>]
+ Step 11  Run Tests                 [<state token>]
 Step 12  Test-Spec Sync            [<state token>]
 Step 13  Update Docs               [<state token>]
 Step 14  Security Audit            [<state token>]
 Step 15  Performance Test          [<state token>]
 Step 16  Deploy                    [<state token>]
 Step 17  Retrospective             [<state token>]
 ```
@@ -110,7 +110,8 @@ Before entering a step from this table for the first time in a session, verify t
 | Flow | Step | Sub-Step | Tracker Action |
 |------|------|----------|----------------|
 | greenfield | Plan | Step 6 — Epics | Create epics for each component |
-| greenfield | Decompose | Step 1 + Step 2 + Step 3 — All tasks | Create ticket per task, link to epic |
+| greenfield | Decompose | Implementation decomposition Step 1 + Step 2 — Product tasks | Create ticket per product task, link to epic |
 | greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |
@@ -13,7 +13,7 @@ The autodev persists its position to `_docs/_autodev_state.md`. This is a lightw
 ## Current Step
 flow: [greenfield | existing-code | meta-repo]
-step: [1-11 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
+step: [1-17 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
 name: [step name from the active flow's Step Reference Table]
 status: [not_started / in_progress / completed / skipped / failed]
 sub_step:
@@ -2,8 +2,8 @@
 name: decompose
 description: |
  Decompose planned components into atomic implementable tasks with bootstrap structure plan.
-  4-step workflow: bootstrap structure plan, component task decomposition, blackbox test task decomposition, and cross-task verification.
+  Workflow entrypoints: implementation task decomposition, single component decomposition, and tests-only decomposition.
-  Supports full decomposition (_docs/ structure), single component mode, and tests-only mode.
+  The invoking flow decides which entrypoint to run; this skill executes that selected sequence.
  Trigger phrases:
  - "decompose", "decompose features", "feature decomposition"
  - "task decomposition", "break down components"
@@ -20,7 +20,7 @@ Decompose planned components into atomic, implementable task specs with a bootst
 ## Core Principles
- **Atomic tasks**: each task does one thing; if it exceeds 8 complexity points, split it
+- **Atomic tasks**: each task does one thing; if it exceeds 5 complexity points, split it
 - **Behavioral specs, not implementation plans**: describe what the system should do, not how to build it
 - **Flat structure**: all tasks are tracker-ID-prefixed files in TASKS_DIR — no component subdirectories
 - **Save immediately**: write artifacts to disk after each task; never accumulate unsaved work
@@ -30,14 +30,15 @@ Decompose planned components into atomic, implementable task specs with a bootst
 ## Context Resolution
-Determine the operating mode based on invocation before any other logic runs.
+Resolve the selected entrypoint from the invocation context before any other logic runs. The caller decides whether this is implementation, single component, or tests-only decomposition; this skill only executes the selected sequence.
-**Default** (no explicit input file provided):
+**Implementation task decomposition** (default; selected by flows before invoking this skill):
 - DOCUMENT_DIR: `_docs/02_document/`
 - TASKS_DIR: `_docs/02_tasks/`
 - TASKS_TODO: `_docs/02_tasks/todo/`
 - Reads from: `_docs/00_problem/`, `_docs/01_solution/`, DOCUMENT_DIR
 - Produces only implementation tasks. Blackbox/e2e test task files are produced only when the invoking flow selects tests-only decomposition.
 **Single component mode** (provided file is within `_docs/02_document/` and inside a `components/` subdirectory):
@@ -55,24 +56,24 @@ Determine the operating mode based on invocation before any other logic runs.
 - TESTS_DIR: `DOCUMENT_DIR/tests/`
 - Reads from: `_docs/00_problem/`, `_docs/01_solution/`, TESTS_DIR
-Announce the detected mode and resolved paths to the user before proceeding.
+Announce the selected entrypoint and resolved paths to the user before proceeding.
 ### Step Applicability by Mode
-| Step | File | Default | Single | Tests-only |
+| Step | File | Implementation | Single | Tests-only |
-|------|------|:-------:|:------:|:----------:|
+|------|------|:--------------:|:------:|:----------:|
 | 1 Bootstrap Structure | `steps/01_bootstrap-structure.md` | ✓ | — | — |
 | 1t Test Infrastructure | `steps/01t_test-infrastructure.md` | — | — | ✓ |
 | 1.5 Module Layout | `steps/01-5_module-layout.md` | ✓ | — | — |
 | 2 Task Decomposition | `steps/02_task-decomposition.md` | ✓ | ✓ | — |
-| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | ✓ | — | ✓ |
+| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | — | — | ✓ |
 | 4 Cross-Verification | `steps/04_cross-verification.md` | ✓ | — | ✓ |
 ## Input Specification
 ### Required Files
-**Default:**
+**Implementation task decomposition:**
 | File | Purpose |
 |------|---------|
@@ -84,7 +85,7 @@ Announce the detected mode and resolved paths to the user before proceeding.
 | `DOCUMENT_DIR/glossary.md` | Project terminology (confirmed by user in plan Phase 2a.0 or document Step 4.5). Use it to keep task names, component references, and AC wording consistent with the user's vocabulary |
 | `DOCUMENT_DIR/system-flows.md` | System flows from plan skill |
 | `DOCUMENT_DIR/components/[##]_[name]/description.md` | Component specs from plan skill |
-| `DOCUMENT_DIR/tests/` | Blackbox test specs from plan skill |
+| `DOCUMENT_DIR/tests/` | Optional product acceptance context from test-spec skill; do not create test task files from it in this entrypoint |
 **Single component mode:**
@@ -111,7 +112,7 @@ Announce the detected mode and resolved paths to the user before proceeding.
 ### Prerequisite Checks (BLOCKING)
-**Default:**
+**Implementation task decomposition:**
 1. DOCUMENT_DIR contains `architecture.md` and `components/` — **STOP if missing**
 2. Create TASKS_DIR and TASKS_TODO if they do not exist
@@ -145,6 +146,8 @@ TASKS_DIR/
 **Naming convention**: Each task file is initially saved in `TASKS_TODO/` with a temporary numeric prefix (`[##]_[short_name].md`). After creating the work item ticket, rename the file to use the work item ticket ID as prefix (`[TRACKER-ID]_[short_name].md`). For example: `todo/01_initial_structure.md` → `todo/AZ-42_initial_structure.md`.
 If tracker availability fails, follow `.cursor/rules/tracker.mdc` before continuing. Only when the user explicitly chooses `tracker: local` may the numeric prefix remain; in that mode set `Tracker: pending` and `Epic: pending` in the task header and keep the task eligible for later tracker sync.
 ### Save Timing
 | Step | Save immediately after | Filename |
@@ -166,11 +169,11 @@ If TASKS_DIR subfolders already contain task files:
 ## Progress Tracking
-At the start of execution, create a TodoWrite with all applicable steps for the detected mode (see Step Applicability table). Update status as each step/component completes.
+At the start of execution, create a TodoWrite with all applicable steps for the selected entrypoint (see Step Applicability table). Update status as each step/component completes.
 ## Workflow
-### Step 1: Bootstrap Structure Plan (default mode only)
+### Step 1: Bootstrap Structure Plan (implementation mode only)
 Read and follow `steps/01_bootstrap-structure.md`.
@@ -182,25 +185,25 @@ Read and follow `steps/01t_test-infrastructure.md`.
 ---
-### Step 1.5: Module Layout (default mode only)
+### Step 1.5: Module Layout (implementation mode only)
 Read and follow `steps/01-5_module-layout.md`.
 ---
-### Step 2: Task Decomposition (default and single component modes)
+### Step 2: Task Decomposition (implementation and single component modes)
 Read and follow `steps/02_task-decomposition.md`.
 ---
-### Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
+### Step 3: Blackbox Test Task Decomposition (tests-only mode only)
 Read and follow `steps/03_blackbox-test-decomposition.md`.
 ---
-### Step 4: Cross-Task Verification (default and tests-only modes)
+### Step 4: Cross-Task Verification (implementation and tests-only modes)
 Read and follow `steps/04_cross-verification.md`.
@@ -208,7 +211,7 @@ Read and follow `steps/04_cross-verification.md`.
 - **Coding during decomposition**: this workflow produces specs, never code
 - **Over-splitting**: don't create many tasks if the component is simple — 1 task is fine
- **Tasks exceeding 8 points**: split them; no task should be too complex for a single implementer
+- **Tasks exceeding 5 points**: split them; no task should be too complex for a single implementer
 - **Cross-component tasks**: each task belongs to exactly one component
 - **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
 - **Creating git branches**: branch creation is an implementation concern, not a decomposition one
@@ -221,7 +224,7 @@ Read and follow `steps/04_cross-verification.md`.
 | Situation | Action |
 |-----------|--------|
 | Ambiguous component boundaries | ASK user |
-| Task complexity exceeds 8 points after splitting | ASK user |
+| Task complexity exceeds 5 points after splitting | ASK user |
 | Missing component specs in DOCUMENT_DIR | ASK user |
 | Cross-component dependency conflict | ASK user |
 | Tracker epic not found for a component | ASK user for Epic ID |
@@ -233,15 +236,14 @@ Read and follow `steps/04_cross-verification.md`.
 ┌────────────────────────────────────────────────────────────────┐
 │          Task Decomposition (Multi-Mode)                        │
 ├────────────────────────────────────────────────────────────────┤
-│ CONTEXT: Resolve mode (default / single component / tests-only) │
+│ CONTEXT: Invoke the selected entrypoint (implementation / single / tests-only) │
 │                                                                 │
-│ DEFAULT MODE:                                                   │
+│ IMPLEMENTATION TASK DECOMPOSITION:                              │
 │  1.   Bootstrap Structure → steps/01_bootstrap-structure.md     │
 │       [BLOCKING: user confirms structure]                       │
 │  1.5  Module Layout       → steps/01-5_module-layout.md         │
 │       [BLOCKING: user confirms layout]                          │
 │  2.   Component Tasks     → steps/02_task-decomposition.md      │
 │  3.   Blackbox Tests      → steps/03_blackbox-test-decomposition.md │
 │  4.   Cross-Verification  → steps/04_cross-verification.md      │
 │       [BLOCKING: user confirms dependencies]                    │
 │                                                                 │
@@ -26,7 +26,7 @@ For each component (or the single provided component):
 4. Do not create tasks for other components — only tasks for the current component
 5. Each task should be atomic, containing 1 API or a list of semantically connected APIs
 6. Write each task spec using `templates/task.md`
-7. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
+7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
 8. Note task dependencies (referencing tracker IDs of already-created dependency tasks, e.g., `AZ-42_initial_structure`)
 9. **Cross-cutting rule**: if a concern spans ≥2 components (logging, config loading, auth/authZ, error envelope, telemetry, feature flags, i18n), create ONE shared task under the cross-cutting epic. Per-component tasks declare it as a dependency and consume it; they MUST NOT re-implement it locally. Duplicate local implementations are an `Architecture` finding (High) in code-review Phase 7 and a `Maintainability` finding in Phase 6.
 10. **Shared-models / shared-API rule**: classify the task as shared if ANY of the following is true:
@@ -46,7 +46,7 @@ For each component (or the single provided component):
 ## Self-verification (per component)
 - [ ] Every task is atomic (single concern)
- [ ] No task exceeds 8 complexity points
+- [ ] No task exceeds 5 complexity points
 - [ ] Task dependencies reference correct tracker IDs
 - [ ] Tasks cover all interfaces defined in the component spec
 - [ ] No tasks duplicate work from other components
@@ -1,4 +1,4 @@
-# Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
+# Step 3: Blackbox Test Task Decomposition (tests-only mode only)
 **Role**: Professional Quality Assurance Engineer
 **Goal**: Decompose blackbox test specs into atomic, implementable task specs.
@@ -6,7 +6,6 @@
 ## Numbering
 - In default mode: continue sequential numbering from where Step 2 left off.
 - In tests-only mode: start from 02 (01 is the test infrastructure bootstrap from Step 1t).
 ## Steps
@@ -15,10 +14,9 @@
 2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test)
 3. Each task should reference the specific test scenarios it implements and the environment/test-data specs
 4. Dependencies:
   - In default mode: blackbox test tasks depend on the component implementation tasks they exercise
   - In tests-only mode: blackbox test tasks depend on the test infrastructure bootstrap task (Step 1t)
 5. Write each task spec using `templates/task.md`
-6. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
+6. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
 7. Note task dependencies (referencing tracker IDs of already-created dependency tasks)
 8. **Immediately after writing each task file**: create a work item ticket under the "Blackbox Tests" epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
@@ -26,8 +24,8 @@
 - [ ] Every scenario from `tests/blackbox-tests.md` is covered by a task
 - [ ] Every scenario from `tests/performance-tests.md`, `tests/resilience-tests.md`, `tests/security-tests.md`, and `tests/resource-limit-tests.md` is covered by a task
- [ ] No task exceeds 8 complexity points
+- [ ] No task exceeds 5 complexity points
- [ ] Dependencies correctly reference the dependency tasks (component tasks in default mode, test infrastructure in tests-only mode)
+- [ ] Dependencies correctly reference the test infrastructure task
 - [ ] Every task has a work item ticket linked to the "Blackbox Tests" epic
 ## Save action
@@ -1,4 +1,4 @@
-# Step 4: Cross-Task Verification (default and tests-only modes)
+# Step 4: Cross-Task Verification (implementation and tests-only modes)
 **Role**: Professional software architect and analyst
 **Goal**: Verify task consistency and produce `_dependencies_table.md`.
@@ -8,7 +8,7 @@
 1. Verify task dependencies across all tasks are consistent
 2. Check no gaps:
-   - In default mode: every interface in `architecture.md` has tasks covering it
+   - In implementation mode: every product interface in `architecture.md` has implementation task coverage
   - In tests-only mode: every test scenario in `traceability-matrix.md` is covered by a task
 3. Check no overlaps: tasks don't duplicate work
 4. Check no circular dependencies in the task graph
@@ -16,9 +16,9 @@
 ## Self-verification
-### Default mode
+### Implementation mode
- [ ] Every architecture interface is covered by at least one task
+- [ ] Every product interface in `architecture.md` is covered by at least one implementation task
 - [ ] No circular dependencies in the task graph
 - [ ] Cross-component dependencies are explicitly noted in affected task specs
 - [ ] `_dependencies_table.md` contains every task with correct dependencies
@@ -28,4 +28,4 @@ Use this template after cross-task verification. Save as `TASKS_DIR/_dependencie
 - Dependencies column lists tracker IDs (e.g., "AZ-43, AZ-44") or "None"
 - No circular dependencies allowed
 - Tasks should be listed in recommended execution order
- The `/implement` skill reads this table to compute parallel batches
+- The `/implement` skill reads this table to compute dependency-aware batches; task execution remains sequential
@@ -11,7 +11,7 @@ Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[T
 **Task**: [TRACKER-ID]_[short_name]
 **Name**: [short human name]
 **Description**: [one-line description of what this task delivers]
-**Complexity**: [1|2|3|5|8] points
+**Complexity**: [1|2|3|5] points
 **Dependencies**: [AZ-43_shared_models, AZ-44_db_migrations] or "None"
 **Component**: [component name for context]
 **Tracker**: [TASK-ID]
@@ -102,8 +102,7 @@ Consumers MUST read that file — not this task spec — to discover the interfa
 - 2 points: Non-trivial, low complexity, minimal coordination
 - 3 points: Multi-step, moderate complexity, potential alignment needed
 - 5 points: Difficult, interconnected logic, medium-high risk
- 8 points: High difficulty, high ambiguity or coordination, multiple components
+- 8+ points: Too complex — split into smaller tasks
 - 13 points: Too complex — split into smaller tasks
 ## Output Guidelines
@@ -25,6 +25,7 @@ For each task the main agent receives a task spec, analyzes the codebase, implem
 - **Dependency-aware ordering**: tasks run only when all their dependencies are satisfied
 - **Batching for review, not parallelism**: tasks are grouped into batches so `/code-review` and commits operate on a coherent unit of work — all tasks inside a batch are still implemented one after the other
 - **Integrated review**: `/code-review` skill runs automatically after each batch
 - **Completeness before testing**: product implementation is not done until code is checked against task outcomes, included scope, architecture/component promises, and unresolved scaffold/native placeholders — not just task AC tests
 - **Auto-start**: batches start immediately — no user confirmation before a batch
 - **Gate on failure**: user confirmation is required only when code review returns FAIL
 - **Commit per batch**: after each batch is confirmed, commit. Ask the user whether to push to remote unless the user previously opted into auto-push for this session.
@@ -32,9 +33,26 @@ For each task the main agent receives a task spec, analyzes the codebase, implem
 ## Context Resolution
 - TASKS_DIR: `_docs/02_tasks/`
- Task files: all `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
+- Task files: selected `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
 - Dependency table: `TASKS_DIR/_dependencies_table.md`
 ### Task Selection Context
 The invoking flow decides which task category this run should execute. The implement skill must honor that selected context instead of consuming every file in `todo/`.
 | Context | Selected task files |
 |---------|---------------------|
 | Product implementation | Task specs that are not test-only and not refactoring specs |
 | Test implementation | `*_test_infrastructure.md` plus task specs whose `Component` or `Epic` identifies `Blackbox Tests` |
 | Refactoring | Task specs whose filename or task ID includes `_refactor_` |
 If no explicit context is provided, infer it from the active autodev step:
 - greenfield Step 7 or existing-code Step 10 → Product implementation
 - greenfield Step 10 or existing-code Step 6 → Test implementation
 - refactor Phase 4 → Refactoring
 Unselected task files remain in `TASKS_DIR/todo/` for their later flow step.
 ### Task Lifecycle Folders
 ```
@@ -47,7 +65,7 @@ TASKS_DIR/
 ## Prerequisite Checks (BLOCKING)
-1. `TASKS_DIR/todo/` exists and contains at least one task file — **STOP if missing**
+1. `TASKS_DIR/todo/` exists and contains at least one task file for the selected context — **STOP if missing**
 2. `_dependencies_table.md` exists — **STOP if missing**
 3. At least one task is not yet completed — **STOP if all done**
 4. **Working tree is clean** — run `git status --porcelain`; the output must be empty.
@@ -62,9 +80,9 @@ TASKS_DIR/
 ### 1. Parse
- Read all task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
+- Read selected task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
 - Read `_dependencies_table.md` — parse into a dependency graph (DAG)
- Validate: no circular dependencies, all referenced dependencies exist
+- Validate: no circular dependencies in the selected task graph, all referenced selected-task dependencies exist or are already completed in `TASKS_DIR/done/`
 ### 2. Detect Progress
@@ -102,7 +120,7 @@ If `_docs/02_document/module-layout.md` is missing or the component is not found
 ### 5. Update Tracker Status → In Progress
-For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before starting work. If `tracker: local`, skip this step.
+For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before starting work. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
 ### 6. Implement Tasks Sequentially
@@ -188,12 +206,14 @@ Track `auto_fix_attempts` and `escalated_findings` in the batch report for retro
 ### 12. Update Tracker Status → In Testing
-After the batch is committed and pushed, transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step.
+After the batch is committed (and pushed if the user approved pushing), transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
 ### 13. Archive Completed Tasks
 Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.
 For product implementation, this archive means "batch implementation accepted." The Product Implementation Completeness Gate can still require follow-up remediation tasks before the feature is complete; it does not move original task files back to `todo/`.
 ### 14. Loop
 - Go back to step 2 until all tasks in `todo/` are done
@@ -215,16 +235,70 @@ Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.
 - **Interaction with Auto-Fix Gate**: Architecture findings (new category from code-review Phase 7) always escalate per the implement auto-fix matrix; they cannot silently auto-fix
 - **Resumability**: if interrupted, the next invocation checks for the latest `cumulative_review_batches_*.md` and computes the changed-file set from batch reports produced after that review
-### 15. Final Test Run
+### 15. Product Implementation Completeness Gate
- After all batches are complete, run the full test suite once
+Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate only when the remaining context is explicitly test implementation or refactoring, as determined by the task files and report filename rules.
- Read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices)
+
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision
+**Goal**: catch the failure mode where narrow tests validate scaffold behavior while the task's actual outcome, included scope, architecture promise, or named integration remains unimplemented.
- When tests pass, report final summary
+
 Inputs:
 - Completed product task specs from `_docs/02_tasks/done/` for the current cycle
 - `_docs/02_document/architecture.md`
 - `_docs/02_document/system-flows.md`
 - Relevant `_docs/02_document/components/*/description.md` files
 - Current source code under each completed task's ownership envelope
 - Batch reports and code-review reports for the current cycle
 For each completed product task:
 1. Read these sections from the task spec: `Description`, `Outcome`, `Scope / Included`, `Acceptance Criteria`, `Non-Functional Requirements`, `Constraints`, and explicit named technologies or integrations.
 2. Compare those promises against actual source code, not only tests or report prose.
 3. Search the task's owned component files for unresolved implementation markers: `placeholder`, `stub`, `reserved`, `TODO`, `NotImplemented`, `pass`, `deterministic`, `fake`, `mock`, `scaffold`, `native bridge`, and empty native/readme-only integration directories. Ignore test fixtures/mocks only when they are under test-owned paths and not used as production behavior.
 4. Verify that each named runtime dependency in the task promise is either integrated behind the approved boundary or explicitly documented as a blocked prerequisite in the task/report. Examples: if a task promises FAISS, DINOv2, BASALT, LightGlue, OpenCV, RANSAC, a database, cloud service, or hardware SDK, the production code must contain that integration boundary; a deterministic fallback alone is not complete.
 5. Verify tests exercise the real implementation path where local prerequisites exist. Environment-gated tests may skip only with an explicit prerequisite reason; they do not make missing production code complete.
 6. Classify each task:
   - **PASS**: task promises are implemented or explicitly out of scope in the task itself.
   - **BLOCKED**: production code exists but cannot be fully verified due to external hardware/data/license/runtime prerequisites; the blocker is explicit and tests report blocked/skipped with reason.
   - **FAIL**: promised production behavior is missing, only scaffolded, or only represented in tests/reports.
 Save the audit to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` with:
 - Per-task classification
 - Evidence files/symbols checked
 - Any unresolved scaffold/native placeholders
 - Any named promised technologies not integrated
 - Required remediation task suggestions, each sized to 5 points or less
 Gate:
 - If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, continue to Final Test Run.
 - If any product task is `FAIL`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
  - A) Create remediation tasks now and return to implementation
  - B) Mark the missing behavior explicitly out of scope in task/docs, then re-run this gate
  - C) Abort for manual correction
 - Recommendation must normally be A unless the user deliberately accepts reduced scope.
 Remediation task creation:
 1. For each `FAIL`, create one or more task specs using `.cursor/skills/decompose/templates/task.md`; each remediation task must be sized at 5 points or less.
 2. Save each task to `_docs/02_tasks/todo/` with a short name prefixed by `remediate_`.
 3. Set **Component** to the failed task's component and set **Dependencies** to the failed task ID plus any remediation prerequisites.
 4. Create or defer tracker tickets using the same tracker rules as decompose/new-task: if tracker is available, create tickets immediately; if the user explicitly chose `tracker: local`, keep numeric prefixes with `Tracker: pending` / `Epic: pending`.
 5. Append the remediation tasks to `_docs/02_tasks/_dependencies_table.md`.
 6. Return to Step 1 (Parse) in **Product implementation** context. The final product implementation report can be written only after remediation tasks complete and this gate reruns without `FAIL`.
 ### 16. Final Test Run
 - After all batches are complete, run the full test suite once unless the invoking flow's immediate next step is `Run Tests`.
 - If the next flow step is `Run Tests`, record a handoff in the final implementation report and let `.cursor/skills/test-run/SKILL.md` own the full-suite gate to avoid duplicate full runs.
 - When this step does run, read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices).
 - Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision.
 - When tests pass, report final summary.
 ## Batch Report Persistence
-After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. When all tasks are complete, produce a FINAL implementation report with a summary of all batches. The filename depends on context:
+After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. For product implementation, produce the FINAL implementation report only after the Product Implementation Completeness Gate passes. For test and refactor implementation, produce the FINAL report after all selected tasks complete and the full-suite gate is either run or handed off per Step 16. The filename depends on context:
 - **Test implementation** (tasks from test decomposition): `_docs/03_implementation/implementation_report_tests.md`
 - **Feature implementation**: `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` where `{feature_slug}` is derived from the batch task names (e.g., `implementation_report_core_api_cycle2.md`) and `{N}` is the current `state.cycle` from `_docs/_autodev_state.md`. If `state.cycle` is absent (pre-migration), default to `cycle1`.
@@ -266,6 +340,7 @@ After each batch, produce a structured report:
 | Same task rewritten 3+ times without green tests | Mark Blocked, continue batch, escalate at batch end |
 | Task blocked on external dependency (not in task list) | Report and skip |
 | File ownership violated (task wrote outside OWNED) | ASK user |
 | Product completeness gate finds missing promised implementation | STOP — create remediation tasks or get explicit user scope reduction |
 | Test failure after final test run | Delegate to test-run skill — blocking gate |
 | All tasks complete | Report final summary, suggest final commit |
 | `_dependencies_table.md` missing | STOP — run `/decompose` first |
@@ -283,4 +358,5 @@ Each batch commit serves as a rollback checkpoint. If recovery is needed:
 - Never start a task whose dependencies are not yet completed
 - Never run tasks in parallel and never spawn subagents — see `.cursor/rules/no-subagents.mdc`
 - If a task is flagged as stuck, stop working on it and report — do not let it loop indefinitely
- Always run the full test suite after all batches complete (step 15)
+- Always run the Product Implementation Completeness Gate before final product reports
 - Always run or hand off the full test suite after all batches complete (step 16)
@@ -282,7 +282,7 @@ Present using the Choose format for each decision that has meaningful alternativ
   - Update **Epic** field: `[EPIC-ID]`
 3. Rename the file from `[##]_[short_name].md` to `[TICKET-ID]_[short_name].md`
-If the work item tracker is not authenticated or unavailable (`tracker: local`):
+If the work item tracker is not authenticated or unavailable, follow `.cursor/rules/tracker.mdc` before continuing. Only if the user explicitly chooses `tracker: local`:
 - Keep the numeric prefix
 - Set **Tracker** to `pending`
 - Set **Epic** to `pending`
@@ -337,7 +337,7 @@ After the user chooses **Done**:
 | Research skill hits a blocker | Follow research skill's own escalation rules |
 | Codebase analysis reveals conflicting architectures | **ASK** user which pattern to follow |
 | Complexity exceeds 5 points | **WARN** user and suggest splitting into multiple tasks |
-| Work item tracker MCP unavailable | **WARN**, continue with local-only task files |
+| Work item tracker MCP unavailable | Follow `.cursor/rules/tracker.mdc`; do not continue in local mode unless the user explicitly chooses it |
 ## Trigger Conditions
@@ -58,4 +58,4 @@ Do NOT create minimal epics with just a summary and short description. The epic
 8. **Create "Blackbox Tests" epic** — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.
-**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If `tracker: local`, save locally only.
+**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If tracker availability fails, follow `.cursor/rules/tracker.mdc`; only if the user explicitly chooses `tracker: local`, save locally only with pending tracker markers.
@@ -133,4 +133,4 @@ Link to architecture.md and relevant component spec.]
  - `component` — a normal per-component epic
  - `cross-cutting` — a shared concern that spans ≥2 components
  - `tests` — the blackbox-tests epic (always exactly one)
- Complexity points for child issues follow the project standard: 1, 2, 3, 5, 8. Do not create issues above 5 points — split them.
+- Complexity points for child issues follow the project standard: 1, 2, 3, 5. Do not create issues above 5 points — split them.
@@ -25,6 +25,7 @@ Phase details live in `phases/` — read the relevant file before executing each
 - **Delegate execution**: all code changes go through the implement skill via task files
 - **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
 - **Exact-fit recommendations**: do not recommend a replacement pattern, library, service, architecture, algorithm, or "modern approach" merely because it improves structure or solves a similar class of problem. It must fit confirmed product constraints, acceptance criteria, operating context, integration boundaries, and current code realities. Otherwise reject it, mark it experimental, or ask the user before adding it to the roadmap.
 - **Per-mode API capability verification on replacements**: when a refactor proposes replacing or adding a library/SDK/framework/service that exposes multiple modes or configurations, pin the exact mode the refactored code will use (inputs, outputs, runtime) and verify *that mode* via mandatory `context7` lookup plus a saved Minimum Viable Example before promoting the recommendation to `Selected`. Capability claims at the category level ("supports A, B, C modes") must be cross-checked against the literal mode enumeration — `A, B → A+B` style conflations are the recurring silent-failure path.
 ## Context Resolution
@@ -58,7 +59,7 @@ Create REFACTOR_DIR and RUN_DIR if missing. If a RUN_DIR with the same name alre
 Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-changes.md`). Both modes then convert that file into task files in TASKS_DIR during Phase 2.
-**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file to avoid duplication.
+**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file only if it lives outside `RUN_DIR`. If the provided file is already the canonical `RUN_DIR/list-of-changes.md`, keep it as the audit record.
 ## Workflow
@@ -80,10 +81,10 @@ Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-ch
 - "refactor [specific target]" → skip phase 1 if docs exist
 - Default → all phases
-**Testability-run specifics** (guided mode invoked by autodev existing-code flow Step 4):
+**Testability-run specifics** (guided mode invoked by autodev existing-code Step 4 or greenfield Step 8):
 - Run name is `01-testability-refactoring`.
 - Phase 3 (Safety Net) is skipped by design — no tests exist yet. Compensating control: the `list-of-changes.md` gate in Phase 1 must be reviewed and approved by the user before Phase 4 runs.
- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see existing-code flow Step 4 for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for Step 8 (optional full refactor) consideration.
+- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see the invoking flow's testability step for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for the next optional full-refactor step or backlog consideration.
 - After Phase 4 (Execution) completes, write `RUN_DIR/testability_changes_summary.md` as Phase 4.5. Format: one bullet per applied change.
  ```markdown
  # Testability Changes Summary ({{run_name}})
@@ -10,6 +10,17 @@
 2. Extract the **Project Constraint Matrix** from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, current architecture/docs, and actual code constraints. Include required inputs/outputs, operating context, lifecycle assumptions, integration boundaries, non-functional targets, and hard disqualifiers.
 3. Research modern approaches for similar systems
 4. For each alternative pattern/library/service/architecture/algorithm, research intrinsic implementation constraints: required inputs/outputs, runtime assumptions, supported deployment modes, resource needs, operational limits, licensing/security constraints, and known failure reports.
   **API Capability Verification — Per-Mode (MANDATORY, BLOCKING for proposed replacements)**
   When a refactor recommendation replaces (or adds) a library/SDK/framework/service, the same per-mode verification used by `/research` Step 2 applies — selecting a replacement on category fit alone is the same silent-failure path. For every replacement candidate that has multiple modes or configurations:
   1. **Pin the exact mode/configuration** the refactored code will use, in one explicit sentence. Inputs (data shapes, sensor counts, payloads, rates), outputs (per `acceptance_criteria.md` and contract files), runtime (matching the project's deployment).
   2. **Run `context7` (or equivalent docs lookup)** for the candidate. **Mandatory for every replacement library/SDK/framework candidate**, not optional. Minimum three queries per candidate: mode enumeration, project's exact mode (with input/output shapes), disqualifier probe ("does this mode produce the required output? are there published limitations on this runtime?"). Append URLs to `RUN_DIR/analysis/research_findings.md` references section.
   3. **Save a Minimum Viable Example (MVE)** for the pinned mode under `RUN_DIR/analysis/mve_evidence.md` with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌. If no official example covers the project's exact configuration, the recommendation cannot be `Selected` based on category fit alone — it must be `Experimental only` (with required-evidence note) or `Rejected`.
   4. **Treat "the same library in a different mode" as a different recommendation.** If the project's pinned mode is `<X>` but the only documented evidence covers `<Y>`, do not silently soften the description. Open a separate recommendation row, with its own MVE, fit assessment, and disqualifiers.
   5. **Common silent-failure pattern**: a fact summary paraphrases docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes" — no `A+B` combination exists. Cross-check paraphrased capability claims against the literal mode enumeration.
 5. Identify what could be done differently
 6. Suggest improvements only when they fit the Project Constraint Matrix. A cleaner or more modern approach that violates product constraints must be marked `Rejected` or `Experimental only`, not added as a roadmap recommendation.
@@ -17,7 +28,8 @@ Write `RUN_DIR/analysis/research_findings.md`:
 - Current state analysis: patterns used, strengths, weaknesses
 - Alternative approaches per component: current vs alternative, pros/cons, migration effort
 - Prioritized recommendations: quick wins + strategic improvements
- Constraint-fit table: recommendation, constraints checked, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
+- Constraint-fit table: recommendation, **pinned mode/config**, constraints checked, **API capability evidence (MVE link)**, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
 - For every recommendation that replaces or adds a library/SDK/framework, append a **Restrictions × Candidate-Mode sub-matrix** that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's pinned mode, marking each cell ✅ Pass / ❌ Fail / ❓ Verify / N/A with cited evidence. A recommendation cannot be `Selected` while any cell is ❌ or ❓.
 ## 2b. Solution Assessment & Hardening Tracks
@@ -62,7 +74,7 @@ Create a work item tracker epic for this refactoring run:
 1. Epic name: the RUN_DIR name (e.g., `01-testability-refactoring`)
 2. Create the epic via configured tracker MCP
 3. Record the Epic ID — all tasks in 2d will be linked under this epic
-4. If tracker unavailable, use `PENDING` placeholder and note for later
+4. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; only use `PENDING` placeholders if the user explicitly chooses `tracker: local`
 ## 2d. Task Decomposition
@@ -88,6 +100,9 @@ Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files
 - [ ] Recommendations are grounded in actual code, not abstract
 - [ ] Every recommendation has been checked against the Project Constraint Matrix
 - [ ] No recommendation violates product restrictions, acceptance criteria, documented architecture decisions, or actual code integration boundaries
 - [ ] Every replacement library/SDK/framework recommendation has a pinned mode/config, a saved MVE in `mve_evidence.md`, and a Restrictions × Candidate-Mode sub-matrix with no ❌ or ❓ cells
 - [ ] `context7` (or equivalent) was consulted for every replacement library/SDK/framework recommendation
 - [ ] Paraphrased capability claims have been cross-checked against the literal mode-enumeration evidence (no `A, B → A+B` style conflation)
 - [ ] Rejected and experimental approaches are documented but not converted into implementation tasks without user approval
 - [ ] Roadmap phases are prioritized by impact
 - [ ] Epic created and all tasks linked to it
@@ -10,7 +10,7 @@
   - All `[TRACKER-ID]_refactor_*.md` files are present
   - Each task file has valid header fields (Task, Name, Description, Complexity, Dependencies)
 2. Verify `TASKS_DIR/_dependencies_table.md` includes the refactoring tasks
-3. Verify all tests pass (safety net from Phase 3 is green)
+3. Verify all tests pass (safety net from Phase 3 is green), unless this is a testability run where Phase 3 was intentionally skipped
 4. If any check fails, go back to the relevant phase to fix
 ## 4b. Delegate to Implement Skill
@@ -23,7 +23,7 @@ The implement skill will:
 3. Compute execution batches for the refactoring tasks
 4. Implement tasks sequentially in topological order (no subagents, no parallelism)
 5. Run code review after each batch
-6. Commit and push per batch
+6. Commit per batch and push only when the user approved pushing
 7. Update work item ticket status
 Do NOT modify, skip, or abbreviate any part of the implement skill's workflow. The refactor skill is delegating execution, not optimizing it.
@@ -47,7 +47,7 @@ After the implement skill completes:
 For each successfully completed refactoring task:
 1. Transition the work item ticket status to **Done** via the configured tracker MCP
-2. If tracker unavailable, note the pending status transitions in `RUN_DIR/execution_log.md`
+2. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; if the user explicitly chose `tracker: local`, note the pending status transitions in `RUN_DIR/execution_log.md`
 For any failed or blocked tasks, leave their status as-is (the implement skill already set them to In Testing or blocked).
@@ -30,7 +30,27 @@ Transform vague topics raised by users into high-quality, deliverable research r
 - **Internet-first investigation** — do not rely on training data for factual claims; search the web extensively for every sub-question, rephrase queries when results are thin, and keep searching until you have converging evidence from multiple independent sources
 - **Multi-perspective analysis** — examine every problem from at least 3 different viewpoints (e.g., end-user, implementer, business decision-maker, contrarian, domain expert, field practitioner); each perspective should generate its own search queries
 - **Question multiplication** — for each sub-question, generate multiple reformulated search queries (synonyms, related terms, negations, "what can go wrong" variants, practitioner-focused variants) to maximize coverage and uncover blind spots
 - **Component option breadth** — for every component area, build a broad option landscape before selecting. Search direct candidates, adjacent-domain alternatives, commercial/open-source variants, classical/simple baselines, current SOTA, and "do not use" failure cases. A component may not be narrowed to one candidate until alternatives have been searched and rejected with evidence.
 - **Component research depth** — for every serious component candidate, go beyond discovery pages. Read official docs, repository/license files, issue discussions, benchmarks, deployment guides, version/platform requirements, security notes, maintenance signals, and real-world failure reports. Extract evidence for inputs/outputs, lifecycle assumptions, runtime/storage/latency fit, integration boundaries, licensing, operational risks, and unsupported scenarios before assigning any selection status.
 - **Exact-fit component selection** — never select a component, tool, library, service, architecture pattern, or algorithm merely because it solves a similar class of problem. It must be proven compatible with the project's explicit operating context, constraints, required inputs/outputs, non-functional requirements, lifecycle assumptions, and acceptance criteria. If fit is unproven or mismatched, mark it `Rejected`, `Experimental only`, or escalate for user decision before it can shape the solution.
 - **Per-mode API capability verification** *(applies only to technical-component selection — see Research Output Class below)* — when a candidate library/SDK/framework/service exposes multiple modes or configurations, *the candidate is not a single thing*. Pin the exact mode the project will use (one explicit sentence: inputs, outputs, runtime), and verify *that mode* against the project's required inputs/outputs via official docs (mandatory `context7` lookup) plus a saved Minimum Viable Example. Capability claims at the category level ("supports X, Y, Z modes") must be cross-checked against the literal mode enumeration before being treated as project-applicable. Two modes of one library are two distinct candidates for the purposes of the Component Applicability Gate. Does not apply to non-technical research (concept comparison, market/policy investigation, knowledge organization, etc.).
 ## Research Output Class (BLOCKING — set in Step 1)
 Before applying any of the technical-component gates (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookup), classify the research output into one of two classes. Record the decision in `00_question_decomposition.md` once, near the top, so every downstream step honors it.
 | Class | What the output recommends or selects | Examples | Technical-component gates apply? |
 |-------|---------------------------------------|----------|----------------------------------|
 | **Technical-component selection** | One or more libraries, SDKs, frameworks, services, protocols, data formats, infrastructure patterns, algorithms, or APIs that will be implemented or operated against | "Pick a vector database", "Compare auth-token strategies for our API", "Should we use Kafka or RabbitMQ?", architecture / tech-stack / migration drafts (Mode A, Mode B) | **Yes — all gates active** |
 | **Non-technical investigation** | Concept comparisons, knowledge organization, root-cause investigation of an event, market/policy/regulatory/social analysis, literature review, decision support without committing to specific tooling | "Why did adoption stall in Q3?", "Compare phenomenology vs constructivism", "Map regulatory landscape for X", "What do practitioners say about onboarding under remote-first orgs?" | **No — skip API/MVE/sub-matrix gates; the rest of the 8-step engine still applies** |
 How to decide:
 1. Inspect the question and the input files (`problem.md`, `restrictions.md`, `acceptance_criteria.md`, or the standalone input file).
 2. If the deliverable will name specific software/services/protocols that someone will then build with or operate, it is **Technical-component selection**.
 3. If the deliverable is a report, comparison, or recommendation that does not commit to specific tooling, it is **Non-technical investigation**.
 4. **Mixed runs are valid.** Some research questions have a non-technical core but include one technical sub-question (or vice versa). In that case classify per component area within the run, not the run as a whole, and note in `00_question_decomposition.md` which component areas trigger the technical-component gates.
 When the run is purely **Non-technical investigation**, the rest of the research engine — question decomposition, perspective rotation, exhaustive web search, fact extraction, comparison framework, reasoning chain, validation, deliverable formatting — still applies in full. The sections that get skipped are explicitly the technical gates listed in the table above.
 ## Context Resolution
@@ -27,12 +27,23 @@
 - [ ] Iterative deepening completed: follow-up questions from initial findings were searched
 - [ ] No sub-question relies solely on training data without web verification
 ## Component Option Breadth
 - [ ] `00_question_decomposition.md` contains a Component Option Search Plan
 - [ ] Every component area was searched across simple baseline, established production, open-source, commercial/vendor, current SOTA, adjacent-domain, no-build/defer, and known-bad options where applicable
 - [ ] Every component area has at least 3 realistic candidates, or a documented explanation of why broad searches found fewer
 - [ ] Each lead candidate has official/source-of-truth evidence plus independent validation when available
 - [ ] Each component area includes at least one baseline/fallback option and at least one rejected or experimental option when possible
 - [ ] Alternative names, synonyms, and neighboring-domain terms were searched before declaring the option landscape complete
 - [ ] Licensing, runtime, platform, maintenance, and unsupported-scenario searches were performed for every lead, fallback, and rejected candidate
 ## Mode A Specific
 - [ ] Phase 1 completed: AC assessment was presented to and confirmed by user
 - [ ] AC assessment consistent: Solution draft respects the (possibly adjusted) acceptance criteria and restrictions
 - [ ] Competitor analysis included: Existing solutions were researched
 - [ ] All components have comparison tables: Each component lists alternatives with tools, advantages, limitations, security, cost
 - [ ] Component options are broad: component tables include baseline, production, open-source, commercial/vendor, SOTA/research, adjacent-domain, defer/no-build, and disqualified options where applicable
 - [ ] Tools/libraries verified: Suggested tools actually exist and work as described
 - [ ] Component fit matrix completed: `06_component_fit_matrix.md` exists and every selected component/tool/pattern is marked `Selected`
 - [ ] No field-adjacent substitution: no selected candidate is chosen only because it solves a similar class of problem while failing the project's explicit constraints
@@ -47,6 +58,7 @@
 - [ ] New draft is self-contained: Written as if from scratch, no "updated" markers
 - [ ] Performance column included: Mode B comparison tables include performance characteristics
 - [ ] Previous draft issues addressed: Every finding in the table is resolved in the new draft
 - [ ] Existing selected components were challenged against a broad alternative landscape before being kept
 - [ ] Existing component fit audited: every old and new component/tool/pattern was checked against `restrictions.md`, `acceptance_criteria.md`, and the Project Constraint Matrix
 - [ ] Rejected/experimental candidates are not lead recommendations unless the user explicitly accepted the risk
@@ -84,6 +96,7 @@ When the research topic has Critical or High sensitivity level:
 ## Exact-Fit Validation (BLOCKING)
 - [ ] Project Constraint Matrix extracted from problem context before component selection
 - [ ] Component fit matrix includes `Component Area`, `Option Family`, and `Pinned Mode/Config` columns
 - [ ] Every selected component/tool/library/service/pattern/algorithm has evidence for required inputs/outputs and integration boundaries
 - [ ] Every selected candidate has evidence for the operating context and lifecycle assumptions it must support
 - [ ] Every selected candidate has evidence for non-functional targets that are binding for the project
@@ -91,3 +104,21 @@ When the research topic has Critical or High sensitivity level:
 - [ ] Mismatches are recorded as disqualifiers, not softened into generic limitations
 - [ ] Any candidate with unproven fit is marked `Experimental only` or escalated for user decision
 - [ ] Any candidate with documented constraint conflict is marked `Rejected`
 ## API Capability Verification (BLOCKING)
 **Applicability**: this checklist applies only when the run is classified as **Technical-component selection** (see SKILL.md → Research Output Class). For non-technical research (concept comparison, market/policy investigation, root-cause analysis, knowledge organization), skip this checklist entirely and note the skip in `05_validation_log.md`. For mixed runs, apply only to technical component areas.
 For every lead candidate that is a library/SDK/framework/service:
 - [ ] The exact mode/configuration the project will use is pinned in one explicit sentence (inputs, outputs, runtime); no vague "supports X" language
 - [ ] `context7` (or equivalent docs lookup) was run for the candidate, with at least 3 queries: mode enumeration, project's exact mode, disqualifier probe
 - [ ] All consulted URLs from context7 / official docs are appended to `01_source_registry.md`
 - [ ] A Minimum Viable Example (MVE) was saved for the pinned mode in `02_fact_cards.md` (or `02_mve_evidence.md`) with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌
 - [ ] When the MVE inputs or outputs do not exactly match the project's, the mismatch is cited from the official docs (not inferred), and the candidate is `Experimental only` or `Rejected`
 - [ ] When a library has multiple modes, each project-relevant mode appears as its own candidate row (not a single library row that softens across modes)
 - [ ] Restrictions × Candidate-Modes sub-matrix in `06_component_fit_matrix.md` is filled for every lead candidate, with one row per numbered restriction and per numbered acceptance criterion
 - [ ] Sub-matrix uses ✅ / ❌ / ❓ / N/A only — no free-form prose substitutes
 - [ ] No `Selected` candidate has any ❌ or ❓ cell in its sub-matrix
 - [ ] "Validation gate required" footnotes are explicitly classified as either *API capability* (must be resolved here) or *runtime quality* (may be carried forward)
 - [ ] Paraphrased capability claims in fact cards have been cross-checked against the literal mode-enumeration evidence (no `mono, inertial → mono-inertial` style conflation)
@@ -40,6 +40,7 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
 - "What existing/competitor solutions address this problem?"
 - "What are the component parts of this problem?"
 - "For each component, what are the state-of-the-art solutions?"
 - "For each component, what are the practical alternatives across simple baseline, established production option, open-source option, commercial option, current SOTA, adjacent-domain option, and no-build/defer option?"
 - "What are the security considerations per component?"
 - "What are the cost implications of each approach?"
@@ -48,6 +49,7 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
 - "What are the security vulnerabilities in the proposed architecture?"
 - "Where are the performance bottlenecks?"
 - "What solutions exist for each identified issue?"
 - "For each component already selected in the draft, what alternatives should be considered before keeping, replacing, or rejecting it?"
 **General sub-question patterns** (use when applicable):
 - **Sub-question A**: "What is X and how does it work?" (Definition & mechanism)
@@ -84,6 +86,27 @@ For **each sub-question**, generate **at least 3-5 search query variants** befor
 Record all planned queries in `00_question_decomposition.md` alongside each sub-question.
 #### Component Option Breadth (MANDATORY)
 Before Step 2, identify the component areas implied by the problem and create a search plan for options in each area. A component area is any replaceable tool, library, model, service, algorithm, data format, protocol, infrastructure pattern, or validation approach that could materially affect the solution.
 For every component area, generate search queries for these option families unless clearly not applicable:
 - **Simple baseline**: low-complexity classical or manual approach that can serve as a fallback or regression baseline.
 - **Established production option**: mature library/service/pattern with field usage.
 - **Open-source candidate**: permissive-license option with inspectable implementation and community history.
 - **Commercial/vendor option**: paid or vendor-supported option, including SDK/platform constraints.
 - **Current SOTA / research option**: recent model, paper, or benchmark leader that may be promising but immature.
 - **Adjacent-domain option**: solution from a neighboring domain with similar constraints.
 - **No-build / defer option**: whether the component can be avoided, simplified, or moved out of scope.
 - **Known bad option**: candidate or family that appears attractive but has documented failure modes or disqualifiers.
 For each component area, record:
 - Candidate names and option families to search.
 - At least 5 query variants covering alternatives, comparisons, limitations, licensing, runtime/scale, and exact project constraints.
 - The minimum evidence needed to mark a candidate `Selected`, `Rejected`, `Experimental only`, or `Needs user decision`.
 Add this as a "Component Option Search Plan" section in `00_question_decomposition.md`.
 **Research Subject Boundary Definition (BLOCKING - must be explicit)**:
 When decomposing questions, you must explicitly define the **boundaries of the research subject**:
@@ -123,6 +146,7 @@ Record the audit result in `00_question_decomposition.md` as a "Completeness Aud
   - List of decomposed sub-questions
   - **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
   - **Search query variants** for each sub-question (at least 3-5 per sub-question)
   - **Component Option Search Plan** (component areas, option families, candidate names, query variants, required evidence)
   - **Completeness audit** (taxonomy cross-reference + domain discovery results)
 4. Write TodoWrite to track progress
@@ -136,7 +160,7 @@ Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). C
 **Tool Usage**:
 - Use `WebSearch` for broad searches; `WebFetch` to read specific pages
- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation
+- Use the `context7` MCP server (`resolve-library-id` then `query-docs` / `get-library-docs`) for up-to-date library/framework documentation. **Mandatory per lead candidate** — see "API Capability Verification" below.
 - Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
 - When citing web sources, include the URL and date accessed
@@ -149,6 +173,13 @@ Do not stop at the first few results. The goal is to build a comprehensive evide
 - Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion)
 - If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems
 **Minimum search effort per component area**:
 - Search every option family from the "Component Option Search Plan" before choosing a lead candidate.
 - For each lead, fallback, or rejected candidate, search at least one official/source-of-truth page and at least one independent validation source when available.
 - Search `"[component] alternatives"`, `"[candidate] vs [alternative]"`, `"[candidate] limitations"`, `"[candidate] license"`, `"[candidate] production"`, and `"[candidate] [binding project constraint]"`.
 - If fewer than 3 realistic candidates are found for a component area, explicitly document why the landscape is narrow and search adjacent domains before accepting that result.
 - Include at least one simple baseline and one "do not use" or disqualified candidate per component area when possible; these prevent false confidence in the selected option.
 **Candidate implementation-limit searches (MANDATORY)**:
 For every component/tool/library/service/pattern/algorithm that may be selected or recommended, search for its intrinsic implementation constraints. Do not rely on product category labels, marketing summaries, or examples from a different operating context. Include query variants for:
 - Official supported inputs/outputs, protocols, data formats, and deployment modes
@@ -159,6 +190,48 @@ For every component/tool/library/service/pattern/algorithm that may be selected
 - Licensing, security, maintenance, and community-health constraints
 - Exact phrases from the project's restrictions and acceptance criteria combined with the candidate name
 **API Capability Verification — Per-Mode (MANDATORY, BLOCKING for lead candidates)**:
 **Applicability**: this section applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section, and only to lead candidates that are libraries/SDKs/frameworks/services/protocols/data formats with multiple modes or configurations. For non-technical research (concept comparison, market/policy investigation, knowledge organization, root-cause analysis without tooling commitments), skip this entire sub-section and continue with the rest of Step 2 — the broader candidate implementation-limit search above is sufficient. State the skip explicitly once in `02_fact_cards.md`: `API Capability Verification: not applicable — this run is a Non-technical investigation, no library/SDK/service candidates`.
 Most libraries/SDKs/services expose **multiple modes or configurations** (e.g., monocular vs stereo VO, sync vs async API, batch vs streaming inference, write-through vs write-behind cache). Selecting a candidate "because it supports X" without pinning *which mode* the project will use, and *whether that exact mode produces the required outputs from the required inputs*, is the most common silent-failure path in research. A library can support a class of problem in mode A while being unusable for the project's specific configuration in mode B.
 For every lead candidate that is a library/SDK/framework/service with multiple modes or configurations, do the following — in this order, before marking the candidate `Selected`:
 1. **Pin the exact mode/configuration the project will use.**
   Derived from the Project Constraint Matrix: which inputs are available (sensor count, sensor types, data shapes, rates), which outputs are required (per `acceptance_criteria.md` and contract files), which hardware/runtime is fixed (per `restrictions.md`). Write this as a single sentence: "We will use `<library>` in `<mode/config>` with inputs `<list>` and expect outputs `<list>` on `<runtime>`." Do not progress past this step on a vague mode description.
 2. **Run `context7` (or equivalent docs lookup) for the candidate** — this is **mandatory for every lead library/SDK/framework candidate**, not optional. Minimum three queries per candidate:
   1. *Mode enumeration*: "What modes/configurations does `<library>` support? List every value of the mode/config enum and what each requires as input."
   2. *Project's exact mode*: "Show a minimum runnable example of `<library>` in `<the pinned mode>` with `<the project's input shape>`. What does it produce?"
   3. *Disqualifier probe*: "Does `<library>` `<the pinned mode>` produce `<the required output>`? Are there published limitations of `<the pinned mode>` for `<the project's runtime/hardware>`?"
   For services without context7 coverage, use official docs site + WebFetch on the API reference page + the project's example/tutorial directory in the source repo. Append every consulted URL to `01_source_registry.md`.
 3. **Save a Minimum Viable Example (MVE) for the pinned mode.**
   Append to `02_fact_cards.md` (or a sibling `02_mve_evidence.md`) at least one block per lead library candidate with:
   ```markdown
   ## MVE — <library> in <pinned mode>
   - **Source**: <official URL or context7 reference, with date>
   - **Inputs in the example**: <e.g., 2 calibrated cameras + IMU at 200 Hz>
   - **Outputs in the example**: <e.g., 6-DoF pose with covariance>
   - **Project inputs**: <e.g., 1 camera + IMU at 200 Hz>
   - **Project outputs required**: <e.g., 6-DoF pose with metric translation>
   - **Match assessment**: ✅ exact match / ⚠️ partial (specify dimension) / ❌ mismatch (specify dimension)
   - **If ⚠️ or ❌**: cite the official-docs sentence that establishes the mismatch.
   ```
   If no official example covers the project's exact configuration → the candidate cannot be marked `Selected` based on category fit alone. Status must be `Experimental only` (with required-evidence note) or `Rejected` (when the docs explicitly disqualify the configuration).
 4. **Bind every numbered Restriction and Acceptance Criterion to the candidate's pinned mode.**
   For each numbered line in `restrictions.md` and `acceptance_criteria.md`, decide one of: `Pass` (the pinned mode satisfies it with cited evidence), `Fail` (the pinned mode contradicts it with cited evidence), `Verify` (no evidence either way; deeper investigation required), `N/A` (the line is irrelevant to this component area). Record this in `02_fact_cards.md` under the candidate's MVE block. The structural matrix in Step 7.5 reads from these bindings.
 5. **Treat "the same library in a different mode" as a different candidate.**
   If the project's pinned mode is `Monocular` but the only documented evidence covers `Stereo`, do not silently soften "rotation only" into "rotation + translation". Open a separate candidate row for the Monocular mode, with its own MVE, fit assessment, and disqualifiers. Two modes of one library are two distinct candidates for the purposes of this gate.
 **Common silent-failure pattern this guards against**: a fact card paraphrases the docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes". A category-level "Selected" decision then carries through every downstream artifact, masking that the project's required A+B combination does not exist as a single mode.
 **Search broadening strategies** (use when results are thin):
 - Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
 - Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
@@ -26,6 +26,7 @@ Write to `03_comparison_framework.md`:
 **Required exact-fit dimensions for component/tool decisions**:
 When the output selects or recommends a component, tool, library, service, architecture pattern, or algorithm, the framework MUST include these dimensions unless explicitly not applicable:
 - Option family (`Simple baseline`, `Established production`, `Open-source`, `Commercial/vendor`, `Current SOTA`, `Adjacent-domain`, `No-build/defer`, `Known bad`)
 - Required inputs/outputs and ownership boundaries
 - Operating context and lifecycle fit
 - Non-functional envelope fit
@@ -33,6 +34,8 @@ When the output selects or recommends a component, tool, library, service, archi
 - Evidence quality and source tier
 - Selection status (`Selected`, `Rejected`, `Experimental only`, `Needs user decision`)
 For each component area, include multiple candidates in the initial population. Do not present only the preferred option unless the investigation found no realistic alternatives; if so, state the searches that proved the narrow landscape.
 ---
 ### Step 5: Reference Point Baseline Alignment
@@ -141,26 +144,61 @@ If using Y: [expected behavior]
 ### Step 7.5: Component Applicability Gate (BLOCKING)
-Before finalizing the solution draft, build an exact-fit matrix for every component/tool/library/service/pattern/algorithm that is selected, recommended, rejected, or treated as a fallback.
+**Applicability**: this gate applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section. For non-technical research (concept comparison, market/policy investigation, root-cause analysis without tooling, knowledge organization), skip this entire step and proceed to Step 8 — there are no components to gate. State the skip once in `05_validation_log.md`: `Step 7.5 (Component Applicability Gate): not applicable — Non-technical investigation`. For mixed runs (some component areas technical, some not), apply this gate only to the technical component areas; the non-technical ones do not produce 7.5 rows.
 Before finalizing the solution draft, build an exact-fit matrix for every component/tool/library/service/pattern/algorithm that is selected, recommended, rejected, or treated as a fallback. Free-form prose in a "Project Constraints Checked" column is **not sufficient** — mismatches hide inside rationale text. The matrix must be structured per restriction and per acceptance criterion.
 #### 7.5.1 Top-level Component Fit Matrix
 ```markdown
 # Component Fit Matrix
-| Candidate | Intended Role | Project Constraints Checked | Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
+| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
-|-----------|---------------|-----------------------------|----------|----------------------------|--------|--------------------|
+|----------------|-----------|--------------------|---------------|---------------|-------------------------|----------------------------|--------|--------------------|
-| [name] | [role] | [constraints] | [Fact # / Source #] | [none / list] | Selected / Rejected / Experimental only / Needs user decision | [why] |
+| [area] | [name] | [exact mode/config the project will use, copied verbatim from the MVE block in Step 2] | [family] | [role] | MVE: [link to MVE block in `02_fact_cards.md` or `02_mve_evidence.md`]; docs: [Source #] | [none / list] | Selected / Rejected / Experimental only / Needs user decision | [why] |
 ```
-Rules:
+The new **Pinned Mode/Config** column is mandatory. A row without a pinned mode is incomplete. The new **API Capability Evidence** column links to the Minimum Viable Example saved during Step 2's API Capability Verification — without an MVE link the candidate cannot be `Selected`.
- `Selected` is allowed only when the candidate's documented implementation assumptions match the project's explicit constraints and acceptance criteria.
+
- `Experimental only` is required when a candidate might work but lacks proof for the exact operating context.
+#### 7.5.2 Restrictions × Candidate-Modes Sub-Matrix (MANDATORY)
- `Rejected` is required when documented assumptions conflict with project constraints.
+
- `Needs user decision` is required when a mismatch changes scope, cost, safety, product behavior, or acceptance criteria.
+For each lead candidate row in the top-level matrix, append a structured cross-check that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's **pinned mode/config**.
 ```markdown
 ## Sub-Matrix — <Candidate Name> in <Pinned Mode>
 | Restriction / AC | Candidate-mode behavior | Result | Evidence |
 |------------------|-------------------------|--------|----------|
 | R1: <verbatim line from restrictions.md> | <how the pinned mode behaves under this restriction> | ✅ Pass / ❌ Fail / ❓ Verify / N/A | [Fact # / Source # / MVE link] |
 | R2: ... | ... | ... | ... |
 | ... | ... | ... | ... |
 | AC-1.1: <verbatim line from acceptance_criteria.md> | <how the pinned mode satisfies (or contradicts) this AC's measurable target> | ✅ / ❌ / ❓ / N/A | [Fact # / Source # / MVE link] |
 | AC-1.2: ... | ... | ... | ... |
 | ... | ... | ... | ... |
 ```
 Cell semantics:
 - ✅ **Pass** — the candidate's pinned mode satisfies this line, with cited official-doc or MVE evidence.
 - ❌ **Fail** — the candidate's pinned mode contradicts this line, with cited evidence. Even one ❌ disqualifies the candidate from `Selected` status.
 - ❓ **Verify** — no evidence yet either way; further investigation required (loops back to Step 2 / Step 3.5). A row left ❓ at the end of analysis blocks the candidate.
 - **N/A** — the line is irrelevant to this component area (state why in one phrase).
 A candidate row may not be marked `Selected` while any cell is ❌ or ❓.
 #### 7.5.3 Decision Rules
 - `Selected` is allowed only when (a) the top-level row has an MVE link, (b) the sub-matrix has zero ❌, (c) the sub-matrix has zero ❓, and (d) the candidate's documented implementation assumptions match the project's explicit constraints and acceptance criteria.
 - `Experimental only` is required when a candidate might work but lacks proof for the exact operating context (e.g., MVE exists for a similar configuration but not the exact one).
 - `Rejected` is required when documented assumptions conflict with project constraints (any sub-matrix row is ❌ with cited evidence).
 - `Needs user decision` is required when a mismatch changes scope, cost, safety, product behavior, or acceptance criteria — and the user has not yet been consulted.
 - Each component area must include at least one selected or fallback-safe option, plus the most credible rejected/experimental alternatives discovered during web research.
 - A component area with only one candidate is incomplete unless `00_question_decomposition.md` documents the broader searches and why they yielded no realistic alternatives.
 - A candidate may not appear as the lead solution in Step 8 unless this gate marks it `Selected`.
 - "Validation gate required" footnotes are not equivalent to `Selected`. If the validation gate concerns API capability (does the mode produce the required output?), that is a Step-2 / Step-7.5 question and must be resolved here, not deferred to runtime. Only validation gates concerning *runtime quality* (e.g., "does this VO converge on this terrain class?") may be carried forward as `Selected with runtime gate`.
-**Save action**: Write `06_component_fit_matrix.md`.
+**Save action**: Write `06_component_fit_matrix.md` containing both 7.5.1 (top-level) and 7.5.2 (per-candidate sub-matrices).
-**BLOCKING**: If any lead candidate is `Experimental only`, `Rejected`, or `Needs user decision`, do not silently proceed. Ask the user or choose a different selected candidate.
+**BLOCKING**: If any lead candidate has ❌, ❓, `Experimental only`, `Rejected`, or `Needs user decision` status, do not silently proceed. Ask the user or choose a different selected candidate.
 ---
@@ -10,17 +10,21 @@
 [Architecture solution that meets restrictions and acceptance criteria.]
 > **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical research outputs (concept comparison, market/policy report, investigation answer), this Architecture section may be replaced with a comparison/analysis section that does not use these columns; or the columns may be marked `N/A` per row when the row describes a non-technical "component" (a process, a policy, an organizational construct). For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
 ### Component: [Component Name]
-| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
+| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
-|----------|-------|-----------|-------------|-------------|----------|------|-----|
+|----------|-------|--------------------|-----------|-------------|-------------|----------|------|-------------------------|-----|
-| [Option 1] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
-| [Option 2] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
 **Exact-fit evidence**:
 - Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
 - Evidence: [Fact # / Source #]
 - Disqualifiers: [none or list]
 - Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
 - API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected
 [Repeat per component]
@@ -13,17 +13,21 @@
 [Architecture solution that meets restrictions and acceptance criteria.]
 > **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical assessment outputs (e.g., reassessing a policy approach, comparing organizational designs), this Architecture section may be replaced with the assessment content that does not use these columns; or the columns may be marked `N/A` per row for non-technical "components". For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
 ### Component: [Component Name]
-| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
+| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
-|----------|-------|-----------|-------------|-------------|----------|------------|-----|
+|----------|-------|--------------------|-----------|-------------|-------------|----------|------------|-------------------------|-----|
-| [Option 1] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
-| [Option 2] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
 **Exact-fit evidence**:
 - Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
 - Evidence: [Fact # / Source #]
 - Disqualifiers: [none or list]
 - Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
 - API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected
 [Repeat per component]
@@ -22,7 +22,7 @@ test-run has two modes. The caller passes the mode explicitly; if missing, defau
 | Mode | Scope | Typical caller | Input artifacts |
 |------|-------|---------------|-----------------|
 | `functional` (default) | Unit / integration / blackbox tests — correctness | autodev Steps that verify after Implement Tests or Implement | `scripts/run-tests.sh`, `_docs/02_document/tests/environment.md`, `_docs/02_document/tests/blackbox-tests.md` |
-| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 9, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |
+| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 15, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |
 Direct user invocation (`/test-run`) defaults to `functional`. If the user says "perf tests", "load test", "performance", or passes a performance scenarios file, run `perf` mode.
@@ -95,7 +95,7 @@ Examples:
 File: `expected_results/image_01_detections.json`
-```json
+```json
 {
  "input": "image_01.jpg",
  "expected": {
@@ -119,7 +119,7 @@ File: `expected_results/image_01_detections.json`
    ]
  }
 }
-```
+```
 ```
 ---
@@ -0,0 +1,27 @@
 .git
 .github
 .cursor
 _docs
 .venv
 __pycache__
 .pytest_cache
 .ruff_cache
 .mypy_cache
 .env
 .env.*
 *.pem
 *.key
 *.secret
 data/input/*
 data/cache/*
 data/fdr/*
 data/test-results/*
 *.tlog
 *.ulg
 *.bag
 *.mcap
 *.cbor
 *.parquet
 *.mp4
 *.mov
 *.avi
@@ -0,0 +1,10 @@
 GPSD_ENV=development
 GPSD_CONFIG_DIR=./config/development
 GPSD_CACHE_DIR=./data/cache
 GPSD_FDR_DIR=./data/fdr
 GPSD_DATABASE_URL=postgresql://gpsd:gpsd@localhost:5432/gpsd
 GPSD_MAVLINK_URL=udp:127.0.0.1:14550
 GPSD_CAMERA_SOURCE=./data/input
 GPSD_SIGNING_KEY_REF=test-key-ref
 GPSD_MAX_FDR_BYTES=104857600
 GPSD_LOG_LEVEL=info
@@ -0,0 +1,43 @@
 name: CI
 on:
  pull_request:
  push:
    branches:
      - dev
 jobs:
  python-quality:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.10"
      - name: Install
        run: |
          python -m pip install --upgrade pip
          python -m pip install -e ".[dev]"
      - name: Format check
        run: python -m black --check src tests
      - name: Lint
        run: python -m ruff check src tests
      - name: Unit tests
        run: python -m pytest tests/unit
  replay-compose-smoke:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Validate compose files
        run: |
          docker compose -f docker-compose.yml config
          docker compose -f docker-compose.test.yml config
      - name: Collect artifact placeholders
        run: mkdir -p data/test-results e2e/reports
      - uses: actions/upload-artifact@v4
        with:
          name: replay-evidence-placeholders
          path: |
            data/test-results
            e2e/reports
@@ -1 +1,42 @@
 .DS_Store
 .venv/
 __pycache__/
 *.py[cod]
 .pytest_cache/
 .ruff_cache/
 .mypy_cache/
 .coverage
 htmlcov/
 *.egg-info/
 .env
 .env.*
 !.env.example
 *.pem
 *.key
 *.secret
 data/input/*
 data/cache/*
 data/fdr/*
 data/test-results/*
 data/expected/*
 !data/input/.gitkeep
 !data/cache/.gitkeep
 !data/fdr/.gitkeep
 !data/test-results/.gitkeep
 !data/expected/.gitkeep
 *.tlog
 *.ulg
 *.bag
 *.mcap
 *.cbor
 *.parquet
 *.mp4
 *.mov
 *.avi
 *.jpg
 *.jpeg
 *.png
 !_docs/00_problem/input_data/**
@@ -0,0 +1,22 @@
 # GPS-Denied Onboard Runtime
 Scaffold for the Jetson-hosted GPS-denied localization runtime, replay harness, and
 deployment evidence paths.
 The project uses a Python `src/` layout for orchestration code. Native bridge
 placeholders live inside the owning component folders rather than in a shared
 native tree.
 Generated mission data, FDR payloads, cache payloads, and raw frame dumps are kept
 out of git unless they are explicitly curated test fixtures.
 ## Local Development
 ```bash
 python3 -m venv .venv
 source .venv/bin/activate
 python -m pip install -e ".[dev]"
 python -m pytest
 ```
 Local replay infrastructure is described in `docker-compose.yml`; CI and black-box
 test infrastructure are described in `docker-compose.test.yml`.
@@ -1,21 +1,25 @@
 # Acceptance Criteria
-> **Last revised**: 2026-04-26 (post Mode B Solution Assessment + user-driven addendum on VPR granularity & change-robustness + user lock-in of Mode B open items Q1–Q5).
+> **Last revised**: 2026-05-01 (Phase 1 AC/restrictions assessment clarifications).
 > Changes vs. previous version (2026-04-25): AC-1.2 split into hard-floor + stretch; AC-1.4 made quantitative; AC-2.2 split per pipeline stage; AC-3.4 dual-trigger; AC-4.3 autopilot-pinned; AC-5.2 N pinned; AC-7.1 scoped to level flight; AC-8.2 freshness by sector; six new AC added (AC-NEW-1 … AC-NEW-6).
 > Changes 2026-04-26: AC-4.3 extended to dual-channel hybrid (GPS_INPUT primary + ODOMETRY auxiliary); AC-8.6 added (VPR retrieval-unit + change-robustness); AC-NEW-7 added with confirmed numeric thresholds (cache-poisoning safety budget).
 > Changes 2026-04-29: AC-3.5 and AC-NEW-8 added for temporary visual blackout/cloud occlusion during GPS spoofing, including IMU-only degraded navigation, covariance growth, and failover limits.
 > Changes 2026-05-01: AC-1.3 anchor-age reporting clarified; AC-2.1 split so the >95% rate applies to VO registration, not every satellite re-anchor; AC-5.2 and AC-NEW-2 now require ArduPilot Plane SITL trigger verification; AC-8.3 storage accounting and AC-NEW-7 Satellite Service ownership clarified.
 ## Position Accuracy
 - **AC-1.1** — The system shall determine GPS coordinates of frame centers within **50 m** of true GPS for **≥80%** of photos in normal flight segments.
 - **AC-1.2** — The system shall determine GPS coordinates of frame centers within **20 m** of true GPS for **≥50%** of photos in normal flight segments.
- **AC-1.3** — Maximum cumulative VO drift between two consecutive satellite-anchored fixes shall be **<100 m** (VO-only fallback) or **<50 m** (when IMU is fused). Drift is measured as ‖VO-extrapolated centre − next anchor centre‖ at the moment of the anchor fix.
+- **AC-1.3** — Maximum cumulative VO drift between two consecutive satellite-anchored fixes shall be **<100 m** (VO-only fallback) or **<50 m** (when IMU is fused). Drift is measured as ‖VO-extrapolated centre − next anchor centre‖ at the moment of the anchor fix. Every emitted estimate shall include `last_satellite_anchor_age_ms`; validation results shall be binned by anchor age, and the solution draft must define the maximum anchor age after which estimates are treated as degraded (`vo_extrapolated` or `dead_reckoned`) with monotonically growing covariance.
 - **AC-1.4** — The system shall report a **quantitative confidence score** per position estimate, comprising:
  - the 95% covariance ellipse semi-major axis in meters, AND
  - a categorical label `{satellite_anchored, vo_extrapolated, dead_reckoned}`.
 ## Image Processing Quality
- **AC-2.1** — Image registration rate **>95%** for normal flight segments (defined as: nadir flight ±10° bank / pitch, ≥40% overlap with prior frame, daytime, season-matched satellite tile).
+- **AC-2.1** — Image registration rate is split by registration type:
  - **AC-2.1a — VO registration**: frame-to-frame visual registration shall succeed for **>95%** of normal flight segments (defined as: nadir flight ±10° bank / pitch, ≥40% overlap with prior frame, daytime, usable texture, no full visual blackout).
  - **AC-2.1b — Satellite-anchor registration**: cross-domain UAV-photo to satellite/cache registration is measured separately and is not hidden inside AC-2.1a. Satellite anchoring must satisfy AC-1.1 / AC-1.2 position accuracy, AC-2.2 cross-domain MRE, AC-8.2 freshness, and AC-8.6 retrieval behavior on season-matched tiles.
 - **AC-2.2** — Mean Reprojection Error (MRE):
  - **<1.0 px** for VO frame-to-frame homography on overlapping aerial pairs;
  - **<2.5 px** for satellite-anchored cross-domain (UAV photo ↔ ortho satellite tile) registration.
@@ -26,10 +30,11 @@
 - **AC-3.2** — The system shall correctly continue work during sharp turns where the next photo overlaps **<5%** with the previous, drifts **<200 m**, and changes heading **<70°**. Sharp-turn frames are expected to fail VO and shall be handled by satellite-based re-localization (place recognition over the satellite tile cache).
 - **AC-3.3** — The system shall handle **≥3 disconnected segments** per flight, connecting each new segment to the previous trajectory via global descriptor retrieval + RANSAC pose-graph relocalization. This is a core capability, not a degraded mode.
 - **AC-3.4** — When the system cannot determine position for **≥3 consecutive frames AND ≥2 s**, it shall send a re-localization request to the ground station via telemetry. While waiting, it continues VO/IMU dead reckoning and the flight controller uses last known position + IMU extrapolation.
 - **AC-3.5** — During temporary **visual blackout** where the navigation camera provides no usable ground signal (e.g., clouds/occlusion/whiteout) while GPS is denied or spoofed, the system shall switch to `{dead_reckoned}` within **≤1 processed frame OR ≤400 ms**, reject the spoofed GPS as an estimator input, and propagate position solely from the last trusted state + flight-controller IMU/attitude/airspeed/altitude inputs until visual or satellite anchoring recovers. During this mode, covariance shall grow monotonically, `GPS_INPUT.horiz_accuracy` shall not under-report the 95% covariance semi-major axis, and QGroundControl shall receive a `VISUAL_BLACKOUT_IMU_ONLY` status at **1–2 Hz**.
 ## Real-Time Onboard Performance
- **AC-4.1** — End-to-end latency from camera capture to GPS coordinate output to the flight controller shall be **<400 ms p95**. Up to ~10% of frames may be dropped under sustained load (skip-allowed).
+- **AC-4.1** — End-to-end latency from camera capture to GPS coordinate output to the flight controller shall be **<400 ms p95**. Up to ~10% of frames may be dropped under sustained load (skip-allowed). Heavy global VPR / cross-domain re-ranking shall be conditional, not part of the steady-state per-frame path, unless profiling proves the full path stays inside the latency and memory budgets on the target Jetson.
 - **AC-4.2** — Memory usage shall remain below **8 GB** shared on Jetson Orin Nano Super (CPU and GPU share the same 8 GB LPDDR5 pool).
 - **AC-4.3** — The system shall output its position estimate to the flight controller via **two parallel MAVLink channels**, both emitted by **pymavlink** (general telemetry uses MAVSDK):
  - **Primary**: `GPS_INPUT` targeting **ArduPilot** with `GPS1_TYPE=14` (MAVLink GPS substitute). Matches the "replacement for the GPS module" framing of the build.
@@ -43,7 +48,7 @@
 ## Startup & Failsafe
 - **AC-5.1** — The system shall initialise using the last known valid GPS position from the flight controller's EKF, plus IMU-extrapolated position at the moment of GPS denial.
- **AC-5.2** — If the system fails to produce any position estimate for **>3 s**, the flight controller shall fall back to IMU-only dead reckoning and the system shall log the failure.
+- **AC-5.2** — If the system fails to produce any position estimate for **>3 s**, the flight controller shall fall back to IMU-only dead reckoning and the system shall log the failure. Because ArduPilot failsafe timing depends on vehicle type and parameters, this fallback behavior must be verified specifically in ArduPilot Plane SITL with the production parameter set; Copter defaults are reference evidence only.
 - **AC-5.3** — On companion computer reboot mid-flight, the system shall attempt to re-initialise from the flight controller's current IMU-extrapolated position. See AC-NEW-1 for the cold-start time-to-first-fix budget.
 ## Ground Station & Telemetry
@@ -64,7 +69,7 @@
  - **<6 months old** for active-conflict sectors;
  - **<12 months old** for stable rear sectors.
  System shall reject or downgrade-confidence on tiles older than these thresholds (see AC-NEW-6).
- **AC-8.3** — Satellite imagery for the operational area shall be **pre-loaded and pre-processed** onto the companion computer before flight. Offline preprocessing time is not time-critical (minutes/hours). Pre-extracted tile descriptors (e.g., SuperPoint keypoints/descriptors and DINOv2-VLAD global descriptors) are part of the cache.
+- **AC-8.3** — Satellite imagery for the operational area shall be **pre-loaded and pre-processed** onto the companion computer before flight. Offline preprocessing time is not time-critical (minutes/hours). Pre-extracted tile descriptors (e.g., SuperPoint keypoints/descriptors and DINOv2-VLAD global descriptors) are part of the cache and count against the storage budget unless the solution draft explicitly defines a separate descriptor/index budget.
 - **AC-8.4** — **Mid-flight tile generation & write-back**: during flight, the system shall continuously orthorectify navigation-camera frames into tiles aligned with the basemap projection and store them in the local cache, **deduplicated** so each ground sector is stored at most once (latest / highest-quality tile wins). On landing, the companion computer shall upload newly generated tiles back to the Azaion Suite Satellite Service so that the next mission cache contains imagery refreshed by the previous flight.
 - **AC-8.5** — **Storage policy**: the system shall **not** retain raw navigation-camera frames or AI-camera frames as part of normal operation. Tiles are the only persistent imagery artifact. Forensic exception: a low-rate (≤0.1 Hz) thumbnail log of frames that **failed** tile generation may be retained for debugging within the FDR budget (AC-NEW-3).
 - **AC-8.6** — **VPR retrieval unit + change-robustness**:
@@ -91,9 +96,9 @@
 **Why it matters.** Without this gate, the FC may continue to follow a spoofed real-GPS source while our valid estimate sits idle. 3 s is short enough to keep the FC from acting on a malicious heading change but long enough to ride out a single-frame anomaly.
-**Implementation drivers.** Subscribe to `GPS_RAW_INT`, `EKF_STATUS_REPORT`, `SYS_STATUS`. Maintain an internal "real-GPS health" rolling average; switch to "primary" mode (raise our `GPS_INPUT` `fix_type` to 3D and assert) when health drops below threshold for ≥1 s. Emit `STATUSTEXT` to QGC on every promotion / demotion.
+**Implementation drivers.** Subscribe to `GPS_RAW_INT`, `EKF_STATUS_REPORT`, `SYS_STATUS`, and any ArduPilot Plane EKF/GPS status messages available in the production firmware. Maintain an internal "real-GPS health" rolling average; switch to "primary" mode (raise our `GPS_INPUT` `fix_type` to 3D and assert) when the verified Plane-specific health trigger stays below threshold for >=1 s. Emit `STATUSTEXT` to QGC on every promotion / demotion.
-**Validation.** SITL: simulate spoofing (inject false `GPS_RAW_INT` from a malicious node); measure time from spoof onset to our promotion. Pass = 95% percentile <3 s.
+**Validation.** ArduPilot Plane SITL: simulate spoofing (inject false `GPS_RAW_INT` from a malicious node); verify the exact trigger signals used by the production parameter set; measure time from spoof onset to our promotion. Pass = 95% percentile <3 s.
 ### AC-NEW-3 — Flight Data Recorder
@@ -148,8 +153,23 @@
 **Implementation drivers.**
 - Service-source tiles are immutable within freshness budget (AC-8.2); onboard tiles overwrite only stale or other-onboard tiles.
- The Suite Satellite Service ingest applies a **2-flight voting layer**: an onboard tile gets promoted to "trusted basemap" only after **N≥2 independent flights** confirm consistent geo-alignment within X m of each other. (Active sectors per AC-NEW-6 may use single-flight promotion when σ_xy ≤ 3 m AND OSM-road-overlap ≥ 70 %.)
+- The onboard GPS-Denied system writes tile-quality metadata required by the Suite Satellite Service. The Service-side ingest applies a **2-flight voting layer**: an onboard tile gets promoted to "trusted basemap" only after **N>=2 independent flights** confirm consistent geo-alignment within X m of each other. (Active sectors per AC-NEW-6 may use single-flight promotion when σ_xy <= 3 m AND OSM-road-overlap >= 70 %.) The voting layer is an external Suite Satellite Service dependency, not implemented inside this onboard build, but its contract is required for AC-NEW-7 to pass end-to-end.
 - The Component-1b parent-pose covariance is a **hard gate** in the local quality score: σ_xy ≤ 5 m for a hard write (`trust_level = candidate`); σ_xy ≤ 3 m for `trust_level = candidate` with full quality; tiles written in the σ_xy ∈ (3, 5] m band are marked `trust_level = soft` in the sidecar.
 - Eligibility check (Component 1b) tightens generation gate from σ_xy ≤ 10 m to σ_xy ≤ 5 m.
 **Validation.** Multi-flight Monte Carlo replay over AerialVL + Mavic + AerialExtreMatch with **synthetic over-confidence injection** (artificially deflate EKF covariance by 1.5×–3×): assert both probabilities below budget across ≥100 simulated flights worth of frames. Independently, Service-side voting layer is exercised in F-T3 to verify candidate tiles are not promoted to trusted basemap before N-flight confirmation.
 ### AC-NEW-8 — Visual blackout + GPS spoofing degraded-mode budget
 **Statement.** When the navigation camera is fully unusable for visual localization and the flight controller simultaneously reports GPS denial/spoofing, the onboard system shall:
 - continue emitting `GPS_INPUT` from IMU-only propagation for **up to 30 s** after the last trusted visual/satellite anchor, unless the estimator covariance exceeds the fail threshold earlier;
 - label every estimate `{dead_reckoned}` and set `fix_type=2` or lower when the 95% covariance semi-major axis exceeds **100 m**;
 - emit `fix_type=0`, `horiz_accuracy=999.0`, and `STATUSTEXT: VISUAL_BLACKOUT_FAILSAFE` when the 95% covariance semi-major axis exceeds **500 m** OR visual blackout exceeds **30 s** without a trusted re-anchor;
 - never promote spoofed real-GPS measurements back into the estimator during blackout unless the FC GPS health has been stable and non-spoofed for **≥10 s** and a visual/satellite consistency check has succeeded.
 **Why it matters.** A cloud/whiteout period removes all visual correction exactly when spoofed GPS cannot be trusted. The only safe behavior is honest IMU-only dead reckoning with rapidly growing uncertainty, not pretending that a stale visual position or spoofed GPS remains valid.
 **Implementation drivers.** Add an image-quality/occlusion classifier before VO/VPR, a blackout state in the ESKF mode machine, covariance floors for IMU-only propagation, strict GPS health gating, and QGC/FDR logging for blackout start, every degraded estimate, and blackout recovery/failsafe.
 **Validation.** SITL/replay: inject a 5 s, 15 s, and 35 s full-camera blackout while spoofing `GPS_RAW_INT`; assert mode transition ≤400 ms, spoofed GPS is ignored, covariance grows monotonically, `GPS_INPUT` fields degrade at the thresholds above, and recovery only occurs after a trusted visual/satellite anchor or the 10 s GPS-health + visual-consistency gate.
@@ -1,8 +1,2 @@
- Height
+- Height: 400m
-  - 400m
+- Camera: ADTi Surveyor Lite 20MP 20L V1
 - Camera:
  - Name: ADTi Surveyor Lite 26S v2
  - Resolution: 26MP
  - Image resolution: 6252*4168
  - Focal length: 25mm
  - Sensor width: 23.5
@@ -1,61 +0,0 @@
 frame_index,image,expected_lat,expected_lon,max_error_m,threshold_50m_applies,threshold_20m_applies
 1,AD000001.jpg,48.275292,37.385220,100,yes,yes
 2,AD000002.jpg,48.275001,37.382922,100,yes,yes
 3,AD000003.jpg,48.274520,37.381657,100,yes,yes
 4,AD000004.jpg,48.274956,37.379004,100,yes,yes
 5,AD000005.jpg,48.273997,37.379828,100,yes,yes
 6,AD000006.jpg,48.272538,37.380294,100,yes,yes
 7,AD000007.jpg,48.272408,37.379153,100,yes,yes
 8,AD000008.jpg,48.271992,37.377572,100,yes,yes
 9,AD000009.jpg,48.271376,37.376671,100,yes,yes
 10,AD000010.jpg,48.271233,37.374806,100,yes,yes
 11,AD000011.jpg,48.270334,37.374442,100,yes,yes
 12,AD000012.jpg,48.269922,37.373284,100,yes,yes
 13,AD000013.jpg,48.269366,37.372134,100,yes,yes
 14,AD000014.jpg,48.268759,37.370940,100,yes,yes
 15,AD000015.jpg,48.268291,37.369815,100,yes,yes
 16,AD000016.jpg,48.267719,37.368469,100,yes,yes
 17,AD000017.jpg,48.267461,37.367255,100,yes,yes
 18,AD000018.jpg,48.266663,37.365888,100,yes,yes
 19,AD000019.jpg,48.266135,37.365460,100,yes,yes
 20,AD000020.jpg,48.265574,37.364211,100,yes,yes
 21,AD000021.jpg,48.264892,37.362998,100,yes,yes
 22,AD000022.jpg,48.264393,37.361086,100,yes,yes
 23,AD000023.jpg,48.263803,37.361028,100,yes,yes
 24,AD000024.jpg,48.263014,37.359878,100,yes,yes
 25,AD000025.jpg,48.262635,37.358277,100,yes,yes
 26,AD000026.jpg,48.261819,37.357116,100,yes,yes
 27,AD000027.jpg,48.261182,37.355907,100,yes,yes
 28,AD000028.jpg,48.260727,37.354723,100,yes,yes
 29,AD000029.jpg,48.260117,37.353469,100,yes,yes
 30,AD000030.jpg,48.259677,37.352165,100,yes,yes
 31,AD000031.jpg,48.258881,37.351376,100,yes,yes
 32,AD000032.jpg,48.258425,37.349964,100,yes,yes
 33,AD000033.jpg,48.258653,37.347004,100,yes,yes
 34,AD000034.jpg,48.257879,37.347711,100,yes,yes
 35,AD000035.jpg,48.256777,37.348444,100,yes,yes
 36,AD000036.jpg,48.255756,37.348098,100,yes,yes
 37,AD000037.jpg,48.255375,37.346549,100,yes,yes
 38,AD000038.jpg,48.254799,37.345603,100,yes,yes
 39,AD000039.jpg,48.254557,37.344566,100,yes,yes
 40,AD000040.jpg,48.254380,37.344375,100,yes,yes
 41,AD000041.jpg,48.253722,37.343093,100,yes,yes
 42,AD000042.jpg,48.254205,37.340532,100,yes,yes
 43,AD000043.jpg,48.252380,37.342112,100,yes,yes
 44,AD000044.jpg,48.251489,37.343079,100,yes,yes
 45,AD000045.jpg,48.251085,37.346128,100,yes,yes
 46,AD000046.jpg,48.250413,37.344034,100,yes,yes
 47,AD000047.jpg,48.249414,37.343296,100,yes,yes
 48,AD000048.jpg,48.249114,37.346895,100,yes,yes
 49,AD000049.jpg,48.250241,37.347741,100,yes,yes
 50,AD000050.jpg,48.250974,37.348379,100,yes,yes
 51,AD000051.jpg,48.251528,37.349468,100,yes,yes
 52,AD000052.jpg,48.251873,37.350485,100,yes,yes
 53,AD000053.jpg,48.252161,37.351491,100,yes,yes
 54,AD000054.jpg,48.252685,37.352343,100,yes,yes
 55,AD000055.jpg,48.253268,37.353119,100,yes,yes
 56,AD000056.jpg,48.253767,37.354246,100,yes,yes
 57,AD000057.jpg,48.254329,37.354946,100,yes,yes
 58,AD000058.jpg,48.254874,37.355765,100,yes,yes
 59,AD000059.jpg,48.255481,37.356501,100,yes,yes
 60,AD000060.jpg,48.256246,37.357485,100,yes,yes
@@ -1,166 +1,97 @@
-# Expected Results
+# Expected Results Mapping
-Maps every input data item to its quantifiable expected result.
+## Scope
 Tests use this mapping to compare actual system output against known-correct answers.
-## Result Format Legend
+`coordinates.csv` is the current source of truth for the provided still-image nadir set. It gives expected WGS84 frame-center coordinates for `AD000001.jpg` through `AD000060.jpg`.
-| Result Type | When to Use | Example |
+This data is sufficient for black-box frame-center geolocation tests against still images. The Derkachi representative fixture in `input_data/flight_derkachi/` adds cropped nadir video plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry. It is sufficient for fixture validation, video/telemetry synchronization, replay, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. It is not sufficient by itself for final production accuracy because raw camera calibration, lens distortion, and exact camera-to-body calibration are still pending.
 |-------------|-------------|---------|
 | Exact value | Output must match precisely | `fix_type: 3`, `satellites_visible: 10` |
 | Tolerance range | Numeric output with acceptable variance | `lat: 48.275292 ± 50m` |
 | Threshold | Output must exceed or stay below a limit | `latency < 400ms`, `memory < 8GB` |
 | Pattern match | Output must match a string/regex pattern | `RELOC_REQ: last_lat=.* last_lon=.* uncertainty=.*m` |
 | File reference | Complex output compared against a reference file | `match expected_results/position_accuracy.csv` |
 | Set/count | Output must contain specific items or counts | `registered_frames / total_frames > 0.95` |
-## Comparison Methods
+## Pass / Fail Rules
-| Method | Description | Tolerance Syntax |
+- **Normal frame-center geolocation**: estimated frame center is within 50 m of the expected WGS84 coordinate.
-|--------|-------------|-----------------|
+- **Stretch accuracy bin**: estimated frame center is within 20 m of the expected WGS84 coordinate.
-| `numeric_tolerance` | abs(actual - expected) ≤ tolerance | `± <value>` |
+- **Dataset aggregate**: at least 80% of mapped images pass the 50 m threshold and at least 50% pass the 20 m threshold.
-| `threshold_min` | actual ≥ threshold | `≥ <value>` |
+- **Output shape**: each result must include image name, estimated `lat`, estimated `lon`, error in meters, source label, 95% covariance semi-major axis, and `last_satellite_anchor_age_ms`.
 | `threshold_max` | actual ≤ threshold | `≤ <value>` |
 | `percentage` | percentage of items meeting criterion | `≥ N%` |
 | `exact` | actual == expected | N/A |
 | `regex` | actual matches regex pattern | regex string |
 | `file_reference` | compare against reference file | file path |
-## Input → Expected Result Mapping
+## Input To Expected Output Map
-### Position Accuracy (60-image flight sequence)
+### Still-Image Frame Centers
-Ground truth GPS coordinates for each frame are in `coordinates.csv`. The system processes these frames sequentially (simulating a real flight) with corresponding IMU data (200Hz, from SITL ArduPilot or synthetic generation from trajectory) and satellite tile matches. The system outputs estimated GPS coordinates per frame. Expected results compare estimated positions against ground truth.
+| Input image | Expected latitude | Expected longitude | Primary threshold | Stretch threshold |
 |-------------|-------------------|--------------------|-------------------|-------------------|
 | AD000001.jpg | 48.275292 | 37.385220 | <= 50 m | <= 20 m |
 | AD000002.jpg | 48.275001 | 37.382922 | <= 50 m | <= 20 m |
 | AD000003.jpg | 48.274520 | 37.381657 | <= 50 m | <= 20 m |
 | AD000004.jpg | 48.274956 | 37.379004 | <= 50 m | <= 20 m |
 | AD000005.jpg | 48.273997 | 37.379828 | <= 50 m | <= 20 m |
 | AD000006.jpg | 48.272538 | 37.380294 | <= 50 m | <= 20 m |
 | AD000007.jpg | 48.272408 | 37.379153 | <= 50 m | <= 20 m |
 | AD000008.jpg | 48.271992 | 37.377572 | <= 50 m | <= 20 m |
 | AD000009.jpg | 48.271376 | 37.376671 | <= 50 m | <= 20 m |
 | AD000010.jpg | 48.271233 | 37.374806 | <= 50 m | <= 20 m |
 | AD000011.jpg | 48.270334 | 37.374442 | <= 50 m | <= 20 m |
 | AD000012.jpg | 48.269922 | 37.373284 | <= 50 m | <= 20 m |
 | AD000013.jpg | 48.269366 | 37.372134 | <= 50 m | <= 20 m |
 | AD000014.jpg | 48.268759 | 37.370940 | <= 50 m | <= 20 m |
 | AD000015.jpg | 48.268291 | 37.369815 | <= 50 m | <= 20 m |
 | AD000016.jpg | 48.267719 | 37.368469 | <= 50 m | <= 20 m |
 | AD000017.jpg | 48.267461 | 37.367255 | <= 50 m | <= 20 m |
 | AD000018.jpg | 48.266663 | 37.365888 | <= 50 m | <= 20 m |
 | AD000019.jpg | 48.266135 | 37.365460 | <= 50 m | <= 20 m |
 | AD000020.jpg | 48.265574 | 37.364211 | <= 50 m | <= 20 m |
 | AD000021.jpg | 48.264892 | 37.362998 | <= 50 m | <= 20 m |
 | AD000022.jpg | 48.264393 | 37.361086 | <= 50 m | <= 20 m |
 | AD000023.jpg | 48.263803 | 37.361028 | <= 50 m | <= 20 m |
 | AD000024.jpg | 48.263014 | 37.359878 | <= 50 m | <= 20 m |
 | AD000025.jpg | 48.262635 | 37.358277 | <= 50 m | <= 20 m |
 | AD000026.jpg | 48.261819 | 37.357116 | <= 50 m | <= 20 m |
 | AD000027.jpg | 48.261182 | 37.355907 | <= 50 m | <= 20 m |
 | AD000028.jpg | 48.260727 | 37.354723 | <= 50 m | <= 20 m |
 | AD000029.jpg | 48.260117 | 37.353469 | <= 50 m | <= 20 m |
 | AD000030.jpg | 48.259677 | 37.352165 | <= 50 m | <= 20 m |
 | AD000031.jpg | 48.258881 | 37.351376 | <= 50 m | <= 20 m |
 | AD000032.jpg | 48.258425 | 37.349964 | <= 50 m | <= 20 m |
 | AD000033.jpg | 48.258653 | 37.347004 | <= 50 m | <= 20 m |
 | AD000034.jpg | 48.257879 | 37.347711 | <= 50 m | <= 20 m |
 | AD000035.jpg | 48.256777 | 37.348444 | <= 50 m | <= 20 m |
 | AD000036.jpg | 48.255756 | 37.348098 | <= 50 m | <= 20 m |
 | AD000037.jpg | 48.255375 | 37.346549 | <= 50 m | <= 20 m |
 | AD000038.jpg | 48.254799 | 37.345603 | <= 50 m | <= 20 m |
 | AD000039.jpg | 48.254557 | 37.344566 | <= 50 m | <= 20 m |
 | AD000040.jpg | 48.254380 | 37.344375 | <= 50 m | <= 20 m |
 | AD000041.jpg | 48.253722 | 37.343093 | <= 50 m | <= 20 m |
 | AD000042.jpg | 48.254205 | 37.340532 | <= 50 m | <= 20 m |
 | AD000043.jpg | 48.252380 | 37.342112 | <= 50 m | <= 20 m |
 | AD000044.jpg | 48.251489 | 37.343079 | <= 50 m | <= 20 m |
 | AD000045.jpg | 48.251085 | 37.346128 | <= 50 m | <= 20 m |
 | AD000046.jpg | 48.250413 | 37.344034 | <= 50 m | <= 20 m |
 | AD000047.jpg | 48.249414 | 37.343296 | <= 50 m | <= 20 m |
 | AD000048.jpg | 48.249114 | 37.346895 | <= 50 m | <= 20 m |
 | AD000049.jpg | 48.250241 | 37.347741 | <= 50 m | <= 20 m |
 | AD000050.jpg | 48.250974 | 37.348379 | <= 50 m | <= 20 m |
 | AD000051.jpg | 48.251528 | 37.349468 | <= 50 m | <= 20 m |
 | AD000052.jpg | 48.251873 | 37.350485 | <= 50 m | <= 20 m |
 | AD000053.jpg | 48.252161 | 37.351491 | <= 50 m | <= 20 m |
 | AD000054.jpg | 48.252685 | 37.352343 | <= 50 m | <= 20 m |
 | AD000055.jpg | 48.253268 | 37.353119 | <= 50 m | <= 20 m |
 | AD000056.jpg | 48.253767 | 37.354246 | <= 50 m | <= 20 m |
 | AD000057.jpg | 48.254329 | 37.354946 | <= 50 m | <= 20 m |
 | AD000058.jpg | 48.254874 | 37.355765 | <= 50 m | <= 20 m |
 | AD000059.jpg | 48.255481 | 37.356501 | <= 50 m | <= 20 m |
 | AD000060.jpg | 48.256246 | 37.357485 | <= 50 m | <= 20 m |
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+### Representative Derkachi Video/IMU Fixture
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 1 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | ≥ 80% of frames have position error < 50m from ground truth | percentage | ≥ 80% of frames within 50m | `expected_results/position_accuracy.csv` |
 | 2 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | ≥ 50% of frames have position error < 20m from ground truth (per AC-1.2) | percentage | ≥ 50% of frames within 20m | `expected_results/position_accuracy.csv` |
 | 3 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | Per-frame position output in WGS84 (lat, lon) | numeric_tolerance | each frame ± 100m max (no single frame exceeds 100m error) | `expected_results/position_accuracy.csv` |
 | 4 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | Cumulative VO drift between satellite anchors < 100m | threshold_max | ≤ 100m drift between anchors | N/A |
-### GPS_INPUT Message Correctness
+| Input fixture | Expected validation result | Threshold |
 |---------------|----------------------------|-----------|
 | `flight_derkachi/data_imu.csv` | Telemetry CSV has required `timestamp(ms)`, `Time`, `SCALED_IMU2.*`, and `GLOBAL_POSITION_INT.*` columns; non-empty rows are monotonic from `Time=0.0` to `489.9` | 0 missing required columns; 0 decreasing timestamps; 4,900 nonblank rows |
 | `flight_derkachi/flight_derkachi.mp4` | Video stream is readable as cropped nadir footage for replay | H.264, 880 x 720, 30 fps, approximately 490.07 s |
 | Video/telemetry alignment | Video has 14,700 frames and telemetry has 4,900 rows | Exactly 3 video frames per telemetry row; duration delta <=250 ms |
 | Derkachi trajectory comparison | Replay output can be compared to `GLOBAL_POSITION_INT.lat`, `GLOBAL_POSITION_INT.lon`, `GLOBAL_POSITION_INT.alt`, `GLOBAL_POSITION_INT.relative_alt`, velocity, and heading | Thresholds are calibration-gated; use for smoke/relative trajectory validation until intrinsics and camera-to-body calibration are pinned |
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+## Known Gaps
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 5 | Single frame + IMU data | Normal tracking frame with recent satellite match | `fix_type: 3`, `horiz_accuracy: 5-20m`, `satellites_visible: 10`, lat/lon populated | exact (fix_type, sat), numeric_tolerance (accuracy) | fix_type == 3, horiz_accuracy ∈ [1, 50] | N/A |
 | 6 | Frame sequence, no satellite match for >30s | VO-only tracking, no recent satellite anchor | `fix_type: 3`, `horiz_accuracy: 20-50m` | exact (fix_type), range (accuracy) | fix_type == 3, horiz_accuracy ∈ [20, 100] | N/A |
 | 7 | Frame sequence, VO lost + no satellite | IMU-only dead reckoning | `fix_type: 2`, `horiz_accuracy: 50-200m+` (growing over time) | exact (fix_type), threshold_min (accuracy) | fix_type == 2, horiz_accuracy ≥ 50 | N/A |
 | 8 | VO lost + 3 consecutive satellite failures | Total position failure | `fix_type: 0`, `horiz_accuracy: 999.0` | exact | fix_type == 0, horiz_accuracy == 999.0 | N/A |
 | 9 | Any valid frame | GPS_INPUT output rate | GPS_INPUT messages at 5-10Hz continuous | range | 5 ≤ rate_hz ≤ 10 | N/A |
-### Confidence Tier Transitions
+- The still-image set has expected WGS84 centers but no synchronized IMU, attitude, airspeed, altitude, or timestamp stream.
-
+- The Derkachi fixture has synchronized video, IMU, and GPS trajectory, but no raw camera calibration, lens distortion, exact camera-to-body transform, attitude, or airspeed columns.
-| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+- The still-image sample cadence is slower than the target 3 fps runtime profile; the Derkachi video is 30 fps and must be sampled to target replay cadence for runtime tests.
-|---|-------|-------------------|-----------------|------------|-----------|---------------|
+- Final production acceptance requires camera calibration and representative datasets with synchronized camera/IMU plus ground-truth trajectory.
 | 10 | Frame with satellite match <30s ago, covariance <400m² | HIGH confidence conditions | Confidence tier: HIGH, SSE confidence: "HIGH" | exact | N/A | N/A |
 | 11 | Frame with cuVSLAM OK, no satellite match >30s | MEDIUM confidence conditions | Confidence tier: MEDIUM, SSE confidence: "MEDIUM" | exact | N/A | N/A |
 | 12 | Frame with cuVSLAM lost, IMU-only | LOW confidence conditions | Confidence tier: LOW, SSE confidence: "LOW" | exact | N/A | N/A |
 | 13 | 3+ consecutive total failures | FAILED conditions | Confidence tier: FAILED, SSE confidence: "FAILED", fix_type: 0 | exact | N/A | N/A |
 ### Image Registration & Visual Odometry
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 14 | 60 sequential flight images | Normal flight (no sharp turns) | Image registration rate ≥ 95% (≥ 57 of 60 registered) | percentage | ≥ 95% | N/A |
 | 15 | 60 sequential flight images | Normal flight images | Mean reprojection error < 1.0 pixels | threshold_max | MRE < 1.0 px | N/A |
 ### Disconnected Route Segments & Sharp Turns
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 16 | Frames 32-43 from coordinates.csv | Trajectory with direction change (turn area) | System continues producing position estimates through the turn | threshold_min | ≥ 1 position output per frame | N/A |
 | 17 | Simulated consecutive frames with 350m gap | Outlier between 2 consecutive photos due to tilt | System handles outlier, position estimate not corrupted (error < 100m for next valid frame) | threshold_max | ≤ 100m error after recovery | N/A |
 | 18 | Simulated sharp turn (no overlap, <5% overlap, <70° angle, <200m drift) | Sharp turn where VO fails | Satellite re-localization triggers, position recovered within 3 frames after turn | threshold_max | position error ≤ 50m after re-localization | N/A |
 | 19 | Simulated VO loss + satellite match success | Tracking loss → re-localization | cuVSLAM restarts, Component 5 calibrator emits a satellite-anchored fix, FC EKF3 reconverges, tracking_state returns to NORMAL | exact | tracking_state == NORMAL after recovery | N/A |
 ### 3-Consecutive-Failure Re-Localization
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 20 | Simulated VO loss + 3 satellite match failures | Cannot determine position by any means | Re-localization request sent: `RELOC_REQ: last_lat=.* last_lon=.* uncertainty=.*m` | regex | message matches pattern | N/A |
 | 21 | Re-localization request active | System waiting for operator | GPS_INPUT fix_type=0, system continues IMU prediction, continues satellite matching attempts | exact (fix_type) | fix_type == 0 | N/A |
 | 22 | Operator sends approximate coordinates (lat, lon) | Operator re-localization hint | System uses hint as a high-covariance (~500m) seed for VPR/cross-view re-localization (consumed by Component 5 calibrator), attempts satellite match in new area | threshold_max | position error ≤ 500m initially, ≤ 50m after satellite match | N/A |
 ### Startup & Handoff
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 23 | System boot with GLOBAL_POSITION_INT available | Normal startup | System reads initial position, initializes Component 5 calibrator state, starts GPS_INPUT output (per AC-NEW-1 cold-start TTFF budget) | threshold_max | GPS_INPUT output begins within 30s of boot (95th percentile) | N/A |
 | 24 | System boot + first satellite match | Startup validation | First satellite match validates initial position, position error drops | threshold_max | position error ≤ 50m after first satellite match | N/A |
 ### Mid-Flight Reboot Recovery
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 25 | System process killed mid-flight | Companion computer reboot | System recovers: reads FC IMU-extrapolated position, re-initialises Component 5 calibrator state with high uncertainty, loads TRT engines, starts cuVSLAM, performs satellite match | threshold_max | total recovery time ≤ 30s (matches AC-NEW-1 TTFF) | N/A |
 | 26 | Post-reboot first satellite match | Recovery validation | Position accuracy restored after first satellite match | threshold_max | position error ≤ 50m after first satellite match | N/A |
 ### Object Localization
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 27 | POST /objects/locate with pixel_x, pixel_y, gimbal angles, zoom, known UAV position | Object at known ground GPS | Response: `{ lat, lon, alt, accuracy_m, confidence }` with lat/lon matching ground truth | numeric_tolerance | lat/lon within accuracy_m of ground truth (consistent with frame-center accuracy) | N/A |
 | 28 | POST /objects/locate with invalid pixel coordinates | Out-of-frame pixel | HTTP 422 or error response indicating invalid input | exact | HTTP status 422 | N/A |
 ### Coordinate Transform Chain
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 29 | Known GPS → NED → pixel → GPS round-trip | Coordinate transform validation | Round-trip error < 0.1m | threshold_max | ≤ 0.1m | N/A |
 ### API & Communication
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 30 | GET /health | Health check endpoint | HTTP 200, JSON with memory_mb, gpu_temp_c, status fields | exact (status code), regex (body) | status == 200, body contains `"status"` | N/A |
 | 31 | POST /sessions | Start session | HTTP 200/201 with session ID | exact | status ∈ {200, 201} | N/A |
 | 32 | GET /sessions/{id}/stream | SSE position stream | SSE events at ~1Hz with fields: type, timestamp, lat, lon, alt, accuracy_h, confidence, vo_status | regex | each event matches SSE schema | N/A |
 | 33 | Unauthenticated request to /sessions | No JWT token | HTTP 401 Unauthorized | exact | status == 401 | N/A |
 ### Performance Thresholds
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 34 | Single camera frame (6252x4168) | End-to-end processing time | Total pipeline latency < 400ms (capture → GPS coordinate output) | threshold_max | ≤ 400ms | N/A |
 | 35 | 30-minute sustained operation | Memory usage over time | Peak memory < 8GB, no memory leaks (growth < 50MB over 30min) | threshold_max | peak < 8192MB, growth ≤ 50MB | N/A |
 | 36 | 30-minute sustained operation | GPU thermal | SoC junction temperature stays below 80°C (no throttling) | threshold_max | ≤ 80°C | N/A |
 | 37 | cuVSLAM single frame | VO processing time | cuVSLAM inference ≤ 20ms per frame | threshold_max | ≤ 20ms | N/A |
 | 38 | Satellite matching single frame | Inline cross-view matcher time | SP+LG (TRT FP16/INT8) inline-matcher inference ≤ 200ms / pair on Orin Nano Super @ 25W. (LiteSAM is re-loc-fallback only, ≤ 2s budget — out of inline path.) | threshold_max | ≤ 200ms inline; ≤ 2000ms re-loc fallback | N/A |
 | 39 | TRT engine load | Engine initialization time | All TRT engines loaded within 10s total | threshold_max | ≤ 10s | N/A |
 ### Satellite Tile Management
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 40 | Mission area definition (200km path, ±2km buffer, zoom 18) | Tile storage calculation | Total storage 500-800MB for zoom 18 + zoom 19 flight path | range | [300MB, 1000MB] | N/A |
 | 41 | ESKF position ± 3σ search radius | Tile selection | Tiles covering search area loaded, mosaic assembled, covers at least 500m radius | threshold_min | coverage radius ≥ 500m | N/A |
 ### TRT Engine Validation
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 42 | LiteSAM PyTorch model → ONNX → TRT FP16 | TRT engine conversion | Engine builds successfully on Jetson Orin Nano Super | exact | exit_code == 0 | N/A |
 | 43 | TRT engine output vs PyTorch reference (same input) | Inference correctness | Max L1 error between TRT and PyTorch output < 0.01 | threshold_max | L1_max < 0.01 | N/A |
 | 44 | LiteSAM MinGRU operations | TRT compatibility check | All MinGRU ops supported in TRT 10.3 (polygraphy inspect) | exact | unsupported_ops == 0 | N/A |
 ### Telemetry
 | # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
 |---|-------|-------------------|-----------------|------------|-----------|---------------|
 | 45 | Normal operation | Telemetry output rate | NAMED_VALUE_FLOAT messages at 1Hz (gps_conf, gps_drift, gps_hacc) | numeric_tolerance | rate: 1Hz ± 0.2Hz | N/A |
 | 46 | VO tracking lost + 3 satellite failures | Re-localization telemetry | STATUSTEXT with RELOC_REQ sent to ground station | regex | message matches `RELOC_REQ:.*` | N/A |
 ## Expected Result Reference Files
 ### position_accuracy.csv
 Reference file: `expected_results/position_accuracy.csv`
 Contains the ground truth GPS coordinate for each frame in the 60-image test sequence (copied from `coordinates.csv`) plus the acceptance thresholds. Test harness computes haversine distance between estimated and ground truth positions, then applies aggregate criteria.
 Thresholds applied to the full 60-frame sequence:
 - ≥ 80% of frames: error < 50m
 - ≥ 60% of frames: error < 20m
 - 0% of frames: error > 100m (no single frame exceeds 100m)
 - Cumulative VO drift between satellite anchors: < 100m
@@ -0,0 +1,14 @@
 # Derkachi Representative Flight Fixture
 ## Files
 | File | Description | Observed Metadata |
 |------|-------------|-------------------|
 | `flight_derkachi.mp4` | Cropped nadir flight footage for replay | H.264, 880 x 720, 30 fps, about 490.07 s |
 | `data_imu.csv` | Flight-controller telemetry trace exported from the tlog | 4,900 rows at 10 Hz from `Time=0.0` to `489.9`; includes `SCALED_IMU2` and `GLOBAL_POSITION_INT` trajectory fields |
 ## Test Use
 Use this fixture for video/telemetry synchronization checks, representative replay smoke tests, VIO hot-path latency, frame-drop accounting, and trajectory comparison against `GLOBAL_POSITION_INT`. The video and telemetry align at exactly three video frames per telemetry row. Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims.
 For the test recording, the rotating camera was mechanically fixed in a downward/nadir orientation. Treat the MP4 as a cleaned/cropped replay fixture rather than the raw camera feed.
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:9acb97042fc648301d73d3c0fe7d80f7e3e2697000c0d33afa8a7b7a74a20005
 size 282207328
@@ -1,2 +1,2 @@
-We have a wing-type UAV with a camera pointing downwards that can take photos 3 times per second with a resolution 6200*4100. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC. 
+We have a wing-type UAV with a fixed downward navigation camera that can take photos 3 times per second. The authoritative navigation-camera spec is defined in `restrictions.md` as the ADTi 20MP 20L V1, APS-C sensor, about 5472 x 3648 px; older higher-resolution references are superseded. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC. 
-The real world examples are in input_data folder, but the distance between each photo is way bigger than it will be from a real plane. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms. We also don't have IMU data for the test. For now we have to search for the public data for that in internet. We've tried to record that with Mavic 3 Pro Mini, but failed, cause of the closed system if DJI.
+The real world examples are in input_data folder, but the original still-image set has a much larger distance between photos than the target aircraft will have. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms. Additional representative data is available in `input_data/flight_derkachi/`: cropped nadir flight footage plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry. This supports video/telemetry synchronization, replay, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. Camera intrinsics, lens distortion, raw camera feed parameters, and exact camera-to-body calibration are still pending, so final production accuracy claims remain gated on calibration data or a separately surveyed representative dataset.
@@ -1,6 +1,6 @@
 # Restrictions
-> **Last revised**: 2026-04-26 (post Mode B Solution Assessment + user-driven addendum on camera spec & zoom level).
+> **Last revised**: 2026-05-01 (post Phase 1 AC/restrictions assessment clarifications).
 ## UAV & Flight
@@ -12,7 +12,7 @@
  - **Transit corridor**: ~**50 km × 1 km = 50 km²** strip in/out of the sector.
  - **Total operational area: up to ~400 km²** of pre-cached satellite imagery per mission. Cache is **persistent across flights** (not redownloaded each mission). Storage budget **~10 GB** for the satellite tile cache; see AC-NEW-3 for flight-data-recorder budget.
 - Altitude: pre-defined, **≤1 km AGL**. Terrain is assumed flat (operational area is rolling steppe / agricultural land); height differences are negligible.
- Weather: predominantly sunny daytime operations.
+- Weather: predominantly sunny daytime operations. Validation must still cover the seasonal/visibility classes that affect visual matching in the operational area: summer crop/field patterns, autumn/winter bare fields, cloud/smoke/haze, snow if missions can occur in winter, and low-texture agricultural repetition.
 - Sharp turns occur but are the exception, not the rule. Two consecutive photos may share <5% overlap during a turn (see AC-3.2).
 - **No photo-count cap.** The previously stated "up to 3000 photos per flight" was a legacy operator number from a Mavic-class workflow; it is dropped because (a) it is inconsistent with 8 h × 3 fps, and (b) the system does **not store raw photos at all** (see AC-8.5). Storage is bounded by the tile-cache + FDR caps (~10 GB persistent + 64 GB / flight, AC-NEW-3).
@@ -32,7 +32,7 @@
 - **Mid-flight tile generation (AC-8.4)**: during the mission the companion computer generates fresh tiles from the navigation camera, orthorectified into the basemap projection, deduplicated against the existing cache, and stored locally. On landing, those new tiles are uploaded back to the Suite Satellite Service for ingestion, so the next mission's cache is refreshed by the previous flight.
 - **No raw photo storage** (AC-8.5): the tile is the unit of persistence. Raw nav-camera and AI-camera frames are not retained (except a low-rate failure-thumbnail log for forensics).
 - **Resolution at the cache interface**: 0.5 m/pixel minimum, 0.3 m/pixel ideal (AC-8.1). The architecture is provider-agnostic at the cache boundary; whatever the Suite Satellite Service supplies must meet that bar.
- **Storage tile zoom level**: **slippy-XYZ z=20 (~30 cm/px, 512×512)** — pinned because the matcher (Component 3) needs ≤~4× scale ratio between the UAV frame (~12 cm/px GSD at 1 km AGL with the 20 MP APS-C camera) and the reference; z=20 gives a 2.5× ratio (workable), z=18 gives a 10× ratio (matcher accuracy breaks down). Storage budget at z=20 across the 400 km² operational area = ~2.8 GB cache + ~30 MB DEM + ~16 MB VPR chunk index ≈ ~3 GB total — well inside the 10 GB cache budget. **VPR retrieval unit is decoupled from the storage tile** (see AC-8.6 below): VPR chunks are derived from the z=20 tile cache at ground-footprint scale (~600–800 m chunks with 40–50 % overlap), independent of the storage zoom level.
+- **Storage tile resolution convention**: cache imagery is specified by source pixel size, not by assuming a universal zoom-to-meter mapping. The cache interface accepts **0.5 m/px minimum, 0.3 m/px ideal** imagery, and every tile manifest records CRS, tile matrix convention, tile dimension, latitude-adjusted meters-per-pixel, capture date, source, and compression. If an XYZ/WebMercator tile pyramid is used, its zoom level is documented as a provider convention rather than treated as proof of physical resolution. The matcher (Component 3) needs <=~4x scale ratio between the UAV frame (~12 cm/px GSD at 1 km AGL with the 20 MP APS-C camera) and the reference; 0.3-0.5 m/px reference imagery gives a ~2.5-4.2x ratio. Storage budget across the 400 km² operational area remains capped at **10 GB** for the persistent cache and must be validated against the final provider format/compression. The 10 GB budget includes cache imagery, manifests, overviews, and any precomputed global/local descriptors unless the solution draft explicitly splits a separate descriptor/index budget. **VPR retrieval unit is decoupled from the storage tile** (see AC-8.6 below): VPR chunks are derived from the tile cache at ground-footprint scale (~600-800 m chunks with 40-50 % overlap), independent of the storage tile convention.
 - **Freshness gates** (AC-8.2 / AC-NEW-6) are enforced at runtime: tiles older than 6 months (active-conflict sectors) or 12 months (stable rear sectors) are rejected or down-confidence-weighted. Tiles generated mid-flight are timestamped with the current flight date and treated as fresh.
 - **Free public imagery (Sentinel-2 etc.)** is not on the runtime path. If the Suite Satellite Service ever returns Sentinel-class tiles, the cache rejects them as below the 0.5 m/px floor.
@@ -46,6 +46,7 @@
 ## Sensors & Integration
 - High-rate **IMU** data is available from the flight controller via MAVLink.
 - The original still-image sample does **not** include synchronized IMU or ground-truth pose. The Derkachi representative fixture adds cropped nadir video plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry, which is enough for replay, synchronization, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. Final production acceptance still requires camera intrinsics, lens distortion, exact camera-to-body calibration, and representative synchronized navigation-camera frames, FC IMU/attitude/airspeed/altitude, emitted MAVLink messages, and ground-truth trajectory from a representative flight or replay rig.
 - The system communicates with the flight controller via MAVLink. Telemetry plumbing uses **MAVSDK**; the `GPS_INPUT` injection path is implemented via **pymavlink**, since MAVSDK does not expose a native `GPS_INPUT` API.
 - **Autopilot target: ArduPilot only** (with `GPS1_TYPE=14` for MAVLink GPS injection). PX4 is out of scope for the build; if it ever returns to scope it will use `VISION_POSITION_ESTIMATE`, not `GPS_INPUT`. (See `_docs/00_research/00_ac_assessment.md` Q-1.)
 - The system outputs WGS84 GPS coordinates to the flight controller as a replacement for the real GPS module (MAVLink GPS_INPUT, AC-4.3).
@@ -0,0 +1,54 @@
 # Acceptance Criteria Assessment
 Accessed: 2026-05-01. Rerun after user-approved clarifications: 2026-05-01.
 ## Research Scope
 - **Output class**: Technical-component selection support.
 - **Novelty sensitivity**: High for VPR, embedded AI, and autopilot integration; source preference is current papers and official docs.
 - **Boundary**: Fixed-wing UAV, nadir navigation camera, ArduPilot Plane, Jetson Orin Nano Super, offline Azaion Suite Satellite Service cache, eastern/southern Ukraine terrain.
 ## Acceptance Criteria
 | Criterion | Current Values | Researched Values / Evidence | Cost / Timeline Impact | Status |
 |-----------|----------------|------------------------------|------------------------|--------|
 | AC-1.1 / AC-1.2 frame-center accuracy | <=50 m for >=80%, <=20 m for >=50% in normal segments | Plausible only with periodic satellite anchoring plus VO/IMU propagation. Aerial VPR papers show the mechanism is viable but sensitive to weather, scale, repetition, and tile overlap. | High validation cost. | Keep, high-risk |
 | AC-1.3 drift | VO-only <100 m, IMU-fused <50 m between anchors, anchor age reported | Updated AC now requires `last_satellite_anchor_age_ms`, binned validation, and degraded covariance after a solution-defined max anchor age. | Medium. | Updated |
 | AC-1.4 confidence | 95% covariance ellipse + source label | `GPS_INPUT` supports accuracy fields; source labels must be carried in telemetry/FDR because `GPS_INPUT` has no semantic label field. | Medium. | Keep |
 | AC-2.1 registration | VO >95%; satellite anchoring measured separately | Split is correct: VO success is not the same as cross-domain satellite anchor success. | Medium-high. | Updated |
 | AC-2.2 reprojection | <1 px VO, <2.5 px satellite anchor | Reasonable image-space gates, with coordinate error still dependent on calibration, orthorectification, and satellite georegistration. | Medium. | Keep |
 | AC-3.x resilience | Outliers, sharp turns, disconnected segments, blackout | Technically feasible only through mode switching: VO failure triggers VPR/relocalization, blackout triggers IMU-only propagation with honest covariance growth. | High test cost. | Keep |
 | AC-4.1 latency | <400 ms p95, <=10% frame drops, heavy VPR conditional | Aerial VPR survey reports some re-ranking paths too slow for steady-state use; solution must keep global VPR off the per-frame hot path. | High optimization cost. | Updated |
 | AC-4.2 memory | <8 GB shared | Feasible if descriptors are compressed/pruned and indices are memory-mapped or loaded selectively. | Medium-high. | Keep |
 | AC-4.3 MAVLink | v1 GPS_INPUT only via pymavlink | ArduPilot docs require `GPS1_TYPE=14`; MAVLink defines required lat/lon, velocity, fix, and accuracy fields. MAVSDK should remain telemetry-oriented. | Medium. | Keep |
 | AC-5.2 failsafe | >3 s no estimate triggers fallback, Plane SITL verified | Copter docs are reference only. Plane-specific production parameters must be verified in SITL. | Medium. | Updated |
 | AC-7 object localization | Level-flight AI-camera object GPS | Realistic under level-flight clause; maneuvering estimates must publish conservative bound. | Medium. | Keep |
 | AC-8.x satellite cache | 0.3-0.5 m/px, freshness, offline descriptors, VPR chunks | Resolution is feasible through commercial/service imagery. Storage must count descriptors unless separately budgeted. | Medium-high. | Updated |
 | AC-NEW-1 / 2 startup and spoofing | <30 s first fix, <3 s promotion | Feasible only with prebuilt engines, warmed indices, and verified Plane GPS-health triggers. | Medium-high. | Keep with SITL gate |
 | AC-NEW-3 FDR | <=64 GB per flight, no raw frames | Feasible with segment files and rollover. | Medium. | Keep |
 | AC-NEW-4 / 7 safety budgets | False-position and cache-poisoning probabilities | Appropriate safety gates, but require Monte Carlo and representative flight/replay data. | High. | Keep |
 | AC-NEW-5 environment | -20 C to +50 C, 25 W for 8 h | NVIDIA confirms 25 W mode; thermal design must prevent throttling. | Medium-high. | Keep |
 ## Restrictions Assessment
 | Restriction | Current Values | Researched Values / Evidence | Cost / Timeline Impact | Status |
 |-------------|----------------|------------------------------|------------------------|--------|
 | Camera source of truth | `restrictions.md` pins ADTi 20MP ~5472 x 3648 | User confirmed `restrictions.md` is authoritative. Lens/FOV remains a design parameter. | Medium during module selection. | Updated |
 | Fixed nadir camera | No gimbal stabilization | Good for orthorectification; turn/tilt requires attitude compensation and failure detection. | Medium. | Keep |
 | Terrain/weather | Flat steppe/agricultural, seasonal classes included | Repetitive fields and seasonal changes are VPR hazards; validation must include those classes. | High validation cost. | Updated |
 | Satellite Service boundary | Offline consumer of Suite Satellite Service | Strong separation; cache manifest and ingest-voting contract are required. | Medium. | Keep |
 | 10 GB cache | Includes imagery, manifests, overviews, descriptors unless split | Plausible at 0.5 m/px with compression; 0.3 m/px plus descriptors may exceed unless pruned. | Medium. | Updated |
 | Jetson Orin Nano Super | 67 TOPS INT8, 8 GB, 25 W | Official specs support the restriction; thermal throttling remains a risk. | Medium-high. | Keep |
 | Test data gap | Sample imagery lacks IMU/ground truth | Public datasets help prototype, but final acceptance needs synchronized representative data. | High. | Updated |
 ## Key Findings
 1. Use a hybrid estimator: VO/IMU for frame propagation, satellite/VPR anchors for absolute correction, ESKF covariance as the safety gate.
 2. Do not run heavy VPR/re-ranking every frame; invoke it on cold start, VO failure, covariance growth, sharp turns, and disconnected segments.
 3. Avoid GPL libraries in production dependencies unless the project accepts GPL obligations. GPL VIO/SLAM tools should be benchmarks or references, not selected production components.
 4. The cache must be designed as imagery + metadata + descriptor index, not just raster tiles.
 5. ArduPilot Plane SITL and representative camera+IMU data are blocking validation dependencies, but not blockers for solution drafting.
 ## Sources
 See `_docs/00_research/01_source_registry.md` for the detailed source list.
@@ -0,0 +1,145 @@
 # Question Decomposition
 ## Classification
 - **Original question**: Design a GPS-denied onboard localization system for a fixed-wing UAV using a nadir camera, IMU, preloaded satellite imagery, and ArduPilot `GPS_INPUT`.
 - **Active mode**: Mode A Phase 2, initial solution research.
 - **Research output class**: Technical-component selection.
 - **Question type**: Decision support with knowledge organization.
 - **Timeliness sensitivity**: High for VPR, embedded AI inference, and MAVLink/ArduPilot integration; medium for geometry and filtering fundamentals.
 ## Research Boundary
 | Dimension | Boundary |
 |-----------|----------|
 | Population | Fixed-wing UAV missions; not multirotor hover workflows. |
 | Geography | Eastern/southern Ukraine operational areas east/left of the Dnipro River. |
 | Timeframe | Current implementation target with 2024-2026 component evidence where possible. |
 | Level | Onboard real-time production system, not offline post-processing. |
 | Operating context | 8 h flight, 60 km/h, <=1 km AGL, 3 fps nav camera, Jetson Orin Nano Super, GPS denied/spoofed. |
 | Required interfaces | Offline Satellite Service cache in; MAVLink `GPS_INPUT`, QGC telemetry, FDR records, and object-coordinate API out. |
 | Non-functional envelope | <400 ms p95, <8 GB shared memory, 10 GB persistent cache target, 64 GB FDR cap, safety covariance and false-position budgets. |
 ## Project Constraint Matrix Summary
 | Constraint Area | Binding Constraint |
 |-----------------|-------------------|
 | Camera | ADTi 20MP 20L V1, APS-C, ~5472 x 3648, fixed nadir, no gimbal stabilization. |
 | Sensors | FC IMU/attitude/airspeed/altitude available over MAVLink; original still-image sample lacks synchronized IMU, while Derkachi replay data now provides synchronized IMU and `GLOBAL_POSITION_INT` trajectory. |
 | Reference imagery | Offline cache only, 0.5 m/px minimum and 0.3 m/px ideal, freshness gates, no in-flight provider fetch. |
 | Runtime | Jetson Orin Nano Super, CUDA/TensorRT available, 25 W thermal envelope. |
 | Autopilot | ArduPilot only, v1 emits `GPS_INPUT` only; ODOMETRY intentionally disabled. |
 | Storage | No raw frame retention; tiles + FDR only. Descriptor/index storage must be budgeted. |
 | Safety | Reject weak anchors, never under-report covariance, fail/degrade honestly in blackout and spoofing. |
 | Hard disqualifiers | Per-frame heavy VPR without profiling, runtime dependence on external network, stale-tile confident anchors, GPL production dependency unless licensing is accepted. |
 ## Perspectives
 | Perspective | Focus |
 |-------------|-------|
 | Operator / mission user | Does the system keep the UAV navigable and report honest confidence under spoofing/blackout? |
 | Embedded implementer | Can the pipeline fit <400 ms p95 and <8 GB on Jetson with maintainable interfaces? |
 | Safety reviewer | Are false-position and cache-poisoning paths gated before they can steer the FC or poison future caches? |
 | Field practitioner | Will seasonal agricultural repetition, turns, haze/smoke, and stale imagery break the architecture? |
 | Contrarian | Which attractive libraries or SOTA models fail because of licensing, memory, latency, or input mismatch? |
 ## Sub-Questions And Query Variants
 1. What architecture bounds drift while GPS is denied?
   - fixed-wing UAV GPS-denied satellite image matching visual odometry
   - visual odometry satellite imagery accumulated error fixed wing UAV
   - monocular VIO aerial navigation scale ambiguity satellite anchor
   - GPS spoofed UAV visual inertial navigation covariance failover
 2. Which VO/VIO approach fits one nadir camera + IMU?
   - OpenVINS monocular visual inertial odometry Jetson
   - ORB-SLAM3 monocular inertial Jetson UAV limitations
   - VINS-Fusion fixed wing monocular IMU outdoor aerial
   - homography visual odometry nadir UAV IMU fusion
 3. Which satellite retrieval and matching approach fits offline cache + <400 ms?
   - aerial visual place recognition survey DINOv2 FAISS
   - DINOv2 VLAD aerial VPR embedded memory
   - LightGlue SuperPoint DISK ALIKED TensorRT Jetson
   - cross-view UAV satellite matching failure modes farmland
 4. How should the estimator and safety modes work?
   - ESKF visual inertial GPS denied UAV covariance
   - GPS_INPUT horiz_accuracy covariance external GPS ArduPilot
   - visual blackout IMU dead reckoning UAV covariance growth
   - false position rejection Mahalanobis gate visual localization
 5. What cache format and data contract fit the onboard/Satellite Service boundary?
   - COG PMTiles MBTiles offline raster cache embedded
   - satellite tile descriptor index storage FAISS PMTiles
   - cloud optimized geotiff local update limitations
   - PMTiles read only update PostgreSQL/PostGIS-backed raster cache
 6. How should MAVLink output integrate with ArduPilot Plane?
   - ArduPilot GPS_INPUT GPS1_TYPE 14 Plane SITL
   - pymavlink gps_input_send external GPS example
   - MAVSDK GPS_INPUT support raw MAVLink
   - ArduPilot EKF GPS glitch spoof failsafe Plane parameters
 7. What validation datasets and tests are needed?
   - AerialVL UAV satellite visual localization dataset
   - VPAir aerial visual place recognition dataset
   - EuRoC MAV visual inertial odometry dataset
   - ArduPilot Plane SITL fake GPS spoofing simulation
 ## Component Option Search Plan
 | Component Area | Option Families / Candidates | Evidence Needed |
 |----------------|------------------------------|-----------------|
 | Camera calibration and geometry | OpenCV calibration/homography; custom NumPy geometry; ROS camera pipeline | Official API for intrinsics, distortion, homography, RANSAC; permissive licensing; Jetson compatibility. |
 | VO / VIO propagation | OpenVINS, ORB-SLAM3, VINS-Fusion, custom homography+IMU ESKF | Exact monocular+IMU input fit, output pose/covariance, licensing, runtime, initialization behavior. |
 | VPR global retrieval | DINOv2-VLAD/AnyLoc, MixVPR/SALAD/SelaVPR, classical NetVLAD/BoW | Aerial benchmark evidence, descriptor size, offline index fit, embedded feasibility. |
 | Local cross-domain matching | LightGlue + DISK/ALIKED, SuperPoint+LightGlue, LoFTR/XFeat, SIFT/ORB baseline | Inputs/outputs, match coordinates, license, runtime knobs, TensorRT/Jetson feasibility. |
 | Vector index | FAISS CPU/GPU, PostgreSQL/pgvector metadata-assisted search, Annoy/HNSWLIB | Top-K retrieval, saved index, memory/compression knobs, ARM/Jetson feasibility. |
 | Estimator | Custom ESKF, factor graph, robot_localization | Covariance output, mode labels, Mahalanobis gates, source-specific update control. |
 | Cache/storage | COG, PostgreSQL/PostGIS manifest, PMTiles, MBTiles, raw tile folders | Offline read/update behavior, storage efficiency, metadata/manifest support. |
 | MAVLink integration | pymavlink, MAVSDK, MAVProxy bridge | `GPS_INPUT` support, ArduPilot `GPS1_TYPE=14`, telemetry subscriptions, QGC status. |
 | FDR | PostgreSQL event index, Parquet export, CBOR segment files | Streaming writes, rollover, compact typed records, replayability. |
 ## Completeness Audit
 - **Cost/resources**: covered by Jetson, cache, thermal, and descriptor storage constraints.
 - **Legal/licensing**: covered; GPL VIO/SLAM tools are not selected for production.
 - **Dependencies**: Satellite Service cache contract, ArduPilot Plane SITL, and synchronized validation data are explicit dependencies.
 - **Operating environment**: fixed-wing, altitude, terrain, seasonal/visibility classes, and blackout cases covered.
 - **Failure modes**: VO failure, stale tiles, spoofing, blackout, thermal throttling, false anchors, cache poisoning covered.
 - **Practitioner concerns**: real-time embedded performance and dataset mismatch covered through survey and benchmark sources.
 - **Change over time**: DINOv2/VPR models and Jetson/TensorRT assumptions require version-pinned profiling during implementation.
 ## Mode B Round 2 Addendum — User-Requested Technology Check
 ### Research Output Class
 Technical-component selection. The addendum verifies two implementation choices before autodev proceeds to planning:
 1. Whether OpenVINS should replace the custom OpenCV-based VO/ESKF direction.
 2. Whether DINOv2-VLAD + ALIKED/LightGlue is still the right satellite retrieval and anchor-verification stack.
 ### Boundary Clarification
 "Custom OpenCV" is treated as OpenCV for calibration, undistortion, feature geometry, homography/RANSAC, and MRE measurement, plus a project-owned ESKF/mode machine. It is not treated as a naive OpenCV-only replacement for VIO.
 ### Additional Query Variants Executed
 - OpenVINS GPL-3 license MSCKF visual inertial odometry documentation monocular IMU 2026
 - OpenVINS visual inertial odometry GPS denied UAV MSCKF limitations monocular high altitude nadir camera
 - why not use OpenVINS production GPL ROS dependency visual inertial odometry limitations
 - OpenCV license BSD 3-Clause camera calibration findHomography RANSAC documentation 4.x
 - custom visual odometry OpenCV homography IMU EKF fixed wing UAV satellite imagery GPS denied 2024
 - DINOv2 VLAD AnyLoc visual place recognition aerial satellite retrieval benchmark 2024 2025
 - DINOv2 VLAD limitations visual place recognition storage compute AnyLoc limitations
 - DINOv2 TensorRT Jetson performance issue embedding accuracy visual place recognition
 - ALIKED LightGlue license local feature matching aerial image registration 2024 2025
 - ALIKED LightGlue ONNX TensorRT Jetson performance benchmark local feature matching
 - aerial visual place recognition survey 2024 runtime memory re-ranking SuperGlue LightGlue satellite UAV retrieval
 ### Addendum Conclusion
 OpenVINS is better than a pure custom OpenCV-only VIO implementation, but the production architecture should keep OpenCV as the utility layer and keep the project-owned ESKF/mode machine as the shipped estimator. OpenVINS becomes a mandatory benchmark/reference because it does not own the satellite anchor, spoofing/blackout, source-label, cache-write, and MAVLink semantics required by the acceptance criteria, and GPLv3 remains a production dependency blocker.
 DINOv2-VLAD + CPU-first FAISS + ALIKED/LightGlue remains the preferred anchor stack, with two non-negotiable constraints: retrieval is trigger-based rather than per-frame, and TensorRT/ONNX optimizations are accepted only after descriptor-fidelity and Jetson latency tests.
@@ -0,0 +1,419 @@
 # Source Registry
 ## Source #1
 - **Title**: Visual Odometry in GPS-Denied Zones for Fixed-Wing UAV with Reduced Accumulative Error Based on Satellite Imagery
 - **Link**: https://www.mdpi.com/2076-3417/14/16/7420
 - **Tier**: L1
 - **Publication Date**: 2024
 - **Timeliness Status**: Currently valid
 - **Target Audience**: UAV visual localization researchers/implementers
 - **Research Boundary Match**: Full match
 - **Summary**: Demonstrates fixed-wing high-altitude monocular VO corrected by satellite imagery; highlights scale ambiguity and accumulated drift.
 - **Related Sub-question**: Architecture / drift bounding
 ## Source #2
 - **Title**: Visual place recognition for aerial imagery: A survey
 - **Link**: https://arxiv.org/abs/2406.00885
 - **Tier**: L1
 - **Publication Date**: 2024
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Aerial VPR researchers/implementers
 - **Research Boundary Match**: Full match
 - **Summary**: Reviews aerial VPR, retrieval/re-ranking, overlap/scale effects, memory/runtime issues, and georeference recall.
 - **Related Sub-question**: VPR / validation
 ## Source #3
 - **Title**: OpenVINS documentation
 - **Link**: https://docs.openvins.com/
 - **Tier**: L1
 - **Publication Date**: 2023 latest noted release
 - **Timeliness Status**: Needs verification before implementation
 - **Target Audience**: VIO researchers/implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: OpenVINS is an EKF/MSCKF visual-inertial estimator supporting monocular tracking, calibration, evaluation, and covariance-aware estimation; GPL-3 license.
 - **Related Sub-question**: VO/VIO
 ## Source #4
 - **Title**: ORB-SLAM3 README
 - **Link**: https://raw.githubusercontent.com/UZ-SLAMLab/ORB_SLAM3/master/README.md
 - **Tier**: L1
 - **Publication Date**: 2021 README, still repository source
 - **Timeliness Status**: Needs verification before implementation
 - **Target Audience**: SLAM implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: ORB-SLAM3 supports monocular visual-inertial SLAM and multi-map operation, requires calibration, and is GPLv3.
 - **Related Sub-question**: VO/VIO alternatives
 ## Source #5
 - **Title**: OpenCV 4.x documentation via Context7
 - **Link**: https://docs.opencv.org/4.x/
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Computer vision implementers
 - **Research Boundary Match**: Full match for utility layer
 - **Summary**: Documents camera calibration, undistortion, and `findHomography` with RANSAC for robust geometry.
 - **Related Sub-question**: Calibration / geometry
 ## Source #6
 - **Title**: LightGlue README and Context7 docs
 - **Link**: https://raw.githubusercontent.com/cvg/LightGlue/main/README.md
 - **Tier**: L1
 - **Publication Date**: Current repository, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Feature-matching implementers
 - **Research Boundary Match**: Full match for local matching
 - **Summary**: LightGlue accepts local keypoints/descriptors and returns matched coordinates/scores; supports SuperPoint, DISK, ALIKED, SIFT, adaptive pruning, CUDA, and Apache-2 for code/weights while SuperPoint has restrictive licensing.
 - **Related Sub-question**: Local matching
 ## Source #7
 - **Title**: AnyLoc README
 - **Link**: https://github.com/AnyLoc/AnyLoc
 - **Tier**: L1
 - **Publication Date**: 2023 repository, accessed 2026-05-01
 - **Timeliness Status**: Needs profiling verification
 - **Target Audience**: VPR implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Provides DINOv2 + VLAD API examples and notes substantial storage/compute requirements for full experiments.
 - **Related Sub-question**: VPR descriptors
 ## Source #8
 - **Title**: DINOv2 repository
 - **Link**: https://github.com/facebookresearch/dinov2
 - **Tier**: L1
 - **Publication Date**: 2023 repository, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Vision model implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Meta's DINOv2 implementation and models, Apache-2.0 / CC-BY-4.0 license notices.
 - **Related Sub-question**: VPR descriptors
 ## Source #9
 - **Title**: FAISS documentation and Context7 docs
 - **Link**: https://faiss.ai/index.html
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Vector search implementers
 - **Research Boundary Match**: Full match
 - **Summary**: FAISS supports dense vector search, top-k retrieval, CPU/GPU indexes, product quantization, and save/load APIs; GPU indexes must be converted to CPU before saving.
 - **Related Sub-question**: Descriptor retrieval
 ## Source #10
 - **Title**: MAVSDK documentation via Context7
 - **Link**: https://github.com/mavlink/mavsdk
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: MAVLink application implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: MAVSDK provides telemetry APIs including raw GPS, GPS info, status text, position/velocity, and odometry subscriptions; `GPS_INPUT` emission should use raw MAVLink/pymavlink for this project.
 - **Related Sub-question**: MAVLink integration
 ## Source #11
 - **Title**: ArduPilot MAVProxy GPSInput
 - **Link**: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: ArduPilot integrators
 - **Research Boundary Match**: Full match
 - **Summary**: External GPS input requires `GPS1_TYPE=14` and accepts MAVLink `GPS_INPUT` fields including WGS84 lat/lon, velocity, fix type, and accuracy.
 - **Related Sub-question**: MAVLink output
 ## Source #12
 - **Title**: MAVLink common message spec: GPS_INPUT
 - **Link**: https://mavlink.io/en/messages/common.html#GPS_INPUT
 - **Tier**: L1
 - **Publication Date**: Current spec, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: MAVLink implementers
 - **Research Boundary Match**: Full match
 - **Summary**: Defines `GPS_INPUT` fields, fix type semantics, `horiz_accuracy`, and ignore flags.
 - **Related Sub-question**: MAVLink output / confidence
 ## Source #13
 - **Title**: ArduPilot GPS failsafe and glitch protection
 - **Link**: https://ardupilot.org/copter/docs/gps-failsafe-glitch-protection.html
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Reference only for Plane
 - **Target Audience**: ArduPilot operators
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Documents GPS glitch protection and notes inertial-only position degrades quickly; Copter-specific defaults must not be assumed for Plane.
 - **Related Sub-question**: Failsafe / spoofing
 ## Source #14
 - **Title**: ArduPilot EKF failsafe
 - **Link**: https://ardupilot.org/copter/docs/ekf-inav-failsafe.html
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Reference only for Plane
 - **Target Audience**: ArduPilot operators
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Explains EKF variance failsafe behavior and why spoof/glitch tests must be parameterized.
 - **Related Sub-question**: Failsafe / spoofing
 ## Source #15
 - **Title**: Jetson Orin Nano Super Developer Kit
 - **Link**: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit/
 - **Tier**: L1
 - **Publication Date**: Current page, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Embedded AI implementers
 - **Research Boundary Match**: Full match
 - **Summary**: Confirms 67 INT8 TOPS, 8 GB LPDDR5, 102 GB/s, and 7-25 W power range.
 - **Related Sub-question**: Runtime
 ## Source #16
 - **Title**: NVIDIA JetPack 6.2 Super Mode blog
 - **Link**: https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/
 - **Tier**: L2
 - **Publication Date**: 2024
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Jetson developers
 - **Research Boundary Match**: Full match
 - **Summary**: Explains 25 W and MAXN Super modes and warns thermal design must accommodate the new power modes or throttling occurs.
 - **Related Sub-question**: Runtime / thermal
 ## Source #17
 - **Title**: PMTiles Concepts
 - **Link**: https://docs.protomaps.com/pmtiles/
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Geospatial storage implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: PMTiles is single-file tiled archive, efficient for reads, but read-only and not update-in-place.
 - **Related Sub-question**: Cache storage
 ## Source #18
 - **Title**: GDAL COG driver
 - **Link**: https://gdal.org/en/stable/drivers/raster/cog.html
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Geospatial raster implementers
 - **Research Boundary Match**: Full match
 - **Summary**: Defines COG creation options for tiled, compressed, overview-enabled GeoTIFFs.
 - **Related Sub-question**: Cache storage
 ## Source #19
 - **Title**: AerialVL dataset
 - **Link**: https://github.com/hmf21/AerialVL
 - **Tier**: L1
 - **Publication Date**: 2024
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Aerial visual localization researchers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Public aerial localization benchmark with UAV sequences, reference maps, and geo-referenced evaluation data.
 - **Related Sub-question**: Validation
 ## Source #20
 - **Title**: EuRoC MAV Dataset
 - **Link**: http://projects.asl.ethz.ch/datasets/euroc-mav/
 - **Tier**: L1
 - **Publication Date**: 2016
 - **Timeliness Status**: Stable benchmark
 - **Target Audience**: VIO researchers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Stereo camera + IMU + ground truth benchmark useful for VIO sanity tests but not representative of high-altitude nadir fixed-wing imagery.
 - **Related Sub-question**: Validation
 ## Source #21
 - **Title**: NVIDIA/TensorRT issue: DINOv2 TensorRT performance/precision on Jetson
 - **Link**: https://github.com/NVIDIA/TensorRT/issues/4348
 - **Tier**: L4
 - **Publication Date**: 2024
 - **Timeliness Status**: Needs verification
 - **Target Audience**: Jetson/TensorRT implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Reports limited mixed-precision gains for DINOv2-S on Jetson/RTX, suggesting DINOv2 optimization is not automatically beneficial.
 - **Related Sub-question**: Mode B performance risk
 ## Source #22
 - **Title**: NVIDIA Developer Forum: DINOv2 TensorRT model performance issue
 - **Link**: https://forums.developer.nvidia.com/t/dinov2-tensorrt-model-performance-issue/312251
 - **Tier**: L4
 - **Publication Date**: 2024
 - **Timeliness Status**: Needs verification
 - **Target Audience**: Jetson/TensorRT implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Reports DINOv2 embedding distance changes after TensorRT conversion on Jetson Orin Nano; requires embedding-fidelity validation before relying on TensorRT descriptors.
 - **Related Sub-question**: Mode B performance/quality risk
 ## Source #23
 - **Title**: LightGlue license issue discussions
 - **Link**: https://github.com/cvg/LightGlue/issues/120
 - **Tier**: L4
 - **Publication Date**: 2024
 - **Timeliness Status**: Currently relevant
 - **Target Audience**: Feature-matching implementers
 - **Research Boundary Match**: Full match for licensing
 - **Summary**: Community discussion highlights restrictive SuperPoint licensing inside the LightGlue ecosystem and supports avoiding SuperPoint as default production extractor.
 - **Related Sub-question**: Mode B licensing risk
 ## Source #24
 - **Title**: ArduPilot issue: GPS_INPUT velocity ignore flag pitfall
 - **Link**: https://github.com/ArduPilot/ardupilot/issues/19633
 - **Tier**: L4
 - **Publication Date**: 2021
 - **Timeliness Status**: Needs SITL verification
 - **Target Audience**: ArduPilot integrators
 - **Research Boundary Match**: Full match for GPS_INPUT caution
 - **Summary**: Reports EKF3 may use zero velocity when `GPS_INPUT_IGNORE_FLAG_VEL_HORIZ` is set, so velocity-source parameters must be tested rather than relying only on ignore flags.
 - **Related Sub-question**: Mode B MAVLink pitfall
 ## Source #25
 - **Title**: FAISS install documentation
 - **Link**: https://github.com/facebookresearch/faiss/blob/main/INSTALL.md
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Vector search implementers
 - **Research Boundary Match**: Full match
 - **Summary**: FAISS CPU conda package supports aarch64, while GPU package availability is x86-64 focused; Jetson design should assume CPU FAISS unless a custom build is proven.
 - **Related Sub-question**: Mode B FAISS deployment
 ## Source #26
 - **Title**: GNSS-denied geolocalization of UAVs by visual matching of onboard camera images with orthophotos
 - **Link**: https://ar5iv.labs.arxiv.org/html/2103.14381
 - **Tier**: L1
 - **Publication Date**: 2021
 - **Timeliness Status**: Stable mechanism reference
 - **Target Audience**: UAV visual geolocalization researchers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Demonstrates visual matching with orthophotos and Monte Carlo/local planarity ideas; supports using orthorectified reference maps but does not cover all adversarial visual attacks.
 - **Related Sub-question**: Mode B alternative / security limits
 ## Source #27
 - **Title**: OpenVINS LICENSE
 - **Link**: https://github.com/rpng/open_vins/blob/master/LICENSE
 - **Tier**: L1
 - **Publication Date**: Current repository, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: VIO implementers / product owners
 - **Research Boundary Match**: Full match for licensing
 - **Summary**: OpenVINS is GPLv3-licensed; this is a production dependency constraint, not a technical capability limitation.
 - **Related Sub-question**: Mode B round 2 — OpenVINS vs custom production estimator
 ## Source #28
 - **Title**: OpenVINS documentation and Context7 lookup
 - **Link**: https://docs.openvins.com/index.html
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: VIO implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: OpenVINS is a strong EKF/MSCKF VIO system for monocular camera + IMU reference runs, with calibration and covariance-aware state estimation, but it does not own the project-specific satellite anchor, GPS_INPUT, source-label, spoofing, blackout, and cache-poisoning state machine.
 - **Related Sub-question**: Mode B round 2 — OpenVINS vs custom production estimator
 ## Source #29
 - **Title**: OpenCV 4.x calibration/homography documentation and Context7 lookup
 - **Link**: https://docs.opencv.org/4.x/
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Computer vision implementers
 - **Research Boundary Match**: Full match for geometry utility layer
 - **Summary**: OpenCV 4.x provides calibration, undistortion, homography estimation, RANSAC/USAC robust estimation, and reprojection-error primitives under a permissive license; it is a utility layer rather than a complete GPS-denied estimator.
 - **Related Sub-question**: Mode B round 2 — custom OpenCV boundary
 ## Source #30
 - **Title**: AnyLoc: Towards Universal Visual Place Recognition
 - **Link**: https://arxiv.org/html/2308.00688
 - **Tier**: L1
 - **Publication Date**: 2023; ICRA 2024
 - **Timeliness Status**: Currently valid, profiling required before deployment
 - **Target Audience**: VPR implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: AnyLoc combines DINOv2 features with VLAD aggregation for broad VPR, including aerial data, and supports the selected DINOv2-VLAD retrieval family while leaving runtime/storage tuning as a deployment gate.
 - **Related Sub-question**: Mode B round 2 — satellite retrieval
 ## Source #31
 - **Title**: ALIKED-LightGlue-ONNX and LightGlue ONNX/TensorRT deployment reports
 - **Link**: https://github.com/ikeboo/ALIKED-LightGlue-ONNX
 - **Tier**: L2
 - **Publication Date**: Current repository, accessed 2026-05-01
 - **Timeliness Status**: Promising but needs Jetson verification
 - **Target Audience**: Local feature matching implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: ONNX/optimized variants show a credible deployment path for ALIKED + LightGlue, but public evidence is not enough to assume Jetson Orin Nano p95 latency without project profiling.
 - **Related Sub-question**: Mode B round 2 — local matcher deployability
 ## Source #32
 - **Title**: Visual place recognition for aerial imagery: A survey
 - **Link**: https://arxiv.org/abs/2406.00885
 - **Tier**: L1
 - **Publication Date**: 2024
 - **Timeliness Status**: Currently valid
 - **Target Audience**: Aerial VPR researchers / implementers
 - **Research Boundary Match**: Full match
 - **Summary**: Aerial VPR performance depends materially on tile scale, overlap, weather, repetitive patterns, and re-ranking cost; this supports overlapped VPR chunks, dynamic top-K, and conditional local verification.
 - **Related Sub-question**: Mode B round 2 — satellite retrieval and anchor verification
 ## Source #33
 - **Title**: BASALT repository and documentation
 - **Link**: https://github.com/VladyslavUsenko/basalt
 - **Tier**: L1
 - **Publication Date**: Current repository, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: VIO implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: BASALT provides visual-inertial odometry and mapping, camera/IMU calibration tools, EuRoC/TUM VI support, and a BSD-style production-friendly licensing path.
 - **Related Sub-question**: Mode B round 3 — Kimera vs BASALT vs OpenVINS
 ## Source #34
 - **Title**: HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry
 - **Link**: https://arxiv.org/pdf/2106.11857
 - **Tier**: L1
 - **Publication Date**: 2021
 - **Timeliness Status**: Stable benchmark reference
 - **Target Audience**: VIO researchers / embedded implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Reports EuRoC RMS ATE comparisons including BASALT mean about 0.051 m online stereo and Kimera mean about 0.12 m, plus notes that optimization-based methods often lack direct uncertainty quantification compared with filters.
 - **Related Sub-question**: Mode B round 3 — VIO error and confidence comparison
 ## Source #35
 - **Title**: OpenVINS issue #402 — up-to-date ATE and RTE metrics
 - **Link**: https://github.com/rpng/open_vins/issues/402
 - **Tier**: L4
 - **Publication Date**: 2024
 - **Timeliness Status**: Community benchmark, verify in our replay harness
 - **Target Audience**: VIO implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Community EuRoC comparison reports BASALT average ATE about 0.072 m with 100% completion, and OpenVINS average ATE about 0.091 m with about 88.55% completion and a divergence on one hard sequence.
 - **Related Sub-question**: Mode B round 3 — BASALT vs OpenVINS error/completion
 ## Source #36
 - **Title**: Kimera-VIO mono-inertial parameter issues
 - **Link**: https://github.com/MIT-SPARK/Kimera-VIO/issues/254
 - **Tier**: L4
 - **Publication Date**: 2024
 - **Timeliness Status**: Relevant implementation caveat
 - **Target Audience**: VIO implementers
 - **Research Boundary Match**: Partial overlap
 - **Summary**: Kimera-VIO stereo path remains strong, but mono-inertial configurations had documented poor default performance; parameter changes improved one EuRoC mono setup to less than about +/-0.2 m per axis.
 - **Related Sub-question**: Mode B round 3 — Kimera mono/nadir risk
 ## Source #37
 - **Title**: RaD-VIO and downward-facing VIO literature
 - **Link**: https://arxiv.org/abs/1810.08704
 - **Tier**: L1
 - **Publication Date**: 2018
 - **Timeliness Status**: Stable mechanism reference
 - **Target Audience**: MAV downward-camera VIO researchers
 - **Research Boundary Match**: Full match for nadir-camera caveat
 - **Summary**: Downward-facing monocular VIO has planar-scene and observability challenges; range/altitude and IMU constraints are important when the camera sees mostly ground plane.
 - **Related Sub-question**: Mode B round 3 — nadir support and limitations
 ## Source #38
 - **Title**: OpenVINS covariance documentation and StateHelper APIs
 - **Link**: https://docs.openvins.com/dev-index.html
 - **Tier**: L1
 - **Publication Date**: Current docs, accessed 2026-05-01
 - **Timeliness Status**: Currently valid
 - **Target Audience**: VIO implementers
 - **Research Boundary Match**: Full match for covariance/confidence output
 - **Summary**: OpenVINS maintains EKF covariance and exposes full/marginal covariance helpers, making it the strongest reference for covariance consistency even if GPLv3 blocks default production use.
 - **Related Sub-question**: Mode B round 3 — confidence/covariance support
@@ -0,0 +1,348 @@
 # Fact Cards
 ## Fact #1
 - **Statement**: Fixed-wing high-altitude monocular VO suffers from scale ambiguity and accumulated error; comparing against satellite imagery can reduce accumulated drift.
 - **Source**: Source #1
 - **Phase**: Phase 2
 - **Target Audience**: UAV localization implementers
 - **Confidence**: High
 - **Related Dimension**: Architecture
 - **Fit Impact**: Supports satellite-anchored hybrid estimator.
 ## Fact #2
 - **Statement**: Aerial VPR is sensitive to weather, season, scale variation, repetitive patterns, and map tile construction; overlap and scale level materially affect retrieval quality.
 - **Source**: Source #2
 - **Confidence**: High
 - **Related Dimension**: VPR
 - **Fit Impact**: Supports AC-8.6 VPR chunks with overlap and seasonal validation.
 ## Fact #3
 - **Statement**: Heavy VPR re-ranking can be too slow for steady-state embedded use; survey evidence reports some re-ranking around 1 s and SuperGlue much slower on evaluated hardware.
 - **Source**: Source #2
 - **Confidence**: High
 - **Related Dimension**: Runtime
 - **Fit Impact**: Disqualifies per-frame global VPR/re-ranking unless profiled on Jetson.
 ## Fact #4
 - **Statement**: OpenVINS is an EKF/MSCKF visual-inertial estimator with monocular tracking and calibration support, but its code is GPL-3.
 - **Source**: Source #3
 - **Confidence**: High
 - **Related Dimension**: VO/VIO
 - **Fit Impact**: Reference/benchmark only unless GPL obligations are accepted.
 ## Fact #5
 - **Statement**: ORB-SLAM3 supports monocular visual-inertial SLAM and multi-map operation, but it is GPLv3 and expects careful calibration and a SLAM-style runtime stack.
 - **Source**: Source #4
 - **Confidence**: High
 - **Related Dimension**: VO/VIO
 - **Fit Impact**: Rejected as production dependency; useful benchmark/reference.
 ## Fact #6
 - **Statement**: OpenCV provides camera calibration APIs that output camera matrix and distortion coefficients, and homography estimation APIs including RANSAC.
 - **Source**: Source #5
 - **Confidence**: High
 - **Related Dimension**: Calibration / geometry
 - **Fit Impact**: Selected utility layer for calibration, undistortion, homography, and geometric validation.
 ## Fact #7
 - **Statement**: LightGlue accepts local keypoints/descriptors from extractors such as DISK, ALIKED, SIFT, and SuperPoint, and returns matched keypoint indices, coordinates, and confidence scores.
 - **Source**: Source #6
 - **Confidence**: High
 - **Related Dimension**: Local matching
 - **Fit Impact**: Selected candidate for conditional cross-domain local matching.
 ## Fact #8
 - **Statement**: LightGlue has adaptive depth/width pruning, FlashAttention, mixed precision, and benchmark scripts; runtime must be profiled on Jetson because defaults are optimized for desktop GPUs.
 - **Source**: Source #6
 - **Confidence**: High
 - **Related Dimension**: Runtime
 - **Fit Impact**: Selected with runtime-quality gate.
 ## Fact #9
 - **Statement**: LightGlue code/weights are Apache-2.0, but SuperPoint pretrained weights/inference have restrictive licensing; DISK and ALIKED are safer extractor pairings from a licensing perspective.
 - **Source**: Source #6
 - **Confidence**: High
 - **Related Dimension**: Licensing
 - **Fit Impact**: Select DISK/ALIKED+LightGlue for production candidate; treat SuperPoint as license-gated.
 ## Fact #10
 - **Statement**: AnyLoc provides DINOv2 feature extraction and VLAD aggregation APIs, but its full experiment setup notes large storage/compute requirements.
 - **Source**: Source #7
 - **Confidence**: High
 - **Related Dimension**: VPR descriptors
 - **Fit Impact**: DINOv2-VLAD selected as offline/conditional retrieval candidate, not unconditional per-frame path.
 ## Fact #11
 - **Statement**: DINOv2 official repository provides Meta's DINOv2 implementation and model assets with Apache-2.0 / CC-BY-4.0 license notices.
 - **Source**: Source #8
 - **Confidence**: High
 - **Related Dimension**: VPR descriptors
 - **Fit Impact**: Supports DINOv2 as a permissible descriptor backbone subject to model-license review.
 ## Fact #12
 - **Statement**: FAISS is designed for efficient dense vector similarity search, top-k nearest-neighbor retrieval, speed/accuracy tradeoffs, and indexes too large for simple exhaustive scanning.
 - **Source**: Source #9
 - **Confidence**: High
 - **Related Dimension**: Descriptor retrieval
 - **Fit Impact**: Selected vector index for offline VPR descriptors.
 ## Fact #13
 - **Statement**: FAISS supports saving/loading indexes; GPU indexes must be converted to CPU before saving.
 - **Source**: Source #9
 - **Confidence**: High
 - **Related Dimension**: Cache lifecycle
 - **Fit Impact**: Supports install-time/index-build flow with runtime load.
 ## Fact #14
 - **Statement**: MAVSDK provides telemetry subscriptions for raw GPS, GPS info, status text, odometry, and position/velocity; it does not remove the need for raw MAVLink control over `GPS_INPUT` emission.
 - **Source**: Source #10
 - **Confidence**: High
 - **Related Dimension**: MAVLink integration
 - **Fit Impact**: Select MAVSDK for telemetry, pymavlink/raw MAVLink for `GPS_INPUT`.
 ## Fact #15
 - **Statement**: ArduPilot GPSInput requires `GPS1_TYPE=14` for MAVLink GPS input.
 - **Source**: Source #11
 - **Confidence**: High
 - **Related Dimension**: MAVLink output
 - **Fit Impact**: Confirms production parameter requirement.
 ## Fact #16
 - **Statement**: `GPS_INPUT` carries WGS84 lat/lon, MSL altitude, velocity, `fix_type`, `horiz_accuracy`, `vert_accuracy`, `speed_accuracy`, and ignore flags.
 - **Source**: Source #12
 - **Confidence**: High
 - **Related Dimension**: Output contract
 - **Fit Impact**: Supports mapping estimator covariance to `horiz_accuracy` and failover fix types.
 ## Fact #17
 - **Statement**: ArduPilot GPS glitch protection and EKF failsafe behavior are parameterized and vehicle-specific; Copter docs are not enough to prove Plane behavior.
 - **Source**: Sources #13, #14
 - **Confidence**: High
 - **Related Dimension**: Failsafe
 - **Fit Impact**: Requires ArduPilot Plane SITL validation.
 ## Fact #18
 - **Statement**: Jetson Orin Nano Super provides 67 INT8 TOPS, 8 GB memory, 102 GB/s bandwidth, and 7-25 W power range.
 - **Source**: Source #15
 - **Confidence**: High
 - **Related Dimension**: Runtime
 - **Fit Impact**: Confirms target platform constraint.
 ## Fact #19
 - **Statement**: NVIDIA warns Super power modes require thermal design that can handle the power modes; otherwise throttling can reduce performance.
 - **Source**: Source #16
 - **Confidence**: High
 - **Related Dimension**: Thermal
 - **Fit Impact**: Supports AC-NEW-5 hot-soak and throttle logging.
 ## Fact #20
 - **Statement**: PMTiles is efficient for single-file tile reads but is read-only and cannot be updated in place.
 - **Source**: Source #17
 - **Confidence**: High
 - **Related Dimension**: Cache storage
 - **Fit Impact**: Rejected for mutable onboard tile writes; possible export/package format only.
 ## Fact #21
 - **Statement**: COG supports tiled, compressed, overview-enabled GeoTIFFs suitable for efficient raster access and geospatial tooling.
 - **Source**: Source #18
 - **Confidence**: High
 - **Related Dimension**: Cache storage
 - **Fit Impact**: Selected imagery storage unit for immutable service tiles and generated candidate tiles.
 ## Fact #22
 - **Statement**: AerialVL provides aerial visual localization sequences, reference maps, and geo-referenced evaluation data.
 - **Source**: Source #19
 - **Confidence**: Medium
 - **Related Dimension**: Validation
 - **Fit Impact**: Selected validation dataset for VPR/satellite-anchor algorithm development.
 ## Fact #23
 - **Statement**: EuRoC provides synchronized camera/IMU and ground truth for VIO, but it is not representative of high-altitude fixed-wing nadir imagery.
 - **Source**: Source #20
 - **Confidence**: High
 - **Related Dimension**: Validation
 - **Fit Impact**: Use for VIO sanity checks only, not final AC proof.
 ## MVE Evidence
 ### MVE — OpenCV calibration and homography utilities
 - **Source**: Source #5
 - **Pinned mode/config**: Use OpenCV 4.x C++/Python APIs for checkerboard calibration, undistortion, homography estimation with RANSAC, and reprojection-error measurement.
 - **Inputs in example**: Object/image point correspondences, image size, matched keypoints.
 - **Outputs in example**: Camera matrix, distortion coefficients, rotation/translation vectors, homography matrix.
 - **Project inputs**: ADTi nav-camera frames, checkerboard calibration images, matched VO/satellite points.
 - **Project outputs required**: Intrinsics/distortion, homography, inlier mask, MRE.
 - **Match assessment**: Exact match.
 ### MVE — LightGlue in DISK/ALIKED local-matching mode
 - **Source**: Source #6
 - **Pinned mode/config**: Use DISK+LightGlue or ALIKED+LightGlue on CUDA/TensorRT-profiled Jetson path, with inputs two normalized images and outputs matched keypoint coordinates plus confidence scores.
 - **Inputs in example**: Two images loaded to GPU; local features extracted by DISK/ALIKED/SuperPoint.
 - **Outputs in example**: `matches` shape `(K, 2)`, keypoint coordinates in each image, confidence scores.
 - **Project inputs**: Orthorectified nav frame crop and candidate satellite/VPR chunk.
 - **Project outputs required**: 2D-2D correspondences for RANSAC homography and cross-domain MRE.
 - **Match assessment**: Exact interface match; runtime quality gate remains.
 ### MVE — FAISS top-K VPR retrieval
 - **Source**: Source #9
 - **Pinned mode/config**: Use FAISS CPU index with optional GPU acceleration for top-K nearest neighbor search over precomputed DINOv2/VLAD descriptors, saved/loaded at install/preflight time.
 - **Inputs in example**: Float32 descriptor matrix, query descriptor, `k`.
 - **Outputs in example**: Distance matrix `D` and index matrix `I`.
 - **Project inputs**: Precomputed VPR chunk descriptors, query frame descriptor.
 - **Project outputs required**: Top-K candidate chunk IDs for local matching.
 - **Match assessment**: Exact match.
 ### MVE — MAVSDK telemetry + pymavlink GPS_INPUT
 - **Source**: Sources #10, #11, #12
 - **Pinned mode/config**: Use MAVSDK for telemetry subscriptions and pymavlink/raw MAVLink for `GPS_INPUT` emission to ArduPilot with `GPS1_TYPE=14`.
 - **Inputs in example**: Telemetry streams, estimator lat/lon/alt/velocity/covariance.
 - **Outputs in example**: `GPS_INPUT` fields accepted by ArduPilot GPS backend.
 - **Project inputs**: ESKF state and covariance, source label, mode/fix quality.
 - **Project outputs required**: Frame-by-frame WGS84 `GPS_INPUT`, status text, FDR record.
 - **Match assessment**: Exact match for output contract; Plane SITL validation remains.
 ## Mode B Findings
 ### Fact #24
 - **Statement**: DINOv2 TensorRT optimization on Jetson may provide limited speedup and can change embedding distances; descriptor fidelity must be tested against the PyTorch/ONNX baseline before selecting a TensorRT descriptor path.
 - **Source**: Sources #21, #22
 - **Phase**: Mode B
 - **Confidence**: Medium
 - **Related Dimension**: VPR runtime / quality
 - **Fit Impact**: Adds embedding-fidelity gate; keeps DINOv2 selected only after profiling.
 ### Fact #25
 - **Statement**: LightGlue's SuperPoint path has documented license concerns; DISK/ALIKED remain the safer production default unless legal review approves SuperPoint.
 - **Source**: Source #23
 - **Phase**: Mode B
 - **Confidence**: High
 - **Related Dimension**: Licensing
 - **Fit Impact**: Confirms draft01 decision to avoid SuperPoint as default.
 ### Fact #26
 - **Statement**: ArduPilot `GPS_INPUT_IGNORE_FLAG_VEL_HORIZ` has a reported EKF3 pitfall where velocity may become zero rather than truly ignored; SITL must validate velocity-source parameters and message fields.
 - **Source**: Source #24
 - **Phase**: Mode B
 - **Confidence**: Medium
 - **Related Dimension**: MAVLink integration
 - **Fit Impact**: Adds a specific MAVLink test and parameter gate.
 ### Fact #27
 - **Statement**: FAISS deployment on Jetson ARM64 should assume CPU FAISS by default; GPU FAISS packages are not the safe default on aarch64.
 - **Source**: Source #25
 - **Phase**: Mode B
 - **Confidence**: Medium
 - **Related Dimension**: Descriptor retrieval runtime
 - **Fit Impact**: Changes FAISS pinned mode from CPU with optional GPU to CPU-first, with custom GPU build only as future optimization.
 ### Fact #28
 - **Statement**: Visual matching with orthophotos is a known GNSS-denied UAV approach, but available sources do not prove robustness against adversarial visual attacks on imagery/cache content.
 - **Source**: Source #26
 - **Phase**: Mode B
 - **Confidence**: Medium
 - **Related Dimension**: Security
 - **Fit Impact**: Adds cache integrity, signed manifests, and consistency checks as required controls.
 ### Fact #29
 - **Statement**: COG creation is a write-new-object workflow; the live onboard cache should append/replace tile objects through manifests, not mutate a COG in place.
 - **Source**: Source #18
 - **Phase**: Mode B
 - **Confidence**: High
 - **Related Dimension**: Cache lifecycle
 - **Fit Impact**: Clarifies cache implementation.
 ### Fact #30
 - **Statement**: OpenVINS is technically stronger than a pure hand-rolled OpenCV-only VIO stack for camera+IMU odometry, but its GPLv3 license and generic VIO lifecycle make it unsuitable as the default production dependency for this product.
 - **Source**: Sources #27, #28
 - **Phase**: Mode B round 2
 - **Confidence**: High
 - **Related Dimension**: VO / VIO selection
 - **Fit Impact**: Use OpenVINS as a mandatory benchmark/reference, not as the shipped estimator dependency unless GPL obligations are explicitly accepted.
 ### Fact #31
 - **Statement**: The selected production estimator is not "custom OpenCV-only"; OpenCV is the geometry utility layer, while the product-owned ESKF/mode machine owns covariance, source labels, GPS spoofing, blackout, tile-write eligibility, and MAVLink semantics.
 - **Source**: Sources #5, #29; AC-1.4, AC-3.5, AC-4.3, AC-NEW-4, AC-NEW-7, AC-NEW-8
 - **Phase**: Mode B round 2
 - **Confidence**: High
 - **Related Dimension**: Estimator ownership
 - **Fit Impact**: Keep custom production estimator, but reject any interpretation that means building a naive OpenCV-only VIO stack.
 ### Fact #32
 - **Statement**: Fixed-wing GPS-denied UAV research supports a hybrid of visual odometry plus satellite/orthophoto matching to reduce accumulated drift, matching the project architecture better than a standalone VIO-only solution.
 - **Source**: Sources #1, #26
 - **Phase**: Mode B round 2
 - **Confidence**: High
 - **Related Dimension**: Architecture
 - **Fit Impact**: Confirms that OpenVINS alone cannot satisfy the absolute-position and re-anchor responsibilities without the satellite anchor path.
 ### Fact #33
 - **Statement**: DINOv2-VLAD/AnyLoc-style retrieval is a strong global candidate generator for aerial VPR, but descriptor size, model size, and environment-specific VLAD/index choices must be budgeted and profiled.
 - **Source**: Sources #7, #30, #32
 - **Phase**: Mode B round 2
 - **Confidence**: High
 - **Related Dimension**: Satellite retrieval
 - **Fit Impact**: Select DINOv2-VLAD for triggered retrieval, not steady-state per-frame execution.
 ### Fact #34
 - **Statement**: Aerial VPR sources emphasize tile/chunk scale, overlap, weather/season changes, repetitive patterns, and re-ranking cost; local matching should be a verification/rerank stage over bounded top-K candidates.
 - **Source**: Source #32
 - **Phase**: Mode B round 2
 - **Confidence**: High
 - **Related Dimension**: Anchor verification
 - **Fit Impact**: Supports VPR chunks with 40-50% overlap, dynamic K, and conditional ALIKED/LightGlue verification.
 ### Fact #35
 - **Statement**: ALIKED + LightGlue has an exact local matching interface and a plausible ONNX/TensorRT deployment path, but public evidence does not prove Jetson Orin Nano p95 latency for the project image sizes.
 - **Source**: Sources #6, #31
 - **Phase**: Mode B round 2
 - **Confidence**: Medium
 - **Related Dimension**: Local matching runtime
 - **Fit Impact**: Keep ALIKED/LightGlue selected with runtime gate; benchmark DISK and SIFT/ORB as fallbacks.
 ### Fact #36
 - **Statement**: DINOv2 TensorRT conversion can reduce embedding discrimination and may not provide meaningful speedup on Jetson-class devices; descriptor-fidelity tests must precede any optimized engine acceptance.
 - **Source**: Source #22
 - **Phase**: Mode B round 2
 - **Confidence**: Medium
 - **Related Dimension**: VPR deployment
 - **Fit Impact**: TensorRT is an optimization candidate only after PyTorch/ONNX retrieval-rank equivalence is proven.
 ### Fact #37
 - **Statement**: BASALT is the best production VIO candidate among BASALT, OpenVINS, and Kimera-VIO because it combines permissive licensing with strong published EuRoC accuracy and completion evidence.
 - **Source**: Sources #33, #34, #35
 - **Phase**: Mode B round 3
 - **Confidence**: Medium
 - **Related Dimension**: VO / VIO selection
 - **Fit Impact**: Select BASALT as the production VIO candidate, pending project replay/profiling.
 ### Fact #38
 - **Statement**: OpenVINS has the clearest EKF covariance story, including full/marginal covariance helpers and NEES-style evaluation support, but remains production-constrained by GPLv3.
 - **Source**: Sources #27, #28, #38
 - **Phase**: Mode B round 3
 - **Confidence**: High
 - **Related Dimension**: Confidence / covariance
 - **Fit Impact**: Keep OpenVINS as covariance/reference baseline and use it to calibrate the BASALT wrapper's reported uncertainty.
 ### Fact #39
 - **Statement**: Kimera-VIO is production-friendly from a license standpoint, but it is heavier/stereo-oriented and has documented mono-inertial parameter/performance caveats.
 - **Source**: Sources #34, #36
 - **Phase**: Mode B round 3
 - **Confidence**: Medium
 - **Related Dimension**: VO / VIO fallback
 - **Fit Impact**: Keep Kimera-VIO as a backup candidate, not the first production choice for a single fixed nadir camera.
 ### Fact #40
 - **Statement**: None of BASALT, OpenVINS, or Kimera-VIO provides a special fixed-wing nadir mode; downward-camera support depends on accurate camera-to-IMU extrinsics, altitude/scale constraints, and validation under low-parallax planar terrain.
 - **Source**: Source #37
 - **Phase**: Mode B round 3
 - **Confidence**: High
 - **Related Dimension**: Nadir-camera support
 - **Fit Impact**: The architecture must keep satellite anchors and project-level confidence gates regardless of which VIO library is selected.
 ### Fact #41
 - **Statement**: Published EuRoC-type VIO error rates are useful for ranking libraries but are not acceptance evidence for high-altitude fixed-wing nadir imagery over agricultural terrain.
 - **Source**: Sources #34, #35, #37
 - **Phase**: Mode B round 3
 - **Confidence**: High
 - **Related Dimension**: Validation
 - **Fit Impact**: Require representative replay/flight data before claiming AC-1/AC-2 accuracy.
@@ -0,0 +1,55 @@
 # Comparison Framework
 ## Selected Framework Type
 Decision support with exact-fit component selection.
 ## Selected Dimensions
 1. Required inputs/outputs and ownership boundaries
 2. Operating context and lifecycle fit
 3. Non-functional envelope fit: latency, memory, storage, thermal, safety
 4. Licensing and deployability
 5. Evidence quality and validation burden
 6. Security and safety failure modes
 7. Selection status
 ## Initial Population
 | Component Area | Candidate | Option Family | Inputs / Outputs | Fit Summary | Status |
 |----------------|-----------|---------------|------------------|-------------|--------|
 | Calibration / geometry | OpenCV 4.x | Established production / open-source | image/object points -> intrinsics, distortion, homography, inlier mask | Exact utility fit; permissive production use. | Selected |
 | VO / VIO | BASALT + project-owned safety/anchor wrapper | Open-source production candidate | nav frames + FC IMU/calibration -> relative VIO state; wrapper adds source labels, anchor fusion, calibrated confidence, and MAVLink semantics | Best production candidate after user decision: permissive license, strong benchmark evidence, and lower implementation burden than custom VIO. | Selected |
 | VO / VIO | OpenVINS | Open-source research | monocular camera + IMU -> VIO state/covariance | Best covariance/reference story, but GPL-3 and generic VIO ownership make it a benchmark/reference rather than shipped core. | Reference only |
 | VO / VIO | Kimera-VIO | Open-source production candidate / fallback | mono or stereo camera + IMU -> VIO/SLAM outputs | BSD-friendly, but heavier/stereo-oriented and mono-inertial path has documented parameter caveats. | Backup candidate |
 | VO / VIO | OpenCV geometry + project-owned ESKF | Custom fallback | nav frames + FC IMU/attitude/altitude + satellite anchors -> relative motion, absolute updates, covariance, source labels | Fallback if BASALT fails project data/runtime tests; still needed as safety/anchor wrapper around any VIO library. | Fallback / wrapper |
 | VO / VIO | ORB-SLAM3 | Open-source research | monocular/stereo/RGB-D + optional IMU -> SLAM pose | Useful benchmark; GPLv3, map runtime, and initialization complexity make it poor production dependency. | Rejected for production |
 | Global retrieval | DINOv2-VLAD / AnyLoc-style descriptors | Current SOTA / research | image/chunk -> descriptor | Strong VPR evidence; trigger-only use due runtime/memory and TensorRT fidelity risk. | Selected with runtime gate |
 | Vector retrieval | FAISS | Established production / open-source | descriptor matrix + query -> top-K IDs | Exact fit for offline VPR chunk retrieval. | Selected |
 | Local matching | LightGlue + DISK/ALIKED | Current SOTA / open-source | two images -> keypoint correspondences/scores | Exact local-match interface; avoids SuperPoint license issue. | Selected with runtime gate |
 | Local matching | SuperPoint + LightGlue | Current SOTA / known-bad/licensing caveat | two images -> matches | Technically good; licensing requires explicit review. | Needs user decision / fallback only |
 | Cache imagery | COG + manifest + sidecars | Established geospatial format | georeferenced raster + metadata -> efficient local reads | Good immutable tile unit; generated tiles can be written as new COGs. | Selected |
 | Cache packaging | PMTiles | Established web-map archive | tile pyramid -> single archive | Efficient reads, but read-only; not suitable for in-flight mutable writes. | Rejected for mutable cache |
 | Estimator | Custom ESKF mode machine | Custom production | VO/IMU/VPR/GPS-health -> WGS84 state + covariance + label | Needed for source labels, covariance gates, blackout/spoofing behavior. | Selected |
 | MAVLink integration | MAVSDK + pymavlink | Established APIs | telemetry in, `GPS_INPUT` out | MAVSDK handles telemetry; pymavlink handles raw `GPS_INPUT`. | Selected |
 | FDR | PostgreSQL event index + CBOR payload segments with optional Parquet export | Established storage | frame estimates, IMU, MAVLink, health, tiles -> bounded replayable log | Matches project PostgreSQL choice while keeping compact append payloads. | Selected pattern |
 | Validation | AerialVL + EuRoC + Plane SITL + representative flight | Multi-source test strategy | datasets/sim/flight -> AC evidence | Public data is partial; final representative data is mandatory. | Selected |
 ## Round 2 Decision Notes
 - **OpenVINS vs custom OpenCV**: OpenVINS wins if the comparison is against a naive OpenCV-only VIO implementation. The selected design is not that; it is OpenCV geometry plus a product-owned estimator/state machine, with OpenVINS used as a benchmark/reference.
 - **Satellite retrieval**: DINOv2-VLAD remains the best global candidate generator found, but aerial VPR sources require chunk scale/overlap tuning, dynamic top-K, and geometric verification.
 - **Anchor verification**: ALIKED/LightGlue remains the preferred learned local matcher, while SIFT/ORB stays as a regression/fallback baseline and SuperPoint remains license-gated.
 ## Round 3 Decision Notes
 - **User decision**: BASALT is selected as the production VIO candidate.
 - **Confidence/covariance**: OpenVINS remains the covariance/reference baseline because its EKF exposes clearer uncertainty semantics than BASALT/Kimera.
 - **Nadir support**: no compared VIO library has a special fixed-wing nadir mode; the acceptance path is calibration, altitude/scale constraints, satellite anchors, and representative replay validation.
 ## Baseline Alignment
 - "Position estimate" means WGS84 frame center emitted to FC plus a 95% covariance semi-major axis and source label.
 - "Satellite anchored" means a visual match passed VPR retrieval, local matching, RANSAC, freshness, covariance, and Mahalanobis gates.
 - "Normal flight segment" means the AC-2.1a conditions, not turns/blackout/stale imagery.
 - "Selected with runtime gate" means the API capability fits, but final deployment depends on Jetson profiling against AC-4.1 and AC-4.2.
@@ -0,0 +1,181 @@
 # Reasoning Chain
 ## Dimension 1: Core Architecture
 ### Fact Confirmation
 Fixed-wing monocular VO accumulates drift because scale and terrain assumptions are imperfect (Fact #1). Aerial VPR can provide absolute anchors but has weather, scale, and repetition failure modes (Fact #2).
 ### Reference Comparison
 Pure VO/IMU cannot satisfy the long 8-hour mission. Pure satellite matching cannot run every frame inside the latency budget and will fail in turns, stale imagery, and repetitive fields.
 ### Conclusion
 Select a hybrid architecture: steady-state VO/IMU propagation, conditional satellite/VPR anchoring, and ESKF covariance gates.
 ### Confidence
 High. Supported by Sources #1 and #2.
 ## Dimension 2: VO / VIO Dependency
 ### Fact Confirmation
 OpenVINS and ORB-SLAM3 both support visual-inertial estimation patterns, but both are GPL-family production dependency risks (Facts #4 and #5).
 ### Reference Comparison
 The project needs a production system with custom mode labels, covariance propagation, MAVLink-specific output behavior, and no hidden GPL obligation. A full SLAM stack also adds initialization and map-management complexity that the ACs do not require.
 ### Conclusion
 Use a custom VO/IMU propagation and ESKF mode machine for production. Use OpenVINS/ORB-SLAM3 only as benchmarks/reference algorithms.
 ### Confidence
 High for licensing/scope decision; medium for final estimator performance until prototype profiling.
 ## Dimension 3: Satellite Retrieval And Local Matching
 ### Fact Confirmation
 Aerial VPR benefits from scale/overlap-aware chunks and retrieval + local alignment (Fact #2). LightGlue provides keypoint correspondences/scores from local descriptors (Fact #7). FAISS provides top-K retrieval over descriptors (Fact #12).
 ### Reference Comparison
 Global DINOv2/VLAD retrieval alone gives candidate chunks but not enough precision for AC-2.2. Local matching alone over the whole map is too expensive. The two-stage retrieval+alignment structure matches the operational need.
 ### Conclusion
 Use DINOv2-VLAD descriptors with FAISS top-K for conditional candidate retrieval, followed by DISK/ALIKED+LightGlue and OpenCV RANSAC homography for local alignment.
 ### Confidence
 Medium-high. API fit is strong; embedded runtime must be profiled.
 ## Dimension 4: Latency And Memory
 ### Fact Confirmation
 Some aerial VPR re-ranking methods are too slow for the steady-state path (Fact #3). Jetson Orin Nano Super has 8 GB shared memory and 25 W power mode (Fact #18), with thermal throttling risk (Fact #19).
 ### Reference Comparison
 At 3 fps and <400 ms p95, the system can process every frame through lightweight VO/IMU and use heavier VPR only on triggers. Running DINOv2/LightGlue across many candidates per frame would violate the AC unless proven otherwise.
 ### Conclusion
 Make VPR conditional, cap top-K by covariance/sector, run descriptor extraction on downsampled/orthorectified crops, precompute cache descriptors offline, and add performance regression tests.
 ### Confidence
 High for architecture direction; exact model sizes need profiling.
 ## Dimension 5: Cache Format
 ### Fact Confirmation
 COG supports tiled compressed rasters and geospatial tooling (Fact #21). PMTiles is read-efficient but read-only (Fact #20).
 ### Reference Comparison
 The onboard system must both read service tiles and write new generated tiles in-flight. A read-only archive is a poor primary mutable store.
 ### Conclusion
 Use COG files plus STAC-like manifests/sidecars for imagery and metadata; use FAISS sidecar indexes for descriptors. PMTiles may be an export/snapshot format, not the live mutable cache.
 ### Confidence
 High.
 ## Dimension 6: Autopilot Integration
 ### Fact Confirmation
 ArduPilot MAVLink GPS input requires `GPS1_TYPE=14` (Fact #15). `GPS_INPUT` carries the fields needed for WGS84 position and accuracy (Fact #16). MAVSDK covers telemetry but raw `GPS_INPUT` emission still needs pymavlink/raw MAVLink (Fact #14).
 ### Reference Comparison
 MAVSDK-only output would not satisfy AC-4.3. Raw pymavlink-only telemetry is possible but gives up MAVSDK's high-level subscriptions.
 ### Conclusion
 Use MAVSDK for telemetry subscriptions and pymavlink for `GPS_INPUT` emission. Verify all failsafe/spoofing behavior in ArduPilot Plane SITL.
 ### Confidence
 High for interface; medium for exact Plane failsafe timing until SITL.
 ## Dimension 7: Validation
 ### Fact Confirmation
 AerialVL provides aerial localization/reference-map data (Fact #22). EuRoC provides camera+IMU+ground truth but not fixed-wing nadir imagery (Fact #23). The current sample set lacks IMU/ground truth.
 ### Reference Comparison
 No single public dataset proves every AC. Public datasets can de-risk components, while final acceptance requires representative synchronized flight/replay data.
 ### Conclusion
 Use layered validation: EuRoC for VIO sanity, AerialVL/VPAir-style data for VPR/anchor tests, ArduPilot Plane SITL for MAVLink/failsafe/spoofing, and a final representative flight/replay rig for AC proof.
 ### Confidence
 High.
 ## Dimension 8: OpenVINS vs Project-Owned Estimator
 ### Fact Confirmation
 OpenVINS is a strong monocular+IMU EKF/MSCKF VIO reference with covariance-aware estimation (Facts #4 and #30). OpenCV supplies calibration, undistortion, homography, RANSAC/USAC, and reprojection-error primitives, but it is not a full estimator by itself (Facts #6 and #31).
 ### Reference Comparison
 If the alternative is a naive OpenCV-only VIO stack, OpenVINS is the better technical starting point. The project's actual production choice is different: it needs a project-owned ESKF/mode machine that fuses VO/IMU, accepts/rejects satellite anchors, emits `GPS_INPUT`, labels every estimate, handles spoofing/blackout, and gates cache write-back.
 ### Conclusion
 Keep OpenVINS as a mandatory benchmark/reference implementation, not as the default production dependency. The production estimator remains project-owned, with OpenCV as the geometry utility layer.
 ### Confidence
 High. The technical comparison favors OpenVINS over naive custom VIO, while the product fit favors the project-owned estimator.
 ## Dimension 9: Satellite Retrieval And Anchor Verification
 ### Fact Confirmation
 DINOv2-VLAD/AnyLoc-style retrieval has strong VPR evidence but descriptor/model size and TensorRT fidelity must be validated (Facts #33 and #36). Aerial VPR survey evidence emphasizes tile scale, overlap, season/weather shifts, repetitive patterns, and re-ranking cost (Fact #34). LightGlue supports ALIKED/DISK/SIFT feature matching and returns correspondences/scores suitable for RANSAC verification, but Jetson latency must be profiled (Facts #7, #8, #35).
 ### Reference Comparison
 Classical SIFT/ORB is simpler and cheap but weaker for cross-domain UAV-to-satellite matching. SuperPoint+LightGlue is technically strong but remains license-gated. Pure global retrieval without local verification is unsafe because repetitive farmland and stale imagery can produce plausible but wrong candidates.
 ### Conclusion
 Use DINOv2-VLAD + CPU-first FAISS as the triggered global retriever, then verify bounded top-K candidates with ALIKED/LightGlue + OpenCV RANSAC. Keep SIFT/ORB as a regression baseline and SuperPoint only after legal approval.
 ### Confidence
 High for architecture and interfaces; medium for final runtime until Jetson profiling.
 ## Dimension 10: BASALT vs Kimera-VIO vs OpenVINS
 ### Fact Confirmation
 BASALT has strong published EuRoC evidence and production-friendly licensing (Fact #37). OpenVINS has the clearest EKF covariance API and consistency tooling, but GPLv3 remains a production constraint (Fact #38). Kimera-VIO is BSD-friendly but heavier and has documented mono-inertial caveats (Fact #39). All three require calibrated camera-to-IMU extrinsics; none has a special fixed-wing nadir mode (Fact #40).
 ### Reference Comparison
 For a single fixed downward camera, the selection criterion is not just benchmark ATE. The project needs a VIO core that can run on Jetson, tolerate calibrated nadir geometry, and be wrapped by project-specific satellite-anchor, confidence, MAVLink, and safety logic. OpenVINS is attractive for confidence/covariance but problematic as a shipped component. Kimera is acceptable as a BSD fallback, but mono-inertial risk makes it weaker as the first pick. BASALT provides the best production trade-off if its uncertainty can be calibrated and wrapped.
 ### Conclusion
 Select BASALT as the production VIO candidate. Keep OpenVINS as a reference/covariance baseline and Kimera-VIO as a backup candidate. The project-owned safety/anchor wrapper remains mandatory around BASALT because BASALT alone does not satisfy GPS-denied source labels, satellite anchors, false-position budgets, cache-write gates, or `GPS_INPUT` behavior.
 ### Confidence
 Medium-high. Library ranking is well supported; final acceptance still depends on representative fixed-wing nadir replay data.
@@ -0,0 +1,51 @@
 # Validation Log
 ## Validation Scenario
 An 8-hour fixed-wing mission enters GPS-denied/spoofed mode after takeoff. The onboard system starts from last trusted FC state, processes 3 fps nadir frames, emits `GPS_INPUT`, handles normal flight, sharp turns, short visual blackouts, stale/changed tiles, and post-flight tile write-back.
 ## Expected Based On Conclusions
 - **Normal segment**: VO/IMU propagates every processed frame; satellite anchors refresh state conditionally before covariance grows too large.
 - **Sharp turn / <5% overlap**: VO is expected to fail; relocalization uses FAISS top-K VPR chunks followed by LightGlue/RANSAC.
 - **Visual blackout + spoofing**: estimator switches to `dead_reckoned`, covariance grows monotonically, spoofed GPS is ignored, `GPS_INPUT` degrades honestly.
 - **Stale tile**: anchor is rejected or down-confidence weighted and cannot emit `satellite_anchored`.
 - **Cache write-back**: onboard generated tile is written only when parent-pose covariance passes AC-NEW-7 gates and carries metadata for Satellite Service voting.
 ## Actual Validation Plan
 | Validation Target | Test Method | Pass Evidence |
 |-------------------|-------------|---------------|
 | VO/VIO propagation | EuRoC and synthetic nadir replay; then representative flight data | Drift vs anchor-age bins; AC-1.3 pass/fail. |
 | Satellite anchor | AerialVL/VPAir-style benchmark plus project sample imagery with satellite cache | AC-1.1/1.2 accuracy, AC-2.2 MRE, georeference recall. |
 | Runtime | Jetson Orin Nano Super profiling under 25 W, hot-soak included | <400 ms p95, <8 GB memory, no thermal throttle. |
 | VPR retrieval | Offline descriptor build and FAISS query benchmark | Top-K recall, query latency, index size within cache budget. |
 | MAVLink output | ArduPilot Plane SITL with `GPS1_TYPE=14` | Valid `GPS_INPUT`, fix-type/accuracy degradation, QGC status. |
 | Spoofing promotion | Plane SITL false GPS injection | Promotion <3 s and spoofed GPS rejected during blackout. |
 | FDR | 8-hour synthetic load | <=64 GB, rollover logged, no silent payload loss. |
 | Cache poisoning | Monte Carlo with over-confident wrong anchors | AC-NEW-7 probabilities below budget; metadata contract emitted. |
 | OpenVINS reference comparison | Replay the same synchronized camera+IMU segments through OpenVINS and the project-owned estimator | OpenVINS establishes a VIO baseline; production estimator must match/beat drift where applicable while preserving source labels and GPS_INPUT behavior. |
 | BASALT production VIO candidate | Replay synchronized camera+IMU segments through BASALT, OpenVINS, and Kimera-VIO | BASALT selected if drift, completion rate, latency, and wrapper-calibrated covariance meet project gates. |
 | DINOv2-VLAD fidelity | Compare PyTorch, ONNX, and TensorRT descriptor distances and FAISS rankings | Optimized engines accepted only if rank/top-K behavior stays within tolerance. |
 | ALIKED/LightGlue runtime | Jetson benchmark across K candidates and project image sizes | Candidate accepted for runtime only if relocalization trigger path fits AC-4.1 with bounded frame drops. |
 ## Counterexamples And Risks
 - Large DINOv2 variants or many local-match candidates may violate the Jetson latency/memory envelope.
 - Agricultural fields can be visually repetitive; VPR confidence must not be treated as sufficient without geometric verification.
 - Public datasets do not fully match Ukrainian fixed-wing operational conditions; final evidence requires representative data.
 - GPL VIO/SLAM libraries are not production dependencies unless licensing is explicitly accepted.
 - OpenVINS may outperform the first custom estimator prototype on pure VIO drift; that would trigger estimator improvement, not automatic GPL production adoption.
 - BASALT covariance/confidence is less directly exposed than OpenVINS EKF covariance; the project wrapper must calibrate uncertainty before mapping it to `GPS_INPUT.horiz_accuracy`.
 ## Review Checklist
 - [x] Draft conclusions are consistent with fact cards.
 - [x] No important dimensions missed: architecture, VO, VPR, local matching, cache, estimator, MAVLink, FDR, validation covered.
 - [x] No selected component relies only on field-adjacent fit.
 - [x] Mismatches are recorded as rejected/reference/needs-decision rather than hidden.
 - [x] Step 7.5 Component Applicability Gate applies and is saved in `06_component_fit_matrix.md`.
 ## Conclusions Requiring Revision
 Round 3 applies the user decision to select BASALT as the production VIO candidate. The selected implementation is BASALT VIO plus a project-owned safety/anchor wrapper; OpenVINS remains the covariance/reference baseline, Kimera-VIO remains a backup candidate, and custom OpenCV-only VIO is no longer the primary path. Runtime gates and Plane SITL gates are implementation validation gates, not API capability blockers.
@@ -0,0 +1,107 @@
 # Component Fit Matrix
 ## Top-Level Matrix
 | Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
 |----------------|-----------|--------------------|---------------|---------------|-------------------------|----------------------------|--------|--------------------|
 | Calibration / geometry | OpenCV 4.x | C++/Python calibration, undistortion, RANSAC homography, reprojection-error measurement | Established production / open-source | Camera intrinsics, image normalization, VO/satellite homography verification | MVE: `02_fact_cards.md` OpenCV block; Source #5 | None | Selected | Exact API fit and permissive utility role. |
 | VO / IMU propagation | BASALT + project-owned safety/anchor wrapper | BASALT VIO consumes calibrated nav-camera frames + FC IMU; wrapper fuses satellite anchors, calibrates uncertainty, emits source labels and GPS_INPUT semantics | Open-source production candidate | Relative VIO state, completion/error benchmark, wrapped covariance/confidence, degraded modes, GPS_INPUT semantics | Sources #33-#35; Facts #37, #40, #41 | BASALT covariance/confidence must be calibrated in wrapper; no special nadir mode | Selected | User-selected best production trade-off: permissive licensing and stronger benchmark/completion evidence than OpenVINS/Kimera, with wrapper covering project-specific safety semantics. |
 | VO / VIO reference | OpenVINS | Monocular camera + IMU EKF/MSCKF reference runs with covariance extraction | Open-source research | Benchmark and covariance reference | Sources #3, #27, #28, #35, #38; Facts #4, #30, #38 | GPL-3 production dependency risk; completion/divergence risk on some sequences; does not own satellite anchor / GPS_INPUT / blackout / cache-write state machine | Reference only | Best covariance baseline, but not selected as shipped production dependency. |
 | VO / VIO backup | Kimera-VIO | Mono/stereo camera + IMU VIO/SLAM backup candidate | Open-source production candidate / fallback | Alternative VIO baseline | Sources #34, #36; Fact #39 | Heavier/stereo-oriented; mono-inertial path has documented parameter caveats | Backup candidate | BSD-friendly backup if BASALT fails project replay/runtime gates. |
 | VO / SLAM alternative | ORB-SLAM3 | Monocular-inertial SLAM | Open-source research | Benchmark and failure-mode comparison | Source #4, Fact #5 | GPLv3; heavier SLAM/map lifecycle than required | Rejected | Does not fit licensing/scope for production. |
 | VPR descriptors | DINOv2-VLAD / AnyLoc-style | Precomputed satellite chunk descriptors; conditional query descriptor on relocalization triggers; TensorRT path only after embedding-fidelity check | Current SOTA / research | Global top-K candidate retrieval | Sources #7, #8, #21, #22, #30, #32; Facts #10, #11, #24, #33, #34, #36 | Runtime and embedding-fidelity gates on Jetson; model-size/index-size selection required | Selected with runtime gate | Best evidence for change-robust VPR, but not per-frame and not blindly TensorRT-converted. |
 | Vector retrieval | FAISS | CPU-first aarch64 index; saved/loaded index over float/PQ descriptors; GPU only if custom Jetson build is proven | Established production / open-source | Top-K candidate chunk search | MVE: `02_fact_cards.md` FAISS block; Sources #9, #25 | GPU FAISS not default on Jetson ARM64 | Selected | Exact top-K descriptor retrieval fit with CPU-first deployment. |
 | Local matching | LightGlue + DISK/ALIKED | CUDA/ONNX-profiled DISK or ALIKED feature extraction + LightGlue matching on bounded top-K candidates | Current SOTA / open-source | 2D-2D correspondences for RANSAC and MRE | MVE: `02_fact_cards.md` LightGlue block; Sources #6, #23, #31 | Runtime quality gate; extractor choice must avoid SuperPoint license issue | Selected with runtime gate | Exact input/output fit with deployable licensing path; ALIKED/LightGlue is preferred for anchor verification. |
 | Local matching fallback | SuperPoint + LightGlue | SuperPoint features + LightGlue | Current SOTA / license-gated | Optional benchmark/fallback | Source #6 | SuperPoint restrictive license | Needs user decision | Do not use as default production dependency without legal review. |
 | Cache imagery | COG + manifest/sidecar | Tiled compressed GeoTIFF tile objects with CRS, capture date, source, m/px, freshness, descriptor sidecars; write-new-object lifecycle | Established geospatial format | Immutable service tiles and generated candidate tiles | Source #18, Facts #21, #29 | No in-place mutation; manifest manages active tile version | Selected | Fits geospatial raster access and write-new-tile workflow. |
 | Cache packaging | PMTiles | Read-only tile archive | Established web-map archive | Optional export/snapshot | Source #17, Fact #20 | Cannot update in place | Rejected for live cache | In-flight tile generation needs mutable write-new objects. |
 | MAVLink | MAVSDK + pymavlink | MAVSDK telemetry subscriptions; pymavlink/raw MAVLink `GPS_INPUT` emission to ArduPilot `GPS1_TYPE=14`; velocity source/ignore-flag behavior SITL-tested | Established APIs | FC telemetry, QGC status, GPS substitute output | MVE: `02_fact_cards.md` MAVSDK/pymavlink block; Sources #10-#12, #24 | Plane SITL behavior and velocity-source parameters must be validated | Selected | Exact output contract with known ArduPilot pitfall covered. |
 | Validation | EuRoC + AerialVL/VPAir + Plane SITL + representative flight | Layered validation suite | Test strategy | Prove ACs before production | Sources #19, #20 | Public data not sufficient for final proof | Selected | Covers component de-risking plus final representative proof. |
 ## Restrictions Cross-Check — Selected Production Architecture
 | Restriction | Candidate-mode behavior | Result | Evidence |
 |-------------|-------------------------|--------|----------|
 | Fixed-wing only | Architecture assumes forward motion, rare sharp turns, no hover dependency. | Pass | Problem context; Source #1 |
 | Fixed nadir nav camera | VO/orthorectification uses fixed camera extrinsics; attitude compensation from FC. | Pass | Source #5 |
 | Operational area / flat terrain | Flat terrain assumption supported; repetitive agricultural terrain handled as validation class and confidence risk. | Pass | Source #2 |
 | Weather/season classes | Validation matrix includes seasonal/visibility classes. | Pass | `05_validation_log.md` |
 | Two-camera split | Nav camera drives localization; AI camera object localization uses current GPS-denied state and AI gimbal/zoom. | Pass | AC-7.1/7.2 |
 | Satellite Service offline boundary | Runtime uses local COG/cache + FAISS descriptors only; no in-flight provider fetch. | Pass | Sources #17, #18 |
 | Freshness gates | Tile manifest carries capture date and sector; stale tiles rejected/down-weighted. | Pass | AC-8.2, AC-NEW-6 |
 | Jetson Orin Nano Super | Hot path is lightweight; heavy VPR conditional; runtime profiling gate required. | Pass | Sources #15, #16 |
 | MAVLink / ArduPilot only | MAVSDK telemetry + pymavlink `GPS_INPUT`, `GPS1_TYPE=14`. | Pass | Sources #10-#12 |
 | No raw frame storage | FDR keeps estimates, telemetry, tiles, and low-rate failure thumbnails only. | Pass | AC-8.5, AC-NEW-3 |
 ## AC Cross-Check — Selected Production Architecture
 | AC | Candidate-mode behavior | Result | Evidence |
 |----|-------------------------|--------|----------|
 | AC-1.1 | Satellite anchors and ESKF output target <=50 m for >=80% normal frames. | Pass | Facts #1, #2 |
 | AC-1.2 | Same pipeline targets <=20 m for >=50% normal frames; validated statistically. | Pass | `05_validation_log.md` |
 | AC-1.3 | ESKF tracks anchor age and covariance; VO-only and IMU-fused drift measured between anchors. | Pass | Fact #1 |
 | AC-1.4 | ESKF emits covariance; telemetry/FDR carries source label. | Pass | Fact #16 |
 | AC-2.1a | VO hot path handles normal overlapping frames; failures trigger mode change. | Pass | Facts #1, #6 |
 | AC-2.1b | Satellite anchor success measured separately through VPR + local match + RANSAC. | Pass | Facts #2, #7, #12 |
 | AC-2.2 | OpenCV/LightGlue provide correspondences and homography MRE measurement. | Pass | Facts #6, #7 |
 | AC-3.1 | ESKF innovation gates reject outliers; covariance grows instead of trusting jumps. | Pass | Reasoning chain |
 | AC-3.2 | Sharp turn VO failure triggers VPR relocalization. | Pass | AC design; Source #2 |
 | AC-3.3 | Disconnected segment handled by global retrieval and pose-graph/ESKF re-anchor. | Pass | Source #2 |
 | AC-3.4 | Loss counter and timer trigger GCS relocalization request while dead reckoning continues. | Pass | AC design |
 | AC-3.5 | Image-quality/blackout state switches to IMU-only and rejects spoofed GPS. | Pass | AC design; Facts #16, #17 |
 | AC-4.1 | Heavy VPR is conditional; steady-state path is VO/IMU. Jetson profiling is a runtime quality gate. | Pass | Facts #3, #18 |
 | AC-4.2 | Descriptor/index memory is budgeted; FAISS and cache are precomputed/pruned. | Pass | Facts #12, #13 |
 | AC-4.3 | `GPS_INPUT` emitted by pymavlink; ODOMETRY remains disabled in v1. | Pass | Facts #14-#16 |
 | AC-4.4 | Estimator emits frame-by-frame; no batching required. | Pass | Architecture |
 | AC-4.5 | Corrections are emitted as updated estimates with timestamps and covariance. | Pass | Architecture |
 | AC-5.1 | Startup initializes from last trusted FC state plus IMU propagation. | Pass | Architecture |
 | AC-5.2 | >3 s no-estimate path enters degraded/failsafe behavior; Plane SITL proves FC response. | Pass | Fact #17 |
 | AC-5.3 | Cold restart uses FC state and preloaded cache/index. | Pass | AC-NEW-1 |
 | AC-6.1 | QGC receives downsampled status; FDR keeps high-rate details. | Pass | MAVSDK telemetry + FDR design |
 | AC-6.2 | Command ingress reserved through MAVLink status/named values/custom dialect. | Pass | MAVLink design |
 | AC-6.3 | `GPS_INPUT` lat/lon is WGS84. | Pass | Fact #16 |
 | AC-7.1 | Object localization exposes level-flight accuracy and maneuver bound. | Pass | Geometry design |
 | AC-7.2 | Flat-terrain trig projection from UAV GPS + gimbal/zoom/altitude. | Pass | Geometry design |
 | AC-8.1 | Cache contract requires 0.3-0.5 m/px imagery. | Pass | Restrictions |
 | AC-8.2 | Freshness metadata gates anchors. | Pass | Restrictions |
 | AC-8.3 | Precomputed descriptors and FAISS index are part of cache budget. | Pass | Facts #10, #12, #13 |
 | AC-8.4 | Generated tiles are new COGs with quality metadata for Service ingest. | Pass | Fact #21 |
 | AC-8.5 | Raw frames are not retained. | Pass | FDR design |
 | AC-8.6 | VPR chunks use overlap/multi-scale descriptors and dynamic K. | Pass | Facts #2, #12 |
 | AC-NEW-1 | Engines/indexes built before flight; first fix benchmark validates <30 s. | Pass | Runtime plan |
 | AC-NEW-2 | Plane SITL verifies spoofing trigger and promotion <3 s. | Pass | Fact #17 |
 | AC-NEW-3 | FDR segment rollover validates <=64 GB. | Pass | Validation plan |
 | AC-NEW-4 | Mahalanobis gates and calibrated covariance target false-position budget. | Pass | ESKF design |
 | AC-NEW-5 | Thermal profiling at 25 W validates no throttle. | Pass | Facts #18, #19 |
 | AC-NEW-6 | Tile age rejection/down-weighting built into anchor gate. | Pass | AC design |
 | AC-NEW-7 | Tile writes require parent-pose covariance and sidecar metadata; Satellite Service voting is external contract. | Pass | AC design |
 | AC-NEW-8 | Dead-reckoned blackout mode degrades `GPS_INPUT` fields at covariance thresholds. | Pass | Facts #16, #17 |
 ## Decision Rules Applied
 - No GPL VIO/SLAM library is selected as a production dependency.
 - Runtime gates are classified as runtime-quality validation gates, not API capability blockers.
 - Every selected component has an exact input/output role and a validation path.
 - Any candidate with license uncertainty is marked `Needs user decision` or non-production.
 ## Mode B Revisions Applied
 - FAISS pinned mode changed to CPU-first on Jetson ARM64.
 - DINOv2 TensorRT path requires descriptor-fidelity validation against PyTorch/ONNX.
 - `GPS_INPUT` tests now include velocity-source and ignore-flag behavior.
 - COG cache lifecycle clarified as write-new-object plus manifest versioning, not in-place mutation.
 - Visual/satellite security controls now include signed manifests, cache provenance, stale-tile rejection, and multi-signal consistency checks.
 ## Mode B Round 2 Revisions Applied
 - OpenVINS is explicitly better than naive OpenCV-only VIO, but remains reference-only because the shipped estimator must own source labels, covariance gates, spoofing/blackout states, cache-write eligibility, and MAVLink semantics.
 - The selected VO wording is now "OpenCV geometry + project-owned ESKF" to avoid implying a fragile OpenCV-only odometry implementation.
 - DINOv2-VLAD + CPU-first FAISS + ALIKED/LightGlue remains selected for satellite retrieval and anchor verification, with retrieval limited to relocalization triggers and bounded top-K verification.
 - SIFT/ORB remains a regression/fallback baseline; SuperPoint remains non-production until legal approval.
 ## Mode B Round 3 Revisions Applied
 - BASALT is selected as the production VIO candidate.
 - OpenVINS remains the covariance/reference baseline, not a shipped dependency by default.
 - Kimera-VIO remains a backup VIO candidate because its license is production-friendly but mono-inertial caveats make it weaker for the single-nadir-camera path.
 - The project-owned safety/anchor wrapper remains mandatory around BASALT for satellite anchor acceptance, source labels, blackout/spoofing modes, cache-write eligibility, calibrated confidence, and MAVLink `GPS_INPUT`.
@@ -0,0 +1,130 @@
 # Solution
 ## Product Solution Description
 Build an onboard GPS-denied localization service that runs on the Jetson companion computer, uses the fixed downward navigation camera and flight-controller inertial telemetry, and emits ArduPilot `GPS_INPUT` estimates with calibrated covariance and source labels.
 The production architecture is a trigger-based hybrid estimator:
 ```text
 Nav camera + FC telemetry
        |
        v
 Image quality + calibration + orthorectification
        |
        +--> Hot path: OpenCV geometry + BASALT VIO --> safety/anchor wrapper --> GPS_INPUT + QGC + FDR
        |
        +--> Reference path: OpenVINS replay benchmark for VIO drift/covariance tests; Kimera backup replay
        |
        +--> Trigger path: DINOv2-VLAD query --> CPU FAISS top-K --> ALIKED/DISK+LightGlue --> OpenCV RANSAC --> safety/anchor wrapper
        |
        +--> Tile path: new COG tile + quality/provenance sidecar --> manifest update --> post-flight Satellite Service sync
 ```
 Heavy local retrieval and local matching are not steady-state per-frame dependencies. They run on cold start, VO failure, sharp turns, disconnected segments, covariance growth, stale-anchor age, or operator-assisted relocalization, using only preloaded cache/index data during flight.
 ## Architecture
 ### Camera Ingest, Calibration, And Geometry
 | Solution | Tools | Pinned Mode/Config | Fit |
 |----------|-------|--------------------|-----|
 | OpenCV geometry utility layer | OpenCV 4.x | Calibration, undistortion, homography, RANSAC/USAC, MRE measurement | Selected. Mature, permissive, exact utility fit; not a full estimator. |
 ### VO / IMU Propagation And Estimator
 | Solution | Tools | Pinned Mode/Config | Fit |
 |----------|-------|--------------------|-----|
 | BASALT + safety/anchor wrapper | BASALT, OpenCV, custom wrapper | BASALT consumes calibrated nav-camera frames + FC IMU; wrapper fuses satellite anchors, calibrates uncertainty, emits source labels and `GPS_INPUT` fields | Selected. Best production VIO candidate found: permissive license, strong benchmark evidence, avoids custom VIO from scratch. |
 | OpenVINS | OpenVINS | Monocular camera + IMU EKF/MSCKF reference runs with covariance extraction | Reference only. Strong VIO and covariance baseline, but GPLv3 and generic VIO ownership make it unsuitable as default shipped dependency. |
 | Kimera-VIO | Kimera-VIO | Mono/stereo camera + IMU VIO/SLAM backup replay | Backup candidate. BSD-friendly but heavier/stereo-oriented; mono-inertial path has documented caveats. |
 | ORB-SLAM3 | ORB-SLAM3 | Monocular-inertial SLAM | Rejected for production. GPLv3 and heavier SLAM/map lifecycle. |
 BASALT does not replace the project-owned safety logic. The wrapper remains responsible for satellite anchor acceptance, confidence calibration, source labels, blackout/spoofing modes, tile-write eligibility, and MAVLink `GPS_INPUT` semantics.
 ### Satellite Service And Anchor Verification
 | Solution | Tools | Pinned Mode/Config | Fit |
 |----------|-------|--------------------|-----|
 | DINOv2-VLAD + CPU FAISS + ALIKED/DISK+LightGlue | DINOv2/AnyLoc-style descriptors, FAISS CPU, LightGlue, OpenCV RANSAC | Offline VPR chunk descriptors; conditional query descriptor; CPU FAISS top-K; learned local match on bounded candidates; TensorRT only after fidelity check | Selected with runtime/fidelity gates. |
 | SuperPoint+LightGlue | SuperPoint, LightGlue | Same matcher with SuperPoint features | License-gated benchmark/fallback only. |
 | Classical SIFT/ORB | OpenCV | Handcrafted features + homography | Regression/fallback baseline. |
 The Satellite Service component imports mission cache/index packages before flight, uploads generated-tile packages after landing, and serves local VPR queries during flight. The VPR index is built over ground-footprint-sized chunks with overlap and a multi-scale descriptor set. VPR is invoked only on relocalization triggers or covariance/anchor-age growth; normal flight uses BASALT VIO plus wrapper propagation. No satellite-provider or Satellite Service network calls are allowed mid-flight.
 ### Tile Manager
 | Solution | Tools | Pinned Mode/Config | Fit |
 |----------|-------|--------------------|-----|
 | COG tile objects + PostgreSQL/PostGIS manifest + signed JSON sidecars | GDAL COG, PostgreSQL/PostGIS, signed JSON sidecars, FAISS index files | Service tiles and generated tiles are write-new COG objects; active version selected by PostGIS-backed manifest | Selected. Fits geospatial raster access, provenance, spatial/freshness queries, and write-new tile lifecycle. |
 | PMTiles | PMTiles | Read-only archive snapshot | Rejected for live cache because in-flight tile generation needs mutable write-new objects. |
 Service-source tiles and generated tiles carry CRS, capture date, source, m/px, freshness, quality score, sidecar hashes, and descriptor references. The Tile Manager also orthorectifies eligible nadir frames into generated COG tiles. Stale tiles are rejected or down-confidence weighted.
 ### MAVLink Integration
 | Solution | Tools | Pinned Mode/Config | Fit |
 |----------|-------|--------------------|-----|
 | MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK, pymavlink | MAVSDK subscriptions; pymavlink emits `GPS_INPUT`; v1 emits GPS_INPUT only; Plane SITL validates `GPS1_TYPE=14`, velocity source params, ignore flags, fix types, accuracy fields | Selected. Exact output control with good telemetry ergonomics. |
 The system emits per-frame estimates locally and downsampled status to QGroundControl. `GPS_INPUT.horiz_accuracy` must not under-report the calibrated 95% covariance semi-major axis.
 ### Security And Safety Controls
 | Solution | Tools | Pinned Mode/Config | Fit |
 |----------|-------|--------------------|-----|
 | Consistency-gated anchor acceptance | Safety/anchor wrapper, cache manifest verification | Anchor accepted only if freshness, provenance, RANSAC, covariance, Mahalanobis, and temporal consistency pass | Selected. Prevents confident false fixes. |
 | FDR audit trail | PostgreSQL event index + CBOR payload segments + hashes | Logs estimates, inputs, emitted GPS_INPUT, health, tile writes, anchor decisions | Selected. Supports incident analysis, indexed queries, and cache-poisoning audits. |
 ## Runtime Modes
 | Mode | Trigger | Behavior | `GPS_INPUT` / Telemetry |
 |------|---------|----------|--------------------------|
 | `satellite_anchored` | VPR + local match passes all gates | Wrapper absolute update; tile write eligible only if sigma gate passes | 3D fix, `horiz_accuracy` >= 95% covariance semi-major axis |
 | `vo_extrapolated` | BASALT VIO healthy and anchor age/covariance within bounds | BASALT VIO + wrapper propagation; covariance grows | 3D/2D depending covariance threshold |
 | `dead_reckoned` | visual blackout or no accepted anchor | IMU-only propagation, monotonic covariance growth | degraded fix type; QGC `VISUAL_BLACKOUT_IMU_ONLY` |
 | failsafe/no-fix | covariance >500 m or blackout >30 s | stop pretending position is valid | `fix_type=0`, `horiz_accuracy=999.0`, QGC `VISUAL_BLACKOUT_FAILSAFE` |
 ## Testing Strategy
 ### Integration / Functional Tests
 - BASALT replay: assert AC-2.1a and AC-2.2 VO MRE on overlapping frame pairs, completion rate, latency, and wrapper-calibrated covariance.
 - OpenVINS reference replay: compare VIO drift, failure cases, and covariance against BASALT + wrapper.
 - Kimera-VIO backup replay: keep a second permissive candidate benchmark in case BASALT fails project replay/runtime gates.
 - Satellite anchor replay: assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness rejection, and source labels.
 - DINOv2 descriptor fidelity: compare PyTorch/ONNX/TensorRT embeddings and retrieval rankings before accepting optimized engines.
 - FAISS CPU index tests: top-K recall, query latency, index size, save/load behavior on Jetson ARM64.
 - LightGlue extractor matrix: ALIKED vs DISK vs SIFT/ORB vs SuperPoint benchmark; SuperPoint excluded from production unless legal approves.
 - Tile Manager: orthorectify eligible nadir frames into write-new generated tiles, update manifest, verify active version and rollback.
 - `GPS_INPUT` SITL: validate fix type, `horiz_accuracy`, velocity fields, ignore flags, `EK3_SRC1_*` parameters, QGC behavior.
 - Security gates: stale tile, mismatched tile hash, low inlier ratio, impossible velocity jump, and spoofed GPS during blackout.
 ### Non-Functional Tests
 - Jetson latency and memory: <400 ms p95, <8 GB shared memory, no 25 W thermal throttle.
 - Cache budget: 400 km² imagery + manifests + descriptors fits budget or reports explicit split budget.
 - FDR 8-hour load: <=64 GB, rollover logged, no silent payload loss.
 - Monte Carlo false-position and cache-poisoning tests for AC-NEW-4 and AC-NEW-7.
 - Cold boot: first valid `GPS_INPUT` <30 s p95 across 50 runs.
 ## References
 Detailed source registry: `_docs/00_research/01_source_registry.md`.
 Key sources:
 - BASALT repository: https://github.com/VladyslavUsenko/basalt
 - HybVIO benchmark paper: https://arxiv.org/pdf/2106.11857
 - OpenVINS docs: https://docs.openvins.com/index.html
 - OpenVINS ATE/RTE comparison: https://github.com/rpng/open_vins/issues/402
 - Kimera mono-inertial caveat: https://github.com/MIT-SPARK/Kimera-VIO/issues/254
 - AnyLoc paper: https://arxiv.org/html/2308.00688
 - Aerial VPR survey: https://arxiv.org/abs/2406.00885
 - ALIKED-LightGlue-ONNX: https://github.com/ikeboo/ALIKED-LightGlue-ONNX
 - DINOv2 TensorRT issue: https://github.com/NVIDIA/TensorRT/issues/4348
 ## Related Artifacts
 - Tech stack evaluation: `_docs/01_solution/tech_stack.md`
 - Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
 - Fact cards: `_docs/00_research/02_fact_cards.md`
@@ -0,0 +1,149 @@
 # Solution Draft
 ## Product Solution Description
 Build an onboard GPS-denied localization service that runs on the companion Jetson, consumes the fixed nadir navigation camera plus flight-controller IMU/attitude/altitude, and emits ArduPilot-compatible `GPS_INPUT` estimates with honest covariance.
 The solution is a hybrid estimator:
 ```text
 Nav camera + FC telemetry
        |
        v
 Image quality + calibration + orthorectification
        |
        +--> Steady state: VO/IMU propagation --> ESKF --> GPS_INPUT + QGC + FDR
        |
        +--> Re-anchor triggers: VPR top-K --> local match/RANSAC --> ESKF anchor update
        |
        +--> Tile generation: ortho tile + quality sidecar --> local cache --> post-flight Satellite Service upload
 ```
 The steady-state path is intentionally lightweight. Heavy global retrieval and cross-domain local matching run only on cold start, VO failure, sharp turns, disconnected segments, covariance growth, or stale-anchor age.
 ## Existing / Competitor Solutions Analysis
 Recent fixed-wing GPS-denied research supports the same high-level mechanism: monocular visual odometry alone accumulates scale/drift error, while satellite-image comparison can periodically correct it. Aerial VPR surveys also show why the implementation cannot be naive: weather, season, scale mismatch, repetitive fields, tile overlap, and re-ranking runtime all matter.
 The selected design borrows the proven structure but rejects an all-in-one SLAM dependency as the product core. OpenVINS and ORB-SLAM3 are useful benchmark/reference implementations, but their GPL-family licensing and broader SLAM lifecycle do not fit a default production dependency for this project.
 ## Architecture
 ### Component: Camera Ingest, Calibration, And Geometry
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
 | OpenCV geometry utility layer | OpenCV 4.x | Camera calibration, undistortion, RANSAC homography, reprojection-error measurement | Mature, local, fast, exact API fit | Not a full estimator | Calibration target, fixed intrinsics/extrinsics, lens/FOV selection | Local-only, no network | Low | MVE: `_docs/00_research/02_fact_cards.md`; Source #5 | Selected |
 **Exact-fit evidence**:
 - Project constraints checked: fixed nadir camera, calibration, homography, MRE gates.
 - Disqualifiers: none.
 - Restrictions × AC matrix: `_docs/00_research/06_component_fit_matrix.md`.
 ### Component: VO / IMU Propagation And Estimator
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
 | Custom VO/IMU ESKF | OpenCV + custom estimator | Frame-to-frame homography/features + FC IMU/attitude/altitude fused in a custom ESKF with source modes | Owns covariance, source labels, degraded modes, ArduPilot output semantics | More implementation work than adopting VIO library | Synchronized frames/IMU, calibration, replay tests | No third-party cloud; deterministic local logs | Medium | Facts #1, #4, #5, #16 | Selected |
 | OpenVINS reference | OpenVINS | Monocular camera + IMU EKF/MSCKF reference runs | Strong VIO reference and evaluation tools | GPL-3 production dependency risk | Dataset/replay adapter | Local only | Low for benchmark | Source #3 | Reference only |
 | ORB-SLAM3 alternative | ORB-SLAM3 | Monocular-inertial SLAM | Mature SLAM benchmark | GPLv3, heavier map lifecycle, initialization complexity | Calibration, vocabulary, runtime tuning | Local only | Medium | Source #4 | Rejected for production |
 **Exact-fit evidence**:
 - Project constraints checked: one camera + IMU, frame-by-frame output, covariance labels, blackout/spoofing modes.
 - Runtime quality gate: validate drift and covariance on EuRoC-style and representative fixed-wing replay.
 ### Component: Satellite Retrieval And Local Anchor
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
 | DINOv2-VLAD + FAISS + LightGlue | DINOv2/AnyLoc-style descriptors, FAISS, DISK/ALIKED+LightGlue, OpenCV RANSAC | Offline precomputed VPR chunk descriptors; conditional query descriptor; top-K FAISS; local matching on candidates | Matches AC-8.6; scalable top-K; local geometry verifies anchors | Needs Jetson profiling and model-size pruning | Fresh satellite cache, descriptors, dynamic K, RANSAC gates | Cache is local; no in-flight provider calls | Medium-high | MVE blocks in `02_fact_cards.md`; Sources #6-#9 | Selected with runtime gate |
 | SuperPoint + LightGlue | SuperPoint, LightGlue | Same local matching with SuperPoint features | Strong technical baseline | SuperPoint license is restrictive | Legal review | Local only | Medium | Source #6 | Needs user decision |
 | Classical SIFT/ORB-only | OpenCV | Handcrafted features and homography | Simple fallback, low compute | Poor cross-domain robustness | Feature-rich scenes | Local only | Low | Source #5 | Fallback / regression baseline |
 **Exact-fit evidence**:
 - Project constraints checked: offline cache, top-K dynamic retrieval, cross-domain local match, <400 ms hot-path constraint.
 - Runtime gate: heavy VPR/re-ranking is trigger-based, not per-frame.
 ### Component: Satellite Cache And Tile Write-Back
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
 | COG + PostgreSQL/PostGIS manifest + descriptor sidecars | GDAL COG, PostgreSQL/PostGIS manifest, FAISS sidecars | Service tiles and generated candidate tiles stored as tiled compressed GeoTIFFs with CRS/date/source/meter-per-pixel metadata | Geospatial standard, supports write-new-tile workflow, descriptor accounting | Needs careful 10 GB budget | STAC-like manifest, freshness gates, descriptor pruning | Local signed manifests recommended | Medium | Source #18 | Selected |
 | PMTiles archive | PMTiles | Single-file read archive | Efficient map reads | Read-only; cannot update in place | Archive rebuild for updates | Local file integrity | Low | Source #17 | Rejected for live mutable cache |
 **Exact-fit evidence**:
 - Project constraints checked: offline-only, no raw photo storage, mid-flight generated tiles, Satellite Service ingest metadata.
 - External dependency: Satellite Service owns the promotion/voting layer for trusted basemap updates.
 ### Component: MAVLink Integration And Telemetry
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
 | MAVSDK telemetry + pymavlink output | MAVSDK, pymavlink | MAVSDK subscribes to telemetry; pymavlink emits `GPS_INPUT` to ArduPilot with `GPS1_TYPE=14` | Exact `GPS_INPUT` field control while keeping high-level telemetry APIs | Plane-specific failsafe/spoof triggers need SITL proof | ArduPilot Plane params, QGC status, FDR tlog | Validate MAVLink source and message rate | Medium | Sources #10-#12 | Selected |
 **Exact-fit evidence**:
 - Project constraints checked: ArduPilot only, v1 `GPS_INPUT` only, WGS84 output, covariance to `horiz_accuracy`.
 - Validation gate: Plane SITL with production parameters.
 ### Component: Flight Data Recorder
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
 | Segmented FDR | PostgreSQL event index + binary/CBOR payloads with Parquet export post-flight | Fixed-size segment files for per-frame estimates, IMU, MAVLink, health, emitted GPS_INPUT, generated-tile metadata | Replayable, bounded, no raw frame retention | Exact format selected during implementation | Rollover policy, monotonic timestamps | Integrity hash per segment recommended | Medium | AC-NEW-3 | Selected pattern |
 ## Runtime Modes
 | Mode | Trigger | Behavior | Output Label |
 |------|---------|----------|--------------|
 | Satellite anchored | VPR + local match passes freshness, RANSAC, covariance, and Mahalanobis gates | ESKF absolute update, low covariance, tile generation eligible if sigma gate passes | `satellite_anchored` |
 | VO extrapolated | Last anchor fresh enough, VO healthy, normal overlap | VO/IMU propagation, covariance grows with anchor age | `vo_extrapolated` |
 | Dead reckoned | Visual blackout, VO failure without anchor, spoofing while no visual signal | IMU-only propagation, monotonic covariance growth, degraded fix type thresholds | `dead_reckoned` |
 | Failsafe / no fix | Blackout >30 s or covariance >500 m | `GPS_INPUT.fix_type=0`, `horiz_accuracy=999.0`, QGC status | `dead_reckoned` with failsafe status |
 ## Testing Strategy
 ### Integration / Functional Tests
 - Replay normal overlapping frames and assert AC-2.1a VO registration rate and AC-2.2 VO MRE.
 - Replay satellite-anchor cases and assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness gates, and source labels.
 - Inject stale tiles and assert no `satellite_anchored` output.
 - Inject sharp turns and disconnected segments and assert VPR relocalization.
 - Run ArduPilot Plane SITL with `GPS1_TYPE=14` and assert valid `GPS_INPUT` fields, fix type degradation, and QGC statuses.
 - Inject visual blackout + spoofed GPS and assert spoofed GPS is ignored, covariance grows, and thresholds match AC-NEW-8.
 - Request AI-camera object coordinates and assert level-flight projection plus maneuver error bound.
 ### Non-Functional Tests
 - Jetson Orin Nano Super profiling: <400 ms p95, <8 GB shared memory, 25 W no-throttle hot-soak.
 - Cache build test for 400 km²: imagery + manifests + descriptors fit within budget or fail with explicit budget report.
 - 8-hour FDR load test: <=64 GB, rollover logged, no silent data loss.
 - Monte Carlo false-anchor and over-confidence tests for AC-NEW-4 and AC-NEW-7.
 - Cold boot 50x: first valid `GPS_INPUT` <30 s p95.
 ## References
 Detailed source registry: `_docs/00_research/01_source_registry.md`.
 Key sources:
 - Fixed-wing satellite-aided VO: https://www.mdpi.com/2076-3417/14/16/7420
 - Aerial VPR survey: https://arxiv.org/abs/2406.00885
 - OpenVINS docs: https://docs.openvins.com/
 - ORB-SLAM3 README: https://raw.githubusercontent.com/UZ-SLAMLab/ORB_SLAM3/master/README.md
 - LightGlue README: https://raw.githubusercontent.com/cvg/LightGlue/main/README.md
 - FAISS docs: https://faiss.ai/index.html
 - ArduPilot GPSInput: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
 - MAVLink GPS_INPUT: https://mavlink.io/en/messages/common.html#GPS_INPUT
 - Jetson Orin Nano Super: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit/
 - PMTiles: https://docs.protomaps.com/pmtiles/
 - GDAL COG: https://gdal.org/en/stable/drivers/raster/cog.html
 ## Related Artifacts
 - AC assessment: `_docs/00_research/00_ac_assessment.md`
 - Question decomposition: `_docs/00_research/00_question_decomposition.md`
 - Source registry: `_docs/00_research/01_source_registry.md`
 - Fact cards: `_docs/00_research/02_fact_cards.md`
 - Comparison framework: `_docs/00_research/03_comparison_framework.md`
 - Reasoning chain: `_docs/00_research/04_reasoning_chain.md`
 - Validation log: `_docs/00_research/05_validation_log.md`
 - Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
@@ -0,0 +1,125 @@
 # Solution Draft
 ## Assessment Findings
 | Old Component Solution | Weak Point (functional/security/performance) | New Solution |
 |------------------------|----------------------------------------------|-------------|
 | DINOv2-VLAD with possible TensorRT optimization | TensorRT conversion may produce limited speedup and can alter embedding distances on Jetson-class deployments. | Keep DINOv2-VLAD, but require descriptor-fidelity tests against PyTorch/ONNX before TensorRT descriptors are accepted. |
 | FAISS CPU/GPU optional | FAISS GPU is not a safe default on Jetson ARM64/aarch64 packaging. | Pin FAISS as CPU-first on Jetson; use PQ/IVF and top-K caps before considering custom GPU builds. |
 | LightGlue local matcher | SuperPoint path has license risk and community confusion. | Keep DISK/ALIKED+LightGlue as production default; SuperPoint remains license-gated benchmark/fallback only. |
 | COG cache | "COG cache" could be misread as mutable in-place raster updates. | Use write-new COG tile objects plus manifest versioning and sidecars; never mutate COGs in place. |
 | `GPS_INPUT` output | ArduPilot velocity ignore flags have reported EKF3 pitfalls. | SITL must validate velocity source parameters, ignore flags, and whether zero velocity is ever fused accidentally. |
 | Visual/satellite anchoring | Draft did not emphasize adversarial/cache integrity enough. | Add signed cache manifests, tile provenance, freshness gates, anchor consistency checks, and FDR audit trail. |
 ## Product Solution Description
 Build an onboard GPS-denied localization service that runs on the Jetson companion computer, uses the fixed downward navigation camera and flight-controller inertial telemetry, and emits ArduPilot `GPS_INPUT` estimates with calibrated covariance and source labels.
 The production architecture is a trigger-based hybrid estimator:
 ```text
 Nav camera + FC telemetry
        |
        v
 Image quality + calibration + orthorectification
        |
        +--> Hot path: VO/IMU propagation --> custom ESKF --> GPS_INPUT + QGC + FDR
        |
        +--> Trigger path: DINOv2-VLAD query --> CPU FAISS top-K --> DISK/ALIKED+LightGlue --> RANSAC --> ESKF anchor
        |
        +--> Tile path: new COG tile + quality/provenance sidecar --> manifest update --> post-flight Satellite Service sync
 ```
 Heavy retrieval and local matching are not steady-state per-frame dependencies. They run on cold start, VO failure, sharp turns, disconnected segments, covariance growth, or stale-anchor age.
 ## Architecture
 ### Component: Camera Ingest, Calibration, And Geometry
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
 | OpenCV geometry utility layer | OpenCV 4.x | Calibration, undistortion, RANSAC homography, MRE measurement | Mature, exact fit, permissive | Not a full estimator | Checkerboard calibration, fixed extrinsics, lens/FOV selection | Local-only | Fast enough for hot-path utility use | MVE in `02_fact_cards.md`; Source #5 | Selected |
 ### Component: VO / IMU Propagation And Estimator
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
 | Custom VO/IMU ESKF | OpenCV + custom estimator | Nadir VO/homography + FC IMU/attitude/altitude fused in ESKF with mode labels | Owns covariance, source labels, blackout/spoofing behavior | More implementation effort | Synchronized frames/IMU, calibration, replay tests | No network dependency | Hot path is lightweight | Facts #1, #16 | Selected |
 | OpenVINS | OpenVINS | Monocular+IMU reference runs | Strong EKF/MSCKF reference | GPL-3 production risk | Replay adapter | Local only | Benchmark only | Source #3 | Reference only |
 | ORB-SLAM3 | ORB-SLAM3 | Monocular-inertial SLAM | Mature benchmark | GPLv3 and heavier SLAM lifecycle | Calibration/vocabulary/runtime tuning | Local only | Riskier on embedded | Source #4 | Rejected for production |
 ### Component: Satellite Retrieval And Anchor Verification
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
 | DINOv2-VLAD + CPU FAISS + DISK/ALIKED+LightGlue | DINOv2/AnyLoc-style descriptors, FAISS CPU, LightGlue, OpenCV RANSAC | Offline VPR chunk descriptors; conditional query descriptor; CPU FAISS top-K; local match on candidates; TensorRT only after fidelity check | Strong retrieval+geometry structure; avoids per-frame map search | Requires profiling and representative data | Descriptor cache, dynamic K, freshness, RANSAC, Mahalanobis gates | Signed manifests, provenance, stale-tile rejection | Trigger path only; top-K capped | MVE blocks in `02_fact_cards.md`; Sources #6-#9, #21-#25 | Selected with runtime/fidelity gates |
 | SuperPoint+LightGlue | SuperPoint, LightGlue | Same matcher with SuperPoint features | Strong technical baseline | SuperPoint license risk | Legal review | Local only | Benchmark only | Sources #6, #23 | Needs user decision |
 | Classical SIFT/ORB | OpenCV | Handcrafted features + homography | Simple and cheap | Weak cross-domain robustness | Feature-rich scenes | Local only | Fast | Source #5 | Regression baseline |
 ### Component: Cache And Tile Lifecycle
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
 | COG tile objects + manifest + sidecars | GDAL COG, manifest DB/JSON, FAISS index files | Service tiles and generated tiles are write-new COG objects; active version selected by manifest | Geospatial standard, supports provenance and quality metadata | Descriptor budget pressure | CRS/date/source/m/px/freshness, sidecar hashes | Signed manifests, tile provenance, hash verification | Efficient local reads | Source #18; Facts #21, #29 | Selected |
 | PMTiles | PMTiles | Read-only archive snapshot | Compact read package | Cannot update in place | Archive rebuild | Hash archive | Good for read-only export | Source #17 | Rejected for live cache |
 ### Component: MAVLink Integration
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
 | MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK, pymavlink | MAVSDK subscriptions; pymavlink emits `GPS_INPUT`; Plane SITL validates `GPS1_TYPE=14`, velocity source params, ignore flags, fix types, accuracy fields | Exact output control with good telemetry ergonomics | SITL required to prove Plane behavior | ArduPilot Plane params, QGC, tlog/FDR | Link/source validation, status audit | Light CPU load | Sources #10-#12, #24 | Selected |
 ### Component: Security And Safety Controls
 | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
 |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
 | Consistency-gated anchor acceptance | Custom ESKF gates, cache manifest verification | Anchor accepted only if freshness, provenance, RANSAC, covariance, Mahalanobis, and temporal consistency pass | Prevents confident false fixes | Needs calibrated thresholds | Representative replay and Monte Carlo | Rejects stale/poisoned/low-confidence anchors | Lightweight after candidate generation | Facts #16, #17, #28 | Selected |
 | FDR audit trail | Segmented logs + hashes | Logs estimates, inputs, emitted GPS_INPUT, health, tile writes, anchor decisions | Supports incident analysis and cache-poisoning audits | Schema work | 64 GB rollover | Tamper-evident hashes recommended | Sequential writes | AC-NEW-3 | Selected |
 ## Runtime Modes
 | Mode | Trigger | Behavior | `GPS_INPUT` / Telemetry |
 |------|---------|----------|--------------------------|
 | `satellite_anchored` | VPR + local match passes all gates | ESKF absolute update; tile write eligible only if sigma gate passes | 3D fix, `horiz_accuracy` >= 95% covariance semi-major axis |
 | `vo_extrapolated` | VO healthy and anchor age/covariance within bounds | VO/IMU propagation; covariance grows | 3D/2D depending covariance threshold |
 | `dead_reckoned` | visual blackout or no accepted anchor | IMU-only propagation, monotonic covariance growth | degraded fix type; QGC `VISUAL_BLACKOUT_IMU_ONLY` |
 | failsafe/no-fix | covariance >500 m or blackout >30 s | stop pretending position is valid | `fix_type=0`, `horiz_accuracy=999.0`, QGC `VISUAL_BLACKOUT_FAILSAFE` |
 ## Testing Strategy
 ### Integration / Functional Tests
 - VO replay: assert AC-2.1a and AC-2.2 VO MRE on overlapping frame pairs.
 - Satellite anchor replay: assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness rejection, and source labels.
 - DINOv2 descriptor fidelity: compare PyTorch/ONNX/TensorRT embeddings and retrieval rankings before accepting optimized engines.
 - FAISS CPU index tests: top-K recall, query latency, index size, save/load behavior on Jetson ARM64.
 - LightGlue extractor matrix: DISK vs ALIKED vs SuperPoint benchmark; SuperPoint output excluded from production unless license approved.
 - COG cache lifecycle: write-new generated tile, update manifest, verify active version and rollback.
 - `GPS_INPUT` SITL: validate fix type, `horiz_accuracy`, velocity fields, ignore flags, `EK3_SRC1_*` parameters, QGC behavior.
 - Security gates: stale tile, mismatched tile hash, low inlier ratio, impossible velocity jump, and spoofed GPS during blackout.
 ### Non-Functional Tests
 - Jetson latency and memory: <400 ms p95, <8 GB shared memory, no 25 W thermal throttle.
 - Cache budget: 400 km² imagery + manifests + descriptors fits budget or reports explicit split budget.
 - FDR 8-hour load: <=64 GB, rollover logged, no silent payload loss.
 - Monte Carlo false-position and cache-poisoning tests for AC-NEW-4 and AC-NEW-7.
 - Cold boot: first valid `GPS_INPUT` <30 s p95 across 50 runs.
 ## References
 Detailed source registry: `_docs/00_research/01_source_registry.md`.
 Key added Mode B sources:
 - DINOv2 TensorRT issue: https://github.com/NVIDIA/TensorRT/issues/4348
 - DINOv2 Jetson forum issue: https://forums.developer.nvidia.com/t/dinov2-tensorrt-model-performance-issue/312251
 - LightGlue license discussion: https://github.com/cvg/LightGlue/issues/120
 - ArduPilot GPS_INPUT velocity issue: https://github.com/ArduPilot/ardupilot/issues/19633
 - FAISS install docs: https://github.com/facebookresearch/faiss/blob/main/INSTALL.md
 - Orthophoto visual geolocalization: https://ar5iv.labs.arxiv.org/html/2103.14381
 ## Related Artifacts
 - Tech stack evaluation: `_docs/01_solution/tech_stack.md`
 - Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
 - Fact cards: `_docs/00_research/02_fact_cards.md`
@@ -0,0 +1,59 @@
 # Tech Stack Evaluation
 ## Requirements Analysis
 | Area | Requirement |
 |------|-------------|
 | Runtime | Jetson Orin Nano Super, Ubuntu/JetPack, CUDA/TensorRT available, 8 GB shared memory, 25 W thermal envelope. |
 | Language | Python is acceptable for orchestration/prototyping; C++/TensorRT paths likely needed for hot vision loops. |
 | Vision | Calibration, undistortion, homography, VPR descriptors, local matching, RANSAC verification. |
 | Estimation | BASALT VIO for relative camera+IMU propagation, wrapped by project-owned safety/anchor estimator logic for covariance calibration, source labels, blackout/spoofing modes, and MAVLink output mapping. |
 | Storage | Offline satellite cache, descriptor index, FDR with no raw frame retention. |
 | Autopilot | ArduPilot Plane, `GPS_INPUT` through pymavlink, MAVSDK for telemetry. |
 ## Technology Evaluation
 | Layer | Selected | Alternatives Considered | Rationale | Risk |
 |-------|----------|-------------------------|-----------|------|
 | OS / GPU stack | JetPack Ubuntu + CUDA + TensorRT | Plain Ubuntu without JetPack | Required for Jetson acceleration and profiling. | Thermal/performance tuning. |
 | Calibration / geometry | OpenCV 4.x | Custom NumPy-only geometry | Mature APIs for calibration, undistortion, homography, RANSAC. | Version pin and calibration quality. |
 | VO / estimator | BASALT + project-owned safety/anchor wrapper | OpenVINS, Kimera-VIO, custom OpenCV/ESKF, ORB-SLAM3, VINS-Fusion | BASALT is the selected production VIO candidate; wrapper owns source labels, covariance gates, degraded modes, satellite-anchor acceptance, and MAVLink semantics. | BASALT confidence/covariance must be calibrated; nadir fixed-wing replay required. |
 | VPR descriptors | DINOv2-VLAD / AnyLoc-style, model-size profiled, TensorRT only after fidelity check | MixVPR, SALAD, NetVLAD, classical BoW | Strong retrieval evidence; good offline descriptor model. | Memory/latency and embedding drift if optimized incorrectly. |
 | Vector search | FAISS CPU-first on Jetson ARM64 | HNSWLIB, PostgreSQL/pgvector metadata-assisted search, brute-force NumPy, custom FAISS GPU build | Mature top-K search, save/load, PQ compression; PostgreSQL stores spatial/mission metadata around descriptor files. | CPU query latency must be profiled; GPU FAISS is not default on aarch64. |
 | Local matching | DISK/ALIKED + LightGlue | SuperPoint+LightGlue, SIFT/ORB, LoFTR | Exact match outputs; Apache/BSD-friendly path; adaptive speed knobs. | Jetson profiling needed. |
 | Raster cache | COG + PostgreSQL/PostGIS manifest + signed JSON sidecars | PMTiles, MBTiles, loose tile folders | COG fits geospatial raster and write-new-tile workflow; PostGIS supports spatial/freshness queries. | 10 GB budget pressure and local DB availability. |
 | MAVLink | MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK-only, MAVProxy bridge | MAVSDK does telemetry well; `GPS_INPUT` needs raw field control. | Plane SITL validation. |
 | FDR | PostgreSQL event index + CBOR/binary payload segments with optional Parquet export | Raw images, plain CSV | Queryable event metadata, bounded payload segments, no raw frame retention. | Schema and local DB availability. |
 | Testing | ArduPilot Plane SITL + AerialVL/VPAir + EuRoC + representative flight replay | Public datasets only | No public dataset covers all ACs. | Representative data collection required. |
 ## Tech Stack Summary
 - **Primary implementation**: Python for orchestration, test harness, cache tooling, and MAVLink integration; C++/TensorRT for hot-path vision if profiling requires it.
 - **Vision utilities**: OpenCV 4.x.
 - **Estimator**: BASALT VIO plus project-owned safety/anchor wrapper and mode machine.
 - **Global retrieval**: DINOv2-VLAD style descriptors with CPU-first FAISS top-K search.
 - **Local matching**: DISK/ALIKED + LightGlue; SuperPoint only after license review.
 - **Cache**: COG imagery, PostgreSQL/PostGIS manifest metadata, signed JSON sidecars, FAISS index files.
 - **Autopilot**: MAVSDK subscriptions plus pymavlink `GPS_INPUT`.
 - **Validation**: public datasets for component de-risking, Plane SITL for integration, representative flight/replay data for acceptance.
 ## Risk Assessment
 | Risk | Impact | Mitigation |
 |------|--------|------------|
 | VPR/local matching exceeds Jetson latency | AC-4.1 failure | Conditional VPR, top-K caps, downsampled descriptors, CPU FAISS profiling, TensorRT only after embedding-fidelity checks. |
 | Descriptor cache exceeds 10 GB | AC-8.3/storage failure | PQ/compression, multi-scale budget report, split descriptor budget if needed. |
 | GPL library accidentally becomes production dependency | Licensing issue | Keep OpenVINS/ORB-SLAM3 as reference only unless legal approves; BASALT is the production VIO candidate. |
 | BASALT covariance under-reports real error | False-position safety budget failure | Calibrate wrapper covariance against OpenVINS covariance, ground truth replay, and satellite anchor residuals. |
 | Plane failsafe differs from Copter docs | Safety behavior mismatch | Production-parameter ArduPilot Plane SITL gate. |
 | `GPS_INPUT` velocity ignore flags behave unexpectedly | EKF drift or false velocity fusion | SITL tests for velocity fields, ignore flags, and `EK3_SRC1_*` source parameters. |
 | Public datasets fail to represent Ukrainian agricultural terrain | False confidence | Require representative synchronized flight/replay data before AC signoff. |
 | Thermal throttling | Latency regression | Hot-soak test and throttle logging per AC-NEW-5. |
 ## Learning / Implementation Requirements
 - Jetson profiling with CUDA/TensorRT and memory instrumentation.
 - ArduPilot Plane SITL, `GPS_INPUT`, GPS spoof/failsafe parameters.
 - Geospatial raster formats: COG, CRS, tile matrices, m/px metadata, manifests.
 - BASALT integration, covariance calibration, and Mahalanobis rejection in the safety/anchor wrapper.
 - Aerial VPR benchmark methodology and georeference recall.
@@ -0,0 +1,148 @@
 # GPS-Denied Onboard Localization — Planning Report
 ## Executive Summary
 The solution planning phase decomposed the GPS-denied onboard localization service into 8 runtime implementation components, 2 cross-cutting foundation epics, a bootstrap epic, and separate e2e/blackbox test epics. The architecture centers on a Jetson-hosted hot path using camera ingest, BASALT VIO, and a project-owned safety/anchor wrapper, with triggered Satellite Service candidate retrieval and ALIKED/DISK-LightGlue anchor verification against an offline PostgreSQL/PostGIS-backed cache.
 Jira epics were created in project `AZ` from AZ-206 through AZ-218. Total estimated effort across epics is approximately 87-141 story points, with large work intentionally decomposed into child tasks of 2, 3, or 5 points where possible.
 ## Problem Statement
 The system must provide reliable onboard WGS84 localization when GPS is denied or spoofed, using a fixed nadir camera, flight-controller telemetry, and an offline satellite cache. It must emit ArduPilot-compatible position estimates, report confidence honestly, degrade safely under blackout, and preserve enough forensic evidence for post-flight analysis without retaining raw frames.
 ## Architecture Overview
 The system is a trigger-based hybrid estimator. Normal flight uses camera ingest, pre-VIO occlusion checks, BASALT VIO, and a safety/anchor wrapper. Relocalization triggers use DINOv2-VLAD, FAISS, ALIKED/DISK-LightGlue, and OpenCV RANSAC against the offline cache. The wrapper is the safety authority for covariance, source labels, degraded modes, tile-write eligibility, and MAVLink output semantics.
 **Technology stack**: Python orchestration, C++/native vision paths where needed, OpenCV 4.x, BASALT, DINOv2-VLAD, FAISS CPU, ALIKED/DISK-LightGlue, PostgreSQL/PostGIS, COG, CBOR FDR segments, MAVSDK + pymavlink.
 **Deployment**: Local onboard Jetson runtime with Docker/replay and Plane SITL for validation; release gates require Jetson hardware, Plane SITL, and representative synchronized replay data.
 ## Component Summary
 | # | Component | Purpose | Dependencies | Epic |
 |---|-----------|---------|--------------|------|
 | 01 | Camera Ingest And Calibration | Ingest frames, validate calibration, detect total occlusion before VIO | Bootstrap, shared geometry/time, config/errors | AZ-209 |
 | 02 | VIO Adapter | Wrap the selected relative VIO backend and emit replaceable state DTOs | Camera, MAVLink telemetry, shared helpers | AZ-213 |
 | 03 | Safety And Anchor Wrapper | Own localization state, covariance, anchors, blackout/failsafe, output semantics | Camera, MAVLink, VIO, anchor verification | AZ-216 |
 | 04 | Satellite Service | Sync Satellite Service cache/upload packages and retrieve local VPR candidates from cache descriptors and FAISS | Camera, Tile Manager, shared helpers | AZ-214 |
 | 05 | Anchor Verification | Verify retrieved candidates with learned matching and RANSAC | Satellite Service, camera, Tile Manager | AZ-215 |
 | 06 | Tile Manager | Manage COGs, PostGIS manifests, sidecars, freshness, and orthorectified generated tiles | Bootstrap, shared helpers, config/errors | AZ-211 |
 | 07 | MAVLink And GCS Integration | Consume FC telemetry and emit v1 `GPS_INPUT`/QGC status | Bootstrap, config/errors | AZ-210 |
 | 08 | FDR And Observability | Record bounded replayable evidence and status | Bootstrap, config/errors, runtime DTOs | AZ-212 |
 | Test | E2E Test Suite | Separate black-box replay, SITL, Jetson, and release evidence tests; not onboard runtime | All runtime components | AZ-217 |
 **Implementation order**:
 1. Bootstrap and cross-cutting foundations: AZ-206, AZ-207, AZ-208.
 2. Independent adapters/stores: AZ-209, AZ-210, AZ-211, AZ-212.
 3. Estimation/relocalization: AZ-213, AZ-214, AZ-215.
 4. Safety orchestration: AZ-216.
 5. Separate e2e/blackbox test implementation: AZ-217, AZ-218.
 ## System Flows
 | Flow | Description | Key Components |
 |------|-------------|----------------|
 | Pre-flight cache preparation | Validate offline cache, sidecars, descriptors, and indexes | Satellite Service, Tile Manager |
 | Normal frame processing | Route usable frames through BASALT; route total occlusion to IMU-only degraded path | Camera, BASALT, safety, MAVLink, FDR |
 | Satellite relocalization | Retrieve and verify cache candidates, then accept/reject anchors | Safety, Satellite Service, anchor verification, Tile Manager |
 | Visual blackout / spoofing | Propagate IMU-only from last trusted state and fail safe at thresholds | Camera, safety, MAVLink, QGC, FDR |
 | Generated tile lifecycle | Write generated COG candidates only under covariance/quality gates | Safety, Tile Manager, FDR |
 | Post-flight sync and audit | Package generated tiles and FDR evidence | Tile Manager, FDR, Satellite Service |
 | Validation replay | Exercise runtime through public interfaces | Validation harness, all runtime components |
 See `system-flows.md` for full diagrams and details.
 ## Risk Summary
 | Level | Count | Key Risks |
 |-------|-------|-----------|
 | Critical | 0 | None |
 | High | 7 | Camera spec mismatch, BASALT nadir fit, covariance under-reporting, total occlusion false-negative, IMU-only over-trust, Jetson trigger-path performance, PostgreSQL/PostGIS availability |
 | Medium | 5 | Cache poisoning, dataset coverage/licensing, FDR append pressure, GPL/non-commercial leakage, generated tile promotion risk |
 | Low | 0 | None |
 **Iterations completed**: 1  
 **All Critical/High risks mitigated**: Yes. High risks have concrete gates in architecture, component specs, and tests.
 See `risk_mitigations.md` for the full register.
 ## Test Coverage
 | Component | Integration | Performance | Security | Acceptance | AC Coverage |
 |-----------|-------------|-------------|----------|------------|-------------|
 | Camera Ingest And Calibration | 3 | 1 | 1 | 2 | 7 ACs |
 | VIO Adapter | 4 | 1 | 1 | 1 | 8 ACs |
 | Safety And Anchor Wrapper | 7 | 1 | 1 | 3 | 15 ACs |
 | Satellite Service | 4 | 2 | 1 | 1 | 10 ACs |
 | Anchor Verification | 2 | 1 | 2 | 1 | 9 ACs |
 | Tile Manager | 4 | 1 | 3 | 1 | 10 ACs |
 | MAVLink And GCS Integration | 6 | 2 | 1 | 1 | 10 ACs |
 | FDR And Observability | 6 | 1 | 1 | 1 | 11 ACs |
 | E2E Test Suite | 9 | 2 | 1 | 2 | All AC groups |
 **Overall acceptance criteria coverage**: 39 / 39 acceptance criteria covered (100%).  
 **Restrictions coverage**: 10 / 10 restriction groups covered (100%).
 ## Epic Roadmap
 | Order | Epic | Component / Concern | Effort | Dependencies |
 |-------|------|---------------------|--------|--------------|
 | 1 | AZ-206: Bootstrap & Initial Structure | Scaffold | M / 5-8 pts | none |
 | 2 | AZ-207: Cross-Cutting: Shared Geometry And Time Sync | Shared helper | S-M / 3-5 pts | AZ-206 |
 | 3 | AZ-208: Cross-Cutting: Runtime Configuration And Errors | Shared helper | S-M / 3-5 pts | AZ-206 |
 | 4 | AZ-209: Camera Ingest And Calibration | Component 01 | M / 5-8 pts | AZ-206, AZ-207, AZ-208 |
 | 5 | AZ-210: MAVLink And GCS Integration | Component 07 | M / 5-8 pts | AZ-206, AZ-208 |
 | 6 | AZ-211: Tile Manager | Component 06 | L / 8-13 pts | AZ-206, AZ-207, AZ-208 |
 | 7 | AZ-212: FDR And Observability | Component 08 | M-L / 5-8 pts | AZ-206, AZ-208 |
 | 8 | AZ-213: VIO Adapter | Component 02 | L / 8-13 pts | AZ-209, AZ-210 |
 | 9 | AZ-214: Satellite Service | Component 04 | L / 8-13 pts | AZ-209, AZ-211 |
 | 10 | AZ-215: Anchor Verification | Component 05 | L / 8-13 pts | AZ-214, AZ-209, AZ-211 |
 | 11 | AZ-216: Safety And Anchor Wrapper | Component 03 | XL / 13-21 pts | AZ-209, AZ-210, AZ-213, AZ-215 |
 | 12 | AZ-217: E2E Test Suite | Separate test support | L / 8-13 pts | Component epics |
 | 13 | AZ-218: Blackbox Tests | System tests | L / 8-13 pts | AZ-217, component epics |
 **Total estimated effort**: 87-141 story points.
 ## Key Decisions Made
 | # | Decision | Rationale | Alternatives Rejected |
 |---|----------|-----------|----------------------|
 | 1 | Use BASALT as production VIO candidate | Permissive license and strong VIO benchmark fit | OpenVINS production dependency, custom VIO from scratch |
 | 2 | Keep safety/anchor wrapper as authority | Product semantics require calibrated covariance, labels, gates, failsafe, MAVLink mapping | Letting BASALT/OpenVINS own output safety |
 | 3 | Use ALIKED/DISK-LightGlue for anchor verification | Strong local correspondences for cross-domain verification | Per-frame learned matcher as primary VIO hot path |
 | 4 | Add pre-VIO total-occlusion gate | Safer and cheaper than feeding fully unusable frames to VIO | Letting BASALT detect all visual failures |
 | 5 | Use PostgreSQL/PostGIS for structured metadata | User confirmed PostgreSQL; PostGIS fits spatial cache/mission metadata | JSON-only or embedded single-file metadata DB |
 | 6 | Use CBOR FDR payload segments with PostgreSQL index | Keeps high-volume append data bounded and queryable | Raw-frame retention, plain CSV, Parquet as runtime primary |
 | 7 | v1 emits `GPS_INPUT` only | Avoid ArduPilot EKF3 double-fusion risk in v1 | Parallel `ODOMETRY` in v1 |
 ## Open Questions
 | # | Question | Impact | Assigned To |
 |---|----------|--------|-------------|
 | 1 | Exact ADTi camera lens, interface, sustained FPS, and temperature spec | Blocks final camera calibration and runtime FPS assumptions | Hardware/product owner |
 | 2 | Final representative synchronized target dataset collection timing | Blocks final acceptance, though public datasets can de-risk | Project/product owner |
 | 3 | Dataset license approval for ALTO/Kagaru/EPFL/VPAir/UZH FPV use | Blocks commercial acceptance evidence for restricted datasets | Legal/product owner |
 | 4 | Local onboard PostgreSQL/PostGIS deployment profile | Blocks implementation details for DB persistence and health checks | Backend/runtime owner |
 ## Artifact Index
 | File | Description |
 |------|-------------|
 | `glossary.md` | Confirmed project glossary |
 | `architecture.md` | System architecture and ADRs |
 | `data_model.md` | System data model and storage strategy |
 | `system-flows.md` | Main runtime and validation flows |
 | `deployment/containerization.md` | Container/replay strategy |
 | `deployment/ci_cd_pipeline.md` | CI/CD and release gates |
 | `deployment/environment_strategy.md` | Environment and dataset strategy |
 | `deployment/observability.md` | Runtime signals, logs, and alerts |
 | `deployment/deployment_procedures.md` | Deployment, rollback, and health checks |
 | `components/*/description.md` | Component specifications |
 | `components/*/tests.md` | Component test specifications |
 | `common-helpers/*.md` | Shared helper specifications |
 | `diagrams/component_overview.md` | Component overview Mermaid diagram |
 | `diagrams/flows/*.md` | Flow-specific Mermaid diagrams |
 | `risk_mitigations.md` | Risk register and mitigations |
 | `epics.md` | Jira epic mapping and dependency roadmap |
 | `FINAL_report.md` | This final planning report |
@@ -0,0 +1,241 @@
 # GPS-Denied Onboard Localization — Architecture
 ## Architecture Vision
 Build a Jetson-hosted onboard localization pipeline for fixed-wing GPS-denied flight. The hot path fuses fixed nadir camera frames and FC telemetry through OpenCV geometry, BASALT VIO, and a project-owned safety/anchor wrapper that emits calibrated `GPS_INPUT` estimates and QGC/FDR status. A triggered satellite-anchor path uses DINOv2-VLAD, CPU FAISS, ALIKED/DISK+LightGlue, and RANSAC against the offline cache; generated tiles are written back only with strict provenance and covariance gates.
 ### Components / Responsibilities
 - Camera ingest/calibration: load frames, apply intrinsics/extrinsics, validate image quality.
 - VIO adapter: produce relative camera+IMU motion from synchronized nav frames and FC IMU.
 - Safety/anchor wrapper: own covariance calibration, source labels, degraded modes, anchor fusion, and `GPS_INPUT`.
 - Satellite Service: sync mission cache packages before flight, upload generated-tile packages after flight, and serve local VPR candidate retrieval from the offline cache.
 - Anchor verification: run local matching/RANSAC and reject unsafe anchors.
 - Tile Manager: manage COGs, manifests, freshness/provenance, orthorectified generated tiles, and local tile metadata.
 - MAVLink/GCS integration: consume FC telemetry and emit `GPS_INPUT`/QGC status.
 - FDR/observability: record replayable mission evidence under storage caps.
 - Validation harness: run still-image, public dataset, SITL, Jetson, and representative replay tests.
 ### Principles / Non-Negotiables
 - No in-flight satellite-provider or Satellite Service calls; runtime uses offline cache only.
 - BASALT is a VIO component, not the safety authority.
 - Confidence must be honest; covariance must grow in degraded modes.
 - Heavy VPR/local matching is trigger-based, not per-frame.
 - Raw nav/AI frames are not retained in normal operation.
 - GPL VIO libraries remain reference-only unless explicitly approved.
 - Plane SITL and Jetson hardware are release gates.
 - Public datasets can de-risk, but representative synchronized flight data is required for final acceptance.
 ## 1. System Context
 **Problem being solved**: During fixed-wing flight, GPS may be denied or spoofed. The onboard system must estimate WGS84 coordinates for navigation-camera frame centers and detected objects, stream `GPS_INPUT` to ArduPilot Plane, report confidence honestly, and maintain safety during VO failure, stale imagery, spoofing, and visual blackout.
 **System boundaries**:
 - In scope: onboard localization runtime, offline cache consumption, BASALT VIO integration, satellite anchor verification, MAVLink output, QGC status, FDR, generated tile metadata, and a separate e2e/black-box test suite.
 - Out of scope: upstream commercial satellite-provider sourcing, Satellite Service ingest implementation, AI mission-camera detection itself, PX4 support, raw-frame retention as a normal operating mode.
 **External systems**:
 | System | Integration Type | Direction | Purpose |
 |--------|------------------|-----------|---------|
 | ArduPilot Plane FC | MAVLink | Inbound/Outbound | FC telemetry in, `GPS_INPUT` and status out |
 | QGroundControl | MAVLink telemetry | Outbound | Downsampled operator status and failsafe messages |
 | Azaion Suite Satellite Service | Offline file/cache sync | Inbound before flight, outbound after landing | Provides mission cache packages and receives generated-tile packages; never called mid-flight |
 | Public/replay datasets | File/rosbag/fixture | Inbound to validation | De-risk BASALT, VPR, and anchor logic |
 ## 2. Technology Stack
 | Layer | Technology | Version / Mode | Rationale |
 |-------|------------|----------------|-----------|
 | OS / GPU stack | JetPack Ubuntu + CUDA/TensorRT/ONNX Runtime | Jetson Orin Nano Super target | Required for production hardware profiling |
 | Runtime language | Python + C++ | Python orchestration; C++ for BASALT/hot vision paths | Fits MAVLink/test tooling and native VIO dependencies |
 | Geometry | OpenCV 4.x | Calibration, undistortion, homography, RANSAC/USAC | Mature utility layer |
 | VIO | BASALT | Production candidate | BSD-friendly, strong benchmark evidence |
 | VIO reference | OpenVINS | Reference/covariance baseline only | Strong EKF covariance story; GPLv3 risk |
 | Backup VIO | Kimera-VIO | Backup candidate | BSD-friendly fallback with mono caveats |
 | Local matching | ALIKED/DISK + LightGlue | Anchor verification and optional VO fallback | Strong learned correspondences; profile before hot-path use |
 | Retrieval | DINOv2-VLAD + CPU FAISS | Triggered VPR only | Robust candidate retrieval under cache/offline constraints |
 | Structured metadata DB | PostgreSQL + PostGIS | Onboard/local deployment | Spatial cache manifests, mission state, generated-tile metadata, and FDR event indexes |
 | Cache imagery | COG + PostgreSQL/PostGIS manifest + signed JSON sidecars | Write-new COG objects | Efficient geospatial rasters with queryable spatial metadata and auditable sidecars |
 | FDR | PostgreSQL event index + CBOR segment payloads, optional Parquet export | Per-flight rollover | Queryable event metadata with compact bounded payload segments |
 | MAVLink | MAVSDK + pymavlink | MAVSDK telemetry, pymavlink `GPS_INPUT` | Exact output control |
 **Key constraints from restrictions.md**:
 - Jetson has 8 GB shared memory and 25 W thermal envelope, so heavy VPR/local matching cannot run every frame.
 - Runtime must be offline with respect to satellite providers, so all imagery and descriptors are preloaded.
 - The camera is fixed nadir; all VO choices must be validated against low-parallax/planar terrain.
 - ADTi public specs conflict with current assumptions on resolution, continuous FPS, and operating temperature; manufacturer specs must be pinned before implementation.
 ## 3. Deployment Model
 **Environments**: Development replay, public-dataset replay, Jetson hardware validation, Plane SITL, representative flight/replay rig.
 **Infrastructure**:
 - Onboard production runtime runs on the Jetson companion computer, not in cloud.
 - Replay/test infrastructure may use Docker for deterministic fixture tests.
 - Release gates require local Jetson hardware and ArduPilot Plane SITL.
 **Environment-specific configuration**:
 | Config | Development | Production |
 |--------|-------------|------------|
 | Satellite cache | Small fixture cache | Preloaded operational-area cache |
 | Descriptor index | Fixture FAISS index | CPU-first FAISS index with PQ/IVF if needed |
 | Secrets/signing | Local test keys | Mission/cache signing keys from Suite process |
 | FDR | Local temp output | Per-flight bounded NVMe storage |
 | MAVLink | SITL/replay | Physical FC telemetry link |
 ## 4. Data Model Overview
 **Core entities**:
 | Entity | Description | Owned By Component |
 |--------|-------------|--------------------|
 | FrameRecord | Navigation-camera frame metadata, total-occlusion status, and processing status | Camera ingest/calibration |
 | TelemetrySample | FC IMU, attitude, airspeed, altitude, GPS health | MAVLink/GCS integration |
 | VioState | Backend-relative pose/velocity/bias output and quality metadata | VIO adapter |
 | PositionEstimate | WGS84 estimate, covariance, source label, fix type, anchor age | Safety/anchor wrapper |
 | VprChunk | Retrieval unit over cache imagery and descriptors | Satellite Service |
 | AnchorCandidate | Retrieved tile/chunk with local-match and RANSAC evidence | Anchor verification |
 | CacheTile | COG tile plus manifest and sidecar metadata | Tile Manager |
 | GeneratedTile | In-flight orthorectified tile with trust/provenance metadata | Tile Manager |
 | FdrSegment | Bounded replayable log segment | FDR/observability |
 **Data flow summary**:
 - Frame quality/total-occlusion gate + telemetry -> BASALT VIO when usable, or IMU-only degraded mode when not -> safety/anchor wrapper -> `GPS_INPUT`, QGC, FDR.
 - Relocalization trigger -> DINOv2-VLAD/FAISS -> ALIKED/DISK+LightGlue/RANSAC -> accepted/rejected anchor.
 - High-confidence pose + frame -> generated tile -> manifest/sidecar -> post-flight Satellite Service sync.
 ## 5. Integration Points
 ### Internal Communication
 | From | To | Protocol | Pattern | Notes |
 |------|----|----------|---------|-------|
 | Camera ingest/calibration | VIO adapter | In-process queue or shared frame bus | Streaming | Timestamp discipline is critical |
 | MAVLink telemetry | VIO adapter | In-process telemetry buffer | Streaming | IMU/attitude/altitude sync |
 | VIO adapter | Safety/anchor wrapper | Typed state messages | Streaming | Wrapper calibrates confidence |
 | Safety/anchor wrapper | Satellite Service | Command | Triggered local request | Uses only preloaded cache/index data during flight |
 | Satellite Service | Anchor verification | Candidate list | Request-response | Dynamic top-K |
 | Anchor verification | Safety/anchor wrapper | Anchor decision | Request-response | Includes MRE/inliers/provenance |
 | Safety/anchor wrapper | MAVLink/GCS integration | Position/status DTO | Streaming | `GPS_INPUT` emitted frame-by-frame |
 | Safety/anchor wrapper | FDR/observability | Append-only events | Streaming | Bounded segments |
 ### External Integrations
 | External System | Protocol | Auth | Failure Mode |
 |-----------------|----------|------|--------------|
 | ArduPilot Plane | MAVLink | Source/system ID allowlist | Degrade/failsafe; never trust spoofed GPS blindly |
 | QGroundControl | MAVLink | FC telemetry path | Downsampled status may be delayed but local FDR remains authoritative |
 | Azaion Suite Satellite Service | Offline package sync | Signed manifests/sidecars | Missing/stale cache causes degraded mode, not mid-flight network fetch |
 | Public datasets | File/rosbag | License constraints | Not final acceptance unless representative and license-compatible |
 ## 6. Non-Functional Requirements
 | Requirement | Target | Measurement | Priority |
 |-------------|--------|-------------|----------|
 | Frame latency | <400 ms p95 | Capture/replay timestamp to emitted estimate | High |
 | Memory | <8 GB shared | Jetson monitoring | High |
 | First fix | <30 s p95 | 50 cold starts | High |
 | Thermal | No throttle at 25 W / +50 C | 8-hour hot-soak | High |
 | FDR storage | <=64 GB/flight | 8-hour synthetic load | High |
 | Cache storage | ~10 GB persistent budget | Full mission cache accounting | High |
 | False position | P(error >500 m) <0.1%, >1 km <0.01% | Monte Carlo/replay | High |
 ## 7. Security Architecture
 **Authentication / trust boundary**:
 - Runtime accepts only local cache files with valid manifest/signature/provenance.
 - MAVLink input is filtered by expected source/system IDs and FC health semantics.
 **Data protection**:
 - At rest: FDR and cache sidecars should be integrity protected; mission secrets/signing keys are not stored in code.
 - In transit: no in-flight satellite-provider or Satellite Service network dependency; MAVLink link security depends on FC/GCS deployment.
 **Audit logging**:
 - FDR records estimates, covariance, anchors, rejected anchors, cache validation failures, spoofing/blackout transitions, emitted `GPS_INPUT`, resource health, and tile-write decisions.
 ## 8. Key Architectural Decisions
 ### ADR-001: BASALT As Production VIO Candidate
 **Context**: A naive OpenCV-only VIO implementation is risky, while OpenVINS has GPLv3 production constraints.
 **Decision**: Use BASALT as the production relative VIO candidate and keep OpenVINS as covariance/reference baseline.
 **Alternatives considered**:
 1. OpenVINS as production core — rejected by default because of GPLv3 and generic VIO ownership.
 2. Kimera-VIO — retained as backup due to BSD license but mono-inertial caveats.
 3. Fully custom OpenCV/ESKF — fallback only because implementation burden is high.
 **Consequences**: The safety/anchor wrapper must calibrate confidence around BASALT and prove it on representative data.
 ### ADR-002: ALIKED-LightGlue Role
 **Context**: ALIKED-LightGlue can produce strong local correspondences and can support frame-to-frame homography/pose estimation.
 **Decision**: Use ALIKED/DISK+LightGlue for satellite-anchor verification and evaluate it as an optional VO fallback/keyframe-assist path, not as the default BASALT replacement.
 **Alternatives considered**:
 1. Per-frame ALIKED-LightGlue VO hot path — deferred until Jetson profiling proves latency/memory fit.
 2. SIFT/ORB-only matching — retained as regression baseline, weaker under cross-domain conditions.
 3. SuperPoint+LightGlue — license-gated.
 **Consequences**: Implementation tasks must benchmark ALIKED-LightGlue on frame-to-frame VO and cross-domain anchor workloads separately.
 ### ADR-003: Cache Metadata Format
 **Context**: JSON is simple and auditable, but operational cache queries need spatial indexing, freshness filters, update safety, and integration with the project PostgreSQL database.
 **Decision**: Use PostgreSQL with PostGIS as the primary cache manifest/index database, with signed JSON sidecars for each tile/generated tile for auditability and interchange.
 **Alternatives considered**:
 1. JSON-only manifest — simpler, but weak for query/update scale, spatial search, and consistency.
 2. Embedded single-file metadata DB — efficient for small deployments, but rejected because the project will use PostgreSQL/PostGIS.
 **Consequences**: The Tile Manager owns PostgreSQL migrations, PostGIS indexes, signature checks, generated-tile orthorectification metadata, and sidecar/db consistency.
 ### ADR-004: FDR Format
 **Context**: The FDR must be compact, bounded, replayable, and exportable for analysis.
 **Decision**: Use PostgreSQL for FDR event indexes and mission-query metadata, with CBOR-backed segment payloads for bounded append-heavy runtime data and optional Parquet export after flight.
 **Alternatives considered**:
 1. Plain CSV — rejected for type safety, size, and complex payloads.
 2. Parquet as primary onboard format — good analytics, but less ideal as the runtime append/rollover path.
 **Consequences**: FDR implementation must define PostgreSQL tables/indexes, CBOR segment schema, rollover behavior, and export tooling.
 ### ADR-006: Total Occlusion Before VIO
 **Context**: BASALT should not receive frames that are completely unusable because of lens cover, cloud/whiteout, decode failure, extreme exposure, or other total visual blackout.
 **Decision**: Camera ingest performs a pre-VIO total-occlusion/blackout check. Total occlusion bypasses BASALT for that frame, sends a `total_occlusion` or `visual_blackout` degradation signal to the safety wrapper, and continues IMU-only propagation from the last trusted state.
 **Alternatives considered**:
 1. Let BASALT detect every visual failure — rejected because total occlusion is cheaper and safer to catch before the VIO hot path.
 2. Drop frames silently — rejected because the wrapper must grow covariance and emit honest degraded output.
 **Consequences**: The camera component must expose `occlusion_status`, and tests must assert mode transition to `dead_reckoned`/failsafe under total blackout.
 ### ADR-005: Public Dataset Strategy
 **Context**: The original still-image sample lacks synchronized IMU and ground-truth trajectory. The Derkachi fixture adds cropped nadir video synchronized with IMU and `GLOBAL_POSITION_INT` trajectory, but camera intrinsics, distortion, and camera-to-body calibration remain pending.
 **Decision**: Prioritize MUN-FRL for synchronized nadir camera + IMU + GNSS/ground truth; use ALTO for aerial localization/VPR and long nadir trajectories; investigate Kagaru/EPFL for fixed-wing/farmland relevance; use EuRoC/UZH FPV only as VIO proxies if license-compatible.
 **Consequences**: Public datasets de-risk components but do not replace representative target flight data for final acceptance.
@@ -0,0 +1,30 @@
 # Geo Geometry Helper
 ## Purpose
 Shared geospatial and camera-geometry utilities used by camera ingest, safety wrapper, Tile Manager, anchor verification, and validation.
 ## Responsibilities
 - WGS84 to local tangent plane conversions.
 - Haversine/ground-distance calculations.
 - Ground sampling distance calculations.
 - Camera footprint projection from intrinsics, extrinsics, altitude, and attitude.
 - Homography and covariance unit conversions for reporting.
 ## Non-Responsibilities
 - No image matching.
 - No state estimation.
 - No MAVLink emission.
 - No cache policy decisions.
 ## Consumers
 | Component | Usage |
 |-----------|-------|
 | Camera ingest/calibration | Footprint and calibration sanity checks |
 | Safety/anchor wrapper | Distance/covariance/unit conversion |
 | Anchor verification | Pixel-to-ground error reporting |
 | Tile Manager | Tile footprint metadata |
 | Validation harness | Error thresholds and reports |
@@ -0,0 +1,29 @@
 # Time Sync Helper
 ## Purpose
 Shared timestamp validation and alignment utilities for frame, IMU, telemetry, FDR, and replay data.
 ## Responsibilities
 - Monotonic timestamp checks.
 - Frame-to-IMU window selection.
 - Clock-domain conversion metadata.
 - Replay ordering validation.
 - Gap and jitter metrics.
 ## Non-Responsibilities
 - No VIO state estimation.
 - No MAVLink parsing beyond normalized timestamp fields.
 - No recovery policy; callers decide whether to degrade or reject.
 ## Consumers
 | Component | Usage |
 |-----------|-------|
 | Camera ingest/calibration | Frame ordering and timestamp metadata |
 | VIO adapter | IMU/frame synchronization |
 | MAVLink/GCS integration | Telemetry timestamp normalization |
 | FDR/observability | Segment ordering |
 | Validation harness | Fixture validation |
@@ -0,0 +1,115 @@
 # Camera Ingest And Calibration
 ## 1. High-Level Overview
 **Purpose**: Ingest navigation-camera frames, attach timestamps and calibration metadata, undistort/normalize imagery, detect total occlusion/blackout before VIO, and classify image quality before VIO or satellite matching consumes it.
 **Architectural Pattern**: Streaming adapter + validation gate.
 **Upstream dependencies**: Navigation camera, camera calibration files.
 **Downstream consumers**: VIO adapter, Satellite Service, anchor verification, Tile Manager, FDR.
 ## 2. Internal Interfaces
 ### Interface: `FrameProvider`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `next_frame` | `FrameRequest` | `FramePacket` | Yes | `FrameUnavailable`, `CalibrationMissing`, `InvalidFrame` |
 | `detect_occlusion` | `FramePacket` | `OcclusionReport` | No | `InvalidFrame` |
 | `classify_quality` | `FramePacket` | `ImageQualityReport` | No | `InvalidFrame` |
 **Input DTOs**:
 ```yaml
 FrameRequest:
  source: enum(live_camera, replay_file)
  timestamp_ns: integer optional
 ```
 **Output DTOs**:
 ```yaml
 FramePacket:
  frame_id: string
  timestamp_ns: integer
  image_ref: path_or_buffer
  camera_calibration_id: string
  altitude_hint_m: number optional
  occlusion: OcclusionReport
  quality: ImageQualityReport
 OcclusionReport:
  status: enum(clear, partial_occlusion, total_occlusion, blackout)
  reason: enum(cloud, lens_cover, whiteout, decode_failure, underexposed, overexposed, unknown) optional
  usable_for_vio: boolean
  usable_for_anchor: boolean
 ImageQualityReport:
  usable_for_vio: boolean
  usable_for_anchor: boolean
  blackout_detected: boolean
  blur_score: number
  texture_score: number
 ```
 ## 3. Data Access Patterns
 | Query | Frequency | Hot Path | Index Needed |
 |-------|-----------|----------|--------------|
 | Load calibration by ID | Startup | Yes | No |
 | Read replay frame | Test/replay | Yes | File order |
 ## 4. Implementation Details
 **State Management**: Maintains current camera calibration version and frame sequence state.
 **Key Dependencies**:
 | Library | Purpose |
 |---------|---------|
 | OpenCV | Decode, undistort, homography support, image metrics |
 | Camera SDK / V4L2 / GigE SDK | Live camera access once interface is selected |
 **Error Handling Strategy**:
 - Camera or decode errors emit degraded quality and FDR events.
 - Total occlusion or blackout sets `usable_for_vio=false`, bypasses BASALT for that frame, and emits a degradation signal to the safety/anchor wrapper.
 - Missing calibration blocks production startup.
 - ADTi public spec mismatch is tracked as a verification blocker until manufacturer spec is pinned.
 ## 5. Extensions and Helpers
 | Helper | Purpose | Used By |
 |--------|---------|---------|
 | `geo_geometry_helper` | Coordinate transforms, GSD, WGS84/local conversions | Camera ingest, safety wrapper, Tile Manager |
 ## 6. Caveats & Edge Cases
 **Known limitations**:
 - Public ADTi pages list 2 fps continuous capture and -10..40 C operating range; project assumptions need manufacturer verification.
 - Live camera interface is TBD.
 - Total occlusion detection must be conservative: false positives cause temporary IMU-only degradation, while false negatives can feed unusable frames into VIO.
 **Performance bottlenecks**:
 - Full-resolution preprocessing must stay inside the <400 ms p95 pipeline budget.
 ## 7. Dependency Graph
 **Must be implemented after**: none.
 **Can be implemented in parallel with**: Tile Manager, MAVLink/GCS integration.
 **Blocks**: VIO adapter, anchor verification, generated tile lifecycle.
 ## 8. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | Camera unavailable or calibration missing | `camera_calibration_missing id=...` |
 | WARN | Frame degraded or occluded | `frame_quality_degraded occlusion=total_occlusion blur=... texture=...` |
 | INFO | Calibration loaded | `camera_calibration_loaded version=...` |
 **Log format**: FDR structured event.
 **Log storage**: FDR segment and health log.
@@ -0,0 +1,139 @@
 # Test Specification — Camera Ingest And Calibration
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-2.1a | VO registration uses only normal usable frames | IT-01, AT-01 | Covered |
 | AC-3.5 | Visual blackout switches to degraded mode quickly | IT-02, AT-02 | Covered |
 | AC-4.1 | Capture-to-output latency budget | PT-01 | Covered |
 | AC-4.2 | Jetson memory budget | PT-01 | Covered |
 | AC-8.4 | Mid-flight tile generation input eligibility | IT-03 | Covered |
 | AC-8.5 | No raw frame retention | ST-01 | Covered |
 | AC-NEW-8 | Full blackout/occlusion degraded-mode trigger | IT-02, AT-02 | Covered |
 ## Blackbox Tests
 ### IT-01: Usable Frame Classification
 **Summary**: Verify normal nadir frames are marked usable for VIO and anchor matching.
 **Traces to**: AC-2.1a
 **Input data**: Project still images plus calibration metadata.
 **Expected result**: `FramePacket.quality.usable_for_vio=true`, `usable_for_anchor=true`, and `occlusion.status=clear` for normal daytime textured frames.
 **Max execution time**: 100 ms per frame.
 **Dependencies**: Calibration file, replay fixture.
 ---
 ### IT-02: Total Occlusion Gate Before VIO
 **Summary**: Verify total occlusion is detected before BASALT receives the frame.
 **Traces to**: AC-3.5, AC-NEW-8
 **Input data**: Black/white/covered-lens frames, corrupt frame fixture, low-texture whiteout frames.
 **Expected result**: `occlusion.status` is `total_occlusion` or `blackout`, `usable_for_vio=false`, `usable_for_anchor=false`, and a degradation signal is emitted to the safety wrapper.
 **Max execution time**: 100 ms per frame.
 **Dependencies**: Safety wrapper test double.
 ---
 ### IT-03: Tile Generation Eligibility Metadata
 **Summary**: Verify frame metadata required for generated tiles is available without persisting raw frames.
 **Traces to**: AC-8.4, AC-8.5
 **Input data**: Valid frame, altitude hint, calibration, pose fixture.
 **Expected result**: Frame metadata supports orthorectification handoff; raw frame is not retained after processing outside allowed FDR thumbnail exception.
 **Max execution time**: 100 ms per frame.
 **Dependencies**: Tile Manager test double.
 ## Performance Tests
 ### PT-01: Ingest And Quality Latency
 **Summary**: Verify decode, calibration lookup, occlusion detection, and quality classification stay inside the system latency budget.
 **Traces to**: AC-4.1, AC-4.2
 **Load scenario**:
 - Input rate: target camera/replay rate.
 - Duration: 30 minutes replay.
 - Dataset: project still images plus synthetic occlusion frames.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Per-frame ingest p95 | <=100 ms | >150 ms |
 | Memory contribution | <=1 GB | >1.5 GB |
 | Dropped frames | <=10% under sustained load | >10% |
 **Resource limits**: Jetson shared memory remains below system 8 GB cap.
 ## Security Tests
 ### ST-01: Raw Frame Retention Check
 **Summary**: Verify normal operation does not persist raw navigation frames.
 **Traces to**: AC-8.5
 **Attack vector**: Sensitive raw imagery remains on disk after processing.
 **Test procedure**:
 1. Run replay with normal and failed tile-generation frames.
 2. Inspect output directories and FDR artifacts.
 **Expected behavior**: Only metadata, generated tiles, and allowed low-rate failed-frame thumbnails are retained.
 **Pass criteria**: No raw full-resolution frames remain after teardown.
 ## Acceptance Tests
 ### AT-01: Normal Frame Enters VIO
 **Summary**: Confirm normal usable frames are passed to BASALT.
 **Traces to**: AC-2.1a
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Feed a calibrated normal frame | Occlusion status is `clear` |
 | 2 | Process quality gate | Frame is emitted to VIO adapter |
 ---
 ### AT-02: Occluded Frame Bypasses VIO
 **Summary**: Confirm full occlusion triggers IMU-only degraded path.
 **Traces to**: AC-3.5, AC-NEW-8
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Feed total-occlusion frame | `usable_for_vio=false` |
 | 2 | Observe downstream routing | BASALT is bypassed and safety wrapper receives degradation signal |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `project_60_still_images` | Normal nadir frames | `_docs/00_problem/input_data/` | Project data |
 | `synthetic_occlusion_frames` | Black/white/corrupt/whiteout frames | Generated fixture | Small |
 **Setup procedure**: Load calibration fixture and mount read-only frame data.
 **Teardown procedure**: Remove run-scoped temp directories and verify no raw-frame persistence.
 **Data isolation strategy**: Each run writes to a unique `test-results/<run-id>/` directory.
@@ -0,0 +1,92 @@
 # VIO Adapter
 ## 1. High-Level Overview
 **Purpose**: Wrap the selected relative VIO backend as a replaceable component that consumes calibrated frames and FC IMU data, then emits relative pose/velocity/bias state and tracking quality.
 **Architectural Pattern**: Adapter / anti-corruption layer.
 **Upstream dependencies**: Camera ingest/calibration, MAVLink telemetry stream.
 **Downstream consumers**: Safety/anchor wrapper, FDR, separate e2e test suite.
 ## 2. Internal Interfaces
 ### Interface: `VioAdapter`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `initialize` | `VioInitRequest` | `VioInitResult` | No | `CalibrationInvalid`, `DatasetUnsupported` |
 | `process` | `VioInputPacket` | `VioStatePacket` | Yes | `TrackingLost`, `TimestampMismatch` |
 | `health` | none | `VioHealth` | No | none |
 **Input DTOs**:
 ```yaml
 VioInputPacket:
  frame: FramePacket
  imu_samples: list[TelemetrySample]
  attitude_sample: TelemetrySample optional
 ```
 **Output DTOs**:
 ```yaml
 VioStatePacket:
  timestamp_ns: integer
  relative_pose: transform
  velocity: vector3
  bias_estimate: object optional
  tracking_quality: number
  completed: boolean
  covariance_hint: matrix optional
 ```
 ## 3. Data Access Patterns
 No persistent production data ownership. Reads calibration/config at startup and emits state to downstream consumers.
 ## 4. Implementation Details
 **State Management**: Owns selected VIO backend runtime state and resets only through explicit wrapper command.
 **Key Dependencies**:
 | Library | Purpose |
 |---------|---------|
 | BASALT | Current selected relative visual-inertial odometry backend |
 | Eigen/Sophus or backend-native math stack | Pose and transform representation |
 **Error Handling Strategy**:
 - Tracking loss is surfaced to the safety/anchor wrapper, not hidden.
 - Timestamp/camera-IMU sync violations fail the packet and are logged.
 - The adapter never emits WGS84 coordinates; absolute semantics belong to the wrapper.
 ## 5. Caveats & Edge Cases
 **Known limitations**:
 - BASALT has no special fixed-wing nadir mode; validation must prove fit under low-parallax/planar terrain.
 - Backend covariance/confidence output is not the product authority; wrapper calibration is required.
 **Performance bottlenecks**:
 - Native VIO runtime and image resolution can exceed Jetson budget if not tuned.
 ## 6. Dependency Graph
 **Must be implemented after**: Camera ingest/calibration, MAVLink telemetry DTO definitions.
 **Can be implemented in parallel with**: Satellite Service, Tile Manager.
 **Blocks**: Safety/anchor wrapper final integration.
 ## 7. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | VIO backend initialization fails | `vio_init_failed reason=...` |
 | WARN | Tracking quality drops | `vio_tracking_degraded quality=...` |
 | INFO | VIO reset/reinitialized | `vio_reset cause=...` |
 **Log format**: FDR structured event.
 **Log storage**: FDR segment.
@@ -0,0 +1,141 @@
 # Test Specification — VIO Adapter
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-1.3 | Drift between anchors and anchor-age reporting input | IT-02, AT-01 | Covered |
 | AC-2.1a | >95% VO registration on normal segments | IT-01, AT-01 | Covered |
 | AC-2.2 | <1.0 px VO MRE | IT-01 | Covered |
 | AC-3.1 | Handles 350 m outliers/tilt | IT-03 | Covered |
 | AC-3.2 | Sharp turns trigger relocalization path | IT-04 | Covered |
 | AC-3.4 | Loss threshold feeds relocalization/dead reckoning | IT-04 | Covered |
 | AC-4.1 | Hot-path latency | PT-01 | Covered |
 | AC-4.2 | Jetson memory budget | PT-01 | Covered |
 ## Blackbox Tests
 ### IT-01: Public Dataset VIO Replay
 **Summary**: Verify the VIO adapter produces relative motion for synchronized camera/IMU replay.
 **Traces to**: AC-2.1a, AC-2.2
 **Input data**: Derkachi cropped nadir video + `SCALED_IMU2` + `GLOBAL_POSITION_INT`, MUN-FRL preferred slice, or representative synchronized nav-camera + IMU + ground truth.
 **Expected result**: VO registration succeeds for >95% of normal usable frames; frame-to-frame MRE <1.0 px where ground-truth/feature evaluation supports it. Derkachi runs are accepted as calibration-limited until intrinsics, distortion, and camera-to-body transform are pinned.
 **Max execution time**: Dataset-dependent; report per-frame latency.
 **Dependencies**: Camera ingest, MAVLink telemetry/replay, calibration fixtures.
 ---
 ### IT-02: Relative Drift Reporting
 **Summary**: Verify adapter emits state needed for wrapper drift and anchor-age accounting.
 **Traces to**: AC-1.3
 **Input data**: Segment with two known satellite anchors and IMU samples.
 **Expected result**: Adapter emits continuous `VioStatePacket` values with timestamps and quality, enabling wrapper to compare VO extrapolation to next anchor.
 **Max execution time**: Dataset-dependent.
 **Dependencies**: Safety wrapper test harness.
 ---
 ### IT-03: Tilt/Outlier Robustness
 **Summary**: Verify adapter reports degraded tracking without false success under tilt/outlier cases.
 **Traces to**: AC-3.1
 **Input data**: Replay segment with synthetic +/-20 degree tilt and up to 350 m apparent outlier.
 **Expected result**: Adapter either tracks with quality metadata or emits `TrackingLost`; it never hides a failure as high-quality VIO.
 **Max execution time**: 15 minutes per fixture.
 ---
 ### IT-04: Sharp Turn / Loss Signal
 **Summary**: Verify sharp turns and disconnected visual overlap produce wrapper-visible failure signals.
 **Traces to**: AC-3.2, AC-3.4
 **Input data**: <5% overlap sequence with heading change <70 degrees.
 **Expected result**: Adapter emits low tracking quality or `TrackingLost` within the loss window, allowing relocalization trigger.
 **Max execution time**: 10 minutes.
 ## Performance Tests
 ### PT-01: VIO Adapter Runtime Budget
 **Summary**: Verify VIO processing does not consume the full <400 ms system p95 budget.
 **Traces to**: AC-4.1, AC-4.2
 **Load scenario**:
 - Input: Derkachi synchronized replay and public/representative replay.
 - Duration: 30 minutes plus release long-run slice.
 - Target: Jetson Orin Nano Super.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Adapter p95 latency | <=250 ms | >300 ms |
 | Memory contribution | <=3 GB | >4 GB |
 | Tracking failure on normal segments | <5% | >=5% |
 **Resource limits**: Total system memory remains below 8 GB.
 ## Security Tests
 ### ST-01: Timestamp Injection Rejection
 **Summary**: Verify malformed or non-monotonic timestamps do not produce trusted VIO state.
 **Traces to**: AC-NEW-4
 **Attack vector**: Replay or telemetry timestamp manipulation.
 **Test procedure**:
 1. Feed non-monotonic frame and IMU timestamps.
 2. Observe adapter output.
 **Expected behavior**: Adapter returns `TimestampMismatch` or low-quality failure; wrapper does not trust the state.
 **Pass criteria**: No high-quality VIO state is emitted from malformed timing.
 ## Acceptance Tests
 ### AT-01: Normal VIO State Contract
 **Summary**: Confirm adapter output contract supports downstream localization.
 **Traces to**: AC-1.3, AC-2.1a
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Initialize with calibrated frame/IMU config | `VioInitResult` succeeds |
 | 2 | Replay normal frames | `VioStatePacket` includes timestamp, relative pose, velocity, tracking quality |
 | 3 | End segment | State stream is continuous enough for wrapper drift accounting |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `derkachi_video_telemetry` | Cropped nadir MP4 + synchronized IMU and `GLOBAL_POSITION_INT` trajectory | Project fixture | ~282 MB video + CSV |
 | `public_nadir_vio_candidates` | MUN-FRL/ALTO/Kagaru/EPFL slices | Public pinned fixtures | Dataset-dependent |
 | `representative_sync_replay` | Target camera + FC IMU + calibrated ground truth | Project collection | TBD |
 **Setup procedure**: Pin calibration/extrinsics and mount read-only synchronized replay data.
 **Teardown procedure**: Remove generated result reports and adapter temp state.
 **Data isolation strategy**: One run directory per dataset slice and configuration hash.
@@ -0,0 +1,106 @@
 # Safety And Anchor Wrapper
 ## 1. High-Level Overview
 **Purpose**: Own the authoritative localization state, confidence calibration, source labels, anchor fusion, degraded modes, tile-write gates, and MAVLink output semantics.
 **Architectural Pattern**: Stateful coordinator / safety facade.
 **Upstream dependencies**: VIO adapter, anchor verification, MAVLink telemetry, camera quality reports.
 **Downstream consumers**: MAVLink/GCS integration, FDR, Tile Manager, separate e2e test suite.
 ## 2. Internal Interfaces
 ### Interface: `LocalizationStateMachine`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `update_vio` | `VioStatePacket` | `PositionEstimate` | Yes | `StateInconsistent` |
 | `consider_anchor` | `AnchorDecision` | `AnchorAcceptanceResult` | No | `AnchorRejected` |
 | `degrade` | `DegradationSignal` | `PositionEstimate` | No | none |
 | `propagate_imu_only` | `ImuOnlyPropagationRequest` | `PositionEstimate` | Yes | `InsufficientTelemetry` |
 | `tile_write_eligibility` | `FramePacket` | `TileWriteDecision` | No | none |
 **Input DTOs**:
 ```yaml
 AnchorDecision:
  candidate_id: string
  timestamp_ns: integer
  estimated_pose_wgs84: object
  inlier_count: integer
  mre_px: number
  tile_freshness_status: enum
  provenance_status: enum
 DegradationSignal:
  type: enum(total_occlusion, visual_blackout, gps_spoofing, covariance_growth, tracking_lost)
  timestamp_ns: integer
 ImuOnlyPropagationRequest:
  last_trusted_estimate: PositionEstimate
  imu_samples: list[TelemetrySample]
  elapsed_blackout_ms: integer
  reason: enum(total_occlusion, visual_blackout, tracking_lost)
 ```
 **Output DTOs**:
 ```yaml
 PositionEstimate:
  timestamp_ns: integer
  lat_deg: number
  lon_deg: number
  alt_msl_m: number
  covariance_95_semi_major_m: number
  source_label: enum(satellite_anchored, vo_extrapolated, dead_reckoned)
  fix_type: integer
  horiz_accuracy_m: number
  last_satellite_anchor_age_ms: integer
 ```
 ## 3. Data Access Patterns
 No direct tile/image storage ownership. Writes all decisions to FDR via observability component.
 ## 4. Implementation Details
 **State Management**: Owns the authoritative state machine and covariance growth model.
 **Error Handling Strategy**:
 - Reject uncertain anchors by default.
 - Never emit optimistic accuracy when confidence is degraded.
 - On total occlusion or visual blackout, do not call VIO for that frame; propagate from the last trusted state with IMU-only dynamics, set `source_label=dead_reckoned`, and grow covariance monotonically.
 - If covariance or blackout thresholds exceed AC limits, emit no-fix/failsafe semantics.
 - Treat cache freshness and provenance as evidence carried by `AnchorDecision`; do not call the Tile Manager directly during anchor acceptance.
 ## 5. Caveats & Edge Cases
 **Known limitations**:
 - Final covariance calibration requires representative synchronized data.
 - The wrapper must stay independent of BASALT internals so VIO can be replaced.
 - IMU-only propagation is an emergency bridge, not a reliable long-duration localization mode; the spec requires failsafe/no-fix once time or covariance thresholds are exceeded.
 **Potential race conditions**:
 - A delayed anchor result must be checked against current timestamp/state before acceptance.
 ## 6. Dependency Graph
 **Must be implemented after**: VIO DTOs, anchor DTOs, MAVLink output contract.
 **Can be implemented in parallel with**: FDR schema after DTOs stabilize.
 **Blocks**: MAVLink production output, tile-write lifecycle, end-to-end validation.
 ## 7. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | State invariant violation | `localization_state_inconsistent reason=...` |
 | WARN | Anchor rejected or mode degraded | `anchor_rejected reason=mahalanobis_gate` |
 | INFO | Mode transition | `source_label_changed from=vo_extrapolated to=dead_reckoned reason=total_occlusion` |
 **Log format**: FDR structured event.
 **Log storage**: FDR segment and QGC status for operator-critical events.
@@ -0,0 +1,208 @@
 # Test Specification — Safety And Anchor Wrapper
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-1.1 | >=80% within 50 m | AT-01 | Covered |
 | AC-1.2 | >=50% within 20 m | AT-01 | Covered |
 | AC-1.3 | Drift and anchor age | IT-01 | Covered |
 | AC-1.4 | Quantitative confidence + label | IT-02, AT-02 | Covered |
 | AC-3.4 | Relocalization after no-position window | IT-03 | Covered |
 | AC-3.5 | Blackout to dead reckoned | IT-04, AT-03 | Covered |
 | AC-4.3 | GPS_INPUT semantics source fields | IT-05 | Covered |
 | AC-4.4 | Frame-by-frame streaming | PT-01 | Covered |
 | AC-4.5 | Correction updates | IT-06 | Covered |
 | AC-5.1 | Initialize from FC state | IT-07 | Covered |
 | AC-5.2 | >3 s no-estimate fallback | IT-03 | Covered |
 | AC-5.3 | Reinitialize after reboot | IT-07 | Covered |
 | AC-NEW-2 | Spoofing promotion <3 s | IT-05 | Covered |
 | AC-NEW-4 | False-position safety budget | ST-01, AT-02 | Covered |
 | AC-NEW-8 | IMU-only blackout thresholds | IT-04, AT-03 | Covered |
 ## Blackbox Tests
 ### IT-01: Drift And Anchor Age Accounting
 **Summary**: Verify VO extrapolation drift and anchor age are tracked per estimate.
 **Traces to**: AC-1.3
 **Input data**: VIO state stream with two accepted anchor decisions.
 **Expected result**: Every `PositionEstimate` includes `last_satellite_anchor_age_ms`; drift at next anchor is measured and binned by age.
 **Max execution time**: 5 minutes.
 ---
 ### IT-02: Confidence Output Contract
 **Summary**: Verify every estimate has covariance and source label.
 **Traces to**: AC-1.4
 **Input data**: satellite-anchored, VO-extrapolated, and dead-reckoned state fixtures.
 **Expected result**: Each output contains `covariance_95_semi_major_m`, `source_label`, `fix_type`, and `horiz_accuracy_m`; `horiz_accuracy_m` never under-reports covariance.
 **Max execution time**: 2 minutes.
 ---
 ### IT-03: No-Position Relocalization Trigger
 **Summary**: Verify no position for >=3 frames and >=2 s triggers relocalization/degraded behavior.
 **Traces to**: AC-3.4, AC-5.2
 **Input data**: VIO loss sequence with no accepted anchors.
 **Expected result**: Relocalization request is emitted and wrapper continues dead reckoning until fail threshold.
 **Max execution time**: 5 minutes.
 ---
 ### IT-04: Total Blackout IMU-Only Propagation
 **Summary**: Verify total occlusion produces honest IMU-only estimates and failsafe thresholds.
 **Traces to**: AC-3.5, AC-NEW-8
 **Input data**: Last trusted estimate, IMU/attitude/airspeed/altitude stream, total-occlusion signal for 5 s, 15 s, and 35 s.
 **Expected result**: Wrapper emits `dead_reckoned` within <=1 frame or <=400 ms, covariance grows monotonically, and no-fix/failsafe is emitted when blackout >30 s or covariance >500 m.
 **Max execution time**: 10 minutes.
 ---
 ### IT-05: Spoofing Promotion And GPS_INPUT Mapping
 **Summary**: Verify wrapper promotes own estimate and maps output fields correctly under GPS spoofing.
 **Traces to**: AC-4.3, AC-NEW-2
 **Input data**: Plane SITL spoofing telemetry and trusted wrapper estimate.
 **Expected result**: Own estimate is promoted within <3 s; v1 emits `GPS_INPUT` only; source labels and accuracy fields are correct.
 **Max execution time**: 10 minutes.
 ---
 ### IT-06: Correction Update
 **Summary**: Verify refined estimates can correct previous positions.
 **Traces to**: AC-4.5
 **Input data**: VO estimate followed by accepted satellite anchor for same segment.
 **Expected result**: Wrapper emits updated estimate/correction with improved source label and covariance.
 **Max execution time**: 5 minutes.
 ---
 ### IT-07: Initialization And Reboot Recovery
 **Summary**: Verify wrapper initializes from FC state and can reinitialize after reboot.
 **Traces to**: AC-5.1, AC-5.3
 **Input data**: FC EKF position, IMU-extrapolated state, simulated companion restart.
 **Expected result**: Wrapper resumes from current FC state and reports degraded confidence until re-anchored.
 **Max execution time**: 5 minutes.
 ## Performance Tests
 ### PT-01: Wrapper Streaming Overhead
 **Summary**: Verify state-machine processing does not delay frame-by-frame output.
 **Traces to**: AC-4.4
 **Load scenario**:
 - Input: 3 Hz estimate stream with anchor and blackout events.
 - Duration: 8-hour synthetic run.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Wrapper p95 processing | <=25 ms | >50 ms |
 | Missed frame outputs | 0 except skip-allowed upstream drops | Any silent batch/delay |
 **Resource limits**: Negligible memory growth across run.
 ## Security Tests
 ### ST-01: False Anchor Rejection
 **Summary**: Verify impossible anchors do not become trusted estimates.
 **Traces to**: AC-NEW-4
 **Attack vector**: Anchor decision with plausible inliers but impossible jump or stale provenance.
 **Test procedure**:
 1. Feed an anchor >1 km from predicted state with low covariance.
 2. Feed stale/provenance-failed anchor evidence.
 **Expected behavior**: Anchor is rejected, FDR logs reason, source label remains degraded or VO extrapolated.
 **Pass criteria**: 0 accepted impossible/stale anchors.
 ## Acceptance Tests
 ### AT-01: Position Accuracy Aggregation
 **Summary**: Verify wrapper outputs can be scored against frame-center thresholds.
 **Traces to**: AC-1.1, AC-1.2
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Replay mapped frame-center estimates | Report computes >=80% within 50 m and >=50% within 20 m |
 | 2 | Include source labels | Accuracy is binned by source label |
 ---
 ### AT-02: Confidence Honesty
 **Summary**: Verify reported confidence is conservative relative to measured error.
 **Traces to**: AC-1.4, AC-NEW-4
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Replay ground-truth trajectory | Measured error is not systematically above reported covariance |
 | 2 | Inject over-confident anchors | Mahalanobis gate rejects them |
 ---
 ### AT-03: Blackout Failsafe
 **Summary**: Verify operator-visible blackout/failsafe behavior.
 **Traces to**: AC-3.5, AC-NEW-8
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Inject total visual blackout | `dead_reckoned` within <=400 ms |
 | 2 | Continue >30 s or covariance >500 m | `fix_type=0`, `horiz_accuracy=999.0`, QGC failsafe status |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `wrapper_state_fixtures` | VIO states, anchors, blackouts, spoofing signals | Generated fixtures | Small |
 | `representative_replay` | Target synchronized replay and ground truth | Project collection | TBD |
 **Setup procedure**: Load deterministic VIO/anchor/telemetry fixtures and run with isolated PostgreSQL schema.
 **Teardown procedure**: Drop run schema and remove FDR segments.
 **Data isolation strategy**: Use per-run mission IDs and database schemas.
@@ -0,0 +1,102 @@
 # Satellite Service
 ## 1. High-Level Overview
 **Purpose**: Own the onboard boundary to the suite Satellite Service: import pre-flight mission cache packages, upload generated-tile packages after flight, and convert query frames into ranked local VPR candidates using preloaded DINOv2-VLAD descriptors and FAISS.
 **Architectural Pattern**: Offline sync gateway + local retrieval index adapter.
 **Upstream dependencies**: Camera ingest/calibration, Tile Manager, safety/anchor wrapper, Azaion Suite Satellite Service before/after flight.
 **Downstream consumers**: Anchor verification, FDR.
 ## 2. Internal Interfaces
 ### Interface: `SatelliteService`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `import_mission_cache` | `CacheImportRequest` | `CacheImportResult` | Yes | `SyncUnavailable`, `PackageInvalid` |
 | `upload_generated_tiles` | `GeneratedTileUploadRequest` | `GeneratedTileUploadResult` | Yes | `SyncUnavailable`, `PackageRejected` |
 | `retrieve` | `RetrievalRequest` | `RetrievalResult` | Yes | `IndexUnavailable`, `DescriptorFailed` |
 | `load_index` | `IndexLoadRequest` | `IndexStatus` | No | `ManifestInvalid`, `IndexUnavailable` |
 **Input DTOs**:
 ```yaml
 RetrievalRequest:
  frame: FramePacket
  prior_estimate: PositionEstimate optional
  search_radius_m: number optional
  max_candidates: integer
 ```
 **Output DTOs**:
 ```yaml
 RetrievalResult:
  query_descriptor_id: string
  candidates: list[VprCandidate]
 VprCandidate:
  chunk_id: string
  tile_id: string
  score: number
  footprint: geometry
  freshness_status: enum
 ```
 ## 3. Data Access Patterns
 | Query | Frequency | Hot Path | Index Needed |
 |-------|-----------|----------|--------------|
 | Top-K FAISS search | Triggered only | No steady-state | FAISS index |
 | Import/export package sync | Pre-flight / post-flight only | No mid-flight | Package manifest and sidecar hashes |
 | Load chunk metadata | Per candidate | No | PostgreSQL/PostGIS spatial and chunk indexes |
 ## 4. Implementation Details
 **State Management**: Holds loaded descriptor model and FAISS index handles; tracks pre-flight import and post-flight upload package status.
 **Key Dependencies**:
 | Library | Purpose |
 |---------|---------|
 | DINOv2 / ONNX / TensorRT candidate path | Query descriptor extraction |
 | FAISS CPU | Top-K retrieval |
 | Satellite Service client | Pre-flight cache import and post-flight generated-tile upload |
 **Error Handling Strategy**:
 - If descriptor extraction or index load fails, return no candidates and trigger degraded mode.
 - Optimized engines are allowed only after descriptor-fidelity tests pass.
 - Network/package sync failures are allowed only before takeoff or after landing; during flight, the component must never call a satellite provider or suite service.
 ## 5. Caveats & Edge Cases
 **Known limitations**:
 - VPR result is only a candidate, never an accepted fix.
 - Cross-domain retrieval can be wrong under seasonal, lighting, or terrain ambiguity.
 - External Satellite Service availability cannot be part of the mid-flight localization safety case.
 **Performance bottlenecks**:
 - Descriptor extraction on Jetson must be trigger-limited and profiled separately from BASALT.
 ## 6. Dependency Graph
 **Must be implemented after**: cache manifest/index schema, camera frame DTOs.
 **Can be implemented in parallel with**: anchor verification.
 **Blocks**: satellite relocalization flow.
 ## 7. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | Index unavailable | `faiss_index_unavailable id=...` |
 | WARN | No candidates | `vpr_no_candidates frame_id=...` |
 | INFO | Retrieval invoked | `vpr_query candidates=... latency_ms=...` |
 **Log format**: FDR structured event.
 **Log storage**: FDR segment.
@@ -0,0 +1,172 @@
 # Test Specification — Satellite Service
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-3.2 | Sharp-turn relocalization | IT-02, AT-01 | Covered |
 | AC-3.3 | >=3 disconnected segments | IT-03 | Covered |
 | AC-3.4 | Relocalization request after loss | IT-02 | Covered |
 | AC-4.1 | Trigger path latency | PT-01 | Covered |
 | AC-4.2 | Memory budget | PT-01 | Covered |
 | AC-8.1 | Cache imagery interface | IT-01 | Covered |
 | AC-8.3 | Preloaded/preprocessed cache | IT-01 | Covered |
 | AC-8.6 | VPR chunks, multi-scale, dynamic K | IT-01, IT-04 | Covered |
 | AC-NEW-1 | Cold-start first fix support | PT-02 | Covered |
 | AC-NEW-6 | Freshness-aware retrieval | IT-04, ST-01 | Covered |
 ## Blackbox Tests
 ### IT-01: Index Load And Chunk Coverage
 **Summary**: Verify preloaded VPR chunks and FAISS index cover the operational area.
 **Traces to**: AC-8.1, AC-8.3, AC-8.6
 **Input data**: PostgreSQL/PostGIS cache manifest, VPR chunk metadata, FAISS index.
 **Expected result**: Every test frame footprint falls inside at least one VPR chunk; fine and coarse descriptors are present where required.
 **Max execution time**: 2 minutes per mission cache.
 ---
 ### IT-02: Sharp-Turn Local Retrieval Trigger
 **Summary**: Verify sharp-turn state requests candidates rather than relying on frame-to-frame VO.
 **Traces to**: AC-3.2, AC-3.4
 **Input data**: Wrapper relocalization request with sharp-turn/loss reason.
 **Expected result**: Satellite Service returns bounded top-K candidates from preloaded local indexes based on sector/covariance policy.
 **Max execution time**: 2 seconds per query.
 ---
 ### IT-03: Disconnected Segment Retrieval
 **Summary**: Verify at least three disconnected segments can each retrieve candidate chunks.
 **Traces to**: AC-3.3
 **Input data**: Three disconnected query frames with approximate prior/covariance.
 **Expected result**: Each query returns a candidate set including the ground-truth region when covered by the cache fixture.
 **Max execution time**: Dataset-dependent.
 ---
 ### IT-04: Dynamic K And Freshness Filter
 **Summary**: Verify K varies by sector and covariance, and stale candidates are tagged.
 **Traces to**: AC-8.6, AC-NEW-6
 **Input data**: Stable and active-conflict sector cache fixtures with fresh/stale tiles.
 **Expected result**: K=5 for stable low-covariance, K=20 for active-conflict, K=50 fallback; stale candidates are flagged for rejection/down-confidence.
 **Max execution time**: 2 seconds per query.
 ## Performance Tests
 ### PT-01: Retrieval Query Runtime
 **Summary**: Verify descriptor extraction and FAISS query fit trigger-path budget.
 **Traces to**: AC-4.1, AC-4.2
 **Load scenario**:
 - Query set: 100 representative relocalization frames.
 - Environment: Jetson and replay workstation.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Retrieval p95 | <=300 ms trigger path share | >400 ms |
 | Memory contribution | <=2 GB | >3 GB |
 | Candidate count policy | Exact | Any wrong K |
 **Resource limits**: Total system memory remains below 8 GB.
 ---
 ### PT-02: Cold-Start Index Load
 **Summary**: Verify retrieval readiness supports first fix <30 s.
 **Traces to**: AC-NEW-1
 **Load scenario**:
 - Cold boot 50 runs.
 - Cache/index mounted locally.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Index ready p95 | <=10 s | >15 s |
 | First retrieval p95 | Fits <30 s first-fix budget | Exceeds budget |
 ## Security Tests
 ### ST-01: Stale Candidate Handling
 **Summary**: Verify stale imagery cannot silently appear as a trusted retrieval candidate.
 **Traces to**: AC-NEW-6
 **Attack vector**: Manipulated cache manifest marks stale tile as available.
 **Test procedure**:
 1. Load cache fixture with stale capture dates.
 2. Query against stale region.
 **Expected behavior**: Candidate carries stale status; anchor path cannot accept it as `satellite_anchored`.
 **Pass criteria**: 0 stale candidates without explicit stale/down-confidence metadata.
 ---
 ### ST-02: No Mid-Flight Satellite Service Calls
 **Summary**: Verify relocalization never performs satellite-provider or suite Satellite Service network calls during flight.
 **Traces to**: AC-8.3, R-SAT-01
 **Attack vector**: Runtime attempts to fetch missing cache/index data over the network during relocalization.
 **Test procedure**:
 1. Disable external network access during a replay scenario.
 2. Trigger relocalization against preloaded cache fixtures.
 3. Inspect network call logs and Satellite Service client telemetry.
 **Expected behavior**: Retrieval uses only mounted local cache/index data; missing data produces degraded/no-candidate behavior, not a network fetch.
 **Pass criteria**: 0 mid-flight Satellite Service or satellite-provider calls.
 ## Acceptance Tests
 ### AT-01: Relocalization Candidate Returned
 **Summary**: Verify a relocalization request returns usable candidates for anchor verification.
 **Traces to**: AC-3.2, AC-8.6
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Submit sharp-turn query | Retrieval invokes VPR |
 | 2 | Read output | Candidate list includes chunk IDs, tile IDs, scores, footprints, freshness |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `cache_vpr_fixture` | PostGIS manifest, COGs, descriptors, FAISS index | Generated/cache fixture | Mission-dependent |
 | `aerial_vpr_queries` | Aerial query frames with ground-truth regions | ALTO/AerialVL/representative | Dataset-dependent |
 **Setup procedure**: Restore isolated PostgreSQL schema and mount read-only descriptor/index files.
 **Teardown procedure**: Drop schema and remove generated retrieval reports.
 **Data isolation strategy**: Per-run schema and read-only cache fixture volume.
@@ -0,0 +1,93 @@
 # Anchor Verification
 ## 1. High-Level Overview
 **Purpose**: Verify retrieved cache candidates with local feature matching and geometric checks before the safety wrapper considers an absolute anchor.
 **Architectural Pattern**: Validation pipeline.
 **Upstream dependencies**: Satellite Service, camera ingest/calibration, Tile Manager.
 **Downstream consumers**: Safety/anchor wrapper, FDR.
 ## 2. Internal Interfaces
 ### Interface: `AnchorVerifier`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `verify` | `AnchorVerificationRequest` | `AnchorDecision` | Yes | `TileUnavailable`, `MatchFailed`, `GeometryFailed` |
 | `benchmark_matcher` | `MatcherBenchmarkRequest` | `MatcherBenchmarkReport` | Yes | `ModelUnavailable` |
 **Input DTOs**:
 ```yaml
 AnchorVerificationRequest:
  frame: FramePacket
  candidates: list[VprCandidate]
  matcher_profile: enum(aliked, disk, sift_orb_baseline)
 ```
 **Output DTOs**:
 ```yaml
 AnchorDecision:
  candidate_id: string
  accepted_by_geometry: boolean
  estimated_pose_wgs84: object optional
  inlier_count: integer
  mre_px: number
  homography: matrix optional
  rejection_reason: string optional
 ```
 ## 3. Data Access Patterns
 | Query | Frequency | Hot Path | Index Needed |
 |-------|-----------|----------|--------------|
 | Read candidate COG footprint/window | Triggered only | No | Tile spatial metadata |
 | Read matcher model | Startup | No | No |
 ## 4. Implementation Details
 **State Management**: Loads matcher/extractor models and tracks benchmark-selected profile.
 **Key Dependencies**:
 | Library | Purpose |
 |---------|---------|
 | ALIKED/DISK + LightGlue | Learned local matching |
 | OpenCV | RANSAC/USAC geometry and error metrics |
 **Error Handling Strategy**:
 - Low inlier count, high MRE, stale tile, or provenance failure returns a rejected decision with reason.
 - SuperPoint can be benchmarked only if legal approval allows its license use.
 ## 5. Caveats & Edge Cases
 **Known limitations**:
 - ALIKED-LightGlue is not VIO by itself; it supplies correspondences. A full VIO path still needs state estimation and IMU fusion.
 - Optional frame-to-frame VO fallback must be benchmarked separately from cross-domain anchor verification.
 **Performance bottlenecks**:
 - Learned matching can exceed the Jetson budget if run per-frame; default invocation is trigger-based.
 ## 6. Dependency Graph
 **Must be implemented after**: Satellite Service candidate DTOs, Tile Manager tile access.
 **Can be implemented in parallel with**: VIO adapter.
 **Blocks**: accepted satellite-anchor path.
 ## 7. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | Matcher model unavailable | `lightglue_model_unavailable profile=aliked` |
 | WARN | Candidate rejected | `anchor_geometry_rejected mre_px=... inliers=...` |
 | INFO | Anchor verified | `anchor_verified tile_id=... mre_px=...` |
 **Log format**: FDR structured event.
 **Log storage**: FDR segment.
@@ -0,0 +1,124 @@
 # Test Specification — Anchor Verification
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-1.1 | 50 m frame-center accuracy via accepted anchors | AT-01 | Covered |
 | AC-1.2 | 20 m stretch accuracy via accepted anchors | AT-01 | Covered |
 | AC-2.1b | Satellite-anchor registration measured separately | IT-01 | Covered |
 | AC-2.2 | <2.5 px cross-domain MRE | IT-01, AT-01 | Covered |
 | AC-3.1 | Outlier rejection | ST-01 | Covered |
 | AC-4.1 | Trigger-path latency | PT-01 | Covered |
 | AC-4.2 | Memory budget | PT-01 | Covered |
 | AC-NEW-4 | False-position safety budget | ST-01 | Covered |
 | AC-NEW-6 | Stale imagery rejection evidence | IT-02, ST-02 | Covered |
 ## Blackbox Tests
 ### IT-01: Cross-Domain Match And RANSAC
 **Summary**: Verify ALIKED/DISK-LightGlue plus RANSAC produces measurable anchor evidence.
 **Traces to**: AC-2.1b, AC-2.2
 **Input data**: UAV frame, retrieved COG candidate window, ground-truth georegistration.
 **Expected result**: Accepted anchors have MRE <2.5 px, sufficient inliers, and homography/pose evidence.
 **Max execution time**: 2 seconds per candidate set.
 ---
 ### IT-02: Stale Candidate Verification
 **Summary**: Verify stale or provenance-failed candidates are rejected or marked unsafe.
 **Traces to**: AC-NEW-6
 **Input data**: Candidate list with stale tile metadata and valid-looking image content.
 **Expected result**: `AnchorDecision` is rejected or carries stale/provenance failure; safety wrapper cannot accept it as trusted.
 **Max execution time**: 2 seconds per candidate.
 ## Performance Tests
 ### PT-01: Local Matcher Runtime
 **Summary**: Verify learned matching stays bounded on Jetson when invoked.
 **Traces to**: AC-4.1, AC-4.2
 **Load scenario**:
 - Candidate sets: K=5, K=20, K=50.
 - Matcher profiles: ALIKED, DISK, SIFT/ORB baseline.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | K=5 p95 | <=300 ms | >500 ms |
 | K=20 p95 | Reported and bounded | Unbounded/no report |
 | Memory contribution | <=2 GB | >3 GB |
 **Resource limits**: Total Jetson shared memory remains below 8 GB.
 ## Security Tests
 ### ST-01: False Match / Impossible Jump Rejection
 **Summary**: Verify visually plausible but geographically impossible matches are rejected.
 **Traces to**: AC-3.1, AC-NEW-4
 **Attack vector**: Candidate tile from wrong region with repetitive texture.
 **Test procedure**:
 1. Pair query with wrong-region candidate.
 2. Run matcher and RANSAC.
 3. Pass result to safety wrapper fixture.
 **Expected behavior**: Anchor is rejected by geometry or downstream consistency gates.
 **Pass criteria**: 0 accepted anchors from wrong-region fixtures.
 ---
 ### ST-02: Provenance Failure
 **Summary**: Verify unsigned/hash-failed candidate tiles cannot produce accepted anchors.
 **Traces to**: AC-NEW-6
 **Attack vector**: Tampered COG or sidecar.
 **Test procedure**: Run verification against tampered tile fixture.
 **Expected behavior**: `AnchorDecision` includes provenance failure and is not accepted.
 **Pass criteria**: 0 trusted anchors from tampered tiles.
 ## Acceptance Tests
 ### AT-01: Accepted Anchor Accuracy
 **Summary**: Verify accepted anchors support position accuracy thresholds.
 **Traces to**: AC-1.1, AC-1.2, AC-2.2
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Verify satellite candidate against query | MRE <2.5 px |
 | 2 | Convert accepted anchor to WGS84 evidence | Error supports 50 m / 20 m aggregate thresholds |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `anchor_match_fixture` | Query frames, COG windows, expected georegistration | ALTO/AerialVL/project cache | Dataset-dependent |
 | `tampered_tile_fixture` | Hash/signature/stale cases | Generated fixture | Small |
 **Setup procedure**: Load cache fixture and matcher model profile.
 **Teardown procedure**: Remove matcher outputs and reports.
 **Data isolation strategy**: Read-only imagery with per-run output folders.
@@ -0,0 +1,92 @@
 # Tile Manager
 ## 1. High-Level Overview
 **Purpose**: Manage local tiles: service-source COGs, manifests, descriptor metadata, freshness/provenance checks, nadir-image orthorectification into generated tiles, generated tile writes, and post-flight package preparation.
 **Architectural Pattern**: Repository + policy gate.
 **Upstream dependencies**: Satellite Service cache packages, safety/anchor wrapper, camera ingest/calibration.
 **Downstream consumers**: Satellite Service, anchor verification, FDR, post-flight sync.
 ## 2. Internal Interfaces
 ### Interface: `TileManager`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `validate_cache` | `CacheValidationRequest` | `CacheValidationReport` | No | `ManifestInvalid`, `SignatureInvalid` |
 | `get_tile_window` | `TileWindowRequest` | `TileWindow` | No | `TileUnavailable`, `TileRejected` |
 | `orthorectify_frame` | `TileGenerationRequest` | `GeneratedTileCandidate` | Yes | `TileWriteRejected`, `FrameNotUsable` |
 | `write_generated_tile` | `GeneratedTileRequest` | `GeneratedTileRecord` | Yes | `TileWriteRejected`, `StorageFull` |
 | `package_sync` | `SyncPackageRequest` | `SyncPackage` | Yes | `PackageFailed` |
 ## 3. Data Access Patterns
 | Query | Frequency | Hot Path | Index Needed |
 |-------|-----------|----------|--------------|
 | Tile by footprint/time/freshness | Per retrieval/anchor | Yes during relocalization | Spatial/time indexes |
 | Descriptor metadata by chunk | Per Satellite Service retrieval | Yes during relocalization | Chunk ID index |
 | Generated tile by mission/sector | Post-flight | No | Mission ID index |
 ### Caching Strategy
 | Data | Cache Type | TTL | Invalidation |
 |------|------------|-----|--------------|
 | Manifest metadata | PostgreSQL/PostGIS query cache / process cache | Mission duration | New mission cache load |
 | Sidecar verification | In-memory result cache | Mission duration | File hash change |
 ### Storage Estimates
 | Table/Collection | Est. Row Count | Row Size | Total Size | Growth Rate |
 |------------------|----------------|----------|------------|-------------|
 | Cache manifest tiles | Mission-dependent | Small metadata | Within ~10 GB package with imagery | Per mission |
 | Generated tiles | Flight-dependent | Metadata + COG payload | FDR/cache budget constrained | Per flight |
 ## 4. Implementation Details
 **State Management**: Owns PostgreSQL/PostGIS manifest connection, sidecar verification state, and generated tile staging area.
 **Key Dependencies**:
 | Library | Purpose |
 |---------|---------|
 | PostgreSQL + PostGIS | Manifest, spatial metadata, freshness queries, and generated-tile metadata |
 | GDAL/rasterio candidate | COG read/write |
 | OpenCV/GDAL geometry utilities | Nadir-frame orthorectification into generated COG tiles |
 | Cryptographic hash/signature library | Sidecar validation |
 **Error Handling Strategy**:
 - Invalid signatures/hashes reject tiles.
 - Storage-full blocks generated tile writes without affecting localization output.
 - Cache validation failure blocks mission cache usage.
 ## 5. Caveats & Edge Cases
 **Known limitations**:
 - JSON-only manifests are avoided for scale and queryability, but signed JSON sidecars remain required for audit/interchange.
 - PostgreSQL/PostGIS must be available locally before flight; runtime cannot depend on a remote DB link.
 **Potential race conditions**:
 - Generated tile and PostgreSQL manifest update must be atomic enough to avoid orphan trusted metadata.
 ## 6. Dependency Graph
 **Must be implemented after**: data model schema decisions.
 **Can be implemented in parallel with**: camera ingest, MAVLink integration.
 **Blocks**: Satellite Service retrieval, anchor verification, generated tile lifecycle.
 ## 7. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | Cache package invalid | `cache_manifest_invalid reason=signature` |
 | WARN | Tile rejected | `tile_rejected reason=stale tile_id=...` |
 | INFO | Generated tile staged | `generated_tile_written tile_id=...` |
 **Log format**: FDR structured event.
 **Log storage**: FDR segment plus cache validation report.
@@ -0,0 +1,167 @@
 # Test Specification — Tile Manager
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-4.2 | Memory/storage pressure | PT-01 | Covered |
 | AC-8.1 | Resolution at cache interface | IT-01 | Covered |
 | AC-8.2 | Freshness thresholds | IT-02, ST-01 | Covered |
 | AC-8.3 | Preloaded/preprocessed offline cache | IT-01 | Covered |
 | AC-8.4 | Mid-flight tile generation/write-back | IT-03, AT-01 | Covered |
 | AC-8.5 | Persistent imagery policy | ST-02 | Covered |
 | AC-8.6 | VPR chunk metadata | IT-04 | Covered |
 | AC-NEW-3 | FDR/tile storage cap interaction | PT-01 | Covered |
 | AC-NEW-6 | Imagery freshness enforcement | IT-02, ST-01 | Covered |
 | AC-NEW-7 | Cache-poisoning safety budget | ST-03, AT-01 | Covered |
 ## Blackbox Tests
 ### IT-01: Mission Cache Validation
 **Summary**: Verify preloaded COGs, PostGIS metadata, sidecars, descriptors, and indexes validate before flight.
 **Traces to**: AC-8.1, AC-8.3
 **Input data**: Mission cache package with COGs, signed JSON sidecars, PostGIS manifest seed, FAISS index files.
 **Expected result**: Valid cache passes resolution, hash, signature, descriptor-reference, and spatial coverage checks.
 **Max execution time**: 5 minutes per cache fixture.
 ---
 ### IT-02: Freshness Gate
 **Summary**: Verify active-conflict and stable-rear freshness rules.
 **Traces to**: AC-8.2, AC-NEW-6
 **Input data**: Tiles at fresh, grace, and stale ages for both sector classes.
 **Expected result**: Fresh tiles pass, grace tiles are down-confidence weighted if allowed, stale tiles are rejected and cannot emit `satellite_anchored`.
 **Max execution time**: 2 minutes.
 ---
 ### IT-03: Generated Tile Write
 **Summary**: Verify nadir frames are orthorectified and written as generated tiles only when pose and frame quality gates pass.
 **Traces to**: AC-8.4
 **Input data**: Frame metadata, pose covariance <=3 m, <=5 m, and >5 m.
 **Expected result**: <=3 m writes full-quality candidate, 3-5 m writes soft candidate, >5 m rejects write.
 **Max execution time**: 2 minutes.
 ---
 ### IT-04: VPR Chunk Metadata
 **Summary**: Verify chunk metadata supports retrieval rules.
 **Traces to**: AC-8.6
 **Input data**: Operational-area cache manifest.
 **Expected result**: Chunks are 600-800 m equivalent footprint with 40-50% overlap and multi-scale active-sector descriptors.
 **Max execution time**: 2 minutes.
 ## Performance Tests
 ### PT-01: Cache And FDR Storage Budget
 **Summary**: Verify cache metadata and generated tile writes stay within storage/memory budgets.
 **Traces to**: AC-4.2, AC-NEW-3
 **Load scenario**:
 - Mission cache: up to operational budget.
 - Generated tiles: 8-hour synthetic flight.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Persistent cache | <=10 GB unless split budget approved | >budget without report |
 | FDR + generated artifacts | <=64 GB per flight | >64 GB without rollover |
 | DB query p95 | <=50 ms for indexed tile lookup | >150 ms |
 **Resource limits**: PostgreSQL/PostGIS stays within total system 8 GB memory budget.
 ## Security Tests
 ### ST-01: Signed Manifest Enforcement
 **Summary**: Verify unsigned/tampered/stale manifests are rejected.
 **Traces to**: AC-8.2, AC-NEW-6
 **Attack vector**: Tampered sidecar, bad hash, unsigned manifest.
 **Test procedure**: Load invalid cache variants and run validation.
 **Expected behavior**: Invalid tiles are rejected and logged.
 **Pass criteria**: 0 invalid cache entries become available to retrieval/anchor verification.
 ---
 ### ST-02: Raw Frame Persistence Check
 **Summary**: Verify Tile Manager persists tiles, not raw frames.
 **Traces to**: AC-8.5
 **Attack vector**: Raw frames accidentally stored as generated artifacts.
 **Test procedure**: Run tile generation and inspect cache/FDR outputs.
 **Expected behavior**: Only COG tiles, sidecars, manifests, and allowed failed-frame thumbnails exist.
 **Pass criteria**: No raw full-resolution frames retained.
 ---
 ### ST-03: Cache Poisoning Gate
 **Summary**: Verify misaligned generated tiles cannot become trusted basemap.
 **Traces to**: AC-NEW-7
 **Attack vector**: Over-confident pose writes misaligned generated tile.
 **Test procedure**: Inject deflated covariance and wrong pose during tile write.
 **Expected behavior**: Tile is rejected or marked candidate/soft; never promoted to trusted by onboard component.
 **Pass criteria**: 0 direct trusted basemap promotions onboard.
 ## Acceptance Tests
 ### AT-01: Generated Tile Package For Satellite Service
 **Summary**: Verify post-flight sync package contains valid generated tiles and metadata.
 **Traces to**: AC-8.4, AC-NEW-7
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Orthorectify and write generated candidate tile | COG + sidecar + PostGIS manifest row created |
 | 2 | Package post-flight sync | Manifest delta includes trust level and parent covariance |
 | 3 | Inspect package | No tile is marked trusted basemap by onboard runtime |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `cache_integrity_fixtures` | Valid/stale/unsigned/hash-mismatched manifests | Generated fixture | Small |
 | `mission_cache_fixture` | COGs, descriptors, PostGIS seed | Satellite Service stub | Mission-dependent |
 **Setup procedure**: Restore isolated PostgreSQL/PostGIS schema and mount cache fixture read-only except generated-tile staging.
 **Teardown procedure**: Drop schema and delete generated staging volume.
 **Data isolation strategy**: Per-run mission ID, schema, and staging directory.
@@ -0,0 +1,69 @@
 # MAVLink And GCS Integration
 ## 1. High-Level Overview
 **Purpose**: Subscribe to flight-controller telemetry, emit `GPS_INPUT`, and send downsampled QGroundControl status/failsafe messages.
 **Architectural Pattern**: Protocol adapter.
 **Upstream dependencies**: ArduPilot Plane FC, safety/anchor wrapper.
 **Downstream consumers**: VIO adapter, safety/anchor wrapper, QGC, FDR.
 ## 2. Internal Interfaces
 ### Interface: `MavlinkGateway`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `subscribe_telemetry` | `TelemetrySubscriptionRequest` | `TelemetrySample` | Yes | `MavlinkDisconnected` |
 | `emit_gps_input` | `PositionEstimate` | `EmitResult` | Yes | `MavlinkDisconnected`, `InvalidGpsInput` |
 | `emit_status` | `GcsStatusMessage` | `EmitResult` | Yes | `MavlinkDisconnected` |
 ## 3. Data Access Patterns
 No persistent data ownership; telemetry and emitted packets are mirrored to FDR.
 ## 4. Implementation Details
 **State Management**: Maintains MAVLink connection status, source/system IDs, and rate limiters for QGC status.
 **Key Dependencies**:
 | Library | Purpose |
 |---------|---------|
 | MAVSDK | Telemetry subscriptions |
 | pymavlink | Exact `GPS_INPUT` field emission |
 **Error Handling Strategy**:
 - Invalid `GPS_INPUT` fields are rejected before emission.
 - Connection loss is surfaced to wrapper/FDR and does not silently drop safety events.
 ## 5. Caveats & Edge Cases
 **Known limitations**:
 - v1 emits `GPS_INPUT` only, not velocity-target navigation commands.
 - Plane parameter configuration must be validated in SITL before hardware use.
 **Performance bottlenecks**:
 - Status text must be rate-limited to avoid telemetry noise.
 ## 6. Dependency Graph
 **Must be implemented after**: position estimate DTO and MAVLink output contract.
 **Can be implemented in parallel with**: Tile Manager, camera ingest.
 **Blocks**: SITL integration and production FC output.
 ## 7. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | MAVLink disconnected | `mavlink_disconnected endpoint=...` |
 | WARN | Invalid output rejected | `gps_input_invalid reason=...` |
 | INFO | FC link established | `mavlink_connected system_id=...` |
 **Log format**: FDR structured event.
 **Log storage**: FDR segment and optional tlog.
@@ -0,0 +1,176 @@
 # Test Specification — MAVLink And GCS Integration
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-4.3 | v1 GPS_INPUT only for ArduPilot Plane | IT-01, AT-01 | Covered |
 | AC-4.4 | Frame-by-frame streaming | PT-01 | Covered |
 | AC-4.5 | Updated estimates/corrections | IT-02 | Covered |
 | AC-5.1 | FC state initialization telemetry | IT-03 | Covered |
 | AC-5.2 | Plane SITL fallback | IT-04 | Covered |
 | AC-6.1 | QGC status 1-2 Hz | IT-05, PT-02 | Covered |
 | AC-6.2 | GCS command ingress | IT-06, ST-01 | Covered |
 | AC-6.3 | WGS84 output | IT-01 | Covered |
 | AC-NEW-2 | Spoofing promotion <3 s | IT-04 | Covered |
 | AC-NEW-8 | Blackout/failsafe status | IT-05 | Covered |
 ## Blackbox Tests
 ### IT-01: GPS_INPUT Field Mapping
 **Summary**: Verify `PositionEstimate` maps to valid MAVLink `GPS_INPUT`.
 **Traces to**: AC-4.3, AC-6.3
 **Input data**: Position estimates across all source labels.
 **Expected result**: v1 emits `GPS_INPUT` only, no `ODOMETRY`; WGS84 lat/lon/alt, fix type, ignore flags, and accuracy fields match contract.
 **Max execution time**: 2 minutes.
 ---
 ### IT-02: Correction Emission
 **Summary**: Verify updated estimates can be emitted without batching.
 **Traces to**: AC-4.5
 **Input data**: Original VO estimate followed by anchor-corrected estimate.
 **Expected result**: Both estimates are emitted in order with updated accuracy/source label.
 **Max execution time**: 2 minutes.
 ---
 ### IT-03: FC Telemetry Subscription
 **Summary**: Verify telemetry needed for initialization and VIO is available.
 **Traces to**: AC-5.1
 **Input data**: Plane SITL or MAVLink replay with EKF position, IMU, attitude, airspeed, altitude.
 **Expected result**: Normalized `TelemetrySample` stream includes required fields and timestamps.
 **Max execution time**: 5 minutes.
 ---
 ### IT-04: Spoofing And Fallback In Plane SITL
 **Summary**: Verify spoofing and no-estimate behavior in ArduPilot Plane SITL.
 **Traces to**: AC-5.2, AC-NEW-2
 **Input data**: Plane SITL production parameter set and spoofing trace.
 **Expected result**: Own estimate promotion occurs within <3 s; fallback/no-estimate behavior matches Plane parameters.
 **Max execution time**: 10 minutes.
 ---
 ### IT-05: QGC Blackout Status
 **Summary**: Verify degraded-mode messages are visible at required rate.
 **Traces to**: AC-6.1, AC-NEW-8
 **Input data**: Safety wrapper emits blackout and failsafe statuses.
 **Expected result**: QGC observer sees `VISUAL_BLACKOUT_IMU_ONLY` at 1-2 Hz and `VISUAL_BLACKOUT_FAILSAFE` at threshold.
 **Max execution time**: 10 minutes.
 ---
 ### IT-06: Operator Relocalization Hint
 **Summary**: Verify GCS command ingress can carry approximate relocalization hints.
 **Traces to**: AC-6.2
 **Input data**: STATUSTEXT/NAMED_VALUE_FLOAT/custom dialect hint fixture.
 **Expected result**: Valid hint is parsed and forwarded to retrieval/safety logic; invalid hint is rejected.
 **Max execution time**: 5 minutes.
 ## Performance Tests
 ### PT-01: Frame-Rate Emission
 **Summary**: Verify output is streamed frame-by-frame and not batched.
 **Traces to**: AC-4.4
 **Load scenario**:
 - Input estimate rate: target frame rate.
 - Duration: 30 minutes.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Output delay p95 | <=25 ms after wrapper output | >100 ms |
 | Missing messages | 0 except upstream dropped frames | Any silent drop |
 ---
 ### PT-02: QGC Status Rate Limit
 **Summary**: Verify QGC status is downsampled without losing critical transitions.
 **Traces to**: AC-6.1
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Status rate | 1-2 Hz while active | <1 Hz or >2 Hz sustained |
 | Critical transition delay | <=1 s | >2 s |
 ## Security Tests
 ### ST-01: MAVLink Source And Command Validation
 **Summary**: Verify unauthorized or malformed MAVLink messages are rejected.
 **Traces to**: AC-6.2
 **Attack vector**: Malicious source sends spoofed command or GPS data.
 **Test procedure**:
 1. Send valid command from allowed source.
 2. Send same command from disallowed source/system ID.
 3. Send malformed values.
 **Expected behavior**: Allowed command is accepted; disallowed/malformed messages are rejected and logged.
 **Pass criteria**: 0 unauthorized commands affect localization state.
 ## Acceptance Tests
 ### AT-01: Plane SITL Output Acceptance
 **Summary**: Verify ArduPilot Plane receives and uses v1 `GPS_INPUT` as configured.
 **Traces to**: AC-4.3
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Start Plane SITL with production params | FC accepts external GPS substitute config |
 | 2 | Emit `GPS_INPUT` estimate | Message is received with expected fields |
 | 3 | Observe wire | `ODOMETRY` is absent in v1 |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `sitl_spoofing_scenarios` | GPS loss/spoofing traces | Generated SITL | Small |
 | `mavlink_output_fixtures` | PositionEstimate cases | Generated fixture | Small |
 **Setup procedure**: Start SITL/QGC observer or replay MAVLink log.
 **Teardown procedure**: Stop processes and archive tlogs.
 **Data isolation strategy**: Unique MAVLink ports and run IDs per test.
@@ -0,0 +1,79 @@
 # FDR And Observability
 ## 1. High-Level Overview
 **Purpose**: Record bounded, replayable mission evidence and expose runtime health/status events for analysis and operator awareness.
 **Architectural Pattern**: Append-only event sink + exporter.
 **Upstream dependencies**: All runtime components.
 **Downstream consumers**: Validation harness, post-flight audit tools, QGC status through MAVLink component.
 ## 2. Internal Interfaces
 ### Interface: `FlightRecorder`
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `append_event` | `FdrEvent` | `AppendResult` | Yes | `RecorderUnavailable`, `StorageFull` |
 | `rollover` | `RolloverRequest` | `FdrSegmentInfo` | No | `RolloverFailed` |
 | `export` | `ExportRequest` | `ExportResult` | Yes | `ExportFailed` |
 ## 3. Data Access Patterns
 | Query | Frequency | Hot Path | Index Needed |
 |-------|-----------|----------|--------------|
 | Append event | High | Yes | Append index only |
 | Export by time/type | Post-flight | No | Time/type index |
 ### Storage Estimates
 | Table/Collection | Est. Row Count | Row Size | Total Size | Growth Rate |
 |------------------|----------------|----------|------------|-------------|
 | FDR events | Flight-dependent | Mixed | <=64 GB per 8 h | Per flight |
 ## 4. Implementation Details
 **State Management**: Owns active segment, rollover policy, and export state.
 **Key Dependencies**:
 | Library | Purpose |
 |---------|---------|
 | PostgreSQL client | Event metadata, time/type indexes, mission query surface |
 | CBOR writer | Bounded runtime payload segments |
 | Parquet writer | Optional post-flight export |
 **Error Handling Strategy**:
 - Storage-full emits critical status and starts rollover/retention behavior.
 - Append failures are surfaced to the caller and health system.
 ## 5. Caveats & Edge Cases
 **Known limitations**:
 - Raw frames are not retained by default; only metadata, decisions, hashes, and occlusion/blackout status are recorded.
 - PostgreSQL availability is required for indexed FDR metadata; CBOR payload segments preserve bounded append behavior for high-volume data.
 **Performance bottlenecks**:
 - FDR appends must not block hot-path localization.
 ## 6. Dependency Graph
 **Must be implemented after**: event schema and key DTOs.
 **Can be implemented in parallel with**: MAVLink integration.
 **Blocks**: release evidence and most validation reports.
 ## 7. Logging Strategy
 | Log Level | When | Example |
 |-----------|------|---------|
 | ERROR | Recorder unavailable | `fdr_unavailable path=...` |
 | WARN | Rollover occurs | `fdr_rollover segment=...` |
 | INFO | Export complete | `fdr_export_complete format=parquet` |
 **Log format**: FDR event metadata plus local health logs.
 **Log storage**: PostgreSQL FDR event tables plus CBOR segment payloads.
@@ -0,0 +1,166 @@
 # Test Specification — FDR And Observability
 ## Acceptance Criteria Traceability
 | AC ID | Acceptance Criterion | Test IDs | Coverage |
 |-------|---------------------|----------|----------|
 | AC-1.3 | Anchor age/drift evidence | IT-01 | Covered |
 | AC-1.4 | Confidence/source label retained | IT-01 | Covered |
 | AC-4.4 | Per-frame local stream evidence | IT-01, PT-01 | Covered |
 | AC-5.2 | Failure logging | IT-02 | Covered |
 | AC-6.1 | QGC/status evidence | IT-03 | Covered |
 | AC-8.4 | Generated tile audit | IT-04 | Covered |
 | AC-8.5 | No raw frame retention | ST-01 | Covered |
 | AC-NEW-3 | FDR retention and 64 GB cap | PT-01, AT-01 | Covered |
 | AC-NEW-4 | False-position forensics | IT-05 | Covered |
 | AC-NEW-5 | Thermal/throttle logging | IT-06 | Covered |
 | AC-NEW-8 | Blackout/failsafe logging | IT-02, IT-03 | Covered |
 ## Blackbox Tests
 ### IT-01: Per-Estimate Event Capture
 **Summary**: Verify every estimate stores covariance, source label, anchor age, and emitted output metadata.
 **Traces to**: AC-1.3, AC-1.4, AC-4.4
 **Input data**: Position estimate stream with satellite, VO, and dead-reckoned labels.
 **Expected result**: PostgreSQL event index and CBOR payload segments contain all required fields with monotonic timestamps.
 **Max execution time**: 5 minutes.
 ---
 ### IT-02: Failure And Blackout Logging
 **Summary**: Verify no-estimate and blackout transitions are recorded.
 **Traces to**: AC-5.2, AC-NEW-8
 **Input data**: No-estimate gap and total blackout sequence.
 **Expected result**: FDR records start, every degraded estimate, failsafe threshold, and recovery reason.
 **Max execution time**: 10 minutes.
 ---
 ### IT-03: QGC Status Audit
 **Summary**: Verify operator-visible status has matching FDR evidence.
 **Traces to**: AC-6.1, AC-NEW-8
 **Input data**: QGC status messages from MAVLink component.
 **Expected result**: FDR contains status text, timestamp, and mode context.
 **Max execution time**: 5 minutes.
 ---
 ### IT-04: Generated Tile Audit Trail
 **Summary**: Verify tile-write decisions are recorded with parent covariance and trust level.
 **Traces to**: AC-8.4
 **Input data**: Accepted and rejected generated tile write decisions.
 **Expected result**: FDR includes tile ID, parent covariance, trust level, sidecar hash, and rejection reason where applicable.
 **Max execution time**: 5 minutes.
 ---
 ### IT-05: False-Position Investigation Bundle
 **Summary**: Verify enough evidence exists to investigate a false-position event.
 **Traces to**: AC-NEW-4
 **Input data**: Simulated false anchor rejection and covariance growth sequence.
 **Expected result**: Export includes estimates, anchor decisions, residuals, covariance, and emitted MAVLink fields.
 **Max execution time**: 5 minutes.
 ---
 ### IT-06: Thermal/Throttle Event Capture
 **Summary**: Verify resource health events are recorded.
 **Traces to**: AC-NEW-5
 **Input data**: Synthetic thermal/throttle metric stream.
 **Expected result**: FDR records CPU/GPU/temp/throttle status and QGC warning trigger.
 **Max execution time**: 5 minutes.
 ## Performance Tests
 ### PT-01: 8-Hour FDR Load
 **Summary**: Verify FDR storage and append behavior under full mission load.
 **Traces to**: AC-4.4, AC-NEW-3
 **Load scenario**:
 - Duration: 8 hours synthetic.
 - Inputs: 3 Hz estimates, full-rate IMU, MAVLink tlog, health metrics, tile events.
 | Metric | Target | Failure Threshold |
 |--------|--------|-------------------|
 | Total FDR size | <=64 GB | >64 GB without rollover |
 | Append latency p95 | <=10 ms async enqueue | >25 ms |
 | Silent payload loss | 0 | Any unlogged loss |
 **Resource limits**: FDR must not block hot-path localization.
 ## Security Tests
 ### ST-01: Raw Frame Retention Audit
 **Summary**: Verify FDR does not store raw full-resolution frames.
 **Traces to**: AC-8.5
 **Attack vector**: Debug logging accidentally persists raw camera frames.
 **Test procedure**:
 1. Run normal replay and failed tile-generation replay.
 2. Inspect FDR payloads and output directories.
 **Expected behavior**: Only metadata, hashes, estimates, tiles, and allowed low-rate failed-frame thumbnails are retained.
 **Pass criteria**: No raw nav/AI camera frame payloads in normal FDR.
 ## Acceptance Tests
 ### AT-01: FDR Export
 **Summary**: Verify post-flight export creates usable audit artifacts.
 **Traces to**: AC-NEW-3
 | Step | Action | Expected Result |
 |------|--------|-----------------|
 | 1 | Complete synthetic flight | Segment rollover is logged and cap respected |
 | 2 | Export FDR summary | Markdown/CSV/Parquet optional artifacts are produced |
 | 3 | Query PostgreSQL index | Events can be filtered by time/type/mission |
 ## Test Data Management
 | Data Set | Description | Source | Size |
 |----------|-------------|--------|------|
 | `fdr_synthetic_load` | Estimate, IMU, MAVLink, health, tile events | Generated fixture | Large |
 | `incident_fixture` | False-position and blackout evidence | Generated fixture | Small |
 **Setup procedure**: Create isolated PostgreSQL schema and FDR segment directory.
 **Teardown procedure**: Export report, then remove schema and segment directory.
 **Data isolation strategy**: Per-run mission ID, schema, and FDR directory.
@@ -0,0 +1,51 @@
 # Contract: Config Errors Telemetry
 **Component**: shared/config, shared/errors, shared/telemetry
 **Producer task**: AZ-222 — AZ-222_runtime_config_errors_telemetry.md
 **Consumer tasks**: AZ-223, AZ-224, AZ-225, AZ-226, AZ-227, AZ-228, AZ-229, AZ-230, AZ-231, AZ-232
 **Version**: 1.0.0
 **Status**: draft
 **Last Updated**: 2026-05-03
 ## Purpose
 Defines shared runtime configuration, error/result envelope, health, and telemetry metadata behavior consumed by all runtime components.
 ## Shape
 | Contract | Required Behavior |
 |----------|-------------------|
 | Runtime profile | environment-specific settings loaded and validated before use |
 | Error envelope | component, category, message, cause, retryability, severity |
 | Health event | liveness/readiness status, dependency state, timestamp, component |
 | Metrics labels | bounded component/action/status labels suitable for runtime reports |
 ## Invariants
 - Missing required production settings fail startup or readiness loudly.
 - Errors are returned or logged with component and category; no silent suppression.
 - Secrets are referenced, not serialized into FDR, logs, or metrics.
 ## Non-Goals
 - Does not define component-specific business errors.
 - Does not replace FDR payload schemas.
 ## Versioning Rules
 - Removing required config keys or error categories requires a major version bump.
 - Adding optional health fields or metrics labels requires a minor version bump.
 ## Test Cases
 | Case | Input | Expected | Notes |
 |------|-------|----------|-------|
 | missing-required-prod | production profile missing cache dir | readiness/startup failure | Clear error category |
 | secret-value | signing key ref present | only key ref logged | No secret leakage |
 | component-error | component reports dependency failure | structured envelope emitted | FDR-safe |
 ## Change Log
 | Version | Date | Change | Author |
 |---------|------|--------|--------|
 | 1.0.0 | 2026-05-03 | Initial contract | autodev |
@@ -0,0 +1,52 @@
 # Contract: Geometry And Time Sync Helpers
 **Component**: shared/geo_geometry, shared/time_sync
 **Producer task**: AZ-221 — AZ-221_shared_geometry_time_sync.md
 **Consumer tasks**: AZ-223, AZ-225, AZ-226, AZ-228, AZ-230, AZ-231, AZ-232
 **Version**: 1.0.0
 **Status**: draft
 **Last Updated**: 2026-05-03
 ## Purpose
 Defines shared geospatial and timestamp helper behavior used by runtime components to avoid duplicated math and inconsistent frame/IMU alignment.
 ## Shape
 | API Area | Shape | Errors |
 |----------|-------|--------|
 | Coordinate conversion | WGS84/local tangent conversions and distance calculations | invalid CRS, missing origin |
 | Camera footprint | intrinsics/extrinsics/attitude/altitude to footprint and GSD | invalid calibration, missing altitude |
 | Homography metrics | homography/covariance conversions and MRE support | invalid geometry |
 | Time sync | monotonic checks, frame-to-IMU window selection, replay ordering | timestamp mismatch, gap/jitter exceeded |
 ## Invariants
 - Helpers are deterministic for the same calibration, pose, and timestamp inputs.
 - Time helpers report gaps/jitter instead of silently dropping samples.
 - Geometry helpers do not decide safety policy; callers decide degrade/reject behavior.
 ## Non-Goals
 - No VIO state estimation.
 - No MAVLink parsing beyond normalized timestamp fields.
 - No tile freshness or cache policy decisions.
 ## Versioning Rules
 - Breaking changes to units, coordinate frames, or timestamp semantics require a major version bump.
 - New helper outputs may be added as optional fields in minor versions.
 ## Test Cases
 | Case | Input | Expected | Notes |
 |------|-------|----------|-------|
 | valid-wgs84-local | known WGS84 point and origin | round-trip within tolerance | Uses representative coordinates |
 | frame-imu-window | frame timestamp plus IMU samples | correct aligned window | Includes gap metrics |
 | invalid-calibration | missing intrinsics/extrinsics | explicit error | No silent fallback |
 ## Change Log
 | Version | Date | Change | Author |
 |---------|------|--------|--------|
 | 1.0.0 | 2026-05-03 | Initial contract | autodev |
@@ -0,0 +1,56 @@
 # Contract: Runtime Shared Contracts
 **Component**: shared/contracts
 **Producer task**: AZ-220 — AZ-220_shared_runtime_contracts.md
 **Consumer tasks**: AZ-223, AZ-224, AZ-225, AZ-226, AZ-227, AZ-228, AZ-229, AZ-230, AZ-231, AZ-232
 **Version**: 1.0.0
 **Status**: draft
 **Last Updated**: 2026-05-03
 ## Purpose
 Defines the shared runtime DTO/event contract surface that component implementations consume instead of inventing local shapes.
 ## Shape
 | Contract | Required Fields / Methods | Consumers |
 |----------|---------------------------|-----------|
 | `FramePacket` | frame ID, timestamp, image reference, calibration ID, occlusion, quality, normalization hint | camera, VIO, Satellite Service, Anchor Verification, Tile Manager, FDR |
 | `TelemetrySample` | timestamp, IMU, attitude, altitude, airspeed, GPS health | MAVLink, VIO, safety wrapper, FDR |
 | `VioStatePacket` | timestamp, relative pose, velocity, bias, tracking quality, covariance hint | VIO, safety wrapper, FDR |
 | `PositionEstimate` | WGS84 coordinates, covariance, source label, fix type, horizontal accuracy, anchor age | safety wrapper, MAVLink, Tile Manager, FDR |
 | `VprCandidate` | chunk ID, tile ID, score, footprint, freshness status | Satellite Service, Anchor Verification, FDR |
 | `AnchorDecision` | candidate ID, acceptance result, estimated pose, inliers, MRE, rejection reason | Anchor Verification, safety wrapper, FDR |
 | `CacheTileRecord` | tile ID, CRS, meters per pixel, capture date, signature/hash, trust level | Tile Manager, Satellite Service, Anchor Verification |
 | `FdrEvent` | event type, timestamp, component, severity, payload reference, mission/run ID | all runtime components |
 ## Invariants
 - Timestamps are normalized to a shared monotonic nanosecond representation before cross-component use.
 - Confidence fields must not under-report known uncertainty.
 - Raw frame payloads are referenced, not persisted in shared DTOs.
 - Generated tile and anchor records must carry provenance/freshness metadata.
 ## Non-Goals
 - Does not prescribe internal classes or storage implementation.
 - Does not define e2e test runner-only report schemas.
 ## Versioning Rules
 - Removing or renaming a field requires a major version bump.
 - Adding optional telemetry or diagnostic fields requires a minor version bump.
 ## Test Cases
 | Case | Input | Expected | Notes |
 |------|-------|----------|-------|
 | valid-frame | frame with timestamp, calibration, quality | accepted by consumers | Includes normalization hint |
 | invalid-time | non-monotonic timestamp | rejected or marked invalid | Time-sync contract decides details |
 | stale-anchor | anchor decision with stale freshness | rejected/down-confidenced | Safety wrapper must not accept blindly |
 ## Change Log
 | Version | Date | Change | Author |
 |---------|------|--------|--------|
 | 1.0.0 | 2026-05-03 | Initial contract | autodev |
@@ -0,0 +1,148 @@
 # Data Model
 ## Scope
 This model defines system-level runtime, cache, telemetry, and validation data. PostgreSQL with PostGIS is the primary structured store for manifests, spatial metadata, mission state, and FDR event indexes. Large binary payloads remain local files: COG tiles, descriptor/index files, FDR payload segments, and replay fixtures.
 ## Entity Overview
 | Entity | Purpose | Storage / Transport | Owner |
 |--------|---------|---------------------|-------|
 | MissionProfile | Operational area, sector type, route shape, altitude band, cache budget | Mission config file | Tile Manager |
 | CameraCalibration | Intrinsics, distortion, lens, fixed extrinsics, capture settings | Versioned calibration file | Camera ingest/calibration |
 | FrameRecord | Per-frame metadata, timestamp, total-occlusion/blackout state, image quality, processing status | PostgreSQL/FDR event; replay fixture | Camera ingest/calibration |
 | TelemetrySample | FC IMU, attitude, altitude, airspeed, GPS health | MAVLink stream; FDR event | MAVLink/GCS integration |
 | VioState | Backend-relative state, velocity, bias, tracking quality | Internal DTO; FDR event | VIO adapter |
 | PositionEstimate | WGS84 output, covariance, source label, anchor age, fix type | MAVLink DTO; FDR event | Safety/anchor wrapper |
 | VprChunk | Retrieval footprint and descriptor metadata | PostgreSQL/PostGIS manifest + descriptor files | Satellite Service |
 | AnchorCandidate | Top-K retrieval result and local verification metrics | Internal DTO; FDR event | Anchor verification |
 | CacheTile | Service-source or generated COG tile metadata | PostgreSQL/PostGIS manifest + signed JSON sidecar | Tile Manager |
 | GeneratedTile | In-flight tile candidate with trust/provenance metadata | COG + sidecar + FDR event | Tile Manager |
 | FdrSegment | Bounded append-only mission evidence segment | PostgreSQL event index + CBOR segment payloads | FDR/observability |
 | ValidationRun | Replay/test run metadata and outcomes | CSV/Markdown/test artifacts | Validation harness |
 ## Core Entity Attributes
 ### MissionProfile
 | Field | Type | Required | Notes |
 |-------|------|----------|-------|
 | `mission_id` | string | yes | Unique mission/run identifier |
 | `operational_area_polygon` | geometry | yes | Up to ~400 km² |
 | `sector_classification` | enum | yes | `active_conflict` or `stable_rear` |
 | `planned_altitude_agl_m` | number | yes | <=1000 m AGL |
 | `route_type` | enum | yes | `sector`, `transit_corridor`, or mixed |
 | `cache_budget_bytes` | integer | yes | Default ~10 GB persistent |
 ### CameraCalibration
 | Field | Type | Required | Notes |
 |-------|------|----------|-------|
 | `camera_model` | string | yes | ADTi 20MP 20L V1 family |
 | `sensor_width_mm` | number | yes | Public spec check currently indicates 23.20 mm |
 | `sensor_height_mm` | number | yes | Public spec check currently indicates 15.40 mm |
 | `image_width_px` | integer | yes | Public spec check currently indicates 5456 px |
 | `image_height_px` | integer | yes | Public spec check currently indicates 3632 px |
 | `pixel_pitch_um` | number | yes | Public spec check indicates 4.25 um |
 | `lens_focal_length_mm` | number | yes | TBD before implementation |
 | `distortion_coefficients` | array | yes | From checkerboard calibration |
 | `body_T_camera` | transform | yes | Fixed camera-to-body extrinsics |
 | `spec_verification_status` | enum | yes | `manufacturer_verified`, `public_page_only`, `operator_supplied` |
 ### PositionEstimate
 | Field | Type | Required | Notes |
 |-------|------|----------|-------|
 | `timestamp_ns` | integer | yes | Frame-aligned |
 | `lat_deg` | number | yes | WGS84 |
 | `lon_deg` | number | yes | WGS84 |
 | `alt_msl_m` | number | yes | MSL altitude for `GPS_INPUT` |
 | `covariance_95_semi_major_m` | number | yes | Must not be under-reported |
 | `source_label` | enum | yes | `satellite_anchored`, `vo_extrapolated`, `dead_reckoned` |
 | `last_satellite_anchor_age_ms` | integer | yes | Monotonic until new anchor |
 | `fix_type` | integer | yes | MAVLink fix semantics |
 | `horiz_accuracy_m` | number | yes | >= covariance semi-major mapping |
 | `quality_flags` | bitset/string array | yes | Anchor, blackout, spoofing, stale tile, etc. |
 ### FrameRecord
 | Field | Type | Required | Notes |
 |-------|------|----------|-------|
 | `frame_id` | string | yes | Stable frame/run identifier |
 | `timestamp_ns` | integer | yes | Camera clock normalized by time-sync helper |
 | `camera_calibration_id` | string | yes | Links to `CameraCalibration` |
 | `occlusion_status` | enum | yes | `clear`, `partial_occlusion`, `total_occlusion`, `blackout` |
 | `usable_for_vio` | boolean | yes | Must be false for total occlusion/blackout |
 | `usable_for_anchor` | boolean | yes | Must be false for total occlusion/blackout |
 | `blackout_reason` | enum | optional | `cloud`, `lens_cover`, `whiteout`, `decode_failure`, `underexposed`, `overexposed`, `unknown` |
 | `blur_score` | number | yes | Quality metric |
 | `texture_score` | number | yes | Quality metric |
 ### CacheTile
 | Field | Type | Required | Notes |
 |-------|------|----------|-------|
 | `tile_id` | string | yes | Stable ID |
 | `tile_type` | enum | yes | `service_source`, `generated_candidate`, `generated_soft`, `trusted_basemap` |
 | `cog_path` | string | yes | Local path |
 | `crs` | string | yes | Projection metadata |
 | `meters_per_pixel` | number | yes | Must satisfy cache interface floor |
 | `capture_date` | date | yes | Freshness gate |
 | `source` | string | yes | Satellite Service or onboard generation |
 | `sha256` | string | yes | Integrity |
 | `signature_status` | enum | yes | `valid`, `missing`, `invalid` |
 | `parent_pose_covariance_m` | number | generated only | Tile-write gate |
 | `trust_level` | enum | yes | `rejected`, `candidate`, `soft`, `trusted` |
 ## Relationships
 ```mermaid
 erDiagram
    MissionProfile ||--o{ CacheTile : scopes
    MissionProfile ||--o{ VprChunk : indexes
    CameraCalibration ||--o{ FrameRecord : calibrates
    FrameRecord ||--o{ VioState : contributes_to
    TelemetrySample ||--o{ VioState : contributes_to
    VioState ||--o{ PositionEstimate : propagates
    VprChunk ||--o{ AnchorCandidate : retrieved_as
    CacheTile ||--o{ AnchorCandidate : verified_against
    AnchorCandidate ||--o{ PositionEstimate : may_anchor
    PositionEstimate ||--o{ GeneratedTile : gates
    PositionEstimate ||--o{ FdrSegment : recorded_in
 ```
 ## Storage Strategy
 | Data Class | Primary Format | Reason |
 |------------|----------------|--------|
 | Structured mission/cache/FDR metadata | PostgreSQL + PostGIS | Queryable freshness, coverage, spatial footprints, descriptors, tile status, and FDR event indexes |
 | Tile audit sidecar | Signed JSON | Human/audit/service interchange per tile |
 | Imagery tile | COG | Geospatial raster standard |
 | Descriptor index | FAISS CPU index files + metadata | Fast top-K retrieval |
 | FDR runtime payloads | CBOR segment files + PostgreSQL index | Bounded append payloads with queryable event metadata |
 | FDR analysis export | Parquet optional | Post-flight analytics |
 | Test report | CSV + Markdown | CI and human review |
 ## Migration Strategy
 - PostgreSQL schemas use explicit `schema_version` and additive migrations by default.
 - PostGIS geometry columns are used for mission polygons, tile footprints, VPR chunks, and generated-tile extents.
 - FDR segment schema includes `segment_schema_version`; old readers must reject unknown required fields loudly.
 - Sidecars include a `sidecar_version` and hash of the COG payload.
 - Migrations are implemented as deterministic scripts with rollback for metadata-only changes.
 - No database/table/column rename is allowed without explicit approval during implementation.
 ## Seed Data Requirements
 | Environment | Seed Data |
 |-------------|-----------|
 | Development | 60 project images, `coordinates.csv`, small cache fixture, generated SITL traces |
 | Public replay | Pinned MUN-FRL/ALTO/Kagaru/EPFL dataset slices and licenses |
 | Jetson validation | Production-like cache/index, cold-start fixtures, thermal workload |
 | Representative acceptance | Synchronized target nav-camera + FC telemetry + ground truth |
 ## Backward Compatibility
 - Runtime should tolerate older cache sidecars if required fields exist and signatures validate.
 - Generated tile sidecars must include all fields required by Satellite Service ingest; missing fields make the tile ineligible for promotion.
 - FDR readers must support at least the current and previous segment schema version during the project lifecycle.
@@ -0,0 +1,11 @@
 # Deployment Planning Index
 This directory contains the system-level deployment plan produced during Plan Step 2:
 - `containerization.md`
 - `ci_cd_pipeline.md`
 - `environment_strategy.md`
 - `observability.md`
 - `deployment_procedures.md`
 Component-specific implementation tasks are created later during decomposition.
@@ -0,0 +1,54 @@
 # CI/CD Pipeline
 ## Pipeline Stages
 | Stage | Runs On | Gate |
 |-------|---------|------|
 | Format/lint | PR | Block merge |
 | Unit tests | PR | Block merge |
 | Replay black-box smoke | PR | Block merge |
 | Cache/security fixture tests | PR | Block merge |
 | Plane SITL spoof/failsafe tests | Release candidate / nightly | Block release |
 | Public dataset replay | Nightly / release candidate | Block release |
 | Jetson latency/resource tests | Release candidate | Block release |
 | Thermal/FDR endurance | Release candidate / hardware qualification | Block release |
 ## Artifact Outputs
 - Test CSV reports.
 - FDR validation summaries.
 - Cache integrity reports.
 - Dataset replay metrics.
 - SITL tlogs.
 - Jetson profiling traces.
 ## Caching
 - Cache dependency builds by lockfile hash.
 - Cache public dataset slices only in controlled CI storage with license metadata.
 - Do not cache secrets or signing keys.
 ## Branch Policy
 - Work occurs on `dev`.
 - Release gates must pass before deploy artifacts are considered production-ready.
 - Any failed safety, spoofing, false-position, or cache-poisoning test blocks release.
 ## Quality Gates
 | Gate | Threshold |
 |------|-----------|
 | Still-image geolocation | >=80% within 50 m and >=50% within 20 m |
 | Hot-path latency | <400 ms p95 |
 | Memory | <8 GB shared |
 | Cold start | <30 s p95 |
 | FDR | <=64 GB / 8-hour flight |
 | Cache storage | <=10 GB unless split budget is approved |
 | False position | AC-NEW-4 thresholds |
 ## Open Tasks For Decomposition
 - Define CI runner labels for Docker/replay vs Jetson local hardware.
 - Add dataset-license checks before public dataset jobs.
 - Implement SITL scenario generation and tlog validation job.
 - Implement report collation into a release evidence bundle.
@@ -0,0 +1,46 @@
 # Containerization
 ## Strategy
 The production runtime targets Jetson hardware and may not be fully containerized for all camera/GPU paths. The test and development stack uses containers where practical, with local hardware execution required for release gates.
 ## Runtime Units
 | Unit | Containerized? | Notes |
 |------|----------------|-------|
 | GPS-denied service | Optional on Jetson | Must access camera, CUDA/TensorRT/ONNX, MAVLink, local cache, FDR storage |
 | Replay consumer | Yes | Deterministic black-box test harness |
 | Satellite cache stub | Yes | Local fixture volume for COG/manifest/descriptors |
 | ArduPilot Plane SITL | Yes or local process | Used for MAVLink and failsafe validation |
 | QGC observer/log parser | Yes | Parses MAVLink status/tlogs |
 ## Docker Compose Profiles
 | Profile | Purpose | Services |
 |---------|---------|----------|
 | `replay` | CI/PR deterministic fixture tests | gps-denied-service, replay-consumer, satellite-cache-stub |
 | `sitl` | ArduPilot Plane integration tests | gps-denied-service, ardupilot-plane-sitl, qgc-observer |
 | `jetson-local` | Documentation-only profile for local hardware run | Host runtime with local scripts/tasks created later |
 ## Image Requirements
 - Base images must match JetPack/CUDA compatibility for GPU tests.
 - Replay-only images may use standard Ubuntu/Python/C++ build images.
 - No production image should contain secrets, mission signing keys, or provider credentials.
 - Dataset downloads are not baked into images; they are mounted as versioned fixtures.
 ## Volumes
 | Volume | Purpose |
 |--------|---------|
 | `/data/input` | Test images and public dataset slices |
 | `/cache/satellite` | Offline cache fixture |
 | `/fdr` | Runtime FDR output |
 | `/test-results` | CSV/Markdown reports |
 ## Open Tasks For Decomposition
 - Create Dockerfiles for replay-compatible service and consumer harness.
 - Define Jetson local setup scripts for GPU/camera/MAVLink access.
 - Create compose profiles for replay and SITL.
 - Add license-aware public dataset fixture downloader.
@@ -0,0 +1,68 @@
 # Deployment Procedures
 ## Deployment Targets
 | Target | Purpose |
 |--------|---------|
 | Replay environment | Development and CI fixtures |
 | Plane SITL | MAVLink/failsafe validation |
 | Jetson companion computer | Production runtime and release gating |
 | Representative flight/replay rig | Final acceptance evidence |
 ## Pre-Deployment Checklist
 - Camera lens, resolution, FPS, sensor dimensions, and operating temperature are manufacturer-verified.
 - Camera intrinsics/extrinsics are calibrated and versioned.
 - BASALT, OpenCV, FAISS, LightGlue, DINOv2/ONNX/TensorRT dependencies are pinned.
 - TensorRT/ONNX descriptor-fidelity tests pass before optimized engines are used.
 - Satellite cache manifests and sidecars validate signatures, hashes, freshness, and resolution.
 - Plane SITL validates `GPS_INPUT` behavior with production parameters.
 - Jetson latency, memory, and thermal release gates pass.
 - FDR rollover test passes.
 ## Deployment Steps
 1. Install JetPack-compatible runtime dependencies on the companion computer.
 2. Install/build BASALT and native vision dependencies.
 3. Pre-build any ONNX/TensorRT engines accepted by fidelity tests.
 4. Sync mission cache from Satellite Service before flight.
 5. Validate cache manifest, descriptors, signatures, resolution, and freshness.
 6. Start the onboard service and verify FC telemetry connection.
 7. Run cold-start first-fix check.
 8. Confirm QGroundControl status and FDR segment creation.
 ## Health Checks
 | Check | Pass Condition |
 |-------|----------------|
 | Camera input | Frames received with expected resolution/rate |
 | FC telemetry | IMU/attitude/altitude/GPS-health stream healthy |
 | Cache | Manifest and descriptor index valid |
 | First fix | Valid `GPS_INPUT` <30 s p95 in cold-start test |
 | Resource health | Memory <8 GB, no thermal throttle |
 | QGC status | Status visible at configured downsample rate |
 | FDR | Segment open and writable |
 ## Rollback
 - If runtime dependency update fails tests, revert to previous pinned build.
 - If cache manifest validation fails, reject the mission cache and resync/rebuild before flight.
 - If optimized engine fidelity fails, fall back to PyTorch/ONNX path that passed descriptor tests.
 - If BASALT candidate fails representative replay gates, evaluate Kimera backup or custom fallback tasks before production deployment.
 ## Post-Flight Procedure
 1. Stop the onboard service cleanly.
 2. Export FDR summary and integrity hashes.
 3. Package generated tiles with sidecars and manifest delta.
 4. Upload generated tile package to Satellite Service when connectivity is available.
 5. Archive release evidence: tlogs, FDR summary, cache validation report, test results.
 ## Deployment Blockers
 - ADTi camera spec mismatch unresolved for FPS/resolution/lens/temperature.
 - Missing representative synchronized nav-camera + FC telemetry + ground truth for final acceptance.
 - Any false-position safety budget failure.
 - Any cache-poisoning gate failure.
 - Any Plane SITL `GPS_INPUT` failure.
 - Thermal throttling during the 8-hour target workload.
@@ -0,0 +1,49 @@
 # Environment Strategy
 ## Environments
 | Environment | Purpose | Hardware |
 |-------------|---------|----------|
 | Development replay | Fast local iteration with fixtures | Developer workstation |
 | CI replay | Deterministic PR checks | Docker runner |
 | Public dataset replay | Nightly/RC algorithm validation | Docker or GPU runner |
 | Plane SITL | MAVLink/failsafe validation | Docker/local SITL |
 | Jetson hardware validation | Production path latency, memory, GPU, camera, thermal | Jetson Orin Nano Super |
 | Representative flight/replay | Final acceptance evidence | Target-like UAV/FC/camera setup |
 ## Configuration Classes
 | Config | Development | Production |
 |--------|-------------|------------|
 | Satellite cache | Small fixture | Full mission cache |
 | PostgreSQL/PostGIS | Local test DB with fixture manifests | Local onboard DB with signed mission manifests, spatial metadata, and FDR event indexes |
 | Descriptor index | Small FAISS index | Full operational-area index |
 | MAVLink | SITL/replay | Physical FC link |
 | FDR | Temporary directory | Per-flight NVMe directory with rollover |
 | Dataset fixtures | Optional public slices | Not used at runtime |
 ## Secrets And Signing
 - Mission signing keys are never committed.
 - Test keys may be committed only if clearly labeled as non-production.
 - Provider credentials are not used by onboard runtime.
 - Any Satellite Service sync credentials are post-flight/deployment environment secrets.
 ## Dataset Licensing
 Public datasets must be tagged before use:
 | Dataset | Expected Use | License Constraint |
 |---------|--------------|--------------------|
 | MUN-FRL | Preferred public VIO/nadir replay | CC BY 4.0 per current docs |
 | ALTO | Preferred aerial localization/VPR replay | BSD-3 repository; dataset availability must be pinned |
 | Kagaru | Fixed-wing/farmland validation candidate | Verify terms before commercial use |
 | EPFL fixed-wing | Fixed-wing validation candidate | Verify terms before commercial use |
 | VPAir | VPR/localization only | Academic-use restriction likely blocks commercial acceptance |
 | UZH FPV | VIO stress proxy only | Non-commercial license blocks commercial acceptance |
 ## Promotion Rules
 - A result from public datasets can de-risk implementation but cannot replace representative acceptance data.
 - A release candidate cannot be promoted without Jetson hardware validation and Plane SITL.
 - A mission cache cannot be used if manifest/signature/freshness validation fails.
@@ -0,0 +1,61 @@
 # Observability
 ## Goals
 - Explain every emitted position estimate.
 - Detect false-position risk before it reaches the flight controller.
 - Preserve enough evidence to replay incidents without storing raw frames.
 - Surface operator-relevant status to QGroundControl without saturating telemetry.
 ## Runtime Signals
 | Signal | Frequency | Destination | Notes |
 |--------|-----------|-------------|-------|
 | Position estimate | Per processed frame locally | FDR, MAVLink `GPS_INPUT` | GCS receives downsampled status |
 | Source label | Per estimate | FDR, status summary | `satellite_anchored`, `vo_extrapolated`, `dead_reckoned` |
 | Covariance semi-major | Per estimate | FDR, `GPS_INPUT.horiz_accuracy` mapping | Must not under-report |
 | Anchor decision | Per candidate | FDR | Include MRE, inliers, tile provenance, rejection reason |
 | Cache validation | On cache load / tile read | FDR, health log | Signature, freshness, resolution, hash |
 | Blackout/spoofing status | On transition and 1-2 Hz while active | QGC, FDR | Operator status |
 | Total occlusion status | Per transition and sampled while active | FDR, QGC if persistent | Indicates VIO is bypassed and IMU-only propagation is active |
 | Resource health | 1 Hz or configurable | FDR, QGC warning on threshold | CPU/GPU/temp/memory/throttle |
 | Tile write decision | Per generated tile | FDR, sidecar | Include parent covariance and trust level |
 ## Logs
 | Log Type | Format | Retention |
 |----------|--------|-----------|
 | FDR events/index | PostgreSQL tables + CBOR payload segments | <=64 GB per flight, rollover |
 | MAVLink raw stream | tlog or equivalent | FDR cap |
 | Health metrics | FDR event stream | FDR cap |
 | Test reports | CSV/Markdown | CI artifact retention |
 ## Alerts And Status Text
 | Condition | Status |
 |-----------|--------|
 | Visual blackout starts | `VISUAL_BLACKOUT_IMU_ONLY` |
 | Total occlusion before VIO | `VISUAL_OCCLUSION_IMU_ONLY` |
 | Blackout failsafe threshold exceeded | `VISUAL_BLACKOUT_FAILSAFE` |
 | Spoofing promotion/demotion | QGC status text with mode and timestamp |
 | Stale cache tile rejected | Warning in FDR; QGC only if mission-impacting |
 | Thermal throttle risk | QGC warning before throttle if possible |
 | No estimate for threshold | Relocalization request / failsafe status |
 ## Metrics For Release Evidence
 - Error CDF against ground truth.
 - Anchor-age binned error.
 - Covariance calibration plot.
 - VIO completion rate.
 - Relocalization trigger-to-anchor latency.
 - Cache freshness rejection counts.
 - FDR size over 8 hours.
 - Thermal/throttle timeline.
 ## Open Tasks For Decomposition
 - Define FDR schema and event names.
 - Define QGC status vocabulary and rate limiting.
 - Define telemetry-to-report export tooling.
 - Define covariance calibration dashboard/report.
@@ -0,0 +1,46 @@
 # Component Overview Diagram
 ```mermaid
 flowchart LR
    camera[01 Camera Ingest And Calibration]
    vio[02 VIO Adapter]
    wrapper[03 Safety And Anchor Wrapper]
    retrieval[04 Satellite Service]
    verify[05 Anchor Verification]
    cache[06 Tile Manager]
    mav[07 MAVLink And GCS Integration]
    fdr[08 FDR And Observability]
    tests[[Separate E2E Test Suite]]
    navCam[[Nav Camera]] --> camera
    fc[[ArduPilot Plane FC]] --> mav
    satSvc[[Azaion Suite Satellite Service]] --> retrieval
    datasets[[Replay/Public Datasets]] --> tests
    camera --> vio
    mav --> vio
    vio --> wrapper
    wrapper --> retrieval
    retrieval --> verify
    cache --> retrieval
    cache --> verify
    verify --> wrapper
    wrapper --> mav
    wrapper --> cache
    camera --> cache
    camera --> fdr
    vio --> fdr
    wrapper --> fdr
    retrieval --> fdr
    verify --> fdr
    cache --> fdr
    mav --> fdr
    tests --> camera
    tests --> mav
    tests --> cache
    mav --> qgc[[QGroundControl]]
    mav --> fc
    retrieval --> satSvc
 ```
@@ -0,0 +1,18 @@
 # Flow: Tile Manager And Generated Tile Lifecycle
 ```mermaid
 flowchart TD
    preflight([Pre-flight Satellite Service sync]) --> validate[06 Tile Manager validates manifest signatures hashes freshness]
    validate --> cacheOk{Cache valid?}
    cacheOk -->|No| block[Block cache usage and report]
    cacheOk -->|Yes| load[04 Satellite Service loads local descriptor metadata and FAISS index]
    load --> flight([Flight runtime])
    flight --> eligibility[03 Tile write eligibility check]
    eligibility --> eligible{Covariance and quality pass?}
    eligible -->|No| noWrite[Do not write generated tile]
    eligible -->|Yes| write[06 Orthorectify frame and write COG + signed JSON sidecar]
    write --> fdr[08 Record tile-write audit]
    fdr --> postflight([Post-flight])
    postflight --> package[06 Package generated tiles + manifest delta]
    package --> sync[[Post-flight Satellite Service upload]]
 ```
@@ -0,0 +1,21 @@
 # Flow: Normal Localization
 ```mermaid
 flowchart TD
    start([Frame + FC telemetry]) --> ingest[01 Camera ingest and quality]
    ingest --> occlusion{Total occlusion or blackout?}
    occlusion -->|Yes| imuOnly[03 IMU-only dead_reckoned propagation]
    occlusion -->|No| frameOk{Frame usable for VIO?}
    frameOk -->|No| degrade[03 Safety wrapper degraded mode]
    frameOk -->|Yes| vio[02 VIO adapter]
    telemetry[07 MAVLink telemetry] --> vio
    vio --> healthy{VIO healthy?}
    healthy -->|Yes| wrap[03 Covariance calibration + source label]
    healthy -->|No| trigger[Trigger relocalization]
    wrap --> emit[07 Emit GPS_INPUT and QGC status]
    wrap --> record[08 Record FDR event]
    emit --> endNode([Position output])
    trigger --> satFlow[[Satellite relocalization flow]]
    imuOnly --> emit
    degrade --> emit
 ```
@@ -0,0 +1,21 @@
 # Flow: Satellite Relocalization
 ```mermaid
 flowchart TD
    start([Relocalization trigger]) --> request[03 Build retrieval request]
    request --> retrieve[04 DINOv2-VLAD + FAISS top-K]
    retrieve --> candidates{Candidates found?}
    candidates -->|No| degraded[03 Continue degraded/dead reckoned]
    candidates -->|Yes| verify[05 ALIKED/DISK + LightGlue + RANSAC]
    verify --> geometry{Geometry passes?}
    geometry -->|No| degraded
    geometry -->|Yes| gates[03 Freshness/provenance/Mahalanobis gates]
    gates --> accepted{Anchor accepted?}
    accepted -->|No| degraded
    accepted -->|Yes| update[03 Apply absolute correction]
    update --> emit[07 Emit anchored GPS_INPUT]
    degraded --> emitDegraded[07 Emit degraded GPS_INPUT/status]
    emit --> record[08 Record anchor decision]
    emitDegraded --> record
    record --> endNode([Relocalization result])
 ```
@@ -0,0 +1,136 @@
 # Work Item Epics
 **Tracker**: Jira  
 **Project**: AZAION (`AZ`)  
 **Date**: 2026-05-01  
 **Labels**: `gps-denied-onboard-plan`, `autodev`  
 **Lessons applied**: No `_docs/LESSONS.md` file exists; no prior estimation or dependency lessons were available.
 ## Dependency Order
 | Order | Jira ID | Epic | Type | Depends On | Estimate |
 |-------|---------|------|------|------------|----------|
 | 1 | AZ-206 | Bootstrap & Initial Structure | bootstrap | none | M / 5-8 pts |
 | 2 | AZ-207 | Cross-Cutting: Shared Geometry And Time Sync | cross-cutting | AZ-206 | S-M / 3-5 pts |
 | 3 | AZ-208 | Cross-Cutting: Runtime Configuration And Errors | cross-cutting | AZ-206 | S-M / 3-5 pts |
 | 4 | AZ-209 | Camera Ingest And Calibration | component | AZ-206, AZ-207, AZ-208 | M / 5-8 pts |
 | 5 | AZ-210 | MAVLink And GCS Integration | component | AZ-206, AZ-208 | M / 5-8 pts |
 | 6 | AZ-211 | Tile Manager | component | AZ-206, AZ-207, AZ-208 | L / 8-13 pts |
 | 7 | AZ-212 | FDR And Observability | component | AZ-206, AZ-208 | M-L / 5-8 pts |
 | 8 | AZ-213 | VIO Adapter | component | AZ-206, AZ-207, AZ-208, AZ-209, AZ-210 | L / 8-13 pts |
 | 9 | AZ-214 | Satellite Service | component | AZ-206, AZ-207, AZ-208, AZ-209, AZ-211 | L / 8-13 pts |
 | 10 | AZ-215 | Anchor Verification | component | AZ-206, AZ-207, AZ-208, AZ-209, AZ-211, AZ-214 | L / 8-13 pts |
 | 11 | AZ-216 | Safety And Anchor Wrapper | component | AZ-206, AZ-207, AZ-208, AZ-209, AZ-210, AZ-213, AZ-215 | XL / 13-21 pts |
 | 12 | AZ-217 | E2E Test Suite | test-support | component epics | L / 8-13 pts |
 | 13 | AZ-218 | Blackbox Tests | blackbox-tests | AZ-217, component epics | L / 8-13 pts |
 ## Component Mapping
 | Component / Artifact | Epic |
 |----------------------|------|
 | Project scaffold, shared DTOs, migrations, CI skeleton | AZ-206 |
 | `common-helpers/01_helper_geo_geometry.md` | AZ-207 |
 | `common-helpers/02_helper_time_sync.md` | AZ-207 |
 | Runtime config, error contracts, health checks | AZ-208 |
 | `components/01_camera_ingest_calibration/` | AZ-209 |
 | `components/02_vio_adapter/` | AZ-213 |
 | `components/03_safety_anchor_wrapper/` | AZ-216 |
 | `components/04_satellite_retrieval/` | AZ-214 |
 | `components/05_anchor_verification/` | AZ-215 |
 | `components/06_cache_tile_lifecycle/` | AZ-211 |
 | `components/07_mavlink_gcs_integration/` | AZ-210 |
 | `components/08_fdr_observability/` | AZ-212 |
 | `tests/e2e-test-suite.md`, `tests/blackbox-tests.md`, `tests/environment.md` | AZ-217 |
 | System blackbox/performance/resilience/security/resource tests | AZ-218 |
 ## Epic Relationship Diagram
 ```mermaid
 flowchart TD
    bootstrap[AZ-206 Bootstrap]
    geo[AZ-207 Shared Geometry And Time Sync]
    config[AZ-208 Runtime Configuration And Errors]
    camera[AZ-209 Camera Ingest]
    mavlink[AZ-210 MAVLink And GCS]
    cache[AZ-211 Tile Manager]
    fdr[AZ-212 FDR And Observability]
    vio[AZ-213 VIO Adapter]
    retrieval[AZ-214 Satellite Service]
    anchor[AZ-215 Anchor Verification]
    safety[AZ-216 Safety And Anchor Wrapper]
    validation[AZ-217 E2E Test Suite]
    blackbox[AZ-218 Blackbox Tests]
    bootstrap --> geo
    bootstrap --> config
    bootstrap --> camera
    bootstrap --> mavlink
    bootstrap --> cache
    bootstrap --> fdr
    geo --> camera
    geo --> cache
    geo --> vio
    geo --> retrieval
    geo --> anchor
    geo --> safety
    config --> camera
    config --> mavlink
    config --> cache
    config --> fdr
    config --> vio
    config --> retrieval
    config --> anchor
    config --> safety
    camera --> vio
    mavlink --> vio
    camera --> retrieval
    cache --> retrieval
    retrieval --> anchor
    camera --> anchor
    cache --> anchor
    vio --> safety
    anchor --> safety
    mavlink --> safety
    safety --> cache
    safety --> mavlink
    safety --> fdr
    camera --> fdr
    cache --> fdr
    safety --> validation
    fdr --> validation
    camera --> validation
    mavlink --> validation
    retrieval --> validation
    anchor --> validation
    cache --> validation
    validation --> blackbox
 ```
 ## Cross-Cutting Ownership
 | Concern | Owner Epic | Rule |
 |---------|------------|------|
 | Geospatial math, WGS84/local conversions, GSD, footprints | AZ-207 | Components consume shared helper; no local duplicate implementations |
 | Timestamp validation, IMU/frame alignment, replay ordering | AZ-207 | Components consume shared helper; no local duplicate implementations |
 | Runtime configuration, environment profiles, startup validation | AZ-208 | Components consume shared config loader and health-check contract |
 | Shared error/result envelopes | AZ-208 | Components use shared error categories and do not swallow failures |
 ## Created Jira Epics
 - AZ-206 — Bootstrap & Initial Structure
 - AZ-207 — Cross-Cutting: Shared Geometry And Time Sync
 - AZ-208 — Cross-Cutting: Runtime Configuration And Errors
 - AZ-209 — Camera Ingest And Calibration
 - AZ-210 — MAVLink And GCS Integration
 - AZ-211 — Tile Manager
 - AZ-212 — FDR And Observability
 - AZ-213 — VIO Adapter
 - AZ-214 — Satellite Service
 - AZ-215 — Anchor Verification
 - AZ-216 — Safety And Anchor Wrapper
 - AZ-217 — E2E Test Suite
 - AZ-218 — Blackbox Tests
 ## Tracker Notes
 Jira authentication succeeded. A transient Jira server-side PostgreSQL timeout occurred while creating `Blackbox Tests`; the write was recorded under `_docs/_process_leftovers/` and then successfully retried as AZ-218. The leftover entry was deleted after replay success.
@@ -0,0 +1,96 @@
 # Glossary
 **Status**: confirmed-by-user  
 **Date**: 2026-05-01
 ## Aerial VPR
 Visual place recognition over aerial/nadir imagery; used to retrieve candidate satellite/cache chunks for a UAV frame. (source: `solution.md`)
 ## ADTi 20MP 20L V1
 Selected navigation-camera family. Public product pages list Sony APS-C CMOS, 5456 x 3632 max image size, 23.20 x 15.40 mm sensor, 4.25 um pixel pitch, Sony E mount, and 2 fps continuous capture; final manufacturer specification remains a verification risk. (source: `restrictions.md`, user-confirmed open-question review)
 ## Anchor
 Accepted absolute visual match to a georeferenced cache tile, used to correct VO/IMU drift. (source: `acceptance_criteria.md`, `solution.md`)
 ## ArduPilot Plane SITL
 Simulation environment used to verify `GPS_INPUT`, spoofing, failsafe, and MAVLink behavior with the production Plane parameter set. (source: `acceptance_criteria.md`, `tests/environment.md`)
 ## BASALT VIO
 Selected production relative visual-inertial odometry candidate. It consumes calibrated navigation-camera frames and IMU data but does not own the product's safety, anchor, cache, or MAVLink semantics. (source: `solution.md`)
 ## Cache Poisoning
 Failure mode where a misaligned onboard-generated tile is written back and later corrupts future satellite-cache anchors. (source: `acceptance_criteria.md`)
 ## COG Tile
 Cloud Optimized GeoTIFF raster object used as the write-new unit for georeferenced service tiles and generated onboard tiles. (source: `solution.md`)
 ## FDR
 Flight Data Recorder. Bounded per-flight recorder for estimates, IMU traces, emitted MAVLink, health telemetry, tile writes, and audit evidence; raw nav/AI frames are not retained. (source: `acceptance_criteria.md`, `solution.md`)
 ## Generated Tile
 In-flight orthorectified tile created from navigation-camera imagery and stored locally for post-flight Satellite Service sync. (source: `acceptance_criteria.md`)
 ## `GPS_INPUT`
 MAVLink message used as the ArduPilot GPS substitute in v1. It carries WGS84 coordinates, fix type, accuracy fields, and ignore flags. (source: `acceptance_criteria.md`, `solution.md`)
 ## Kimera Backup
 BSD-friendly VIO backup candidate if BASALT fails project replay/runtime gates. (source: `solution.md`)
 ## Mahalanobis Gate
 Statistical consistency gate used to reject anchor updates that are inconsistent with the current estimator state and covariance. (source: `solution.md`)
 ## Nadir Camera
 Fixed downward-pointing navigation camera used for localization. It is not gimbal-stabilized. (source: `restrictions.md`)
 ## OpenVINS Reference
 GPLv3 VIO/covariance baseline used for replay comparison and uncertainty calibration; not the default production dependency. (source: `solution.md`)
 ## Public Replay Dataset
 External dataset used to cover validation gaps in the current sample data, especially synchronized nadir camera, IMU, GNSS/ground truth, and reference imagery. MUN-FRL and ALTO are first-choice candidates. (source: `tests/test-data.md`)
 ## QGroundControl
 Supported ground station that receives downsampled status and failsafe messages at 1-2 Hz. (source: `restrictions.md`, `acceptance_criteria.md`)
 ## Representative Replay Rig
 Final acceptance dataset or hardware setup from the target flight profile with synchronized nav camera, FC IMU/attitude/airspeed/altitude, emitted MAVLink, and ground truth. (source: `acceptance_criteria.md`, `tests/test-data.md`)
 ## Safety/Anchor Wrapper
 Project-owned layer around BASALT that calibrates confidence, fuses satellite anchors, owns source labels, handles degraded modes, gates generated tiles, and emits MAVLink outputs. (source: `solution.md`)
 ## Satellite Service
 External Azaion Suite component that prepares the offline satellite cache before flight and ingests generated tiles after landing. The onboard system does not call it in-flight. (source: `restrictions.md`)
 ## Source Label
 Categorical label attached to every estimate: `{satellite_anchored, vo_extrapolated, dead_reckoned}`. (source: `acceptance_criteria.md`)
 ## Tile Freshness
 Runtime age gate for cache tiles: active-conflict sectors require newer imagery than stable rear sectors. (source: `acceptance_criteria.md`)
 ## Visual Blackout
 State where the navigation camera provides no usable localization signal because of clouds, occlusion, whiteout, or similar visual loss. (source: `acceptance_criteria.md`)
 ## VPR Chunk
 Ground-footprint-sized retrieval unit, approximately 600-800 m with overlap, decoupled from the storage tile convention. (source: `acceptance_criteria.md`)
@@ -0,0 +1,243 @@
 # Module Layout
 **Language**: mixed (Python orchestration + C++ native vision bridges)
 **Layout Convention**: src-layout
 **Root**: `src/`
 **Last Updated**: 2026-05-03
 ## Layout Rules
 1. Each product component owns one top-level directory under `src/`.
 2. Shared contracts and cross-cutting helpers live under `src/shared/`.
 3. Native hot-path or third-party bridge code lives inside the owning component folder under `native/`.
 4. Public API surface per component is limited to `__init__.py`, `types.py`, and `interfaces.py` unless a component entry lists another public file.
 5. Tests live under `tests/` by test type and component; implementation tasks must not place tests inside the component tree unless a later test task explicitly changes this layout.
 ## Per-Component Mapping
 ### Component: Camera Ingest And Calibration
 - **Epic**: AZ-209
 - **Directory**: `src/camera_ingest_calibration/`
 - **Technologies**: Python, OpenCV 4.x, camera SDK/V4L2/GigE adapter boundary, calibration files, shared geometry/time helpers
 - **Public API**:
  - `src/camera_ingest_calibration/__init__.py`
  - `src/camera_ingest_calibration/types.py`
  - `src/camera_ingest_calibration/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/camera_ingest_calibration/internal/*`
  - `src/camera_ingest_calibration/_*.py`
 - **Owns (exclusive write during implementation)**: `src/camera_ingest_calibration/**`
 - **Imports from**: shared/contracts, shared/geo_geometry, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: VIO Adapter, Satellite Service, Anchor Verification, Tile Manager, FDR And Observability
 ### Component: VIO Adapter
 - **Epic**: AZ-213
 - **Directory**: `src/vio_adapter/`
 - **Native Directory**: `src/vio_adapter/native/`
 - **Technologies**: Python adapter, C++ native bridge, BASALT as current backend, Eigen/Sophus or backend-native math stack, OpenCV 4.x, shared time-sync contracts
 - **Public API**:
  - `src/vio_adapter/__init__.py`
  - `src/vio_adapter/types.py`
  - `src/vio_adapter/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/vio_adapter/internal/*`
  - `src/vio_adapter/_*.py`
  - `src/vio_adapter/native/**`
 - **Owns (exclusive write during implementation)**:
  - `src/vio_adapter/**`
 - **Imports from**: Camera Ingest And Calibration, MAVLink And GCS Integration, shared/contracts, shared/geo_geometry, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: Safety And Anchor Wrapper, FDR And Observability
 ### Component: Safety And Anchor Wrapper
 - **Epic**: AZ-216
 - **Directory**: `src/safety_anchor_wrapper/`
 - **Technologies**: Python state machine, OpenCV geometry helpers, covariance/gating logic, shared DTO contracts, MAVLink output DTOs
 - **Public API**:
  - `src/safety_anchor_wrapper/__init__.py`
  - `src/safety_anchor_wrapper/types.py`
  - `src/safety_anchor_wrapper/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/safety_anchor_wrapper/internal/*`
  - `src/safety_anchor_wrapper/_*.py`
 - **Owns (exclusive write during implementation)**: `src/safety_anchor_wrapper/**`
 - **Imports from**: VIO Adapter, Anchor Verification, MAVLink And GCS Integration, Camera Ingest And Calibration, shared/contracts, shared/geo_geometry, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: MAVLink And GCS Integration, Tile Manager, FDR And Observability
 ### Component: Satellite Service
 - **Epic**: AZ-214
 - **Directory**: `src/satellite_service/`
 - **Native Directory**: `src/satellite_service/native/`
 - **Technologies**: Python service adapter, DINOv2-VLAD descriptors, ONNX/TensorRT candidate path, CPU FAISS, offline package sync client
 - **Public API**:
  - `src/satellite_service/__init__.py`
  - `src/satellite_service/types.py`
  - `src/satellite_service/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/satellite_service/internal/*`
  - `src/satellite_service/_*.py`
  - `src/satellite_service/native/**`
 - **Owns (exclusive write during implementation)**:
  - `src/satellite_service/**`
 - **Imports from**: Camera Ingest And Calibration, Tile Manager, Safety And Anchor Wrapper, shared/contracts, shared/geo_geometry, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: Anchor Verification, FDR And Observability
 - **Network invariant**: external Satellite Service sync is allowed only pre-flight or post-flight; no mid-flight satellite-provider or suite-service calls.
 ### Component: Anchor Verification
 - **Epic**: AZ-215
 - **Directory**: `src/anchor_verification/`
 - **Native Directory**: `src/anchor_verification/native/`
 - **Technologies**: Python validation pipeline, ALIKED/DISK + LightGlue, OpenCV RANSAC/USAC, SIFT/ORB baseline, native feature-matching bridge
 - **Public API**:
  - `src/anchor_verification/__init__.py`
  - `src/anchor_verification/types.py`
  - `src/anchor_verification/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/anchor_verification/internal/*`
  - `src/anchor_verification/_*.py`
  - `src/anchor_verification/native/**`
 - **Owns (exclusive write during implementation)**:
  - `src/anchor_verification/**`
 - **Imports from**: Satellite Service, Camera Ingest And Calibration, Tile Manager, shared/contracts, shared/geo_geometry, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: Safety And Anchor Wrapper, FDR And Observability
 ### Component: Tile Manager
 - **Epic**: AZ-211
 - **Directory**: `src/tile_manager/`
 - **Technologies**: Python repository/policy layer, PostgreSQL/PostGIS, GDAL/rasterio COG handling, signed JSON sidecars, OpenCV/GDAL orthorectification, hash/signature validation
 - **Public API**:
  - `src/tile_manager/__init__.py`
  - `src/tile_manager/types.py`
  - `src/tile_manager/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/tile_manager/internal/*`
  - `src/tile_manager/_*.py`
 - **Owns (exclusive write during implementation)**:
  - `src/tile_manager/**`
  - `migrations/postgresql/cache_*.sql`
  - `migrations/seed/cache_*`
 - **Imports from**: Camera Ingest And Calibration, Safety And Anchor Wrapper, shared/contracts, shared/geo_geometry, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: Satellite Service, Anchor Verification, FDR And Observability
 ### Component: MAVLink And GCS Integration
 - **Epic**: AZ-210
 - **Directory**: `src/mavlink_gcs_integration/`
 - **Technologies**: Python, MAVSDK telemetry subscriptions, pymavlink `GPS_INPUT` emission, MAVLink/QGC status messages
 - **Public API**:
  - `src/mavlink_gcs_integration/__init__.py`
  - `src/mavlink_gcs_integration/types.py`
  - `src/mavlink_gcs_integration/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/mavlink_gcs_integration/internal/*`
  - `src/mavlink_gcs_integration/_*.py`
 - **Owns (exclusive write during implementation)**: `src/mavlink_gcs_integration/**`
 - **Imports from**: Safety And Anchor Wrapper, shared/contracts, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: VIO Adapter, Safety And Anchor Wrapper, FDR And Observability
 ### Component: FDR And Observability
 - **Epic**: AZ-212
 - **Directory**: `src/fdr_observability/`
 - **Technologies**: Python append/export layer, PostgreSQL event index, CBOR segment payloads, optional Parquet export, structured logging/health events
 - **Public API**:
  - `src/fdr_observability/__init__.py`
  - `src/fdr_observability/types.py`
  - `src/fdr_observability/interfaces.py`
 - **Internal (do NOT import from other components)**:
  - `src/fdr_observability/internal/*`
  - `src/fdr_observability/_*.py`
 - **Owns (exclusive write during implementation)**:
  - `src/fdr_observability/**`
  - `migrations/postgresql/fdr_*.sql`
  - `migrations/seed/fdr_*`
 - **Imports from**: shared/contracts, shared/time_sync, shared/config, shared/errors, shared/telemetry
 - **Consumed by**: all runtime components
 ## Shared / Cross-Cutting
 ### shared/contracts
 - **Epic**: AZ-206
 - **Directory**: `src/shared/contracts/`
 - **Technologies**: Python typed DTOs, schema/contract definitions, Markdown API-contract documents
 - **Purpose**: Shared DTOs, protocol shapes, schemas, and public contract exports.
 - **Owned by**: initial structure and shared-contract tasks under AZ-206.
 - **Consumed by**: all components.
 ### shared/geo_geometry
 - **Epic**: AZ-207
 - **Directory**: `src/shared/geo_geometry/`
 - **Technologies**: Python geometry utilities, OpenCV 4.x, WGS84/local-frame math, homography/covariance conversions
 - **Purpose**: WGS84/local conversions, GSD, camera footprint projection, homography/covariance unit conversion, and distance calculations.
 - **Owned by**: shared geometry task under AZ-207.
 - **Consumed by**: Camera Ingest And Calibration, Safety And Anchor Wrapper, Anchor Verification, Tile Manager.
 ### shared/time_sync
 - **Epic**: AZ-207
 - **Directory**: `src/shared/time_sync/`
 - **Technologies**: Python timestamp utilities, monotonic-clock validation, MAVLink/camera timestamp normalization, replay ordering checks
 - **Purpose**: Monotonic timestamp checks, frame-to-IMU alignment, clock-domain metadata, replay ordering, and gap/jitter metrics.
 - **Owned by**: time-sync task under AZ-207.
 - **Consumed by**: Camera Ingest And Calibration, VIO Adapter, MAVLink And GCS Integration, FDR And Observability.
 ### shared/config
 - **Epic**: AZ-208
 - **Directory**: `src/shared/config/`
 - **Technologies**: Python configuration loader, environment variables, `.env.example`, startup readiness validation
 - **Purpose**: Runtime profile loading, environment validation, typed settings, and startup readiness inputs.
 - **Owned by**: runtime configuration task under AZ-208.
 - **Consumed by**: all runtime components.
 ### shared/errors
 - **Epic**: AZ-208
 - **Directory**: `src/shared/errors/`
 - **Technologies**: Python exception/result envelope types, shared error categories, fail-fast helpers
 - **Purpose**: Error categories, result envelopes, fail-fast helpers, and non-silent exception contracts.
 - **Owned by**: runtime error contract task under AZ-208.
 - **Consumed by**: all components.
 ### shared/telemetry
 - **Epic**: AZ-208
 - **Directory**: `src/shared/telemetry/`
 - **Technologies**: Python structured logging, metrics labels, health event DTOs, FDR-safe telemetry metadata
 - **Purpose**: Structured logging, metrics labels, health event shapes, and FDR-safe event metadata helpers.
 - **Owned by**: observability/config contract task under AZ-208.
 - **Consumed by**: all components.
 ## Allowed Dependencies (layering)
 Read top-to-bottom; an upper layer may import from a lower layer but never the reverse.
 | Layer | Components | May import from |
 |-------|------------|-----------------|
 | 4. Runtime Output / Coordination | Safety And Anchor Wrapper, MAVLink And GCS Integration, FDR And Observability | 1, 2, 3 public interfaces |
 | 3. Perception / Satellite Anchor | VIO Adapter, Satellite Service, Anchor Verification | 1, 2 public interfaces |
 | 2. Data Ingest / Persistence | Camera Ingest And Calibration, Tile Manager | 1 |
 | 1. Shared / Foundation | shared/contracts, shared/geo_geometry, shared/time_sync, shared/config, shared/errors, shared/telemetry | none |
 Violations of this table are Architecture findings in code-review Phase 7 and are High severity.
 ## Out-of-Product E2E Test Suite
 The e2e replay/SITL/Jetson validation suite is not a product component and must not receive Step 6 product implementation tasks. It owns test-support artifacts under `tests/blackbox/**`, `tests/e2e/**`, `e2e/replay/**`, and `e2e/reports/**`, and it exercises the runtime only through public file, MAVLink, cache, status, and FDR interfaces.
 - **Technologies**: Python, pytest-style runner, Docker/compose, pymavlink/log parser, ArduPilot Plane SITL, QGC observer/log parser, CSV/Markdown reports
 ## Self-Verification
 - Every runtime component under `_docs/02_document/components/` has a mapping entry.
 - Cross-cutting epics AZ-206, AZ-207, and AZ-208 have shared ownership entries.
 - Layering covers all components and keeps shared code at the bottom.
 - Component-owned paths do not overlap; native bridge paths live inside the component that owns them.
 - Paths follow the project `src/` layout already confirmed by `AZ-219_initial_structure`.
@@ -0,0 +1,275 @@
 # Risk Assessment — Architecture Review — Iteration 01
 ## Evaluator Pass Summary
 | Check | Result | Notes |
 |-------|--------|-------|
 | Single Responsibility | Pass | Components each own one primary concern: ingest, VIO, safety, Satellite Service sync/retrieval, verification, Tile Manager storage/generation, MAVLink, FDR, validation |
 | Dumb Code / Smart Data | Pass | Complex behavior is mostly expressed through DTOs, mode labels, covariance fields, manifests, and gates |
 | Interface Consistency | Pass with fix | Safety wrapper no longer directly depends on Tile Manager for anchor acceptance; cache freshness/provenance travels through `AnchorDecision` |
 | Circular Dependencies | Pass with caution | Runtime flow is acyclic at component ownership level; MAVLink remains a bidirectional protocol adapter but owns no localization policy |
 | Missing Interactions | Pass | Pre-VIO occlusion, IMU-only blackout, relocalization, tile writes, FDR, and SITL validation are all represented |
 | Security Considerations | Pass | Signed cache sidecars, source/system ID checks, spoofing rejection, and no in-flight satellite-provider or Satellite Service access are covered |
 | Performance Bottlenecks | Pass | Jetson latency, VPR/local matching, FDR append pressure, PostgreSQL availability, and thermal limits are identified |
 | API Contracts | Pass | Core DTO handoffs are documented: `FramePacket`, `VioStatePacket`, `AnchorDecision`, `PositionEstimate`, `FdrEvent` |
 ## Risk Scoring Matrix
 |  | Low Impact | Medium Impact | High Impact |
 |--|------------|---------------|-------------|
 | **High Probability** | Medium | High | Critical |
 | **Medium Probability** | Low | Medium | High |
 | **Low Probability** | Low | Low | Medium |
 ## Acceptance Criteria by Risk Level
 | Level | Action Required |
 |-------|-----------------|
 | Low | Accepted and monitored |
 | Medium | Mitigation plan required before implementation |
 | High | Mitigation + contingency plan required, reviewed during implementation |
 | Critical | Must be resolved before proceeding to next planning step |
 ## Risk Register
 | ID | Risk | Category | Probability | Impact | Score | Mitigation | Owner | Status |
 |----|------|----------|-------------|--------|-------|------------|-------|--------|
 | R01 | ADTi 20MP 20L V1 public specs conflict with planning assumptions for resolution, FPS, lens, interface, and temperature | Technical / External | Medium | High | High | Pin manufacturer datasheet and exact lens/interface before implementation; make camera calibration/spec task a bootstrap blocker | Camera ingest/calibration | Mitigated by gate |
 | R02 | BASALT may underperform or lose tracking on nadir fixed-wing low-parallax terrain | Technical | Medium | High | High | Public replay with MUN-FRL/ALTO/Kagaru/EPFL where applicable, representative target replay, OpenVINS reference comparison, Kimera backup path | VIO adapter | Mitigated by validation |
 | R03 | BASALT confidence/covariance may under-report real error | Safety | Medium | High | High | Wrapper owns covariance calibration; compare against ground truth, satellite residuals, and OpenVINS reference; never emit optimistic `horiz_accuracy` | Safety/anchor wrapper | Mitigated by wrapper design |
 | R04 | Total occlusion detector may false-negative and feed unusable frames into VIO | Safety / Technical | Medium | High | High | Conservative pre-VIO occlusion gate, FDR status, tests for total blackout, and fallback to IMU-only `dead_reckoned` mode | Camera ingest/calibration | Mitigated by spec/test |
 | R05 | IMU-only blackout propagation could be trusted too long | Safety | Medium | High | High | Monotonic covariance growth, `dead_reckoned` label, `fix_type=0`/`horiz_accuracy=999.0` when >30 s or covariance >500 m | Safety/anchor wrapper | Mitigated by AC gate |
 | R06 | DINOv2-VLAD + ALIKED/DISK-LightGlue exceeds Jetson latency/memory budget | Performance | Medium | High | High | Trigger-only execution, CPU FAISS first, top-K caps, model profiling, TensorRT only after fidelity checks | Satellite Service / Anchor verification | Mitigated by profiling gates |
 | R07 | PostgreSQL/PostGIS local DB is unavailable or too heavy for onboard runtime | Technical / Operational | Medium | High | High | Run local onboard PostgreSQL, health-check before flight, keep large payloads in files, fail mission cache validation if DB unavailable | Tile Manager / FDR | Mitigated by deployment gates |
 | R08 | Generated tile cache poisoning corrupts future anchors | Security / Safety | Low | High | Medium | Sigma gate, provenance sidecars, post-flight Satellite Service voting, no direct promotion to trusted basemap | Tile Manager | Mitigated by policy |
 | R09 | Public datasets do not cover final target terrain or commercial license needs | External / Schedule | Medium | Medium | Medium | Use public data for de-risking only; representative synchronized target data remains mandatory for acceptance | Validation harness | Mitigated by acceptance rule |
 | R10 | MAVLink `GPS_INPUT` parameters or Plane behavior differs from assumptions | Integration | Medium | High | High | Plane SITL release gate with production parameters, spoofing/failsafe tests, raw field validation with pymavlink | MAVLink/GCS integration | Mitigated by SITL gate |
 | R11 | FDR appends or PostgreSQL indexing interferes with hot-path latency | Performance | Medium | Medium | Medium | Append asynchronously, use CBOR payload segments for high-volume data, keep PostgreSQL as event index/query surface | FDR/observability | Mitigated by design |
 | R12 | GPL/non-commercial tooling accidentally enters production or acceptance evidence | Legal / Compliance | Low | High | Medium | Keep OpenVINS/ORB-SLAM3 reference-only; license-tag datasets before CI; SuperPoint only after legal approval | Validation harness / Architecture | Mitigated by gates |
 ## Detailed Risk Analysis
 ### R01: Camera Specification Mismatch
 **Description**: Public ADTi pages show 5456 x 3632 stills, 2 fps continuous capture, Sony E mount, and -10..40 C operation. The project needs the exact production lens, camera interface, sustained capture behavior, thermal behavior, and calibration model.
 **Trigger conditions**: Manufacturer documentation or hardware testing contradicts assumed FPS, interface, temperature, or lens characteristics.
 **Affected components**: Camera ingest/calibration, VIO adapter, separate e2e test suite, deployment procedures.
 **Mitigation strategy**:
 1. Make camera specification verification a bootstrap task.
 2. Require manufacturer datasheet or hardware measurement before implementation claims 3 fps or hot-environment operation.
 3. Version calibration data by exact camera/lens/interface.
 **Contingency plan**: Reduce frame rate assumptions, adjust latency tests, or select a different navigation camera/lens/interface.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `glossary.md`, `architecture.md`, `components/01_camera_ingest_calibration/description.md`, `deployment/deployment_procedures.md`.
 ---
 ### R02: BASALT Nadir Fixed-Wing Fit
 **Description**: BASALT is a strong VIO candidate, but fixed downward cameras over planar terrain can cause low-parallax and texture-degeneracy cases.
 **Trigger conditions**: Public or representative replay shows high drift, frequent tracking loss, or poor initialization.
 **Affected components**: VIO adapter, safety/anchor wrapper, separate e2e test suite.
 **Mitigation strategy**:
 1. Run MUN-FRL first for synchronized nadir camera + IMU + ground truth.
 2. Add ALTO/Kagaru/EPFL slices where available for aerial/fixed-wing realism.
 3. Compare against OpenVINS reference and Kimera backup.
 **Contingency plan**: Keep Kimera backup or build a project-owned fallback estimator around OpenCV + IMU only after replay evidence requires it.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `architecture.md`, `components/02_vio_adapter/description.md`, `tests/test-data.md`.
 ---
 ### R03: Covariance Under-Reporting
 **Description**: Incorrect confidence is more dangerous than no estimate because the flight controller may trust a false fix.
 **Trigger conditions**: Replay error exceeds reported covariance, or anchors are accepted despite inconsistent residuals.
 **Affected components**: Safety/anchor wrapper, MAVLink/GCS integration, FDR/observability.
 **Mitigation strategy**:
 1. Make wrapper covariance the product authority, not BASALT raw confidence.
 2. Validate calibration against ground truth, satellite residuals, and OpenVINS reference.
 3. Map `horiz_accuracy` so it never under-reports the 95% semi-major covariance axis.
 **Contingency plan**: Degrade to no-fix sooner and require operator relocalization or mission abort behavior.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `architecture.md`, `components/03_safety_anchor_wrapper/description.md`, `tests/blackbox-tests.md`.
 ---
 ### R04: Total Occlusion Detection Failure
 **Description**: If total occlusion is not detected before VIO, BASALT may receive unusable frames and produce misleading state updates.
 **Trigger conditions**: Lens cover, cloud/whiteout, decode failure, underexposure/overexposure, or textureless frame reaches VIO as usable.
 **Affected components**: Camera ingest/calibration, safety/anchor wrapper, VIO adapter.
 **Mitigation strategy**:
 1. Camera ingest exposes `OcclusionReport` and sets `usable_for_vio=false` for total occlusion/blackout.
 2. Total occlusion bypasses BASALT for that frame.
 3. Safety wrapper switches to IMU-only `dead_reckoned` propagation with monotonic covariance growth.
 **Contingency plan**: Tune detector conservatively and accept temporary false-positive IMU-only degradation over false VIO confidence.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `components/01_camera_ingest_calibration/description.md`, `components/03_safety_anchor_wrapper/description.md`, `system-flows.md`, `diagrams/flows/flow_normal_localization.md`, `tests/resilience-tests.md`.
 ---
 ### R05: IMU-Only Mode Over-Trust
 **Description**: IMU-only propagation drifts quickly and must be treated as an emergency bridge, not a long-duration solution.
 **Trigger conditions**: Blackout lasts longer than 30 seconds or covariance exceeds 500 m.
 **Affected components**: Safety/anchor wrapper, MAVLink/GCS integration, FDR/observability.
 **Mitigation strategy**:
 1. Emit `source_label=dead_reckoned` during IMU-only mode.
 2. Grow covariance monotonically.
 3. Emit `fix_type=0`, `horiz_accuracy=999.0`, and `VISUAL_BLACKOUT_FAILSAFE` at thresholds.
 **Contingency plan**: Stop publishing valid fixes and require relocalization/operator action.
 **Residual risk after mitigation**: Low.
 **Documents updated**: `components/03_safety_anchor_wrapper/description.md`, `system-flows.md`, `tests/blackbox-tests.md`, `tests/resilience-tests.md`, `tests/traceability-matrix.md`.
 ---
 ### R06: Trigger Path Performance
 **Description**: DINOv2-VLAD and learned local matching can exceed Jetson latency/memory limits.
 **Trigger conditions**: Relocalization exceeds p95 latency, memory budget, or causes thermal throttling.
 **Affected components**: Satellite Service, anchor verification, separate e2e test suite.
 **Mitigation strategy**:
 1. Keep VPR/local matching trigger-based.
 2. Use CPU FAISS first and bounded top-K.
 3. Accept optimized engines only after descriptor-fidelity tests pass.
 **Contingency plan**: Reduce descriptor resolution/model size, reduce top-K, or fall back to classical features for emergency operation.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `architecture.md`, `components/04_satellite_retrieval/description.md`, `components/05_anchor_verification/description.md`, `tests/performance-tests.md`.
 ---
 ### R07: Onboard PostgreSQL/PostGIS Availability
 **Description**: PostgreSQL/PostGIS is now the structured metadata store. If local DB availability or resource use is poor, cache/FDR queries may fail.
 **Trigger conditions**: Local DB does not start, DB files corrupt, DB consumes too much memory/I/O, or migrations fail.
 **Affected components**: Tile Manager, FDR/observability, deployment procedures.
 **Mitigation strategy**:
 1. Require local onboard PostgreSQL health check before flight.
 2. Store large imagery/descriptors/CBOR payloads as files, not DB blobs.
 3. Treat DB unavailability as a mission-cache validation blocker.
 **Contingency plan**: Abort mission-cache activation and run only no-cache degraded modes or resync/rebuild DB before flight.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `data_model.md`, `architecture.md`, `components/06_cache_tile_lifecycle/description.md`, `components/08_fdr_observability/description.md`, `deployment/environment_strategy.md`.
 ---
 ### R08: Cache Poisoning
 **Description**: A bad generated tile could be written back and later used as a trusted anchor.
 **Trigger conditions**: Generated tile is promoted despite high parent covariance, stale source, bad sidecar, or inconsistent overlap voting.
 **Affected components**: Tile Manager, safety/anchor wrapper, Satellite Service integration.
 **Mitigation strategy**:
 1. Require tile-write sigma gates.
 2. Store generated tiles as candidates with signed sidecars.
 3. Promote only through post-flight Satellite Service validation/voting.
 **Contingency plan**: Quarantine generated tiles and invalidate affected cache regions.
 **Residual risk after mitigation**: Low.
 **Documents updated**: `architecture.md`, `components/06_cache_tile_lifecycle/description.md`, `tests/security-tests.md`.
 ---
 ### R09: Dataset Coverage / Licensing
 **Description**: Public datasets may not match target terrain, may lack raw synchronized IMU, or may have non-commercial restrictions.
 **Trigger conditions**: MUN-FRL/ALTO/Kagaru/EPFL slices are unavailable, unrepresentative, or license-incompatible for acceptance.
 **Affected components**: Validation harness, VIO adapter, anchor verification.
 **Mitigation strategy**:
 1. Use public datasets for de-risking only.
 2. License-tag datasets before CI jobs.
 3. Require representative synchronized target data for final acceptance.
 **Contingency plan**: Collect a target replay dataset before final acceptance.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `tests/test-data.md`, `deployment/environment_strategy.md`, `deployment/ci_cd_pipeline.md`.
 ---
 ### R10: Plane `GPS_INPUT` Integration
 **Description**: ArduPilot Plane EKF and `GPS_INPUT` handling may differ from assumptions, especially around accuracy fields, ignore flags, velocity fields, and spoofing transitions.
 **Trigger conditions**: Plane SITL rejects or mishandles emitted `GPS_INPUT`, or QGC status is insufficient.
 **Affected components**: MAVLink/GCS integration, safety/anchor wrapper, separate e2e test suite.
 **Mitigation strategy**:
 1. Use pymavlink for exact `GPS_INPUT` field control.
 2. Gate release on Plane SITL with production parameters.
 3. Validate spoofing/failsafe and QGC status behavior.
 **Contingency plan**: Adjust parameter guidance/output fields before hardware deployment.
 **Residual risk after mitigation**: Medium.
 **Documents updated**: `components/07_mavlink_gcs_integration/description.md`, `tests/environment.md`, `deployment/ci_cd_pipeline.md`.
 ## Architecture/Component Changes Applied
 | Risk ID | Document Modified | Change Description |
 |---------|-------------------|--------------------|
 | R04 | `components/01_camera_ingest_calibration/description.md` | Added explicit `detect_occlusion`, `OcclusionReport`, and pre-VIO bypass behavior |
 | R04/R05 | `components/03_safety_anchor_wrapper/description.md` | Added `propagate_imu_only`, `total_occlusion`, monotonic covariance behavior, and no direct Tile Manager dependency |
 | R07 | `data_model.md` | Replaced embedded DB references with PostgreSQL/PostGIS structured metadata and CBOR FDR payload segments |
 | R07 | `architecture.md` | Added PostgreSQL/PostGIS ADR and FDR storage decision |
 | R05 | `tests/blackbox-tests.md` / `tests/resilience-tests.md` | Made total occlusion and IMU-only blackout behavior explicit |
 ## Summary
 **Total risks identified**: 12  
 **Critical**: 0 | **High**: 7 | **Medium**: 5 | **Low**: 0  
 **Risks mitigated this iteration**: 12  
 **Risks requiring user decision**: None immediately. Future decisions are tied to exact camera hardware proof, dataset license approval, and representative data collection timing.
@@ -0,0 +1,321 @@
 # GPS-Denied Onboard Localization — System Flows
 ## Flow Inventory
 | # | Flow Name | Trigger | Primary Components | Criticality |
 |---|-----------|---------|--------------------|-------------|
 | F1 | Pre-flight cache preparation | Operator sync before mission | Satellite Service, Tile Manager | High |
 | F2 | Normal frame processing | Navigation frame + FC telemetry | Camera ingest, VIO adapter, safety/anchor wrapper, MAVLink, FDR | High |
 | F3 | Satellite relocalization | Cold start, VO failure, sharp turn, covariance growth, stale anchor | Satellite Service, anchor verification, safety/anchor wrapper | High |
 | F4 | Visual blackout / spoofing degraded mode | Image-quality failure and GPS health failure | Camera ingest, MAVLink telemetry, safety/anchor wrapper, QGC, FDR | Critical |
 | F5 | Generated tile lifecycle | High-confidence pose + usable frame | Camera ingest, safety/anchor wrapper, Tile Manager, FDR | Medium |
 | F6 | Post-flight sync and audit | Landing / operator offload | Tile Manager, Satellite Service, FDR | Medium |
 | F7 | E2E validation replay | Test-suite invocation | Separate e2e test suite, system runtime, public datasets, SITL | High |
 ## Flow Dependencies
 | Flow | Depends On | Shares Data With |
 |------|------------|------------------|
 | F1 | Satellite Service cache export and Tile Manager validation | F2, F3, F5 |
 | F2 | F1 for cache availability; FC telemetry | F3, F4, F5, FDR |
 | F3 | F1 cache/index; F2 state estimate | F2, F5 |
 | F4 | F2 telemetry and quality signals | F2, QGC/FDR |
 | F5 | Accepted state/covariance from F2/F3 | F6 |
 | F6 | F5 generated tiles and FDR | Satellite Service |
 | F7 | Test fixtures and selected execution environment | All flows |
 ---
 ## Flow F1: Pre-Flight Cache Preparation
 ### Description
 Before flight, the Satellite Service imports an offline cache package for the operational area, including COG tiles, manifests, sidecars, VPR chunks, descriptors, and FAISS index files. No Satellite Service or satellite-provider calls are allowed during flight.
 ### Preconditions
 - Operational area and sector freshness classification are known.
 - Cache imagery meets 0.5 m/px minimum and ideally 0.3 m/px.
 - Cache package fits storage budget or has approved split descriptor budget.
 ### Sequence Diagram
 ```mermaid
 sequenceDiagram
    participant Operator
    participant SatelliteService
    participant TileManager
    Operator->>SatelliteService: Request mission cache
    SatelliteService-->>TileManager: COG tiles + manifests + sidecars
    TileManager->>TileManager: Verify signatures, hashes, freshness, resolution
    TileManager-->>SatelliteService: Local cache/index ready
    TileManager-->>Operator: Cache validation report
 ```
 ### Data Flow
 | Step | From | To | Data | Format |
 |------|------|----|------|--------|
 | 1 | Satellite Service | Tile Manager | Tiles and metadata | COG + PostgreSQL/PostGIS manifest + signed JSON sidecars |
 | 2 | Tile Manager | Satellite Service | Descriptor/index readiness | FAISS index + descriptor sidecars |
 | 3 | Tile Manager | Operator/FDR | Validation report | Markdown/CSV/log |
 ### Error Scenarios
 | Error | Where | Detection | Recovery |
 |-------|-------|-----------|----------|
 | Stale tile | Cache validation | Capture date exceeds sector threshold | Reject/down-confidence tile |
 | Hash mismatch | Cache validation | Sidecar hash mismatch | Reject tile and report security event |
 | Cache too large | Cache load | Storage accounting > budget | Require cache rebuild or approved split budget |
 ### Performance Expectations
 | Metric | Target | Notes |
 |--------|--------|-------|
 | Runtime network calls | 0 | No in-flight Satellite Service or provider calls |
 | Cache load | Within cold-start budget contribution | Exact threshold set during implementation |
 ---
 ## Flow F2: Normal Frame Processing
 ### Description
 During normal flight, the system processes each navigation frame and FC telemetry sample. The camera component first checks for total occlusion/blackout. Usable frames go to the VIO adapter; total-occlusion frames bypass VIO and send the wrapper into IMU-only degraded propagation.
 ### Preconditions
 - Camera calibration/extrinsics are loaded.
 - VIO adapter and wrapper are initialized.
 - FC telemetry stream is healthy.
 ### Sequence Diagram
 ```mermaid
 sequenceDiagram
    participant CameraIngest
    participant FCTelemetry
    participant BasaltAdapter
    participant SafetyWrapper
    participant MavlinkOutput
    participant FDR
    CameraIngest->>CameraIngest: Total occlusion / blackout check
    CameraIngest->>BasaltAdapter: Usable frame + timestamp + calibration
    CameraIngest-->>SafetyWrapper: Degradation signal if total occlusion
    FCTelemetry->>BasaltAdapter: IMU/attitude/altitude
    BasaltAdapter-->>SafetyWrapper: Relative VIO state + quality
    SafetyWrapper->>SafetyWrapper: Calibrate covariance + source label
    SafetyWrapper-->>MavlinkOutput: GPS_INPUT estimate
    SafetyWrapper-->>FDR: Estimate + inputs + health
 ```
 ### Data Flow
 | Step | From | To | Data | Format |
 |------|------|----|------|--------|
 | 1 | Camera ingest | VIO adapter or safety wrapper | Frame metadata, image, occlusion status | Frame DTO / DegradationSignal |
 | 2 | FC telemetry | VIO adapter | IMU/attitude/altitude | MAVLink-derived telemetry DTO |
 | 3 | VIO adapter | Safety wrapper | Relative VIO state | VioState DTO |
 | 4 | Safety wrapper | MAVLink output | WGS84 estimate | `GPS_INPUT` |
 | 5 | Safety wrapper | FDR | Inputs/outputs/audit | FDR segment event |
 ### Error Scenarios
 | Error | Where | Detection | Recovery |
 |-------|-------|-----------|----------|
 | Total occlusion / blackout | Camera ingest | Occlusion status, exposure/texture/decode checks | Bypass VIO, enter IMU-only `dead_reckoned` propagation |
 | Frame unreadable | Camera ingest | Decode/quality failure | Mark visual signal degraded and bypass VIO for that frame |
 | VIO quality low | VIO adapter | Tracking/completion metrics | Trigger relocalization or dead reckoning |
 | Covariance grows | Safety wrapper | Covariance threshold | Degrade fix type/source label |
 ### Performance Expectations
 | Metric | Target | Notes |
 |--------|--------|-------|
 | End-to-end latency | <400 ms p95 | Frame input to emitted estimate |
 | Dropped frames | <=10% sustained | Under load |
 | Memory | <8 GB shared | Jetson limit |
 ---
 ## Flow F3: Satellite Relocalization
 ### Description
 When the state becomes uncertain or disconnected, the system retrieves satellite/cache candidates and accepts an anchor only after local verification and safety gates pass.
 ### Preconditions
 - Offline VPR chunks and FAISS index are loaded.
 - Trigger condition is met: cold start, VO failure, sharp turn, disconnected segment, covariance growth, or stale anchor.
 ### Sequence Diagram
 ```mermaid
 sequenceDiagram
    participant SafetyWrapper
    participant SatelliteService
    participant AnchorVerification
    participant TileManager
    participant FDR
    SafetyWrapper->>SatelliteService: Relocalization request
    SatelliteService->>TileManager: Read candidate chunk metadata
    SatelliteService-->>AnchorVerification: Top-K candidates
    AnchorVerification->>AnchorVerification: ALIKED/DISK+LightGlue + RANSAC
    AnchorVerification-->>SafetyWrapper: Accepted/rejected anchor
    SafetyWrapper->>SafetyWrapper: Mahalanobis + freshness + provenance gates
    SafetyWrapper-->>FDR: Anchor decision audit
 ```
 ### Data Flow
 | Step | From | To | Data | Format |
 |------|------|----|------|--------|
 | 1 | Safety wrapper | Satellite Service | Query frame and prior/covariance | Relocalization DTO |
 | 2 | Satellite Service | Anchor verification | Top-K chunks from local cache/index | Candidate list |
 | 3 | Anchor verification | Safety wrapper | MRE, inliers, homography, provenance | AnchorDecision DTO |
 ### Error Scenarios
 | Error | Where | Detection | Recovery |
 |-------|-------|-----------|----------|
 | No good candidate | Retrieval/verification | Low score or failed RANSAC | Continue degraded and request GCS hint after threshold |
 | Stale candidate | Tile Manager | Capture date gate | Reject/down-confidence |
 | Implausible anchor | Safety wrapper | Mahalanobis/impossible velocity gate | Reject and log |
 ### Performance Expectations
 | Metric | Target | Notes |
 |--------|--------|-------|
 | Invocation frequency | Trigger-based only | Not per-frame |
 | Cross-domain MRE | <2.5 px for accepted anchors | AC-2.2 |
 ---
 ## Flow F4: Visual Blackout / Spoofing Degraded Mode
 ### Description
 When visual localization is unavailable due to total occlusion/blackout and GPS is denied/spoofed, the wrapper switches to honest IMU-only propagation from the last trusted state and degrades MAVLink output based on covariance/time thresholds.
 ### Preconditions
 - Last trusted state exists.
 - FC telemetry continues.
 ### Sequence Diagram
 ```mermaid
 sequenceDiagram
    participant CameraIngest
    participant FCTelemetry
    participant SafetyWrapper
    participant MavlinkOutput
    participant QGC
    participant FDR
    CameraIngest-->>SafetyWrapper: Total occlusion / visual blackout signal
    FCTelemetry-->>SafetyWrapper: GPS health/spoofing signal
    SafetyWrapper->>SafetyWrapper: IMU-only propagation + monotonic covariance growth
    SafetyWrapper->>SafetyWrapper: Switch source_label to dead_reckoned
    SafetyWrapper-->>MavlinkOutput: Degraded GPS_INPUT
    SafetyWrapper-->>QGC: VISUAL_BLACKOUT_IMU_ONLY / FAILSAFE
    SafetyWrapper-->>FDR: Blackout and spoofing audit events
 ```
 ### Error Scenarios
 | Error | Where | Detection | Recovery |
 |-------|-------|-----------|----------|
 | Blackout >30 s | Safety wrapper | Timer threshold | Emit no-fix/failsafe |
 | Covariance >500 m | Safety wrapper | Covariance threshold | `fix_type=0`, `horiz_accuracy=999.0` |
 | Spoofed GPS recovers | Safety wrapper | FC health + visual consistency gate | Re-enable only after required stable interval and visual/satellite consistency |
 ### Performance Expectations
 | Metric | Target | Notes |
 |--------|--------|-------|
 | Mode transition | <=1 processed frame or <=400 ms | AC-3.5 |
 | QGC status | 1-2 Hz | Downsampled operator awareness |
 ---
 ## Flow F5: Generated Tile Lifecycle
 ### Description
 When pose confidence is strong enough, the system orthorectifies navigation imagery into write-new generated tiles and records quality/provenance sidecars.
 ### Preconditions
 - Parent pose covariance passes tile-write gate.
 - Frame quality supports orthorectification.
 ### Data Flow
 | Step | From | To | Data | Format |
 |------|------|----|------|--------|
 | 1 | Safety wrapper | Tile Manager | Pose/covariance + frame metadata | TileGenerationRequest |
 | 2 | Tile Manager | Local storage | Orthorectified generated COG + sidecar | COG + signed JSON |
 | 3 | Tile Manager | FDR | Tile write event | FDR event |
 ### Error Scenarios
 | Error | Where | Detection | Recovery |
 |-------|-------|-----------|----------|
 | Parent covariance too high | Safety wrapper | Sigma gate | Do not write tile |
 | Duplicate sector | Tile Manager | Spatial deduplication | Keep latest/highest-quality tile |
 | Sidecar write failure | Tile Manager | I/O error | Log and do not mark tile eligible |
 ---
 ## Flow F6: Post-Flight Sync And Audit
 ### Description
 After landing, generated tiles and FDR evidence are exported through Satellite Service sync for ingest and incident analysis.
 ### Data Flow
 | Step | From | To | Data | Format |
 |------|------|----|------|--------|
 | 1 | Tile Manager | Satellite Service | Generated tile package | COG + sidecar + manifest delta |
 | 2 | FDR | Operator/audit tools | Mission replay evidence | Segmented logs + optional Parquet export |
 ### Error Scenarios
 | Error | Where | Detection | Recovery |
 |-------|-------|-----------|----------|
 | Upload unavailable | Post-flight sync | Network/service failure | Retain package for retry |
 | Candidate rejected by Service voting | Satellite Service | Ingest rules | Keep as candidate/soft trust, not trusted basemap |
 ---
 ## Flow F7: Validation Replay
 ### Description
 The separate e2e test suite runs deterministic still-image, public dataset, SITL, Jetson, and representative replay scenarios against public interfaces.
 ### Preconditions
 - Test data and expected results are pinned.
 - Execution mode is selected: Docker/replay and local Jetson hardware.
 ### Data Flow
 | Step | From | To | Data | Format |
 |------|------|----|------|--------|
 | 1 | E2E test suite | Runtime | Images/telemetry/cache fixtures | File/stream/MAVLink |
 | 2 | Runtime | E2E test suite | GPS_INPUT/FDR/status | MAVLink/log files |
 | 3 | E2E test suite | Reports | Pass/fail metrics | CSV/Markdown |
 ### Performance Expectations
 | Metric | Target | Notes |
 |--------|--------|-------|
 | PR smoke | <=15 min | Still-image/cache/SITL subset |
 | Release gate | Hardware-dependent | Jetson and representative replay required |
@@ -0,0 +1,173 @@
 # Blackbox Tests
 ## Positive Scenarios
 ### FT-P-01: Still-Image Frame Center Geolocation
 **Summary**: Validate that the system estimates WGS84 frame centers for the provided 60-image nadir dataset.
 **Traces to**: AC-1.1, AC-1.2, AC-6.3, AC-8.1
 **Category**: Position Accuracy
 **Preconditions**:
 - Offline satellite cache fixture is available for the sample area.
 - Expected results are loaded from `input_data/expected_results/results_report.md`.
 **Input data**: `project_60_still_images`, `expected_frame_centers`
 | Step | Consumer Action | Expected System Response |
 |------|-----------------|--------------------------|
 | 1 | Submit `AD000001.jpg` through `AD000060.jpg` with height/camera metadata | System emits one WGS84 estimate per processed image |
 | 2 | Compare each estimate to the mapped expected coordinate | Per-frame error is reported in meters |
 **Expected outcome**: At least 80% of images are within 50 m and at least 50% are within 20 m.
 **Max execution time**: 15 minutes for the 60-image replay on the local replay environment.
 ---
 ### FT-P-02: Position Confidence Output Contract
 **Summary**: Validate that every emitted position estimate includes confidence and source-label fields required by the public contract.
 **Traces to**: AC-1.3, AC-1.4, AC-4.4, AC-4.5
 **Category**: Position Confidence
 **Preconditions**:
 - Same fixture setup as FT-P-01.
 **Input data**: `project_60_still_images`, `expected_frame_centers`
 | Step | Consumer Action | Expected System Response |
 |------|-----------------|--------------------------|
 | 1 | Submit the 60-image replay | System emits estimates frame-by-frame, not batched |
 | 2 | Inspect public output fields | Each estimate contains WGS84 coordinate, 95% covariance semi-major axis, source label, and `last_satellite_anchor_age_ms` |
 | 3 | Submit a later correction for a prior frame if available | System emits updated estimate with timestamp and covariance without corrupting newer estimates |
 **Expected outcome**: 100% of emitted estimates include required confidence fields; no `horiz_accuracy` equivalent under-reports the 95% covariance semi-major axis.
 **Max execution time**: 15 minutes.
 ---
 ### FT-P-03: BASALT VIO Replay With Synchronized Video/Telemetry
 **Summary**: Validate that BASALT + safety/anchor wrapper can process synchronized nadir video, IMU, and trajectory telemetry and produce frame-by-frame estimates with honest confidence.
 **Traces to**: AC-1.3, AC-2.1a, AC-2.2, AC-4.1, AC-4.2
 **Category**: VO / IMU Propagation
 **Preconditions**:
 - Derkachi replay fixture is mounted from `input_data/flight_derkachi/`.
 - `flight_derkachi.mp4` is readable as cropped nadir video: 880 x 720, 30 fps, approximately 490.07 s.
 - `data_imu.csv` contains monotonic 10 Hz `Time`, `timestamp(ms)`, `SCALED_IMU2.*`, and `GLOBAL_POSITION_INT.*` fields for 4,900 rows.
 - Camera intrinsics, lens distortion, and camera-to-body transform are either pinned or the run is marked as calibration-limited.
 - Public synchronized dataset slice remains useful for calibrated final comparison. Strongest candidates: MUN-FRL, ALTO, EPFL fixed-wing, Kagaru; EuRoC/UZH FPV are proxy-only.
 **Input data**: `derkachi_video_telemetry`, `public_nadir_vio_candidates`
 | Step | Consumer Action | Expected System Response |
 |------|-----------------|--------------------------|
 | 1 | Validate Derkachi video/telemetry alignment | Harness accepts the fixture only if MP4 duration and CSV duration differ by <=250 ms and there are exactly 3 video frames per telemetry row |
 | 2 | Replay synchronized video frames and IMU stream | System emits frame-by-frame `vo_extrapolated` or `satellite_anchored` estimates without batching |
 | 3 | Compare output trajectory to `GLOBAL_POSITION_INT` lat/lon/alt/heading | Error, covariance, source label, and anchor age are reported per segment |
 | 4 | Compare calibrated public/representative replay against ground truth when available | BASALT + wrapper does not materially under-report uncertainty relative to error |
 | 5 | Compare against OpenVINS reference replay when available | BASALT + wrapper does not materially under-report uncertainty relative to error |
 **Expected outcome**: Derkachi replay is accepted as a synchronized representative fixture and produces continuous estimates for >95% of normal overlapping frames. Absolute geolocation and covariance pass/fail thresholds are calibration-gated until camera intrinsics, distortion, and camera-to-body transform are pinned. For calibrated datasets, VO homography MRE is <1.0 px where homography validation is applicable.
 **Max execution time**: Dataset-dependent, but replay must report per-frame latency.
 ---
 ### FT-P-04: Satellite Service And Anchor Verification
 **Summary**: Validate that relocalization uses global retrieval plus local verification and emits only verified satellite anchors.
 **Traces to**: AC-2.1b, AC-2.2, AC-3.2, AC-3.3, AC-8.6
 **Category**: Satellite Anchor
 **Preconditions**:
 - AerialVL/ALTO/VPAir-style public dataset slice or project satellite-cache fixture is available.
 - VPR chunks and descriptors are precomputed.
 **Input data**: Public aerial localization slice, cache fixture
 | Step | Consumer Action | Expected System Response |
 |------|-----------------|--------------------------|
 | 1 | Trigger cold-start or relocalization query | System searches CPU FAISS top-K chunks |
 | 2 | Present top-K candidates to local verification | System runs ALIKED/DISK+LightGlue and RANSAC |
 | 3 | Inspect emitted anchor decision | Accepted anchors include source label, MRE, inlier count, covariance, and tile provenance |
 **Expected outcome**: Cross-domain satellite-anchor MRE is <2.5 px for accepted anchors; rejected candidates do not produce `satellite_anchored` estimates.
 **Max execution time**: Must be measured as part of performance tests.
 ## Negative Scenarios
 ### FT-N-01: Repetitive Or Low-Texture Imagery
 **Summary**: Validate that visually ambiguous images do not produce confident false satellite anchors.
 **Traces to**: AC-1.4, AC-3.1, AC-NEW-4, AC-8.6
 **Category**: False Position Prevention
 **Input data**: Repetitive agricultural or low-texture frames from project/public data.
 | Step | Consumer Action | Expected System Response |
 |------|-----------------|--------------------------|
 | 1 | Submit ambiguous frame or sequence | System either emits degraded `vo_extrapolated`/`dead_reckoned` output or rejects low-confidence anchor |
 | 2 | Inspect anchor and confidence outputs | No anchor is accepted unless local verification and covariance gates pass |
 **Expected outcome**: 0 confident `satellite_anchored` outputs for candidates that fail local verification, freshness, or Mahalanobis gates.
 **Max execution time**: 15 minutes per fixture.
 ---
 ### FT-N-02: GPS Spoofing During Total Visual Blackout
 **Summary**: Validate that spoofed GPS is not promoted during total camera occlusion/visual blackout and that output degrades honestly before unusable frames reach VIO.
 **Traces to**: AC-3.5, AC-5.2, AC-NEW-2, AC-NEW-8
 **Category**: Spoofing / Blackout
 **Input data**: ArduPilot Plane SITL spoofing trace with camera blackout/total-occlusion frames.
 | Step | Consumer Action | Expected System Response |
 |------|-----------------|--------------------------|
 | 1 | Start normal replay with trusted visual/satellite anchor | System emits normal estimates |
 | 2 | Inject full visual blackout/total occlusion and spoofed `GPS_RAW_INT` | Camera gate sets `usable_for_vio=false`, BASALT is bypassed for occluded frames, and system switches to `dead_reckoned` within <=1 processed frame or <=400 ms |
 | 3 | Continue blackout beyond thresholds | IMU-only covariance grows monotonically; system degrades fix type and emits failsafe status at specified covariance/time thresholds |
 **Expected outcome**: Spoofed GPS is ignored; total occlusion never feeds BASALT as a usable VIO frame; `fix_type=0`, `horiz_accuracy=999.0`, and `VISUAL_BLACKOUT_FAILSAFE` are emitted when covariance >500 m or blackout >30 s.
 **Max execution time**: 10 minutes per SITL scenario.
 ---
 ### FT-N-03: Invalid Or Stale Satellite Cache
 **Summary**: Validate cache freshness, integrity, and provenance gates.
 **Traces to**: AC-8.2, AC-8.3, AC-NEW-6, AC-NEW-7
 **Category**: Cache Integrity
 **Input data**: `cache_integrity_fixtures`
 | Step | Consumer Action | Expected System Response |
 |------|-----------------|--------------------------|
 | 1 | Replay with stale tile manifest | Tile is rejected or down-confidence weighted; no stale tile emits `satellite_anchored` |
 | 2 | Replay with hash-mismatched or unsigned manifest | Cache fixture is rejected and security event is logged |
 | 3 | Replay generated tile with weak parent-pose covariance | Tile is not promoted beyond allowed trust level |
 **Expected outcome**: 0 invalid/stale/cache-poisoning fixtures produce trusted anchors or trusted basemap tiles.
 **Max execution time**: 15 minutes.
@@ -0,0 +1,81 @@
 # E2E Test Suite
 ## Scope
 The e2e test suite is separate test tooling, not part of the onboard runtime. It drives black-box replay, public dataset, SITL, Jetson, and representative validation through public runtime interfaces only.
 ## Purpose
 - Feed navigation frames, telemetry traces, cache manifests, and fault triggers into the system under test.
 - Validate emitted coordinates, confidence fields, MAVLink `GPS_INPUT`, QGC status, FDR, and generated-tile evidence.
 - Produce release evidence without importing runtime internals.
 ## Ownership
 - **Epic**: AZ-217 (E2E Test Suite / test-support work, not product runtime)
 - **Owns**:
  - `tests/blackbox/**`
  - `tests/e2e/**`
  - `e2e/replay/**`
  - `e2e/reports/**`
 - **Does not own**:
  - `src/**`
  - runtime component internals
  - production deployment code
 ## Public Interfaces Under Test
 | Interface | Protocol / Contract |
 |-----------|---------------------|
 | Navigation frames | Ordered image/video replay with timestamps |
 | FC telemetry | MAVLink replay or generated stream |
 | Satellite cache | Local COG + manifest + descriptor fixtures |
 | GPS output | MAVLink `GPS_INPUT` |
 | Operator status | QGC-visible MAVLink status |
 | FDR | Filesystem/database-backed evidence outputs |
 ## Runner Contract
 | Method | Input | Output | Error Types |
 |--------|-------|--------|-------------|
 | `run_scenario` | `ScenarioRequest` | `ScenarioReport` | `FixtureInvalid`, `RuntimeFailed`, `ThresholdFailed` |
 | `validate_fixture` | `FixtureRequest` | `FixtureValidationReport` | `FixtureInvalid` |
 ```yaml
 ScenarioRequest:
  scenario_id: string
  execution_environment: enum(replay, sitl, jetson, representative)
  fixture_paths: list[string]
 ScenarioReport:
  scenario_id: string
  result: enum(pass, fail, blocked)
  metrics: object
  artifacts: list[path]
  failure_reason: string optional
 ```
 ## Scenario Coverage
 | Scenario | Purpose | Evidence |
 |----------|---------|----------|
 | Still-image accuracy runner | Verify project still-image replay reports frame-center accuracy | Per-image error, aggregate pass rates, covariance, source label, anchor age |
 | Synchronized VIO replay runner | Verify Derkachi and public/representative synchronized data drive BASALT/wrapper tests | Fixture alignment, trajectory comparison, VIO registration, latency, covariance calibration |
 | Satellite anchor replay runner | Verify VPR and anchor verification scenarios are executable | Retrieval recall, MRE, accepted/rejected anchors, freshness behavior |
 | Outlier/sharp-turn/disconnected runner | Verify relocalization resilience scenarios are executable | Degraded-mode timelines and relocalization outcomes |
 | Blackout and spoofing runner | Verify total blackout plus spoofing through SITL/replay | Mode-switch timing, covariance growth, failsafe thresholds |
 | MAVLink/QGC contract runner | Verify MAVLink output and GCS status assertions | `GPS_INPUT`, WGS84 coordinates, status rate, command ingress |
 | Startup/reboot runner | Verify cold-start and companion reboot scenarios | First valid `GPS_INPUT` p95 and FC-state reinitialization |
 | Object coordinate contract runner | Verify AI-camera object coordinate request at system boundary | Frame-center-consistent coordinate accuracy and projection bound |
 | Tile Manager runner | Verify cache, generated tiles, and storage tests | Cache load, tile write gates, no raw-frame retention, stale rejection, poisoning evidence |
 ## Release Evidence
 The suite assembles CSV, Markdown, MAVLink tlogs, FDR summaries, cache validation reports, and pass/fail metadata into release evidence bundles. Missing public or representative data is reported as `blocked`, not `passed`.
 ## Non-Responsibilities
 - No onboard flight logic.
 - No direct estimator, BASALT, wrapper, or tile-manager imports.
 - No mutation of runtime internal state.
 - No production service APIs.
@@ -0,0 +1,125 @@
 # Test Environment
 ## Overview
 **System under test**: Onboard GPS-denied localization service. Public interfaces are navigation-camera frame input, flight-controller telemetry input, offline satellite-cache input, `GPS_INPUT` MAVLink output, QGroundControl status output, and flight-data-recorder output.
 **Consumer app purpose**: A black-box replay harness that feeds image frames, telemetry traces, cache manifests, and fault triggers into the service, then validates emitted coordinates, confidence fields, telemetry, and logs without importing internal modules.
 ## Execution Environments
 | Environment | Purpose | Required for |
 |-------------|---------|--------------|
 | Local replay workstation | Fast still-image and dataset replay validation | Frame-center geolocation, Satellite Service local retrieval, stale-tile rejection |
 | Jetson Orin Nano Super | Production-like latency, memory, thermal, and TensorRT/ONNX profiling | AC-4.1, AC-4.2, AC-NEW-1, AC-NEW-5 |
 | ArduPilot Plane SITL + QGroundControl | MAVLink `GPS_INPUT`, spoofing, failsafe, and GCS status validation | AC-4.3, AC-5.2, AC-NEW-2, AC-NEW-8 |
 | Representative flight/replay rig | Final acceptance evidence with synchronized nav camera, FC IMU/attitude/airspeed/altitude, MAVLink logs, and ground truth | Final AC signoff |
 ## Docker / Compose Structure
 | Service | Image / Build | Purpose | Ports |
 |---------|---------------|---------|-------|
 | gps-denied-service | Project build image for JetPack-compatible target or replay-compatible host | System under test | MAVLink UDP/TCP and health/status endpoints TBD |
 | replay-consumer | Python replay/test harness | Feeds images, telemetry, cache data, and fault triggers | none |
 | satellite-cache-stub | Local COG/manifest/descriptor fixture volume | Provides offline tile cache and signed/unsigned manifests | none |
 | ardupilot-plane-sitl | ArduPilot Plane SITL image or local process | Validates `GPS_INPUT`, spoofing/failsafe behavior | MAVLink SITL ports |
 | qgc-observer | QGC/tlog-compatible observer or MAVLink log parser | Verifies GCS-visible status output | none |
 ## Networks
 | Network | Services | Purpose |
 |---------|----------|---------|
 | replay-net | gps-denied-service, replay-consumer, satellite-cache-stub | Offline replay and black-box validation |
 | sitl-net | gps-denied-service, ardupilot-plane-sitl, qgc-observer | MAVLink integration and failsafe validation |
 ## Volumes
 | Volume | Mounted to | Purpose |
 |--------|------------|---------|
 | input-data | `/data/input` | `_docs/00_problem/input_data/` and public dataset slices |
 | expected-results | `/data/expected` | `_docs/00_problem/input_data/expected_results/` |
 | derkachi-replay | `/data/input/flight_derkachi` | Cropped nadir MP4 plus synchronized IMU and `GLOBAL_POSITION_INT` trajectory |
 | satellite-cache | `/cache/satellite` | COG tiles, manifests, descriptor index fixtures |
 | fdr-output | `/fdr` | Flight-data-recorder outputs for validation |
 ## Consumer Application
 **Tech stack**: Python replay harness with pytest-style assertions and MAVLink log parsing.
 **Entry point**: `run-blackbox-replay` command to be created during implementation; this planning artifact defines required behavior, not code.
 ### Communication With System Under Test
 | Interface | Protocol | Endpoint / Topic | Authentication |
 |-----------|----------|------------------|----------------|
 | Navigation frames | File/stream replay | Ordered image frames with timestamps | Local fixture access |
 | FC telemetry | MAVLink replay or generated stream | IMU, attitude, airspeed, altitude, GPS health | Local MAVLink link |
 | Satellite cache | Local filesystem contract | COG + manifest + descriptors | Signed manifest validation |
 | GPS output | MAVLink | `GPS_INPUT` to ArduPilot Plane | MAVLink source/system ID allowlist |
 | Status output | MAVLink/QGC | `STATUSTEXT` / status summary | MAVLink source/system ID allowlist |
 | FDR | Filesystem output | Per-flight segmented logs | Local fixture access |
 ### What The Consumer Does Not Access
 - No internal estimator modules.
 - No direct BASALT/OpenVINS/Kimera APIs.
 - No direct mutation of internal state.
 - No bypass of public cache, MAVLink, replay, or FDR interfaces.
 ## CI/CD Integration
 | Suite | When to run | Gate behavior | Timeout |
 |-------|-------------|---------------|---------|
 | Still-image geolocation smoke | Every PR after implementation exists | Block merge | <= 15 min |
 | Public VIO dataset replay | Nightly and before release | Block release | Dataset-dependent |
 | Jetson performance/resource | Before release and after runtime dependency changes | Block release | <= 8 h for endurance/thermal |
 | Plane SITL failsafe/spoofing | Every release candidate | Block release | <= 60 min |
 ## Reporting
 **Format**: CSV and FDR validation summary.
 **Columns**: Test ID, Test Name, Input Dataset, Execution Time (ms), Result, Error Distance (m), Source Label, Covariance 95% Semi-Major (m), `GPS_INPUT.fix_type`, Error Message.
 **Output path**: `./test-results/blackbox-report.csv` and `./test-results/fdr-validation-summary.md`.
 ## Test Execution
 **Decision**: Both Docker/replay and local hardware execution.
 **Hardware dependencies found**:
 - Jetson Orin Nano Super with 8 GB shared LPDDR5 and 25 W power mode.
 - CUDA/TensorRT/ONNX acceleration for DINOv2 and local-matcher profiling.
 - Camera ingestion paths over USB, MIPI-CSI, or GigE.
 - ArduPilot Plane SITL and MAVLink `GPS_INPUT` behavior.
 - Thermal, power, FDR, and storage limits that require target-like execution.
 ### Docker / Replay Mode
 Use Docker or local host replay for deterministic, reproducible tests that do not require physical Jetson hardware:
 - Still-image frame-center geolocation.
 - Derkachi synchronized video/telemetry replay, including alignment and VIO smoke checks.
 - Satellite-cache freshness and integrity fixtures.
 - FAISS descriptor/index behavior.
 - Public dataset replay where GPU/hardware timing is not the assertion.
 - Plane SITL tests where SITL and MAVLink behavior are the target.
 Docker/replay mode is suitable for PR checks and nightly validation, but it does not prove Jetson latency, memory, thermal, or camera-driver behavior.
 ### Local Hardware Mode
 Use local Jetson hardware for release gates:
 - BASALT + wrapper latency and memory profiling.
 - DINOv2/ONNX/TensorRT descriptor-fidelity and runtime profiling.
 - ALIKED/DISK + LightGlue runtime profiling.
 - Cold-start time to first valid `GPS_INPUT`.
 - 8-hour thermal and FDR endurance tests.
 - Camera interface validation once the exact module interface is selected.
 ### Gate Policy
 - PR gate: Docker/replay smoke and deterministic fixture tests.
 - Nightly gate: Docker/replay public dataset slices and SITL scenarios.
 - Release gate: local Jetson hardware, Plane SITL, thermal/resource tests, and representative replay data.
@@ -0,0 +1,96 @@
 # Performance Tests
 ### NFT-PERF-01: Per-Frame Latency On Project Still Images
 **Summary**: Validate end-to-end latency for processing project nadir frames through geolocation output.
 **Traces to**: AC-4.1, AC-4.4
 **Metric**: Capture-to-output latency p50/p95/p99 and dropped-frame rate.
 **Preconditions**:
 - Jetson Orin Nano Super or equivalent production target is running in the intended power mode.
 - `project_60_still_images` fixture is available.
 | Step | Consumer Action | Measurement |
 |------|-----------------|-------------|
 | 1 | Replay images at target 3 fps or faster stress rate | Measure latency from input timestamp to emitted estimate |
 | 2 | Record all frame drops | Measure dropped-frame percentage |
 **Pass criteria**: p95 latency <400 ms; dropped frames <=10% under sustained load; no batching delay.
 **Duration**: Minimum 20 minutes or full fixture loop repeated enough times to reach stable measurements.
 ---
 ### NFT-PERF-02: BASALT + Wrapper Replay Latency
 **Summary**: Validate relative VIO hot-path latency using synchronized Derkachi video/telemetry and public or representative camera/IMU data.
 **Traces to**: AC-2.1a, AC-4.1, AC-4.2
 **Metric**: Per-frame VIO latency, completion rate, and memory usage.
 **Preconditions**:
 - Derkachi `flight_derkachi.mp4` and `data_imu.csv` are mounted and pass fixture validation.
 - MUN-FRL/ALTO/EPFL/Kagaru or another representative synchronized dataset slice is pinned for calibrated final comparison.
 - OpenVINS reference replay is available for comparison when the dataset supports it.
 | Step | Consumer Action | Measurement |
 |------|-----------------|-------------|
 | 1 | Replay Derkachi video at target 3 fps and stress rates from the 30 fps source | Measure per-frame processing time, dropped frames, and telemetry alignment |
 | 2 | Replay synchronized camera/IMU stream through BASALT + wrapper | Measure VIO processing time and completion rate |
 | 3 | Compare emitted trajectory against Derkachi `GLOBAL_POSITION_INT` and calibrated dataset ground truth where available | Measure completion rate and error distribution |
 | 4 | Monitor memory | Track CPU/GPU shared memory peak |
 **Pass criteria**: Normal-frame VO registration >95% on calibration-supported segments; p95 processing latency <400 ms for the hot path; memory <8 GB shared; Derkachi replay maintains stable 3-video-frames-per-telemetry-row alignment with <=10% dropped frames under sustained target-rate replay.
 **Duration**: Dataset-dependent; at least one normal segment and one challenging segment.
 ---
 ### NFT-PERF-03: Relocalization Trigger Path Latency
 **Summary**: Validate the heavy DINOv2-VLAD + FAISS + ALIKED/LightGlue path under bounded top-K settings.
 **Traces to**: AC-3.2, AC-3.3, AC-4.1, AC-8.6
 **Metric**: Trigger-to-anchor latency, top-K query time, local verification time, accepted/rejected anchor counts.
 **Preconditions**:
 - Precomputed descriptor index is loaded.
 - Dynamic K settings are configured: K=5 stable, K=20 active-conflict, K=50 fallback.
 | Step | Consumer Action | Measurement |
 |------|-----------------|-------------|
 | 1 | Trigger relocalization from cold start or sharp turn | Measure DINOv2 descriptor time and FAISS query time |
 | 2 | Verify top-K candidates | Measure ALIKED/LightGlue + RANSAC latency |
 | 3 | Emit accepted/rejected decision | Measure total trigger-to-decision latency |
 **Pass criteria**: Heavy path is conditional, never blocks steady-state frame output; accepted anchor carries MRE <2.5 px and valid covariance.
 **Duration**: 100 relocalization trials across stable and active-conflict sector fixtures.
 ---
 ### NFT-PERF-04: Cold Boot Time To First Fix
 **Summary**: Validate companion boot to first valid `GPS_INPUT`.
 **Traces to**: AC-NEW-1
 **Metric**: Time from service start/boot marker to first valid `GPS_INPUT`.
 **Preconditions**:
 - Engines/indexes are built before the run.
 - Cache/index is available locally.
 - FC state handoff is simulated or provided.
 | Step | Consumer Action | Measurement |
 |------|-----------------|-------------|
 | 1 | Start service from cold boot condition | Measure initialization stages |
 | 2 | Wait for first valid output | Measure first valid `GPS_INPUT` timestamp |
 **Pass criteria**: 95th percentile <30 s over 50 runs.
 **Duration**: 50 cold-start trials.
@@ -0,0 +1,85 @@
 # Resilience Tests
 ### NFT-RES-01: Total Visual Blackout With GPS Spoofing
 **Summary**: Validate degraded-mode behavior when the camera feed is totally occluded/blacked out and real GPS is spoofed or denied.
 **Traces to**: AC-3.5, AC-5.2, AC-NEW-8
 **Preconditions**:
 - Plane SITL or replay trace is emitting normal telemetry.
 - System has a recent trusted visual/satellite anchor.
 **Fault injection**:
 - Full camera blackout/total occlusion for 5 s, 15 s, and 35 s while spoofed GPS is present.
 | Step | Action | Expected Behavior |
 |------|--------|-------------------|
 | 1 | Inject total occlusion/blackout and spoofed GPS | Camera gate reports `usable_for_vio=false`, BASALT is bypassed, and system switches to `dead_reckoned` within <=1 processed frame or <=400 ms |
 | 2 | Continue blackout | IMU-only covariance grows monotonically and spoofed GPS is ignored |
 | 3 | Exceed 30 s or covariance >500 m | System emits no-fix/failsafe fields and QGC `VISUAL_BLACKOUT_FAILSAFE` |
 **Pass criteria**: All pre-VIO occlusion gate, timing, covariance, `fix_type`, `horiz_accuracy`, and status thresholds match AC-NEW-8.
 ---
 ### NFT-RES-02: Sharp Turn And Disconnected Segment Relocalization
 **Summary**: Validate recovery when frame-to-frame overlap drops below the VO threshold.
 **Traces to**: AC-3.2, AC-3.3, AC-3.4, AC-8.6
 **Preconditions**:
 - Public or representative replay contains sharp-turn/disconnected segment cases, or equivalent synthetic sequence is generated from mapped imagery.
 **Fault injection**:
 - Sequence transition with <5% overlap, heading change <70°, and drift <200 m.
 | Step | Action | Expected Behavior |
 |------|--------|-------------------|
 | 1 | Replay normal segment | BASALT + wrapper emits normal `vo_extrapolated` estimates |
 | 2 | Inject sharp-turn/disconnected transition | VO failure is expected; system triggers VPR relocalization |
 | 3 | Continue next segment | System connects segment through verified satellite anchor or reports degraded status |
 **Pass criteria**: Relocalization request is issued when no position is available for >=3 consecutive frames and >=2 s; verified anchor reconnects the segment or output remains degraded with growing covariance.
 ---
 ### NFT-RES-03: Companion Computer Restart Mid-Flight
 **Summary**: Validate reboot recovery from flight-controller state and preloaded cache.
 **Traces to**: AC-5.3, AC-NEW-1
 **Preconditions**:
 - Replay/SITL mission is in progress.
 - FDR has current segment logs.
 **Fault injection**:
 - Kill and restart the GPS-denied service during a GPS-denied segment.
 | Step | Action | Expected Behavior |
 |------|--------|-------------------|
 | 1 | Kill service | FC continues on last known/IMU-extrapolated state |
 | 2 | Restart service | Service reloads cache/index and uses FC state handoff |
 | 3 | Observe first valid output | First valid `GPS_INPUT` emitted within <30 s |
 **Pass criteria**: No raw frames are required for recovery; first valid fix <30 s p95; failure is logged in FDR.
 ---
 ### NFT-RES-04: Tile Cache Freshness Degradation
 **Summary**: Validate graceful behavior when the only available tile candidates are stale.
 **Traces to**: AC-8.2, AC-NEW-6
 **Fault injection**:
 - Mark cache tiles older than 6 months for active-conflict sector and older than 12 months for stable sector.
 | Step | Action | Expected Behavior |
 |------|--------|-------------------|
 | 1 | Replay frame requiring satellite anchor | Stale tiles are rejected or down-confidence weighted |
 | 2 | Inspect emitted estimate | No stale tile produces `satellite_anchored` label past hard rejection threshold |
 **Pass criteria**: Freshness decay and hard rejection match AC-NEW-6.
@@ -0,0 +1,85 @@
 # Resource Limit Tests
 ### NFT-RES-LIM-01: Jetson Memory Budget
 **Summary**: Validate that runtime memory stays below the 8 GB shared LPDDR5 limit.
 **Traces to**: AC-4.2, Restrictions Onboard Hardware
 **Preconditions**:
 - Jetson Orin Nano Super in production power/thermal mode.
 - BASALT + wrapper, cache index, FAISS CPU index, and FDR enabled.
 **Monitoring**:
 - CPU/GPU shared memory, process RSS, CUDA allocations, FAISS index memory.
 **Duration**: Minimum 60 minutes steady-state replay plus relocalization triggers.
 **Pass criteria**: Peak memory <8 GB shared; no OOM kill; no silent descriptor/index eviction.
 ---
 ### NFT-RES-LIM-02: Thermal And Power Envelope
 **Summary**: Validate sustained 25 W operation without thermal throttling across the environmental envelope.
 **Traces to**: AC-NEW-5
 **Preconditions**:
 - Jetson cooling solution installed.
 - Hot-soak chamber or production thermal test setup at +50 °C.
 **Monitoring**:
 - Power mode, temperature sensors, throttle flags, CPU/GPU clocks, per-frame latency.
 **Duration**: 8 hours at sustained representative workload.
 **Pass criteria**: No thermal throttle event; p95 latency remains <400 ms; QGC receives thermal warning if any threshold is approached.
 ---
 ### NFT-RES-LIM-03: Satellite Cache Storage Budget
 **Summary**: Validate persistent satellite cache footprint for up to 400 km² operational area.
 **Traces to**: AC-8.3, Restrictions Satellite Imagery
 **Monitoring**:
 - Cache imagery, overviews, manifests, sidecars, FAISS descriptors/indexes.
 **Duration**: Full cache build/load test.
 **Pass criteria**: Persistent cache is <=10 GB unless the implementation explicitly defines and gets approval for a separate descriptor/index budget.
 ---
 ### NFT-RES-LIM-04: Flight Data Recorder Rollover
 **Summary**: Validate FDR storage cap and rollover behavior under an 8-hour synthetic mission.
 **Traces to**: AC-NEW-3, AC-8.5
 **Preconditions**:
 - Synthetic 8-hour load with 3 fps navigation frames, full-rate IMU, emitted `GPS_INPUT`, health telemetry, tile writes, and failure thumbnails.
 **Monitoring**:
 - FDR segment sizes, rollover events, retained payload classes.
 **Duration**: 8 hours.
 **Pass criteria**: FDR remains <=64 GB per flight; rollover is logged; no raw nav/AI frames are retained; no payload class is silently dropped.
 ---
 ### NFT-RES-LIM-05: Cold Start Resource Spike
 **Summary**: Validate that CUDA/TensorRT/ONNX/FAISS initialization does not violate boot or memory budgets.
 **Traces to**: AC-NEW-1, AC-4.2
 **Monitoring**:
 - Initialization time, peak memory, engine/index load time.
 **Duration**: 50 cold-start trials.
 **Pass criteria**: First valid `GPS_INPUT` <30 s p95; peak memory <8 GB; no first-run engine build occurs at runtime.
@@ -0,0 +1,62 @@
 # Security Tests
 ### NFT-SEC-01: Signed Cache Manifest Enforcement
 **Summary**: Validate that unsigned or tampered cache manifests cannot produce trusted anchors.
 **Traces to**: AC-8.2, AC-8.3, AC-NEW-4, AC-NEW-7
 | Step | Consumer Action | Expected Response |
 |------|-----------------|-------------------|
 | 1 | Provide valid signed manifest | System accepts cache fixture if all freshness and resolution checks pass |
 | 2 | Provide unsigned manifest | System rejects cache fixture and logs security event |
 | 3 | Provide hash-mismatched tile sidecar | System rejects affected tile and emits no trusted anchor from it |
 **Pass criteria**: 0 unsigned or hash-mismatched fixtures produce `satellite_anchored` output or trusted generated tile promotion.
 ---
 ### NFT-SEC-02: Cache Poisoning Write Gate
 **Summary**: Validate that generated onboard tiles are not written or promoted when parent-pose covariance is too weak.
 **Traces to**: AC-8.4, AC-NEW-7
 | Step | Consumer Action | Expected Response |
 |------|-----------------|-------------------|
 | 1 | Replay generated tile candidate with parent sigma <=3 m | Tile may be written as candidate with full quality metadata |
 | 2 | Replay candidate with parent sigma in (3 m, 5 m] | Tile is marked lower trust per sidecar policy |
 | 3 | Replay candidate with parent sigma >5 m | Tile is not eligible for write/promotion |
 **Pass criteria**: Tile trust level and write eligibility match AC-NEW-7; no over-threshold tile becomes trusted basemap.
 ---
 ### NFT-SEC-03: MAVLink Source And Spoofing Rejection
 **Summary**: Validate that spoofed real-GPS measurements and unauthorized MAVLink sources do not override trusted estimator state.
 **Traces to**: AC-3.5, AC-4.3, AC-NEW-2, AC-NEW-8
 | Step | Consumer Action | Expected Response |
 |------|-----------------|-------------------|
 | 1 | Inject spoofed `GPS_RAW_INT` during normal visual operation | Estimator rejects inconsistent GPS based on FC health and visual/satellite consistency |
 | 2 | Inject spoofed GPS during visual blackout | Spoofed GPS remains excluded until health and visual consistency gates pass |
 | 3 | Inject MAVLink messages from unauthorized source ID | Message is ignored and security/status event is logged |
 **Pass criteria**: No unauthorized or spoofed input causes a confident position estimate; promotion/demotion status is visible to QGC and FDR.
 ---
 ### NFT-SEC-04: No In-Flight Satellite Provider Access
 **Summary**: Validate that the runtime system does not call commercial or Suite satellite services during flight.
 **Traces to**: AC-8.1, AC-8.3, Restrictions Satellite Imagery
 | Step | Consumer Action | Expected Response |
 |------|-----------------|-------------------|
 | 1 | Run replay with network blocked | System continues using local cache |
 | 2 | Run replay requiring missing tile | System reports degraded/relocalization-needed status, not an external fetch |
 **Pass criteria**: 0 outbound satellite-provider or Suite Service calls during runtime; missing cache data produces controlled degraded behavior.
@@ -0,0 +1,100 @@
 # Test Data Management
 ## Seed Data Sets
 | Data Set | Description | Used by Tests | How Loaded | Cleanup |
 |----------|-------------|---------------|------------|---------|
 | `project_60_still_images` | 60 nadir images with WGS84 frame-center coordinates from `coordinates.csv`; height 400 m | FT-P-01, FT-P-02, FT-N-01, NFT-PERF-01 | Mounted from `_docs/00_problem/input_data/` | Read-only |
 | `project_gmaps_reference_subset` | Google Maps reference images available for the first sample frames | FT-P-02, FT-N-01 | Mounted from `_docs/00_problem/input_data/` | Read-only |
 | `expected_frame_centers` | Expected lat/lon and thresholds derived from `coordinates.csv` | FT-P-01, FT-P-02 | `_docs/00_problem/input_data/expected_results/results_report.md` | Read-only |
 | `derkachi_video_telemetry` | Cropped nadir MP4 synchronized with IMU and `GLOBAL_POSITION_INT` trajectory: 880 x 720, 30 fps, ~490.07 s; telemetry 10 Hz, 4,900 rows | FT-P-03, NFT-PERF-02, NFT-RES-02 | Mounted from `_docs/00_problem/input_data/flight_derkachi/` | Read-only |
 | `public_nadir_vio_candidates` | MUN-FRL, ALTO, EPFL fixed-wing, Kagaru, AerialVL/VPAir slices, EuRoC/UZH FPV proxy slices | FT-P-03, FT-P-04, NFT-PERF-02, NFT-RES-02 | Downloaded or mounted by replay harness; exact files pinned during implementation | Reset fixture volume |
 | `sitl_spoofing_scenarios` | Generated ArduPilot Plane SITL GPS loss/spoofing traces | FT-N-02, NFT-RES-01, NFT-SEC-03 | Generated by test harness | Discard generated logs after report |
 | `cache_integrity_fixtures` | Fresh, stale, unsigned, hash-mismatched, and low-resolution cache manifests | FT-N-03, NFT-SEC-01, NFT-SEC-02 | Mounted fixture volume | Read-only |
 ## Public Dataset Coverage Plan
 | Public Data Source | Fit For This Project | Limitations | Planned Use |
 |--------------------|----------------------|-------------|-------------|
 | MUN-FRL | Strong nadir camera + IMU + GNSS/ground truth candidate | Helicopter/hexacopter, not fixed-wing | BASALT/OpenVINS/Kimera replay and covariance calibration |
 | ALTO | Strong nadir aerial imagery with GPS/INS, altimeter, orthophotos | Helicopter/airborne collection, access/details must be pinned | VPR, satellite alignment, VO/geolocalization replay |
 | EPFL fixed-wing micro UAV | Strong fixed-wing relevance with camera/navigation sensors | Availability and exact raw IMU packaging must be verified | Fixed-wing path realism and photogrammetry-style validation |
 | Kagaru airborne vision | Fixed-wing/farmland relevance, downward stereo, INS/GPS | Older dataset; exact sensor compatibility must be verified | Agricultural terrain and fixed-wing motion checks |
 | AerialVL | Strong UAV-to-satellite localization and VPR benchmark | IMU availability is less clear than image/GNSS/reference-map data | Satellite retrieval, anchor verification, visual localization |
 | VPAir | Strong aircraft nadir VPR/localization with GPS-derived poses | Academic-use restriction; raw IMU not confirmed | VPR and cross-view localization only if license allows |
 | EuRoC MAV | Excellent synchronized camera/IMU/ground-truth VIO benchmark | Not fixed-wing nadir, indoor MAV | BASALT/OpenVINS/Kimera baseline sanity tests |
 | UZH FPV | Synchronized camera/IMU/ground-truth high-dynamics benchmark | Not nadir fixed-wing; non-commercial license | Stress VIO robustness only if license allows |
 ## Data Isolation Strategy
 Every replay test uses read-only fixture mounts and writes results to a fresh `test-results/<run-id>/` directory. The system under test may write FDR and generated COG tiles only to run-scoped temporary volumes.
 ## Input Data Mapping
 | Input Data File | Source Location | Description | Covers Scenarios |
 |-----------------|----------------|-------------|------------------|
 | `AD000001.jpg` ... `AD000060.jpg` | `_docs/00_problem/input_data/` | Project still-image set with expected WGS84 centers | FT-P-01, FT-P-02, NFT-PERF-01 |
 | `coordinates.csv` | `_docs/00_problem/input_data/coordinates.csv` | Machine-readable expected frame centers | FT-P-01, FT-P-02 |
 | `data_parameters.md` | `_docs/00_problem/input_data/data_parameters.md` | Height 400 m and camera model | FT-P-01, NFT-PERF-01 |
 | `AD000001_gmaps.png`, `AD000002_gmaps.png` | `_docs/00_problem/input_data/` | Reference map screenshots for sample sanity checks | FT-P-02 |
 | `flight_derkachi/flight_derkachi.mp4` + `flight_derkachi/data_imu.csv` | `_docs/00_problem/input_data/flight_derkachi/` | Cropped nadir video synchronized with IMU and `GLOBAL_POSITION_INT` GPS trajectory | FT-P-03, NFT-PERF-02, NFT-RES-02 |
 | Public dataset slices | External fixture paths pinned during implementation | Synchronized camera/IMU/GNSS/ground truth where available | FT-P-03, FT-P-04, NFT-PERF-02, NFT-RES-02 |
 ## Expected Results Mapping
 | Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
 |------------------|------------|-----------------|-------------------|-----------|------------------------|
 | FT-P-01 | `AD000001.jpg` ... `AD000060.jpg` | Output WGS84 frame center per mapped row; >=80% within 50 m, >=50% within 20 m | Haversine distance threshold + aggregate pass rate | 50 m primary, 20 m stretch | `input_data/expected_results/results_report.md` |
 | FT-P-02 | Same 60 images + map references where present | Output includes source label, covariance semi-major axis, and anchor age for every emitted estimate | Required-field validation + geolocation threshold | Required fields present; geolocation thresholds as above | `input_data/expected_results/results_report.md` |
 | FT-P-03 | `derkachi_video_telemetry` plus public synchronized VIO dataset slice when available | BASALT + wrapper emits trajectory with calibrated covariance and no optimistic under-reporting | Compare Derkachi output to `GLOBAL_POSITION_INT` trajectory for smoke/relative validation; compare public/representative calibrated runs to ground truth for final accuracy | Derkachi threshold is calibration-gated; final threshold is dataset-specific and pinned after camera calibration | `data_imu.csv` trajectory plus public dataset ground truth |
 | FT-P-04 | AerialVL/ALTO/VPAir-style aerial localization slice | Satellite retrieval returns candidate chunks and local verification produces accepted/rejected anchors | Georeference error + MRE + source-label checks | AC-1.1/1.2 and AC-2.2 thresholds where dataset supports them | Public dataset ground truth/reference map |
 | FT-N-01 | Low-texture/repetitive frames from sample or public data | System emits degraded confidence or rejects anchor rather than confident false fix | Source label and covariance threshold | No `satellite_anchored` label unless gates pass | Fixture-specific |
 | FT-N-02 | Plane SITL GPS spoof/loss trace | Spoofed GPS rejected; system promotes own estimate within <3 s when trigger conditions are met | Event timing and MAVLink field checks | <3 s promotion; blackout thresholds from AC-NEW-8 | Generated SITL trace |
 | FT-N-03 | Stale/unsigned/hash-mismatched cache fixtures | Anchors rejected or downgraded; stale tile never emits `satellite_anchored` | Manifest validation + emitted label check | 0 accepted stale/invalid anchors | Cache fixture manifest |
 ## External Dependency Mocks
 | External Service | Mock/Stub | How Provided | Behavior |
 |------------------|-----------|--------------|----------|
 | Azaion Suite Satellite Service | Offline cache stub | Local COG/manifest/descriptor fixture | Provides only preloaded tiles; no in-flight network fetch |
 | Flight controller | ArduPilot Plane SITL and MAVLink replay | SITL container/process and recorded/generated tlogs | Emits IMU, attitude, altitude, airspeed, GPS health/spoofing events |
 | QGroundControl | MAVLink observer/log parser | Test-side parser | Verifies downsampled status and `STATUSTEXT` events |
 ## Data Validation Rules
 | Data Type | Validation | Invalid Examples | Expected System Behavior |
 |-----------|------------|------------------|--------------------------|
 | Image frame | Existing file, readable image, expected timestamp/order metadata if sequence replay | Missing image, corrupt image, unsupported resolution | Mark estimate unavailable/degraded, log error, continue if possible |
 | Expected coordinate | Valid WGS84 latitude/longitude | Out-of-range lat/lon, missing row | Reject test fixture before replay |
 | Video/telemetry pair | MP4 duration matches telemetry duration, frame-to-telemetry ratio is stable, timestamps are monotonic | Duration drift >250 ms, missing trajectory columns, non-monotonic timestamps | Reject fixture before replay |
 | IMU trace | Monotonic timestamps, angular rate/accel fields, calibrated units | Non-monotonic timestamps, missing samples | Reject fixture or enter degraded mode depending scenario |
 | GPS trajectory trace | Valid WGS84 lat/lon, altitude, velocity, and heading fields | Out-of-range lat/lon, impossible altitude, missing `GLOBAL_POSITION_INT` columns | Reject trajectory comparison while allowing pure video replay if applicable |
 | Cache tile manifest | CRS, m/px, capture date, source, hashes, signature/provenance | Stale, unsigned, hash mismatch, low resolution | Reject or down-confidence per AC-8.2 and AC-NEW-6 |
 | MAVLink output | Valid `GPS_INPUT` fields and fix type/accuracy semantics | Missing `horiz_accuracy`, impossible fix type | Fail test; output contract violated |
 ## Phase 3 Validation Gate Result
 | Test Scenario ID | Shape | Required Input Data | Required Expected Result | Input Provided? | Expected Result Provided? | Validation Decision |
 |------------------|-------|---------------------|--------------------------|-----------------|---------------------------|---------------------|
 | FT-P-01 | Input/output | 60 project images + `coordinates.csv` | WGS84 center per image with 50 m / 20 m thresholds | Yes | Yes | Keep |
 | FT-P-02 | Input/output | 60 project images + output schema expectations | Required confidence/source-label fields and thresholds | Yes | Yes | Keep |
 | FT-P-03 | Input/output | Derkachi synchronized video/IMU/GPS fixture; public or calibrated representative dataset for final accuracy | Derkachi `GLOBAL_POSITION_INT` trajectory for smoke/relative validation; calibrated ground truth for final covariance checks | Yes for Derkachi; public/calibrated dataset still useful for final signoff | Yes for Derkachi GPS trajectory; calibrated camera thresholds pending | Keep with calibration gate |
 | FT-P-04 | Input/output | Public aerial localization or project cache fixture | Georeference, MRE, and source-label checks | Accepted as required external fixture | Accepted as dataset/reference-map ground truth | Keep with acquisition task |
 | FT-N-01 | Behavioral/input-output | Ambiguous low-texture/repetitive frames | 0 confident false anchors | Accepted as project/public fixture | Yes | Keep |
 | FT-N-02 | Behavioral | Generated Plane SITL spoof/blackout trace | Timing and MAVLink field thresholds from AC-NEW-8 | Generated by test harness | Yes | Keep |
 | FT-N-03 | Behavioral/input-output | Cache integrity fixtures | 0 trusted anchors from stale/invalid tiles | Generated fixture | Yes | Keep |
 | NFT-PERF-01 | Input/output | 60 project images | p95 latency and drop-rate thresholds | Yes | Yes | Keep |
 | NFT-PERF-02 | Input/output | Derkachi synchronized video/IMU/GPS fixture; public/representative synchronized camera/IMU dataset | VO registration, latency, memory thresholds | Yes for Derkachi | Yes | Keep with calibration gate |
 | NFT-PERF-03 | Behavioral/input-output | Precomputed descriptor/cache fixture | Trigger-path latency and MRE thresholds | Generated fixture | Yes | Keep |
 | NFT-PERF-04 | Behavioral | Cold-start harness and cache fixture | <30 s p95 over 50 runs | Generated by test harness | Yes | Keep |
 | NFT-RES-* | Behavioral | Fault triggers and generated traces | AC-defined timing/status thresholds | Generated by test harness | Yes | Keep |
 | NFT-SEC-* | Behavioral/input-output | Cache/MAVLink/network fixtures | Rejection/no-fetch/no-promote thresholds | Generated fixture | Yes | Keep |
 | NFT-RES-LIM-* | Behavioral | Jetson/cache/FDR monitoring environment | Numeric resource thresholds | Environment-dependent | Yes | Keep |
 **Coverage after validation**: 49/49 AC and restriction groups remain covered. No tests were removed.
 **Acquisition tasks required downstream**:
 - Pin camera intrinsics, lens distortion, raw camera feed parameters, and camera-to-body mounting transform for the Derkachi fixture or future representative recordings.
 - Pin and download at least one strong synchronized nadir camera + IMU + ground-truth dataset, preferably MUN-FRL or ALTO, with EPFL fixed-wing and Kagaru as fixed-wing/farmland candidates.
 - Pin license-compatible VPR/localization datasets for satellite anchor tests; VPAir and UZH FPV have non-commercial restrictions and must not be used for commercial acceptance unless license terms allow it.
 - Create generated fixtures for Plane SITL spoofing, stale cache manifests, signed/unsigned manifests, FDR load, and thermal/resource monitoring during implementation.
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Oleksandr Bezdieniezhnykh	e7eaefff8b	chore: sync .cursor from suite	2026-05-05 01:08:48 +03:00
Oleksandr Bezdieniezhnykh	827d4fe644	[AZ-240] Update product implementation and task decomposition processes - Refined task decomposition steps to ensure implementation tasks are atomic and complexity does not exceed 5 points. - Enhanced the product implementation process with a completeness gate to verify task outcomes against architecture promises before proceeding to testing. - Updated dependencies table to reflect new tasks and their relationships, ensuring all test tasks are linked to product remediation tasks. - Adjusted workflow documentation to clarify entry points for task decomposition and implementation contexts. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-05 01:02:25 +03:00
Oleksandr Bezdieniezhnykh	9fb9e4a349	[AZ-232] Add safety anchor state machine Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 19:10:10 +03:00
Oleksandr Bezdieniezhnykh	7819ae7a38	[AZ-231] Add anchor verification gates Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 19:02:13 +03:00
Oleksandr Bezdieniezhnykh	07fb9535a9	[AZ-230] Add local VPR retrieval boundary Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 18:49:37 +03:00
Oleksandr Bezdieniezhnykh	087f4dba27	[AZ-228] [AZ-229] Add VIO and satellite sync boundaries Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 18:31:04 +03:00
Oleksandr Bezdieniezhnykh	2db50bc124	[AZ-226] Add generated tile staging Keep generated tiles auditable and untrusted onboard while preserving covariance, quality, and sidecar metadata for post-flight sync. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 18:10:25 +03:00
Oleksandr Bezdieniezhnykh	e86084da6b	[AZ-223] [AZ-224] [AZ-225] [AZ-227] Add runtime gateways Implement the first runtime component boundaries around the shared contracts so downstream batches can consume typed frame, MAVLink, tile, and FDR behavior with focused tests and batch evidence. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 18:01:13 +03:00
Oleksandr Bezdieniezhnykh	aab11e488e	chore: sync .cursor skills from suite	2026-05-03 17:43:26 +03:00
Oleksandr Bezdieniezhnykh	c3650d979d	[AZ-221] [AZ-222] Add shared runtime helpers Provide deterministic geometry/time-sync helpers and structured config, error, health, and telemetry primitives for downstream runtime components. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 14:01:04 +03:00
Oleksandr Bezdieniezhnykh	5156453224	[AZ-220] Add shared runtime contract models Implement the shared DTO contract surface with validation so runtime components consume one public model set instead of duplicating shapes. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 13:22:50 +03:00
Oleksandr Bezdieniezhnykh	72a9df6b57	[AZ-219] [AZ-228] Generalize VIO component layout Keep VIO package and native bridge paths backend-neutral so BASALT remains an implementation choice rather than a component boundary. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 12:41:54 +03:00
Oleksandr Bezdieniezhnykh	79997e39ac	[AZ-219] Scaffold onboard runtime project Add the initial source, test, infrastructure, CI, configuration, and evidence-path scaffold so dependent implementation tasks have stable package and runtime boundaries. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-03 12:41:54 +03:00
Oleksandr Bezdieniezhnykh	dd9afe2797	Refactor documentation to replace the Validation Harness with a separate E2E Test Suite, updating references throughout various documents. Adjust the autodev state to reflect the transition from the Decompose phase to the Implement phase, and revise the architecture documentation to clarify system boundaries and component relationships. Enhance risk mitigation documentation to specify affected components and update the component overview diagram accordingly.	2026-05-03 12:41:53 +03:00
Oleksandr Bezdieniezhnykh	5bf2dbd85f	Update autodev state documentation to reflect progress in the Decompose phase, changing the current step from 5 to 6. Revise sub-step details to indicate a shift to phase 2, focusing on module layout for the Satellite Service and Tile Manager, and awaiting confirmation before product task decomposition. Additionally, enhance problem documentation to clarify the original still-image sample limitations and introduce the Derkachi representative fixture for improved data validation. Update references to the Tile Manager and Satellite Service throughout the documentation for consistency.	2026-05-03 12:41:52 +03:00
Oleksandr Bezdieniezhnykh	35547e9b65	Update autodev workflow documentation to include new steps for Test Spec and Decompose Tests, enhancing the greenfield process. Revise existing steps to reflect changes in task flow and clarify conditions for implementation. Adjust current state to indicate progress in the Decompose phase.	2026-05-02 05:31:23 +03:00
Oleksandr Bezdieniezhnykh	7e15868d39	Revise acceptance criteria and restrictions documentation to clarify recent updates and specifications. Key changes include enhanced definitions for position accuracy, image processing quality, and operational parameters, as well as updates to camera specifications and validation requirements. This revision aims to improve clarity and ensure alignment with project goals.	2026-05-01 16:24:46 +03:00
Oleksandr Bezdieniezhnykh	3f173c1bb7	Update camera specifications in data_parameters.md and remove outdated expected results files for position accuracy and results report. The camera model has been changed to ADTi Surveyor Lite 20MP 20L V1, and the previous CSV and report files have been deleted to streamline documentation.	2026-05-01 05:00:07 +03:00
Oleksandr Bezdieniezhnykh	13bb25be38	Enhance research and refactor documentation with mandatory API capability verification for technical components. Introduce per-mode verification steps, including pinned mode/config requirements and Minimum Viable Example (MVE) documentation. Update analysis and solution draft templates to reflect new columns for API capability evidence and ensure structured cross-checking against project constraints. This update aims to prevent silent failures in component selection and improve overall research rigor.	2026-04-30 23:02:11 +03:00
Oleksandr Bezdieniezhnykh	3ef26c515e	fresh start v2	2026-04-29 17:07:28 +03:00
Oleksandr Bezdieniezhnykh	af5eb13ecb	update GPS-denied onboard research docs	2026-04-29 17:03:57 +03:00
`@@ -58,4 +58,4 @@ Do NOT create minimal epics with just a summary and short description. The epic`

	8. Create "Blackbox Tests" epic — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.	8. Create "Blackbox Tests" epic — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.

	Save action: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If `tracker: local`, save locally only.	Save action: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If tracker availability fails, follow `.cursor/rules/tracker.mdc`; only if the user explicitly chooses `tracker: local`, save locally only with pending tracker markers.
`@@ -1,2 +1,2 @@`
	We have a wing-type UAV with a camera pointing downwards that can take photos 3 times per second with a resolution 6200*4100. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC.	We have a wing-type UAV with a fixed downward navigation camera that can take photos 3 times per second. The authoritative navigation-camera spec is defined in `restrictions.md` as the ADTi 20MP 20L V1, APS-C sensor, about 5472 x 3648 px; older higher-resolution references are superseded. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC.
	`The real world examples are in input_data folder, but the distance between each photo is way bigger than it will be from a real plane. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms. We also don't have IMU data for the test. For now we have to search for the public data for that in internet. We've tried to record that with Mavic 3 Pro Mini, but failed, cause of the closed system if DJI.`	The real world examples are in input_data folder, but the original still-image set has a much larger distance between photos than the target aircraft will have. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms. Additional representative data is available in `input_data/flight_derkachi/`: cropped nadir flight footage plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry. This supports video/telemetry synchronization, replay, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. Camera intrinsics, lens distortion, raw camera feed parameters, and exact camera-to-body calibration are still pending, so final production accuracy claims remain gated on calibration data or a separately surveyed representative dataset.