1 Commits

Author SHA1 Message Date
Oleksandr Bezdieniezhnykh 4e3b814229 Merge branch 'dev' into stage 2026-04-22 01:37:33 +03:00
284 changed files with 2885 additions and 21519 deletions
-29
View File
@@ -1,29 +0,0 @@
---
description: "Forbid spawning subagents; the main agent must do the work directly"
alwaysApply: true
---
# No Subagents
Do NOT create or delegate to subagents. This includes:
- The `Task` tool with any `subagent_type` (e.g. `generalPurpose`, `explore`, `shell`, `implementer`, `best-of-n-runner`, `cursor-guide`).
- Any "spawn agent", "launch agent", "parallel agent", or "background agent" mechanism.
- Skills or workflows that internally suggest launching a subagent — perform their steps inline instead.
## Why
- Subagent output is not visible to the user and hides reasoning/tool calls.
- Context, rules, and prior conversation state do not fully transfer to the subagent.
- Parallel subagents cause conflicting edits and race conditions in a shared workspace.
- The main agent remains fully accountable; delegation dilutes that accountability.
## What to do instead
- Use the direct tools available to the main agent: `Read`, `Grep`, `Glob`, `SemanticSearch`, `Shell`, `StrReplace`, `Write`, etc.
- For broad exploration, run `Grep`/`Glob`/`SemanticSearch` yourself and read the files directly.
- For multi-step work, use `TodoWrite` to track progress inline.
- For isolated experiments the user explicitly asks for, use a git branch/worktree you manage directly — not a subagent runner.
## Exception
Only spawn a subagent if the user explicitly requests it in the current turn (e.g. "use a subagent to…", "launch an explore agent…"). Even then, confirm once before spawning.
-46
View File
@@ -1,46 +0,0 @@
---
description: "Explanation length and reasoning depth calibration"
alwaysApply: true
---
# Response Calibration
Default to concise. Expand only when the content demands it.
## Length target
- **Default**: a direct answer in ~310 lines. Short paragraphs or a tight bullet list.
- **Expand when**: the question involves trade-offs across multiple options, a migration/architectural decision, a security/data-loss risk, or the user explicitly asks for depth ("explain in detail", "walk me through", "why").
- **Shrink when**: the user asks for "shorter", "simpler", "TL;DR", "one line", or similar. Do not re-inflate in later turns unless they ask a new deeper question.
## Completeness floor
Short ≠ incomplete. Every response must still:
- Answer the actual question asked (not a reframed version).
- State the key constraint or reason *once*, not repeatedly.
- Flag a real caveat if one exists (data loss, breaking change, wrong-OS, security). One sentence is enough.
- Not drop a step from an action sequence. If there are 5 steps, list 5 — but without narration between them.
If the honest answer truly needs more space (e.g. trade-off matrix, multi-option decision), write more — but lead with the recommendation or direct answer, then the detail.
## Structure
- One direct sentence first. Then supporting detail.
- Prefer bullets over prose for enumerations, comparisons, or step lists.
- Drop section headers for anything under ~15 lines.
- No "Summary" / "Conclusion" sections unless the response is genuinely long.
## Reasoning depth (internal)
- Match thinking to the problem, not the length of the answer.
- Factual / "where is X used" / single-file edit → minimal thinking, go straight to tools.
- Trade-off / refactor / debugging 3+ hypotheses deep → full thinking budget.
- Do not pad thinking to look thorough. Do not skip thinking on genuinely ambiguous problems to look fast.
## Anti-patterns to avoid
- Restating the question back to the user.
- Multi-paragraph preambles before the answer.
- Exhaustive "alternatives considered" sections when the user didn't ask for alternatives.
- Recapping what was just done at the end of every tool-using turn ("Done. I have edited the file…") — a one-line confirmation is enough.
- Speculative "you might also want to…" paragraphs. Offer follow-ups as a single short sentence, or not at all.
-38
View File
@@ -1,38 +0,0 @@
---
description: "Standards for creating and maintaining Cursor skills"
globs: [".cursor/skills/**"]
---
# Skill Building
## When To Create A Skill
- Create a skill for repeatable, bounded workflows that benefit from a reusable process.
- Do not create a skill for a one-off task, vague goal, or workflow that still needs product decisions.
- Start small; evolve the skill when repeated use reveals clearer steps, constraints, or checks.
## Skill Contract
- `SKILL.md` must define a clear `name` and a proactive `description` that explains when the skill should be used.
- State expected inputs, constraints, workflow steps, and final output shape.
- Make trigger conditions explicit enough that the agent can recognize intent without an exact command.
- Base instructions on observable project evidence; do not invite fabrication or unsupported assumptions.
## Keep The Core Lean
- Keep `SKILL.md` concise and under the repo's `.cursor/` size guidance.
- Move detailed standards, examples, and background knowledge into `references/`.
- Put reusable output shapes in `templates/` or other skill-local assets instead of embedding them in the main instructions.
- Keep one primary responsibility per skill; use an orchestrator skill only when multiple existing skills must run in a defined order.
## Deterministic Work
- Use scripts for mechanical steps that are repeatable, parameterized, and safer outside the model's reasoning.
- Scripts must expose explicit inputs, avoid hidden side effects, and fail loudly on errors.
- Do not use scripts to bypass review, hide destructive behavior, or hardcode secrets.
## Quality Proof
- Include realistic examples, checklists, or eval-style scenarios that define what good output looks like.
- Cover common failure cases such as missing sections, leftover placeholders, hallucinated facts, unsafe actions, or malformed output.
- Review skill changes against those checks before treating the skill as ready.
## Security Review
- Treat third-party skills like untrusted code until reviewed.
- Inspect scripts, dependencies, references, secret handling, network calls, and destructive commands before use.
- Prefer local, project-scoped assets and dependencies; document any external dependency the skill requires.
+2 -2
View File
@@ -3,7 +3,7 @@ name: autodev
description: |
Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
Detects current project state from _docs/ folder, resumes from where it left off, and flows through
problem → research → plan → test specs → decompose → implement → tests → docs sync → deploy without manual skill invocation.
problem → research → plan → decompose → implement → deploy without manual skill invocation.
Maximizes work per conversation by auto-transitioning between skills.
Trigger phrases:
- "autodev", "auto", "start", "continue"
@@ -52,7 +52,7 @@ Determine which flow to use (check in order — first match wins):
After selecting the flow, apply its detection rules (first match wins) to determine the current step.
**Note**: the meta-repo flow uses a different artifact layout — its source of truth is `_docs/_repo-config.yaml`, not `_docs/NN_*/` folders. After Step 2.5 it also produces `_docs/glossary.md` and a `## Architecture Vision` section in the cross-cutting architecture doc identified by `docs.cross_cutting`. Other detection rules assume the BUILD-SHIP artifact layout; they don't apply to meta-repos.
**Note**: the meta-repo flow uses a different artifact layout — its source of truth is `_docs/_repo-config.yaml`, not `_docs/NN_*/` folders. Other detection rules assume the BUILD-SHIP artifact layout; they don't apply to meta-repos.
## Execution Loop
@@ -13,7 +13,7 @@ A first-time run executes Phase A then Phase B; every subsequent invocation re-e
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 1 | Document | document/SKILL.md | Steps 07 incl. inline 2.5 (module-layout) and 4.5 (glossary + arch vision) |
| 1 | Document | document/SKILL.md | Steps 18 |
| 2 | Architecture Baseline Scan | code-review/SKILL.md (baseline mode) | Phase 1 + Phase 7 |
| 3 | Test Spec | test-spec/SKILL.md | Phases 14 |
| 4 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 07 (conditional) |
@@ -53,8 +53,6 @@ Action: An existing codebase without documentation was detected. Read and execut
The document skill's Step 2.5 produces `_docs/02_document/module-layout.md`, which is required by every downstream step that assigns file ownership (`/implement` Step 4, `/code-review` Phase 7, `/refactor` discovery). If this file is missing after Step 1 completes (e.g., a pre-existing `_docs/` dir predates the 2.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 2.5.
The document skill's Step 4.5 produces `_docs/02_document/glossary.md` and prepends a confirmed `## Architecture Vision` section to `architecture.md`. Both are user-confirmed artifacts; downstream skills (refactor, decompose, new-task) treat them as authoritative for terminology and structural intent. If `glossary.md` is missing after Step 1 (pre-existing `_docs/` dir from before the 4.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 4.5 without redoing module/component analysis.
---
**Step 2 — Architecture Baseline Scan**
@@ -152,17 +150,15 @@ If `_docs/02_tasks/` subfolders have some task files already (e.g., refactoring
---
**Step 6 — Implement Tests**
Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
Condition (folder fallback): `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 5.
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
Action: Read and execute `.cursor/skills/implement/SKILL.md`
The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
---
**Step 7 — Run Tests**
+65 -217
View File
@@ -1,6 +1,6 @@
# Greenfield Workflow
Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
## Step Reference Table
@@ -10,19 +10,13 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
| 2 | Research | research/SKILL.md | Mode A: Phase 14 · Mode B: Step 08 |
| 3 | Plan | plan/SKILL.md | Step 16 + Final |
| 4 | UI Design | ui-design/SKILL.md | Phase 08 (conditional — UI projects only) |
| 5 | Test Spec | test-spec/SKILL.md | Phases 14 |
| 6 | Decompose | decompose/SKILL.md (implementation task decomposition) | Step 1 + Step 1.5 + Step 2 + Step 4 |
| 7 | Implement | implement/SKILL.md | Batch loop + Product Implementation Completeness Gate |
| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 07 (conditional) |
| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 11 | Run Tests | test-run/SKILL.md | Steps 14 |
| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 05 |
| 14 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 15 (optional) |
| 16 | Deploy | deploy/SKILL.md | Step 17 |
| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 14 |
| 5 | Decompose | decompose/SKILL.md | Step 14 |
| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 7 | Run Tests | test-run/SKILL.md | Steps 14 |
| 8 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 9 | Performance Test | test-run/SKILL.md (perf mode) | Steps 15 (optional) |
| 10 | Deploy | deploy/SKILL.md | Step 17 |
| 11 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 14 |
## Detection Rules
@@ -86,12 +80,12 @@ If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FIN
---
**Step 4 — UI Design (conditional)**
Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_tasks/todo/` does not exist or has no task files.
State-driven: reached by auto-chain from Step 3.
Action: Read and execute `.cursor/skills/ui-design/SKILL.md`. The skill runs its own **Applicability Check**, which handles UI project detection and the user's A/B choice. It returns one of:
- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Test Spec).
- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Decompose).
- `outcome: skipped, reason: not-a-ui-project` → mark Step 4 as `skipped`, auto-chain to Step 5.
- `outcome: skipped, reason: user-declined` → mark Step 4 as `skipped`, auto-chain to Step 5.
@@ -99,162 +93,34 @@ The autodev no longer inlines UI detection heuristics — they live in `ui-desig
---
**Step 5 — Test Spec**
Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
State-driven: reached by auto-chain from Step 4 (completed or skipped).
**Step 5 — Decompose**
Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files
Action: Read and execute `.cursor/skills/test-spec/SKILL.md`.
This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
---
**Step 6 — Decompose**
Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_document/tests/traceability-matrix.md` exists AND `_docs/02_tasks/todo/` does not exist or has no implementation task files.
Action: Invoke `.cursor/skills/decompose/SKILL.md` for **implementation task decomposition**. The greenfield flow selects the implementation entrypoint before handing off: Bootstrap Structure, Module Layout, Component Task Decomposition, and Cross-Task Verification.
Do not invoke Blackbox Test Task Decomposition from Step 6. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality and Step 8 can revise testability before test task files exist.
Action: Read and execute `.cursor/skills/decompose/SKILL.md`
If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.
---
**Step 7 — Implement**
Condition: `_docs/02_tasks/todo/` contains implementation task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain a valid product implementation report.
**Step 6 — Implement**
Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Product implementation**.
The implement skill must run its **Product Implementation Completeness Gate** before it writes any final product implementation report. This gate compares completed product task specs, architecture/component promises, and actual source code so scaffold-only implementations cannot advance to Step 8. A final product implementation report without `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` is incomplete and must not be treated as Step 7 completion.
Action: Read and execute `.cursor/skills/implement/SKILL.md`
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.
For folder fallback, **implementation task files** means task specs that are not test-only specs: exclude `*_test_infrastructure.md` and task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
For folder fallback, a **product implementation report** is any `_docs/03_implementation/implementation_report_*.md` file except `_docs/03_implementation/implementation_report_tests.md` and refactor reports. It is valid for greenfield progression only when:
- the matching `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists,
- that completeness report does not contain unresolved `FAIL` classifications, and
- `_docs/02_tasks/todo/` contains no pending implementation task files.
If a product report exists but any of those validity checks fail, treat product implementation as incomplete and stay in Step 7.
---
**Step 8Code Testability Revision**
Condition (folder fallback): `_docs/03_implementation/` contains a valid product implementation report, `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications, `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` does not exist, `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` does not exist, `_docs/03_implementation/implementation_report_tests.md` does not exist, and `_docs/02_tasks/todo/` does not contain test task files.
State-driven: reached by auto-chain from Step 7.
**Purpose**: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
**Scope — MINIMAL, SURGICAL fixes**: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
**Allowed changes** in this phase:
- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
- Extract narrow interfaces for components that need stubbing in tests.
- Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
- Wrap global singletons in thin accessors that tests can override.
- Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
**NOT allowed** in this phase (defer to a later refactor task):
- Renaming public APIs.
- Moving code between files unless strictly required for isolation.
- Changing algorithms or business logic.
- Restructuring module boundaries or rewriting layers.
Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
- Hardcoded file paths or directory references
- Hardcoded configuration values (URLs, credentials, magic numbers)
- Global mutable state that cannot be overridden
- Tight coupling to external services without abstraction
- Missing dependency injection or non-configurable parameters
- Direct file system operations without path configurability
- Inline construction of heavy dependencies (models, clients)
3. If ALL scenarios are testable as-is:
- Create `_docs/04_refactoring/01-testability-refactoring/`
- Write `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` with the scenarios reviewed and outcome "Code is testable — no changes needed"
- Mark Step 8 as `completed` with outcome "Code is testable — no changes needed"
- Auto-chain to Step 9 (Decompose Tests)
4. If testability issues are found:
- Create `_docs/04_refactoring/01-testability-refactoring/`
- Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
- **Mode**: `guided`
- **Source**: `autodev-greenfield-testability-analysis`
- One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
- Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
- Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
- After execution, surface `RUN_DIR/testability_changes_summary.md` to the user via the Choose format (accept / request follow-up) before auto-chaining
- Copy or save the accepted summary as `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` so folder fallback can detect Step 8 completion
- Mark Step 8 as `completed`
- Auto-chain to Step 9 (Decompose Tests)
---
**Step 9 — Decompose Tests**
Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND `_docs/03_implementation/` contains a valid product implementation report AND `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications AND (`_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` exists OR `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` exists) AND (`_docs/02_tasks/todo/` does not exist or has no test task files) AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 8.
Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
1. Run Step 1t (test infrastructure bootstrap)
2. Run Step 3 (blackbox/e2e-capable test task decomposition)
3. Run Step 4 (cross-verification against test coverage)
If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
---
**Step 10 — Implement Tests**
Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 9.
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed test tasks and continues.
For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
---
**Step 11 — Run Tests**
Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
State-driven: reached by auto-chain from Step 10.
**Step 7Run Tests**
Condition (folder fallback): `_docs/03_implementation/` contains an `implementation_report_*.md` file.
State-driven: reached by auto-chain from Step 6.
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync.
---
**Step 12Test-Spec Sync**
State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing, mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
After completion, auto-chain to Step 13 (Update Docs).
---
**Step 13 — Update Docs**
State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all completed implementation and test task spec files plus the implementation reports.
The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
If `_docs/02_document/` does not contain existing docs, mark Step 13 as `skipped`.
After completion, auto-chain to Step 14 (Security Audit).
---
**Step 14 — Security Audit (optional)**
State-driven: reached by auto-chain from Step 13 (completed or skipped).
**Step 8Security Audit (optional)**
State-driven: reached by auto-chain from Step 7.
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run security audit before deploy?`
@@ -262,12 +128,12 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A — catches vulnerabilities before production`
- target-skill: `.cursor/skills/security/SKILL.md`
- next-step: Step 15 (Performance Test)
- next-step: Step 9 (Performance Test)
---
**Step 15 — Performance Test (optional)**
State-driven: reached by auto-chain from Step 14 (completed or skipped).
**Step 9 — Performance Test (optional)**
State-driven: reached by auto-chain from Step 8.
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run performance/load tests before deploy?`
@@ -275,30 +141,30 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
- target-skill: `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
- next-step: Step 16 (Deploy)
- next-step: Step 10 (Deploy)
---
**Step 16 — Deploy**
State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).
**Step 10 — Deploy**
State-driven: reached by auto-chain from Step 9 (after Step 9 is completed or skipped).
Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).
After the deploy skill completes successfully, mark Step 10 as `completed` and auto-chain to Step 11 (Retrospective).
---
**Step 17 — Retrospective**
State-driven: reached by auto-chain from Step 16.
**Step 11 — Retrospective**
State-driven: reached by auto-chain from Step 10.
Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. This closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` and appending the top-3 lessons to `_docs/LESSONS.md`.
After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.
After retrospective completes, mark Step 11 as `completed` and enter "Done" evaluation.
---
**Done**
State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
State-driven: reached by auto-chain from Step 11. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:
@@ -325,65 +191,47 @@ On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and r
| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
| Research Decision → proceed | Auto-chain → Plan (3) |
| Plan (3) | Auto-chain → UI Design detection (4) |
| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
| Test Spec (5) | Auto-chain → Decompose (6) |
| Decompose (6) | **Session boundary** — suggest new conversation before Implement |
| Implement (7) | Auto-chain only after Product Implementation Completeness Gate passes → Code Testability Revision (8) |
| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
| Decompose Tests (9) | **Session boundary** — suggest new conversation before Implement Tests |
| Implement Tests (10) | Auto-chain → Run Tests (11) |
| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
| Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
| Deploy (16) | Auto-chain → Retrospective (17) |
| Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |
| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
| Implement (6) | Auto-chain → Run Tests (7) |
| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
| Deploy (10) | Auto-chain → Retrospective (11) |
| Retrospective (11) | Report completion; rewrite state to existing-code flow, step 9 |
## Status Summary — Step List
Flow name: `greenfield`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|-----------------------------|--------------------------------------------|
| 1 | Problem | — |
| 2 | Research | `DONE (N drafts)` |
| 3 | Plan | — |
| 4 | UI Design | — |
| 5 | Test Spec | — |
| 6 | Decompose | `DONE (N tasks)` |
| 7 | Implement | `IN PROGRESS (batch M of ~N)` |
| 8 | Code Testability Revision | — |
| 9 | Decompose Tests | `DONE (N tasks)` |
| 10 | Implement Tests | `IN PROGRESS (batch M)` |
| 11 | Run Tests | `DONE (N passed, M failed)` |
| 12 | Test-Spec Sync | — |
| 13 | Update Docs | — |
| 14 | Security Audit | — |
| 15 | Performance Test | — |
| 16 | Deploy | — |
| 17 | Retrospective | — |
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|--------------------|--------------------------------------------|
| 1 | Problem | — |
| 2 | Research | `DONE (N drafts)` |
| 3 | Plan | — |
| 4 | UI Design | — |
| 5 | Decompose | `DONE (N tasks)` |
| 6 | Implement | `IN PROGRESS (batch M of ~N)` |
| 7 | Run Tests | `DONE (N passed, M failed)` |
| 8 | Security Audit | — |
| 9 | Performance Test | — |
| 10 | Deploy | — |
| 11 | Retrospective | — |
All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.
All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 8, 9 additionally accept `SKIPPED`.
Row rendering format (step-number column is right-padded to 2 characters for alignment):
```
Step 1 Problem [<state token>]
Step 2 Research [<state token>]
Step 3 Plan [<state token>]
Step 4 UI Design [<state token>]
Step 5 Test Spec [<state token>]
Step 6 Decompose [<state token>]
Step 7 Implement [<state token>]
Step 8 Code Testability Rev. [<state token>]
Step 9 Decompose Tests [<state token>]
Step 10 Implement Tests [<state token>]
Step 11 Run Tests [<state token>]
Step 12 Test-Spec Sync [<state token>]
Step 13 Update Docs [<state token>]
Step 14 Security Audit [<state token>]
Step 15 Performance Test [<state token>]
Step 16 Deploy [<state token>]
Step 17 Retrospective [<state token>]
Step 1 Problem [<state token>]
Step 2 Research [<state token>]
Step 3 Plan [<state token>]
Step 4 UI Design [<state token>]
Step 5 Decompose [<state token>]
Step 6 Implement [<state token>]
Step 7 Run Tests [<state token>]
Step 8 Security Audit [<state token>]
Step 9 Performance Test [<state token>]
Step 10 Deploy [<state token>]
Step 11 Retrospective [<state token>]
```
+22 -159
View File
@@ -15,10 +15,8 @@ This flow differs fundamentally from `greenfield` and `existing-code`:
|------|------|-----------|-------------------|
| 1 | Discover | monorepo-discover/SKILL.md | Phase 110 |
| 2 | Config Review | (human checkpoint, no sub-skill) | — |
| 2.5 | Glossary & Architecture Vision | (inline, no sub-skill) | Steps 15 |
| 3 | Status | monorepo-status/SKILL.md | Sections 15 |
| 4 | Document Sync | monorepo-document/SKILL.md | Phase 17 (conditional on doc drift) |
| 4.5 | Integration Test Sync | monorepo-e2e/SKILL.md | Phase 16 (conditional on suite-e2e drift; skipped if `suite_e2e:` block absent in config) |
| 5 | CICD Sync | monorepo-cicd/SKILL.md | Phase 17 (conditional on CI drift) |
| 6 | Loop | (auto-return to Step 3 on next invocation) | — |
@@ -60,121 +58,17 @@ Action: This is a **hard session boundary**. The skill cannot proceed until a hu
══════════════════════════════════════
```
- If user picks A → verify `confirmed_by_user: true` is now set in the config. If still `false`, re-ask. If true, auto-chain to **Step 2.5 (Glossary & Architecture Vision)**.
- If user picks A → verify `confirmed_by_user: true` is now set in the config. If still `false`, re-ask. If true, auto-chain to **Step 3 (Status)**.
- If user picks B → mark Step 2 as `in_progress`, update state file, end the session. Tell the user to invoke `/autodev` again after reviewing.
**Do NOT auto-flip `confirmed_by_user`.** Only the human does that.
---
**Step 2.5 — Glossary & Architecture Vision** (one-shot)
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` does NOT exist OR the cross-cutting architecture doc identified in `docs.cross_cutting` does NOT contain a `## Architecture Vision` section).
State-driven: reached by auto-chain from Step 2 (user picked A).
**Goal**: Capture meta-repo-wide terminology and the user's architecture vision **once**, after the config is confirmed but before any sync skill runs. Without this, `monorepo-document` will faithfully propagate per-component changes but never surface a unified mental model of the meta-repo to the user, and the AI will keep re-inferring the same project terminology on every invocation.
**Why inline (no sub-skill)**: `monorepo-discover` is hard-guarded to write only `_repo-config.yaml`; `monorepo-document` only edits *existing* docs. Glossary and architecture-vision creation is a first-time, user-confirmed write that crosses both guarantees, so it lives directly in the flow.
**Inputs**:
- `_docs/_repo-config.yaml` (component list, doc map, conventions, assumptions log)
- Cross-cutting docs listed under `docs.cross_cutting` (existing architecture doc, if any)
- Each component's `primary_doc` (read-only, for terminology + responsibility extraction)
- Root `README.md` if `repo.root_readme` is referenced
**Outputs**:
- `_docs/glossary.md` (or `<docs.root>/glossary.md` if `docs.root``_docs/`) — NEW
- The cross-cutting architecture doc updated in place: a `## Architecture Vision` section is prepended (or merged into an existing "Vision" / "Overview" heading)
- One new entry appended to `_docs/_repo-config.yaml` under `assumptions_log:` recording the run
- A new top-level config entry: `glossary_doc: <path>` so future `monorepo-status` and `monorepo-document` runs treat the glossary as a known cross-cutting doc
**Procedure**:
1. **Draft glossary** from `_repo-config.yaml` + each component's primary doc. Include:
- Component codenames as they appear in the config (`name` field) and any rename pairs the user noted in `unresolved:` resolutions
- Domain terms that recur across ≥2 component docs
- Acronyms / abbreviations
- Convention names from `conventions:` (e.g., commit prefix, deployment tier names)
- Stakeholder personas if cross-cutting docs reference them
Each entry: one-line definition + source (`source: components.<name>.primary_doc` or `source: _repo-config.yaml conventions`). Skip generic terms.
2. **Draft architecture vision** from the meta-repo perspective:
- **One paragraph**: what the system as a whole is, what each component contributes, the runtime topology (one binary / N services / N clients + 1 server / hybrid), how components communicate (REST / gRPC / queue / DB-shared / file-shared)
- **Components & responsibilities** (one-line each), pulled directly from `_repo-config.yaml` `components:` list
- **Cross-cutting concerns ownership**: which doc owns which concern (auth, schema, deployment, etc.) — pulled from `docs.cross_cutting[].owns`
- **Architectural principles / non-negotiables** the user has implied across components (e.g., "all components share a single Postgres", "submodules own their own CI", "deployment is per-tier, not per-component")
- **Open questions / structural drift signals**: components missing from `docs.cross_cutting`, components in registry but not in config (registry mismatch), or contradictions between component primary docs
3. **Present condensed view** to the user (NOT the full draft files):
```
══════════════════════════════════════
REVIEW: Meta-Repo Glossary + Architecture Vision
══════════════════════════════════════
Glossary (N terms drafted from config + component docs):
- <Term>: <one-line definition>
- ...
Architecture Vision — meta-repo level:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Cross-cutting ownership:
- <concern> → <doc>
- ...
Principles / non-negotiables:
- <principle>
- ...
Open questions / drift signals:
- <q1>
- <q2>
══════════════════════════════════════
A) Looks correct — write the files
B) Add / correct entries (provide diffs)
C) Resolve open questions / drift signals first
══════════════════════════════════════
Recommendation: pick C if drift signals exist;
otherwise B if components or principles
don't match your intent; A only when
the inferred vision is exactly right.
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present, loop until A.
- On C → ask the listed open questions in one batch, integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `_docs/glossary.md` (alphabetical) with `**Status**: confirmed-by-user` + date.
- Update the cross-cutting architecture doc identified in `docs.cross_cutting` (or create one at `_docs/00_architecture.md` if none exists and the user's option-B input named one): prepend `## Architecture Vision` with the confirmed paragraph + components + ownership + principles. Preserve every existing H2 below verbatim.
- Append to `_docs/_repo-config.yaml`:
- Top-level `glossary_doc: <path-relative-to-repo-root>` (sibling of `docs.root`)
- New `assumptions_log:` entry: `{ date: <today>, skill: autodev-meta-repo Step 2.5, run_notes: "Captured glossary + architecture vision", assumptions: [...] }`
- Do NOT flip any `confirmed: false` → `confirmed: true` in the config; this step writes its own confirmed artifact, it does not retroactively confirm config inferences.
**Self-verification**:
- [ ] Every glossary entry traces to either the config or a component primary doc
- [ ] Every component listed in the vision matches a `components:` entry in the config
- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
- [ ] The cross-cutting architecture doc still contains every H2 it had before this step
- [ ] User picked option A on the latest condensed view
**Idempotency**: if both `_docs/glossary.md` exists AND the architecture doc already has a `## Architecture Vision` section, this step is **skipped on re-invocation**. To refresh, the user invokes `/autodev` after deleting `glossary.md` (or running `monorepo-discover` with structural changes that justify a re-confirmation).
After completion, auto-chain to **Step 3 (Status)**.
---
**Step 3 — Status**
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` exists OR `glossary_doc:` is recorded in the config).
State-driven: reached by auto-chain from Step 2.5, or entered on any re-invocation after a completed cycle.
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true`.
State-driven: reached by auto-chain from Step 2 (user picked A), or entered on any re-invocation after a completed cycle.
Action: Read and execute `.cursor/skills/monorepo-status/SKILL.md`.
@@ -221,28 +115,6 @@ The skill:
3. Applies doc edits
4. Skips any component with unconfirmed mapping (M5), reports
After completion:
- If the status report ALSO flagged suite-e2e drift → auto-chain to **Step 4.5 (Integration Test Sync)**
- Else if the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
- Else → end cycle, report done
---
**Step 4.5 — Integration Test Sync**
State-driven: reached by auto-chain from Step 3 (when status report flagged suite-e2e drift and no doc drift) or from Step 4 (when both doc and suite-e2e drift were flagged).
**Skip condition**: if `_docs/_repo-config.yaml` has no `suite_e2e:` block, this step is skipped entirely — there's no harness to sync. The status report should not flag suite-e2e drift in that case; if it does, that's a status-skill bug.
Action: Read and execute `.cursor/skills/monorepo-e2e/SKILL.md` with scope = components flagged by status.
The skill:
1. Verifies every path under `suite_e2e.*` exists (binary fixtures excepted — see the skill's Phase 1)
2. Classifies each flagged change against the suite-e2e impact table
3. Applies edits to `e2e/docker-compose.suite-e2e.yml`, `e2e/fixtures/init.sql`, `e2e/fixtures/expected_detections.json` metadata, and `e2e/runner/tests/*.spec.ts` selectors as needed
4. Bumps baseline `fixture_version` with a `-stale` suffix and appends a `_docs/_process_leftovers/` entry whenever the detection model revision changes (binary fixture cannot be regenerated automatically)
5. Reports synced files; does not run the suite e2e itself
After completion:
- If the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
- Else → end cycle, report done
@@ -251,11 +123,11 @@ After completion:
**Step 5 — CICD Sync**
State-driven: reached by auto-chain from Step 3 (when status report flagged CI drift and no doc/suite-e2e drift), Step 4, or Step 4.5.
State-driven: reached by auto-chain from Step 3 (when status report flagged CI drift and no doc drift) or from Step 4 (when both doc and CI drift were flagged).
Action: Read and execute `.cursor/skills/monorepo-cicd/SKILL.md` with scope = components flagged by status.
After completion, end cycle. Report files updated across doc, suite-e2e, and CI sync.
After completion, end cycle. Report files updated across both doc and CI sync.
---
@@ -284,19 +156,14 @@ After onboarding completes, the config is updated. Auto-chain back to **Step 3 (
| Completed Step | Next Action |
|---------------|-------------|
| Discover (1) | Auto-chain → Config Review (2) |
| Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Glossary & Architecture Vision (2.5) |
| Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Status (3) |
| Config Review (2, user picked B) | **Session boundary** — end session, await re-invocation |
| Glossary & Architecture Vision (2.5) | Auto-chain → Status (3) |
| Status (3, doc drift) | Auto-chain → Document Sync (4) |
| Status (3, suite-e2e drift only) | Auto-chain → Integration Test Sync (4.5) |
| Status (3, CI drift only) | Auto-chain → CICD Sync (5) |
| Status (3, no drift) | **Cycle complete** — end session, await re-invocation |
| Status (3, registry mismatch) | Ask user (A: discover, B: onboard, C: continue) |
| Document Sync (4) + suite-e2e drift pending | Auto-chain → Integration Test Sync (4.5) |
| Document Sync (4) + CI drift only pending | Auto-chain → CICD Sync (5) |
| Document Sync (4) + no further drift | **Cycle complete** |
| Integration Test Sync (4.5) + CI drift pending | Auto-chain → CICD Sync (5) |
| Integration Test Sync (4.5) + no CI drift | **Cycle complete** |
| Document Sync (4) + CI drift pending | Auto-chain → CICD Sync (5) |
| Document Sync (4) + no CI drift | **Cycle complete** |
| CICD Sync (5) | **Cycle complete** |
## Status Summary — Step List
@@ -311,33 +178,29 @@ Flow-specific slot values:
Config: _docs/_repo-config.yaml [confirmed_by_user: <true|false>, last_updated: <date>]
```
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|------------------------------------|--------------------------------------------|
| 1 | Discover | — |
| 2 | Config Review | `IN PROGRESS (awaiting human)` |
| 2.5 | Glossary & Architecture Vision | `SKIPPED (already captured)` |
| 3 | Status | `DONE (no drift)`, `DONE (N drifts)` |
| 4 | Document Sync | `DONE (N docs)`, `SKIPPED (no doc drift)` |
| 4.5 | Integration Test Sync | `DONE (N files)`, `SKIPPED (no suite-e2e drift)`, `SKIPPED (no suite_e2e config block)` |
| 5 | CICD Sync | `DONE (N files)`, `SKIPPED (no CI drift)` |
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|------------------|--------------------------------------------|
| 1 | Discover | — |
| 2 | Config Review | `IN PROGRESS (awaiting human)` |
| 3 | Status | `DONE (no drift)`, `DONE (N drifts)` |
| 4 | Document Sync | `DONE (N docs)`, `SKIPPED (no doc drift)` |
| 5 | CICD Sync | `DONE (N files)`, `SKIPPED (no CI drift)` |
All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2.5, 4, 4.5, and 5 additionally accept `SKIPPED`.
All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4 and 5 additionally accept `SKIPPED`.
Row rendering format:
```
Step 1 Discover [<state token>]
Step 2 Config Review [<state token>]
Step 2.5 Glossary & Architecture Vision [<state token>]
Step 3 Status [<state token>]
Step 4 Document Sync [<state token>]
Step 4.5 Integration Test Sync [<state token>]
Step 5 CICD Sync [<state token>]
Step 1 Discover [<state token>]
Step 2 Config Review [<state token>]
Step 3 Status [<state token>]
Step 4 Document Sync [<state token>]
Step 5 CICD Sync [<state token>]
```
## Notes for the meta-repo flow
- **No session boundary except Step 2 and Step 2.5**: unlike existing-code flow (which has boundaries around decompose), meta-repo flow only pauses at config review and the one-shot glossary/vision capture. Once both are confirmed, syncing is fast enough to complete in one session and Step 2.5 idempotently no-ops on every subsequent invocation.
- **No session boundary except Step 2**: unlike existing-code flow (which has boundaries around decompose), meta-repo flow only pauses at config review. Syncing is fast enough to complete in one session.
- **Cyclical, not terminal**: no "done forever" state. Each invocation completes a drift cycle; next invocation starts fresh.
- **No tracker integration**: this flow does NOT create Jira/ADO tickets. Maintenance is not a feature — if a feature-level ticket spans the meta-repo's concerns, it lives in the per-component workspace.
- **Onboarding is opt-in**: never auto-onboarded. User must explicitly request.
+3 -4
View File
@@ -110,8 +110,7 @@ Before entering a step from this table for the first time in a session, verify t
| Flow | Step | Sub-Step | Tracker Action |
|------|------|----------|----------------|
| greenfield | Plan | Step 6 — Epics | Create epics for each component |
| greenfield | Decompose | Implementation decomposition Step 1 + Step 2Product tasks | Create ticket per product task, link to epic |
| greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
| greenfield | Decompose | Step 1 + Step 2 + Step 3All tasks | Create ticket per task, link to epic |
| existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
| existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |
@@ -139,7 +138,7 @@ One retry ladder covers all failure modes: explicit failure returned by a sub-sk
Treat the sub-skill as **failed** when ANY of the following is observed:
- The sub-skill explicitly returns a failed result (including blocked tasks, auto-fix loop exhaustion, prerequisite violations).
- The sub-skill explicitly returns a failed result (including blocked subagents, auto-fix loop exhaustion, prerequisite violations).
- **Stuck signals**: the same artifact is rewritten 3+ times without meaningful change; the sub-skill re-asks a question that was already answered; no new artifact has been saved despite active execution.
### Retry ladder
@@ -292,7 +291,7 @@ For steps that produce `_docs/` artifacts (problem, research, plan, decompose, d
## Debug Protocol
When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or a task reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)
When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or an implementer subagent reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)
### Structured Debugging Workflow
+1 -1
View File
@@ -13,7 +13,7 @@ The autodev persists its position to `_docs/_autodev_state.md`. This is a lightw
## Current Step
flow: [greenfield | existing-code | meta-repo]
step: [1-17 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
step: [1-11 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
name: [step name from the active flow's Step Reference Table]
status: [not_started / in_progress / completed / skipped / failed]
sub_step:
+2 -2
View File
@@ -209,7 +209,7 @@ Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architectur
The `/implement` skill invokes this skill after each batch completes:
1. Collects changed files from all tasks implemented in the batch
1. Collects changed files from all implementer agents in the batch
2. Passes task spec paths + changed files to this skill
3. If verdict is FAIL — presents findings to user (BLOCKING), user fixes or confirms
4. If verdict is PASS or PASS_WITH_WARNINGS — proceeds automatically (findings shown as info)
@@ -221,7 +221,7 @@ The `/implement` skill invokes this skill after each batch completes:
| Input | Type | Source | Required |
|-------|------|--------|----------|
| `task_specs` | list of file paths | Task `.md` files from `_docs/02_tasks/todo/` for the current batch | Yes |
| `changed_files` | list of file paths | Files modified by the tasks in the batch (from `git diff`) | Yes |
| `changed_files` | list of file paths | Files modified by implementer agents (from `git diff` or agent reports) | Yes |
| `batch_number` | integer | Current batch number (for report naming) | Yes |
| `project_restrictions` | file path | `_docs/00_problem/restrictions.md` | If exists |
| `solution_overview` | file path | `_docs/01_solution/solution.md` | If exists |
+24 -27
View File
@@ -2,8 +2,8 @@
name: decompose
description: |
Decompose planned components into atomic implementable tasks with bootstrap structure plan.
Workflow entrypoints: implementation task decomposition, single component decomposition, and tests-only decomposition.
The invoking flow decides which entrypoint to run; this skill executes that selected sequence.
4-step workflow: bootstrap structure plan, component task decomposition, blackbox test task decomposition, and cross-task verification.
Supports full decomposition (_docs/ structure), single component mode, and tests-only mode.
Trigger phrases:
- "decompose", "decompose features", "feature decomposition"
- "task decomposition", "break down components"
@@ -20,7 +20,7 @@ Decompose planned components into atomic, implementable task specs with a bootst
## Core Principles
- **Atomic tasks**: each task does one thing; if it exceeds 5 complexity points, split it
- **Atomic tasks**: each task does one thing; if it exceeds 8 complexity points, split it
- **Behavioral specs, not implementation plans**: describe what the system should do, not how to build it
- **Flat structure**: all tasks are tracker-ID-prefixed files in TASKS_DIR — no component subdirectories
- **Save immediately**: write artifacts to disk after each task; never accumulate unsaved work
@@ -30,15 +30,14 @@ Decompose planned components into atomic, implementable task specs with a bootst
## Context Resolution
Resolve the selected entrypoint from the invocation context before any other logic runs. The caller decides whether this is implementation, single component, or tests-only decomposition; this skill only executes the selected sequence.
Determine the operating mode based on invocation before any other logic runs.
**Implementation task decomposition** (default; selected by flows before invoking this skill):
**Default** (no explicit input file provided):
- DOCUMENT_DIR: `_docs/02_document/`
- TASKS_DIR: `_docs/02_tasks/`
- TASKS_TODO: `_docs/02_tasks/todo/`
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, DOCUMENT_DIR
- Produces only implementation tasks. Blackbox/e2e test task files are produced only when the invoking flow selects tests-only decomposition.
**Single component mode** (provided file is within `_docs/02_document/` and inside a `components/` subdirectory):
@@ -56,24 +55,24 @@ Resolve the selected entrypoint from the invocation context before any other log
- TESTS_DIR: `DOCUMENT_DIR/tests/`
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, TESTS_DIR
Announce the selected entrypoint and resolved paths to the user before proceeding.
Announce the detected mode and resolved paths to the user before proceeding.
### Step Applicability by Mode
| Step | File | Implementation | Single | Tests-only |
|------|------|:--------------:|:------:|:----------:|
| Step | File | Default | Single | Tests-only |
|------|------|:-------:|:------:|:----------:|
| 1 Bootstrap Structure | `steps/01_bootstrap-structure.md` | ✓ | — | — |
| 1t Test Infrastructure | `steps/01t_test-infrastructure.md` | — | — | ✓ |
| 1.5 Module Layout | `steps/01-5_module-layout.md` | ✓ | — | — |
| 2 Task Decomposition | `steps/02_task-decomposition.md` | ✓ | ✓ | — |
| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | | — | ✓ |
| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | | — | ✓ |
| 4 Cross-Verification | `steps/04_cross-verification.md` | ✓ | — | ✓ |
## Input Specification
### Required Files
**Implementation task decomposition:**
**Default:**
| File | Purpose |
|------|---------|
@@ -81,11 +80,10 @@ Announce the selected entrypoint and resolved paths to the user before proceedin
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
| `_docs/01_solution/solution.md` | Finalized solution |
| `DOCUMENT_DIR/architecture.md` | Architecture from plan/document skill (must contain a `## Architecture Vision` H2 — confirmed user intent) |
| `DOCUMENT_DIR/glossary.md` | Project terminology (confirmed by user in plan Phase 2a.0 or document Step 4.5). Use it to keep task names, component references, and AC wording consistent with the user's vocabulary |
| `DOCUMENT_DIR/architecture.md` | Architecture from plan skill |
| `DOCUMENT_DIR/system-flows.md` | System flows from plan skill |
| `DOCUMENT_DIR/components/[##]_[name]/description.md` | Component specs from plan skill |
| `DOCUMENT_DIR/tests/` | Optional product acceptance context from test-spec skill; do not create test task files from it in this entrypoint |
| `DOCUMENT_DIR/tests/` | Blackbox test specs from plan skill |
**Single component mode:**
@@ -112,7 +110,7 @@ Announce the selected entrypoint and resolved paths to the user before proceedin
### Prerequisite Checks (BLOCKING)
**Implementation task decomposition:**
**Default:**
1. DOCUMENT_DIR contains `architecture.md` and `components/`**STOP if missing**
2. Create TASKS_DIR and TASKS_TODO if they do not exist
@@ -146,8 +144,6 @@ TASKS_DIR/
**Naming convention**: Each task file is initially saved in `TASKS_TODO/` with a temporary numeric prefix (`[##]_[short_name].md`). After creating the work item ticket, rename the file to use the work item ticket ID as prefix (`[TRACKER-ID]_[short_name].md`). For example: `todo/01_initial_structure.md``todo/AZ-42_initial_structure.md`.
If tracker availability fails, follow `.cursor/rules/tracker.mdc` before continuing. Only when the user explicitly chooses `tracker: local` may the numeric prefix remain; in that mode set `Tracker: pending` and `Epic: pending` in the task header and keep the task eligible for later tracker sync.
### Save Timing
| Step | Save immediately after | Filename |
@@ -169,11 +165,11 @@ If TASKS_DIR subfolders already contain task files:
## Progress Tracking
At the start of execution, create a TodoWrite with all applicable steps for the selected entrypoint (see Step Applicability table). Update status as each step/component completes.
At the start of execution, create a TodoWrite with all applicable steps for the detected mode (see Step Applicability table). Update status as each step/component completes.
## Workflow
### Step 1: Bootstrap Structure Plan (implementation mode only)
### Step 1: Bootstrap Structure Plan (default mode only)
Read and follow `steps/01_bootstrap-structure.md`.
@@ -185,25 +181,25 @@ Read and follow `steps/01t_test-infrastructure.md`.
---
### Step 1.5: Module Layout (implementation mode only)
### Step 1.5: Module Layout (default mode only)
Read and follow `steps/01-5_module-layout.md`.
---
### Step 2: Task Decomposition (implementation and single component modes)
### Step 2: Task Decomposition (default and single component modes)
Read and follow `steps/02_task-decomposition.md`.
---
### Step 3: Blackbox Test Task Decomposition (tests-only mode only)
### Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
Read and follow `steps/03_blackbox-test-decomposition.md`.
---
### Step 4: Cross-Task Verification (implementation and tests-only modes)
### Step 4: Cross-Task Verification (default and tests-only modes)
Read and follow `steps/04_cross-verification.md`.
@@ -211,7 +207,7 @@ Read and follow `steps/04_cross-verification.md`.
- **Coding during decomposition**: this workflow produces specs, never code
- **Over-splitting**: don't create many tasks if the component is simple — 1 task is fine
- **Tasks exceeding 5 points**: split them; no task should be too complex for a single implementer
- **Tasks exceeding 8 points**: split them; no task should be too complex for a single implementer
- **Cross-component tasks**: each task belongs to exactly one component
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Creating git branches**: branch creation is an implementation concern, not a decomposition one
@@ -224,7 +220,7 @@ Read and follow `steps/04_cross-verification.md`.
| Situation | Action |
|-----------|--------|
| Ambiguous component boundaries | ASK user |
| Task complexity exceeds 5 points after splitting | ASK user |
| Task complexity exceeds 8 points after splitting | ASK user |
| Missing component specs in DOCUMENT_DIR | ASK user |
| Cross-component dependency conflict | ASK user |
| Tracker epic not found for a component | ASK user for Epic ID |
@@ -236,14 +232,15 @@ Read and follow `steps/04_cross-verification.md`.
┌────────────────────────────────────────────────────────────────┐
│ Task Decomposition (Multi-Mode) │
├────────────────────────────────────────────────────────────────┤
│ CONTEXT: Invoke the selected entrypoint (implementation / single / tests-only) │
│ CONTEXT: Resolve mode (default / single component / tests-only) │
│ │
IMPLEMENTATION TASK DECOMPOSITION:
DEFAULT MODE:
│ 1. Bootstrap Structure → steps/01_bootstrap-structure.md │
│ [BLOCKING: user confirms structure] │
│ 1.5 Module Layout → steps/01-5_module-layout.md │
│ [BLOCKING: user confirms layout] │
│ 2. Component Tasks → steps/02_task-decomposition.md │
│ 3. Blackbox Tests → steps/03_blackbox-test-decomposition.md │
│ 4. Cross-Verification → steps/04_cross-verification.md │
│ [BLOCKING: user confirms dependencies] │
│ │
@@ -26,7 +26,7 @@ For each component (or the single provided component):
4. Do not create tasks for other components — only tasks for the current component
5. Each task should be atomic, containing 1 API or a list of semantically connected APIs
6. Write each task spec using `templates/task.md`
7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
7. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
8. Note task dependencies (referencing tracker IDs of already-created dependency tasks, e.g., `AZ-42_initial_structure`)
9. **Cross-cutting rule**: if a concern spans ≥2 components (logging, config loading, auth/authZ, error envelope, telemetry, feature flags, i18n), create ONE shared task under the cross-cutting epic. Per-component tasks declare it as a dependency and consume it; they MUST NOT re-implement it locally. Duplicate local implementations are an `Architecture` finding (High) in code-review Phase 7 and a `Maintainability` finding in Phase 6.
10. **Shared-models / shared-API rule**: classify the task as shared if ANY of the following is true:
@@ -46,7 +46,7 @@ For each component (or the single provided component):
## Self-verification (per component)
- [ ] Every task is atomic (single concern)
- [ ] No task exceeds 5 complexity points
- [ ] No task exceeds 8 complexity points
- [ ] Task dependencies reference correct tracker IDs
- [ ] Tasks cover all interfaces defined in the component spec
- [ ] No tasks duplicate work from other components
@@ -1,4 +1,4 @@
# Step 3: Blackbox Test Task Decomposition (tests-only mode only)
# Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
**Role**: Professional Quality Assurance Engineer
**Goal**: Decompose blackbox test specs into atomic, implementable task specs.
@@ -6,6 +6,7 @@
## Numbering
- In default mode: continue sequential numbering from where Step 2 left off.
- In tests-only mode: start from 02 (01 is the test infrastructure bootstrap from Step 1t).
## Steps
@@ -14,9 +15,10 @@
2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test)
3. Each task should reference the specific test scenarios it implements and the environment/test-data specs
4. Dependencies:
- In default mode: blackbox test tasks depend on the component implementation tasks they exercise
- In tests-only mode: blackbox test tasks depend on the test infrastructure bootstrap task (Step 1t)
5. Write each task spec using `templates/task.md`
6. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
6. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
7. Note task dependencies (referencing tracker IDs of already-created dependency tasks)
8. **Immediately after writing each task file**: create a work item ticket under the "Blackbox Tests" epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
@@ -24,8 +26,8 @@
- [ ] Every scenario from `tests/blackbox-tests.md` is covered by a task
- [ ] Every scenario from `tests/performance-tests.md`, `tests/resilience-tests.md`, `tests/security-tests.md`, and `tests/resource-limit-tests.md` is covered by a task
- [ ] No task exceeds 5 complexity points
- [ ] Dependencies correctly reference the test infrastructure task
- [ ] No task exceeds 8 complexity points
- [ ] Dependencies correctly reference the dependency tasks (component tasks in default mode, test infrastructure in tests-only mode)
- [ ] Every task has a work item ticket linked to the "Blackbox Tests" epic
## Save action
@@ -1,4 +1,4 @@
# Step 4: Cross-Task Verification (implementation and tests-only modes)
# Step 4: Cross-Task Verification (default and tests-only modes)
**Role**: Professional software architect and analyst
**Goal**: Verify task consistency and produce `_dependencies_table.md`.
@@ -8,7 +8,7 @@
1. Verify task dependencies across all tasks are consistent
2. Check no gaps:
- In implementation mode: every product interface in `architecture.md` has implementation task coverage
- In default mode: every interface in `architecture.md` has tasks covering it
- In tests-only mode: every test scenario in `traceability-matrix.md` is covered by a task
3. Check no overlaps: tasks don't duplicate work
4. Check no circular dependencies in the task graph
@@ -16,9 +16,9 @@
## Self-verification
### Implementation mode
### Default mode
- [ ] Every product interface in `architecture.md` is covered by at least one implementation task
- [ ] Every architecture interface is covered by at least one task
- [ ] No circular dependencies in the task graph
- [ ] Cross-component dependencies are explicitly noted in affected task specs
- [ ] `_dependencies_table.md` contains every task with correct dependencies
@@ -28,4 +28,4 @@ Use this template after cross-task verification. Save as `TASKS_DIR/_dependencie
- Dependencies column lists tracker IDs (e.g., "AZ-43, AZ-44") or "None"
- No circular dependencies allowed
- Tasks should be listed in recommended execution order
- The `/implement` skill reads this table to compute dependency-aware batches; task execution remains sequential
- The `/implement` skill reads this table to compute parallel batches
@@ -1,6 +1,6 @@
# Module Layout Template
The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to each task. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.
The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to implementer subagents. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.
Save as `_docs/02_document/module-layout.md`. This file is produced by the decompose skill (Step 1.5 module layout) and consumed by the implement skill (Step 4 file ownership). Task specs remain purely behavioral — they do NOT carry file paths. The layout is the single place where component → filesystem mapping lives.
@@ -104,4 +104,4 @@ The implement skill's Step 4 (File Ownership) reads this file and, for each task
3. Set READ-ONLY = the Public API files of every component listed in `Imports from`, plus `shared/*` Public API files.
4. Set FORBIDDEN = every other component's Owns glob.
Execution inside a batch is already sequential (one task at a time). This mapping is still required because it enforces scope discipline per task — preventing a task from drifting into files that belong to another component.
If two tasks in the same batch map to the same component, the implement skill schedules them sequentially (one implementer at a time for that component) to avoid file conflicts on shared internal files.
+3 -2
View File
@@ -11,7 +11,7 @@ Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[T
**Task**: [TRACKER-ID]_[short_name]
**Name**: [short human name]
**Description**: [one-line description of what this task delivers]
**Complexity**: [1|2|3|5] points
**Complexity**: [1|2|3|5|8] points
**Dependencies**: [AZ-43_shared_models, AZ-44_db_migrations] or "None"
**Component**: [component name for context]
**Tracker**: [TASK-ID]
@@ -102,7 +102,8 @@ Consumers MUST read that file — not this task spec — to discover the interfa
- 2 points: Non-trivial, low complexity, minimal coordination
- 3 points: Multi-step, moderate complexity, potential alignment needed
- 5 points: Difficult, interconnected logic, medium-high risk
- 8+ points: Too complex — split into smaller tasks
- 8 points: High difficulty, high ambiguity or coordination, multiple components
- 13 points: Too complex — split into smaller tasks
## Output Guidelines
@@ -26,8 +26,7 @@
- Application components under test
- Test runner container (black-box, no internal imports)
- Isolated database with seed data
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner`
- See the Woodpecker two-workflow contract in [`../templates/ci_cd_pipeline.md`](../templates/ci_cd_pipeline.md) — the test runner entry point defined here becomes the first step of `.woodpecker/01-test.yml`.
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
## Self-verification
@@ -85,140 +85,3 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`.
| Deploy success | [Slack] | [team] |
| Deploy failure | [Slack/email + PagerDuty] | [on-call] |
```
---
## Reference Implementation: Woodpecker CI two-workflow contract
Use this when the project's CI is **Woodpecker** and the test layout follows the autodev e2e contract from [`../../decompose/templates/test-infrastructure-task.md`](../../decompose/templates/test-infrastructure-task.md) (an `e2e/` folder containing `Dockerfile`, `docker-compose.test.yml`, `conftest.py`, `requirements.txt`, `mocks/`, `fixtures/`, `tests/`).
The contract is **two workflows in `.woodpecker/`**, scheduled on the same agent label, with the build workflow gated on a successful test run:
- `.woodpecker/01-test.yml` — runs the e2e contract, publishes `results/report.csv` as an artifact, fails the pipeline on any test failure.
- `.woodpecker/02-build-push.yml``depends_on: [01-test]`. Builds the image, tags it `${CI_COMMIT_BRANCH}-${TAG_SUFFIX}`, pushes it to the registry. Skipped automatically if test failed.
The agent label is parameterized via `matrix:` so a single workflow file fans out across architectures: `labels: platform: ${PLATFORM}` routes each matrix entry to the matching agent. Both workflows for a repo must use the same matrix so test and build run on the same machine and share Docker layer cache. New architectures = new matrix entries; never new files.
### Multi-arch matrix conventions
| Variable | Meaning | Typical values |
|----------|---------|----------------|
| `PLATFORM` | Woodpecker agent label — selects which physical machine runs the entry. | `arm64`, `amd64` |
| `TAG_SUFFIX` | Image tag suffix appended after the branch name. | `arm`, `amd` |
| `DOCKERFILE` *(only when arches need different Dockerfiles)* | Path to the Dockerfile for this entry. | `Dockerfile`, `Dockerfile.jetson` |
Most repos use the same `Dockerfile` for both arches (multi-arch base images handle the rest), so `DOCKERFILE` can be omitted from the matrix and hardcoded in the build command. Repos with split per-arch Dockerfiles (e.g., `detections` uses `Dockerfile.jetson` on Jetson with TensorRT/CUDA-on-L4T) declare `DOCKERFILE` as a matrix var.
When only one architecture is currently in use, keep the matrix block with a single entry and the second entry commented out — adding a new arch is then a one-line uncomment, not a structural change.
### `.woodpecker/01-test.yml`
```yaml
when:
event: [push, pull_request, manual]
branch: [dev, stage, main]
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: e2e
image: docker
commands:
- cd e2e
- docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build
- docker compose -f docker-compose.test.yml down -v
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- name: report
image: docker
when:
status: [success, failure]
commands:
- test -f e2e/results/report.csv && cat e2e/results/report.csv || echo "no report"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
```
Notes:
- `--abort-on-container-exit` shuts the whole compose down as soon as ANY service exits, so a crashed dependency surfaces immediately instead of hanging the runner.
- `--exit-code-from e2e-runner` ensures the pipeline's exit code reflects the test runner's, not the SUT's.
- The `report` step runs on `[success, failure]` so the report is always published; without this the CSV is lost on red builds.
- `down -v` between runs drops mock state and DB volumes — every test run starts clean.
### `.woodpecker/02-build-push.yml`
```yaml
when:
event: [push, manual]
branch: [dev, stage, main]
depends_on:
- 01-test
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: build-push
image: docker
environment:
REGISTRY_HOST:
from_secret: registry_host
REGISTRY_USER:
from_secret: registry_user
REGISTRY_TOKEN:
from_secret: registry_token
commands:
- echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
- export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
- export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
- |
docker build -f Dockerfile \
--build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
--label org.opencontainers.image.revision=$CI_COMMIT_SHA \
--label org.opencontainers.image.created=$BUILD_DATE \
--label org.opencontainers.image.source=$CI_REPO_URL \
-t $REGISTRY_HOST/azaion/<service>:$TAG .
- docker push $REGISTRY_HOST/azaion/<service>:$TAG
volumes:
- /var/run/docker.sock:/var/run/docker.sock
```
Notes:
- `depends_on: [01-test]` is enforced by Woodpecker — a failed `01-test` (any matrix entry) skips this workflow.
- The build workflow does NOT trigger on `pull_request` events: PRs get test signal only; pushes to `dev`/`stage`/`main` produce images. Avoids polluting the registry with PR images.
- Replace `<service>` with the actual service name (matches the registry namespace pattern `azaion/<service>`).
- For repos with split per-arch Dockerfiles, add `DOCKERFILE: Dockerfile.jetson` (or similar) to the matrix entry and substitute `${DOCKERFILE}` for `Dockerfile` in the `docker build -f` line.
### Variations by stack
The contract is language-agnostic because the runner is `docker compose`. The Dockerfile inside `e2e/` selects the test framework:
| Stack | `e2e/Dockerfile` runs |
|-------|----------------------|
| Python | `pytest --csv=/results/report.csv -v` |
| .NET | `dotnet test --logger:"trx;LogFileName=/results/report.trx"` (convert to CSV in a final step if needed) |
| Node/UI | `npm test -- --reporters=default --reporters=jest-junit --outputDirectory=/results` |
| Rust | `cargo test --no-fail-fast -- --format json > /results/report.json` |
When the repo has **only unit tests** (no `e2e/docker-compose.test.yml`), drop the compose orchestration and run the native test command directly inside a stack-appropriate image. Keep the same two-workflow split — `01-test.yml` runs unit tests, `02-build-push.yml` is unchanged.
### Manual-trigger override (test infrastructure not yet validated)
If a repo ships a complete `e2e/` layout but the test fixtures are not yet validated end-to-end (e.g., expected-results data is still being authored), gate `01-test.yml` on `event: [manual]` only and add a TODO comment pointing to the unblocking task. The `02-build-push.yml` workflow drops its `depends_on` clause for the manual-only window — an explicit and reversible exception, not a permanent split.
@@ -31,7 +31,6 @@ _docs/
│ ├── components.md
│ └── flows/
├── 04_verification_log.md # Step 4
├── glossary.md # Step 4.5 (confirmed-by-user)
├── FINAL_report.md # Step 7
└── state.json # Resumability
```
@@ -50,7 +49,6 @@ Maintained in `DOCUMENT_DIR/state.json` for resumability:
"modules_remaining": ["services/auth", "api/endpoints"],
"module_batch": 1,
"components_written": [],
"step_4_5_glossary_vision": "not_started",
"last_updated": "2026-03-21T14:00:00Z"
}
```
+2 -102
View File
@@ -15,7 +15,7 @@ Covers three related modes that share the same 8-step pipeline:
## Progress Tracking
Create a TodoWrite with all steps (0 through 7, including the inline Step 2.5 Module Layout Derivation and Step 4.5 Glossary & Architecture Vision). Update status as each step completes.
Create a TodoWrite with all steps (0 through 7). Update status as each step completes.
## Steps
@@ -251,107 +251,7 @@ Apply corrections inline to the documents that need them.
**BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes.
---
### Step 4.5: Glossary & Architecture Vision (BLOCKING)
**Role**: Software architect + business analyst
**Goal**: Reconcile the AI's verified understanding of the codebase with the user's intended terminology and architecture vision. Existing-code projects often carry domain language and structural intent that is invisible from code alone (synonyms, deprecated names, modules that are "supposed to" be split, components the user thinks of as one logical unit even though they live in two folders). This step makes that intent explicit before any downstream skill (refactor, decompose, new-task) acts on the docs.
**When this step runs**:
- Always, after Step 4 (Verification Pass) — for Full and Resume modes.
- **Skipped** in Focus Area mode (the glossary/vision is system-wide; running it on a partial scan would produce a partial glossary). Resume the user once a full pass exists.
**Inputs** (already on disk after Step 4):
- `DOCUMENT_DIR/architecture.md`, `system-flows.md`, `data_model.md`, `deployment/*`
- `DOCUMENT_DIR/components/*/description.md`
- `DOCUMENT_DIR/modules/*.md`
- `DOCUMENT_DIR/04_verification_log.md` (so the AI knows which doc parts are confirmed vs. flagged)
**Outputs**:
- `DOCUMENT_DIR/glossary.md` (NEW)
- `DOCUMENT_DIR/architecture.md` updated in place: a new `## Architecture Vision` section is prepended (or merged into an existing "Overview" / "Vision" heading if already present); existing technical sections are preserved verbatim
**Procedure**:
1. **Draft glossary** from verified docs:
- Domain entities, processes, roles named in module/component docs
- Acronyms / abbreviations
- Internal codenames (project, service, model names) that recur in the codebase
- Synonym pairs the AI noticed (e.g., the codebase uses "flight" but module comments say "mission")
- Stakeholder personas if any docs reference them
Each entry: one-line definition + source reference (`source: components/03_flights/description.md`). Skip generic CS/industry terms.
2. **Draft architecture vision** as the AI currently understands the codebase:
- **One paragraph**: what the system is, who runs it, the runtime topology shape (monolith / services / pipeline / library / hybrid), and the dominant pattern (e.g., "submodule-based meta-repo with REST + SSE between UI and backend").
- **Components & responsibilities** (one-line each), pulled from `components/*/description.md`.
- **Major data flows** (one or two sentences each), pulled from `system-flows.md`.
- **Architectural principles / non-negotiables** the AI inferred from the code (e.g., "DB-driven config", "all UI traffic via REST + SSE only", "no per-component shared state"). Mark each with `inferred-from: <source>`.
- **Open questions / drift signals**: places where the code disagrees with itself, or where the AI cannot tell intent from implementation (e.g., two components doing similar work — is that legacy duplication or deliberate?).
3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
```
══════════════════════════════════════
REVIEW: Glossary + Architecture Vision (existing code)
══════════════════════════════════════
Glossary (N terms drafted from verified docs):
- <Term>: <one-line definition>
- ...
Architecture Vision — as inferred from the codebase:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Principles / non-negotiables (inferred):
- <principle> [inferred-from: <source>]
- ...
Open questions / drift signals:
- <q1>
- <q2>
══════════════════════════════════════
A) Inferred vision matches my intent — write the files
B) Add / correct entries (provide diffs — terms, components,
principles, or rename pairs)
C) Resolve the open questions / drift signals first
══════════════════════════════════════
Recommendation: pick C if any drift signals exist;
otherwise B if the vision misses
project-specific intent; A only when
the inferred vision is exactly right.
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present, loop until A.
- On C → ask the listed open questions in one batch (M4-style), integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `DOCUMENT_DIR/glossary.md`, alphabetical, with a top-line `**Status**: confirmed-by-user` and the date.
- Update `DOCUMENT_DIR/architecture.md`:
- If a `## Architecture Vision` (or `## Vision` / `## Overview`) section already exists at the top, replace its body with the confirmed paragraph + components + principles.
- Otherwise, insert `## Architecture Vision` as the first H2 after the title; preserve every existing H2 below.
- Do NOT delete or re-order existing technical sections (Tech Stack, Deployment Model, Data Model, NFRs, ADRs).
6. **Update `state.json`**: mark `step_4_5_glossary_vision: confirmed`. Resume on rerun must skip this step unless the user explicitly invokes `/document --refresh-vision`.
**Self-verification**:
- [ ] Every glossary entry traces to at least one file under `DOCUMENT_DIR/`
- [ ] Every component listed in the vision matches a folder under `DOCUMENT_DIR/components/`
- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
- [ ] `architecture.md` still contains all H2 sections it had before this step
- [ ] User picked option A on the latest condensed view
**BLOCKING**: Do NOT proceed to the session boundary / Step 5 until both files are saved and the user has picked A.
---
**Session boundary**: After Step 4.5 is confirmed, suggest a session break before proceeding to the synthesis steps (57). These steps produce different artifact types and benefit from fresh context:
**Session boundary**: After verification is confirmed, suggest a session break before proceeding to the synthesis steps (57). These steps produce different artifact types and benefit from fresh context:
```
══════════════════════════════════════
+54 -129
View File
@@ -1,58 +1,41 @@
---
name: implement
description: |
Implement tasks sequentially with dependency-aware batching and integrated code review.
Orchestrate task implementation with dependency-aware batching, parallel subagents, and integrated code review.
Reads flat task files and _dependencies_table.md from TASKS_DIR, computes execution batches via topological sort,
implements tasks one at a time in dependency order, runs code-review skill after each batch, and loops until done.
launches up to 4 implementer subagents in parallel, runs code-review skill after each batch, and loops until done.
Use after /decompose has produced task files.
Trigger phrases:
- "implement", "start implementation", "implement tasks"
- "execute tasks"
- "run implementers", "execute tasks"
category: build
tags: [implementation, batching, code-review]
tags: [implementation, orchestration, batching, parallel, code-review]
disable-model-invocation: true
---
# Implementation Runner
# Implementation Orchestrator
Implement all tasks produced by the `/decompose` skill. This skill reads task specs, computes execution order, writes the code and tests for each task **sequentially** (no subagents, no parallel execution), validates results via the `/code-review` skill, and escalates issues.
Orchestrate the implementation of all tasks produced by the `/decompose` skill. This skill is a **pure orchestrator** — it does NOT write implementation code itself. It reads task specs, computes execution order, delegates to `implementer` subagents, validates results via the `/code-review` skill, and escalates issues.
For each task the main agent receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria — then moves on to the next task.
The `implementer` agent is the specialist that writes all the code — it receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria.
## Core Principles
- **Sequential execution**: implement one task at a time. Do NOT spawn subagents and do NOT run tasks in parallel. (See `.cursor/rules/no-subagents.mdc`.)
- **Dependency-aware ordering**: tasks run only when all their dependencies are satisfied
- **Batching for review, not parallelism**: tasks are grouped into batches so `/code-review` and commits operate on a coherent unit of work — all tasks inside a batch are still implemented one after the other
- **Orchestrate, don't implement**: this skill delegates all coding to `implementer` subagents
- **Dependency-aware batching**: tasks run only when all their dependencies are satisfied
- **Max 4 parallel agents**: never launch more than 4 implementer subagents simultaneously
- **File isolation**: no two parallel agents may write to the same file
- **Integrated review**: `/code-review` skill runs automatically after each batch
- **Completeness before testing**: product implementation is not done until code is checked against task outcomes, included scope, architecture/component promises, and unresolved scaffold/native placeholders — not just task AC tests
- **Auto-start**: batches start immediately — no user confirmation before a batch
- **Auto-start**: batches launch immediately — no user confirmation before a batch
- **Gate on failure**: user confirmation is required only when code review returns FAIL
- **Commit per batch**: after each batch is confirmed, commit. Ask the user whether to push to remote unless the user previously opted into auto-push for this session.
## Context Resolution
- TASKS_DIR: `_docs/02_tasks/`
- Task files: selected `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
- Task files: all `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
- Dependency table: `TASKS_DIR/_dependencies_table.md`
### Task Selection Context
The invoking flow decides which task category this run should execute. The implement skill must honor that selected context instead of consuming every file in `todo/`.
| Context | Selected task files |
|---------|---------------------|
| Product implementation | Task specs that are not test-only and not refactoring specs |
| Test implementation | `*_test_infrastructure.md` plus task specs whose `Component` or `Epic` identifies `Blackbox Tests` |
| Refactoring | Task specs whose filename or task ID includes `_refactor_` |
If no explicit context is provided, infer it from the active autodev step:
- greenfield Step 7 or existing-code Step 10 → Product implementation
- greenfield Step 10 or existing-code Step 6 → Test implementation
- refactor Phase 4 → Refactoring
Unselected task files remain in `TASKS_DIR/todo/` for their later flow step.
### Task Lifecycle Folders
```
@@ -65,7 +48,7 @@ TASKS_DIR/
## Prerequisite Checks (BLOCKING)
1. `TASKS_DIR/todo/` exists and contains at least one task file for the selected context **STOP if missing**
1. `TASKS_DIR/todo/` exists and contains at least one task file — **STOP if missing**
2. `_dependencies_table.md` exists — **STOP if missing**
3. At least one task is not yet completed — **STOP if all done**
4. **Working tree is clean** — run `git status --porcelain`; the output must be empty.
@@ -73,16 +56,16 @@ TASKS_DIR/
- A) Commit or stash stray changes manually, then re-invoke `/implement`
- B) Agent commits stray changes as a single `chore: WIP pre-implement` commit and proceeds
- C) Abort
- Rationale: each batch ends with a commit. Unrelated uncommitted changes would get silently folded into batch commits otherwise.
- Rationale: implementer subagents edit files in parallel and commit per batch. Unrelated uncommitted changes get silently folded into batch commits otherwise.
- This check is repeated at the start of each batch iteration (see step 6 / step 14 Loop).
## Algorithm
### 1. Parse
- Read selected task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
- Read all task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
- Read `_dependencies_table.md` — parse into a dependency graph (DAG)
- Validate: no circular dependencies in the selected task graph, all referenced selected-task dependencies exist or are already completed in `TASKS_DIR/done/`
- Validate: no circular dependencies, all referenced dependencies exist
### 2. Detect Progress
@@ -95,8 +78,8 @@ TASKS_DIR/
- Topological sort remaining tasks
- Select tasks whose dependencies are ALL satisfied (completed)
- A batch is simply a coherent group of tasks for review + commit. Within the batch, tasks are implemented sequentially in topological order.
- Cap the batch size at a reasonable review scope (default: 4 tasks)
- If a ready task depends on any task currently being worked on in this batch, it must wait for the next batch
- Cap the batch at 4 parallel agents
- If the batch would exceed 20 total complexity points, suggest splitting and let the user decide
### 4. Assign File Ownership
@@ -106,12 +89,11 @@ The authoritative file-ownership map is `_docs/02_document/module-layout.md` (pr
For each task in the batch:
- Read the task spec's **Component** field.
- Look up the component in `_docs/02_document/module-layout.md` → Per-Component Mapping.
- Set **OWNED** = the component's `Owns` glob (the files this task is allowed to write).
- Set **OWNED** = the component's `Owns` glob (exclusive write for the duration of the batch).
- Set **READ-ONLY** = Public API files of every component in the component's `Imports from` list, plus all `shared/*` Public API files.
- Set **FORBIDDEN** = every other component's `Owns` glob, and every other component's internal (non-Public API) files.
- If the task is a shared / cross-cutting task (lives under `shared/*`), OWNED = that shared directory; READ-ONLY = nothing; FORBIDDEN = every component directory.
Since execution is sequential, there is no parallel-write conflict to resolve; ownership here is a **scope discipline** check — it stops a task from drifting into unrelated components even when alone.
- If two tasks in the same batch map to the same component or overlapping `Owns` globs, schedule them sequentially instead of in parallel.
If `_docs/02_document/module-layout.md` is missing or the component is not found:
- STOP the batch.
@@ -120,30 +102,31 @@ If `_docs/02_document/module-layout.md` is missing or the component is not found
### 5. Update Tracker Status → In Progress
For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before starting work. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before launching the implementer. If `tracker: local`, skip this step.
### 6. Implement Tasks Sequentially
### 6. Launch Implementer Subagents
**Per-batch dirty-tree re-check**: before starting the batch, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.
**Per-batch dirty-tree re-check**: before launching subagents, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.
For each task in the batch **in topological order, one at a time**:
1. Read the task spec file.
2. Respect the file-ownership envelope computed in Step 4 (OWNED / READ-ONLY / FORBIDDEN).
3. Implement the feature and write/update tests for every acceptance criterion in the spec. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still be written and skip with a clear reason.
4. Run the relevant tests locally before moving on to the next task in the batch. If tests fail, fix in-place — do not defer.
5. Capture a short per-task status line (files changed, tests pass/fail, any blockers) for the batch report.
For each task in the batch, launch an `implementer` subagent with:
- Path to the task spec file
- List of files OWNED (exclusive write access)
- List of files READ-ONLY
- List of files FORBIDDEN
- **Explicit instruction**: the implementer must write or update tests that validate each acceptance criterion in the task spec. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still be written and skip with a clear reason.
Do NOT spawn subagents and do NOT attempt to implement two tasks simultaneously, even if they touch disjoint files. See `.cursor/rules/no-subagents.mdc`.
Launch all subagents immediately — no user confirmation.
### 7. Collect Status
### 7. Monitor
- After all tasks in the batch are finished, aggregate the per-task status lines into a structured batch status.
- If any task reported "Blocked", log the blocker with the failing task's ID and continue — the batch report will surface it.
- Wait for all subagents to complete
- Collect structured status reports from each implementer
- If any implementer reports "Blocked", log the blocker and continue with others
**Stuck detection** — while implementing a task, watch for these signals in your own progress:
- The same file has been rewritten 3+ times without tests going green → stop, mark the task Blocked, and move to the next task in the batch (the user will be asked at the end of the batch).
- You have tried 3+ distinct approaches without evidence-driven progress → stop, mark Blocked, move on.
- Do NOT loop indefinitely on a single task. Record the blocker and proceed.
**Stuck detection** — while monitoring, watch for these signals per subagent:
- Same file modified 3+ times without test pass rate improving → flag as stuck, stop the subagent, report as Blocked
- Subagent has not produced new output for an extended period → flag as potentially hung
- If a subagent is flagged as stuck, do NOT let it continue looping — stop it and record the blocker in the batch report
### 8. AC Test Coverage Verification
@@ -156,8 +139,8 @@ Before code review, verify that every acceptance criterion in each task spec has
- **Not covered**: no test exists for this AC
If any AC is **Not covered**:
- This is a **BLOCKING** failure — the missing test must be written before proceeding
- Go back to the offending task, add tests for the specific ACs that lack coverage, then re-run this check
- This is a **BLOCKING** failure — the implementer must write the missing test before proceeding
- Re-launch the implementer with the specific ACs that need tests
- If the test cannot run in the current environment (GPU required, platform-specific, external service), the test must still exist and skip with `pytest.mark.skipif` or `pytest.skip()` explaining the prerequisite
- A skipped test counts as **Covered** — the test exists and will run when the environment allows
@@ -206,14 +189,12 @@ Track `auto_fix_attempts` and `escalated_findings` in the batch report for retro
### 12. Update Tracker Status → In Testing
After the batch is committed (and pushed if the user approved pushing), transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
After the batch is committed and pushed, transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step.
### 13. Archive Completed Tasks
Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.
For product implementation, this archive means "batch implementation accepted." The Product Implementation Completeness Gate can still require follow-up remediation tasks before the feature is complete; it does not move original task files back to `todo/`.
### 14. Loop
- Go back to step 2 until all tasks in `todo/` are done
@@ -235,70 +216,16 @@ For product implementation, this archive means "batch implementation accepted."
- **Interaction with Auto-Fix Gate**: Architecture findings (new category from code-review Phase 7) always escalate per the implement auto-fix matrix; they cannot silently auto-fix
- **Resumability**: if interrupted, the next invocation checks for the latest `cumulative_review_batches_*.md` and computes the changed-file set from batch reports produced after that review
### 15. Product Implementation Completeness Gate
### 15. Final Test Run
Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate only when the remaining context is explicitly test implementation or refactoring, as determined by the task files and report filename rules.
**Goal**: catch the failure mode where narrow tests validate scaffold behavior while the task's actual outcome, included scope, architecture promise, or named integration remains unimplemented.
Inputs:
- Completed product task specs from `_docs/02_tasks/done/` for the current cycle
- `_docs/02_document/architecture.md`
- `_docs/02_document/system-flows.md`
- Relevant `_docs/02_document/components/*/description.md` files
- Current source code under each completed task's ownership envelope
- Batch reports and code-review reports for the current cycle
For each completed product task:
1. Read these sections from the task spec: `Description`, `Outcome`, `Scope / Included`, `Acceptance Criteria`, `Non-Functional Requirements`, `Constraints`, and explicit named technologies or integrations.
2. Compare those promises against actual source code, not only tests or report prose.
3. Search the task's owned component files for unresolved implementation markers: `placeholder`, `stub`, `reserved`, `TODO`, `NotImplemented`, `pass`, `deterministic`, `fake`, `mock`, `scaffold`, `native bridge`, and empty native/readme-only integration directories. Ignore test fixtures/mocks only when they are under test-owned paths and not used as production behavior.
4. Verify that each named runtime dependency in the task promise is either integrated behind the approved boundary or explicitly documented as a blocked prerequisite in the task/report. Examples: if a task promises FAISS, DINOv2, BASALT, LightGlue, OpenCV, RANSAC, a database, cloud service, or hardware SDK, the production code must contain that integration boundary; a deterministic fallback alone is not complete.
5. Verify tests exercise the real implementation path where local prerequisites exist. Environment-gated tests may skip only with an explicit prerequisite reason; they do not make missing production code complete.
6. Classify each task:
- **PASS**: task promises are implemented or explicitly out of scope in the task itself.
- **BLOCKED**: production code exists but cannot be fully verified due to external hardware/data/license/runtime prerequisites; the blocker is explicit and tests report blocked/skipped with reason.
- **FAIL**: promised production behavior is missing, only scaffolded, or only represented in tests/reports.
Save the audit to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` with:
- Per-task classification
- Evidence files/symbols checked
- Any unresolved scaffold/native placeholders
- Any named promised technologies not integrated
- Required remediation task suggestions, each sized to 5 points or less
Gate:
- If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, continue to Final Test Run.
- If any product task is `FAIL`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
- A) Create remediation tasks now and return to implementation
- B) Mark the missing behavior explicitly out of scope in task/docs, then re-run this gate
- C) Abort for manual correction
- Recommendation must normally be A unless the user deliberately accepts reduced scope.
Remediation task creation:
1. For each `FAIL`, create one or more task specs using `.cursor/skills/decompose/templates/task.md`; each remediation task must be sized at 5 points or less.
2. Save each task to `_docs/02_tasks/todo/` with a short name prefixed by `remediate_`.
3. Set **Component** to the failed task's component and set **Dependencies** to the failed task ID plus any remediation prerequisites.
4. Create or defer tracker tickets using the same tracker rules as decompose/new-task: if tracker is available, create tickets immediately; if the user explicitly chose `tracker: local`, keep numeric prefixes with `Tracker: pending` / `Epic: pending`.
5. Append the remediation tasks to `_docs/02_tasks/_dependencies_table.md`.
6. Return to Step 1 (Parse) in **Product implementation** context. The final product implementation report can be written only after remediation tasks complete and this gate reruns without `FAIL`.
### 16. Final Test Run
- After all batches are complete, run the full test suite once unless the invoking flow's immediate next step is `Run Tests`.
- If the next flow step is `Run Tests`, record a handoff in the final implementation report and let `.cursor/skills/test-run/SKILL.md` own the full-suite gate to avoid duplicate full runs.
- When this step does run, read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices).
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision.
- When tests pass, report final summary.
- After all batches are complete, run the full test suite once
- Read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices)
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision
- When tests pass, report final summary
## Batch Report Persistence
After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. For product implementation, produce the FINAL implementation report only after the Product Implementation Completeness Gate passes. For test and refactor implementation, produce the FINAL report after all selected tasks complete and the full-suite gate is either run or handed off per Step 16. The filename depends on context:
After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. When all tasks are complete, produce a FINAL implementation report with a summary of all batches. The filename depends on context:
- **Test implementation** (tasks from test decomposition): `_docs/03_implementation/implementation_report_tests.md`
- **Feature implementation**: `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` where `{feature_slug}` is derived from the batch task names (e.g., `implementation_report_core_api_cycle2.md`) and `{N}` is the current `state.cycle` from `_docs/_autodev_state.md`. If `state.cycle` is absent (pre-migration), default to `cycle1`.
@@ -337,10 +264,9 @@ After each batch, produce a structured report:
| Situation | Action |
|-----------|--------|
| Same task rewritten 3+ times without green tests | Mark Blocked, continue batch, escalate at batch end |
| Implementer fails same approach 3+ times | Stop it, escalate to user |
| Task blocked on external dependency (not in task list) | Report and skip |
| File ownership violated (task wrote outside OWNED) | ASK user |
| Product completeness gate finds missing promised implementation | STOP — create remediation tasks or get explicit user scope reduction |
| File ownership conflict unresolvable | ASK user |
| Test failure after final test run | Delegate to test-run skill — blocking gate |
| All tasks complete | Report final summary, suggest final commit |
| `_dependencies_table.md` missing | STOP — run `/decompose` first |
@@ -355,8 +281,7 @@ Each batch commit serves as a rollback checkpoint. If recovery is needed:
## Safety Rules
- Never start a task whose dependencies are not yet completed
- Never run tasks in parallel and never spawn subagents — see `.cursor/rules/no-subagents.mdc`
- If a task is flagged as stuck, stop working on it and report — do not let it loop indefinitely
- Always run the Product Implementation Completeness Gate before final product reports
- Always run or hand off the full test suite after all batches complete (step 16)
- Never launch tasks whose dependencies are not yet completed
- Never allow two parallel agents to write to the same file
- If a subagent fails or is flagged as stuck, stop it and report — do not let it loop indefinitely
- Always run the full test suite after all batches complete (step 15)
@@ -3,31 +3,29 @@
## Topological Sort with Batch Grouping
The `/implement` skill uses a topological sort to determine execution order,
then groups tasks into batches for code review and commit. Execution within a
batch is **sequential** — see `.cursor/rules/no-subagents.mdc`.
then groups tasks into batches for parallel execution.
## Algorithm
1. Build adjacency list from `_dependencies_table.md`
2. Compute in-degree for each task node
3. Initialize the ready set with all nodes that have in-degree 0
3. Initialize batch 0 with all nodes that have in-degree 0
4. For each batch:
a. Select up to 4 tasks from the ready set (default batch size cap)
b. Implement the selected tasks one at a time in topological order
c. When all tasks in the batch complete, remove them from the graph and
decrement in-degrees of dependents
d. Add newly zero-in-degree nodes to the ready set
a. Select up to 4 tasks from the ready set
b. Check file ownership — if two tasks would write the same file, defer one to the next batch
c. Launch selected tasks as parallel implementer subagents
d. When all complete, remove them from the graph and decrement in-degrees of dependents
e. Add newly zero-in-degree nodes to the next batch's ready set
5. Repeat until the graph is empty
## Ordering Inside a Batch
## File Ownership Conflict Resolution
Tasks inside a batch are executed in topological order — a task is only
started after every task it depends on (inside the batch or in a previous
batch) is done. When two tasks have the same topological rank, prefer the
lower-numbered (more foundational) task first.
When two tasks in the same batch map to overlapping files:
- Prefer to run the lower-numbered task first (it's more foundational)
- Defer the higher-numbered task to the next batch
- If both have equal priority, ask the user
## Complexity Budget
Each batch should not exceed 20 total complexity points.
If it does, split the batch and let the user choose which tasks to include.
The budget exists to keep the per-batch code review scope reviewable.
+1 -2
View File
@@ -129,8 +129,7 @@ If `_docs/_repo-config.yaml` already exists:
- Entries removed (component removed from registry)
4. **Ask the user** whether to apply the diff.
5. If applied, **preserve `confirmed: true` flags** for entries that still match — don't reset human-approved mappings.
6. **Preserve user-owned top-level keys verbatim**: `glossary_doc:` (written by autodev meta-repo Step 2.5) and any `assumptions_log:` entries are NEVER edited or removed by this skill. Carry them through unchanged. If the file referenced by `glossary_doc:` no longer exists on disk, surface as an `unresolved:` question — do not auto-clear the field.
7. If user declines, stop — leave config untouched.
6. If user declines, stop — leave config untouched.
### Phase 8: Batch question checkpoint (M4)
@@ -15,8 +15,6 @@ Propagates component changes into the unified documentation set. Strictly scoped
| Root `README.md` **only** if `_repo-config.yaml` lists it as a doc target (e.g., services table) | Install scripts (`ci-*.sh`) → use `monorepo-cicd` |
| Docs index (`_docs/README.md` or similar) cross-reference tables | Component-internal docs (`<component>/README.md`, `<component>/docs/*`) |
| Cross-cutting docs listed in `docs.cross_cutting` | `_docs/_repo-config.yaml` itself (only `monorepo-discover` and `monorepo-onboard` write it) |
| Body of cross-cutting docs **except** the `## Architecture Vision` section (preserved verbatim — owned by autodev meta-repo Step 2.5) | The file at `glossary_doc:` (user-confirmed; only autodev meta-repo Step 2.5 rewrites it). New project terms surfaced during sync are reported back to the user, not silently appended |
| `## Architecture Vision` body — read-only, may be referenced for terminology consistency but never edited | — |
If a component change requires CI/env updates too, tell the user to also run `monorepo-cicd`. This skill does NOT cross domains.
@@ -168,8 +166,6 @@ Append to `_docs/_repo-config.yaml` under `assumptions_log:`:
- Change `confirmed_by_user` or any `confirmed: <bool>` flag
- Auto-commit or push
- Guess a mapping not in the config
- Edit `glossary_doc:` (the file recorded under the config's `glossary_doc:` key)
- Edit the `## Architecture Vision` section of any cross-cutting doc; if a sync would conflict with that section, surface the conflict to the user and skip — do not silently rewrite user-confirmed content
## Edge cases
-152
View File
@@ -1,152 +0,0 @@
---
name: monorepo-e2e
description: Syncs the suite-level integration e2e harness (`e2e/docker-compose.suite-e2e.yml`, fixtures, Playwright runner) when component contracts drift in ways that affect the cross-service scenario. Reads `_docs/_repo-config.yaml` to know which suite-e2e artifacts are in play. Touches ONLY suite-e2e files — never per-component CI, docs, or component internals. Use when a component changes a port, env var, public API endpoint, DB schema column, or detection model that the suite e2e exercises.
---
# Monorepo Suite-E2E
Propagates component changes into the suite-level integration e2e harness. Strictly scoped — never edits docs, component internals, per-component CI configs, or the production deploy compose.
## Scope — explicit
| In scope | Out of scope |
| -------- | ------------ |
| `e2e/docker-compose.suite-e2e.yml` (overlay, healthchecks, seed services) | Production `_infra/deploy/<target>/docker-compose.yml``monorepo-cicd` owns it |
| `e2e/fixtures/init.sql` (seeded rows that the spec depends on) | Component DB migrations — owned by each component |
| `e2e/fixtures/expected_detections.json` (detection baseline) | Detection model itself — owned by `detections/` |
| `e2e/runner/tests/*.spec.ts` selector / contract-driven edits | New scenarios (user-driven, not drift-driven) |
| `e2e/runner/Dockerfile` / `package.json` Playwright version bumps | Net-new e2e infrastructure (use `monorepo-onboard` or initial scaffolding) |
| `.woodpecker/suite-e2e.yml` (suite-level pipeline) | Per-component `.woodpecker/01-test.yml` / `02-build-push.yml``monorepo-cicd` owns those |
| Suite-e2e leftover entries under `_docs/_process_leftovers/` | Per-component leftovers — owned by each component |
If a component change needs doc updates too, tell the user to also run `monorepo-document`. If it needs production-deploy or per-component CI updates, run `monorepo-cicd`. This skill **only** updates the suite-e2e surface.
## Preconditions (hard gates)
1. `_docs/_repo-config.yaml` exists.
2. Top-level `confirmed_by_user: true`.
3. `suite_e2e.*` section is populated in config (see "Required config block" below). If absent, abort and ask the user to extend the config via `monorepo-discover`.
4. Components-in-scope have confirmed contract mappings (port, public API path, DB tables touched), OR user explicitly approves inferred ones.
## Required config block
This skill expects `_docs/_repo-config.yaml` to carry:
```yaml
suite_e2e:
overlay: e2e/docker-compose.suite-e2e.yml
fixtures:
init_sql: e2e/fixtures/init.sql
baseline_json: e2e/fixtures/expected_detections.json
binary_fixtures:
- e2e/fixtures/sample.mp4
- e2e/fixtures/model.tar.gz
runner:
dockerfile: e2e/runner/Dockerfile
package_json: e2e/runner/package.json
spec_dir: e2e/runner/tests
pipeline: .woodpecker/suite-e2e.yml
scenario:
description: "Upload video → detect → overlays → dataset → DB persistence"
components_exercised:
- ui
- annotations
- detections
- postgres-local
api_contracts:
- component: ui
path: /api/admin/auth/login
- component: annotations
path: /api/annotations/media/batch
- component: annotations
path: /api/annotations/media/{id}/annotations
db_tables:
- media
- annotations
- detection
- detection_classes
model_pin:
detections_repo_path: <path-to-model-config-or-classes-source>
classes_source: annotations/src/Database/DatabaseMigrator.cs
```
If `suite_e2e:` is missing the skill **stops** — it does not invent a default mapping.
## Mitigations (M1M7)
- **M1** Separation: this skill only touches suite-e2e files; no production deploy compose, no per-component CI, no docs, no component internals.
- **M3** Factual vs. interpretive: port, env var, API path, DB column — FACTUAL, read from the components' code. Whether a baseline still matches the model — DEFERRED to the user (the skill flags drift, never silently re-records).
- **M4** Batch questions at checkpoints.
- **M5** Skip over guess: a component change that doesn't map cleanly to one of the in-scope artifacts → skip and report.
- **M6** Assumptions footer + append to `_repo-config.yaml` `assumptions_log`.
- **M7** Drift detection: verify every path under `suite_e2e.*` exists on disk; stop if not.
## Workflow
### Phase 1: Drift check (M7)
Verify every file listed under `suite_e2e.*` (excluding `binary_fixtures`, which are gitignored) exists on disk. Missing file → stop and ask:
- Run `monorepo-discover` to refresh, OR
- Skip the missing artifact (recorded in report)
For `binary_fixtures` paths that are absent (expected — they live in S3/LFS), check whether `expected_detections.json._meta.video_sha256` is still a `TBD-...` placeholder. If yes, surface this as a known leftover (`_docs/_process_leftovers/2026-04-22_suite-e2e-binary-fixtures.md`) and continue.
### Phase 2: Determine scope
Same as `monorepo-cicd` Phase 2 — ask the user, or auto-detect. For **auto-detect**, flag commits that touch suite-e2e-relevant concerns:
| Commit pattern | Suite-e2e impact |
| -------------- | ---------------- |
| New port exposed by `<component>` | Healthcheck override may change in `e2e/docker-compose.suite-e2e.yml` |
| New required env var on `<component>` | `e2e/docker-compose.suite-e2e.yml` `e2e-runner` env block + `init.sql` seed |
| Public API path renamed / removed | Spec selector / API call path in `e2e/runner/tests/*.spec.ts` |
| DB schema column renamed in a `db_tables` entry | `init.sql` column reference + spec `pg.query` text |
| New required DB table referenced by spec | `init.sql` insert block (skip if owned by component migration) |
| Detection model rev change in `detections/` | `expected_detections.json` `_meta.model.revision` + flag baseline as stale |
| New canonical detection class added | `expected_detections.json._meta` annotation |
Present the flagged list; confirm.
### Phase 3: Classify changes per component
| Change type | Target suite-e2e files |
| ----------- | ---------------------- |
| Port / env var change | `e2e/docker-compose.suite-e2e.yml` |
| API path / contract change | `e2e/runner/tests/*.spec.ts` |
| DB schema reference change | `e2e/fixtures/init.sql` and spec SQL queries |
| Model / class catalog change | `e2e/fixtures/expected_detections.json` (mark `_meta.fixture_version` bump + leftover entry for binary refresh) |
| Playwright dependency drift | `e2e/runner/package.json` + `e2e/runner/Dockerfile` |
| Suite scenario steps gone stale | **Stop and ask** — scenario edits are user-driven, not drift-driven |
### Phase 4: Apply edits
Edit each in-scope file. After each batch, run `ReadLints` on touched files. Do NOT run the suite e2e itself — that's a downstream pipeline operation, not a sync-skill responsibility.
For `expected_detections.json`: when the model revision changes, the skill **does not** re-record the baseline — the binary fixture cannot be regenerated from the dev environment. Instead:
1. Set `_meta.model.revision` to the new revision.
2. Set `_meta.fixture_version` to a new bumped version with a `-stale` suffix (e.g., `0.2.0-stale`).
3. Append a new entry to `_docs/_process_leftovers/` describing the required re-record.
4. Leave `expected.by_class` untouched — the spec's tolerance check will fail loudly until the binary refresh lands.
### Phase 5: Update assumptions log
Append a new `assumptions_log:` entry to `_docs/_repo-config.yaml` recording:
- Date, components in scope, which suite-e2e files were touched
- Any inferred contract mappings still tagged `confirmed: false`
- Any leftover entries created
### Phase 6: Report
Render a Choose-format summary of the synced files, surface any `_process_leftovers/` entries created, and end. Do NOT auto-commit.
## Self-verification
- [ ] No file outside `e2e/`, `.woodpecker/suite-e2e.yml`, or `_docs/_process_leftovers/` was edited
- [ ] `_docs/_repo-config.yaml` `suite_e2e:` block was not silently mutated except for `assumptions_log` append
- [ ] `expected_detections.json` was not re-recorded (only metadata bumped + leftover added)
- [ ] Every spec edit traces to a flagged commit pattern in Phase 2
- [ ] `ReadLints` clean on every touched file
## Failure handling
Same retry / escalation protocol as `monorepo-cicd` — see `protocols.md`. The most common failure mode is the binary-fixture leftover (sample.mp4 missing or SHA-mismatched); this skill does not attempt to resolve it, only surfaces it.
-4
View File
@@ -59,8 +59,6 @@ Mark each as `complete` / `partial` / `missing` and explain.
- Every component in `components:` appears in the registry — flag mismatches
- Every `docs.root` file cross-referenced in config exists on disk — flag missing
- Every `ci.orchestration_files` and `ci.install_scripts` exists — flag missing
- `glossary_doc:` (if recorded in config) points to a file that exists on disk — flag missing
- The cross-cutting architecture doc identified by `docs.cross_cutting` contains a `## Architecture Vision` section — flag missing (signals the meta-repo flow's Step 2.5 was skipped or the section was removed)
### Section 5: Unresolved questions
@@ -115,8 +113,6 @@ In registry, not in config: [list or "(none)"]
In config, not in registry: [list or "(none)"]
Config-referenced docs missing: [list or "(none)"]
Config-referenced CI files missing: [list or "(none)"]
glossary_doc: [path or "not recorded — run /autodev to capture"]
Architecture Vision section: [present | missing in <doc>]
═══════════════════════════════════════════════════
Unresolved questions
+4 -5
View File
@@ -75,7 +75,7 @@ Record the description verbatim for use in subsequent steps.
**Role**: Technical analyst
**Goal**: Determine whether deep research is needed.
Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md including its `## Architecture Vision` section, glossary.md, components/, system-flows.md). Use `glossary.md` to keep the new task's name, acceptance-criteria wording, and component references aligned with the user's confirmed vocabulary; flag the task to the user if the request appears to violate an Architecture Vision principle, do not silently allow it.
Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md, components/, system-flows.md).
**Consult LESSONS.md**: if `_docs/LESSONS.md` exists, read it and look for entries in categories `estimation`, `architecture`, `dependencies` that might apply to the task under consideration. If a relevant lesson exists (e.g., "estimation: auth-related changes historically take 2x estimate"), bias the classification and recommendation accordingly. Note in the output which lessons (if any) were applied.
@@ -134,8 +134,7 @@ The `<task_slug>` is a short kebab-case name derived from the feature descriptio
**Goal**: Determine where and how to insert the new functionality, and whether existing tests cover the new requirements.
1. Read the codebase documentation from DOCUMENT_DIR:
- `architecture.md` — overall structure (the `## Architecture Vision` H2 is user-confirmed intent and must not be violated by the new task without explicit approval)
- `glossary.md` — project terminology; reuse the user's vocabulary in task names, AC, and component references
- `architecture.md` — overall structure
- `components/` — component specs
- `system-flows.md` — data flows (if exists)
- `data_model.md` — data model (if exists)
@@ -282,7 +281,7 @@ Present using the Choose format for each decision that has meaningful alternativ
- Update **Epic** field: `[EPIC-ID]`
3. Rename the file from `[##]_[short_name].md` to `[TICKET-ID]_[short_name].md`
If the work item tracker is not authenticated or unavailable, follow `.cursor/rules/tracker.mdc` before continuing. Only if the user explicitly chooses `tracker: local`:
If the work item tracker is not authenticated or unavailable (`tracker: local`):
- Keep the numeric prefix
- Set **Tracker** to `pending`
- Set **Epic** to `pending`
@@ -337,7 +336,7 @@ After the user chooses **Done**:
| Research skill hits a blocker | Follow research skill's own escalation rules |
| Codebase analysis reveals conflicting architectures | **ASK** user which pattern to follow |
| Complexity exceeds 5 points | **WARN** user and suggest splitting into multiple tasks |
| Work item tracker MCP unavailable | Follow `.cursor/rules/tracker.mdc`; do not continue in local mode unless the user explicitly chooses it |
| Work item tracker MCP unavailable | **WARN**, continue with local-only task files |
## Trigger Conditions
+3 -6
View File
@@ -69,7 +69,7 @@ Capture any new questions, findings, or insights that arise during test specific
### Step 2: Solution Analysis
Read and follow `steps/02_solution-analysis.md`. The step opens with **Phase 2a.0: Glossary & Architecture Vision** (BLOCKING) — drafts `_docs/02_document/glossary.md` and a one-paragraph architecture vision, presents the condensed view to the user, iterates until confirmed, then proceeds into the architecture, data-model, and deployment phases. The confirmed vision becomes the first `## Architecture Vision` H2 of `architecture.md`.
Read and follow `steps/02_solution-analysis.md`.
---
@@ -107,7 +107,6 @@ Read and follow `steps/07_quality-checklist.md`.
- **Coding during planning**: this workflow produces documents, never code
- **Multi-responsibility components**: if a component does two things, split it
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Skipping the glossary/vision gate (Phase 2a.0)**: drafting `architecture.md` from raw `solution.md` without confirming terminology and vision means the AI's mental model is not aligned with the user's; every downstream artifact will inherit that drift
- **Diagrams without data**: generate diagrams only after the underlying structure is documented
- **Copy-pasting problem.md**: the architecture doc should analyze and transform, not repeat the input
- **Vague interfaces**: "component A talks to component B" is not enough; define the method, input, output
@@ -138,10 +137,8 @@ Read and follow `steps/07_quality-checklist.md`.
│ │
│ 1. Blackbox Tests → test-spec/SKILL.md │
│ [BLOCKING: user confirms test coverage] │
│ 2. Solution Analysis → glossary + vision, architecture,
data model, deployment
│ [BLOCKING 2a.0: user confirms glossary + vision] │
│ [BLOCKING 2a: user confirms architecture] │
│ 2. Solution Analysis → architecture, data model, deployment
[BLOCKING: user confirms architecture]
│ 3. Component Decomp → component specs + interfaces │
│ [BLOCKING: user confirms components] │
│ 4. Review & Risk → risk register, iterations │
@@ -4,105 +4,20 @@
**Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft
**Constraints**: No code, no component-level detail yet; focus on system-level view
### Phase 2a.0: Glossary & Architecture Vision (BLOCKING)
**Role**: Software architect + business analyst
**Goal**: Align the AI's mental model of the project with the user's intent BEFORE drafting `architecture.md`. Capture domain terminology and the user's high-level architecture vision so every downstream artifact (architecture, components, flows, tests, epics) is grounded in confirmed user intent — not in AI inference.
**Inputs**:
- `_docs/00_problem/problem.md`, `acceptance_criteria.md`, `restrictions.md`
- `_docs/00_problem/input_data/*`
- `_docs/01_solution/solution.md` (and any earlier `solution_draft*.md` siblings)
- Any blackbox-test findings produced in Step 1
**Outputs**:
- `_docs/02_document/glossary.md` (NEW)
- A confirmed "Architecture Vision" paragraph + bullet list held in working memory and used as the spine of Phase 2a's `architecture.md`
**Procedure**:
1. **Draft glossary** — extract project-specific terminology from inputs (NOT generic software terms). Include:
- Domain entities, processes, and roles
- Acronyms / abbreviations
- Internal codenames or product names
- Synonym pairs in active use (e.g., "flight" vs. "mission")
- Stakeholder personas referenced in problem.md
Each entry: one-line definition, plus a parenthetical source (`source: problem.md`, `source: solution.md §3`).
Skip terms that have a single well-known industry meaning (REST, JSON, etc.).
2. **Draft architecture vision** — synthesize from inputs:
- **One paragraph**: what the system is, who uses it, the shape of the runtime topology (monolith / services / pipeline / library / hybrid).
- **Components & responsibilities** (one-line each). At this stage these are *intent-level*, not the formal decomposition that Step 3 produces.
- **Major data flows** (one or two sentences each).
- **Architectural principles / non-negotiables** the user has implied (e.g., "DB-driven config", "no per-component state outside Redis", "all UI traffic via REST + SSE only").
- **Open architectural questions** the AI cannot resolve from inputs alone.
3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
```
══════════════════════════════════════
REVIEW: Glossary + Architecture Vision
══════════════════════════════════════
Glossary (N terms drafted):
- <Term>: <one-line definition>
- ...
Architecture Vision:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Principles / non-negotiables:
- <principle>
- ...
Open questions (AI could not resolve):
- <q1>
- <q2>
══════════════════════════════════════
A) Looks correct — write glossary.md, use vision for Phase 2a
B) I want to add / correct entries (provide diffs)
C) Answer the open questions first, then re-present
══════════════════════════════════════
Recommendation: pick C if open questions exist, otherwise A
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present the condensed view, loop until A.
- On C → ask the listed open questions one round (M4-style batch), integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `_docs/02_document/glossary.md` with terms in alphabetical order. Include a top-line `**Status**: confirmed-by-user` and the date.
- Hold the confirmed vision (paragraph + components + principles) in working memory; Phase 2a will materialize it into `architecture.md` and **must** preserve every confirmed principle and component intent verbatim.
**Self-verification**:
- [ ] Every glossary entry traces to at least one input file (no invented terms)
- [ ] Every component listed in the vision is one the inputs reference
- [ ] All open questions are either answered or explicitly deferred (with the user's acknowledgement)
- [ ] User picked option A on the latest condensed view
**BLOCKING**: Do NOT proceed to Phase 2a until `glossary.md` is saved and the user has confirmed the architecture vision.
### Phase 2a: Architecture & Flows
1. Read all input files thoroughly
2. Incorporate findings, questions, and insights discovered during Step 1 (blackbox tests)
3. **Apply confirmed vision from Phase 2a.0**: the architecture document must include a top-level `## Architecture Vision` section that contains the user-confirmed paragraph, components, and principles verbatim. The rest of `architecture.md` (tech stack, deployment model, NFRs, ADRs) builds on top of that section, never contradicts it
4. Research unknown or questionable topics via internet; ask user about ambiguities
5. Document architecture using `templates/architecture.md` as structure
6. Document system flows using `templates/system-flows.md` as structure
3. Research unknown or questionable topics via internet; ask user about ambiguities
4. Document architecture using `templates/architecture.md` as structure
5. Document system flows using `templates/system-flows.md` as structure
**Self-verification**:
- [ ] `architecture.md` opens with a `## Architecture Vision` section matching Phase 2a.0
- [ ] Architecture covers all capabilities mentioned in solution.md
- [ ] System flows cover all main user/system interactions
- [ ] No contradictions with problem.md, restrictions.md, or the confirmed vision
- [ ] No contradictions with problem.md or restrictions.md
- [ ] Technology choices are justified
- [ ] Blackbox test findings are reflected in architecture decisions
- [ ] Every term used in `architecture.md` that is project-specific appears in `glossary.md`
**Save action**: Write `architecture.md` and `system-flows.md`
@@ -58,4 +58,4 @@ Do NOT create minimal epics with just a summary and short description. The epic
8. **Create "Blackbox Tests" epic** — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.
**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If tracker availability fails, follow `.cursor/rules/tracker.mdc`; only if the user explicitly chooses `tracker: local`, save locally only with pending tracker markers.
**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If `tracker: local`, save locally only.
+1 -1
View File
@@ -133,4 +133,4 @@ Link to architecture.md and relevant component spec.]
- `component` — a normal per-component epic
- `cross-cutting` — a shared concern that spans ≥2 components
- `tests` — the blackbox-tests epic (always exactly one)
- Complexity points for child issues follow the project standard: 1, 2, 3, 5. Do not create issues above 5 points — split them.
- Complexity points for child issues follow the project standard: 1, 2, 3, 5, 8. Do not create issues above 5 points — split them.
+3 -5
View File
@@ -24,8 +24,6 @@ Phase details live in `phases/` — read the relevant file before executing each
- **Save immediately**: write artifacts to disk after each phase
- **Delegate execution**: all code changes go through the implement skill via task files
- **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
- **Exact-fit recommendations**: do not recommend a replacement pattern, library, service, architecture, algorithm, or "modern approach" merely because it improves structure or solves a similar class of problem. It must fit confirmed product constraints, acceptance criteria, operating context, integration boundaries, and current code realities. Otherwise reject it, mark it experimental, or ask the user before adding it to the roadmap.
- **Per-mode API capability verification on replacements**: when a refactor proposes replacing or adding a library/SDK/framework/service that exposes multiple modes or configurations, pin the exact mode the refactored code will use (inputs, outputs, runtime) and verify *that mode* via mandatory `context7` lookup plus a saved Minimum Viable Example before promoting the recommendation to `Selected`. Capability claims at the category level ("supports A, B, C modes") must be cross-checked against the literal mode enumeration — `A, B → A+B` style conflations are the recurring silent-failure path.
## Context Resolution
@@ -59,7 +57,7 @@ Create REFACTOR_DIR and RUN_DIR if missing. If a RUN_DIR with the same name alre
Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-changes.md`). Both modes then convert that file into task files in TASKS_DIR during Phase 2.
**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file only if it lives outside `RUN_DIR`. If the provided file is already the canonical `RUN_DIR/list-of-changes.md`, keep it as the audit record.
**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file to avoid duplication.
## Workflow
@@ -81,10 +79,10 @@ Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-ch
- "refactor [specific target]" → skip phase 1 if docs exist
- Default → all phases
**Testability-run specifics** (guided mode invoked by autodev existing-code Step 4 or greenfield Step 8):
**Testability-run specifics** (guided mode invoked by autodev existing-code flow Step 4):
- Run name is `01-testability-refactoring`.
- Phase 3 (Safety Net) is skipped by design — no tests exist yet. Compensating control: the `list-of-changes.md` gate in Phase 1 must be reviewed and approved by the user before Phase 4 runs.
- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see the invoking flow's testability step for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for the next optional full-refactor step or backlog consideration.
- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see existing-code flow Step 4 for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for Step 8 (optional full refactor) consideration.
- After Phase 4 (Execution) completes, write `RUN_DIR/testability_changes_summary.md` as Phase 4.5. Format: one bullet per applied change.
```markdown
# Testability Changes Summary ({{run_name}})
@@ -95,7 +95,7 @@ Also copy to project standard locations:
**Critical step — do not skip.** Before producing the change list, cross-reference documented business flows against actual implementation. This catches issues that static code inspection alone misses.
1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md` (paying special attention to its `## Architecture Vision` section — that's the user-confirmed structural intent), `DOCUMENT_DIR/glossary.md`, `DOCUMENT_DIR/module-layout.md`, every file under `DOCUMENT_DIR/contracts/`, and `SOLUTION_DIR/solution.md` (whichever exist). Extract every documented business flow, data path, architectural decision, module ownership boundary, and contract shape. Any refactor change that contradicts a confirmed Architecture Vision principle must either be rejected or surfaced to the user before being added to `list-of-changes.md` — those principles are not refactor targets without explicit user approval.
1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md`, `DOCUMENT_DIR/module-layout.md`, every file under `DOCUMENT_DIR/contracts/`, and `SOLUTION_DIR/solution.md` (whichever exist). Extract every documented business flow, data path, architectural decision, module ownership boundary, and contract shape.
2. **Trace each flow through code**: For every documented flow (e.g., "video batch processing", "image tiling", "engine initialization"), walk the actual code path line by line. At each decision point ask:
- Does the code match the documented/intended behavior?
+4 -29
View File
@@ -7,29 +7,14 @@
## 2a. Deep Research
1. Analyze current implementation patterns
2. Extract the **Project Constraint Matrix** from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, current architecture/docs, and actual code constraints. Include required inputs/outputs, operating context, lifecycle assumptions, integration boundaries, non-functional targets, and hard disqualifiers.
3. Research modern approaches for similar systems
4. For each alternative pattern/library/service/architecture/algorithm, research intrinsic implementation constraints: required inputs/outputs, runtime assumptions, supported deployment modes, resource needs, operational limits, licensing/security constraints, and known failure reports.
**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for proposed replacements)**
When a refactor recommendation replaces (or adds) a library/SDK/framework/service, the same per-mode verification used by `/research` Step 2 applies — selecting a replacement on category fit alone is the same silent-failure path. For every replacement candidate that has multiple modes or configurations:
1. **Pin the exact mode/configuration** the refactored code will use, in one explicit sentence. Inputs (data shapes, sensor counts, payloads, rates), outputs (per `acceptance_criteria.md` and contract files), runtime (matching the project's deployment).
2. **Run `context7` (or equivalent docs lookup)** for the candidate. **Mandatory for every replacement library/SDK/framework candidate**, not optional. Minimum three queries per candidate: mode enumeration, project's exact mode (with input/output shapes), disqualifier probe ("does this mode produce the required output? are there published limitations on this runtime?"). Append URLs to `RUN_DIR/analysis/research_findings.md` references section.
3. **Save a Minimum Viable Example (MVE)** for the pinned mode under `RUN_DIR/analysis/mve_evidence.md` with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌. If no official example covers the project's exact configuration, the recommendation cannot be `Selected` based on category fit alone — it must be `Experimental only` (with required-evidence note) or `Rejected`.
4. **Treat "the same library in a different mode" as a different recommendation.** If the project's pinned mode is `<X>` but the only documented evidence covers `<Y>`, do not silently soften the description. Open a separate recommendation row, with its own MVE, fit assessment, and disqualifiers.
5. **Common silent-failure pattern**: a fact summary paraphrases docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes" — no `A+B` combination exists. Cross-check paraphrased capability claims against the literal mode enumeration.
5. Identify what could be done differently
6. Suggest improvements only when they fit the Project Constraint Matrix. A cleaner or more modern approach that violates product constraints must be marked `Rejected` or `Experimental only`, not added as a roadmap recommendation.
2. Research modern approaches for similar systems
3. Identify what could be done differently
4. Suggest improvements based on state-of-the-art practices
Write `RUN_DIR/analysis/research_findings.md`:
- Current state analysis: patterns used, strengths, weaknesses
- Alternative approaches per component: current vs alternative, pros/cons, migration effort
- Prioritized recommendations: quick wins + strategic improvements
- Constraint-fit table: recommendation, **pinned mode/config**, constraints checked, **API capability evidence (MVE link)**, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
- For every recommendation that replaces or adds a library/SDK/framework, append a **Restrictions × Candidate-Mode sub-matrix** that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's pinned mode, marking each cell ✅ Pass / ❌ Fail / ❓ Verify / N/A with cited evidence. A recommendation cannot be `Selected` while any cell is ❌ or ❓.
## 2b. Solution Assessment & Hardening Tracks
@@ -37,7 +22,6 @@ Write `RUN_DIR/analysis/research_findings.md`:
2. Identify weak points in codebase, map to specific code areas
3. Perform gap analysis: acceptance criteria vs current state
4. Prioritize changes by impact and effort
5. Reject or escalate any proposed refactor that improves code structure while weakening required behavior, integration contracts, runtime constraints, safety/security posture, or acceptance criteria
Present optional hardening tracks for user to include in the roadmap:
@@ -63,9 +47,6 @@ Write `RUN_DIR/analysis/refactoring_roadmap.md`:
- Gap analysis: what's missing, what needs improvement
- Phased roadmap: Phase 1 (critical fixes), Phase 2 (major improvements), Phase 3 (enhancements)
- Selected hardening tracks and their items
- Applicability gate: each roadmap item must state constraint fit, mismatches, required evidence, and status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
**BLOCKING applicability gate**: Before 2c and 2d, every recommendation in the roadmap must be `Selected`. Items marked `Rejected` are excluded. Items marked `Experimental only` or `Needs user decision` require a user decision before task creation.
## 2c. Create Epic
@@ -74,7 +55,7 @@ Create a work item tracker epic for this refactoring run:
1. Epic name: the RUN_DIR name (e.g., `01-testability-refactoring`)
2. Create the epic via configured tracker MCP
3. Record the Epic ID — all tasks in 2d will be linked under this epic
4. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; only use `PENDING` placeholders if the user explicitly chooses `tracker: local`
4. If tracker unavailable, use `PENDING` placeholder and note for later
## 2d. Task Decomposition
@@ -98,12 +79,6 @@ Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files
**Self-verification**:
- [ ] All acceptance criteria are addressed in gap analysis
- [ ] Recommendations are grounded in actual code, not abstract
- [ ] Every recommendation has been checked against the Project Constraint Matrix
- [ ] No recommendation violates product restrictions, acceptance criteria, documented architecture decisions, or actual code integration boundaries
- [ ] Every replacement library/SDK/framework recommendation has a pinned mode/config, a saved MVE in `mve_evidence.md`, and a Restrictions × Candidate-Mode sub-matrix with no ❌ or ❓ cells
- [ ] `context7` (or equivalent) was consulted for every replacement library/SDK/framework recommendation
- [ ] Paraphrased capability claims have been cross-checked against the literal mode-enumeration evidence (no `A, B → A+B` style conflation)
- [ ] Rejected and experimental approaches are documented but not converted into implementation tasks without user approval
- [ ] Roadmap phases are prioritized by impact
- [ ] Epic created and all tasks linked to it
- [ ] Every entry in list-of-changes.md has a corresponding task file in TASKS_DIR
@@ -10,7 +10,7 @@
- All `[TRACKER-ID]_refactor_*.md` files are present
- Each task file has valid header fields (Task, Name, Description, Complexity, Dependencies)
2. Verify `TASKS_DIR/_dependencies_table.md` includes the refactoring tasks
3. Verify all tests pass (safety net from Phase 3 is green), unless this is a testability run where Phase 3 was intentionally skipped
3. Verify all tests pass (safety net from Phase 3 is green)
4. If any check fails, go back to the relevant phase to fix
## 4b. Delegate to Implement Skill
@@ -21,9 +21,9 @@ The implement skill will:
1. Parse task files and dependency graph from TASKS_DIR
2. Detect already-completed tasks (skip non-refactoring tasks from prior workflow steps)
3. Compute execution batches for the refactoring tasks
4. Implement tasks sequentially in topological order (no subagents, no parallelism)
4. Launch implementer subagents (up to 4 in parallel)
5. Run code review after each batch
6. Commit per batch and push only when the user approved pushing
6. Commit and push per batch
7. Update work item ticket status
Do NOT modify, skip, or abbreviate any part of the implement skill's workflow. The refactor skill is delegating execution, not optimizing it.
@@ -47,7 +47,7 @@ After the implement skill completes:
For each successfully completed refactoring task:
1. Transition the work item ticket status to **Done** via the configured tracker MCP
2. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; if the user explicitly chose `tracker: local`, note the pending status transitions in `RUN_DIR/execution_log.md`
2. If tracker unavailable, note the pending status transitions in `RUN_DIR/execution_log.md`
For any failed or blocked tasks, leave their status as-is (the implement skill already set them to In Testing or blocked).
@@ -32,7 +32,7 @@ For each component doc affected:
## 7d. Update System-Level Documentation
If structural changes were made (new modules, removed modules, changed interfaces):
1. Update `_docs/02_document/architecture.md` if architecture changed — but **never edit the `## Architecture Vision` section**. That section is user-confirmed (plan Phase 2a.0 / document Step 4.5); if a refactor invalidates a vision principle, surface it to the user and let them update the vision themselves before continuing. Update only the technical sections below the Vision H2.
1. Update `_docs/02_document/architecture.md` if architecture changed
2. Update `_docs/02_document/system-flows.md` if flow sequences changed
3. Update `_docs/02_document/diagrams/components.md` if component relationships changed
@@ -23,7 +23,6 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
- **Problem**: [what makes this problematic / untestable / coupled]
- **Change**: [what to do — behavioral description, not implementation steps]
- **Rationale**: [why this change is needed]
- **Constraint Fit**: [which product constraints / acceptance criteria / integration boundaries this preserves; or "Rejected — violates ..."]
- **Risk**: [low | medium | high]
- **Dependencies**: [other change IDs this depends on, or "None"]
@@ -32,7 +31,6 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
- **Problem**: [description]
- **Change**: [description]
- **Rationale**: [description]
- **Constraint Fit**: [description]
- **Risk**: [low | medium | high]
- **Dependencies**: [C01, or "None"]
```
@@ -46,8 +44,6 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
- **File(s)** must reference actual files verified to exist in the codebase
- **Problem** describes the current state, not the desired state
- **Change** describes what the system should do differently — behavioral, not prescriptive
- **Constraint Fit** proves the change preserves confirmed product requirements, restrictions, acceptance criteria, architecture decisions, and integration contracts
- Do not include changes whose only benefit is structural cleanliness if they weaken required behavior or violate constraints; record those as rejected in analysis instead
- **Dependencies** reference other change IDs within this list; cross-run dependencies use tracker IDs
- In guided mode, the input file entries are validated against actual code and enriched with file paths, risk, and dependencies before writing
- In automatic mode, entries are derived from Phase 1 component analysis and Phase 2 research findings
-21
View File
@@ -30,27 +30,6 @@ Transform vague topics raised by users into high-quality, deliverable research r
- **Internet-first investigation** — do not rely on training data for factual claims; search the web extensively for every sub-question, rephrase queries when results are thin, and keep searching until you have converging evidence from multiple independent sources
- **Multi-perspective analysis** — examine every problem from at least 3 different viewpoints (e.g., end-user, implementer, business decision-maker, contrarian, domain expert, field practitioner); each perspective should generate its own search queries
- **Question multiplication** — for each sub-question, generate multiple reformulated search queries (synonyms, related terms, negations, "what can go wrong" variants, practitioner-focused variants) to maximize coverage and uncover blind spots
- **Component option breadth** — for every component area, build a broad option landscape before selecting. Search direct candidates, adjacent-domain alternatives, commercial/open-source variants, classical/simple baselines, current SOTA, and "do not use" failure cases. A component may not be narrowed to one candidate until alternatives have been searched and rejected with evidence.
- **Component research depth** — for every serious component candidate, go beyond discovery pages. Read official docs, repository/license files, issue discussions, benchmarks, deployment guides, version/platform requirements, security notes, maintenance signals, and real-world failure reports. Extract evidence for inputs/outputs, lifecycle assumptions, runtime/storage/latency fit, integration boundaries, licensing, operational risks, and unsupported scenarios before assigning any selection status.
- **Exact-fit component selection** — never select a component, tool, library, service, architecture pattern, or algorithm merely because it solves a similar class of problem. It must be proven compatible with the project's explicit operating context, constraints, required inputs/outputs, non-functional requirements, lifecycle assumptions, and acceptance criteria. If fit is unproven or mismatched, mark it `Rejected`, `Experimental only`, or escalate for user decision before it can shape the solution.
- **Per-mode API capability verification** *(applies only to technical-component selection — see Research Output Class below)* — when a candidate library/SDK/framework/service exposes multiple modes or configurations, *the candidate is not a single thing*. Pin the exact mode the project will use (one explicit sentence: inputs, outputs, runtime), and verify *that mode* against the project's required inputs/outputs via official docs (mandatory `context7` lookup) plus a saved Minimum Viable Example. Capability claims at the category level ("supports X, Y, Z modes") must be cross-checked against the literal mode enumeration before being treated as project-applicable. Two modes of one library are two distinct candidates for the purposes of the Component Applicability Gate. Does not apply to non-technical research (concept comparison, market/policy investigation, knowledge organization, etc.).
## Research Output Class (BLOCKING — set in Step 1)
Before applying any of the technical-component gates (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookup), classify the research output into one of two classes. Record the decision in `00_question_decomposition.md` once, near the top, so every downstream step honors it.
| Class | What the output recommends or selects | Examples | Technical-component gates apply? |
|-------|---------------------------------------|----------|----------------------------------|
| **Technical-component selection** | One or more libraries, SDKs, frameworks, services, protocols, data formats, infrastructure patterns, algorithms, or APIs that will be implemented or operated against | "Pick a vector database", "Compare auth-token strategies for our API", "Should we use Kafka or RabbitMQ?", architecture / tech-stack / migration drafts (Mode A, Mode B) | **Yes — all gates active** |
| **Non-technical investigation** | Concept comparisons, knowledge organization, root-cause investigation of an event, market/policy/regulatory/social analysis, literature review, decision support without committing to specific tooling | "Why did adoption stall in Q3?", "Compare phenomenology vs constructivism", "Map regulatory landscape for X", "What do practitioners say about onboarding under remote-first orgs?" | **No — skip API/MVE/sub-matrix gates; the rest of the 8-step engine still applies** |
How to decide:
1. Inspect the question and the input files (`problem.md`, `restrictions.md`, `acceptance_criteria.md`, or the standalone input file).
2. If the deliverable will name specific software/services/protocols that someone will then build with or operate, it is **Technical-component selection**.
3. If the deliverable is a report, comparison, or recommendation that does not commit to specific tooling, it is **Non-technical investigation**.
4. **Mixed runs are valid.** Some research questions have a non-technical core but include one technical sub-question (or vice versa). In that case classify per component area within the run, not the run as a whole, and note in `00_question_decomposition.md` which component areas trigger the technical-component gates.
When the run is purely **Non-technical investigation**, the rest of the research engine — question decomposition, perspective rotation, exhaustive web search, fact extraction, comparison framework, reasoning chain, validation, deliverable formatting — still applies in full. The sections that get skipped are explicitly the technical gates listed in the table above.
## Context Resolution
@@ -27,26 +27,13 @@
- [ ] Iterative deepening completed: follow-up questions from initial findings were searched
- [ ] No sub-question relies solely on training data without web verification
## Component Option Breadth
- [ ] `00_question_decomposition.md` contains a Component Option Search Plan
- [ ] Every component area was searched across simple baseline, established production, open-source, commercial/vendor, current SOTA, adjacent-domain, no-build/defer, and known-bad options where applicable
- [ ] Every component area has at least 3 realistic candidates, or a documented explanation of why broad searches found fewer
- [ ] Each lead candidate has official/source-of-truth evidence plus independent validation when available
- [ ] Each component area includes at least one baseline/fallback option and at least one rejected or experimental option when possible
- [ ] Alternative names, synonyms, and neighboring-domain terms were searched before declaring the option landscape complete
- [ ] Licensing, runtime, platform, maintenance, and unsupported-scenario searches were performed for every lead, fallback, and rejected candidate
## Mode A Specific
- [ ] Phase 1 completed: AC assessment was presented to and confirmed by user
- [ ] AC assessment consistent: Solution draft respects the (possibly adjusted) acceptance criteria and restrictions
- [ ] Competitor analysis included: Existing solutions were researched
- [ ] All components have comparison tables: Each component lists alternatives with tools, advantages, limitations, security, cost
- [ ] Component options are broad: component tables include baseline, production, open-source, commercial/vendor, SOTA/research, adjacent-domain, defer/no-build, and disqualified options where applicable
- [ ] Tools/libraries verified: Suggested tools actually exist and work as described
- [ ] Component fit matrix completed: `06_component_fit_matrix.md` exists and every selected component/tool/pattern is marked `Selected`
- [ ] No field-adjacent substitution: no selected candidate is chosen only because it solves a similar class of problem while failing the project's explicit constraints
- [ ] Testing strategy covers AC: Tests map to acceptance criteria
- [ ] Tech stack documented (if Phase 3 ran): `tech_stack.md` has evaluation tables, risk assessment, and learning requirements
- [ ] Security analysis documented (if Phase 4 ran): `security_analysis.md` has threat model and per-component controls
@@ -58,9 +45,6 @@
- [ ] New draft is self-contained: Written as if from scratch, no "updated" markers
- [ ] Performance column included: Mode B comparison tables include performance characteristics
- [ ] Previous draft issues addressed: Every finding in the table is resolved in the new draft
- [ ] Existing selected components were challenged against a broad alternative landscape before being kept
- [ ] Existing component fit audited: every old and new component/tool/pattern was checked against `restrictions.md`, `acceptance_criteria.md`, and the Project Constraint Matrix
- [ ] Rejected/experimental candidates are not lead recommendations unless the user explicitly accepted the risk
## Timeliness Check (High-Sensitivity Domain BLOCKING)
@@ -92,33 +76,3 @@ When the research topic has Critical or High sensitivity level:
- [ ] Cited facts have corresponding statements in the original text (no over-interpretation)
- [ ] Source publication/update dates annotated; technical docs include version numbers
- [ ] Unverifiable information annotated `[limited source]` and not sole support for core conclusions
## Exact-Fit Validation (BLOCKING)
- [ ] Project Constraint Matrix extracted from problem context before component selection
- [ ] Component fit matrix includes `Component Area`, `Option Family`, and `Pinned Mode/Config` columns
- [ ] Every selected component/tool/library/service/pattern/algorithm has evidence for required inputs/outputs and integration boundaries
- [ ] Every selected candidate has evidence for the operating context and lifecycle assumptions it must support
- [ ] Every selected candidate has evidence for non-functional targets that are binding for the project
- [ ] Known unsupported scenarios and failure reports were searched for every selected candidate
- [ ] Mismatches are recorded as disqualifiers, not softened into generic limitations
- [ ] Any candidate with unproven fit is marked `Experimental only` or escalated for user decision
- [ ] Any candidate with documented constraint conflict is marked `Rejected`
## API Capability Verification (BLOCKING)
**Applicability**: this checklist applies only when the run is classified as **Technical-component selection** (see SKILL.md → Research Output Class). For non-technical research (concept comparison, market/policy investigation, root-cause analysis, knowledge organization), skip this checklist entirely and note the skip in `05_validation_log.md`. For mixed runs, apply only to technical component areas.
For every lead candidate that is a library/SDK/framework/service:
- [ ] The exact mode/configuration the project will use is pinned in one explicit sentence (inputs, outputs, runtime); no vague "supports X" language
- [ ] `context7` (or equivalent docs lookup) was run for the candidate, with at least 3 queries: mode enumeration, project's exact mode, disqualifier probe
- [ ] All consulted URLs from context7 / official docs are appended to `01_source_registry.md`
- [ ] A Minimum Viable Example (MVE) was saved for the pinned mode in `02_fact_cards.md` (or `02_mve_evidence.md`) with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌
- [ ] When the MVE inputs or outputs do not exactly match the project's, the mismatch is cited from the official docs (not inferred), and the candidate is `Experimental only` or `Rejected`
- [ ] When a library has multiple modes, each project-relevant mode appears as its own candidate row (not a single library row that softens across modes)
- [ ] Restrictions × Candidate-Modes sub-matrix in `06_component_fit_matrix.md` is filled for every lead candidate, with one row per numbered restriction and per numbered acceptance criterion
- [ ] Sub-matrix uses ✅ / ❌ / ❓ / N/A only — no free-form prose substitutes
- [ ] No `Selected` candidate has any ❌ or ❓ cell in its sub-matrix
- [ ] "Validation gate required" footnotes are explicitly classified as either *API capability* (must be resolved here) or *runtime quality* (may be carried forward)
- [ ] Paraphrased capability claims in fact cards have been cross-checked against the literal mode-enumeration evidence (no `mono, inertial → mono-inertial` style conflation)
@@ -57,7 +57,6 @@ RESEARCH_DIR/
├── 03_comparison_framework.md # Step 4 output: selected framework and populated data
├── 04_reasoning_chain.md # Step 6 output: fact → conclusion reasoning
├── 05_validation_log.md # Step 7 output: use-case validation results
├── 06_component_fit_matrix.md # Step 7.5 output: component exact-fit gate
└── raw/ # Raw source archive (optional)
├── source_1.md
└── source_2.md
@@ -74,7 +73,6 @@ RESEARCH_DIR/
| Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` |
| Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` |
| Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` |
| Step 7.5 | Component exact-fit gate and selection status | `06_component_fit_matrix.md` |
| Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` |
### Save Principles
@@ -97,7 +95,6 @@ RESEARCH_DIR/
| `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion |
| `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion |
| `05_validation_log.md` | Use-case validation and review | After Step 7 completion |
| `06_component_fit_matrix.md` | Exact-fit matrix for every proposed component/tool/pattern with status `Selected` / `Rejected` / `Experimental only` / `Needs user decision` | Before Step 8 deliverable formatting |
| `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion |
| `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) |
| `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) |
@@ -73,18 +73,16 @@ Full 8-step research methodology. Produces the first solution draft.
**Task** (drives the 8-step engine):
1. Research existing/competitor solutions for similar problems — search broadly across industries and adjacent domains, not just the obvious competitors
2. Research the problem thoroughly — all possible ways to solve it, split into components; search for how different fields approach analogous problems
3. Derive a **Project Constraint Matrix** before evaluating component options. Extract exact constraints from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, input data notes, and the Phase 1 AC assessment. Include required inputs/outputs, operating context, runtime envelope, data availability, lifecycle boundaries, non-functional targets, integration boundaries, security constraints, and explicit out-of-scope decisions.
4. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
5. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
6. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
7. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
8. For every candidate component/tool/library/service/pattern/algorithm, prove exact fit against the Project Constraint Matrix. A field-adjacent solution is not selectable unless its documented implementation assumptions match the project's constraints. Mismatches must be recorded as disqualifiers and the candidate marked `Rejected`, `Experimental only`, or `Needs user decision`.
9. Include security considerations in each component analysis
10. Provide rough cost estimates for proposed solutions
3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
4. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
5. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
6. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
7. Include security considerations in each component analysis
8. Provide rough cost estimates for proposed solutions
Be concise in formulating. The fewer words, the better, but do not miss any important details.
**Save action**: Write `RESEARCH_DIR/06_component_fit_matrix.md` before the final draft, then write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`
**Save action**: Write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`
---
@@ -10,25 +10,18 @@ Full 8-step research methodology applied to assessing and improving an existing
**Task** (drives the 8-step engine):
1. Read the existing solution draft thoroughly
2. Derive or refresh the **Project Constraint Matrix** from all files in INPUT_DIR. Include required inputs/outputs, operating context, runtime envelope, data availability, lifecycle boundaries, non-functional targets, integration boundaries, security constraints, and explicit out-of-scope decisions.
3. Audit every component/decision in the existing draft against the Project Constraint Matrix before researching alternatives:
- If a component's documented implementation assumptions match the project constraints, keep it eligible and record evidence.
- If fit is unproven, mark it `Experimental only` until evidence is found.
- If constraints conflict, mark it `Rejected` and search for alternatives.
- If rejecting it changes product behavior or risk materially, escalate for user decision.
4. Research in internet extensively — for each component/decision in the draft, search for:
2. Research in internet extensively — for each component/decision in the draft, search for:
- Known problems and limitations of the chosen approach
- What practitioners say about using it in production
- Better alternatives that may have emerged recently
- Common failure modes and edge cases
- How competitors/similar projects solve the same problem differently
5. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
6. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
7. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
8. For each identified weak point, search for multiple solution approaches and compare them
9. For every revised candidate, prove exact fit against the Project Constraint Matrix. Do not select field-adjacent or "similar problem" options unless their intrinsic implementation constraints match the project.
10. Based on findings, form a new solution draft in the same format
3. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
4. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
5. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
6. For each identified weak point, search for multiple solution approaches and compare them
7. Based on findings, form a new solution draft in the same format
**Save action**: Write `RESEARCH_DIR/06_component_fit_matrix.md` before the final draft, then write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
**Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
**Optional follow-up**: After Mode B completes, the user can request Phase 3 (Tech Stack Consolidation) or Phase 4 (Security Deep Dive) using the revised draft. These phases work identically to their Mode A descriptions in `steps/01_mode-a-initial-research.md`.
@@ -40,7 +40,6 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
- "What existing/competitor solutions address this problem?"
- "What are the component parts of this problem?"
- "For each component, what are the state-of-the-art solutions?"
- "For each component, what are the practical alternatives across simple baseline, established production option, open-source option, commercial option, current SOTA, adjacent-domain option, and no-build/defer option?"
- "What are the security considerations per component?"
- "What are the cost implications of each approach?"
@@ -49,7 +48,6 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
- "What are the security vulnerabilities in the proposed architecture?"
- "Where are the performance bottlenecks?"
- "What solutions exist for each identified issue?"
- "For each component already selected in the draft, what alternatives should be considered before keeping, replacing, or rejecting it?"
**General sub-question patterns** (use when applicable):
- **Sub-question A**: "What is X and how does it work?" (Definition & mechanism)
@@ -86,27 +84,6 @@ For **each sub-question**, generate **at least 3-5 search query variants** befor
Record all planned queries in `00_question_decomposition.md` alongside each sub-question.
#### Component Option Breadth (MANDATORY)
Before Step 2, identify the component areas implied by the problem and create a search plan for options in each area. A component area is any replaceable tool, library, model, service, algorithm, data format, protocol, infrastructure pattern, or validation approach that could materially affect the solution.
For every component area, generate search queries for these option families unless clearly not applicable:
- **Simple baseline**: low-complexity classical or manual approach that can serve as a fallback or regression baseline.
- **Established production option**: mature library/service/pattern with field usage.
- **Open-source candidate**: permissive-license option with inspectable implementation and community history.
- **Commercial/vendor option**: paid or vendor-supported option, including SDK/platform constraints.
- **Current SOTA / research option**: recent model, paper, or benchmark leader that may be promising but immature.
- **Adjacent-domain option**: solution from a neighboring domain with similar constraints.
- **No-build / defer option**: whether the component can be avoided, simplified, or moved out of scope.
- **Known bad option**: candidate or family that appears attractive but has documented failure modes or disqualifiers.
For each component area, record:
- Candidate names and option families to search.
- At least 5 query variants covering alternatives, comparisons, limitations, licensing, runtime/scale, and exact project constraints.
- The minimum evidence needed to mark a candidate `Selected`, `Rejected`, `Experimental only`, or `Needs user decision`.
Add this as a "Component Option Search Plan" section in `00_question_decomposition.md`.
**Research Subject Boundary Definition (BLOCKING - must be explicit)**:
When decomposing questions, you must explicitly define the **boundaries of the research subject**:
@@ -117,9 +94,6 @@ When decomposing questions, you must explicitly define the **boundaries of the r
| **Geography** | Which region is being studied? | Chinese universities vs US universities vs global |
| **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture |
| **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational |
| **Operating context** | What exact environment, lifecycle phase, and runtime conditions must the solution support? | In-flight embedded runtime vs offline post-processing; production web traffic vs admin batch job |
| **Required interfaces** | What inputs, outputs, protocols, data shapes, and ownership boundaries are fixed? | One camera vs stereo rig; REST API vs message queue; local file boundary vs service API |
| **Non-functional envelope** | What latency, throughput, storage, memory, availability, safety, security, cost, and maintainability targets are binding? | <400 ms p95, 8 GB RAM, 99.9% availability, reversible migrations |
**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research.
@@ -142,11 +116,9 @@ Record the audit result in `00_question_decomposition.md` as a "Completeness Aud
- Summary of relevant problem context from INPUT_DIR
- Classified question type and rationale
- **Research subject boundary definition** (population, geography, timeframe, level)
- **Project Constraint Matrix summary** (operating context, required interfaces, non-functional envelope, lifecycle assumptions, and hard disqualifiers extracted from input files)
- List of decomposed sub-questions
- **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
- **Search query variants** for each sub-question (at least 3-5 per sub-question)
- **Component Option Search Plan** (component areas, option families, candidate names, query variants, required evidence)
- **Completeness audit** (taxonomy cross-reference + domain discovery results)
4. Write TodoWrite to track progress
@@ -160,7 +132,7 @@ Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). C
**Tool Usage**:
- Use `WebSearch` for broad searches; `WebFetch` to read specific pages
- Use the `context7` MCP server (`resolve-library-id` then `query-docs` / `get-library-docs`) for up-to-date library/framework documentation. **Mandatory per lead candidate** — see "API Capability Verification" below.
- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation
- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
- When citing web sources, include the URL and date accessed
@@ -173,72 +145,12 @@ Do not stop at the first few results. The goal is to build a comprehensive evide
- Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion)
- If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems
**Minimum search effort per component area**:
- Search every option family from the "Component Option Search Plan" before choosing a lead candidate.
- For each lead, fallback, or rejected candidate, search at least one official/source-of-truth page and at least one independent validation source when available.
- Search `"[component] alternatives"`, `"[candidate] vs [alternative]"`, `"[candidate] limitations"`, `"[candidate] license"`, `"[candidate] production"`, and `"[candidate] [binding project constraint]"`.
- If fewer than 3 realistic candidates are found for a component area, explicitly document why the landscape is narrow and search adjacent domains before accepting that result.
- Include at least one simple baseline and one "do not use" or disqualified candidate per component area when possible; these prevent false confidence in the selected option.
**Candidate implementation-limit searches (MANDATORY)**:
For every component/tool/library/service/pattern/algorithm that may be selected or recommended, search for its intrinsic implementation constraints. Do not rely on product category labels, marketing summaries, or examples from a different operating context. Include query variants for:
- Official supported inputs/outputs, protocols, data formats, and deployment modes
- Required hardware/runtime/platform/version constraints
- Timing, throughput, memory, storage, synchronization, and scaling assumptions
- Lifecycle assumptions: offline vs online, batch vs real time, development vs production, single tenant vs multi tenant, local vs networked
- Known unsupported scenarios, limitations, issue reports, production failures, and workarounds
- Licensing, security, maintenance, and community-health constraints
- Exact phrases from the project's restrictions and acceptance criteria combined with the candidate name
**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for lead candidates)**:
**Applicability**: this section applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section, and only to lead candidates that are libraries/SDKs/frameworks/services/protocols/data formats with multiple modes or configurations. For non-technical research (concept comparison, market/policy investigation, knowledge organization, root-cause analysis without tooling commitments), skip this entire sub-section and continue with the rest of Step 2 — the broader candidate implementation-limit search above is sufficient. State the skip explicitly once in `02_fact_cards.md`: `API Capability Verification: not applicable — this run is a Non-technical investigation, no library/SDK/service candidates`.
Most libraries/SDKs/services expose **multiple modes or configurations** (e.g., monocular vs stereo VO, sync vs async API, batch vs streaming inference, write-through vs write-behind cache). Selecting a candidate "because it supports X" without pinning *which mode* the project will use, and *whether that exact mode produces the required outputs from the required inputs*, is the most common silent-failure path in research. A library can support a class of problem in mode A while being unusable for the project's specific configuration in mode B.
For every lead candidate that is a library/SDK/framework/service with multiple modes or configurations, do the following — in this order, before marking the candidate `Selected`:
1. **Pin the exact mode/configuration the project will use.**
Derived from the Project Constraint Matrix: which inputs are available (sensor count, sensor types, data shapes, rates), which outputs are required (per `acceptance_criteria.md` and contract files), which hardware/runtime is fixed (per `restrictions.md`). Write this as a single sentence: "We will use `<library>` in `<mode/config>` with inputs `<list>` and expect outputs `<list>` on `<runtime>`." Do not progress past this step on a vague mode description.
2. **Run `context7` (or equivalent docs lookup) for the candidate** — this is **mandatory for every lead library/SDK/framework candidate**, not optional. Minimum three queries per candidate:
1. *Mode enumeration*: "What modes/configurations does `<library>` support? List every value of the mode/config enum and what each requires as input."
2. *Project's exact mode*: "Show a minimum runnable example of `<library>` in `<the pinned mode>` with `<the project's input shape>`. What does it produce?"
3. *Disqualifier probe*: "Does `<library>` `<the pinned mode>` produce `<the required output>`? Are there published limitations of `<the pinned mode>` for `<the project's runtime/hardware>`?"
For services without context7 coverage, use official docs site + WebFetch on the API reference page + the project's example/tutorial directory in the source repo. Append every consulted URL to `01_source_registry.md`.
3. **Save a Minimum Viable Example (MVE) for the pinned mode.**
Append to `02_fact_cards.md` (or a sibling `02_mve_evidence.md`) at least one block per lead library candidate with:
```markdown
## MVE — <library> in <pinned mode>
- **Source**: <official URL or context7 reference, with date>
- **Inputs in the example**: <e.g., 2 calibrated cameras + IMU at 200 Hz>
- **Outputs in the example**: <e.g., 6-DoF pose with covariance>
- **Project inputs**: <e.g., 1 camera + IMU at 200 Hz>
- **Project outputs required**: <e.g., 6-DoF pose with metric translation>
- **Match assessment**: ✅ exact match / ⚠️ partial (specify dimension) / ❌ mismatch (specify dimension)
- **If ⚠️ or ❌**: cite the official-docs sentence that establishes the mismatch.
```
If no official example covers the project's exact configuration → the candidate cannot be marked `Selected` based on category fit alone. Status must be `Experimental only` (with required-evidence note) or `Rejected` (when the docs explicitly disqualify the configuration).
4. **Bind every numbered Restriction and Acceptance Criterion to the candidate's pinned mode.**
For each numbered line in `restrictions.md` and `acceptance_criteria.md`, decide one of: `Pass` (the pinned mode satisfies it with cited evidence), `Fail` (the pinned mode contradicts it with cited evidence), `Verify` (no evidence either way; deeper investigation required), `N/A` (the line is irrelevant to this component area). Record this in `02_fact_cards.md` under the candidate's MVE block. The structural matrix in Step 7.5 reads from these bindings.
5. **Treat "the same library in a different mode" as a different candidate.**
If the project's pinned mode is `Monocular` but the only documented evidence covers `Stereo`, do not silently soften "rotation only" into "rotation + translation". Open a separate candidate row for the Monocular mode, with its own MVE, fit assessment, and disqualifiers. Two modes of one library are two distinct candidates for the purposes of this gate.
**Common silent-failure pattern this guards against**: a fact card paraphrases the docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes". A category-level "Selected" decision then carries through every downstream artifact, masking that the project's required A+B combination does not exist as a single mode.
**Search broadening strategies** (use when results are thin):
- Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
- Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
- Try different geographies: search in English + search for European/Asian approaches if relevant
- Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025"
- Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report"
- Try disqualifier probes: "X unsupported", "X limitations", "X requirements", "X with [project constraint]", "X without [required input]", "X real-time [target]", "X production failure"
**Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated.
@@ -282,7 +194,6 @@ For each extracted fact, **immediately** append to `02_fact_cards.md`:
- **Target Audience**: [which group this fact applies to, inherited from source or further refined]
- **Confidence**: ✅/⚠️/❓
- **Related Dimension**: [corresponding comparison dimension]
- **Fit Impact**: [supports selection / disqualifies / makes experimental / needs user decision]
```
**Target audience in fact statements**:
@@ -24,18 +24,6 @@ Write to `03_comparison_framework.md`:
| ... | | | |
```
**Required exact-fit dimensions for component/tool decisions**:
When the output selects or recommends a component, tool, library, service, architecture pattern, or algorithm, the framework MUST include these dimensions unless explicitly not applicable:
- Option family (`Simple baseline`, `Established production`, `Open-source`, `Commercial/vendor`, `Current SOTA`, `Adjacent-domain`, `No-build/defer`, `Known bad`)
- Required inputs/outputs and ownership boundaries
- Operating context and lifecycle fit
- Non-functional envelope fit
- Implementation assumptions and hard disqualifiers
- Evidence quality and source tier
- Selection status (`Selected`, `Rejected`, `Experimental only`, `Needs user decision`)
For each component area, include multiple candidates in the initial population. Do not present only the preferred option unless the investigation found no realistic alternatives; if so, state the searches that proved the narrow landscape.
---
### Step 5: Reference Point Baseline Alignment
@@ -109,8 +97,6 @@ Validate conclusions against a typical scenario:
- [ ] Are there any important dimensions missed?
- [ ] Is there any over-extrapolation?
- [ ] Are conclusions actionable/verifiable?
- [ ] Does every selected component/tool/pattern match the Project Constraint Matrix?
- [ ] Are mismatches marked as disqualifiers instead of hidden as generic "limitations"?
**Save action**:
Write to `05_validation_log.md`:
@@ -142,66 +128,6 @@ If using Y: [expected behavior]
---
### Step 7.5: Component Applicability Gate (BLOCKING)
**Applicability**: this gate applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section. For non-technical research (concept comparison, market/policy investigation, root-cause analysis without tooling, knowledge organization), skip this entire step and proceed to Step 8 — there are no components to gate. State the skip once in `05_validation_log.md`: `Step 7.5 (Component Applicability Gate): not applicable — Non-technical investigation`. For mixed runs (some component areas technical, some not), apply this gate only to the technical component areas; the non-technical ones do not produce 7.5 rows.
Before finalizing the solution draft, build an exact-fit matrix for every component/tool/library/service/pattern/algorithm that is selected, recommended, rejected, or treated as a fallback. Free-form prose in a "Project Constraints Checked" column is **not sufficient** — mismatches hide inside rationale text. The matrix must be structured per restriction and per acceptance criterion.
#### 7.5.1 Top-level Component Fit Matrix
```markdown
# Component Fit Matrix
| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
|----------------|-----------|--------------------|---------------|---------------|-------------------------|----------------------------|--------|--------------------|
| [area] | [name] | [exact mode/config the project will use, copied verbatim from the MVE block in Step 2] | [family] | [role] | MVE: [link to MVE block in `02_fact_cards.md` or `02_mve_evidence.md`]; docs: [Source #] | [none / list] | Selected / Rejected / Experimental only / Needs user decision | [why] |
```
The new **Pinned Mode/Config** column is mandatory. A row without a pinned mode is incomplete. The new **API Capability Evidence** column links to the Minimum Viable Example saved during Step 2's API Capability Verification — without an MVE link the candidate cannot be `Selected`.
#### 7.5.2 Restrictions × Candidate-Modes Sub-Matrix (MANDATORY)
For each lead candidate row in the top-level matrix, append a structured cross-check that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's **pinned mode/config**.
```markdown
## Sub-Matrix — <Candidate Name> in <Pinned Mode>
| Restriction / AC | Candidate-mode behavior | Result | Evidence |
|------------------|-------------------------|--------|----------|
| R1: <verbatim line from restrictions.md> | <how the pinned mode behaves under this restriction> | ✅ Pass / ❌ Fail / ❓ Verify / N/A | [Fact # / Source # / MVE link] |
| R2: ... | ... | ... | ... |
| ... | ... | ... | ... |
| AC-1.1: <verbatim line from acceptance_criteria.md> | <how the pinned mode satisfies (or contradicts) this AC's measurable target> | ✅ / ❌ / ❓ / N/A | [Fact # / Source # / MVE link] |
| AC-1.2: ... | ... | ... | ... |
| ... | ... | ... | ... |
```
Cell semantics:
- ✅ **Pass** — the candidate's pinned mode satisfies this line, with cited official-doc or MVE evidence.
- ❌ **Fail** — the candidate's pinned mode contradicts this line, with cited evidence. Even one ❌ disqualifies the candidate from `Selected` status.
- ❓ **Verify** — no evidence yet either way; further investigation required (loops back to Step 2 / Step 3.5). A row left ❓ at the end of analysis blocks the candidate.
- **N/A** — the line is irrelevant to this component area (state why in one phrase).
A candidate row may not be marked `Selected` while any cell is ❌ or ❓.
#### 7.5.3 Decision Rules
- `Selected` is allowed only when (a) the top-level row has an MVE link, (b) the sub-matrix has zero ❌, (c) the sub-matrix has zero ❓, and (d) the candidate's documented implementation assumptions match the project's explicit constraints and acceptance criteria.
- `Experimental only` is required when a candidate might work but lacks proof for the exact operating context (e.g., MVE exists for a similar configuration but not the exact one).
- `Rejected` is required when documented assumptions conflict with project constraints (any sub-matrix row is ❌ with cited evidence).
- `Needs user decision` is required when a mismatch changes scope, cost, safety, product behavior, or acceptance criteria — and the user has not yet been consulted.
- Each component area must include at least one selected or fallback-safe option, plus the most credible rejected/experimental alternatives discovered during web research.
- A component area with only one candidate is incomplete unless `00_question_decomposition.md` documents the broader searches and why they yielded no realistic alternatives.
- A candidate may not appear as the lead solution in Step 8 unless this gate marks it `Selected`.
- "Validation gate required" footnotes are not equivalent to `Selected`. If the validation gate concerns API capability (does the mode produce the required output?), that is a Step-2 / Step-7.5 question and must be resolved here, not deferred to runtime. Only validation gates concerning *runtime quality* (e.g., "does this VO converge on this terrain class?") may be carried forward as `Selected with runtime gate`.
**Save action**: Write `06_component_fit_matrix.md` containing both 7.5.1 (top-level) and 7.5.2 (per-candidate sub-matrices).
**BLOCKING**: If any lead candidate has ❌, ❓, `Experimental only`, `Rejected`, or `Needs user decision` status, do not silently proceed. Ask the user or choose a different selected candidate.
---
### Step 8: Deliverable Formatting
Make the output **readable, traceable, and actionable**.
@@ -10,21 +10,12 @@
[Architecture solution that meets restrictions and acceptance criteria.]
> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical research outputs (concept comparison, market/policy report, investigation answer), this Architecture section may be replaced with a comparison/analysis section that does not use these columns; or the columns may be marked `N/A` per row when the row describes a non-technical "component" (a process, a policy, an organizational construct). For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
### Component: [Component Name]
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|-----------|-------------|-------------|----------|------|-------------------------|-----|
| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
**Exact-fit evidence**:
- Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
- Evidence: [Fact # / Source #]
- Disqualifiers: [none or list]
- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|----------|-------|-----------|-------------|-------------|----------|------|-----|
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
[Repeat per component]
@@ -13,21 +13,12 @@
[Architecture solution that meets restrictions and acceptance criteria.]
> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical assessment outputs (e.g., reassessing a policy approach, comparing organizational designs), this Architecture section may be replaced with the assessment content that does not use these columns; or the columns may be marked `N/A` per row for non-technical "components". For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
### Component: [Component Name]
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|-----------|-------------|-------------|----------|------------|-------------------------|-----|
| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
**Exact-fit evidence**:
- Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
- Evidence: [Fact # / Source #]
- Disqualifiers: [none or list]
- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|----------|-------|-----------|-------------|-------------|----------|------------|-----|
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
[Repeat per component]
+1 -1
View File
@@ -22,7 +22,7 @@ test-run has two modes. The caller passes the mode explicitly; if missing, defau
| Mode | Scope | Typical caller | Input artifacts |
|------|-------|---------------|-----------------|
| `functional` (default) | Unit / integration / blackbox tests — correctness | autodev Steps that verify after Implement Tests or Implement | `scripts/run-tests.sh`, `_docs/02_document/tests/environment.md`, `_docs/02_document/tests/blackbox-tests.md` |
| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 15, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |
| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 9, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |
Direct user invocation (`/test-run`) defaults to `functional`. If the user says "perf tests", "load test", "performance", or passes a performance scenarios file, run `perf` mode.
@@ -95,7 +95,7 @@ Examples:
File: `expected_results/image_01_detections.json`
```json
```json
{
"input": "image_01.jpg",
"expected": {
@@ -119,7 +119,7 @@ File: `expected_results/image_01_detections.json`
]
}
}
```
```
```
---
-27
View File
@@ -1,27 +0,0 @@
.git
.github
.cursor
_docs
.venv
__pycache__
.pytest_cache
.ruff_cache
.mypy_cache
.env
.env.*
*.pem
*.key
*.secret
data/input/*
data/cache/*
data/fdr/*
data/test-results/*
*.tlog
*.ulg
*.bag
*.mcap
*.cbor
*.parquet
*.mp4
*.mov
*.avi
-10
View File
@@ -1,10 +0,0 @@
GPSD_ENV=development
GPSD_CONFIG_DIR=./config/development
GPSD_CACHE_DIR=./data/cache
GPSD_FDR_DIR=./data/fdr
GPSD_DATABASE_URL=postgresql://gpsd:gpsd@localhost:5432/gpsd
GPSD_MAVLINK_URL=udp:127.0.0.1:14550
GPSD_CAMERA_SOURCE=./data/input
GPSD_SIGNING_KEY_REF=test-key-ref
GPSD_MAX_FDR_BYTES=104857600
GPSD_LOG_LEVEL=info
-1
View File
@@ -1 +0,0 @@
_docs/00_problem/input_data/flight_derkachi/flight_derkachi.mp4 filter=lfs diff=lfs merge=lfs -text
-43
View File
@@ -1,43 +0,0 @@
name: CI
on:
pull_request:
push:
branches:
- dev
jobs:
python-quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
- name: Install
run: |
python -m pip install --upgrade pip
python -m pip install -e ".[dev]"
- name: Format check
run: python -m black --check src tests
- name: Lint
run: python -m ruff check src tests
- name: Unit tests
run: python -m pytest tests/unit
replay-compose-smoke:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate compose files
run: |
docker compose -f docker-compose.yml config
docker compose -f docker-compose.test.yml config
- name: Collect artifact placeholders
run: mkdir -p data/test-results e2e/reports
- uses: actions/upload-artifact@v4
with:
name: replay-evidence-placeholders
path: |
data/test-results
e2e/reports
-41
View File
@@ -1,42 +1 @@
.DS_Store
.venv/
__pycache__/
*.py[cod]
.pytest_cache/
.ruff_cache/
.mypy_cache/
.coverage
htmlcov/
*.egg-info/
.env
.env.*
!.env.example
*.pem
*.key
*.secret
data/input/*
data/cache/*
data/fdr/*
data/test-results/*
data/expected/*
!data/input/.gitkeep
!data/cache/.gitkeep
!data/fdr/.gitkeep
!data/test-results/.gitkeep
!data/expected/.gitkeep
*.tlog
*.ulg
*.bag
*.mcap
*.cbor
*.parquet
*.mp4
*.mov
*.avi
*.jpg
*.jpeg
*.png
!_docs/00_problem/input_data/**
-22
View File
@@ -1,22 +0,0 @@
# GPS-Denied Onboard Runtime
Scaffold for the Jetson-hosted GPS-denied localization runtime, replay harness, and
deployment evidence paths.
The project uses a Python `src/` layout for orchestration code. Native bridge
placeholders live inside the owning component folders rather than in a shared
native tree.
Generated mission data, FDR payloads, cache payloads, and raw frame dumps are kept
out of git unless they are explicitly curated test fixtures.
## Local Development
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -e ".[dev]"
python -m pytest
```
Local replay infrastructure is described in `docker-compose.yml`; CI and black-box
test infrastructure are described in `docker-compose.test.yml`.
+35 -160
View File
@@ -1,175 +1,50 @@
# Acceptance Criteria
# Position Accuracy
> **Last revised**: 2026-05-01 (Phase 1 AC/restrictions assessment clarifications).
> Changes vs. previous version (2026-04-25): AC-1.2 split into hard-floor + stretch; AC-1.4 made quantitative; AC-2.2 split per pipeline stage; AC-3.4 dual-trigger; AC-4.3 autopilot-pinned; AC-5.2 N pinned; AC-7.1 scoped to level flight; AC-8.2 freshness by sector; six new AC added (AC-NEW-1 … AC-NEW-6).
> Changes 2026-04-26: AC-4.3 extended to dual-channel hybrid (GPS_INPUT primary + ODOMETRY auxiliary); AC-8.6 added (VPR retrieval-unit + change-robustness); AC-NEW-7 added with confirmed numeric thresholds (cache-poisoning safety budget).
> Changes 2026-04-29: AC-3.5 and AC-NEW-8 added for temporary visual blackout/cloud occlusion during GPS spoofing, including IMU-only degraded navigation, covariance growth, and failover limits.
> Changes 2026-05-01: AC-1.3 anchor-age reporting clarified; AC-2.1 split so the >95% rate applies to VO registration, not every satellite re-anchor; AC-5.2 and AC-NEW-2 now require ArduPilot Plane SITL trigger verification; AC-8.3 storage accounting and AC-NEW-7 Satellite Service ownership clarified.
- The system should determine GPS coordinates of frame centers for 80% of photos within 50m error compared to real GPS
- The system should determine GPS coordinates of frame centers for 60% of photos within 20m error compared to real GPS
- Maximum cumulative VO drift between satellite correction anchors should be less than 100 meters
- System should report a confidence score per position estimate (high = satellite-anchored, low = VO-extrapolated with drift)
## Position Accuracy
# Image Processing Quality
- **AC-1.1** — The system shall determine GPS coordinates of frame centers within **50 m** of true GPS for **≥80%** of photos in normal flight segments.
- **AC-1.2** — The system shall determine GPS coordinates of frame centers within **20 m** of true GPS for **≥50%** of photos in normal flight segments.
- **AC-1.3** — Maximum cumulative VO drift between two consecutive satellite-anchored fixes shall be **<100 m** (VO-only fallback) or **<50 m** (when IMU is fused). Drift is measured as ‖VO-extrapolated centre next anchor centre‖ at the moment of the anchor fix. Every emitted estimate shall include `last_satellite_anchor_age_ms`; validation results shall be binned by anchor age, and the solution draft must define the maximum anchor age after which estimates are treated as degraded (`vo_extrapolated` or `dead_reckoned`) with monotonically growing covariance.
- **AC-1.4** — The system shall report a **quantitative confidence score** per position estimate, comprising:
- the 95% covariance ellipse semi-major axis in meters, AND
- a categorical label `{satellite_anchored, vo_extrapolated, dead_reckoned}`.
- Image Registration Rate > 95% for normal flight segments. The system can find enough matching features to confidently calculate the camera's 6-DoF pose and stitch that image into the trajectory
- Mean Reprojection Error (MRE) < 1.0 pixels
## Image Processing Quality
# Resilience & Edge Cases
- **AC-2.1** — Image registration rate is split by registration type:
- **AC-2.1a — VO registration**: frame-to-frame visual registration shall succeed for **>95%** of normal flight segments (defined as: nadir flight ±10° bank / pitch, ≥40% overlap with prior frame, daytime, usable texture, no full visual blackout).
- **AC-2.1b — Satellite-anchor registration**: cross-domain UAV-photo to satellite/cache registration is measured separately and is not hidden inside AC-2.1a. Satellite anchoring must satisfy AC-1.1 / AC-1.2 position accuracy, AC-2.2 cross-domain MRE, AC-8.2 freshness, and AC-8.6 retrieval behavior on season-matched tiles.
- **AC-2.2** — Mean Reprojection Error (MRE):
- **<1.0 px** for VO frame-to-frame homography on overlapping aerial pairs;
- **<2.5 px** for satellite-anchored cross-domain (UAV photo ↔ ortho satellite tile) registration.
- The system should correctly continue work even in the presence of up to 350m outlier between 2 consecutive photos (due to tilt of the plane)
- System should correctly continue work during sharp turns, where the next photo doesn't overlap at all or overlaps less than 5%. The next photo should be within 200m drift and at an angle of less than 70 degrees. Sharp-turn frames are expected to fail VO and should be handled by satellite-based re-localization
- System should operate when UAV makes a sharp turn and next photos have no common points with previous route. It should figure out the location of the new route segment and connect it to the previous route. There could be more than 2 such disconnected segments, so this strategy must be core to the system
- In case the system cannot determine the position of 3 consecutive frames by any means, it should send a re-localization request to the ground station operator via telemetry link. While waiting for operator input, the system continues attempting VO/IMU dead reckoning and the flight controller uses last known position + IMU extrapolation
## Resilience & Edge Cases
# Real-Time Onboard Performance
- **AC-3.1** — The system shall correctly continue work in the presence of up to **350 m** outliers between two consecutive photos (caused by airframe tilt up to ±20°).
- **AC-3.2** — The system shall correctly continue work during sharp turns where the next photo overlaps **<5%** with the previous, drifts **<200 m**, and changes heading **<70°**. Sharp-turn frames are expected to fail VO and shall be handled by satellite-based re-localization (place recognition over the satellite tile cache).
- **AC-3.3** — The system shall handle **≥3 disconnected segments** per flight, connecting each new segment to the previous trajectory via global descriptor retrieval + RANSAC pose-graph relocalization. This is a core capability, not a degraded mode.
- **AC-3.4** — When the system cannot determine position for **≥3 consecutive frames AND ≥2 s**, it shall send a re-localization request to the ground station via telemetry. While waiting, it continues VO/IMU dead reckoning and the flight controller uses last known position + IMU extrapolation.
- **AC-3.5** — During temporary **visual blackout** where the navigation camera provides no usable ground signal (e.g., clouds/occlusion/whiteout) while GPS is denied or spoofed, the system shall switch to `{dead_reckoned}` within **≤1 processed frame OR ≤400 ms**, reject the spoofed GPS as an estimator input, and propagate position solely from the last trusted state + flight-controller IMU/attitude/airspeed/altitude inputs until visual or satellite anchoring recovers. During this mode, covariance shall grow monotonically, `GPS_INPUT.horiz_accuracy` shall not under-report the 95% covariance semi-major axis, and QGroundControl shall receive a `VISUAL_BLACKOUT_IMU_ONLY` status at **12 Hz**.
- Less than 400ms end-to-end per frame: from camera capture to GPS coordinate output to the flight controller (camera shoots at ~3fps)
- Memory usage should stay below 8GB shared memory (Jetson Orin Nano Super — CPU and GPU share the same 8GB LPDDR5 pool)
- The system must output calculated GPS coordinates directly to the flight controller via MAVLink GPS_INPUT messages (using MAVSDK)
- Position estimates are streamed to the flight controller frame-by-frame; the system does not batch or delay output
- The system may refine previously calculated positions and send corrections to the flight controller as updated estimates
## Real-Time Onboard Performance
# Startup & Failsafe
- **AC-4.1** — End-to-end latency from camera capture to GPS coordinate output to the flight controller shall be **<400 ms p95**. Up to ~10% of frames may be dropped under sustained load (skip-allowed). Heavy global VPR / cross-domain re-ranking shall be conditional, not part of the steady-state per-frame path, unless profiling proves the full path stays inside the latency and memory budgets on the target Jetson.
- **AC-4.2** — Memory usage shall remain below **8 GB** shared on Jetson Orin Nano Super (CPU and GPU share the same 8 GB LPDDR5 pool).
- **AC-4.3** — The system shall output its position estimate to the flight controller via **two parallel MAVLink channels**, both emitted by **pymavlink** (general telemetry uses MAVSDK):
- **Primary**: `GPS_INPUT` targeting **ArduPilot** with `GPS1_TYPE=14` (MAVLink GPS substitute). Matches the "replacement for the GPS module" framing of the build.
- **Auxiliary** (when the EKF emits a fix with full 6-DoF covariance and quality > VISO_QUAL_MIN): `ODOMETRY` so EKF3 can fuse the richer covariance + native yaw error + quality field. ArduPilot's own dev docs designate ODOMETRY as the preferred external-nav channel for non-GPS substitution; we hybridise to keep AC-4.3's GPS-substitute framing while not throwing away the covariance fidelity that AC-NEW-4 depends on.
- FC source priorities are configured so GPS_INPUT remains the failover path if ODOMETRY trips a parameter gate.
- **v1 scope clause (added 2026-04-26 — see solution_draft03 finding M-30)**: v1 ships **GPS_INPUT only**; the ODOMETRY auxiliary channel is intentionally **disabled** in v1 because feeding both `GPS_INPUT` and `ODOMETRY` for overlapping axes triggers ArduPilot EKF3 double-fusion bugs (issues #30076 / #32506). `EK3_SRC1_*=GPS+Compass`; ODOMETRY emission re-enables in v1.1 once F-T9 SITL confirms PR #30080-class clean source-switching. Tests therefore assert v1 emits GPS_INPUT only and that ODOMETRY is *intentionally absent* on the wire.
- (Decision rationale: MAVSDK has no native GPS_INPUT support — see `_docs/00_research/00_ac_assessment.md` Q-1; ODOMETRY hybrid rationale — see Mode B finding M-1 in `_docs/00_research/02_fact_cards.md`; v1 single-channel rationale — see Mode B round-2 finding M-30 in `_docs/00_research/02_fact_cards.md` / solution_draft03.)
- **AC-4.4** — Position estimates are streamed to the flight controller frame-by-frame; the system shall not batch or delay output.
- **AC-4.5** — The system may refine previously calculated positions and send corrections to the flight controller as updated estimates.
- The system initializes using the last known valid GPS position from the flight controller before GPS denial begins
- If the system completely fails to produce any position estimate for more than N seconds (TBD), the flight controller should fall back to IMU-only dead reckoning and the system should log the failure
- On companion computer reboot mid-flight, the system should attempt to re-initialize from the flight controller's current IMU-extrapolated position
## Startup & Failsafe
# Ground Station & Telemetry
- **AC-5.1** — The system shall initialise using the last known valid GPS position from the flight controller's EKF, plus IMU-extrapolated position at the moment of GPS denial.
- **AC-5.2** — If the system fails to produce any position estimate for **>3 s**, the flight controller shall fall back to IMU-only dead reckoning and the system shall log the failure. Because ArduPilot failsafe timing depends on vehicle type and parameters, this fallback behavior must be verified specifically in ArduPilot Plane SITL with the production parameter set; Copter defaults are reference evidence only.
- **AC-5.3** — On companion computer reboot mid-flight, the system shall attempt to re-initialise from the flight controller's current IMU-extrapolated position. See AC-NEW-1 for the cold-start time-to-first-fix budget.
- Position estimates and confidence scores should be streamed to the ground station via telemetry link for operator situational awareness
- The ground station can send commands to the onboard system (e.g., operator-assisted re-localization hint with approximate coordinates)
- Output coordinates in WGS84 format
## Ground Station & Telemetry
# Object Localization
- **AC-6.1** — Position estimates and confidence scores shall be streamed to **QGroundControl** via the MAVLink telemetry link. High-rate (per-frame) content stays on the local link for forensics; the GCS link is downsampled to **12 Hz** for situational awareness.
- **AC-6.2** — The ground station can send commands to the onboard system (e.g., operator-assisted re-localization hint with approximate coordinates) via STATUSTEXT, NAMED_VALUE_FLOAT, or a custom MAVLink dialect.
- **AC-6.3** — Output coordinates are in **WGS84** format (matches GPS_INPUT spec).
- Other onboard AI systems can request GPS coordinates of objects detected by the AI camera
- The GPS-Denied system calculates object coordinates trigonometrically using: current UAV GPS position (from GPS-Denied), known AI camera angle, zoom, and current flight altitude. Flat terrain is assumed
- Accuracy is consistent with the frame-center position accuracy of the GPS-Denied system
## Object Localization (AI Camera)
# Satellite Reference Imagery
- **AC-7.1** — Other onboard AI systems may request GPS coordinates of objects detected by the AI camera. Localization accuracy is **consistent with the frame-center accuracy of the GPS-Denied system in level flight (bank/pitch <5°)**. In maneuvering flight, ground-projection error is bounded by `altitude × |sin(unknown_bank_or_pitch)|` and the system shall publish that bound alongside the estimate.
- **AC-7.2** — The system computes object coordinates trigonometrically using: current UAV GPS position (from GPS-Denied), known AI-camera gimbal angle, zoom, and current flight altitude. Flat-terrain assumption applies.
## Satellite Reference Imagery
- **AC-8.1** — Satellite reference imagery is provided by the **Azaion Suite Satellite Service** (a separate component of the Suite). The runtime onboard system consumes this service through an offline tile cache interface; it does **not** call commercial providers (Maxar, Airbus, Planet, etc.) directly. The Satellite Service is responsible for upstream sourcing and is out of scope for this build. Required resolution at the cache interface: **at least 0.5 m/pixel, ideally 0.3 m/pixel**.
- **AC-8.2** — Satellite tiles consumed at runtime shall be:
- **<6 months old** for active-conflict sectors;
- **<12 months old** for stable rear sectors.
System shall reject or downgrade-confidence on tiles older than these thresholds (see AC-NEW-6).
- **AC-8.3** — Satellite imagery for the operational area shall be **pre-loaded and pre-processed** onto the companion computer before flight. Offline preprocessing time is not time-critical (minutes/hours). Pre-extracted tile descriptors (e.g., SuperPoint keypoints/descriptors and DINOv2-VLAD global descriptors) are part of the cache and count against the storage budget unless the solution draft explicitly defines a separate descriptor/index budget.
- **AC-8.4****Mid-flight tile generation & write-back**: during flight, the system shall continuously orthorectify navigation-camera frames into tiles aligned with the basemap projection and store them in the local cache, **deduplicated** so each ground sector is stored at most once (latest / highest-quality tile wins). On landing, the companion computer shall upload newly generated tiles back to the Azaion Suite Satellite Service so that the next mission cache contains imagery refreshed by the previous flight.
- **AC-8.5****Storage policy**: the system shall **not** retain raw navigation-camera frames or AI-camera frames as part of normal operation. Tiles are the only persistent imagery artifact. Forensic exception: a low-rate (≤0.1 Hz) thumbnail log of frames that **failed** tile generation may be retained for debugging within the FDR budget (AC-NEW-3).
- **AC-8.6****VPR retrieval unit + change-robustness**:
- The Visual Place Recognition (Component 2) FAISS index shall be built over **ground-footprint-sized "VPR chunks"** (~600800 m at the deployment altitude band, with **4050 % overlap** between adjacent chunks), **decoupled from the slippy-XYZ storage tile** (z=20). Any UAV frame footprint shall fall fully inside ≥1 chunk regardless of position.
- The index shall be **multi-scale**: in addition to fine-scale chunks (derived from z=20 storage), a coarser-scale chunk descriptor set (z=17 or z=18 effective scale) shall be maintained for change-robust retrieval in **active-conflict sectors** where building destruction or major scene change is expected.
- VPR top-K shall be **dynamically sized** by sector classification (AC-NEW-6) and EKF position covariance: K=5 in stable sectors with σ_xy ≤ 20 m; K=20 in active-conflict sectors; K=50 on expanding-window fallback.
- VPR shall be **invoked conditionally**, not on every frame: in steady state (last anchor age < 2 s, σ_xy < 20 m, VO healthy), the system uses a geometric prior from IMU+VO predicted position to rank candidate chunks by distance alone. VPR's DINOv2 forward is invoked on **re-loc triggers** (cold start AC-NEW-1, sharp turn AC-3.2, disconnected segment AC-3.3, σ_xy > 50 m, or VO failure for ≥2 frames).
## New AC (added in Phase 1 assessment, expanded with rationale & validation)
### AC-NEW-1 — Time-to-first-fix on cold start
**Statement.** From companion-computer boot, the system shall emit its first valid `GPS_INPUT` message in **<30 s**, given an IMU-extrapolated initial position handed over from the flight controller's EKF.
**Why it matters.** A mid-flight reboot (brown-out, watchdog reset, OS panic) is a realistic scenario on a fixed-wing UAV running an 8-hour mission. The autopilot continues to fly on IMU dead reckoning during the gap; a 30 s budget keeps that drift under ~500 m at 60 km/h cruise, which the EKF can absorb when our first fix arrives.
**Implementation drivers.** TensorRT engines must be built at install time (not at first run); CUDA / TRT init <5 s; tile-cache mmap warm at start; FAISS index loaded before MAVLink connect; first VPR retrieval + cross-view match must succeed at full resolution within the remaining budget.
**Validation.** Bench: cold-boot the companion 50× with simulated FC-pose input; record time from boot to first valid `GPS_INPUT` MAVLink frame. Pass = 95% percentile <30 s.
### AC-NEW-2 — Spoofing-promotion latency
**Statement.** When the flight controller signals GPS denial or spoofing (ArduPilot fix-loss / EKF lane-switch event; PX4 `EKF2_GPS_SPOOFED` flag if PX4 ever returns to scope), the GPS-Denied system shall promote its own estimate to the FC's primary GPS source within **<3 s**.
**Why it matters.** Without this gate, the FC may continue to follow a spoofed real-GPS source while our valid estimate sits idle. 3 s is short enough to keep the FC from acting on a malicious heading change but long enough to ride out a single-frame anomaly.
**Implementation drivers.** Subscribe to `GPS_RAW_INT`, `EKF_STATUS_REPORT`, `SYS_STATUS`, and any ArduPilot Plane EKF/GPS status messages available in the production firmware. Maintain an internal "real-GPS health" rolling average; switch to "primary" mode (raise our `GPS_INPUT` `fix_type` to 3D and assert) when the verified Plane-specific health trigger stays below threshold for >=1 s. Emit `STATUSTEXT` to QGC on every promotion / demotion.
**Validation.** ArduPilot Plane SITL: simulate spoofing (inject false `GPS_RAW_INT` from a malicious node); verify the exact trigger signals used by the production parameter set; measure time from spoof onset to our promotion. Pass = 95% percentile <3 s.
### AC-NEW-3 — Flight Data Recorder
**Statement.** The system shall retain to non-volatile storage, per flight: per-frame position estimates with covariance and source-label, IMU traces from the FC at full rate, all emitted `GPS_INPUT` frames, MAVLink raw stream (tlog), system health (CPU / GPU / temp / throttle), tiles generated mid-flight (AC-8.4), and a low-rate (≤0.1 Hz) thumbnail log of frames that failed tile generation. **Raw nav-cam frames and AI-cam frames are NOT retained** (AC-8.5). Storage cap **64 GB / flight**; recorder rolls over (oldest segment dropped first) after cap.
**Why it matters.** Tiles, telemetry traces, and IMU are the operationally useful artifacts: they reproduce the mission, feed the next mission's cache (AC-8.4), and let post-mission analysis explain any false-position event (AC-NEW-4). Raw frames are large and redundant once tiles exist.
**Implementation drivers.** Per-day directory layout; fixed-size segment files; rollover policy on segment-close, not on every write. NVMe ≥64 GB on top of the persistent satellite-tile cache.
**Validation.** Bench: run an 8-hour synthetic load (3 Hz nav frames replayed from disk), assert the FDR ends ≤64 GB and no payload class is silently dropped without a logged rollover event.
### AC-NEW-4 — False-position safety budget
**Statement.**
- P(reported estimate error > **500 m**) **<0.1 %** per flight.
- P(reported estimate error > **1 km**) **<0.01 %** per flight.
**Why it matters.** A single 1-km-off `GPS_INPUT` frame can hand the FC a heading that flies the UAV outside the geofence in seconds. The covariance carried in `GPS_INPUT` (`h_acc`) is the FC's only defense; this AC bounds the **probability** of our covariance under-reporting reality.
**Implementation drivers.** EKF covariance must be calibrated, not optimistic. Cross-view fixes with low inlier ratio must be **rejected**, not down-weighted to "small but non-zero". Outlier rejection at the EKF stage (Mahalanobis gate) is mandatory.
**Validation.** Monte Carlo over the AerialVL public dataset (S03) and our own recorded Mavic flights, with synthetic IMU injection where applicable; report error CDF; pass = both probabilities below budget across ≥100 simulated flights worth of frames.
### AC-NEW-5 — Operational environmental envelope
**Statement.** Operating temperature **20 °C to +50 °C**; vibration / shock per RTCA DO-160G low-altitude UAV-class envelope. The cooling solution shall sustain the **25 W** power mode at the upper temperature bound for the full **8-hour duty cycle** without thermal throttling.
**Why it matters.** Without this, all latency / accuracy ACs are conditional on a benign thermal day. Eastern/southern Ukraine summers easily exceed +35 °C ambient inside a UAV bay; without active cooling, the Jetson throttles to 15 W mode and our 400 ms latency budget collapses.
**Implementation drivers.** Forced-air or active heatsink sized for 25 W continuous at +50 °C ambient bay temperature; thermal sensors logged in FDR (AC-NEW-3); throttle event = automatic `STATUSTEXT` warning to QGC.
**Validation.** Hot-soak chamber test: 25 W workload at +50 °C ambient for 8 h; assert no throttle. Cold-soak: 20 °C cold-start to first fix within AC-NEW-1 budget.
### AC-NEW-6 — Imagery freshness enforcement
**Statement.** The system shall reject (or downgrade confidence on) any satellite tile whose capture date violates AC-8.2 (>6 months old in active-conflict sectors; >12 months old in stable rear sectors). Tiles generated mid-flight (AC-8.4) and not yet uploaded to the Suite Satellite Service are timestamped with the current flight date and treated as fresh.
**Why it matters.** Stale satellite tiles are the dominant cross-view-matching failure mode in active-conflict sectors (cratering, dam destruction, road realignment). A confident match against a stale tile is worse than no match.
**Implementation drivers.** Each tile carries `capture_date` metadata in the cache index. Sector classification (active vs stable) is part of the operational area definition handed in pre-flight. Confidence weight = 1.0 if within freshness budget, linearly decayed to 0.0 over a 30-day grace zone past the budget, hard reject beyond the grace.
**Validation.** Inject tiles with synthetic age into the cache; verify rejection / decay curve matches spec; verify a stale-tile match never produces a `satellite_anchored` source label.
### AC-NEW-7 — Cache-poisoning safety budget
**Statement.** Per flight, across all onboard tiles written by Component 1b (in-flight ortho-tile generator):
- P(onboard tile geo-misaligned > **30 m**) **<1 %**.
- P(onboard tile geo-misaligned > **100 m**) **<0.1 %**.
**Why it matters.** Onboard tiles feed back into the Suite Satellite Service's basemap (AC-8.4). Without this AC, a confidently-bad EKF pose can write a misaligned tile that, after Service ingest, becomes the next flight's satellite anchor — producing cross-flight error compounding that AC-NEW-4 (single-flight false-position budget) does not capture. This AC bounds the **probability** that an onboard tile's claimed geo-alignment is wrong by a margin that would propagate to a downstream flight.
**Implementation drivers.**
- Service-source tiles are immutable within freshness budget (AC-8.2); onboard tiles overwrite only stale or other-onboard tiles.
- The onboard GPS-Denied system writes tile-quality metadata required by the Suite Satellite Service. The Service-side ingest applies a **2-flight voting layer**: an onboard tile gets promoted to "trusted basemap" only after **N>=2 independent flights** confirm consistent geo-alignment within X m of each other. (Active sectors per AC-NEW-6 may use single-flight promotion when σ_xy <= 3 m AND OSM-road-overlap >= 70 %.) The voting layer is an external Suite Satellite Service dependency, not implemented inside this onboard build, but its contract is required for AC-NEW-7 to pass end-to-end.
- The Component-1b parent-pose covariance is a **hard gate** in the local quality score: σ_xy ≤ 5 m for a hard write (`trust_level = candidate`); σ_xy ≤ 3 m for `trust_level = candidate` with full quality; tiles written in the σ_xy ∈ (3, 5] m band are marked `trust_level = soft` in the sidecar.
- Eligibility check (Component 1b) tightens generation gate from σ_xy ≤ 10 m to σ_xy ≤ 5 m.
**Validation.** Multi-flight Monte Carlo replay over AerialVL + Mavic + AerialExtreMatch with **synthetic over-confidence injection** (artificially deflate EKF covariance by 1.5×–3×): assert both probabilities below budget across ≥100 simulated flights worth of frames. Independently, Service-side voting layer is exercised in F-T3 to verify candidate tiles are not promoted to trusted basemap before N-flight confirmation.
### AC-NEW-8 — Visual blackout + GPS spoofing degraded-mode budget
**Statement.** When the navigation camera is fully unusable for visual localization and the flight controller simultaneously reports GPS denial/spoofing, the onboard system shall:
- continue emitting `GPS_INPUT` from IMU-only propagation for **up to 30 s** after the last trusted visual/satellite anchor, unless the estimator covariance exceeds the fail threshold earlier;
- label every estimate `{dead_reckoned}` and set `fix_type=2` or lower when the 95% covariance semi-major axis exceeds **100 m**;
- emit `fix_type=0`, `horiz_accuracy=999.0`, and `STATUSTEXT: VISUAL_BLACKOUT_FAILSAFE` when the 95% covariance semi-major axis exceeds **500 m** OR visual blackout exceeds **30 s** without a trusted re-anchor;
- never promote spoofed real-GPS measurements back into the estimator during blackout unless the FC GPS health has been stable and non-spoofed for **≥10 s** and a visual/satellite consistency check has succeeded.
**Why it matters.** A cloud/whiteout period removes all visual correction exactly when spoofed GPS cannot be trusted. The only safe behavior is honest IMU-only dead reckoning with rapidly growing uncertainty, not pretending that a stale visual position or spoofed GPS remains valid.
**Implementation drivers.** Add an image-quality/occlusion classifier before VO/VPR, a blackout state in the ESKF mode machine, covariance floors for IMU-only propagation, strict GPS health gating, and QGC/FDR logging for blackout start, every degraded estimate, and blackout recovery/failsafe.
**Validation.** SITL/replay: inject a 5 s, 15 s, and 35 s full-camera blackout while spoofing `GPS_RAW_INT`; assert mode transition ≤400 ms, spoofed GPS is ignored, covariance grows monotonically, `GPS_INPUT` fields degrade at the thresholds above, and recovery only occurs after a trusted visual/satellite anchor or the 10 s GPS-health + visual-consistency gate.
- Satellite reference imagery resolution must be at least 0.5 m/pixel, ideally 0.3 m/pixel
- Satellite imagery for the operational area should be less than 2 years old where possible
- Satellite imagery must be pre-processed and loaded onto the companion computer before flight. Offline preprocessing time is not time-critical (can take minutes/hours)
@@ -1,2 +1,8 @@
- Height: 400m
- Camera: ADTi Surveyor Lite 20MP 20L V1
- Height
- 400m
- Camera:
- Name: ADTi Surveyor Lite 26S v2
- Resolution: 26MP
- Image resolution: 6252*4168
- Focal length: 25mm
- Sensor width: 23.5
@@ -1,97 +1,166 @@
# Expected Results Mapping
# Expected Results
## Scope
Maps every input data item to its quantifiable expected result.
Tests use this mapping to compare actual system output against known-correct answers.
`coordinates.csv` is the current source of truth for the provided still-image nadir set. It gives expected WGS84 frame-center coordinates for `AD000001.jpg` through `AD000060.jpg`.
## Result Format Legend
This data is sufficient for black-box frame-center geolocation tests against still images. The Derkachi representative fixture in `input_data/flight_derkachi/` adds cropped nadir video plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry. It is sufficient for fixture validation, video/telemetry synchronization, replay, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. It is not sufficient by itself for final production accuracy because raw camera calibration, lens distortion, and exact camera-to-body calibration are still pending.
| Result Type | When to Use | Example |
|-------------|-------------|---------|
| Exact value | Output must match precisely | `fix_type: 3`, `satellites_visible: 10` |
| Tolerance range | Numeric output with acceptable variance | `lat: 48.275292 ± 50m` |
| Threshold | Output must exceed or stay below a limit | `latency < 400ms`, `memory < 8GB` |
| Pattern match | Output must match a string/regex pattern | `RELOC_REQ: last_lat=.* last_lon=.* uncertainty=.*m` |
| File reference | Complex output compared against a reference file | `match expected_results/position_accuracy.csv` |
| Set/count | Output must contain specific items or counts | `registered_frames / total_frames > 0.95` |
## Pass / Fail Rules
## Comparison Methods
- **Normal frame-center geolocation**: estimated frame center is within 50 m of the expected WGS84 coordinate.
- **Stretch accuracy bin**: estimated frame center is within 20 m of the expected WGS84 coordinate.
- **Dataset aggregate**: at least 80% of mapped images pass the 50 m threshold and at least 50% pass the 20 m threshold.
- **Output shape**: each result must include image name, estimated `lat`, estimated `lon`, error in meters, source label, 95% covariance semi-major axis, and `last_satellite_anchor_age_ms`.
| Method | Description | Tolerance Syntax |
|--------|-------------|-----------------|
| `numeric_tolerance` | abs(actual - expected) ≤ tolerance | `± <value>` |
| `threshold_min` | actual ≥ threshold | `≥ <value>` |
| `threshold_max` | actual ≤ threshold | `≤ <value>` |
| `percentage` | percentage of items meeting criterion | `≥ N%` |
| `exact` | actual == expected | N/A |
| `regex` | actual matches regex pattern | regex string |
| `file_reference` | compare against reference file | file path |
## Input To Expected Output Map
## Input Expected Result Mapping
### Still-Image Frame Centers
### Position Accuracy (60-image flight sequence)
| Input image | Expected latitude | Expected longitude | Primary threshold | Stretch threshold |
|-------------|-------------------|--------------------|-------------------|-------------------|
| AD000001.jpg | 48.275292 | 37.385220 | <= 50 m | <= 20 m |
| AD000002.jpg | 48.275001 | 37.382922 | <= 50 m | <= 20 m |
| AD000003.jpg | 48.274520 | 37.381657 | <= 50 m | <= 20 m |
| AD000004.jpg | 48.274956 | 37.379004 | <= 50 m | <= 20 m |
| AD000005.jpg | 48.273997 | 37.379828 | <= 50 m | <= 20 m |
| AD000006.jpg | 48.272538 | 37.380294 | <= 50 m | <= 20 m |
| AD000007.jpg | 48.272408 | 37.379153 | <= 50 m | <= 20 m |
| AD000008.jpg | 48.271992 | 37.377572 | <= 50 m | <= 20 m |
| AD000009.jpg | 48.271376 | 37.376671 | <= 50 m | <= 20 m |
| AD000010.jpg | 48.271233 | 37.374806 | <= 50 m | <= 20 m |
| AD000011.jpg | 48.270334 | 37.374442 | <= 50 m | <= 20 m |
| AD000012.jpg | 48.269922 | 37.373284 | <= 50 m | <= 20 m |
| AD000013.jpg | 48.269366 | 37.372134 | <= 50 m | <= 20 m |
| AD000014.jpg | 48.268759 | 37.370940 | <= 50 m | <= 20 m |
| AD000015.jpg | 48.268291 | 37.369815 | <= 50 m | <= 20 m |
| AD000016.jpg | 48.267719 | 37.368469 | <= 50 m | <= 20 m |
| AD000017.jpg | 48.267461 | 37.367255 | <= 50 m | <= 20 m |
| AD000018.jpg | 48.266663 | 37.365888 | <= 50 m | <= 20 m |
| AD000019.jpg | 48.266135 | 37.365460 | <= 50 m | <= 20 m |
| AD000020.jpg | 48.265574 | 37.364211 | <= 50 m | <= 20 m |
| AD000021.jpg | 48.264892 | 37.362998 | <= 50 m | <= 20 m |
| AD000022.jpg | 48.264393 | 37.361086 | <= 50 m | <= 20 m |
| AD000023.jpg | 48.263803 | 37.361028 | <= 50 m | <= 20 m |
| AD000024.jpg | 48.263014 | 37.359878 | <= 50 m | <= 20 m |
| AD000025.jpg | 48.262635 | 37.358277 | <= 50 m | <= 20 m |
| AD000026.jpg | 48.261819 | 37.357116 | <= 50 m | <= 20 m |
| AD000027.jpg | 48.261182 | 37.355907 | <= 50 m | <= 20 m |
| AD000028.jpg | 48.260727 | 37.354723 | <= 50 m | <= 20 m |
| AD000029.jpg | 48.260117 | 37.353469 | <= 50 m | <= 20 m |
| AD000030.jpg | 48.259677 | 37.352165 | <= 50 m | <= 20 m |
| AD000031.jpg | 48.258881 | 37.351376 | <= 50 m | <= 20 m |
| AD000032.jpg | 48.258425 | 37.349964 | <= 50 m | <= 20 m |
| AD000033.jpg | 48.258653 | 37.347004 | <= 50 m | <= 20 m |
| AD000034.jpg | 48.257879 | 37.347711 | <= 50 m | <= 20 m |
| AD000035.jpg | 48.256777 | 37.348444 | <= 50 m | <= 20 m |
| AD000036.jpg | 48.255756 | 37.348098 | <= 50 m | <= 20 m |
| AD000037.jpg | 48.255375 | 37.346549 | <= 50 m | <= 20 m |
| AD000038.jpg | 48.254799 | 37.345603 | <= 50 m | <= 20 m |
| AD000039.jpg | 48.254557 | 37.344566 | <= 50 m | <= 20 m |
| AD000040.jpg | 48.254380 | 37.344375 | <= 50 m | <= 20 m |
| AD000041.jpg | 48.253722 | 37.343093 | <= 50 m | <= 20 m |
| AD000042.jpg | 48.254205 | 37.340532 | <= 50 m | <= 20 m |
| AD000043.jpg | 48.252380 | 37.342112 | <= 50 m | <= 20 m |
| AD000044.jpg | 48.251489 | 37.343079 | <= 50 m | <= 20 m |
| AD000045.jpg | 48.251085 | 37.346128 | <= 50 m | <= 20 m |
| AD000046.jpg | 48.250413 | 37.344034 | <= 50 m | <= 20 m |
| AD000047.jpg | 48.249414 | 37.343296 | <= 50 m | <= 20 m |
| AD000048.jpg | 48.249114 | 37.346895 | <= 50 m | <= 20 m |
| AD000049.jpg | 48.250241 | 37.347741 | <= 50 m | <= 20 m |
| AD000050.jpg | 48.250974 | 37.348379 | <= 50 m | <= 20 m |
| AD000051.jpg | 48.251528 | 37.349468 | <= 50 m | <= 20 m |
| AD000052.jpg | 48.251873 | 37.350485 | <= 50 m | <= 20 m |
| AD000053.jpg | 48.252161 | 37.351491 | <= 50 m | <= 20 m |
| AD000054.jpg | 48.252685 | 37.352343 | <= 50 m | <= 20 m |
| AD000055.jpg | 48.253268 | 37.353119 | <= 50 m | <= 20 m |
| AD000056.jpg | 48.253767 | 37.354246 | <= 50 m | <= 20 m |
| AD000057.jpg | 48.254329 | 37.354946 | <= 50 m | <= 20 m |
| AD000058.jpg | 48.254874 | 37.355765 | <= 50 m | <= 20 m |
| AD000059.jpg | 48.255481 | 37.356501 | <= 50 m | <= 20 m |
| AD000060.jpg | 48.256246 | 37.357485 | <= 50 m | <= 20 m |
Ground truth GPS coordinates for each frame are in `coordinates.csv`. The system processes these frames sequentially (simulating a real flight) with corresponding IMU data (200Hz, from SITL ArduPilot or synthetic generation from trajectory) and satellite tile matches. The system outputs estimated GPS coordinates per frame. Expected results compare estimated positions against ground truth.
### Representative Derkachi Video/IMU Fixture
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 1 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | ≥ 80% of frames have position error < 50m from ground truth | percentage | ≥ 80% of frames within 50m | `expected_results/position_accuracy.csv` |
| 2 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | ≥ 60% of frames have position error < 20m from ground truth | percentage | ≥ 60% of frames within 20m | `expected_results/position_accuracy.csv` |
| 3 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | Per-frame position output in WGS84 (lat, lon) | numeric_tolerance | each frame ± 100m max (no single frame exceeds 100m error) | `expected_results/position_accuracy.csv` |
| 4 | `coordinates.csv` (all 60 frames) | Sequential flight images with ground truth GPS | Cumulative VO drift between satellite anchors < 100m | threshold_max | ≤ 100m drift between anchors | N/A |
| Input fixture | Expected validation result | Threshold |
|---------------|----------------------------|-----------|
| `flight_derkachi/data_imu.csv` | Telemetry CSV has required `timestamp(ms)`, `Time`, `SCALED_IMU2.*`, and `GLOBAL_POSITION_INT.*` columns; non-empty rows are monotonic from `Time=0.0` to `489.9` | 0 missing required columns; 0 decreasing timestamps; 4,900 nonblank rows |
| `flight_derkachi/flight_derkachi.mp4` | Video stream is readable as cropped nadir footage for replay | H.264, 880 x 720, 30 fps, approximately 490.07 s |
| Video/telemetry alignment | Video has 14,700 frames and telemetry has 4,900 rows | Exactly 3 video frames per telemetry row; duration delta <=250 ms |
| Derkachi trajectory comparison | Replay output can be compared to `GLOBAL_POSITION_INT.lat`, `GLOBAL_POSITION_INT.lon`, `GLOBAL_POSITION_INT.alt`, `GLOBAL_POSITION_INT.relative_alt`, velocity, and heading | Thresholds are calibration-gated; use for smoke/relative trajectory validation until intrinsics and camera-to-body calibration are pinned |
### GPS_INPUT Message Correctness
## Known Gaps
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 5 | Single frame + IMU data | Normal tracking frame with recent satellite match | `fix_type: 3`, `horiz_accuracy: 5-20m`, `satellites_visible: 10`, lat/lon populated | exact (fix_type, sat), numeric_tolerance (accuracy) | fix_type == 3, horiz_accuracy ∈ [1, 50] | N/A |
| 6 | Frame sequence, no satellite match for >30s | VO-only tracking, no recent satellite anchor | `fix_type: 3`, `horiz_accuracy: 20-50m` | exact (fix_type), range (accuracy) | fix_type == 3, horiz_accuracy ∈ [20, 100] | N/A |
| 7 | Frame sequence, VO lost + no satellite | IMU-only dead reckoning | `fix_type: 2`, `horiz_accuracy: 50-200m+` (growing over time) | exact (fix_type), threshold_min (accuracy) | fix_type == 2, horiz_accuracy ≥ 50 | N/A |
| 8 | VO lost + 3 consecutive satellite failures | Total position failure | `fix_type: 0`, `horiz_accuracy: 999.0` | exact | fix_type == 0, horiz_accuracy == 999.0 | N/A |
| 9 | Any valid frame | GPS_INPUT output rate | GPS_INPUT messages at 5-10Hz continuous | range | 5 ≤ rate_hz ≤ 10 | N/A |
- The still-image set has expected WGS84 centers but no synchronized IMU, attitude, airspeed, altitude, or timestamp stream.
- The Derkachi fixture has synchronized video, IMU, and GPS trajectory, but no raw camera calibration, lens distortion, exact camera-to-body transform, attitude, or airspeed columns.
- The still-image sample cadence is slower than the target 3 fps runtime profile; the Derkachi video is 30 fps and must be sampled to target replay cadence for runtime tests.
- Final production acceptance requires camera calibration and representative datasets with synchronized camera/IMU plus ground-truth trajectory.
### Confidence Tier Transitions
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 10 | Frame with satellite match <30s ago, covariance <400m² | HIGH confidence conditions | Confidence tier: HIGH, SSE confidence: "HIGH" | exact | N/A | N/A |
| 11 | Frame with cuVSLAM OK, no satellite match >30s | MEDIUM confidence conditions | Confidence tier: MEDIUM, SSE confidence: "MEDIUM" | exact | N/A | N/A |
| 12 | Frame with cuVSLAM lost, IMU-only | LOW confidence conditions | Confidence tier: LOW, SSE confidence: "LOW" | exact | N/A | N/A |
| 13 | 3+ consecutive total failures | FAILED conditions | Confidence tier: FAILED, SSE confidence: "FAILED", fix_type: 0 | exact | N/A | N/A |
### Image Registration & Visual Odometry
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 14 | 60 sequential flight images | Normal flight (no sharp turns) | Image registration rate ≥ 95% (≥ 57 of 60 registered) | percentage | ≥ 95% | N/A |
| 15 | 60 sequential flight images | Normal flight images | Mean reprojection error < 1.0 pixels | threshold_max | MRE < 1.0 px | N/A |
### Disconnected Route Segments & Sharp Turns
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 16 | Frames 32-43 from coordinates.csv | Trajectory with direction change (turn area) | System continues producing position estimates through the turn | threshold_min | ≥ 1 position output per frame | N/A |
| 17 | Simulated consecutive frames with 350m gap | Outlier between 2 consecutive photos due to tilt | System handles outlier, position estimate not corrupted (error < 100m for next valid frame) | threshold_max | ≤ 100m error after recovery | N/A |
| 18 | Simulated sharp turn (no overlap, <5% overlap, <70° angle, <200m drift) | Sharp turn where VO fails | Satellite re-localization triggers, position recovered within 3 frames after turn | threshold_max | position error ≤ 50m after re-localization | N/A |
| 19 | Simulated VO loss + satellite match success | Tracking loss → re-localization | cuVSLAM restarts, ESKF position corrected, tracking_state returns to NORMAL | exact | tracking_state == NORMAL after recovery | N/A |
### 3-Consecutive-Failure Re-Localization
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 20 | Simulated VO loss + 3 satellite match failures | Cannot determine position by any means | Re-localization request sent: `RELOC_REQ: last_lat=.* last_lon=.* uncertainty=.*m` | regex | message matches pattern | N/A |
| 21 | Re-localization request active | System waiting for operator | GPS_INPUT fix_type=0, system continues IMU prediction, continues satellite matching attempts | exact (fix_type) | fix_type == 0 | N/A |
| 22 | Operator sends approximate coordinates (lat, lon) | Operator re-localization hint | System uses hint as ESKF measurement (high covariance ~500m), attempts satellite match in new area | threshold_max | position error ≤ 500m initially, ≤ 50m after satellite match | N/A |
### Startup & Handoff
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 23 | System boot with GLOBAL_POSITION_INT available | Normal startup | System reads initial position, initializes ESKF, starts GPS_INPUT output | exact | GPS_INPUT output begins within 60s of boot | N/A |
| 24 | System boot + first satellite match | Startup validation | First satellite match validates initial position, position error drops | threshold_max | position error ≤ 50m after first satellite match | N/A |
### Mid-Flight Reboot Recovery
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 25 | System process killed mid-flight | Companion computer reboot | System recovers: reads FC position, inits ESKF with high uncertainty, loads TRT engines, starts cuVSLAM, performs satellite match | threshold_max | total recovery time ≤ 70s | N/A |
| 26 | Post-reboot first satellite match | Recovery validation | Position accuracy restored after first satellite match | threshold_max | position error ≤ 50m after first satellite match | N/A |
### Object Localization
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 27 | POST /objects/locate with pixel_x, pixel_y, gimbal angles, zoom, known UAV position | Object at known ground GPS | Response: `{ lat, lon, alt, accuracy_m, confidence }` with lat/lon matching ground truth | numeric_tolerance | lat/lon within accuracy_m of ground truth (consistent with frame-center accuracy) | N/A |
| 28 | POST /objects/locate with invalid pixel coordinates | Out-of-frame pixel | HTTP 422 or error response indicating invalid input | exact | HTTP status 422 | N/A |
### Coordinate Transform Chain
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 29 | Known GPS → NED → pixel → GPS round-trip | Coordinate transform validation | Round-trip error < 0.1m | threshold_max | ≤ 0.1m | N/A |
### API & Communication
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 30 | GET /health | Health check endpoint | HTTP 200, JSON with memory_mb, gpu_temp_c, status fields | exact (status code), regex (body) | status == 200, body contains `"status"` | N/A |
| 31 | POST /sessions | Start session | HTTP 200/201 with session ID | exact | status ∈ {200, 201} | N/A |
| 32 | GET /sessions/{id}/stream | SSE position stream | SSE events at ~1Hz with fields: type, timestamp, lat, lon, alt, accuracy_h, confidence, vo_status | regex | each event matches SSE schema | N/A |
| 33 | Unauthenticated request to /sessions | No JWT token | HTTP 401 Unauthorized | exact | status == 401 | N/A |
### Performance Thresholds
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 34 | Single camera frame (6252x4168) | End-to-end processing time | Total pipeline latency < 400ms (capture → GPS coordinate output) | threshold_max | ≤ 400ms | N/A |
| 35 | 30-minute sustained operation | Memory usage over time | Peak memory < 8GB, no memory leaks (growth < 50MB over 30min) | threshold_max | peak < 8192MB, growth ≤ 50MB | N/A |
| 36 | 30-minute sustained operation | GPU thermal | SoC junction temperature stays below 80°C (no throttling) | threshold_max | ≤ 80°C | N/A |
| 37 | cuVSLAM single frame | VO processing time | cuVSLAM inference ≤ 20ms per frame | threshold_max | ≤ 20ms | N/A |
| 38 | Satellite matching single frame | Satellite matching time (async) | LiteSAM/XFeat inference ≤ 330ms | threshold_max | ≤ 330ms | N/A |
| 39 | TRT engine load | Engine initialization time | All TRT engines loaded within 10s total | threshold_max | ≤ 10s | N/A |
### Satellite Tile Management
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 40 | Mission area definition (200km path, ±2km buffer, zoom 18) | Tile storage calculation | Total storage 500-800MB for zoom 18 + zoom 19 flight path | range | [300MB, 1000MB] | N/A |
| 41 | ESKF position ± 3σ search radius | Tile selection | Tiles covering search area loaded, mosaic assembled, covers at least 500m radius | threshold_min | coverage radius ≥ 500m | N/A |
### TRT Engine Validation
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 42 | LiteSAM PyTorch model → ONNX → TRT FP16 | TRT engine conversion | Engine builds successfully on Jetson Orin Nano Super | exact | exit_code == 0 | N/A |
| 43 | TRT engine output vs PyTorch reference (same input) | Inference correctness | Max L1 error between TRT and PyTorch output < 0.01 | threshold_max | L1_max < 0.01 | N/A |
| 44 | LiteSAM MinGRU operations | TRT compatibility check | All MinGRU ops supported in TRT 10.3 (polygraphy inspect) | exact | unsupported_ops == 0 | N/A |
### Telemetry
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| 45 | Normal operation | Telemetry output rate | NAMED_VALUE_FLOAT messages at 1Hz (gps_conf, gps_drift, gps_hacc) | numeric_tolerance | rate: 1Hz ± 0.2Hz | N/A |
| 46 | VO tracking lost + 3 satellite failures | Re-localization telemetry | STATUSTEXT with RELOC_REQ sent to ground station | regex | message matches `RELOC_REQ:.*` | N/A |
## Expected Result Reference Files
### position_accuracy.csv
Reference file: `expected_results/position_accuracy.csv`
Contains the ground truth GPS coordinate for each frame in the 60-image test sequence (copied from `coordinates.csv`) plus the acceptance thresholds. Test harness computes haversine distance between estimated and ground truth positions, then applies aggregate criteria.
Thresholds applied to the full 60-frame sequence:
- ≥ 80% of frames: error < 50m
- ≥ 60% of frames: error < 20m
- 0% of frames: error > 100m (no single frame exceeds 100m)
- Cumulative VO drift between satellite anchors: < 100m
@@ -1,14 +0,0 @@
# Derkachi Representative Flight Fixture
## Files
| File | Description | Observed Metadata |
|------|-------------|-------------------|
| `flight_derkachi.mp4` | Cropped nadir flight footage for replay | H.264, 880 x 720, 30 fps, about 490.07 s |
| `data_imu.csv` | Flight-controller telemetry trace exported from the tlog | 4,900 rows at 10 Hz from `Time=0.0` to `489.9`; includes `SCALED_IMU2` and `GLOBAL_POSITION_INT` trajectory fields |
## Test Use
Use this fixture for video/telemetry synchronization checks, representative replay smoke tests, VIO hot-path latency, frame-drop accounting, and trajectory comparison against `GLOBAL_POSITION_INT`. The video and telemetry align at exactly three video frames per telemetry row. Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims.
For the test recording, the rotating camera was mechanically fixed in a downward/nadir orientation. Treat the MP4 as a cleaned/cropped replay fixture rather than the raw camera feed.
File diff suppressed because it is too large Load Diff
@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9acb97042fc648301d73d3c0fe7d80f7e3e2697000c0d33afa8a7b7a74a20005
size 282207328
+2 -2
View File
@@ -1,2 +1,2 @@
We have a wing-type UAV with a fixed downward navigation camera that can take photos 3 times per second. The authoritative navigation-camera spec is defined in `restrictions.md` as the ADTi 20MP 20L V1, APS-C sensor, about 5472 x 3648 px; older higher-resolution references are superseded. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC.
The real world examples are in input_data folder, but the original still-image set has a much larger distance between photos than the target aircraft will have. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms. Additional representative data is available in `input_data/flight_derkachi/`: cropped nadir flight footage plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry. This supports video/telemetry synchronization, replay, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. Camera intrinsics, lens distortion, raw camera feed parameters, and exact camera-to-body calibration are still pending, so final production accuracy claims remain gated on calibration data or a separately surveyed representative dataset.
We have a wing-type UAV with a camera pointing downwards that can take photos 3 times per second with a resolution 6200*4100. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC.
The real world examples are in input_data folder, but the distance between each photo is way bigger than it will be from a real plane. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms or even 400 ms.
+28 -50
View File
@@ -1,59 +1,37 @@
# Restrictions
# UAV & Flight
> **Last revised**: 2026-05-01 (post Phase 1 AC/restrictions assessment clarifications).
- Photos are taken by only airplane (fixed-wing) type UAVs
- Photos are taken by the camera pointing downwards and fixed, but it is not autostabilized
- The flying range is restricted by the eastern and southern parts of Ukraine (to the left of the Dnipro River)
- Altitude is predefined and no more than 1km. The height of the terrain can be neglected
- Flights are done mostly in sunny weather
- During the flight, UAVs can make sharp turns, so that the next photo may be absolutely different from the previous one (no same objects), but it is rather an exception than the rule
- Number of photos per flight could be up to 3000, usually in the 500-1500 range
## UAV & Flight
# Cameras
- Photos are taken by airplane (fixed-wing) type UAVs only.
- Photos are taken by the navigation camera pointing downwards and fixed (not gimbal-stabilized).
- Operational area is the eastern and southern parts of Ukraine (east/left of the Dnipro River).
- Mission profile: 8-hour flights at ~60 km/h cruise. Two route shapes coexist:
- **Sector**: up to **10 × 15 km = 150 km²** of dense coverage.
- **Transit corridor**: ~**50 km × 1 km = 50 km²** strip in/out of the sector.
- **Total operational area: up to ~400 km²** of pre-cached satellite imagery per mission. Cache is **persistent across flights** (not redownloaded each mission). Storage budget **~10 GB** for the satellite tile cache; see AC-NEW-3 for flight-data-recorder budget.
- Altitude: pre-defined, **≤1 km AGL**. Terrain is assumed flat (operational area is rolling steppe / agricultural land); height differences are negligible.
- Weather: predominantly sunny daytime operations. Validation must still cover the seasonal/visibility classes that affect visual matching in the operational area: summer crop/field patterns, autumn/winter bare fields, cloud/smoke/haze, snow if missions can occur in winter, and low-texture agricultural repetition.
- Sharp turns occur but are the exception, not the rule. Two consecutive photos may share <5% overlap during a turn (see AC-3.2).
- **No photo-count cap.** The previously stated "up to 3000 photos per flight" was a legacy operator number from a Mavic-class workflow; it is dropped because (a) it is inconsistent with 8 h × 3 fps, and (b) the system does **not store raw photos at all** (see AC-8.5). Storage is bounded by the tile-cache + FDR caps (~10 GB persistent + 64 GB / flight, AC-NEW-3).
- UAV has two cameras:
1. **Navigation camera** — fixed, pointing downwards, not autostabilized. Used by GPS-Denied system for position estimation
2. **AI camera** — main camera with configurable angle and zoom, used by onboard AI detection systems
- Navigation camera resolution: FullHD to 6252*4168. Camera parameters are known: focal length, sensor width, resolution, etc.
- Cameras are connected to the companion computer (interface TBD: USB, CSI, or GigE)
- Terrain is assumed flat (eastern/southern Ukraine operational area); height differences are negligible
## Cameras
# Satellite Imagery
- The UAV carries **two cameras**:
1. **Navigation camera** — fixed, downward-pointing, not autostabilized. Consumed by the GPS-Denied system for position estimation.
2. **AI camera** — main mission camera with operator-controllable gimbal angle and zoom. Consumed by onboard AI detection systems.
- **Navigation camera**: **ADTi 20MP 20L V1, APS-C sensor, ~5472 × 3648 px (≈20 MP)**. APS-C sensor (~23.6 × 15.7 mm). Lens TBD — selected during solution-draft phase to land GSD in the **1020 cm/px band at 1 km AGL** (drives a frame ground footprint of ~470 m × 314 m to ~980 m × 655 m depending on focal length). Other intrinsics (focal length, exact sensor dimensions, distortion coefficients) are pinned at module-selection time and used by Component-1b orthorectification (pre-flight checkerboard calibration, F-F2).
- **AI camera pose information available to the GPS-Denied system**: gimbal angle and zoom only. The UAV's instantaneous bank/pitch is **not** published from the autopilot to the AI-camera reasoning path. Object-localization accuracy is therefore scoped to level flight (AC-7.1).
- Cameras connect to the companion computer over USB, MIPI-CSI, or GigE (specific interface TBD at solution-draft phase, dependent on chosen module).
- We can use satellite providers, but we're limited right now to Google Maps, which could be outdated for some regions
- Satellite imagery for the operational area must be pre-loaded onto the companion computer before flight
## Satellite Imagery
# Onboard Hardware
- **Source: Azaion Suite Satellite Service** (a separate component of the wider Suite). The onboard system is a **consumer** of this service, not a direct customer of commercial providers. Upstream sourcing (Maxar / Airbus / partner agencies / commissioned tasking) is the Satellite Service's concern, not this build's.
- **Onboard interface to the Service is offline-only**: the companion computer holds a local cache populated **before flight** by syncing from the Service for the operational area (AC-8.3). No satellite imagery is fetched in-flight from the Service.
- **Mid-flight tile generation (AC-8.4)**: during the mission the companion computer generates fresh tiles from the navigation camera, orthorectified into the basemap projection, deduplicated against the existing cache, and stored locally. On landing, those new tiles are uploaded back to the Suite Satellite Service for ingestion, so the next mission's cache is refreshed by the previous flight.
- **No raw photo storage** (AC-8.5): the tile is the unit of persistence. Raw nav-camera and AI-camera frames are not retained (except a low-rate failure-thumbnail log for forensics).
- **Resolution at the cache interface**: 0.5 m/pixel minimum, 0.3 m/pixel ideal (AC-8.1). The architecture is provider-agnostic at the cache boundary; whatever the Suite Satellite Service supplies must meet that bar.
- **Storage tile resolution convention**: cache imagery is specified by source pixel size, not by assuming a universal zoom-to-meter mapping. The cache interface accepts **0.5 m/px minimum, 0.3 m/px ideal** imagery, and every tile manifest records CRS, tile matrix convention, tile dimension, latitude-adjusted meters-per-pixel, capture date, source, and compression. If an XYZ/WebMercator tile pyramid is used, its zoom level is documented as a provider convention rather than treated as proof of physical resolution. The matcher (Component 3) needs <=~4x scale ratio between the UAV frame (~12 cm/px GSD at 1 km AGL with the 20 MP APS-C camera) and the reference; 0.3-0.5 m/px reference imagery gives a ~2.5-4.2x ratio. Storage budget across the 400 km² operational area remains capped at **10 GB** for the persistent cache and must be validated against the final provider format/compression. The 10 GB budget includes cache imagery, manifests, overviews, and any precomputed global/local descriptors unless the solution draft explicitly splits a separate descriptor/index budget. **VPR retrieval unit is decoupled from the storage tile** (see AC-8.6 below): VPR chunks are derived from the tile cache at ground-footprint scale (~600-800 m chunks with 40-50 % overlap), independent of the storage tile convention.
- **Freshness gates** (AC-8.2 / AC-NEW-6) are enforced at runtime: tiles older than 6 months (active-conflict sectors) or 12 months (stable rear sectors) are rejected or down-confidence-weighted. Tiles generated mid-flight are timestamped with the current flight date and treated as fresh.
- **Free public imagery (Sentinel-2 etc.)** is not on the runtime path. If the Suite Satellite Service ever returns Sentinel-class tiles, the cache rejects them as below the 0.5 m/px floor.
- Processing is done on a Jetson Orin Nano Super (67 TOPS, 8GB shared LPDDR5, 25W TDP)
- The companion computer runs JetPack (Ubuntu-based) with CUDA/TensorRT available
- Onboard storage for satellite imagery is limited (exact capacity TBD, but must be accounted for in tile preparation)
- Sustained GPU load may cause thermal throttling; the processing pipeline must stay within thermal envelope
## Onboard Hardware
# Sensors & Integration
- Processing platform: **Jetson Orin Nano Super** — 67 TOPS sparse INT8, **8 GB shared LPDDR5** (CPU and GPU share the same memory pool), **25 W TDP**.
- Companion computer runs JetPack (Ubuntu-based) with CUDA / TensorRT available.
- Sustained GPU load may cause thermal throttling; the processing pipeline must stay within the thermal envelope. The cooling solution shall sustain the 25 W power mode for the 8-hour duty cycle at the upper environmental-envelope temperature (AC-NEW-5).
- Onboard non-volatile storage: budget at least the satellite-cache (~10 GB) **plus** the flight-data-recorder cap (64 GB / flight, AC-NEW-3). Reuse-across-flights tile cache stays resident; per-flight FDR rolls over after cap.
## Sensors & Integration
- High-rate **IMU** data is available from the flight controller via MAVLink.
- The original still-image sample does **not** include synchronized IMU or ground-truth pose. The Derkachi representative fixture adds cropped nadir video plus synchronized `SCALED_IMU2` and `GLOBAL_POSITION_INT` telemetry, which is enough for replay, synchronization, latency, VIO smoke tests, and trajectory comparison against the tlog GPS path. Final production acceptance still requires camera intrinsics, lens distortion, exact camera-to-body calibration, and representative synchronized navigation-camera frames, FC IMU/attitude/airspeed/altitude, emitted MAVLink messages, and ground-truth trajectory from a representative flight or replay rig.
- The system communicates with the flight controller via MAVLink. Telemetry plumbing uses **MAVSDK**; the `GPS_INPUT` injection path is implemented via **pymavlink**, since MAVSDK does not expose a native `GPS_INPUT` API.
- **Autopilot target: ArduPilot only** (with `GPS1_TYPE=14` for MAVLink GPS injection). PX4 is out of scope for the build; if it ever returns to scope it will use `VISION_POSITION_ESTIMATE`, not `GPS_INPUT`. (See `_docs/00_research/00_ac_assessment.md` Q-1.)
- The system outputs WGS84 GPS coordinates to the flight controller as a replacement for the real GPS module (MAVLink GPS_INPUT, AC-4.3).
- **Ground station: QGroundControl** is the supported GCS. Mission Planner is not in scope. Telemetry link is bandwidth-limited and is not the primary output channel; per-frame data stays on the local FDR (AC-NEW-3), GCS sees a 12 Hz downsampled summary (AC-6.1).
## Failsafe & Safety
- If the GPS-Denied system fails to produce any position estimate for **>3 s**, the autopilot falls back to IMU-only dead reckoning (AC-5.2). N=3 s rides through one sharp turn at cruise speed without tripping the failsafe.
- The system must satisfy the false-position safety budget in AC-NEW-4 (P(error >500 m) <0.1%, P(error >1 km) <0.01% per flight).
- Cold-start time-to-first-fix budget is **<30 s** from companion-computer boot (AC-NEW-1); spoofing-promotion latency is **<3 s** from FC's GPS-loss signal (AC-NEW-2).
- There is a lot of data from IMU (via the flight controller)
- The system communicates with the flight controller via MAVLink protocol using MAVSDK library
- The system must output GPS coordinates to the flight controller as a replacement for the real GPS module (MAVLink GPS_INPUT message)
- Ground station telemetry link is available but bandwidth-limited; it is not the primary output channel
-54
View File
@@ -1,54 +0,0 @@
# Acceptance Criteria Assessment
Accessed: 2026-05-01. Rerun after user-approved clarifications: 2026-05-01.
## Research Scope
- **Output class**: Technical-component selection support.
- **Novelty sensitivity**: High for VPR, embedded AI, and autopilot integration; source preference is current papers and official docs.
- **Boundary**: Fixed-wing UAV, nadir navigation camera, ArduPilot Plane, Jetson Orin Nano Super, offline Azaion Suite Satellite Service cache, eastern/southern Ukraine terrain.
## Acceptance Criteria
| Criterion | Current Values | Researched Values / Evidence | Cost / Timeline Impact | Status |
|-----------|----------------|------------------------------|------------------------|--------|
| AC-1.1 / AC-1.2 frame-center accuracy | <=50 m for >=80%, <=20 m for >=50% in normal segments | Plausible only with periodic satellite anchoring plus VO/IMU propagation. Aerial VPR papers show the mechanism is viable but sensitive to weather, scale, repetition, and tile overlap. | High validation cost. | Keep, high-risk |
| AC-1.3 drift | VO-only <100 m, IMU-fused <50 m between anchors, anchor age reported | Updated AC now requires `last_satellite_anchor_age_ms`, binned validation, and degraded covariance after a solution-defined max anchor age. | Medium. | Updated |
| AC-1.4 confidence | 95% covariance ellipse + source label | `GPS_INPUT` supports accuracy fields; source labels must be carried in telemetry/FDR because `GPS_INPUT` has no semantic label field. | Medium. | Keep |
| AC-2.1 registration | VO >95%; satellite anchoring measured separately | Split is correct: VO success is not the same as cross-domain satellite anchor success. | Medium-high. | Updated |
| AC-2.2 reprojection | <1 px VO, <2.5 px satellite anchor | Reasonable image-space gates, with coordinate error still dependent on calibration, orthorectification, and satellite georegistration. | Medium. | Keep |
| AC-3.x resilience | Outliers, sharp turns, disconnected segments, blackout | Technically feasible only through mode switching: VO failure triggers VPR/relocalization, blackout triggers IMU-only propagation with honest covariance growth. | High test cost. | Keep |
| AC-4.1 latency | <400 ms p95, <=10% frame drops, heavy VPR conditional | Aerial VPR survey reports some re-ranking paths too slow for steady-state use; solution must keep global VPR off the per-frame hot path. | High optimization cost. | Updated |
| AC-4.2 memory | <8 GB shared | Feasible if descriptors are compressed/pruned and indices are memory-mapped or loaded selectively. | Medium-high. | Keep |
| AC-4.3 MAVLink | v1 GPS_INPUT only via pymavlink | ArduPilot docs require `GPS1_TYPE=14`; MAVLink defines required lat/lon, velocity, fix, and accuracy fields. MAVSDK should remain telemetry-oriented. | Medium. | Keep |
| AC-5.2 failsafe | >3 s no estimate triggers fallback, Plane SITL verified | Copter docs are reference only. Plane-specific production parameters must be verified in SITL. | Medium. | Updated |
| AC-7 object localization | Level-flight AI-camera object GPS | Realistic under level-flight clause; maneuvering estimates must publish conservative bound. | Medium. | Keep |
| AC-8.x satellite cache | 0.3-0.5 m/px, freshness, offline descriptors, VPR chunks | Resolution is feasible through commercial/service imagery. Storage must count descriptors unless separately budgeted. | Medium-high. | Updated |
| AC-NEW-1 / 2 startup and spoofing | <30 s first fix, <3 s promotion | Feasible only with prebuilt engines, warmed indices, and verified Plane GPS-health triggers. | Medium-high. | Keep with SITL gate |
| AC-NEW-3 FDR | <=64 GB per flight, no raw frames | Feasible with segment files and rollover. | Medium. | Keep |
| AC-NEW-4 / 7 safety budgets | False-position and cache-poisoning probabilities | Appropriate safety gates, but require Monte Carlo and representative flight/replay data. | High. | Keep |
| AC-NEW-5 environment | -20 C to +50 C, 25 W for 8 h | NVIDIA confirms 25 W mode; thermal design must prevent throttling. | Medium-high. | Keep |
## Restrictions Assessment
| Restriction | Current Values | Researched Values / Evidence | Cost / Timeline Impact | Status |
|-------------|----------------|------------------------------|------------------------|--------|
| Camera source of truth | `restrictions.md` pins ADTi 20MP ~5472 x 3648 | User confirmed `restrictions.md` is authoritative. Lens/FOV remains a design parameter. | Medium during module selection. | Updated |
| Fixed nadir camera | No gimbal stabilization | Good for orthorectification; turn/tilt requires attitude compensation and failure detection. | Medium. | Keep |
| Terrain/weather | Flat steppe/agricultural, seasonal classes included | Repetitive fields and seasonal changes are VPR hazards; validation must include those classes. | High validation cost. | Updated |
| Satellite Service boundary | Offline consumer of Suite Satellite Service | Strong separation; cache manifest and ingest-voting contract are required. | Medium. | Keep |
| 10 GB cache | Includes imagery, manifests, overviews, descriptors unless split | Plausible at 0.5 m/px with compression; 0.3 m/px plus descriptors may exceed unless pruned. | Medium. | Updated |
| Jetson Orin Nano Super | 67 TOPS INT8, 8 GB, 25 W | Official specs support the restriction; thermal throttling remains a risk. | Medium-high. | Keep |
| Test data gap | Sample imagery lacks IMU/ground truth | Public datasets help prototype, but final acceptance needs synchronized representative data. | High. | Updated |
## Key Findings
1. Use a hybrid estimator: VO/IMU for frame propagation, satellite/VPR anchors for absolute correction, ESKF covariance as the safety gate.
2. Do not run heavy VPR/re-ranking every frame; invoke it on cold start, VO failure, covariance growth, sharp turns, and disconnected segments.
3. Avoid GPL libraries in production dependencies unless the project accepts GPL obligations. GPL VIO/SLAM tools should be benchmarks or references, not selected production components.
4. The cache must be designed as imagery + metadata + descriptor index, not just raster tiles.
5. ArduPilot Plane SITL and representative camera+IMU data are blocking validation dependencies, but not blockers for solution drafting.
## Sources
See `_docs/00_research/01_source_registry.md` for the detailed source list.
@@ -1,145 +0,0 @@
# Question Decomposition
## Classification
- **Original question**: Design a GPS-denied onboard localization system for a fixed-wing UAV using a nadir camera, IMU, preloaded satellite imagery, and ArduPilot `GPS_INPUT`.
- **Active mode**: Mode A Phase 2, initial solution research.
- **Research output class**: Technical-component selection.
- **Question type**: Decision support with knowledge organization.
- **Timeliness sensitivity**: High for VPR, embedded AI inference, and MAVLink/ArduPilot integration; medium for geometry and filtering fundamentals.
## Research Boundary
| Dimension | Boundary |
|-----------|----------|
| Population | Fixed-wing UAV missions; not multirotor hover workflows. |
| Geography | Eastern/southern Ukraine operational areas east/left of the Dnipro River. |
| Timeframe | Current implementation target with 2024-2026 component evidence where possible. |
| Level | Onboard real-time production system, not offline post-processing. |
| Operating context | 8 h flight, 60 km/h, <=1 km AGL, 3 fps nav camera, Jetson Orin Nano Super, GPS denied/spoofed. |
| Required interfaces | Offline Satellite Service cache in; MAVLink `GPS_INPUT`, QGC telemetry, FDR records, and object-coordinate API out. |
| Non-functional envelope | <400 ms p95, <8 GB shared memory, 10 GB persistent cache target, 64 GB FDR cap, safety covariance and false-position budgets. |
## Project Constraint Matrix Summary
| Constraint Area | Binding Constraint |
|-----------------|-------------------|
| Camera | ADTi 20MP 20L V1, APS-C, ~5472 x 3648, fixed nadir, no gimbal stabilization. |
| Sensors | FC IMU/attitude/airspeed/altitude available over MAVLink; original still-image sample lacks synchronized IMU, while Derkachi replay data now provides synchronized IMU and `GLOBAL_POSITION_INT` trajectory. |
| Reference imagery | Offline cache only, 0.5 m/px minimum and 0.3 m/px ideal, freshness gates, no in-flight provider fetch. |
| Runtime | Jetson Orin Nano Super, CUDA/TensorRT available, 25 W thermal envelope. |
| Autopilot | ArduPilot only, v1 emits `GPS_INPUT` only; ODOMETRY intentionally disabled. |
| Storage | No raw frame retention; tiles + FDR only. Descriptor/index storage must be budgeted. |
| Safety | Reject weak anchors, never under-report covariance, fail/degrade honestly in blackout and spoofing. |
| Hard disqualifiers | Per-frame heavy VPR without profiling, runtime dependence on external network, stale-tile confident anchors, GPL production dependency unless licensing is accepted. |
## Perspectives
| Perspective | Focus |
|-------------|-------|
| Operator / mission user | Does the system keep the UAV navigable and report honest confidence under spoofing/blackout? |
| Embedded implementer | Can the pipeline fit <400 ms p95 and <8 GB on Jetson with maintainable interfaces? |
| Safety reviewer | Are false-position and cache-poisoning paths gated before they can steer the FC or poison future caches? |
| Field practitioner | Will seasonal agricultural repetition, turns, haze/smoke, and stale imagery break the architecture? |
| Contrarian | Which attractive libraries or SOTA models fail because of licensing, memory, latency, or input mismatch? |
## Sub-Questions And Query Variants
1. What architecture bounds drift while GPS is denied?
- fixed-wing UAV GPS-denied satellite image matching visual odometry
- visual odometry satellite imagery accumulated error fixed wing UAV
- monocular VIO aerial navigation scale ambiguity satellite anchor
- GPS spoofed UAV visual inertial navigation covariance failover
2. Which VO/VIO approach fits one nadir camera + IMU?
- OpenVINS monocular visual inertial odometry Jetson
- ORB-SLAM3 monocular inertial Jetson UAV limitations
- VINS-Fusion fixed wing monocular IMU outdoor aerial
- homography visual odometry nadir UAV IMU fusion
3. Which satellite retrieval and matching approach fits offline cache + <400 ms?
- aerial visual place recognition survey DINOv2 FAISS
- DINOv2 VLAD aerial VPR embedded memory
- LightGlue SuperPoint DISK ALIKED TensorRT Jetson
- cross-view UAV satellite matching failure modes farmland
4. How should the estimator and safety modes work?
- ESKF visual inertial GPS denied UAV covariance
- GPS_INPUT horiz_accuracy covariance external GPS ArduPilot
- visual blackout IMU dead reckoning UAV covariance growth
- false position rejection Mahalanobis gate visual localization
5. What cache format and data contract fit the onboard/Satellite Service boundary?
- COG PMTiles MBTiles offline raster cache embedded
- satellite tile descriptor index storage FAISS PMTiles
- cloud optimized geotiff local update limitations
- PMTiles read only update PostgreSQL/PostGIS-backed raster cache
6. How should MAVLink output integrate with ArduPilot Plane?
- ArduPilot GPS_INPUT GPS1_TYPE 14 Plane SITL
- pymavlink gps_input_send external GPS example
- MAVSDK GPS_INPUT support raw MAVLink
- ArduPilot EKF GPS glitch spoof failsafe Plane parameters
7. What validation datasets and tests are needed?
- AerialVL UAV satellite visual localization dataset
- VPAir aerial visual place recognition dataset
- EuRoC MAV visual inertial odometry dataset
- ArduPilot Plane SITL fake GPS spoofing simulation
## Component Option Search Plan
| Component Area | Option Families / Candidates | Evidence Needed |
|----------------|------------------------------|-----------------|
| Camera calibration and geometry | OpenCV calibration/homography; custom NumPy geometry; ROS camera pipeline | Official API for intrinsics, distortion, homography, RANSAC; permissive licensing; Jetson compatibility. |
| VO / VIO propagation | OpenVINS, ORB-SLAM3, VINS-Fusion, custom homography+IMU ESKF | Exact monocular+IMU input fit, output pose/covariance, licensing, runtime, initialization behavior. |
| VPR global retrieval | DINOv2-VLAD/AnyLoc, MixVPR/SALAD/SelaVPR, classical NetVLAD/BoW | Aerial benchmark evidence, descriptor size, offline index fit, embedded feasibility. |
| Local cross-domain matching | LightGlue + DISK/ALIKED, SuperPoint+LightGlue, LoFTR/XFeat, SIFT/ORB baseline | Inputs/outputs, match coordinates, license, runtime knobs, TensorRT/Jetson feasibility. |
| Vector index | FAISS CPU/GPU, PostgreSQL/pgvector metadata-assisted search, Annoy/HNSWLIB | Top-K retrieval, saved index, memory/compression knobs, ARM/Jetson feasibility. |
| Estimator | Custom ESKF, factor graph, robot_localization | Covariance output, mode labels, Mahalanobis gates, source-specific update control. |
| Cache/storage | COG, PostgreSQL/PostGIS manifest, PMTiles, MBTiles, raw tile folders | Offline read/update behavior, storage efficiency, metadata/manifest support. |
| MAVLink integration | pymavlink, MAVSDK, MAVProxy bridge | `GPS_INPUT` support, ArduPilot `GPS1_TYPE=14`, telemetry subscriptions, QGC status. |
| FDR | PostgreSQL event index, Parquet export, CBOR segment files | Streaming writes, rollover, compact typed records, replayability. |
## Completeness Audit
- **Cost/resources**: covered by Jetson, cache, thermal, and descriptor storage constraints.
- **Legal/licensing**: covered; GPL VIO/SLAM tools are not selected for production.
- **Dependencies**: Satellite Service cache contract, ArduPilot Plane SITL, and synchronized validation data are explicit dependencies.
- **Operating environment**: fixed-wing, altitude, terrain, seasonal/visibility classes, and blackout cases covered.
- **Failure modes**: VO failure, stale tiles, spoofing, blackout, thermal throttling, false anchors, cache poisoning covered.
- **Practitioner concerns**: real-time embedded performance and dataset mismatch covered through survey and benchmark sources.
- **Change over time**: DINOv2/VPR models and Jetson/TensorRT assumptions require version-pinned profiling during implementation.
## Mode B Round 2 Addendum — User-Requested Technology Check
### Research Output Class
Technical-component selection. The addendum verifies two implementation choices before autodev proceeds to planning:
1. Whether OpenVINS should replace the custom OpenCV-based VO/ESKF direction.
2. Whether DINOv2-VLAD + ALIKED/LightGlue is still the right satellite retrieval and anchor-verification stack.
### Boundary Clarification
"Custom OpenCV" is treated as OpenCV for calibration, undistortion, feature geometry, homography/RANSAC, and MRE measurement, plus a project-owned ESKF/mode machine. It is not treated as a naive OpenCV-only replacement for VIO.
### Additional Query Variants Executed
- OpenVINS GPL-3 license MSCKF visual inertial odometry documentation monocular IMU 2026
- OpenVINS visual inertial odometry GPS denied UAV MSCKF limitations monocular high altitude nadir camera
- why not use OpenVINS production GPL ROS dependency visual inertial odometry limitations
- OpenCV license BSD 3-Clause camera calibration findHomography RANSAC documentation 4.x
- custom visual odometry OpenCV homography IMU EKF fixed wing UAV satellite imagery GPS denied 2024
- DINOv2 VLAD AnyLoc visual place recognition aerial satellite retrieval benchmark 2024 2025
- DINOv2 VLAD limitations visual place recognition storage compute AnyLoc limitations
- DINOv2 TensorRT Jetson performance issue embedding accuracy visual place recognition
- ALIKED LightGlue license local feature matching aerial image registration 2024 2025
- ALIKED LightGlue ONNX TensorRT Jetson performance benchmark local feature matching
- aerial visual place recognition survey 2024 runtime memory re-ranking SuperGlue LightGlue satellite UAV retrieval
### Addendum Conclusion
OpenVINS is better than a pure custom OpenCV-only VIO implementation, but the production architecture should keep OpenCV as the utility layer and keep the project-owned ESKF/mode machine as the shipped estimator. OpenVINS becomes a mandatory benchmark/reference because it does not own the satellite anchor, spoofing/blackout, source-label, cache-write, and MAVLink semantics required by the acceptance criteria, and GPLv3 remains a production dependency blocker.
DINOv2-VLAD + CPU-first FAISS + ALIKED/LightGlue remains the preferred anchor stack, with two non-negotiable constraints: retrieval is trigger-based rather than per-frame, and TensorRT/ONNX optimizations are accepted only after descriptor-fidelity and Jetson latency tests.
-419
View File
@@ -1,419 +0,0 @@
# Source Registry
## Source #1
- **Title**: Visual Odometry in GPS-Denied Zones for Fixed-Wing UAV with Reduced Accumulative Error Based on Satellite Imagery
- **Link**: https://www.mdpi.com/2076-3417/14/16/7420
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: Currently valid
- **Target Audience**: UAV visual localization researchers/implementers
- **Research Boundary Match**: Full match
- **Summary**: Demonstrates fixed-wing high-altitude monocular VO corrected by satellite imagery; highlights scale ambiguity and accumulated drift.
- **Related Sub-question**: Architecture / drift bounding
## Source #2
- **Title**: Visual place recognition for aerial imagery: A survey
- **Link**: https://arxiv.org/abs/2406.00885
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: Currently valid
- **Target Audience**: Aerial VPR researchers/implementers
- **Research Boundary Match**: Full match
- **Summary**: Reviews aerial VPR, retrieval/re-ranking, overlap/scale effects, memory/runtime issues, and georeference recall.
- **Related Sub-question**: VPR / validation
## Source #3
- **Title**: OpenVINS documentation
- **Link**: https://docs.openvins.com/
- **Tier**: L1
- **Publication Date**: 2023 latest noted release
- **Timeliness Status**: Needs verification before implementation
- **Target Audience**: VIO researchers/implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: OpenVINS is an EKF/MSCKF visual-inertial estimator supporting monocular tracking, calibration, evaluation, and covariance-aware estimation; GPL-3 license.
- **Related Sub-question**: VO/VIO
## Source #4
- **Title**: ORB-SLAM3 README
- **Link**: https://raw.githubusercontent.com/UZ-SLAMLab/ORB_SLAM3/master/README.md
- **Tier**: L1
- **Publication Date**: 2021 README, still repository source
- **Timeliness Status**: Needs verification before implementation
- **Target Audience**: SLAM implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: ORB-SLAM3 supports monocular visual-inertial SLAM and multi-map operation, requires calibration, and is GPLv3.
- **Related Sub-question**: VO/VIO alternatives
## Source #5
- **Title**: OpenCV 4.x documentation via Context7
- **Link**: https://docs.opencv.org/4.x/
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Computer vision implementers
- **Research Boundary Match**: Full match for utility layer
- **Summary**: Documents camera calibration, undistortion, and `findHomography` with RANSAC for robust geometry.
- **Related Sub-question**: Calibration / geometry
## Source #6
- **Title**: LightGlue README and Context7 docs
- **Link**: https://raw.githubusercontent.com/cvg/LightGlue/main/README.md
- **Tier**: L1
- **Publication Date**: Current repository, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Feature-matching implementers
- **Research Boundary Match**: Full match for local matching
- **Summary**: LightGlue accepts local keypoints/descriptors and returns matched coordinates/scores; supports SuperPoint, DISK, ALIKED, SIFT, adaptive pruning, CUDA, and Apache-2 for code/weights while SuperPoint has restrictive licensing.
- **Related Sub-question**: Local matching
## Source #7
- **Title**: AnyLoc README
- **Link**: https://github.com/AnyLoc/AnyLoc
- **Tier**: L1
- **Publication Date**: 2023 repository, accessed 2026-05-01
- **Timeliness Status**: Needs profiling verification
- **Target Audience**: VPR implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: Provides DINOv2 + VLAD API examples and notes substantial storage/compute requirements for full experiments.
- **Related Sub-question**: VPR descriptors
## Source #8
- **Title**: DINOv2 repository
- **Link**: https://github.com/facebookresearch/dinov2
- **Tier**: L1
- **Publication Date**: 2023 repository, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Vision model implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: Meta's DINOv2 implementation and models, Apache-2.0 / CC-BY-4.0 license notices.
- **Related Sub-question**: VPR descriptors
## Source #9
- **Title**: FAISS documentation and Context7 docs
- **Link**: https://faiss.ai/index.html
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Vector search implementers
- **Research Boundary Match**: Full match
- **Summary**: FAISS supports dense vector search, top-k retrieval, CPU/GPU indexes, product quantization, and save/load APIs; GPU indexes must be converted to CPU before saving.
- **Related Sub-question**: Descriptor retrieval
## Source #10
- **Title**: MAVSDK documentation via Context7
- **Link**: https://github.com/mavlink/mavsdk
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: MAVLink application implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: MAVSDK provides telemetry APIs including raw GPS, GPS info, status text, position/velocity, and odometry subscriptions; `GPS_INPUT` emission should use raw MAVLink/pymavlink for this project.
- **Related Sub-question**: MAVLink integration
## Source #11
- **Title**: ArduPilot MAVProxy GPSInput
- **Link**: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: ArduPilot integrators
- **Research Boundary Match**: Full match
- **Summary**: External GPS input requires `GPS1_TYPE=14` and accepts MAVLink `GPS_INPUT` fields including WGS84 lat/lon, velocity, fix type, and accuracy.
- **Related Sub-question**: MAVLink output
## Source #12
- **Title**: MAVLink common message spec: GPS_INPUT
- **Link**: https://mavlink.io/en/messages/common.html#GPS_INPUT
- **Tier**: L1
- **Publication Date**: Current spec, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: MAVLink implementers
- **Research Boundary Match**: Full match
- **Summary**: Defines `GPS_INPUT` fields, fix type semantics, `horiz_accuracy`, and ignore flags.
- **Related Sub-question**: MAVLink output / confidence
## Source #13
- **Title**: ArduPilot GPS failsafe and glitch protection
- **Link**: https://ardupilot.org/copter/docs/gps-failsafe-glitch-protection.html
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Reference only for Plane
- **Target Audience**: ArduPilot operators
- **Research Boundary Match**: Partial overlap
- **Summary**: Documents GPS glitch protection and notes inertial-only position degrades quickly; Copter-specific defaults must not be assumed for Plane.
- **Related Sub-question**: Failsafe / spoofing
## Source #14
- **Title**: ArduPilot EKF failsafe
- **Link**: https://ardupilot.org/copter/docs/ekf-inav-failsafe.html
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Reference only for Plane
- **Target Audience**: ArduPilot operators
- **Research Boundary Match**: Partial overlap
- **Summary**: Explains EKF variance failsafe behavior and why spoof/glitch tests must be parameterized.
- **Related Sub-question**: Failsafe / spoofing
## Source #15
- **Title**: Jetson Orin Nano Super Developer Kit
- **Link**: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit/
- **Tier**: L1
- **Publication Date**: Current page, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Embedded AI implementers
- **Research Boundary Match**: Full match
- **Summary**: Confirms 67 INT8 TOPS, 8 GB LPDDR5, 102 GB/s, and 7-25 W power range.
- **Related Sub-question**: Runtime
## Source #16
- **Title**: NVIDIA JetPack 6.2 Super Mode blog
- **Link**: https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/
- **Tier**: L2
- **Publication Date**: 2024
- **Timeliness Status**: Currently valid
- **Target Audience**: Jetson developers
- **Research Boundary Match**: Full match
- **Summary**: Explains 25 W and MAXN Super modes and warns thermal design must accommodate the new power modes or throttling occurs.
- **Related Sub-question**: Runtime / thermal
## Source #17
- **Title**: PMTiles Concepts
- **Link**: https://docs.protomaps.com/pmtiles/
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Geospatial storage implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: PMTiles is single-file tiled archive, efficient for reads, but read-only and not update-in-place.
- **Related Sub-question**: Cache storage
## Source #18
- **Title**: GDAL COG driver
- **Link**: https://gdal.org/en/stable/drivers/raster/cog.html
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Geospatial raster implementers
- **Research Boundary Match**: Full match
- **Summary**: Defines COG creation options for tiled, compressed, overview-enabled GeoTIFFs.
- **Related Sub-question**: Cache storage
## Source #19
- **Title**: AerialVL dataset
- **Link**: https://github.com/hmf21/AerialVL
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: Currently valid
- **Target Audience**: Aerial visual localization researchers
- **Research Boundary Match**: Partial overlap
- **Summary**: Public aerial localization benchmark with UAV sequences, reference maps, and geo-referenced evaluation data.
- **Related Sub-question**: Validation
## Source #20
- **Title**: EuRoC MAV Dataset
- **Link**: http://projects.asl.ethz.ch/datasets/euroc-mav/
- **Tier**: L1
- **Publication Date**: 2016
- **Timeliness Status**: Stable benchmark
- **Target Audience**: VIO researchers
- **Research Boundary Match**: Partial overlap
- **Summary**: Stereo camera + IMU + ground truth benchmark useful for VIO sanity tests but not representative of high-altitude nadir fixed-wing imagery.
- **Related Sub-question**: Validation
## Source #21
- **Title**: NVIDIA/TensorRT issue: DINOv2 TensorRT performance/precision on Jetson
- **Link**: https://github.com/NVIDIA/TensorRT/issues/4348
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: Needs verification
- **Target Audience**: Jetson/TensorRT implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: Reports limited mixed-precision gains for DINOv2-S on Jetson/RTX, suggesting DINOv2 optimization is not automatically beneficial.
- **Related Sub-question**: Mode B performance risk
## Source #22
- **Title**: NVIDIA Developer Forum: DINOv2 TensorRT model performance issue
- **Link**: https://forums.developer.nvidia.com/t/dinov2-tensorrt-model-performance-issue/312251
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: Needs verification
- **Target Audience**: Jetson/TensorRT implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: Reports DINOv2 embedding distance changes after TensorRT conversion on Jetson Orin Nano; requires embedding-fidelity validation before relying on TensorRT descriptors.
- **Related Sub-question**: Mode B performance/quality risk
## Source #23
- **Title**: LightGlue license issue discussions
- **Link**: https://github.com/cvg/LightGlue/issues/120
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: Currently relevant
- **Target Audience**: Feature-matching implementers
- **Research Boundary Match**: Full match for licensing
- **Summary**: Community discussion highlights restrictive SuperPoint licensing inside the LightGlue ecosystem and supports avoiding SuperPoint as default production extractor.
- **Related Sub-question**: Mode B licensing risk
## Source #24
- **Title**: ArduPilot issue: GPS_INPUT velocity ignore flag pitfall
- **Link**: https://github.com/ArduPilot/ardupilot/issues/19633
- **Tier**: L4
- **Publication Date**: 2021
- **Timeliness Status**: Needs SITL verification
- **Target Audience**: ArduPilot integrators
- **Research Boundary Match**: Full match for GPS_INPUT caution
- **Summary**: Reports EKF3 may use zero velocity when `GPS_INPUT_IGNORE_FLAG_VEL_HORIZ` is set, so velocity-source parameters must be tested rather than relying only on ignore flags.
- **Related Sub-question**: Mode B MAVLink pitfall
## Source #25
- **Title**: FAISS install documentation
- **Link**: https://github.com/facebookresearch/faiss/blob/main/INSTALL.md
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Vector search implementers
- **Research Boundary Match**: Full match
- **Summary**: FAISS CPU conda package supports aarch64, while GPU package availability is x86-64 focused; Jetson design should assume CPU FAISS unless a custom build is proven.
- **Related Sub-question**: Mode B FAISS deployment
## Source #26
- **Title**: GNSS-denied geolocalization of UAVs by visual matching of onboard camera images with orthophotos
- **Link**: https://ar5iv.labs.arxiv.org/html/2103.14381
- **Tier**: L1
- **Publication Date**: 2021
- **Timeliness Status**: Stable mechanism reference
- **Target Audience**: UAV visual geolocalization researchers
- **Research Boundary Match**: Partial overlap
- **Summary**: Demonstrates visual matching with orthophotos and Monte Carlo/local planarity ideas; supports using orthorectified reference maps but does not cover all adversarial visual attacks.
- **Related Sub-question**: Mode B alternative / security limits
## Source #27
- **Title**: OpenVINS LICENSE
- **Link**: https://github.com/rpng/open_vins/blob/master/LICENSE
- **Tier**: L1
- **Publication Date**: Current repository, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: VIO implementers / product owners
- **Research Boundary Match**: Full match for licensing
- **Summary**: OpenVINS is GPLv3-licensed; this is a production dependency constraint, not a technical capability limitation.
- **Related Sub-question**: Mode B round 2 — OpenVINS vs custom production estimator
## Source #28
- **Title**: OpenVINS documentation and Context7 lookup
- **Link**: https://docs.openvins.com/index.html
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: VIO implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: OpenVINS is a strong EKF/MSCKF VIO system for monocular camera + IMU reference runs, with calibration and covariance-aware state estimation, but it does not own the project-specific satellite anchor, GPS_INPUT, source-label, spoofing, blackout, and cache-poisoning state machine.
- **Related Sub-question**: Mode B round 2 — OpenVINS vs custom production estimator
## Source #29
- **Title**: OpenCV 4.x calibration/homography documentation and Context7 lookup
- **Link**: https://docs.opencv.org/4.x/
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: Computer vision implementers
- **Research Boundary Match**: Full match for geometry utility layer
- **Summary**: OpenCV 4.x provides calibration, undistortion, homography estimation, RANSAC/USAC robust estimation, and reprojection-error primitives under a permissive license; it is a utility layer rather than a complete GPS-denied estimator.
- **Related Sub-question**: Mode B round 2 — custom OpenCV boundary
## Source #30
- **Title**: AnyLoc: Towards Universal Visual Place Recognition
- **Link**: https://arxiv.org/html/2308.00688
- **Tier**: L1
- **Publication Date**: 2023; ICRA 2024
- **Timeliness Status**: Currently valid, profiling required before deployment
- **Target Audience**: VPR implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: AnyLoc combines DINOv2 features with VLAD aggregation for broad VPR, including aerial data, and supports the selected DINOv2-VLAD retrieval family while leaving runtime/storage tuning as a deployment gate.
- **Related Sub-question**: Mode B round 2 — satellite retrieval
## Source #31
- **Title**: ALIKED-LightGlue-ONNX and LightGlue ONNX/TensorRT deployment reports
- **Link**: https://github.com/ikeboo/ALIKED-LightGlue-ONNX
- **Tier**: L2
- **Publication Date**: Current repository, accessed 2026-05-01
- **Timeliness Status**: Promising but needs Jetson verification
- **Target Audience**: Local feature matching implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: ONNX/optimized variants show a credible deployment path for ALIKED + LightGlue, but public evidence is not enough to assume Jetson Orin Nano p95 latency without project profiling.
- **Related Sub-question**: Mode B round 2 — local matcher deployability
## Source #32
- **Title**: Visual place recognition for aerial imagery: A survey
- **Link**: https://arxiv.org/abs/2406.00885
- **Tier**: L1
- **Publication Date**: 2024
- **Timeliness Status**: Currently valid
- **Target Audience**: Aerial VPR researchers / implementers
- **Research Boundary Match**: Full match
- **Summary**: Aerial VPR performance depends materially on tile scale, overlap, weather, repetitive patterns, and re-ranking cost; this supports overlapped VPR chunks, dynamic top-K, and conditional local verification.
- **Related Sub-question**: Mode B round 2 — satellite retrieval and anchor verification
## Source #33
- **Title**: BASALT repository and documentation
- **Link**: https://github.com/VladyslavUsenko/basalt
- **Tier**: L1
- **Publication Date**: Current repository, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: VIO implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: BASALT provides visual-inertial odometry and mapping, camera/IMU calibration tools, EuRoC/TUM VI support, and a BSD-style production-friendly licensing path.
- **Related Sub-question**: Mode B round 3 — Kimera vs BASALT vs OpenVINS
## Source #34
- **Title**: HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry
- **Link**: https://arxiv.org/pdf/2106.11857
- **Tier**: L1
- **Publication Date**: 2021
- **Timeliness Status**: Stable benchmark reference
- **Target Audience**: VIO researchers / embedded implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: Reports EuRoC RMS ATE comparisons including BASALT mean about 0.051 m online stereo and Kimera mean about 0.12 m, plus notes that optimization-based methods often lack direct uncertainty quantification compared with filters.
- **Related Sub-question**: Mode B round 3 — VIO error and confidence comparison
## Source #35
- **Title**: OpenVINS issue #402 — up-to-date ATE and RTE metrics
- **Link**: https://github.com/rpng/open_vins/issues/402
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: Community benchmark, verify in our replay harness
- **Target Audience**: VIO implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: Community EuRoC comparison reports BASALT average ATE about 0.072 m with 100% completion, and OpenVINS average ATE about 0.091 m with about 88.55% completion and a divergence on one hard sequence.
- **Related Sub-question**: Mode B round 3 — BASALT vs OpenVINS error/completion
## Source #36
- **Title**: Kimera-VIO mono-inertial parameter issues
- **Link**: https://github.com/MIT-SPARK/Kimera-VIO/issues/254
- **Tier**: L4
- **Publication Date**: 2024
- **Timeliness Status**: Relevant implementation caveat
- **Target Audience**: VIO implementers
- **Research Boundary Match**: Partial overlap
- **Summary**: Kimera-VIO stereo path remains strong, but mono-inertial configurations had documented poor default performance; parameter changes improved one EuRoC mono setup to less than about +/-0.2 m per axis.
- **Related Sub-question**: Mode B round 3 — Kimera mono/nadir risk
## Source #37
- **Title**: RaD-VIO and downward-facing VIO literature
- **Link**: https://arxiv.org/abs/1810.08704
- **Tier**: L1
- **Publication Date**: 2018
- **Timeliness Status**: Stable mechanism reference
- **Target Audience**: MAV downward-camera VIO researchers
- **Research Boundary Match**: Full match for nadir-camera caveat
- **Summary**: Downward-facing monocular VIO has planar-scene and observability challenges; range/altitude and IMU constraints are important when the camera sees mostly ground plane.
- **Related Sub-question**: Mode B round 3 — nadir support and limitations
## Source #38
- **Title**: OpenVINS covariance documentation and StateHelper APIs
- **Link**: https://docs.openvins.com/dev-index.html
- **Tier**: L1
- **Publication Date**: Current docs, accessed 2026-05-01
- **Timeliness Status**: Currently valid
- **Target Audience**: VIO implementers
- **Research Boundary Match**: Full match for covariance/confidence output
- **Summary**: OpenVINS maintains EKF covariance and exposes full/marginal covariance helpers, making it the strongest reference for covariance consistency even if GPLv3 blocks default production use.
- **Related Sub-question**: Mode B round 3 — confidence/covariance support
-348
View File
@@ -1,348 +0,0 @@
# Fact Cards
## Fact #1
- **Statement**: Fixed-wing high-altitude monocular VO suffers from scale ambiguity and accumulated error; comparing against satellite imagery can reduce accumulated drift.
- **Source**: Source #1
- **Phase**: Phase 2
- **Target Audience**: UAV localization implementers
- **Confidence**: High
- **Related Dimension**: Architecture
- **Fit Impact**: Supports satellite-anchored hybrid estimator.
## Fact #2
- **Statement**: Aerial VPR is sensitive to weather, season, scale variation, repetitive patterns, and map tile construction; overlap and scale level materially affect retrieval quality.
- **Source**: Source #2
- **Confidence**: High
- **Related Dimension**: VPR
- **Fit Impact**: Supports AC-8.6 VPR chunks with overlap and seasonal validation.
## Fact #3
- **Statement**: Heavy VPR re-ranking can be too slow for steady-state embedded use; survey evidence reports some re-ranking around 1 s and SuperGlue much slower on evaluated hardware.
- **Source**: Source #2
- **Confidence**: High
- **Related Dimension**: Runtime
- **Fit Impact**: Disqualifies per-frame global VPR/re-ranking unless profiled on Jetson.
## Fact #4
- **Statement**: OpenVINS is an EKF/MSCKF visual-inertial estimator with monocular tracking and calibration support, but its code is GPL-3.
- **Source**: Source #3
- **Confidence**: High
- **Related Dimension**: VO/VIO
- **Fit Impact**: Reference/benchmark only unless GPL obligations are accepted.
## Fact #5
- **Statement**: ORB-SLAM3 supports monocular visual-inertial SLAM and multi-map operation, but it is GPLv3 and expects careful calibration and a SLAM-style runtime stack.
- **Source**: Source #4
- **Confidence**: High
- **Related Dimension**: VO/VIO
- **Fit Impact**: Rejected as production dependency; useful benchmark/reference.
## Fact #6
- **Statement**: OpenCV provides camera calibration APIs that output camera matrix and distortion coefficients, and homography estimation APIs including RANSAC.
- **Source**: Source #5
- **Confidence**: High
- **Related Dimension**: Calibration / geometry
- **Fit Impact**: Selected utility layer for calibration, undistortion, homography, and geometric validation.
## Fact #7
- **Statement**: LightGlue accepts local keypoints/descriptors from extractors such as DISK, ALIKED, SIFT, and SuperPoint, and returns matched keypoint indices, coordinates, and confidence scores.
- **Source**: Source #6
- **Confidence**: High
- **Related Dimension**: Local matching
- **Fit Impact**: Selected candidate for conditional cross-domain local matching.
## Fact #8
- **Statement**: LightGlue has adaptive depth/width pruning, FlashAttention, mixed precision, and benchmark scripts; runtime must be profiled on Jetson because defaults are optimized for desktop GPUs.
- **Source**: Source #6
- **Confidence**: High
- **Related Dimension**: Runtime
- **Fit Impact**: Selected with runtime-quality gate.
## Fact #9
- **Statement**: LightGlue code/weights are Apache-2.0, but SuperPoint pretrained weights/inference have restrictive licensing; DISK and ALIKED are safer extractor pairings from a licensing perspective.
- **Source**: Source #6
- **Confidence**: High
- **Related Dimension**: Licensing
- **Fit Impact**: Select DISK/ALIKED+LightGlue for production candidate; treat SuperPoint as license-gated.
## Fact #10
- **Statement**: AnyLoc provides DINOv2 feature extraction and VLAD aggregation APIs, but its full experiment setup notes large storage/compute requirements.
- **Source**: Source #7
- **Confidence**: High
- **Related Dimension**: VPR descriptors
- **Fit Impact**: DINOv2-VLAD selected as offline/conditional retrieval candidate, not unconditional per-frame path.
## Fact #11
- **Statement**: DINOv2 official repository provides Meta's DINOv2 implementation and model assets with Apache-2.0 / CC-BY-4.0 license notices.
- **Source**: Source #8
- **Confidence**: High
- **Related Dimension**: VPR descriptors
- **Fit Impact**: Supports DINOv2 as a permissible descriptor backbone subject to model-license review.
## Fact #12
- **Statement**: FAISS is designed for efficient dense vector similarity search, top-k nearest-neighbor retrieval, speed/accuracy tradeoffs, and indexes too large for simple exhaustive scanning.
- **Source**: Source #9
- **Confidence**: High
- **Related Dimension**: Descriptor retrieval
- **Fit Impact**: Selected vector index for offline VPR descriptors.
## Fact #13
- **Statement**: FAISS supports saving/loading indexes; GPU indexes must be converted to CPU before saving.
- **Source**: Source #9
- **Confidence**: High
- **Related Dimension**: Cache lifecycle
- **Fit Impact**: Supports install-time/index-build flow with runtime load.
## Fact #14
- **Statement**: MAVSDK provides telemetry subscriptions for raw GPS, GPS info, status text, odometry, and position/velocity; it does not remove the need for raw MAVLink control over `GPS_INPUT` emission.
- **Source**: Source #10
- **Confidence**: High
- **Related Dimension**: MAVLink integration
- **Fit Impact**: Select MAVSDK for telemetry, pymavlink/raw MAVLink for `GPS_INPUT`.
## Fact #15
- **Statement**: ArduPilot GPSInput requires `GPS1_TYPE=14` for MAVLink GPS input.
- **Source**: Source #11
- **Confidence**: High
- **Related Dimension**: MAVLink output
- **Fit Impact**: Confirms production parameter requirement.
## Fact #16
- **Statement**: `GPS_INPUT` carries WGS84 lat/lon, MSL altitude, velocity, `fix_type`, `horiz_accuracy`, `vert_accuracy`, `speed_accuracy`, and ignore flags.
- **Source**: Source #12
- **Confidence**: High
- **Related Dimension**: Output contract
- **Fit Impact**: Supports mapping estimator covariance to `horiz_accuracy` and failover fix types.
## Fact #17
- **Statement**: ArduPilot GPS glitch protection and EKF failsafe behavior are parameterized and vehicle-specific; Copter docs are not enough to prove Plane behavior.
- **Source**: Sources #13, #14
- **Confidence**: High
- **Related Dimension**: Failsafe
- **Fit Impact**: Requires ArduPilot Plane SITL validation.
## Fact #18
- **Statement**: Jetson Orin Nano Super provides 67 INT8 TOPS, 8 GB memory, 102 GB/s bandwidth, and 7-25 W power range.
- **Source**: Source #15
- **Confidence**: High
- **Related Dimension**: Runtime
- **Fit Impact**: Confirms target platform constraint.
## Fact #19
- **Statement**: NVIDIA warns Super power modes require thermal design that can handle the power modes; otherwise throttling can reduce performance.
- **Source**: Source #16
- **Confidence**: High
- **Related Dimension**: Thermal
- **Fit Impact**: Supports AC-NEW-5 hot-soak and throttle logging.
## Fact #20
- **Statement**: PMTiles is efficient for single-file tile reads but is read-only and cannot be updated in place.
- **Source**: Source #17
- **Confidence**: High
- **Related Dimension**: Cache storage
- **Fit Impact**: Rejected for mutable onboard tile writes; possible export/package format only.
## Fact #21
- **Statement**: COG supports tiled, compressed, overview-enabled GeoTIFFs suitable for efficient raster access and geospatial tooling.
- **Source**: Source #18
- **Confidence**: High
- **Related Dimension**: Cache storage
- **Fit Impact**: Selected imagery storage unit for immutable service tiles and generated candidate tiles.
## Fact #22
- **Statement**: AerialVL provides aerial visual localization sequences, reference maps, and geo-referenced evaluation data.
- **Source**: Source #19
- **Confidence**: Medium
- **Related Dimension**: Validation
- **Fit Impact**: Selected validation dataset for VPR/satellite-anchor algorithm development.
## Fact #23
- **Statement**: EuRoC provides synchronized camera/IMU and ground truth for VIO, but it is not representative of high-altitude fixed-wing nadir imagery.
- **Source**: Source #20
- **Confidence**: High
- **Related Dimension**: Validation
- **Fit Impact**: Use for VIO sanity checks only, not final AC proof.
## MVE Evidence
### MVE — OpenCV calibration and homography utilities
- **Source**: Source #5
- **Pinned mode/config**: Use OpenCV 4.x C++/Python APIs for checkerboard calibration, undistortion, homography estimation with RANSAC, and reprojection-error measurement.
- **Inputs in example**: Object/image point correspondences, image size, matched keypoints.
- **Outputs in example**: Camera matrix, distortion coefficients, rotation/translation vectors, homography matrix.
- **Project inputs**: ADTi nav-camera frames, checkerboard calibration images, matched VO/satellite points.
- **Project outputs required**: Intrinsics/distortion, homography, inlier mask, MRE.
- **Match assessment**: Exact match.
### MVE — LightGlue in DISK/ALIKED local-matching mode
- **Source**: Source #6
- **Pinned mode/config**: Use DISK+LightGlue or ALIKED+LightGlue on CUDA/TensorRT-profiled Jetson path, with inputs two normalized images and outputs matched keypoint coordinates plus confidence scores.
- **Inputs in example**: Two images loaded to GPU; local features extracted by DISK/ALIKED/SuperPoint.
- **Outputs in example**: `matches` shape `(K, 2)`, keypoint coordinates in each image, confidence scores.
- **Project inputs**: Orthorectified nav frame crop and candidate satellite/VPR chunk.
- **Project outputs required**: 2D-2D correspondences for RANSAC homography and cross-domain MRE.
- **Match assessment**: Exact interface match; runtime quality gate remains.
### MVE — FAISS top-K VPR retrieval
- **Source**: Source #9
- **Pinned mode/config**: Use FAISS CPU index with optional GPU acceleration for top-K nearest neighbor search over precomputed DINOv2/VLAD descriptors, saved/loaded at install/preflight time.
- **Inputs in example**: Float32 descriptor matrix, query descriptor, `k`.
- **Outputs in example**: Distance matrix `D` and index matrix `I`.
- **Project inputs**: Precomputed VPR chunk descriptors, query frame descriptor.
- **Project outputs required**: Top-K candidate chunk IDs for local matching.
- **Match assessment**: Exact match.
### MVE — MAVSDK telemetry + pymavlink GPS_INPUT
- **Source**: Sources #10, #11, #12
- **Pinned mode/config**: Use MAVSDK for telemetry subscriptions and pymavlink/raw MAVLink for `GPS_INPUT` emission to ArduPilot with `GPS1_TYPE=14`.
- **Inputs in example**: Telemetry streams, estimator lat/lon/alt/velocity/covariance.
- **Outputs in example**: `GPS_INPUT` fields accepted by ArduPilot GPS backend.
- **Project inputs**: ESKF state and covariance, source label, mode/fix quality.
- **Project outputs required**: Frame-by-frame WGS84 `GPS_INPUT`, status text, FDR record.
- **Match assessment**: Exact match for output contract; Plane SITL validation remains.
## Mode B Findings
### Fact #24
- **Statement**: DINOv2 TensorRT optimization on Jetson may provide limited speedup and can change embedding distances; descriptor fidelity must be tested against the PyTorch/ONNX baseline before selecting a TensorRT descriptor path.
- **Source**: Sources #21, #22
- **Phase**: Mode B
- **Confidence**: Medium
- **Related Dimension**: VPR runtime / quality
- **Fit Impact**: Adds embedding-fidelity gate; keeps DINOv2 selected only after profiling.
### Fact #25
- **Statement**: LightGlue's SuperPoint path has documented license concerns; DISK/ALIKED remain the safer production default unless legal review approves SuperPoint.
- **Source**: Source #23
- **Phase**: Mode B
- **Confidence**: High
- **Related Dimension**: Licensing
- **Fit Impact**: Confirms draft01 decision to avoid SuperPoint as default.
### Fact #26
- **Statement**: ArduPilot `GPS_INPUT_IGNORE_FLAG_VEL_HORIZ` has a reported EKF3 pitfall where velocity may become zero rather than truly ignored; SITL must validate velocity-source parameters and message fields.
- **Source**: Source #24
- **Phase**: Mode B
- **Confidence**: Medium
- **Related Dimension**: MAVLink integration
- **Fit Impact**: Adds a specific MAVLink test and parameter gate.
### Fact #27
- **Statement**: FAISS deployment on Jetson ARM64 should assume CPU FAISS by default; GPU FAISS packages are not the safe default on aarch64.
- **Source**: Source #25
- **Phase**: Mode B
- **Confidence**: Medium
- **Related Dimension**: Descriptor retrieval runtime
- **Fit Impact**: Changes FAISS pinned mode from CPU with optional GPU to CPU-first, with custom GPU build only as future optimization.
### Fact #28
- **Statement**: Visual matching with orthophotos is a known GNSS-denied UAV approach, but available sources do not prove robustness against adversarial visual attacks on imagery/cache content.
- **Source**: Source #26
- **Phase**: Mode B
- **Confidence**: Medium
- **Related Dimension**: Security
- **Fit Impact**: Adds cache integrity, signed manifests, and consistency checks as required controls.
### Fact #29
- **Statement**: COG creation is a write-new-object workflow; the live onboard cache should append/replace tile objects through manifests, not mutate a COG in place.
- **Source**: Source #18
- **Phase**: Mode B
- **Confidence**: High
- **Related Dimension**: Cache lifecycle
- **Fit Impact**: Clarifies cache implementation.
### Fact #30
- **Statement**: OpenVINS is technically stronger than a pure hand-rolled OpenCV-only VIO stack for camera+IMU odometry, but its GPLv3 license and generic VIO lifecycle make it unsuitable as the default production dependency for this product.
- **Source**: Sources #27, #28
- **Phase**: Mode B round 2
- **Confidence**: High
- **Related Dimension**: VO / VIO selection
- **Fit Impact**: Use OpenVINS as a mandatory benchmark/reference, not as the shipped estimator dependency unless GPL obligations are explicitly accepted.
### Fact #31
- **Statement**: The selected production estimator is not "custom OpenCV-only"; OpenCV is the geometry utility layer, while the product-owned ESKF/mode machine owns covariance, source labels, GPS spoofing, blackout, tile-write eligibility, and MAVLink semantics.
- **Source**: Sources #5, #29; AC-1.4, AC-3.5, AC-4.3, AC-NEW-4, AC-NEW-7, AC-NEW-8
- **Phase**: Mode B round 2
- **Confidence**: High
- **Related Dimension**: Estimator ownership
- **Fit Impact**: Keep custom production estimator, but reject any interpretation that means building a naive OpenCV-only VIO stack.
### Fact #32
- **Statement**: Fixed-wing GPS-denied UAV research supports a hybrid of visual odometry plus satellite/orthophoto matching to reduce accumulated drift, matching the project architecture better than a standalone VIO-only solution.
- **Source**: Sources #1, #26
- **Phase**: Mode B round 2
- **Confidence**: High
- **Related Dimension**: Architecture
- **Fit Impact**: Confirms that OpenVINS alone cannot satisfy the absolute-position and re-anchor responsibilities without the satellite anchor path.
### Fact #33
- **Statement**: DINOv2-VLAD/AnyLoc-style retrieval is a strong global candidate generator for aerial VPR, but descriptor size, model size, and environment-specific VLAD/index choices must be budgeted and profiled.
- **Source**: Sources #7, #30, #32
- **Phase**: Mode B round 2
- **Confidence**: High
- **Related Dimension**: Satellite retrieval
- **Fit Impact**: Select DINOv2-VLAD for triggered retrieval, not steady-state per-frame execution.
### Fact #34
- **Statement**: Aerial VPR sources emphasize tile/chunk scale, overlap, weather/season changes, repetitive patterns, and re-ranking cost; local matching should be a verification/rerank stage over bounded top-K candidates.
- **Source**: Source #32
- **Phase**: Mode B round 2
- **Confidence**: High
- **Related Dimension**: Anchor verification
- **Fit Impact**: Supports VPR chunks with 40-50% overlap, dynamic K, and conditional ALIKED/LightGlue verification.
### Fact #35
- **Statement**: ALIKED + LightGlue has an exact local matching interface and a plausible ONNX/TensorRT deployment path, but public evidence does not prove Jetson Orin Nano p95 latency for the project image sizes.
- **Source**: Sources #6, #31
- **Phase**: Mode B round 2
- **Confidence**: Medium
- **Related Dimension**: Local matching runtime
- **Fit Impact**: Keep ALIKED/LightGlue selected with runtime gate; benchmark DISK and SIFT/ORB as fallbacks.
### Fact #36
- **Statement**: DINOv2 TensorRT conversion can reduce embedding discrimination and may not provide meaningful speedup on Jetson-class devices; descriptor-fidelity tests must precede any optimized engine acceptance.
- **Source**: Source #22
- **Phase**: Mode B round 2
- **Confidence**: Medium
- **Related Dimension**: VPR deployment
- **Fit Impact**: TensorRT is an optimization candidate only after PyTorch/ONNX retrieval-rank equivalence is proven.
### Fact #37
- **Statement**: BASALT is the best production VIO candidate among BASALT, OpenVINS, and Kimera-VIO because it combines permissive licensing with strong published EuRoC accuracy and completion evidence.
- **Source**: Sources #33, #34, #35
- **Phase**: Mode B round 3
- **Confidence**: Medium
- **Related Dimension**: VO / VIO selection
- **Fit Impact**: Select BASALT as the production VIO candidate, pending project replay/profiling.
### Fact #38
- **Statement**: OpenVINS has the clearest EKF covariance story, including full/marginal covariance helpers and NEES-style evaluation support, but remains production-constrained by GPLv3.
- **Source**: Sources #27, #28, #38
- **Phase**: Mode B round 3
- **Confidence**: High
- **Related Dimension**: Confidence / covariance
- **Fit Impact**: Keep OpenVINS as covariance/reference baseline and use it to calibrate the BASALT wrapper's reported uncertainty.
### Fact #39
- **Statement**: Kimera-VIO is production-friendly from a license standpoint, but it is heavier/stereo-oriented and has documented mono-inertial parameter/performance caveats.
- **Source**: Sources #34, #36
- **Phase**: Mode B round 3
- **Confidence**: Medium
- **Related Dimension**: VO / VIO fallback
- **Fit Impact**: Keep Kimera-VIO as a backup candidate, not the first production choice for a single fixed nadir camera.
### Fact #40
- **Statement**: None of BASALT, OpenVINS, or Kimera-VIO provides a special fixed-wing nadir mode; downward-camera support depends on accurate camera-to-IMU extrinsics, altitude/scale constraints, and validation under low-parallax planar terrain.
- **Source**: Source #37
- **Phase**: Mode B round 3
- **Confidence**: High
- **Related Dimension**: Nadir-camera support
- **Fit Impact**: The architecture must keep satellite anchors and project-level confidence gates regardless of which VIO library is selected.
### Fact #41
- **Statement**: Published EuRoC-type VIO error rates are useful for ranking libraries but are not acceptance evidence for high-altitude fixed-wing nadir imagery over agricultural terrain.
- **Source**: Sources #34, #35, #37
- **Phase**: Mode B round 3
- **Confidence**: High
- **Related Dimension**: Validation
- **Fit Impact**: Require representative replay/flight data before claiming AC-1/AC-2 accuracy.
@@ -1,55 +0,0 @@
# Comparison Framework
## Selected Framework Type
Decision support with exact-fit component selection.
## Selected Dimensions
1. Required inputs/outputs and ownership boundaries
2. Operating context and lifecycle fit
3. Non-functional envelope fit: latency, memory, storage, thermal, safety
4. Licensing and deployability
5. Evidence quality and validation burden
6. Security and safety failure modes
7. Selection status
## Initial Population
| Component Area | Candidate | Option Family | Inputs / Outputs | Fit Summary | Status |
|----------------|-----------|---------------|------------------|-------------|--------|
| Calibration / geometry | OpenCV 4.x | Established production / open-source | image/object points -> intrinsics, distortion, homography, inlier mask | Exact utility fit; permissive production use. | Selected |
| VO / VIO | BASALT + project-owned safety/anchor wrapper | Open-source production candidate | nav frames + FC IMU/calibration -> relative VIO state; wrapper adds source labels, anchor fusion, calibrated confidence, and MAVLink semantics | Best production candidate after user decision: permissive license, strong benchmark evidence, and lower implementation burden than custom VIO. | Selected |
| VO / VIO | OpenVINS | Open-source research | monocular camera + IMU -> VIO state/covariance | Best covariance/reference story, but GPL-3 and generic VIO ownership make it a benchmark/reference rather than shipped core. | Reference only |
| VO / VIO | Kimera-VIO | Open-source production candidate / fallback | mono or stereo camera + IMU -> VIO/SLAM outputs | BSD-friendly, but heavier/stereo-oriented and mono-inertial path has documented parameter caveats. | Backup candidate |
| VO / VIO | OpenCV geometry + project-owned ESKF | Custom fallback | nav frames + FC IMU/attitude/altitude + satellite anchors -> relative motion, absolute updates, covariance, source labels | Fallback if BASALT fails project data/runtime tests; still needed as safety/anchor wrapper around any VIO library. | Fallback / wrapper |
| VO / VIO | ORB-SLAM3 | Open-source research | monocular/stereo/RGB-D + optional IMU -> SLAM pose | Useful benchmark; GPLv3, map runtime, and initialization complexity make it poor production dependency. | Rejected for production |
| Global retrieval | DINOv2-VLAD / AnyLoc-style descriptors | Current SOTA / research | image/chunk -> descriptor | Strong VPR evidence; trigger-only use due runtime/memory and TensorRT fidelity risk. | Selected with runtime gate |
| Vector retrieval | FAISS | Established production / open-source | descriptor matrix + query -> top-K IDs | Exact fit for offline VPR chunk retrieval. | Selected |
| Local matching | LightGlue + DISK/ALIKED | Current SOTA / open-source | two images -> keypoint correspondences/scores | Exact local-match interface; avoids SuperPoint license issue. | Selected with runtime gate |
| Local matching | SuperPoint + LightGlue | Current SOTA / known-bad/licensing caveat | two images -> matches | Technically good; licensing requires explicit review. | Needs user decision / fallback only |
| Cache imagery | COG + manifest + sidecars | Established geospatial format | georeferenced raster + metadata -> efficient local reads | Good immutable tile unit; generated tiles can be written as new COGs. | Selected |
| Cache packaging | PMTiles | Established web-map archive | tile pyramid -> single archive | Efficient reads, but read-only; not suitable for in-flight mutable writes. | Rejected for mutable cache |
| Estimator | Custom ESKF mode machine | Custom production | VO/IMU/VPR/GPS-health -> WGS84 state + covariance + label | Needed for source labels, covariance gates, blackout/spoofing behavior. | Selected |
| MAVLink integration | MAVSDK + pymavlink | Established APIs | telemetry in, `GPS_INPUT` out | MAVSDK handles telemetry; pymavlink handles raw `GPS_INPUT`. | Selected |
| FDR | PostgreSQL event index + CBOR payload segments with optional Parquet export | Established storage | frame estimates, IMU, MAVLink, health, tiles -> bounded replayable log | Matches project PostgreSQL choice while keeping compact append payloads. | Selected pattern |
| Validation | AerialVL + EuRoC + Plane SITL + representative flight | Multi-source test strategy | datasets/sim/flight -> AC evidence | Public data is partial; final representative data is mandatory. | Selected |
## Round 2 Decision Notes
- **OpenVINS vs custom OpenCV**: OpenVINS wins if the comparison is against a naive OpenCV-only VIO implementation. The selected design is not that; it is OpenCV geometry plus a product-owned estimator/state machine, with OpenVINS used as a benchmark/reference.
- **Satellite retrieval**: DINOv2-VLAD remains the best global candidate generator found, but aerial VPR sources require chunk scale/overlap tuning, dynamic top-K, and geometric verification.
- **Anchor verification**: ALIKED/LightGlue remains the preferred learned local matcher, while SIFT/ORB stays as a regression/fallback baseline and SuperPoint remains license-gated.
## Round 3 Decision Notes
- **User decision**: BASALT is selected as the production VIO candidate.
- **Confidence/covariance**: OpenVINS remains the covariance/reference baseline because its EKF exposes clearer uncertainty semantics than BASALT/Kimera.
- **Nadir support**: no compared VIO library has a special fixed-wing nadir mode; the acceptance path is calibration, altitude/scale constraints, satellite anchors, and representative replay validation.
## Baseline Alignment
- "Position estimate" means WGS84 frame center emitted to FC plus a 95% covariance semi-major axis and source label.
- "Satellite anchored" means a visual match passed VPR retrieval, local matching, RANSAC, freshness, covariance, and Mahalanobis gates.
- "Normal flight segment" means the AC-2.1a conditions, not turns/blackout/stale imagery.
- "Selected with runtime gate" means the API capability fits, but final deployment depends on Jetson profiling against AC-4.1 and AC-4.2.
-181
View File
@@ -1,181 +0,0 @@
# Reasoning Chain
## Dimension 1: Core Architecture
### Fact Confirmation
Fixed-wing monocular VO accumulates drift because scale and terrain assumptions are imperfect (Fact #1). Aerial VPR can provide absolute anchors but has weather, scale, and repetition failure modes (Fact #2).
### Reference Comparison
Pure VO/IMU cannot satisfy the long 8-hour mission. Pure satellite matching cannot run every frame inside the latency budget and will fail in turns, stale imagery, and repetitive fields.
### Conclusion
Select a hybrid architecture: steady-state VO/IMU propagation, conditional satellite/VPR anchoring, and ESKF covariance gates.
### Confidence
High. Supported by Sources #1 and #2.
## Dimension 2: VO / VIO Dependency
### Fact Confirmation
OpenVINS and ORB-SLAM3 both support visual-inertial estimation patterns, but both are GPL-family production dependency risks (Facts #4 and #5).
### Reference Comparison
The project needs a production system with custom mode labels, covariance propagation, MAVLink-specific output behavior, and no hidden GPL obligation. A full SLAM stack also adds initialization and map-management complexity that the ACs do not require.
### Conclusion
Use a custom VO/IMU propagation and ESKF mode machine for production. Use OpenVINS/ORB-SLAM3 only as benchmarks/reference algorithms.
### Confidence
High for licensing/scope decision; medium for final estimator performance until prototype profiling.
## Dimension 3: Satellite Retrieval And Local Matching
### Fact Confirmation
Aerial VPR benefits from scale/overlap-aware chunks and retrieval + local alignment (Fact #2). LightGlue provides keypoint correspondences/scores from local descriptors (Fact #7). FAISS provides top-K retrieval over descriptors (Fact #12).
### Reference Comparison
Global DINOv2/VLAD retrieval alone gives candidate chunks but not enough precision for AC-2.2. Local matching alone over the whole map is too expensive. The two-stage retrieval+alignment structure matches the operational need.
### Conclusion
Use DINOv2-VLAD descriptors with FAISS top-K for conditional candidate retrieval, followed by DISK/ALIKED+LightGlue and OpenCV RANSAC homography for local alignment.
### Confidence
Medium-high. API fit is strong; embedded runtime must be profiled.
## Dimension 4: Latency And Memory
### Fact Confirmation
Some aerial VPR re-ranking methods are too slow for the steady-state path (Fact #3). Jetson Orin Nano Super has 8 GB shared memory and 25 W power mode (Fact #18), with thermal throttling risk (Fact #19).
### Reference Comparison
At 3 fps and <400 ms p95, the system can process every frame through lightweight VO/IMU and use heavier VPR only on triggers. Running DINOv2/LightGlue across many candidates per frame would violate the AC unless proven otherwise.
### Conclusion
Make VPR conditional, cap top-K by covariance/sector, run descriptor extraction on downsampled/orthorectified crops, precompute cache descriptors offline, and add performance regression tests.
### Confidence
High for architecture direction; exact model sizes need profiling.
## Dimension 5: Cache Format
### Fact Confirmation
COG supports tiled compressed rasters and geospatial tooling (Fact #21). PMTiles is read-efficient but read-only (Fact #20).
### Reference Comparison
The onboard system must both read service tiles and write new generated tiles in-flight. A read-only archive is a poor primary mutable store.
### Conclusion
Use COG files plus STAC-like manifests/sidecars for imagery and metadata; use FAISS sidecar indexes for descriptors. PMTiles may be an export/snapshot format, not the live mutable cache.
### Confidence
High.
## Dimension 6: Autopilot Integration
### Fact Confirmation
ArduPilot MAVLink GPS input requires `GPS1_TYPE=14` (Fact #15). `GPS_INPUT` carries the fields needed for WGS84 position and accuracy (Fact #16). MAVSDK covers telemetry but raw `GPS_INPUT` emission still needs pymavlink/raw MAVLink (Fact #14).
### Reference Comparison
MAVSDK-only output would not satisfy AC-4.3. Raw pymavlink-only telemetry is possible but gives up MAVSDK's high-level subscriptions.
### Conclusion
Use MAVSDK for telemetry subscriptions and pymavlink for `GPS_INPUT` emission. Verify all failsafe/spoofing behavior in ArduPilot Plane SITL.
### Confidence
High for interface; medium for exact Plane failsafe timing until SITL.
## Dimension 7: Validation
### Fact Confirmation
AerialVL provides aerial localization/reference-map data (Fact #22). EuRoC provides camera+IMU+ground truth but not fixed-wing nadir imagery (Fact #23). The current sample set lacks IMU/ground truth.
### Reference Comparison
No single public dataset proves every AC. Public datasets can de-risk components, while final acceptance requires representative synchronized flight/replay data.
### Conclusion
Use layered validation: EuRoC for VIO sanity, AerialVL/VPAir-style data for VPR/anchor tests, ArduPilot Plane SITL for MAVLink/failsafe/spoofing, and a final representative flight/replay rig for AC proof.
### Confidence
High.
## Dimension 8: OpenVINS vs Project-Owned Estimator
### Fact Confirmation
OpenVINS is a strong monocular+IMU EKF/MSCKF VIO reference with covariance-aware estimation (Facts #4 and #30). OpenCV supplies calibration, undistortion, homography, RANSAC/USAC, and reprojection-error primitives, but it is not a full estimator by itself (Facts #6 and #31).
### Reference Comparison
If the alternative is a naive OpenCV-only VIO stack, OpenVINS is the better technical starting point. The project's actual production choice is different: it needs a project-owned ESKF/mode machine that fuses VO/IMU, accepts/rejects satellite anchors, emits `GPS_INPUT`, labels every estimate, handles spoofing/blackout, and gates cache write-back.
### Conclusion
Keep OpenVINS as a mandatory benchmark/reference implementation, not as the default production dependency. The production estimator remains project-owned, with OpenCV as the geometry utility layer.
### Confidence
High. The technical comparison favors OpenVINS over naive custom VIO, while the product fit favors the project-owned estimator.
## Dimension 9: Satellite Retrieval And Anchor Verification
### Fact Confirmation
DINOv2-VLAD/AnyLoc-style retrieval has strong VPR evidence but descriptor/model size and TensorRT fidelity must be validated (Facts #33 and #36). Aerial VPR survey evidence emphasizes tile scale, overlap, season/weather shifts, repetitive patterns, and re-ranking cost (Fact #34). LightGlue supports ALIKED/DISK/SIFT feature matching and returns correspondences/scores suitable for RANSAC verification, but Jetson latency must be profiled (Facts #7, #8, #35).
### Reference Comparison
Classical SIFT/ORB is simpler and cheap but weaker for cross-domain UAV-to-satellite matching. SuperPoint+LightGlue is technically strong but remains license-gated. Pure global retrieval without local verification is unsafe because repetitive farmland and stale imagery can produce plausible but wrong candidates.
### Conclusion
Use DINOv2-VLAD + CPU-first FAISS as the triggered global retriever, then verify bounded top-K candidates with ALIKED/LightGlue + OpenCV RANSAC. Keep SIFT/ORB as a regression baseline and SuperPoint only after legal approval.
### Confidence
High for architecture and interfaces; medium for final runtime until Jetson profiling.
## Dimension 10: BASALT vs Kimera-VIO vs OpenVINS
### Fact Confirmation
BASALT has strong published EuRoC evidence and production-friendly licensing (Fact #37). OpenVINS has the clearest EKF covariance API and consistency tooling, but GPLv3 remains a production constraint (Fact #38). Kimera-VIO is BSD-friendly but heavier and has documented mono-inertial caveats (Fact #39). All three require calibrated camera-to-IMU extrinsics; none has a special fixed-wing nadir mode (Fact #40).
### Reference Comparison
For a single fixed downward camera, the selection criterion is not just benchmark ATE. The project needs a VIO core that can run on Jetson, tolerate calibrated nadir geometry, and be wrapped by project-specific satellite-anchor, confidence, MAVLink, and safety logic. OpenVINS is attractive for confidence/covariance but problematic as a shipped component. Kimera is acceptable as a BSD fallback, but mono-inertial risk makes it weaker as the first pick. BASALT provides the best production trade-off if its uncertainty can be calibrated and wrapped.
### Conclusion
Select BASALT as the production VIO candidate. Keep OpenVINS as a reference/covariance baseline and Kimera-VIO as a backup candidate. The project-owned safety/anchor wrapper remains mandatory around BASALT because BASALT alone does not satisfy GPS-denied source labels, satellite anchors, false-position budgets, cache-write gates, or `GPS_INPUT` behavior.
### Confidence
Medium-high. Library ranking is well supported; final acceptance still depends on representative fixed-wing nadir replay data.
-51
View File
@@ -1,51 +0,0 @@
# Validation Log
## Validation Scenario
An 8-hour fixed-wing mission enters GPS-denied/spoofed mode after takeoff. The onboard system starts from last trusted FC state, processes 3 fps nadir frames, emits `GPS_INPUT`, handles normal flight, sharp turns, short visual blackouts, stale/changed tiles, and post-flight tile write-back.
## Expected Based On Conclusions
- **Normal segment**: VO/IMU propagates every processed frame; satellite anchors refresh state conditionally before covariance grows too large.
- **Sharp turn / <5% overlap**: VO is expected to fail; relocalization uses FAISS top-K VPR chunks followed by LightGlue/RANSAC.
- **Visual blackout + spoofing**: estimator switches to `dead_reckoned`, covariance grows monotonically, spoofed GPS is ignored, `GPS_INPUT` degrades honestly.
- **Stale tile**: anchor is rejected or down-confidence weighted and cannot emit `satellite_anchored`.
- **Cache write-back**: onboard generated tile is written only when parent-pose covariance passes AC-NEW-7 gates and carries metadata for Satellite Service voting.
## Actual Validation Plan
| Validation Target | Test Method | Pass Evidence |
|-------------------|-------------|---------------|
| VO/VIO propagation | EuRoC and synthetic nadir replay; then representative flight data | Drift vs anchor-age bins; AC-1.3 pass/fail. |
| Satellite anchor | AerialVL/VPAir-style benchmark plus project sample imagery with satellite cache | AC-1.1/1.2 accuracy, AC-2.2 MRE, georeference recall. |
| Runtime | Jetson Orin Nano Super profiling under 25 W, hot-soak included | <400 ms p95, <8 GB memory, no thermal throttle. |
| VPR retrieval | Offline descriptor build and FAISS query benchmark | Top-K recall, query latency, index size within cache budget. |
| MAVLink output | ArduPilot Plane SITL with `GPS1_TYPE=14` | Valid `GPS_INPUT`, fix-type/accuracy degradation, QGC status. |
| Spoofing promotion | Plane SITL false GPS injection | Promotion <3 s and spoofed GPS rejected during blackout. |
| FDR | 8-hour synthetic load | <=64 GB, rollover logged, no silent payload loss. |
| Cache poisoning | Monte Carlo with over-confident wrong anchors | AC-NEW-7 probabilities below budget; metadata contract emitted. |
| OpenVINS reference comparison | Replay the same synchronized camera+IMU segments through OpenVINS and the project-owned estimator | OpenVINS establishes a VIO baseline; production estimator must match/beat drift where applicable while preserving source labels and GPS_INPUT behavior. |
| BASALT production VIO candidate | Replay synchronized camera+IMU segments through BASALT, OpenVINS, and Kimera-VIO | BASALT selected if drift, completion rate, latency, and wrapper-calibrated covariance meet project gates. |
| DINOv2-VLAD fidelity | Compare PyTorch, ONNX, and TensorRT descriptor distances and FAISS rankings | Optimized engines accepted only if rank/top-K behavior stays within tolerance. |
| ALIKED/LightGlue runtime | Jetson benchmark across K candidates and project image sizes | Candidate accepted for runtime only if relocalization trigger path fits AC-4.1 with bounded frame drops. |
## Counterexamples And Risks
- Large DINOv2 variants or many local-match candidates may violate the Jetson latency/memory envelope.
- Agricultural fields can be visually repetitive; VPR confidence must not be treated as sufficient without geometric verification.
- Public datasets do not fully match Ukrainian fixed-wing operational conditions; final evidence requires representative data.
- GPL VIO/SLAM libraries are not production dependencies unless licensing is explicitly accepted.
- OpenVINS may outperform the first custom estimator prototype on pure VIO drift; that would trigger estimator improvement, not automatic GPL production adoption.
- BASALT covariance/confidence is less directly exposed than OpenVINS EKF covariance; the project wrapper must calibrate uncertainty before mapping it to `GPS_INPUT.horiz_accuracy`.
## Review Checklist
- [x] Draft conclusions are consistent with fact cards.
- [x] No important dimensions missed: architecture, VO, VPR, local matching, cache, estimator, MAVLink, FDR, validation covered.
- [x] No selected component relies only on field-adjacent fit.
- [x] Mismatches are recorded as rejected/reference/needs-decision rather than hidden.
- [x] Step 7.5 Component Applicability Gate applies and is saved in `06_component_fit_matrix.md`.
## Conclusions Requiring Revision
Round 3 applies the user decision to select BASALT as the production VIO candidate. The selected implementation is BASALT VIO plus a project-owned safety/anchor wrapper; OpenVINS remains the covariance/reference baseline, Kimera-VIO remains a backup candidate, and custom OpenCV-only VIO is no longer the primary path. Runtime gates and Plane SITL gates are implementation validation gates, not API capability blockers.
@@ -1,107 +0,0 @@
# Component Fit Matrix
## Top-Level Matrix
| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
|----------------|-----------|--------------------|---------------|---------------|-------------------------|----------------------------|--------|--------------------|
| Calibration / geometry | OpenCV 4.x | C++/Python calibration, undistortion, RANSAC homography, reprojection-error measurement | Established production / open-source | Camera intrinsics, image normalization, VO/satellite homography verification | MVE: `02_fact_cards.md` OpenCV block; Source #5 | None | Selected | Exact API fit and permissive utility role. |
| VO / IMU propagation | BASALT + project-owned safety/anchor wrapper | BASALT VIO consumes calibrated nav-camera frames + FC IMU; wrapper fuses satellite anchors, calibrates uncertainty, emits source labels and GPS_INPUT semantics | Open-source production candidate | Relative VIO state, completion/error benchmark, wrapped covariance/confidence, degraded modes, GPS_INPUT semantics | Sources #33-#35; Facts #37, #40, #41 | BASALT covariance/confidence must be calibrated in wrapper; no special nadir mode | Selected | User-selected best production trade-off: permissive licensing and stronger benchmark/completion evidence than OpenVINS/Kimera, with wrapper covering project-specific safety semantics. |
| VO / VIO reference | OpenVINS | Monocular camera + IMU EKF/MSCKF reference runs with covariance extraction | Open-source research | Benchmark and covariance reference | Sources #3, #27, #28, #35, #38; Facts #4, #30, #38 | GPL-3 production dependency risk; completion/divergence risk on some sequences; does not own satellite anchor / GPS_INPUT / blackout / cache-write state machine | Reference only | Best covariance baseline, but not selected as shipped production dependency. |
| VO / VIO backup | Kimera-VIO | Mono/stereo camera + IMU VIO/SLAM backup candidate | Open-source production candidate / fallback | Alternative VIO baseline | Sources #34, #36; Fact #39 | Heavier/stereo-oriented; mono-inertial path has documented parameter caveats | Backup candidate | BSD-friendly backup if BASALT fails project replay/runtime gates. |
| VO / SLAM alternative | ORB-SLAM3 | Monocular-inertial SLAM | Open-source research | Benchmark and failure-mode comparison | Source #4, Fact #5 | GPLv3; heavier SLAM/map lifecycle than required | Rejected | Does not fit licensing/scope for production. |
| VPR descriptors | DINOv2-VLAD / AnyLoc-style | Precomputed satellite chunk descriptors; conditional query descriptor on relocalization triggers; TensorRT path only after embedding-fidelity check | Current SOTA / research | Global top-K candidate retrieval | Sources #7, #8, #21, #22, #30, #32; Facts #10, #11, #24, #33, #34, #36 | Runtime and embedding-fidelity gates on Jetson; model-size/index-size selection required | Selected with runtime gate | Best evidence for change-robust VPR, but not per-frame and not blindly TensorRT-converted. |
| Vector retrieval | FAISS | CPU-first aarch64 index; saved/loaded index over float/PQ descriptors; GPU only if custom Jetson build is proven | Established production / open-source | Top-K candidate chunk search | MVE: `02_fact_cards.md` FAISS block; Sources #9, #25 | GPU FAISS not default on Jetson ARM64 | Selected | Exact top-K descriptor retrieval fit with CPU-first deployment. |
| Local matching | LightGlue + DISK/ALIKED | CUDA/ONNX-profiled DISK or ALIKED feature extraction + LightGlue matching on bounded top-K candidates | Current SOTA / open-source | 2D-2D correspondences for RANSAC and MRE | MVE: `02_fact_cards.md` LightGlue block; Sources #6, #23, #31 | Runtime quality gate; extractor choice must avoid SuperPoint license issue | Selected with runtime gate | Exact input/output fit with deployable licensing path; ALIKED/LightGlue is preferred for anchor verification. |
| Local matching fallback | SuperPoint + LightGlue | SuperPoint features + LightGlue | Current SOTA / license-gated | Optional benchmark/fallback | Source #6 | SuperPoint restrictive license | Needs user decision | Do not use as default production dependency without legal review. |
| Cache imagery | COG + manifest/sidecar | Tiled compressed GeoTIFF tile objects with CRS, capture date, source, m/px, freshness, descriptor sidecars; write-new-object lifecycle | Established geospatial format | Immutable service tiles and generated candidate tiles | Source #18, Facts #21, #29 | No in-place mutation; manifest manages active tile version | Selected | Fits geospatial raster access and write-new-tile workflow. |
| Cache packaging | PMTiles | Read-only tile archive | Established web-map archive | Optional export/snapshot | Source #17, Fact #20 | Cannot update in place | Rejected for live cache | In-flight tile generation needs mutable write-new objects. |
| MAVLink | MAVSDK + pymavlink | MAVSDK telemetry subscriptions; pymavlink/raw MAVLink `GPS_INPUT` emission to ArduPilot `GPS1_TYPE=14`; velocity source/ignore-flag behavior SITL-tested | Established APIs | FC telemetry, QGC status, GPS substitute output | MVE: `02_fact_cards.md` MAVSDK/pymavlink block; Sources #10-#12, #24 | Plane SITL behavior and velocity-source parameters must be validated | Selected | Exact output contract with known ArduPilot pitfall covered. |
| Validation | EuRoC + AerialVL/VPAir + Plane SITL + representative flight | Layered validation suite | Test strategy | Prove ACs before production | Sources #19, #20 | Public data not sufficient for final proof | Selected | Covers component de-risking plus final representative proof. |
## Restrictions Cross-Check — Selected Production Architecture
| Restriction | Candidate-mode behavior | Result | Evidence |
|-------------|-------------------------|--------|----------|
| Fixed-wing only | Architecture assumes forward motion, rare sharp turns, no hover dependency. | Pass | Problem context; Source #1 |
| Fixed nadir nav camera | VO/orthorectification uses fixed camera extrinsics; attitude compensation from FC. | Pass | Source #5 |
| Operational area / flat terrain | Flat terrain assumption supported; repetitive agricultural terrain handled as validation class and confidence risk. | Pass | Source #2 |
| Weather/season classes | Validation matrix includes seasonal/visibility classes. | Pass | `05_validation_log.md` |
| Two-camera split | Nav camera drives localization; AI camera object localization uses current GPS-denied state and AI gimbal/zoom. | Pass | AC-7.1/7.2 |
| Satellite Service offline boundary | Runtime uses local COG/cache + FAISS descriptors only; no in-flight provider fetch. | Pass | Sources #17, #18 |
| Freshness gates | Tile manifest carries capture date and sector; stale tiles rejected/down-weighted. | Pass | AC-8.2, AC-NEW-6 |
| Jetson Orin Nano Super | Hot path is lightweight; heavy VPR conditional; runtime profiling gate required. | Pass | Sources #15, #16 |
| MAVLink / ArduPilot only | MAVSDK telemetry + pymavlink `GPS_INPUT`, `GPS1_TYPE=14`. | Pass | Sources #10-#12 |
| No raw frame storage | FDR keeps estimates, telemetry, tiles, and low-rate failure thumbnails only. | Pass | AC-8.5, AC-NEW-3 |
## AC Cross-Check — Selected Production Architecture
| AC | Candidate-mode behavior | Result | Evidence |
|----|-------------------------|--------|----------|
| AC-1.1 | Satellite anchors and ESKF output target <=50 m for >=80% normal frames. | Pass | Facts #1, #2 |
| AC-1.2 | Same pipeline targets <=20 m for >=50% normal frames; validated statistically. | Pass | `05_validation_log.md` |
| AC-1.3 | ESKF tracks anchor age and covariance; VO-only and IMU-fused drift measured between anchors. | Pass | Fact #1 |
| AC-1.4 | ESKF emits covariance; telemetry/FDR carries source label. | Pass | Fact #16 |
| AC-2.1a | VO hot path handles normal overlapping frames; failures trigger mode change. | Pass | Facts #1, #6 |
| AC-2.1b | Satellite anchor success measured separately through VPR + local match + RANSAC. | Pass | Facts #2, #7, #12 |
| AC-2.2 | OpenCV/LightGlue provide correspondences and homography MRE measurement. | Pass | Facts #6, #7 |
| AC-3.1 | ESKF innovation gates reject outliers; covariance grows instead of trusting jumps. | Pass | Reasoning chain |
| AC-3.2 | Sharp turn VO failure triggers VPR relocalization. | Pass | AC design; Source #2 |
| AC-3.3 | Disconnected segment handled by global retrieval and pose-graph/ESKF re-anchor. | Pass | Source #2 |
| AC-3.4 | Loss counter and timer trigger GCS relocalization request while dead reckoning continues. | Pass | AC design |
| AC-3.5 | Image-quality/blackout state switches to IMU-only and rejects spoofed GPS. | Pass | AC design; Facts #16, #17 |
| AC-4.1 | Heavy VPR is conditional; steady-state path is VO/IMU. Jetson profiling is a runtime quality gate. | Pass | Facts #3, #18 |
| AC-4.2 | Descriptor/index memory is budgeted; FAISS and cache are precomputed/pruned. | Pass | Facts #12, #13 |
| AC-4.3 | `GPS_INPUT` emitted by pymavlink; ODOMETRY remains disabled in v1. | Pass | Facts #14-#16 |
| AC-4.4 | Estimator emits frame-by-frame; no batching required. | Pass | Architecture |
| AC-4.5 | Corrections are emitted as updated estimates with timestamps and covariance. | Pass | Architecture |
| AC-5.1 | Startup initializes from last trusted FC state plus IMU propagation. | Pass | Architecture |
| AC-5.2 | >3 s no-estimate path enters degraded/failsafe behavior; Plane SITL proves FC response. | Pass | Fact #17 |
| AC-5.3 | Cold restart uses FC state and preloaded cache/index. | Pass | AC-NEW-1 |
| AC-6.1 | QGC receives downsampled status; FDR keeps high-rate details. | Pass | MAVSDK telemetry + FDR design |
| AC-6.2 | Command ingress reserved through MAVLink status/named values/custom dialect. | Pass | MAVLink design |
| AC-6.3 | `GPS_INPUT` lat/lon is WGS84. | Pass | Fact #16 |
| AC-7.1 | Object localization exposes level-flight accuracy and maneuver bound. | Pass | Geometry design |
| AC-7.2 | Flat-terrain trig projection from UAV GPS + gimbal/zoom/altitude. | Pass | Geometry design |
| AC-8.1 | Cache contract requires 0.3-0.5 m/px imagery. | Pass | Restrictions |
| AC-8.2 | Freshness metadata gates anchors. | Pass | Restrictions |
| AC-8.3 | Precomputed descriptors and FAISS index are part of cache budget. | Pass | Facts #10, #12, #13 |
| AC-8.4 | Generated tiles are new COGs with quality metadata for Service ingest. | Pass | Fact #21 |
| AC-8.5 | Raw frames are not retained. | Pass | FDR design |
| AC-8.6 | VPR chunks use overlap/multi-scale descriptors and dynamic K. | Pass | Facts #2, #12 |
| AC-NEW-1 | Engines/indexes built before flight; first fix benchmark validates <30 s. | Pass | Runtime plan |
| AC-NEW-2 | Plane SITL verifies spoofing trigger and promotion <3 s. | Pass | Fact #17 |
| AC-NEW-3 | FDR segment rollover validates <=64 GB. | Pass | Validation plan |
| AC-NEW-4 | Mahalanobis gates and calibrated covariance target false-position budget. | Pass | ESKF design |
| AC-NEW-5 | Thermal profiling at 25 W validates no throttle. | Pass | Facts #18, #19 |
| AC-NEW-6 | Tile age rejection/down-weighting built into anchor gate. | Pass | AC design |
| AC-NEW-7 | Tile writes require parent-pose covariance and sidecar metadata; Satellite Service voting is external contract. | Pass | AC design |
| AC-NEW-8 | Dead-reckoned blackout mode degrades `GPS_INPUT` fields at covariance thresholds. | Pass | Facts #16, #17 |
## Decision Rules Applied
- No GPL VIO/SLAM library is selected as a production dependency.
- Runtime gates are classified as runtime-quality validation gates, not API capability blockers.
- Every selected component has an exact input/output role and a validation path.
- Any candidate with license uncertainty is marked `Needs user decision` or non-production.
## Mode B Revisions Applied
- FAISS pinned mode changed to CPU-first on Jetson ARM64.
- DINOv2 TensorRT path requires descriptor-fidelity validation against PyTorch/ONNX.
- `GPS_INPUT` tests now include velocity-source and ignore-flag behavior.
- COG cache lifecycle clarified as write-new-object plus manifest versioning, not in-place mutation.
- Visual/satellite security controls now include signed manifests, cache provenance, stale-tile rejection, and multi-signal consistency checks.
## Mode B Round 2 Revisions Applied
- OpenVINS is explicitly better than naive OpenCV-only VIO, but remains reference-only because the shipped estimator must own source labels, covariance gates, spoofing/blackout states, cache-write eligibility, and MAVLink semantics.
- The selected VO wording is now "OpenCV geometry + project-owned ESKF" to avoid implying a fragile OpenCV-only odometry implementation.
- DINOv2-VLAD + CPU-first FAISS + ALIKED/LightGlue remains selected for satellite retrieval and anchor verification, with retrieval limited to relocalization triggers and bounded top-K verification.
- SIFT/ORB remains a regression/fallback baseline; SuperPoint remains non-production until legal approval.
## Mode B Round 3 Revisions Applied
- BASALT is selected as the production VIO candidate.
- OpenVINS remains the covariance/reference baseline, not a shipped dependency by default.
- Kimera-VIO remains a backup VIO candidate because its license is production-friendly but mono-inertial caveats make it weaker for the single-nadir-camera path.
- The project-owned safety/anchor wrapper remains mandatory around BASALT for satellite anchor acceptance, source labels, blackout/spoofing modes, cache-write eligibility, calibrated confidence, and MAVLink `GPS_INPUT`.
+582 -90
View File
@@ -1,130 +1,622 @@
# Solution
# Solution Draft
## Assessment Findings
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
| ------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ESKF described as "16-state vector, ~10MB" with no mathematical specification | **Functional**: No state vector, no process model (F,Q), no measurement models (H for VO, H for satellite), no noise parameters, no scale observability analysis. Impossible to implement or validate accuracy claims. | **Define complete ESKF specification**: 15-state error vector, IMU-driven prediction, dual measurement models (VO relative pose, satellite absolute position), initial Q/R values, scale constraint via altitude + satellite corrections. |
| GPS_INPUT at 5-10Hz via pymavlink — no field mapping | **Functional**: GPS_INPUT requires 15+ fields (velocity, accuracy, hdop, fix_type, GPS time). No specification of how ESKF state maps to these fields. ArduPilot requires minimum 5Hz. | **Define GPS_INPUT population spec**: velocity from ESKF, accuracy from covariance, fix_type from confidence tier, GPS time from system clock conversion, synthesized hdop/vdop. |
| Confidence scoring "unchanged from draft03" — not in draft05 | **Functional**: Draft05 is supposed to be self-contained. Confidence scoring determines GPS_INPUT accuracy fields and fix_type — directly affects how ArduPilot EKF weights the position data. | **Define confidence scoring inline**: 3 tiers (satellite-anchored, VO-tracked, IMU-only) mapping to fix_type + accuracy values. |
| Coordinate transformations not defined | **Functional**: No pixel→camera→body→NED→WGS84 chain. Camera is not autostabilized, so body attitude matters. Satellite match → WGS84 conversion undefined. Object localization impossible without these transforms. | **Define coordinate transformation chain**: camera intrinsics K, camera-to-body extrinsic T_cam_body, body-to-NED from ESKF attitude, NED origin at mission start point. |
| Disconnected route segments — "satellite re-localization" mentioned but no algorithm | **Functional**: AC requires handling as "core to the system." Multiple disconnected segments expected. No tracking-loss detection, no re-localization trigger, no ESKF re-initialization, no cuVSLAM restart procedure. | **Define re-localization pipeline**: detect cuVSLAM tracking loss → IMU-only ESKF prediction → trigger satellite match on every frame → on match success: ESKF position reset + cuVSLAM restart → on 3 consecutive failures: operator re-localization request. |
| No startup handoff from GPS to GPS-denied | **Functional**: System reads GLOBAL_POSITION_INT at startup but no protocol for when GPS is lost/spoofed vs system start. No validation of initial position. | **Define handoff protocol**: system runs continuously, FC receives both real GPS and GPS_INPUT. GPS-denied system always provides its estimate; FC selects best source. Initial position validated against first satellite match. |
| No mid-flight reboot recovery | **Functional**: AC requires: "re-initialize from flight controller's current IMU-extrapolated position." No procedure defined. Recovery time estimation missing. | **Define reboot recovery sequence**: read FC position → init ESKF with high uncertainty → load TRT engines → start cuVSLAM → immediate satellite match. Estimated recovery: ~35-70s. Document as known limitation. |
| 3-consecutive-failure re-localization request undefined | **Functional**: AC requires ground station re-localization request. No message format, no operator workflow, no system behavior while waiting. | **Define re-localization protocol**: detect 3 failures → send custom MAVLink message with last known position + uncertainty → operator provides approximate coordinates → system uses as ESKF measurement with high covariance. |
| Object localization — "trigonometric calculation" with no details | **Functional**: No math, no API, no Viewpro gimbal integration, no accuracy propagation. Other onboard systems cannot use this component as specified. | **Define object localization**: pixel→ray using Viewpro intrinsics + gimbal angles → body frame → NED → ray-ground intersection → WGS84. FastAPI endpoint: POST /objects/locate. Accuracy propagated from UAV position + gimbal uncertainty. |
| Satellite matching — GSD normalization and tile selection unspecified | **Functional**: Camera GSD ~15.9 cm/px at 600m vs satellite ~0.3 m/px at zoom 19. The "pre-resize" step is mentioned but not specified. Tile selection radius based on ESKF uncertainty not defined. | **Define GSD handling**: downsample camera frame to match satellite GSD. Define tile selection: ESKF position ± 3σ_horizontal → select tiles covering that area. Assemble tile mosaic for matching. |
| Satellite tile storage requirements not calculated | **Functional**: "±2km" preload mentioned but no storage estimate. At zoom 19: a 200km path with ±2km buffer requires ~~130K tiles (~~2.5GB). | **Calculate tile storage**: specify zoom level (18 preferred — 0.6m/px, 4× fewer tiles), estimate storage per mission profile, define maximum mission area by storage limit. |
| FastAPI endpoints not in solution draft | **Functional**: Endpoints only in security_analysis.md. No request/response schemas. No SSE event format. No object localization endpoint. | **Consolidate API spec in solution**: define all endpoints, SSE event schema, object localization endpoint. Reference security_analysis.md for auth. |
| cuVSLAM configuration missing (calibration, IMU params, mode) | **Functional**: No camera calibration procedure, no IMU noise parameters, no T_imu_rig extrinsic, no mode selection (Mono vs Inertial). | **Define cuVSLAM configuration**: use Inertial mode, specify required calibration data (camera intrinsics, distortion, IMU noise params from datasheet, T_imu_rig from physical measurement), define calibration procedure. |
| tech_stack.md inconsistent with draft05 | **Functional**: tech_stack.md says 3fps (should be 0.7fps), LiteSAM at 480px (should be 1280px), missing EfficientLoFTR. | **Flag for update**: tech_stack.md must be synchronized with draft05 corrections. Not addressed in this draft — separate task. |
## Overall Maturity Assessment
| Category | Maturity (1-5) | Assessment |
| ----------------------------- | -------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| Hardware & Platform Selection | 3.5 | UAV airframe, cameras, Jetson, batteries — well-researched with specs, weight budget, endurance calculations. Ready for procurement. |
| Core Algorithm Selection | 3.0 | cuVSLAM, LiteSAM/XFeat, ESKF — components selected with comparison tables, fallback chains, decision trees. Day-one benchmarks defined. |
| AI Inference Runtime | 3.5 | TRT Engine migration thoroughly analyzed. Conversion workflows, memory savings, performance estimates. Code wrapper provided. |
| Sensor Fusion (ESKF) | 1.5 | Mentioned but not specified. No implementable detail. Blockerfor coding. |
| System Integration | 1.5 | GPS_INPUT, coordinate transforms, inter-component data flow — all under-specified. |
| Edge Cases & Resilience | 1.0 | Disconnected segments, reboot recovery, re-localization — acknowledged but no algorithms. |
| Operational Readiness | 0.5 | No pre-flight procedures, no in-flight monitoring, no failure response. |
| Security | 3.0 | Comprehensive threat model, OP-TEE analysis, LUKS, secure boot. Well-researched. |
| **Overall TRL** | **~2.5** | **Technology concept formulated + some component validation. Not implementation-ready.** |
The solution is at approximately **TRL 3** (proof of concept) for hardware/algorithm selection and **TRL 1-2** (basic concept) for system integration, ESKF, and operational procedures.
## Product Solution Description
Build an onboard GPS-denied localization service that runs on the Jetson companion computer, uses the fixed downward navigation camera and flight-controller inertial telemetry, and emits ArduPilot `GPS_INPUT` estimates with calibrated covariance and source labels.
A real-time GPS-denied visual navigation system for fixed-wing UAVs, running on a Jetson Orin Nano Super (8GB). All AI model inference uses native TensorRT Engine files. The system replaces the GPS module by sending MAVLink GPS_INPUT messages via pymavlink over UART at 5-10Hz.
The production architecture is a trigger-based hybrid estimator:
Position is determined by fusing: (1) CUDA-accelerated visual odometry (cuVSLAM in Inertial mode) from ADTI 20L V1 at 0.7 fps sustained, (2) absolute position corrections from satellite image matching (LiteSAM or XFeat — TRT Engine FP16) using keyframes from the same ADTI image stream, and (3) IMU data from the flight controller via ESKF. Viewpro A40 Pro is reserved for AI object detection only.
The ESKF is the central state estimator with 15-state error vector. It fuses:
- **IMU prediction** at 5-10Hz (high-frequency pose propagation)
- **cuVSLAM VO measurement** at 0.7Hz (relative pose correction)
- **Satellite matching measurement** at ~0.07-0.14Hz (absolute position correction)
GPS_INPUT messages carry position, velocity, and accuracy derived from the ESKF state and covariance.
**Hard constraint**: ADTI 20L V1 shoots at 0.7 fps sustained (1430ms interval). Full VO+ESKF pipeline within 400ms per frame. Satellite matching async on keyframes (every 5-10 camera frames). GPS_INPUT at 5-10Hz (ESKF IMU prediction fills gaps between camera frames).
```text
Nav camera + FC telemetry
|
v
Image quality + calibration + orthorectification
|
+--> Hot path: OpenCV geometry + BASALT VIO --> safety/anchor wrapper --> GPS_INPUT + QGC + FDR
|
+--> Reference path: OpenVINS replay benchmark for VIO drift/covariance tests; Kimera backup replay
|
+--> Trigger path: DINOv2-VLAD query --> CPU FAISS top-K --> ALIKED/DISK+LightGlue --> OpenCV RANSAC --> safety/anchor wrapper
|
+--> Tile path: new COG tile + quality/provenance sidecar --> manifest update --> post-flight Satellite Service sync
```
Heavy local retrieval and local matching are not steady-state per-frame dependencies. They run on cold start, VO failure, sharp turns, disconnected segments, covariance growth, stale-anchor age, or operator-assisted relocalization, using only preloaded cache/index data during flight.
┌─────────────────────────────────────────────────────────────────────┐
│ OFFLINE (Before Flight) │
│ 1. Satellite Tiles → Download & Validate → Pre-resize → Store │
│ (Google Maps) (≥0.5m/px, <2yr) (matcher res) (GeoHash)│
│ 2. TRT Engine Build (one-time per model version): │
│ PyTorch model → reparameterize → ONNX export → trtexec --fp16 │
│ Output: litesam.engine, xfeat.engine │
│ 3. Camera + IMU calibration (one-time per hardware unit) │
│ 4. Copy tiles + engines + calibration to Jetson storage │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ ONLINE (During Flight) │
│ │
│ STARTUP: │
│ 1. pymavlink → read GLOBAL_POSITION_INT → init ESKF state │
│ 2. Load TRT engines + allocate GPU buffers │
│ 3. Load camera calibration + IMU calibration │
│ 4. Start cuVSLAM (Inertial mode) with ADTI 20L V1 │
│ 5. Preload satellite tiles ±2km into RAM │
│ 6. First satellite match → validate initial position │
│ 7. Begin GPS_INPUT output loop at 5-10Hz │
│ │
│ EVERY CAMERA FRAME (0.7fps from ADTI 20L V1): │
│ ┌──────────────────────────────────────┐ │
│ │ ADTI 20L V1 → Downsample (CUDA) │ │
│ │ → cuVSLAM VO+IMU (~9ms) │ ← CUDA Stream A │
│ │ → ESKF VO measurement │ │
│ └──────────────────────────────────────┘ │
│ │
│ 5-10Hz CONTINUOUS (IMU-driven between camera frames): │
│ ┌──────────────────────────────────────┐ │
│ │ IMU data → ESKF prediction │ │
│ │ ESKF state → GPS_INPUT fields │ │
│ │ GPS_INPUT → Flight Controller (UART) │ │
│ └──────────────────────────────────────┘ │
│ │
│ KEYFRAMES (every 5-10 camera frames, async): │
│ ┌──────────────────────────────────────┐ │
│ │ Camera frame → GSD downsample │ │
│ │ Select satellite tile (ESKF pos±3σ) │ │
│ │ TRT inference (Stream B): LiteSAM/ │ │
│ │ XFeat → correspondences │ │
│ │ RANSAC → homography → WGS84 position │ │
│ │ ESKF satellite measurement update │──→ Position correction │
│ └──────────────────────────────────────┘ │
│ │
│ TRACKING LOSS (cuVSLAM fails — sharp turn / featureless): │
│ ┌──────────────────────────────────────┐ │
│ │ ESKF → IMU-only prediction (growing │ │
│ │ uncertainty) │ │
│ │ Satellite match on EVERY frame │ │
│ │ On match success → ESKF reset + │ │
│ │ cuVSLAM restart │ │
│ │ 3 consecutive failures → operator │ │
│ │ re-localization request │ │
│ └──────────────────────────────────────┘ │
│ │
│ TELEMETRY (1Hz): │
│ ┌──────────────────────────────────────┐ │
│ │ NAMED_VALUE_FLOAT: confidence, drift │──→ Ground Station │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
```
## Architecture
### Camera Ingest, Calibration, And Geometry
### Component: ESKF Sensor Fusion (NEW — previously unspecified)
| Solution | Tools | Pinned Mode/Config | Fit |
|----------|-------|--------------------|-----|
| OpenCV geometry utility layer | OpenCV 4.x | Calibration, undistortion, homography, RANSAC/USAC, MRE measurement | Selected. Mature, permissive, exact utility fit; not a full estimator. |
**Error-State Kalman Filter** fusing IMU, visual odometry, and satellite matching.
### VO / IMU Propagation And Estimator
**Nominal state vector** (propagated by IMU):
| Solution | Tools | Pinned Mode/Config | Fit |
|----------|-------|--------------------|-----|
| BASALT + safety/anchor wrapper | BASALT, OpenCV, custom wrapper | BASALT consumes calibrated nav-camera frames + FC IMU; wrapper fuses satellite anchors, calibrates uncertainty, emits source labels and `GPS_INPUT` fields | Selected. Best production VIO candidate found: permissive license, strong benchmark evidence, avoids custom VIO from scratch. |
| OpenVINS | OpenVINS | Monocular camera + IMU EKF/MSCKF reference runs with covariance extraction | Reference only. Strong VIO and covariance baseline, but GPLv3 and generic VIO ownership make it unsuitable as default shipped dependency. |
| Kimera-VIO | Kimera-VIO | Mono/stereo camera + IMU VIO/SLAM backup replay | Backup candidate. BSD-friendly but heavier/stereo-oriented; mono-inertial path has documented caveats. |
| ORB-SLAM3 | ORB-SLAM3 | Monocular-inertial SLAM | Rejected for production. GPLv3 and heavier SLAM/map lifecycle. |
BASALT does not replace the project-owned safety logic. The wrapper remains responsible for satellite anchor acceptance, confidence calibration, source labels, blackout/spoofing modes, tile-write eligibility, and MAVLink `GPS_INPUT` semantics.
| State | Symbol | Size | Description |
| ---------- | ------ | ---- | ------------------------------------------------ |
| Position | p | 3 | NED position relative to mission origin (meters) |
| Velocity | v | 3 | NED velocity (m/s) |
| Attitude | q | 4 | Unit quaternion (body-to-NED rotation) |
| Accel bias | b_a | 3 | Accelerometer bias (m/s²) |
| Gyro bias | b_g | 3 | Gyroscope bias (rad/s) |
### Satellite Service And Anchor Verification
| Solution | Tools | Pinned Mode/Config | Fit |
|----------|-------|--------------------|-----|
| DINOv2-VLAD + CPU FAISS + ALIKED/DISK+LightGlue | DINOv2/AnyLoc-style descriptors, FAISS CPU, LightGlue, OpenCV RANSAC | Offline VPR chunk descriptors; conditional query descriptor; CPU FAISS top-K; learned local match on bounded candidates; TensorRT only after fidelity check | Selected with runtime/fidelity gates. |
| SuperPoint+LightGlue | SuperPoint, LightGlue | Same matcher with SuperPoint features | License-gated benchmark/fallback only. |
| Classical SIFT/ORB | OpenCV | Handcrafted features + homography | Regression/fallback baseline. |
**Error-state vector** (estimated by ESKF): δx = [δp, δv, δθ, δb_a, δb_g]ᵀ ∈ ℝ¹⁵
where δθ ∈ so(3) is the 3D rotation error.
The Satellite Service component imports mission cache/index packages before flight, uploads generated-tile packages after landing, and serves local VPR queries during flight. The VPR index is built over ground-footprint-sized chunks with overlap and a multi-scale descriptor set. VPR is invoked only on relocalization triggers or covariance/anchor-age growth; normal flight uses BASALT VIO plus wrapper propagation. No satellite-provider or Satellite Service network calls are allowed mid-flight.
**Prediction step** (IMU at 5-10Hz from flight controller):
### Tile Manager
- Input: accelerometer a_m, gyroscope ω_m, dt
- Propagate nominal state: p += v·dt, v += (R(q)·(a_m - b_a) - g)·dt, q ⊗= Exp(ω_m - b_g)·dt
- Propagate error covariance: P = F·P·Fᵀ + Q
- F is the 15×15 error-state transition matrix (standard ESKF formulation)
- Q: process noise diagonal, initial values from IMU datasheet noise densities
| Solution | Tools | Pinned Mode/Config | Fit |
|----------|-------|--------------------|-----|
| COG tile objects + PostgreSQL/PostGIS manifest + signed JSON sidecars | GDAL COG, PostgreSQL/PostGIS, signed JSON sidecars, FAISS index files | Service tiles and generated tiles are write-new COG objects; active version selected by PostGIS-backed manifest | Selected. Fits geospatial raster access, provenance, spatial/freshness queries, and write-new tile lifecycle. |
| PMTiles | PMTiles | Read-only archive snapshot | Rejected for live cache because in-flight tile generation needs mutable write-new objects. |
**VO measurement update** (0.7Hz from cuVSLAM):
Service-source tiles and generated tiles carry CRS, capture date, source, m/px, freshness, quality score, sidecar hashes, and descriptor references. The Tile Manager also orthorectifies eligible nadir frames into generated COG tiles. Stale tiles are rejected or down-confidence weighted.
- cuVSLAM outputs relative pose: ΔR, Δt (camera frame)
- Transform to NED: Δp_ned = R_body_ned · T_cam_body · Δt
- Innovation: z = Δp_ned_measured - Δp_ned_predicted
- Observation matrix H_vo maps error state to relative position change
- R_vo: measurement noise, initial ~0.1-0.5m (from cuVSLAM precision at 600m+ altitude)
- Kalman update: K = P·Hᵀ·(H·P·Hᵀ + R)⁻¹, δx = K·z, P = (I - K·H)·P
### MAVLink Integration
**Satellite measurement update** (0.07-0.14Hz, async):
| Solution | Tools | Pinned Mode/Config | Fit |
|----------|-------|--------------------|-----|
| MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK, pymavlink | MAVSDK subscriptions; pymavlink emits `GPS_INPUT`; v1 emits GPS_INPUT only; Plane SITL validates `GPS1_TYPE=14`, velocity source params, ignore flags, fix types, accuracy fields | Selected. Exact output control with good telemetry ergonomics. |
- Satellite matching outputs absolute position: lat_sat, lon_sat in WGS84
- Convert to NED relative to mission origin
- Innovation: z = p_satellite - p_predicted
- H_sat = [I₃, 0, 0, 0, 0] (directly observes position)
- R_sat: measurement noise, from matching confidence (~5-20m based on RANSAC inlier ratio)
- Provides absolute position correction — bounds drift accumulation
The system emits per-frame estimates locally and downsampled status to QGroundControl. `GPS_INPUT.horiz_accuracy` must not under-report the calibrated 95% covariance semi-major axis.
**Scale observability**:
### Security And Safety Controls
- Monocular cuVSLAM has scale ambiguity during constant-velocity flight
- Scale is constrained by: (1) satellite matching absolute positions (primary), (2) known flight altitude from barometer + predefined mission altitude, (3) IMU accelerometer during maneuvers
- During long straight segments without satellite correction, scale drift is possible. Satellite corrections every ~7-14s re-anchor scale.
| Solution | Tools | Pinned Mode/Config | Fit |
|----------|-------|--------------------|-----|
| Consistency-gated anchor acceptance | Safety/anchor wrapper, cache manifest verification | Anchor accepted only if freshness, provenance, RANSAC, covariance, Mahalanobis, and temporal consistency pass | Selected. Prevents confident false fixes. |
| FDR audit trail | PostgreSQL event index + CBOR payload segments + hashes | Logs estimates, inputs, emitted GPS_INPUT, health, tile writes, anchor decisions | Selected. Supports incident analysis, indexed queries, and cache-poisoning audits. |
**Tuning approach**: Start with IMU datasheet noise values for Q. Start with conservative R values (high measurement noise). Tune on flight test data by comparing ESKF output to known GPS ground truth.
## Runtime Modes
| Mode | Trigger | Behavior | `GPS_INPUT` / Telemetry |
|------|---------|----------|--------------------------|
| `satellite_anchored` | VPR + local match passes all gates | Wrapper absolute update; tile write eligible only if sigma gate passes | 3D fix, `horiz_accuracy` >= 95% covariance semi-major axis |
| `vo_extrapolated` | BASALT VIO healthy and anchor age/covariance within bounds | BASALT VIO + wrapper propagation; covariance grows | 3D/2D depending covariance threshold |
| `dead_reckoned` | visual blackout or no accepted anchor | IMU-only propagation, monotonic covariance growth | degraded fix type; QGC `VISUAL_BLACKOUT_IMU_ONLY` |
| failsafe/no-fix | covariance >500 m or blackout >30 s | stop pretending position is valid | `fix_type=0`, `horiz_accuracy=999.0`, QGC `VISUAL_BLACKOUT_FAILSAFE` |
| Solution | Tools | Advantages | Limitations | Performance | Fit |
| -------------------------- | --------------- | ------------------------------------------------------------- | -------------------------------------- | ------------- | ----------- |
| Custom ESKF (Python/NumPy) | NumPy, SciPy | Full control, minimal dependencies, well-understood algorithm | Implementation effort, tuning required | <1ms per step | ✅ Selected |
| FilterPy ESKF | FilterPy v1.4.5 | Reference implementation, less code | Less flexible for multi-rate fusion | <1ms per step | ⚠️ Fallback |
### Component: Coordinate System & Transformations (NEW — previously undefined)
**Reference frames**:
- **Camera frame (C)**: origin at camera optical center, Z forward, X right, Y down (OpenCV convention)
- **Body frame (B)**: origin at UAV CG, X forward (nose), Y right (starboard), Z down
- **NED frame (N)**: North-East-Down, origin at mission start point
- **WGS84**: latitude, longitude, altitude (output format)
**Transformation chain**:
1. **Pixel → Camera ray**: p_cam = K⁻¹ · [u, v, 1]ᵀ where K = camera intrinsic matrix (ADTI 20L V1: fx, fy from 16mm lens + APS-C sensor)
2. **Camera → Body**: p_body = T_cam_body · p_cam where T_cam_body is the fixed mounting rotation (camera points nadir: 90° pitch rotation from body X-forward to camera Z-down)
3. **Body → NED**: p_ned = R_body_ned(q) · p_body where q is the ESKF quaternion attitude estimate
4. **NED → WGS84**: lat = lat_origin + p_north / R_earth, lon = lon_origin + p_east / (R_earth · cos(lat_origin)) where (lat_origin, lon_origin) is the mission start GPS position
**Camera intrinsic matrix K** (ADTI 20L V1 + 16mm lens):
- Sensor: 23.2 × 15.4 mm, Resolution: 5456 × 3632
- fx = fy = focal_mm × width_px / sensor_width_mm = 16 × 5456 / 23.2 = 3763 pixels
- cx = 2728, cy = 1816 (sensor center)
- Distortion: Brown model (k1, k2, p1, p2 from calibration)
**T_cam_body** (camera mount):
- Navigation camera is fixed, pointing nadir (downward), not autostabilized
- R_cam_body = R_x(180°) · R_z(0°) (camera Z-axis aligned with body -Z, camera X with body X)
- Translation: offset from CG to camera mount (measured during assembly, typically <0.3m)
**Satellite match → WGS84**:
- Feature correspondences between camera frame and geo-referenced satellite tile
- Homography H maps camera pixels to satellite tile pixels
- Satellite tile pixel → WGS84 via tile's known georeference (zoom level + tile x,y → lat,lon)
- Camera center projects to satellite pixel (cx_sat, cy_sat) via H
- Convert (cx_sat, cy_sat) to WGS84 using tile georeference
### Component: GPS_INPUT Message Population (NEW — previously undefined)
| GPS_INPUT Field | Source | Computation |
| ----------------------- | ---------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| lat, lon | ESKF position (NED) | NED → WGS84 conversion using mission origin |
| alt | ESKF position (Down) + mission origin altitude | alt = alt_origin - p_down |
| vn, ve, vd | ESKF velocity state | Direct from ESKF v[0], v[1], v[2] |
| fix_type | Confidence tier | 3 (3D fix) when satellite-anchored (last match <30s). 2 (2D) when VO-only. 0 (no fix) when IMU-only >5s |
| hdop | ESKF horizontal covariance | hdop = sqrt(P[0,0] + P[1,1]) / 5.0 (approximate CEP→HDOP mapping) |
| vdop | ESKF vertical covariance | vdop = sqrt(P[2,2]) / 5.0 |
| horiz_accuracy | ESKF horizontal covariance | horiz_accuracy = sqrt(P[0,0] + P[1,1]) meters |
| vert_accuracy | ESKF vertical covariance | vert_accuracy = sqrt(P[2,2]) meters |
| speed_accuracy | ESKF velocity covariance | speed_accuracy = sqrt(P[3,3] + P[4,4]) m/s |
| time_week, time_week_ms | System time | Convert Unix time to GPS epoch (GPS epoch = 1980-01-06, subtract leap seconds) |
| satellites_visible | Constant | 10 (synthetic — prevents satellite-count failsafes in ArduPilot) |
| gps_id | Constant | 0 |
| ignore_flags | Constant | 0 (provide all fields) |
**Confidence tiers** mapping to GPS_INPUT:
| Tier | Condition | fix_type | horiz_accuracy | Rationale |
| ------ | ------------------------------------------------- | ---------- | ------------------------------- | -------------------------------------- |
| HIGH | Satellite match <30s ago, ESKF covariance < 400m² | 3 (3D fix) | From ESKF P (typically 5-20m) | Absolute position anchor recent |
| MEDIUM | cuVSLAM tracking OK, no recent satellite match | 3 (3D fix) | From ESKF P (typically 20-50m) | Relative tracking valid, drift growing |
| LOW | cuVSLAM lost, IMU-only | 2 (2D fix) | From ESKF P (50-200m+, growing) | Only IMU dead reckoning, rapid drift |
| FAILED | 3+ consecutive total failures | 0 (no fix) | 999.0 | System cannot determine position |
### Component: Disconnected Route Segment Handling (NEW — previously undefined)
**Trigger**: cuVSLAM reports tracking_lost OR tracking confidence drops below threshold
**Algorithm**:
```
STATE: TRACKING_NORMAL
cuVSLAM provides relative pose
ESKF VO measurement updates at 0.7Hz
Satellite matching on keyframes (every 5-10 frames)
STATE: TRACKING_LOST (enter when cuVSLAM reports loss)
1. ESKF continues with IMU-only prediction (no VO updates)
→ uncertainty grows rapidly (~1-5 m/s drift with consumer IMU)
2. Switch satellite matching to EVERY frame (not just keyframes)
→ maximize chances of getting absolute correction
3. For each camera frame:
a. Attempt satellite match using ESKF predicted position ± 3σ for tile selection
b. If match succeeds (RANSAC inlier ratio > 30%):
→ ESKF measurement update with satellite position
→ Restart cuVSLAM with current frame as new origin
→ Transition to TRACKING_NORMAL
→ Reset failure counter
c. If match fails:
→ Increment failure_counter
→ Continue IMU-only ESKF prediction
4. If failure_counter >= 3:
→ Send re-localization request to ground station
→ GPS_INPUT fix_type = 0 (no fix), horiz_accuracy = 999.0
→ Continue attempting satellite matching on each frame
5. If operator sends re-localization hint (approximate lat,lon):
→ Use as ESKF measurement with high covariance (~500m)
→ Attempt satellite match in that area
→ On success: transition to TRACKING_NORMAL
STATE: SEGMENT_DISCONNECT
After re-localization following tracking loss:
→ New cuVSLAM track is independent of previous track
→ ESKF maintains global NED position continuity via satellite anchor
→ No need to "connect" segments at the cuVSLAM level
→ ESKF already handles this: satellite corrections keep global position consistent
```
### Component: Satellite Image Matching Pipeline (UPDATED — added GSD + tile selection details)
**GSD normalization**:
- Camera GSD at 600m: ~15.9 cm/pixel (ADTI 20L V1 + 16mm)
- Satellite tile GSD at zoom 18: ~0.6 m/pixel
- Scale ratio: ~3.8:1
- Downsample camera image to satellite GSD before matching: resize from 5456×3632 to ~1440×960 (matching zoom 18 GSD)
- This is close to LiteSAM's 1280px input — use 1280px with minor GSD mismatch acceptable for matching
**Tile selection**:
- Input: ESKF position estimate (lat, lon) + horizontal covariance σ_h
- Search radius: max(3·σ_h, 500m) — at least 500m to handle initial uncertainty
- Compute geohash for center position → load tiles covering the search area
- Assemble tile mosaic if needed (typically 2×2 to 4×4 tiles for adequate coverage)
- If ESKF uncertainty > 2km: tile selection unreliable, fall back to wider search or request operator input
**Tile storage calculation** (zoom 18 — 0.6 m/pixel):
- Each 256×256 tile covers ~153m × 153m
- Flight path 200km with ±2km buffer: area ≈ 200km × 4km = 800 km²
- Tiles needed: 800,000,000 / (153 × 153) ≈ 34,200 tiles
- Storage: ~10-15KB per JPEG tile → ~340-510 MB
- With zoom 19 overlap tiles for higher precision: ×4 = ~1.4-2.0 GB
- Recommended: zoom 18 primary + zoom 19 for ±500m along flight path → ~500-800 MB total
| Solution | Tools | Advantages | Limitations | Performance (est. Orin Nano Super TRT FP16) | Params | Fit |
| -------------------------------------- | ------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------ | ------------------------------------------- | ------ | ------------------------------- |
| LiteSAM (opt) TRT Engine FP16 @ 1280px | trtexec + tensorrt Python | Best satellite-aerial accuracy (RMSE@30=17.86m UAV-VisLoc), 6.31M params | MinGRU TRT export needs verification (LOW-MEDIUM risk) | Est. ~165-330ms | 6.31M | ✅ Primary |
| EfficientLoFTR TRT Engine FP16 | trtexec + tensorrt Python | Proven TRT path (Coarse_LoFTR_TRT). Semi-dense. CVPR 2024. | 2.4x more params than LiteSAM. | Est. ~200-400ms | 15.05M | ✅ Fallback if LiteSAM TRT fails |
| XFeat TRT Engine FP16 | trtexec + tensorrt Python | Fastest. Proven TRT implementation. | General-purpose, not designed for cross-view gap. | Est. ~50-100ms | <5M | ✅ Speed fallback |
### Component: cuVSLAM Configuration (NEW — previously undefined)
**Mode**: Inertial (mono camera + IMU)
**Camera configuration** (ADTI 20L V1 + 16mm lens):
- Model: Brown distortion
- fx = fy = 3763 px (16mm on 23.2mm sensor at 5456px width)
- cx = 2728 px, cy = 1816 px
- Distortion coefficients: from calibration (k1, k2, p1, p2)
- Border: 50px (ignore lens edge distortion)
**IMU configuration** (Pixhawk 6x IMU — ICM-42688-P):
- Gyroscope noise density: 3.0 × 10⁻³ °/s/√Hz
- Gyroscope random walk: 5.0 × 10⁻⁵ °/s²/√Hz
- Accelerometer noise density: 70 µg/√Hz
- Accelerometer random walk: ~2.0 × 10⁻³ m/s³/√Hz
- IMU frequency: 200 Hz (from flight controller via MAVLink)
- T_imu_rig: measured transformation from Pixhawk IMU to camera center (translation + rotation)
**cuVSLAM settings**:
- OdometryMode: INERTIAL
- MulticameraMode: PRECISION (favor accuracy over speed — we have 1430ms budget)
- Input resolution: downsample to 1280×852 (or 720p) for processing speed
- async_bundle_adjustment: True
**Initialization**:
- cuVSLAM initializes automatically when it receives the first camera frame + IMU data
- First few frames used for feature initialization and scale estimation
- First satellite match validates and corrects the initial position
**Calibration procedure** (one-time per hardware unit):
1. Camera intrinsics: checkerboard calibration with OpenCV (or use manufacturer data if available)
2. Camera-IMU extrinsic (T_imu_rig): Kalibr tool with checkerboard + IMU data
3. IMU noise parameters: Allan variance analysis or use datasheet values
4. Store calibration files on Jetson storage
### Component: AI Model Inference Runtime (UNCHANGED)
Native TRT Engine — optimal performance and memory on fixed NVIDIA hardware. See draft05 for full comparison table and conversion workflow.
### Component: Visual Odometry (UNCHANGED)
cuVSLAM in Inertial mode, fed by ADTI 20L V1 at 0.7 fps sustained. See draft05 for feasibility analysis at 0.7fps.
### Component: Flight Controller Integration (UPDATED — added GPS_INPUT field spec)
pymavlink over UART at 5-10Hz. GPS_INPUT field population defined above.
ArduPilot configuration:
- GPS1_TYPE = 14 (MAVLink)
- GPS_RATE = 5 (minimum, matching our 5-10Hz output)
- EK3_SRC1_POSXY = 1 (GPS), EK3_SRC1_VELXY = 1 (GPS) — EKF uses GPS_INPUT as position/velocity source
### Component: Object Localization (NEW — previously undefined)
**Input**: pixel coordinates (u, v) in Viewpro A40 Pro image, current gimbal angles (pan_deg, tilt_deg), zoom factor, UAV position from GPS-denied system, UAV altitude
**Process**:
1. Pixel → camera ray: ray_cam = K_viewpro⁻¹(zoom) · [u, v, 1]ᵀ
2. Camera → gimbal frame: ray_gimbal = R_gimbal(pan, tilt) · ray_cam
3. Gimbal → body: ray_body = T_gimbal_body · ray_gimbal
4. Body → NED: ray_ned = R_body_ned(q) · ray_body
5. Ray-ground intersection: assuming flat terrain at UAV altitude h: t = -h / ray_ned[2], p_ground_ned = p_uav_ned + t · ray_ned
6. NED → WGS84: convert to lat, lon
**Output**: { lat, lon, accuracy_m, confidence }
- accuracy_m propagated from: UAV position accuracy (from ESKF) + gimbal angle uncertainty + altitude uncertainty
**API endpoint**: POST /objects/locate
- Request: { pixel_x, pixel_y, gimbal_pan_deg, gimbal_tilt_deg, zoom_factor }
- Response: { lat, lon, alt, accuracy_m, confidence, uav_position: {lat, lon, alt}, timestamp }
### Component: Startup, Handoff & Failsafe (UPDATED — added handoff + reboot + re-localization)
**GPS-denied handoff protocol**:
- GPS-denied system runs continuously from companion computer boot
- Reads initial position from FC (GLOBAL_POSITION_INT) — this may be real GPS or last known
- First satellite match validates the initial position
- FC receives both real GPS (if available) and GPS_INPUT; FC EKF selects best source based on accuracy
- No explicit "switch" — the GPS-denied system is a secondary GPS source
**Startup sequence** (expanded from draft05):
1. Boot Jetson → start GPS-Denied service (systemd)
2. Connect to flight controller via pymavlink on UART
3. Wait for heartbeat
4. Initialize PyCUDA context
5. Load TRT engines: litesam.engine + xfeat.engine (~1-3s each)
6. Allocate GPU I/O buffers
7. Create CUDA streams: Stream A (cuVSLAM), Stream B (satellite matching)
8. Load camera calibration + IMU calibration files
9. Read GLOBAL_POSITION_INT → set mission origin (NED reference point) → init ESKF
10. Start cuVSLAM (Inertial mode) with ADTI 20L V1 camera stream
11. Preload satellite tiles within ±2km into RAM
12. Trigger first satellite match → validate initial position
13. Begin GPS_INPUT output loop at 5-10Hz
14. System ready
**Mid-flight reboot recovery**:
1. Jetson boots (~30-60s)
2. GPS-Denied service starts, connects to FC
3. Read GLOBAL_POSITION_INT (FC's current IMU-extrapolated position)
4. Init ESKF with this position + HIGH uncertainty covariance (σ = 200m)
5. Load TRT engines (~2-6s total)
6. Start cuVSLAM (fresh, no prior map)
7. Immediate satellite matching on first camera frame
8. On satellite match success: ESKF corrected, uncertainty drops
9. Estimated total recovery: ~35-70s
10. During recovery: FC uses IMU-only dead reckoning (at 70 km/h: ~700-1400m uncontrolled drift)
11. **Known limitation**: recovery time is dominated by Jetson boot time
**3-consecutive-failure re-localization**:
- Trigger: VO lost + satellite match failed × 3 consecutive camera frames
- Action: send re-localization request via MAVLink STATUSTEXT or custom message
- Message content: "RELOC_REQ: last_lat={lat} last_lon={lon} uncertainty={σ}m"
- Operator response: MAVLink COMMAND_LONG with approximate lat/lon
- System: use operator position as ESKF measurement with R = diag(500², 500², 100²) meters²
- System continues satellite matching with updated search area
- While waiting: GPS_INPUT fix_type=0, IMU-only ESKF prediction continues
### Component: Ground Station Telemetry (UPDATED — added re-localization)
MAVLink messages to ground station:
| Message | Rate | Content |
| ----------------------------- | -------- | --------------------------------------------------- |
| NAMED_VALUE_FLOAT "gps_conf" | 1Hz | Confidence score (0.0-1.0) |
| NAMED_VALUE_FLOAT "gps_drift" | 1Hz | Estimated drift from last satellite anchor (meters) |
| NAMED_VALUE_FLOAT "gps_hacc" | 1Hz | Horizontal accuracy (meters, from ESKF) |
| STATUSTEXT | On event | "RELOC_REQ: ..." for re-localization request |
| STATUSTEXT | On event | Tracking loss / recovery notifications |
### Component: Thermal Management (UNCHANGED)
Same adaptive pipeline from draft05. Active cooling required at 25W. Throttling at 80°C SoC junction.
### Component: API & Inter-System Communication (NEW — consolidated)
FastAPI (Uvicorn) running locally on Jetson for inter-process communication with other onboard systems.
| Endpoint | Method | Purpose | Auth |
| --------------------- | --------- | -------------------------------------- | ---- |
| /sessions | POST | Start GPS-denied session | JWT |
| /sessions/{id}/stream | GET (SSE) | Real-time position + confidence stream | JWT |
| /sessions/{id}/anchor | POST | Operator re-localization hint | JWT |
| /sessions/{id} | DELETE | End session | JWT |
| /objects/locate | POST | Object GPS from pixel coordinates | JWT |
| /health | GET | System health + memory + thermal | None |
**SSE event schema** (1Hz):
```json
{
"type": "position",
"timestamp": "2026-03-17T12:00:00.000Z",
"lat": 48.123456,
"lon": 37.654321,
"alt": 600.0,
"accuracy_h": 15.2,
"accuracy_v": 8.1,
"confidence": "HIGH",
"drift_from_anchor": 12.5,
"vo_status": "tracking",
"last_satellite_match_age_s": 8.3
}
```
## UAV Platform
Unchanged from draft05. See draft05 for: airframe configuration (3.5m S-2 composite, 12.5kg AUW), flight performance (3.4h endurance at 50 km/h), camera specifications (ADTI 20L V1 + 16mm, Viewpro A40 Pro), ground coverage calculations.
## Speed Optimization Techniques
Unchanged from draft05. Key points: cuVSLAM ~9ms/frame, native TRT Engine (no ONNX RT), dual CUDA streams, 5-10Hz GPS_INPUT from ESKF IMU prediction.
## Processing Time Budget
Unchanged from draft05. VO frame: ~17-22ms. Satellite matching: ≤210ms async. Well within 1430ms frame interval.
## Memory Budget (Jetson Orin Nano Super, 8GB shared)
| Component | Memory | Notes |
| ------------------------- | -------------- | ------------------------------------------- |
| OS + runtime | ~1.5GB | JetPack 6.2 + Python |
| cuVSLAM | ~200-500MB | CUDA library + map |
| LiteSAM TRT engine | ~50-80MB | If LiteSAM fails: EfficientLoFTR ~100-150MB |
| XFeat TRT engine | ~30-50MB | |
| Preloaded satellite tiles | ~200MB | ±2km of flight plan |
| pymavlink + MAVLink | ~20MB | |
| FastAPI (local IPC) | ~50MB | |
| ESKF + buffers | ~10MB | |
| **Total** | **~2.1-2.9GB** | **26-36% of 8GB** |
## Key Risks and Mitigations
| Risk | Likelihood | Impact | Mitigation |
| ------------------------------------------------------- | ---------- | ----------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| LiteSAM MinGRU ops unsupported in TRT 10.3 | LOW-MEDIUM | LiteSAM TRT export fails | Day-one verification. Fallback: EfficientLoFTR TRT → XFeat TRT. |
| cuVSLAM fails on low-texture terrain at 0.7fps | HIGH | Frequent tracking loss | Satellite matching corrections bound drift. Re-localization pipeline handles tracking loss. IMU bridges short gaps. |
| Google Maps satellite quality in conflict zone | HIGH | Satellite matching fails, outdated imagery | Pre-flight tile validation. Consider alternative providers (Bing, Mapbox). Robust to seasonal appearance changes via feature-based matching. |
| ESKF scale drift during long constant-velocity segments | MEDIUM | Position error exceeds 100m between satellite anchors | Satellite corrections every 7-14s re-anchor. Altitude constraint from barometer. Monitor drift rate — if >50m between corrections, increase satellite matching frequency. |
| Monocular scale ambiguity | MEDIUM | Metric scale lost during constant-velocity flight | Satellite absolute corrections provide scale. Known altitude constrains vertical scale. IMU acceleration during turns provides observability. |
| AUW exceeds AT4125 recommended range | MEDIUM | Reduced endurance, motor thermal stress | 12.5 kg vs 8-10 kg recommended. Monitor motor temps. Weight optimization. |
| ADTI mechanical shutter lifespan | MEDIUM | Replacement needed periodically | ~8,800 actuations/flight at 0.7fps. Estimated 11-57 flights before replacement. Budget as consumable. |
| Mid-flight companion computer failure | LOW | ~35-70s position gap | Reboot recovery procedure defined. FC uses IMU dead reckoning during gap. Known limitation. |
| Thermal throttling on Jetson | MEDIUM | Satellite matching latency increases | Active cooling required. Monitor SoC temp. Throttling at 80°C. Our workload ~8-15W typical — well under 25W TDP. |
| Engine incompatibility after JetPack update | MEDIUM | Must rebuild engines | Include engine rebuild in update procedure. |
| TRT engine build OOM on 8GB | LOW | Cannot build on target | Models small (6.31M, <5M). Reduce --memPoolSize if needed. |
## Testing Strategy
### Integration / Functional Tests
- BASALT replay: assert AC-2.1a and AC-2.2 VO MRE on overlapping frame pairs, completion rate, latency, and wrapper-calibrated covariance.
- OpenVINS reference replay: compare VIO drift, failure cases, and covariance against BASALT + wrapper.
- Kimera-VIO backup replay: keep a second permissive candidate benchmark in case BASALT fails project replay/runtime gates.
- Satellite anchor replay: assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness rejection, and source labels.
- DINOv2 descriptor fidelity: compare PyTorch/ONNX/TensorRT embeddings and retrieval rankings before accepting optimized engines.
- FAISS CPU index tests: top-K recall, query latency, index size, save/load behavior on Jetson ARM64.
- LightGlue extractor matrix: ALIKED vs DISK vs SIFT/ORB vs SuperPoint benchmark; SuperPoint excluded from production unless legal approves.
- Tile Manager: orthorectify eligible nadir frames into write-new generated tiles, update manifest, verify active version and rollback.
- `GPS_INPUT` SITL: validate fix type, `horiz_accuracy`, velocity fields, ignore flags, `EK3_SRC1_*` parameters, QGC behavior.
- Security gates: stale tile, mismatched tile hash, low inlier ratio, impossible velocity jump, and spoofed GPS during blackout.
- **ESKF correctness**: Feed recorded IMU + synthetic VO/satellite data → verify output matches reference ESKF implementation
- **GPS_INPUT field validation**: Send GPS_INPUT to SITL ArduPilot → verify EKF accepts and uses the data correctly
- **Coordinate transform chain**: Known GPS → NED → pixel → back to GPS — verify round-trip error <0.1m
- **Disconnected segment handling**: Simulate tracking loss → verify satellite re-localization triggers → verify cuVSLAM restarts → verify ESKF position continuity
- **3-consecutive-failure**: Simulate VO + satellite failures → verify re-localization request sent → verify operator hint accepted
- **Object localization**: Known object at known GPS → verify computed GPS matches within camera accuracy
- **Mid-flight reboot**: Kill GPS-denied process → restart → verify recovery within expected time → verify position accuracy after recovery
- **TRT engine load test**: Verify engines load successfully on Jetson
- **TRT inference correctness**: Compare TRT output vs PyTorch reference (max L1 error < 0.01)
- **CUDA Stream pipelining**: Verify Stream B satellite matching does not block Stream A VO
- **ADTI sustained capture rate**: Verify 0.7fps sustained >30 min without buffer overflow
- **Confidence tier transitions**: Verify fix_type and accuracy change correctly across HIGH → MEDIUM → LOW → FAILED transitions
### Non-Functional Tests
- Jetson latency and memory: <400 ms p95, <8 GB shared memory, no 25 W thermal throttle.
- Cache budget: 400 km² imagery + manifests + descriptors fits budget or reports explicit split budget.
- FDR 8-hour load: <=64 GB, rollover logged, no silent payload loss.
- Monte Carlo false-position and cache-poisoning tests for AC-NEW-4 and AC-NEW-7.
- Cold boot: first valid `GPS_INPUT` <30 s p95 across 50 runs.
- **End-to-end accuracy** (primary validation): Fly with real GPS recording → run GPS-denied system in parallel → compare estimated vs real positions → verify 80% within 50m, 60% within 20m
- **VO drift rate**: Measure cuVSLAM drift over 1km straight segment without satellite correction
- **Satellite matching accuracy**: Compare satellite-matched position vs real GPS at known locations
- **Processing time**: Verify end-to-end per-frame <400ms
- **Memory usage**: Monitor over 30-min session → verify <8GB, no leaks
- **Thermal**: Sustained 30-min run → verify no throttling
- **GPS_INPUT rate**: Verify consistent 5-10Hz delivery to FC
- **Tile storage**: Validate calculated storage matches actual for test mission area
- **MinGRU TRT compatibility** (day-one blocker): Clone LiteSAM → ONNX export → polygraphy → trtexec
- **Flight endurance**: Ground-test full system power draw against 267W estimate
## References
Detailed source registry: `_docs/00_research/01_source_registry.md`.
Key sources:
- BASALT repository: https://github.com/VladyslavUsenko/basalt
- HybVIO benchmark paper: https://arxiv.org/pdf/2106.11857
- OpenVINS docs: https://docs.openvins.com/index.html
- OpenVINS ATE/RTE comparison: https://github.com/rpng/open_vins/issues/402
- Kimera mono-inertial caveat: https://github.com/MIT-SPARK/Kimera-VIO/issues/254
- AnyLoc paper: https://arxiv.org/html/2308.00688
- Aerial VPR survey: https://arxiv.org/abs/2406.00885
- ALIKED-LightGlue-ONNX: https://github.com/ikeboo/ALIKED-LightGlue-ONNX
- DINOv2 TensorRT issue: https://github.com/NVIDIA/TensorRT/issues/4348
- ArduPilot GPS_RATE parameter: [https://github.com/ArduPilot/ardupilot/pull/15980](https://github.com/ArduPilot/ardupilot/pull/15980)
- MAVLink GPS_INPUT message: [https://ardupilot.org/mavproxy/docs/modules/GPSInput.html](https://ardupilot.org/mavproxy/docs/modules/GPSInput.html)
- pymavlink GPS_INPUT example: [https://webperso.ensta.fr/lebars/Share/GPS_INPUT_pymavlink.py](https://webperso.ensta.fr/lebars/Share/GPS_INPUT_pymavlink.py)
- ESKF reference (fixed-wing UAV): [https://github.com/ludvigls/ESKF](https://github.com/ludvigls/ESKF)
- ROS ESKF multi-sensor: [https://github.com/EliaTarasov/ESKF](https://github.com/EliaTarasov/ESKF)
- Range-VIO scale observability: [https://arxiv.org/abs/2103.15215](https://arxiv.org/abs/2103.15215)
- NaviLoc trajectory-level localization: [https://www.mdpi.com/2504-446X/10/2/97](https://www.mdpi.com/2504-446X/10/2/97)
- SatLoc-Fusion hierarchical framework: [https://www.scilit.com/publications/e5cafaf875a49297a62b298a89d5572f](https://www.scilit.com/publications/e5cafaf875a49297a62b298a89d5572f)
- Auterion GPS-denied workflow: [https://docs.auterion.com/vehicle-operation/auterion-mission-control/useful-resources/operations/gps-denied-workflow](https://docs.auterion.com/vehicle-operation/auterion-mission-control/useful-resources/operations/gps-denied-workflow)
- PX4 GNSS-denied flight: [https://docs.px4.io/main/en/advanced_config/gnss_degraded_or_denied_flight.html](https://docs.px4.io/main/en/advanced_config/gnss_degraded_or_denied_flight.html)
- ArduPilot GPS_INPUT advanced usage: [https://discuss.ardupilot.org/t/advanced-usage-of-gps-type-mav-14/99406](https://discuss.ardupilot.org/t/advanced-usage-of-gps-type-mav-14/99406)
- Google Maps Ukraine imagery: [https://newsukraine.rbc.ua/news/google-maps-has-surprise-for-satellite-imagery-1727182380.html](https://newsukraine.rbc.ua/news/google-maps-has-surprise-for-satellite-imagery-1727182380.html)
- Jetson Orin Nano Super thermal: [https://edgeaistack.app/blog/jetson-orin-nano-power-consumption/](https://edgeaistack.app/blog/jetson-orin-nano-power-consumption/)
- GSD matching research: [https://www.kjrs.org/journal/view.html?pn=related&uid=756&vmd=Full](https://www.kjrs.org/journal/view.html?pn=related&uid=756&vmd=Full)
- VO+satellite matching pipeline: [https://polen.itu.edu.tr/items/1fe1e872-7cea-44d8-a8de-339e4587bee6](https://polen.itu.edu.tr/items/1fe1e872-7cea-44d8-a8de-339e4587bee6)
- PyCuVSLAM docs: [https://wiki.seeedstudio.com/pycuvslam_recomputer_robotics/](https://wiki.seeedstudio.com/pycuvslam_recomputer_robotics/)
- Pixhawk 6x IMU (ICM-42688-P) datasheet: [https://invensense.tdk.com/products/motion-tracking/6-axis/icm-42688-p/](https://invensense.tdk.com/products/motion-tracking/6-axis/icm-42688-p/)
- All references from solution_draft05.md
## Related Artifacts
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
- Fact cards: `_docs/00_research/02_fact_cards.md`
- AC Assessment: `_docs/00_research/gps_denied_nav/00_ac_assessment.md`
- Completeness assessment research: `_docs/00_research/solution_completeness_assessment/`
- Previous research: `_docs/00_research/trt_engine_migration/`
- Tech stack evaluation: `_docs/01_solution/tech_stack.md` (needs sync with draft05 corrections)
- Security analysis: `_docs/01_solution/security_analysis.md`
- Previous draft: `_docs/01_solution/solution_draft05.md`
+241 -107
View File
@@ -2,148 +2,282 @@
## Product Solution Description
Build an onboard GPS-denied localization service that runs on the companion Jetson, consumes the fixed nadir navigation camera plus flight-controller IMU/attitude/altitude, and emits ArduPilot-compatible `GPS_INPUT` estimates with honest covariance.
A real-time GPS-denied visual navigation system for fixed-wing UAVs, running entirely on a Jetson Orin Nano Super (8GB). The system determines frame-center GPS coordinates by fusing three information sources: (1) CUDA-accelerated visual odometry (cuVSLAM), (2) absolute position corrections from satellite image matching, and (3) IMU-based motion prediction. Results stream to clients via REST API + SSE in real time.
The solution is a hybrid estimator:
**Hard constraint**: Camera shoots at ~3fps (333-400ms interval). The full pipeline must complete within **400ms per frame**.
```text
Nav camera + FC telemetry
|
v
Image quality + calibration + orthorectification
|
+--> Steady state: VO/IMU propagation --> ESKF --> GPS_INPUT + QGC + FDR
|
+--> Re-anchor triggers: VPR top-K --> local match/RANSAC --> ESKF anchor update
|
+--> Tile generation: ortho tile + quality sidecar --> local cache --> post-flight Satellite Service upload
**Satellite matching strategy**: Benchmark LiteSAM on actual Orin Nano Super hardware as a day-one priority. If LiteSAM cannot achieve ≤400ms at 480px resolution, **abandon it entirely** and use XFeat semi-dense matching as the primary satellite matcher. Speed is non-negotiable.
**Core architectural principles**:
1. **cuVSLAM handles VO** — NVIDIA's CUDA-accelerated library achieves 90fps on Jetson Orin Nano, giving VO essentially "for free" (~11ms/frame).
2. **Keyframe-based satellite matching** — satellite matcher runs on keyframes only (every 3-10 frames), amortizing its cost. Non-keyframes rely on cuVSLAM VO + IMU.
3. **Every keyframe independently attempts satellite-based geo-localization** — this handles disconnected segments natively.
4. **Pipeline parallelism** — satellite matching for frame N overlaps with VO processing of frame N+1 via CUDA streams.
```
┌─────────────────────────────────────────────────────────────────┐
│ OFFLINE (Before Flight) │
│ Satellite Tiles → Download & Crop → Store as tile pairs │
│ (Google Maps) (per flight plan) (disk, GeoHash indexed) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ ONLINE (During Flight) │
│ │
│ EVERY FRAME (400ms budget): │
│ ┌────────────────────────────────┐ │
│ │ Camera → Downsample (CUDA 2ms)│ │
│ │ → cuVSLAM VO+IMU (~11ms)│──→ ESKF Update → SSE Emit │
│ └────────────────────────────────┘ ↑ │
│ │ │
│ KEYFRAMES ONLY (every 3-10 frames): │ │
│ ┌────────────────────────────────────┐ │ │
│ │ Satellite match (async CUDA stream)│─────┘ │
│ │ LiteSAM or XFeat (see benchmark) │ │
│ │ (does NOT block VO output) │ │
│ └────────────────────────────────────┘ │
│ │
│ IMU: 100+Hz continuous → ESKF prediction │
└─────────────────────────────────────────────────────────────────┘
```
The steady-state path is intentionally lightweight. Heavy global retrieval and cross-domain local matching run only on cold start, VO failure, sharp turns, disconnected segments, covariance growth, or stale-anchor age.
## Speed Optimization Techniques
## Existing / Competitor Solutions Analysis
### 1. cuVSLAM for Visual Odometry (~11ms/frame)
NVIDIA's CUDA-accelerated VO library (v15.0.0, March 2026) achieves 90fps on Jetson Orin Nano. Supports monocular camera + IMU natively. Features: automatic IMU fallback when visual tracking fails, loop closure, Python and C++ APIs. This eliminates custom VO entirely.
Recent fixed-wing GPS-denied research supports the same high-level mechanism: monocular visual odometry alone accumulates scale/drift error, while satellite-image comparison can periodically correct it. Aerial VPR surveys also show why the implementation cannot be naive: weather, season, scale mismatch, repetitive fields, tile overlap, and re-ranking runtime all matter.
### 2. Keyframe-Based Satellite Matching
Not every frame needs satellite matching. Strategy:
- cuVSLAM provides VO at every frame (high-rate, low-latency)
- Satellite matching triggers on **keyframes** selected by:
- Fixed interval: every 3-10 frames (~1-3.3s between satellite corrections)
- Confidence drop: when ESKF covariance exceeds threshold
- VO failure: when cuVSLAM reports tracking loss (sharp turn)
The selected design borrows the proven structure but rejects an all-in-one SLAM dependency as the product core. OpenVINS and ORB-SLAM3 are useful benchmark/reference implementations, but their GPL-family licensing and broader SLAM lifecycle do not fit a default production dependency for this project.
### 3. Satellite Matcher Selection (Benchmark-Driven)
**Candidate A: LiteSAM (opt)** — Best accuracy for satellite-aerial matching (RMSE@30 = 17.86m on UAV-VisLoc). 6.31M params, MobileOne + TAIFormer + MinGRU. Benchmarked at 497ms on Jetson AGX Orin at 1184px. AGX Orin is 3-4x more powerful than Orin Nano Super (275 TOPS vs 67 TOPS, $2000+ vs $249).
Realistic Orin Nano Super estimates:
- At 1184px: ~1.5-2.0s (unusable)
- At 640px: ~500-800ms (borderline)
- At 480px: ~300-500ms (best case)
**Candidate B: XFeat semi-dense** — ~50-100ms on Orin Nano Super. Proven on Jetson. Not specifically designed for cross-view satellite-aerial, but fast and reliable.
**Decision rule**: Benchmark LiteSAM TensorRT FP16 at 480px on Orin Nano Super. If ≤400ms → use LiteSAM. If >400ms → **abandon LiteSAM, use XFeat as primary**. No hybrid compromises — pick one and optimize it.
### 4. TensorRT FP16 Optimization
LiteSAM's MobileOne backbone is reparameterizable — multi-branch training structure collapses to a single feed-forward path at inference. Combined with TensorRT FP16, this maximizes throughput. INT8 is possible for MobileOne backbone but ViT/transformer components may degrade with INT8.
### 5. CUDA Stream Pipelining
Overlap operations across consecutive frames:
- Stream A: cuVSLAM VO for current frame (~11ms) + ESKF fusion (~1ms)
- Stream B: Satellite matching for previous keyframe (async)
- CPU: SSE emission, tile management, keyframe selection logic
### 6. Pre-cropped Satellite Tiles
Offline: for each satellite tile, store both the raw image and a pre-resized version matching the satellite matcher's input resolution. Runtime avoids resize cost.
## Existing/Competitor Solutions Analysis
| Solution | Approach | Accuracy | Hardware | Limitations |
|----------|----------|----------|----------|-------------|
| Mateos-Ramirez et al. (2024) | VO (ORB) + satellite keypoint correction + Kalman | 142m mean / 17km (0.83%) | Orange Pi class | No re-localization; ORB only; 1000m+ altitude |
| SatLoc (2025) | DinoV2 + XFeat + optical flow + adaptive fusion | <15m, >90% coverage | Edge (unspecified) | Paper not fully accessible |
| LiteSAM (2025) | MobileOne + TAIFormer + MinGRU subpixel refinement | RMSE@30 = 17.86m on UAV-VisLoc | RTX 3090 (62ms), AGX Orin (497ms) | Not tested on Orin Nano; AGX Orin is 3-4x more powerful |
| TerboucheHacene/visual_localization | SuperPoint/SuperGlue/GIM + VO + satellite | Not quantified | Desktop-class | Not edge-optimized |
| cuVSLAM (NVIDIA, 2025-2026) | CUDA-accelerated VO+SLAM, mono/stereo/IMU | <1% trajectory error (KITTI), <5cm (EuRoC) | Jetson Orin Nano (90fps) | VO only, no satellite matching |
**Key insight**: Combine cuVSLAM (best-in-class VO for Jetson) with the fastest viable satellite-aerial matcher via ESKF fusion. LiteSAM is the accuracy leader but unproven on Orin Nano Super — benchmark first, abandon for XFeat if too slow.
## Architecture
### Component: Camera Ingest, Calibration, And Geometry
### Component: Visual Odometry
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
| OpenCV geometry utility layer | OpenCV 4.x | Camera calibration, undistortion, RANSAC homography, reprojection-error measurement | Mature, local, fast, exact API fit | Not a full estimator | Calibration target, fixed intrinsics/extrinsics, lens/FOV selection | Local-only, no network | Low | MVE: `_docs/00_research/02_fact_cards.md`; Source #5 | Selected |
| Solution | Tools | Advantages | Limitations | Fit |
|----------|-------|-----------|-------------|-----|
| cuVSLAM (mono+IMU) | PyCuVSLAM / C++ API | 90fps on Orin Nano, NVIDIA-optimized, loop closure, IMU fallback | Closed-source CUDA library | ✅ Best |
| XFeat frame-to-frame | XFeatTensorRT | 5x faster than SuperPoint, open-source | ~30-50ms total, no IMU integration | ⚠️ Fallback |
| ORB-SLAM3 | OpenCV + custom | Well-understood, open-source | CPU-heavy, ~30fps on Orin | ⚠️ Slower |
**Exact-fit evidence**:
- Project constraints checked: fixed nadir camera, calibration, homography, MRE gates.
- Disqualifiers: none.
- Restrictions × AC matrix: `_docs/00_research/06_component_fit_matrix.md`.
**Selected**: **cuVSLAM (mono+IMU mode)** — purpose-built by NVIDIA for Jetson. ~11ms/frame leaves 389ms for everything else. Auto-fallback to IMU when visual tracking fails.
### Component: VO / IMU Propagation And Estimator
### Component: Satellite Image Matching
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
| Custom VO/IMU ESKF | OpenCV + custom estimator | Frame-to-frame homography/features + FC IMU/attitude/altitude fused in a custom ESKF with source modes | Owns covariance, source labels, degraded modes, ArduPilot output semantics | More implementation work than adopting VIO library | Synchronized frames/IMU, calibration, replay tests | No third-party cloud; deterministic local logs | Medium | Facts #1, #4, #5, #16 | Selected |
| OpenVINS reference | OpenVINS | Monocular camera + IMU EKF/MSCKF reference runs | Strong VIO reference and evaluation tools | GPL-3 production dependency risk | Dataset/replay adapter | Local only | Low for benchmark | Source #3 | Reference only |
| ORB-SLAM3 alternative | ORB-SLAM3 | Monocular-inertial SLAM | Mature SLAM benchmark | GPLv3, heavier map lifecycle, initialization complexity | Calibration, vocabulary, runtime tuning | Local only | Medium | Source #4 | Rejected for production |
| Solution | Tools | Advantages | Limitations | Fit |
|----------|-------|-----------|-------------|-----|
| LiteSAM (opt) | TensorRT | Best satellite-aerial accuracy (RMSE@30 17.86m), 6.31M params, subpixel refinement | 497ms on AGX Orin at 1184px; AGX Orin is 3-4x more powerful than Orin Nano Super | ✅ If benchmark passes |
| XFeat semi-dense | XFeatTensorRT | ~50-100ms, lightweight, Jetson-proven | Not designed for cross-view satellite-aerial | ✅ If LiteSAM fails benchmark |
| EfficientLoFTR | TensorRT | Good accuracy, semi-dense | 15.05M params (2.4x LiteSAM), slower | ⚠️ Heavier |
| SuperPoint + LightGlue | TensorRT C++ | Good general matching | Sparse only, worse on satellite-aerial | ⚠️ Not specialized |
**Exact-fit evidence**:
- Project constraints checked: one camera + IMU, frame-by-frame output, covariance labels, blackout/spoofing modes.
- Runtime quality gate: validate drift and covariance on EuRoC-style and representative fixed-wing replay.
**Selection**: Benchmark-driven. Day-one test on Orin Nano Super:
1. Export LiteSAM (opt) to TensorRT FP16
2. Measure at 480px, 640px, 800px
3. If ≤400ms at 480px → **LiteSAM**
4. If >400ms at any viable resolution → **XFeat semi-dense** (primary, no hybrid)
### Component: Satellite Retrieval And Local Anchor
### Component: Sensor Fusion
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
| DINOv2-VLAD + FAISS + LightGlue | DINOv2/AnyLoc-style descriptors, FAISS, DISK/ALIKED+LightGlue, OpenCV RANSAC | Offline precomputed VPR chunk descriptors; conditional query descriptor; top-K FAISS; local matching on candidates | Matches AC-8.6; scalable top-K; local geometry verifies anchors | Needs Jetson profiling and model-size pruning | Fresh satellite cache, descriptors, dynamic K, RANSAC gates | Cache is local; no in-flight provider calls | Medium-high | MVE blocks in `02_fact_cards.md`; Sources #6-#9 | Selected with runtime gate |
| SuperPoint + LightGlue | SuperPoint, LightGlue | Same local matching with SuperPoint features | Strong technical baseline | SuperPoint license is restrictive | Legal review | Local only | Medium | Source #6 | Needs user decision |
| Classical SIFT/ORB-only | OpenCV | Handcrafted features and homography | Simple fallback, low compute | Poor cross-domain robustness | Feature-rich scenes | Local only | Low | Source #5 | Fallback / regression baseline |
| Solution | Tools | Advantages | Limitations | Fit |
|----------|-------|-----------|-------------|-----|
| Error-State EKF (ESKF) | Custom Python/C++ | Lightweight, multi-rate, well-understood | Linear approximation | ✅ Best |
| Hybrid ESKF/UKF | Custom | 49% better accuracy | More complex | ⚠️ Upgrade path |
| Factor Graph (GTSAM) | GTSAM | Best accuracy | Heavy compute | ❌ Too heavy |
**Exact-fit evidence**:
- Project constraints checked: offline cache, top-K dynamic retrieval, cross-domain local match, <400 ms hot-path constraint.
- Runtime gate: heavy VPR/re-ranking is trigger-based, not per-frame.
**Selected**: **ESKF** with adaptive measurement noise. State vector: [position(3), velocity(3), orientation_quat(4), accel_bias(3), gyro_bias(3)] = 16 states.
### Component: Satellite Cache And Tile Write-Back
Measurement sources and rates:
- IMU prediction: 100+Hz
- cuVSLAM VO update: ~3Hz (every frame)
- Satellite update: ~0.3-1Hz (keyframes only, delayed via async pipeline)
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
| COG + PostgreSQL/PostGIS manifest + descriptor sidecars | GDAL COG, PostgreSQL/PostGIS manifest, FAISS sidecars | Service tiles and generated candidate tiles stored as tiled compressed GeoTIFFs with CRS/date/source/meter-per-pixel metadata | Geospatial standard, supports write-new-tile workflow, descriptor accounting | Needs careful 10 GB budget | STAC-like manifest, freshness gates, descriptor pruning | Local signed manifests recommended | Medium | Source #18 | Selected |
| PMTiles archive | PMTiles | Single-file read archive | Efficient map reads | Read-only; cannot update in place | Archive rebuild for updates | Local file integrity | Low | Source #17 | Rejected for live mutable cache |
### Component: Satellite Tile Preprocessing (Offline)
**Exact-fit evidence**:
- Project constraints checked: offline-only, no raw photo storage, mid-flight generated tiles, Satellite Service ingest metadata.
- External dependency: Satellite Service owns the promotion/voting layer for trusted basemap updates.
**Selected**: **GeoHash-indexed tile pairs on disk**.
### Component: MAVLink Integration And Telemetry
Pipeline:
1. Define operational area from flight plan
2. Download satellite tiles from Google Maps Tile API at max zoom (18-19)
3. Pre-resize each tile to matcher input resolution
4. Store: original tile + resized tile + metadata (GPS bounds, zoom, GSD) in GeoHash-indexed directory structure
5. Copy to Jetson storage before flight
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
| MAVSDK telemetry + pymavlink output | MAVSDK, pymavlink | MAVSDK subscribes to telemetry; pymavlink emits `GPS_INPUT` to ArduPilot with `GPS1_TYPE=14` | Exact `GPS_INPUT` field control while keeping high-level telemetry APIs | Plane-specific failsafe/spoof triggers need SITL proof | ArduPilot Plane params, QGC status, FDR tlog | Validate MAVLink source and message rate | Medium | Sources #10-#12 | Selected |
### Component: Re-localization (Disconnected Segments)
**Exact-fit evidence**:
- Project constraints checked: ArduPilot only, v1 `GPS_INPUT` only, WGS84 output, covariance to `horiz_accuracy`.
- Validation gate: Plane SITL with production parameters.
**Selected**: **Keyframe satellite matching is always active + expanded search on VO failure**.
### Component: Flight Data Recorder
When cuVSLAM reports tracking loss (sharp turn, no features):
1. Immediately flag next frame as keyframe → trigger satellite matching
2. Expand tile search radius (from ±200m to ±1km based on IMU dead-reckoning uncertainty)
3. If match found: position recovered, new segment begins
4. If 3+ consecutive keyframe failures: request user input via API
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
| Segmented FDR | PostgreSQL event index + binary/CBOR payloads with Parquet export post-flight | Fixed-size segment files for per-frame estimates, IMU, MAVLink, health, emitted GPS_INPUT, generated-tile metadata | Replayable, bounded, no raw frame retention | Exact format selected during implementation | Rollover policy, monotonic timestamps | Integrity hash per segment recommended | Medium | AC-NEW-3 | Selected pattern |
### Component: Object Center Coordinates
## Runtime Modes
Geometric calculation once frame-center GPS is known:
1. Pixel offset from center: (dx_px, dy_px)
2. Convert to meters: dx_m = dx_px × GSD, dy_m = dy_px × GSD
3. Rotate by IMU yaw heading
4. Convert meter offset to lat/lon and add to frame-center GPS
| Mode | Trigger | Behavior | Output Label |
|------|---------|----------|--------------|
| Satellite anchored | VPR + local match passes freshness, RANSAC, covariance, and Mahalanobis gates | ESKF absolute update, low covariance, tile generation eligible if sigma gate passes | `satellite_anchored` |
| VO extrapolated | Last anchor fresh enough, VO healthy, normal overlap | VO/IMU propagation, covariance grows with anchor age | `vo_extrapolated` |
| Dead reckoned | Visual blackout, VO failure without anchor, spoofing while no visual signal | IMU-only propagation, monotonic covariance growth, degraded fix type thresholds | `dead_reckoned` |
| Failsafe / no fix | Blackout >30 s or covariance >500 m | `GPS_INPUT.fix_type=0`, `horiz_accuracy=999.0`, QGC status | `dead_reckoned` with failsafe status |
### Component: API & Streaming
**Selected**: **FastAPI + sse-starlette**. REST for session management, SSE for real-time position stream. OpenAPI auto-documentation.
## Processing Time Budget (per frame, 400ms budget)
### Normal Frame (non-keyframe, ~60-80% of frames)
| Step | Time | Notes |
|------|------|-------|
| Image capture + transfer | ~10ms | CSI/USB3 |
| Downsample (for cuVSLAM) | ~2ms | OpenCV CUDA |
| cuVSLAM VO+IMU | ~11ms | NVIDIA CUDA-optimized, 90fps capable |
| ESKF fusion (VO+IMU update) | ~1ms | C extension or NumPy |
| SSE emit | ~1ms | Async |
| **Total** | **~25ms** | Well within 400ms |
### Keyframe Satellite Matching (async, every 3-10 frames)
Runs asynchronously on a separate CUDA stream — does NOT block per-frame VO output.
**Path A — LiteSAM (if benchmark passes)**:
| Step | Time | Notes |
|------|------|-------|
| Downsample to ~480px | ~1ms | OpenCV CUDA |
| Load satellite tile | ~5ms | Pre-resized, from storage |
| LiteSAM (opt) matching | ~300-500ms | TensorRT FP16, 480px, Orin Nano Super estimate |
| Geometric pose (RANSAC) | ~5ms | Homography estimation |
| ESKF satellite update | ~1ms | Delayed measurement |
| **Total** | **~310-510ms** | Async, does not block VO |
**Path B — XFeat (if LiteSAM abandoned)**:
| Step | Time | Notes |
|------|------|-------|
| XFeat feature extraction (both images) | ~10-20ms | TensorRT FP16/INT8 |
| XFeat semi-dense matching | ~30-50ms | KNN + refinement |
| Geometric verification (RANSAC) | ~5ms | |
| ESKF satellite update | ~1ms | |
| **Total** | **~50-80ms** | Comfortably within budget |
### Per-Frame Wall-Clock Latency
Every frame:
- **VO result emitted in ~25ms** (cuVSLAM + ESKF + SSE)
- Satellite correction arrives asynchronously on keyframes
- Client gets immediate position, then refined position when satellite match completes
## Memory Budget (Jetson Orin Nano Super, 8GB shared)
| Component | Memory | Notes |
|-----------|--------|-------|
| OS + runtime | ~1.5GB | JetPack 6.2 + Python |
| cuVSLAM | ~200-300MB | NVIDIA CUDA library + internal state |
| Satellite matcher TensorRT | ~50-100MB | LiteSAM FP16 or XFeat FP16 |
| Current frame (downsampled) | ~2MB | 640×480×3 |
| Satellite tile (pre-resized) | ~1MB | Single active tile |
| ESKF state + buffers | ~10MB | |
| FastAPI + SSE runtime | ~100MB | |
| **Total** | **~1.9-2.4GB** | ~25-30% of 8GB — comfortable margin |
## Confidence Scoring
| Level | Condition | Expected Accuracy |
|-------|-----------|-------------------|
| HIGH | Satellite match succeeded + cuVSLAM consistent | <20m |
| MEDIUM | cuVSLAM VO only, recent satellite correction (<500m travel) | 20-50m |
| LOW | cuVSLAM VO only, no recent satellite correction | 50-100m+ |
| VERY LOW | IMU dead-reckoning only (cuVSLAM + satellite both failed) | 100m+ |
| MANUAL | User-provided position | As provided |
## Key Risks and Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| LiteSAM too slow on Orin Nano Super | HIGH | Misses 400ms deadline | **Abandon LiteSAM, use XFeat**. Day-one benchmark is the go/no-go gate |
| cuVSLAM not supporting nadir-only camera well | MEDIUM | VO accuracy degrades | Fall back to XFeat frame-to-frame matching |
| Google Maps satellite quality in conflict zone | HIGH | Satellite matching fails | Accept VO+IMU with higher drift; request user input sooner; alternative satellite providers |
| XFeat cross-view accuracy insufficient | MEDIUM | Position corrections less accurate than LiteSAM | Increase keyframe frequency; multi-tile consensus voting; geometric verification with strict RANSAC |
| cuVSLAM is closed-source | LOW | Hard to debug | Fallback to XFeat VO; cuVSLAM has Python+C++ APIs |
## Testing Strategy
### Integration / Functional Tests
- Replay normal overlapping frames and assert AC-2.1a VO registration rate and AC-2.2 VO MRE.
- Replay satellite-anchor cases and assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness gates, and source labels.
- Inject stale tiles and assert no `satellite_anchored` output.
- Inject sharp turns and disconnected segments and assert VPR relocalization.
- Run ArduPilot Plane SITL with `GPS1_TYPE=14` and assert valid `GPS_INPUT` fields, fix type degradation, and QGC statuses.
- Inject visual blackout + spoofed GPS and assert spoofed GPS is ignored, covariance grows, and thresholds match AC-NEW-8.
- Request AI-camera object coordinates and assert level-flight projection plus maneuver error bound.
- End-to-end pipeline test with real flight data (60 images from input_data/)
- Compare computed positions against ground truth GPS from coordinates.csv
- Measure: percentage within 50m, percentage within 20m
- Test sharp-turn handling: introduce 90-degree heading change in sequence
- Test user-input fallback: simulate 3+ consecutive failures
- Test SSE streaming: verify client receives VO result within 50ms, satellite-corrected result within 500ms
- Test session management: start/stop/restart flight sessions via REST API
### Non-Functional Tests
- Jetson Orin Nano Super profiling: <400 ms p95, <8 GB shared memory, 25 W no-throttle hot-soak.
- Cache build test for 400 km²: imagery + manifests + descriptors fit within budget or fail with explicit budget report.
- 8-hour FDR load test: <=64 GB, rollover logged, no silent data loss.
- Monte Carlo false-anchor and over-confidence tests for AC-NEW-4 and AC-NEW-7.
- Cold boot 50x: first valid `GPS_INPUT` <30 s p95.
- **Day-one benchmark**: LiteSAM TensorRT FP16 at 480/640/800px on Orin Nano Super → go/no-go for LiteSAM
- cuVSLAM benchmark: verify 90fps monocular+IMU on Orin Nano Super
- Performance: measure per-frame processing time (must be <400ms)
- Memory: monitor peak usage during 1000-frame session (must stay <8GB)
- Stress: process 3000 frames without memory leak
- Keyframe strategy: vary interval (2, 3, 5, 10) and measure accuracy vs latency tradeoff
## References
Detailed source registry: `_docs/00_research/01_source_registry.md`.
Key sources:
- Fixed-wing satellite-aided VO: https://www.mdpi.com/2076-3417/14/16/7420
- Aerial VPR survey: https://arxiv.org/abs/2406.00885
- OpenVINS docs: https://docs.openvins.com/
- ORB-SLAM3 README: https://raw.githubusercontent.com/UZ-SLAMLab/ORB_SLAM3/master/README.md
- LightGlue README: https://raw.githubusercontent.com/cvg/LightGlue/main/README.md
- FAISS docs: https://faiss.ai/index.html
- ArduPilot GPSInput: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
- MAVLink GPS_INPUT: https://mavlink.io/en/messages/common.html#GPS_INPUT
- Jetson Orin Nano Super: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit/
- PMTiles: https://docs.protomaps.com/pmtiles/
- GDAL COG: https://gdal.org/en/stable/drivers/raster/cog.html
- LiteSAM (2025): https://www.mdpi.com/2072-4292/17/19/3349
- LiteSAM code: https://github.com/boyagesmile/LiteSAM
- cuVSLAM (2025-2026): https://github.com/NVlabs/PyCuVSLAM
- PyCuVSLAM API: https://nvlabs.github.io/PyCuVSLAM/api.html
- Mateos-Ramirez et al. (2024): https://www.mdpi.com/2076-3417/14/16/7420
- SatLoc (2025): https://www.scilit.com/publications/e5cafaf875a49297a62b298a89d5572f
- XFeat (CVPR 2024): https://arxiv.org/abs/2404.19174
- XFeat TensorRT for Jetson: https://github.com/PranavNedunghat/XFeatTensorRT
- EfficientLoFTR (CVPR 2024): https://github.com/zju3dv/EfficientLoFTR
- JetPack 6.2: https://docs.nvidia.com/jetson/archives/jetpack-archived/jetpack-62/release-notes/
- Hybrid ESKF/UKF: https://arxiv.org/abs/2512.17505
- Google Maps Tile API: https://developers.google.com/maps/documentation/tile/satellite
## Related Artifacts
- AC assessment: `_docs/00_research/00_ac_assessment.md`
- Question decomposition: `_docs/00_research/00_question_decomposition.md`
- Source registry: `_docs/00_research/01_source_registry.md`
- Fact cards: `_docs/00_research/02_fact_cards.md`
- Comparison framework: `_docs/00_research/03_comparison_framework.md`
- Reasoning chain: `_docs/00_research/04_reasoning_chain.md`
- Validation log: `_docs/00_research/05_validation_log.md`
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
- AC Assessment: `_docs/00_research/gps_denied_nav/00_ac_assessment.md`
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
+316 -85
View File
@@ -4,122 +4,353 @@
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|------------------------|----------------------------------------------|-------------|
| DINOv2-VLAD with possible TensorRT optimization | TensorRT conversion may produce limited speedup and can alter embedding distances on Jetson-class deployments. | Keep DINOv2-VLAD, but require descriptor-fidelity tests against PyTorch/ONNX before TensorRT descriptors are accepted. |
| FAISS CPU/GPU optional | FAISS GPU is not a safe default on Jetson ARM64/aarch64 packaging. | Pin FAISS as CPU-first on Jetson; use PQ/IVF and top-K caps before considering custom GPU builds. |
| LightGlue local matcher | SuperPoint path has license risk and community confusion. | Keep DISK/ALIKED+LightGlue as production default; SuperPoint remains license-gated benchmark/fallback only. |
| COG cache | "COG cache" could be misread as mutable in-place raster updates. | Use write-new COG tile objects plus manifest versioning and sidecars; never mutate COGs in place. |
| `GPS_INPUT` output | ArduPilot velocity ignore flags have reported EKF3 pitfalls. | SITL must validate velocity source parameters, ignore flags, and whether zero velocity is ever fused accidentally. |
| Visual/satellite anchoring | Draft did not emphasize adversarial/cache integrity enough. | Add signed cache manifests, tile provenance, freshness gates, anchor consistency checks, and FDR audit trail. |
| LiteSAM at 480px as satellite matcher | **Performance**: 497ms on AGX Orin at 1184px. Orin Nano Super is ~3-4x slower. At 480px estimated ~270-360ms — borderline. Paper uses PyTorch AMP, not TensorRT FP16. TensorRT could bring 2-3x improvement. | Add TensorRT FP16 as mandatory optimization step. Revised estimate at 480px with TensorRT: ~90-180ms. Still benchmark-driven: abandon if >400ms. |
| XFeat as LiteSAM fallback for satellite matching | **Functional**: XFeat is a general-purpose feature matcher, NOT designed for cross-view satellite-aerial gap. May fail on season/lighting differences between UAV and satellite imagery. | **Expand fallback options**: benchmark EfficientLoFTR (designed for weak-texture aerial) alongside XFeat. Consider STHN-style deep homography as third option. See detailed satellite matcher comparison below. |
| SP+LG considered as "sparse only, worse on satellite-aerial" | **Functional**: LiteSAM paper confirms "SP+LG achieves fastest inference speed but at expense of accuracy." Sparse matcher fails on texture-scarce regions. ~180-360ms on Orin Nano Super. | **Reject SP+LG** for both VO and satellite matching. cuVSLAM is 15-33x faster for VO. |
| cuVSLAM on low-texture terrain | **Functional**: cuVSLAM uses Shi-Tomasi corners + Lucas-Kanade tracking. On uniform agricultural fields/water bodies, features will be sparse → frequent tracking loss. IMU fallback lasts only ~1s. No published benchmarks for nadir agricultural terrain. Does NOT guarantee pose recovery after tracking loss. | **CRITICAL RISK**: cuVSLAM will likely fail frequently over low-texture terrain. Mitigation: (1) increase satellite matching frequency in low-texture areas, (2) use IMU dead-reckoning bridge, (3) accept higher drift in featureless segments, (4) XFeat VO as secondary fallback may also struggle on same terrain. |
| cuVSLAM memory estimate ~200-300MB | **Performance**: Map grows over time. For 3000-frame flights (~16min at 3fps), map could reach 500MB-1GB without pruning. | Configure cuVSLAM map pruning. Set max keyframes. Monitor memory. |
| Tile search on VO failure: "expand to ±1km" | **Functional**: Underspecified. Loading 10-20 tiles slow from disk I/O. | Preload tiles within ±2km of flight plan into RAM. Ranked search by IMU dead-reckoning position. |
| LiteSAM resolution | **Performance**: Paper benchmarked at 1184px on AGX Orin (497ms AMP). TensorRT FP16 with reparameterized MobileOne expected 2-3x faster. | Benchmark LiteSAM TRT FP16 at **1280px** on Orin Nano Super. If ≤200ms → use LiteSAM at 1280px. If >200ms → use XFeat. |
| SP+LG proposed for VO by user | **Performance**: ~130-280ms/frame on Orin Nano. cuVSLAM ~8.6ms/frame. No IMU, no loop closure. | **Reject SP+LG for VO.** cuVSLAM 15-33x faster. XFeat frame-to-frame remains fallback. |
## Product Solution Description
Build an onboard GPS-denied localization service that runs on the Jetson companion computer, uses the fixed downward navigation camera and flight-controller inertial telemetry, and emits ArduPilot `GPS_INPUT` estimates with calibrated covariance and source labels.
A real-time GPS-denied visual navigation system for fixed-wing UAVs, running entirely on a Jetson Orin Nano Super (8GB). The system determines frame-center GPS coordinates by fusing three information sources: (1) CUDA-accelerated visual odometry (cuVSLAM), (2) absolute position corrections from satellite image matching, and (3) IMU-based motion prediction. Results stream to clients via REST API + SSE in real time.
The production architecture is a trigger-based hybrid estimator:
**Hard constraint**: Camera shoots at ~3fps (333-400ms interval). The full pipeline must complete within **400ms per frame**.
```text
Nav camera + FC telemetry
|
v
Image quality + calibration + orthorectification
|
+--> Hot path: VO/IMU propagation --> custom ESKF --> GPS_INPUT + QGC + FDR
|
+--> Trigger path: DINOv2-VLAD query --> CPU FAISS top-K --> DISK/ALIKED+LightGlue --> RANSAC --> ESKF anchor
|
+--> Tile path: new COG tile + quality/provenance sidecar --> manifest update --> post-flight Satellite Service sync
**Satellite matching strategy**: Benchmark LiteSAM TensorRT FP16 at **1280px** on Orin Nano Super as a day-one priority. The paper's AGX Orin benchmark used PyTorch AMP — TensorRT FP16 with reparameterized MobileOne should yield 2-3x additional speedup. **Decision rule: if LiteSAM TRT FP16 at 1280px ≤200ms → use LiteSAM. If >200ms → use XFeat.**
**Core architectural principles**:
1. **cuVSLAM handles VO** — 116fps on Orin Nano 8GB, ~8.6ms/frame. SuperPoint+LightGlue was evaluated and rejected (15-33x slower, no IMU integration).
2. **Keyframe-based satellite matching** — satellite matcher runs on keyframes only (every 3-10 frames), amortizing cost. Non-keyframes rely on cuVSLAM VO + IMU.
3. **Every keyframe independently attempts satellite-based geo-localization** — handles disconnected segments natively.
4. **Pipeline parallelism** — satellite matching for frame N overlaps with VO processing of frame N+1 via CUDA streams.
5. **Proactive tile loading** — preload tiles within ±2km of flight plan into RAM for fast lookup during expanded search.
```
┌─────────────────────────────────────────────────────────────────┐
│ OFFLINE (Before Flight) │
│ Satellite Tiles → Download & Crop → Store as tile pairs │
│ (Google Maps) (per flight plan) (disk, GeoHash indexed) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ ONLINE (During Flight) │
│ │
│ EVERY FRAME (400ms budget): │
│ ┌────────────────────────────────┐ │
│ │ Camera → Downsample (CUDA 2ms)│ │
│ │ → cuVSLAM VO+IMU (~9ms) │──→ ESKF Update → SSE Emit │
│ └────────────────────────────────┘ ↑ │
│ │ │
│ KEYFRAMES ONLY (every 3-10 frames): │ │
│ ┌────────────────────────────────────┐ │ │
│ │ Satellite match (async CUDA stream)│─────┘ │
│ │ LiteSAM TRT FP16 or XFeat │ │
│ │ (does NOT block VO output) │ │
│ └────────────────────────────────────┘ │
│ │
│ IMU: 100+Hz continuous → ESKF prediction │
│ TILES: ±2km preloaded in RAM from flight plan │
└─────────────────────────────────────────────────────────────────┘
```
Heavy retrieval and local matching are not steady-state per-frame dependencies. They run on cold start, VO failure, sharp turns, disconnected segments, covariance growth, or stale-anchor age.
## Speed Optimization Techniques
### 1. cuVSLAM for Visual Odometry (~9ms/frame)
NVIDIA's CUDA-accelerated VO library (v15.0.0, March 2026) achieves 116fps on Jetson Orin Nano 8GB at 720p. Supports monocular camera + IMU natively. Features: automatic IMU fallback when visual tracking fails, loop closure, Python and C++ APIs.
**Why not SuperPoint+LightGlue for VO**: SP+LG is 15-33x slower (~130-280ms vs ~9ms). Lacks IMU integration, loop closure, auto-fallback.
**CRITICAL: cuVSLAM on difficult/even terrain (agricultural fields, water)**:
cuVSLAM uses Shi-Tomasi corner detection + Lucas-Kanade optical flow tracking (classical features, not learned). On uniform agricultural terrain or water bodies:
- Very few corners will be detected → sparse/unreliable tracking
- Frequent keyframe creation → heavier compute
- Tracking loss → IMU fallback (~1 second) → constant-velocity integrator (~0.5s more)
- cuVSLAM does NOT guarantee pose recovery after tracking loss
- All published benchmarks (KITTI: urban/suburban, EuRoC: indoor) do NOT include nadir agricultural terrain
- Multi-stereo mode helps with featureless surfaces, but we have mono camera only
**Mitigation strategy for low-texture terrain**:
1. **Increase satellite matching frequency**: In low-texture areas (detected by cuVSLAM's keypoint count dropping), switch from every 3-10 frames to every frame
2. **IMU dead-reckoning bridge**: When cuVSLAM reports tracking loss, ESKF continues with IMU prediction. At 3fps with ~1.5s IMU bridge, that covers ~4-5 frames
3. **Accept higher drift**: In featureless segments, position accuracy degrades to IMU-only level (50-100m+ over ~10s). Satellite matching must recover absolute position when texture returns
4. **Keypoint density monitoring**: Track cuVSLAM's number of tracked features per frame. When below threshold (e.g., <50), proactively trigger satellite matching
5. **XFeat frame-to-frame as VO fallback**: XFeat uses learned features that may detect texture invisible to Shi-Tomasi corners. But XFeat may also struggle on truly uniform terrain
### 2. Keyframe-Based Satellite Matching
Not every frame needs satellite matching. Strategy:
- cuVSLAM provides VO at every frame (high-rate, low-latency)
- Satellite matching triggers on **keyframes** selected by:
- Fixed interval: every 3-10 frames (~1-3.3s between satellite corrections)
- Confidence drop: when ESKF covariance exceeds threshold
- VO failure: when cuVSLAM reports tracking loss (sharp turn)
### 3. Satellite Matcher Selection (Benchmark-Driven)
**Important context**: Our UAV-to-satellite matching is EASIER than typical cross-view geo-localization problems. Both the UAV camera and satellite imagery are approximately nadir (top-down). The main challenges are season/lighting differences, resolution mismatch, and temporal changes — not the extreme viewpoint gap seen in ground-to-satellite matching. This means even general-purpose matchers may perform well.
**Candidate A: LiteSAM (opt) with TensorRT FP16 at 1280px** — Best satellite-aerial accuracy (RMSE@30 = 17.86m on UAV-VisLoc). 6.31M params, MobileOne reparameterizable for TensorRT. Paper benchmarked at 497ms on AGX Orin using AMP at 1184px. TensorRT FP16 with reparameterized MobileOne expected 2-3x faster than AMP. At 1280px (close to paper's 1184px benchmark resolution), accuracy should match published results.
Orin Nano Super TensorRT FP16 estimate at 1280px:
- AGX Orin AMP @ 1184px: 497ms
- TRT FP16 speedup over AMP: ~2-3x → AGX Orin TRT estimate: ~165-250ms
- Orin Nano Super is ~3-4x slower → estimate: ~500-1000ms without TRT
- With TRT FP16: **~165-330ms** (realistic range)
- Go/no-go threshold: **≤200ms**
**Candidate B (fallback): XFeat semi-dense** — ~50-100ms on Orin Nano Super. Proven on Jetson. General-purpose, not designed for cross-view gap. FASTEST option. Since our cross-view gap is small (both nadir), XFeat may work adequately for this specific use case.
**Other evaluated options (not selected)**:
- **EfficientLoFTR**: Semi-dense, 15.05M params, handles weak-texture well. ~20% slower than LiteSAM. Strong option if LiteSAM codebase proves difficult to export to TRT, but larger model footprint.
- **Deep Homography (STHN-style)**: End-to-end homography estimation, no feature/RANSAC pipeline. 4.24m at 50m range. Interesting future option but needs RGB retraining — higher implementation risk.
- **PFED and retrieval-based methods**: Image RETRIEVAL only (identifies which tile matches), not pixel-level matching. We already know which tile to use from ESKF position.
- **SuperPoint+LightGlue**: Sparse matcher. LiteSAM paper confirms worse satellite-aerial accuracy. Slower than XFeat.
**Decision rule** (day-one on Orin Nano Super):
1. Export LiteSAM (opt) to TensorRT FP16
2. Benchmark at **1280px**
3. **If ≤200ms → use LiteSAM at 1280px**
4. **If >200ms → use XFeat**
### 4. TensorRT FP16 Optimization
LiteSAM's MobileOne backbone is reparameterizable — multi-branch training structure collapses to a single feed-forward path at inference. Combined with TensorRT FP16, this maximizes throughput. **Do NOT use INT8 on transformer components** (TAIFormer) — accuracy degrades. INT8 is safe only for the MobileOne backbone CNN layers.
### 5. CUDA Stream Pipelining
Overlap operations across consecutive frames:
- Stream A: cuVSLAM VO for current frame (~9ms) + ESKF fusion (~1ms)
- Stream B: Satellite matching for previous keyframe (async)
- CPU: SSE emission, tile management, keyframe selection logic
### 6. Proactive Tile Loading
**Change from draft01**: Instead of loading tiles on-demand from disk, preload tiles within ±2km of the flight plan into RAM at session start. This eliminates disk I/O latency during flight. For a 50km flight path, ~2000 tiles at zoom 19 ≈ ~200MB RAM — well within budget.
On VO failure / expanded search:
1. Compute IMU dead-reckoning position
2. Rank preloaded tiles by distance to predicted position
3. Try top 3 tiles (not all tiles in ±1km radius)
4. If no match in top 3, expand to next 3
## Existing/Competitor Solutions Analysis
| Solution | Approach | Accuracy | Hardware | Limitations |
|----------|----------|----------|----------|-------------|
| Mateos-Ramirez et al. (2024) | VO (ORB) + satellite keypoint correction + Kalman | 142m mean / 17km (0.83%) | Orange Pi class | No re-localization; ORB only; 1000m+ altitude |
| SatLoc (2025) | DinoV2 + XFeat + optical flow + adaptive fusion | <15m, >90% coverage | Edge (unspecified) | Paper not fully accessible |
| LiteSAM (2025) | MobileOne + TAIFormer + MinGRU subpixel refinement | RMSE@30 = 17.86m on UAV-VisLoc | RTX 3090 (62ms), AGX Orin (497ms@1184px) | Not tested on Orin Nano; AGX Orin is 3-4x more powerful |
| TerboucheHacene/visual_localization | SuperPoint/SuperGlue/GIM + VO + satellite | Not quantified | Desktop-class | Not edge-optimized |
| cuVSLAM (NVIDIA, 2025-2026) | CUDA-accelerated VO+SLAM, mono/stereo/IMU | <1% trajectory error (KITTI), <5cm (EuRoC) | Jetson Orin Nano (116fps) | VO only, no satellite matching |
| VRLM (2024) | FocalNet backbone + multi-scale feature fusion | 83.35% MA@20 | Desktop | Not edge-optimized |
| Scale-Aware UAV-to-Satellite (2026) | Semantic geometric + metric scale recovery | N/A | Desktop | Addresses scale ambiguity problem |
| EfficientLoFTR (CVPR 2024) | Aggregated attention + adaptive token selection, semi-dense | Competitive with LiteSAM | 2.5x faster than LoFTR, TRT available | 15.05M params, heavier than LiteSAM |
| PFED (2025) | Knowledge distillation + multi-view refinement, retrieval | 97.15% Recall@1 (University-1652) | AGX Orin (251.5 FPS) | Retrieval only, not pixel-level matching |
| STHN (IEEE RA-L 2024) | Deep homography estimation, coarse-to-fine | 4.24m at 50m range | Open-source, lightweight | Trained on thermal, needs RGB retraining |
| Hierarchical AVL (2025) | DINOv2 retrieval + SuperPoint matching | 64.5-95% success rate | ROS, IMU integration | Two-stage complexity |
| JointLoc (IROS 2024) | Retrieval + VO fusion, adaptive weighting | 0.237m RMSE over 1km | Open-source | Designed for Mars/planetary, needs adaptation |
## Architecture
### Component: Camera Ingest, Calibration, And Geometry
### Component: Visual Odometry
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| OpenCV geometry utility layer | OpenCV 4.x | Calibration, undistortion, RANSAC homography, MRE measurement | Mature, exact fit, permissive | Not a full estimator | Checkerboard calibration, fixed extrinsics, lens/FOV selection | Local-only | Fast enough for hot-path utility use | MVE in `02_fact_cards.md`; Source #5 | Selected |
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|----------|-------|-----------|-------------|------------|-----|
| cuVSLAM (mono+IMU) | PyCuVSLAM v15.0.0 | 116fps on Orin Nano, NVIDIA-optimized, loop closure, IMU fallback | Closed-source CUDA library | ~9ms/frame | ✅ Best |
| XFeat frame-to-frame | XFeatTensorRT | 5x faster than SuperPoint, open-source | ~30-50ms total, no IMU integration | ~30-50ms/frame | ⚠️ Fallback |
| SuperPoint+LightGlue | LightGlue-ONNX TRT | Good accuracy, adaptive pruning | ~130-280ms, no IMU, no loop closure | ~130-280ms/frame | ❌ Rejected |
| ORB-SLAM3 | OpenCV + custom | Well-understood, open-source | CPU-heavy, ~30fps on Orin | ~33ms/frame | ⚠️ Slower |
### Component: VO / IMU Propagation And Estimator
**Selected**: **cuVSLAM (mono+IMU mode)** — 116fps, purpose-built by NVIDIA for Jetson. Auto-fallback to IMU when visual tracking fails.
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| Custom VO/IMU ESKF | OpenCV + custom estimator | Nadir VO/homography + FC IMU/attitude/altitude fused in ESKF with mode labels | Owns covariance, source labels, blackout/spoofing behavior | More implementation effort | Synchronized frames/IMU, calibration, replay tests | No network dependency | Hot path is lightweight | Facts #1, #16 | Selected |
| OpenVINS | OpenVINS | Monocular+IMU reference runs | Strong EKF/MSCKF reference | GPL-3 production risk | Replay adapter | Local only | Benchmark only | Source #3 | Reference only |
| ORB-SLAM3 | ORB-SLAM3 | Monocular-inertial SLAM | Mature benchmark | GPLv3 and heavier SLAM lifecycle | Calibration/vocabulary/runtime tuning | Local only | Riskier on embedded | Source #4 | Rejected for production |
**SP+LG rejection rationale**: 15-33x slower than cuVSLAM. No built-in IMU fusion, loop closure, or tracking failure detection. Building these features around SP+LG would take significant development time and still be slower. XFeat at ~30-50ms is a better fallback for VO if cuVSLAM fails on nadir camera.
### Component: Satellite Retrieval And Anchor Verification
### Component: Satellite Image Matching
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| DINOv2-VLAD + CPU FAISS + DISK/ALIKED+LightGlue | DINOv2/AnyLoc-style descriptors, FAISS CPU, LightGlue, OpenCV RANSAC | Offline VPR chunk descriptors; conditional query descriptor; CPU FAISS top-K; local match on candidates; TensorRT only after fidelity check | Strong retrieval+geometry structure; avoids per-frame map search | Requires profiling and representative data | Descriptor cache, dynamic K, freshness, RANSAC, Mahalanobis gates | Signed manifests, provenance, stale-tile rejection | Trigger path only; top-K capped | MVE blocks in `02_fact_cards.md`; Sources #6-#9, #21-#25 | Selected with runtime/fidelity gates |
| SuperPoint+LightGlue | SuperPoint, LightGlue | Same matcher with SuperPoint features | Strong technical baseline | SuperPoint license risk | Legal review | Local only | Benchmark only | Sources #6, #23 | Needs user decision |
| Classical SIFT/ORB | OpenCV | Handcrafted features + homography | Simple and cheap | Weak cross-domain robustness | Feature-rich scenes | Local only | Fast | Source #5 | Regression baseline |
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|----------|-------|-----------|-------------|------------|-----|
| LiteSAM (opt) TRT FP16 @ 1280px | TensorRT | Best satellite-aerial accuracy (RMSE@30 17.86m), 6.31M params, subpixel refinement | Untested on Orin Nano Super with TensorRT | Est. ~165-330ms @ 1280px TRT FP16 | ✅ If ≤200ms |
| XFeat semi-dense | XFeatTensorRT | ~50-100ms, lightweight, Jetson-proven, fastest | General-purpose, not designed for cross-view. Our nadir-nadir gap is small → may work. | ~50-100ms | ✅ Fallback if LiteSAM >200ms |
### Component: Cache And Tile Lifecycle
**Selection**: Day-one benchmark on Orin Nano Super:
1. Export LiteSAM (opt) to TensorRT FP16
2. Benchmark at **1280px**
3. **If ≤200ms → LiteSAM at 1280px**
4. **If >200ms → XFeat**
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| COG tile objects + manifest + sidecars | GDAL COG, manifest DB/JSON, FAISS index files | Service tiles and generated tiles are write-new COG objects; active version selected by manifest | Geospatial standard, supports provenance and quality metadata | Descriptor budget pressure | CRS/date/source/m/px/freshness, sidecar hashes | Signed manifests, tile provenance, hash verification | Efficient local reads | Source #18; Facts #21, #29 | Selected |
| PMTiles | PMTiles | Read-only archive snapshot | Compact read package | Cannot update in place | Archive rebuild | Hash archive | Good for read-only export | Source #17 | Rejected for live cache |
### Component: Sensor Fusion
### Component: MAVLink Integration
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|----------|-------|-----------|-------------|------------|-----|
| Error-State EKF (ESKF) | Custom Python/C++ | Lightweight, multi-rate, well-understood | Linear approximation | <1ms/step | ✅ Best |
| Hybrid ESKF/UKF | Custom | 49% better accuracy | More complex | ~2-3ms/step | ⚠️ Upgrade path |
| Factor Graph (GTSAM) | GTSAM | Best accuracy | Heavy compute | ~10-50ms/step | ❌ Too heavy |
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK, pymavlink | MAVSDK subscriptions; pymavlink emits `GPS_INPUT`; Plane SITL validates `GPS1_TYPE=14`, velocity source params, ignore flags, fix types, accuracy fields | Exact output control with good telemetry ergonomics | SITL required to prove Plane behavior | ArduPilot Plane params, QGC, tlog/FDR | Link/source validation, status audit | Light CPU load | Sources #10-#12, #24 | Selected |
**Selected**: **ESKF** with adaptive measurement noise. State vector: [position(3), velocity(3), orientation_quat(4), accel_bias(3), gyro_bias(3)] = 16 states.
### Component: Security And Safety Controls
### Component: Satellite Tile Preprocessing (Offline)
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| Consistency-gated anchor acceptance | Custom ESKF gates, cache manifest verification | Anchor accepted only if freshness, provenance, RANSAC, covariance, Mahalanobis, and temporal consistency pass | Prevents confident false fixes | Needs calibrated thresholds | Representative replay and Monte Carlo | Rejects stale/poisoned/low-confidence anchors | Lightweight after candidate generation | Facts #16, #17, #28 | Selected |
| FDR audit trail | Segmented logs + hashes | Logs estimates, inputs, emitted GPS_INPUT, health, tile writes, anchor decisions | Supports incident analysis and cache-poisoning audits | Schema work | 64 GB rollover | Tamper-evident hashes recommended | Sequential writes | AC-NEW-3 | Selected |
**Selected**: **GeoHash-indexed tile pairs on disk + RAM preloading**.
## Runtime Modes
Pipeline:
1. Define operational area from flight plan
2. Download satellite tiles from Google Maps Tile API at max zoom (18-19)
3. Pre-resize each tile to matcher input resolution
4. Store: original tile + resized tile + metadata (GPS bounds, zoom, GSD) in GeoHash-indexed directory structure
5. Copy to Jetson storage before flight
6. **At session start**: preload tiles within ±2km of flight plan into RAM (~200MB for 50km route)
| Mode | Trigger | Behavior | `GPS_INPUT` / Telemetry |
|------|---------|----------|--------------------------|
| `satellite_anchored` | VPR + local match passes all gates | ESKF absolute update; tile write eligible only if sigma gate passes | 3D fix, `horiz_accuracy` >= 95% covariance semi-major axis |
| `vo_extrapolated` | VO healthy and anchor age/covariance within bounds | VO/IMU propagation; covariance grows | 3D/2D depending covariance threshold |
| `dead_reckoned` | visual blackout or no accepted anchor | IMU-only propagation, monotonic covariance growth | degraded fix type; QGC `VISUAL_BLACKOUT_IMU_ONLY` |
| failsafe/no-fix | covariance >500 m or blackout >30 s | stop pretending position is valid | `fix_type=0`, `horiz_accuracy=999.0`, QGC `VISUAL_BLACKOUT_FAILSAFE` |
### Component: Re-localization (Disconnected Segments)
**Selected**: **Keyframe satellite matching is always active + ranked tile search on VO failure**.
When cuVSLAM reports tracking loss (sharp turn, no features):
1. Immediately flag next frame as keyframe → trigger satellite matching
2. Compute IMU dead-reckoning position since last known position
3. Rank preloaded tiles by distance to dead-reckoning position
4. Try top 3 tiles sequentially (not all tiles in radius)
5. If match found: position recovered, new segment begins
6. If 3 consecutive keyframe failures across top tiles: expand to next 3 tiles
7. If still no match after 3+ full attempts: request user input via API
### Component: Object Center Coordinates
Geometric calculation once frame-center GPS is known:
1. Pixel offset from center: (dx_px, dy_px)
2. Convert to meters: dx_m = dx_px × GSD, dy_m = dy_px × GSD
3. Rotate by IMU yaw heading
4. Convert meter offset to lat/lon and add to frame-center GPS
### Component: API & Streaming
**Selected**: **FastAPI + sse-starlette**. REST for session management, SSE for real-time position stream. OpenAPI auto-documentation.
## Processing Time Budget (per frame, 400ms budget)
### Normal Frame (non-keyframe, ~60-80% of frames)
| Step | Time | Notes |
|------|------|-------|
| Image capture + transfer | ~10ms | CSI/USB3 |
| Downsample (for cuVSLAM) | ~2ms | OpenCV CUDA |
| cuVSLAM VO+IMU | ~9ms | NVIDIA CUDA-optimized, 116fps capable |
| ESKF fusion (VO+IMU update) | ~1ms | C extension or NumPy |
| SSE emit | ~1ms | Async |
| **Total** | **~23ms** | Well within 400ms |
### Keyframe Satellite Matching (async, every 3-10 frames)
Runs asynchronously on a separate CUDA stream — does NOT block per-frame VO output.
**Path A — LiteSAM TRT FP16 at 1280px (if ≤200ms benchmark)**:
| Step | Time | Notes |
|------|------|-------|
| Downsample to 1280px | ~1ms | OpenCV CUDA |
| Load satellite tile | ~1ms | Pre-loaded in RAM |
| LiteSAM (opt) TRT FP16 matching | ≤200ms | TensorRT FP16, 1280px, go/no-go threshold |
| Geometric pose (RANSAC) | ~5ms | Homography estimation |
| ESKF satellite update | ~1ms | Delayed measurement |
| **Total** | **≤210ms** | Async, within budget |
**Path B — XFeat (if LiteSAM >200ms)**:
| Step | Time | Notes |
|------|------|-------|
| XFeat feature extraction (both images) | ~10-20ms | TensorRT FP16/INT8 |
| XFeat semi-dense matching | ~30-50ms | KNN + refinement |
| Geometric verification (RANSAC) | ~5ms | |
| ESKF satellite update | ~1ms | |
| **Total** | **~50-80ms** | Comfortably within budget |
## Memory Budget (Jetson Orin Nano Super, 8GB shared)
| Component | Memory | Notes |
|-----------|--------|-------|
| OS + runtime | ~1.5GB | JetPack 6.2 + Python |
| cuVSLAM | ~200-500MB | CUDA library + map state. **Configure map pruning for 3000-frame flights** |
| Satellite matcher TensorRT | ~50-100MB | LiteSAM FP16 or XFeat FP16 |
| Preloaded satellite tiles | ~200MB | ±2km of flight plan, pre-resized |
| Current frame (downsampled) | ~2MB | 640×480×3 |
| ESKF state + buffers | ~10MB | |
| FastAPI + SSE runtime | ~100MB | |
| **Total** | **~2.1-2.9GB** | ~26-36% of 8GB — comfortable margin |
## Confidence Scoring
| Level | Condition | Expected Accuracy |
|-------|-----------|-------------------|
| HIGH | Satellite match succeeded + cuVSLAM consistent | <20m |
| MEDIUM | cuVSLAM VO only, recent satellite correction (<500m travel) | 20-50m |
| LOW | cuVSLAM VO only, no recent satellite correction | 50-100m+ |
| VERY LOW | IMU dead-reckoning only (cuVSLAM + satellite both failed) | 100m+ |
| MANUAL | User-provided position | As provided |
## Key Risks and Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| **cuVSLAM fails on low-texture agricultural terrain** | **HIGH** | Frequent tracking loss, degraded VO | Increase satellite matching frequency when keypoint count drops. IMU dead-reckoning bridge (~1.5s). Accept higher drift in featureless segments. Satellite matching recovers position when texture returns. |
| LiteSAM TRT FP16 >200ms at 1280px on Orin Nano Super | MEDIUM | Must use XFeat instead (less accurate for cross-view) | Day-one TRT FP16 benchmark. If >200ms → XFeat. Since our nadir-nadir gap is small, XFeat may still perform adequately. |
| XFeat cross-view accuracy insufficient | MEDIUM | Satellite corrections less accurate | Benchmark XFeat on actual operational area satellite-aerial pairs. Increase keyframe frequency; multi-tile consensus; strict RANSAC. |
| cuVSLAM map memory growth on long flights | MEDIUM | Memory pressure | Configure map pruning, set max keyframes. Monitor memory. |
| Google Maps satellite quality in conflict zone | HIGH | Satellite matching fails | Accept VO+IMU with higher drift; request user input sooner; alternative satellite providers |
| cuVSLAM is closed-source, no nadir benchmarks | MEDIUM | Unknown failure modes over farmland | Extensive testing with real nadir UAV imagery before deployment. XFeat VO as fallback (also uses learned features). |
| Tile I/O bottleneck during expanded search | LOW | Delayed re-localization | Preload ±2km tiles in RAM; ranked search instead of exhaustive |
## Testing Strategy
### Integration / Functional Tests
- VO replay: assert AC-2.1a and AC-2.2 VO MRE on overlapping frame pairs.
- Satellite anchor replay: assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness rejection, and source labels.
- DINOv2 descriptor fidelity: compare PyTorch/ONNX/TensorRT embeddings and retrieval rankings before accepting optimized engines.
- FAISS CPU index tests: top-K recall, query latency, index size, save/load behavior on Jetson ARM64.
- LightGlue extractor matrix: DISK vs ALIKED vs SuperPoint benchmark; SuperPoint output excluded from production unless license approved.
- COG cache lifecycle: write-new generated tile, update manifest, verify active version and rollback.
- `GPS_INPUT` SITL: validate fix type, `horiz_accuracy`, velocity fields, ignore flags, `EK3_SRC1_*` parameters, QGC behavior.
- Security gates: stale tile, mismatched tile hash, low inlier ratio, impossible velocity jump, and spoofed GPS during blackout.
- End-to-end pipeline test with real flight data (60 images from input_data/)
- Compare computed positions against ground truth GPS from coordinates.csv
- Measure: percentage within 50m, percentage within 20m
- Test sharp-turn handling: introduce 90-degree heading change in sequence
- Test user-input fallback: simulate 3+ consecutive failures
- Test SSE streaming: verify client receives VO result within 50ms, satellite-corrected result within 500ms
- Test session management: start/stop/restart flight sessions via REST API
- Test cuVSLAM map memory: run 3000-frame session, monitor memory growth
### Non-Functional Tests
- Jetson latency and memory: <400 ms p95, <8 GB shared memory, no 25 W thermal throttle.
- Cache budget: 400 km² imagery + manifests + descriptors fits budget or reports explicit split budget.
- FDR 8-hour load: <=64 GB, rollover logged, no silent payload loss.
- Monte Carlo false-position and cache-poisoning tests for AC-NEW-4 and AC-NEW-7.
- Cold boot: first valid `GPS_INPUT` <30 s p95 across 50 runs.
- **Day-one satellite matcher benchmark**: LiteSAM TRT FP16 at **1280px** on Orin Nano Super. If ≤200ms → use LiteSAM. If >200ms → use XFeat. Also measure accuracy on test satellite-aerial pairs for both.
- cuVSLAM benchmark: verify 116fps monocular+IMU on Orin Nano Super
- **cuVSLAM terrain stress test**: test with nadir camera over (a) urban/structured terrain, (b) agricultural fields, (c) water/uniform terrain, (d) forest. Measure: keypoint count, tracking success rate, drift per 100 frames, IMU fallback frequency
- cuVSLAM keypoint monitoring: verify that low-keypoint detection triggers increased satellite matching
- Performance: measure per-frame processing time (must be <400ms)
- Memory: monitor peak usage during 3000-frame session (must stay <8GB)
- Stress: process 3000 frames without memory leak
- Keyframe strategy: vary interval (2, 3, 5, 10) and measure accuracy vs latency tradeoff
- Tile preloading: verify RAM usage of preloaded tiles for 50km flight plan
## References
Detailed source registry: `_docs/00_research/01_source_registry.md`.
Key added Mode B sources:
- DINOv2 TensorRT issue: https://github.com/NVIDIA/TensorRT/issues/4348
- DINOv2 Jetson forum issue: https://forums.developer.nvidia.com/t/dinov2-tensorrt-model-performance-issue/312251
- LightGlue license discussion: https://github.com/cvg/LightGlue/issues/120
- ArduPilot GPS_INPUT velocity issue: https://github.com/ArduPilot/ardupilot/issues/19633
- FAISS install docs: https://github.com/facebookresearch/faiss/blob/main/INSTALL.md
- Orthophoto visual geolocalization: https://ar5iv.labs.arxiv.org/html/2103.14381
- EfficientLoFTR (CVPR 2024): https://github.com/zju3dv/EfficientLoFTR
- EfficientLoFTR paper: https://zju3dv.github.io/efficientloftr/
- LoFTR TensorRT adaptation: https://github.com/Kolkir/LoFTR_TRT
- PFED (2025): https://github.com/SkyEyeLoc/PFED
- STHN (IEEE RA-L 2024): https://github.com/arplaboratory/STHN
- JointLoc (IROS 2024): https://github.com/LuoXubo/JointLoc
- Hierarchical AVL (MDPI 2025): https://www.mdpi.com/2072-4292/17/20/3470
- LiteSAM (2025): https://www.mdpi.com/2072-4292/17/19/3349
- LiteSAM code: https://github.com/boyagesmile/LiteSAM
- cuVSLAM (2025-2026): https://github.com/NVlabs/PyCuVSLAM
- cuVSLAM paper: https://arxiv.org/abs/2506.04359
- PyCuVSLAM API: https://nvlabs.github.io/PyCuVSLAM/api.html
- Intermodalics cuVSLAM benchmark: https://www.intermodalics.ai/blog/nvidia-isaac-ros-in-depth-cuvslam-and-the-dp3-1-release
- Mateos-Ramirez et al. (2024): https://www.mdpi.com/2076-3417/14/16/7420
- SatLoc (2025): https://www.scilit.com/publications/e5cafaf875a49297a62b298a89d5572f
- XFeat (CVPR 2024): https://arxiv.org/abs/2404.19174
- XFeat TensorRT for Jetson: https://github.com/PranavNedunghat/XFeatTensorRT
- EfficientLoFTR (CVPR 2024): https://github.com/zju3dv/EfficientLoFTR
- LightGlue (ICCV 2023): https://github.com/cvg/LightGlue
- LightGlue TensorRT: https://fabio-sim.github.io/blog/accelerating-lightglue-inference-onnx-runtime-tensorrt/
- LightGlue TRT Jetson: https://github.com/qdLMF/LightGlue-with-FlashAttentionV2-TensorRT
- ForestVO / SP+LG VO: https://arxiv.org/html/2504.01261v1
- vo_lightglue (SP+LG VO): https://github.com/himadrir/vo_lightglue
- JetPack 6.2: https://docs.nvidia.com/jetson/archives/jetpack-archived/jetpack-62/release-notes/
- Hybrid ESKF/UKF: https://arxiv.org/abs/2512.17505
- Google Maps Tile API: https://developers.google.com/maps/documentation/tile/satellite
## Related Artifacts
- AC Assessment: `_docs/00_research/gps_denied_nav/00_ac_assessment.md`
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
- Fact cards: `_docs/00_research/02_fact_cards.md`
- Security analysis: `_docs/01_solution/security_analysis.md`
+243 -45
View File
@@ -1,59 +1,257 @@
# Tech Stack Evaluation
## Requirements Analysis
## Requirements Summary
| Area | Requirement |
|------|-------------|
| Runtime | Jetson Orin Nano Super, Ubuntu/JetPack, CUDA/TensorRT available, 8 GB shared memory, 25 W thermal envelope. |
| Language | Python is acceptable for orchestration/prototyping; C++/TensorRT paths likely needed for hot vision loops. |
| Vision | Calibration, undistortion, homography, VPR descriptors, local matching, RANSAC verification. |
| Estimation | BASALT VIO for relative camera+IMU propagation, wrapped by project-owned safety/anchor estimator logic for covariance calibration, source labels, blackout/spoofing modes, and MAVLink output mapping. |
| Storage | Offline satellite cache, descriptor index, FDR with no raw frame retention. |
| Autopilot | ArduPilot Plane, `GPS_INPUT` through pymavlink, MAVSDK for telemetry. |
### Functional
- GPS-denied visual navigation for fixed-wing UAV
- Frame-center GPS estimation via VO + satellite matching + IMU fusion
- Object-center GPS via geometric projection
- Real-time streaming via REST API + SSE
- Disconnected route segment handling
- User-input fallback for unresolvable frames
### Non-Functional
- <400ms per-frame processing (camera @ ~3fps)
- <50m accuracy for 80% of frames, <20m for 60%
- <8GB total memory (CPU+GPU shared pool)
- Up to 3000 frames per flight session
- Image Registration Rate >95% (normal segments)
### Hardware Constraints
- **Jetson Orin Nano Super** (8GB LPDDR5, 1024 CUDA cores, 67 TOPS INT8)
- **JetPack 6.2.2**: CUDA 12.6.10, TensorRT 10.3.0, cuDNN 9.3
- ARM64 (aarch64) architecture
- No internet connectivity during flight
## Technology Evaluation
| Layer | Selected | Alternatives Considered | Rationale | Risk |
|-------|----------|-------------------------|-----------|------|
| OS / GPU stack | JetPack Ubuntu + CUDA + TensorRT | Plain Ubuntu without JetPack | Required for Jetson acceleration and profiling. | Thermal/performance tuning. |
| Calibration / geometry | OpenCV 4.x | Custom NumPy-only geometry | Mature APIs for calibration, undistortion, homography, RANSAC. | Version pin and calibration quality. |
| VO / estimator | BASALT + project-owned safety/anchor wrapper | OpenVINS, Kimera-VIO, custom OpenCV/ESKF, ORB-SLAM3, VINS-Fusion | BASALT is the selected production VIO candidate; wrapper owns source labels, covariance gates, degraded modes, satellite-anchor acceptance, and MAVLink semantics. | BASALT confidence/covariance must be calibrated; nadir fixed-wing replay required. |
| VPR descriptors | DINOv2-VLAD / AnyLoc-style, model-size profiled, TensorRT only after fidelity check | MixVPR, SALAD, NetVLAD, classical BoW | Strong retrieval evidence; good offline descriptor model. | Memory/latency and embedding drift if optimized incorrectly. |
| Vector search | FAISS CPU-first on Jetson ARM64 | HNSWLIB, PostgreSQL/pgvector metadata-assisted search, brute-force NumPy, custom FAISS GPU build | Mature top-K search, save/load, PQ compression; PostgreSQL stores spatial/mission metadata around descriptor files. | CPU query latency must be profiled; GPU FAISS is not default on aarch64. |
| Local matching | DISK/ALIKED + LightGlue | SuperPoint+LightGlue, SIFT/ORB, LoFTR | Exact match outputs; Apache/BSD-friendly path; adaptive speed knobs. | Jetson profiling needed. |
| Raster cache | COG + PostgreSQL/PostGIS manifest + signed JSON sidecars | PMTiles, MBTiles, loose tile folders | COG fits geospatial raster and write-new-tile workflow; PostGIS supports spatial/freshness queries. | 10 GB budget pressure and local DB availability. |
| MAVLink | MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK-only, MAVProxy bridge | MAVSDK does telemetry well; `GPS_INPUT` needs raw field control. | Plane SITL validation. |
| FDR | PostgreSQL event index + CBOR/binary payload segments with optional Parquet export | Raw images, plain CSV | Queryable event metadata, bounded payload segments, no raw frame retention. | Schema and local DB availability. |
| Testing | ArduPilot Plane SITL + AerialVL/VPAir + EuRoC + representative flight replay | Public datasets only | No public dataset covers all ACs. | Representative data collection required. |
### Platform & OS
| Option | Version | Score (1-5) | Notes |
|--------|---------|-------------|-------|
| **JetPack 6.2.2 (L4T)** | Ubuntu 22.04 based | **5** | Only supported OS for Orin Nano Super. Includes CUDA 12.6, TensorRT 10.3, cuDNN 9.3 |
**Selected**: JetPack 6.2.2 — no alternative.
### Primary Language
| Option | Fitness | Maturity | Perf on Jetson | Ecosystem | Score |
|--------|---------|----------|----------------|-----------|-------|
| **Python 3.10+** | 5 | 5 | 4 | 5 | **4.8** |
| C++ | 5 | 5 | 5 | 3 | 4.5 |
| Rust | 3 | 3 | 5 | 2 | 3.3 |
**Selected**: **Python 3.10+** as primary language.
- cuVSLAM provides Python bindings (PyCuVSLAM v15.0.0)
- TensorRT has Python API
- FastAPI is Python-native
- OpenCV has full Python+CUDA bindings
- Performance-critical paths offloaded to CUDA via cuVSLAM/TensorRT — Python is glue code only
- C++ for custom ESKF if NumPy proves too slow (unlikely for 16-state EKF at 100Hz)
### Visual Odometry
| Option | Version | FPS on Orin Nano | Memory | License | Score |
|--------|---------|------------------|--------|---------|-------|
| **cuVSLAM (PyCuVSLAM)** | v15.0.0 (Mar 2026) | 116fps @ 720p | ~200-300MB | Free (NVIDIA, closed-source) | **5** |
| XFeat frame-to-frame | TensorRT engine | ~30-50ms/frame | ~50MB | MIT | 3.5 |
| ORB-SLAM3 | v1.0 | ~30fps | ~300MB | GPLv3 | 2.5 |
**Selected**: **PyCuVSLAM v15.0.0**
- 116fps on Orin Nano 8G at 720p (verified via Intermodalics benchmark)
- Mono + IMU mode natively supported
- Auto IMU fallback on tracking loss
- Pre-built aarch64 wheel: `pip install -e bin/aarch64`
- Loop closure built-in
**Risk**: Closed-source; nadir-only camera not explicitly tested. **Fallback**: XFeat frame-to-frame matching.
### Satellite Image Matching (Benchmark-Driven Selection)
**Day-one benchmark decides between two candidates:**
| Option | Params | Accuracy (UAV-VisLoc) | Est. Time on Orin Nano | License | Score |
|--------|--------|----------------------|----------------------|---------|-------|
| **LiteSAM (opt)** | 6.31M | RMSE@30 = 17.86m | ~300-500ms @ 480px (estimated) | Open-source | **4** (if fast enough) |
| **XFeat semi-dense** | ~5M | Not benchmarked on UAV-VisLoc | ~50-100ms | MIT | **4** (if LiteSAM too slow) |
**Decision rule**:
1. Export LiteSAM (opt) to TensorRT FP16 on Orin Nano Super
2. Benchmark at 480px, 640px, 800px
3. If ≤400ms at 480px → LiteSAM
4. If >400ms → **abandon LiteSAM, XFeat is primary**
| Requirement | LiteSAM (opt) | XFeat semi-dense |
|-------------|---------------|------------------|
| PyTorch → ONNX → TensorRT export | Required | Required |
| TensorRT FP16 engine | 6.31M params, ~25MB engine | ~5M params, ~20MB engine |
| Input preprocessing | Resize to 480px, normalize | Resize to 640px, normalize |
| Matching pipeline | End-to-end (detect + match + refine) | Detect → KNN match → geometric verify |
| Cross-view robustness | Designed for satellite-aerial gap | General-purpose, less robust |
### Sensor Fusion
| Option | Complexity | Accuracy | Compute @ 100Hz | Score |
|--------|-----------|----------|-----------------|-------|
| **ESKF (custom)** | Low | Good | <1ms/step | **5** |
| Hybrid ESKF/UKF | Medium | 49% better | ~2-3ms/step | 3.5 |
| GTSAM Factor Graph | High | Best | ~10-50ms/step | 2 |
**Selected**: **Custom ESKF in Python (NumPy/SciPy)**
- 16-state vector, well within NumPy capability
- FilterPy (v1.4.5, MIT) as reference/fallback, but custom implementation preferred for tighter control
- If 100Hz IMU prediction step proves slow in Python: rewrite as Cython or C extension (~1 day effort)
### Image Preprocessing
| Option | Tool | Time on Orin Nano | Notes | Score |
|--------|------|-------------------|-------|-------|
| **OpenCV CUDA resize** | cv2.cuda.resize | ~2-3ms (pre-allocated) | Must build OpenCV with CUDA from source. Pre-allocate GPU mats to avoid allocation overhead | **4** |
| NVIDIA VPI resize | VPI 3.2 | ~1-2ms | Part of JetPack, potentially faster | 4 |
| CPU resize (OpenCV) | cv2.resize | ~5-10ms | No GPU needed, simpler | 3 |
**Selected**: **OpenCV CUDA** (pre-allocated GPU memory) or **VPI 3.2** (whichever is faster in benchmark). Both available in JetPack 6.2.
- Must build OpenCV from source with `CUDA_ARCH_BIN=8.7` for Orin Nano Ampere architecture
- Alternative: VPI 3.2 is pre-installed in JetPack 6.2, no build step needed
### API & Streaming Framework
| Option | Version | Async Support | SSE Support | Score |
|--------|---------|--------------|-------------|-------|
| **FastAPI + sse-starlette** | FastAPI 0.115+, sse-starlette 3.3.2 | Native async/await | EventSourceResponse with auto-disconnect | **5** |
| Flask + flask-sse | Flask 3.x | Limited | Redis dependency | 2 |
| Raw aiohttp | aiohttp 3.x | Full | Manual SSE implementation | 3 |
**Selected**: **FastAPI + sse-starlette v3.3.2**
- sse-starlette: 108M downloads/month, BSD-3 license, production-stable
- Auto-generated OpenAPI docs
- Native async for non-blocking VO + satellite pipeline
- Uvicorn as ASGI server
### Satellite Tile Storage & Indexing
| Option | Complexity | Lookup Speed | Score |
|--------|-----------|-------------|-------|
| **GeoHash-indexed directory** | Low | O(1) hash lookup | **5** |
| SQLite + spatial index | Medium | O(log n) | 4 |
| PostGIS | High | O(log n) | 2 (overkill) |
**Selected**: **GeoHash-indexed directory structure**
- Pre-flight: download tiles, store as `{geohash}/{zoom}_{x}_{y}.jpg` + `{geohash}/{zoom}_{x}_{y}_resized.jpg`
- Runtime: compute geohash from ESKF position → direct directory lookup
- Metadata in JSON sidecar files
- No database dependency on the Jetson during flight
### Satellite Tile Provider
| Provider | Max Zoom | GSD | Pricing | Eastern Ukraine Coverage | Score |
|----------|----------|-----|---------|--------------------------|-------|
| **Google Maps Tile API** | 18-19 | ~0.3-0.5 m/px | 100K tiles free/month, then $0.48/1K | Partial (conflict zone gaps) | **4** |
| Bing Maps | 18-19 | ~0.3-0.5 m/px | 125K free/year (basic) | Similar | 3.5 |
| Mapbox Satellite | 18-19 | ~0.5 m/px | 200K free/month | Similar | 3.5 |
**Selected**: **Google Maps Tile API** (per restrictions.md). 100K free tiles/month covers ~25km² at zoom 19. For larger operational areas, costs are manageable at $0.48/1K tiles.
### Output Format
| Format | Standard | Tooling | Score |
|--------|----------|---------|-------|
| **GeoJSON** | RFC 7946 | Universal GIS support | **5** |
| CSV (lat, lon, confidence) | De facto | Simple, lightweight | 4 |
**Selected**: **GeoJSON** as primary, CSV as export option. Per AC: WGS84 coordinates.
## Tech Stack Summary
- **Primary implementation**: Python for orchestration, test harness, cache tooling, and MAVLink integration; C++/TensorRT for hot-path vision if profiling requires it.
- **Vision utilities**: OpenCV 4.x.
- **Estimator**: BASALT VIO plus project-owned safety/anchor wrapper and mode machine.
- **Global retrieval**: DINOv2-VLAD style descriptors with CPU-first FAISS top-K search.
- **Local matching**: DISK/ALIKED + LightGlue; SuperPoint only after license review.
- **Cache**: COG imagery, PostgreSQL/PostGIS manifest metadata, signed JSON sidecars, FAISS index files.
- **Autopilot**: MAVSDK subscriptions plus pymavlink `GPS_INPUT`.
- **Validation**: public datasets for component de-risking, Plane SITL for integration, representative flight/replay data for acceptance.
```
┌────────────────────────────────────────────────────┐
│ HARDWARE: Jetson Orin Nano Super 8GB │
│ OS: JetPack 6.2.2 (L4T / Ubuntu 22.04) │
│ CUDA 12.6.10 / TensorRT 10.3.0 / cuDNN 9.3 │
├────────────────────────────────────────────────────┤
│ LANGUAGE: Python 3.10+ │
│ FRAMEWORK: FastAPI + sse-starlette 3.3.2 │
│ SERVER: Uvicorn (ASGI) │
├────────────────────────────────────────────────────┤
│ VISUAL ODOMETRY: PyCuVSLAM v15.0.0 │
│ SATELLITE MATCH: LiteSAM(opt) or XFeat (benchmark) │
│ SENSOR FUSION: Custom ESKF (NumPy/SciPy) │
│ PREPROCESSING: OpenCV CUDA or VPI 3.2 │
│ INFERENCE: TensorRT 10.3.0 (FP16) │
├────────────────────────────────────────────────────┤
│ TILE PROVIDER: Google Maps Tile API │
│ TILE STORAGE: GeoHash-indexed directory │
│ OUTPUT: GeoJSON (WGS84) via SSE stream │
└────────────────────────────────────────────────────┘
```
## Dependency List
### Python Packages (pip)
| Package | Version | Purpose |
|---------|---------|---------|
| pycuvslam | v15.0.0 (aarch64 wheel) | Visual odometry |
| fastapi | >=0.115 | REST API framework |
| sse-starlette | >=3.3.2 | SSE streaming |
| uvicorn | >=0.30 | ASGI server |
| numpy | >=1.26 | ESKF math, array ops |
| scipy | >=1.12 | Rotation matrices, spatial transforms |
| opencv-python (CUDA build) | >=4.8 | Image preprocessing (must build from source with CUDA) |
| torch (aarch64) | >=2.3 (JetPack-compatible) | LiteSAM model loading (if selected) |
| tensorrt | 10.3.0 (JetPack bundled) | Inference engine |
| pycuda | >=2024.1 | CUDA stream management |
| geojson | >=3.1 | GeoJSON output formatting |
| pygeohash | >=1.2 | GeoHash tile indexing |
### System Dependencies (JetPack 6.2.2)
| Component | Version | Notes |
|-----------|---------|-------|
| CUDA Toolkit | 12.6.10 | Pre-installed |
| TensorRT | 10.3.0 | Pre-installed |
| cuDNN | 9.3 | Pre-installed |
| VPI | 3.2 | Pre-installed, alternative to OpenCV CUDA for resize |
| cuVSLAM runtime | Bundled with PyCuVSLAM wheel | |
### Offline Preprocessing Tools (developer machine, not Jetson)
| Tool | Purpose |
|------|---------|
| Python 3.10+ | Tile download script |
| Google Maps Tile API key | Satellite tile access |
| torch + LiteSAM weights | Feature pre-extraction (if LiteSAM selected) |
| trtexec (TensorRT) | Model export to TensorRT engine |
## Risk Assessment
| Risk | Impact | Mitigation |
|------|--------|------------|
| VPR/local matching exceeds Jetson latency | AC-4.1 failure | Conditional VPR, top-K caps, downsampled descriptors, CPU FAISS profiling, TensorRT only after embedding-fidelity checks. |
| Descriptor cache exceeds 10 GB | AC-8.3/storage failure | PQ/compression, multi-scale budget report, split descriptor budget if needed. |
| GPL library accidentally becomes production dependency | Licensing issue | Keep OpenVINS/ORB-SLAM3 as reference only unless legal approves; BASALT is the production VIO candidate. |
| BASALT covariance under-reports real error | False-position safety budget failure | Calibrate wrapper covariance against OpenVINS covariance, ground truth replay, and satellite anchor residuals. |
| Plane failsafe differs from Copter docs | Safety behavior mismatch | Production-parameter ArduPilot Plane SITL gate. |
| `GPS_INPUT` velocity ignore flags behave unexpectedly | EKF drift or false velocity fusion | SITL tests for velocity fields, ignore flags, and `EK3_SRC1_*` source parameters. |
| Public datasets fail to represent Ukrainian agricultural terrain | False confidence | Require representative synchronized flight/replay data before AC signoff. |
| Thermal throttling | Latency regression | Hot-soak test and throttle logging per AC-NEW-5. |
| Technology | Risk | Likelihood | Impact | Mitigation |
|-----------|------|-----------|--------|------------|
| cuVSLAM | Closed-source, nadir camera untested | Medium | High | XFeat frame-to-frame as open-source fallback |
| LiteSAM | May exceed 400ms on Orin Nano Super | High | High | **Abandon for XFeat** — day-one benchmark is go/no-go |
| OpenCV CUDA build | Build complexity on Jetson, CUDA arch compatibility | Medium | Low | VPI 3.2 as drop-in alternative (pre-installed) |
| Google Maps Tile API | Conflict zone coverage gaps, EEA restrictions | Medium | Medium | Test tile availability for operational area pre-flight; alternative providers (Bing, Mapbox) |
| Custom ESKF | Implementation bugs, tuning effort | Low | Medium | FilterPy v1.4.5 as reference; well-understood algorithm |
| Python GIL | Concurrent VO + satellite matching contention | Low | Low | CUDA operations release GIL; use asyncio + threading for I/O |
## Learning / Implementation Requirements
## Learning Requirements
- Jetson profiling with CUDA/TensorRT and memory instrumentation.
- ArduPilot Plane SITL, `GPS_INPUT`, GPS spoof/failsafe parameters.
- Geospatial raster formats: COG, CRS, tile matrices, m/px metadata, manifests.
- BASALT integration, covariance calibration, and Mahalanobis rejection in the safety/anchor wrapper.
- Aerial VPR benchmark methodology and georeference recall.
| Technology | Team Expertise Needed | Ramp-up Time |
|-----------|----------------------|--------------|
| PyCuVSLAM | SLAM concepts, Python API, camera calibration | 2-3 days |
| TensorRT model export | ONNX export, trtexec, FP16 optimization | 2-3 days |
| LiteSAM architecture | Transformer-based matching (if selected) | 1-2 days |
| XFeat | Feature detection/matching concepts | 1 day |
| ESKF | Kalman filtering, quaternion math, multi-rate fusion | 3-5 days |
| FastAPI + SSE | Async Python, ASGI, SSE protocol | 1 day |
| GeoHash spatial indexing | Geospatial concepts | 0.5 days |
| Jetson deployment | JetPack, power modes, thermal management | 2-3 days |
## Development Environment
| Environment | Purpose | Setup |
|-------------|---------|-------|
| **Developer machine** (x86_64, GPU) | Development, unit testing, model export | Docker with CUDA + TensorRT |
| **Jetson Orin Nano Super** | Integration testing, benchmarking, deployment | JetPack 6.2.2 flashed, SSH access |
Code should be developed and unit-tested on x86_64, then deployed to Jetson for integration/performance testing. cuVSLAM and TensorRT engines are aarch64-only — mock these in x86_64 tests.
-148
View File
@@ -1,148 +0,0 @@
# GPS-Denied Onboard Localization — Planning Report
## Executive Summary
The solution planning phase decomposed the GPS-denied onboard localization service into 8 runtime implementation components, 2 cross-cutting foundation epics, a bootstrap epic, and separate e2e/blackbox test epics. The architecture centers on a Jetson-hosted hot path using camera ingest, BASALT VIO, and a project-owned safety/anchor wrapper, with triggered Satellite Service candidate retrieval and ALIKED/DISK-LightGlue anchor verification against an offline PostgreSQL/PostGIS-backed cache.
Jira epics were created in project `AZ` from AZ-206 through AZ-218. Total estimated effort across epics is approximately 87-141 story points, with large work intentionally decomposed into child tasks of 2, 3, or 5 points where possible.
## Problem Statement
The system must provide reliable onboard WGS84 localization when GPS is denied or spoofed, using a fixed nadir camera, flight-controller telemetry, and an offline satellite cache. It must emit ArduPilot-compatible position estimates, report confidence honestly, degrade safely under blackout, and preserve enough forensic evidence for post-flight analysis without retaining raw frames.
## Architecture Overview
The system is a trigger-based hybrid estimator. Normal flight uses camera ingest, pre-VIO occlusion checks, BASALT VIO, and a safety/anchor wrapper. Relocalization triggers use DINOv2-VLAD, FAISS, ALIKED/DISK-LightGlue, and OpenCV RANSAC against the offline cache. The wrapper is the safety authority for covariance, source labels, degraded modes, tile-write eligibility, and MAVLink output semantics.
**Technology stack**: Python orchestration, C++/native vision paths where needed, OpenCV 4.x, BASALT, DINOv2-VLAD, FAISS CPU, ALIKED/DISK-LightGlue, PostgreSQL/PostGIS, COG, CBOR FDR segments, MAVSDK + pymavlink.
**Deployment**: Local onboard Jetson runtime with Docker/replay and Plane SITL for validation; release gates require Jetson hardware, Plane SITL, and representative synchronized replay data.
## Component Summary
| # | Component | Purpose | Dependencies | Epic |
|---|-----------|---------|--------------|------|
| 01 | Camera Ingest And Calibration | Ingest frames, validate calibration, detect total occlusion before VIO | Bootstrap, shared geometry/time, config/errors | AZ-209 |
| 02 | VIO Adapter | Wrap the selected relative VIO backend and emit replaceable state DTOs | Camera, MAVLink telemetry, shared helpers | AZ-213 |
| 03 | Safety And Anchor Wrapper | Own localization state, covariance, anchors, blackout/failsafe, output semantics | Camera, MAVLink, VIO, anchor verification | AZ-216 |
| 04 | Satellite Service | Sync Satellite Service cache/upload packages and retrieve local VPR candidates from cache descriptors and FAISS | Camera, Tile Manager, shared helpers | AZ-214 |
| 05 | Anchor Verification | Verify retrieved candidates with learned matching and RANSAC | Satellite Service, camera, Tile Manager | AZ-215 |
| 06 | Tile Manager | Manage COGs, PostGIS manifests, sidecars, freshness, and orthorectified generated tiles | Bootstrap, shared helpers, config/errors | AZ-211 |
| 07 | MAVLink And GCS Integration | Consume FC telemetry and emit v1 `GPS_INPUT`/QGC status | Bootstrap, config/errors | AZ-210 |
| 08 | FDR And Observability | Record bounded replayable evidence and status | Bootstrap, config/errors, runtime DTOs | AZ-212 |
| Test | E2E Test Suite | Separate black-box replay, SITL, Jetson, and release evidence tests; not onboard runtime | All runtime components | AZ-217 |
**Implementation order**:
1. Bootstrap and cross-cutting foundations: AZ-206, AZ-207, AZ-208.
2. Independent adapters/stores: AZ-209, AZ-210, AZ-211, AZ-212.
3. Estimation/relocalization: AZ-213, AZ-214, AZ-215.
4. Safety orchestration: AZ-216.
5. Separate e2e/blackbox test implementation: AZ-217, AZ-218.
## System Flows
| Flow | Description | Key Components |
|------|-------------|----------------|
| Pre-flight cache preparation | Validate offline cache, sidecars, descriptors, and indexes | Satellite Service, Tile Manager |
| Normal frame processing | Route usable frames through BASALT; route total occlusion to IMU-only degraded path | Camera, BASALT, safety, MAVLink, FDR |
| Satellite relocalization | Retrieve and verify cache candidates, then accept/reject anchors | Safety, Satellite Service, anchor verification, Tile Manager |
| Visual blackout / spoofing | Propagate IMU-only from last trusted state and fail safe at thresholds | Camera, safety, MAVLink, QGC, FDR |
| Generated tile lifecycle | Write generated COG candidates only under covariance/quality gates | Safety, Tile Manager, FDR |
| Post-flight sync and audit | Package generated tiles and FDR evidence | Tile Manager, FDR, Satellite Service |
| Validation replay | Exercise runtime through public interfaces | Validation harness, all runtime components |
See `system-flows.md` for full diagrams and details.
## Risk Summary
| Level | Count | Key Risks |
|-------|-------|-----------|
| Critical | 0 | None |
| High | 7 | Camera spec mismatch, BASALT nadir fit, covariance under-reporting, total occlusion false-negative, IMU-only over-trust, Jetson trigger-path performance, PostgreSQL/PostGIS availability |
| Medium | 5 | Cache poisoning, dataset coverage/licensing, FDR append pressure, GPL/non-commercial leakage, generated tile promotion risk |
| Low | 0 | None |
**Iterations completed**: 1
**All Critical/High risks mitigated**: Yes. High risks have concrete gates in architecture, component specs, and tests.
See `risk_mitigations.md` for the full register.
## Test Coverage
| Component | Integration | Performance | Security | Acceptance | AC Coverage |
|-----------|-------------|-------------|----------|------------|-------------|
| Camera Ingest And Calibration | 3 | 1 | 1 | 2 | 7 ACs |
| VIO Adapter | 4 | 1 | 1 | 1 | 8 ACs |
| Safety And Anchor Wrapper | 7 | 1 | 1 | 3 | 15 ACs |
| Satellite Service | 4 | 2 | 1 | 1 | 10 ACs |
| Anchor Verification | 2 | 1 | 2 | 1 | 9 ACs |
| Tile Manager | 4 | 1 | 3 | 1 | 10 ACs |
| MAVLink And GCS Integration | 6 | 2 | 1 | 1 | 10 ACs |
| FDR And Observability | 6 | 1 | 1 | 1 | 11 ACs |
| E2E Test Suite | 9 | 2 | 1 | 2 | All AC groups |
**Overall acceptance criteria coverage**: 39 / 39 acceptance criteria covered (100%).
**Restrictions coverage**: 10 / 10 restriction groups covered (100%).
## Epic Roadmap
| Order | Epic | Component / Concern | Effort | Dependencies |
|-------|------|---------------------|--------|--------------|
| 1 | AZ-206: Bootstrap & Initial Structure | Scaffold | M / 5-8 pts | none |
| 2 | AZ-207: Cross-Cutting: Shared Geometry And Time Sync | Shared helper | S-M / 3-5 pts | AZ-206 |
| 3 | AZ-208: Cross-Cutting: Runtime Configuration And Errors | Shared helper | S-M / 3-5 pts | AZ-206 |
| 4 | AZ-209: Camera Ingest And Calibration | Component 01 | M / 5-8 pts | AZ-206, AZ-207, AZ-208 |
| 5 | AZ-210: MAVLink And GCS Integration | Component 07 | M / 5-8 pts | AZ-206, AZ-208 |
| 6 | AZ-211: Tile Manager | Component 06 | L / 8-13 pts | AZ-206, AZ-207, AZ-208 |
| 7 | AZ-212: FDR And Observability | Component 08 | M-L / 5-8 pts | AZ-206, AZ-208 |
| 8 | AZ-213: VIO Adapter | Component 02 | L / 8-13 pts | AZ-209, AZ-210 |
| 9 | AZ-214: Satellite Service | Component 04 | L / 8-13 pts | AZ-209, AZ-211 |
| 10 | AZ-215: Anchor Verification | Component 05 | L / 8-13 pts | AZ-214, AZ-209, AZ-211 |
| 11 | AZ-216: Safety And Anchor Wrapper | Component 03 | XL / 13-21 pts | AZ-209, AZ-210, AZ-213, AZ-215 |
| 12 | AZ-217: E2E Test Suite | Separate test support | L / 8-13 pts | Component epics |
| 13 | AZ-218: Blackbox Tests | System tests | L / 8-13 pts | AZ-217, component epics |
**Total estimated effort**: 87-141 story points.
## Key Decisions Made
| # | Decision | Rationale | Alternatives Rejected |
|---|----------|-----------|----------------------|
| 1 | Use BASALT as production VIO candidate | Permissive license and strong VIO benchmark fit | OpenVINS production dependency, custom VIO from scratch |
| 2 | Keep safety/anchor wrapper as authority | Product semantics require calibrated covariance, labels, gates, failsafe, MAVLink mapping | Letting BASALT/OpenVINS own output safety |
| 3 | Use ALIKED/DISK-LightGlue for anchor verification | Strong local correspondences for cross-domain verification | Per-frame learned matcher as primary VIO hot path |
| 4 | Add pre-VIO total-occlusion gate | Safer and cheaper than feeding fully unusable frames to VIO | Letting BASALT detect all visual failures |
| 5 | Use PostgreSQL/PostGIS for structured metadata | User confirmed PostgreSQL; PostGIS fits spatial cache/mission metadata | JSON-only or embedded single-file metadata DB |
| 6 | Use CBOR FDR payload segments with PostgreSQL index | Keeps high-volume append data bounded and queryable | Raw-frame retention, plain CSV, Parquet as runtime primary |
| 7 | v1 emits `GPS_INPUT` only | Avoid ArduPilot EKF3 double-fusion risk in v1 | Parallel `ODOMETRY` in v1 |
## Open Questions
| # | Question | Impact | Assigned To |
|---|----------|--------|-------------|
| 1 | Exact ADTi camera lens, interface, sustained FPS, and temperature spec | Blocks final camera calibration and runtime FPS assumptions | Hardware/product owner |
| 2 | Final representative synchronized target dataset collection timing | Blocks final acceptance, though public datasets can de-risk | Project/product owner |
| 3 | Dataset license approval for ALTO/Kagaru/EPFL/VPAir/UZH FPV use | Blocks commercial acceptance evidence for restricted datasets | Legal/product owner |
| 4 | Local onboard PostgreSQL/PostGIS deployment profile | Blocks implementation details for DB persistence and health checks | Backend/runtime owner |
## Artifact Index
| File | Description |
|------|-------------|
| `glossary.md` | Confirmed project glossary |
| `architecture.md` | System architecture and ADRs |
| `data_model.md` | System data model and storage strategy |
| `system-flows.md` | Main runtime and validation flows |
| `deployment/containerization.md` | Container/replay strategy |
| `deployment/ci_cd_pipeline.md` | CI/CD and release gates |
| `deployment/environment_strategy.md` | Environment and dataset strategy |
| `deployment/observability.md` | Runtime signals, logs, and alerts |
| `deployment/deployment_procedures.md` | Deployment, rollback, and health checks |
| `components/*/description.md` | Component specifications |
| `components/*/tests.md` | Component test specifications |
| `common-helpers/*.md` | Shared helper specifications |
| `diagrams/component_overview.md` | Component overview Mermaid diagram |
| `diagrams/flows/*.md` | Flow-specific Mermaid diagrams |
| `risk_mitigations.md` | Risk register and mitigations |
| `epics.md` | Jira epic mapping and dependency roadmap |
| `FINAL_report.md` | This final planning report |
-241
View File
@@ -1,241 +0,0 @@
# GPS-Denied Onboard Localization — Architecture
## Architecture Vision
Build a Jetson-hosted onboard localization pipeline for fixed-wing GPS-denied flight. The hot path fuses fixed nadir camera frames and FC telemetry through OpenCV geometry, BASALT VIO, and a project-owned safety/anchor wrapper that emits calibrated `GPS_INPUT` estimates and QGC/FDR status. A triggered satellite-anchor path uses DINOv2-VLAD, CPU FAISS, ALIKED/DISK+LightGlue, and RANSAC against the offline cache; generated tiles are written back only with strict provenance and covariance gates.
### Components / Responsibilities
- Camera ingest/calibration: load frames, apply intrinsics/extrinsics, validate image quality.
- VIO adapter: produce relative camera+IMU motion from synchronized nav frames and FC IMU.
- Safety/anchor wrapper: own covariance calibration, source labels, degraded modes, anchor fusion, and `GPS_INPUT`.
- Satellite Service: sync mission cache packages before flight, upload generated-tile packages after flight, and serve local VPR candidate retrieval from the offline cache.
- Anchor verification: run local matching/RANSAC and reject unsafe anchors.
- Tile Manager: manage COGs, manifests, freshness/provenance, orthorectified generated tiles, and local tile metadata.
- MAVLink/GCS integration: consume FC telemetry and emit `GPS_INPUT`/QGC status.
- FDR/observability: record replayable mission evidence under storage caps.
- Validation harness: run still-image, public dataset, SITL, Jetson, and representative replay tests.
### Principles / Non-Negotiables
- No in-flight satellite-provider or Satellite Service calls; runtime uses offline cache only.
- BASALT is a VIO component, not the safety authority.
- Confidence must be honest; covariance must grow in degraded modes.
- Heavy VPR/local matching is trigger-based, not per-frame.
- Raw nav/AI frames are not retained in normal operation.
- GPL VIO libraries remain reference-only unless explicitly approved.
- Plane SITL and Jetson hardware are release gates.
- Public datasets can de-risk, but representative synchronized flight data is required for final acceptance.
## 1. System Context
**Problem being solved**: During fixed-wing flight, GPS may be denied or spoofed. The onboard system must estimate WGS84 coordinates for navigation-camera frame centers and detected objects, stream `GPS_INPUT` to ArduPilot Plane, report confidence honestly, and maintain safety during VO failure, stale imagery, spoofing, and visual blackout.
**System boundaries**:
- In scope: onboard localization runtime, offline cache consumption, BASALT VIO integration, satellite anchor verification, MAVLink output, QGC status, FDR, generated tile metadata, and a separate e2e/black-box test suite.
- Out of scope: upstream commercial satellite-provider sourcing, Satellite Service ingest implementation, AI mission-camera detection itself, PX4 support, raw-frame retention as a normal operating mode.
**External systems**:
| System | Integration Type | Direction | Purpose |
|--------|------------------|-----------|---------|
| ArduPilot Plane FC | MAVLink | Inbound/Outbound | FC telemetry in, `GPS_INPUT` and status out |
| QGroundControl | MAVLink telemetry | Outbound | Downsampled operator status and failsafe messages |
| Azaion Suite Satellite Service | Offline file/cache sync | Inbound before flight, outbound after landing | Provides mission cache packages and receives generated-tile packages; never called mid-flight |
| Public/replay datasets | File/rosbag/fixture | Inbound to validation | De-risk BASALT, VPR, and anchor logic |
## 2. Technology Stack
| Layer | Technology | Version / Mode | Rationale |
|-------|------------|----------------|-----------|
| OS / GPU stack | JetPack Ubuntu + CUDA/TensorRT/ONNX Runtime | Jetson Orin Nano Super target | Required for production hardware profiling |
| Runtime language | Python + C++ | Python orchestration; C++ for BASALT/hot vision paths | Fits MAVLink/test tooling and native VIO dependencies |
| Geometry | OpenCV 4.x | Calibration, undistortion, homography, RANSAC/USAC | Mature utility layer |
| VIO | BASALT | Production candidate | BSD-friendly, strong benchmark evidence |
| VIO reference | OpenVINS | Reference/covariance baseline only | Strong EKF covariance story; GPLv3 risk |
| Backup VIO | Kimera-VIO | Backup candidate | BSD-friendly fallback with mono caveats |
| Local matching | ALIKED/DISK + LightGlue | Anchor verification and optional VO fallback | Strong learned correspondences; profile before hot-path use |
| Retrieval | DINOv2-VLAD + CPU FAISS | Triggered VPR only | Robust candidate retrieval under cache/offline constraints |
| Structured metadata DB | PostgreSQL + PostGIS | Onboard/local deployment | Spatial cache manifests, mission state, generated-tile metadata, and FDR event indexes |
| Cache imagery | COG + PostgreSQL/PostGIS manifest + signed JSON sidecars | Write-new COG objects | Efficient geospatial rasters with queryable spatial metadata and auditable sidecars |
| FDR | PostgreSQL event index + CBOR segment payloads, optional Parquet export | Per-flight rollover | Queryable event metadata with compact bounded payload segments |
| MAVLink | MAVSDK + pymavlink | MAVSDK telemetry, pymavlink `GPS_INPUT` | Exact output control |
**Key constraints from restrictions.md**:
- Jetson has 8 GB shared memory and 25 W thermal envelope, so heavy VPR/local matching cannot run every frame.
- Runtime must be offline with respect to satellite providers, so all imagery and descriptors are preloaded.
- The camera is fixed nadir; all VO choices must be validated against low-parallax/planar terrain.
- ADTi public specs conflict with current assumptions on resolution, continuous FPS, and operating temperature; manufacturer specs must be pinned before implementation.
## 3. Deployment Model
**Environments**: Development replay, public-dataset replay, Jetson hardware validation, Plane SITL, representative flight/replay rig.
**Infrastructure**:
- Onboard production runtime runs on the Jetson companion computer, not in cloud.
- Replay/test infrastructure may use Docker for deterministic fixture tests.
- Release gates require local Jetson hardware and ArduPilot Plane SITL.
**Environment-specific configuration**:
| Config | Development | Production |
|--------|-------------|------------|
| Satellite cache | Small fixture cache | Preloaded operational-area cache |
| Descriptor index | Fixture FAISS index | CPU-first FAISS index with PQ/IVF if needed |
| Secrets/signing | Local test keys | Mission/cache signing keys from Suite process |
| FDR | Local temp output | Per-flight bounded NVMe storage |
| MAVLink | SITL/replay | Physical FC telemetry link |
## 4. Data Model Overview
**Core entities**:
| Entity | Description | Owned By Component |
|--------|-------------|--------------------|
| FrameRecord | Navigation-camera frame metadata, total-occlusion status, and processing status | Camera ingest/calibration |
| TelemetrySample | FC IMU, attitude, airspeed, altitude, GPS health | MAVLink/GCS integration |
| VioState | Backend-relative pose/velocity/bias output and quality metadata | VIO adapter |
| PositionEstimate | WGS84 estimate, covariance, source label, fix type, anchor age | Safety/anchor wrapper |
| VprChunk | Retrieval unit over cache imagery and descriptors | Satellite Service |
| AnchorCandidate | Retrieved tile/chunk with local-match and RANSAC evidence | Anchor verification |
| CacheTile | COG tile plus manifest and sidecar metadata | Tile Manager |
| GeneratedTile | In-flight orthorectified tile with trust/provenance metadata | Tile Manager |
| FdrSegment | Bounded replayable log segment | FDR/observability |
**Data flow summary**:
- Frame quality/total-occlusion gate + telemetry -> BASALT VIO when usable, or IMU-only degraded mode when not -> safety/anchor wrapper -> `GPS_INPUT`, QGC, FDR.
- Relocalization trigger -> DINOv2-VLAD/FAISS -> ALIKED/DISK+LightGlue/RANSAC -> accepted/rejected anchor.
- High-confidence pose + frame -> generated tile -> manifest/sidecar -> post-flight Satellite Service sync.
## 5. Integration Points
### Internal Communication
| From | To | Protocol | Pattern | Notes |
|------|----|----------|---------|-------|
| Camera ingest/calibration | VIO adapter | In-process queue or shared frame bus | Streaming | Timestamp discipline is critical |
| MAVLink telemetry | VIO adapter | In-process telemetry buffer | Streaming | IMU/attitude/altitude sync |
| VIO adapter | Safety/anchor wrapper | Typed state messages | Streaming | Wrapper calibrates confidence |
| Safety/anchor wrapper | Satellite Service | Command | Triggered local request | Uses only preloaded cache/index data during flight |
| Satellite Service | Anchor verification | Candidate list | Request-response | Dynamic top-K |
| Anchor verification | Safety/anchor wrapper | Anchor decision | Request-response | Includes MRE/inliers/provenance |
| Safety/anchor wrapper | MAVLink/GCS integration | Position/status DTO | Streaming | `GPS_INPUT` emitted frame-by-frame |
| Safety/anchor wrapper | FDR/observability | Append-only events | Streaming | Bounded segments |
### External Integrations
| External System | Protocol | Auth | Failure Mode |
|-----------------|----------|------|--------------|
| ArduPilot Plane | MAVLink | Source/system ID allowlist | Degrade/failsafe; never trust spoofed GPS blindly |
| QGroundControl | MAVLink | FC telemetry path | Downsampled status may be delayed but local FDR remains authoritative |
| Azaion Suite Satellite Service | Offline package sync | Signed manifests/sidecars | Missing/stale cache causes degraded mode, not mid-flight network fetch |
| Public datasets | File/rosbag | License constraints | Not final acceptance unless representative and license-compatible |
## 6. Non-Functional Requirements
| Requirement | Target | Measurement | Priority |
|-------------|--------|-------------|----------|
| Frame latency | <400 ms p95 | Capture/replay timestamp to emitted estimate | High |
| Memory | <8 GB shared | Jetson monitoring | High |
| First fix | <30 s p95 | 50 cold starts | High |
| Thermal | No throttle at 25 W / +50 C | 8-hour hot-soak | High |
| FDR storage | <=64 GB/flight | 8-hour synthetic load | High |
| Cache storage | ~10 GB persistent budget | Full mission cache accounting | High |
| False position | P(error >500 m) <0.1%, >1 km <0.01% | Monte Carlo/replay | High |
## 7. Security Architecture
**Authentication / trust boundary**:
- Runtime accepts only local cache files with valid manifest/signature/provenance.
- MAVLink input is filtered by expected source/system IDs and FC health semantics.
**Data protection**:
- At rest: FDR and cache sidecars should be integrity protected; mission secrets/signing keys are not stored in code.
- In transit: no in-flight satellite-provider or Satellite Service network dependency; MAVLink link security depends on FC/GCS deployment.
**Audit logging**:
- FDR records estimates, covariance, anchors, rejected anchors, cache validation failures, spoofing/blackout transitions, emitted `GPS_INPUT`, resource health, and tile-write decisions.
## 8. Key Architectural Decisions
### ADR-001: BASALT As Production VIO Candidate
**Context**: A naive OpenCV-only VIO implementation is risky, while OpenVINS has GPLv3 production constraints.
**Decision**: Use BASALT as the production relative VIO candidate and keep OpenVINS as covariance/reference baseline.
**Alternatives considered**:
1. OpenVINS as production core — rejected by default because of GPLv3 and generic VIO ownership.
2. Kimera-VIO — retained as backup due to BSD license but mono-inertial caveats.
3. Fully custom OpenCV/ESKF — fallback only because implementation burden is high.
**Consequences**: The safety/anchor wrapper must calibrate confidence around BASALT and prove it on representative data.
### ADR-002: ALIKED-LightGlue Role
**Context**: ALIKED-LightGlue can produce strong local correspondences and can support frame-to-frame homography/pose estimation.
**Decision**: Use ALIKED/DISK+LightGlue for satellite-anchor verification and evaluate it as an optional VO fallback/keyframe-assist path, not as the default BASALT replacement.
**Alternatives considered**:
1. Per-frame ALIKED-LightGlue VO hot path — deferred until Jetson profiling proves latency/memory fit.
2. SIFT/ORB-only matching — retained as regression baseline, weaker under cross-domain conditions.
3. SuperPoint+LightGlue — license-gated.
**Consequences**: Implementation tasks must benchmark ALIKED-LightGlue on frame-to-frame VO and cross-domain anchor workloads separately.
### ADR-003: Cache Metadata Format
**Context**: JSON is simple and auditable, but operational cache queries need spatial indexing, freshness filters, update safety, and integration with the project PostgreSQL database.
**Decision**: Use PostgreSQL with PostGIS as the primary cache manifest/index database, with signed JSON sidecars for each tile/generated tile for auditability and interchange.
**Alternatives considered**:
1. JSON-only manifest — simpler, but weak for query/update scale, spatial search, and consistency.
2. Embedded single-file metadata DB — efficient for small deployments, but rejected because the project will use PostgreSQL/PostGIS.
**Consequences**: The Tile Manager owns PostgreSQL migrations, PostGIS indexes, signature checks, generated-tile orthorectification metadata, and sidecar/db consistency.
### ADR-004: FDR Format
**Context**: The FDR must be compact, bounded, replayable, and exportable for analysis.
**Decision**: Use PostgreSQL for FDR event indexes and mission-query metadata, with CBOR-backed segment payloads for bounded append-heavy runtime data and optional Parquet export after flight.
**Alternatives considered**:
1. Plain CSV — rejected for type safety, size, and complex payloads.
2. Parquet as primary onboard format — good analytics, but less ideal as the runtime append/rollover path.
**Consequences**: FDR implementation must define PostgreSQL tables/indexes, CBOR segment schema, rollover behavior, and export tooling.
### ADR-006: Total Occlusion Before VIO
**Context**: BASALT should not receive frames that are completely unusable because of lens cover, cloud/whiteout, decode failure, extreme exposure, or other total visual blackout.
**Decision**: Camera ingest performs a pre-VIO total-occlusion/blackout check. Total occlusion bypasses BASALT for that frame, sends a `total_occlusion` or `visual_blackout` degradation signal to the safety wrapper, and continues IMU-only propagation from the last trusted state.
**Alternatives considered**:
1. Let BASALT detect every visual failure — rejected because total occlusion is cheaper and safer to catch before the VIO hot path.
2. Drop frames silently — rejected because the wrapper must grow covariance and emit honest degraded output.
**Consequences**: The camera component must expose `occlusion_status`, and tests must assert mode transition to `dead_reckoned`/failsafe under total blackout.
### ADR-005: Public Dataset Strategy
**Context**: The original still-image sample lacks synchronized IMU and ground-truth trajectory. The Derkachi fixture adds cropped nadir video synchronized with IMU and `GLOBAL_POSITION_INT` trajectory, but camera intrinsics, distortion, and camera-to-body calibration remain pending.
**Decision**: Prioritize MUN-FRL for synchronized nadir camera + IMU + GNSS/ground truth; use ALTO for aerial localization/VPR and long nadir trajectories; investigate Kagaru/EPFL for fixed-wing/farmland relevance; use EuRoC/UZH FPV only as VIO proxies if license-compatible.
**Consequences**: Public datasets de-risk components but do not replace representative target flight data for final acceptance.
@@ -1,30 +0,0 @@
# Geo Geometry Helper
## Purpose
Shared geospatial and camera-geometry utilities used by camera ingest, safety wrapper, Tile Manager, anchor verification, and validation.
## Responsibilities
- WGS84 to local tangent plane conversions.
- Haversine/ground-distance calculations.
- Ground sampling distance calculations.
- Camera footprint projection from intrinsics, extrinsics, altitude, and attitude.
- Homography and covariance unit conversions for reporting.
## Non-Responsibilities
- No image matching.
- No state estimation.
- No MAVLink emission.
- No cache policy decisions.
## Consumers
| Component | Usage |
|-----------|-------|
| Camera ingest/calibration | Footprint and calibration sanity checks |
| Safety/anchor wrapper | Distance/covariance/unit conversion |
| Anchor verification | Pixel-to-ground error reporting |
| Tile Manager | Tile footprint metadata |
| Validation harness | Error thresholds and reports |
@@ -1,29 +0,0 @@
# Time Sync Helper
## Purpose
Shared timestamp validation and alignment utilities for frame, IMU, telemetry, FDR, and replay data.
## Responsibilities
- Monotonic timestamp checks.
- Frame-to-IMU window selection.
- Clock-domain conversion metadata.
- Replay ordering validation.
- Gap and jitter metrics.
## Non-Responsibilities
- No VIO state estimation.
- No MAVLink parsing beyond normalized timestamp fields.
- No recovery policy; callers decide whether to degrade or reject.
## Consumers
| Component | Usage |
|-----------|-------|
| Camera ingest/calibration | Frame ordering and timestamp metadata |
| VIO adapter | IMU/frame synchronization |
| MAVLink/GCS integration | Telemetry timestamp normalization |
| FDR/observability | Segment ordering |
| Validation harness | Fixture validation |
@@ -1,115 +0,0 @@
# Camera Ingest And Calibration
## 1. High-Level Overview
**Purpose**: Ingest navigation-camera frames, attach timestamps and calibration metadata, undistort/normalize imagery, detect total occlusion/blackout before VIO, and classify image quality before VIO or satellite matching consumes it.
**Architectural Pattern**: Streaming adapter + validation gate.
**Upstream dependencies**: Navigation camera, camera calibration files.
**Downstream consumers**: VIO adapter, Satellite Service, anchor verification, Tile Manager, FDR.
## 2. Internal Interfaces
### Interface: `FrameProvider`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `next_frame` | `FrameRequest` | `FramePacket` | Yes | `FrameUnavailable`, `CalibrationMissing`, `InvalidFrame` |
| `detect_occlusion` | `FramePacket` | `OcclusionReport` | No | `InvalidFrame` |
| `classify_quality` | `FramePacket` | `ImageQualityReport` | No | `InvalidFrame` |
**Input DTOs**:
```yaml
FrameRequest:
source: enum(live_camera, replay_file)
timestamp_ns: integer optional
```
**Output DTOs**:
```yaml
FramePacket:
frame_id: string
timestamp_ns: integer
image_ref: path_or_buffer
camera_calibration_id: string
altitude_hint_m: number optional
occlusion: OcclusionReport
quality: ImageQualityReport
OcclusionReport:
status: enum(clear, partial_occlusion, total_occlusion, blackout)
reason: enum(cloud, lens_cover, whiteout, decode_failure, underexposed, overexposed, unknown) optional
usable_for_vio: boolean
usable_for_anchor: boolean
ImageQualityReport:
usable_for_vio: boolean
usable_for_anchor: boolean
blackout_detected: boolean
blur_score: number
texture_score: number
```
## 3. Data Access Patterns
| Query | Frequency | Hot Path | Index Needed |
|-------|-----------|----------|--------------|
| Load calibration by ID | Startup | Yes | No |
| Read replay frame | Test/replay | Yes | File order |
## 4. Implementation Details
**State Management**: Maintains current camera calibration version and frame sequence state.
**Key Dependencies**:
| Library | Purpose |
|---------|---------|
| OpenCV | Decode, undistort, homography support, image metrics |
| Camera SDK / V4L2 / GigE SDK | Live camera access once interface is selected |
**Error Handling Strategy**:
- Camera or decode errors emit degraded quality and FDR events.
- Total occlusion or blackout sets `usable_for_vio=false`, bypasses BASALT for that frame, and emits a degradation signal to the safety/anchor wrapper.
- Missing calibration blocks production startup.
- ADTi public spec mismatch is tracked as a verification blocker until manufacturer spec is pinned.
## 5. Extensions and Helpers
| Helper | Purpose | Used By |
|--------|---------|---------|
| `geo_geometry_helper` | Coordinate transforms, GSD, WGS84/local conversions | Camera ingest, safety wrapper, Tile Manager |
## 6. Caveats & Edge Cases
**Known limitations**:
- Public ADTi pages list 2 fps continuous capture and -10..40 C operating range; project assumptions need manufacturer verification.
- Live camera interface is TBD.
- Total occlusion detection must be conservative: false positives cause temporary IMU-only degradation, while false negatives can feed unusable frames into VIO.
**Performance bottlenecks**:
- Full-resolution preprocessing must stay inside the <400 ms p95 pipeline budget.
## 7. Dependency Graph
**Must be implemented after**: none.
**Can be implemented in parallel with**: Tile Manager, MAVLink/GCS integration.
**Blocks**: VIO adapter, anchor verification, generated tile lifecycle.
## 8. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | Camera unavailable or calibration missing | `camera_calibration_missing id=...` |
| WARN | Frame degraded or occluded | `frame_quality_degraded occlusion=total_occlusion blur=... texture=...` |
| INFO | Calibration loaded | `camera_calibration_loaded version=...` |
**Log format**: FDR structured event.
**Log storage**: FDR segment and health log.
@@ -1,139 +0,0 @@
# Test Specification — Camera Ingest And Calibration
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-2.1a | VO registration uses only normal usable frames | IT-01, AT-01 | Covered |
| AC-3.5 | Visual blackout switches to degraded mode quickly | IT-02, AT-02 | Covered |
| AC-4.1 | Capture-to-output latency budget | PT-01 | Covered |
| AC-4.2 | Jetson memory budget | PT-01 | Covered |
| AC-8.4 | Mid-flight tile generation input eligibility | IT-03 | Covered |
| AC-8.5 | No raw frame retention | ST-01 | Covered |
| AC-NEW-8 | Full blackout/occlusion degraded-mode trigger | IT-02, AT-02 | Covered |
## Blackbox Tests
### IT-01: Usable Frame Classification
**Summary**: Verify normal nadir frames are marked usable for VIO and anchor matching.
**Traces to**: AC-2.1a
**Input data**: Project still images plus calibration metadata.
**Expected result**: `FramePacket.quality.usable_for_vio=true`, `usable_for_anchor=true`, and `occlusion.status=clear` for normal daytime textured frames.
**Max execution time**: 100 ms per frame.
**Dependencies**: Calibration file, replay fixture.
---
### IT-02: Total Occlusion Gate Before VIO
**Summary**: Verify total occlusion is detected before BASALT receives the frame.
**Traces to**: AC-3.5, AC-NEW-8
**Input data**: Black/white/covered-lens frames, corrupt frame fixture, low-texture whiteout frames.
**Expected result**: `occlusion.status` is `total_occlusion` or `blackout`, `usable_for_vio=false`, `usable_for_anchor=false`, and a degradation signal is emitted to the safety wrapper.
**Max execution time**: 100 ms per frame.
**Dependencies**: Safety wrapper test double.
---
### IT-03: Tile Generation Eligibility Metadata
**Summary**: Verify frame metadata required for generated tiles is available without persisting raw frames.
**Traces to**: AC-8.4, AC-8.5
**Input data**: Valid frame, altitude hint, calibration, pose fixture.
**Expected result**: Frame metadata supports orthorectification handoff; raw frame is not retained after processing outside allowed FDR thumbnail exception.
**Max execution time**: 100 ms per frame.
**Dependencies**: Tile Manager test double.
## Performance Tests
### PT-01: Ingest And Quality Latency
**Summary**: Verify decode, calibration lookup, occlusion detection, and quality classification stay inside the system latency budget.
**Traces to**: AC-4.1, AC-4.2
**Load scenario**:
- Input rate: target camera/replay rate.
- Duration: 30 minutes replay.
- Dataset: project still images plus synthetic occlusion frames.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Per-frame ingest p95 | <=100 ms | >150 ms |
| Memory contribution | <=1 GB | >1.5 GB |
| Dropped frames | <=10% under sustained load | >10% |
**Resource limits**: Jetson shared memory remains below system 8 GB cap.
## Security Tests
### ST-01: Raw Frame Retention Check
**Summary**: Verify normal operation does not persist raw navigation frames.
**Traces to**: AC-8.5
**Attack vector**: Sensitive raw imagery remains on disk after processing.
**Test procedure**:
1. Run replay with normal and failed tile-generation frames.
2. Inspect output directories and FDR artifacts.
**Expected behavior**: Only metadata, generated tiles, and allowed low-rate failed-frame thumbnails are retained.
**Pass criteria**: No raw full-resolution frames remain after teardown.
## Acceptance Tests
### AT-01: Normal Frame Enters VIO
**Summary**: Confirm normal usable frames are passed to BASALT.
**Traces to**: AC-2.1a
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Feed a calibrated normal frame | Occlusion status is `clear` |
| 2 | Process quality gate | Frame is emitted to VIO adapter |
---
### AT-02: Occluded Frame Bypasses VIO
**Summary**: Confirm full occlusion triggers IMU-only degraded path.
**Traces to**: AC-3.5, AC-NEW-8
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Feed total-occlusion frame | `usable_for_vio=false` |
| 2 | Observe downstream routing | BASALT is bypassed and safety wrapper receives degradation signal |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `project_60_still_images` | Normal nadir frames | `_docs/00_problem/input_data/` | Project data |
| `synthetic_occlusion_frames` | Black/white/corrupt/whiteout frames | Generated fixture | Small |
**Setup procedure**: Load calibration fixture and mount read-only frame data.
**Teardown procedure**: Remove run-scoped temp directories and verify no raw-frame persistence.
**Data isolation strategy**: Each run writes to a unique `test-results/<run-id>/` directory.
@@ -1,92 +0,0 @@
# VIO Adapter
## 1. High-Level Overview
**Purpose**: Wrap the selected relative VIO backend as a replaceable component that consumes calibrated frames and FC IMU data, then emits relative pose/velocity/bias state and tracking quality.
**Architectural Pattern**: Adapter / anti-corruption layer.
**Upstream dependencies**: Camera ingest/calibration, MAVLink telemetry stream.
**Downstream consumers**: Safety/anchor wrapper, FDR, separate e2e test suite.
## 2. Internal Interfaces
### Interface: `VioAdapter`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `initialize` | `VioInitRequest` | `VioInitResult` | No | `CalibrationInvalid`, `DatasetUnsupported` |
| `process` | `VioInputPacket` | `VioStatePacket` | Yes | `TrackingLost`, `TimestampMismatch` |
| `health` | none | `VioHealth` | No | none |
**Input DTOs**:
```yaml
VioInputPacket:
frame: FramePacket
imu_samples: list[TelemetrySample]
attitude_sample: TelemetrySample optional
```
**Output DTOs**:
```yaml
VioStatePacket:
timestamp_ns: integer
relative_pose: transform
velocity: vector3
bias_estimate: object optional
tracking_quality: number
completed: boolean
covariance_hint: matrix optional
```
## 3. Data Access Patterns
No persistent production data ownership. Reads calibration/config at startup and emits state to downstream consumers.
## 4. Implementation Details
**State Management**: Owns selected VIO backend runtime state and resets only through explicit wrapper command.
**Key Dependencies**:
| Library | Purpose |
|---------|---------|
| BASALT | Current selected relative visual-inertial odometry backend |
| Eigen/Sophus or backend-native math stack | Pose and transform representation |
**Error Handling Strategy**:
- Tracking loss is surfaced to the safety/anchor wrapper, not hidden.
- Timestamp/camera-IMU sync violations fail the packet and are logged.
- The adapter never emits WGS84 coordinates; absolute semantics belong to the wrapper.
## 5. Caveats & Edge Cases
**Known limitations**:
- BASALT has no special fixed-wing nadir mode; validation must prove fit under low-parallax/planar terrain.
- Backend covariance/confidence output is not the product authority; wrapper calibration is required.
**Performance bottlenecks**:
- Native VIO runtime and image resolution can exceed Jetson budget if not tuned.
## 6. Dependency Graph
**Must be implemented after**: Camera ingest/calibration, MAVLink telemetry DTO definitions.
**Can be implemented in parallel with**: Satellite Service, Tile Manager.
**Blocks**: Safety/anchor wrapper final integration.
## 7. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | VIO backend initialization fails | `vio_init_failed reason=...` |
| WARN | Tracking quality drops | `vio_tracking_degraded quality=...` |
| INFO | VIO reset/reinitialized | `vio_reset cause=...` |
**Log format**: FDR structured event.
**Log storage**: FDR segment.
@@ -1,141 +0,0 @@
# Test Specification — VIO Adapter
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-1.3 | Drift between anchors and anchor-age reporting input | IT-02, AT-01 | Covered |
| AC-2.1a | >95% VO registration on normal segments | IT-01, AT-01 | Covered |
| AC-2.2 | <1.0 px VO MRE | IT-01 | Covered |
| AC-3.1 | Handles 350 m outliers/tilt | IT-03 | Covered |
| AC-3.2 | Sharp turns trigger relocalization path | IT-04 | Covered |
| AC-3.4 | Loss threshold feeds relocalization/dead reckoning | IT-04 | Covered |
| AC-4.1 | Hot-path latency | PT-01 | Covered |
| AC-4.2 | Jetson memory budget | PT-01 | Covered |
## Blackbox Tests
### IT-01: Public Dataset VIO Replay
**Summary**: Verify the VIO adapter produces relative motion for synchronized camera/IMU replay.
**Traces to**: AC-2.1a, AC-2.2
**Input data**: Derkachi cropped nadir video + `SCALED_IMU2` + `GLOBAL_POSITION_INT`, MUN-FRL preferred slice, or representative synchronized nav-camera + IMU + ground truth.
**Expected result**: VO registration succeeds for >95% of normal usable frames; frame-to-frame MRE <1.0 px where ground-truth/feature evaluation supports it. Derkachi runs are accepted as calibration-limited until intrinsics, distortion, and camera-to-body transform are pinned.
**Max execution time**: Dataset-dependent; report per-frame latency.
**Dependencies**: Camera ingest, MAVLink telemetry/replay, calibration fixtures.
---
### IT-02: Relative Drift Reporting
**Summary**: Verify adapter emits state needed for wrapper drift and anchor-age accounting.
**Traces to**: AC-1.3
**Input data**: Segment with two known satellite anchors and IMU samples.
**Expected result**: Adapter emits continuous `VioStatePacket` values with timestamps and quality, enabling wrapper to compare VO extrapolation to next anchor.
**Max execution time**: Dataset-dependent.
**Dependencies**: Safety wrapper test harness.
---
### IT-03: Tilt/Outlier Robustness
**Summary**: Verify adapter reports degraded tracking without false success under tilt/outlier cases.
**Traces to**: AC-3.1
**Input data**: Replay segment with synthetic +/-20 degree tilt and up to 350 m apparent outlier.
**Expected result**: Adapter either tracks with quality metadata or emits `TrackingLost`; it never hides a failure as high-quality VIO.
**Max execution time**: 15 minutes per fixture.
---
### IT-04: Sharp Turn / Loss Signal
**Summary**: Verify sharp turns and disconnected visual overlap produce wrapper-visible failure signals.
**Traces to**: AC-3.2, AC-3.4
**Input data**: <5% overlap sequence with heading change <70 degrees.
**Expected result**: Adapter emits low tracking quality or `TrackingLost` within the loss window, allowing relocalization trigger.
**Max execution time**: 10 minutes.
## Performance Tests
### PT-01: VIO Adapter Runtime Budget
**Summary**: Verify VIO processing does not consume the full <400 ms system p95 budget.
**Traces to**: AC-4.1, AC-4.2
**Load scenario**:
- Input: Derkachi synchronized replay and public/representative replay.
- Duration: 30 minutes plus release long-run slice.
- Target: Jetson Orin Nano Super.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Adapter p95 latency | <=250 ms | >300 ms |
| Memory contribution | <=3 GB | >4 GB |
| Tracking failure on normal segments | <5% | >=5% |
**Resource limits**: Total system memory remains below 8 GB.
## Security Tests
### ST-01: Timestamp Injection Rejection
**Summary**: Verify malformed or non-monotonic timestamps do not produce trusted VIO state.
**Traces to**: AC-NEW-4
**Attack vector**: Replay or telemetry timestamp manipulation.
**Test procedure**:
1. Feed non-monotonic frame and IMU timestamps.
2. Observe adapter output.
**Expected behavior**: Adapter returns `TimestampMismatch` or low-quality failure; wrapper does not trust the state.
**Pass criteria**: No high-quality VIO state is emitted from malformed timing.
## Acceptance Tests
### AT-01: Normal VIO State Contract
**Summary**: Confirm adapter output contract supports downstream localization.
**Traces to**: AC-1.3, AC-2.1a
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Initialize with calibrated frame/IMU config | `VioInitResult` succeeds |
| 2 | Replay normal frames | `VioStatePacket` includes timestamp, relative pose, velocity, tracking quality |
| 3 | End segment | State stream is continuous enough for wrapper drift accounting |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `derkachi_video_telemetry` | Cropped nadir MP4 + synchronized IMU and `GLOBAL_POSITION_INT` trajectory | Project fixture | ~282 MB video + CSV |
| `public_nadir_vio_candidates` | MUN-FRL/ALTO/Kagaru/EPFL slices | Public pinned fixtures | Dataset-dependent |
| `representative_sync_replay` | Target camera + FC IMU + calibrated ground truth | Project collection | TBD |
**Setup procedure**: Pin calibration/extrinsics and mount read-only synchronized replay data.
**Teardown procedure**: Remove generated result reports and adapter temp state.
**Data isolation strategy**: One run directory per dataset slice and configuration hash.
@@ -1,106 +0,0 @@
# Safety And Anchor Wrapper
## 1. High-Level Overview
**Purpose**: Own the authoritative localization state, confidence calibration, source labels, anchor fusion, degraded modes, tile-write gates, and MAVLink output semantics.
**Architectural Pattern**: Stateful coordinator / safety facade.
**Upstream dependencies**: VIO adapter, anchor verification, MAVLink telemetry, camera quality reports.
**Downstream consumers**: MAVLink/GCS integration, FDR, Tile Manager, separate e2e test suite.
## 2. Internal Interfaces
### Interface: `LocalizationStateMachine`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `update_vio` | `VioStatePacket` | `PositionEstimate` | Yes | `StateInconsistent` |
| `consider_anchor` | `AnchorDecision` | `AnchorAcceptanceResult` | No | `AnchorRejected` |
| `degrade` | `DegradationSignal` | `PositionEstimate` | No | none |
| `propagate_imu_only` | `ImuOnlyPropagationRequest` | `PositionEstimate` | Yes | `InsufficientTelemetry` |
| `tile_write_eligibility` | `FramePacket` | `TileWriteDecision` | No | none |
**Input DTOs**:
```yaml
AnchorDecision:
candidate_id: string
timestamp_ns: integer
estimated_pose_wgs84: object
inlier_count: integer
mre_px: number
tile_freshness_status: enum
provenance_status: enum
DegradationSignal:
type: enum(total_occlusion, visual_blackout, gps_spoofing, covariance_growth, tracking_lost)
timestamp_ns: integer
ImuOnlyPropagationRequest:
last_trusted_estimate: PositionEstimate
imu_samples: list[TelemetrySample]
elapsed_blackout_ms: integer
reason: enum(total_occlusion, visual_blackout, tracking_lost)
```
**Output DTOs**:
```yaml
PositionEstimate:
timestamp_ns: integer
lat_deg: number
lon_deg: number
alt_msl_m: number
covariance_95_semi_major_m: number
source_label: enum(satellite_anchored, vo_extrapolated, dead_reckoned)
fix_type: integer
horiz_accuracy_m: number
last_satellite_anchor_age_ms: integer
```
## 3. Data Access Patterns
No direct tile/image storage ownership. Writes all decisions to FDR via observability component.
## 4. Implementation Details
**State Management**: Owns the authoritative state machine and covariance growth model.
**Error Handling Strategy**:
- Reject uncertain anchors by default.
- Never emit optimistic accuracy when confidence is degraded.
- On total occlusion or visual blackout, do not call VIO for that frame; propagate from the last trusted state with IMU-only dynamics, set `source_label=dead_reckoned`, and grow covariance monotonically.
- If covariance or blackout thresholds exceed AC limits, emit no-fix/failsafe semantics.
- Treat cache freshness and provenance as evidence carried by `AnchorDecision`; do not call the Tile Manager directly during anchor acceptance.
## 5. Caveats & Edge Cases
**Known limitations**:
- Final covariance calibration requires representative synchronized data.
- The wrapper must stay independent of BASALT internals so VIO can be replaced.
- IMU-only propagation is an emergency bridge, not a reliable long-duration localization mode; the spec requires failsafe/no-fix once time or covariance thresholds are exceeded.
**Potential race conditions**:
- A delayed anchor result must be checked against current timestamp/state before acceptance.
## 6. Dependency Graph
**Must be implemented after**: VIO DTOs, anchor DTOs, MAVLink output contract.
**Can be implemented in parallel with**: FDR schema after DTOs stabilize.
**Blocks**: MAVLink production output, tile-write lifecycle, end-to-end validation.
## 7. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | State invariant violation | `localization_state_inconsistent reason=...` |
| WARN | Anchor rejected or mode degraded | `anchor_rejected reason=mahalanobis_gate` |
| INFO | Mode transition | `source_label_changed from=vo_extrapolated to=dead_reckoned reason=total_occlusion` |
**Log format**: FDR structured event.
**Log storage**: FDR segment and QGC status for operator-critical events.
@@ -1,208 +0,0 @@
# Test Specification — Safety And Anchor Wrapper
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-1.1 | >=80% within 50 m | AT-01 | Covered |
| AC-1.2 | >=50% within 20 m | AT-01 | Covered |
| AC-1.3 | Drift and anchor age | IT-01 | Covered |
| AC-1.4 | Quantitative confidence + label | IT-02, AT-02 | Covered |
| AC-3.4 | Relocalization after no-position window | IT-03 | Covered |
| AC-3.5 | Blackout to dead reckoned | IT-04, AT-03 | Covered |
| AC-4.3 | GPS_INPUT semantics source fields | IT-05 | Covered |
| AC-4.4 | Frame-by-frame streaming | PT-01 | Covered |
| AC-4.5 | Correction updates | IT-06 | Covered |
| AC-5.1 | Initialize from FC state | IT-07 | Covered |
| AC-5.2 | >3 s no-estimate fallback | IT-03 | Covered |
| AC-5.3 | Reinitialize after reboot | IT-07 | Covered |
| AC-NEW-2 | Spoofing promotion <3 s | IT-05 | Covered |
| AC-NEW-4 | False-position safety budget | ST-01, AT-02 | Covered |
| AC-NEW-8 | IMU-only blackout thresholds | IT-04, AT-03 | Covered |
## Blackbox Tests
### IT-01: Drift And Anchor Age Accounting
**Summary**: Verify VO extrapolation drift and anchor age are tracked per estimate.
**Traces to**: AC-1.3
**Input data**: VIO state stream with two accepted anchor decisions.
**Expected result**: Every `PositionEstimate` includes `last_satellite_anchor_age_ms`; drift at next anchor is measured and binned by age.
**Max execution time**: 5 minutes.
---
### IT-02: Confidence Output Contract
**Summary**: Verify every estimate has covariance and source label.
**Traces to**: AC-1.4
**Input data**: satellite-anchored, VO-extrapolated, and dead-reckoned state fixtures.
**Expected result**: Each output contains `covariance_95_semi_major_m`, `source_label`, `fix_type`, and `horiz_accuracy_m`; `horiz_accuracy_m` never under-reports covariance.
**Max execution time**: 2 minutes.
---
### IT-03: No-Position Relocalization Trigger
**Summary**: Verify no position for >=3 frames and >=2 s triggers relocalization/degraded behavior.
**Traces to**: AC-3.4, AC-5.2
**Input data**: VIO loss sequence with no accepted anchors.
**Expected result**: Relocalization request is emitted and wrapper continues dead reckoning until fail threshold.
**Max execution time**: 5 minutes.
---
### IT-04: Total Blackout IMU-Only Propagation
**Summary**: Verify total occlusion produces honest IMU-only estimates and failsafe thresholds.
**Traces to**: AC-3.5, AC-NEW-8
**Input data**: Last trusted estimate, IMU/attitude/airspeed/altitude stream, total-occlusion signal for 5 s, 15 s, and 35 s.
**Expected result**: Wrapper emits `dead_reckoned` within <=1 frame or <=400 ms, covariance grows monotonically, and no-fix/failsafe is emitted when blackout >30 s or covariance >500 m.
**Max execution time**: 10 minutes.
---
### IT-05: Spoofing Promotion And GPS_INPUT Mapping
**Summary**: Verify wrapper promotes own estimate and maps output fields correctly under GPS spoofing.
**Traces to**: AC-4.3, AC-NEW-2
**Input data**: Plane SITL spoofing telemetry and trusted wrapper estimate.
**Expected result**: Own estimate is promoted within <3 s; v1 emits `GPS_INPUT` only; source labels and accuracy fields are correct.
**Max execution time**: 10 minutes.
---
### IT-06: Correction Update
**Summary**: Verify refined estimates can correct previous positions.
**Traces to**: AC-4.5
**Input data**: VO estimate followed by accepted satellite anchor for same segment.
**Expected result**: Wrapper emits updated estimate/correction with improved source label and covariance.
**Max execution time**: 5 minutes.
---
### IT-07: Initialization And Reboot Recovery
**Summary**: Verify wrapper initializes from FC state and can reinitialize after reboot.
**Traces to**: AC-5.1, AC-5.3
**Input data**: FC EKF position, IMU-extrapolated state, simulated companion restart.
**Expected result**: Wrapper resumes from current FC state and reports degraded confidence until re-anchored.
**Max execution time**: 5 minutes.
## Performance Tests
### PT-01: Wrapper Streaming Overhead
**Summary**: Verify state-machine processing does not delay frame-by-frame output.
**Traces to**: AC-4.4
**Load scenario**:
- Input: 3 Hz estimate stream with anchor and blackout events.
- Duration: 8-hour synthetic run.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Wrapper p95 processing | <=25 ms | >50 ms |
| Missed frame outputs | 0 except skip-allowed upstream drops | Any silent batch/delay |
**Resource limits**: Negligible memory growth across run.
## Security Tests
### ST-01: False Anchor Rejection
**Summary**: Verify impossible anchors do not become trusted estimates.
**Traces to**: AC-NEW-4
**Attack vector**: Anchor decision with plausible inliers but impossible jump or stale provenance.
**Test procedure**:
1. Feed an anchor >1 km from predicted state with low covariance.
2. Feed stale/provenance-failed anchor evidence.
**Expected behavior**: Anchor is rejected, FDR logs reason, source label remains degraded or VO extrapolated.
**Pass criteria**: 0 accepted impossible/stale anchors.
## Acceptance Tests
### AT-01: Position Accuracy Aggregation
**Summary**: Verify wrapper outputs can be scored against frame-center thresholds.
**Traces to**: AC-1.1, AC-1.2
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Replay mapped frame-center estimates | Report computes >=80% within 50 m and >=50% within 20 m |
| 2 | Include source labels | Accuracy is binned by source label |
---
### AT-02: Confidence Honesty
**Summary**: Verify reported confidence is conservative relative to measured error.
**Traces to**: AC-1.4, AC-NEW-4
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Replay ground-truth trajectory | Measured error is not systematically above reported covariance |
| 2 | Inject over-confident anchors | Mahalanobis gate rejects them |
---
### AT-03: Blackout Failsafe
**Summary**: Verify operator-visible blackout/failsafe behavior.
**Traces to**: AC-3.5, AC-NEW-8
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Inject total visual blackout | `dead_reckoned` within <=400 ms |
| 2 | Continue >30 s or covariance >500 m | `fix_type=0`, `horiz_accuracy=999.0`, QGC failsafe status |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `wrapper_state_fixtures` | VIO states, anchors, blackouts, spoofing signals | Generated fixtures | Small |
| `representative_replay` | Target synchronized replay and ground truth | Project collection | TBD |
**Setup procedure**: Load deterministic VIO/anchor/telemetry fixtures and run with isolated PostgreSQL schema.
**Teardown procedure**: Drop run schema and remove FDR segments.
**Data isolation strategy**: Use per-run mission IDs and database schemas.
@@ -1,102 +0,0 @@
# Satellite Service
## 1. High-Level Overview
**Purpose**: Own the onboard boundary to the suite Satellite Service: import pre-flight mission cache packages, upload generated-tile packages after flight, and convert query frames into ranked local VPR candidates using preloaded DINOv2-VLAD descriptors and FAISS.
**Architectural Pattern**: Offline sync gateway + local retrieval index adapter.
**Upstream dependencies**: Camera ingest/calibration, Tile Manager, safety/anchor wrapper, Azaion Suite Satellite Service before/after flight.
**Downstream consumers**: Anchor verification, FDR.
## 2. Internal Interfaces
### Interface: `SatelliteService`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `import_mission_cache` | `CacheImportRequest` | `CacheImportResult` | Yes | `SyncUnavailable`, `PackageInvalid` |
| `upload_generated_tiles` | `GeneratedTileUploadRequest` | `GeneratedTileUploadResult` | Yes | `SyncUnavailable`, `PackageRejected` |
| `retrieve` | `RetrievalRequest` | `RetrievalResult` | Yes | `IndexUnavailable`, `DescriptorFailed` |
| `load_index` | `IndexLoadRequest` | `IndexStatus` | No | `ManifestInvalid`, `IndexUnavailable` |
**Input DTOs**:
```yaml
RetrievalRequest:
frame: FramePacket
prior_estimate: PositionEstimate optional
search_radius_m: number optional
max_candidates: integer
```
**Output DTOs**:
```yaml
RetrievalResult:
query_descriptor_id: string
candidates: list[VprCandidate]
VprCandidate:
chunk_id: string
tile_id: string
score: number
footprint: geometry
freshness_status: enum
```
## 3. Data Access Patterns
| Query | Frequency | Hot Path | Index Needed |
|-------|-----------|----------|--------------|
| Top-K FAISS search | Triggered only | No steady-state | FAISS index |
| Import/export package sync | Pre-flight / post-flight only | No mid-flight | Package manifest and sidecar hashes |
| Load chunk metadata | Per candidate | No | PostgreSQL/PostGIS spatial and chunk indexes |
## 4. Implementation Details
**State Management**: Holds loaded descriptor model and FAISS index handles; tracks pre-flight import and post-flight upload package status.
**Key Dependencies**:
| Library | Purpose |
|---------|---------|
| DINOv2 / ONNX / TensorRT candidate path | Query descriptor extraction |
| FAISS CPU | Top-K retrieval |
| Satellite Service client | Pre-flight cache import and post-flight generated-tile upload |
**Error Handling Strategy**:
- If descriptor extraction or index load fails, return no candidates and trigger degraded mode.
- Optimized engines are allowed only after descriptor-fidelity tests pass.
- Network/package sync failures are allowed only before takeoff or after landing; during flight, the component must never call a satellite provider or suite service.
## 5. Caveats & Edge Cases
**Known limitations**:
- VPR result is only a candidate, never an accepted fix.
- Cross-domain retrieval can be wrong under seasonal, lighting, or terrain ambiguity.
- External Satellite Service availability cannot be part of the mid-flight localization safety case.
**Performance bottlenecks**:
- Descriptor extraction on Jetson must be trigger-limited and profiled separately from BASALT.
## 6. Dependency Graph
**Must be implemented after**: cache manifest/index schema, camera frame DTOs.
**Can be implemented in parallel with**: anchor verification.
**Blocks**: satellite relocalization flow.
## 7. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | Index unavailable | `faiss_index_unavailable id=...` |
| WARN | No candidates | `vpr_no_candidates frame_id=...` |
| INFO | Retrieval invoked | `vpr_query candidates=... latency_ms=...` |
**Log format**: FDR structured event.
**Log storage**: FDR segment.
@@ -1,172 +0,0 @@
# Test Specification — Satellite Service
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-3.2 | Sharp-turn relocalization | IT-02, AT-01 | Covered |
| AC-3.3 | >=3 disconnected segments | IT-03 | Covered |
| AC-3.4 | Relocalization request after loss | IT-02 | Covered |
| AC-4.1 | Trigger path latency | PT-01 | Covered |
| AC-4.2 | Memory budget | PT-01 | Covered |
| AC-8.1 | Cache imagery interface | IT-01 | Covered |
| AC-8.3 | Preloaded/preprocessed cache | IT-01 | Covered |
| AC-8.6 | VPR chunks, multi-scale, dynamic K | IT-01, IT-04 | Covered |
| AC-NEW-1 | Cold-start first fix support | PT-02 | Covered |
| AC-NEW-6 | Freshness-aware retrieval | IT-04, ST-01 | Covered |
## Blackbox Tests
### IT-01: Index Load And Chunk Coverage
**Summary**: Verify preloaded VPR chunks and FAISS index cover the operational area.
**Traces to**: AC-8.1, AC-8.3, AC-8.6
**Input data**: PostgreSQL/PostGIS cache manifest, VPR chunk metadata, FAISS index.
**Expected result**: Every test frame footprint falls inside at least one VPR chunk; fine and coarse descriptors are present where required.
**Max execution time**: 2 minutes per mission cache.
---
### IT-02: Sharp-Turn Local Retrieval Trigger
**Summary**: Verify sharp-turn state requests candidates rather than relying on frame-to-frame VO.
**Traces to**: AC-3.2, AC-3.4
**Input data**: Wrapper relocalization request with sharp-turn/loss reason.
**Expected result**: Satellite Service returns bounded top-K candidates from preloaded local indexes based on sector/covariance policy.
**Max execution time**: 2 seconds per query.
---
### IT-03: Disconnected Segment Retrieval
**Summary**: Verify at least three disconnected segments can each retrieve candidate chunks.
**Traces to**: AC-3.3
**Input data**: Three disconnected query frames with approximate prior/covariance.
**Expected result**: Each query returns a candidate set including the ground-truth region when covered by the cache fixture.
**Max execution time**: Dataset-dependent.
---
### IT-04: Dynamic K And Freshness Filter
**Summary**: Verify K varies by sector and covariance, and stale candidates are tagged.
**Traces to**: AC-8.6, AC-NEW-6
**Input data**: Stable and active-conflict sector cache fixtures with fresh/stale tiles.
**Expected result**: K=5 for stable low-covariance, K=20 for active-conflict, K=50 fallback; stale candidates are flagged for rejection/down-confidence.
**Max execution time**: 2 seconds per query.
## Performance Tests
### PT-01: Retrieval Query Runtime
**Summary**: Verify descriptor extraction and FAISS query fit trigger-path budget.
**Traces to**: AC-4.1, AC-4.2
**Load scenario**:
- Query set: 100 representative relocalization frames.
- Environment: Jetson and replay workstation.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Retrieval p95 | <=300 ms trigger path share | >400 ms |
| Memory contribution | <=2 GB | >3 GB |
| Candidate count policy | Exact | Any wrong K |
**Resource limits**: Total system memory remains below 8 GB.
---
### PT-02: Cold-Start Index Load
**Summary**: Verify retrieval readiness supports first fix <30 s.
**Traces to**: AC-NEW-1
**Load scenario**:
- Cold boot 50 runs.
- Cache/index mounted locally.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Index ready p95 | <=10 s | >15 s |
| First retrieval p95 | Fits <30 s first-fix budget | Exceeds budget |
## Security Tests
### ST-01: Stale Candidate Handling
**Summary**: Verify stale imagery cannot silently appear as a trusted retrieval candidate.
**Traces to**: AC-NEW-6
**Attack vector**: Manipulated cache manifest marks stale tile as available.
**Test procedure**:
1. Load cache fixture with stale capture dates.
2. Query against stale region.
**Expected behavior**: Candidate carries stale status; anchor path cannot accept it as `satellite_anchored`.
**Pass criteria**: 0 stale candidates without explicit stale/down-confidence metadata.
---
### ST-02: No Mid-Flight Satellite Service Calls
**Summary**: Verify relocalization never performs satellite-provider or suite Satellite Service network calls during flight.
**Traces to**: AC-8.3, R-SAT-01
**Attack vector**: Runtime attempts to fetch missing cache/index data over the network during relocalization.
**Test procedure**:
1. Disable external network access during a replay scenario.
2. Trigger relocalization against preloaded cache fixtures.
3. Inspect network call logs and Satellite Service client telemetry.
**Expected behavior**: Retrieval uses only mounted local cache/index data; missing data produces degraded/no-candidate behavior, not a network fetch.
**Pass criteria**: 0 mid-flight Satellite Service or satellite-provider calls.
## Acceptance Tests
### AT-01: Relocalization Candidate Returned
**Summary**: Verify a relocalization request returns usable candidates for anchor verification.
**Traces to**: AC-3.2, AC-8.6
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Submit sharp-turn query | Retrieval invokes VPR |
| 2 | Read output | Candidate list includes chunk IDs, tile IDs, scores, footprints, freshness |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `cache_vpr_fixture` | PostGIS manifest, COGs, descriptors, FAISS index | Generated/cache fixture | Mission-dependent |
| `aerial_vpr_queries` | Aerial query frames with ground-truth regions | ALTO/AerialVL/representative | Dataset-dependent |
**Setup procedure**: Restore isolated PostgreSQL schema and mount read-only descriptor/index files.
**Teardown procedure**: Drop schema and remove generated retrieval reports.
**Data isolation strategy**: Per-run schema and read-only cache fixture volume.
@@ -1,93 +0,0 @@
# Anchor Verification
## 1. High-Level Overview
**Purpose**: Verify retrieved cache candidates with local feature matching and geometric checks before the safety wrapper considers an absolute anchor.
**Architectural Pattern**: Validation pipeline.
**Upstream dependencies**: Satellite Service, camera ingest/calibration, Tile Manager.
**Downstream consumers**: Safety/anchor wrapper, FDR.
## 2. Internal Interfaces
### Interface: `AnchorVerifier`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `verify` | `AnchorVerificationRequest` | `AnchorDecision` | Yes | `TileUnavailable`, `MatchFailed`, `GeometryFailed` |
| `benchmark_matcher` | `MatcherBenchmarkRequest` | `MatcherBenchmarkReport` | Yes | `ModelUnavailable` |
**Input DTOs**:
```yaml
AnchorVerificationRequest:
frame: FramePacket
candidates: list[VprCandidate]
matcher_profile: enum(aliked, disk, sift_orb_baseline)
```
**Output DTOs**:
```yaml
AnchorDecision:
candidate_id: string
accepted_by_geometry: boolean
estimated_pose_wgs84: object optional
inlier_count: integer
mre_px: number
homography: matrix optional
rejection_reason: string optional
```
## 3. Data Access Patterns
| Query | Frequency | Hot Path | Index Needed |
|-------|-----------|----------|--------------|
| Read candidate COG footprint/window | Triggered only | No | Tile spatial metadata |
| Read matcher model | Startup | No | No |
## 4. Implementation Details
**State Management**: Loads matcher/extractor models and tracks benchmark-selected profile.
**Key Dependencies**:
| Library | Purpose |
|---------|---------|
| ALIKED/DISK + LightGlue | Learned local matching |
| OpenCV | RANSAC/USAC geometry and error metrics |
**Error Handling Strategy**:
- Low inlier count, high MRE, stale tile, or provenance failure returns a rejected decision with reason.
- SuperPoint can be benchmarked only if legal approval allows its license use.
## 5. Caveats & Edge Cases
**Known limitations**:
- ALIKED-LightGlue is not VIO by itself; it supplies correspondences. A full VIO path still needs state estimation and IMU fusion.
- Optional frame-to-frame VO fallback must be benchmarked separately from cross-domain anchor verification.
**Performance bottlenecks**:
- Learned matching can exceed the Jetson budget if run per-frame; default invocation is trigger-based.
## 6. Dependency Graph
**Must be implemented after**: Satellite Service candidate DTOs, Tile Manager tile access.
**Can be implemented in parallel with**: VIO adapter.
**Blocks**: accepted satellite-anchor path.
## 7. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | Matcher model unavailable | `lightglue_model_unavailable profile=aliked` |
| WARN | Candidate rejected | `anchor_geometry_rejected mre_px=... inliers=...` |
| INFO | Anchor verified | `anchor_verified tile_id=... mre_px=...` |
**Log format**: FDR structured event.
**Log storage**: FDR segment.
@@ -1,124 +0,0 @@
# Test Specification — Anchor Verification
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-1.1 | 50 m frame-center accuracy via accepted anchors | AT-01 | Covered |
| AC-1.2 | 20 m stretch accuracy via accepted anchors | AT-01 | Covered |
| AC-2.1b | Satellite-anchor registration measured separately | IT-01 | Covered |
| AC-2.2 | <2.5 px cross-domain MRE | IT-01, AT-01 | Covered |
| AC-3.1 | Outlier rejection | ST-01 | Covered |
| AC-4.1 | Trigger-path latency | PT-01 | Covered |
| AC-4.2 | Memory budget | PT-01 | Covered |
| AC-NEW-4 | False-position safety budget | ST-01 | Covered |
| AC-NEW-6 | Stale imagery rejection evidence | IT-02, ST-02 | Covered |
## Blackbox Tests
### IT-01: Cross-Domain Match And RANSAC
**Summary**: Verify ALIKED/DISK-LightGlue plus RANSAC produces measurable anchor evidence.
**Traces to**: AC-2.1b, AC-2.2
**Input data**: UAV frame, retrieved COG candidate window, ground-truth georegistration.
**Expected result**: Accepted anchors have MRE <2.5 px, sufficient inliers, and homography/pose evidence.
**Max execution time**: 2 seconds per candidate set.
---
### IT-02: Stale Candidate Verification
**Summary**: Verify stale or provenance-failed candidates are rejected or marked unsafe.
**Traces to**: AC-NEW-6
**Input data**: Candidate list with stale tile metadata and valid-looking image content.
**Expected result**: `AnchorDecision` is rejected or carries stale/provenance failure; safety wrapper cannot accept it as trusted.
**Max execution time**: 2 seconds per candidate.
## Performance Tests
### PT-01: Local Matcher Runtime
**Summary**: Verify learned matching stays bounded on Jetson when invoked.
**Traces to**: AC-4.1, AC-4.2
**Load scenario**:
- Candidate sets: K=5, K=20, K=50.
- Matcher profiles: ALIKED, DISK, SIFT/ORB baseline.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| K=5 p95 | <=300 ms | >500 ms |
| K=20 p95 | Reported and bounded | Unbounded/no report |
| Memory contribution | <=2 GB | >3 GB |
**Resource limits**: Total Jetson shared memory remains below 8 GB.
## Security Tests
### ST-01: False Match / Impossible Jump Rejection
**Summary**: Verify visually plausible but geographically impossible matches are rejected.
**Traces to**: AC-3.1, AC-NEW-4
**Attack vector**: Candidate tile from wrong region with repetitive texture.
**Test procedure**:
1. Pair query with wrong-region candidate.
2. Run matcher and RANSAC.
3. Pass result to safety wrapper fixture.
**Expected behavior**: Anchor is rejected by geometry or downstream consistency gates.
**Pass criteria**: 0 accepted anchors from wrong-region fixtures.
---
### ST-02: Provenance Failure
**Summary**: Verify unsigned/hash-failed candidate tiles cannot produce accepted anchors.
**Traces to**: AC-NEW-6
**Attack vector**: Tampered COG or sidecar.
**Test procedure**: Run verification against tampered tile fixture.
**Expected behavior**: `AnchorDecision` includes provenance failure and is not accepted.
**Pass criteria**: 0 trusted anchors from tampered tiles.
## Acceptance Tests
### AT-01: Accepted Anchor Accuracy
**Summary**: Verify accepted anchors support position accuracy thresholds.
**Traces to**: AC-1.1, AC-1.2, AC-2.2
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Verify satellite candidate against query | MRE <2.5 px |
| 2 | Convert accepted anchor to WGS84 evidence | Error supports 50 m / 20 m aggregate thresholds |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `anchor_match_fixture` | Query frames, COG windows, expected georegistration | ALTO/AerialVL/project cache | Dataset-dependent |
| `tampered_tile_fixture` | Hash/signature/stale cases | Generated fixture | Small |
**Setup procedure**: Load cache fixture and matcher model profile.
**Teardown procedure**: Remove matcher outputs and reports.
**Data isolation strategy**: Read-only imagery with per-run output folders.
@@ -1,92 +0,0 @@
# Tile Manager
## 1. High-Level Overview
**Purpose**: Manage local tiles: service-source COGs, manifests, descriptor metadata, freshness/provenance checks, nadir-image orthorectification into generated tiles, generated tile writes, and post-flight package preparation.
**Architectural Pattern**: Repository + policy gate.
**Upstream dependencies**: Satellite Service cache packages, safety/anchor wrapper, camera ingest/calibration.
**Downstream consumers**: Satellite Service, anchor verification, FDR, post-flight sync.
## 2. Internal Interfaces
### Interface: `TileManager`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `validate_cache` | `CacheValidationRequest` | `CacheValidationReport` | No | `ManifestInvalid`, `SignatureInvalid` |
| `get_tile_window` | `TileWindowRequest` | `TileWindow` | No | `TileUnavailable`, `TileRejected` |
| `orthorectify_frame` | `TileGenerationRequest` | `GeneratedTileCandidate` | Yes | `TileWriteRejected`, `FrameNotUsable` |
| `write_generated_tile` | `GeneratedTileRequest` | `GeneratedTileRecord` | Yes | `TileWriteRejected`, `StorageFull` |
| `package_sync` | `SyncPackageRequest` | `SyncPackage` | Yes | `PackageFailed` |
## 3. Data Access Patterns
| Query | Frequency | Hot Path | Index Needed |
|-------|-----------|----------|--------------|
| Tile by footprint/time/freshness | Per retrieval/anchor | Yes during relocalization | Spatial/time indexes |
| Descriptor metadata by chunk | Per Satellite Service retrieval | Yes during relocalization | Chunk ID index |
| Generated tile by mission/sector | Post-flight | No | Mission ID index |
### Caching Strategy
| Data | Cache Type | TTL | Invalidation |
|------|------------|-----|--------------|
| Manifest metadata | PostgreSQL/PostGIS query cache / process cache | Mission duration | New mission cache load |
| Sidecar verification | In-memory result cache | Mission duration | File hash change |
### Storage Estimates
| Table/Collection | Est. Row Count | Row Size | Total Size | Growth Rate |
|------------------|----------------|----------|------------|-------------|
| Cache manifest tiles | Mission-dependent | Small metadata | Within ~10 GB package with imagery | Per mission |
| Generated tiles | Flight-dependent | Metadata + COG payload | FDR/cache budget constrained | Per flight |
## 4. Implementation Details
**State Management**: Owns PostgreSQL/PostGIS manifest connection, sidecar verification state, and generated tile staging area.
**Key Dependencies**:
| Library | Purpose |
|---------|---------|
| PostgreSQL + PostGIS | Manifest, spatial metadata, freshness queries, and generated-tile metadata |
| GDAL/rasterio candidate | COG read/write |
| OpenCV/GDAL geometry utilities | Nadir-frame orthorectification into generated COG tiles |
| Cryptographic hash/signature library | Sidecar validation |
**Error Handling Strategy**:
- Invalid signatures/hashes reject tiles.
- Storage-full blocks generated tile writes without affecting localization output.
- Cache validation failure blocks mission cache usage.
## 5. Caveats & Edge Cases
**Known limitations**:
- JSON-only manifests are avoided for scale and queryability, but signed JSON sidecars remain required for audit/interchange.
- PostgreSQL/PostGIS must be available locally before flight; runtime cannot depend on a remote DB link.
**Potential race conditions**:
- Generated tile and PostgreSQL manifest update must be atomic enough to avoid orphan trusted metadata.
## 6. Dependency Graph
**Must be implemented after**: data model schema decisions.
**Can be implemented in parallel with**: camera ingest, MAVLink integration.
**Blocks**: Satellite Service retrieval, anchor verification, generated tile lifecycle.
## 7. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | Cache package invalid | `cache_manifest_invalid reason=signature` |
| WARN | Tile rejected | `tile_rejected reason=stale tile_id=...` |
| INFO | Generated tile staged | `generated_tile_written tile_id=...` |
**Log format**: FDR structured event.
**Log storage**: FDR segment plus cache validation report.
@@ -1,167 +0,0 @@
# Test Specification — Tile Manager
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-4.2 | Memory/storage pressure | PT-01 | Covered |
| AC-8.1 | Resolution at cache interface | IT-01 | Covered |
| AC-8.2 | Freshness thresholds | IT-02, ST-01 | Covered |
| AC-8.3 | Preloaded/preprocessed offline cache | IT-01 | Covered |
| AC-8.4 | Mid-flight tile generation/write-back | IT-03, AT-01 | Covered |
| AC-8.5 | Persistent imagery policy | ST-02 | Covered |
| AC-8.6 | VPR chunk metadata | IT-04 | Covered |
| AC-NEW-3 | FDR/tile storage cap interaction | PT-01 | Covered |
| AC-NEW-6 | Imagery freshness enforcement | IT-02, ST-01 | Covered |
| AC-NEW-7 | Cache-poisoning safety budget | ST-03, AT-01 | Covered |
## Blackbox Tests
### IT-01: Mission Cache Validation
**Summary**: Verify preloaded COGs, PostGIS metadata, sidecars, descriptors, and indexes validate before flight.
**Traces to**: AC-8.1, AC-8.3
**Input data**: Mission cache package with COGs, signed JSON sidecars, PostGIS manifest seed, FAISS index files.
**Expected result**: Valid cache passes resolution, hash, signature, descriptor-reference, and spatial coverage checks.
**Max execution time**: 5 minutes per cache fixture.
---
### IT-02: Freshness Gate
**Summary**: Verify active-conflict and stable-rear freshness rules.
**Traces to**: AC-8.2, AC-NEW-6
**Input data**: Tiles at fresh, grace, and stale ages for both sector classes.
**Expected result**: Fresh tiles pass, grace tiles are down-confidence weighted if allowed, stale tiles are rejected and cannot emit `satellite_anchored`.
**Max execution time**: 2 minutes.
---
### IT-03: Generated Tile Write
**Summary**: Verify nadir frames are orthorectified and written as generated tiles only when pose and frame quality gates pass.
**Traces to**: AC-8.4
**Input data**: Frame metadata, pose covariance <=3 m, <=5 m, and >5 m.
**Expected result**: <=3 m writes full-quality candidate, 3-5 m writes soft candidate, >5 m rejects write.
**Max execution time**: 2 minutes.
---
### IT-04: VPR Chunk Metadata
**Summary**: Verify chunk metadata supports retrieval rules.
**Traces to**: AC-8.6
**Input data**: Operational-area cache manifest.
**Expected result**: Chunks are 600-800 m equivalent footprint with 40-50% overlap and multi-scale active-sector descriptors.
**Max execution time**: 2 minutes.
## Performance Tests
### PT-01: Cache And FDR Storage Budget
**Summary**: Verify cache metadata and generated tile writes stay within storage/memory budgets.
**Traces to**: AC-4.2, AC-NEW-3
**Load scenario**:
- Mission cache: up to operational budget.
- Generated tiles: 8-hour synthetic flight.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Persistent cache | <=10 GB unless split budget approved | >budget without report |
| FDR + generated artifacts | <=64 GB per flight | >64 GB without rollover |
| DB query p95 | <=50 ms for indexed tile lookup | >150 ms |
**Resource limits**: PostgreSQL/PostGIS stays within total system 8 GB memory budget.
## Security Tests
### ST-01: Signed Manifest Enforcement
**Summary**: Verify unsigned/tampered/stale manifests are rejected.
**Traces to**: AC-8.2, AC-NEW-6
**Attack vector**: Tampered sidecar, bad hash, unsigned manifest.
**Test procedure**: Load invalid cache variants and run validation.
**Expected behavior**: Invalid tiles are rejected and logged.
**Pass criteria**: 0 invalid cache entries become available to retrieval/anchor verification.
---
### ST-02: Raw Frame Persistence Check
**Summary**: Verify Tile Manager persists tiles, not raw frames.
**Traces to**: AC-8.5
**Attack vector**: Raw frames accidentally stored as generated artifacts.
**Test procedure**: Run tile generation and inspect cache/FDR outputs.
**Expected behavior**: Only COG tiles, sidecars, manifests, and allowed failed-frame thumbnails exist.
**Pass criteria**: No raw full-resolution frames retained.
---
### ST-03: Cache Poisoning Gate
**Summary**: Verify misaligned generated tiles cannot become trusted basemap.
**Traces to**: AC-NEW-7
**Attack vector**: Over-confident pose writes misaligned generated tile.
**Test procedure**: Inject deflated covariance and wrong pose during tile write.
**Expected behavior**: Tile is rejected or marked candidate/soft; never promoted to trusted by onboard component.
**Pass criteria**: 0 direct trusted basemap promotions onboard.
## Acceptance Tests
### AT-01: Generated Tile Package For Satellite Service
**Summary**: Verify post-flight sync package contains valid generated tiles and metadata.
**Traces to**: AC-8.4, AC-NEW-7
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Orthorectify and write generated candidate tile | COG + sidecar + PostGIS manifest row created |
| 2 | Package post-flight sync | Manifest delta includes trust level and parent covariance |
| 3 | Inspect package | No tile is marked trusted basemap by onboard runtime |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `cache_integrity_fixtures` | Valid/stale/unsigned/hash-mismatched manifests | Generated fixture | Small |
| `mission_cache_fixture` | COGs, descriptors, PostGIS seed | Satellite Service stub | Mission-dependent |
**Setup procedure**: Restore isolated PostgreSQL/PostGIS schema and mount cache fixture read-only except generated-tile staging.
**Teardown procedure**: Drop schema and delete generated staging volume.
**Data isolation strategy**: Per-run mission ID, schema, and staging directory.
@@ -1,69 +0,0 @@
# MAVLink And GCS Integration
## 1. High-Level Overview
**Purpose**: Subscribe to flight-controller telemetry, emit `GPS_INPUT`, and send downsampled QGroundControl status/failsafe messages.
**Architectural Pattern**: Protocol adapter.
**Upstream dependencies**: ArduPilot Plane FC, safety/anchor wrapper.
**Downstream consumers**: VIO adapter, safety/anchor wrapper, QGC, FDR.
## 2. Internal Interfaces
### Interface: `MavlinkGateway`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `subscribe_telemetry` | `TelemetrySubscriptionRequest` | `TelemetrySample` | Yes | `MavlinkDisconnected` |
| `emit_gps_input` | `PositionEstimate` | `EmitResult` | Yes | `MavlinkDisconnected`, `InvalidGpsInput` |
| `emit_status` | `GcsStatusMessage` | `EmitResult` | Yes | `MavlinkDisconnected` |
## 3. Data Access Patterns
No persistent data ownership; telemetry and emitted packets are mirrored to FDR.
## 4. Implementation Details
**State Management**: Maintains MAVLink connection status, source/system IDs, and rate limiters for QGC status.
**Key Dependencies**:
| Library | Purpose |
|---------|---------|
| MAVSDK | Telemetry subscriptions |
| pymavlink | Exact `GPS_INPUT` field emission |
**Error Handling Strategy**:
- Invalid `GPS_INPUT` fields are rejected before emission.
- Connection loss is surfaced to wrapper/FDR and does not silently drop safety events.
## 5. Caveats & Edge Cases
**Known limitations**:
- v1 emits `GPS_INPUT` only, not velocity-target navigation commands.
- Plane parameter configuration must be validated in SITL before hardware use.
**Performance bottlenecks**:
- Status text must be rate-limited to avoid telemetry noise.
## 6. Dependency Graph
**Must be implemented after**: position estimate DTO and MAVLink output contract.
**Can be implemented in parallel with**: Tile Manager, camera ingest.
**Blocks**: SITL integration and production FC output.
## 7. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | MAVLink disconnected | `mavlink_disconnected endpoint=...` |
| WARN | Invalid output rejected | `gps_input_invalid reason=...` |
| INFO | FC link established | `mavlink_connected system_id=...` |
**Log format**: FDR structured event.
**Log storage**: FDR segment and optional tlog.
@@ -1,176 +0,0 @@
# Test Specification — MAVLink And GCS Integration
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-4.3 | v1 GPS_INPUT only for ArduPilot Plane | IT-01, AT-01 | Covered |
| AC-4.4 | Frame-by-frame streaming | PT-01 | Covered |
| AC-4.5 | Updated estimates/corrections | IT-02 | Covered |
| AC-5.1 | FC state initialization telemetry | IT-03 | Covered |
| AC-5.2 | Plane SITL fallback | IT-04 | Covered |
| AC-6.1 | QGC status 1-2 Hz | IT-05, PT-02 | Covered |
| AC-6.2 | GCS command ingress | IT-06, ST-01 | Covered |
| AC-6.3 | WGS84 output | IT-01 | Covered |
| AC-NEW-2 | Spoofing promotion <3 s | IT-04 | Covered |
| AC-NEW-8 | Blackout/failsafe status | IT-05 | Covered |
## Blackbox Tests
### IT-01: GPS_INPUT Field Mapping
**Summary**: Verify `PositionEstimate` maps to valid MAVLink `GPS_INPUT`.
**Traces to**: AC-4.3, AC-6.3
**Input data**: Position estimates across all source labels.
**Expected result**: v1 emits `GPS_INPUT` only, no `ODOMETRY`; WGS84 lat/lon/alt, fix type, ignore flags, and accuracy fields match contract.
**Max execution time**: 2 minutes.
---
### IT-02: Correction Emission
**Summary**: Verify updated estimates can be emitted without batching.
**Traces to**: AC-4.5
**Input data**: Original VO estimate followed by anchor-corrected estimate.
**Expected result**: Both estimates are emitted in order with updated accuracy/source label.
**Max execution time**: 2 minutes.
---
### IT-03: FC Telemetry Subscription
**Summary**: Verify telemetry needed for initialization and VIO is available.
**Traces to**: AC-5.1
**Input data**: Plane SITL or MAVLink replay with EKF position, IMU, attitude, airspeed, altitude.
**Expected result**: Normalized `TelemetrySample` stream includes required fields and timestamps.
**Max execution time**: 5 minutes.
---
### IT-04: Spoofing And Fallback In Plane SITL
**Summary**: Verify spoofing and no-estimate behavior in ArduPilot Plane SITL.
**Traces to**: AC-5.2, AC-NEW-2
**Input data**: Plane SITL production parameter set and spoofing trace.
**Expected result**: Own estimate promotion occurs within <3 s; fallback/no-estimate behavior matches Plane parameters.
**Max execution time**: 10 minutes.
---
### IT-05: QGC Blackout Status
**Summary**: Verify degraded-mode messages are visible at required rate.
**Traces to**: AC-6.1, AC-NEW-8
**Input data**: Safety wrapper emits blackout and failsafe statuses.
**Expected result**: QGC observer sees `VISUAL_BLACKOUT_IMU_ONLY` at 1-2 Hz and `VISUAL_BLACKOUT_FAILSAFE` at threshold.
**Max execution time**: 10 minutes.
---
### IT-06: Operator Relocalization Hint
**Summary**: Verify GCS command ingress can carry approximate relocalization hints.
**Traces to**: AC-6.2
**Input data**: STATUSTEXT/NAMED_VALUE_FLOAT/custom dialect hint fixture.
**Expected result**: Valid hint is parsed and forwarded to retrieval/safety logic; invalid hint is rejected.
**Max execution time**: 5 minutes.
## Performance Tests
### PT-01: Frame-Rate Emission
**Summary**: Verify output is streamed frame-by-frame and not batched.
**Traces to**: AC-4.4
**Load scenario**:
- Input estimate rate: target frame rate.
- Duration: 30 minutes.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Output delay p95 | <=25 ms after wrapper output | >100 ms |
| Missing messages | 0 except upstream dropped frames | Any silent drop |
---
### PT-02: QGC Status Rate Limit
**Summary**: Verify QGC status is downsampled without losing critical transitions.
**Traces to**: AC-6.1
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Status rate | 1-2 Hz while active | <1 Hz or >2 Hz sustained |
| Critical transition delay | <=1 s | >2 s |
## Security Tests
### ST-01: MAVLink Source And Command Validation
**Summary**: Verify unauthorized or malformed MAVLink messages are rejected.
**Traces to**: AC-6.2
**Attack vector**: Malicious source sends spoofed command or GPS data.
**Test procedure**:
1. Send valid command from allowed source.
2. Send same command from disallowed source/system ID.
3. Send malformed values.
**Expected behavior**: Allowed command is accepted; disallowed/malformed messages are rejected and logged.
**Pass criteria**: 0 unauthorized commands affect localization state.
## Acceptance Tests
### AT-01: Plane SITL Output Acceptance
**Summary**: Verify ArduPilot Plane receives and uses v1 `GPS_INPUT` as configured.
**Traces to**: AC-4.3
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Start Plane SITL with production params | FC accepts external GPS substitute config |
| 2 | Emit `GPS_INPUT` estimate | Message is received with expected fields |
| 3 | Observe wire | `ODOMETRY` is absent in v1 |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `sitl_spoofing_scenarios` | GPS loss/spoofing traces | Generated SITL | Small |
| `mavlink_output_fixtures` | PositionEstimate cases | Generated fixture | Small |
**Setup procedure**: Start SITL/QGC observer or replay MAVLink log.
**Teardown procedure**: Stop processes and archive tlogs.
**Data isolation strategy**: Unique MAVLink ports and run IDs per test.
@@ -1,79 +0,0 @@
# FDR And Observability
## 1. High-Level Overview
**Purpose**: Record bounded, replayable mission evidence and expose runtime health/status events for analysis and operator awareness.
**Architectural Pattern**: Append-only event sink + exporter.
**Upstream dependencies**: All runtime components.
**Downstream consumers**: Validation harness, post-flight audit tools, QGC status through MAVLink component.
## 2. Internal Interfaces
### Interface: `FlightRecorder`
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `append_event` | `FdrEvent` | `AppendResult` | Yes | `RecorderUnavailable`, `StorageFull` |
| `rollover` | `RolloverRequest` | `FdrSegmentInfo` | No | `RolloverFailed` |
| `export` | `ExportRequest` | `ExportResult` | Yes | `ExportFailed` |
## 3. Data Access Patterns
| Query | Frequency | Hot Path | Index Needed |
|-------|-----------|----------|--------------|
| Append event | High | Yes | Append index only |
| Export by time/type | Post-flight | No | Time/type index |
### Storage Estimates
| Table/Collection | Est. Row Count | Row Size | Total Size | Growth Rate |
|------------------|----------------|----------|------------|-------------|
| FDR events | Flight-dependent | Mixed | <=64 GB per 8 h | Per flight |
## 4. Implementation Details
**State Management**: Owns active segment, rollover policy, and export state.
**Key Dependencies**:
| Library | Purpose |
|---------|---------|
| PostgreSQL client | Event metadata, time/type indexes, mission query surface |
| CBOR writer | Bounded runtime payload segments |
| Parquet writer | Optional post-flight export |
**Error Handling Strategy**:
- Storage-full emits critical status and starts rollover/retention behavior.
- Append failures are surfaced to the caller and health system.
## 5. Caveats & Edge Cases
**Known limitations**:
- Raw frames are not retained by default; only metadata, decisions, hashes, and occlusion/blackout status are recorded.
- PostgreSQL availability is required for indexed FDR metadata; CBOR payload segments preserve bounded append behavior for high-volume data.
**Performance bottlenecks**:
- FDR appends must not block hot-path localization.
## 6. Dependency Graph
**Must be implemented after**: event schema and key DTOs.
**Can be implemented in parallel with**: MAVLink integration.
**Blocks**: release evidence and most validation reports.
## 7. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | Recorder unavailable | `fdr_unavailable path=...` |
| WARN | Rollover occurs | `fdr_rollover segment=...` |
| INFO | Export complete | `fdr_export_complete format=parquet` |
**Log format**: FDR event metadata plus local health logs.
**Log storage**: PostgreSQL FDR event tables plus CBOR segment payloads.
@@ -1,166 +0,0 @@
# Test Specification — FDR And Observability
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-1.3 | Anchor age/drift evidence | IT-01 | Covered |
| AC-1.4 | Confidence/source label retained | IT-01 | Covered |
| AC-4.4 | Per-frame local stream evidence | IT-01, PT-01 | Covered |
| AC-5.2 | Failure logging | IT-02 | Covered |
| AC-6.1 | QGC/status evidence | IT-03 | Covered |
| AC-8.4 | Generated tile audit | IT-04 | Covered |
| AC-8.5 | No raw frame retention | ST-01 | Covered |
| AC-NEW-3 | FDR retention and 64 GB cap | PT-01, AT-01 | Covered |
| AC-NEW-4 | False-position forensics | IT-05 | Covered |
| AC-NEW-5 | Thermal/throttle logging | IT-06 | Covered |
| AC-NEW-8 | Blackout/failsafe logging | IT-02, IT-03 | Covered |
## Blackbox Tests
### IT-01: Per-Estimate Event Capture
**Summary**: Verify every estimate stores covariance, source label, anchor age, and emitted output metadata.
**Traces to**: AC-1.3, AC-1.4, AC-4.4
**Input data**: Position estimate stream with satellite, VO, and dead-reckoned labels.
**Expected result**: PostgreSQL event index and CBOR payload segments contain all required fields with monotonic timestamps.
**Max execution time**: 5 minutes.
---
### IT-02: Failure And Blackout Logging
**Summary**: Verify no-estimate and blackout transitions are recorded.
**Traces to**: AC-5.2, AC-NEW-8
**Input data**: No-estimate gap and total blackout sequence.
**Expected result**: FDR records start, every degraded estimate, failsafe threshold, and recovery reason.
**Max execution time**: 10 minutes.
---
### IT-03: QGC Status Audit
**Summary**: Verify operator-visible status has matching FDR evidence.
**Traces to**: AC-6.1, AC-NEW-8
**Input data**: QGC status messages from MAVLink component.
**Expected result**: FDR contains status text, timestamp, and mode context.
**Max execution time**: 5 minutes.
---
### IT-04: Generated Tile Audit Trail
**Summary**: Verify tile-write decisions are recorded with parent covariance and trust level.
**Traces to**: AC-8.4
**Input data**: Accepted and rejected generated tile write decisions.
**Expected result**: FDR includes tile ID, parent covariance, trust level, sidecar hash, and rejection reason where applicable.
**Max execution time**: 5 minutes.
---
### IT-05: False-Position Investigation Bundle
**Summary**: Verify enough evidence exists to investigate a false-position event.
**Traces to**: AC-NEW-4
**Input data**: Simulated false anchor rejection and covariance growth sequence.
**Expected result**: Export includes estimates, anchor decisions, residuals, covariance, and emitted MAVLink fields.
**Max execution time**: 5 minutes.
---
### IT-06: Thermal/Throttle Event Capture
**Summary**: Verify resource health events are recorded.
**Traces to**: AC-NEW-5
**Input data**: Synthetic thermal/throttle metric stream.
**Expected result**: FDR records CPU/GPU/temp/throttle status and QGC warning trigger.
**Max execution time**: 5 minutes.
## Performance Tests
### PT-01: 8-Hour FDR Load
**Summary**: Verify FDR storage and append behavior under full mission load.
**Traces to**: AC-4.4, AC-NEW-3
**Load scenario**:
- Duration: 8 hours synthetic.
- Inputs: 3 Hz estimates, full-rate IMU, MAVLink tlog, health metrics, tile events.
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Total FDR size | <=64 GB | >64 GB without rollover |
| Append latency p95 | <=10 ms async enqueue | >25 ms |
| Silent payload loss | 0 | Any unlogged loss |
**Resource limits**: FDR must not block hot-path localization.
## Security Tests
### ST-01: Raw Frame Retention Audit
**Summary**: Verify FDR does not store raw full-resolution frames.
**Traces to**: AC-8.5
**Attack vector**: Debug logging accidentally persists raw camera frames.
**Test procedure**:
1. Run normal replay and failed tile-generation replay.
2. Inspect FDR payloads and output directories.
**Expected behavior**: Only metadata, hashes, estimates, tiles, and allowed low-rate failed-frame thumbnails are retained.
**Pass criteria**: No raw nav/AI camera frame payloads in normal FDR.
## Acceptance Tests
### AT-01: FDR Export
**Summary**: Verify post-flight export creates usable audit artifacts.
**Traces to**: AC-NEW-3
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | Complete synthetic flight | Segment rollover is logged and cap respected |
| 2 | Export FDR summary | Markdown/CSV/Parquet optional artifacts are produced |
| 3 | Query PostgreSQL index | Events can be filtered by time/type/mission |
## Test Data Management
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| `fdr_synthetic_load` | Estimate, IMU, MAVLink, health, tile events | Generated fixture | Large |
| `incident_fixture` | False-position and blackout evidence | Generated fixture | Small |
**Setup procedure**: Create isolated PostgreSQL schema and FDR segment directory.
**Teardown procedure**: Export report, then remove schema and segment directory.
**Data isolation strategy**: Per-run mission ID, schema, and FDR directory.
@@ -1,51 +0,0 @@
# Contract: Config Errors Telemetry
**Component**: shared/config, shared/errors, shared/telemetry
**Producer task**: AZ-222 — AZ-222_runtime_config_errors_telemetry.md
**Consumer tasks**: AZ-223, AZ-224, AZ-225, AZ-226, AZ-227, AZ-228, AZ-229, AZ-230, AZ-231, AZ-232
**Version**: 1.0.0
**Status**: draft
**Last Updated**: 2026-05-03
## Purpose
Defines shared runtime configuration, error/result envelope, health, and telemetry metadata behavior consumed by all runtime components.
## Shape
| Contract | Required Behavior |
|----------|-------------------|
| Runtime profile | environment-specific settings loaded and validated before use |
| Error envelope | component, category, message, cause, retryability, severity |
| Health event | liveness/readiness status, dependency state, timestamp, component |
| Metrics labels | bounded component/action/status labels suitable for runtime reports |
## Invariants
- Missing required production settings fail startup or readiness loudly.
- Errors are returned or logged with component and category; no silent suppression.
- Secrets are referenced, not serialized into FDR, logs, or metrics.
## Non-Goals
- Does not define component-specific business errors.
- Does not replace FDR payload schemas.
## Versioning Rules
- Removing required config keys or error categories requires a major version bump.
- Adding optional health fields or metrics labels requires a minor version bump.
## Test Cases
| Case | Input | Expected | Notes |
|------|-------|----------|-------|
| missing-required-prod | production profile missing cache dir | readiness/startup failure | Clear error category |
| secret-value | signing key ref present | only key ref logged | No secret leakage |
| component-error | component reports dependency failure | structured envelope emitted | FDR-safe |
## Change Log
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | 2026-05-03 | Initial contract | autodev |
@@ -1,52 +0,0 @@
# Contract: Geometry And Time Sync Helpers
**Component**: shared/geo_geometry, shared/time_sync
**Producer task**: AZ-221 — AZ-221_shared_geometry_time_sync.md
**Consumer tasks**: AZ-223, AZ-225, AZ-226, AZ-228, AZ-230, AZ-231, AZ-232
**Version**: 1.0.0
**Status**: draft
**Last Updated**: 2026-05-03
## Purpose
Defines shared geospatial and timestamp helper behavior used by runtime components to avoid duplicated math and inconsistent frame/IMU alignment.
## Shape
| API Area | Shape | Errors |
|----------|-------|--------|
| Coordinate conversion | WGS84/local tangent conversions and distance calculations | invalid CRS, missing origin |
| Camera footprint | intrinsics/extrinsics/attitude/altitude to footprint and GSD | invalid calibration, missing altitude |
| Homography metrics | homography/covariance conversions and MRE support | invalid geometry |
| Time sync | monotonic checks, frame-to-IMU window selection, replay ordering | timestamp mismatch, gap/jitter exceeded |
## Invariants
- Helpers are deterministic for the same calibration, pose, and timestamp inputs.
- Time helpers report gaps/jitter instead of silently dropping samples.
- Geometry helpers do not decide safety policy; callers decide degrade/reject behavior.
## Non-Goals
- No VIO state estimation.
- No MAVLink parsing beyond normalized timestamp fields.
- No tile freshness or cache policy decisions.
## Versioning Rules
- Breaking changes to units, coordinate frames, or timestamp semantics require a major version bump.
- New helper outputs may be added as optional fields in minor versions.
## Test Cases
| Case | Input | Expected | Notes |
|------|-------|----------|-------|
| valid-wgs84-local | known WGS84 point and origin | round-trip within tolerance | Uses representative coordinates |
| frame-imu-window | frame timestamp plus IMU samples | correct aligned window | Includes gap metrics |
| invalid-calibration | missing intrinsics/extrinsics | explicit error | No silent fallback |
## Change Log
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | 2026-05-03 | Initial contract | autodev |
@@ -1,56 +0,0 @@
# Contract: Runtime Shared Contracts
**Component**: shared/contracts
**Producer task**: AZ-220 — AZ-220_shared_runtime_contracts.md
**Consumer tasks**: AZ-223, AZ-224, AZ-225, AZ-226, AZ-227, AZ-228, AZ-229, AZ-230, AZ-231, AZ-232
**Version**: 1.0.0
**Status**: draft
**Last Updated**: 2026-05-03
## Purpose
Defines the shared runtime DTO/event contract surface that component implementations consume instead of inventing local shapes.
## Shape
| Contract | Required Fields / Methods | Consumers |
|----------|---------------------------|-----------|
| `FramePacket` | frame ID, timestamp, image reference, calibration ID, occlusion, quality, normalization hint | camera, VIO, Satellite Service, Anchor Verification, Tile Manager, FDR |
| `TelemetrySample` | timestamp, IMU, attitude, altitude, airspeed, GPS health | MAVLink, VIO, safety wrapper, FDR |
| `VioStatePacket` | timestamp, relative pose, velocity, bias, tracking quality, covariance hint | VIO, safety wrapper, FDR |
| `PositionEstimate` | WGS84 coordinates, covariance, source label, fix type, horizontal accuracy, anchor age | safety wrapper, MAVLink, Tile Manager, FDR |
| `VprCandidate` | chunk ID, tile ID, score, footprint, freshness status | Satellite Service, Anchor Verification, FDR |
| `AnchorDecision` | candidate ID, acceptance result, estimated pose, inliers, MRE, rejection reason | Anchor Verification, safety wrapper, FDR |
| `CacheTileRecord` | tile ID, CRS, meters per pixel, capture date, signature/hash, trust level | Tile Manager, Satellite Service, Anchor Verification |
| `FdrEvent` | event type, timestamp, component, severity, payload reference, mission/run ID | all runtime components |
## Invariants
- Timestamps are normalized to a shared monotonic nanosecond representation before cross-component use.
- Confidence fields must not under-report known uncertainty.
- Raw frame payloads are referenced, not persisted in shared DTOs.
- Generated tile and anchor records must carry provenance/freshness metadata.
## Non-Goals
- Does not prescribe internal classes or storage implementation.
- Does not define e2e test runner-only report schemas.
## Versioning Rules
- Removing or renaming a field requires a major version bump.
- Adding optional telemetry or diagnostic fields requires a minor version bump.
## Test Cases
| Case | Input | Expected | Notes |
|------|-------|----------|-------|
| valid-frame | frame with timestamp, calibration, quality | accepted by consumers | Includes normalization hint |
| invalid-time | non-monotonic timestamp | rejected or marked invalid | Time-sync contract decides details |
| stale-anchor | anchor decision with stale freshness | rejected/down-confidenced | Safety wrapper must not accept blindly |
## Change Log
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | 2026-05-03 | Initial contract | autodev |
-148
View File
@@ -1,148 +0,0 @@
# Data Model
## Scope
This model defines system-level runtime, cache, telemetry, and validation data. PostgreSQL with PostGIS is the primary structured store for manifests, spatial metadata, mission state, and FDR event indexes. Large binary payloads remain local files: COG tiles, descriptor/index files, FDR payload segments, and replay fixtures.
## Entity Overview
| Entity | Purpose | Storage / Transport | Owner |
|--------|---------|---------------------|-------|
| MissionProfile | Operational area, sector type, route shape, altitude band, cache budget | Mission config file | Tile Manager |
| CameraCalibration | Intrinsics, distortion, lens, fixed extrinsics, capture settings | Versioned calibration file | Camera ingest/calibration |
| FrameRecord | Per-frame metadata, timestamp, total-occlusion/blackout state, image quality, processing status | PostgreSQL/FDR event; replay fixture | Camera ingest/calibration |
| TelemetrySample | FC IMU, attitude, altitude, airspeed, GPS health | MAVLink stream; FDR event | MAVLink/GCS integration |
| VioState | Backend-relative state, velocity, bias, tracking quality | Internal DTO; FDR event | VIO adapter |
| PositionEstimate | WGS84 output, covariance, source label, anchor age, fix type | MAVLink DTO; FDR event | Safety/anchor wrapper |
| VprChunk | Retrieval footprint and descriptor metadata | PostgreSQL/PostGIS manifest + descriptor files | Satellite Service |
| AnchorCandidate | Top-K retrieval result and local verification metrics | Internal DTO; FDR event | Anchor verification |
| CacheTile | Service-source or generated COG tile metadata | PostgreSQL/PostGIS manifest + signed JSON sidecar | Tile Manager |
| GeneratedTile | In-flight tile candidate with trust/provenance metadata | COG + sidecar + FDR event | Tile Manager |
| FdrSegment | Bounded append-only mission evidence segment | PostgreSQL event index + CBOR segment payloads | FDR/observability |
| ValidationRun | Replay/test run metadata and outcomes | CSV/Markdown/test artifacts | Validation harness |
## Core Entity Attributes
### MissionProfile
| Field | Type | Required | Notes |
|-------|------|----------|-------|
| `mission_id` | string | yes | Unique mission/run identifier |
| `operational_area_polygon` | geometry | yes | Up to ~400 km² |
| `sector_classification` | enum | yes | `active_conflict` or `stable_rear` |
| `planned_altitude_agl_m` | number | yes | <=1000 m AGL |
| `route_type` | enum | yes | `sector`, `transit_corridor`, or mixed |
| `cache_budget_bytes` | integer | yes | Default ~10 GB persistent |
### CameraCalibration
| Field | Type | Required | Notes |
|-------|------|----------|-------|
| `camera_model` | string | yes | ADTi 20MP 20L V1 family |
| `sensor_width_mm` | number | yes | Public spec check currently indicates 23.20 mm |
| `sensor_height_mm` | number | yes | Public spec check currently indicates 15.40 mm |
| `image_width_px` | integer | yes | Public spec check currently indicates 5456 px |
| `image_height_px` | integer | yes | Public spec check currently indicates 3632 px |
| `pixel_pitch_um` | number | yes | Public spec check indicates 4.25 um |
| `lens_focal_length_mm` | number | yes | TBD before implementation |
| `distortion_coefficients` | array | yes | From checkerboard calibration |
| `body_T_camera` | transform | yes | Fixed camera-to-body extrinsics |
| `spec_verification_status` | enum | yes | `manufacturer_verified`, `public_page_only`, `operator_supplied` |
### PositionEstimate
| Field | Type | Required | Notes |
|-------|------|----------|-------|
| `timestamp_ns` | integer | yes | Frame-aligned |
| `lat_deg` | number | yes | WGS84 |
| `lon_deg` | number | yes | WGS84 |
| `alt_msl_m` | number | yes | MSL altitude for `GPS_INPUT` |
| `covariance_95_semi_major_m` | number | yes | Must not be under-reported |
| `source_label` | enum | yes | `satellite_anchored`, `vo_extrapolated`, `dead_reckoned` |
| `last_satellite_anchor_age_ms` | integer | yes | Monotonic until new anchor |
| `fix_type` | integer | yes | MAVLink fix semantics |
| `horiz_accuracy_m` | number | yes | >= covariance semi-major mapping |
| `quality_flags` | bitset/string array | yes | Anchor, blackout, spoofing, stale tile, etc. |
### FrameRecord
| Field | Type | Required | Notes |
|-------|------|----------|-------|
| `frame_id` | string | yes | Stable frame/run identifier |
| `timestamp_ns` | integer | yes | Camera clock normalized by time-sync helper |
| `camera_calibration_id` | string | yes | Links to `CameraCalibration` |
| `occlusion_status` | enum | yes | `clear`, `partial_occlusion`, `total_occlusion`, `blackout` |
| `usable_for_vio` | boolean | yes | Must be false for total occlusion/blackout |
| `usable_for_anchor` | boolean | yes | Must be false for total occlusion/blackout |
| `blackout_reason` | enum | optional | `cloud`, `lens_cover`, `whiteout`, `decode_failure`, `underexposed`, `overexposed`, `unknown` |
| `blur_score` | number | yes | Quality metric |
| `texture_score` | number | yes | Quality metric |
### CacheTile
| Field | Type | Required | Notes |
|-------|------|----------|-------|
| `tile_id` | string | yes | Stable ID |
| `tile_type` | enum | yes | `service_source`, `generated_candidate`, `generated_soft`, `trusted_basemap` |
| `cog_path` | string | yes | Local path |
| `crs` | string | yes | Projection metadata |
| `meters_per_pixel` | number | yes | Must satisfy cache interface floor |
| `capture_date` | date | yes | Freshness gate |
| `source` | string | yes | Satellite Service or onboard generation |
| `sha256` | string | yes | Integrity |
| `signature_status` | enum | yes | `valid`, `missing`, `invalid` |
| `parent_pose_covariance_m` | number | generated only | Tile-write gate |
| `trust_level` | enum | yes | `rejected`, `candidate`, `soft`, `trusted` |
## Relationships
```mermaid
erDiagram
MissionProfile ||--o{ CacheTile : scopes
MissionProfile ||--o{ VprChunk : indexes
CameraCalibration ||--o{ FrameRecord : calibrates
FrameRecord ||--o{ VioState : contributes_to
TelemetrySample ||--o{ VioState : contributes_to
VioState ||--o{ PositionEstimate : propagates
VprChunk ||--o{ AnchorCandidate : retrieved_as
CacheTile ||--o{ AnchorCandidate : verified_against
AnchorCandidate ||--o{ PositionEstimate : may_anchor
PositionEstimate ||--o{ GeneratedTile : gates
PositionEstimate ||--o{ FdrSegment : recorded_in
```
## Storage Strategy
| Data Class | Primary Format | Reason |
|------------|----------------|--------|
| Structured mission/cache/FDR metadata | PostgreSQL + PostGIS | Queryable freshness, coverage, spatial footprints, descriptors, tile status, and FDR event indexes |
| Tile audit sidecar | Signed JSON | Human/audit/service interchange per tile |
| Imagery tile | COG | Geospatial raster standard |
| Descriptor index | FAISS CPU index files + metadata | Fast top-K retrieval |
| FDR runtime payloads | CBOR segment files + PostgreSQL index | Bounded append payloads with queryable event metadata |
| FDR analysis export | Parquet optional | Post-flight analytics |
| Test report | CSV + Markdown | CI and human review |
## Migration Strategy
- PostgreSQL schemas use explicit `schema_version` and additive migrations by default.
- PostGIS geometry columns are used for mission polygons, tile footprints, VPR chunks, and generated-tile extents.
- FDR segment schema includes `segment_schema_version`; old readers must reject unknown required fields loudly.
- Sidecars include a `sidecar_version` and hash of the COG payload.
- Migrations are implemented as deterministic scripts with rollback for metadata-only changes.
- No database/table/column rename is allowed without explicit approval during implementation.
## Seed Data Requirements
| Environment | Seed Data |
|-------------|-----------|
| Development | 60 project images, `coordinates.csv`, small cache fixture, generated SITL traces |
| Public replay | Pinned MUN-FRL/ALTO/Kagaru/EPFL dataset slices and licenses |
| Jetson validation | Production-like cache/index, cold-start fixtures, thermal workload |
| Representative acceptance | Synchronized target nav-camera + FC telemetry + ground truth |
## Backward Compatibility
- Runtime should tolerate older cache sidecars if required fields exist and signatures validate.
- Generated tile sidecars must include all fields required by Satellite Service ingest; missing fields make the tile ineligible for promotion.
- FDR readers must support at least the current and previous segment schema version during the project lifecycle.
-11
View File
@@ -1,11 +0,0 @@
# Deployment Planning Index
This directory contains the system-level deployment plan produced during Plan Step 2:
- `containerization.md`
- `ci_cd_pipeline.md`
- `environment_strategy.md`
- `observability.md`
- `deployment_procedures.md`
Component-specific implementation tasks are created later during decomposition.

Some files were not shown because too many files have changed in this diff Show More