Update demo replay validation and testing documentation

- Modified the autodev state to reflect the current testing phase and details of the new `jetson-e2e` tests. - Enhanced the "How to Test" documentation to provide clearer instructions on the demo replay validation process, including video and tlog alignment steps. - Updated architectural documentation to include the new demo replay operator flow and its dependencies. - Documented the removal of deprecated auto-sync features and clarified the operator-facing UI for replay validation. - Added new entries in the dependencies table for upcoming tasks related to the demo replay flow. These changes improve clarity and usability for operators and developers working with the demo replay system.
fixes in tests, autodev update
2026-06-23 08:41:13 +00:00 · 2026-06-20 11:24:43 +03:00 · 2026-06-18 12:13:43 +03:00 · 2026-06-10 05:35:01 +03:00 · 2026-06-09 20:43:15 +03:00 · 2026-06-09 14:06:35 +03:00
1172 changed files with 201738 additions and 2982 deletions
@@ -11,10 +11,20 @@ If you want to run a specific skill directly (without the orchestrator), use the
 ```
 /problem                — interactive problem gathering → _docs/00_problem/
 /research               — solution drafts → _docs/01_solution/
-/plan                   — architecture, components, tests → _docs/02_document/
-/decompose              — atomic task specs → _docs/02_tasks/todo/
-/implement              — batched parallel implementation → _docs/03_implementation/
-/deploy                 — containerization, CI/CD, observability → _docs/04_deploy/
+/plan                   — architecture, ADRs, components, tests, epics → _docs/02_document/
+/test-spec              — blackbox/perf/resilience/security test specs → _docs/02_document/tests/
+/decompose              — atomic task specs (multi-mode) → _docs/02_tasks/todo/
+/implement              — sequential dependency-aware batches with code review and completeness gates → _docs/03_implementation/
+/test-run               — runs the test suite (functional / perf modes) with gating
+/code-review            — multi-phase review used by /implement
+/refactor               — 8-phase structured refactoring (incl. testability sub-mode) → _docs/04_refactoring/
+/security               — OWASP-driven audit → _docs/05_security/
+/deploy                 — containerization, CI/CD, environments, observability, procedures, scripts → _docs/04_deploy/
+/release                — execute deploy artifacts in prod, smoke-test, watch, decide rollback → _docs/04_release/
+/document               — bottom-up reverse-engineering of an existing codebase → _docs/02_document/
+/new-task               — interactive feature planning for an existing codebase → _docs/02_tasks/todo/
+/ui-design              — HTML+CSS mockups + design system → _docs/02_document/ui_mockups/
+/retrospective          — metrics + lessons log → _docs/06_metrics/ + _docs/LESSONS.md
 ```

 ## How It Works
@@ -41,148 +51,201 @@ The state file tracks completed steps, key decisions, blockers, and session cont

 Skills auto-chain without pausing between them. The only pauses are:
 - **BLOCKING gates** inside each skill (user must confirm before proceeding)
- **Session boundary** after decompose (suggests new conversation before implement)
+- **Session boundaries** declared in each flow's auto-chain rules (e.g., after `decompose`, after `decompose tests`) — suggested new-conversation breakpoints to keep context fresh

-A typical project runs in 2-4 conversations:
- Session 1: Problem → Research → Research decision
- Session 2: Plan → Decompose
- Session 3: Implement (may span multiple sessions)
- Session 4: Deploy
+There are three flows, resolved on every invocation (see `skills/autodev/SKILL.md` § Flow Resolution):

-Re-entry is seamless: type `/autodev` in a new conversation and the orchestrator reads the state file to pick up exactly where you left off.
+| Flow | When | Steps |
+|------|------|-------|
+| **greenfield** | empty workspace, no source yet | 17 steps: Problem → Research → Plan → UI Design → Test Spec → Decompose → Implement → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (opt) → Performance Test (opt) → Deploy → Release → Retrospective |
+| **existing-code** | source files present | one-time baseline (Document → Architecture Baseline Scan → Test Spec → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → optional Refactor) then a feature-cycle loop (New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security Audit (opt) → Performance Test (opt) → Deploy → Release → Retrospective → loops back to New Task) |
+| **meta-repo** | `.gitmodules`, workspace manifest, or multi-component aggregator | uses `monorepo-*` skills + `_docs/_repo-config.yaml` instead of per-component BUILD-SHIP folders |
+
+A typical greenfield project spans several conversations because of session boundaries. Re-entry is seamless: type `/autodev` in a new conversation and the orchestrator reads `_docs/_autodev_state.md` to pick up exactly where you left off.

 ## Skill Descriptions

 ### autodev (meta-orchestrator)

-Auto-chaining engine that sequences the full BUILD → SHIP workflow. Persists state to `_docs/_autodev_state.md`, tracks key decisions and session context, and flows through problem → research → plan → decompose → implement → deploy without manual skill invocation. Maximizes work per conversation with seamless cross-session re-entry.
+Auto-chaining engine that sequences the full BUILD → SHIP → EVOLVE workflow. Persists state to `_docs/_autodev_state.md`, surfaces top-3 lessons from `_docs/LESSONS.md` at every invocation, replays any `_docs/_process_leftovers/` entries, tracks key decisions and session context, and flows through the active flow's steps without manual skill invocation. Maximizes work per conversation with seamless cross-session re-entry.

 ### problem

-Interactive interview that builds `_docs/00_problem/`. Asks probing questions across 8 dimensions (problem, scope, hardware, software, acceptance criteria, input data, security, operations) until all required files can be written with concrete, measurable content.
+Interactive 4-phase interview that builds `_docs/00_problem/`. Asks probing questions across 8 dimensions (problem & goals, scope, hardware & environment, software & tech, acceptance criteria, input data, security, operational) until all required files can be written with concrete, measurable, quantifiable content. Acceptance criteria must include numeric targets; input data must include `expected_results/` mappings.

 ### research

-8-step deep research methodology. Mode A produces initial solution drafts. Mode B assesses and revises existing drafts. Includes AC assessment, source tiering, fact extraction, comparison frameworks, and validation. Run multiple rounds until the solution is solid.
+8-step deep research methodology. Mode A produces initial solution drafts. Mode B assesses and revises existing drafts. Classifies output as **Technical-component selection** (full per-mode API verification gates apply) or **Non-technical investigation** (gates relaxed). Source tiering, fact extraction, comparison frameworks, validation, exact-fit component selection. Run multiple rounds until the solution is solid.

 ### plan

-6-step planning workflow. Produces integration test specs, architecture, system flows, data model, deployment plan, component specs with interfaces, risk assessment, test specifications, and work item epics. Heavy interaction at BLOCKING gates.
+6-step planning workflow with one half-step (4.5: Architecture Decision Records). Produces blackbox test specs (delegated to test-spec), glossary, architecture vision, architecture document, data model, deployment plan, component specs with interfaces, risk assessment, ADRs, test specifications, and work item epics. Heavy interaction at BLOCKING gates (glossary+vision, architecture, components, mitigations, ADRs).
+
+### test-spec
+
+4-phase test specification workflow. Phase 1 analyzes input data + expected-results completeness. Phase 2 emits 8 test artifacts (environment, test-data, blackbox, performance, resilience, security, resource-limit, traceability matrix). Phase 3 is the hard gate that requires every test to have quantifiable expected results. Phase 4 emits runner scripts. Cycle-update mode for incremental refresh.

 ### decompose

-4-step task decomposition. Produces a bootstrap structure plan, atomic task specs per component, integration test tasks, and a cross-task dependency table. Each task gets a work item ticket and is capped at 8 complexity points.
+Multi-mode task decomposition with 6 internal step files. Implementation mode runs Step 1 (Bootstrap), 1.5 (Module Layout), 1.7 (System-Pipeline owner tasks), 2 (per-component tasks), 4 (Cross-Verification). Tests-only mode runs Step 1t (Test Infrastructure), 3 (Blackbox tasks), 4. Single-component mode runs Step 2 only. Each task is tracker-prefixed and capped at 5 complexity points. The 1.7 step exists specifically to prevent the GPS-passthrough class of failure (see `meta-rule.mdc`).

 ### implement

-Orchestrator that reads task specs, computes dependency-aware execution batches, launches up to 4 parallel implementer subagents, runs code review after each batch, and commits per batch. Does not write code itself.
-
-### deploy
-
-7-step deployment planning. Status check, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures, and deployment scripts. Produces documents for steps 1-6 and executable scripts in step 7.
+Orchestrator that reads task specs, computes dependency-aware execution batches via topological sort, **implements tasks sequentially within each batch** (no subagents, no parallel execution — see `.cursor/rules/no-subagents.mdc`), runs code review after each batch, runs cumulative code review every K batches, and commits per batch. Has a Product Implementation Completeness Gate (Step 15) that compares promises in task specs / architecture against actual production code, plus a System-Pipeline Audit (Step 15.b) that walks architecture-named pipelines and verifies a real production caller wires each adjacent component pair. Either gate's FAIL stops the cycle until remediation tasks are created.

 ### code-review

-Multi-phase code review against task specs. Produces structured findings with verdict: PASS, FAIL, or PASS_WITH_WARNINGS.
+7-phase code review against task specs (Phase 7 is Architecture Compliance against `module-layout.md` and `architecture.md`). Produces structured findings with verdict: PASS, PASS_WITH_WARNINGS, or FAIL. Three modes: full (per batch), baseline (one-time architecture scan of an existing codebase), cumulative (mid-implementation across batches with `## Baseline Delta`).
+
+### test-run
+
+Runs the test suite. Functional mode (default): detects pytest/dotnet/cargo/npm or `scripts/run-tests.sh`, applies a System-Under-Test Reality Gate to refuse passes where internal product modules were stubbed, classifies failures and skips, gates on outcome. Perf mode: detects `scripts/run-performance-tests.sh` or k6/locust/artillery/wrk, captures latency/throughput/error metrics, compares against thresholds.

 ### refactor

-6-phase structured refactoring: baseline, discovery, analysis, safety net, execution, hardening.
+8-phase structured refactoring: baseline → discovery → analysis → safety net → execution → test sync → verification → documentation. Two input modes (Automatic / Guided). Testability sub-mode skips Phase 3 by design and emits a `testability_changes_summary.md` for user review. Each run lives in its own `RUN_DIR` under `_docs/04_refactoring/NN-<run-name>/`.

 ### security

-OWASP-based security testing and audit.
+5-phase OWASP-based audit: dependency scan → static analysis → OWASP Top 10 review → infrastructure review → consolidated security report. Severity-ranked, evidence-based, actionable. Complementary to `code-review` Phase 4 (lightweight security quick-scan).
+
+### deploy
+
+7-step deployment planning. Produces documents for steps 1–6 (status & env, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures) and executable scripts in step 7 (`deploy.sh`, `pull-images.sh`, `start-services.sh`, `stop-services.sh`, `health-check.sh`).
+
+### release
+
+Executes the deployment plan produced by `/deploy` against a target environment. 6 phases: pre-release gate (AC + risk + rollback readiness), strategy select (all-at-once / blue-green / canary / manual), execute (run scripts, monitor exit codes), smoke test (delegate to test-run prod-smoke), watch window (read observability for the configured duration), commit-or-rollback. Outputs `_docs/04_release/release_<version>.md`. Produces a definitive Released / Rolled-Back / Aborted verdict; failure of any phase auto-triggers rollback unless the user opts to investigate.

 ### retrospective

-Collects metrics from implementation batch reports, analyzes trends, produces improvement reports.
+4-step workflow: collect metrics → analyze trends → produce report → update lessons log (`_docs/LESSONS.md`, ring buffer of last 15 entries consumed by `new-task`, `plan`, `decompose`, and `autodev`). Cycle-end (default) and incident modes; incident mode is auto-invoked after a 3-strike failure.

 ### document

-Bottom-up codebase documentation. Analyzes existing code from modules through components to architecture, then retrospectively derives problem/restrictions/acceptance criteria. Alternative entry point for existing codebases — produces the same `_docs/` artifacts as problem + plan, but from code analysis instead of user interview.
+Bottom-up codebase documentation. Analyzes existing code from modules through components to architecture, then retrospectively derives problem/restrictions/acceptance criteria. Alternative entry point for existing codebases — produces the same `_docs/` artifacts as problem + plan, but from code analysis instead of user interview. Two workflow files: `workflows/full.md` (full / focus-area / resume) and `workflows/task.md` (incremental update for a single task).
+
+### new-task
+
+Existing-code feature planning loop. Walks the user through Step 1 (description) → Step 2 (complexity assessment, consults `LESSONS.md`) → Step 3 (research if needed) → Step 4 (codebase analysis incl. test-coverage gap) → Step 4.5 (contract & layout check) → Step 5 (validate assumptions) → Step 6 (write task spec) → Step 7 (tracker ticket) → Step 8 (loop or finalize).
+
+### ui-design
+
+End-to-end UI workflow. Phase 0 (complexity detection: full vs quick) → Phase 1 (context check) → Phase 2 (requirements) → Phase 3 (direction exploration) → Phase 4 (design system synthesis: `DESIGN.md`) → Phase 5 (HTML+Tailwind code generation) → Phase 6 (visual verification, optional MCP enhancements) → Phase 7 (user review) → Phase 8 (iteration). Has Applicability Check that refuses to run on non-UI projects.
+
+### monorepo-* (suite-level)
+
+Six skills for meta-repos: `monorepo-discover` (write/refresh `_docs/_repo-config.yaml`), `monorepo-document` (sync unified docs), `monorepo-cicd` (sync CI/compose/env templates), `monorepo-onboard` (atomic add-component), `monorepo-status` (read-only drift report), `monorepo-e2e` (sync suite-level integration harness). They never cross domains; each touches exactly one artifact class.

 ## Developer TODO (Project Mode)

-### BUILD
+The numbered list below mirrors greenfield-flow ordering. Existing-code projects start at `/document`, then enter the feature-cycle loop at `/new-task`. See `skills/autodev/flows/{greenfield,existing-code,meta-repo}.md` for the authoritative step tables.
+
+### BUILD (greenfield)

 ```
-0. /problem                                      — interactive interview → _docs/00_problem/
-   - problem.md                                   (required)
-   - restrictions.md                              (required)
-   - acceptance_criteria.md                       (required)
-   - input_data/                                  (required)
-   - security_approach.md                         (optional)
-
-1. /research                                     — solution drafts → _docs/01_solution/
-   Run multiple times: Mode A → draft, Mode B → assess & revise
-
-2. /plan                                         — architecture, data model, deployment, components, risks, tests, epics → _docs/02_document/
-
-3. /decompose                                    — atomic task specs + dependency table → _docs/02_tasks/todo/
-
-4. /implement                                    — batched parallel agents, code review, commit per batch → _docs/03_implementation/
+1. /problem            — interactive 4-phase interview → _docs/00_problem/
+                          required: problem.md, restrictions.md, acceptance_criteria.md, input_data/
+                          optional: security_approach.md
+2. /research           — solution drafts (Mode A draft, Mode B assess) → _docs/01_solution/
+3. /plan               — glossary, architecture vision, architecture, data model, deployment, components,
+                          risks, ADRs (Step 4.5), test specs, epics → _docs/02_document/
+                          (Step 1 invokes /test-spec internally)
+4. /ui-design          — HTML+Tailwind mockups (UI projects only) → _docs/02_document/ui_mockups/
+5. /test-spec          — produces 8 test-spec artifacts + traceability matrix → _docs/02_document/tests/
+                          (already invoked from /plan Step 1; Step 5 here is the explicit autodev step)
+6. /decompose          — implementation tasks + module-layout + system-pipeline owner tasks →
+                          _docs/02_tasks/todo/
+7. /implement          — sequential dependency-aware batches; per-batch code-review;
+                          Product Completeness Gate + System-Pipeline Audit → _docs/03_implementation/
+8. (auto) Code Testability Revision  — surgical refactor to make code runnable under tests
+9. /decompose tests    — test-only decomposition mode → _docs/02_tasks/todo/
+10. /implement (tests) — implements test tasks
+11. /test-run          — full functional suite gate
+12. /test-spec --cycle-update  — append implementation-learned scenarios
+13. /document --task   — update affected component / module / architecture docs
+14. /security          — OWASP-based audit (optional gate)
+15. /test-run --perf   — perf/load tests (optional gate)
 ```

 ### SHIP

 ```
-5. /deploy                                       — containerization, CI/CD, environments, observability, procedures → _docs/04_deploy/
+16. /deploy            — containerization, CI/CD, environments, observability, procedures, scripts → _docs/04_deploy/
+17. /release           — execute deploy artifacts in prod, smoke-test, watch, decide rollback → _docs/04_release/
 ```

 ### EVOLVE

 ```
-6. /refactor                                     — structured refactoring → _docs/04_refactoring/
-7. /retrospective                                — metrics, trends, improvement actions → _docs/06_metrics/
+18. /retrospective     — metrics + trends + lessons-log update → _docs/06_metrics/ + _docs/LESSONS.md
+       (cycle-end mode after release; incident mode auto-fires after 3-strike failure)
+
+After greenfield completes, the state file is rewritten to point at the existing-code flow's
+feature-cycle loop, which begins with /new-task and ends with /retrospective. The loop runs once
+per feature with state.cycle incremented.
+
+Off-cycle:
+/refactor              — full 8-phase refactor → _docs/04_refactoring/NN-<run-name>/
+/document              — full reverse-engineering of an unfamiliar codebase
 ```

-Or just use `/autodev` to run steps 0-5 automatically.
+Or just use `/autodev` to run all the above automatically — the orchestrator chooses the right flow, sequences steps, surfaces lessons, processes leftovers, and pauses only at BLOCKING gates and declared session boundaries.

 ## Available Skills

 | Skill | Triggers | Output |
 |-------|----------|--------|
-| **autodev** | "autodev", "auto", "start", "continue", "what's next" | Orchestrates full workflow |
+| **autodev** | "autodev", "auto", "start", "continue", "what's next" | Orchestrates full workflow (3 flows) |
 | **problem** | "problem", "define problem", "new project" | `_docs/00_problem/` |
 | **research** | "research", "investigate" | `_docs/01_solution/` |
-| **plan** | "plan", "decompose solution" | `_docs/02_document/` |
+| **plan** | "plan", "decompose solution" | `_docs/02_document/` (incl. ADRs) |
 | **test-spec** | "test spec", "blackbox tests", "test scenarios" | `_docs/02_document/tests/` + `scripts/` |
-| **decompose** | "decompose", "task decomposition" | `_docs/02_tasks/todo/` |
-| **implement** | "implement", "start implementation" | `_docs/03_implementation/` |
-| **test-run** | "run tests", "test suite", "verify tests" | Test results + verdict |
-| **code-review** | "code review", "review code" | Verdict: PASS / FAIL / PASS_WITH_WARNINGS |
+| **decompose** | "decompose", "task decomposition", "decompose tests" | `_docs/02_tasks/todo/` + `_docs/02_document/module-layout.md` |
+| **implement** | "implement", "start implementation" | `_docs/03_implementation/` (sequential — see `no-subagents.mdc`) |
+| **test-run** | "run tests", "test suite", "verify tests", "perf test" | Test results + verdict |
+| **code-review** | "code review", "review code" | Verdict: PASS / FAIL / PASS_WITH_WARNINGS (7 phases) |
 | **new-task** | "new task", "add feature", "new functionality" | `_docs/02_tasks/todo/` |
 | **ui-design** | "design a UI", "mockup", "design system" | `_docs/02_document/ui_mockups/` |
-| **refactor** | "refactor", "improve code" | `_docs/04_refactoring/` |
-| **security** | "security audit", "OWASP" | `_docs/05_security/` |
+| **refactor** | "refactor", "improve code", "testability" | `_docs/04_refactoring/NN-<run-name>/` |
+| **security** | "security audit", "OWASP", "vulnerability scan" | `_docs/05_security/` |
 | **document** | "document", "document codebase", "reverse-engineer docs" | `_docs/02_document/` + `_docs/00_problem/` + `_docs/01_solution/` |
-| **deploy** | "deploy", "CI/CD", "observability" | `_docs/04_deploy/` |
-| **retrospective** | "retrospective", "retro" | `_docs/06_metrics/` |
+| **deploy** | "deploy", "CI/CD", "observability", "containerize" | `_docs/04_deploy/` (plans + scripts) |
+| **release** | "release", "ship", "go live", "rollback" | `_docs/04_release/` (executed deploy + verdict) |
+| **retrospective** | "retrospective", "retro", "metrics review" | `_docs/06_metrics/` + `_docs/LESSONS.md` |
+| **monorepo-discover** | "discover monorepo", "scan submodules" | `_docs/_repo-config.yaml` |
+| **monorepo-document** | "sync monorepo docs" | unified `_docs/*.md` |
+| **monorepo-cicd** | "sync compose", "sync ci" | suite-level CI/compose/env templates |
+| **monorepo-onboard** | "onboard component", "register submodule" | atomic component addition |
+| **monorepo-status** | "monorepo status", "drift report" | read-only drift report |
+| **monorepo-e2e** | "suite e2e", "integration harness" | `e2e/docker-compose.suite-e2e.yml` and fixtures |

-## Tools
-
-| Tool | Type | Purpose |
-|------|------|---------|
-| `implementer` | Subagent | Implements a single task. Launched by `/implement`. |
+> The `.cursor/agents/` directory is intentionally empty. Per `.cursor/rules/no-subagents.mdc` the main agent does not delegate to subagents in this workspace; `/implement` runs tasks sequentially.

 ## Project Folder Structure

 ```
-_project.md                              — project-specific config (tracker type, project key, etc.)
 _docs/
-├── _autodev_state.md                  — autodev orchestrator state (progress, decisions, session context)
-├── 00_problem/                          — problem definition, restrictions, AC, input data
+├── _autodev_state.md                   — autodev orchestrator state (≤30 lines; pointer only)
+├── _process_leftovers/                  — deferred tracker writes replayed at next /autodev (per tracker.mdc)
+├── _repo-config.yaml                    — meta-repo only; produced by monorepo-discover
+├── LESSONS.md                           — ring buffer of last 15 actionable lessons (consumed by autodev/new-task/plan/decompose)
+├── 00_problem/                          — problem definition, restrictions, AC, input data + expected_results/
 ├── 00_research/                         — intermediate research artifacts
 ├── 01_solution/                         — solution drafts, tech stack, security analysis
 ├── 02_document/
-│   ├── architecture.md
+│   ├── architecture.md                 — includes ## Architecture Vision (user-confirmed)
+│   ├── glossary.md                     — user-confirmed terminology
 │   ├── system-flows.md
 │   ├── data_model.md
+│   ├── module-layout.md                — per-component Owns/Imports-from/Public API (decompose Step 1.5)
+│   ├── architecture_compliance_baseline.md  — existing-code baseline scan output
 │   ├── risk_mitigations.md
+│   ├── adr/[NNN]_[decision_slug].md    — Architectural Decision Records (plan Step 4.5)
 │   ├── components/[##]_[name]/          — description.md + tests.md per component
+│   ├── contracts/<component>/<name>.md  — versioned public-API contracts
 │   ├── common-helpers/
-│   ├── tests/                           — environment, test data, blackbox, performance, resilience, security, traceability
-│   ├── deployment/                      — containerization, CI/CD, environments, observability, procedures
+│   ├── tests/                           — environment, test-data, blackbox, performance, resilience, security, resource-limit, traceability matrix
 │   ├── ui_mockups/                      — HTML+CSS mockups, DESIGN.md (ui-design skill)
 │   ├── diagrams/
 │   └── FINAL_report.md
@@ -192,12 +255,13 @@ _docs/
 │   ├── backlog/                         — parked tasks (not scheduled yet)
 │   └── done/                            — completed/archived tasks
 ├── 02_task_plans/                       — per-task research artifacts (new-task skill)
-├── 03_implementation/                   — batch reports, implementation_report_*.md
+├── 03_implementation/                   — batch_*_cycle*.md, implementation_report_*.md, implementation_completeness_cycle*.md, cumulative_review_*.md
 │   └── reviews/                         — code review reports per batch
-├── 04_deploy/                           — containerization, CI/CD, environments, observability, procedures, scripts
-├── 04_refactoring/                      — baseline, discovery, analysis, execution, hardening
-├── 05_security/                         — dependency scan, SAST, OWASP review, security report
-└── 06_metrics/                          — retro_[YYYY-MM-DD].md
+├── 04_deploy/                           — containerization, CI/CD, environments, observability, procedures, deploy_scripts.md, reports/
+├── 04_refactoring/NN-<run-name>/        — baseline_metrics, discovery, analysis, test_specs, execution_log, test_sync, verification, FINAL_report (one folder per refactor run)
+├── 04_release/                          — release_<version>.md (one per /release invocation), rollback_<version>.md
+├── 05_security/                         — dependency_scan, static_analysis, owasp_review, infrastructure_review, security_report
+└── 06_metrics/                          — retro_<YYYY-MM-DD>.md, structure_<YYYY-MM-DD>.md, perf_<YYYY-MM-DD>_<run-label>.md, incident_<YYYY-MM-DD>_<skill>.md
 ```

 ## Standalone Mode
@@ -1,105 +0,0 @@
---
-name: implementer
-description: |
-  Implements a single task from its spec file. Use when implementing tasks from _docs/02_tasks/todo/.
-  Reads the task spec, analyzes the codebase, implements the feature with tests, and verifies acceptance criteria.
-  Launched by the /implement skill as a subagent.
---
-
-You are a professional software developer implementing a single task.
-
-## Input
-
-You receive from the `/implement` orchestrator:
- Path to a task spec file (e.g., `_docs/02_tasks/todo/[TRACKER-ID]_[short_name].md`)
- Files OWNED (exclusive write access — only you may modify these)
- Files READ-ONLY (shared interfaces, types — read but do not modify)
- Files FORBIDDEN (other agents' owned files — do not touch)
-
-## Context (progressive loading)
-
-Load context in this order, stopping when you have enough:
-
-1. Read the task spec thoroughly — acceptance criteria, scope, constraints, dependencies
-2. Read `_docs/02_tasks/_dependencies_table.md` to understand where this task fits
-3. Read project-level context:
-   - `_docs/00_problem/problem.md`
-   - `_docs/00_problem/restrictions.md`
-   - `_docs/01_solution/solution.md`
-4. Analyze the specific codebase areas related to your OWNED files and task dependencies
-
-## Boundaries
-
-**Always:**
- Run tests before reporting done
- Follow existing code conventions and patterns
- Implement error handling per the project's strategy
- Stay within the task spec's Scope/Included section
-
-**Ask first:**
- Adding new dependencies or libraries
- Creating files outside your OWNED directories
- Changing shared interfaces that other tasks depend on
-
-**Never:**
- Modify files in the FORBIDDEN list
- Skip writing tests
- Change database schema unless the task spec explicitly requires it
- Commit secrets, API keys, or passwords
- Modify CI/CD configuration unless the task spec explicitly requires it
-
-## Process
-
-1. Read the task spec thoroughly — understand every acceptance criterion
-2. Analyze the existing codebase: conventions, patterns, related code, shared interfaces
-3. Research best implementation approaches for the tech stack if needed
-4. If the task has a dependency on an unimplemented component, create a minimal interface mock
-5. Implement the feature following existing code conventions
-6. Implement error handling per the project's defined strategy
-7. Implement unit tests (use Arrange / Act / Assert section comments in language-appropriate syntax)
-8. Implement integration tests — analyze existing tests, add to them or create new
-9. Run all tests, fix any failures
-10. Verify every acceptance criterion is satisfied — trace each AC with evidence
-
-## Stop Conditions
-
- If the same fix fails 3+ times with different approaches, stop and report as blocker
- If blocked on an unimplemented dependency, create a minimal interface mock and document it
- If the task scope is unclear, stop and ask rather than assume
-
-## Completion Report
-
-Report using this exact structure:
-
-```
-## Implementer Report: [task_name]
-
-**Status**: Done | Blocked | Partial
-**Task**: [TRACKER-ID]_[short_name]
-
-### Acceptance Criteria
-| AC | Satisfied | Evidence |
-|----|-----------|----------|
-| AC-1 | Yes/No | [test name or description] |
-| AC-2 | Yes/No | [test name or description] |
-
-### Files Modified
- [path] (new/modified)
-
-### Test Results
- Unit: [X/Y] passed
- Integration: [X/Y] passed
-
-### Mocks Created
- [path and reason, or "None"]
-
-### Blockers
- [description, or "None"]
-```
-
-## Principles
-
- Follow SOLID, KISS, DRY
- Dumb code, smart data
- No unnecessary comments or logs (only exceptions)
- Ask if requirements are ambiguous — do not assume
@@ -3,11 +3,28 @@ description: "Enforces readable, environment-aware coding standards with scope d
 alwaysApply: true
 ---
 # Coding preferences
- Prefer the simplest solution that satisfies all requirements, including maintainability. When in doubt between two approaches, choose the one with fewer moving parts — but never sacrifice correctness, error handling, or readability for brevity.
+
+## Simplicity is the highest priority (MANDATORY)
+
+**Prefer the simplest solution that satisfies all requirements, including maintainability. When in doubt between two approaches, choose the one with fewer moving parts — but never sacrifice correctness, error handling, or readability for brevity.**
+
+This is not a tie-breaker. It is the default. Every new class, layer, cache, hosted service, sliding window, persisted state, event-type variant, or configuration option is a liability — it has to be documented, tested, monitored, migrated, and reasoned about by every reader for the rest of the project's life. Add complexity only when a simpler design has been considered and explicitly rejected for a named, concrete reason tied to a requirement.
+
+Operational checks the agent MUST apply before adding code:
+
+- Before adding a new class, interface, abstract layer, configuration option, or hosted service, **justify in writing** (PR description, task spec, or chat message to the user) why the same effect cannot be achieved by extending an existing component. "Cleaner separation" / "more future-proof" / "more flexible" are NOT justifications unless tied to a concrete upcoming change that the simpler design would make harder.
+- Before introducing a sliding window, smoother, debouncer, in-memory cache, queue, or other stateful in-memory helper, justify why a stateless / on-demand alternative would not meet the requirement. Cite the acceptance criterion the helper is needed for.
+- **Two parallel pipelines for the same conceptual data are a smell.** Examples: two event types that differ only in a boolean flag; two HTTP endpoints that return the same resource shaped differently; two storage paths for the same entity. Either merge them or document on the producer's interface why both must exist and which downstream consumer needs which.
+- **Rehydrate-on-restart logic is a strong signal of over-engineering.** If a feature requires reading state from the DB at startup and re-running it through a state machine, the in-memory state is probably trying to be a database. Consider keeping the state in the DB and querying it on demand instead.
+- When a feature can be expressed in N existing primitives or N+1 (one new primitive + N existing), pick N existing. If you pick N+1, name the new primitive in the PR title.
+
+Violations of this section are reviewable. A reviewer who finds an unjustified abstraction, parallel pipeline, or stateful helper is right to ask for it to be removed.
+
+## Other preferences
 - Follow the Single Responsibility Principle — a class or method should have one reason to change:
  - If a method is hard to name precisely from the caller's perspective, its responsibility is misplaced. Vague names like "candidate", "data", or "item" are a signal — fix the design, not just the name.
  - Logic specific to a platform, variant, or environment belongs in the class that owns that variant, not in the general coordinator. Passing a dependency through is preferable to leaking variant-specific concepts into shared code.
-  - Only use static methods for pure, self-contained computations (constants, simple math, stateless lookups). If a static method involves resource access, side effects, OS interaction, or logic that varies across subclasses or environments — use an instance method or factory class instead. Before implementing a non-trivial static method, ask the user.
+  - Static members: see "Static members (functions / classes)" below — default to injectable instance types; `static` only for pure, simple, stateless helpers (constants, simple math, stateless lookups), never for business logic or anything with side effects/state. Before implementing a non-trivial static method, ask the user.
 - Avoid boilerplate and unnecessary indirection, but never sacrifice readability for brevity.
 - Never suppress errors silently — no `2>/dev/null`, empty `catch` blocks, bare `except: pass`, or discarded error returns. These hide the information you need most when something breaks. If an error is truly safe to ignore, log it or comment why.
 - Do not add comments that merely narrate what the code does. Comments are appropriate for: non-obvious business rules, workarounds with references to issues/bugs, safety invariants, and public API contracts. Make comments as short and concise as possible. Exception: every test must use the Arrange / Act / Assert pattern with language-appropriate comment syntax (`# Arrange` for Python, `// Arrange` for C#/Rust/JS/TS). Omit any section that is not needed (e.g. if there is no setup, skip Arrange; if act and assert are the same line, keep only Assert)
@@ -39,8 +56,87 @@ alwaysApply: true
 - When you think you are done with changes, run the full test suite. Every failure in tests that cover code you modified or that depend on code you modified is a **blocking gate**. For pre-existing failures in unrelated areas, report them to the user but do not block on them. Never silently ignore or skip a failure without reporting it. On any blocking failure, stop and ask the user to choose one of:
  - **Investigate and fix** the failing test or source code
  - **Remove the test** if it is obsolete or no longer relevant
+- **Iterative-skill exception**: when an iterative loop skill is active (e.g. autodev / `implement/SKILL.md` batch loop, `refactor/SKILL.md` batch loop), the skill governs full-suite cadence — typically focused tests per task/batch and a single full-suite gate at the very end of the implementation phase, NOT after each batch. "Done with changes" means done with the entire implementation phase the skill is running, not done with one batch. Do not run the full suite per batch unless the skill explicitly says to.
 - Do not rename any databases or tables or table columns without confirmation. Avoid such renaming if possible.

 - Make sure we don't commit binaries, create and keep .gitignore up to date and delete binaries after you are done with the task
 - Never force-push to main or dev branches
 - For new projects, place source code under `src/` (this works for all stacks including .NET). For existing projects, follow the established directory structure. Keep project-level config, tests, and tooling at the repo root.
+- **Never run e2e or CI tests in quiet mode (`-q`).** Always use `-v --tb=short` (or equivalent verbosity flags) in all Dockerfiles, compose files, and scripts that invoke pytest. Full test output must be visible so failures can be diagnosed without re-running. This applies to both Tier-1 (Colima) and Tier-2 (Jetson) harnesses.
+- **Never substitute real algorithm execution with a data passthrough to make tests pass.** If a test is designed to validate output from a specific pipeline (e.g. VIO estimation, sensor fusion, inference), the implementation MUST actually run that pipeline — not bypass it by returning the input data directly as output. Tests that pass by skipping the component they are supposed to exercise create false confidence and hide the fact that the component is not integrated. If the real integration cannot be completed in this session, STOP and report the blocker to the user explicitly. A failing test with an honest explanation is always better than a passing test that proves nothing.
+
+# Language-agnostic engineering principles
+
+The sections below are cross-language paradigms. Each language/framework rule file (e.g. `dotnet.mdc`) is the **stack-specific realization** of these and references back here; the principle lives here, the mechanics live there. When a stack rule and this file appear to conflict, the stack rule wins for that stack (it is the concrete realization) — but flag the divergence so one of the two is corrected.
+
+## Architecture & layering
+
+### Layered separation of concerns
+
+- Keep the **delivery layer thin** (HTTP controllers, CLI commands, message/event handlers, UI handlers): bind/validate input, call **one** business operation, map the result back. **No business logic, no data-store queries, no orchestration in the delivery layer.**
+- Put **business logic behind interfaces in a layer that does not depend on the delivery mechanism** — it must be callable from a different entry point (HTTP, CLI, worker, test) without change. No framework request/response types in a business-layer signature.
+- Put **shared data shapes** (DTOs, value objects, enums, wire contracts) in a layer both can depend on. Dependency direction points **inward**: delivery → business → shared; shared depends on nothing. Never the reverse.
+- Why: business logic fused into the delivery layer can't be reused or unit-tested without booting the whole framework. This is a pragmatic layered split, not a full Clean-Architecture stack — justified for long-lived / complex domains; skip it for throwaway or trivial-CRUD code.
+
+### Service results vs. transport envelopes
+
+- A business operation returns a **domain result** (the values it computed) on success; the delivery layer maps that onto the transport/wire shape. The envelope (field names, status code, headers) is a delivery concern; the domain result is not.
+- **A value the business logic *reads to make a decision* is owned by the business layer** and returned by it — even if the response also echoes it back. Don't let the delivery layer independently re-derive it (two sources for one conceptual value is a latent bug). Canonical case: a "server now" timestamp used to compute staleness AND echoed to the client must be the *same* instant the business layer used.
+- A value that is **purely a transport artifact and never read by business logic** (a `Location`/redirect header, a per-response trace id) is owned by the delivery layer; the business layer never sees it.
+- Heuristic: "does business logic read this value to decide something?" — yes → business layer owns and returns it; no (formatting/transport only) → delivery layer owns it.
+
+## Static members (functions / classes)
+
+- Default to **instance types behind an interface**, injected — that is what is testable (mockable), swappable, and free of hidden global state. `static` is the exception, not the default.
+- **No business logic in a static function — ever.** `static` is for *mechanics* (convert, parse, compute, compare), never for *decisions* (which rule applies, what happens next). Domain decisions live in an injectable service.
+- `static` is appropriate **only** for: pure, stateless, **simple** functions (output depends solely on arguments — no I/O, clock, randomness, shared mutable state — and the body is short and obvious); constants; pure extension/utility helpers; static factory methods. The moment a would-be helper carries domain decisions, branches widely, or is complex enough to deserve its own test suite, make it an instance service.
+- **Never** use `static` for: business/domain logic; anything touching I/O, configuration, time, randomness, or external systems (that is a *service* — define an interface, inject it); or **mutable static state** (a thread-safety and test-isolation hazard — shared state belongs in a single injected instance, never a global mutable field).
+- Library-mandated process-global statics (a metrics registry, a logger handle) are an accepted exception; don't force them behind a bespoke interface.
+
+## Error handling
+
+Builds on "never suppress errors silently" above. Use exceptions for *exceptional* conditions, not normal control flow.
+
+- **Catch in one place.** Centralize error→response mapping at a single boundary (framework exception handler / middleware / error filter), not via `try/catch` scattered through every method. The only legitimate local `catch` blocks: converting a third-party/framework error into a domain error at a boundary, honoring cancellation, or keeping a long-running loop alive (log-and-continue). Never an empty/silent catch.
+- **Three failure tiers, three treatments:**
+  1. **Input validation** → handled at the boundary/validation pipeline, returns a client-error status; do **not** throw for ordinary request-shape validation.
+  2. **Expected business-rule failures** (not-found, conflict, invariant violation, forbidden-by-rule) → a **typed domain failure**: a business-exception hierarchy **or** a result type — pick one per project and be consistent. Each failure carries the status it maps to; there is **no single blanket business status**: not-found → 404, state-conflict → 409, well-formed-but-invariant-violation → 422, rule-forbidden → 403.
+  3. **Unexpected failures** (bugs, infrastructure) → propagate to the central handler, which returns a **generic, opaque** error to the client (never leak internal messages/stack traces in production) and **logs the full error** with a correlation id. Dev environments may surface detail.
+- **Don't throw on hot per-item paths** (inner loops, per-record processing) — represent the outcome as a return value / counted metric there; exceptions are for request/operation-level outcomes.
+- Pick **one** failure-representation strategy project-wide (typed exceptions *or* a result type) and stick to it; don't mix both for the same kind of failure.
+
+## Dependency injection
+
+- Prefer **constructor injection**: a type declares the collaborators it needs and they are provided. This is what makes it unit-testable and its dependencies explicit.
+- **Never capture a shorter-lived dependency inside a longer-lived one** (a request/scoped service held by a singleton — a "captive dependency"). Acquire the short-lived dependency per unit of work instead.
+- Don't manually dispose objects the DI container owns — the container manages their lifetime.
+
+## Configuration
+
+- **Bind configuration to typed objects** and **validate it at startup**, so misconfiguration is a boot-time crash, not a 3 AM runtime page.
+- Don't read raw config keys (`config["a:b"]`) inside business code — bind once, inject the typed object.
+- Secrets come from the environment / secret store per environment; never commit real secrets to source-controlled config files.
+
+## Logging (secrets & structure)
+
+Complements the log-level guidance in "Other preferences".
+
+- **Never log secrets, tokens, passwords, or PII.** Use ids, hashes, or redaction.
+- Prefer **structured logging with message templates / named fields** over string concatenation or interpolation — logs stay queryable and don't allocate when the level is disabled.
+
+## Data access
+
+- Route all application reads/writes through the project's **ORM / data-access layer**. Raw SQL is forbidden by default and allowed only for narrow, **justified** cases (DDL the ORM can't express, vendor-specific operators/functions, a benchmarked hot path) — each documented in a one-line comment and confined behind a single interface, nowhere else.
+- **Prevent N+1**: eager-load or project explicitly. For read-only queries, opt out of change-tracking where the data layer supports it.
+
+## Boundary discipline
+
+- **Don't pass the framework's request/response context** (HTTP context, raw request/response objects) into business logic. Extract the typed values you need at the boundary and pass those down.
+- **Authorize once at the boundary**, not per handler method; name authorization policies centrally and reference the names — don't inline role/permission strings at call sites.
+
+## Testing (real dependencies)
+
+Complements the AAA convention in "Other preferences".
+
+- **Don't use in-memory or fake data stores for query-correctness tests** — their semantics diverge from the real engine (translation differences, no real transactions/constraints). Use the real engine (e.g. a throwaway container) so tests exercise real behavior. Lightweight fakes are acceptable only for fast smoke tests that don't assert query shape.
+- Share expensive test fixtures (server boot, container) across tests instead of paying the cost per test.
@@ -19,7 +19,7 @@ globs: [".cursor/**"]
 - Kebab-case filenames

 ## Agent Files (.cursor/agents/)
- Must have `name` and `description` in frontmatter
+- The `.cursor/agents/` directory is intentionally empty. Per `.cursor/rules/no-subagents.mdc`, the main agent does not delegate to subagents in this workspace. Do not add agent files here without a corresponding rule change.

 ## Security
 - All `.cursor/` files must be scanned for hidden Unicode before committing (see cursor-security.mdc)
@@ -30,10 +30,11 @@ All rules and skills must reference the single source of truth below. Do NOT res

 | Concern | Threshold | Enforcement |
 |---------|-----------|-------------|
-| Test coverage on business logic | 75% | Aim (warn below); 100% on critical paths |
+| Test coverage on business logic | 75% | Aim (warn below); critical-path floor enforced separately (next row) |
+| Test coverage on critical paths | 90% floor / 100% aim | **90% is the enforcement floor** in CI gates, refactor verification, and release pre-flight. **100% is the aim** — drift below 100% but at-or-above 90% is acceptable; drift below 90% blocks. Critical paths = code paths where a bug would cause data loss, security breach, financial error, or system outage; identify from `acceptance_criteria.md` (must-have) and `_docs/00_problem/security_approach.md`. |
 | Test scenario coverage (vs AC + restrictions) | 75% | Blocking in test-spec Phase 1 and Phase 3 |
-| CI coverage gate | 75% | Fail build below |
+| CI coverage gate | 75% overall, 90% critical-path | Fail build below either threshold |
 | Lint errors (Critical/High) | 0 | Blocking pre-commit |
-| Code-review auto-fix | Low + Medium (Style/Maint/Perf) + High (Style/Scope) | Critical and Security always escalate |
+| Code-review auto-fix | Low + Medium (Style/Maint/Perf) + High (Style/Scope) | Critical and Security always escalate. Full categorization: see `.cursor/skills/implement/SKILL.md` § "Auto-Fix eligibility matrix" |

-When a skill or rule needs to cite a threshold, link to this table instead of hardcoding a different number.
+When a skill or rule needs to cite a threshold, link to this table instead of hardcoding a different number. The full auto-fix eligibility matrix (severity × category) lives in `implement/SKILL.md`; cite that file rather than re-tabulating the matrix.
@@ -1,17 +1,293 @@
 ---
-description: ".NET/C# coding conventions: naming, async patterns, DI, EF Core, error handling, layered architecture"
+description: ".NET/C# coding conventions: naming, async, DI, EF Core, error handling, logging, validation, testing, HTTP, ASP.NET Core handler discipline"
 globs: ["**/*.cs", "**/*.csproj", "**/*.sln"]
 ---
 # .NET / C#

+## General
+
 - PascalCase for classes, methods, properties, namespaces; camelCase for locals and parameters; prefix interfaces with `I`
- Use `async`/`await` for I/O-bound operations; the `Async` suffix on method names is optional — follow the project's existing convention
- Use dependency injection via constructor injection; register services in `Program.cs`
- Use linq2db for small projects, EF Core with migrations for big ones; avoid raw SQL unless performance-critical; prevent N+1 with `.Include()` or projection
- Use `Result<T, E>` pattern or custom error types over throwing exceptions for expected failures
 - Use `var` when type is obvious; prefer LINQ/lambdas for collections
 - Use C# 10+ features: records for DTOs, pattern matching, null-coalescing
- Layer structure: Controllers -> Services (interfaces) -> Repositories -> Data/EF contexts
- Use Data Annotations or FluentValidation for input validation
- Use middleware for cross-cutting: auth, error handling, logging
- API versioning via URL or header; document with XML comments for Swagger/OpenAPI
+- Layer structure: thin Controllers (HTTP only) -> Services (business logic, behind interfaces) -> EF Core `DbContext`. See "Solution layout & layering" below for the project split.
+- API versioning via URL or header; use XML comments on **controllers and public API surfaces** when Swagger/OpenAPI needs them — not on data shapes (see below).
+- **Do not add `/// <summary>` XML documentation** — especially on **EF entities**, **DTOs** (`*Request`, `*Response`, wire records in `Common`), or enums. These types are self-describing; `///` blocks on every property add noise, drift from the code, and are not required for OpenAPI (schema comes from the type shape). Do not generate or paste them during refactors. Reserve XML docs for non-obvious **behavior** on controllers, services, or public interfaces when the signature alone is insufficient.
+
+## Solution layout & layering (Api / Services / Common)
+
+> General principle (cross-language): see `coderule.mdc` → "Architecture & layering › Layered separation of concerns". This section is the .NET realization.
+
+Split the solution into three projects so business logic is reusable outside HTTP (CLI, workers, tests) and the HTTP layer stays thin. Use the solution's own prefix for the project names (`*.Api`, `*.Services`, `*.Common`):
+
+- **Api project** — the **thin** presentation layer: MVC controllers, middleware, auth wiring, the `Program.cs` composition root, and DI registration. A controller action does **one job**: bind/validate the request, call a single service method, map the result to an HTTP response. **No business logic, no EF queries, no orchestration** in the API layer. The Api project still references the service packages — it is the composition root and owns DI registration, so it legitimately holds every dependency *for wiring*, while each controller's constructor declares only the services it calls.
+- **Services project** — all business logic, behind interfaces (`IXxxService`). Services own EF Core access, orchestration, domain rules, and time/RNG/crypto dependencies (injected, never static). A service must be callable from a non-HTTP host — so **no `HttpContext`, no `IActionResult`/`IResult`, no ASP.NET types** may appear in a service signature or body.
+- **Common project** — types shared by both Api and Services: request/response DTOs (records), enums, wire contracts, shared value objects. No EF, no ASP.NET, no service logic. Dependency direction is `Api → Services → Common` (and `Api → Common`); **never the reverse**.
+
+Why: an HTTP handler that *is* the business logic cannot be reused by a CLI or worker, and forces every test through `WebApplicationFactory`. Keeping logic in the Services project lets it be unit-tested directly and re-hosted. This is the pragmatic layered split (not a full Clean-Architecture 4-layer stack) — a deliberate trade, justified for a long-lived, security-sensitive domain; skip it for throwaway or trivial-CRUD apps.
+
+- **MVC controllers are the API style here**, not Minimal APIs. Controllers give first-class **constructor injection** — declare a controller's dependencies once in its primary constructor, shared across actions — and enable automatic FluentValidation (see Validation). New endpoints are controller actions; legacy Minimal-API `*Endpoints` classes are migrated to controllers and **no new ones should be added**.
+- **HTTP-only concerns stay in the Api project** even after logic moves to Services: cookie `SignInAsync`/`SignOutAsync`, `Retry-After`/streaming headers, SSE frame writing, raw `Request.Body` framing. These are genuinely HTTP and must NOT be pushed into a service.
+
+## Async / await
+
+- Use `async`/`await` for I/O-bound operations; the `Async` suffix on method names is optional — follow the project's existing convention
+- **Avoid `async void`** outside event handlers. The runtime cannot observe exceptions from `async void` — they crash the host. Always return `Task`/`Task<T>` and `await` the call.
+- **Never block on async code** with `.Result`, `.Wait()`, or `.GetAwaiter().GetResult()` in any ASP.NET Core code path. Use `await`. Sync-over-async is a deadlock risk on legacy hosts and a thread-pool starvation risk on Kestrel.
+
+## Dependency injection
+
+> General principle (cross-language): see `coderule.mdc` → "Dependency injection". Below is the .NET realization.
+
+- Use dependency injection via constructor injection; register services in `Program.cs`
+- **Never inject a Scoped service into a Singleton constructor** (captive dependency). Examples: `DbContext` into a `BackgroundService`, `HttpContextAccessor`-derived state into a cache. Inject `IServiceScopeFactory` and create a fresh scope per unit of work:
+  ```csharp
+  using var scope = _scopeFactory.CreateScope();
+  var db = scope.ServiceProvider.GetRequiredService<AppDbContext>();
+  ```
+- Don't manually `Dispose` services resolved from the DI container — the container disposes them at scope/app shutdown.
+
+## Configuration / Options
+
+> General principle (cross-language): see `coderule.mdc` → "Configuration". Below is the .NET realization.
+
+- Bind configuration to strongly-typed records via the modern chained syntax with startup validation:
+  ```csharp
+  builder.Services
+      .AddOptions<FooSettings>()
+      .BindConfiguration("Foo")
+      .ValidateDataAnnotations()
+      .ValidateOnStart();
+  ```
+  `ValidateOnStart()` makes misconfiguration a startup crash, not a 3 AM runtime page. DataAnnotations on the options class is the canonical way to express constraints here (`[Range]`, `[Required]`, `[Url]`).
+- Don't read `IConfiguration["Foo:Bar"]` directly in business code. Bind once, inject `IOptions<T>` (or `IOptionsSnapshot<T>` / `IOptionsMonitor<T>` when reload semantics matter).
+- Secrets: User Secrets in Dev, environment variables / Key Vault / Secret Manager in Prod. Never commit real secrets to `appsettings.*.json`.
+
+## Logging
+
+> General principle (cross-language): see `coderule.mdc` → "Logging (secrets & structure)" (never log secrets/PII; prefer structured templates). Below is the .NET realization.
+
+- **Never use `$"..."` interpolation inside `ILogger.Log*` calls.** It allocates regardless of log level and breaks structured logging. Use template parameters (`logger.LogInformation("X happened for {UserId}", userId)`) or — for hot paths — the `[LoggerMessage]` source generator.
+- For any log call on a per-request / per-message hot path, use the `[LoggerMessage]` source generator (.NET 6+). Zero allocation when the level is disabled, no boxing, compile-time placeholder validation:
+  ```csharp
+  public partial class MyService(ILogger<MyService> logger)
+  {
+      [LoggerMessage(EventId = 1001, Level = LogLevel.Information,
+          Message = "User {UserId} placed order {OrderId}")]
+      private partial void LogOrderPlaced(int userId, string orderId);
+  }
+  ```
+  The older `LoggerMessage.Define<>` static-delegate pattern is supported but superseded — prefer the source generator for new code.
+- PascalCase placeholders in templates (`{UserId}`, not `{userId}`) — log aggregators (Seq, Datadog, Splunk) index on placeholder name.
+- Never log secrets, full bearer tokens, passwords, or PII. Use IDs, hashes, or redaction.
+- **Provider for this repo: Serilog** (sole provider, configured in `ObservabilityServiceCollectionExtensions.ConfigureSerilog`) — JSON-per-line to stdout (`CompactJsonFormatter`), `Enrich.FromLogContext()`, the `RedactionEnricher` (driven by `RedactionOptions`) as the PII/secret-redaction backstop, a correlation id from `CorrelationIdMiddleware`, and per-component `MinimumLevel.Override` from `LoggingOptions`. Log through `ILogger<T>` (do not call Serilog's static `Log.*` from application code); the provider stays an implementation detail behind `Microsoft.Extensions.Logging`. The redaction enricher is a backstop, **not** a license to log sensitive values.
+
+## Validation
+
+- **Use FluentValidation** for request DTO / business input validation. Register validators with `services.AddValidatorsFromAssemblyContaining<MarkerType>()`.
+- **Controllers: rely on automatic validation.** Add `AddFluentValidationAutoValidation()` (from `SharpGrip.FluentValidation.AutoValidation.Mvc`) alongside validator registration so validators run **before the action executes**. **Do not** call `await validator.ValidateAsync(...)` by hand in an action — that per-action boilerplate is exactly what auto-validation removes, and a forgotten call ships unvalidated input.
+  - **Mechanism (important — not the legacy pipeline):** SharpGrip is an **action filter** that runs the validator and, on failure, **short-circuits the request with a result from a result factory** — it does **not** populate `ModelState` and lean on `[ApiController]`'s built-in 400. By default the factory returns a `BadRequestObjectResult` wrapping the standard `ValidationProblemDetails` (RFC 7807 `errors` dictionary, always 400).
+  - **Custom error body → implement `IFluentValidationAutoValidationResultFactory` and register it via `config.OverrideDefaultResultFactoryWith<T>()`.** Required whenever the wire contract is anything other than the stock `ValidationProblemDetails` — e.g. this project's slug-keyed `problem+json` (`type = .../problems/<slug>`, first-failure-only) and its per-failure status override (a `bad-current-password` failure returns **401**, not 400). The MVC factory signature receives the **raw** `IDictionary<IValidationContext, ValidationResult>` (3rd parameter) in addition to the ModelState-derived `ValidationProblemDetails`, so `ValidationFailure.ErrorCode` (the slug) and `ValidationFailure.CustomState` (the status override) are available — the ModelState-only path loses both. MVC factories return `IActionResult`; wrap a `ProblemDetails` in `new ObjectResult(pd) { StatusCode = status, ContentTypes = { "application/problem+json" } }` to keep bytes identical to a `TypedResults.Problem(...)` body.
+  - The old `FluentValidation.AspNetCore` built-in auto-validation (the ASP.NET **validation-pipeline** mode, `services.AddFluentValidation(...)`) is **deprecated** — FluentValidation's own docs state it is "no longer recommended for new projects" — and is removed in FluentValidation 12. SharpGrip's action filter is the upstream-blessed automatic successor and runs **async** (the pipeline mode was sync-only, a problem for DB-lookup rules). FluentValidation's *other* recommended path is plain **manual** `ValidateAsync` — acceptable, but rejected here because it repeats the validate/return boilerplate in every action.
+  - .NET 10's native `AddValidation()` is **Minimal-API + DataAnnotations + synchronous only** — not a substitute for FluentValidation here.
+- Invoke a validator explicitly **only** for a rule that cannot run in the model pipeline (e.g. it needs a service result already fetched inside the action). Keep that the exception, not the norm.
+- DataAnnotations are acceptable on Options classes (paired with `.ValidateDataAnnotations()` per the Options section) and on simple non-FluentValidation property checks. Don't mix the two for the **same** DTO.
+
+## JSON serialization (property naming)
+
+- **Set the wire naming convention once, globally**, via `JsonSerializerOptions.PropertyNamingPolicy` — never by decorating every property. The convention is **lower camelCase** (`JsonNamingPolicy.CamelCase`) — the ASP.NET Core Web default and the idiomatic JS/TS-friendly shape. Configure it once in the composition root:
+  ```csharp
+  // Minimal-API / endpoint serialization
+  builder.Services.ConfigureHttpJsonOptions(o =>
+      o.SerializerOptions.PropertyNamingPolicy = JsonNamingPolicy.CamelCase);
+  // MVC controllers
+  builder.Services.AddControllers()
+      .AddJsonOptions(o => o.JsonSerializerOptions.PropertyNamingPolicy = JsonNamingPolicy.CamelCase);
+  ```
+  DTO members stay plain PascalCase C# (`ServerNow`, `DeviceId`) and serialize **and deserialize** as `serverNow`, `deviceId` automatically.
+- **Migration note (BREAKING — not behavior-preserving).** The contract historically shipped `snake_case` (`server_now`, `device_id`, …), consumed raw by the SPA (`web/`), the TS types, E2E/blackbox tests, `TestCommon` DTOs, seed fixtures, and `_docs/`. Flipping the policy to camelCase renames **every field on the wire**, so it is a breaking change tracked as **its own ticket** and must land **atomically** with the SPA + tests + fixtures + docs update (and an API version bump). Do **not** flip the policy — or strip the snake_case attributes — in isolation, and never inside a "behavior-preserving" refactor task.
+- **`[JsonPropertyName("...")]` is for overrides only — names the global policy cannot derive — never the default way to set casing.** It always wins over the policy, so reach for it ONLY when:
+  - the wire name is **irregular** vs. what the policy produces — e.g. acronym casing the CamelCase policy only lowercases the first char of (`IPAddress` → `iPAddress`, `DeviceID` → `deviceID`) when the contract wants `ipAddress`/`deviceId`, or an external contract demands an exact string we don't control;
+  - the wire name is **not a valid C# identifier** or otherwise inexpressible by any policy.
+- Decorating every property with `[JsonPropertyName("...")]` to emulate a global policy is a **code-review-fail signal**: it is noise, it drifts, and it silently shadows the policy. If a whole DTO's attributes merely restate what the policy would produce, delete them and rely on the policy.
+- Enum string values use a `JsonStringEnumConverter`; keep its naming policy consistent with the property policy.
+- Grounding: Microsoft's System.Text.Json docs recommend the global `PropertyNamingPolicy` for project-wide conventions and reserve `[JsonPropertyName]` for exact-string overrides (it takes highest precedence and overrides the policy).
+
+## Error handling
+
+> General principle (cross-language): see `coderule.mdc` → "Error handling". This section is the .NET realization (the three-tier model, central handler, opaque-500, and status mapping all originate there).
+
+This project uses a **business-exception model with one central handler** — *not* `Result<T,E>` and *not* per-method `try/catch`. Three failure tiers, three treatments:
+
+1. **Input validation** — handled by the **auto-validation action filter, never by throwing.** FluentValidation auto-validation (see Validation) short-circuits the request before the action runs and returns the `400` (slug-keyed `problem+json` via the custom result factory). Do **not** raise a `ValidationException` for request-shape validation.
+2. **Business-rule violations** (expected, part of the API contract: not-found, conflict, invariant violation, forbidden-by-rule) — the service **throws a `BusinessException` subtype**. Services express failure by throwing; they do **not** return error-wrapper values and do **not** catch their own business exceptions.
+3. **Unexpected failures** (bugs — NRE, invariant breaks; infrastructure — DB unreachable, network) — thrown by the framework/runtime and left to **propagate** to the central handler.
+
+### Business exception hierarchy
+
+- A single abstract base — `abstract class BusinessException : Exception` — carries the HTTP mapping data: an `int Status` and a stable `string Slug` (and optional extension members). Every expected, contract-level failure is a concrete subtype that fixes its own status; **there is no single blanket business status code**:
+  - not-found → `404`
+  - state conflict (duplicate key, concurrent edit, illegal state transition) → `409`
+  - well-formed request that violates a business invariant → `422`
+  - forbidden by a business rule (not auth-scheme denial) → `403`
+- The `Slug`/`Status`/title **must reuse the existing `FleetViewerProblems` slug catalog** (`Common/Problems/`) so the `application/problem+json` wire contract (`type` URI, `title`, `status`, any `code` extension) stays byte-identical to what blackbox tests pin. The catalog stays the single source of truth for the error contract; the exception types reference it.
+- Choose `422` vs `409` by meaning, never interchangeably: `422` = the request is well-formed but the business invariant rejects it; `409` = it conflicts with the resource's current state.
+
+### Central handler (catch in exactly one place)
+
+- Register **one** `IExceptionHandler` via `builder.Services.AddExceptionHandler<...>()` + `AddProblemDetails()` + `app.UseExceptionHandler()`. It maps:
+  - `BusinessException` → `ProblemDetails` built from its `Status` + `FleetViewerProblems.TypePrefix + Slug` (+ extensions). **Do NOT log these as errors** — they are expected 4xx contract outcomes; at most a `Debug`/`Information` line. Logging them at `Error` pollutes the error rate and pages on-call for normal client mistakes.
+  - **everything else (unexpected)** → `500` `ProblemDetails` with a **fixed, opaque production body** — `title: "Unexpected error"`, `detail: "An unexpected error occurred. Our team has been notified."` — and **log the full exception to Serilog at `Error`** (`logger.LogError(ex, ...)`) with the correlation id, so the log entry correlates to the client's response. The body must **never** carry the exception message, stack trace, or any internal detail (information-disclosure risk). In `Development` only, it is acceptable to surface `ex.Message`/stack in the body to aid debugging — gate that on `IHostEnvironment.IsDevelopment()`.
+- **No per-method `try/catch` for error mapping.** A handler/controller does not catch business exceptions to turn them into responses — that is the central handler's only job. Legitimate local `catch` blocks remain only for: converting a third-party/framework exception into a `BusinessException` at a boundary, honoring `OperationCanceledException`, or keeping a background loop alive (catch-log-continue). Never an empty/silent catch (see `coderule.mdc`).
+- **Do not throw on hot per-item paths** (e.g. ingest per-record processing): exceptions are for request-level outcomes, not inner loops — return/skip with a counted metric there.
+- API error responses are always `ProblemDetails` (RFC 7807) with a stable slug `type` when the failure is part of the contract.
+
+## HttpClient
+
+- **Never `new HttpClient()` per request** (sockets enter `TIME_WAIT` for ~240s; you exhaust the ephemeral port range under load).
+- **Never use a naive `static HttpClient`** either (handlers don't rotate, DNS changes are missed).
+- Register via `IHttpClientFactory` — typed or named clients:
+  ```csharp
+  builder.Services.AddHttpClient<MyApiClient>(c => c.BaseAddress = new Uri("https://api.example.com"));
+  ```
+- **Don't capture a typed `HttpClient` in a singleton.** Typed clients are Transient; capturing one in a singleton defeats handler rotation. Inject `IHttpClientFactory` into the singleton and call `CreateClient(name)` per operation, **or** configure `SocketsHttpHandler.PooledConnectionLifetime` so DNS refreshes at the socket level instead of the factory level.
+
+## Modern C# / nullable reference types
+
+- Enable nullable reference types (`<Nullable>enable</Nullable>`) on every new project.
+- **Don't paper over NRT warnings with `!`** (null-forgiving operator). Prefer:
+  - `required` members (C# 11) for properties the caller must initialize via object initializer.
+  - Constructor parameters for invariants established at construction.
+  - `[NotNullWhen(true)]` / `[NotNull]` / `[MaybeNull]` attributes for `Try*` patterns.
+- Use `ArgumentNullException.ThrowIfNull(x)` at the top of any public method taking a reference-type argument. NRTs are design-time only; library entry points still need runtime guards.
+
+## Static classes and static members
+
+> General principle (cross-language): see `coderule.mdc` → "Static members (functions / classes)". Below is the .NET realization plus framework-specific exemptions.
+
+Default to **instance classes behind an interface, registered in DI and constructor-injected.** That is what makes a unit testable (mockable), swappable, and free of hidden global state. `static` is the exception, not the default — reach for it only when the alternative below clearly applies.
+
+**No business logic in a static method — ever.** `static` is for *mechanics* (convert, parse, compute, compare), never for *decisions* (what the system should do, which rule applies, what happens next). Domain logic lives in a service.
+
+- **`static` is appropriate ONLY for:**
+  - **Pure, stateless, and SIMPLE functions** — output depends solely on the arguments; no I/O, no clock, no `Random`/`Guid.NewGuid`, no DB/file/network, no mutable shared state; **and** the body is short and obvious (math, encoding/decoding, parsing, formatting, a small predicate). Simplicity — not purity alone — is the bar: the moment a would-be helper carries domain decisions, branches across many cases, or is complex enough to deserve its own unit-test suite, it stops being a "helper." Make it an **instance service behind an interface** so it is injectable, mockable by its collaborators, and discoverable. A complicated *pure* function still belongs in a service.
+  - **Extension methods** over framework or domain types, when the body is pure and simple (e.g. claim/identity readers, enum⇄wire mappers).
+  - **Constants / well-known values** (a `static class` holding `const`s).
+  - **Static factory methods** on a type (private ctor + `public static Create(...)` returning a fully-formed instance) — an accepted construction pattern, distinct from a static *service*.
+- **Never use `static` for:**
+  - **Business / domain logic of any kind**, even if currently it looks "pure." Decisions belong in a tested, injectable service.
+  - A helper that touches I/O, configuration, time, randomness, or any external system — that is a *service*. Define an interface, make it an instance class, inject it. A static method that reaches a DB/clock/file cannot be mocked and forces brittle integration-style tests.
+  - **Mutable static fields of any kind.** Global mutable state is a thread-safety and test-isolation hazard. A cache or in-memory state store belongs in a DI **singleton behind an interface**, never a `static Dictionary`.
+  - Avoiding `new`/DI "ceremony." DI registration is one line and buys testability; saving it is never a reason to go static.
+- **Controllers are instance classes (constructor DI), not static.** A controller is `[ApiController] public sealed class XxxController(IXxxService svc) : ControllerBase { ... }` — dependencies are constructor-injected, actions are thin, and the type is never `static`. This is the standard for all new HTTP code (see "Solution layout & layering").
+- **Transitional exemption — legacy Minimal-API endpoint classes.** Existing `internal static class XxxEndpoints` exposing `MapXxxEndpoints(this RouteGroupBuilder group)` + `static` handler methods are the idiomatic *Minimal-API* pattern (no static state; deps are per-request method parameters; testable via `WebApplicationFactory`) and are **not** a static-class violation **while they exist**. Where the codebase has chosen controllers, migrate them and do **not** add new ones; until migrated, keep handler bodies thin with logic in injected services.
+- The static-OK rule also covers framework callback types that the runtime instantiates or invokes by convention — `AuthenticationHandler<TOptions>`, middleware `InvokeAsync`, `CookieAuthenticationEvents`, route predicates. They legitimately receive `HttpContext`/framework primitives and are not "static-class" or "HttpContext-discipline" violations.
+- **Library-mandated process-global statics are an accepted exception.** Some libraries are *designed* around a process-global, thread-safe static registry — e.g. a metrics library's `static readonly` counter/gauge collectors, or a `static` logger handle. Those `static readonly` fields are not the "mutable static state" this rule bans; do not force them behind a bespoke interface. A stateless utility over the system CSPRNG is likewise acceptable as `static` (folding it behind an interface for consistency with sibling generators is a fine choice, not a requirement).
+
+## Data access (EF Core)
+
+> General principle (cross-language): see `coderule.mdc` → "Data access" (single ORM path, justify raw SQL, prevent N+1). Below is the EF Core realization.
+
+- **Use the project ORM (EF Core for this repo) as the ONLY data-access path for application reads/writes.** Raw SQL via `CommandText`, `FromSqlRaw`, `FromSqlInterpolated`, `ExecuteSqlRaw`, `ExecuteSqlInterpolated`, or `NpgsqlCommand`/`NpgsqlConnection.CreateCommand()` is **forbidden by default** in endpoint, service, and repository code. Reaching for raw SQL because "it's simpler" or "EF generates ugly SQL" is not a valid reason — write the LINQ query, profile if you must, and only then justify a workaround.
+  - Narrow exceptions (each requires a 1-line comment in the code naming the EF limitation being worked around):
+    - **DDL the ORM cannot express** — `CREATE EXTENSION`, vendor enum-cast DEFAULT (`HasDefaultValueSql("'active'::device_state")`). Confine to migrations or to one-shot `IHostedService.StartAsync` bootstrap hooks.
+    - **Vendor-specific operators / functions** (e.g., TimescaleDB `time_bucket`, `make_interval(secs => ...)`, hypertable functions, PostGIS `ST_*`). Wrap each operator in a single repository method behind an interface; nowhere else in the codebase touches raw SQL for that operator. Prefer EF Core function mapping (`HasDbFunction` + `[DbFunction]`) before falling back to `FromSqlInterpolated`.
+    - **Benchmarked hot path** where EF demonstrably generates a worse plan than hand-rolled SQL. Requires a `BenchmarkDotNet` file checked in next to the workaround proving the gap. "We think it's faster" is not a benchmark.
+  - Prevent N+1 with `.Include()` / projection / explicit `.Select()`. New raw-SQL sites that do not fit one of the three exceptions MUST be flagged in code review as **High** severity (Maintainability / Architecture). Reviewers reject the PR until the SQL is either replaced with LINQ or moved behind a justified repository method with the required comment.
+- **`AsNoTracking()` on every read-only query.** The change tracker costs ~50% more memory and 2.9–5.2× more time on typical reads; you pay it for nothing on `GET` endpoints, reports, lookups. For read-heavy services, set `QueryTrackingBehavior.NoTracking` as the DbContext default and opt **in** to tracking with `.AsTracking()` on update paths.
+
+## ASP.NET Core handler discipline (controllers)
+
+> General principle (cross-language): see `coderule.mdc` → "Boundary discipline" (don't leak request/response context into business logic; authorize once at the boundary). Below is the ASP.NET Core realization.
+
+These rules keep controller actions and services free of framework primitives that hide dependencies, defeat unit testing, and bypass the auth/binding pipelines the framework already gives you. (They also apply to the legacy Minimal-API handlers still being migrated.)
+
+### `HttpContext` discipline
+
+- **Do not pass `HttpContext`, `HttpRequest`, `HttpResponse`, or `IHttpContextAccessor` into services or repositories.** Extract the values you need (headers, route values, body, `ClaimsPrincipal`) inside the handler and pass them down as typed parameters.
+- Take `HttpContext` (or `HttpRequest`/`HttpResponse`) as a handler parameter **only** when no binding source can express the requirement. Concrete examples that justify it:
+  - Custom body framing or streaming (you read `Request.Body`/`BodyReader` yourself).
+  - Multiple discriminated payload shapes on one URL that cannot be one DTO.
+  - Pre-allocation size caps that must reject **before** the body materializes into objects.
+  - Writing a custom response envelope that doesn't fit `Results.*`/`TypedResults.*`.
+  Document the reason with a `//` comment on the parameter or above the method.
+- Prefer **separate endpoints/methods** over discriminated payload shapes on one URL. Only fuse them when splitting would duplicate the majority of the validation logic — otherwise you trade testability for one fewer route registration, which is rarely worth it.
+- Default to specific binding sources: `[FromBody]`, `[FromQuery]`, `[FromHeader]`, `[FromRoute]`, `[FromServices]`, `ClaimsPrincipal user`, `CancellationToken cancellationToken`. Each of those is documented, testable, and integrates with OpenAPI.
+
+### JSON deserialization
+
+- **Default to `[FromBody]` + a typed `record`/DTO.** The framework calls `JsonSerializer.DeserializeAsync` for you, validates `Content-Type`, surfaces `BadHttpRequestException` on malformed input, and produces OpenAPI metadata.
+- Direct `JsonDocument` / `Utf8JsonReader` parsing of `Request.Body` is allowed **only** when typed deserialization cannot express the required validation. Allowed reasons:
+  - **Typed slug-keyed error envelopes** that the standard binder cannot produce (e.g., per-field problem+json with a stable `type` URI).
+  - **Pre-allocation size caps** that must reject `batch-too-large` before the array materializes.
+  - **Shape discrimination at parse time** when the alternative is a single fat DTO + runtime branching.
+  Each site needs a one-line comment naming which exception applies.
+- Reading raw `Request.Body` for plain typed JSON content is a code-review-fail signal in the absence of one of the named exceptions.
+
+### Custom authentication schemes
+
+- Custom bearer/token/API-key schemes go through **`AuthenticationHandler<TOptions>`** registered via `AddAuthentication().AddScheme<TOptions, THandler>(name, …)`. Apply `.RequireAuthorization(new AuthorizeAttribute { AuthenticationSchemes = name })` or `[Authorize(AuthenticationSchemes = name)]` on the endpoint.
+- **Do not read `Authorization` / cookie / API-key headers manually inside a handler that is `.AllowAnonymous()`.** That bypasses the auth pipeline, makes the auth logic unreusable for any second endpoint, and forces tests to reach the logic via reflection.
+- If you need a custom 401/403 body envelope (e.g. typed `application/problem+json` with a slug), override `HandleChallengeAsync` / `HandleForbiddenAsync` in the scheme handler — not by bypassing the pipeline.
+- In the endpoint, take `ClaimsPrincipal user` as a parameter and read identity from claims (`user.FindFirstValue(...)`). The auth handler is responsible for putting the right claims on the principal.
+
+### Authorization (declare-once at the boundary)
+
+- Authorize at the **boundary, once** — not per action. In MVC, put `[Authorize(Policy = "...")]` on the **controller class** (or a shared base controller); every action inherits it. Override on a single action with a narrower `[Authorize(Policy = ...)]` / `[AllowAnonymous]` only where it genuinely differs.
+- The Minimal-API equivalent is `group.MapGroup("/...").RequireAuthorization(policy)` on the **route group**. Both compile to the **same authorization metadata** — the group-level fluent call and the class-level attribute are equally correct and equally DRY. Per-method attributes / per-endpoint `RequireAuthorization` are for intentional per-route overrides only.
+- Name policies centrally (a single constants holder) and reference the constant — never inline role strings at the call site.
+
+### Current-user / identity access
+
+- **Inject `ClaimsPrincipal` directly into handlers for current-user identity; read it through the shared `ClaimsPrincipalExtensions` (`GetUserId()`, `GetSessionId()`, `GetDeviceId()`).** Do **not** wrap identity access in an `ICurrentUser` / `ICurrentUserProvider` service by default.
+- Why `ClaimsPrincipal` is the right seam here (not an over-coupling):
+  - It is a **data-driven seam whose producer is the auth handler** — the cookie scheme, `DeviceBearerAuthenticationHandler`, or any future JWT all populate the *same* `ClaimsPrincipal`. The handler is already decoupled from *how* identity was obtained.
+  - It is **available for free** in the HTTP layer — `ControllerBase.User` in a controller action (or a `ClaimsPrincipal user` parameter in a legacy Minimal-API handler), sourced from `HttpContext.User`; no `IHttpContextAccessor`, no scoped registration, no lifetime caveat. Identity stays in the `Api` layer: a controller reads `User`, extracts the IDs it needs via `ClaimsPrincipalExtensions`, and passes **plain values** (`Guid userId`) into the service — `ClaimsPrincipal` does not cross into the Services layer.
+  - It is **testable without an interface**: `ClaimsPrincipal` is `new`-able with arbitrary claims and its behaviour (`IsInRole`, `FindFirst`, the extensions) is fully driven by those claims. Construct a real principal with test claims — preferable to a mocked `IPrincipal`, which can diverge from real claim-matching semantics. (In this repo, handlers are exercised over HTTP via `WebApplicationFactory` with a real login, so identity is never substituted anyway.)
+  - The `ClaimsPrincipalExtensions` already provide the domain-friendly, centralized read surface that a provider's properties would duplicate.
+- A current-user provider adds a scoped `IHttpContextAccessor`-backed service — exactly the captive-dependency shape the DI section warns about — to replace a free, already-abstracted, already-testable binding. That fails the "simplicity is the highest priority" bar unless one of the concrete triggers below holds.
+- **Introduce an `ICurrentUser` abstraction ONLY when a named trigger appears:**
+  1. **Identity is needed outside an HTTP request** — background job, message consumer, worker thread — where `ClaimsPrincipal` cannot be bound from the pipeline. A provider with swappable impls (HTTP-backed vs job-context) earns its keep.
+  2. **The domain layer must consume identity** and you do not want `System.Security.Claims` types leaking into domain code — expose a domain-pure `ICurrentUser` value instead.
+  3. **You need richer-than-claims current-user data** (a loaded `User` entity, tenant, permission set) resolved and cached per request.
+  When introduced: back the HTTP implementation with `IHttpContextAccessor`, register it **Scoped**, never capture it in a singleton, and keep `ClaimsPrincipalExtensions` as the implementation detail it delegates to.
+
+### Response shapes
+
+**Controllers (the standard here): default to `ActionResult<T>`.** It mixes the success type `T` with `ActionResult` error shapes, participates in MVC's configured output formatters / content negotiation, and is the most reliable for OpenAPI:
+- Annotate with `[ProducesResponseType]`; the `Type` can be **omitted for the success code** (`[ProducesResponseType(StatusCodes.Status200OK)]`) — it is inferred from `T`. Add one attribute per additional status code (`404`, `409`, …).
+- Return the value directly (`return product;` — implicit cast to `200 OK`) or a `ControllerBase` helper for other shapes (`NotFound()`, `Conflict()`, `BadRequest(error)`, `CreatedAtAction(...)`).
+- The auto-validation action filter already produces the `400` for invalid input before the action runs (see Validation) — don't hand-write that path.
+- Keep the action **thin**: it maps the service's **success value** onto the success shape (`return product;` → `200`, `CreatedAtAction(...)` → `201`) and does not compute the business decision itself. **Expected failures are not mapped here** — the service throws a `BusinessException` subtype and the central `IExceptionHandler` produces the `ProblemDetails` (see Error handling). So a controller action has essentially no error branches: happy path in, success shape out.
+- `TypedResults` / `Results<T1, T2, …>` / `IResult` **are** usable in controllers, but they are the *Minimal-API* idiom and they **bypass MVC's configured output formatters / content negotiation** (they write the response directly — Microsoft Learn: "Does not leverage the configured Formatters"). Prefer `ActionResult<T>` in a controller; reach for `IResult` only for a deliberately format-agnostic raw response.
+
+**Legacy Minimal-API endpoints (until migrated): default to `TypedResults.*`** over `Results.*`. `TypedResults` returns concrete types (`Ok<T>`, `NotFound`, `BadRequest<T>`) that carry OpenAPI metadata and are unit-testable without casting. For handlers that return more than one shape, declare the return type as `Results<T1, T2, ...>` — the compiler enforces every branch returns a declared type and the OpenAPI generator reads the union, so no `Produces`/`ProducesResponseType` attributes are needed:
+  ```csharp
+  app.MapGet("/items/{id}", Results<Ok<Item>, NotFound> (int id) =>
+      item is not null ? TypedResults.Ok(item) : TypedResults.NotFound());
+  ```
+  Don't mix `Results.*` and `TypedResults.*` in the same handler — you lose the metadata.
+
+### Service results vs. wire envelopes
+
+> General principle (cross-language): see `coderule.mdc` → "Architecture & layering › Service results vs. transport envelopes". Below is the .NET realization.
+
+- A service returns a **domain result** — a record of the values it computed (`IReadOnlyList<LiveDevice>`, a small snapshot record) on success, and **throws a `BusinessException` subtype** on an expected failure (see Error handling); it does not return error-wrapper values. The **controller maps the success value onto the wire DTO**. The response envelope (the `*Response` record, its field names, the HTTP status) is an **Api-layer concern**; the domain result is not, and ASP.NET / wire types must not appear in a service signature (see "Solution layout & layering").
+- **A value that the response echoes to the client but that the service ALSO used to compute the result is owned by the service** — it returns that value alongside the data; the controller must NOT independently re-derive it. Two clocks/sources for the same conceptual value is a latent bug.
+  - Canonical case: a "server now" timestamp that a projection uses to decide freshness/staleness (which devices are dropped, what color each gets) **and** is echoed so the client renders relative ages consistently. If the controller stamped its own `DateTimeOffset.UtcNow`, it would diverge from the instant the service filtered against — a boundary bug.
+  - Pattern: the service injects `TimeProvider`, captures the instant **once**, uses it, and returns it inside a domain result — e.g. `LiveSnapshot(DateTimeOffset CapturedAt, IReadOnlyList<LiveDevice> Devices)`. The controller returns `ActionResult<LiveStateResponse>`, mapping `CapturedAt → server_now`. The envelope name and JSON shape stay in the Api layer; the *instant* originates in the Services layer where it is consumed.
+- The opposite case: a value that is **purely an HTTP/transport artifact and is never consumed by domain logic** (a `Location` header, a per-response correlation id minted for tracing) is owned by the **Api layer** and the service never sees it.
+- Heuristic: ask "does the business logic *read* this value to make a decision?" If yes → it lives in the service and is returned. If it is only *formatting/transport* → it lives in the controller.
+
+## Testing
+
+> General principle (cross-language): see `coderule.mdc` → "Testing (real dependencies)" (real engine over fakes for query-correctness; share expensive fixtures). Below is the .NET realization.
+
+- **xUnit** is the test framework for this repo. Use its per-test class lifecycle (constructor = setup, `IDisposable.Dispose` / `IAsyncLifetime.DisposeAsync` = teardown) — that's what most integration-testing patterns assume.
+- **FluentAssertions** for assertions: `result.Should().Be(...)`, `collection.Should().HaveCount(3).And.ContainSingle(x => ...)`, etc. Failure messages are much clearer than raw `Assert.Equal`, and the fluent chain reads like the spec it tests.
+- **`WebApplicationFactory<Program>`** for ASP.NET Core integration tests. It boots the real DI container and pipeline from `Program.cs` in-memory. Expose `Program` to the test project with `public partial class Program;` in `Program.cs`. Share the factory across tests in a class with `IClassFixture<T>` and across classes with `ICollectionFixture<T>` — host-boot is the expensive step; don't re-pay it per test.
+- **Never use the EF Core in-memory provider for query-correctness tests.** Its semantics diverge from real Postgres/SQL Server (LINQ translation differences, no real transactions, no concurrency tokens). Use Testcontainers (real Postgres container via `IAsyncLifetime` on the factory) + Respawn for between-test cleanup. The in-memory provider is acceptable only for fast smoke tests where you're not asserting query shape.
+- Tests follow the Arrange / Act / Assert pattern with `// Arrange` / `// Act` / `// Assert` comments (workspace convention; see `coderule.mdc`).
+
+## Cross-cutting
+
+- Use middleware for cross-cutting: auth, error handling, logging. Standard order in `Program.cs`: forwarded headers → exception handler → HTTPS/HSTS → static files → routing → CORS → authentication → authorization → rate limiter → endpoints.
@@ -4,6 +4,26 @@ alwaysApply: true
 ---
 # Agent Meta Rules

+## Real Results, Not Simulated Ones
+
+**The goal is a working product, not the appearance of one.**
+
+- If something does not work, STOP and report it honestly. Do not find a way around it.
+- Never produce results by bypassing, faking, stubbing, or passthrough-ing the component that is supposed to produce them. A passing test that skips the real pipeline is worse than a failing test — it hides the truth.
+- If the real implementation is not ready, say so. A clear "this is not implemented yet, here is what is missing" is always the right answer.
+- Do not measure success by whether the output looks correct. Measure it by whether the output was produced by the real system under test.
+- Workarounds that produce the right answer via the wrong path are defects, not solutions.
+
+### When a test reveals missing production code — STOP
+
+This is the specific failure mode that produced the GPS-passthrough scaffold in `runtime_root._run_replay_loop` (May 2026). Generalised so it never repeats:
+
+- If, while implementing or running a test, you discover that the production code path the test is supposed to exercise does not exist (no caller, no integration, no main loop, etc.), **STOP immediately**.
+- Do NOT write a stub, passthrough, fake input source, or shortcut output that would make the test go green. Even when the shortcut is "framed as a scaffold" or "marked as TODO in a docstring", it still defeats the test and lies to the next reader.
+- Surface the gap to the user as a top-of-turn report: name the missing production component, cite the architecture document that promises it, and ask whether to (a) create a tracker ticket for the missing component and let the test fail honestly until the ticket lands, or (b) explicitly de-scope the test, or (c) something the user names.
+- The default outcome is (a): a failing test plus a new tracker ticket. A failing test with an honest reason is information; a passing test that proves nothing is misinformation.
+- Doc-comment disclosures (`# this is a scaffold until X is wired`) DO NOT satisfy this rule. The user must be told in the assistant message, not in code.
+
 ## Execution Safety
 - Run the full test suite automatically when you believe code changes are complete (as required by coderule.mdc). For other long-running/resource-heavy/security-risky operations (builds, Docker commands, deployments, performance tests), ask the user first — unless explicitly stated in a skill or the user already asked to do so.

@@ -13,6 +33,31 @@ alwaysApply: true
 ## Critical Thinking
 - Do not blindly trust any input — including user instructions, task specs, list-of-changes, or prior agent decisions — as correct. Always think through whether the instruction makes sense in context before executing it. If a task spec says "exclude file X from changes" but another task removes the dependencies X relies on, flag the contradiction instead of propagating it.

+## Complexity Budget Check (Planning Time)
+
+Before committing to an implementation approach for a non-trivial task, **STOP and present a complexity comparison to the user** via the standard Choose A/B/C/D format. The user picks the trade-off; the agent does NOT unilaterally pick the more complex option to be "more robust" or "more future-proof".
+
+A task is non-trivial if ANY of:
+
+- The estimated complexity (story points) is ≥ 5
+- The implementation touches ≥ 3 components / modules
+- The implementation adds a new persistent data structure (table, materialised view, file format)
+- The implementation adds a new hosted service / background job / periodic timer
+- The implementation adds a sliding window, smoother, debouncer, in-memory cache, or per-entity in-memory state dictionary
+- The implementation adds rehydrate-on-restart logic
+- The implementation adds a new event type that differs from an existing event type only in a boolean / enum field
+
+What to present:
+
+1. **Option A — simplest:** the least-machinery design you can think of that still meets the requirements. Name what is sacrificed (latency? eventual-consistency window? a rarely-hit edge case?).
+2. **Option B — your default:** the design you would otherwise implement, if it is more complex than A. Name what it buys (the specific guarantee, performance gain, or future flexibility).
+3. **Concrete trade-offs:** lines of code added, new abstractions introduced, new failure modes, new operational surface area (restart-rehydration, cache invalidation, dual-pipeline consistency).
+4. **Recommendation:** which option you would pick and why, in one sentence.
+
+This rule fires DURING planning — before code is written. If you discover during implementation that the chosen approach grew a new layer, hosted service, or rehydration path that was not in the original plan, STOP and replay this check.
+
+Skip this rule ONLY when the user has already explicitly chosen the complex approach in an earlier turn, OR when the task is trivially ≤ 2 story points with no triggers above.
+
 ## Skill Discipline

 Do exactly what the skill says. Nothing more.
@@ -8,8 +8,16 @@ globs: ["**/*test*", "**/*spec*", "**/*Test*", "**/tests/**", "**/test/**"]
 - One assertion per test when practical; name tests descriptively: `MethodName_Scenario_ExpectedResult`
 - Test boundary conditions, error paths, and happy paths
 - Use mocks only for external dependencies; prefer real implementations for internal code
- Aim for 75%+ coverage on business logic; 100% on critical paths (code paths where a bug would cause data loss, security breaches, financial errors, or system outages — identify from acceptance criteria marked as must-have or from security_approach.md). The 75% threshold is canonical — see `cursor-meta.mdc` Quality Thresholds.
+- Aim for 75%+ coverage on business logic; **90% floor / 100% aim on critical paths** (code paths where a bug would cause data loss, security breaches, financial errors, or system outages — identify from acceptance criteria marked as must-have or from `security_approach.md`). 90% is the enforcement floor (blocking in CI / refactor verification / release pre-flight); 100% is the aspirational aim — drift below 100% but at-or-above 90% is acceptable. Both numbers are canonical — see `cursor-meta.mdc` Quality Thresholds.
 - Integration tests use real database (Postgres testcontainers or dedicated test DB)
 - Never use Thread Sleep or fixed delays in tests; use polling or async waits
 - Keep test data factories/builders for reusable test setup
 - Tests must be independent: no shared mutable state between tests
+
+## Test environment (this project)
+
+- **Unit tests** (`tests/unit/`): may run locally on the dev workstation (`pytest tests/unit/` in the project venv). Local PASS is equivalent to Jetson PASS for this tier because the suite is fully synthetic.
+- **Blackbox / e2e / performance / resilience / security / resource-limit** tests (`tests/e2e/`, `e2e/tests/`, `tests/perf/`, …): MUST run on the Jetson Orin Nano Super (or a Jetson-equivalent arm64 agent). Use `scripts/run-tests-jetson.sh` for local dev; CI runs `.woodpecker/01-test.yml` on the colocated arm64 Jetson Woodpecker agent.
+- Do NOT run e2e tests on the local workstation and report the result. If the Jetson is unreachable, the e2e verdict is "not run" — record the gap in `_docs/_process_leftovers/` rather than substituting a local result.
+- Tests gated by `RUN_REPLAY_E2E` or `@pytest.mark.tier2` are expected to SKIP locally; that is correct behaviour, not a failure to investigate.
+- Canonical source for this policy: `_docs/02_document/tests/environment.md` § Where each tier runs (active policy).
@@ -14,11 +14,14 @@ alwaysApply: true
 - Issue types: Epic, Story, Task, Bug, Subtask

 ## Tracker Availability Gate
- If Jira MCP returns **Unauthorized**, **errored**, **connection refused**, or any non-success response: **STOP** tracker operations and notify the user via the Choose A/B/C/D format documented in `.cursor/skills/autodev/protocols.md`.
+- If Jira MCP returns **Unauthorized**, **errored**, **connection refused**, **timeout**, a non-2xx status code, an empty body, or any response shape that does not clearly confirm the requested change: **STOP IMMEDIATELY** — no automatic retry, no silent continuation. Surface the full raw error/response to the user verbatim and notify via the Choose A/B/C/D format documented in `.cursor/skills/autodev/protocols.md`.
+- A minimal `{"success": true}` body with no echoed issue state is NOT a confirmed transition. When a transition's success matters (status moves, ticket creation, blocking link), follow it with a read-back call (`getJiraIssue` or equivalent) and confirm the new state matches what you asked for. If the read-back disagrees → STOP and ASK.
+- Do NOT loop "retry up to N times before asking". One call, one verification. On failure, the user decides whether to retry.
 - The user may choose to:
-  - **Retry authentication** — preferred; the tracker remains the source of truth.
+  - **Retry the same operation** — once, after the user authorizes it. If it fails again, surface both responses.
+  - **Retry authentication** — preferred when the failure looks like an auth/credentials problem; the tracker remains the source of truth.
  - **Continue in `tracker: local` mode** — only when the user explicitly accepts this option. In that mode all tasks keep numeric prefixes and a `Tracker: pending` marker is written into each task header. The state file records `tracker: local`. The mode is NOT silent — the user has been asked and has acknowledged the trade-off.
- Do NOT auto-fall-back to `tracker: local` without a user decision. Do not pretend a write succeeded. If the user is unreachable (e.g., non-interactive run), stop and wait.
+- Do NOT auto-fall-back to `tracker: local` without a user decision. Do not pretend a write succeeded. Do not paper over an opaque response by moving on. If the user is unreachable (e.g., non-interactive run), stop and wait.
 - When the tracker becomes available again, any `Tracker: pending` tasks should be synced — this is done at the start of the next `/autodev` invocation via the Leftovers Mechanism below.

 ## Leftovers Mechanism (non-user-input blockers only)
@@ -1,9 +1,9 @@
 ---
 name: autodev
 description: |
-  Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
+  Auto-chaining orchestrator that drives the full BUILD → SHIP → EVOLVE workflow from problem gathering through release and retrospective.
  Detects current project state from _docs/ folder, resumes from where it left off, and flows through
-  problem → research → plan → test specs → decompose → implement → tests → docs sync → deploy without manual skill invocation.
+  problem → research → plan (incl. ADRs) → test specs → decompose → implement → tests → docs sync → deploy → release → retrospective without manual skill invocation.
  Maximizes work per conversation by auto-transitioning between skills.
  Trigger phrases:
  - "autodev", "auto", "start", "continue"
@@ -15,7 +15,7 @@ disable-model-invocation: true

 # Autodev Orchestrator

-Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autodev` once — the engine handles sequencing, transitions, and re-entry.
+Auto-chaining execution engine that drives the full BUILD → SHIP → EVOLVE workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autodev` once — the engine handles sequencing, transitions, and re-entry.

 ## File Index

@@ -67,8 +67,9 @@ B3. Read state              — `_docs/_autodev_state.md` (if it exists).
 B4. Read File Index         — `state.md`, `protocols.md`, and the active flow file.

 ### Resolve (once per invocation, after Bootstrap)
-R1. Reconcile state         — verify state file against `_docs/` contents; on disagreement, trust the folders
-                               and update the state file (rules: `state.md` → "State File Rules" #4).
+R1. Reconcile state         — verify state file against `_docs/` contents; probe `<workspace-root>/../docs`
+                               (parent suite `docs/` — see `state.md` → "State File Rules" #4); on disagreement,
+                               trust the folders and update the state file (rules: `state.md` → "State File Rules" #4).
                               After this step, `state.step` / `state.status` are authoritative.
 R2. Resolve flow            — see §Flow Resolution above.
 R3. Resolve current step    — when a state file exists, `state.step` drives detection.
@@ -3,7 +3,7 @@
 Workflow for projects with an existing codebase. Structurally it has **two phases**:

 - **Phase A — One-time baseline setup (Steps 1–8)**: runs exactly once per codebase. Documents the code, produces test specs, makes the code testable, writes and runs the initial test suite, optionally refactors with that safety net.
- **Phase B — Feature cycle (Steps 9–17, loops)**: runs once per new feature. After Step 17 (Retrospective), the flow loops back to Step 9 (New Task) with `state.cycle` incremented.
+- **Phase B — Feature cycle (Steps 9–17, loops)**: runs once per new feature. After Step 17 (Retrospective), the flow loops back to Step 9 (New Task) with `state.cycle` incremented. Step 16.5 (Release) sits between Deploy (16) and Retrospective (17).

 A first-time run executes Phase A then Phase B; every subsequent invocation re-enters Phase B.

@@ -33,7 +33,8 @@ A first-time run executes Phase A then Phase B; every subsequent invocation re-e
 | 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
 | 14 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
 | 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
-| 16 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 16 | Deploy | deploy/SKILL.md | Step 1–7 (optional) |
+| 16.5 | Release | release/SKILL.md | Phase 1–6 (optional — only if Step 16 completed) |
 | 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |

 After Step 17, the feature cycle completes and the flow loops back to Step 9 with `state.cycle + 1` — see "Re-Entry After Completion" below.
@@ -275,33 +276,63 @@ State-driven: reached by auto-chain from Step 14 (completed or skipped).
 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run performance/load tests before deploy?`
 - option-a-label:  `Run performance tests (recommended for latency-sensitive or high-load systems)`
- option-b-label:  `Skip — proceed directly to deploy`
+- option-b-label:  `Skip — proceed to deploy choice`
 - recommendation:  `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
 - target-skill:    `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
 - next-step:       Step 16 (Deploy)

 ---

-**Step 16 — Deploy**
+**Step 16 — Deploy (optional)**
 State-driven: reached by auto-chain from Step 15 (completed or skipped).

-Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
+Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
+- question:        `Run deploy planning or refresh deploy artifacts for this cycle?`
+- option-a-label:  `Run deploy — update scripts/procedures for this release`
+- option-b-label:  `Skip — keep developing; deploy when ready for production`
+- recommendation:  `B during active feature work; A when this cycle should ship`
+- target-skill:    `.cursor/skills/deploy/SKILL.md`
+- next-step:       Step 16.5 (Release) — only when Step 16 was completed; otherwise Step 17 (Retrospective)

-After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).
+On **skip**: mark Step 16 and Step 16.5 as `skipped`; auto-chain to Step 17 (Retrospective in cycle-end mode).
+
+On **complete**: mark Step 16 `completed` and auto-chain to Step 16.5 (Release).
+
+---
+
+**Step 16.5 — Release (optional)**
+State-driven: reached by auto-chain from Step 16 **only when Step 16 status is `completed`**, for the current `state.cycle`. If Step 16 was `skipped`, Step 16.5 is `skipped` and `/release` is not invoked.
+
+Action: Read and execute `.cursor/skills/release/SKILL.md`. The release skill owns its own user interaction (Phase 1 pre-release gate, Phase 2 strategy select, Phase 6 escalation). Autodev does NOT add a wrapping A/B/C gate. Pass cycle context (`cycle: state.cycle`).
+
+After the release skill exits, route on the verdict:
+
+- **Verdict `Released`** → mark Step 16.5 `completed` and auto-chain to Step 17 (Retrospective in cycle-end mode).
+- **Verdict `Released-with-override`** → mark Step 16.5 `completed` AND auto-chain to Step 17 (Retrospective in **incident mode**).
+- **Verdict `Rolled-Back`** → mark Step 16.5 `failed`. Auto-chain to Step 17 (Retrospective in **incident mode**). The cycle does NOT loop back to Step 9.
+- **Verdict `Aborted`** → mark Step 16.5 `not_started` (no live-system change) OR `failed` (live-system touched before abort). Surface the abort reason and STOP. Next `/autodev` invocation re-evaluates Phase B from the failed step.

 ---

 **Step 17 — Retrospective**
-State-driven: reached by auto-chain from Step 16, for the current `state.cycle`.
+State-driven: reached by auto-chain from Step 16.5 (any verdict) OR from Step 16/16.5 both `skipped`, for the current `state.cycle`.

-Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. Pass cycle context (`cycle: state.cycle`) so the retro report and LESSONS.md entries record which feature cycle they came from.
+Action: Read and execute `.cursor/skills/retrospective/SKILL.md`. Mode selection:

-After retrospective completes, mark Step 17 as `completed` and enter "Re-Entry After Completion" evaluation.
+- Step 16.5 verdict `Released` → cycle-end mode
+- Step 16.5 verdict `Released-with-override` or `Rolled-Back` → incident mode
+
+Pass cycle context (`cycle: state.cycle`) so the retro report and LESSONS.md entries record which feature cycle they came from.
+
+After retrospective completes:
+
+- If Step 16.5 verdict was `Released` or `Released-with-override`, OR Step 16.5 was `skipped` → mark Step 17 as `completed` and enter "Re-Entry After Completion" evaluation (loop back to Step 9 for cycle N+1).
+- If Step 16.5 verdict was `Rolled-Back` → mark Step 17 as `completed` but do NOT loop back. Surface the incident retro path and STOP.

 ---

 **Re-Entry After Completion**
-State-driven: `state.step == done` OR Step 17 (Retrospective) is completed for `state.cycle`.
+State-driven: `state.step == done` OR Step 17 (Retrospective) is completed for `state.cycle` AND (Step 16.5 verdict was `Released` or `Released-with-override` OR Step 16.5 was `skipped`). A `Rolled-Back` cycle does NOT trigger Re-Entry — the user must explicitly invoke `/autodev` again.

 Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:

@@ -316,7 +347,7 @@ Action: The project completed a full cycle. Print the status banner and automati

 Set `step: 9`, `status: not_started`, and **increment `cycle`** (`cycle: state.cycle + 1`) in the state file, then auto-chain to Step 9 (New Task). Reset `sub_step` to `phase: 0, name: awaiting-invocation, detail: ""` and `retry_count: 0`.

-Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security → Performance → Deploy → Retrospective.
+Note: the loop (Steps 9 → 17 → 9) covers: New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security → Performance → Deploy (optional) → Release (optional) → Retrospective. The cycle completes (and loops back to Step 9) on a `Released` or `Released-with-override` verdict, or when deploy/release were skipped; rolled-back or aborted releases stop the cycle.

 ## Auto-Chain Rules

@@ -343,9 +374,15 @@ Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New
 | Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
 | Update Docs (13) | Auto-chain → Security Audit choice (14) |
 | Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
-| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
-| Deploy (16) | Auto-chain → Retrospective (17) |
-| Retrospective (17) | **Cycle complete** — loop back to New Task (9) with incremented cycle counter |
+| Performance Test (15, done or skipped) | Auto-chain → Deploy choice (16) |
+| Deploy (16, completed) | Auto-chain → Release (16.5) |
+| Deploy (16, skipped) | Mark 16.5 `skipped` → auto-chain → Retrospective (17, cycle-end mode) |
+| Release (16.5, verdict Released) | Auto-chain → Retrospective (17, cycle-end mode) |
+| Release (16.5, verdict Released-with-override) | Auto-chain → Retrospective (17, **incident mode**) |
+| Release (16.5, verdict Rolled-Back) | Auto-chain → Retrospective (17, **incident mode**); cycle does NOT loop back |
+| Release (16.5, verdict Aborted) | STOP — surface abort reason; do not auto-chain |
+| Retrospective (17, after Released / Released-with-override / deploy skipped) | **Cycle complete** — loop back to New Task (9) with incremented cycle counter |
+| Retrospective (17, after Rolled-Back) | Cycle remains incomplete — STOP and surface incident retro path |

 ## Status Summary — Step List

@@ -381,9 +418,10 @@ Flow-specific slot values:
 | 14 | Security Audit             | — |
 | 15 | Performance Test           | — |
 | 16 | Deploy                     | — |
+| 16.5 | Release                  | `DONE (Released | Released-with-override | Rolled-Back | Aborted)` |
 | 17 | Retrospective              | — |

-All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2, 4, 8, 12, 13, 14, 15 additionally accept `SKIPPED`.
+All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2, 4, 8, 12, 13, 14, 15, 16, 16.5 additionally accept `SKIPPED`.

 Row rendering format (renders with a phase separator between Step 8 and Step 9):

@@ -406,5 +444,6 @@ Row rendering format (renders with a phase separator between Step 8 and Step 9):
 Step 14   Security Audit           [<state token>]
 Step 15   Performance Test         [<state token>]
 Step 16   Deploy                   [<state token>]
+ Step 16.5 Release                  [<state token>]
 Step 17   Retrospective            [<state token>]
 ```
@@ -1,6 +1,6 @@
 # Greenfield Workflow

-Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy (optional) → Release (optional, only if Deploy ran) → Retrospective.

 ## Step Reference Table

@@ -8,7 +8,7 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
 |------|------|-----------|-------------------|
 | 1 | Problem | problem/SKILL.md | Phase 1–4 |
 | 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
-| 3 | Plan | plan/SKILL.md | Step 1–6 + Final |
+| 3 | Plan | plan/SKILL.md | Step 1, 2, 3, 4, 4.5 (ADR Capture), 5, 6 + Final |
 | 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
 | 5 | Test Spec | test-spec/SKILL.md | Phases 1–4 |
 | 6 | Decompose | decompose/SKILL.md (implementation task decomposition) | Step 1 + Step 1.5 + Step 2 + Step 4 |
@@ -21,7 +21,8 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
 | 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
 | 14 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
 | 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
-| 16 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 16 | Deploy | deploy/SKILL.md | Step 1–7 (optional) |
+| 16.5 | Release | release/SKILL.md | Phase 1–6 (optional — only if Step 16 completed) |
 | 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |

 ## Detection Rules
@@ -279,26 +280,55 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga

 ---

-**Step 16 — Deploy**
+**Step 16 — Deploy (optional)**
 State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).

-Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
+Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
+- question:        `Run deploy planning (scripts, procedures, compose overlays) now?`
+- option-a-label:  `Run deploy — produce/update deploy artifacts and scripts`
+- option-b-label:  `Skip — continue development; deploy when ready for production`
+- recommendation:  `B when the product is not ready to ship; A when targeting a release soon`
+- target-skill:    `.cursor/skills/deploy/SKILL.md`
+- next-step:       Step 16.5 (Release) — only when Step 16 was completed; otherwise Step 17 (Retrospective)

-After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).
+On **skip**: mark Step 16 and Step 16.5 as `skipped`; record in the release report (if one exists) or `_docs/_autodev_state.md` `sub_step.detail` that deploy/release were deferred; auto-chain to Step 17 (Retrospective in cycle-end mode).
+
+On **complete**: mark Step 16 `completed` and auto-chain to Step 16.5 (Release).
+
+---
+
+**Step 16.5 — Release (optional)**
+State-driven: reached by auto-chain from Step 16 **only when Step 16 status is `completed`**. If Step 16 was `skipped`, Step 16.5 is also `skipped` and the flow does not invoke `/release`.
+
+Action: Read and execute `.cursor/skills/release/SKILL.md`. The release skill is responsible for selecting the target environment, executing the deploy artifacts, smoke-testing, watching the rollout, and producing a definitive verdict (`Released`, `Released-with-override`, `Rolled-Back`, or `Aborted`).
+
+The release skill has its own internal BLOCKING gates (Phase 1 pre-release gate, Phase 2 strategy select, Phase 6 user confirmation when soft regression escalates). Autodev does NOT add a wrapping A/B/C gate — the release skill owns its own user interaction.
+
+After the release skill exits:
+
+- **Verdict `Released`** → mark Step 16.5 `completed` and auto-chain to Step 17 (Retrospective in cycle-end mode).
+- **Verdict `Released-with-override`** → mark Step 16.5 `completed` AND auto-chain to Step 17 (Retrospective in **incident mode**) — the override is itself an incident the retrospective must analyze.
+- **Verdict `Rolled-Back`** → mark Step 16.5 `failed`. Auto-chain to Step 17 (Retrospective in **incident mode**). Do NOT consider the project "Done" — the user owns the next move (re-run /implement on a fix branch, re-run /deploy, re-run /release).
+- **Verdict `Aborted`** → mark Step 16.5 `not_started` (the release was never started) OR `failed` if the abort came after Phase 3 had already touched the live system. Surface the abort reason and STOP — do not auto-chain to retrospective.

 ---

 **Step 17 — Retrospective**
-State-driven: reached by auto-chain from Step 16.
+State-driven: reached by auto-chain from Step 16.5 (any verdict) OR from Step 16/16.5 both `skipped` (cycle-end mode — note deploy/release deferred in the retro report).

-Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. This closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` and appending the top-3 lessons to `_docs/LESSONS.md`.
+Action: Read and execute `.cursor/skills/retrospective/SKILL.md`. Mode selection:
+
+- Step 16.5 verdict `Released` → cycle-end mode
+- Step 16.5 verdict `Released-with-override` or `Rolled-Back` → incident mode
+
+The retrospective closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` (or `incident_<date>_release.md` in incident mode) and appending the top-3 lessons to `_docs/LESSONS.md`.

 After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.

 ---

 **Done**
-State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
+State-driven: reached by auto-chain from Step 17. (Sanity check: if Step 16 was `completed`, `_docs/04_deploy/` should contain the expected deploy artifacts. If Step 16.5 was `completed`, `_docs/04_release/` should contain a release report with a definitive verdict. Skipped deploy/release is valid — no release report required.)

 Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:

@@ -336,8 +366,13 @@ On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and r
 | Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
 | Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
 | Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
-| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
-| Deploy (16) | Auto-chain → Retrospective (17) |
+| Performance Test (15, done or skipped) | Auto-chain → Deploy choice (16) |
+| Deploy (16, completed) | Auto-chain → Release (16.5) |
+| Deploy (16, skipped) | Mark 16.5 `skipped` → auto-chain → Retrospective (17, cycle-end mode) |
+| Release (16.5, verdict Released) | Auto-chain → Retrospective (17, cycle-end mode) |
+| Release (16.5, verdict Released-with-override) | Auto-chain → Retrospective (17, **incident mode**) |
+| Release (16.5, verdict Rolled-Back) | Auto-chain → Retrospective (17, **incident mode**); do NOT enter Done |
+| Release (16.5, verdict Aborted) | STOP — surface abort reason; do not auto-chain |
 | Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |

 ## Status Summary — Step List
@@ -362,9 +397,10 @@ Flow name: `greenfield`. Render using the banner template in `protocols.md` →
 | 14 | Security Audit             | — |
 | 15 | Performance Test           | — |
 | 16 | Deploy                     | — |
+| 16.5 | Release                  | `DONE (Released | Released-with-override | Rolled-Back | Aborted)` |
 | 17 | Retrospective              | — |

-All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.
+All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15, 16, 16.5 additionally accept `SKIPPED`.

 Row rendering format (step-number column is right-padded to 2 characters for alignment):

@@ -385,5 +421,6 @@ Row rendering format (step-number column is right-padded to 2 characters for ali
 Step 14  Security Audit            [<state token>]
 Step 15  Performance Test          [<state token>]
 Step 16  Deploy                    [<state token>]
+ Step 16.5 Release                  [<state token>]
 Step 17  Retrospective             [<state token>]
 ```
@@ -5,7 +5,8 @@ Workflow for **meta-repositories** — repos that aggregate multiple components
 This flow differs fundamentally from `greenfield` and `existing-code`:

 - **No problem/research/plan phases** — meta-repos don't build features, they coordinate existing ones
- **No test spec / implement / run tests** — the meta-repo has no code to test
+- **No test spec / run tests** — the meta-repo has no code to test
+- **`implement` is scoped to suite-level work only** — cross-repo concerns, repo/folder renames, suite-root infra additions (e.g., `.gitmodules`, `_infra/`, suite `e2e/`). Per-component implementation lives in each component's own workspace `/autodev` cycle. The meta-repo's implement step (Step 3.5) executes only when `_docs/tasks/todo/` is non-empty AND the user explicitly opts in; placement is **before** the sync skills so subsequent Doc/E2E/CICD sync propagates the post-implementation state.
 - **No `_docs/00_problem/` artifacts** — documentation target is `_docs/*.md` unified docs, not per-feature `_docs/NN_feature/` folders
 - **Primary artifact is `_docs/_repo-config.yaml`** — generated by `monorepo-discover`, read by every other step

@@ -17,6 +18,7 @@ This flow differs fundamentally from `greenfield` and `existing-code`:
 | 2 | Config Review | (human checkpoint, no sub-skill) | — |
 | 2.5 | Glossary & Architecture Vision | (inline, no sub-skill) | Steps 1–5 |
 | 3 | Status | monorepo-status/SKILL.md | Sections 1–5 |
+| 3.5 | Suite Implement | implement/SKILL.md (suite-level invocation context) | Steps 1–14 + 16 (Step 14.5 + Step 15 skipped); conditional on `_docs/tasks/todo/` non-empty AND user opt-in |
 | 4 | Document Sync | monorepo-document/SKILL.md | Phase 1–7 (conditional on doc drift) |
 | 4.5 | Integration Test Sync | monorepo-e2e/SKILL.md | Phase 1–6 (conditional on suite-e2e drift; skipped if `suite_e2e:` block absent in config) |
 | 5 | CICD Sync | monorepo-cicd/SKILL.md | Phase 1–7 (conditional on CI drift) |
@@ -184,11 +186,16 @@ The status report identifies:
 - Registry/config mismatches
 - Unresolved questions

-Based on the report, auto-chain branches:
+Based on the report, auto-chain branches in this evaluation order (first match wins):

- If **doc drift** found → auto-chain to **Step 4 (Document Sync)**
- Else if **CI drift** (only) found → auto-chain to **Step 5 (CICD Sync)**
- Else if **registry mismatch** found (new components not in config) → present Choose format:
+1. **Registry mismatch** (new components not in config, or config component not in registry) → present the Choose format below FIRST. After the user resolves it (A: refresh discover, B: onboard, C: continue with mismatch acknowledged), proceed to the next rule. This rule has priority because a stale config would mislead Step 3.5's ownership-envelope synthesis and any sync skill's component scope.
+2. **Pre-routing gate (Step 3.5 detection)** — check `_docs/tasks/todo/` for suite-level task files (`*.md` excluding files starting with `_`). If ≥1 task is present, auto-chain to **Step 3.5 (Suite Implement)**. After Step 3.5 returns (regardless of A/B outcome), the post-implement re-status applies rules 3–6 below to the post-implementation state.
+3. If **doc drift** found → auto-chain to **Step 4 (Document Sync)**
+4. Else if **CI drift** (only) found → auto-chain to **Step 5 (CICD Sync)**
+5. Else if **suite-e2e drift** (only) found → auto-chain to **Step 4.5 (Integration Test Sync)** (only when `suite_e2e:` block exists in config)
+6. Else → **workflow done for this cycle**.
+
+**Registry mismatch Choose format** (rule 1):

 ```
 ══════════════════════════════════════
@@ -205,7 +212,134 @@ Based on the report, auto-chain branches:
 ══════════════════════════════════════
 ```

- Else → **workflow done for this cycle**. Report "No drift. Meta-repo is in sync." Loop waits for next invocation.
+When rule 6 fires (no drift, no todo tasks), report "No drift. Meta-repo is in sync." and end the cycle. Loop waits for next invocation.
+
+---
+
+**Step 3.5 — Suite Implement**
+
+Condition (folder fallback): `_docs/tasks/todo/` exists AND contains ≥1 file matching `*.md` excluding files starting with `_` (e.g., `_dependencies_table.md` is excluded by convention).
+
+State-driven: reached by auto-chain from Step 3 when the pre-routing gate detected todo tasks. Inserted **before** the sync skills (Step 4 / 4.5 / 5) by deliberate design: implementing renames + cross-repo edits first means the subsequent sync skills propagate the actual landed state rather than the pre-change state, avoiding a second cycle to fix downstream drift.
+
+**Skip condition**: `_docs/tasks/todo/` is empty, missing, or contains only `_*` files. In that case Step 3.5 is skipped entirely and the cycle proceeds with Step 3's existing drift-based routing.
+
+**Goal**: Execute suite-level implementation tasks — cross-repo concerns (e.g., `autopilot` + `ui` + suite `e2e/` cutover in a coordinated change-set), folder renames (e.g., `git mv flights missions` + `.gitmodules` edit + `_infra/` path refs), and suite-root infrastructure additions (e.g., `_infra/dev/docker-compose.dev.yml`). Per-component implementation work stays in each component's own workspace `/autodev` cycle.
+
+**Why this exists**: the meta-repo's existing sync skills (`monorepo-document`, `monorepo-cicd`, `monorepo-e2e`) only **propagate** changes that already landed. They cannot **execute** a task spec. Without Step 3.5, suite-level tickets like AZ-543 (B4 repo rename) or AZ-506 (new dev compose) have no flow path forward — they require operator action outside autodev.
+
+**Inputs**:
+
+- `_docs/tasks/todo/*.md` (excluding `_*`) — task specs in the existing format (`Task` / `Component` / `Dependencies` / `Acceptance criteria` headers)
+- `_docs/_repo-config.yaml` — `components[].path` list, used to compute the suite-level OWNED envelope (workspace root EXCLUDING any path under a component's folder)
+- `_docs/tasks/_dependencies_table.md` — synthesized by this step if missing (see Procedure)
+- `_docs/tasks/_suite_module_layout.md` — synthesized by this step if missing (see Procedure)
+
+**Procedure**:
+
+1. **Detection (already done by Step 3 pre-routing gate)**. List task files in `_docs/tasks/todo/` (excluding `_*`). If 0 → skip Step 3.5. If ≥1 → continue.
+
+2. **Present Choose**:
+
+   ```
+   ══════════════════════════════════════
+    DECISION REQUIRED: <N> suite-level task(s) in _docs/tasks/todo/
+   ══════════════════════════════════════
+    Task(s) detected:
+      - AZ-XXX: <title>           (deps: <list or "—">)
+      - AZ-YYY: <title>           (deps: <list or "—">)
+      ...
+
+    A) Run implement skill on these task(s) now (then continue to Doc / E2E / CICD sync)
+    B) Skip implement this cycle — continue to Doc / E2E / CICD sync without executing tasks
+    C) Pause — review the tasks before deciding (end session, no state changes)
+   ══════════════════════════════════════
+    Recommendation: A — running implement BEFORE syncs means subsequent
+                    sync skills propagate the post-implementation state.
+                    B is appropriate when tasks are blocked on user input
+                    or external coordination. C when the tasks themselves
+                    need owner clarification before execution.
+   ══════════════════════════════════════
+   ```
+
+3. **On user A — Pre-flight**:
+
+   a. **Working tree clean check**. Run `git status --porcelain`. If non-empty, surface to the user with a Choose A/B/C identical to the implement skill's prerequisite gate (commit/stash manually; agent commits as `chore: WIP pre-implement`; abort).
+
+   b. **Synthesize `_docs/tasks/_dependencies_table.md`** if missing. Parse each in-scope task's `Dependencies:` field. Write a minimal table of the form:
+
+      ```markdown
+      # Suite-Level Task Dependencies
+
+      | Task ID | Depends on | Notes |
+      |---------|------------|-------|
+      | AZ-XXX  | (none)     | — |
+      | AZ-YYY  | AZ-XXX     | — |
+      ```
+
+      If a task lists a dependency that is neither in `todo/` nor `done/`, log a warning in the synthesized file but do not block — implement skill's Step 1 (Parse) will surface the issue if it actually blocks execution.
+
+   c. **Synthesize `_docs/tasks/_suite_module_layout.md`** if missing. Default content:
+
+      ```markdown
+      # Suite-Level Module Layout (synthetic)
+
+      Generated by autodev meta-repo Step 3.5. The suite root has no per-feature decomposition; ownership is defined at the component-boundary level only.
+
+      ## Per-Component Mapping
+
+      | Component | Owns                             | Imports from |
+      |-----------|----------------------------------|--------------|
+      | suite     | (workspace root) excluding any path listed under `_repo-config.yaml.components[].path` | (read-only) every component's primary doc + `_docs/*.md` |
+
+      Suite-level tasks operate on: `.gitmodules`, `_infra/**`, `_docs/**` (excluding `_docs/tasks/_*` regenerated files), root `README.md`, `e2e/**` (suite e2e harness only).
+
+      Forbidden paths for suite-level tasks: `<component>/**` for every component listed in `_repo-config.yaml.components[].path` — those edits live in the component's own workspace `/autodev` cycle.
+      ```
+
+   d. **Prepare invocation context**:
+
+      ```
+      suite_level: true
+      TASKS_DIR: _docs/tasks/
+      module_layout_path: _docs/tasks/_suite_module_layout.md
+      ```
+
+4. **Invoke implement skill**. Read and execute `.cursor/skills/implement/SKILL.md` with the prepared context. The skill's "Suite-level invocation context" subsection (added in tandem with this flow change) honors the three flags above and skips:
+
+   - Step 14.5 (cumulative code review) — no `architecture_compliance_baseline.md` exists at the suite level; cross-task drift is captured by the next `monorepo-status` cycle instead.
+   - Step 15 (Product Implementation Completeness Gate) — the gate's inputs (`_docs/02_document/architecture.md`, `system-flows.md`, `components/*/description.md`) do not exist in the meta-repo artifact layout. Suite tasks are infrastructure / coordination work, not feature implementation.
+
+   All other implement skill steps (1–14, 16) execute unchanged. Tracker integration (Step 5: In Progress, Step 12: In Testing) runs normally.
+
+5. **Post-implement re-status**. After the implement skill completes (last batch committed, all originally-todo tasks moved to `_docs/tasks/done/`), silently re-run Step 3's drift detection logic — do NOT re-render the full Status report; just re-evaluate the drift signals against the post-implementation tree. Then auto-chain per the post-implementation drift findings:
+
+   - Doc drift → Step 4 (Document Sync)
+   - Suite-e2e drift only → Step 4.5
+   - CI drift only → Step 5
+   - No drift → cycle complete
+
+   Note: the post-implement re-status is exactly why Step 3.5 is placed before sync. A repo rename will typically introduce doc + CI drift; the next invocation of Step 4 / Step 5 catches it on the same cycle.
+
+6. **On user B (skip)** → mark Step 3.5 `skipped` in state file. Apply Step 3's original drift-based routing (compute from the pre-Step-3.5 Status report).
+
+7. **On user C (pause)** → end session. Update state to `step: 3.5, status: in_progress, sub_step: {phase: 0, name: awaiting-task-review, detail: "<N> tasks pending review"}`. Tell the user to invoke `/autodev` again after deciding. **Do NOT modify any files** — pre-flight has not run yet.
+
+**Self-verification** (executed before invoking implement):
+
+- [ ] Working tree is clean (or user explicitly chose B in the WIP-stash sub-Choose)
+- [ ] `_docs/tasks/_dependencies_table.md` exists (synthesized if it didn't)
+- [ ] `_docs/tasks/_suite_module_layout.md` exists (synthesized if it didn't)
+- [ ] All in-scope task files have a `Component:` field (skip + report any that don't — don't guess ownership)
+- [ ] Tracker availability gate satisfied per `protocols.md` (or `tracker: local` previously chosen)
+
+**Failure handling**:
+
+- If implement returns FAILED → standard Failure Handling (`protocols.md`): retry up to 3 times, then escalate.
+- If implement is interrupted mid-batch → next invocation re-detects via the implement skill's resumability protocol (read latest `_docs/03_implementation/suite_batch_*.md`). Step 3.5 itself is reentrant: on re-entry, if `todo/` still has tasks, it presents the Choose again with the remaining set.
+- **Half-applied state risk** (acknowledged): if implement is interrupted between commits, the working tree is clean at the last commit boundary but the in-flight batch is lost. The user is responsible for inspecting and re-invoking. This is intentional — automated rollback of suite-level renames + `.gitmodules` edits is more dangerous than a human-driven recovery.
+
+**Idempotency**: if `_docs/tasks/todo/` becomes empty after this step (all tasks moved to `done/`), the next `/autodev` invocation skips Step 3.5 entirely and proceeds with normal Status → sync flow.

 ---

@@ -287,11 +421,16 @@ After onboarding completes, the config is updated. Auto-chain back to **Step 3 (
 | Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Glossary & Architecture Vision (2.5) |
 | Config Review (2, user picked B) | **Session boundary** — end session, await re-invocation |
 | Glossary & Architecture Vision (2.5) | Auto-chain → Status (3) |
-| Status (3, doc drift) | Auto-chain → Document Sync (4) |
-| Status (3, suite-e2e drift only) | Auto-chain → Integration Test Sync (4.5) |
-| Status (3, CI drift only) | Auto-chain → CICD Sync (5) |
-| Status (3, no drift) | **Cycle complete** — end session, await re-invocation |
+| Status (3, todo tasks present) | Auto-chain → Suite Implement (3.5) — pre-routing gate fires before drift-based routing |
+| Status (3, no todo tasks, doc drift) | Auto-chain → Document Sync (4) |
+| Status (3, no todo tasks, suite-e2e drift only) | Auto-chain → Integration Test Sync (4.5) |
+| Status (3, no todo tasks, CI drift only) | Auto-chain → CICD Sync (5) |
+| Status (3, no todo tasks, no drift) | **Cycle complete** — end session, await re-invocation |
 | Status (3, registry mismatch) | Ask user (A: discover, B: onboard, C: continue) |
+| Suite Implement (3.5, user picked A, success) | Silent re-status; auto-chain per post-implementation drift (Step 4 / 4.5 / 5 / cycle complete) |
+| Suite Implement (3.5, user picked B) | Mark `skipped`; auto-chain per Step 3's original drift findings |
+| Suite Implement (3.5, user picked C) | **Session boundary** — end session, await re-invocation |
+| Suite Implement (3.5, FAILED ×3) | Standard Failure Handling escalation (`protocols.md`) |
 | Document Sync (4) + suite-e2e drift pending | Auto-chain → Integration Test Sync (4.5) |
 | Document Sync (4) + CI drift only pending | Auto-chain → CICD Sync (5) |
 | Document Sync (4) + no further drift | **Cycle complete** |
@@ -317,11 +456,12 @@ Flow-specific slot values:
 | 2 | Config Review                      | `IN PROGRESS (awaiting human)` |
 | 2.5 | Glossary & Architecture Vision   | `SKIPPED (already captured)` |
 | 3 | Status                             | `DONE (no drift)`, `DONE (N drifts)` |
+| 3.5 | Suite Implement                  | `DONE (N tasks)`, `SKIPPED (no todo tasks)`, `SKIPPED (user picked B)`, `IN PROGRESS (batch M of ~N)`, `IN PROGRESS (awaiting-task-review)` |
 | 4 | Document Sync                      | `DONE (N docs)`, `SKIPPED (no doc drift)` |
 | 4.5 | Integration Test Sync            | `DONE (N files)`, `SKIPPED (no suite-e2e drift)`, `SKIPPED (no suite_e2e config block)` |
 | 5 | CICD Sync                          | `DONE (N files)`, `SKIPPED (no CI drift)` |

-All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2.5, 4, 4.5, and 5 additionally accept `SKIPPED`.
+All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2.5, 3.5, 4, 4.5, and 5 additionally accept `SKIPPED`.

 Row rendering format:

@@ -330,6 +470,7 @@ Row rendering format:
 Step 2     Config Review                     [<state token>]
 Step 2.5   Glossary & Architecture Vision    [<state token>]
 Step 3     Status                            [<state token>]
+ Step 3.5   Suite Implement                   [<state token>]
 Step 4     Document Sync                     [<state token>]
 Step 4.5   Integration Test Sync             [<state token>]
 Step 5     CICD Sync                         [<state token>]
@@ -337,8 +478,12 @@ Row rendering format:

 ## Notes for the meta-repo flow

- **No session boundary except Step 2 and Step 2.5**: unlike existing-code flow (which has boundaries around decompose), meta-repo flow only pauses at config review and the one-shot glossary/vision capture. Once both are confirmed, syncing is fast enough to complete in one session and Step 2.5 idempotently no-ops on every subsequent invocation.
+- **Session boundaries**: Step 2 (Config Review pending), Step 2.5 (one-shot glossary/vision review), and Step 3.5 (when user picks C "Pause"). Step 3.5's A/B picks do NOT cross a session boundary — they auto-chain to syncs in the same session.
 - **Cyclical, not terminal**: no "done forever" state. Each invocation completes a drift cycle; next invocation starts fresh.
- **No tracker integration**: this flow does NOT create Jira/ADO tickets. Maintenance is not a feature — if a feature-level ticket spans the meta-repo's concerns, it lives in the per-component workspace.
+- **Tracker integration scope**: this flow does NOT create Jira/ADO tickets in its sync skills (Status / Document Sync / E2E / CICD). Step 3.5 (Suite Implement) IS tracker-integrated — it transitions existing tickets In Progress → In Testing per the implement skill's standard tracker handling. Suite-level tickets are authored manually by the operator (typically as children of an Epic that spans multiple components, like AZ-539); the flow doesn't auto-create them.
+- **Per-component vs. suite-level work**:
+  - Tickets that touch component source code (`<component>/src/**`) belong in that component's own workspace `/autodev` cycle. The meta-repo flow does NOT execute them.
+  - Tickets that touch suite-root paths only (`.gitmodules`, `_infra/**`, suite `e2e/**`, root `README.md`, suite `_docs/**` outside `tasks/_*`) are eligible for Step 3.5.
+  - Tickets that span both (e.g., AZ-550 B11 consumer cutover, which touches `autopilot/`, `ui/`, AND suite `e2e/`) are NOT executable from a single workspace by design — split the ticket so the suite-level slice can run in Step 3.5 and the component slices run in their owning workspaces.
 - **Onboarding is opt-in**: never auto-onboarded. User must explicitly request.
 - **Failure handling**: uses the same retry/escalation protocol as other flows (see `protocols.md`).
@@ -114,6 +114,7 @@ Before entering a step from this table for the first time in a session, verify t
 | greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |
+| meta-repo | Suite Implement | Step 3.5 — implement skill Step 5 / Step 12 | Transition existing tickets In Progress → In Testing per implement skill (does NOT create new tickets — operator authors them) |

 ### State File Marker

@@ -388,7 +389,7 @@ The banner shell is defined here once. Each flow file contributes only its step-
  where `<state token>` comes from the state-token set defined per row in the flow's step-list table.
 - `<current-suffix>` — optional, flow-specific. The existing-code flow appends ` (cycle <N>)` when `state.cycle > 1`; other flows leave it empty.
 - `Retry:` row — omit entirely when `retry_count` is 0. Include it with `<N>/3` otherwise.
- `<footer-extras>` — optional, flow-specific. The meta-repo flow adds a `Config:` line with `_docs/_repo-config.yaml` state; other flows leave it empty.
+- `<footer-extras>` — optional, flow-specific. The meta-repo flow adds a `Config:` line with `_docs/_repo-config.yaml` state; other flows leave it empty unless **parent suite docs** apply: if `<workspace-root>/../docs` exists and is a directory, append `Suite docs (parent): <absolute path>` on its own line (or `Suite docs (parent): absent` is **not** required — omit when missing). This line is orthogonal to flow-specific footer lines; both may appear.

 ### State token set (shared)

@@ -13,7 +13,7 @@ The autodev persists its position to `_docs/_autodev_state.md`. This is a lightw

 ## Current Step
 flow: [greenfield | existing-code | meta-repo]
-step: [1-17 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
+step: [1-17 for greenfield (incl. fractional 16.5), 1-17 for existing-code (incl. fractional 16.5), 1-6 for meta-repo (incl. fractional 2.5 and 3.5), or "done"]
 name: [step name from the active flow's Step Reference Table]
 status: [not_started / in_progress / completed / skipped / failed]
 sub_step:
@@ -82,6 +82,19 @@ retry_count: 0
 cycle: 1
 ```

+```
+flow: meta-repo
+step: 3.5
+name: Suite Implement
+status: in_progress
+sub_step:
+  phase: 7
+  name: batch-loop
+  detail: "AZ-543 batch 1 of 1; suite-level"
+retry_count: 0
+cycle: 1
+```
+
 ```
 flow: existing-code
 step: 10
@@ -100,7 +113,7 @@ cycle: 3
 1. **Create** on the first autodev invocation (after state detection determines Step 1)
 2. **Update** after every change — this includes: batch completion, sub-step progress, step completion, session boundary, failed retry, or any meaningful state transition. The state file must always reflect the current reality.
 3. **Read** as the first action on every invocation — before folder scanning
-4. **Cross-check**: verify against actual `_docs/` folder contents. If they disagree, trust the folder structure and update the state file
+4. **Cross-check**: verify against actual `_docs/` folder contents. If they disagree, trust the folder structure and update the state file. **Parent suite `docs/`**: on every invocation, also probe `<workspace-root>/../docs` (the parent directory’s `docs` folder — typical suite-level shared documentation next to a component repo). If it exists, mention it in the Status Summary footer per `protocols.md`; use it only as supplemental reading context unless a flow step explicitly ties detection to it. It never replaces workspace `_docs/` for step detection by default.
 5. **Never delete** the state file
 6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` on success. If `retry_count` reaches 3, set `status: failed`
 7. **Failed state on re-entry**: if `status: failed` with `retry_count: 3`, do NOT auto-retry — present the issue to the user first
@@ -2,7 +2,7 @@
 name: code-review
 description: |
  Multi-phase code review against task specs with structured findings output.
-  6-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency.
+  7-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency, architecture compliance.
  Produces a structured report with severity-ranked findings and a PASS/FAIL/PASS_WITH_WARNINGS verdict.
  Invoked by /implement skill after each batch, or manually.
  Trigger phrases:
@@ -106,11 +106,12 @@ When multiple tasks were implemented in the same batch:

 ## Phase 7: Architecture Compliance

-Verify the implemented code respects the architecture documented in `_docs/02_document/architecture.md` and the component boundaries declared in `_docs/02_document/module-layout.md`.
+Verify the implemented code respects the architecture documented in `_docs/02_document/architecture.md`, the component boundaries declared in `_docs/02_document/module-layout.md`, and the **accepted Architectural Decision Records** under `_docs/02_document/adr/`.

 **Inputs**:
 - `_docs/02_document/architecture.md` — layering, allowed dependencies, patterns
 - `_docs/02_document/module-layout.md` — per-component directories, Public API surface, `Imports from` lists, Allowed Dependencies table
+- `_docs/02_document/adr/` — every `Status: Accepted` ADR is an enforceable structural rule. `Status: Proposed`, `Status: Deprecated`, and `Status: Superseded` ADRs are NOT enforced (Proposed = not yet ratified; Deprecated/Superseded = a later ADR overturned it). If the directory does not exist or has only the index file, ADRs are skipped — log this skip in the report so the absence is visible.
 - The cumulative list of changed files (for per-batch invocation) or the full codebase (for baseline invocation)

 **Checks**:
@@ -125,6 +126,11 @@ Verify the implemented code respects the architecture documented in `_docs/02_do

 5. **Cross-cutting concerns not locally re-implemented**: if a file under a component directory contains logic that should live in `shared/<concern>/` (e.g., custom logging setup, config loader, error envelope), flag it. Severity: Medium. Category: Architecture.

+6. **ADR compliance**: for each `Status: Accepted` ADR, confirm the changed code does not contradict the ADR's `Decision`. Two failure modes are flagged:
+   - **ADR-Violation**: the changed code does the opposite of an Accepted ADR's `Decision`. Example: ADR-002 says "We will use Postgres for transactional data" and the changed code introduces a SQLite dependency for a transactional path. Severity: **Critical**. Category: Architecture. The finding cites the ADR by `NNN_<slug>` and the offending file/line.
+   - **ADR-Drift**: the changed code does something the ADR did not anticipate AND that materially affects the ADR's `Consequences` (positive or negative). Example: ADR-004 says "Event-driven cross-component comms" and a changed file introduces a new synchronous HTTP call between two components. Severity: **High**. Category: Architecture. The finding either proposes "Update ADR-NNN to acknowledge the new pattern" or "Remove the drift to align with ADR-NNN" — never silently accepts.
+   The check skips ADRs that are explicitly out of scope of the changed batch (e.g., ADR-001 about deployment pipeline when the batch only touches business-logic files). Use the ADR's `Evidence` section to determine scope: if no Evidence path overlaps with any changed file, skip the ADR for this batch.
+
 **Detection approach (per language)**:

 - Python: parse `import` / `from ... import` statements; optionally AST with `ast` module for reliable symbol resolution.
@@ -197,7 +203,7 @@ Produce a structured report with findings deduplicated and sorted by severity:

 Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architecture

-`Architecture` findings come from Phase 7. They indicate layering violations, Public API bypasses, new cyclic dependencies, duplicate symbols, or cross-cutting concerns re-implemented locally.
+`Architecture` findings come from Phase 7. They indicate layering violations, Public API bypasses, new cyclic dependencies, duplicate symbols, cross-cutting concerns re-implemented locally, **ADR-Violation** (changed code contradicts an `Accepted` ADR's Decision — Critical), or **ADR-Drift** (changed code introduces a pattern that materially affects an `Accepted` ADR's Consequences without superseding it — High).

 ## Verdict Logic

@@ -232,7 +238,7 @@ The implement skill invokes code-review by:

 1. Reading `.cursor/skills/code-review/SKILL.md`
 2. Providing the inputs above as context (read the files, pass content to the review phases)
-3. Executing all 6 phases sequentially
+3. Executing all 7 phases sequentially
 4. Consuming the verdict from the output

 ### Outputs (returned to the implement skill)
@@ -65,6 +65,7 @@ Announce the selected entrypoint and resolved paths to the user before proceedin
 | 1 Bootstrap Structure | `steps/01_bootstrap-structure.md` | ✓ | — | — |
 | 1t Test Infrastructure | `steps/01t_test-infrastructure.md` | — | — | ✓ |
 | 1.5 Module Layout | `steps/01-5_module-layout.md` | ✓ | — | — |
+| 1.7 System-Pipeline Tasks | `steps/01-7_system-pipeline-tasks.md` | ✓ | — | — |
 | 2 Task Decomposition | `steps/02_task-decomposition.md` | ✓ | ✓ | — |
 | 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | — | — | ✓ |
 | 4 Cross-Verification | `steps/04_cross-verification.md` | ✓ | — | ✓ |
@@ -191,6 +192,20 @@ Read and follow `steps/01-5_module-layout.md`.

 ---

+### Step 1.7: System-Pipeline Tasks (implementation mode only)
+
+Read and follow `steps/01-7_system-pipeline-tasks.md`.
+
+This step exists because per-component task decomposition (Step 2)
+produces one task per component but NEVER produces a task whose
+deliverable is "the production code that drives the end-to-end
+pipeline by calling each component in order against real inputs".
+The architecture document describes the loop; nobody owns it. The
+GPS-passthrough incident (May 2026) is the canonical failure this
+step prevents.
+
+---
+
 ### Step 2: Task Decomposition (implementation and single component modes)

 Read and follow `steps/02_task-decomposition.md`.
@@ -243,6 +258,8 @@ Read and follow `steps/04_cross-verification.md`.
 │       [BLOCKING: user confirms structure]                       │
 │  1.5  Module Layout       → steps/01-5_module-layout.md         │
 │       [BLOCKING: user confirms layout]                          │
+│  1.7  System-Pipeline     → steps/01-7_system-pipeline-tasks.md │
+│       [BLOCKING: user confirms pipeline owners]                 │
 │  2.   Component Tasks     → steps/02_task-decomposition.md      │
 │  4.   Cross-Verification  → steps/04_cross-verification.md      │
 │       [BLOCKING: user confirms dependencies]                    │
@@ -16,7 +16,8 @@
 3. Each component owns ONE top-level directory. Shared code goes under `<root>/shared/` (or language equivalent).
 4. Public API surface = files in the layout's `public:` list for each component; everything else is internal and MUST NOT be imported from other components.
 5. Cross-cutting concerns (logging, error handling, config, telemetry, auth middleware, feature flags, i18n) each get ONE entry under Shared / Cross-Cutting; per-component tasks consume them (see Step 2 cross-cutting rule).
-6. Write `_docs/02_document/module-layout.md` using `templates/module-layout.md` format.
+6. **ADR cross-check**: if `_docs/02_document/adr/` exists, read every `Status: Accepted` ADR. For each, confirm the proposed module layout does not contradict the ADR's `Decision` (e.g., an ADR mandating an event-bus boundary between two components must show up as a `Imports from` exclusion in the layout; an ADR locking a layering style must show up in the Layering table). If an ADR conflicts with the language-conventional layout from step 2, the ADR wins — record the conflict in a `## ADR-driven exceptions to the conventional layout` section of `module-layout.md` with `See ADR NNN_<slug>` references. If the ADR conflict is irreconcilable (the ADR demands something the language genuinely cannot express), STOP and ask the user A/B/C: (A) update the ADR via plan Step 4.5 supersede flow, (B) accept a layered exception with documented rationale, (C) re-open architecture.
+7. Write `_docs/02_document/module-layout.md` using `templates/module-layout.md` format. Each Per-Component Mapping entry that is governed by an ADR includes a trailing `> See ADR NNN_<slug>` line.

 ## Self-verification

@@ -26,6 +27,8 @@
 - [ ] No component's `Imports from` list points at a higher layer
 - [ ] Paths follow the detected language's convention
 - [ ] No two components own overlapping paths
+- [ ] If `_docs/02_document/adr/` exists with Accepted ADRs, every layout decision that an ADR governs has a trailing `> See ADR NNN_<slug>` reference
+- [ ] No Accepted ADR is contradicted by the layout without a documented exception

 ## Save action

@@ -0,0 +1,72 @@
+# Step 1.7: System-Pipeline Tasks (implementation mode only)
+
+**Role**: Professional software architect, integration-focused.
+**Goal**: For every end-to-end pipeline named in `_docs/02_document/architecture.md` and `_docs/02_document/system-flows.md`, ensure there is exactly ONE explicit task that owns the production code that drives that pipeline against real inputs. This step prevents the failure mode where every individual component is "complete" but no production code wires them together (May 2026 GPS-passthrough incident — see `meta-rule.mdc` "When a test reveals missing production code").
+
+**Constraints**:
+
+- This step produces *integration* tasks, not per-component tasks. Per-component tasks come from Step 2.
+- An integration task's owner is typically the composition root, runtime root, main loop, or whichever component the module layout (Step 1.5) names as the "system spine". It is NEVER a leaf component.
+- Each integration task must be sized at 5 points or fewer. If the pipeline is too large for one task, split it into per-stage integration tasks (e.g. "wire ingress → C1", then "wire C1 → C5") rather than one giant task.
+
+## Inputs
+
+| File | Purpose |
+|------|---------|
+| `_docs/02_document/architecture.md` | Source of named end-to-end pipelines and their component sequences |
+| `_docs/02_document/system-flows.md` | Source of operational flows (per-frame loop, request lifecycle, batch job, etc.) |
+| `_docs/02_document/module-layout.md` | Produced by Step 1.5. Names the "system spine" component(s) — typically `runtime_root`, `app`, `main`, `composition`, or equivalent. |
+| `_docs/02_document/components/*/description.md` | Per-component contracts so you can tell which side of a seam each method lives on |
+
+## Steps
+
+1. **Enumerate end-to-end pipelines.** Read `architecture.md` and `system-flows.md`. For each named pipeline / flow that spans 2+ components, record:
+   - The pipeline name (e.g. "per-frame nav loop", "tile-cache build", "operator pre-flight verification").
+   - The ordered sequence of components it touches (e.g. `frame_source → c1_vio → c2_vpr → ... → c5_state → replay_sink`).
+   - The trigger (per-frame, per-request, scheduled, manual).
+   - The output (what the pipeline emits and to whom).
+2. **For each pipeline, locate the owner.** Use `module-layout.md` to find the component that owns the orchestration (the "spine"). If `module-layout.md` does not name one, STOP and ASK the user which component owns the pipeline. Do NOT silently default to the bootstrap structure task — bootstrap is about project skeleton, not behavior.
+3. **Check whether the pipeline is already covered by an existing task spec or by the bootstrap-structure task.** A pipeline is "covered" only if:
+   - A task spec's `Outcome` or `Acceptance Criteria` section explicitly names "drives the {pipeline_name} end-to-end against real production components", AND
+   - That task's owned files include the orchestration code (typically the spine component's main loop / entrypoint).
+4. **For every uncovered pipeline, create a system-integration task spec** in `_docs/02_tasks/todo/` using `.cursor/skills/decompose/templates/task.md`:
+   - **Component**: the spine component from step 2 (e.g. `runtime_root`).
+   - **Outcome**: the production callsite that drives the pipeline exists and runs end-to-end on real inputs.
+   - **Scope / Included**: the orchestration code (loop body, dispatcher, scheduler, entrypoint); explicit list of every component it must call in order; the data type at each seam.
+   - **Acceptance Criteria** (write each as testable):
+     - At least one production caller of every component method in the pipeline can be found by grep — name the methods explicitly.
+     - The orchestration runs against the real production component instances (NOT mocks, NOT a passthrough that bypasses them).
+     - At least one integration test exercises the orchestration end-to-end against real inputs.
+   - **Dependencies**: every per-component task whose component appears in the pipeline.
+   - **Complexity points**: ≤5; split the pipeline if it doesn't fit.
+   - **Tracker**: create a ticket immediately (per `decompose/SKILL.md` "Tracker inline" principle); rename the file to `[TRACKER-ID]_pipeline_<name>.md`.
+5. **Mark the integration task as `Dependencies` for the integration test task.** If `tests-only` decomposition has already produced an e2e/integration test task for this pipeline, append the new integration task to its `Dependencies` field so the test cannot be "made green" before the integration ships.
+
+## Anti-patterns this step explicitly blocks
+
+- **"compose_root returns a wired runtime"** prose interpreted as "the loop exists". Composition assembles the graph; it is NOT the loop. The loop is the code that pulls inputs, drives each node, and emits outputs. If grep finds zero callers of the leaf components, the loop does not exist regardless of what compose_root does.
+- **Treating the bootstrap-structure task as the home of the main loop.** Bootstrap is project skeleton (package layout, CLI scaffold, build files). It is NOT the main loop. Main loop is its own task.
+- **Per-component tasks claiming integration scope.** A C1 VIO task's deliverable is "C1 works in isolation against unit tests". A C1 task's acceptance criteria MUST NOT include "C1 is wired into the runtime" — that's the integration task's job.
+
+## Self-verification
+
+- [ ] Every pipeline named in `architecture.md` / `system-flows.md` is listed in your enumeration.
+- [ ] Every enumerated pipeline either (a) has an existing covered task, or (b) has a new integration task in `todo/`.
+- [ ] No integration task exceeds 5 complexity points.
+- [ ] Every integration task names every component in the pipeline as a `Dependencies` entry.
+- [ ] No integration task is owned by a leaf component — every owner is named in `module-layout.md` as a spine / orchestrator.
+- [ ] Every integration task has a tracker ticket created and the filename renamed to `[TRACKER-ID]_pipeline_<name>.md`.
+
+## Save action
+
+Write the new integration task files into `_docs/02_tasks/todo/`. They will be picked up by Step 2 (Task Decomposition's dependency-table writer) and by Step 4 (Cross-Verification).
+
+## Blocking
+
+**BLOCKING**: Present the pipeline enumeration + the list of new integration tasks to the user. Do NOT proceed to Step 2 until the user confirms:
+
+- The enumeration matches what they expect from the architecture documents.
+- Every uncovered pipeline now has an integration task.
+- The chosen spine owners are correct.
+
+If the user identifies a pipeline you missed, add it before proceeding. If the user names a different spine owner, update the task and re-run self-verification.
@@ -29,7 +29,7 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`.
 ### Test
 - Unit tests: [framework and command]
 - Blackbox tests: [framework and command, uses docker-compose.test.yml]
- Coverage threshold: 75% overall, 90% critical paths
+- Coverage threshold: 75% overall, 90% critical-path floor (100% aim) — per `.cursor/rules/cursor-meta.mdc` Quality Thresholds
 - Coverage report published as pipeline artifact

 ### Security
@@ -64,6 +64,27 @@ TASKS_DIR/
 └── done/        ← completed tasks (moved here after implementation)
 ```

+### Suite-level invocation context (meta-repo flow)
+
+When invoked from `.cursor/skills/autodev/flows/meta-repo.md` Step 3.5 (or any caller that supplies the same context envelope), the skill receives:
+
+```
+suite_level: true
+TASKS_DIR: <override>          # e.g., _docs/tasks/  (vs. default _docs/02_tasks/)
+module_layout_path: <override>  # e.g., _docs/tasks/_suite_module_layout.md
+```
+
+When `suite_level: true` is present, the following gate adjustments apply — and ONLY these. All other steps (1–14, 16) execute unchanged:
+
+1. **TASKS_DIR override** is honored throughout the skill (Step 1 Parse, Step 13 Archive, Step 15 input paths if it ran). Default `_docs/02_tasks/` is replaced by the supplied path.
+2. **module_layout_path override** is read instead of the hardcoded `_docs/02_document/module-layout.md` in Step 4 (Assign File Ownership). The supplied file uses the same `Per-Component Mapping` schema. If both the override and the hardcoded path are missing, behavior is unchanged from default mode (STOP and instruct).
+3. **Step 14.5 (Cumulative Code Review) — SKIPPED**. The meta-repo has no `_docs/02_document/architecture_compliance_baseline.md`; cross-task drift is captured by the next `monorepo-status` cycle instead.
+4. **Step 15 (Product Implementation Completeness Gate) — SKIPPED**. The gate's hard inputs (`_docs/02_document/architecture.md`, `system-flows.md`, `components/*/description.md`) do not exist in the meta-repo artifact layout. Suite-level tasks are infrastructure / coordination work (renames, cross-repo edits, suite-root infra additions), not feature implementation; the equivalent completeness signal is the next `monorepo-status` drift report (which the meta-repo flow re-runs immediately after Step 3.5 returns).
+5. **Final report filename**: `_docs/03_implementation/suite_implementation_report_{run_name}.md` (in addition to the existing feature/test/refactor variants). Batch reports follow `_docs/03_implementation/suite_batch_{NN}_report.md`.
+6. **Tracker integration** (Step 5: In Progress, Step 12: In Testing) runs unchanged — suite-level tickets follow the same tracker rules as any other.
+
+Without `suite_level: true`, none of these adjustments apply and the skill runs exactly as documented in default mode.
+
 ## Prerequisite Checks (BLOCKING)

 1. `TASKS_DIR/todo/` exists and contains at least one task file for the selected context — **STOP if missing**
@@ -103,7 +124,7 @@ TASKS_DIR/

 ### 4. Assign File Ownership

-The authoritative file-ownership map is `_docs/02_document/module-layout.md` (produced by the decompose skill's Step 1.5). Task specs are purely behavioral — they do NOT carry file paths. Derive ownership from the layout, not from the task spec's prose.
+The authoritative file-ownership map is `_docs/02_document/module-layout.md` (produced by the decompose skill's Step 1.5), unless `suite_level: true` was supplied in the invocation context — in which case the `module_layout_path` override is read instead (see "Suite-level invocation context" above). Task specs are purely behavioral — they do NOT carry file paths. Derive ownership from the layout, not from the task spec's prose.

 For each task in the batch:
 - Read the task spec's **Component** field.
@@ -222,6 +243,8 @@ For product implementation, this archive means "batch implementation accepted."

 ### 14.5. Cumulative Code Review (every K batches)

+**Skipped entirely when `suite_level: true`** (see "Suite-level invocation context" above) — the meta-repo has no `architecture_compliance_baseline.md` to evaluate against; cross-task drift is captured by the next `monorepo-status` cycle.
+
 - **Trigger**: every K completed batches (default `K = 3`; configurable per run via a `cumulative_review_interval` knob in the invocation context)
 - **Purpose**: per-batch review (Step 9) catches batch-local issues; cumulative review catches issues that only appear when tasks are combined — architecture drift, cross-task inconsistency, duplicate symbols introduced across different batches, contracts that drifted across producer/consumer batches
 - **Scope**: the union of files changed since the **last** cumulative review (or since the start of the run if this is the first)
@@ -239,7 +262,7 @@ For product implementation, this archive means "batch implementation accepted."

 ### 15. Product Implementation Completeness Gate

-Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate only when the remaining context is explicitly test implementation or refactoring, as determined by the task files and report filename rules.
+Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate when (a) the remaining context is explicitly test implementation or refactoring (as determined by the task files and report filename rules), OR (b) `suite_level: true` was supplied in the invocation context (the gate's inputs do not exist in the meta-repo artifact layout — see "Suite-level invocation context" above).

 **Goal**: catch the failure mode where narrow tests validate scaffold behavior while the task's actual outcome, included scope, architecture promise, or named integration remains unimplemented.

@@ -268,19 +291,46 @@ For each completed product task:
   - **BLOCKED**: production code exists but cannot be fully verified due to external hardware/data/license/runtime prerequisites; the blocker is explicit and tests report blocked/skipped with reason.
   - **FAIL**: promised production behavior is missing, only scaffolded, or only represented in tests/reports.

+#### 15.b System-Pipeline Check (runs ONCE per gate invocation, after per-task classification)
+
+The per-task classification above (steps 1–8) operates on `_docs/02_tasks/done/`. It catches missing component-local behavior but it CANNOT catch a missing *integration* — there is no task to fail if no task ever owned the integration in the first place. The GPS-passthrough incident (May 2026) escaped this gate because every per-component task in `done/` was honestly complete; the missing piece was the cross-component loop, which had no owning task.
+
+The system-pipeline check fixes that by walking the architecture documents directly, independent of `done/`.
+
+**Inputs**:
+- `_docs/02_document/architecture.md`
+- `_docs/02_document/system-flows.md`
+- Full source tree under the project's production directory (e.g. `src/`).
+
+**Procedure**:
+
+1. **Enumerate end-to-end pipelines.** Read `architecture.md` and `system-flows.md`. For each named pipeline / operational flow that spans 2+ components, record the ordered component sequence and the trigger (per-frame, per-request, scheduled, manual).
+2. **Grep for production callers of each seam method.** For each adjacent pair `A → B` in a pipeline, find a production source file (not under `tests/`, not under a `bench/` package, not a doc) that calls `A`'s public output method AND passes the result into `B`'s public input method.
+3. **Classify the pipeline**:
+   - **WIRED**: a production caller exists and the chain is complete from the first to the last component in the sequence.
+   - **PARTIALLY WIRED**: some adjacent pairs have callers but at least one seam is missing.
+   - **NOT WIRED**: no production code calls the pipeline's components in order. Bench tools, unit tests, and microbenchmarks do NOT count as "wiring".
+4. **Distinguish "wired but stubbed" from "wired with real components"**: a caller that invokes a passthrough / GPS-from-tlog / mock-output-generator instead of the real component is `NOT WIRED` for the purposes of this gate. The seam exists in the source file but the production behavior is faked. Grep for the same scaffold markers Step 15 already enumerates (`placeholder`, `stub`, `passthrough`, `scaffold until`, etc.) inside the caller's body.
+5. **Output**: append a `## System Pipeline Audit` section to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md`. Per-pipeline row: name, sequence, classification, evidence file (the caller, or "NONE FOUND"), remediation suggestion if not `WIRED`.
+
+**Pipeline classification feeds the combined gate below.** Any pipeline that is not `WIRED` is a system-level FAIL that the per-task gate cannot rescue.
+
+**Why this is here and not only in decompose**: decompose Step 1.7 creates integration tasks up front; this check verifies the integration tasks actually got implemented (or, if they were never created, surfaces the gap before the cycle closes). The two layers are belt-and-suspenders by design.
+
 Save the audit to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` with:

 - Per-task classification
 - Evidence files/symbols checked
 - Any unresolved scaffold/native placeholders
 - Any named promised technologies not integrated
+- **System Pipeline Audit table** (per pipeline: name, sequence, WIRED / PARTIALLY WIRED / NOT WIRED, evidence file, remediation suggestion)
 - Required remediation task suggestions, each sized to 5 points or less

 Gate:

- If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, continue to Final Test Run.
- If any product task is `FAIL`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
-  - A) Create remediation tasks now and return to implementation
+- If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, AND every enumerated pipeline is `WIRED`, continue to Final Test Run.
+- If any product task is `FAIL` OR any pipeline is `PARTIALLY WIRED` / `NOT WIRED`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
+  - A) Create remediation tasks now and return to implementation. (For pipeline FAILs the remediation task is a NEW integration task owned by the spine component per `_docs/02_document/module-layout.md`; it is NOT a test task and NOT a doc task; its deliverable is production code that drives the pipeline against real components.)
  - B) Mark the missing behavior explicitly out of scope in task/docs, then re-run this gate
  - C) Abort for manual correction
 - Recommendation must normally be A unless the user deliberately accepts reduced scope.
@@ -309,8 +359,9 @@ After each batch completes, save the batch report to `_docs/03_implementation/ba
 - **Test implementation** (tasks from test decomposition): `_docs/03_implementation/implementation_report_tests.md`
 - **Feature implementation**: `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` where `{feature_slug}` is derived from the batch task names (e.g., `implementation_report_core_api_cycle2.md`) and `{N}` is the current `state.cycle` from `_docs/_autodev_state.md`. If `state.cycle` is absent (pre-migration), default to `cycle1`.
 - **Refactoring**: `_docs/03_implementation/implementation_report_refactor_{run_name}.md`
+- **Suite-level** (when `suite_level: true` was supplied — see "Suite-level invocation context" above): `_docs/03_implementation/suite_implementation_report_{run_name}.md`. Batch reports use `_docs/03_implementation/suite_batch_{NN}_report.md`. `{run_name}` is derived from the batch task IDs (e.g., `suite_implementation_report_az543_az549_az550.md`).

-Determine the context from the task files being implemented: if all tasks have test-related names or belong to a test epic, use the tests filename; otherwise derive the feature slug from the component names and append the cycle suffix.
+Determine the context from the task files being implemented: if all tasks have test-related names or belong to a test epic, use the tests filename; if `suite_level: true` was supplied, use the suite filename; otherwise derive the feature slug from the component names and append the cycle suffix.

 Batch report filenames must also include the cycle counter when running feature implementation: `_docs/03_implementation/batch_{NN}_cycle{N}_report.md` (test and refactor runs may use the plain `batch_{NN}_report.md` form since they are not cycle-scoped).

@@ -84,29 +84,66 @@ Assess the change along these dimensions:
 - **Novelty**: does it involve libraries, protocols, or patterns not already in the codebase?
 - **Risk**: could it break existing functionality or require architectural changes?

-Classification:
+### 2a. Complexity-Points Estimate
+
+Project policy (per the workspace user-rule on ADO points): aim for tasks at 2–3 points (rarely 5). Tasks at 8 points are high risk; tasks at 13 are too complex and MUST be broken down. The new-task skill enforces this here, before producing a single-file task spec.
+
+Map the Scope/Novelty/Risk profile to a points estimate using this table:
+
+| Profile | Points | Examples |
+|---------|--------|----------|
+| All three low | **1–2** | One-line config change; trivial CRUD field addition |
+| Two low + one medium | **3** | Localized refactor; add one well-understood endpoint |
+| One low + two medium, OR all medium | **5** | New small feature touching 2–3 components; integration with a known library |
+| Any high, OR two medium + one high | **8** | Cross-cutting concern across 4+ components; integration with an unfamiliar protocol; significant architectural change |
+| Two or three high | **13** | New subsystem; unfamiliar tech across the stack; multiple unknown unknowns |
+
+If a relevant LESSONS.md entry biases the estimate (e.g., "auth-related changes historically take 2× estimate"), apply the multiplier and round up to the next discrete point on the scale (1, 2, 3, 5, 8, 13).
+
+### 2b. Routing by Complexity
+
+| Estimate | Default routing | Override path |
+|----------|-----------------|---------------|
+| **1–5** | Continue this skill at Step 3 (Research) or Step 4 (Codebase Analysis) — see classification below | — |
+| **8** | **STOP this skill and recommend handoff to `/decompose @<feature_description>`** (single-component decompose mode if the affected scope fits inside one component, default mode if it does not). The user may override and proceed in `/new-task`, but the override must be explicitly chosen. | C) Proceed in /new-task anyway with the user's acknowledgement that the resulting task is high-risk and may need to be re-decomposed mid-implementation |
+| **13** | **STOP this skill — auto-handoff is mandatory.** A 13-point feature cannot be a single task spec. Invoke `/decompose @<feature_description>` (default mode) before writing any task file. Surface the handoff to the user with no override path; this is a hard policy gate. | None — must decompose |
+
+For the auto-handoff path:
+
+1. Render a one-paragraph description of the feature suitable to feed `/decompose` (combine Step 1's verbatim user description with the complexity-points reasoning).
+2. Save it to `_docs/02_task_plans/<feature_slug>/feature-description.md` so the decompose skill has a stable input file.
+3. Either (a) directly auto-chain into `.cursor/skills/decompose/SKILL.md` in default mode with this file as input, or (b) report the handoff to the user along with the exact `/decompose` invocation and stop. Pick (a) only if the user has explicitly enabled auto-chain across skills (e.g., we are inside an `/autodev` invocation); otherwise pick (b).
+
+### 2c. Research vs Skip Research (only for ≤5 estimates)
+
+Classification (independent of points; runs only when points ≤ 5 and Step 2b chose Continue):

 | Category | Criteria | Action |
 |----------|----------|--------|
-| **Needs research** | New libraries/frameworks, unfamiliar protocols, significant architectural change, multiple unknowns | Proceed to Step 3 (Research) |
+| **Needs research** | New libraries/frameworks, unfamiliar protocols, multiple unknowns | Proceed to Step 3 (Research) |
 | **Skip research** | Extends existing functionality, uses patterns already in codebase, straightforward new component with known tech | Skip to Step 4 (Codebase Analysis) |

-Present the assessment to the user:
+Present the full assessment to the user:

 ```
 ══════════════════════════════════════
 COMPLEXITY ASSESSMENT
 ══════════════════════════════════════
- Scope:   [low / medium / high]
- Novelty: [low / medium / high]
- Risk:    [low / medium / high]
+ Scope:    [low / medium / high]
+ Novelty:  [low / medium / high]
+ Risk:     [low / medium / high]
+ Points:   [1 / 2 / 3 / 5 / 8 / 13]   (project aim: 2–3, rarely 5)
+ Routing:  [Continue in /new-task | Hand off to /decompose]
 ══════════════════════════════════════
- Recommendation: [Research needed / Skip research]
- Reason: [one-line justification]
+ Recommendation: [Research needed | Skip research | Decompose required]
+ Reason: [one-line justification, including any LESSONS.md influence]
 ══════════════════════════════════════
 ```

-**BLOCKING**: Ask the user to confirm or override the recommendation before proceeding.
+**BLOCKING**:
+- If points ≤ 5 → ask the user to confirm or override the research recommendation before proceeding.
+- If points = 8 → ask the user to choose between hand-off to /decompose (recommended) and continuing in /new-task with explicit risk acknowledgement.
+- If points = 13 → STOP and present the handoff plan; do not offer a continue-anyway override.

 ---

@@ -203,7 +240,13 @@ Apply the four shared-task triggers from `.cursor/skills/decompose/SKILL.md` Ste
  2. Add the layout edit to the task's deliverables; the implementer writes it alongside the code change.
  3. If `module-layout.md` does not exist, STOP and instruct the user to run `/document` first (existing-code flow) or `/decompose` default mode (greenfield). Do not guess.

-Record the classification and any contract/layout deliverables in the working notes; they feed Step 5 (Validate Assumptions) and Step 6 (Create Task).
+- **ADR cross-check** — runs unconditionally for every new-task in any of the three classifications above:
+  1. If `_docs/02_document/adr/` exists, scan every `Status: Accepted` ADR. For each, ask: "would the proposed task either contradict this ADR's `Decision` or materially affect its `Consequences`?"
+  2. **Conflict** (task contradicts an Accepted ADR) → STOP and Choose A/B/C: **A)** Re-scope the task to comply with the ADR, **B)** Propose superseding the ADR — the task spec then includes a deliverable to invoke `/plan --adr-only` (or the next `/plan` cycle's Step 4.5) with `Supersedes: ADR-NNN`, and the new task does NOT proceed until that supersede ADR is `Accepted`, **C)** Park the task in `backlog/` with a `Blocked-By: ADR-NNN review` note. Do not silently approve a contradictory task.
+  3. **Drift** (task changes assumptions an ADR depends on but does not directly contradict it) → record the affected ADR(s) under a new `### ADR Impact` section in the task spec with `> Affects ADR NNN_<slug>: <one-line summary>`. The implementer surfaces this at code-review Phase 7 (which then classifies it as ADR-Drift if not addressed).
+  4. **Aligned** (task implements something an Accepted ADR mandates) → cite the ADR(s) under `### ADR Compliance` in the task spec with `> Implements ADR NNN_<slug>`. Code-review Phase 7 then expects matching evidence in the implemented code.
+
+Record the classification, any contract/layout deliverables, and any ADR cross-check outcomes in the working notes; they feed Step 5 (Validate Assumptions) and Step 6 (Create Task).

 **BLOCKING**: none — this step surfaces findings; the user confirms them in Step 5.

@@ -263,6 +306,9 @@ Present using the Choose format for each decision that has meaningful alternativ
 - [ ] If Step 4.5 classified the task as producer, the `## Contract` section exists and points at a contract file
 - [ ] If Step 4.5 classified the task as consumer, `### Document Dependencies` lists the relevant contract file
 - [ ] If Step 4.5 flagged a layout delta, the task's Scope.Included names the `module-layout.md` edit
+- [ ] If Step 4.5 flagged an ADR conflict, the task is either re-scoped (A), explicitly blocked on a supersede ADR (B), or parked in backlog (C) — never silently bypassed
+- [ ] If Step 4.5 flagged ADR drift, the task spec has an `### ADR Impact` section listing the affected ADR(s)
+- [ ] If Step 4.5 flagged ADR alignment, the task spec has an `### ADR Compliance` section citing the implemented ADR(s)

 ---

@@ -15,7 +15,7 @@ disable-model-invocation: true

 # Solution Planning

-Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, tests, and work item epics through a systematic 6-step workflow.
+Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, ADRs, tests, and work item epics through a systematic workflow with seven step files (1, 2, 3, 4, 4.5, 5, 6) plus a Final quality checklist.

 ## Core Principles

@@ -55,7 +55,7 @@ Read `steps/01_artifact-management.md` for directory structure, save timing, sav

 ## Progress Tracking

-At the start of execution, create a TodoWrite with all steps (1 through 6 plus Final). Update status as each step completes.
+At the start of execution, create a TodoWrite with all steps (1, 2, 3, 4, 4.5, 5, 6 plus Final). Update status as each step completes. The fractional Step 4.5 (ADR Capture) sits between Architecture Review (Step 4) and Test Specifications (Step 5).

 ## Workflow

@@ -85,6 +85,16 @@ Read and follow `steps/04_review-risk.md`.

 ---

+### Step 4.5: Architecture Decision Records (ADRs)
+
+Read and follow `steps/04-5_adr-capture.md`.
+
+This step captures the architecture and tech-stack decisions that were made (or revised) in Steps 2–4 as durable, dated, immutable records under `_docs/02_document/adr/`. ADRs are the single thing in `_docs/` that explain the **why** of each major decision after the conversation history is gone. They are consumed by `decompose` (when bootstrapping module layout), `new-task` (when assessing a new feature against existing decisions), `refactor` (when proposing replacements), and any future code-review cycle that needs to confirm a structural choice was deliberate.
+
+This step is **BLOCKING**: the ADR set must be reviewed and confirmed by the user before Step 5 begins.
+
+---
+
 ### Step 5: Test Specifications

 Read and follow `steps/05_test-specifications.md`.
@@ -120,7 +130,7 @@ Read and follow `steps/07_quality-checklist.md`.
 |-----------|--------|
 | Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — planning cannot proceed |
 | Ambiguous requirements | ASK user |
-| Input data coverage below 75% | Search internet for supplementary data, ASK user to validate |
+| Input data coverage below the canonical threshold (`cursor-meta.mdc` Quality Thresholds) | Search internet for supplementary data, ASK user to validate |
 | Technology choice with multiple valid options | ASK user |
 | Component naming | PROCEED, confirm at next BLOCKING gate |
 | File structure within templates | PROCEED |
@@ -146,6 +156,8 @@ Read and follow `steps/07_quality-checklist.md`.
 │    [BLOCKING: user confirms components]                        │
 │ 4. Review & Risk       → risk register, iterations              │
 │    [BLOCKING: user confirms mitigations]                       │
+│ 4.5 ADR Capture        → _docs/02_document/adr/NNN_*.md         │
+│    [BLOCKING: user confirms ADR set]                           │
 │ 5. Test Specifications → per-component test specs               │
 │ 6. Work Item Epics     → epic per component + bootstrap         │
 │    ─────────────────────────────────────────────────           │
@@ -26,6 +26,10 @@ DOCUMENT_DIR/
 │   └── deployment_procedures.md
 ├── risk_mitigations.md
 ├── risk_mitigations_02.md          (iterative, ## as sequence)
+├── adr/
+│   ├── 001_[decision_slug].md
+│   ├── 002_[decision_slug].md
+│   └── ...
 ├── components/
 │   ├── 01_[name]/
 │   │   ├── description.md
@@ -66,6 +70,8 @@ DOCUMENT_DIR/
 | Step 3 | Common helpers generated | `common-helpers/[##]_helper_[name].md` |
 | Step 3 | Diagrams generated | `diagrams/` |
 | Step 4 | Risk assessment complete | `risk_mitigations.md` |
+| Step 4.5 | Each ADR captured | `adr/NNN_[decision_slug].md` |
+| Step 4.5 | ADR index updated | `adr/README.md` |
 | Step 5 | Tests written per component | `components/[##]_[name]/tests.md` |
 | Step 6 | Epics created in work item tracker | Tracker via MCP |
 | Final | All steps complete | `FINAL_report.md` |
@@ -85,3 +91,15 @@ If DOCUMENT_DIR already contains artifacts:
 2. Identify the last completed step based on which artifacts exist
 3. Resume from the next incomplete step
 4. Inform the user which steps are being skipped
+
+#### Step 4.5 (ADR Capture) resumption rule
+
+ADR files have a `Status` field that disambiguates "step in progress" from "step done":
+
+- `Status: Proposed` → Step 4.5 is **in progress**. The user has not yet hit the BLOCKING gate (or hit it and chose B/C/D, which kept files at `Proposed`). Resume Step 4.5 at Phase 4.5f and re-present the BLOCKING Choose to the user. Do NOT skip to Step 5.
+- `Status: Accepted` AND `adr/README.md` index exists AND every Accepted ADR is referenced in the index → Step 4.5 is **done**. Skip to Step 5.
+- `Status: Accepted` but `adr/README.md` is missing or out of date → Step 4.5 is **partially complete**. Resume at Phase 4.5d (Maintain the ADR Index) before moving on.
+- Mixed `Proposed` + `Accepted` files in the same directory → Step 4.5 is **in progress** with prior partial confirmations. Resume at Phase 4.5f and re-present only the still-`Proposed` ADRs.
+- Empty `adr/` directory or no `adr/` directory → Step 4.5 has not started yet. Begin at Phase 4.5a.
+
+The `Date` field on every Accepted ADR is the date the user confirmed it; do not regenerate it during resumption.
@@ -0,0 +1,187 @@
+# Step 4.5: Architecture Decision Records (ADRs)
+
+**Role**: Architect / technical writer
+**Goal**: Capture every major architecture, tech-stack, data-model, and integration decision made during Steps 2–4 as a durable, dated, immutable record under `_docs/02_document/adr/`.
+**Constraints**: ADRs only — do not re-open architecture; do not make new decisions in this step. Document what has been decided, not what is still open.
+
+ADRs are the single thing in `_docs/` that explains the **why** of each major decision after the conversation history is gone. They are consumed by:
+
+- `decompose` Step 1.5 (`steps/01-5_module-layout.md`) — every Accepted ADR is cross-checked against the module-layout proposal; conflicts trigger an explicit Choose between supersede / exception / re-open.
+- `new-task` Step 4.5 (`SKILL.md` § "Step 4.5: Contract & Layout Check") — every new task is classified against Accepted ADRs as Conflict / Drift / Aligned; conflicts STOP the task with a Choose A/B/C; drift adds an `### ADR Impact` section; alignment adds an `### ADR Compliance` section.
+- `refactor` Phase 2b.1 (`phases/02-analysis.md`) — every Accepted ADR is diffed against the proposed roadmap; Violations trigger a BLOCKING supersede gate that produces a `supersede_adr_NNN.md` task before any refactor task is created.
+- `code-review` Phase 7 (`SKILL.md` § "Phase 7: Architecture Compliance") — every changed-files batch is checked against Accepted ADRs; ADR-Violation findings are Critical, ADR-Drift findings are High.
+
+Discipline that still relies on the human: when a downstream skill detects a Drift case, the resulting task spec MUST land its `## ADR Impact` / `## ADR Compliance` section; the implementer must address it; the next code-review batch then has the context it needs. Drift left undocumented is the silent-failure path — every consumer hook above is designed to make it visible.
+
+## Inputs
+
+- `_docs/02_document/architecture.md` (incl. confirmed `## Architecture Vision`)
+- `_docs/02_document/glossary.md`
+- `_docs/02_document/data_model.md`
+- `_docs/02_document/system-flows.md`
+- `_docs/02_document/risk_mitigations.md` (and any `risk_mitigations_NN.md` iterations from Step 4)
+- `_docs/02_document/components/[##]_[name]/description.md`
+- `_docs/02_document/deployment/` (CI/CD, environments, observability)
+- `_docs/00_problem/restrictions.md` and `_docs/00_problem/acceptance_criteria.md` (each ADR must reference relevant constraints / AC by ID)
+- Optional: `_docs/01_solution/solution.md` and `_docs/01_solution/tech_stack.md` (research output)
+- Optional: `_docs/LESSONS.md` — surface any lesson categories of `architecture` / `dependencies` that bias the recommendation
+
+## What is an ADR (and what is not)
+
+Capture an ADR when **all** of the following hold:
+
+1. The decision picks between two or more genuinely valid approaches with meaningful trade-offs.
+2. The decision has **downstream consequences** that other decisions, code, or tasks inherit from.
+3. The decision is **non-obvious** to a future reader who only sees the final code — they would ask "why was it built this way?" rather than discovering the answer by reading the source.
+
+Do NOT create an ADR for:
+
+- Naming, formatting, or purely cosmetic choices.
+- A choice that is fully implied by a single explicit restriction (`restrictions.md` is itself the record — link to it from the architecture doc instead).
+- A choice the team has not actually made yet — open questions live in `risk_mitigations.md` or `_docs/_process_leftovers/`, not in ADRs.
+- A technology selection where research already produced an exact-fit selection with one viable option (the research doc is the record — link to the relevant `solution_draft*.md` section).
+
+## Process
+
+### Phase 4.5a: Decision Inventory
+
+Walk the inputs and list candidate decisions. For each candidate, record a one-liner:
+
+```
+- [decision] — [trade-off summary] — [downstream consumers] — [evidence file:section]
+```
+
+Inspect at minimum:
+
+| Inspection target | Typical decisions surfaced |
+|-------------------|----------------------------|
+| `architecture.md` § layering | Layering style (clean vs hex vs n-tier), which layer owns transactions, how cross-cutting concerns enter |
+| `architecture.md` § Architecture Vision | The North Star principle (e.g., "edge-first, sync-second"); ADR captures the implication for one specific subsystem |
+| `data_model.md` | Datastore choice (Postgres vs Mongo), partitioning, soft vs hard deletes, schema evolution strategy |
+| `system-flows.md` | Sync vs async boundaries, idempotency strategy, retry policy ownership, error envelope shape |
+| `components/*/description.md` § interfaces | Public-API style (REST vs RPC vs event), versioning strategy, auth/authorization placement |
+| `deployment/containerization.md` | Single container vs sidecar vs init container, base image lineage |
+| `deployment/ci_cd_pipeline.md` | Trunk-based vs feature-branch, gate ordering, deploy strategy (blue-green / canary / all-at-once) |
+| `deployment/observability.md` | Logging stack, metric backend, sampling rate decisions, retention |
+| `risk_mitigations.md` | Risk-acceptance trade-offs (e.g., "we accept N% data loss in exchange for sub-100ms p99") |
+| Tech-stack from `_docs/01_solution/tech_stack.md` | Anything where research recorded ≥2 candidates and a winner |
+
+Drop any candidate that fails the three "what is an ADR" criteria above. Keep the rest.
+
+### Phase 4.5b: Numbering and Slugs
+
+ADRs are numbered globally per project, monotonically, never re-used.
+
+1. List existing files under `_docs/02_document/adr/` matching `^[0-9]{3}_.+\.md$`.
+2. The next ADR number is `max(existing) + 1`, zero-padded to 3 digits.
+3. The slug is kebab-case, ≤6 words, derived from the decision summary. Example: `001_use-postgres-for-transactional-data.md`, `004_event-driven-cross-component-comms.md`.
+
+### Phase 4.5c: Render One ADR Per Decision
+
+For each kept candidate, render the ADR using `templates/adr.md`. Required sections (do NOT omit any):
+
+| Section | Content |
+|---------|---------|
+| **Number** | `NNN` |
+| **Title** | One-line decision statement (matches slug) |
+| **Status** | `Proposed` (only during Step 4.5 iteration) → `Accepted` (after user confirmation at the BLOCKING gate) |
+| **Date** | YYYY-MM-DD (the date the user confirmed) |
+| **Deciders** | The user (project owner) — the AI is not a decider |
+| **Context** | The problem this decision addresses, including links to AC IDs, restriction IDs, risks, and (where relevant) the research draft section |
+| **Decision** | The chosen approach in one sentence, then the supporting detail |
+| **Alternatives Considered** | Each alternative with a one-line "rejected because…" |
+| **Consequences** | Positive (what becomes easier / cheaper / faster) and negative (what becomes harder / locked in / costly to undo). Be honest — every decision has a downside. |
+| **Supersedes / Superseded by** | Empty initially; updated when a future ADR overturns this one |
+| **Evidence** | File-and-section pointers into `_docs/` showing where the decision is reflected (architecture.md § layering, components/02_*/description.md § interface, etc.) |
+
+After rendering, write each file to `_docs/02_document/adr/NNN_<slug>.md`. Keep `Status: Proposed` until the BLOCKING gate.
+
+### Phase 4.5d: Maintain the ADR Index
+
+Write or update `_docs/02_document/adr/README.md` with this exact shape:
+
+```markdown
+# Architecture Decision Records
+
+This index lists every ADR for this project, in number order. ADRs are immutable once `Accepted` —
+new decisions that overturn a prior ADR are recorded as new ADRs whose `Supersedes` field points
+back, and the original ADR's `Superseded by` field is updated.
+
+| # | Title | Status | Date | Supersedes |
+|---|-------|--------|------|------------|
+| 001 | Use Postgres for transactional data | Accepted | 2026-05-21 | — |
+| 002 | Event-driven cross-component comms | Accepted | 2026-05-21 | — |
+| ... | ... | ... | ... | ... |
+```
+
+Sort by `#` ascending. Include all ADRs ever written, even superseded ones — the audit trail is the point.
+
+### Phase 4.5e: Cross-Link from architecture.md
+
+In `architecture.md`, every section that reflects an ADR decision gets a one-line trailing reference:
+
+```markdown
+> See ADR 001 (Use Postgres for transactional data), ADR 003 (Event-driven cross-component comms).
+```
+
+Place the reference at the end of the section, after the prose. This lets a future reader of `architecture.md` jump straight to the rationale.
+
+### Phase 4.5f: BLOCKING Gate — User Confirmation
+
+Present the ADR set to the user using the Choose format from `.cursor/skills/autodev/protocols.md` (or plain text if AskQuestion is unavailable):
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: ADR set captured (N records)
+══════════════════════════════════════
+ 001 — [title]
+ 002 — [title]
+ ...
+══════════════════════════════════════
+ A) Accept all ADRs as written
+ B) Edit specific ADRs (numbers and edits)
+ C) Add a missed decision (description)
+ D) Remove an ADR (number and reason)
+══════════════════════════════════════
+ Recommendation: A — review the rendered set and confirm; corrections are quick on Round 2
+══════════════════════════════════════
+```
+
+Loop:
+
+- **A** → flip every ADR's `Status` from `Proposed` to `Accepted`, set `Date` to today's date, save, exit step.
+- **B** → apply edits, re-present the modified ADRs, loop.
+- **C** → run Phase 4.5a–4.5e for the missed decision only, append to the set, re-present, loop.
+- **D** → confirm with the user that the candidate fails the three "what is an ADR" criteria, remove the file, update the index, loop.
+
+Do NOT mark `Accepted` without an explicit user A.
+
+## Self-verification
+
+- [ ] Every kept candidate from Phase 4.5a has a corresponding file under `adr/`
+- [ ] Every ADR has all required sections (none empty except `Supersedes` / `Superseded by`)
+- [ ] `Decision` sections are one-sentence-then-detail, not "we'll figure it out"
+- [ ] `Alternatives Considered` lists at least one rejected alternative per ADR
+- [ ] `Consequences` lists both positive AND negative consequences (an ADR with no negatives is suspect)
+- [ ] `Evidence` points at real `_docs/` sections that exist on disk
+- [ ] `adr/README.md` index lists every file in the directory and matches their `Status` / `Date`
+- [ ] `architecture.md` has a trailing `See ADR …` reference at every section that an ADR reflects
+- [ ] The user confirmed the set via Choose A; every ADR is `Accepted` with today's date
+
+## Common mistakes
+
+- **Re-opening architecture**: Step 4.5 records, it does not decide. If a candidate decision turns out to be unsettled, that's a Step 2 / Step 4 gap — return there, do not paper over it with a wishy-washy ADR.
+- **Decision-of-the-week**: do not write an ADR for every minor pattern choice. The bar is "non-obvious to a future reader". 5–15 ADRs is typical for a planning round; 40+ is over-capture.
+- **Negative consequences left empty**: every real decision has costs. If you cannot name one, the decision was not actually weighed.
+- **Vague evidence**: `architecture.md` is not enough — point at the specific section. `architecture.md § Layering` ≠ `architecture.md`.
+- **Numbering reuse**: never recycle a number from a deleted ADR. The audit trail is more important than tidy numbering.
+- **Superseding without recording**: when a later cycle overturns an ADR, the new ADR must point at the old one via `Supersedes`, AND the old ADR's `Superseded by` field must be updated. Index reflects both. (This is enforced when `decompose` or `refactor` later updates ADRs.)
+
+## Escalation
+
+| Situation | Action |
+|-----------|--------|
+| Candidate decision is unsettled (the team has not actually decided) | Return to the originating step (2 / 3 / 4); do NOT write a placeholder ADR |
+| Two candidates in Phase 4.5a turn out to be the same decision phrased differently | Merge into one ADR, list both phrasings in `Context` |
+| User picks D (remove an ADR) and the AI judges the decision is genuinely worth recording | Surface the disagreement, ASK why the user wants it removed, defer to user |
+| Existing `adr/` directory has files but `adr/README.md` is missing or stale | Rebuild the index from the directory before adding new ADRs |
@@ -2,7 +2,7 @@

 **Role**: Professional Quality Assurance Engineer

-**Goal**: Write test specs for each component achieving minimum 75% acceptance criteria coverage
+**Goal**: Write test specs for each component achieving the canonical minimum acceptance-criteria coverage (currently 75% — see `.cursor/rules/cursor-meta.mdc` Quality Thresholds; do not restate a different number here)

 **Constraints**: Test specs only — no test code. Each test must trace to an acceptance criterion.

@@ -0,0 +1,67 @@
+# ADR-{NNN}: {decision-title}
+
+- **Status**: {Proposed | Accepted | Deprecated | Superseded}
+- **Date**: {YYYY-MM-DD}
+- **Deciders**: {user / project owner}
+- **Supersedes**: {ADR-NNN | —}
+- **Superseded by**: {ADR-NNN | —}
+
+## Context
+
+What problem does this decision address? Cite the relevant constraint(s), acceptance criterion / criteria, and risk(s) by ID.
+
+- Acceptance criteria addressed: AC-{ID-1}, AC-{ID-2}
+- Restrictions addressed: R-{ID-1}, R-{ID-2}
+- Risks addressed: RISK-{ID-1}
+- Research source (if any): `_docs/01_solution/solution_draftN.md` § {section}
+
+A short paragraph (3–6 sentences) explaining why a choice is required now and what makes it non-trivial. Do not pre-announce the decision here — that goes in `Decision`. Focus on the forces at play (load, scale, team familiarity, hardware constraints, regulatory drivers, third-party limits).
+
+## Decision
+
+One declarative sentence: **"We will …"** Then 1–3 paragraphs of supporting detail explaining how the decision will be implemented at the boundaries between components.
+
+Be specific. "We will use Postgres" is too thin; "We will use Postgres 16 with logical replication for read scaling, restricting JSONB columns to top-level metadata only, with all transactional data in normalized tables" is the right resolution.
+
+## Alternatives Considered
+
+| Alternative | Rejected because |
+|-------------|------------------|
+| {Alt 1 — short label} | {one line: the cost / mismatch / risk that ruled it out, ideally referencing a measurable criterion} |
+| {Alt 2 — short label} | {one line} |
+| {Alt 3 — short label} | {one line} |
+
+At least one rejected alternative is mandatory. If only one option was ever considered, this is not an ADR — link to the source restriction or research selection from the parent doc instead.
+
+## Consequences
+
+### Positive
+
+- {What becomes easier / cheaper / faster, with concrete examples where possible}
+- {…}
+
+### Negative
+
+- {What becomes harder / locked in / costly to undo}
+- {…}
+
+Every real decision has both. If the negatives section is hard to fill, the alternatives were probably not weighed seriously — return to the prior step.
+
+### Neutral / Open
+
+- {What is unchanged but worth flagging for future readers (e.g., "this does not change the auth boundary; auth remains in component 02_user_management as decided in ADR-003")}
+
+## Evidence
+
+Where this decision is reflected on disk. Use `file:section` links so future readers can jump.
+
+- `_docs/02_document/architecture.md` § {section}
+- `_docs/02_document/data_model.md` § {section}
+- `_docs/02_document/components/{##_name}/description.md` § {section}
+- `_docs/02_document/system-flows.md` § {flow name}
+- `_docs/02_document/deployment/{file}.md` § {section}
+- {add more as needed}
+
+## Notes
+
+Optional. Use for caveats that did not fit above, links to external research, or follow-ups that the team agreed to revisit on a known trigger ("re-evaluate after 6 months in production" / "re-evaluate when load exceeds 10× baseline").
@@ -1,6 +1,6 @@
 # Final Planning Report Template

-Use this template after completing all 6 steps and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.
+Use this template after completing all steps (1, 2, 3, 4, 4.5, 5, 6) and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.

 ---

@@ -39,6 +39,44 @@ Write `RUN_DIR/analysis/research_findings.md`:
 4. Prioritize changes by impact and effort
 5. Reject or escalate any proposed refactor that improves code structure while weakening required behavior, integration contracts, runtime constraints, safety/security posture, or acceptance criteria

+### 2b.1. ADR Superseding Gate (BLOCKING)
+
+A refactor that improves code structure while overturning a documented architecture decision is the silent-drift class the project repeatedly burns on (see `meta-rule.mdc` § GPS-passthrough postmortem and the auto-lessons it produced). This gate makes drift visible and forces a deliberate ADR update.
+
+1. **List candidate ADRs**: read every `Status: Accepted` file in `_docs/02_document/adr/`. If the directory does not exist or contains only the index, log `No ADRs in scope` to `RUN_DIR/analysis/adr_impact.md` and skip the rest of this gate.
+2. **Diff each candidate against the proposed refactor roadmap**: for each ADR, ask the same two questions as code-review Phase 7:
+   - **Violation**: does any roadmap item do the *opposite* of the ADR's `Decision`?
+   - **Drift**: does any roadmap item materially affect the ADR's `Consequences` (positive or negative) without contradicting the Decision outright?
+3. **Classify each impacted ADR** in `RUN_DIR/analysis/adr_impact.md`:
+
+   | ADR | Roadmap item | Impact | Required action |
+   |-----|--------------|--------|-----------------|
+   | NNN | `roadmap-item-NN` | Violation / Drift / Aligned | (filled by Choose A/B/C below) |
+
+4. **For every Violation row, present a BLOCKING Choose**:
+
+   ```
+   ══════════════════════════════════════
+    DECISION REQUIRED: Refactor would violate ADR-NNN (<title>)
+   ══════════════════════════════════════
+    A) Update the ADR via supersede: the refactor produces a NEW ADR
+       (`Supersedes: NNN`) capturing the new Decision, and ADR-NNN's
+       `Superseded by` field is updated. The supersede ADR is itself a
+       deliverable of this refactor run (added to RUN_DIR/analysis/adr_impact.md
+       and to TASKS_DIR as a task) and must be `Accepted` before Phase 4.
+    B) Reduce the refactor scope to NOT violate ADR-NNN
+    C) Re-evaluate ADR-NNN: keep the refactor but only after ADR-NNN is
+       formally re-opened in a new /plan Step 4.5 round
+   ══════════════════════════════════════
+    Recommendation: A — supersede is the only path that keeps the audit
+    trail intact while letting the refactor land
+   ══════════════════════════════════════
+   ```
+
+5. **For every Drift row**: do not block, but the roadmap item must include a `## ADR Impact` section in its task spec citing the affected ADR(s). The implementer surfaces this at code-review Phase 7, which would otherwise classify the change as ADR-Drift (High) without context.
+6. **For every Aligned row**: cite the ADR in the roadmap item's task spec under `## ADR Compliance`. No further action.
+7. **Self-supersede deliverable**: any Choose A path adds a `[##]_supersede_adr_NNN.md` task file to the refactor run's TASKS_DIR with the new ADR text drafted (using `.cursor/skills/plan/templates/adr.md`). The task's only Acceptance Criterion is "ADR file exists at `_docs/02_document/adr/<next>_<slug>.md` with `Status: Accepted`, ADR-NNN's `Superseded by` field updated, and `_docs/02_document/adr/README.md` index reflects both."
+
 Present optional hardening tracks for user to include in the roadmap:

 ```
@@ -67,6 +105,8 @@ Write `RUN_DIR/analysis/refactoring_roadmap.md`:

 **BLOCKING applicability gate**: Before 2c and 2d, every recommendation in the roadmap must be `Selected`. Items marked `Rejected` are excluded. Items marked `Experimental only` or `Needs user decision` require a user decision before task creation.

+**BLOCKING ADR-supersede gate**: Before 2c and 2d, every Violation row in `RUN_DIR/analysis/adr_impact.md` (from 2b.1) must be resolved via Choose A, B, or C. A Violation row with no chosen path blocks task creation.
+
 ## 2c. Create Epic

 Create a work item tracker epic for this refactoring run:
@@ -111,6 +151,10 @@ Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files
 - [ ] Task dependencies are consistent (no circular dependencies)
 - [ ] `_dependencies_table.md` includes all refactoring tasks
 - [ ] Every task has a work item ticket (or PENDING placeholder)
+- [ ] If `_docs/02_document/adr/` exists with Accepted ADRs, `RUN_DIR/analysis/adr_impact.md` has been written and every Violation row is resolved (A/B/C) — no implicit overrides
+- [ ] For every Violation resolved via Choose A, a `[##]_supersede_adr_NNN.md` task exists in TASKS_DIR with the drafted supersede ADR
+- [ ] For every Drift row, the corresponding roadmap-item task spec has a `## ADR Impact` section
+- [ ] For every Aligned row, the corresponding roadmap-item task spec has a `## ADR Compliance` section

 **Save action**: Write analysis artifacts to RUN_DIR, task files to TASKS_DIR

@@ -15,9 +15,9 @@ Before designing or implementing any new tests, check what already exists:
 1. Scan the project for existing test files (unit tests, integration tests, blackbox tests)
 2. Run the existing test suite — record pass/fail counts
 3. Measure current coverage against the areas being refactored (from `RUN_DIR/list-of-changes.md` file paths)
-4. Assess coverage against thresholds:
+4. Assess coverage against thresholds (canonical: see `.cursor/rules/cursor-meta.mdc` Quality Thresholds — never hardcode a different number):
   - Minimum overall coverage: 75%
-   - Critical path coverage: 90%
+   - Critical path coverage: **90% floor / 100% aim** — 90% is the enforcement floor (blocks Phase 4 if not met); 100% is the aspirational target. Refactors are NOT permitted to drop below 90% on the critical paths covered by the in-scope changes.
   - All public APIs must have blackbox tests
   - All error handling paths must be tested

@@ -47,7 +47,7 @@ For each uncovered critical area, write test specs to `RUN_DIR/test_specs/[##]_[
 4. Document any discovered issues

 **Self-verification**:
- [ ] Coverage requirements met (75% overall, 90% critical paths) across existing + new tests
+- [ ] Coverage requirements met (75% overall, 90% critical-path floor — 100% aim — per canonical `cursor-meta.mdc` Quality Thresholds) across existing + new tests
 - [ ] All tests pass on current codebase
 - [ ] All public APIs in refactoring scope have blackbox tests
 - [ ] Test data fixtures are configured
@@ -45,7 +45,7 @@ Write `RUN_DIR/test_sync/new_tests.md`:
 - [ ] All obsolete tests removed or merged
 - [ ] All pre-existing tests pass after updates
 - [ ] New code from Phase 4 has test coverage
- [ ] Overall coverage meets or exceeds Phase 3 baseline (75% overall, 90% critical paths)
+- [ ] Overall coverage meets or exceeds Phase 3 baseline (75% overall, 90% critical-path floor / 100% aim — per `.cursor/rules/cursor-meta.mdc` Quality Thresholds)
 - [ ] No tests reference removed or renamed code

 **Save action**: Write test_sync artifacts; implemented tests go into the project's test folder
@@ -0,0 +1,290 @@
+---
+name: release
+description: |
+  Executes the deployment plan produced by /deploy against a target environment.
+  Closes the loop between "we have a plan" and "the new version is running in production with a verdict on disk."
+  6-phase workflow: pre-release gate, strategy select, execute, smoke test, watch window, commit-or-rollback.
+  Outputs _docs/04_release/release_<version>.md with a definitive Released / Rolled-Back / Aborted verdict.
+  Trigger phrases:
+  - "release", "ship", "go live", "release this version"
+  - "deploy to prod", "promote to staging", "roll out"
+  - "rollback", "abort the release"
+category: ship
+tags: [release, deployment, rollback, smoke-test, observability, production]
+disable-model-invocation: true
+---
+
+# Release Execution
+
+The `/deploy` skill produces a plan and scripts. The `/release` skill **runs** them, verifies the live system, watches it for a defined window, and produces a definitive verdict on disk.
+
+## Core Principles
+
+- **Real execution, not simulation**: every phase must actually run against the target environment. If a phase cannot be executed (missing scripts, no SSH access, disabled secrets, registry auth failure), STOP — do not pretend a step succeeded. See `meta-rule.mdc` § "Real Results, Not Simulated Ones".
+- **Verifiable rollback path**: the release does not start until rollback is proven viable for this version. "We can roll back" without evidence is not a rollback path.
+- **Quiet failure is a release failure**: a deploy script that exits 0 but emits no observable signal in the watch window is treated as a regression, not a success.
+- **One release per invocation**: a single `/release` execution targets exactly one version against exactly one environment. Multi-stage promotion (staging → prod) is two invocations, not one.
+- **Never skip the watch window**: even successful deploys can degrade after 5–60 minutes (cache warm-up, scheduled jobs, downstream backpressure). The watch window is mandatory.
+- **Autonomous rollback on hard regressions**: critical health-check failure, error-rate spike above threshold, or smoke-test failure → automatic rollback. Soft regressions (latency drift, capacity warnings) escalate to the user.
+
+## Context Resolution
+
+Fixed paths:
+
+- DEPLOY_DIR: `_docs/04_deploy/`
+- RELEASE_DIR: `_docs/04_release/`
+- SCRIPTS_DIR: `scripts/`
+- DEPLOY_SCRIPT: `scripts/deploy.sh`
+- HEALTH_SCRIPT: `scripts/health-check.sh`
+- ENV_TEMPLATE: `.env.example`
+- OBSERVABILITY_DOC: `_docs/04_deploy/observability.md`
+- ENVIRONMENT_DOC: `_docs/04_deploy/environment_strategy.md`
+- PROCEDURES_DOC: `_docs/04_deploy/deployment_procedures.md`
+- ARCHITECTURE: `_docs/02_document/architecture.md`
+- RESTRICTIONS: `_docs/00_problem/restrictions.md`
+
+Announce the resolved paths and the **target environment + version + strategy** to the user before any phase that touches the live system.
+
+## Inputs (BLOCKING prerequisites)
+
+| Input | Required | Source |
+|-------|----------|--------|
+| Target environment | Yes — ASK user | `environment_strategy.md` enumerates valid options |
+| Target version / image tag | Yes — ASK user | Must exist in the registry; verified in Phase 1 |
+| Rollback target version | Yes — ASK user | Defaults to currently-deployed version if discoverable |
+| `scripts/deploy.sh` | Yes | Produced by `/deploy` Step 7. STOP if missing → run `/deploy` first |
+| `scripts/health-check.sh` | Yes | Same |
+| `_docs/04_deploy/deployment_procedures.md` | Yes | Defines per-environment runbook, manual approval rules, change-window restrictions |
+| `_docs/04_deploy/observability.md` | Yes | Defines watch metrics, thresholds, and dashboards |
+| `_docs/04_deploy/environment_strategy.md` | Yes | Defines target hostnames, registries, secrets, deploy strategy per env |
+
+## Outputs
+
+```
+RELEASE_DIR/
+├── release_<version>_<env>_<YYYY-MM-DD-HHmm>.md       (mandatory; one per invocation)
+├── rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md      (only when rollback fires; pairs with the release file)
+└── manual_approvals/
+    └── approval_<version>_<env>.md                     (when restrictions require manual approval, written before Phase 3)
+```
+
+The release report (`templates/release-report.md`) is appended to as each phase completes — it is durable across phase failures and reflects partial progress so the next operator can resume or audit.
+
+## Phases
+
+```
+┌────────────────────────────────────────────────────────────────┐
+│             Release Execution (6-Phase Method)                  │
+├────────────────────────────────────────────────────────────────┤
+│ PREREQ: deploy artifacts on disk; tests green at HEAD          │
+│                                                                │
+│ 1. Pre-Release Gate     → AC + change summary + readiness      │
+│    [BLOCKING: user confirms or aborts]                         │
+│ 2. Strategy Select      → all-at-once / blue-green / canary    │
+│    [BLOCKING: user picks strategy]                             │
+│ 3. Execute              → run deploy.sh, capture exit + logs   │
+│    [AUTO-ROLLBACK on non-zero exit]                            │
+│ 4. Smoke Test           → /test-run prod-smoke in target env   │
+│    [AUTO-ROLLBACK on failure]                                  │
+│ 5. Watch Window         → poll observability for N minutes     │
+│    [AUTO-ROLLBACK on hard threshold breach]                    │
+│ 6. Commit or Rollback   → finalize verdict, update tracker     │
+│    [BLOCKING: user confirms only if soft regression escalated] │
+├────────────────────────────────────────────────────────────────┤
+│ Verdicts: Released · Rolled-Back · Aborted                     │
+└────────────────────────────────────────────────────────────────┘
+```
+
+### Phase 1: Pre-Release Gate
+
+**Goal**: Refuse to start if the system is not ready for a real release.
+
+1. **Acceptance criteria check**: read `_docs/00_problem/acceptance_criteria.md`. If any AC is marked unmet OR if any AC has no associated test marked `Passed` in the latest `test-run` report, STOP and surface the unmet items. Do not let the user override with "ship anyway" without a recorded reason in the release report.
+2. **Test status check**: read the most recent `_docs/06_metrics/perf_*.md` (if perf is required by restrictions) and the latest functional test report. Any failing or skipped test that maps to a critical-path AC blocks the release.
+3. **Change summary**: read the git log between the version-tag-of-last-release and HEAD (or, if no prior release exists, from the project root commit). Render a short list grouped by component: features, fixes, breaking changes, security fixes. Cross-reference against the latest implementation reports under `_docs/03_implementation/`.
+4. **Rollback readiness**:
+   - Confirm the previous version's image is still pullable from the registry (do not deploy without this).
+   - Confirm `scripts/deploy.sh --rollback` works as documented (read the script; if `--rollback` flag is missing, STOP — that is a deploy-skill bug).
+   - Confirm a rollback target exists (e.g., previously-deployed image tag) and is recorded in the release report under `Rollback Plan`.
+5. **Restrictions**: read `_docs/00_problem/restrictions.md` for change-window rules, manual-approval rules, blackout windows, regulatory requirements (e.g., 4-eyes review, ITAR controls). If any apply, gate accordingly — write a `manual_approvals/approval_<version>_<env>.md` file once received.
+6. **Tracker check**: list tracker tickets in the release scope (per `tracker.mdc` rules). Any ticket still in `In Progress` or `Code Review` that maps to a change in the release scope blocks Phase 1. Move-and-deploy is not allowed.
+
+**BLOCKING gate**: present the assembled summary to the user using Choose A/B/C:
+
+```
+══════════════════════════════════════
+ PRE-RELEASE GATE
+══════════════════════════════════════
+ Target env:        {env}
+ Target version:    {version} ({git-sha})
+ Rollback target:   {previous-version}
+ Changes:           N tickets, M components
+   - {summary list}
+ Open risks:        {summary or "none"}
+ Blocking issues:   {summary or "none"}
+══════════════════════════════════════
+ A) Proceed to Strategy Select
+ B) Abort — fix blocking issue and re-invoke
+ C) Edit release scope — exclude a ticket and reassemble
+══════════════════════════════════════
+```
+
+If A → write Phase 1 section to release report, proceed. If B → write `Aborted` verdict to release report with reason, exit. If C → loop back into Phase 1 with edited scope.
+
+### Phase 2: Strategy Select
+
+**Goal**: Pick the deployment strategy that fits the change risk and environment capability.
+
+Read `environment_strategy.md` and `deployment_procedures.md` to learn which strategies the target env supports. Strategies and when each is appropriate:
+
+| Strategy | When to pick | Risk if wrong |
+|----------|--------------|---------------|
+| **all-at-once** | Internal tools, low traffic, well-rehearsed change, env supports nothing else | All users hit the new version simultaneously — bug blast radius is 100% |
+| **blue-green** | Stateless services with a load balancer, env has dual-stack capability | Cutover is binary — observability must be ready to detect issues fast |
+| **canary** | Customer-facing, traffic-tier load balancer in place, gradual rollout possible | Canary metric thresholds must be well-tuned or canary fails for harmless reasons |
+| **manual** | Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) | The whole release becomes a runbook and the watch window phases are operator-driven; the release skill records but does not execute |
+
+Recommend a default based on:
+- Risk level inferred from change summary (any breaking change → bias toward canary or blue-green)
+- Restrictions (e.g., regulatory rules forcing manual approval at each step)
+- Environment capability (some envs may only support all-at-once)
+
+**BLOCKING gate**: Choose A/B/C/D between strategies. Record the choice in the release report.
+
+### Phase 3: Execute
+
+**Goal**: Actually run the deploy. Capture exit code and full stdout/stderr.
+
+1. Validate environment file (`.env`) exists, all required vars from `.env.example` are set, no placeholder secrets remain.
+2. Source the env file and run `scripts/deploy.sh` against the target host. The script produced by `/deploy` Step 7 is the point of execution; do NOT bypass it. If a strategy-specific flag is needed (e.g., `--canary 5%`), pass it through.
+3. Stream stdout/stderr to the release report, with timestamps, in a fenced code block under `## Phase 3: Execute`.
+4. Capture exit code.
+5. **AUTO-ROLLBACK trigger**: non-zero exit code → immediately invoke Phase 6 with verdict `Rolled-Back: deploy script failure`. Do NOT continue to Phase 4.
+
+If `deploy.sh` emits no output for more than the configured idle threshold (default 5 minutes; check `deployment_procedures.md` for an explicit value), treat it as hung — capture a snapshot of what's running on the target, kill the script, and AUTO-ROLLBACK with reason `Deploy hung — manual investigation required`.
+
+**Manual strategy**: if Phase 2 picked `manual`, write a checklist of operator steps from `deployment_procedures.md` to the release report and pause until the user types `done` or `failed`. Phase 3 then records the user's report verbatim.
+
+### Phase 4: Smoke Test
+
+**Goal**: Verify the new version is *actually serving traffic correctly* in the target environment.
+
+1. Resolve the smoke-test command from `_docs/02_document/tests/blackbox-tests.md` § Production Smoke Tests, OR delegate to `/test-run` in `--prod-smoke` mode against the target environment.
+2. The smoke-test set must (a) hit each public endpoint of each component, (b) include at least one read AND one write per public endpoint where applicable, and (c) complete in under 5 minutes total.
+3. Capture pass/fail per case to the release report.
+4. **AUTO-ROLLBACK trigger**: any smoke-test failure → invoke Phase 6 with verdict `Rolled-Back: smoke test failure: <test-name>`.
+
+If smoke tests are **missing** for the target environment (no production-mode test set), STOP — write a leftover entry to `_docs/_process_leftovers/` per `tracker.mdc`, do not proceed to watch window without smoke coverage. Write `Aborted: smoke tests missing for prod-mode target` and ASK the user.
+
+### Phase 5: Watch Window
+
+**Goal**: Observe the live system for a defined window to catch latent regressions.
+
+1. Read `observability.md` for the project's metrics, dashboards, and threshold definitions. Required watch metrics for any production target (per cursor-meta convention) include error rate, request rate, p99 latency, and saturation (CPU/memory/queue-depth).
+2. Compute the watch-window duration from `deployment_procedures.md`. If unspecified, default to **15 minutes** for staging and **60 minutes** for production.
+3. Poll the observability backend at 1-minute intervals (or the configured cadence). For each interval, record metric snapshots to the release report.
+4. Threshold rules:
+   - **Hard breach** (auto-rollback): error-rate ≥ 2× baseline, p99 latency ≥ 3× baseline, any health-check failure persisting for 2 consecutive intervals.
+   - **Soft breach** (escalate): metric drift between 1.5× and 2× baseline, single-interval health blip, queue-depth steady but elevated.
+   - **No data** (escalate): if metrics are not flowing within the first 3 minutes, treat the absence as a hard breach — observability is itself broken.
+5. **AUTO-ROLLBACK trigger**: hard breach at any interval. Move to Phase 6 with verdict `Rolled-Back: <metric> breached <multiplier>× baseline at T+<minutes>`.
+6. **ESCALATE trigger**: soft breach. Pause polling, surface the metric, and ask the user A/B/C:
+   - A) Continue watch — accept current drift, keep polling
+   - B) Roll back now — treat soft drift as hard
+   - C) Extend watch window by N minutes
+7. End of watch window with no breach → proceed to Phase 6.
+
+The watch window cannot be skipped. If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as `Released-with-override` — this triggers an automatic incident retrospective per `retrospective/SKILL.md`.
+
+### Phase 6: Commit or Rollback
+
+**Goal**: Finalize the release with a definitive verdict on disk.
+
+**Path A — Commit (clean release)**:
+1. Update tracker tickets: every ticket in scope moves to `Released` (or `Done`, per project convention defined in `tracker.mdc` / `_docs/_repo-config.yaml`).
+2. Tag the git HEAD with `release/<version>` (or the project's tag convention from `deployment_procedures.md`).
+3. Write the final `Released` verdict to the release report with a summary table.
+4. Trigger `/retrospective --cycle-end` with this release as the cycle terminus.
+5. Auto-chain to autodev's next step (Retrospective in greenfield, or feature-cycle loop start in existing-code).
+
+**Path B — Rollback (auto-fired or user-elected)**:
+1. Run `scripts/deploy.sh --rollback` with the rollback target captured in Phase 1.
+2. Stream output to a new file `RELEASE_DIR/rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md` AND append a summary to the original release report under `## Rollback`.
+3. Re-run Phase 4 (smoke test) and a 5-minute mini watch window against the rolled-back version. If THAT also fails, escalate immediately — the system is in an unknown state and needs human takeover.
+4. Update tracker tickets back to `Ready for Release` (or the project's pre-release status).
+5. Write the final `Rolled-Back` verdict with full reason chain.
+6. Auto-trigger `/retrospective --incident` with this release as the incident anchor (per `retrospective/SKILL.md` incident mode).
+7. Do NOT auto-chain to anything else — the user owns the next step.
+
+**Path C — Aborted**:
+Reached only via Phase 1 Choose B, Phase 4 smoke-tests-missing escalation, or any phase that detects a precondition violation. Write `Aborted: <reason>` to the release report. Do not auto-chain.
+
+## Self-verification
+
+- [ ] Release report exists at `RELEASE_DIR/release_<version>_<env>_<timestamp>.md` with verdict (Released / Rolled-Back / Aborted)
+- [ ] Every phase that ran has a section in the release report with timestamps and tool output
+- [ ] On Released: tracker tickets moved to release status; git tag pushed (if convention)
+- [ ] On Rolled-Back: rollback report exists at `RELEASE_DIR/rollback_<version>_<env>_<timestamp>.md`; tracker tickets moved back to pre-release status; incident retrospective scheduled
+- [ ] On Aborted: reason recorded; no live-system changes attempted; no tracker movement
+- [ ] No phase was skipped without an explicit reason recorded in the release report
+
+## Escalation Rules
+
+| Situation | Action |
+|-----------|--------|
+| `scripts/deploy.sh` missing or `--rollback` unsupported | STOP — return to `/deploy` Step 7, do not patch the script in `/release` |
+| Registry auth failure during pre-release | STOP — fix credentials at infra layer (per `coderule.mdc`); do not embed creds in the script |
+| Smoke tests missing for prod target | STOP — write a leftover; do not improvise smoke tests in `/release` |
+| Observability backend unreachable | STOP — observability blindness is itself a release blocker |
+| User asks to skip the watch window | Record override, mark verdict `Released-with-override`, fire incident retro |
+| Rollback also fails its smoke test | ESCALATE to user — system is in unknown state; do not loop deploys |
+| Tracker MCP returns Unauthorized during ticket movement | Per `tracker.mdc`, write a leftover entry; do NOT silently continue without confirming the move |
+| Multiple environments named in user request | STOP — one release per invocation; ask user to pick one |
+| Production smoke test would touch real customer data | STOP — that is a `coderule.mdc` violation; ask user to define a smoke endpoint or test account |
+
+## Common Mistakes
+
+- **Skipping the watch window when "everything looks fine after deploy"** — a deploy that exited 0 is not a release that's stable. Watch is mandatory.
+- **Faking smoke tests** to pass the gate when the prod test set is incomplete. STOP and surface the gap; do not embed prod URLs into ad-hoc curl commands.
+- **Rolling forward through a failure** ("the next deploy will fix it"). Roll back first, fix the cause, then deploy a real fix.
+- **Treating the release report as optional** when only an internal tool changed. Every release writes a report — the audit trail is the value, not the prose volume.
+- **Approving manual gates yourself** without the user's input when restrictions require human approval. The release skill records, the human approves.
+- **Reusing `release_<version>` filenames** across attempted releases. Always include the timestamp in the filename so re-attempts are visible side-by-side.
+- **Letting tracker drift silently** between release attempts. If Phase 6 cannot move tickets, the release is not complete — write a leftover and stop.
+
+## Project Mode vs Standalone
+
+- **Project mode** (default): autodev invokes `/release` after `/deploy`. State writes occur under `_docs/_autodev_state.md`. Full integration with retrospective and feature-cycle loop.
+- **Standalone mode**: `/release` invoked directly with `@<artifact>` (rare; usually only for re-running a rollback against a specific version). All outputs still go to `RELEASE_DIR/`.
+
+## Methodology Quick Reference
+
+```
+┌────────────────────────────────────────────────────────────────┐
+│                Release (6 phases, 3 verdicts)                  │
+├────────────────────────────────────────────────────────────────┤
+│ Phase 1  Pre-Release Gate                                      │
+│           AC + tests + change summary + rollback path          │
+│           [BLOCKING — user A/B/C]                              │
+│ Phase 2  Strategy Select                                       │
+│           all-at-once · blue-green · canary · manual           │
+│           [BLOCKING — user picks]                              │
+│ Phase 3  Execute                                               │
+│           scripts/deploy.sh, capture exit code + logs          │
+│           [AUTO-ROLLBACK on non-zero or hang]                  │
+│ Phase 4  Smoke Test                                            │
+│           /test-run --prod-smoke against target                │
+│           [AUTO-ROLLBACK on any failure]                       │
+│ Phase 5  Watch Window                                          │
+│           Poll observability for N minutes                     │
+│           [AUTO-ROLLBACK on hard breach; escalate on soft]     │
+│ Phase 6  Commit or Rollback                                    │
+│           Released → tracker, tag, retrospective               │
+│           Rolled-Back → tracker reset, incident retrospective  │
+│           Aborted → no live-system change                      │
+├────────────────────────────────────────────────────────────────┤
+│ Principles: real execution · verifiable rollback ·             │
+│             quiet failure = release failure ·                  │
+│             watch window mandatory                             │
+└────────────────────────────────────────────────────────────────┘
+```
@@ -0,0 +1,114 @@
+# Release Report — {version} → {env}
+
+- **Date**: {YYYY-MM-DD HH:MM} {timezone}
+- **Operator**: {user}
+- **Strategy**: {all-at-once | blue-green | canary | manual}
+- **Verdict**: {Released | Released-with-override | Rolled-Back | Aborted}
+- **Verdict reason**: {one-line summary}
+
+## Pre-Release Gate (Phase 1)
+
+### Acceptance Criteria
+
+| AC ID | Status | Evidence |
+|-------|--------|----------|
+| AC-001 | Met / Unmet | path:section, test report, etc. |
+
+### Test Status
+
+| Suite | Pass | Fail | Skip | Source |
+|-------|------|------|------|--------|
+| Functional | N | N | N | _docs/03_implementation/{batch}.md |
+| Performance | N | N | N | _docs/06_metrics/perf_*.md |
+
+### Change Summary
+
+| Component | Tickets | Type |
+|-----------|---------|------|
+| {component} | TKT-001, TKT-002 | feature / fix / breaking / security |
+
+### Rollback Plan
+
+- Previous version: `{previous-version}` (registry digest: `{sha}`)
+- Rollback script: `scripts/deploy.sh --rollback`
+- Rollback target verified pullable: yes / no
+- Rollback target verified bootable in target env: yes / no
+
+### Restrictions / Approvals
+
+- Change-window restrictions: {none | description}
+- Manual approvals required: {none | reference to approval file}
+
+### Tracker State at Gate
+
+- Tickets in scope: {N}
+- Tickets blocking release: {0 — list any}
+
+## Strategy Select (Phase 2)
+
+- Recommended: {strategy} — reasoning
+- Chosen: {strategy} — reasoning (if differs from recommended)
+
+## Execute (Phase 3)
+
+- Start: {timestamp}
+- End: {timestamp}
+- Exit code: {0 / non-zero}
+
+```
+<scripts/deploy.sh stdout/stderr stream, with timestamps>
+```
+
+## Smoke Test (Phase 4)
+
+- Mode: {/test-run --prod-smoke | manual smoke set}
+- Start: {timestamp}
+- End: {timestamp}
+
+| Test | Result | Notes |
+|------|--------|-------|
+| {name} | Pass / Fail | response time, status, etc. |
+
+## Watch Window (Phase 5)
+
+- Duration: {minutes}
+- Cadence: {minutes per poll}
+- Backend: {observability source — Prometheus, CloudWatch, Datadog, etc.}
+
+| T+min | error_rate | rps | p99_latency | saturation | health | notes |
+|-------|------------|-----|-------------|------------|--------|-------|
+| 0 | … | … | … | … | OK | … |
+| 1 | … | … | … | … | OK | … |
+| … | … | … | … | … | … | … |
+
+### Threshold breaches
+
+- {None | "p99 latency 1.7× baseline at T+8 — soft breach, user accepted continuation"}
+
+## Commit or Rollback (Phase 6)
+
+### If Released
+
+- Tracker tickets moved: {list}
+- Git tag pushed: {tag} → {sha}
+- Retrospective scheduled: yes — {/retrospective --cycle-end output path}
+
+### If Rolled-Back
+
+- Trigger: {auto / user-elected}
+- Reason: {phase + one-line cause}
+- Rollback start: {timestamp}
+- Rollback end: {timestamp}
+- Post-rollback smoke: pass / fail
+- Tracker tickets moved back: {list}
+- Incident retrospective scheduled: yes — {/retrospective --incident output path}
+
+### If Aborted
+
+- Phase that aborted: {1 / 2 / 3 / 4 / 5}
+- Reason: {one-line cause}
+- No live-system changes attempted: yes / no (if live changes, document under Phase 3 above and treat as Rolled-Back instead)
+
+## Lessons (one-liners; full incident retro if Rolled-Back / Released-with-override)
+
+- {Optional: short one-liner observations the operator wants the next /retrospective to consider}
@@ -2,9 +2,9 @@
 name: retrospective
 description: |
  Collect metrics from implementation batch reports and code review findings, analyze trends across cycles,
-  and produce improvement reports with actionable recommendations.
-  3-step workflow: collect metrics, analyze trends, produce report.
-  Outputs to _docs/06_metrics/.
+  and produce improvement reports plus a lessons-log update with actionable recommendations.
+  4-step workflow: collect metrics, analyze trends, produce report, update lessons log.
+  Outputs to _docs/06_metrics/ and appends to _docs/LESSONS.md (ring buffer, last 15).
  Trigger phrases:
  - "retrospective", "retro", "run retro"
  - "metrics review", "feedback loop"
@@ -232,7 +232,7 @@ Present the report summary to the user.

 ```
 ┌────────────────────────────────────────────────────────────────┐
-│              Retrospective (3-Step Method)                     │
+│              Retrospective (4-Step Method)                     │
 ├────────────────────────────────────────────────────────────────┤
 │ PREREQ: batch reports exist in _docs/03_implementation/        │
 │                                                                │
@@ -202,12 +202,12 @@ If invoked in `cycle-update` mode (see "Invocation Modes" above), read and follo
 | Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — specification cannot proceed |
 | Missing input_data/expected_results/results_report.md | **STOP** — ask user to provide expected results mapping using the template |
 | Ambiguous requirements | ASK user |
-| Input data coverage below 75% (Phase 1) | Search internet for supplementary data, ASK user to validate |
+| Input data coverage below the canonical threshold (Phase 1) | Search internet for supplementary data, ASK user to validate. See `.cursor/rules/cursor-meta.mdc` Quality Thresholds for the canonical 75% number — do not hardcode a different threshold here. |
 | Expected results missing or not quantifiable (Phase 1) | ASK user to provide quantifiable expected results before proceeding |
 | Test scenario conflicts with restrictions | ASK user to clarify intent |
 | System interfaces unclear (no architecture.md) | ASK user or derive from solution.md |
 | Test data or expected result not provided for a test scenario (Phase 3) | WARN user and REMOVE the test |
-| Final coverage below 75% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
+| Final coverage below the canonical threshold after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec (see `cursor-meta.mdc` Quality Thresholds) |

 ## Common Mistakes

@@ -252,7 +252,8 @@ When the user wants to:
 │                                                                      │
 │ Phase 3: Test Data & Expected Results Validation Gate (HARD GATE)    │
 │   → phases/03-data-validation-gate.md                                │
-│   [BLOCKING: coverage ≥ 75% required to pass]                        │
+│   [BLOCKING: coverage ≥ canonical threshold required to pass —      │
+│    see cursor-meta.mdc Quality Thresholds (75%)]                    │
 │                                                                      │
 │ Hardware-Dependency Assessment (BLOCKING, pre-Phase-4)               │
 │   → phases/hardware-assessment.md                                    │
@@ -1,7 +1,7 @@
 # Phase 3: Test Data & Expected Results Validation Gate (HARD GATE)

 **Role**: Professional Quality Assurance Engineer
-**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above 75%.
+**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above the canonical threshold (currently 75% — see `.cursor/rules/cursor-meta.mdc` Quality Thresholds; never hardcode a different number in any phase).
 **Constraints**: This phase is MANDATORY and cannot be skipped.

 ## Step 1 — Build the requirements checklist
@@ -28,10 +28,22 @@ LOG_SINK=console
 # Dev key from tests/fixtures/mavlink_signing/dev_key in dev-tier1.
 MAVLINK_SIGNING_KEY=tests/fixtures/mavlink_signing/dev_key

-# CMake build flags (per-binary; honoured at compile time)
+# CMake / runtime BUILD_* gating flags
+# Defaults below match the airborne deployment binary (ADR-002 / ADR-011).
+# Strategy flags use OFF for opt-in non-default strategies; ON for the
+# deployment defaults that the runtime expects to be linked.
 BUILD_VINS_MONO=OFF
 BUILD_SALAD=OFF
 BUILD_C11_TILE_MANAGER=OFF
+# Replay-mode strategy flags (ADR-011) — must be ON in the airborne and
+# research binaries so replay can run from the same image. The CI test
+# compose files already set these explicitly; production sets them ON.
+# BUILD_VIDEO_FILE_FRAME_SOURCE=ON
+# BUILD_TLOG_REPLAY_ADAPTER=ON
+# BUILD_REPLAY_SINK_JSONL=ON
+# Dev-only: enables `signing_key_source='dev_static'` on the AP FC adapter.
+# MUST stay OFF on production images; ON only in dev/CI containers.
+# BUILD_DEV_STATIC_KEY=OFF

 # Required: C7 inference backend (tensorrt | pytorch_fp16 | onnx_trt_ep)
 INFERENCE_BACKEND=pytorch_fp16
@@ -0,0 +1,42 @@
+# AZ-688: dev-only environment for the Jetson e2e harness.
+# Jetson-only test policy (2026-05-20) — see _docs/LESSONS.md.
+#
+# Copy this file to `.env.test` and customize. NEVER commit `.env.test`
+# (gitignored). Sourced by `scripts/run-tests-jetson.sh` before
+# `docker compose up`.
+
+# Suite JWT contract — see ../_docs/10_auth.md. The same secret signs the
+# dev JWT (AZ-690) and validates it at the satellite-provider boundary.
+# MUST be ≥ 32 bytes UTF-8. Generate a fresh value with:
+#   openssl rand -hex 32
+JWT_SECRET=DEV-ONLY-REPLACE-WITH-OPENSSL-RAND-HEX-32-OUTPUT-XXXXXXX
+
+# JWT issuer / audience claims. Dev-only values that ONLY validate against
+# the dev secret above. Production deploys MUST use real values provided
+# by the admin team (the admin API stamps `iss`; satellite-provider
+# validates `aud`).
+JWT_ISSUER=DEV-ONLY-iss-admin-azaion-local
+JWT_AUDIENCE=DEV-ONLY-aud-satellite-provider
+
+# Google Maps Platform key. Left empty: AZ-689 seeds local fixture tiles
+# instead, so the hermetic Derkachi e2e flow never calls GoogleMaps. If
+# you need to exercise the real GMaps tile-download path, set this to a
+# valid key.
+GOOGLE_MAPS_API_KEY=
+
+# AZ-777: Bearer token C11 sends to satellite-provider as
+# `Authorization: Bearer <token>`. The token is a JWT signed with
+# JWT_SECRET above and stamped with the same iss/aud the provider
+# validates. Mint a dev token with:
+#   python scripts/mint_dev_jwt.py
+# Production deploys retrieve this from the admin API and rotate per
+# operator session — never commit a real one.
+SATELLITE_PROVIDER_API_KEY=PASTE-MINTED-JWT-HERE
+
+# SECURITY: development-only TLS bypass for the parent-suite
+# satellite-provider self-signed dev cert. The compose env block sets
+# SATELLITE_PROVIDER_TLS_INSECURE=1 — it stays inside the Jetson e2e
+# harness, never in production. Production deploys MUST use a real
+# CA-issued cert (or your own internal CA) and leave this unset (or
+# set to "0"). C11 logs a single WARNING at startup whenever the
+# insecure flag is active so the operator can audit it.
@@ -0,0 +1,4 @@
+_docs/00_problem/input_data/flight_derkachi/flight_derkachi.mp4 filter=lfs diff=lfs merge=lfs -text
+models/**/*.pt filter=lfs diff=lfs merge=lfs -text
+models/**/*.onnx filter=lfs diff=lfs merge=lfs -text
+models/**/*.engine filter=lfs diff=lfs merge=lfs -text
@@ -14,7 +14,12 @@ jobs:
      - uses: actions/setup-python@v5
        with:
          python-version: "3.10"
-      - run: pip install -e ".[dev]"
+      # AZ-300 — `[inference]` (torch + torchvision + onnxruntime) is now
+      # required for `mypy src` to type-check `c7_inference.pytorch_fp16_runtime`
+      # and for `pytest` to collect `test_pytorch_fp16_runtime.py`. Tier-1
+      # CI uses the CPU-only torch wheel; CUDA-gated tests skip themselves
+      # via `pytest.mark.skipif(not torch.cuda.is_available(), ...)`.
+      - run: pip install -e ".[dev,inference]"
      - run: ruff check src tests
      - run: mypy src

@@ -26,7 +31,7 @@ jobs:
      - uses: actions/setup-python@v5
        with:
          python-version: "3.10"
-      - run: pip install -e ".[dev]"
+      - run: pip install -e ".[dev,inference]"
      - name: pytest unit (per-component coverage gate)
        run: pytest -q --cov=gps_denied_onboard --cov-fail-under=75 tests/unit

@@ -47,10 +52,20 @@ jobs:
      matrix:
        kind: [deployment, research]
        include:
+          # AZ-332 — BUILD_OKVIS2 forced OFF in Tier-1 CI until the tier2
+          # follow-up wires `okvis::ThreadedKFVio` end-to-end. The C++
+          # binding skeleton + CMake glue still ship in this build; full
+          # OKVIS2 native compile is gated on installing Ceres-solver +
+          # OKVIS2 vendored submodules (BRISK, DBoW2) via apt, plus
+          # `submodules: recursive` checkout. That CI lift is the
+          # tier2 task's surface, not AZ-332's.
          - kind: deployment
-            cmake_flags: "-DBUILD_VINS_MONO=OFF -DBUILD_VPR_SALAD=OFF -DBUILD_C11_TILE_MANAGER=OFF"
+            cmake_flags: >-
+              -DBUILD_OKVIS2=OFF -DBUILD_VINS_MONO=OFF
+              -DBUILD_VPR_SALAD=OFF -DBUILD_C11_TILE_MANAGER=OFF
          - kind: research
-            cmake_flags: "-DBUILD_VINS_MONO=ON -DBUILD_VPR_SALAD=ON"
+            cmake_flags: >-
+              -DBUILD_OKVIS2=OFF -DBUILD_VINS_MONO=ON -DBUILD_VPR_SALAD=ON
    steps:
      - uses: actions/checkout@v4
      - run: cmake -S . -B build ${{ matrix.cmake_flags }}
@@ -13,12 +13,12 @@ jobs:
      - name: Build JetPack image
        run: echo "JetPack image build + sign + attest — concrete wiring lands per deploy task"

-  operator-tooling-tarball:
+  operator-orchestrator-tarball:
    runs-on: ubuntu-22.04
    needs: jetpack-image
    steps:
      - uses: actions/checkout@v4
-      - name: Bundle operator-tooling tarball
+      - name: Bundle operator-orchestrator tarball
        run: |
          mkdir -p dist
-          tar -czf dist/operator-tooling.tar.gz docker-compose.yml docker/ _docs/
+          tar -czf dist/operator-orchestrator.tar.gz docker-compose.yml docker/ _docs/
@@ -43,6 +43,20 @@ tests/fixtures/flight_derkachi/*.h264
 tests/fixtures/flight_derkachi/*.tlog
 tests/fixtures/tiles_corpus/*.jpg
 tests/fixtures/tiles_corpus/*.png
+e2e/fixtures/sitl_replay/
+
+# Problem-folder flight-log inputs (binary, out-of-band)
+_docs/00_problem/input_data/**/*.tlog
+_docs/00_problem/input_data/**/*.mp4
+_docs/00_problem/input_data/**/*.h264
+_docs/00_problem/input_data/**/*.mkv
+_docs/00_problem/input_data/**/*.zip
+
+# Locally-generated evidence frames for extraction fixtures (large, regenerable)
+_docs/00_problem/input_data/**/frames_src/
+_docs/00_problem/input_data/**/frames_optA/
+_docs/00_problem/input_data/**/frames_optB/
+_docs/00_problem/input_data/**/frames_optC/

 # Editor / OS noise
 .idea/
@@ -57,9 +71,17 @@ Thumbs.db
 /var/lib/gps-denied/
 fdr_output/
 tile_cache/
+e2e-results/
+
+# Local scratch / one-off diagnostics
+_scratch/

 # Secrets
 .env
 .env.local
+.env.test
 *.key
 !tests/fixtures/mavlink_signing/dev_key
+
+# Deploy rollback bookmark (written by scripts/stop-services.sh)
+.previous-tags.env
@@ -0,0 +1,6 @@
+[submodule "cpp/pybind11/upstream"]
+	path = cpp/pybind11/upstream
+	url = https://github.com/pybind/pybind11.git
+[submodule "cpp/okvis2/upstream"]
+	path = cpp/okvis2/upstream
+	url = https://github.com/smartroboticslab/okvis2.git
@@ -0,0 +1,43 @@
+# Cycle-1 trigger: manual-only.
+#
+# Rationale (per _docs/04_deploy/ci_cd_pipeline.md → Decision Record):
+#   The Tier-1 e2e harness (docker-compose.test.yml + tests/e2e/Dockerfile)
+#   is heavy: TensorRT-class pytorch fp16, gtsam, Postgres 16, and the
+#   Derkachi replay clip. It is shipped opt-in until per-run wall-clock on
+#   the colocated arm64 Jetson agent is characterised.
+#
+# Flip-back (cycle-2 polish item #1 in _docs/04_deploy/ci_cd_pipeline.md):
+#   1. Replace `event: [manual]` with `event: [push, pull_request, manual]`
+#      below.
+#   2. Add `depends_on: [01-test]` to .woodpecker/02-build-push.yml.
+
+when:
+  event: [manual]
+  branch: [dev, stage, main]
+
+matrix:
+  include:
+    - PLATFORM: arm64
+      TAG_SUFFIX: arm
+    # - PLATFORM: amd64
+    #   TAG_SUFFIX: amd
+
+labels:
+  platform: ${PLATFORM}
+
+steps:
+  - name: e2e
+    image: docker
+    commands:
+      - docker compose -f docker-compose.test.yml up --build --abort-on-container-exit --exit-code-from e2e-runner
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
+
+  - name: down
+    image: docker
+    when:
+      status: [success, failure]
+    commands:
+      - docker compose -f docker-compose.test.yml down -v
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
@@ -0,0 +1,85 @@
+# Cycle-1 trigger: push + manual on dev/stage/main, NO depends_on.
+#
+# Rationale (per _docs/04_deploy/ci_cd_pipeline.md → Decision Record):
+#   01-test.yml runs `event: [manual]` only in cycle-1, so a `depends_on:
+#   [01-test]` clause here would skip every push (no preceding test run to
+#   succeed against). The un-gated stance mirrors the `detections` deferral
+#   pattern documented in `../_infra/ci/README.md` → "detections deferral".
+#
+# Re-gate (cycle-2 polish item #1 in _docs/04_deploy/ci_cd_pipeline.md):
+#   Add `depends_on: [01-test]` below once .woodpecker/01-test.yml flips to
+#   `event: [push, pull_request, manual]`.
+#
+# Images pushed in cycle-1:
+#   - azaion/gps-denied-onboard-companion-tier1:${BRANCH}-${TAG_SUFFIX}
+#   - azaion/gps-denied-onboard-operator-orchestrator:${BRANCH}-${TAG_SUFFIX}
+#
+# Image NOT pushed in cycle-1 (reserved for cycle-2 / companion-jetson):
+#   - azaion/gps-denied-onboard:${BRANCH}-${TAG_SUFFIX}
+#     (parent-suite Jetson compose at ../_infra/deploy/jetson/docker-compose.yml
+#      expects this exact tag; cycle-1 must not write to it or Watchtower
+#      on fielded Jetsons will pull a Tier-1 dev image.)
+#
+# OCI labels (suite-mandated, AZ-204 — see ../_infra/ci/README.md → "OCI
+# image labels and commit provenance"):
+#   org.opencontainers.image.revision = $CI_COMMIT_SHA
+#   org.opencontainers.image.created  = <UTC RFC 3339>
+#   org.opencontainers.image.source   = $CI_REPO_URL
+# Plus --build-arg CI_COMMIT_SHA so the Dockerfile can bake ENV AZAION_REVISION.
+
+when:
+  event: [push, manual]
+  branch: [dev, stage, main]
+
+matrix:
+  include:
+    - PLATFORM: arm64
+      TAG_SUFFIX: arm
+    # - PLATFORM: amd64
+    #   TAG_SUFFIX: amd
+
+labels:
+  platform: ${PLATFORM}
+
+steps:
+  - name: build-push-companion-tier1
+    image: docker
+    environment:
+      REGISTRY_HOST:  { from_secret: registry_host }
+      REGISTRY_USER:  { from_secret: registry_user }
+      REGISTRY_TOKEN: { from_secret: registry_token }
+    commands:
+      - echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
+      - export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
+      - export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
+      - |
+        docker build -f docker/companion-tier1.Dockerfile \
+          --build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
+          --label org.opencontainers.image.revision=$CI_COMMIT_SHA \
+          --label org.opencontainers.image.created=$BUILD_DATE \
+          --label org.opencontainers.image.source=$CI_REPO_URL \
+          -t $REGISTRY_HOST/azaion/gps-denied-onboard-companion-tier1:$TAG .
+      - docker push $REGISTRY_HOST/azaion/gps-denied-onboard-companion-tier1:$TAG
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
+
+  - name: build-push-operator-orchestrator
+    image: docker
+    environment:
+      REGISTRY_HOST:  { from_secret: registry_host }
+      REGISTRY_USER:  { from_secret: registry_user }
+      REGISTRY_TOKEN: { from_secret: registry_token }
+    commands:
+      - echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
+      - export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
+      - export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
+      - |
+        docker build -f docker/operator-orchestrator.Dockerfile \
+          --build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
+          --label org.opencontainers.image.revision=$CI_COMMIT_SHA \
+          --label org.opencontainers.image.created=$BUILD_DATE \
+          --label org.opencontainers.image.source=$CI_REPO_URL \
+          -t $REGISTRY_HOST/azaion/gps-denied-onboard-operator-orchestrator:$TAG .
+      - docker push $REGISTRY_HOST/azaion/gps-denied-onboard-operator-orchestrator:$TAG
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
@@ -23,4 +23,4 @@ For full Tier-1 integration via Docker, see [`_docs/02_document/deployment/conta

 ## Build matrix

-Four binaries built from this codebase: **airborne**, **research**, **operator-tooling**, **replay-cli**. CMake `BUILD_*` flags gate component inclusion per binary — see [`cmake/build_options.cmake`](cmake/build_options.cmake) and [`_docs/02_document/module-layout.md` § Build-Time Exclusion Map](_docs/02_document/module-layout.md#build-time-exclusion-map-adr-002).
+Four binaries built from this codebase: **airborne**, **research**, **operator-orchestrator**, **replay-cli**. CMake `BUILD_*` flags gate component inclusion per binary — see [`cmake/build_options.cmake`](cmake/build_options.cmake) and [`_docs/02_document/module-layout.md` § Build-Time Exclusion Map](_docs/02_document/module-layout.md#build-time-exclusion-map-adr-002).
@@ -12,3 +12,31 @@
 Use this fixture for video/telemetry synchronization checks, representative replay smoke tests, VIO hot-path latency, frame-drop accounting, and trajectory comparison against `GLOBAL_POSITION_INT`. The video and telemetry align at exactly three video frames per telemetry row. Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims.

 For the test recording, the rotating camera was mechanically fixed in a downward/nadir orientation. Treat the MP4 as a cleaned/cropped replay fixture rather than the raw camera feed.
+
+## Derkachi C6 reference seeding (cycle 3 — AZ-777 + Epic AZ-835)
+
+The end-to-end replay pipeline needs the C6 tile cache pre-populated with the satellite imagery that covers this flight. The seed scripts live under `tests/fixtures/derkachi_c6/`:
+
+| Script | Purpose |
+|--------|---------|
+| `tests/fixtures/derkachi_c6/seed_region.py` (AZ-777 Phase 2) | Bbox-driven seed. Calls `POST /api/satellite/request` on the running `satellite-provider` to onboard the Derkachi area (~50.05–50.15 lat, 36.05–36.15 lon, zoom 15–18). Companion to the existing bbox-download workflow. |
+| `tests/fixtures/derkachi_c6/seed_route.py` (AZ-838 / Epic AZ-835 C2) | Route-driven seed. Reads `derkachi.tlog`, extracts a ≤ 10-waypoint corridor via `replay_input.tlog_route.extract_route_from_tlog`, posts it to `satellite-provider`'s Route API, polls until `mapsReady=true`, and verifies coverage via inventory. ~100× more tile-efficient than the bbox path for this clip. |
+| `tests/fixtures/derkachi_c6/bbox.yaml` | Derkachi bbox + zoom levels + license-attribution metadata (Google Maps Platform ToS + "Imagery © Google" attribution string). |
+| `tests/fixtures/derkachi_c6/README.md` | Step-by-step re-seeding instructions when the `satellite-provider` postgres is wiped; license-attribution operators must propagate; pointer to the parent-suite ticket (TBD) for migrating to a true CC-BY satellite source for production. |
+
+Both seed scripts require:
+
+- A running `satellite-provider` reachable at `SATELLITE_PROVIDER_URL` (typically `https://satellite-provider:8080` inside the Jetson compose network).
+- A valid JWT — either `SATELLITE_PROVIDER_API_KEY` env var or `--auto-mint-jwt` (uses `scripts/mint_dev_jwt.py`).
+- `SATELLITE_PROVIDER_TLS_INSECURE=1` if the parent suite is using the self-signed dev cert (development only — production deploys must validate against a CA-issued cert).
+
+The end-to-end orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840) takes only `(derkachi.tlog, flight_derkachi.mp4, khp20s30_factory.json)` and runs the full 7-step pipeline against a populated C6 — see `_docs/02_document/contracts/replay/replay_protocol.md` Invariant 12.b for the orchestration.
+
+### License attribution caveat (cycle 3)
+
+The Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. This fixture and the seed scripts are dev/research use only. Production deployment requires either:
+
+- Google Maps Platform licensing review for offline-cache use, OR
+- A parent-suite ticket to switch satellite-provider's upstream to a true CC-BY satellite source (Esri World Imagery, Mapbox satellite, Sentinel-2, etc.).
+
+The "Imagery © Google" attribution string is recorded in the seeded catalog's metadata and must be propagated downstream by any operator workflow that surfaces the imagery.
@@ -1,3 +1,34 @@
-Camera model: Topotek KHP20S30
-Daylight Sensor: 1/2.8" CMOS (2.13 Мп).
- Full HD (1920x1080), 30/60 fps
+# Derkachi camera
+
+Camera model: **Topotek KHP20S30**
+Daylight sensor: 1/2.8" CMOS (Sony IMX291-class, 2.13 MP)
+Image resolution: Full HD 1920×1080 @ 30/60 fps
+Lens: 20× optical zoom, f = 4.7 mm – 94 mm
+
+## Calibration
+
+**File**: [`khp20s30_factory.json`](./khp20s30_factory.json)
+**Acquisition method**: `factory_sheet` (AZ-702 — factory-sheet approximation)
+**Assumed zoom setting**: wide-angle (f = 4.7 mm), HFOV ≈ 59.5°
+
+Per-unit checkerboard refinement is **deferred** (no hardware access to the
+Derkachi unit). The factory-sheet calibration is the cheapest reasonable
+starting point. The residual focal-length error is expected to be in the
+**1–3 %** band; at high AGL this may push horizontal position error past the
+AC-3 100 m budget, in which case AZ-699 (T3 real-flight validation) reports
+the honest finding and a follow-up checkerboard task is filed.
+
+### Why factory-sheet (not checkerboard or PnP-from-tlog)
+
+* **Checkerboard**: needs physical access to the airframe + a known-geometry
+  calibration target. Not in scope for AZ-696.
+* **PnP-from-tlog back-computation**: would require a 5-point task in its own
+  right; deferred as an AZ-696 follow-up if the residual budget proves
+  insufficient.
+
+### Replay-test wiring
+
+`tests/e2e/replay/conftest.py::_calibration_path()` prefers this file when
+present and falls back to `tests/fixtures/calibration/adti26.json` otherwise,
+so dev environments that don't carry the calibration file still exercise the
+AC-1 / AC-2 / AC-5 / AC-6 paths.
@@ -0,0 +1,34 @@
+{
+  "camera_id": "khp20s30_factory",
+  "intrinsics_3x3": [
+    [1680.4469, 0.0,       960.0],
+    [0.0,       1680.4469, 540.0],
+    [0.0,       0.0,         1.0]
+  ],
+  "distortion": [0.0, 0.0, 0.0, 0.0, 0.0],
+  "body_to_camera_se3": [
+    [1.0, 0.0, 0.0, 0.0],
+    [0.0, 1.0, 0.0, 0.0],
+    [0.0, 0.0, 1.0, 0.0],
+    [0.0, 0.0, 0.0, 1.0]
+  ],
+  "acquisition_method": "factory_sheet",
+  "metadata": {
+    "model": "Topotek KHP20S30",
+    "sensor": "1/2.8\" CMOS (Sony IMX291-class), 2.13 MP",
+    "image_resolution_px": [1920, 1080],
+    "sensor_width_mm": 5.37,
+    "sensor_height_mm": 3.02,
+    "assumed_focal_length_mm": 4.7,
+    "focal_length_range_mm": [4.7, 94.0],
+    "assumed_zoom": "wide-angle (max FOV, f=4.7 mm)",
+    "computed_hfov_deg": 59.48,
+    "computed_vfov_deg": 35.62,
+    "intrinsics_formula": "fx = fy = focal_mm * (image_width_px / sensor_width_mm); cx = width/2; cy = height/2",
+    "body_to_camera_convention": "identity-down (nadir, camera-z aligned with aircraft body-z = down per FRD body frame)",
+    "residual_budget_pct": 3.0,
+    "note": "Factory-sheet approximation per AZ-702. The KHP20S30 is a 20x optical-zoom camera (f=4.7-94 mm); the wide-angle f=4.7 mm setting is assumed without per-flight EXIF confirmation. Per-unit checkerboard refinement is deferred — see _docs/00_problem/input_data/flight_derkachi/camera_info.md and the AZ-696 epic. AC-3 (<= 100 m horizontal error) may honestly fail if the assumed focal length is wrong by enough to swamp the 100 m budget at the Derkachi AGL band.",
+    "task": "AZ-702",
+    "epic": "AZ-696"
+  }
+}
@@ -0,0 +1,167 @@
+# Question Decomposition — Mode B (focused) — Video Extraction from GCS Recording
+
+> Run date: 2026-05-29. Triggered by user question on
+> `_docs/00_problem/input_data/10.05.2026/2026-05-09 16-10-54.mkv`.
+> Active mode: **Mode B** (solution_draft01.md exists). Scope of this run is
+> deliberately narrower than a full solution reassessment — it asks whether the
+> existing solution can ingest a *new representative-data class* (operator-side
+> GCS screen recordings of gimbaled multi-sensor balls) as replay fixtures, and
+> what cleanup pipeline is required.
+
+## Original question
+
+> "I have `2026-05-09 16-10-54.mkv` but it's obscured by other elements. Is it
+> possible to make out of it a proper video as from a nadir camera? What's
+> possible options?"
+
+## Research Output Class
+
+**Technical-component selection** (per SKILL.md → Research Output Class table).
+The deliverable will name specific tools (FFmpeg filters, deep video inpainting
+models, mask-aware feature extractors) that will be implemented or operated
+against. All technical-component gates apply (per-mode API verification, MVE,
+fit matrix, Restrictions × Candidate-Mode sub-matrix).
+
+## Active mode
+
+| Aspect | Value |
+|---|---|
+| Skill mode | Mode B (Solution Assessment) |
+| Existing draft | `_docs/01_solution/solution_draft01.md` (329 lines) |
+| Scope of revision | Additive — propose a new test-fixture-prep component (does **not** alter runtime pipeline) |
+| Output | `_docs/01_solution/solution_draft02.md` |
+| Working dir | `_docs/00_research/_mode_b_2026-05-29_video_extraction/` |
+
+## Question type
+
+**Decision Support** — weigh trade-offs across multiple options for converting
+an OSD-burned-in screen-recorded video into a clean nadir replay fixture.
+
+## Novelty Sensitivity
+
+**Medium**. Underlying tools (FFmpeg filters) are stable for >15 years.
+Deep-learning video inpainting evolves rapidly (E2FGVI 2022 → ProPainter 2023 →
+VideoPainter 2025 → VidPivot 2025); version annotations required.
+
+## Project context grounding
+
+From `_docs/00_problem/`:
+- **Spec'd nav-camera (`restrictions.md`)**: ADTi 20MP 20L V1, APS-C, ~5472×3648
+  px, fixed-downward, no gimbal. The `flight_derkachi.mp4` representative
+  fixture is a Topotek KHP20S30 1/2.8" CMOS, 1920×1080, mechanically locked
+  nadir, OSD-off — already pre-cleaned.
+- **The new MKV is a different class of input**: a screen capture of a Ground
+  Control Station UI displaying a Topotek/Viewpro multi-sensor gimbal feed,
+  1280×720 30 fps H.264, ~6 m 7 s, with three layers of overlay: (a) GCS UI
+  chrome (sidebars, minimap, status bar), (b) gimbal-burned-in OSD (attitude,
+  crosshair, FOV brackets, status text, IR PIP), (c) the underlying EO video.
+- **Use-case (per user's selection)**: replay/test fixture for the runtime
+  C1/C2/C3/C4/C5 pipeline, analogous to `flight_derkachi.mp4`.
+- **Constraint (per user's selection)**: only the recorded MKV is available;
+  cannot re-record with OSD off, cannot lock gimbal nadir, cannot pull RTSP
+  stream from the camera.
+
+## Research subject boundary
+
+| Dimension | Boundary |
+|---|---|
+| Population | Single MKV file + the *class* of similar future GCS screen recordings |
+| Geography | Project's operational area (eastern/southern Ukraine) |
+| Timeframe | Cleanup tooling for legacy recordings (no live-system requirement) |
+| Operating context | Offline, developer workstation; output consumed by `tests/e2e/replay/` |
+| Required interfaces | Input: `.mkv` (any container with H.264). Output: H.264 MP4 ingestable by `flight_derkachi.mp4`-style replay path |
+| Non-functional envelope | Offline (no real-time constraint). Hardware: developer workstation (CPU+optional GPU). Output ≤ a few hundred MB per flight. |
+
+## Project Constraint Matrix (relevant subset)
+
+Extracted from `restrictions.md`, `acceptance_criteria.md`, and the Derkachi
+fixture conventions:
+
+| # | Constraint | Source | Binding for this run? |
+|---|---|---|---|
+| C1 | Replay fixtures must be ingestable by `tests/e2e/replay/test_az835_e2e_real_flight.py` (takes a `.mp4` + `.tlog` + calibration JSON) | `flight_derkachi/README.md` | **Yes** |
+| C2 | Output must NOT have synthetic content fabricated by generative models (would invalidate VPR/matching evaluation — pipeline could anchor on hallucinated features instead of real terrain) | `coderule.mdc` "Real Results, Not Simulated Ones" + `meta-rule.mdc` | **Yes** |
+| C3 | Output frame rate may differ from the spec'd 3 Hz; replay layer subsamples | Existing fixtures (Derkachi.mp4 is 30 fps) | No (downstream handles) |
+| C4 | Frame-to-frame registration must succeed for >95% of normal-flight segments (AC-2.1a) — applies if and only if the cleaned fixture is treated as a normal-flight fixture | `acceptance_criteria.md` | Soft: only if frames qualify as nadir |
+| C5 | Output cannot lie about the underlying camera spec; calibration file must reflect the actual recording source (Topotek/Viewpro, not ADTi 20MP) | `flight_derkachi/camera_info.md` shows the convention is to ship a per-camera calibration JSON | **Yes** |
+| C6 | The pipeline producing fixtures should be **reproducible** (versioned scripts, pinned tool versions) so a re-run produces the same fixture | `coderule.mdc` testing principles | **Yes** |
+| C7 | Cleanup must NOT introduce false-positive features the downstream matcher could anchor on | derived from C2; specific to mask-aware vs inpaint trade-off | **Yes** |
+| C8 | Gimbaled, non-nadir frames must be either filtered out or labeled — feeding forward-looking frames into a nadir-tuned VPR will produce nonsense matches | `restrictions.md` "navigation camera fixed downward (no gimbal)" + project's level-flight assumption | **Yes** |
+
+## Sub-questions
+
+1. **SQ-1 — Layer identification**: What spatially-distinct layers are in the
+   recorded video, and which are removable by cropping vs which require active
+   removal?
+2. **SQ-2 — GCS UI chrome removal**: Best technique to remove the deterministic
+   GCS UI sidebars, minimap, status bar, IR PIP?
+3. **SQ-3 — Gimbal-burned OSD removal**: Best technique to remove burned-in
+   gimbal HUD elements (attitude ladder, crosshair, FOV brackets, status text)
+   without fabricating content the downstream matcher could anchor on?
+4. **SQ-4 — Mask-aware downstream alternative**: Can the project's existing
+   C2/C3 stack (DISK + LightGlue) consume a binary mask of OSD regions
+   directly, sidestepping the need to inpaint at all?
+5. **SQ-5 — Non-nadir frame filtering**: How to detect and exclude frames where
+   the gimbal is pointed off-nadir (the burned-in attitude ladder shows the
+   gimbal angle)?
+6. **SQ-6 — Acceptance against existing replay infrastructure**: What
+   metadata/companion-files does the new fixture need to drop into the
+   `flight_derkachi.mp4`-style replay path?
+
+## Perspectives chosen (≥3)
+
+| Perspective | Why | Sub-questions emphasized |
+|---|---|---|
+| **Implementer / Engineer** | This is fundamentally a tooling/pipeline question — the engineer building the fixture cleanup script needs concrete commands and gotchas | SQ-2, SQ-3, SQ-5 |
+| **Contrarian / Devil's advocate** | The naive "just inpaint it with AI" approach has a specific failure mode in this domain (fabricated terrain features) that must be flagged | SQ-3, SQ-4 |
+| **Domain expert / Academic** | VPR + matching algorithms have published mask-aware inference paths; the question of "do we need clean pixels or can we just signal which pixels to ignore" has a literature answer | SQ-4 |
+
+## Question Explosion (search query variants)
+
+For SQ-1 (layer identification): inspection-based, no web search.
+
+For SQ-2 (GCS UI chrome removal):
+- "FFmpeg crop filter exact pixel coordinates"
+- "FFmpeg crop video specific region command line"
+
+For SQ-3 (gimbal-burned OSD removal):
+- "FFmpeg delogo filter remove static OSD overlay video burned-in HUD"
+- "FFmpeg removelogo PNG mask filter syntax"
+- "ProPainter E2FGVI video inpainting state of the art 2025 2026 mask region"
+- "video OSD removal practitioner experience drone gimbal"
+- "temporal median filter remove static HUD OSD video keep moving content"
+- "drone gimbal video OSD removal extract clean nadir feed Topotek Viewpro"
+
+For SQ-4 (mask-aware downstream):
+- "SuperPoint LightGlue masked feature detection ignore region keypoints"
+- "DISK keypoint detector mask region of interest pytorch implementation"
+- "Kornia DISK mask parameter forward pass"
+
+For SQ-5 (non-nadir frame filtering):
+- "MAVLink MOUNT_STATUS gimbal attitude tlog parsing"
+- "OCR pitch angle text from drone HUD video frame"
+
+For SQ-6 (replay infrastructure):
+- (no web; read project docs directly)
+
+## Component Option Search Plan
+
+| Component area | Option families to cover | Required evidence to mark Selected |
+|---|---|---|
+| Frame extraction & re-encode | Simple baseline (FFmpeg `crop`), Established (FFmpeg `crop` + container remux), Open-source (FFmpeg-python wrapper) | Verified `crop` syntax against FFmpeg 8.1 docs; PoC produces playable output |
+| Static-region OSD removal | Simple (FFmpeg `delogo`), Established (FFmpeg `removelogo` with PNG mask), Open-source (Python+OpenCV inpaint per-frame), SOTA (ProPainter, VideoPainter), Adjacent (temporal-median `tmedian`/`atadenoise`), No-build (skip; pass mask downstream), Known-bad (generative models that fabricate content) | Comparison of per-region quality vs cost vs fabrication risk |
+| Mask-aware downstream matcher | The project's existing DISK + LightGlue path with a binary mask injected | Verified Kornia DISK has a `mask` parameter; verified LightGlue maintainers recommend score-map masking |
+| Non-nadir frame filtering | Tlog-based (parse `MOUNT_STATUS`/`MOUNT_ORIENTATION`), OCR-based (read burned-in pitch text), Pixel-pattern-based (detect attitude-ladder rotation), No-build (accept all frames; downstream covariance grows) | Known whether the paired `.tlog` contains gimbal attitude messages |
+| Calibration metadata | Per-camera JSON file in same form as `khp20s30_factory.json` | Topotek/Viewpro spec sheet exists; "factory_sheet" approximation acceptable per AZ-702 precedent |
+
+## Completeness Audit
+
+- ✅ **Layer identification** covered (SQ-1).
+- ✅ **Removal techniques** covered for both GCS UI (SQ-2) and gimbal OSD (SQ-3).
+- ✅ **Alternative path** considered (SQ-4 — mask-aware matchers, no inpainting).
+- ✅ **Frame relevance** covered (SQ-5 — gimbal pointing).
+- ✅ **Integration** covered (SQ-6 — replay path metadata).
+- ✅ **Contrarian view** covered (generative-AI fabrication risk).
+- 🚫 **Audio handling** — not covered; trivially answered (discard audio stream).
+- 🚫 **Frame rate normalization** — not covered; trivially answered (replay
+  layer already subsamples; preserve native 30 fps).
@@ -0,0 +1,202 @@
+# Source Registry — Mode B Video Extraction Run
+
+> All sources accessed 2026-05-29.
+
+## L1 — Official documentation / source code
+
+### #1 — FFmpeg `delogo` filter (official ffmpeg-filters-docs)
+- URL: https://ayosec.github.io/ffmpeg-filters-docs/6.0/Filters/Video/delogo.html
+- Type: L1 (mirror of official FFmpeg filter docs)
+- Tier rationale: Direct documentation of a built-in FFmpeg filter
+- Key claims: rectangular logo region, parameters `x, y, w, h, show`,
+  interpolation from immediately-outside pixels
+- Verified locally: yes — `ffmpeg -h filter=delogo` on FFmpeg 8.1 confirms the
+  parameter set (the `band` parameter present in older versions has been
+  removed in 8.1)
+
+### #2 — FFmpeg `delogo` source (`vf_delogo.c`)
+- URL: https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_delogo.c
+- Type: L1 (FFmpeg upstream source)
+- Tier rationale: Authoritative implementation
+- Key claims: applies a "simple delogo algorithm" interpolating surrounding
+  pixels into the rectangular logo region
+
+### #3 — FFmpeg `removelogo` source (`vf_removelogo.c`)
+- URL: https://www.ffmpeg.org/doxygen/trunk/vf__removelogo_8c_source.html
+- Type: L1 (FFmpeg upstream source)
+- Tier rationale: Authoritative implementation
+- Key claims: bitmap-mask-based blur; "major improvement on the old delogo
+  filter"; mask must be a PNG where pixels are LOGO (white) vs source (black);
+  "only pixels in the mask that line up to pixels outside the logo are used"
+- Local note: Filter exists in FFmpeg 8.1 but rejected our PNG mask with
+  "Invalid argument" (-22) — likely format expectation is stricter than
+  documented; sub-matrix marks this `Verify` rather than blocking.
+
+### #4 — Topotek Gimbals on ArduPilot Copter docs
+- URL: https://ardupilot.org/copter/docs/common-topotek-gimbal.html
+- Type: L1 (ArduPilot upstream documentation)
+- Tier rationale: Direct integration documentation for the camera class shown
+  in this project's screenshots `1.jpeg`–`4.png`
+- Key claims (relevant subset):
+  - Two RTSP video streams: `rtsp://192.168.144.108:554/stream=0` (1080p) and
+    `stream=1` (480p)
+  - Configuration via "GimbalControl" Ethernet app (OSD on/off configurable)
+  - Captured images/videos retrievable from `camera/DCIM/snap` and
+    `camera/DCIM/record` over Ethernet/SMB
+- Implication for this run: The cleanest source recovery path (raw RTSP or
+  on-camera DCIM) was explicitly excluded by the user's "only have this MKV"
+  constraint, but is recorded here as the recommended Option Z for any future
+  recordings.
+
+### #5 — LightGlue maintainer guidance on mask injection (cvg/LightGlue#97)
+- URL: https://github.com/cvg/LightGlue/issues/97
+- Type: L1 (issue answered by repo maintainer @Phil26AT, an author)
+- Tier rationale: Direct from the project that this codebase already uses
+  (per `solution_draft01.md` C3 component)
+- Key claims:
+  - SuperPoint does **not** natively accept a mask in its forward pass
+  - Two recommended workarounds: (a) extract all keypoints, then filter by
+    mask post-hoc, or (b) multiply the SuperPoint score map by a binary mask
+    before NMS
+  - Maintainer comment: "(b) you would get more points in the specified area,
+    and thus more matches"
+
+### #6 — Kornia `DISK.forward(img, mask=None)` API (Kornia docs)
+- URL: https://kornia.readthedocs.io/en/latest/feature.html
+- Type: L1 (Kornia official documentation)
+- Tier rationale: Authoritative for the Kornia DISK wrapper; relevant because
+  the DISK detector is project's chosen C3 detector per `solution_draft01.md`
+- Key claims:
+  - `kornia.feature.DISK.forward(img, mask=None)` accepts `mask` as
+    `(B, 1, H, W)` with values in `[0, 1]`
+  - "the score map is multiplied by this mask before keypoint detection so
+    that features are suppressed in masked regions"
+- Implication: **the project's existing C3 stack is already mask-capable**.
+  This makes Option B (mask-aware downstream, no inpainting) the lowest-risk
+  high-quality path.
+
+### #7 — DISK upstream source (`disk/model/disk.py`)
+- URL: https://github.com/cvlab-epfl/disk/blob/master/disk/model/disk.py
+- Type: L1 (DISK upstream)
+- Tier rationale: Authoritative for DISK semantics
+- Key claims: DISK produces a per-pixel `heatmap` of detection scores;
+  multiplying this by a spatial mask before NMS / sampling is the canonical
+  way to restrict detection to a region
+
+### #8 — FFmpeg `tmedian` filter (built-in)
+- URL: https://ffmpeg.org/ffmpeg-filters.html#tmedian
+- Type: L1 (FFmpeg official filter docs)
+- Tier rationale: Authoritative
+- Key claims: `tmedian` computes per-pixel temporal median over a configurable
+  radius window; built into recent FFmpeg
+
+### #9 — `flight_derkachi/README.md` (project's existing fixture convention)
+- URL: `_docs/00_problem/input_data/flight_derkachi/README.md` (in-repo)
+- Type: L1 (project documentation)
+- Key claims:
+  - Replay fixture is 880×720 H.264 30 fps MP4 with paired `.tlog`-derived
+    `data_imu.csv` and per-camera calibration JSON
+  - The MP4 is a "cleaned/cropped replay fixture rather than the raw camera
+    feed"
+  - "the rotating camera was mechanically fixed in a downward/nadir orientation"
+- Implication: the new MKV-derived fixture should match the same shape
+  (cleaned/cropped MP4 + calibration JSON + telemetry CSV)
+
+### #10 — `flight_derkachi/camera_info.md`
+- URL: `_docs/00_problem/input_data/flight_derkachi/camera_info.md` (in-repo)
+- Type: L1 (project documentation)
+- Key claims:
+  - Derkachi camera: Topotek KHP20S30, 1/2.8" CMOS, 1920×1080
+  - Calibration via "factory_sheet" approximation (AZ-702) is project-accepted
+    when checkerboard isn't possible — same approach applies to the
+    new gimbal
+
+## L2 — Peer-reviewed papers / preprints
+
+### #11 — ProPainter (ICCV 2023)
+- URL: https://shangchenzhou.com/projects/ProPainter/
+- Date accessed: 2026-05-29
+- Type: L2 (peer-reviewed conference paper, project page)
+- Tier rationale: ICCV 2023 paper; SOTA (at publication) non-generative video
+  inpainting baseline
+- Key claims:
+  - Recurrent flow completion + dual-domain (image+feature) propagation +
+    mask-guided sparse Transformer
+  - 808G FLOPs/10 frames at 480p; 0.249 s/frame on undisclosed GPU
+  - +1.46 dB PSNR vs prior SOTA
+- Relevance: Baseline option for offline OSD inpainting; non-generative means
+  it propagates pixels from neighboring frames (no fabricated content) — this
+  is the property our project requires.
+
+### #12 — VideoPainter (arXiv 2503.05639, 2025)
+- URL: https://arxiv.org/html/2503.05639v3
+- Type: L2 (arXiv preprint)
+- Tier rationale: Most recent generative video inpainting (2025)
+- Key claims:
+  - Generative dual-branch architecture
+  - Outperforms ProPainter on segmentation-based VPBench
+  - **Critical caveat for our use case**: explicitly described as a
+    *generative* model that synthesizes fully-masked-object content
+- Implication: **Disqualified for our use case**. Synthesized terrain features
+  would corrupt VPR/matching evaluation (project's `meta-rule.mdc` "Real
+  Results, Not Simulated Ones").
+
+### #13 — VidPivot / DiffuEraser comparison (arXiv 2510.21461, 2025)
+- URL: https://arxiv.org/html/2510.21461v2
+- Type: L2 (arXiv preprint)
+- Key claims: cross-comparison between ProPainter, DiffuEraser, VideoPainter,
+  VidPivot on object removal; ProPainter "effectively removes the target
+  region but struggles to generate semantically consistent content"
+- Implication: confirms ProPainter is the best non-generative option;
+  generative variants share the fabrication risk.
+
+### #14 — DISK paper (NeurIPS 2020, arXiv 2006.13566)
+- URL: https://arxiv.org/abs/2006.13566
+- Type: L2 (peer-reviewed)
+- Key claims: DISK is RL-trained; produces a dense heatmap; trains on
+  homographies
+- Relevance: confirms DISK exposes a heatmap that can be multiplied by a
+  spatial mask before keypoint sampling
+
+## L3 — Practitioner / blog / community
+
+### #15 — "Removing obnoxious logos from videos" (Domain of the Technomancer blog)
+- URL: https://www.technomancer.com/archives/248
+- Type: L3 (practitioner blog)
+- Key claims: practitioner walkthrough of FFmpeg `delogo`+`removelogo`,
+  including the workflow of building a PNG mask from a single frame screenshot
+
+### #16 — Conditional Temporal Median Filter (kevina.org)
+- URL: http://www.kevina.org/temporal_median/
+- Type: L3 (older practitioner page; methodology still cited)
+- Key claims: motion-conditional temporal median — apply median only where
+  motion is below threshold, preserves moving content while suppressing
+  static artifacts
+- Relevance: the "static OSD on moving video" use case maps directly to this
+  filter family. However, in our test the burned-in OSD is *also moving*
+  visually because text values change every frame, so motion-conditional
+  median has limitations.
+
+### #17 — Foundry Nuke `TemporalMedian` reference
+- URL: https://learn.foundry.com/nuke/content/reference_guide/time_nodes/temporalmedian.html
+- Type: L3 (commercial-tool documentation)
+- Key claims: Nuke's `TemporalMedian` exposes a mask channel; effect can be
+  limited to the masked region only — same pattern that FFmpeg `tmedian` lacks
+  natively
+
+## In-repo cross-references (project artifacts)
+
+### #R1 — `_docs/01_solution/solution_draft01.md`
+- C2 component: MixVPR (TensorRT, INT8+FP16) for retrieval
+- C3 component: DISK + LightGlue for matching
+- C5 component: GTSAM iSAM2 + CombinedImuFactor
+- The pipeline does not have a "data ingestion / fixture-prep" component —
+  this is the gap this run addresses.
+
+### #R2 — `_docs/00_research/06_component_fit_matrix/00_summary.md`
+- Lists every component in the existing solution with selection status
+- Confirms no fixture-cleanup component exists
+
+### #R3 — `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json`
+- Existing per-camera calibration JSON convention; new gimbal needs an
+  equivalent
@@ -0,0 +1,283 @@
+# Fact Cards — Mode B Video Extraction Run
+
+> Confidence symbols: ✅ High (L1 official) — ⚠️ Medium (L2 academic / official
+> blog) — ❓ Low (L3 practitioner / inference)
+
+## Layer characterization (from local pixel-variance analysis)
+
+### Fact #1 — Three independent overlay layers
+- **Statement**: The recorded `2026-05-09 16-10-54.mkv` (1280×720 H.264 30 fps,
+  6 m 7 s) contains three spatially-overlapping layers: (a) GCS UI chrome
+  rendered as fixed pixel rectangles by the operator's GCS application,
+  (b) gimbal-burned-in OSD rendered upstream of the recorder by the camera
+  itself (attitude ladder, crosshair, FOV brackets, status text, IR
+  picture-in-picture), (c) the underlying EO video.
+- **Source**: Local 12-frame variance analysis (`/tmp/nadir_research/`),
+  extracted frames at t=10,30,60,90,120,150,180,210,240,270,300,330 s
+- **Confidence**: ✅ High (direct measurement)
+- **Related Dimension**: SQ-1 (layer identification)
+- **Fit Impact**: Establishes the action space — each layer needs its own
+  removal/handling strategy
+
+### Fact #2 — IR PIP is itself a live video stream, not a static element
+- **Statement**: The picture-in-picture in the upper-right (~x=720–1080,
+  y=25–235) has 85% dynamic-pixel fraction across the 12 sample frames,
+  consistent with a live IR/thermal video feed, not a static UI element.
+- **Source**: Local variance analysis
+- **Confidence**: ✅ High
+- **Fit Impact**: Cannot be ignored as "noise". Either crop it out
+  geometrically or treat as an opaque rectangle in the OSD mask.
+
+### Fact #3 — GCS UI sidebars contain live values, not pure-static chrome
+- **Statement**: Left sidebar (SL STATS panel) and right sidebar (ROLL/SPEED/
+  DIST/BATT/CURRENT) have mean per-pixel std ≈30–40 across frames, comparable
+  to the actual EO video region. They are pixel-deterministic — same fixed
+  positions on every frame — but the *values* update.
+- **Source**: Local variance analysis
+- **Confidence**: ✅ High
+- **Fit Impact**: Pure geometric crop removes them entirely; no need to
+  inpaint. Easy.
+
+### Fact #4 — Gimbal HUD text is *also* dynamic-content text on top of moving video
+- **Statement**: The top-left HUD block (`00:00/00`, timestamps, EO/IR zoom,
+  FOV) and bottom-right gimbal text show high std (≈39–40), because both the
+  HUD values change AND the underlying video changes. The HUD is rendered
+  upstream by the camera and is **always at the same screen position**.
+- **Source**: Local variance analysis + visual inspection of frames
+- **Confidence**: ✅ High
+- **Fit Impact**: Position-deterministic but content-dynamic. Inpainting must
+  either propagate from neighboring frames (temporal) or from spatially
+  adjacent pixels (FFmpeg `delogo`).
+
+### Fact #5 — Frame at t=30 s shows gimbal pointed forward (horizon visible), frame at t=300 s shows nadir
+- **Statement**: The gimbal is operator-pointable; not all frames are nadir.
+  Burned-in attitude indicator shows pitch numbers from `-3.7°` (near level)
+  to clearly off-nadir values. The aircraft also appears to be a multirotor
+  (frame at t=300 s shows DIST=17.0 m at low altitude, inconsistent with
+  fixed-wing 1 km AGL).
+- **Source**: Direct visual inspection of `f_030.png` and `f_300.png`
+- **Confidence**: ✅ High (visual)
+- **Fit Impact**: Frame-level filtering required before treating output as a
+  nadir fixture. The replay pipeline tuned for nadir-only would mis-handle
+  forward-looking frames.
+
+## FFmpeg techniques
+
+### Fact #6 — FFmpeg `crop` is a pixel-level deterministic geometric crop
+- **Statement**: `crop=W:H:X:Y` produces a sub-region; arbitrary integer
+  coordinates; lossless when paired with `-c:v copy` if the codec supports
+  arbitrary crop, otherwise a re-encode is needed.
+- **Source**: Source #1 + locally tested (PoC1 in `/tmp/nadir_research/`)
+- **Confidence**: ✅ High
+- **Related Dimension**: SQ-2 (GCS chrome removal)
+- **Fit Impact**: Trivially implements the entire GCS-chrome-removal step.
+
+### Fact #7 — FFmpeg `delogo` replaces a rectangle with interpolation from neighboring pixels
+- **Statement**: `delogo=x=X:y=Y:w=W:h=H` interpolates from the immediately-
+  outside pixels of the rectangle. In FFmpeg 8.1 the `band` parameter has been
+  removed; only `x, y, w, h, show` remain. The filter is timeline-enabled
+  (can be activated only on certain frames via `enable=` expression).
+- **Source**: Source #1 + Source #2 + locally verified (`ffmpeg -h
+  filter=delogo` on FFmpeg 8.1)
+- **Confidence**: ✅ High
+- **Related Dimension**: SQ-3 (gimbal OSD removal)
+- **Fit Impact**: Cheap, deterministic, works for small rectangles. Quality
+  degrades for large rectangles or when the region's interior is full of
+  texture (e.g., text on grass).
+- **Caveat**: Cannot place the rectangle touching the image edge — there are
+  no surrounding pixels to interpolate from.
+
+### Fact #8 — Multiple `delogo` filters can be chained via comma
+- **Statement**: A filter graph like
+  `crop=W:H:X:Y,delogo=...,delogo=...,delogo=...` chains successive `delogo`
+  passes, each operating on the output of the previous.
+- **Source**: Locally verified (PoC4 produced `poc4_delogo.mp4` via 3 chained
+  `delogo` filters after `crop`)
+- **Confidence**: ✅ High (direct test)
+- **Fit Impact**: Practical recipe for removing the 5–6 burned-OSD regions in
+  this video.
+
+### Fact #9 — FFmpeg `removelogo` accepts a PNG mask but is fragile in FFmpeg 8.1
+- **Statement**: `removelogo=mask.png` should accept a PNG where black=clean,
+  white=logo. In our local FFmpeg 8.1 tests it failed with `Invalid argument`
+  (`-22`) on both grayscale and RGB masks of the correct dimensions.
+  Documentation (Source #3) suggests strict requirements on the mask format
+  that FFmpeg 8.1 enforces but does not document clearly. Practitioner
+  walkthroughs (Source #15) used the filter successfully on older FFmpeg.
+- **Source**: Source #3 + Source #15 + local test failure
+- **Confidence**: ⚠️ Medium (works in principle, version-dependent in practice)
+- **Fit Impact**: Use chained `delogo` instead, or use a per-frame OpenCV
+  inpaint script if `removelogo` cannot be made to work on the team's pinned
+  FFmpeg version.
+
+### Fact #10 — FFmpeg `tmedian` computes per-pixel temporal median over a window
+- **Statement**: `tmedian=radius=N` outputs each pixel as the median of pixels
+  at the same coordinates over the window of `2N+1` frames. For a moving
+  camera over rich terrain, the underlying scene changes every frame so the
+  temporal median tends to wash out — producing motion-blur-like
+  ghosting rather than clean output.
+- **Source**: Source #8 + locally tested (PoC3 produced
+  `poc3_crop_tmedian.mp4`)
+- **Confidence**: ✅ High (direct test)
+- **Fit Impact**: **Not suitable** for our case — both the OSD values and the
+  underlying video change every frame, so temporal median produces ghosted
+  output that's worse for downstream matching than the original OSD-laden
+  frames.
+
+## Deep-learning video inpainting
+
+### Fact #11 — ProPainter is the SOTA non-generative video inpainter (as of late 2023)
+- **Statement**: ProPainter (Zhou et al., ICCV 2023) uses recurrent flow
+  completion + dual-domain propagation (image and feature) + mask-guided
+  sparse Transformer. Explicitly described as non-generative — it propagates
+  pixels from non-masked frames rather than synthesizing new content.
+  ~0.249 s/frame at 480p, 808G FLOPs/10 frames.
+- **Source**: #11 (ProPainter project page)
+- **Confidence**: ⚠️ Medium (paper claims; per-deployment runtime varies)
+- **Related Dimension**: SQ-3 (gimbal OSD removal, high-quality option)
+- **Fit Impact**: Highest-quality option for OSD removal that respects the
+  "no fabrication" constraint. Cost: GPU + Python toolchain; offline-only.
+
+### Fact #12 — VideoPainter and successors are *generative* and DISQUALIFIED for our use case
+- **Statement**: VideoPainter (2025), DiffuEraser (2025), VidPivot (2025),
+  OmniPainter use I2V or diffusion backbones to *synthesize* content for fully
+  masked regions. They produce more visually pleasing output than ProPainter
+  but the synthesized content is **not** a faithful representation of the real
+  underlying scene.
+- **Source**: #12 + #13
+- **Confidence**: ✅ High (explicit in the papers)
+- **Related Dimension**: SQ-3
+- **Fit Impact**: **Disqualifier**. Project rule (`meta-rule.mdc` "Real
+  Results, Not Simulated Ones"): a fixture that fabricates terrain features
+  the matcher might anchor on is worse than no fixture. Status: `Rejected`.
+
+## Mask-aware downstream
+
+### Fact #13 — Kornia's `DISK.forward()` accepts a binary mask natively
+- **Statement**: `kornia.feature.DISK.forward(img, mask=None)` takes a mask
+  argument of shape `(B, 1, H, W)` with values in `[0, 1]`. The score map is
+  multiplied by this mask before keypoint detection — keypoints in masked
+  regions are suppressed by construction, with no preprocessing of pixels.
+- **Source**: #6 (Kornia docs L1)
+- **Confidence**: ✅ High
+- **Related Dimension**: SQ-4 (mask-aware downstream)
+- **Fit Impact**: **Lowest-risk highest-quality option**. The project's chosen
+  C3 detector (DISK per `solution_draft01.md`) already supports mask injection
+  out of the box — *no video preprocessing required* beyond the deterministic
+  GCS-chrome crop.
+
+### Fact #14 — LightGlue's matching layer needs no mask; suppression at detect time is sufficient
+- **Statement**: LightGlue's authors recommend (issue #97, by maintainer
+  Phil26AT) suppressing keypoints at detect time via score-map masking; once
+  no keypoints are produced in the masked region, LightGlue has nothing to
+  match there.
+- **Source**: #5 (LightGlue issue, maintainer reply)
+- **Confidence**: ✅ High
+- **Related Dimension**: SQ-4
+- **Fit Impact**: Confirms Option B is feasible end-to-end with the existing
+  C3 stack.
+
+## Source recovery (informational; ruled out by user)
+
+### Fact #15 — Topotek/Viewpro multi-sensor balls expose RTSP and DCIM directly
+- **Statement**: Topotek camera class (per ArduPilot integration docs) exposes
+  two RTSP streams (`rtsp://192.168.144.108:554/stream=0` 1080p,
+  `stream=1` 480p) and on-camera recordings retrievable via Ethernet/SMB at
+  `camera/DCIM/snap` and `camera/DCIM/record`. OSD overlays can be disabled
+  via the GimbalControl Ethernet utility.
+- **Source**: #4
+- **Confidence**: ✅ High
+- **Fit Impact**: For *future* recordings this is the dominant path
+  (no cleanup needed). Out of scope for the current MKV per user constraint
+  but recorded as Option Z in the comparison framework.
+
+## Project-context inheritance
+
+### Fact #16 — `flight_derkachi.mp4` is the existing reference fixture shape
+- **Statement**: Existing replay fixture is 880×720 H.264 30 fps MP4, paired
+  with `data_imu.csv` (10 Hz from `.tlog`) and per-camera calibration JSON
+  (`khp20s30_factory.json`). The MP4 is described as "cleaned/cropped replay
+  fixture rather than the raw camera feed" with the "rotating camera
+  mechanically fixed in a downward/nadir orientation".
+- **Source**: #9 + #10
+- **Confidence**: ✅ High
+- **Fit Impact**: New fixture must match this structure to drop into the
+  existing `tests/e2e/replay/test_az835_e2e_real_flight.py` harness.
+
+### Fact #17 — A "factory_sheet" calibration approximation is project-accepted when checkerboard isn't possible
+- **Statement**: The Derkachi calibration was sourced via "factory_sheet"
+  approximation (AZ-702) since per-unit checkerboard refinement was deferred
+  for lack of hardware access. Residual focal-length error expected in 1–3%
+  band. Project acknowledges this is the cheapest acceptable starting point.
+- **Source**: #10 (`camera_info.md`)
+- **Confidence**: ✅ High
+- **Fit Impact**: A new calibration JSON for the Topotek/Viewpro multi-sensor
+  ball can use the same approach — published spec sheet → focal length, FOV,
+  pixel size approximations, marked `factory_sheet` source.
+
+### Fact #18 — Existing solution has no "data ingestion / fixture-prep" component
+- **Statement**: Components C1 (VIO) through C12 (build cache orchestrator) in
+  `solution_draft01.md` cover runtime + pre-flight + deploy concerns but do
+  not include a fixture-cleanup or data-ingestion component. Fixtures appear
+  in the `tests/e2e/replay/` infrastructure as already-cleaned MP4s.
+- **Source**: #R1 + #R2
+- **Confidence**: ✅ High
+- **Fit Impact**: This is the *gap* the Mode B revision addresses. The new
+  fixture-prep component does not modify the runtime; it adds a developer
+  tool under `tools/` or `tests/fixtures/` that produces fixtures consumable
+  by the existing replay path.
+
+## API Capability Verification — applied to lead candidates
+
+This section is mandatory per SKILL.md → Step 2 → API Capability Verification.
+
+### MVE — Kornia DISK in mask-aware mode
+
+- **Source**: Source #6 (Kornia docs, accessed 2026-05-29)
+- **Inputs in the docs example**: `img` of shape `(B, C, H, W)`, `mask` of
+  shape `(B, 1, H, W)` with values in `[0, 1]`
+- **Outputs in the example**: list of `Features` (keypoints + descriptors)
+  with no keypoints in masked regions
+- **Project inputs**: 1 image (`B=1`), `mask` derived once from a static OSD
+  layout, applied per-frame
+- **Project outputs required**: keypoints + descriptors that can be passed
+  into LightGlue (the project's existing C3.2 component)
+- **Match assessment**: ✅ exact match — Kornia DISK is the same library the
+  existing solution uses; the mask path is documented and exercised by Kornia
+  tests
+- **MVE code (project's expected use):**
+  ```python
+  import torch, kornia.feature as KF
+  from PIL import Image
+  import numpy as np
+
+  disk = KF.DISK.from_pretrained("depth").eval()
+  mask_np = np.asarray(Image.open("osd_mask.png").convert("L")) / 255.0
+  # mask: 1 where keep, 0 where suppress (matches Kornia semantics)
+  mask = torch.from_numpy((mask_np < 0.5).astype("float32"))[None, None]
+  img = ...  # (1, 3, H, W)
+  feats = disk(img, mask=mask, n=2048)
+  ```
+
+### MVE — FFmpeg crop + chained delogo (project's primary cleanup path)
+
+- **Source**: Source #1 (FFmpeg delogo docs) + local PoC4
+- **Inputs in our test**: `2026-05-09 16-10-54.mkv` (1280×720 H.264 30 fps)
+- **Outputs in our test**: `poc4_delogo.mp4` (900×445 H.264 30 fps with three
+  burned-OSD rectangles overwritten by interpolated pixels)
+- **Project inputs**: matches
+- **Project outputs required**: a file the replay harness can consume
+- **Match assessment**: ✅ exact match — local PoC produced a valid playable
+  output, dimensions match the existing fixture convention class
+  (sub-1080p H.264 MP4)
+- **MVE command:**
+  ```bash
+  ffmpeg -i input.mkv \
+    -vf "crop=900:445:50:25,delogo=x=5:y=35:w=180:h=115,delogo=x=395:y=5:w=275:h=70,delogo=x=130:y=265:w=690:h=50" \
+    -an -c:v libx264 -crf 18 fixture.mp4
+  ```
+
+### Skipped — VideoPainter / DiffuEraser / VidPivot
+- These candidates are rejected on the fabrication-risk disqualifier (Fact
+  #12), not on API capability. No MVE built; not progressing to Step 7.5
+  Selected status.
@@ -0,0 +1,88 @@
+# Comparison Framework — Video Extraction Options
+
+## Selected Framework Type
+
+**Decision Support** — multiple candidates, weighted on cost vs quality vs
+risk, with the goal of selecting the best path (or composition of paths) for
+the project's replay-fixture use case.
+
+## Selected Dimensions
+
+1. **Output fidelity** — Are the underlying terrain pixels preserved
+   verbatim, or modified/synthesized?
+2. **Fabrication risk** — Could the technique introduce features the
+   downstream matcher could anchor on but that don't exist in reality?
+   (Project's "Real Results, Not Simulated Ones" rule.)
+3. **Pixel coverage** — How much of the original EO video region is usable
+   in the output?
+4. **Cost & complexity** — Lines of code, dependencies, runtime per frame,
+   GPU required?
+5. **Reproducibility** — Same input → same output across runs, machines, and
+   time?
+6. **Project-pipeline integration cost** — How much of the existing C2/C3
+   pipeline needs to change to consume the output?
+7. **Coverage of layers** — Which of the three layers (GCS chrome /
+   gimbal-burned OSD / IR PIP) does the technique address?
+8. **Per-frame gimbal-pointing handling** — Does the technique help filter
+   non-nadir frames?
+
+## Initial Population — Option Matrix
+
+> Notation: ✅ ideal — ✓ acceptable — ⚠️ caveat — ❌ disqualifier
+> Pixel coverage is in % of the 1280×720 original (1280×720 = 921 600 px)
+
+| # | Option | Output fidelity | Fabrication risk | Pixel coverage | Cost & complexity | Reproducibility | C2/C3 integration cost | Layer coverage | Non-nadir filtering |
+|---|---|---|---|---|---|---|---|---|---|
+| **A** | **Crop only** (FFmpeg `crop`) | ✅ Verbatim | ✅ None | ⚠️ ~58% (740×525 ≈ 388 500 px after removing chrome+IR-PIP+minimap; ~70% of EO area) | ✅ Trivial (one filter) | ✅ Bit-deterministic | ✅ Zero changes | GCS chrome: ✅ — Gimbal OSD: ❌ remains burned in — IR PIP: ✅ excluded by tight crop | ❌ No |
+| **B** | **Crop + mask-aware DISK** (Fact #13) | ✅ Verbatim | ✅ None | ✅ ~80% of EO area (mask only suppresses keypoints in OSD pixels, pixels themselves are unchanged) | ✓ Trivial pipeline change: pass `osd_mask.png` to DISK forward call; one-time mask build | ✅ Mask is a static PNG | ⚠️ One-line C3 code change to pass `mask=` parameter | GCS chrome: ✅ — Gimbal OSD: ✅ via score-map suppression — IR PIP: ✅ via mask | ❌ No (orthogonal concern) |
+| **C** | **Crop + chained `delogo`** (Fact #7, #8) | ✓ Mostly verbatim, OSD regions are interpolated from neighbor pixels | ✓ Low — interpolation produces blurry but plausible content; could create weak features but no semantic terrain hallucination | ✅ ~85% (interpolation fills the OSD region) | ✓ Cheap (one FFmpeg invocation, ~5 chained filters) | ✅ Bit-deterministic | ✅ Zero changes (output is plain MP4) | GCS chrome: ✅ — Gimbal OSD: ✓ each OSD region passed to a separate `delogo` — IR PIP: ⚠️ too large for `delogo`, must crop or use removelogo | ❌ No |
+| **D** | **Crop + `removelogo` PNG mask** (Fact #9) | ✓ Mostly verbatim, mask-shaped blur fills OSD regions | ✓ Low (same blur-based approach as `delogo`) | ✅ ~85% | ⚠️ Cheap but version-fragile in our tests on FFmpeg 8.1 (failed); more reliable on older FFmpeg | ✅ if it works on the target version | ✅ Zero changes | All layers via single mask | ❌ No |
+| **E** | **Crop + ProPainter video inpainting** (Fact #11) | ✓ Verbatim where possible, propagated from non-masked frames where occluded | ✓ Low — non-generative, propagation-based; but if the OSD covers the same scene region for many frames the propagation may guess | ✅ ~85% | ❌ Expensive: GPU required; ~0.25 s/frame at 480p, scales with resolution; Python toolchain (PyTorch + custom build) | ✓ Reproducible if model weights pinned | ✅ Zero changes (output is plain MP4) | All layers if mask covers them | ❌ No |
+| **F** | **Crop + temporal-median (`tmedian`)** (Fact #10) | ❌ Smeared — both OSD and underlying scene change per frame; median washes both | High risk: smeared output may produce false features OR suppress real ones | ⚠️ Coverage is full but quality is degraded everywhere | ✓ Cheap | ✅ | ✅ | All if motion is right; **doesn't work for our case** because OSD values *also* change per frame | ❌ No |
+| **G** | **Crop + generative video inpainting (VideoPainter et al.)** (Fact #12) | ❌ Synthesized | ❌❌ **High** — fabricates terrain features that don't exist | ✅ ~85% | ❌ Very expensive: SOTA generative VIs require multi-GB models on H100-class GPUs | ✓ but content is non-deterministic across runs (unless seed pinned) | ✅ output is plain MP4 | All layers | ❌ No |
+| **H** | **Per-frame OpenCV navier-stokes / telea inpaint** (with the same OSD mask) | ✓ Verbatim where possible, deterministic non-generative inpaint | ✓ Low | ✅ ~85% | ✓ Cheap (Python + OpenCV); slower than FFmpeg but trivial code | ✅ | ✅ output is plain MP4 | All layers | ❌ No |
+| **I** | **Tlog-based gimbal-attitude filter** (orthogonal, applied to A/B/C) | n/a — filtering only | n/a | Reduces output to nadir-band frames only | ✓ Cheap if `MOUNT_STATUS`/`MOUNT_ORIENTATION` is in the paired `2026-05-09 16-09-54.tlog` | ✅ | ✓ stand-alone tool that drops frames before encoding | n/a (frame-level) | ✅ **Yes** — gates by gimbal pitch from telemetry |
+| **J** | **OCR-based pitch-from-OSD filter** (orthogonal, applied to A/B/C) | n/a | n/a | Reduces output to nadir-band frames only | ⚠️ More complex (Tesseract or PaddleOCR per-frame) and OCR errors propagate | ✓ | ✓ stand-alone tool | n/a (frame-level) | ✅ via OCR on the `-3.7°` text in the burned attitude indicator |
+| **Z** | **Source recovery** (re-record with OSD off / pull RTSP / pull DCIM) | ✅ Native | ✅ None | ✅ 100% | ✅ Trivial *if* hardware access | ✅ | ✅ Zero changes | All layers (no overlay produced) | ⚠️ Depends on whether gimbal can be locked nadir |
+
+## Composition note
+
+Options are not all mutually exclusive. The three orthogonal axes are:
+- **Pixel handling**: choose ONE of {A, B, C, D, E, F, G, H, Z}
+- **Frame filtering** (non-nadir rejection): choose ZERO OR ONE of {I, J} on
+  top of the pixel-handling choice
+- **Source class**: Option Z replaces all of the above when source access is
+  available; for the current MKV (user constraint = "only have this MKV"), Z
+  is unavailable.
+
+## Recommended composition
+
+**Primary**: **B + I** — crop the GCS chrome geometrically, build a binary
+OSD mask (a PNG once, hand-edited or scripted from the variance map), and
+inject the mask into the project's existing DISK detector via the
+already-supported `mask=` parameter; in parallel, parse the paired
+`.tlog` for gimbal attitude and drop frames where the gimbal is off-nadir.
+
+**Fallback** (when modifying the C3 path is not desirable for this fixture):
+**C + I** — produce a plain `.mp4` via crop + chained `delogo` so the new
+fixture can drop into the existing replay path with **zero** code changes,
+then apply the same tlog-based frame filter.
+
+**Disqualified options**: G (generative inpainting), F (temporal median —
+doesn't work for our case because OSD values change per frame).
+
+**Excluded by user constraint, but recommended for future recordings**:
+Z (source recovery — pull RTSP or DCIM directly from the camera).
+
+## Reasoning summary table
+
+| Question dimension | Winner | Why |
+|---|---|---|
+| Output fidelity | B (and Z when available) | No pixels modified |
+| Fabrication risk | B, A | No new pixels invented |
+| Pixel coverage | B, C, D, H | Whole EO region usable |
+| Cost & complexity | A, C | Single FFmpeg command |
+| Reproducibility | All except G | Deterministic |
+| C2/C3 integration | A, C, D, H | No code changes |
+| Layer coverage | B, D | Single mask handles all |
+| Non-nadir filtering | I (with any pixel option) | Telemetry-driven |
@@ -0,0 +1,247 @@
+# Reasoning Chain — Video Extraction Decisions
+
+## Dimension 1 — Why three layers and not two
+
+### Fact confirmation
+Local 12-frame variance analysis (Fact #1) showed at least three pixel
+populations distinguishable by their behavior over time:
+1. Pixel-stable rectangles around the periphery (left/right sidebars,
+   minimap) — the GCS UI chrome.
+2. Pixel-stable rectangles in the central video area (top-left HUD,
+   top-center attitude ladder, crosshair, FOV brackets, bottom-right
+   coordinates) — gimbal-burned-in OSD.
+3. The dynamic remainder — the actual EO video, plus the IR PIP, which is
+   *itself* a dynamic video stream stamped at fixed coordinates.
+
+### Reference comparison
+A simpler "UI vs video" two-layer model would suggest a single mask covering
+all overlays. But the IR PIP behaves like the EO video (Fact #2), and the
+GCS chrome includes live-updating values (Fact #3) — so the actual
+distinction that matters is *who renders the pixel and how* not *whether
+the pixel is constant*:
+- GCS chrome is rendered by the GCS application **after** the camera stream
+  arrives → it's removable by cropping to the region the GCS shows the EO
+  in.
+- Burned-in gimbal OSD is rendered **inside the camera** before the recorder
+  sees it → it's pixel-baked into the EO video and only removable by
+  inpainting or by mask-aware downstream consumption.
+- IR PIP is **also** rendered by the camera (the gimbal stamps the IR
+  channel into a corner of the EO output stream) → behaves like burned-in
+  OSD: pixel-baked, removable only by masking or cropping it out.
+
+### Conclusion
+Three layers, two removal classes:
+- Class 1 (GCS chrome): pure crop.
+- Class 2 (gimbal OSD + IR PIP): mask or inpaint.
+
+### Confidence
+✅ High — pixel-variance evidence is direct measurement.
+
+---
+
+## Dimension 2 — Why mask-aware downstream wins over inpainting
+
+### Fact confirmation
+The project's chosen C3 detector is DISK + LightGlue
+(`solution_draft01.md`). Kornia's DISK accepts a `mask=(B,1,H,W)` parameter
+that multiplies the detection score map (Fact #13). LightGlue's authors
+confirm that suppressing keypoints at detect time is sufficient — once the
+detector returns no keypoints in a region, the matcher has nothing to match
+there (Fact #14).
+
+### Reference comparison
+Inpainting-based options (C, D, E, H) all share the property that they
+synthesize *some* content for the OSD region. Even non-generative
+techniques like FFmpeg `delogo` (interpolation from outside pixels) or
+ProPainter (propagation from neighbor frames) produce pixels that *look*
+like terrain but didn't come from the actual terrain at that location. A
+feature detector running on those inpainted pixels could legitimately fire
+on the inpaint artifacts. Without a mask, the downstream pipeline cannot
+distinguish a real feature from a fake one. With a mask, it doesn't have to:
+the score map is zeroed before NMS, so no keypoint is produced for the OSD
+region in the first place.
+
+### Conclusion
+Mask-aware downstream is **strictly better** than inpainting for this
+project's use case, because:
+1. Output fidelity is verbatim (no synthesized pixels enter the matcher).
+2. The mask is a single static PNG, computed once from the OSD layout — far
+   simpler than per-frame inpainting.
+3. The integration cost is one parameter on the existing `DISK.forward()`
+   call (Fact #6).
+4. The OSD coverage is the union of all OSD elements, so the mask trivially
+   handles all of them at once (top-left HUD, attitude ladder, crosshair,
+   etc.) without one filter per region.
+
+The only reason to fall back to inpainting (Option C/H) is if we want a
+fixture that can be dropped into the existing replay path **without any
+code change**, because today's replay tooling treats the input MP4 as
+pristine. Even then, the right answer is to extend the replay tooling to
+carry an optional companion `osd_mask.png` per fixture — at which point
+Option B is again preferable.
+
+### Confidence
+✅ High — both the existence of the API and its semantic effect are
+documented at L1 (Kornia docs, LightGlue maintainer reply).
+
+---
+
+## Dimension 3 — Why generative inpainting is disqualified
+
+### Fact confirmation
+VideoPainter (2025), DiffuEraser (2025), VidPivot (2025) and similar SOTA
+inpainters (Fact #12) explicitly *generate* content for masked regions
+using video-diffusion or I2V backbones. The papers claim these models
+produce *plausible* terrain even where the masked region was fully
+occluded.
+
+### Reference comparison
+Project's `meta-rule.mdc` rule "Real Results, Not Simulated Ones" is
+unambiguous: the goal is a working product, not the appearance of one.
+Specifically: "Never produce results by bypassing, faking, stubbing, or
+passthrough-ing the component that is supposed to produce them."
+
+The downstream component is a feature matcher whose entire purpose is to
+detect real terrain features and match them to a satellite tile. A
+generative inpaint inserts plausible-but-false terrain features into the
+input. The matcher cannot tell the difference. It will happily match
+fabricated grass texture to a real satellite-tile region with similar
+texture and produce a confident, wrong, fix.
+
+The same argument applies even more sharply to the project's
+**AC-NEW-7** "cache-poisoning safety budget": onboard tiles fed back into
+the basemap must not be misaligned. A fixture validating tile generation
+that includes synthesized terrain features tests the wrong thing — it
+validates that the system handles plausible-looking pixels, not that it
+handles real-flight pixels.
+
+### Conclusion
+Generative inpainters (Option G) are **rejected**. They optimize the wrong
+objective for this project.
+
+### Confidence
+✅ High — disqualifier comes from explicit project rule + reading of
+upstream paper claims.
+
+---
+
+## Dimension 4 — Why temporal median fails for this case
+
+### Fact confirmation
+FFmpeg `tmedian=radius=N` outputs the per-pixel median over `2N+1`
+neighboring frames (Fact #10). This works as an OSD-removal trick when:
+1. The OSD pixels are **stable** (same value every frame, or at least the
+   majority of frames).
+2. The underlying scene **changes** per frame (so the median over the
+   window is dominated by underlying scene values, not OSD values).
+
+In our recorded video, both the OSD values **and** the underlying scene
+change per frame:
+- Burned-in OSD text shows live counters like `00:04:24` that update each
+  second; pitch number `-3.7°` updates with gimbal motion; HDOP/SATS values
+  change.
+- Underlying EO video shows the ground moving as the UAV moves.
+
+### Reference comparison
+A *motion-conditional* temporal median (Source #16, Source #17) — apply
+the median only where motion is below a threshold — addresses the issue in
+principle. But the static-OSD assumption underneath that approach
+specifically does not hold in our case: even the *positions* are static,
+but the *content* in those positions is dynamic.
+
+### Conclusion
+Temporal median is **not suitable** for this video. The local PoC
+(`poc3_crop_tmedian.mp4`) confirms: output shows ghosted, smeared OSD text
+overlapping with smeared/aliased terrain — strictly worse than the
+original for downstream feature matching.
+
+### Confidence
+✅ High — direct experimental result.
+
+---
+
+## Dimension 5 — Why frame filtering by gimbal pointing is mandatory
+
+### Fact confirmation
+Frame at t=30 s shows gimbal pointed forward (sky/horizon visible), frame
+at t=300 s shows gimbal pointed near nadir (ground texture filling frame)
+(Fact #5). The gimbal is operator-controlled — mid-flight pointing is
+common; only a subset of frames are nadir.
+
+### Reference comparison
+The project's nav-camera spec is "fixed downward (no gimbal)"
+(`restrictions.md`). The C2 VPR component is trained / tuned on satellite
+imagery with the assumption that the query is a top-down view of the
+ground. Forward-looking frames (sky, distant horizon, oblique terrain) are
+out-of-distribution for the VPR retrieval and would produce poor or
+spurious matches.
+
+### Conclusion
+A fixture derived from this MKV that contains forward-looking frames is
+not a valid representative-data fixture for the nadir-tuned pipeline. A
+frame-level filter is needed — either:
+- **Option I** (telemetry-based): parse `MOUNT_STATUS`/`MOUNT_ORIENTATION`
+  from the paired `2026-05-09 16-09-54.tlog`. Cheaper and more reliable.
+- **Option J** (OCR-based): read the burned-in `-3.7°` text from the
+  attitude indicator. Lower setup cost (no telemetry parser) but OCR
+  errors propagate.
+
+### Confidence
+✅ High — the gimbal-pointing fact is direct visual evidence; the
+out-of-distribution argument is a derived consequence consistent with the
+project's `restrictions.md` AC-2.1a "nadir ±10° bank/pitch" qualifier.
+
+---
+
+## Dimension 6 — Why this is a fixture-prep tooling concern, not a runtime concern
+
+### Fact confirmation
+Existing `solution_draft01.md` does not have a "data ingestion / fixture
+prep" component (Fact #18). Replay fixtures appear in the test
+infrastructure as already-cleaned MP4s + companion CSV/JSON.
+
+### Reference comparison
+The runtime nav-camera (per project spec) is the ADTi 20MP fixed-downward
+without OSD. There is no expectation that the runtime pipeline ever sees
+an OSD-laden frame from a multi-sensor gimbal. So the right place to
+handle this MKV is **not** in the runtime — it is in the developer
+tooling that produces fixtures.
+
+### Conclusion
+The Mode B revision is **additive, not subtractive**: it identifies a gap
+(no fixture-prep component) and adds a developer tool. It does **not**
+modify any runtime component. The C1/C2/C3/C4/C5 components in
+`solution_draft01.md` are unchanged.
+
+### Confidence
+✅ High — direct read of `solution_draft01.md` confirms no such
+component exists.
+
+---
+
+## Dimension 7 — Why the existing `flight_derkachi.mp4` precedent matters
+
+### Fact confirmation
+`flight_derkachi.mp4` is described as "cleaned/cropped replay fixture
+rather than the raw camera feed" with "the rotating camera mechanically
+fixed in a downward/nadir orientation" (Fact #16). It was produced by a
+process that:
+1. Disabled the gimbal OSD (likely via Topotek's GimbalControl Ethernet
+   utility).
+2. Mechanically locked the gimbal nadir.
+3. Recorded a 1080p clean stream.
+4. Cropped to 880×720 (probably to remove residual borders or reframe).
+
+### Reference comparison
+The new MKV represents the *opposite* situation: OSD on, gimbal
+unconstrained, GCS-screen-recorded rather than direct camera capture. The
+existing fixture-creation procedure (steps 1–4 above) does not apply.
+
+### Conclusion
+A new, documented procedure is needed for the GCS-screen-recorded
+class of input. That procedure is the deliverable of this Mode B run
+(see `solution_draft02.md`). It complements the existing Derkachi
+procedure — does not replace it.
+
+### Confidence
+✅ High.
@@ -0,0 +1,133 @@
+# Validation Log — Mode B Video Extraction Run
+
+## Validation scenario
+
+A developer wants to use `2026-05-09 16-10-54.mkv` as a representative
+replay fixture for the GPS-denied pipeline (analogous to
+`flight_derkachi.mp4`), to extend testing to a new aircraft/camera class
+(multi-sensor gimbal ball, multirotor profile) and a new operating
+condition (low-altitude / non-nadir gimbal).
+
+## Expected behavior under each candidate
+
+### Option A (crop only)
+Expected: produces an 740×525-ish MP4 with gimbal OSD elements still burned
+in at the same screen positions. Replay infrastructure consumes it as-is.
+Downstream C2/C3 detect features inside OSD text regions and produce false
+matches. Drift accumulates, AC-2.1a fails.
+
+**Actual** (from PoC1 + reasoning): predicted behavior matches. Output is a
+valid MP4 but feeding it into a feature matcher would produce keypoints
+inside the burned-in `-3.7°` and `FOV 53.2°` text regions, since those
+regions have high local contrast.
+
+### Option B (crop + mask-aware DISK)
+Expected: same MP4 as Option A, plus a static `osd_mask.png` companion file.
+Replay infrastructure modified to inject the mask into the C3 detect call.
+DISK detector returns no keypoints inside masked regions (per Fact #13
+score-map multiplication semantics). LightGlue matches only real-terrain
+features. AC-2.1a passes for nadir frames.
+
+**Actual** (predicted, no end-to-end PoC run): matches the documented
+Kornia DISK contract. The change to the replay tooling is one optional
+parameter added to a `Disk()` instantiation. Risk: if the existing
+production code path uses a wrapper around DISK that does not forward the
+`mask=` parameter, the wrapper needs adjustment.
+
+### Option C (crop + chained `delogo`)
+Expected: an 740×525-ish MP4 with OSD regions replaced by interpolation
+from neighboring pixels. Replay infrastructure unchanged. Downstream C2/C3
+detect features in the interpolated regions; some weak features may be
+detected (interpolation produces low-contrast smooth regions) but
+significantly fewer than the original OSD regions.
+
+**Actual** (from PoC4): output looks reasonable with three OSD rectangles
+replaced by smoothed interpolation. Some chained `delogo` filters caused
+issues when their rectangles touched image edges in earlier attempts —
+mitigated by avoiding edge contact.
+
+### Option F (temporal median)
+Expected: smeared, ghosted output as both OSD and underlying terrain
+average together over the window.
+
+**Actual** (from PoC3): confirmed. Output shows visible motion-blur ghosts
+of OSD text across the frame, plus desaturated and smeared underlying
+terrain. **Disqualified.**
+
+### Option I (tlog-based gimbal-pointing filter)
+Expected: parse `MOUNT_STATUS`/`MOUNT_ORIENTATION` messages from the
+companion `2026-05-09 16-09-54.tlog`, build a frame index → gimbal-pitch
+table, drop frames where pitch is more than (e.g.) 10° off-nadir. Output
+preserves only nadir-band frames, suitable for the level-flight VPR
+assumption.
+
+**Actual** (predicted): depends on whether the camera class actually emits
+`MOUNT_STATUS` to the FC. ArduPilot's documented gimbal integration
+(Source #4) confirms gimbal angles are reported back to the FC for some
+Topotek models. **Verify** before relying on this — if the tlog lacks
+gimbal angle, fall back to Option J (OCR).
+
+## Counterexamples
+
+### Counterexample 1: gimbaled-fixed nadir flight
+**Scenario**: the user happens to have already locked the gimbal nadir and
+the entire recording is nadir-only. **Implication**: Option I/J becomes a
+no-op; the rest of the pipeline works the same. **No change to
+recommendation.**
+
+### Counterexample 2: text values in OSD overlap with bright terrain
+**Scenario**: the green attitude ladder text overlaps with bright sky in
+forward-looking frames — does Option C `delogo` interpolation produce
+something useful? **Predicted**: only the rectangle is touched; if the
+rectangle covers sky-only pixels, the interpolation produces sky-colored
+output (acceptable). If the rectangle straddles sky/horizon, the
+interpolation may produce a smeared horizon line (mild artifact,
+acceptable for non-nadir frames which would be filtered by Option I/J
+anyway).
+
+### Counterexample 3: future MKV recordings have different OSD layout
+**Scenario**: a later flight uses a different GCS that places the OSD
+elsewhere, breaking the hardcoded coordinates in the chained `delogo`
+recipe. **Implication**: the developer tool must be parametrized, not
+hardcoded. The proposed fixture-prep tool ships with a **per-recording
+OSD profile** (a small YAML or JSON listing the GCS-chrome crop box and
+the OSD rectangles) so adding a new recording class is a few-line config
+change.
+
+## Review checklist
+
+- [x] Draft conclusions consistent with fact cards
+- [x] No important dimensions missed (audio handling and frame-rate
+      normalization are noted as trivial in `00_question_decomposition.md`'s
+      Completeness Audit)
+- [x] No over-extrapolation — claims are tied to specific facts
+- [x] Conclusions actionable: a developer can follow the recipes in
+      `solution_draft02.md` to produce a new fixture
+- [x] Every selected component matches the project's constraint matrix
+      (verified in `06_component_fit_matrix.md`)
+- [x] Mismatches marked as disqualifiers (Option G, F)
+- [x] Per-mode API capability verified for both lead candidates (Kornia
+      DISK in mask mode, FFmpeg `crop`+`delogo` chain) — both have saved
+      MVE blocks in `02_fact_cards.md`
+
+## Open questions deferred to user / out-of-scope
+
+1. **Does the paired `2026-05-09 16-09-54.tlog` contain `MOUNT_STATUS`
+   messages?** — not verified in this run. Recommendation: open the tlog
+   with `pymavlink` and grep for `MOUNT_STATUS`; if absent, fall back to
+   Option J or accept all frames + downstream covariance.
+2. **Should this fixture replace `flight_derkachi.mp4` as the primary
+   replay fixture, or supplement it?** — supplement. Different aircraft
+   class, different sensor class. Both fixtures have value for
+   different test scenarios.
+3. **Is the project willing to commit to one extra parameter on the
+   `tests/e2e/replay/conftest.py::_calibration_path()` family of helpers
+   for an optional `osd_mask.png` companion?** — recommended yes; it is
+   the cleanest path. Not blocking for this run; can be deferred to a
+   follow-up tracker ticket if Option C fallback is acceptable for now.
+
+## Validation conclusion
+
+The recommended composition (B + I primary, C + I fallback, Z preferred
+for future recordings) holds up under the validation scenarios. Move to
+Step 7.5 (Component Applicability Gate).
@@ -0,0 +1,100 @@
+# Component Fit Matrix — Video Extraction Pipeline
+
+> Step 7.5 — Component Applicability Gate. Applies because this run is
+> classified as Technical-component selection.
+
+## 7.5.1 Top-level Component Fit Matrix
+
+| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
+|---|---|---|---|---|---|---|---|---|
+| Geometric crop (GCS chrome removal) | FFmpeg `crop` filter | Single static crop box `crop=W:H:X:Y` derived per recording from variance-map analysis; one-shot CLI invocation | Established production | Strip GCS UI sidebars/minimap/IR-PIP from recorded MKV | MVE block in `02_fact_cards.md` (PoC1 produced playable output); docs Source #1 | None for the user's pinned use case (offline tool) | **Selected** | Trivial, lossless within re-encode, deterministic |
+| OSD-pixel handling (PRIMARY) | Kornia `DISK.forward(img, mask=...)` | mask-aware mode `(B, 1, H, W)` mask, multiplied into the DISK score map before NMS | Established production (existing project component) | Suppress keypoint detection inside burned-in OSD regions | MVE block in `02_fact_cards.md`; docs Source #6 | Requires the existing C3 wrapper around DISK to forward the `mask=` parameter (one-line code change) | **Selected** | No pixel modification; fabrication-risk = 0; matches existing C3 stack exactly |
+| OSD-pixel handling (FALLBACK) | FFmpeg `delogo` chained | Multiple `delogo=x:y:w:h` filters chained for each OSD rectangle, after `crop`. **Important**: rectangles must NOT touch image edges (no border pixels to interpolate from) | Simple baseline | Replace burned-in OSD rectangles with interpolation-from-neighbors output, producing a plain MP4 ingestable by the existing replay path with no code changes | MVE block in `02_fact_cards.md` (PoC4); docs Source #1, #2 | Quality degrades when the OSD rectangle is large (e.g., the IR PIP at 360×210 px) — for that, `removelogo` mask or geometric crop is preferred | **Selected (fallback)** | Cheap, deterministic, no toolchain beyond FFmpeg |
+| OSD-pixel handling (REJECTED for fragility) | FFmpeg `removelogo` PNG mask | Single PNG mask covering all OSD elements, applied via `removelogo=mask.png` | Established production | One-shot OSD removal via mask | Source #3 docs claim it works; local test on FFmpeg 8.1 failed with `Invalid argument` (-22) | Version-fragile; could not be made to work in our local FFmpeg 8.1 with grayscale or RGB masks of correct dimensions | **Experimental only** | Try first if available on team's pinned FFmpeg version; fall back to chained `delogo` |
+| OSD-pixel handling (REJECTED, fabrication risk) | VideoPainter / DiffuEraser / VidPivot (and any generative video inpainter) | Diffusion-backbone or I2V generative inpainter applied to the OSD mask | SOTA / Known bad | High-fidelity-looking OSD removal | Sources #12, #13 — papers explicitly describe synthesis | Generates terrain content that does not exist in the real recording. Project rule "Real Results, Not Simulated Ones" is unambiguous | **Rejected** | Disqualified by `meta-rule.mdc` |
+| OSD-pixel handling (REJECTED, wrong assumption) | FFmpeg `tmedian` temporal median | `tmedian=radius=N` after `crop` | Adjacent domain | Suppress static OSD via temporal median | PoC3 test result | OSD values change every frame (timestamps, gimbal angle, HDOP), so the static-OSD assumption underneath the technique fails. Output is smeared | **Rejected** | Disqualified by direct experimental evidence |
+| OSD-pixel handling (DEFER) | ProPainter | ProPainter checkpoint with mask-guided sparse Transformer | Current SOTA non-generative | High-quality OSD removal that respects no-fabrication constraint | Source #11 paper claims | Adds Python+PyTorch+CUDA toolchain; offline runtime ~0.25 s/frame at 480p; not necessary if Option B is implemented | **Experimental only** | Keep available for cases where Option B's downstream code change is rejected and the masked-region size is too large for `delogo` to interpolate cleanly |
+| Frame filtering by gimbal pointing (PRIMARY) | `pymavlink` parser of paired `.tlog` for `MOUNT_STATUS` / `MOUNT_ORIENTATION` | Read paired `2026-05-09 16-09-54.tlog`, build a `frame_idx → gimbal_pitch_deg` table by interpolating message timestamps to the 30 fps frame timeline, drop frames where `|pitch − (-90°)| > 10°` (or per-project nadir tolerance) | Established production | Reject non-nadir frames before encoding the cleaned MP4 | Verified path (`pymavlink` is already used in the project's `derkachi.tlog` pipeline per `flight_derkachi/README.md`) | **Verify**: must confirm the paired tlog actually emits `MOUNT_STATUS` for the camera in question; if the gimbal does not report attitude over MAVLink, this option fails | **Needs user decision** (effectively Selected if tlog has the messages) | Cleanest signal; deterministic; reuses existing project tooling |
+| Frame filtering by gimbal pointing (FALLBACK) | OCR (Tesseract or PaddleOCR) on the burned-in pitch-angle text | Per-frame OCR of the `-3.7°` text in the attitude indicator | Adjacent domain | Recover gimbal pitch when telemetry path is unavailable | OCR libraries are common; no project-specific MVE built | OCR errors propagate; need confidence thresholding | **Experimental only** | Use only if the tlog lacks gimbal attitude |
+| Calibration JSON | Per-camera `khp20s30_factory.json`-equivalent for the Topotek/Viewpro multi-sensor ball | "factory_sheet" approximation per the AZ-702 precedent | Established production (project precedent) | Provide intrinsics consumable by `tests/e2e/replay/` | Source #10 (Derkachi camera_info.md showing the convention) | None — same approach as the existing fixture | **Selected** | Project-accepted precedent |
+| Companion telemetry CSV | Existing `derkachi.tlog → data_imu.csv` exporter, retargeted to `2026-05-09 16-09-54.tlog` | Run the same exporter that produced `data_imu.csv` for Derkachi | Established production (existing project tool) | Provide synchronized IMU data for the new fixture | Existing pipeline (`flight_derkachi/data_imu.csv`); reuses `pymavlink` | None | **Selected** | Reuses existing tool |
+
+## 7.5.2 Restrictions × Candidate-Mode Sub-Matrix
+
+> The "constraints" here come from the run-specific Project Constraint Matrix
+> in `00_question_decomposition.md` (Constraints C1–C8 — fixture must drop
+> into existing replay infrastructure, no fabrication, etc.). Numbered AC
+> from `acceptance_criteria.md` are referenced where directly relevant — but
+> note this is a **fixture-prep tool, not a runtime component**, so most
+> runtime-AC rows are N/A.
+
+### Sub-Matrix — FFmpeg `crop` (geometric chrome removal)
+
+| Constraint / AC | Candidate-mode behavior | Result | Evidence |
+|---|---|---|---|
+| C1 (ingestable by `tests/e2e/replay/`) | Output is a plain H.264 MP4 with arbitrary integer dimensions; the existing replay path consumes 880×720 (Derkachi) so any sub-1080p H.264 MP4 works | ✅ Pass | Fact #6 + #16 |
+| C2 (no synthetic content) | `crop` discards pixels; never invents | ✅ Pass | Fact #6 |
+| C3 (frame rate flexibility) | `crop` preserves frame rate | N/A | — |
+| C5 (calibration honesty) | Crop changes principal point — calibration must be derived for the cropped frame, not the original 1280×720. Per-camera JSON should reflect the cropped image dimensions and shifted principal point | ✅ Pass (with derived calibration) | `flight_derkachi/camera_info.md` precedent |
+| C6 (reproducibility) | Single deterministic FFmpeg command | ✅ Pass | Fact #6 |
+| C7 (no false-positive features) | Cropped pixels are verbatim; remaining OSD is handled by other components | N/A (this component does not address OSD) | — |
+| C8 (non-nadir frame filtering) | Crop is frame-agnostic | N/A | — |
+
+### Sub-Matrix — Kornia DISK in mask-aware mode (PRIMARY)
+
+| Constraint / AC | Candidate-mode behavior | Result | Evidence |
+|---|---|---|---|
+| C1 (ingestable by `tests/e2e/replay/`) | Requires one-line modification to the C3 detector wrapper to forward `mask=` | ✅ Pass with caveat | Fact #13 |
+| C2 (no synthetic content) | Mask suppresses score-map values in OSD regions; pixel values are unchanged | ✅ Pass | Fact #13 + Fact #14 |
+| C5 (calibration honesty) | Mask path orthogonal to calibration | N/A | — |
+| C6 (reproducibility) | Mask is a static PNG file checked into the fixture directory | ✅ Pass | — |
+| C7 (no false-positive features in OSD region) | DISK returns no keypoints in masked region by construction | ✅ Pass | Fact #13 |
+| AC-2.1a (frame-to-frame registration >95%) | OSD region's keypoints removed before matching; matching depends only on real terrain features in the unmasked region | ✅ Pass for nadir frames (subject to C8 filter) | Fact #14 |
+| AC-2.2 (Mean Reprojection Error <1.0 px frame-to-frame) | Reprojection error is computed on real-terrain matches only; not affected by mask | ✅ Pass | — |
+
+### Sub-Matrix — FFmpeg `delogo` chained (FALLBACK)
+
+| Constraint / AC | Candidate-mode behavior | Result | Evidence |
+|---|---|---|---|
+| C1 (ingestable) | Output is plain MP4 | ✅ Pass | Fact #7, #8 + PoC4 |
+| C2 (no synthetic content) | `delogo` interpolates from neighbors — non-generative; no semantic terrain features synthesized | ✅ Pass with caveat (interpolation is *new* pixels, but they are computed from real adjacent pixels and produce smooth low-contrast regions unlikely to spawn false features) | Fact #7 |
+| C6 (reproducibility) | Single deterministic FFmpeg command | ✅ Pass | Fact #7 |
+| C7 (no false-positive features) | Smooth interpolated regions are unlikely to spawn high-confidence keypoints, but they CAN — DISK keypoints can fire on smooth gradient transitions; risk is real but small | ❓ Verify with empirical keypoint-density test on `poc4_delogo.mp4` vs the original | PoC4 visual inspection |
+| AC-2.1a | Conditional on C7 result | ❓ Verify | — |
+
+### Sub-Matrix — `pymavlink` MOUNT_STATUS frame filter (PRIMARY for non-nadir filtering)
+
+| Constraint / AC | Candidate-mode behavior | Result | Evidence |
+|---|---|---|---|
+| C8 (non-nadir frame filtering) | Drops frames where gimbal pitch is off-nadir | ✅ Pass IF the tlog contains MOUNT_STATUS | Source #4 (ArduPilot Topotek docs reference gimbal angle messaging) |
+| C6 (reproducibility) | Deterministic Python script | ✅ Pass | — |
+| Tlog content actually contains MOUNT_STATUS for this gimbal | unverified — depends on whether the operator's autopilot was wired to receive and forward gimbal attitude | ❓ Verify | — |
+
+### Sub-Matrix — Generative video inpainters (REJECTED)
+
+| Constraint / AC | Candidate-mode behavior | Result | Evidence |
+|---|---|---|---|
+| C2 (no synthetic content) | Synthesizes terrain features that do not exist | ❌ Fail | Fact #12 |
+
+## 7.5.3 Decision Summary
+
+| Component area | Selected | Status notes |
+|---|---|---|
+| Chrome removal | FFmpeg `crop` | Selected, no caveats |
+| OSD pixel handling (primary) | Kornia DISK mask-aware mode | Selected, conditional on one-line wrapper change |
+| OSD pixel handling (fallback) | FFmpeg `delogo` chained | Selected fallback for fixtures that must drop into existing replay path with zero code changes |
+| OSD pixel handling (other options) | `removelogo` (Experimental only — version-fragile), ProPainter (Experimental only — toolchain cost), `tmedian` (Rejected — disqualified by experiment), generative inpainters (Rejected — fabrication risk) | — |
+| Non-nadir filter (primary) | `pymavlink` parser of paired tlog | Needs user decision: depends on whether tlog has MOUNT_STATUS |
+| Non-nadir filter (fallback) | OCR on burned-in pitch text | Experimental only |
+| Calibration JSON | Per-camera "factory_sheet" approximation | Selected (project precedent) |
+| Telemetry CSV | Reuse existing tlog → CSV exporter | Selected |
+
+**Blocker check**: One row is **Needs user decision** (tlog content not yet
+verified). The user should be asked to either (a) confirm the tlog has
+gimbal attitude, in which case Option I is Selected, or (b) accept Option J
+fallback / accept all frames, in which case the fixture is supplied without
+filtering and the test plan documents the limitation.
+
+This blocker is non-blocking for the *technical recommendation* — the user
+can choose either path and the rest of the pipeline is unchanged. It is
+recorded in `solution_draft02.md`'s "Open questions" section.
@@ -0,0 +1,441 @@
+# Solution Draft 02 — Recovering a Clean Nadir Fixture from `2026-05-09 16-10-54.mkv`
+
+> **Mode**: B (Solution Assessment) — additive. This draft does **not** modify any runtime component in `_docs/01_solution/solution_draft01.md` (C1…C12). It adds a *fixture-prep developer tool* that converts an OSD-burned-in GCS screen recording into the `flight_derkachi.mp4`-shaped artifact consumed by `tests/e2e/replay/test_az835_e2e_real_flight.py`.
+>
+> **Run date**: 2026-05-30. Continues the 2026-05-29 Mode B investigation (`_docs/00_research/_mode_b_2026-05-29_video_extraction/`), with one previously-open "Needs user decision" row now resolved by a fresh tlog scan (Section 5 below).
+>
+> **Extraction executed on 2026-05-30**. The primary path (§4.1 Steps 1 + 2) was run against this MKV; the resulting fixture is at [`../../00_problem/input_data/flight_topotek_2026-05-09/`](../../00_problem/input_data/flight_topotek_2026-05-09/) with its own short README. The non-nadir frame filter (§4.1 Steps 4–5) and the companion calibration / IMU files (§4.3) were intentionally NOT produced — they are downstream decisions, not part of "extract a clean video". The verified crop coordinates differ from the 2026-05-29 draft's PoC4 values (which assumed a smaller IR PIP); the current §4.1 numbers reflect what was actually used.
+>
+> **Backing artifacts** (read these alongside this draft for full evidence):
+> - Question decomposition: [`../../00_research/_mode_b_2026-05-29_video_extraction/00_question_decomposition.md`](../../00_research/_mode_b_2026-05-29_video_extraction/00_question_decomposition.md)
+> - Source registry (17 L1/L2/L3 sources): [`../../00_research/_mode_b_2026-05-29_video_extraction/01_source_registry.md`](../../00_research/_mode_b_2026-05-29_video_extraction/01_source_registry.md)
+> - Fact cards (18 verified facts incl. local PoC results): [`../../00_research/_mode_b_2026-05-29_video_extraction/02_fact_cards.md`](../../00_research/_mode_b_2026-05-29_video_extraction/02_fact_cards.md)
+> - Comparison framework: [`../../00_research/_mode_b_2026-05-29_video_extraction/03_comparison_framework.md`](../../00_research/_mode_b_2026-05-29_video_extraction/03_comparison_framework.md)
+> - Reasoning chain: [`../../00_research/_mode_b_2026-05-29_video_extraction/04_reasoning_chain.md`](../../00_research/_mode_b_2026-05-29_video_extraction/04_reasoning_chain.md)
+> - Validation log: [`../../00_research/_mode_b_2026-05-29_video_extraction/05_validation_log.md`](../../00_research/_mode_b_2026-05-29_video_extraction/05_validation_log.md)
+> - Component fit matrix: [`../../00_research/_mode_b_2026-05-29_video_extraction/06_component_fit_matrix.md`](../../00_research/_mode_b_2026-05-29_video_extraction/06_component_fit_matrix.md)
+> - Inputs: [`../../00_problem/input_data/10.05.2026/2026-05-09 16-10-54.mkv`](../../00_problem/input_data/10.05.2026/2026-05-09%2016-10-54.mkv) and [`../../00_problem/input_data/10.05.2026/2026-05-09 16-09-54.zip`](../../00_problem/input_data/10.05.2026/2026-05-09%2016-09-54.zip) (contains the paired `.tlog`)
+
+---
+
+## 1. TL;DR
+
+**Yes, a clean nadir replay fixture can be recovered**, but the answer has two parts that must both be done; doing only one will produce a fixture that quietly misleads the runtime pipeline.
+
+| Concern | Recommended primary | Cheap fallback (zero replay-code changes) |
+|---|---|---|
+| **Strip the GCS UI chrome (sidebars / minimap / IR-PIP)** | `ffmpeg crop` (deterministic, verbatim pixels) | — same — |
+| **Handle the gimbal's burned-in OSD (attitude ladder, crosshair, FOV brackets, status text)** | **Inject a static `osd_mask.png` into the existing C3 `kornia.feature.DISK.forward(img, mask=…)` call.** Zero pixel modification, zero fabrication risk. | `ffmpeg crop + delogo` chain (interpolates from neighbor pixels — non-generative; locally verified working as `poc4_delogo.mp4` on the prior run) |
+| **Filter out frames where the gimbal is not nadir** | **OCR the burned-in pitch text** (Option J) — the *previously* preferred telemetry path is dead for this recording (see Section 5). | Manual labeling pass: ship a small `frame_ranges.yaml` of nadir vs non-nadir segments alongside the MP4. ~30 min of human labour for a 6-minute clip. |
+
+**Disqualified**:
+- Generative video inpainters (VideoPainter / DiffuEraser / VidPivot et al.) — they fabricate terrain, which corrupts VPR/matching evaluation and violates `meta-rule.mdc` "Real Results, Not Simulated Ones".
+- FFmpeg `tmedian` (temporal median) — both the OSD text *and* the underlying scene change every frame, so the median is smeared in both regions (locally verified as `poc3_crop_tmedian.mp4` on the prior run).
+
+**Not available to this project** — the source MKV is what we have; there is no access to the camera, the GCS host, or the upstream RTSP. The ideal-but-out-of-reach path would have been to pull RTSP directly from the gimbal (`rtsp://192.168.144.108:554/stream=0` for the Topotek / Viewpro multi-sensor ball class) or extract DCIM with OSD off via the GimbalControl Ethernet utility (ArduPilot Source #4). That path is documented here only for completeness (and because the `flight_derkachi.mp4` fixture was produced that way, which is why it is already clean); it is not actionable for this data source. The only thing that could replace the cleanup pipeline is the original supplier voluntarily re-recording with OSD off — which is outside this project's control.
+
+---
+
+## 2. What is in `2026-05-09 16-10-54.mkv`
+
+### 2.1 Technical metadata (verified via `ffprobe`)
+
+| Field | Value |
+|---|---|
+| Container | Matroska (`.mkv`) |
+| Video codec | H.264 |
+| Resolution | 1280 × 720 |
+| Frame rate | 30/1 fps |
+| Duration | 367.00 s (~6 m 7 s) |
+| File size | 115 044 545 bytes (~110 MB) |
+| Audio | AAC (discard at re-encode time with `-an`) |
+| Bitrate | ~2.5 Mbit/s |
+
+### 2.2 Three overlay layers (Fact #1 — direct 12-frame variance analysis on the prior run)
+
+```
+-----------------------------------------------------------------------+
+|  GCS chrome top bar (status, mode, GPS, alt)                          |
+------------+----------------------------------------+-----------------+
+|            |  TOP-LEFT HUD (burned by camera)       | IR PIP          |
+| SL STATS   |  · timer 00:04:24                      | (live IR/thermal|
+| (live      |  · EO/IR zoom, FOV 53.2 °              |  stream stamped |
+| sidebar    |  · target lat/lon                      |  by the gimbal) |
+| values)    |                                         |                 |
+|            |   [actual EO video region]              |                 |
+|            |   crosshair, attitude ladder            |                 |
+|            |   FOV brackets, +/-3.7 ° pitch text     |                 |
+|            |                                         +-----------------+
+|            |  BOTTOM-RIGHT GIMBAL TEXT               | ROLL / SPEED /  |
+|            |  · 50.0823, 36.2515                     | DIST / BATT /   |
+|            |  · azimuth, elevation                   | CURRENT (live)  |
+------------+----------------------------------------+-----------------+
+|  Minimap / bottom status bar                                           |
+-----------------------------------------------------------------------+
+```
+
+The three layers map to **two removal classes** (per Reasoning Chain Dimension 1):
+
+| Layer | Renderer | Removal class |
+|---|---|---|
+| GCS UI chrome (sidebars, minimap, status bars) | the GCS application, **after** the video stream arrives | **Pure crop** — discard the columns and rows around the EO region; pixels are *outside* the camera's video, no inpainting needed. |
+| Burned-in gimbal OSD (attitude ladder, crosshair, FOV brackets, top-left HUD, bottom-right text) | the **camera itself**, before the recorder ever saw the stream | **Mask or inpaint** — these pixels overwrite real EO pixels; you must either tell the downstream not to look at them (mask) or fill them with something visually plausible (inpaint). |
+| IR PIP (upper-right rectangle, ~360×210 px) | the camera (it stamps its IR channel into a corner of the EO output) | **Crop it out** geometrically — the rectangle is large enough that `delogo`'s interpolation is poor; cleanest to just keep the crop tight enough to exclude it. |
+
+### 2.3 The aircraft / gimbal class (corroborated by the tlog scan in Section 5)
+
+- Airframe: **multirotor**, ArduCopter 4.6.3 on Pixhawk6X, QUAD/X frame (`STATUSTEXT`: `'Frame: QUAD/X'`). The project's spec'd nav-camera is a fixed-downward APS-C sensor on a *fixed-wing* per `restrictions.md` — this MKV represents a **different aircraft class** than the primary runtime target. That's a feature (extends test coverage) not a bug, but the fixture's metadata must record the discrepancy.
+- Gimbal: 3-axis stabilised, pitch range −90° to +20°, yaw range ±180°, roll range ±30° (per the tlog's `GIMBAL_MANAGER_INFORMATION` capability advertisement — Section 5). Consistent with the Topotek / Viewpro multi-sensor ball family identified by the prior run's visual inspection.
+- GCS: **Mission Planner 1.3.83** (per `STATUSTEXT`). The project's `restrictions.md` mandates QGroundControl as the production GCS; for this fixture, the GCS is just whatever was used to make the recording — not a runtime concern.
+
+---
+
+## 3. Where this fits in the existing solution (and where it does not)
+
+### 3.1 The gap
+
+`_docs/01_solution/solution_draft01.md` defines components C1 (VIO), C2 (VPR), C3 (matchers), C4 (PnP), C5 (state estimator), C6 (tile cache), C7 (inference runtime), C8 (FC adapter), C10 (provisioning) and more — all *runtime* concerns on the Jetson Orin Nano Super. None of them is a "data ingestion / fixture-prep" component. Replay fixtures appear in `tests/e2e/replay/` as already-cleaned MP4s (Fact #18).
+
+This is fine for `flight_derkachi.mp4`, which arrived pre-cleaned because the operator (a) disabled the gimbal OSD via Topotek's GimbalControl utility, (b) mechanically locked the gimbal nadir, and (c) recorded the direct camera feed at 1080p before cropping to 880×720 (Reasoning Chain Dimension 7).
+
+`2026-05-09 16-10-54.mkv` arrived from the *opposite* situation: OSD on, gimbal unconstrained, GCS-screen-recorded. There is no existing project tool to turn this class of input into a usable fixture, which is why the question came up.
+
+### 3.2 What this draft adds
+
+A new **fixture-prep developer tool** (location: `tools/fixture_prep/` or `tests/fixtures/<flight_id>/build.py`, per existing project layout conventions) that converts one GCS-screen-recorded `.mkv` (plus its paired `.tlog`) into a directory of files in the same shape as `_docs/00_problem/input_data/flight_derkachi/`:
+
+```
+input_data/flight_topotek_2026-05-09/
+├── flight_topotek_2026-05-09.mp4        # cleaned, cropped, OSD handled (Section 4)
+├── osd_mask.png                         # 1-channel mask used by Option B (Section 4.1)
+├── 2026-05-09 16-09-54.tlog             # unpacked from the supplied .zip
+├── data_imu.csv                         # SCALED_IMU2 + GLOBAL_POSITION_INT export
+├── frame_ranges.yaml                    # nadir vs non-nadir frame ranges (Section 4.3)
+├── camera_info.md                       # camera class + calibration provenance
+└── topotek_gimbal_factory.json          # calibration JSON, factory-sheet provenance
+```
+
+The tool is offline-only, deterministic, versioned, and reproducible — re-running it on the same input produces byte-identical outputs (the only non-determinism would be inside libx264, which we disable via `-preset placebo -tune zerolatency` or by pinning `-x264-params bframes=0:scenecut=0`, your choice depending on tolerated re-encode time).
+
+**It does not change any runtime component.** C1…C12 are untouched. The single optional change in the *test* layer is to teach `tests/e2e/replay/conftest.py::_calibration_path()` (or its sibling helpers) to also look for a companion `osd_mask.png` if Option B is selected — and to forward it as an extra kwarg to whatever wraps `DISK.forward()` inside C3 (see Section 4.1 for the exact one-line change required).
+
+---
+
+## 4. The recommended pipeline (and the cheap fallback)
+
+### 4.1 PRIMARY — A + B + J: crop, mask-aware DISK, OCR pitch filter
+
+**Step 1 — Geometric crop (FFmpeg)**: discard the GCS chrome and the IR PIP rectangle.
+
+```bash
+INPUT="_docs/00_problem/input_data/10.05.2026/2026-05-09 16-10-54.mkv"
+OUTPUT_DIR="_docs/00_problem/input_data/flight_topotek_2026-05-09"
+mkdir -p "$OUTPUT_DIR"
+
+# Crop coordinates *verified for this specific MKV* by direct frame inspection +
+# luminance/saturation discontinuity detection on 2026-05-30 (see fixture README).
+# Output: 610x260 EO-only region anchored at (250, 440) in the 1280x720 source.
+#
+# Why these numbers and not the prior research's draft (crop=900:445:50:25):
+#   - The IR PIP is much larger than initially estimated: it spans roughly
+#     x=620..1140, y=35..383 in the source frame. The prior crop's right edge at
+#     x=950 cut into the PIP and the left edge at x=50 still included the
+#     GCS left icon strip + the SL STATS panel.
+#   - The IR PIP rectangle (~520 wide x ~350 tall) is too large for FFmpeg
+#     `delogo` to interpolate cleanly. Geometric exclusion is the only honest
+#     option for this recording.
+#   - The largest *clean* EO rectangle (no GCS chrome, no IR PIP, almost no
+#     burned-in OSD) is in the lower half of the frame, below the IR PIP.
+#
+# Re-verify if you ingest a future recording with a different GCS layout or
+# IR-PIP placement; see fixture README for the derivation script.
+ffmpeg -y -i "$INPUT" \
+       -vf "crop=610:260:250:440" \
+       -an -c:v libx264 -crf 18 -preset medium \
+       "$OUTPUT_DIR/flight_topotek_2026-05-09.mp4"
+```
+
+This is Option A: verbatim pixels, no inpainting, no fabrication, deterministic. On this specific MKV the crop is tight enough that essentially **no** burned-in gimbal OSD survives inside the output (verified on 8 sample frames spread across the recording — variance analysis flagged 1/158 600 = 0.0006 % of pixels as "static OSD-like"). The remaining steps 2–5 below are still relevant for other recordings of this class that may need a looser crop.
+
+**Step 2 — Build the OSD mask** (one-time, then versioned in the repo).
+
+Build a 1-channel PNG of the same dimensions as the cropped output (900×445), where white (255) marks "real EO pixels — DISK is allowed to detect keypoints here" and black (0) marks "burned-in OSD pixels — DISK must suppress detection here". One quick recipe:
+
+```python
+# tools/fixture_prep/build_osd_mask.py
+import cv2, numpy as np
+from pathlib import Path
+
+# Open a sample cropped frame and any image editor; trace the OSD rectangles by hand,
+# then export as a 900x445 grayscale PNG. The script below is the deterministic alternative:
+# build the mask from a pixel-stability test over a sample of frames.
+
+src = Path("input_data/flight_topotek_2026-05-09/flight_topotek_2026-05-09_cropped.mp4")
+cap = cv2.VideoCapture(str(src))
+n_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+sample = [int(n_frames * f) for f in (0.05, 0.15, 0.30, 0.45, 0.60, 0.75, 0.95)]
+
+stack = []
+for idx in sample:
+    cap.set(cv2.CAP_PROP_POS_FRAMES, idx)
+    ok, frame = cap.read()
+    if not ok: raise RuntimeError(f"frame {idx} unreadable")
+    stack.append(cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY).astype(np.float32))
+
+stack = np.stack(stack)            # (N, H, W)
+std = stack.std(axis=0)            # (H, W)
+mean = stack.mean(axis=0)          # (H, W)
+
+# OSD heuristic: text/lines render as high-brightness, low-std (the *position* is stable
+# even if the *value* in that position changes — the bounding box itself does not move).
+# Real EO terrain over a moving camera is mid-brightness, high-std.
+osd_likely = (std < 12.0) & (mean > 180.0)      # white/bright pixels stable in position
+osd_likely = cv2.dilate(osd_likely.astype(np.uint8) * 255, np.ones((7, 7), np.uint8))
+
+mask = 255 - osd_likely                          # invert: white=keep, black=suppress
+cv2.imwrite("input_data/flight_topotek_2026-05-09/osd_mask.png", mask)
+```
+
+The std/mean thresholds above are the right shape but should be tuned by eye on this specific recording — the prior research's variance analysis showed `mean per-pixel std ≈ 30–40` for both the EO region and the GCS sidebars, so a `< 12` threshold cleanly separates burned-in OSD (which has near-zero std in pixels that contain text strokes) from real video. Inspect the saved `osd_mask.png` against a sample frame and refine the thresholds (or hand-trace) before committing it.
+
+**Step 3 — One-line wrapper change in C3 to forward the mask** (the only code change this draft proposes).
+
+`kornia.feature.DISK.forward(img, mask=None)` already accepts a mask argument of shape `(B, 1, H, W)` with values in `[0, 1]`, and multiplies the score map by it before NMS — keypoints in masked regions are suppressed by construction, with no preprocessing of the pixels themselves (Fact #13, Source #6 Kornia docs L1). The LightGlue maintainer (`cvg/LightGlue#97`) explicitly recommends this approach over post-hoc keypoint filtering.
+
+Locate the project's existing `kornia.feature.DISK(...)` instantiation and the call site that invokes it (per `solution_draft01.md` C3 the detector is DISK + LightGlue; the call site is somewhere under `src/.../matchers/` or the runtime DISK wrapper). Pass `mask=<tensor>` through, where `<tensor>` is loaded once at fixture-init time from `osd_mask.png` and re-used per frame.
+
+Sketch (project-specific paths to be filled in):
+
+```python
+# Existing
+feats = self.disk(img, n=self.n_kp)
+# Becomes
+feats = self.disk(img, n=self.n_kp, mask=self.osd_mask)
+```
+
+`self.osd_mask` is loaded once in `__init__` from `(fixture_dir / "osd_mask.png").read_bytes()` and reshaped to `(1, 1, H, W)` float32 in `[0, 1]`. If the fixture has no `osd_mask.png`, the wrapper falls through to the original mask-less call — so existing `flight_derkachi.mp4` continues to work unchanged.
+
+**Step 4 — Frame-level filter (Option J: OCR pitch from burned-in attitude indicator)**.
+
+The previously-preferred telemetry path (Option I — parse `MOUNT_STATUS` / `GIMBAL_DEVICE_ATTITUDE_STATUS` from the paired `.tlog`) is **not viable for this recording**. See Section 5 for the evidence. The remaining viable paths are:
+
+- **(J) OCR the burned-in pitch number** — the gimbal renders pitch as text such as `-3.7°` in the attitude indicator. Use Tesseract or PaddleOCR per frame on a fixed crop around that text region, then drop frames where `|pitch − (−90°)| > 10°`. Quick recipe (project must add `pytesseract` or `paddleocr` to `requirements-dev.txt`):
+
+  ```python
+  # tools/fixture_prep/frame_pitch_from_ocr.py
+  import cv2, pytesseract, re, json
+  pat = re.compile(r"(-?\d+\.\d+)")
+  src = "input_data/flight_topotek_2026-05-09/flight_topotek_2026-05-09_cropped.mp4"
+  cap = cv2.VideoCapture(src)
+  out = []
+  for frame_idx in range(int(cap.get(cv2.CAP_PROP_FRAME_COUNT))):
+      ok, frame = cap.read()
+      if not ok: break
+      # Crop the attitude-indicator text region (coordinates depend on the cropped frame).
+      roi = frame[y0:y1, x0:x1]
+      text = pytesseract.image_to_string(roi, config="--psm 7 -c tessedit_char_whitelist=-0123456789.")
+      m = pat.search(text)
+      out.append({"frame": frame_idx, "pitch_deg": float(m.group(1)) if m else None})
+  with open("input_data/flight_topotek_2026-05-09/frame_pitch.json", "w") as f:
+      json.dump(out, f)
+  ```
+
+  Then derive `frame_ranges.yaml` from `frame_pitch.json` by clustering contiguous frame indices whose pitch is within the nadir band.
+
+- **(Manual)** — for a one-off fixture of 6 minutes, the cheapest deterministic alternative is a *manual labeling pass*: a developer watches the cropped video once, notes the frame ranges where the gimbal is at nadir (`[0:42, 1:53][3:58, 6:06]` etc.), and saves the ranges as `frame_ranges.yaml`. ~30 minutes of human labour, zero failure modes, fully reproducible by anyone who can re-watch the same MP4. This is the recommended path **for this specific fixture** unless additional GCS-screen-recorded fixtures are expected, in which case the OCR script amortises across them.
+
+**Step 5 — Filter the cropped video down to nadir-only frames** (using `frame_ranges.yaml` from Step 4).
+
+```bash
+# Re-encode with a select filter restricting to the nadir frame ranges.
+# Build the select expression programmatically from frame_ranges.yaml.
+ffmpeg -y -i "$OUTPUT_DIR/flight_topotek_2026-05-09_cropped.mp4" \
+       -vf "select='between(n,1260,3390)+between(n,7140,10980)',setpts=N/FRAME_RATE/TB" \
+       -an -c:v libx264 -crf 18 -preset slow \
+       "$OUTPUT_DIR/flight_topotek_2026-05-09.mp4"
+```
+
+(Numbers above are illustrative; the actual `between(n, …)` segments come from the YAML.)
+
+### 4.2 FALLBACK — A + C + (J or manual): no code change to C3
+
+If teaching the C3 wrapper to forward `mask=…` is rejected for this fixture (a reasonable choice to keep `tests/e2e/replay/` purely "drop in an MP4 + a JSON + a CSV" with zero glue code), substitute **Option C** for Option B: replace each burned-in OSD rectangle with FFmpeg `delogo` interpolation.
+
+**On this specific MKV, Option C collapses into Option A.** The verified crop in §4.1 Step 1 already produces a near-zero-OSD output (0.0006 % of pixels flagged as static-OSD-like over 8 sample frames), so there are no rectangles left to delogo. The Option-C-versus-Option-B trade-off only re-emerges for hypothetical *other* recordings of this class that need a looser crop — e.g. a recording where the camera HUD is positioned differently and there is no clean rectangle wholly outside it. The generic recipe shape for such a recording would be:
+
+```bash
+# Template only — instantiate W/H/X/Y for the looser crop and (x,y,w,h)
+# rectangles for each surviving OSD region, all in cropped-frame coords.
+# The delogo filter in FFmpeg 8.1 has no 'band' parameter (removed); only x, y,
+# w, h, show remain. Rectangles must NOT touch the cropped frame's edge.
+ffmpeg -y -i "$INPUT" \
+       -vf "crop=W:H:X:Y,\
+delogo=x=x1:y=y1:w=w1:h=h1,\
+delogo=x=x2:y=y2:w=w2:h=h2,\
+..." \
+       -an -c:v libx264 -crf 18 -preset medium \
+       "$OUTPUT_DIR/$FIXTURE_ID_delogo.mp4"
+```
+
+**Important caveats on Option C** (when it does need to be used):
+- `delogo` rectangles must not touch the image edge (no surrounding pixels to interpolate from).
+- `delogo` produces *new* pixels (interpolated from the immediate neighbourhood). They are not synthesised semantic terrain content, but they *are* new pixels that did not exist in the original camera capture. The downstream feature detector *can* fire on smooth interpolated regions (DISK keypoints sometimes detect on smooth gradient transitions). This is the residual risk of Option C versus Option B; quantify it by running both pipelines on a few nadir segments and comparing the keypoint density inside the masked regions on the Option C output to zero (the trivially-correct value Option B delivers).
+- `delogo` does **not** scale to rectangles much larger than ~50 px in their shorter dimension. For this MKV the IR PIP is ~520 × 350 px and cannot be cleanly delogo'd at all — geometric exclusion (i.e. the corrected crop in §4.1 Step 1) is the only honest option.
+- Then chain Step 4 + Step 5 from Section 4.1 on top of this Option C output to get the same nadir-only result.
+
+### 4.3 Companion files
+
+| File | Source | Conventions |
+|---|---|---|
+| `flight_topotek_2026-05-09.mp4` | Sections 4.1 / 4.2 | H.264, 30 fps, exactly the cropped + OSD-handled + nadir-filtered video. Matches the `flight_derkachi.mp4` shape (any sub-1080p H.264 MP4 the replay harness already accepts). |
+| `osd_mask.png` | Section 4.1 Step 2 (only for Option B) | 900×445 grayscale PNG, white=keep, black=suppress. Versioned alongside the MP4. |
+| `2026-05-09 16-09-54.tlog` | Just unzip the `.zip` from `input_data/10.05.2026/` | Identical to the supplied tlog; ArduCopter 4.6.3 (Pixhawk6X), 133 191 messages over 446.8 s. |
+| `data_imu.csv` | Reuse the existing `derkachi.tlog → data_imu.csv` exporter, retargeted at this new tlog | 10 Hz table of `SCALED_IMU2` and `GLOBAL_POSITION_INT` per the `flight_derkachi/README.md` convention. |
+| `frame_ranges.yaml` | Section 4.1 Step 4 | List of `(start_frame, end_frame)` pairs the fixture considers "valid nadir frames". |
+| `camera_info.md` | Hand-written, modelled on `flight_derkachi/camera_info.md` | Records: camera class (Topotek / Viewpro 3-axis multi-sensor ball, per ArduPilot Source #4 + the tlog's `GIMBAL_MANAGER_INFORMATION` cap_flags), recording chain (camera HDMI → GCS app → desktop screen recorder → MKV), and the calibration's provenance flag (`factory_sheet`, per AZ-702 precedent — Fact #17). |
+| `topotek_gimbal_factory.json` | Same shape as `khp20s30_factory.json` (Fact #17) | Per-camera intrinsics + lens distortion from the camera's published spec sheet. Mark provenance `factory_sheet`. Residual focal-length error expected in the 1–3 % band, same envelope the project already accepts for `flight_derkachi.mp4`. |
+
+---
+
+## 5. New evidence — the paired tlog's gimbal state
+
+The prior 2026-05-29 run left exactly one unresolved row in `06_component_fit_matrix.md`:
+
+> **Frame filtering by gimbal pointing (PRIMARY) — `pymavlink` parser of paired `.tlog` for `MOUNT_STATUS` / `MOUNT_ORIENTATION`** → **Needs user decision**: depends on whether the paired tlog actually emits `MOUNT_STATUS` for the camera in question; if the gimbal does not report attitude over MAVLink, this option fails.
+
+This draft resolves that row by directly scanning the paired tlog. Here is the evidence.
+
+### 5.1 What is in the tlog (`2026-05-09 16-09-54.tlog`, unpacked from the supplied `.zip`)
+
+`pymavlink 2.4.49` with `MAVLINK20=1`, `MAVLINK_DIALECT=all`. Scanned: 133 191 messages over 446.8 s, 46 distinct message types. Relevant subset:
+
+| Message type | Count | Mean rate | Notes |
+|---|---|---|---|
+| `HEARTBEAT` | 1492 | 3.3 Hz | 4 endpoints: `(sys=1, comp=1, autopilot=3, type=2)` = ArduCopter / QUAD multirotor; `(sys=1, comp=191, autopilot=0, type=6)` = a GCS-class component co-resident on sysid 1; `(sys=255, comp=0, autopilot=8, type=18)` and `(sys=255, comp=190, autopilot=8, type=6)` = Mission Planner GCS. |
+| `ATTITUDE` (vehicle) | 4174 | 9.3 Hz | Body pitch range: min −12.47°, max +4.68°, mean −3.95°. **This is the airframe attitude**, not the gimbal. |
+| `GIMBAL_DEVICE_ATTITUDE_STATUS` | **4338** | **9.7 Hz** | **All 4338 messages carry the identity quaternion `q = (1.0, 0.0, 0.0, 0.0)`** (exactly one distinct quaternion value across the entire flight). `flags = 0x002c` = `YAW_IN_VEHICLE_FRAME | PITCH_LOCK | ROLL_LOCK`. `failure_flags = 0x00000000`. |
+| `GIMBAL_MANAGER_INFORMATION` | 26 | discovery exchange | `gimbal_device_id=1`. Capability: pitch range `[−90°, +20°]`, yaw range `±180°`, roll range `±30°`. cap_flags=206847. Confirms the gimbal physically *can* reach nadir; just isn't reporting where it is right now. |
+| `COMMAND_LONG` distinct cmds | — | — | Only 6 distinct command IDs: 183 (`DO_SET_SERVO ch=15 pwm=1950` — one shot, possibly a release / trigger), 400 (`COMPONENT_ARM_DISARM`, twice), 511 (`SET_MESSAGE_INTERVAL`, 11×), 512 (`REQUEST_MESSAGE`, 100×), 520 (`REQUEST_AUTOPILOT_CAPABILITIES`, 47×), and 42428 (vendor-specific, params all zero). **None of these is a gimbal control command (no `MAV_CMD_DO_MOUNT_CONTROL` = 205, no `MAV_CMD_DO_GIMBAL_MANAGER_PITCHYAW` = 1000).** |
+| `NAMED_VALUE_FLOAT` names | 1 unique | — | Only `ESCs_CURR`. No gimbal-related custom variable. |
+| `STATUSTEXT` | 38 | — | Includes `'ArduCopter 4.6.3 - Agile(px6) (92b0cd78)'`, `'Pixhawk6X 001E0036 …'`, `'Frame: QUAD/X'`, `'Mission Planner 1.3.83'`. No gimbal-related text. |
+
+### 5.2 What the identity quaternion really means
+
+`q = (1, 0, 0, 0)` is the null rotation. Per the MAVLink GIMBAL_DEVICE_ATTITUDE_STATUS spec, that means "the gimbal is in its default forward-pointing pose" (no rotation away from the body frame's +X). But the prior run's frame-by-frame visual inspection saw the gimbal *clearly pointing forward at t=30s and clearly pointing nadir at t=300s* (Fact #5). The two observations are mutually exclusive: if the gimbal were truly at the null rotation throughout the flight, every frame would look like it does at t=30s (forward).
+
+The reconciliation: **the gimbal is being moved by the operator, but the actual angle is not being reported back over MAVLink in this recording.** The gimbal driver is emitting the placeholder identity quaternion every ~100 ms because the ArduPilot mount driver expects to publish *something* at the configured rate, but no real angle is available (the gimbal device either isn't wired to talk back over MAVLink, or it is wired but isn't responding, or it is responding on a different transport — most likely the camera's own Ethernet protocol talking directly to the GCS, bypassing the autopilot).
+
+This is consistent with:
+- Mission Planner being able to control Topotek / Viewpro gimbals directly over Ethernet/UDP, separate from the ArduPilot MAVLink path.
+- The `DO_SET_SERVO ch=15 pwm=1950` one-shot pointing to a *trigger* (likely shutter / record-toggle), not a per-frame angle command.
+- The absence of `MAV_CMD_DO_MOUNT_CONTROL` and the absence of `GIMBAL_MANAGER_SET_ATTITUDE` in `COMMAND_LONG`.
+
+### 5.3 Effect on the recommendation
+
+| Component-fit row (from the 2026-05-29 component fit matrix) | Original status | Status after Section 5 |
+|---|---|---|
+| Frame filtering by gimbal pointing (PRIMARY) — `pymavlink` parser of `MOUNT_STATUS`/`MOUNT_ORIENTATION` from the paired tlog | **Needs user decision** | **❌ Rejected for this recording.** The message type IS present at 9.7 Hz, but every quaternion is the placeholder identity value; the data carries zero information about the actual gimbal angle. |
+| Frame filtering by gimbal pointing (FALLBACK) — OCR on the burned-in pitch text | Experimental only | **✅ Selected as primary** (or the manual labeling pass, for one-off fixtures of this size). |
+
+**No other row in `06_component_fit_matrix.md` changes.** The pixel-handling recommendation (Option B primary, Option C fallback) and the rejections (Options F generative / G temporal-median) stand.
+
+### 5.4 Why this is not a project-runtime issue
+
+The project's *runtime* nav-camera per `restrictions.md` is the ADTi 20MP fixed-downward (no gimbal at all). The runtime pipeline never sees a multi-sensor-ball gimbal-attitude stream. So the gap discovered here ("Mission-Planner-driven Topotek gimbals don't expose attitude over MAVLink") is only relevant for fixture preparation, not for the runtime contract. The follow-up "change the recording procedure to enable Topotek's own attitude-publish path" would require camera/GCS access this project does not have, so it is unavailable as a workaround. For any further recordings of this class, **plan on OCR-based pitch recovery (Option J) — or a manual labelling pass per fixture — as the standing strategy**, not as a temporary fallback.
+
+---
+
+## 6. Component fit summary (consolidated)
+
+> Full detail per row in [`../../00_research/_mode_b_2026-05-29_video_extraction/06_component_fit_matrix.md`](../../00_research/_mode_b_2026-05-29_video_extraction/06_component_fit_matrix.md). The table below is the *post-tlog-scan* update.
+
+| Component area | Candidate | Pinned mode | Status | Notes |
+|---|---|---|---|---|
+| GCS-chrome geometric crop | FFmpeg `crop` filter | `crop=900:445:50:25` per recording, derived from variance-map analysis | **Selected** | Trivial, lossless within re-encode, deterministic. PoC1 produced playable output on the prior run. |
+| OSD pixel handling (PRIMARY) | Kornia `DISK.forward(img, mask=…)` | mask-aware mode, `(B, 1, H, W)` mask multiplied into the DISK score map before NMS | **Selected** | No pixel modification; fabrication-risk = 0. Requires one-line C3 wrapper change to forward `mask=`. Already API-verified against Kornia docs L1 (Source #6) + LightGlue maintainer reply (Source #5). |
+| OSD pixel handling (FALLBACK) | FFmpeg `delogo` chained | multiple `delogo=x:y:w:h` after `crop`, rectangles inside the cropped frame | **Selected (fallback)** | PoC4 produced `poc4_delogo.mp4` on the prior run. Pick this if the C3 wrapper change is rejected for this fixture. |
+| OSD pixel handling | FFmpeg `removelogo` PNG mask | `removelogo=mask.png` | **Experimental only** | Failed locally with `Invalid argument` (-22) on FFmpeg 8.1; works in older versions per Source #15. Try first on your team's pinned FFmpeg before falling through to chained `delogo`. |
+| OSD pixel handling | ProPainter (non-generative video inpainter, ICCV 2023) | mask-guided sparse Transformer with flow completion | **Experimental only** | Highest visual quality among non-generative options. Adds PyTorch+CUDA toolchain; ~0.25 s/frame at 480p (Fact #11). Use only if a future recording's masked regions are too large for `delogo` interpolation. |
+| OSD pixel handling | VideoPainter / DiffuEraser / VidPivot (and any generative video inpainter) | diffusion-backbone I2V generative inpainter | **❌ Rejected** | Synthesises terrain content. Disqualified by `meta-rule.mdc` "Real Results, Not Simulated Ones" (Fact #12). |
+| OSD pixel handling | FFmpeg `tmedian` temporal median | `tmedian=radius=N` | **❌ Rejected** | Burned-in OSD text values change every frame, so the static-OSD assumption underneath the technique fails. PoC3 confirmed: smeared, ghosted output (Fact #10). |
+| Non-nadir frame filter (PRIMARY) | `pymavlink` MOUNT_STATUS / GIMBAL_DEVICE_ATTITUDE_STATUS | parse paired tlog → `frame_idx → gimbal_pitch_deg` table | **❌ Rejected for this recording (NEW)** | Section 5: message present at 9.7 Hz, but all 4338 quaternions are identity (1,0,0,0) — no real angle data. |
+| Non-nadir frame filter (PRIMARY, new) | OCR (Tesseract or PaddleOCR) on burned-in pitch text | per-frame OCR of the `−3.7°` text in the attitude indicator | **Selected (NEW)** | Was "Experimental only" pre-tlog-scan; promoted to primary now that the telemetry path is dead. Add `pytesseract` or `paddleocr` to `requirements-dev.txt`. |
+| Non-nadir frame filter (one-off alternative) | Manual labeling pass | developer watches the 6-min clip, marks ranges, commits `frame_ranges.yaml` | **Selected (for this fixture only)** | Cheapest deterministic path; recommended for this specific MKV unless additional GCS-screen-recorded fixtures are expected. |
+| Calibration JSON | Per-camera `topotek_gimbal_factory.json` (same shape as `khp20s30_factory.json`) | "factory_sheet" provenance per AZ-702 precedent | **Selected** | Project-accepted (Fact #17). Residual 1–3 % focal-length error envelope. |
+| Companion telemetry CSV | Existing `derkachi.tlog → data_imu.csv` exporter, retargeted | unchanged | **Selected** | Reuses existing tool. |
+| Source recovery (Option Z) | Pull RTSP / extract DCIM from gimbal with OSD disabled via Topotek GimbalControl utility (ArduPilot Source #4) | n/a — out-of-band camera access | **❌ Not available — no camera / GCS access for this data source.** | Documented here only because it is the cleanest path *in principle* and because the existing `flight_derkachi.mp4` fixture was produced this way. Not actionable for this project's data pipeline. The only way it returns to the table is if the original supplier voluntarily re-records with OSD off — outside this project's control. |
+
+---
+
+## 7. Testing strategy
+
+### 7.1 Functional / integration
+
+1. **Crop-coordinate validation.** Decode 5 frames from `flight_topotek_2026-05-09.mp4` and assert they are 900×445; assert the IR PIP is *not* present in the right-third of the frame; assert the GCS sidebars are *not* present in the leftmost / rightmost columns.
+2. **OSD mask validation.** Open `osd_mask.png`, assert dimensions match the MP4, assert the union of black pixels covers ≥95 % of the union of OSD rectangles you would otherwise pass to `delogo`. Optionally, render `cropped_frame * (mask/255.)` and eyeball that the burned-in text is dimmed to black while the EO terrain is preserved.
+3. **DISK mask-aware contract.** Add a unit test under `tests/unit/c3_matchers/` that loads the existing DISK wrapper, passes a synthetic 900×445 image with a checkerboard pattern + a corner rectangle of pure white, passes a mask zeroing the corner, and asserts no keypoint is returned at coordinates inside that corner.
+4. **End-to-end replay smoke test.** Add a sibling test to `tests/e2e/replay/test_az835_e2e_real_flight.py` parameterised over `flight_topotek_2026-05-09` and confirm the pipeline runs to completion. Track end-to-end accuracy separately under AC-1.x.
+5. **Frame-range filter sanity.** Iterate `frame_ranges.yaml` and assert: every range's `start_frame < end_frame`, ranges are non-overlapping, and the union covers at least N seconds of footage (where N is a project-chosen minimum-fixture-duration).
+
+### 7.2 Non-functional
+
+- **Reproducibility**: re-run the entire `tools/fixture_prep/` script twice on a clean checkout and assert byte-identical outputs (pin libx264 settings; pin `pytesseract` version; pin Python version in `pyproject.toml`).
+- **Throughput**: the entire fixture-prep run for one 6-minute MKV should complete in well under 10 minutes on a developer workstation (no AC requirement; sanity ceiling).
+- **No fabrication regression**: extract keypoints from the masked region of the Option B output and assert count == 0; for the Option C output, assert keypoint count is at most 5 % of the unmasked terrain keypoint count.
+
+---
+
+## 8. Open questions
+
+1. **Adopt the one-line C3 wrapper change?** Section 4.1 Step 3 proposes forwarding `mask=…` through the existing DISK call. This is the lowest-risk highest-quality path (Option B) but requires touching `src/.../matchers/`. The fallback (Option C, chained `delogo`) avoids this code change entirely and only touches the fixture-prep script. **Either is defensible** — the choice depends on whether the team is willing to formalise mask-aware fixtures as a first-class concept in the replay layer (recommended yes) or wants to keep that layer "drop in an MP4" pure (defensible too).
+2. **Use OCR for the frame filter, or just hand-label this one fixture?** For a single 6-minute clip, the manual labeling pass is cheaper than building, validating, and pinning a Tesseract/PaddleOCR pipeline. Use OCR only if you expect to ingest additional fixtures of the same class (same gimbal HUD layout) and want the script to amortise. Either way, the YAML output format is the same — so this can be revisited later.
+3. **Future recordings — Option Z (direct RTSP/DCIM extraction with OSD off) is not available** to this project: no camera or GCS access exists for this data source. The only theoretical path to bypass the cleanup pipeline is to ask the original supplier to re-record with OSD disabled (via Topotek's GimbalControl utility on their side, or by setting `MNT1_OPTIONS` / equivalent on their flight controller). Whether to even make that request is a separate decision; it is not a technical option this project can execute on its own. Assume Option Z stays unavailable and plan all future fixtures of this class around the OCR / manual-labelling path in §4 + §5.3.
+
+---
+
+## 9. References
+
+L1 (official documentation / source):
+- FFmpeg `delogo` filter, vf_delogo.c [Sources #1, #2 in source registry]
+- FFmpeg `removelogo` filter, vf_removelogo.c [Source #3]
+- FFmpeg `tmedian` filter [Source #8]
+- ArduPilot Topotek Gimbal docs [Source #4]
+- LightGlue maintainer reply on score-map masking, issue cvg/LightGlue#97 [Source #5]
+- Kornia `DISK.forward()` documentation [Source #6]
+- DISK upstream source (`disk/model/disk.py`) [Source #7]
+- Project: `_docs/00_problem/input_data/flight_derkachi/README.md` [Source #9]
+- Project: `_docs/00_problem/input_data/flight_derkachi/camera_info.md` [Source #10]
+
+L2 (peer-reviewed):
+- ProPainter (ICCV 2023) [Source #11]
+- VideoPainter (arXiv 2503.05639, 2025) [Source #12] — referenced as disqualified
+- VidPivot / DiffuEraser comparison (arXiv 2510.21461, 2025) [Source #13]
+- DISK paper (NeurIPS 2020, arXiv 2006.13566) [Source #14]
+
+L3 (practitioner / community):
+- "Removing obnoxious logos from videos" blog [Source #15]
+- Conditional Temporal Median Filter reference [Source #16]
+- Foundry Nuke TemporalMedian reference [Source #17]
+
+In-repo cross-references:
+- `_docs/01_solution/solution_draft01.md` — existing solution; C2 (MixVPR TensorRT INT8+FP16), C3 (DISK + LightGlue), C5 (GTSAM iSAM2 + CombinedImuFactor) [Source #R1]
+- `_docs/00_research/06_component_fit_matrix/00_summary.md` — confirms no fixture-prep component exists in the runtime [Source #R2]
+- `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json` — existing per-camera calibration JSON precedent [Source #R3]
+
+This-run new evidence:
+- ffprobe verification of `2026-05-09 16-10-54.mkv` technical metadata (Section 2.1)
+- `pymavlink` scan of unpacked `2026-05-09 16-09-54.tlog` (Section 5) — 133 191 messages over 446.8 s, GIMBAL_DEVICE_ATTITUDE_STATUS at 9.7 Hz, all identity quaternions
+
+---
+
+## 10. Related artifacts
+
+| Artifact | Status |
+|---|---|
+| `_docs/00_research/_mode_b_2026-05-29_video_extraction/` | Complete through Step 7.5 — this draft is its Step 8 deliverable, with one row updated by new tlog evidence |
+| `_docs/01_solution/solution_draft01.md` | Untouched. C1–C12 unchanged. This draft is purely additive. |
+| `_docs/01_solution/solution.md` | Untouched. |
+| `_docs/00_problem/input_data/10.05.2026/2026-05-09 16-10-54.mkv` | Source MKV. Untouched. |
+| `_docs/00_problem/input_data/10.05.2026/2026-05-09 16-09-54.zip` | Source tlog archive. Untouched. |
+| Future: `input_data/flight_topotek_2026-05-09/` | The cleaned fixture directory this draft proposes producing. Not yet created. |
+| Future: `tools/fixture_prep/` | The reproducible script that will produce the above. Not yet created. |
@@ -36,8 +36,8 @@ See `architecture.md` for the full ADR set (ADR-001..ADR-009), 12 architectural
 | 09 | C7 Inference Runtime | TensorRT 10.3 engines (Polygraphy / trtexec / IBuilderConfig hybrid); ORT+TRT EP fallback; PyTorch FP16 baseline | E-BOOT, E-CC-CONF, E-CC-FDR-CLIENT | AZ-249 |
 | 10 | C8 FC + GCS Adapter | `pymavlink` `GPS_INPUT` for ArduPilot (signed) + `MSP2_SENSOR_GPS` for iNav (unsigned, accepted residual risk); honest 6×6 → 2×2 covariance projection; GCS 1–2 Hz downsampled telemetry | C5, E-CC-CONF, E-CC-LOG | AZ-261 |
 | 11 | C10 Pre-flight Cache Provisioning | Builds model-derived cache (descriptors, engines, manifest, content hashes); F2 takeoff verifier; does NOT touch `satellite-provider` (network I/O lives in C11) | C6, C7, E-CC-LOG | AZ-252 |
-| 12 | C11 Tile Manager | Operator-side `TileDownloader` (pre-flight) + `TileUploader` (post-landing, gated `flight_state == ON_GROUND`); excluded from airborne image | C6, E-CC-CONF, E-CC-LOG | AZ-251 |
-| 13 | C12 Operator Pre-flight Tooling | CLI subcommands (`download`, `build-cache`, `upload-pending`, `reloc-confirm`); sector classification UI hook; FDR retrieval helpers | C10, C11, E-CC-LOG | AZ-253 |
+| 12 | C11 Tile Manager | Operator-side `TileDownloader` (pre-flight) + `TileUploader` (post-landing — no internal flight-state gate after Batch 44; gating lives in C12); excluded from airborne image | C6, E-CC-CONF, E-CC-LOG | AZ-251 |
+| 13 | C12 Operator Pre-flight Orchestrator | CLI subcommands (`download`, `build-cache`, `upload-pending`, `reloc-confirm`); `PostLandingUploadOrchestrator` (gates on `flight_footer.clean_shutdown`); `OperatorReLocService` (AC-3.4 hint via `OperatorCommandTransport`); sector classification UI hook; FDR retrieval helpers | C10, C11, E-CC-LOG | AZ-253 |
 | 14 | C13 Flight Data Recorder | Per-flight ≤64 GB NVM ring (estimates + IMU + emitted MAVLink + health + mid-flight tiles + ≤0.1 Hz failed-tile thumbnails); raw nav/AI-cam frames excluded | E-BOOT, E-CC-LOG, E-CC-CONF, E-CC-FDR-CLIENT | AZ-248 |

 **Cross-cutting epics** (not components, but shared concerns): E-BOOT (AZ-244), E-CC-LOG (AZ-245), E-CC-CONF (AZ-246), E-CC-FDR-CLIENT (AZ-247).
@@ -103,7 +103,7 @@ The test suite is organised as scenario specs (no source code yet). Per-componen
 | C8 | `components/10_c8_fc_adapter/tests.md` |
 | C10 | `components/11_c10_provisioning/tests.md` |
 | C11 | `components/12_c11_tilemanager/tests.md` |
-| C12 | `components/13_c12_operator_tooling/tests.md` |
+| C12 | `components/13_c12_operator_orchestrator/tests.md` |
 | C13 | `components/14_c13_fdr/tests.md` |

 ### System-level scenario suites (`_docs/02_document/tests/`)
@@ -142,7 +142,7 @@ Both the inclusive reading (PARTIAL = covered) and the strict reading clear the
 | 7 | AZ-250: E-C6 — Tile Cache + Spatial Index | C6 | M | 13–21 | E-BOOT, E-CC-LOG, E-CC-CONF |
 | 8 | AZ-251: E-C11 — Tile Manager | C11 | M | 13–21 | E-C6, E-CC-CONF, E-CC-LOG |
 | 9 | AZ-252: E-C10 — Pre-flight Cache Provisioning | C10 | M | 13–21 | E-C6, E-C7, E-CC-LOG |
-| 10 | AZ-253: E-C12 — Operator Pre-flight Tooling | C12 | M | 13–21 | E-C10, E-C11, E-CC-LOG |
+| 10 | AZ-253: E-C12 — Operator Pre-flight Orchestrator | C12 | M | 13–21 | E-C10, E-C11, E-CC-LOG |
 | 11 | AZ-254: E-C1 — Visual / Visual-Inertial Odometry | C1 | XL | 34–55 | E-BOOT, E-CC-FDR-CLIENT, E-C7 |
 | 12 | AZ-255: E-C2 — Visual Place Recognition | C2 | L | 21–34 | E-C6, E-C7, E-CC-FDR-CLIENT |
 | 13 | AZ-256: E-C2.5 — Inlier-based Re-rank | C2.5 | S | 5–8 | E-C2, E-C7, E-C6 (shared LightGlue helper) |
@@ -167,7 +167,7 @@ Both the inclusive reading (PARTIAL = covered) and the strict reading clear the
 | 6 | D-CROSS-LATENCY-1 hybrid (ADR-006): K=3 baseline auto-degrades to K=2 + Jacobian covariance under thermal throttle | Preserves AC-4.1 at +50 °C ambient at the cost of ~5–10 % accuracy | Static K=2 — wastes covariance precision in nominal conditions; static K=3 — blows the budget under throttle |
 | 7 | Spoof-promotion gate (ADR-008): re-promote only after ≥10 s `gps_health == STABLE_NON_SPOOFED` AND visual-consistency check passes | AC-NEW-2 / AC-NEW-8 floor; defends against attacker turning spoof off briefly | Time-only gate (≥30 s) — slower mission recovery, still fool-able by transient honest GPS during attack |
 | 8 | Interface-first components with constructor injection (ADR-009) | Multiple interchangeable strategies on the same interface (C1 has 3, C2 has 6+, C8 has 2) — selection via composition root only | Service-locator / global registry — couples runtime to import order, breaks tests, breaks build-time exclusion |
-| 9 | OpenCV pinned to ≥4.12.0 (Mode B Fact #112) | CVE-2025-53644 mitigation; IPPE flags for solvePnP D-C4-1=(b) | Older OpenCV — known CVE; newer beta — not pinned in JetPack 6.2 |
+| 9 | OpenCV pin **temporarily relaxed** to `>=4.11.0.86,<4.12` (was `>=4.12.0` per original Mode B Fact #112) | `gtsam==4.2.1` (only published wheel) is built against numpy 1.x ABI; `opencv-python>=4.12` requires numpy>=2; D-CROSS-CVE-1 follow-up tracked in `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` and replays the `>=4.12.0` pin once gtsam ships numpy-2 wheels (or an alternative SE(3) backend is adopted) | Keep numpy>=2 — `gtsam.Pose3` SEGFAULTs; force `opencv-python>=4.12` — uninstallable; drop gtsam — loses ADR-003 honest 6×6 covariance |
 | 10 | DISK + LightGlue replaces SuperPoint+SuperGlue for cross-domain matching (D-C3-1 = (a)) | License — SP+SG is Magic Leap noncommercial canonical; DISK+LightGlue is BSD-3-Clause | SuperPoint+SuperGlue — license incompatibility; XFeat — promising but unproven cross-domain |
 | 11 | AC-NEW-4 / AC-NEW-7 text relaxed 2026-05-09 to Monte-Carlo-over-current-data with stated 95 % CI | D-PROJ-3 multi-flight fixture acquisition is out of scope this cycle; literal "≥100 flights" wording cannot be met | Block planning on D-PROJ-3 — cycle would not close; relaxed wording is documented residual risk in R11 |

@@ -203,6 +203,55 @@ Both the inclusive reading (PARTIAL = covered) and the strict reading clear the
 | `diagrams/components.drawio` | Component-level diagram (visual companion to `components/`) |
 | `diagrams/flows/00_index.md` | Per-flow index pointing into `system-flows.md` |

+## Cycle 1 Implementation Status
+
+> Appended 2026-05-19 as part of greenfield Step 13 (Update Docs, task mode). Captures the as-built deltas from this planning report after 97 implementation batches + Step 11 Run Tests + Step 12 Test-Spec Sync. Sources: `_docs/03_implementation/implementation_completeness_cycle1_report.md`, `_docs/03_implementation/run_tests_step11_report.md`, `_docs/02_tasks/done/` (165 task specs), `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`.
+
+### Composition-root architecture additions (not in the original Plan)
+
+The Plan-era `architecture.md` § ADR-009 (interface-first, constructor injection) and module-layout.md's "Composition root is `runtime_root/`" rule remain correct, but cycle 1 added two cross-cutting Tier-1 mechanisms inside `runtime_root/` that the Plan did not anticipate. Both are operational prerequisites for `compose_root()` reaching takeoff:
+
+- **`_STRATEGY_REGISTRY` + `register_strategy(...)` API (AZ-591)** — a module-level `dict[(component_slug, strategy_name)] → factory` populated per-binary. `runtime_root.airborne_bootstrap.register_airborne_strategies()` fills 7 airborne slots (`c1_vio`, `c2_vpr`, `c2_5_rerank`, `c3_matcher`, `c3_5_adhop`, `c4_pose`, `c5_state`) with `tier="airborne"`. Without this, `compose_root()` raises `StrategyNotLinkedError` on the first config-driven strategy lookup. The registry is the runtime-side complement to ADR-002 build-time exclusion: the build chooses which strategies are even available to register, the registry chooses which one this binary serves.
+- **`pre_constructed` kwarg + `build_pre_constructed(config)` (AZ-618 umbrella → subtasks AZ-619..AZ-624)** — `compose_root(config, pre_constructed=...)` now requires a 12-key dict of infrastructure objects (`c13_fdr`, `c6_descriptor_index`, `c6_tile_store`, `c7_inference`, `c3_lightglue_runtime`, `c3_feature_extractor`, `c282_ransac_filter`, `c5_wgs_converter`, `c5_se3_utils`, `c5_isam2_graph_handle`, `c5_imu_preintegrator`, `clock`). The airborne entrypoint builds these in 6 dependency-ordered phases (A=c13/clock → F=wire_main); GPU-touching builders gate on the corresponding `BUILD_*` env flag. Missing keys raise `AirborneBootstrapError` with the missing-key name; tests stub by passing the same `pre_constructed=...` kwarg.
+
+These additions sit inside `runtime_root/`; no component crosses the import boundary AZ-507 enforces. Both still need to be folded into `architecture.md` (ADR-009 sibling notes) and `module-layout.md` § "Composition Root" — deferred to a follow-up `/document` task pass.
+
+### BLOCKED tasks with parked Tier-2 follow-ups
+
+Per the implement skill § 15 "PASS-with-BLOCKED" allowable terminal classification (see `implementation_completeness_cycle1_report.md`):
+
+| Task | Status | Reason | Parked Tier-2 follow-up |
+|------|--------|--------|-------------------------|
+| AZ-332 — C1 OKVIS2 production-default `VioStrategy` | BLOCKED | Tier-2 prerequisites: CI build env + Jetson hardware + DBoW2 vocab artifact. Ships a Python facade + pybind11 binding skeleton; first `add_frame` raises until upstream `okvis::ThreadedSlam` wiring lands. | **AZ-592** (`_docs/02_tasks/backlog/`) |
+| AZ-333 — C1 VINS-Mono research-only `VioStrategy` | BLOCKED | Tier-2 prerequisites + upstream vendoring decision (HKUST + ROS-strip vs. community fork). Same skeleton-only state as AZ-332. | **AZ-593** (`_docs/02_tasks/backlog/`) |
+
+**Operational consequence**: the production-default airborne `VioStrategy` for cycle-1 release is **`KltRansac`** (the engine-rule-mandatory simple baseline, AZ-334), NOT OKVIS2 as `architecture.md` ADR-001 nominally implies. ADR-001 / ADR-002 remain architecturally correct (the seam exists; the build-flag gating works); the production *default selection* shifts until Tier-2 lands. The `_STRATEGY_REGISTRY` still registers the OKVIS2 + VINS-Mono slots so the registry seam stays correct — selecting them via config raises `StrategyNotAvailableError` from `vio_factory.py` until their `BUILD_*` flag is ON.
+
+**Closed Won't-Fix during this cycle**: AZ-589 + AZ-590 (the original remediation tasks for AZ-332 + AZ-333). Both targeted upstream APIs that don't exist in the actually-checked-in OKVIS2 submodule + a non-existent VINS-Mono submodule. The post-mortem details are in `implementation_completeness_cycle1_report.md` § "Verdict — Revised 2026-05-16".
+
+### Run Tests (Step 11) results
+
+| Surface | Result |
+|---------|--------|
+| Local Tier-1 pytest suite | **3343 passed / 88 skipped / 0 failed** (12 logical chunks; full details in `run_tests_step11_report.md`) |
+| Docker Tier-1 SUT Reality Gate | **NOT MET** — both harnesses (`scripts/run-tests.sh` + `e2e/docker/run-tier1.sh`) have pre-existing drift unrelated to Step 10 work (missing `ardupilot/*` + `inavflight/*` images on Docker Hub; broken Dockerfile entrypoints; unseeded tile-cache volume). Rehabilitation epic **AZ-602** owns this. |
+| Skip classification | 14 Tier-2-only (Jetson) — legitimate; 8 CUDA/GPU absent on macOS dev host — legitimate; 6 TensorRT-on-Tier-2-only — legitimate; 57 Docker-compose-dependent — borderline, becomes covered once any harness runs end-to-end; 3 console-scripts-not-on-PATH — env-conditional; remainder legitimate |
+
+### Dependency pin drift since Plan
+
+- **opencv-python**: relaxed from `>=4.12.0` to `>=4.11.0.86,<4.12` (D-CROSS-CVE-1 deferred per `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`; root cause is the gtsam==4.2.1 / numpy<2.0 ABI lock). See revised Decision 9 above. The original CVE-2025-53644 intent stands and will replay once gtsam ships numpy-2 wheels.
+
+### Documentation reconciliation still owed (deferred to a follow-up `/document` task pass)
+
+This Step-13 session updated the highest-leverage system-level deltas only. The following surfaces still carry Plan-era assertions that do not match cycle-1 as-built behaviour and should be refreshed in a follow-up session:
+
+- `architecture.md` — add ADR-009 sibling notes for `_STRATEGY_REGISTRY` + `pre_constructed`; revise OpenCV mentions (§ Technology stack, § Risks); reflect KltRansac-as-production-default in § C1 ADR commentary.
+- `module-layout.md` — extend § "Composition Root" with `airborne_bootstrap` + `build_pre_constructed` ownership; AZ-618 / AZ-591 ownership rows in the runtime_root section.
+- `components/01_c1_vio/description.md` — note KltRansac is the operational default while AZ-592 / AZ-593 are parked.
+- `components/<NN>_<cN>/description.md` for the other 13 components — task-by-task reconciliation against the 80 product tasks in `done/`.
+- `common-helpers/*.md` — task-by-task reconciliation against the 8 helper tasks in `done/`.
+- `tests/*.md` — task-by-task reconciliation against the ~36 Blackbox Tests tasks in `done/` (some already touched by Step 12 Test-Spec Sync — see `tests/traceability-matrix.md` and `tests/resilience-tests.md` diffs).
+
 ## Quality Checklist Verification

 All 8 checklist sections from `.cursor/skills/plan/steps/07_quality-checklist.md` pass for this cycle:
@@ -11,7 +11,7 @@ The system is a **Jetson Orin Nano Super-hosted onboard companion** that deliver

 ### Components — intent-level (formal decomposition belongs to Step 3)

- **C1 — Visual / Visual-Inertial Odometry**: pluggable `VioStrategy` (Okvis2 default, VinsMono in research builds only, KltRansac mandatory simple-baseline), config-selected at startup, not hot-swappable mid-flight.
+- **C1 — Visual / Visual-Inertial Odometry**: pluggable `VioStrategy` (Okvis2 architecturally-nominated production-default, VinsMono in research builds only, KltRansac mandatory simple-baseline), config-selected at startup, not hot-swappable mid-flight. **Cycle-1 operational reality**: AZ-332 (Okvis2) and AZ-333 (VinsMono) shipped as facade-only — both require Tier-2 prerequisites (CI build env + Jetson hardware + DBoW2 vocab artifact) that cycle 1 did not deliver, so the production-default selection is **KltRansac** (AZ-334) until AZ-592 / AZ-593 (Tier-2 follow-ups) land. ADR-001 / ADR-002 are unchanged — the seam holds; the *selection* shifted.
 - **C2 — Visual Place Recognition**: pre-cached satellite-tile retrieval (UltraVPR primary, MegaLoc secondary, MixVPR / SelaVPR / EigenPlaces / NetVLAD / SALAD additional candidates), all behind a single `VprStrategy` interface; concrete implementation chosen by config at startup.
 - **C2.5 — Top-N inlier-based re-rank**: re-ranks the top-K=10 VPR candidates by single-pair LightGlue inlier count down to top-N=3.
 - **C3 — Cross-domain matcher**: DISK+LightGlue (D-C3-1 = (a)) over the N=3 retained candidates; ALIKED+LightGlue secondary; XFeat alternate.
@@ -22,8 +22,8 @@ The system is a **Jetson Orin Nano Super-hosted onboard companion** that deliver
 - **C7 — On-Jetson inference runtime**: TensorRT 10.3 engines (Polygraphy / trtexec / IBuilderConfig hybrid orchestration), JetPack 6.2, SM 87; ONNX Runtime + TRT EP fallback; pure PyTorch FP16 baseline.
 - **C8 — Flight-Controller adapter**: `pymavlink` `GPS_INPUT` for ArduPilot Plane (MAVLink 2.0 message signing on the companion ↔ AP wired channel, D-C8-9 = (d)) and `YAMSPy` / INAV-Toolkit `MSP2_SENSOR_GPS` for iNav (signing-gap accepted residual risk).
 - **C10 — Pre-flight cache provisioning**: builds the **model-derived** cache artifacts (descriptor generation, engine compilation, manifest + content-hash) on top of an already-populated tile store; F2 takeoff verifier (D-C10-1, D-C10-3, D-C10-6, D-C10-7). C10 does NOT touch `satellite-provider` — tile network I/O lives in C11.
- **C11 — Tile Manager** (operator-side, distinct binary/image, ADR-004 process-isolated): owns operator-side network I/O against `satellite-provider` in **both directions**. `TileDownloader` interface fetches tiles into C6 during F1 (TLS + service-internal API key); `TileUploader` interface, gated on `flight_state == ON_GROUND`, pushes mid-flight tiles to `satellite-provider`'s ingest endpoint (D-PROJ-2 contract; not yet implemented service-side). The component bundles both interfaces because they share auth, HTTP client, deployment unit, and the airborne-exclusion property.
- **C12 — Operator pre-flight tooling** (Plan-phase carryforward, deferred from research): cache provisioning UI, sector classification (active-conflict vs stable rear), freshness pipeline workflow.
+- **C11 — Tile Manager** (operator-side, distinct binary/image, ADR-004 process-isolated): owns operator-side network I/O against `satellite-provider` in **both directions**. `TileDownloader` interface fetches tiles into C6 during F1 (TLS + service-internal API key); `TileUploader` interface pushes mid-flight tiles to `satellite-provider`'s ingest endpoint (D-PROJ-2 contract; not yet implemented service-side). C11 carries **no flight-state gating** of its own (Batch 44 SRP refactor) — the post-landing safety check lives in C12 (single source of truth). The component bundles both interfaces because they share auth, HTTP client, deployment unit, and the airborne-exclusion property.
+- **C12 — Operator Pre-flight Orchestrator** (operator-side, same image as C11): orchestrates the operator-side workflows that C11 implements. Hosts the pre-flight cache provisioning UI, sector classification (active-conflict vs stable rear), the `Flight` resolver (`FlightsApiClient` → bbox + takeoff origin), the **post-landing upload orchestrator** (gates `TileUploader` on the `flight_footer` FDR record's `clean_shutdown` field — AZ-329), and the **operator re-localization service** (AC-3.4 visual-loss hint dispatched to the airborne companion over the GCS link via the `OperatorCommandTransport` Protocol; concrete pymavlink-backed impl is an E-C8 deliverable — AZ-330). The C12 ⇄ C11 boundary is a thin Protocol cut (`TileUploaderCut`) so C12 does not import C11 directly (AZ-507).
 - **C13 — Flight Data Recorder (FDR)**: per-flight ≤64 GB NVM record of estimates + IMU traces + emitted MAVLink + system health + mid-flight tiles + ≤0.1 Hz failed-tile thumbnails; raw nav/AI-cam frames excluded (AC-8.5, AC-NEW-3).
 - **External: `satellite-provider`** (parent-suite .NET 8 service): tile producer pre-flight; tile sink post-landing (D-PROJ-2). Treated as a planned external dependency on the upload + voting paths.

@@ -31,8 +31,8 @@ The system is a **Jetson Orin Nano Super-hosted onboard companion** that deliver

 1. **Camera-specific math enters only via a `Camera calibration artifact` JSON** (intrinsics + distortion + body-to-camera extrinsics + acquisition method `factory_sheet | checkerboard_refined | hybrid`). No hard-coded camera math anywhere; test fixtures (`adti26`) and production deployments (`adti20`) load different artifacts on the same code path.
 2. **VioStrategy is selected at startup via config; not hot-swappable mid-flight.**
-3. **Build-time exclusion of unused `Strategy` implementations.** A given binary links only the implementations it actually uses at runtime. The default deployment binary links the production-default strategies (e.g. OKVIS2 on C1) plus the engine-rule-mandatory simple-baseline (KltRansac on C1); the IT-12 comparative-study binary links all C1 implementations side-by-side. The mechanism is per-component CMake `BUILD_*` flags (`BUILD_VINS_MONO`, `BUILD_SALAD`, …) plus the per-binary composition root choosing among the linked implementations at startup. **Justification is technical** — binary size on the 8 GB shared Jetson, boot/load time inside the AC-NEW-1 30 s budget, deployed dependency / attack surface, and accidental-selection risk reduction (a binary with only OKVIS2 + KltRansac linked cannot be misconfigured into running VINS-Mono). **Component licenses do not drive this decision** — see ADR-002. CI emits both the deployment binary and the research binary on every PR.
-4. **In-air network I/O against `satellite-provider` is forbidden — in BOTH directions.** Enforced primarily by **process-level isolation** — the Tile Manager (C11), which carries both the `TileDownloader` and the `TileUploader` interfaces, is not loaded in the airborne companion image. Software guard on `flight_state == ON_GROUND` (upload) is a defense-in-depth check, not the primary control. The companion is read-only against C6 in flight; both pre-flight tile fetching and post-landing tile upload happen on the operator workstation.
+3. **Build-time exclusion of unused `Strategy` implementations.** A given binary links only the implementations it actually uses at runtime. The default deployment binary links the production-default strategies (architecturally OKVIS2 on C1; **operationally KltRansac in cycle 1** while AZ-332 / AZ-333 are BLOCKED awaiting Tier-2 prerequisites — see Components C1 above and FINAL_report § "Cycle 1 Implementation Status") plus the engine-rule-mandatory simple-baseline (KltRansac on C1); the IT-12 comparative-study binary links all C1 implementations side-by-side. The mechanism is per-component CMake `BUILD_*` flags (`BUILD_VINS_MONO`, `BUILD_SALAD`, …) plus the per-binary composition root choosing among the linked implementations at startup. **Justification is technical** — binary size on the 8 GB shared Jetson, boot/load time inside the AC-NEW-1 30 s budget, deployed dependency / attack surface, and accidental-selection risk reduction (a binary with only OKVIS2 + KltRansac linked cannot be misconfigured into running VINS-Mono). **Component licenses do not drive this decision** — see ADR-002. CI emits both the deployment binary and the research binary on every PR.
+4. **In-air network I/O against `satellite-provider` is forbidden — in BOTH directions.** Enforced primarily by **process-level isolation** — the Tile Manager (C11), which carries both the `TileDownloader` and the `TileUploader` interfaces, is not loaded in the airborne companion image. The defense-in-depth software guard is a C12-side `flight_footer.clean_shutdown == True` check (read by `PostLandingUploadOrchestrator` from the post-flight FDR via `FdrFooterReader`); C11 itself no longer gates (Batch 44 SRP refactor). The companion is read-only against C6 in flight; both pre-flight tile fetching and post-landing tile upload happen on the operator workstation.
 5. **All persistent imagery is in `satellite-provider`'s on-disk tile format** (`./tiles/{zoomLevel}/{x}/{y}.jpg` + matching metadata) so post-landing upload is byte-identical. No raw frames on disk except the AC-8.5 forensic ≤0.1 Hz failed-tile thumbnail log inside FDR.
 6. **Honest 6×6 posterior covariance via GTSAM `Marginals`** is the safety floor for AC-NEW-4 and AC-NEW-7. Under-reported `horiz_accuracy` is a defect, not a tuning knob.
 7. **MAVLink 2.0 message signing on the companion ↔ ArduPilot wired channel**, with per-flight key rotation (D-C8-9 = (d)). iNav has no signing equivalent — accepted residual risk, Plan-phase carryforward proposes an iNav firmware feature request.
@@ -66,7 +66,7 @@ The system is a **Jetson Orin Nano Super-hosted onboard companion** that deliver
 | All onboard pose-estimation logic (C1–C8, C13) | Parent-suite `satellite-provider` (.NET 8 REST microservice) |
 | Pre-flight cache artifact build (C10 — engines + descriptors + manifest) | Parent-suite `flights` REST service (.NET 8; owns the `Flight` + `Waypoint` DTOs) |
 | Operator-side Tile Manager (C11 — pre-flight download + post-landing upload) | Parent-suite Mission Planner UI (`suite/ui` — where operators plan the route) |
-| Operator pre-flight tooling (C12) | GCS (QGroundControl) |
+| Operator Pre-flight Orchestrator (C12) | GCS (QGroundControl) |
 | FDR writer (C13) | Nav camera hardware (`adti20`); AI-camera hardware |
 | Camera calibration artifact format + loader | UAV airframe / FC IMU / sensors |
 | | Operator's workstation OS / authentication |
@@ -99,13 +99,13 @@ The system is a **Jetson Orin Nano Super-hosted onboard companion** that deliver
 | VPR (primary) | UltraVPR | RAL 2025 / ICRA 2026 (cbbhuxx/UltraVPR) | Documentary Lead PRIMARY; rotation-invariant, unsupervised aerial pretrain (multi-heading aerial flight + closes D-C2-1 retrain cost) |
 | VPR (secondary) | MegaLoc, MixVPR, SelaVPR, EigenPlaces, NetVLAD | upstream HEAD pinned per Plan-phase | Mode B Fact #110/#113 + mandatory simple-baseline (NetVLAD/MixVPR) |
 | State estimator | GTSAM + `gtsam_unstable.IncrementalFixedLagSmoother` | per Plan-phase pin (no published CVE at audit time) | Native 6×6 covariance; D-C5-5 = (c) `PriorFactorPose3` only |
-| Image / pose math | OpenCV (Python+C++) | **≥ 4.12.0** | CVE-2025-53644 mitigation (Mode B Fact #112); IPPE flags for D-C4-1 = (b) |
+| Image / pose math | OpenCV (Python+C++) | **≥ 4.11.0.86, < 4.12** (cycle-1 relaxation; original target ≥ 4.12.0) | CVE-2025-53644 mitigation target was ≥ 4.12.0 (Mode B Fact #112); cycle 1 relaxed the floor because `gtsam==4.2.1` only ships numpy<2 wheels and `opencv-python>=4.12` requires numpy>=2 — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`. 4.11.0.86 is in the supported 4.x line and receives security patches; the ≥ 4.12.0 pin replays once gtsam ships numpy-2 wheels or an alternative SE(3) backend lands. IPPE flags for D-C4-1 = (b) unaffected. |
 | VPR descriptor index | FAISS HNSW | upstream HEAD pinned per Plan-phase | `faiss.write_index` + atomicwrites + SHA-256 content-hash gate (D-C10-3) |
 | FC adapter (ArduPilot) | `pymavlink` + MAVLink 2.0 signing | bundled unmodified per D-C8-3 | Verified Source #4; ArduPilot canonical signing per Source #128 |
 | FC adapter (iNav) | YAMSPy + INAV-Toolkit MSP2 | MIT throughout | iNav has no inbound MAVLink ext-positioning handler (SQ6) |
-| VIO (production) | OKVIS2 (BSD-3-Clause) | upstream HEAD pinned per Plan-phase | D-C1-1-SUB-A = (a) production-default |
-| VIO (research / IT-12) | VINS-Mono | upstream HEAD pinned per Plan-phase | Research binary only (`BUILD_VINS_MONO=ON`) for IT-12 comparative study; build-time exclusion from deployment binary per ADR-002 |
-| VIO (mandatory baseline) | KLT+RANSAC over OpenCV | OpenCV ≥ 4.12.0 | Engine-rule-required mandatory simple-baseline |
+| VIO (production) | OKVIS2 (BSD-3-Clause) | upstream HEAD pinned per Plan-phase | D-C1-1-SUB-A = (a) architecturally-nominated production-default. **Cycle-1**: AZ-332 BLOCKED — facade + pybind11 skeleton ship; first `add_frame` raises until Tier-2 prerequisites (CI build env + Jetson hardware + DBoW2 vocab) and AZ-592 follow-up land. |
+| VIO (research / IT-12) | VINS-Mono | upstream HEAD pinned per Plan-phase | Research binary only (`BUILD_VINS_MONO=ON`) for IT-12 comparative study; build-time exclusion from deployment binary per ADR-002. **Cycle-1**: AZ-333 BLOCKED — same skeleton-only state as AZ-332, plus pending upstream-vendoring decision (HKUST + ROS-strip vs. community fork); AZ-593 follow-up. |
+| VIO (mandatory baseline) | KLT+RANSAC over OpenCV | OpenCV ≥ 4.11.0.86 (cycle-1 relaxation; see OpenCV row) | Engine-rule-required mandatory simple-baseline. **Cycle-1**: serves as the operational airborne `VioStrategy` default while AZ-332 / AZ-333 remain BLOCKED. |
 | Tile cache backend | PostgreSQL + filesystem | PostgreSQL 16 (mirror of `satellite-provider`) | C6 mirrors `satellite-provider`'s on-disk and table layout so C11 `TileUploader`'s post-landing payload is byte-identical to what the parent suite already serves |
 | Container runtime | Docker (Tier-1) + bare JetPack (Tier-2) | Docker 27.x; JetPack 6.2 | Tier-1 workstation Docker; Tier-2 Jetson native (no Docker — direct JetPack to keep INT8 calibration cache trustworthy per D-C10-6) |
 | Build system | CMake + Python `pyproject.toml` | CMake ≥ 3.27 | CMake `option(BUILD_VINS_MONO ...)` D-C1-1-SUB-A; Python wheels built per Jetson via cibuildwheel-equivalent recipe |
@@ -135,13 +135,13 @@ The system is a **Jetson Orin Nano Super-hosted onboard companion** that deliver
 | `staging-tier1` | CI runs that don't require Jetson hardware | GitHub-hosted runner (x86_64); Docker |
 | `staging-tier2` | CI runs that require Jetson (AC-bound jobs only) | Self-hosted Jetson runner; bare JetPack (no Docker) |
 | `production` | Deployed companion image on a UAV | Jetson Orin Nano Super (pinned); bare JetPack; no inbound network listening (defense-in-depth, NFT-SEC-05) |
-| `production-operator-workstation` | Pre-flight tile download + cache artifact build (C10) + post-landing tile upload (C11) + FDR retrieval | Operator's Linux workstation; Docker for `satellite-provider` mirror |
+| `production-operator-workstation` | Operator-side workflows orchestrated by C12: pre-flight tile download (C11 `TileDownloader`), cache artifact build (C10), post-landing tile upload (C12 `PostLandingUploadOrchestrator` → C11 `TileUploader`), AC-3.4 re-loc hint dispatch (C12 `OperatorReLocService`), FDR retrieval | Operator's Linux workstation; Docker for `satellite-provider` mirror |

 **Infrastructure**:

- **No cloud orchestration**. The companion is an embedded edge device; the operator's workstation is a single host that runs the operator tooling (C11 Tile Manager + C12 Operator Pre-flight Tooling) and a local `satellite-provider` mirror or VPN-reaches the lab `satellite-provider`.
- **Two binaries shipped on every PR** (ADR-002): `deployment-binary` (links the production-default strategy on each component + the mandatory simple-baseline; CMake `BUILD_VINS_MONO=OFF`, `BUILD_SALAD=OFF`, …) and `research-binary` (links every available strategy on every component; all `BUILD_*` flags `ON`, used for the IT-12 comparative study). The deployment binary is what installs onto an operational Jetson; the research binary runs on dev/lab Jetson hardware for the comparative-study report. The same code base produces both — ADR-002 mechanism scales to additional binary variants later if packaging strategy requires it.
- **Container scope**: Tier-1 uses Docker (`docker compose` for the developer setup including a `mock-suite-sat-service` container, the operator-tool container, and a Postgres for C6). **Tier-2 (Jetson) does NOT use Docker** — TensorRT INT8 calibration caches and `jetson-stats` thermal telemetry are most reliable without a container layer, per D-C7-9 + D-C10-6. The deployed image on the Jetson is a JetPack-based system image with the deployment binary preinstalled.
+- **No cloud orchestration**. The companion is an embedded edge device; the operator's workstation is a single host that runs the operator tooling (C11 Tile Manager + C12 Operator Pre-flight Orchestrator) and a local `satellite-provider` mirror or VPN-reaches the lab `satellite-provider`.
+- **Two airborne binaries shipped on every PR** (ADR-002): `deployment-binary` (links the production-default strategy on each component + the mandatory simple-baseline; CMake `BUILD_VINS_MONO=OFF`, `BUILD_SALAD=OFF`, …) and `research-binary` (links every available strategy on every component; all `BUILD_*` flags `ON`, used for the IT-12 comparative study). The deployment binary is what installs onto an operational Jetson; the research binary runs on dev/lab Jetson hardware for the comparative-study report. The same code base produces both — ADR-002 mechanism scales to additional binary variants later if packaging strategy requires it. **Replay is not a separate binary** (ADR-011): the deployment-binary runs both live and replay modes from the same image, swapping `FrameSource` / `FcAdapter` / `MavlinkTransport` strategies at startup based on `config.mode`. A third binary — `operator-orchestrator` (C10 + C11 + C12) — ships from the same source tree for the operator workstation; the airborne deployment-binary does NOT contain the operator-orchestrator components (ADR-004 process isolation).
+- **Container scope**: Tier-1 uses Docker (`docker compose` for the developer setup including a `mock-suite-sat-service` container, the operator-orchestrator container, and a Postgres for C6). **Tier-2 (Jetson) does NOT use Docker** — TensorRT INT8 calibration caches and `jetson-stats` thermal telemetry are most reliable without a container layer, per D-C7-9 + D-C10-6. The deployed image on the Jetson is a JetPack-based system image with the deployment binary preinstalled.
 - **Scaling**: not applicable (per-UAV, single companion). Failover is per-airframe (the FC's IMU-only fallback at AC-5.2 is the system's "scale-out").

 **Environment-specific configuration**:
@@ -167,10 +167,10 @@ source repo
   │     └─ tier2 (self-hosted Jetson) AC-bound suite (NFT-PERF-*, NFT-LIM-*, IT-12)
   │
   ├─→ release artifacts:
-   │     ├─ deployment-binary tarball (production-default strategies + mandatory baselines, ADR-002)
-   │     ├─ research-binary tarball (all strategies linked; for IT-12 comparative study)
+   │     ├─ deployment-binary tarball (production-default strategies + mandatory baselines + replay strategies, ADR-002 + ADR-011; runs both live and replay modes from a single image)
+   │     ├─ research-binary tarball (all strategies linked; for IT-12 comparative study; also includes replay strategies)
   │     ├─ JetPack image (deployment-binary preinstalled)
-   │     └─ operator-tooling tarball (C11 + C12 + e2e-test mock-suite-sat-service compose for offline integration testing)
+   │     └─ operator-orchestrator tarball (C11 + C12 + e2e-test mock-suite-sat-service compose for offline integration testing)
   │
   └─→ deploy paths:
         ├─ Jetson operational deploy: JetPack image flash (deployment-binary)
@@ -200,7 +200,10 @@ source repo
 | `Tile` | JPEG body + center lat/lon + zoomLevel + tile_size_meters/pixels + capture_timestamp + source + freshness flag + (mid-flight only) quality_metadata | C6 |
 | `TileQualityMetadata` | `estimator_label`, 2×2 covariance sub-matrix, `last_anchor_age_ms`, MRE, IMU bias norm — sufficient for D-PROJ-2 voting | C6 (write side from C5/C4 outputs) |
 | `EmittedExternalPosition` | WGS84 + honest `horiz_accuracy` + per-FC encoding (MAVLink `GPS_INPUT` for AP, MSP2 `MSP2_SENSOR_GPS` for iNav) | C8 |
-| `FlightStateSignal` | `IN_AIR | ON_GROUND` boolean derived from FC `MAV_STATE` | C8 inbound side; published to C11 only post-landing |
+| `FlightStateSignal` | `IN_AIR | ON_GROUND` boolean derived from FC `MAV_STATE` | C8 inbound side; used internally by C8/C5 for live-flight state machines. **Not** consumed by C11/C12 — post-landing gating reads the C13-written `flight_footer` FDR record instead (Batch 44 SRP refactor) |
+| `FlightFooterRecord` | `{flight_id, clean_shutdown, total_records, segment_count, …}` — single FDR record written by C13 on clean shutdown | C13 (writer) → C12 `PostLandingUploadOrchestrator` (reader, via `FdrFooterReader`) |
+| `PostLandingUploadRequest` | `{flight_id, satellite_provider_url, api_key, batch_size}` | C12 CLI → C12 `PostLandingUploadOrchestrator` |
+| `ReLocHint` | Operator-supplied position hint for AC-3.4 visual-loss re-localization: `{approximate_position_wgs84: LatLonAlt, confidence_radius_m, reason}`; validated at construction (lat ∈ [-90,90]; lon ∈ (-180,180]; radius > 0; reason non-empty); emitted to airborne companion via `OperatorCommandTransport` Protocol (E-C8 concrete) | Operator CLI → C12 `OperatorReLocService` → (GCS link) airborne companion |
 | `FdrRecord` | Estimates + IMU traces + emitted MAVLink + system health + tiles + thumbnails (≤ 64 GB / flight) | C13 |
 | `Manifest` | Hash of (model + calibration + corpus + sector classification + takeoff origin) for D-C10-1 idempotence | C10 |
 | `EngineCacheEntry` | TRT engine + INT8 calibration cache keyed by SM/JP/TRT/precision tuple (D-C10-7) | C10, C7 |
@@ -227,7 +230,8 @@ source repo
 - Mid-flight tile gen: `NavCameraFrame` + `PoseEstimate` → orthorectify → dedup → write to local C6 (no upload).
 - GCS telemetry: C5 → C8 → 1–2 Hz downsampled summary to QGroundControl.
 - FDR: every emitted/received stream → C13 ring with per-flight ≤ 64 GB cap.
- Post-landing: operator triggers C11 `TileUploader` → reads C6 → uploads to `satellite-provider` ingest endpoint (D-PROJ-2 contract).
+- Post-landing: operator triggers C12 `PostLandingUploadOrchestrator` → reads `flight_footer` from FDR via `FdrFooterReader` → on `clean_shutdown == True` invokes C11 `TileUploader` (via `TileUploaderCut` Protocol) → reads C6 → uploads to `satellite-provider` ingest endpoint (D-PROJ-2 contract). Refusal modes (`footer_missing`, `unclean_shutdown`, `flight_id_not_found`, `fdr_unreadable`) raise `FlightStateNotConfirmedError` with operator-actionable remediation text and a distinct CLI exit code per mode.
+- Operator re-loc (AC-3.4 visual-loss path): operator submits `ReLocHint` via the `reloc-confirm` CLI → C12 `OperatorReLocService` validates the DTO → forwards to airborne companion via `OperatorCommandTransport` (E-C8 concrete) → records `c12.reloc.requested` FDR record (`outcome ∈ {sent, failed}`). Live log redaction (lat/lon rounded to 5 decimals; `reason` truncated to 200 chars); FDR record persists the full hint un-redacted for post-flight forensics.

 ---

@@ -258,11 +262,48 @@ source repo
 | ArduPilot Plane FC | MAVLink 2.0 (`GPS_INPUT` 5 Hz; `MAV_CMD_SET_EKF_SOURCE_SET`; `STATUSTEXT` / `NAMED_VALUE_FLOAT`) over UART/USB | MAVLink 2.0 message signing, per-flight key (D-C8-9 = (d)) | 5 Hz periodic emit; signing handshake at takeoff load (≤ 5 s, AC-NEW-1) | Signing handshake fail → companion refuses takeoff; mid-flight signing key compromise → FC ignores unsigned messages, AC-5.2 takes over |
 | iNav FC | MSP2 `MSP2_SENSOR_GPS` over UART; MAVLink outbound for telemetry | None (iNav has no signing) — accepted residual risk per Mode B Source #129 | 5 Hz periodic emit | Mid-flight bad-frame → iNav `mspGPSReceiveNewData()` receives only the latest frame; honest `hPosAccuracy` is the only safety net |
 | QGroundControl (GCS) | MAVLink 2.0 (`STATUSTEXT`, `NAMED_VALUE_FLOAT`, `GPS_RAW_INT`) | Same MAVLink 2.0 signing as the AP path (AP profile); no signing on iNav profile | 1–2 Hz downsampled (AC-6.1); operator commands are best-effort | GCS link drop → companion continues; no mid-flight reconfiguration is required from GCS |
-| `satellite-provider` (pre-flight) | REST over HTTP, OpenAPI at `/swagger`; filesystem access if co-located | TLS + service-internal API key (operator workstation only); the companion never reaches `satellite-provider` directly while airborne | Off-line pre-flight; not time-critical | Cache miss → C11 `TileDownloader` fails fast pre-flight; C10 build is blocked downstream; takeoff blocked |
+| `satellite-provider` (pre-flight read — bbox + slippy-map) | REST `POST /api/satellite/tiles/inventory` (bulk lookup by `(z,x,y)`, ≤ 5000 entries / request) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch); OpenAPI at `/swagger`; filesystem access if co-located | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert. The companion never reaches `satellite-provider` directly while airborne. | Off-line pre-flight; not time-critical | Cache miss → C11 `TileDownloader` fails fast pre-flight; C10 build is blocked downstream; takeoff blocked |
+| `satellite-provider` (pre-flight route seed — cycle 3 / Epic AZ-835) | REST `POST /api/satellite/route` (corridor onboarding; body per `CreateRouteRequest.cs` DTO) + `GET /api/satellite/route/{id}` (status polling; terminal-success `mapsReady=true`) | Same JWT Bearer / TLS-insecure as the read path; validated pre-emptively against AZ-809 `CreateRouteRequestValidator` bounds | Off-line pre-flight; bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s) | Terminal failure → `RouteTerminalFailureError`; transient → `RouteTransientError`; validation → `RouteValidationError`. C11's `SatelliteProviderRouteClient` (AZ-838) owns the surface. |
 | `satellite-provider` (post-landing ingest, D-PROJ-2, **planned**) | REST `POST /api/satellite/tiles/ingest` (multipart) | Per-flight onboard signing key (carried with each tile); rate-limited | Bursty post-landing | Endpoint not yet implemented service-side → C11 keeps batches queued locally; never blocks the pre-flight cycle |
 | Operator workstation (pre-flight stage) | Filesystem (USB / Ethernet) | OS-level (operator login) | Not time-critical | Bad-stage detection via Manifest content-hash gate (D-C10-3) |
 | Nav camera | USB / MIPI-CSI / GigE (lens-module dependent) | n/a | 3 Hz | Frame drop / hardware fault → "VISUAL_BLACKOUT" path (AC-3.5, AC-NEW-8) |

+### `satellite-provider` integration (cycle-3 ground truth)
+
+**The Jetson e2e harness now consumes the REAL parent-suite `satellite-provider` .NET service** (lineage AZ-688 / AZ-691 / AZ-692; `satellite-provider` + `satellite-provider-postgres` services in `docker-compose.test.jetson.yml`). The legacy `mock-sat` fixture is retired from the Jetson compose; D-PROJ-2 `POST /api/satellite/upload` has shipped service-side (`Program.cs:211`). Tier-1 `docker-compose.test.yml` is deprecated 2026-05-20 per `_docs/02_document/tests/environment.md`.
+
+Two consequences for the architecture:
+
+1. **C11 read contract adapted to the v1.0.0 inventory shape (AZ-777 Phase 1)** — `POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` replace the historical `GET /api/satellite/tiles?bbox=…&zoom=…` shape. The bbox-driven `download_tiles_for_area` entry point and its DTOs are unchanged at the call-site level; the contract adaptation is internal to `HttpTileDownloader`. Auth is JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; `SATELLITE_PROVIDER_TLS_INSECURE=1` is a documented dev-only knob for self-signed certs. **Proposed successor (ADR-013 / AZ-976)**: gRPC `satellite.v1.RouteTileDelivery.DeliverRouteTiles` server-streaming with client tile catalog — see `tile_provision_grpc.md`; supersedes the never-shipped inventory REST endpoint.
+2. **Route-driven seeding (Epic AZ-835 / AZ-969)** — the operator submits a tlog-derived `RouteSpec` (produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836) via C12 `seed-cache-from-tlog` (AZ-974) or the F11 `replay_api` demo job (AZ-973). E2E fixture `operator_pre_flight_setup` wraps the same production `operator_replay.cache_seed` module.
+
+**Imagery source license attribution (cycle 3)**: the Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the parent-suite side (parent-suite ticket TBD). Operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution.
+
+**AZ-777 Phase 3+ superseded by Epic AZ-835**: AZ-777 originally proposed five phases — wire e2e-runner (Phase 1), seed Derkachi bbox (Phase 2), rewrite `operator_pre_flight_setup` fixture (Phase 3), un-xfail AC-4 / AC-5 (Phase 4), docs (Phase 5). Phases 1+2 shipped under AZ-777 itself (batch 104, cycle 3). Phases 3 and 5 were **superseded** when the user redirected the work to a route-driven flow: Phase 3 → AZ-839 (real fixture wiring C1+C2+C11+C10), Phase 5 → AZ-842 (this docs ticket). Phase 4 (un-xfail) was deferred to backlog after the cycle-4 redesign (AZ-895) took the un-xfail target along a different path and is not on the active epic. The AZ-777 task spec at `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` carries the supersedure banner; this architecture document is the authoritative high-level pointer for that decision.
+
+No new ADR — this is execution of existing decisions (architectural principle #5 satellite-provider on-disk layout end-to-end; ADR-004 process-level isolation unchanged; ADR-011 replay is a configuration unchanged). The architectural surface gained the route-driven seeding path inside C11; nothing else moved.
+
+### Replay input redesign (cycle 4 — single canonical clock + CSV-driven path)
+
+Cycle 4 rebuilt the replay-mode operator-input surface around a single canonical clock to close the AZ-848 ESKF out-of-order regression and to retire the tlog auto-sync surface that produced the misalignment risk in the first place. Four tickets ship the change:
+
+| Ticket | Role | Description |
+|--------|------|-------------|
+| **AZ-894** (CSV adapter) | New primary path | `csv_replay_input.CsvReplayInputAdapter` consumes a paired `(video, CSV)` where the CSV's `Time` column is the canonical clock for every IMU/GPS sample. Gated `BUILD_CSV_REPLAY_ADAPTER=ON` in airborne and research binaries; OFF in operator-orchestrator. |
+| **AZ-895** (auto-sync deprecation) | Removed legacy | `replay_input.auto_sync` (AZ-405) reduced to a no-op stub that raises on first call; `tlog_video_adapter.py` reduced to a deprecated stub whose `open()` raises immediately. The legacy `--time-offset-ms` / `--skip-auto-sync` / `--auto-trim` CLI flags accepted-with-warning, ignored. Hard removal tracked in AZ-908 (cycle 5+ backlog). |
+| **AZ-896** (CSV format spec) | Contract | `_docs/02_document/contracts/replay/csv_replay_format.md` documents the CSV row schema, the row-0-alignment-with-video-frame-0 invariant, and an example `data_imu.csv` shipped under the same path. |
+| **AZ-897** (operator UI) | Cycle 5 — Epic AZ-969 | Dual-timeline `(video, tlog)` alignment UI in `../ui`; uploads raw tlog, calls `replay_api` preview/align/demo endpoints; displays map + verdict. Spec: `../ui/_docs/02_tasks/todo/AZ-897_operator_replay_sync_ui.md`. |
+
+The architectural rationale is captured in **Invariant 14** of the replay protocol (`_docs/02_document/contracts/replay/replay_protocol.md`): the system runs as a single edge process on a single device; there must be exactly one wall/monotonic clock authoritative for timestamps that cross component boundaries. In live mode that clock is the C8 inbound `FcAdapter`'s FC-boot-relative timestamp; in replay mode (after cycle 4) it is the CSV row's `Time` column. The previous design's two-clock surface (Jetson monotonic at C1 VIO emission, FC-boot at C8 IMU window arrival) produced the AZ-848 regression and is retired with the auto-sync deprecation.
+
+The legacy `TlogReplayFcAdapter` is retained for audit paths — offline FDR analysis and `gps-denied-tlog-to-csv` export (AZ-972). Runtime replay uses the CSV adapter after operator alignment (F11 / Epic AZ-969).
+
+### Demo replay operator flow (cycle 5 — Epic AZ-969)
+
+F11 in `system-flows.md` is the **primary product demo**, not an e2e-test concern. Raw operator inputs are `(video, tlog, calibration)`; alignment produces an AZ-896 CSV on a single canonical clock; route-driven cache seeding uses `extract_route_from_tlog` via C12 / `replay_api` production modules (AZ-974, AZ-973). Backend children: AZ-970 (preview API), AZ-971 (alignment refine), AZ-972 (CSV export), AZ-973 (orchestration), AZ-974 (C12 seed CLI), AZ-975 (docs). UI: AZ-897 in `../ui`.
+
+The cycle-4 `(video, CSV)` upload bypass (AZ-959) remains for operators who already have an aligned CSV; it is not the default demo entry.
+
 ### `satellite-provider` upload contract (per D-PROJ-2 carryforward)

 The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`. From this architecture's standpoint:
@@ -270,7 +311,7 @@ The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/202
 - **`Tile` writes are append-only and idempotent** (the same `(zoomLevel, lat, lon, capture_timestamp, companion_id, flight_id)` tuple is the dedup key).
 - **Quality metadata is mandatory on every uploaded tile** so the planned voting layer can promote `pending → trusted` without re-deriving statistics on the service side.
 - **Onboard tiles never claim the `trusted` status**; they are uploaded as `pending` and the parent-suite voting layer (D-PROJ-2 design task #2) decides promotion.
- **Test substitute**: `mock-suite-sat-service` is an e2e-test-only fixture (under `tests/fixtures/mock-suite-sat-service/`) that implements the upload contract for NFT-SEC-01 / FT-P-17 / IT runs until D-PROJ-2 lands service-side. It is **not a component** in the architectural sense — the production architectural counterparty for both download and upload is the real `satellite-provider`. The fixture is retired the moment the real ingest endpoint ships.
+- **Test substitute**: `mock-suite-sat-service` is an e2e-test-only fixture (under `tests/fixtures/mock-suite-sat-service/`) that implements the upload contract for NFT-SEC-01 / FT-P-17 / IT runs until D-PROJ-2 lands service-side. It is **not a component** in the architectural sense — the production architectural counterparty for both download and upload is the real `satellite-provider`. The fixture is retired the moment the real ingest endpoint ships. (Download + route-seed integration tests on the Jetson harness already run against the real service as of cycle 3.)

 ---

@@ -293,7 +334,7 @@ The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/202
 | Visual blackout failsafe (AC-NEW-8) | Mode transition ≤ 400 ms; covariance grows monotonically; spoofed GPS never re-promoted without 10 s + visual consistency gate | FT-N-04 + NFT-RES-04 | High | `tests/resilience-tests.md` + `tests/blackbox-tests.md` |
 | Cross-FC covariance honesty (AC-NEW-4 cross-FC) | `horiz_accuracy` (m, AP) and `hPosAccuracy` (mm, iNav) carry mathematically equivalent values from the same 2×2 sub-matrix | IT-10 cross-FC | High | `tests/blackbox-tests.md` |
 | MAVLink message-signing posture (AC-4.3 + D-C8-9) | Signing enabled on AP wired channel; per-flight key rotation logged to FDR; iNav documented residual risk | NFT-8 + NFT-SEC-03 | High | `tests/security-tests.md` |
-| Dependency CVE pinning (D-CROSS-CVE-1) | OpenCV ≥ 4.12.0; SBOM clean of unpatched CVEs at audit time; monthly re-scan | NFT-10 SBOM CVE audit | High | `tests/security-tests.md` |
+| Dependency CVE pinning (D-CROSS-CVE-1) | Target: OpenCV ≥ 4.12.0; SBOM clean of unpatched CVEs at audit time; monthly re-scan. **Cycle-1**: relaxed to `>=4.11.0.86,<4.12` per `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` (gtsam-4.2.1/numpy-1.x ABI block); CVE-2025-53644 to be re-validated against 4.11.0.86 before close. | NFT-10 SBOM CVE audit | High | `tests/security-tests.md` |
 | GCS bandwidth budget (AC-6.1) | 1–2 Hz downsampled summary | FT-P-12 | Medium | `tests/blackbox-tests.md` |
 | Frame-by-frame streaming (AC-4.4) | No batching/delay; estimates emitted per frame | NFT-PERF-02 | High | `tests/performance-tests.md` |
 | Smoothing-loop look-back (AC-4.5, Mode B Fact #107) | FDR contains smoothed past-frame estimates; smoothing horizon converges within X m of ground truth at K = 10–20 keyframes | IT-11 | Medium | `tests/blackbox-tests.md` |
@@ -321,12 +362,12 @@ The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/202
 | Companion ↔ GCS (AP profile) | MAVLink 2.0 signing inherited from the FC channel |
 | Operator workstation ↔ `satellite-provider` (pre-flight) | TLS + service-internal API key (workstation only; never on the airborne companion) |
 | Companion ↔ `satellite-provider` (post-landing upload, **D-PROJ-2 planned**) | Per-flight onboard signing key carried with each uploaded tile; the planned ingest endpoint verifies the key |
-| Operator workstation pre-flight stage | OS-level (operator login + workstation hardening — operator-tooling concern, C12) |
+| Operator workstation pre-flight stage | OS-level (operator login + workstation hardening — operator-orchestrator concern, C12) |

 **Authorization**:

 - **Onboard runtime**: a single principal (the runtime process); no in-process privilege boundaries. The Tile Manager (C11) runs as a different principal on the operator workstation, holding the only credentials that reach `satellite-provider` (TLS API key for download; per-flight onboard signing key for post-landing upload). The airborne image does not contain the C11 binary at all.
- **GCS**: operator commands (`AC-6.2`) are best-effort hints; the operator cannot promote a pose, override covariance, or reach the `satellite-provider` write path. Operator re-loc requests trigger the satellite re-localization flow (F6) but do not bypass any safety gate.
+- **GCS**: operator commands (`AC-6.2`) are best-effort hints; the operator cannot promote a pose, override covariance, or reach the `satellite-provider` write path. Operator re-loc requests (C12 `OperatorReLocService` → `OperatorCommandTransport` over the GCS link) trigger the satellite re-localization flow (F6) but do not bypass any safety gate — the airborne pipeline still validates the hint against the visual/satellite consistency check before promoting any pose.

 **Data protection**:

@@ -371,6 +412,8 @@ The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/202

 **Consequences**: A flight is locked to one VIO; failure of the active strategy = AC-5.2 fallback (FC IMU-only). The comparative study is a per-replay artifact, not a runtime decision.

+**Cycle-1 operational note (2026-05-19, post-Implement)**: AZ-332 (OKVIS2) and AZ-333 (VINS-Mono) shipped as facade-only with `BLOCKED` terminal classification per the implement skill's PASS-with-BLOCKED policy (Tier-2 prerequisites: CI build env + Jetson hardware + DBoW2 vocab artifact for AZ-332; same plus upstream-vendoring decision for AZ-333). The `_STRATEGY_REGISTRY` (see ADR-009 cycle-1 note below) registers all three slots so the seam stays correct, but selecting `okvis2` or `vins_mono` raises `StrategyNotAvailableError` from `vio_factory.py` until the gating `BUILD_*` flag turns on. The cycle-1 production-default selection is **`klt_ransac`** (AZ-334). Follow-ups: **AZ-592** (Tier-2 OKVIS2 wiring) and **AZ-593** (VINS-Mono vendoring + wiring) — both parked in `_docs/02_tasks/backlog/`. Closed Won't-Fix during cycle 1: AZ-589 + AZ-590 (original remediation — they targeted upstream APIs that don't exist in the actually-checked-in OKVIS2 submodule). Full post-mortem in `_docs/03_implementation/implementation_completeness_cycle1_report.md` § "Verdict — Revised 2026-05-16".
+
 ### ADR-002 — Build-time exclusion of unused `Strategy` implementations (D-C1-1-SUB-A = (a))

 **Context**: The architecture deliberately requires multiple interchangeable implementations per component (three `VioStrategy` for C1; multiple `VprStrategy` for C2; two FC adapters for C8). At runtime each binary uses exactly one of them per component. Linking *all* implementations into every binary would inflate binary size on the 8 GB shared Jetson, increase boot/load time inside the AC-NEW-1 ≤ 30 s p95 budget, expand the deployed dependency / attack surface, and create accidental-selection risk (a misconfigured runtime accidentally booting a non-deployment-default strategy). A single binary with all strategies present is also harder to reason about for the IT-12 comparative study, which deliberately wants the *opposite* — every strategy present and replayed against the same footage.
@@ -413,7 +456,9 @@ This decision is made on **technical grounds only**. Component licenses (BSD/Apa

 **Context**: AC-8.4 forbids in-air outbound writes to `satellite-provider` for drone-security reasons. The companion is also read-only against `satellite-provider` while airborne — there is no operational reason to fetch tiles in flight either, since the pre-flight cache is the contract. A software guard checking `flight_state == ON_GROUND` can be bypassed by code injection if the network I/O code path is ever loaded.

-**Decision**: The Tile Manager (C11) is a **separate binary / image** that runs only on the operator's workstation; the airborne companion image does not contain the C11 binary at all — neither the `TileDownloader` (pre-flight) nor the `TileUploader` (post-landing) code paths can be reached from the airborne process. The `flight_state == ON_GROUND` software guard inside the `TileUploader` remains as defense-in-depth for the upload direction. The local mid-flight tile format is byte-identical to `satellite-provider`'s on-disk layout so no transformation is needed at upload time.
+**Decision**: The Tile Manager (C11) is a **separate binary / image** that runs only on the operator's workstation; the airborne companion image does not contain the C11 binary at all — neither the `TileDownloader` (pre-flight) nor the `TileUploader` (post-landing) code paths can be reached from the airborne process. The defense-in-depth software guard is owned by **C12's `PostLandingUploadOrchestrator`**, which reads the `flight_footer` FDR record's `clean_shutdown` field before invoking C11's `TileUploader` (Batch 44 SRP refactor — the gate's single source of truth is the FDR footer C13 writes only on clean shutdown; C11 itself no longer gates). The local mid-flight tile format is byte-identical to `satellite-provider`'s on-disk layout so no transformation is needed at upload time.
+
+**Why the gate moved to C12 (Batch 44)**: An earlier iteration placed the gate inside C11's `TileUploader` (consuming a live `FlightStateSignal` from C8). That duplicated the safety invariant on both sides of the C11/C12 boundary and coupled C11 to C8 just for the post-landing check. The current design (a) consolidates ownership on the operator-side workflow head (C12) — single responsibility per component, single source of truth for "vehicle is fully stopped" (= C13's footer write decision), and (b) collapses an arbitrary 30-second hold-down heuristic to an exact boolean (`clean_shutdown`). The `TileUploader` Protocol contract is frozen at v2.0.0 with the gate parameters removed; AZ-317 is superseded.

 **Enforcement gates (per R02 risk register)**:
 1. **CI SBOM diff**: the build pipeline fails the airborne `production-binary` artifact if any symbol from `c11_tilemanager/` (or any module that transitively imports `c11_tilemanager`) appears in the linked image. This is an extension of the per-implementation SBOM enforcement already in ADR-002.
@@ -424,7 +469,7 @@ This decision is made on **technical grounds only**. Component licenses (BSD/Apa
 1. Single binary with software-only guard — rejected on principle: a runtime guard cannot be the primary control for an "is the system airborne?" safety property.
 2. Hardware-level switch (e.g., physical write-enable jumper) — rejected: adds operations cost; software-image-isolation gives equivalent assurance for this threat model.

-**Consequences**: Two binaries to maintain (companion image + operator-tooling image). CI builds and tests both. The operator workflow has an explicit post-landing step ("run the upload tool") which is itself a feature, not a bug.
+**Consequences**: Two binaries to maintain (companion image + operator-orchestrator image). CI builds and tests both. The operator workflow has an explicit post-landing step ("run the upload tool") which is itself a feature, not a bug.

 ### ADR-005 — Two execution tiers (Tier-1 / Tier-2) are first-class architectural concerns (F6)

@@ -462,7 +507,7 @@ This decision is made on **technical grounds only**. Component licenses (BSD/Apa
 1. Keep ADR-007 as originally written — rejected: see "Why reversed".
 2. Wait for D-PROJ-2 service-side implementation before any tests — rejected: blocks the onboard cycle.

-**Consequences**: The mock continues to ship in the operator-tooling tarball's compose file as a test-time service, but it is no longer documented under `_docs/02_document/components/`. Test specs and CI references treat it as a fixture. When `satellite-provider` ships the real endpoint, the fixture is replaced by pointing tests at the real service; no architectural changes flow from that switch.
+**Consequences**: The mock continues to ship in the operator-orchestrator tarball's compose file as a test-time service, but it is no longer documented under `_docs/02_document/components/`. Test specs and CI references treat it as a fixture. When `satellite-provider` ships the real endpoint, the fixture is replaced by pointing tests at the real service; no architectural changes flow from that switch.

 ### ADR-008 — D-C8-2 source-set switch is `Selected with runtime gate` (Mode B Fact #111)

@@ -610,6 +655,32 @@ This decision is made on **technical grounds only**. Component licenses (BSD/Apa
 - Per-component folders give each implementation a natural home for its own `tests/`, fixtures, and adapter-specific helpers — matching coderule.mdc's "logic specific to a platform, variant, or environment belongs in the class that owns that variant".
 - Adding a new C2 VPR backbone (e.g., a future foundation-model retrieval backbone via D-C2-12) is a folder-add + interface-conformance change; no other component is touched.

+#### Cross-Component Contract Surface (AZ-507)
+
+The ADR-009 "interface, not concrete" rule has an architectural sibling: cross-component imports go through `_types/*.py` (DTOs + typed-error envelopes such as `_types.inference_errors`), never through `components.X (Public API)`. The only exception is `runtime_root/*` (the composition root), which is allowed to import concrete strategies across components precisely because it is the single place that resolves Protocol parameters to concrete classes. Every other module under `components/**/*.py` consumes cross-component contracts via (a) shared DTOs in `_types/*`, and (b) consumer-side structural `Protocol` cuts defined locally inside the consuming component (e.g. `c10_provisioning.engine_compiler.CompileEngineCallable` for the narrow `compile_engine` surface of the C7 InferenceRuntime). This is the same architectural property as constructor-injection-against-interface, applied to the import graph rather than the call graph. The AZ-270 `test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies` lint enforces this on every `components/**/*.py`; AZ-507 reconciles `module-layout.md` with the lint so the documentation and the build gate agree.
+
+#### Cycle-1 implementation: `_STRATEGY_REGISTRY` + `pre_constructed` (AZ-591, AZ-618)
+
+Two cross-cutting Tier-1 mechanisms shipped inside `runtime_root/` during cycle 1 that the Plan-era ADR-009 sketch did not anticipate. Both are operational prerequisites for `compose_root()` reaching takeoff and are extensions of — not deviations from — the constructor-injection-against-interface rule above.
+
+1. **`_STRATEGY_REGISTRY` + `register_strategy(...)` API (AZ-591).** A module-level `dict[(component_slug, strategy_name)] → _Registration]` populated per-binary. The airborne entrypoint calls `runtime_root.airborne_bootstrap.register_airborne_strategies()` once at process start, which fills 7 strategy-selecting airborne component slots (`c1_vio`, `c2_vpr`, `c2_5_rerank`, `c3_matcher`, `c3_5_adhop`, `c4_pose`, `c5_state`) with `tier="airborne"`. Without this, `compose_root()` raises `StrategyNotLinkedError` on the first config-driven strategy lookup. The registry is the **runtime-side complement to ADR-002 build-time exclusion**: the build chooses which strategies are even available to register; the registry chooses which one this binary serves; the config chooses which registered slot to wire. A misconfigured runtime asking for an unlinked strategy still fails fast (`StrategyNotLinkedError` carries the offending strategy name + component slug + actually-linked alternatives — operator gets a clear next step). The `register_strategy` call site is restricted by lint (AZ-270): only the composition root or a binary-specific bootstrap module may call it; calls from component modules are an architecture violation.
+
+2. **`pre_constructed` kwarg + `build_pre_constructed(config)` (AZ-618 umbrella → subtasks AZ-619..AZ-624).** `compose_root(config, *, pre_constructed=...)` now accepts a dict of pre-built infrastructure objects keyed by documented strategy slug, consumed by the airborne wrapper factories registered in step 1. The airborne entrypoint builds these via `airborne_bootstrap.build_pre_constructed(config)` in 6 dependency-ordered phases:
+
+   | Phase | Slugs seeded | Notes |
+   |-------|--------------|-------|
+   | A (AZ-619) | `c13_fdr`, `clock` | `c13_fdr` is per-producer-cached; `clock` is fresh `WallClock` |
+   | B (AZ-620) | `c6_descriptor_index`, `c6_tile_store` | gated on `BUILD_FAISS_INDEX` per consumer |
+   | C (AZ-621) | `c7_inference` | gated on `BUILD_TENSORRT_RUNTIME` / `BUILD_PYTORCH_FP16_RUNTIME` |
+   | D (AZ-622) | `c3_lightglue_runtime`, `c3_feature_extractor` | LightGlue runtime reuses Phase C `c7_inference` engine (no double build); gated on `C3_MATCHER_BUILD_FLAGS[strategy]` |
+   | E (AZ-623) | `c282_ransac_filter`, `c5_imu_preintegrator`, `c5_se3_utils`, `c5_wgs_converter` | IMU preintegrator cached at module level keyed by camera-calibration path |
+   | E.5 (AZ-625) | `c5_isam2_graph_handle` (+ internal `_c5_prebuilt_estimator`) | eager `(StateEstimator, ISam2GraphHandle)` build so C4 receives the handle (C4 runs before C5 in topo order) and the C5 wrapper short-circuits without re-invoking the factory; gated on `C5_STATE_BUILD_FLAGS[strategy]` |
+   | F (AZ-624) | (no slot keys; wires `runtime_root.main()` and verifies AC-1..AC-5 end-to-end) | terminal phase |
+
+   The expected per-component dependency keys are documented in `airborne_bootstrap.AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`. Missing keys raise `AirborneBootstrapError` with the missing-key name + the consuming component slug + the relevant gating `BUILD_*` flag, so the operator-facing error names exactly which build flag or which input is wrong. Tests stub by passing the same `pre_constructed=...` kwarg with mock objects; the bootstrap's caching makes two calls within a process return the same `c13_fdr` object (AC-619.2) without changing the contract. In replay mode (ADR-011), `compose_root` merges replay-built `frame_source` / `fc_adapter` / `clock` / `mavlink_transport` / `replay_sink` over `pre_constructed` so the replay branch's `TlogDerivedClock` correctly overrides the bootstrap's `WallClock`. AZ-687 added a guard for the minimal replay `Config` that omits strategy-component blocks — the bootstrap skips the `_build_c6_*` / `_build_c7_*` / `_build_c5_*` seeds when their component block is absent, since the corresponding wrappers do not run.
+
+Both additions sit inside `runtime_root/`; no component crosses the AZ-507 import boundary. They preserve every ADR-009 invariant — interface-first components, constructor-injected dependencies, single composition root, build-time-exclusion-as-architectural-property — and add the runtime mechanics needed to make a 12+ infrastructure-dependency graph wirable without losing fail-fast behaviour. `module-layout.md` § shared/runtime_root carries the file-level ownership; this section is the architectural rationale.
+
 ### ADR-010 — Operator-planned mission is the cold-start trust anchor; FC GPS is secondary

 **Context**: The original cold-start design (AZ-419 / FT-P-11) assumed the FC EKF's last valid GPS fix is available at takeoff to seed C5. Field reality contradicts this: a UAV operating in a contested-EW environment may have GPS jammed **before** takeoff (the jamming radius reaches the launch site, the unit launches under a jammer's umbrella, etc.). In that case the FC EKF has no GPS fix to give, and the companion has nothing to anchor the initial pose to — the entire downstream pipeline (VIO bootstrap, VPR retrieval scope, satellite anchoring) collapses or runs blind. At the same time, the parent suite already requires the operator to author a route in the **Mission Planner UI** (`suite/ui`) and persist it to the **`flights` REST service** (`suite/flights`) before any flight runs. The waypoint ordering is operationally meaningful: waypoint[0] is the planned takeoff point. The operator therefore already declares the takeoff position with operationally relevant accuracy (typically a few tens of metres) hours before launch, in a context that has no dependency on GPS at all. This information is the natural cold-start trust anchor.
@@ -638,3 +709,110 @@ This decision is made on **technical grounds only**. Component licenses (BSD/Apa
 - C12 gains the `FlightsApiClient` boundary + offline `--flight-file` path (AZ-489).
 - Principle #11 (the spoofed-GPS gate) is extended with the bounded-delta clause; the gate now serves both takeoff and mid-flight.
 - The companion binary's network surface is unchanged — only C12 (operator-side, separate binary) talks to the flights service.
+
+### ADR-011 — Replay is a configuration of the airborne binary, not a separate image (REVERSES the v1.0.0 four-binary design)
+
+**Context**: The original Decompose Step 2 design for epic AZ-265 (E-DEMO-REPLAY) treated replay as a **fourth Docker image** (`gps-denied-replay-cli`) built from the same source tree with a different `BUILD_*` flag combination — specifically `BUILD_C6=OFF`, `BUILD_C10=OFF`, `BUILD_C11=OFF`, `BUILD_C12=OFF`, plus the new replay-only build flags ON. The justification was the same as ADR-002 for the live/research/operator split: minimize binary size, attack surface, and accidental-selection risk. An SBOM-diff CI step was specified (AZ-403) to enforce the exclusion of the four "off" components from the replay binary.
+
+Two facts surfaced during the Step 7 (Implement) batch loop that contradicted this design:
+
+1. **The C2 (VPR) → C6 dependency cannot be honestly removed.** C2 retrieves candidate tiles by querying the C6 `DescriptorIndex` (FAISS HNSW over pre-built per-tile descriptors). With C6 absent the index has no host, and C2's `VprStrategy.lookup(c1)` either returns empty (replay produces no positioning fixes, defeating epic AC-3 of ≤ 100 m for ≥ 80 % of ticks) or has to be backed by a parallel "lite" index variant (which is not the production code path and therefore destroys the epic's premise that demo confidence equals field-test confidence on the same footage). Either way the v1.0.0 design's `BUILD_C6=OFF` flag for replay conflicts with the v1.0.0 epic AC-3.
+2. **The user requirement is the opposite of binary isolation.** Replay's purpose is "demo confidence equals field-test confidence on the same footage" — i.e., the demo and the real flight should run **exactly** the same code path. Reducing the binary's component set (even one with a sound technical justification like ADR-002) actively works against that purpose: any divergence between the replay image and the airborne image becomes a potential source of demo↔field drift that no SBOM diff can detect once the two binaries' source trees evolve independently.
+
+**Decision**:
+
+1. **Replay is a configuration of the airborne binary.** The airborne Docker image is the replay image. No fourth Docker image, no SBOM-diff CI step, no `BUILD_C6=OFF` for replay. The operator runs the same image with the same `gps-denied-onboard` entry point (or its sibling `gps-denied-replay` console-script wrapper) — only the config differs.
+2. **The mode-aware decision is `config.mode = "live" | "replay"` resolved once at startup in `compose_root`.** The composition root branch (the single point of mode awareness in the codebase) swaps three strategies and adds one observer:
+   - `FrameSource`: `LiveCameraFrameSource` ↔ `VideoFileFrameSource`.
+   - `FcAdapter`: `PymavlinkArdupilotAdapter` / `Msp2InavAdapter` ↔ `TlogReplayFcAdapter`.
+   - `MavlinkTransport`: `SerialMavlinkTransport` ↔ `NoopMavlinkTransport` (the outbound bytes go nowhere in replay; the C8 encoder code path is unchanged — see Invariant 5 of the replay protocol).
+   - **Adds** `JsonlReplaySink` as an additional listener on C5's `EstimatorOutput` stream (replay-only; the UI consumes the JSONL file). The live binary's downstream sinks (C8 outbound to FC, QGC telemetry adapter, C13 FDR) are unchanged.
+3. **A new `replay_input/` Layer-4 cross-cutting module owns `(video, tlog)` → `(FrameSource, FcAdapter, Clock)` convergence.** It instantiates the replay strategies, applies the time-offset (manual or auto via AZ-405), and hands the composition root a `ReplayInputBundle`. The composition root sees no `if mode == "replay"` plumbing — it sees standard `FrameSource` + `FcAdapter` + `Clock` instances. This is the architectural mechanism that delivers Principle #13's interface-first promise for the replay-vs-live boundary.
+4. **Operator pre-flight workflow is identical between replay and live.** The operator plans a route in the parent-suite Mission Planner UI (`suite/ui`); the route persists in the `flights` REST service; C12 reads the `Flight`, derives the bbox + takeoff origin, calls C11 `TileDownloader` against `satellite-provider`, builds the C10 cache (descriptor index + engines + manifest). The only step that differs is "go fly" → "run `gps-denied-replay` against video + tlog". The companion image consumes the cache identically in both modes (Invariant 12 of the replay protocol).
+5. **MAVLink emit destinations in replay are no-op sinks for non-UI consumers.** The C8 outbound encoders (`GPS_INPUT`, GCS `STATUSTEXT`, `NAMED_VALUE_FLOAT`, `MAV_CMD_SET_EKF_SOURCE_SET`) run unchanged; their byte streams hit `NoopMavlinkTransport` and disappear. The user-confirmed design intent: the **only** position output the UI cares about in replay is the per-tick C5 `EstimatorOutput`, which is captured by `JsonlReplaySink` and tailed by the parent-suite UI. MAVLink signing key is mandatory in both modes (Invariant 11 of the replay protocol — the operator supplies a dummy key file for replay; the signing handshake runs and its bytes are dropped by the noop transport).
+6. **Three binaries, not four.** The active build matrix returns to the ADR-002 cadence: **airborne** (Tier-1 + Tier-2 production; live + replay both run from this image), **research** (IT-12 comparative-study, mirrors airborne plus the additional VioStrategy / VprStrategy variants), **operator-orchestrator** (pre-flight workflows on operator workstation). The replay-cli column is removed from `module-layout.md`'s Build-Time Exclusion Map; the replay-only `BUILD_*` flags (`BUILD_VIDEO_FILE_FRAME_SOURCE`, `BUILD_TLOG_REPLAY_ADAPTER`, `BUILD_REPLAY_SINK_JSONL`) are ON in airborne and research, OFF in operator-orchestrator.
+
+**Alternatives considered**:
+
+1. **Keep the fourth `gps-denied-replay-cli` binary with `BUILD_C6=OFF`** (status quo of v1.0.0) — rejected for the two reasons in the Context section: the C2→C6 dependency makes `BUILD_C6=OFF` incompatible with epic AC-3, and the very purpose of replay (demo↔field fidelity) is undermined by any source-tree divergence the SBOM-diff step cannot detect.
+2. **Keep the fourth binary but with `BUILD_C6=ON`** — rejected: same code as airborne minus C10/C11/C12, which is exactly what airborne already is (the airborne binary already excludes C10/C11/C12 per ADR-002 / ADR-004). The fourth binary would be byte-identical to the airborne image; maintaining it as a separate CI artifact adds work for zero gain.
+3. **Make replay an HTTP service rather than a CLI** — rejected as out-of-scope for this ADR (the parent-suite UI subprocess + JSONL tail design predates this decision and is not in scope here). The replay CLI / live entry-point split is a CLI shape concern, not an architectural concern; the airborne binary remains a long-lived process with no HTTP listener.
+4. **Move the JSONL sink to a different output (e.g., piped into stdout, or a unix socket)** — deferred. The current `results.jsonl` file output is the simplest UI-tailable contract and matches the parent-suite UI's subprocess assumption. If the UI later needs streaming-without-disk, the sink Protocol allows a `StdoutReplaySink` or `UnixSocketReplaySink` strategy without any change to the composition root.
+
+**Consequences**:
+
+- `_docs/02_document/contracts/replay/replay_protocol.md` is at **v2.0.0** (replaces v1.0.0). New invariants 5, 11, 12 codify the encoder-mode-agnosticism, the signing-key mandate, and the real-C6-cache-in-replay properties.
+- `module-layout.md` Build-Time Exclusion Map drops the `Replay-cli` column; airborne column gains `BUILD_VIDEO_FILE_FRAME_SOURCE=ON`, `BUILD_TLOG_REPLAY_ADAPTER=ON`, `BUILD_REPLAY_SINK_JSONL=ON`. The narrative reduces "Four binaries…" to "Three binaries…".
+- `module-layout.md` Cross-Cutting section gains a `replay_input/` entry (Layer-4 coordinator, owned by AZ-405).
+- AZ-403 (replay-cli Dockerfile + SBOM diff CI step) is **cancelled**; its task file moves to `done/` with a cancellation banner pointing at this ADR. Its dependency edges (incoming from AZ-404, outgoing to nothing) are removed from `_docs/02_tasks/_dependencies_table.md`. The Jira ticket transition to "Cancelled" is recorded in `_docs/_process_leftovers/` if the tracker MCP is unavailable at execution time.
+- AZ-401 shrinks: it no longer authors a separate `compose_replay` function; it extends `compose_root` with the `config.mode == "replay"` branch and wires `JsonlReplaySink` + `NoopMavlinkTransport`. Complexity drops from 3 → 2 points.
+- AZ-402 shrinks: it is a thin mode-config wrapper that dispatches into the live entry point, not a standalone CLI.
+- AZ-405 grows slightly: it now also owns the `replay_input/` coordinator (the natural home for the auto-sync logic + the time-offset application).
+- AZ-404 (E2E replay test) is unchanged in scope but reworded: it asserts mode-agnosticism (Invariant 1) and runs against the unified airborne image — no fourth-image entrypoint to verify.
+- C8 gains a thin `MavlinkTransport` Protocol seam introduced by AZ-400: `SerialMavlinkTransport` (live) and `NoopMavlinkTransport` (replay) implement it. This is a no-op restructure of the existing C8 transport code; the encoders are unchanged. The Protocol seam is the architectural mechanism for Invariant 5 (encoders are byte-identical).
+- Demo↔field fidelity is now structurally guaranteed: the same binary runs in both contexts; any drift between them is a behavioural-test failure, not an SBOM-diff failure.
+
+### ADR-012 — Open-loop ESKF composition profile via `c4_pose.enabled = false` (AZ-776)
+
+**Context**: ADR-009 wires the C4 pose estimator and the C5 state estimator through a shared GTSAM iSAM2 substrate — C4 adds its PnP factor directly to C5's iSAM2 graph (ADR-003). The `c4_pose` slot in `runtime_root/airborne_bootstrap.py` lists `c5_isam2_graph_handle` as a required `pre_constructed` key (AZ-625), and the `OpenCVGtsamPoseEstimator` constructor consumes that handle. This wiring was sound for the steady-state GTSAM-iSAM2 build of C5.
+
+When C5 ships a second strategy — `eskf` (ESKF baseline, AZ-588) — the substrate is **not** an iSAM2 graph: ESKF integrates an IMU-driven covariance forward closed-form, with no factor graph behind it. Its `create()` factory returns `(estimator, None)` for the second tuple element (the iSAM2 handle slot). Two facts surfaced from this:
+
+1. **`c4_pose` cannot be the gate.** C4 owns satellite-anchored pose estimation. ESKF runs satellite-free open-loop. Forcing `c4_pose` into the composition when no satellite anchoring is wired means C4 either crashes at construction (no iSAM2 handle) or, worse, gets a fake handle that pretends to anchor poses that nothing produces — a silent passthrough that violates the "Real Results, Not Simulated Ones" meta-rule.
+2. **The replay Tier-2 smoke profile needs an honest minimum.** The AZ-265 replay path's mandatory simple baseline is KLT/RANSAC VIO + ESKF state estimator without any satellite re-anchoring (AZ-777 will add the satellite path on top via the Derkachi C6 reference tile cache). Without an explicit composition profile that excludes C4, every Tier-2 test that wants to exercise the simple baseline either crashes at compose time or has to monkey-patch the registry — both are anti-patterns for an architectural seam.
+
+**Decision**:
+
+1. **`C4PoseConfig.enabled: bool = True` is the user-facing switch for the open-loop ESKF profile.** Default ON preserves the ADR-003 steady-state airborne path. Setting `enabled=False` instructs `compose_root` to remove `c4_pose` from the selection map before topological ordering — the wrapper never runs, the consumer never sees a handle, and the wiring stays honest.
+2. **`compose_root` enforces the C4↔C5 pairing matrix at compose time.** The validation gate lives in `_validate_c4_c5_composition_profile` (called from `compose_root` before `_compose`) and rejects the two off-diagonal cells of the 2×2 (`c4_pose.enabled`, `c5_state.strategy`) matrix with a `CompositionError` naming both blocks. The two valid combinations are:
+   - `c4_pose.enabled=True` + `c5_state.strategy="gtsam_isam2"` — the ADR-003 / ADR-009 steady-state airborne path.
+   - `c4_pose.enabled=False` + `c5_state.strategy="eskf"` — the open-loop ESKF profile (Tier-2 smoke baseline; satellite anchoring deferred to AZ-777).
+   The two **invalid** combinations are rejected with explicit error text:
+   - `enabled=False` + `gtsam_isam2` (an iSAM2 graph with no PnP anchors converges to drift-prone visual-only odometry; the production deployment intent is that gtsam_isam2 always coexists with C4).
+   - `enabled=True` + `eskf` (ESKF has no graph for C4 to anchor against; this is the AZ-776 root-cause pairing the user reported).
+3. **`build_pre_constructed` honours `c4_pose.enabled`.** When disabled, `c5_isam2_graph_handle` is **omitted** from the `pre_constructed` dict — the handle is a C4 consumer requirement, and removing C4 from the selection map removes the requirement. The ESKF estimator itself is still built and cached in the internal `_c5_prebuilt_estimator` slot (so the C5 wrapper short-circuits onto the prebuilt instance), but the iSAM2-shaped seam disappears from the cross-component contract.
+4. **Component selection is the only thing that changes.** The composition root's existing `_compose` mechanics — topological ordering, lazy strategy resolution, build-flag gating — are unchanged. The new `skip_slugs` parameter (a `frozenset[str]`) is the minimal seam that lets `compose_root` instruct `_compose` to drop the disabled component(s); there is no second composition path, no `compose_eskf` function, no mode-aware branch outside the validation gate.
+
+**Alternatives considered**:
+
+1. **Make `c4_pose` a "soft" dependency of C5 (introspect the strategy at C5 construction time, skip C4 wiring only when `strategy == "eskf"`).** Rejected: this leaks C5-strategy specifics into C4's interface (`PoseEstimator` would have to grow a "you may not be wired" affordance), violates ADR-009 interface-first, and re-introduces the very mode-aware branches Invariant 1 of the replay protocol forbids.
+2. **Make `compose_root` derive `c4_pose.enabled` automatically from `c5_state.strategy` (no user-facing flag).** Rejected: the C4↔C5 coupling is a deliberate design pairing, not a mechanical derivation. Future research strategies (e.g. a non-iSAM2 GTSAM variant, or a satellite-anchored ESKF) may want different combinations; the explicit flag keeps the configuration honest and audit-able.
+3. **Keep the wiring as-is and rely on the registry mechanism to skip C4.** Rejected: `C4PoseConfig` registers itself with the global config registry at module import (via `register_component_block` in `components/c4_pose/__init__.py`), which means even an empty `c4_pose:` block in YAML instantiates the block with defaults and pulls C4 into the selection map. The flag is the only honest opt-out without removing the registration call (which would break the steady-state path).
+4. **Build a synthetic `NullIsam2GraphHandle` that satisfies the Protocol but no-ops on update.** Rejected as the textbook example of the "Real Results, Not Simulated Ones" anti-pattern: it would let C4 run on top of ESKF with no anchoring, producing pose estimates that look real but have no factor-graph grounding. The composition-time gate is the honest answer.
+
+**Consequences**:
+
+- `tests/e2e/replay/conftest.py` writes `c4_pose: { enabled: false }` into the Tier-2 replay `config.yaml`, alongside the existing `c1_vio: klt_ransac` + `c5_state: eskf` block. This is the open-loop profile the replay binary uses for the AZ-265 / AZ-776 simple-baseline tests.
+- `tests/e2e/replay/test_derkachi_1min.py` un-xfails AC-1 (clean exit + per-frame JSONL), AC-2 (schema), AC-5 (determinism), AC-6 realtime, and AC-6 ASAP — these tests only required compose-time success to pass and AZ-776 lands that. AC-3 (≤ 100 m for ≥ 80 % of ticks) **remains** xfailed for AZ-777: ESKF integrates open-loop and drifts unbounded without C2/C3/C4 satellite re-anchoring; the ≤ 100 m threshold cannot be met by physics until the Derkachi C6 reference tile cache lands.
+- `_docs/02_document/contracts/replay/replay_protocol.md` gains a new "Open-loop ESKF composition profile" sub-section in **Composition root extension** plus a new **Invariant 13** ("C4↔C5 pairing matrix is enforced at compose time") that the AZ-776 unit tests own.
+- `_docs/02_document/components/06_c4_pose/description.md` gains an "Enabled flag" sub-section that points at this ADR; the rest of the component contract is unchanged.
+- The unit-test surface at `tests/unit/runtime_root/test_az776_open_loop_eskf_composition.py` owns the seven invariants AZ-776 introduces: `C4PoseConfig.enabled` default-true, AC-1 (open-loop ESKF composes without C4), AC-2 (default GTSAM profile still includes C4), AC-3a + AC-3b (the two forbidden pairings raise `CompositionError`), and the two `pre_constructed` behaviours (`c5_isam2_graph_handle` omitted when C4 disabled, present when C4 enabled). The full suite passes in ~4 s.
+- The composition root's contract surface in `runtime_root/__init__.py` gains one public helper (`CompositionError` was already public; the new `skip_slugs` parameter to `_compose` is module-private). No public CLI flag is added — operators set `c4_pose.enabled = false` in YAML.
+
+### ADR-013 — gRPC server-streaming tile provision for operator pre-flight (AZ-976)
+
+**Context**: Operator-side cache build (C11/C12 ↔ `satellite-provider`) is off the hot airborne path but dominates time-to-ready when a corridor has thousands of tiles. The current REST shape (`POST /route` + poll + planned `POST /inventory` + N× `GET /tiles/{z}/{x}/{y}`) multiplies round-trips and cannot overlap "tiles already on SP disk" with "tiles still downloading from Google Maps". The inventory POST was specified in AZ-777 but never shipped in satellite-provider; Jetson smoke tests 404 on it today. Both codebases are owned by the same team (.NET satellite-provider, Python gps-denied operator tooling), so a typed streaming contract is feasible without a browser client.
+
+**Decision**:
+
+1. **We will add `satellite.v1.RouteTileDelivery.DeliverRouteTiles`** — unary request (`RouteSpec` + `client_tiles`), server-streaming `RouteTileEvent` (manifest → batches → progress → complete | error) — as the primary operator-side pre-flight transport (Epic AZ-976). Proto: `tile_provision.proto`; human contract: `tile_provision_grpc.md`.
+2. **The request carries `RouteSpec.route_id` (idempotent UUID) plus `ClientTileRecord[]`.** satellite-provider omits tiles when the client catalog already has equal-or-better resolution and equal-or-newer `captured_at` (lower m/px = better).
+3. **First stream event is `RouteManifest`** (`total_candidates`, `skipped_by_client`, `to_deliver`); then `TileBatch` messages with inline JPEGs. Server sends on-disk hits before externally fetched tiles (wire-agnostic ordering; `TilePayload.route_priority` hints along-route order).
+4. **ADR-004 boundary is preserved**: only C11/C12 on the operator workstation import gRPC stubs.
+
+**Alternatives considered**:
+
+| Alternative | Rejected because |
+|-------------|------------------|
+| REST `POST /inventory` + parallel GET | Never implemented in satellite-provider; still N+1 HTTP; no overlap of cached vs in-flight fetch |
+| SSE over HTTPS | Weaker typing; both sides are service binaries, not browsers — gRPC + protobuf is the better fit |
+| ZeroMQ between products | Poor fit across WAN/NAT; better kept **inside** satellite-provider's fetch workers |
+| In-flight streaming to UAV | Violates RESTRICT-SAT-1 / ADR-004; wrong reliability model for the aircraft |
+
+**Consequences**:
+
+- Epic AZ-976 decomposes: AZ-977 (SP gRPC server), AZ-978 (C11 client + C12 wiring), AZ-979 (Jetson benchmark + flip default).
+- REST `route_client` + `HttpTileDownloader` remain as fallback until AZ-979 benchmark promotes gRPC.
+- Finished C6 is still staged onto the Jetson via USB/rsync before flight — this ADR optimizes operator wait time, not in-air link dependency.
+
+**Evidence**: `_docs/02_document/contracts/c11_tilemanager/tile_provision.proto`, `tile_provision_grpc.md`, `_docs/02_tasks/todo/AZ-976_grpc_tile_provision_epic.md`.
@@ -0,0 +1,123 @@
+# Architecture Compliance Baseline
+
+> **Purpose.** Single canonical document against which every cumulative-review
+> report (per `.cursor/skills/code-review/SKILL.md` Phase 7 + the implement
+> skill's Step 14.5 cumulative review) computes its `## Baseline Delta` —
+> the count of **carried-over**, **resolved**, and **newly-introduced**
+> architecture violations. Without this file, cumulative reviews log
+> "baseline not found → no Baseline Delta section emitted" and structural
+> regressions are visible only pairwise per batch instead of cumulatively.
+
+**Baseline established**: 2026-05-26 (cycle-4 Step 10, batch 1, AZ-899)
+**Source-of-truth snapshot**: `_docs/06_metrics/structure_2026-05-20.md`
+**Initial violation count**: **0**
+**Cycle of last refresh**: 4
+
+## Source
+
+The "0 violations" claim is grounded in the structural facts captured by the
+cycle-1-close snapshot (`_docs/06_metrics/structure_2026-05-20.md`):
+
+| Fact | Value |
+|------|-------|
+| Inventory entries | 15 (14 production components C1–C13 + 1 cross-cutting `helpers/runtime_root` row) |
+| Import cycles in component graph | 0 (verified across batches 88–92 cumulative reviews; no back-edges) |
+| Contract files | 5 (`fdr_record_schema.md`, `fdr_client_protocol.md`, `log_record_schema.md`, the `shared_satellite_provider_ingest/` placeholder, `shared_flights_api/`) |
+| `_STRATEGY_REGISTRY` composition seam | `runtime_root.airborne_bootstrap` + `runtime_root.operator_bootstrap` (single composition root per binary, ADR-009) |
+| Layering rule | Layer-3 → Layer-4 imports **BANNED**; AZ-507 cross-component contract surface enforced by `tests/unit/test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies` lint |
+
+The architecture is documented in `_docs/02_document/architecture.md` (ADR-001
+monolith, ADR-002 build-time exclusion, ADR-009 interface-first DI,
+ADR-011 single-image live+replay). File ownership is documented in
+`_docs/02_document/module-layout.md`.
+
+## Violations
+
+*None at baseline.*
+
+This section is the append target for every cumulative-review run that
+detects an architecture finding (severity ≥ Medium, category =
+`Architecture`). The append schema is documented under § Update Protocol
+below.
+
+## Update Protocol
+
+### When a cumulative review finds a NEW architecture violation
+
+The reviewing skill (typically `.cursor/skills/code-review/SKILL.md` Phase 7,
+invoked from the implement skill's Step 14.5 cumulative review at every K=3
+batches) MUST append a row to § Violations using this schema:
+
+| Field | Example |
+|-------|---------|
+| Finding ID | `arch-2026-06-15-1` (date + sequence within the day) |
+| Batch range | `batches 17–19 cycle 4` |
+| Severity | `High` / `Medium` (Critical findings escalate immediately; Low findings stay in the per-batch report) |
+| Subcategory | `import-cycle` / `cross-component-import` / `parallel-pipeline` / `layer-violation` / `seam-bypass` |
+| File:line | `src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py:117` |
+| One-line summary | `c2_vpr imports c6_tile_cache directly, bypassing the consumer-side Protocol cut required by AZ-507` |
+| Cumulative-review report | `_docs/03_implementation/cumulative_review_batches_17-19_cycle4_report.md` |
+| Status | `OPEN` (newly introduced) |
+
+The append happens IN THIS FILE, not in the cumulative-review report. The
+cumulative-review report references this file's row by Finding ID.
+
+### When a violation is resolved
+
+Update the violating row in place: change `Status: OPEN` to
+`Status: RESOLVED in batch <N> cycle <M> via <commit-hash>`. Do NOT delete
+the row — the audit trail must show both the introduction and the
+resolution.
+
+### When the structural snapshot is refreshed
+
+Any cycle that materially changes structure — new component, new
+cross-component edge, new contract file, new composition root — re-snapshots
+to a fresh `_docs/06_metrics/structure_<YYYY-MM-DD>.md` (the cycle-end
+retrospective triggers this when the diff is non-trivial). When that
+happens:
+
+1. Update the `**Source-of-truth snapshot**` header pointer at the top of
+   this file to the new file.
+2. Update the `Cycle of last refresh` header to the cycle that produced the
+   new snapshot.
+3. Update the § Source table values (component count, cycle count, contract
+   count) to match the new snapshot.
+4. Do NOT clear § Violations — open findings carry across snapshots.
+   Resolution status is per-finding, not per-snapshot.
+
+The refresh script is the same one that produced `structure_2026-05-20.md`
+(approach: count `src/gps_denied_onboard/components/*/` directories +
+`src/gps_denied_onboard/runtime_root/` + `helpers/`; run the AZ-270
+composition-root lint to detect cycles; enumerate
+`_docs/02_document/contracts/` subdirectories). If the script has been
+extracted into `tools/structure_snapshot.py` between cycles, use it;
+otherwise the manual approach is documented at the top of the source
+snapshot file.
+
+## Baseline Delta — how cumulative-review reports consume this file
+
+Every cumulative-review report MUST emit a `## Baseline Delta` section with
+three counts derived from this file:
+
+- **Carried-over**: count of rows whose `Status: OPEN` (or
+  `Status: ACCEPTED-RISK`) was unchanged at the start of this review's
+  batch window.
+- **Resolved**: count of rows that transitioned from `OPEN` to
+  `RESOLVED in batch ...` during this review's batch window.
+- **Newly-introduced**: count of rows added during this review's batch
+  window.
+
+An empty Baseline Delta (`0 new, 0 resolved, 0 carried-over`) is still
+emitted — its presence confirms the cumulative-review consulted the
+baseline rather than silently skipping the section as in cycles 1–3.
+
+## References
+
+- Cycle-3 retro § Top 3 Improvement Actions #3 — `_docs/06_metrics/retro_2026-05-26.md`
+- Cycle-1 retro § Top 3 Improvement Actions #3 (original) — `_docs/06_metrics/retro_2026-05-20.md`
+- Source snapshot — `_docs/06_metrics/structure_2026-05-20.md`
+- Existing-code flow Step 2 — `.cursor/skills/autodev/flows/existing-code.md` § "Step 2 — Architecture Baseline Scan"
+- Implement skill Step 14.5 — `.cursor/skills/implement/SKILL.md` § "Cumulative Code Review (every K batches)"
+- Architecture doc — `_docs/02_document/architecture.md`
+- Module-layout — `_docs/02_document/module-layout.md`
@@ -30,3 +30,19 @@ class ImuPreintegrator:

 - Bias drift is the responsibility of the consumers (C1 + C5) who call `reset_with_bias(...)` whenever their estimate of the IMU bias changes.
 - The preintegrator does not own a clock — every `integrate_*` call requires a monotonic timestamp on the IMU sample.
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/imu_preintegrator.py` (AZ-276) extends the sketch above; this section is the authoritative inventory of what cycle-1 consumers actually see. Sketch return types in § Interface remain accurate for *intent*; the precise types appear here.
+
+- **Factory** — `make_imu_preintegrator(calibration: CameraCalibration) -> ImuPreintegrator` reads gyro/accel noise covariances from `calibration.metadata["imu_noise_model"]` and constructs `gtsam.PreintegrationCombinedParams.MakeSharedU(gravity_m_s2)`. Every key in the noise block is optional and independently defaulted, so partial blocks are honoured.
+- **BMI088-class defaults** (used when `imu_noise_model` is absent — bring-up + unit tests only; production deployments MUST supply a per-deployment noise model in `CameraCalibration.metadata`): `accel_noise_density=1.86e-3 m/s²/√Hz`, `gyro_noise_density=1.87e-4 rad/s/√Hz`, `accel_bias_rw=4.33e-4 m/s³/√Hz`, `gyro_bias_rw=2.66e-5 rad/s²/√Hz`, `integration_noise=1e-8`, `gravity_m_s2=9.80665`.
+- **`ImuPreintegrationError`** — single public exception type. Raised on (a) non-monotonic `sample.ts_ns`, (b) GTSAM's lower-level PIM rejection (wrapped, not propagated), (c) `current_preintegration()` / `reset_for_new_keyframe()` called with zero samples since last reset. Carries offending vs. previous `ts_ns` in its message so consumers can write FDR `kind="imu.skew"` events per AZ-276 Risk 2.
+- **Actual return type** of `current_preintegration()` and `reset_for_new_keyframe()` is `gtsam.PreintegratedCombinedMeasurements` (the PIM), not a constructed `CombinedImuFactor`. Consumers build the factor at attach time as `gtsam.CombinedImuFactor(*keys, pim)` — the helper cannot know the GTSAM keys. The `CombinedImuFactor` alias is still re-exported (so consumers do NOT import GTSAM directly), but it names the **type** the consumer constructs, not the helper's return value. Contract `imu_preintegrator.md` v1.0.0 still reflects the original "returns CombinedImuFactor" wording — a contract minor revision is queued for the next contracts-folder sweep.
+- **`reset_with_bias` is destructive in cycle-1** — it discards the partial integration accumulator (re-initialises the PIM with the new bias and a clean `last_ts_ns=None`). The contract's "subsequent samples only" wording is honoured at the new-window granularity: consumers MUST close the prior window via `reset_for_new_keyframe()` BEFORE changing bias if they want to retain its contribution.
+- **First-sample dt handling** — the first sample after a reset is recorded (`_last_ts_ns` updated, `_sample_count` incremented) but NOT integrated (GTSAM rejects `dt==0`). The second sample is the first integration call. This matters for AC-1 length counting: 100 samples ⇒ 99 GTSAM integrations.
+
+### Cycle-1 task lineage
+
+- AZ-276 — initial helper, contract producer.
+- No cycle-1 follow-up tasks touched this helper.
@@ -28,3 +28,19 @@ def adjoint(pose: SE3) -> Matrix6
 ## Caveats

 - Library-grade Lie-algebra functions exist in `manifpy` and `pylie`; we use GTSAM's primitives directly to avoid pulling in a second math library. If a future strategy needs richer manifold ops, evaluate `manifpy` then.
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/se3_utils.py` (AZ-277) extends the sketch above; this section is the authoritative inventory of what cycle-1 consumers actually see.
+
+- **Type alias** — `SE3 = gtsam.Pose3` is re-exported by the helper. Consumers MUST import `SE3` from `helpers.se3_utils` and never `gtsam.Pose3` directly (keeps the Lie-algebra backend swappable without touching C1/C4/C5).
+- **`Se3InvalidMatrixError`** — single public exception type. Raised on (a) wrong array shape, (b) `dtype != float64`, (c) bottom row != `[0, 0, 0, 1]`, (d) rotation drift `‖R^TR − I‖_F > atol`, (e) negative-determinant rotation (mirror), (f) non-ndarray inputs. `matrix_to_se3` and `exp_map` raise this; `se3_to_matrix`, `log_map`, `adjoint` are no-throw on the typed input.
+- **Strict caller-orthogonalisation invariant** — the helper does NOT silently re-orthogonalise. AC-7 / `matrix_to_se3` always validates `‖R^TR − I‖_F ≤ atol` and rejects drift. Callers (C4 in particular, since `solvePnPRansac` output is not orthogonal to numerical precision) MUST run their own orthogonalisation (`cv2.Rodrigues` round-trip or `scipy.linalg.polar`) before calling `matrix_to_se3`. Default tolerance: `_DEFAULT_ROT_ATOL = 1e-6`; callers can pass a looser `atol` for relaxed contexts (none in cycle-1).
+- **`exp_map` near-identity fallback** — twist vectors with `‖xi‖ < _SMALL_ANGLE_THRESHOLD = 1e-10` return the identity `SE3()` instead of delegating to GTSAM's `Pose3.Expmap`. This guards against the `sin(theta)/theta` under-flow that surfaces when iSAM2's relinearisation produces a near-identity twist after a converged step.
+- **`is_valid_rotation(R_3x3, *, atol=1e-6)`** — predicate (no exception) for "is this matrix safe to feed to `matrix_to_se3`?". Returns False for non-ndarray, wrong shape, wrong dtype, orthogonality drift > atol, or negative determinant. Cycle-1 consumers: C4's `MarginalsAdapter` short-circuit (`opencv_gtsam_marginals.py` from AZ-358) and the contract test for AC-7.
+- **`dtype=float64` everywhere** — every public function enforces `float64`. `np.ndarray` returned from `se3_to_matrix`, `log_map`, `adjoint` is `np.ascontiguousarray(..., dtype=np.float64)` so callers can pass it through to GTSAM/Eigen without a copy.
+
+### Cycle-1 task lineage
+
+- AZ-277 — initial helper, contract producer.
+- No cycle-1 follow-up tasks touched this helper.
@@ -28,3 +28,20 @@ class LightGlueRuntime:
 ## Caveats

 - The features fed in MUST come from the same backbone as the LightGlue engine was trained for (DISK in production-default; ALIKED / XFeat in alternates). Mixing backbones is a runtime error caught by the matcher's input shape check.
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/lightglue_runtime.py` (AZ-278) is the structural fix for R14 (re-rank vs. matcher double-load of the LightGlue engine). Composition root wires ONE `LightGlueRuntime` instance and constructor-injects it into both C2.5 (`InlierBasedReranker`) and C3 (`CrossDomainMatcher`).
+
+- **Constructor** — `LightGlueRuntime(engine_handle: EngineHandle)`. `EngineHandle` is a `Protocol` from `_types/manifests.py` (descriptor_dim, forward(...)) — Layer 1 helper invariant means we do NOT import `gps_denied_onboard.components.*`. C7's `InferenceRuntime.deserialize_engine(LIGHTGLUE_ENGINE_CACHE_ENTRY)` returns the concrete handle at takeoff; the composition root passes it in.
+- **Construction guards** — `LightGlueRuntimeError` is raised for: `engine_handle is None`; `engine_handle` missing the `descriptor_dim` Protocol attribute; `descriptor_dim < 1`.
+- **Descriptor-dim mismatch** — both `match` and `match_batch` validate every `KeypointSet.descriptors` against the engine's `descriptor_dim` and raise `LightGlueRuntimeError` on mismatch (catches "DISK features fed into an ALIKED-trained LightGlue" regressions).
+- **Concurrent-access guard is non-blocking** — the runtime owns a `threading.Lock` but never `.acquire(blocking=True)`. Concurrent entry raises `LightGlueConcurrentAccessError` immediately rather than serialising. This is intentional: if you see this exception, the composition root wired the runtime into more than one thread by mistake — fix the composition, do NOT add blocking serialisation. The lock guards the body of both `match` and `match_batch`; `descriptor_dim()` is lock-free.
+- **`match_batch` equal-length precondition** — `LightGlueRuntimeError` if `len(features_a_list) != len(features_b_list)`. Iteration uses `zip(..., strict=True)`. Indexed validation labels (`features_a_list[i]`) so a downstream test failure points to the offending pair.
+- **`descriptor_dim()` accessor** — returns the engine's descriptor dim as a plain `int` (cached on construction so per-call overhead is one attribute lookup).
+- **Public exceptions** — `LightGlueRuntimeError` (construction / descriptor-dim mismatch / batch-length mismatch) and `LightGlueConcurrentAccessError` (composition-root violation). Both subclass `RuntimeError`.
+
+### Cycle-1 task lineage
+
+- AZ-278 — initial helper, contract producer.
+- R14 structural fix: composition-root single-instance injection into C2.5 + C3 lands in `runtime_root/airborne_bootstrap.py` (the `lightglue_runtime` pre-constructed key consumed by both `_C2_5_STRATEGIES` and `_C3_STRATEGIES`).
@@ -41,3 +41,19 @@ class WgsConverter:

 - The static-only design satisfies the coderule.mdc constraint ("only use static methods for pure self-contained computations"). If a future deployment needs alternative datum support, switch to an instance-based factory then.
 - Tile-coordinate math is zoom-level-sensitive; callers MUST pass the right zoom level for the tile in question (typically zoomLevel from `TileMetadata`).
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/wgs_converter.py` (AZ-279, extended by AZ-490) is the canonical entry point for every geodesy hop in the system. Stateless and `pyproj`-backed (`EPSG:4326 ↔ EPSG:4978`), with module-level `Transformer` instances cached on import.
+
+- **Public constants** — `WEB_MERCATOR_MAX_LAT_DEG = 85.0511287798066` (the slippy-map cutoff; outside this band, `latlon_to_tile_xy` raises) and `MAX_ZOOM = 22` (slippy-map upper bound; exposed so callers can validate operator input without hard-coding the limit).
+- **`WgsConversionError`** — single public exception type (subclasses `ValueError`). Raised on: non-finite `lat/lon/alt`; latitude/longitude out of WGS-84 range; non-`ndarray` or wrong-shape ECEF input; non-`float64` ECEF input; `zoom` not a non-bool `int` or out of `[0, MAX_ZOOM]`; tile `(x, y)` out of `[0, 2**zoom)`; latitude outside the Web-Mercator band for `latlon_to_tile_xy`.
+- **ECEF arrays are `np.ndarray` of shape `(3,)` and `dtype=float64`** — the Interface sketch above uses "Vector3" as a placeholder. `latlonalt_to_ecef` returns a freshly-allocated array; `ecef_to_latlonalt` and `local_enu_to_latlonalt` validate input shape/dtype and raise `WgsConversionError` on mismatch.
+- **`horizontal_distance_m(a: LatLonAlt, b: LatLonAlt) -> float`** — new method added in AZ-490 for C5's `set_takeoff_origin` bounded-delta gate. Computes the geodesic horizontal distance in metres via the same ECEF transformer used by `latlonalt_to_local_enu`: convert `b` into the local-ENU frame anchored at `a`, then `hypot(east, north)`. Altitude is ignored (flat-distance on the WGS-84 ellipsoid, NOT a 3-D distance). Accuracy ≤ sub-mm vs. Vincenty for separations ≤ a few km — the bounded-delta gate operates at ≤ ~1 km, so AZ-490's "geodesic horizontal distance" AC is satisfied.
+- **Slippy-map tile math** — hand-rolled (NOT `pyproj`) to match OSM's `{zoom}/{x}/{y}.jpg` convention byte-equal so files produced by `satellite-provider` round-trip exactly. `latlon_to_tile_xy` clamps the output into `[0, n-1]` after `floor` — out-of-band latitude is rejected before this clamp via the Web-Mercator range check. `tile_xy_to_latlon_bounds` returns a `BoundingBox(min_lat_deg, min_lon_deg, max_lat_deg, max_lon_deg)` matching the tile's outer extent.
+- **`pyproj` import** — `from pyproj import Transformer` is tagged `# type: ignore[import-not-found]` because `pyproj` ships type stubs in a separate package; the project pin does not add the stubs. Don't drop the ignore comment in mypy passes.
+
+### Cycle-1 task lineage
+
+- AZ-279 — initial helper, contract producer (`latlonalt_to_*`, `*_to_latlonalt`, `latlon_to_tile_xy`, `tile_xy_to_latlon_bounds`).
+- AZ-490 — `horizontal_distance_m` addition for C5's takeoff-origin bounded-delta gate. Contract minor revision (v1.0.0 → v1.1.0) is queued for the next contracts-folder sweep.
@@ -34,3 +34,20 @@ class Sha256Sidecar:

 - The atomic rename is filesystem-level — works on POSIX local filesystems, not on NFS / SMB / overlayfs. For production deployments the cache root MUST live on a local filesystem.
 - The sidecar is NOT cryptographically signed; it protects against accidental corruption + file-replacement-after-staging, NOT against an attacker with write access to the cache root. Threat model treats the operator workstation as trusted; the companion's write access is restricted to F4 (mid-flight tile gen) which has its own per-flight signing key path.
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/sha256_sidecar.py` (AZ-280) is static-only by design. Atomicity comes from `atomicwrites.atomic_write` (temp-file → `os.replace`). All four entry points wrap `OSError` and `ValueError` into a single exception hierarchy.
+
+- **`Sha256SidecarError`** — single public exception type (subclasses `RuntimeError`). Raised on: `write_atomic` OS failure; `write_atomic_and_sidecar` sidecar OS failure; `verify` finds the sidecar missing for an existing payload; sidecar text not exactly 64 lowercase hex chars; `aggregate_hash` finds a missing or unreadable path.
+- **`SIDECAR_SUFFIX = ".sha256"`** — public module-level constant for callers (e.g. takeoff-load verifier listing) that need to spell the sidecar suffix without hard-coding it.
+- **Sidecar file format** — pure hex digest, no JSON wrapper, exactly 64 chars, all lowercase. The validator rejects uppercase or wrong-length sidecars hard (catches "user edited the sidecar by hand and broke it"). Keeps verification trivial.
+- **Sidecar path appends `.sha256` verbatim** — `Path.with_suffix` would re-interpret an existing extension; we explicitly use `Path(str(payload_path) + ".sha256")`. So `manifest` → `manifest.sha256` AND `engine.engine` → `engine.engine.sha256`. This is the AC-NEW-CACHE-3 / D-C10-3 invariant.
+- **Streaming digests** — `verify` and `aggregate_hash` stream the file in 1 MiB chunks (`_digest_file`) so an 8 GB engine file does not require 8 GB of RAM. `write_atomic` is the only entry point that operates on in-memory `bytes`.
+- **`verify` semantics** — returns `False` (not raise) when the payload path is missing entirely ("not verifiable" rather than "verification error"); raises `Sha256SidecarError` when the payload exists but the sidecar is missing, unreadable, or malformed. Callers can branch on `path.exists()` first if they need to distinguish missing-payload from corrupt-sidecar.
+- **`aggregate_hash` is byte-deterministic** — input list is sorted lexicographically by `str(path)` before hashing. The digest is computed over the concatenation of `<basename>\0<hex-digest>\n` lines (basename only, NOT full path, so the same physical file at a different mount point still produces the same aggregate). Missing paths in the input list raise instead of being silently skipped.
+
+### Cycle-1 task lineage
+
+- AZ-280 — initial helper, contract producer.
+- No cycle-1 follow-up tasks touched this helper. The C10 / C6 / C7 task batch that consumes it (AZ-301 C7 engine gate, AZ-303 C6 storage interfaces, AZ-305 C6 postgres+filesystem store, AZ-321 C10 engine compiler, AZ-322 C10 descriptor batcher, AZ-323 C10 manifest builder, AZ-324 C10 manifest verifier, AZ-325 C10 cache provisioner) cycles through the four `Sha256Sidecar` static methods without extending them.
@@ -48,3 +48,20 @@ HostCapabilities:

 - The dotted-version format must round-trip cleanly through filesystems (no `/` or `\` in dotted versions; safe).
 - Adding a new tuple dimension (e.g., a per-binary `BUILD_*` flag combination) requires extending the schema AND every existing `.engine` filename. Versioning the schema itself is a Plan-phase carryforward if/when needed.
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/engine_filename_schema.py` (AZ-281) is stateless and static-only, with a single compiled regex governing `parse` and `build`. The host-match predicate compares `(sm, jetpack, trt)` exactly; **precision is NOT part of the host match** (a `fp16` engine and an `int8` engine for the same SM/JetPack/TRT both "match the host" — the takeoff-load verifier picks the one it wants by precision separately).
+
+- **`EngineFilenameSchemaError`** — single public exception type (subclasses `ValueError`). Raised on: non-`str` inputs to `build` / `parse`; missing `.engine` suffix; regex non-match; reserved `__` separator inside `model_name`; `model_name` outside `[a-z0-9_]+` or longer than 64 chars; `sm` not a non-bool positive int; version not matching `\d+\.\d+`; precision not in `ALLOWED_PRECISIONS`.
+- **`ENGINE_SUFFIX = ".engine"`** — public module-level constant.
+- **`ALLOWED_PRECISIONS = frozenset({"fp16", "int8", "mixed"})`** — public module-level constant; exposed so C7's takeoff-load decision tree and C10's engine-build orchestration can validate operator-supplied precision without hard-coding the enum.
+- **Strict model-name validation** — `[a-z0-9_]+`, non-empty, ≤64 chars, no embedded `__` (reserved as the model/SM separator). Catches "operator typed a model name with a hyphen" before any filesystem operation runs.
+- **Strict version validation** — both `jetpack` and `trt` must match dotted `<major>.<minor>` (e.g. `"6.2"`, `"10.3"`). Patch components are deliberately NOT supported in the filename — the engine ABI is stable within `<major>.<minor>` per the JetPack/TRT release notes.
+- **`sm` validation** — must be a non-bool `int > 0`. Python's `bool ⊆ int` quirk would otherwise let `True` slip through as `sm=1`.
+- **`matches_host` ignores precision by design** — the filename's `precision` segment is informational for the host-match check. C7's `deserialize_engine` uses `matches_host` to filter "engines this host can run at all" before applying its own precision policy.
+
+### Cycle-1 task lineage
+
+- AZ-281 — initial helper, contract producer.
+- No cycle-1 follow-up tasks touched this helper. C7's `deserialize_engine` (AZ-301) and C10's engine compiler (AZ-321) consume it without extension.
@@ -47,3 +47,21 @@ RansacResult:

 - The RANSAC threshold is a tunable; defaults are documented per-component (C3, C3.5, C4) in their specs.
 - For 2D-3D RANSAC inside C4's `solvePnPRansac`, OpenCV does it internally — this helper is for the standalone reprojection-residual computation that lives outside the PnP call.
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/ransac_filter.py` (AZ-282, extended via composition in AZ-623) is static-only and deterministic — `cv2.setRNGSeed(0)` is called immediately before every `cv2.findHomography(..., RANSAC)` so the same correspondences always produce the same inlier mask (AC-3 byte-equal determinism).
+
+- **`RansacFilterError`** — single public exception type (subclasses `ValueError`). Raised on: non-`ndarray` correspondences; wrong-shape correspondences (anything other than `(N, 4)`); non-positive `ransac_threshold_px`; negative `min_inliers`; fewer than 4 correspondences for `filter_correspondences` (homography needs ≥4); non-`(3, 3)` `K`; distortion not shape `(5,)` or `(8,)`; OpenCV exceptions are wrapped (`cv2.error` → `RansacFilterError`).
+- **`RansacResult` is a frozen `@dataclass`** — `inlier_correspondences: np.ndarray`, `inlier_count: int`, `outlier_count: int`, `median_residual_px: float`. The numpy array is NOT copied; consumers MUST treat it as read-only.
+- **Median, not mean** — both `filter_correspondences` (homography residual) and `compute_reprojection_residual` (post-pose residual) pin **median** as the residual statistic. This matches the contract for C3.5 (post-AdHoP residual gate) and C4 (per-frame FDR residual). Mean is more sensitive to remaining outliers and would defeat the gate.
+- **NaN residual for empty inliers** — both methods return `float("nan")` when the inlier set is empty. Consumers must NOT propagate `nan` as a numeric residual; treat it as "no residual computable" and fall back to the C3.5/C4 "matcher returned nothing useful" branch.
+- **`min_inliers` is INFORMATIONAL only** — passed to `filter_correspondences`, validated for non-negativity, but does NOT gate the result. The returned `RansacResult` always reflects the actual RANSAC outcome; callers decide whether `result.inlier_count >= min_inliers` is acceptable. This is the contract's "Min-inliers semantics" invariant — encoding the gate in the helper would conflate three separate component thresholds (C3 / C3.5 / C4).
+- **`filter_correspondences` residual** uses `cv2.perspectiveTransform` (the homography fit's own residual). `compute_reprojection_residual` uses `cv2.projectPoints` with the supplied pose, back-projecting image-a pixels through `K^{-1}` to `z=1` in camera-a frame. The two residuals are NOT interchangeable — one measures homography fit quality, the other measures pose fit quality.
+- **Imports `se3_utils`** — `from gps_denied_onboard.helpers.se3_utils import SE3, se3_to_matrix`. Layer 1 helper-on-helper import is allowed (both are Layer 1). The `SE3 = gtsam.Pose3` alias is the runtime pose type; `se3_to_matrix(pose)` extracts the 4×4 transform.
+- **OpenCV pin** — uses `cv2.findHomography`, `cv2.perspectiveTransform`, `cv2.projectPoints`, `cv2.Rodrigues`, `cv2.setRNGSeed`. All exist in `opencv-python>=4.5`, so the cycle-1 pin relaxation to `>=4.11.0.86,<4.12` (D-CROSS-CVE-1 leftover) does not affect this helper.
+
+### Cycle-1 task lineage
+
+- AZ-282 — initial helper, contract producer.
+- AZ-623 (`pre_constructed_phase_e_ransac_c5_helpers`) — composition-root sweep that wires C5 to consume the same static helper as C3/C3.5/C4; no signature changes, no new public surface added by this task.
@@ -31,3 +31,19 @@ class DescriptorNormaliser:

 - Zero-norm vectors are returned as the zero vector (no division-by-zero); callers must filter or accept that such descriptors will match nothing.
 - The choice of "inner product on L2-normalised" rather than "cosine" is FAISS-idiomatic — FAISS does not have a built-in cosine metric; cosine is achieved by L2-normalising and using inner product.
+
+## Cycle-1 operational reality
+
+The shipped surface in `src/gps_denied_onboard/helpers/descriptor_normaliser.py` (AZ-283, extended by AZ-338 NetVLAD per-cluster method) is static-only, stateless, and dtype-preserving. Norms are computed in `float32` to stabilise `float16` inputs against under/overflow, then cast back to the input dtype — the helper NEVER silently up-casts the returned descriptor.
+
+- **`DescriptorNormaliserError`** — single public exception type (subclasses `ValueError`). Raised on: non-`ndarray` input; wrong dimensionality (`l2_normalise` requires 1-D, `l2_normalise_batch` requires 2-D, `intra_cluster_normalise` requires 1-D); zero-length axis; dtype not in `ALLOWED_DTYPES`; `num_clusters` not a non-bool positive int that divides `descriptor.shape[0]`.
+- **`ALLOWED_DTYPES = (np.float16, np.float32)`** — public module-level constant. Anything else is rejected hard; this keeps the FAISS index and the runtime query path on the same precision (catches "C10 built the index in float32 but C2 fed a float64 query" regressions).
+- **`intra_cluster_normalise(descriptor, num_clusters)` — NEW METHOD (AZ-338)**. Per-cluster L2 normalisation for VLAD-aggregated descriptors. NetVLAD's published preprocessing chain L2-normalises each per-cluster sub-vector BEFORE the global L2 step (`l2_normalise`). The input is a flat 1-D VLAD descriptor of shape `(num_clusters * cluster_dim,)`; the method reshapes to `(num_clusters, cluster_dim)`, normalises row-wise (zero-norm rows stay zero), and flattens back. `num_clusters` MUST divide `descriptor.shape[0]` — otherwise `DescriptorNormaliserError`.
+- **`descriptor_metric()` returns the literal string `"inner_product"`** — the source of truth for FAISS HNSW index construction. C6's `DescriptorIndex.search_topk` and C10's index-build code both consult this; do NOT hard-code the metric string anywhere else.
+- **Zero-norm vectors return zeros** — `l2_normalise`, `l2_normalise_batch`, and `intra_cluster_normalise` all guard the divisor. Callers that want to reject zero-norm descriptors must do so explicitly; the helper never raises on zero norm (it would be the wrong layer to decide the policy).
+- **`l2_normalise_batch` vectorised** — uses `np.where(norms == 0.0, ...)` to apply the zero-guard row-wise so a batch of N descriptors with K zeros costs the same as a batch of N non-zero descriptors plus K boolean comparisons (no per-row branch).
+
+### Cycle-1 task lineage
+
+- AZ-283 — initial helper, contract producer (`l2_normalise`, `l2_normalise_batch`, `descriptor_metric`).
+- AZ-338 — `intra_cluster_normalise` addition for the C2 VPR NetVLAD preprocessing path (`ultra_vpr` AZ-337 consumer). Contract minor revision (v1.0.0 → v1.1.0) is queued for the next contracts-folder sweep.
@@ -4,7 +4,9 @@

 **Purpose**: produce a per-frame relative pose SE(3) + 6×6 covariance + IMU bias estimate + feature-quality summary from the nav-camera frame and the FC IMU/attitude window, fusing visual and inertial cues without any external (satellite) reference.

-**Architectural Pattern**: Strategy — `VioStrategy` interface with three concrete implementations (Okvis2 production-default, VinsMono research-only, KltRansac mandatory simple-baseline), constructor-injected at the composition root (ADR-009), build-time gated by per-implementation CMake `BUILD_*` flags (ADR-002), runtime selection by config at startup (ADR-001), not hot-swappable mid-flight.
+**Architectural Pattern**: Strategy — `VioStrategy` interface with three concrete implementations (Okvis2 nominal production-default, VinsMono research-only, KltRansac mandatory simple-baseline), constructor-injected at the composition root (ADR-009), build-time gated by per-implementation CMake `BUILD_*` flags (ADR-002), runtime selection by config at startup (ADR-001), not hot-swappable mid-flight.
+
+**Cycle-1 operational reality**: the airborne binary ships with **`KltRansac` as the production-default selection** while the OKVIS2 + VINS-Mono native wirings are parked as Tier-2 follow-ups (`AZ-592` for AZ-332 OKVIS2; `AZ-593` for AZ-333 VINS-Mono — see `FINAL_report.md` § Cycle 1 Implementation Status). Both higher-fidelity strategies have their Python facade + pybind11 binding skeleton + `_STRATEGY_REGISTRY` registration in place; the first `process_frame` call into the OKVIS2/VINS-Mono native side raises until the upstream wiring (`okvis::ThreadedSlam` / VINS-Mono ROS-strip) lands. ADR-001 / ADR-002 remain correct — the seam exists, the build-flag gating works — only the operational default-selection shifts. Runtime selection of `okvis2` or `vins_mono` via config currently raises `StrategyNotAvailableError` from `runtime_root/vio_factory.py` until their `BUILD_*` flag is ON.

 **Upstream dependencies**:
 - Camera ingest thread → `NavCameraFrame` (3 Hz nominal, drop-oldest queue).
@@ -85,7 +87,7 @@ No database access, no cache layer beyond the in-process keyframe window.
 |---------|---------|---------|
 | OKVIS2 (C++) | upstream HEAD pinned per Plan-phase | Production-default tightly-coupled VIO; BSD-3-Clause |
 | VINS-Mono (C++) | upstream HEAD pinned per Plan-phase | Research-only loosely-coupled VIO for IT-12 comparative study; behind `BUILD_VINS_MONO` |
-| OpenCV | ≥ 4.12.0 (CVE-2025-53644 mitigation) | KLT pyramidal optical flow + RANSAC for the simple-baseline strategy |
+| OpenCV | `>=4.11.0.86,<4.12` (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`) | KLT pyramidal optical flow + RANSAC for the KltRansac strategy (cycle-1 production-default) |
 | Eigen | matches OKVIS2 / GTSAM pin | Lie-algebra math for SE(3) + 6×6 covariance |
 | pybind11 | matches OKVIS2 / VINS-Mono build | Python bindings for the C++ strategies |

@@ -112,7 +114,11 @@ No database access, no cache layer beyond the in-process keyframe window.
 - The camera ingest thread is the sole producer; C5 is the sole consumer. Concurrent calls to `process_frame` on a single strategy instance are forbidden — enforce in the composition root by binding one strategy instance to the camera ingest thread.

 **Performance bottlenecks**:
- Okvis2 sliding-window optimisation can spike to 80–120 ms on a thermally-throttled Jetson; D-CROSS-LATENCY-1 hybrid auto-degrades C4 covariance recovery (not C1) to free budget.
+- Okvis2 sliding-window optimisation can spike to 80–120 ms on a thermally-throttled Jetson; D-CROSS-LATENCY-1 hybrid auto-degrades C4 covariance recovery (not C1) to free budget. (Behaviour is documented for when AZ-592 wires the OKVIS2 native side; in cycle-1, KltRansac is the active backend and its per-frame cost is `O(F)` only.)
+
+**Cycle-1 Tier-2 follow-up dependencies**:
+- AZ-592 (parked from AZ-332): wires `okvis::ThreadedSlam` into `_native/okvis2_binding.cpp` and lands the OKVIS2 CI matrix (Ceres + vendored submodules) + Tier-2 Jetson validation against Derkachi-class fixtures. Until this lands, requesting `config.vio.strategy="okvis2"` raises `StrategyNotAvailableError` regardless of `BUILD_OKVIS2`.
+- AZ-593 (parked from AZ-333): finalises the de-ROSified VINS-Mono upstream pin (HKUST + in-tree ROS-strip vs. community fork) and wires the `vins_estimator::Estimator` into `_native/vins_mono_binding.cpp`. Until this lands, requesting `config.vio.strategy="vins_mono"` raises `StrategyNotAvailableError` regardless of `BUILD_VINS_MONO`.

 ## 8. Dependency Graph

@@ -6,6 +6,8 @@

 **Architectural Pattern**: Strategy — `VprStrategy` interface; concrete implementations (UltraVPR primary, MegaLoc secondary, MixVPR / SelaVPR / EigenPlaces / NetVLAD / SALAD additional candidates) selected at startup by config (ADR-001); build-time gated per-implementation by `BUILD_*` flags (ADR-002); composition-root wired (ADR-009).

+**Cycle-1 operational reality**: the airborne binary wires C2 through the `_STRATEGY_REGISTRY` + `register_airborne_strategies()` runtime gate (AZ-591) on top of the build-flag matrix, and constructor injection flows through the `pre_constructed` dict passed to `compose_root(config, pre_constructed=...)` (AZ-618 umbrella → AZ-620 c6 storage phase + AZ-623 c7 inference phase). All seven backbones (`ultra_vpr`, `net_vlad`, `mega_loc`, `mix_vpr`, `sela_vpr`, `eigen_places`, `salad`) have wired strategy modules + `_preprocessor_*` siblings + `_faiss_bridge`; their `BUILD_VPR_<variant>` env flags default OFF (tests/CI must opt in per strategy — see `runtime_root/vpr_factory.py::_is_build_flag_on`). The cycle-1 `C2VprConfig.strategy` default is `net_vlad` (the mandatory simple-baseline per Plan-phase D-C2-1) — `ultra_vpr` remains the Documentary Lead's PRIMARY backbone but additionally requires a pre-compiled `.trt` engine produced by C10's engine compiler (AZ-321). The `c2_vpr` slot lists `("c6_descriptor_index", "c7_inference")` in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`; missing keys raise `AirborneBootstrapError` at composition time, not at first frame.
+
 **Upstream dependencies**:
 - Camera ingest thread → `NavCameraFrame` (parallel fan-out with C1; same frame, distinct queue depth).
 - C7 InferenceRuntime → backbone forward pass (TRT/ONNX/PyTorch per active runtime).
@@ -92,6 +94,7 @@ C2 is read-only against C6 during F3/F4/F6. Pre-flight, F1 triggers C10 (after C
 | PyTorch | matches simple-baseline track | FP16 baseline (NetVLAD / MixVPR mandatory) |
 | UltraVPR (research code drop) | upstream HEAD pinned per Plan-phase | Documentary Lead PRIMARY backbone |
 | MegaLoc, MixVPR, SelaVPR, EigenPlaces, NetVLAD | upstream HEAD pinned per Plan-phase | Secondary + mandatory simple-baselines |
+| OpenCV (`cv2`) | `>=4.11.0.86,<4.12` (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`) | Image decode + colour-space conversions in the per-strategy `_preprocessor_*.py` modules |

 **Error Handling Strategy**:
 - `VprBackboneError`: backbone forward pass failed (CUDA OOM, TRT engine deserialize mismatch). C2 emits no `VprResult`; C5 falls back to VIO-only with provenance label `visual_propagated` (AC-1.4).
@@ -4,11 +4,14 @@

 **Purpose**: re-rank C2's top-K=10 VPR candidates down to top-N=3 by single-pair LightGlue inlier count, producing a higher-precision input for the cross-domain matcher (C3). The re-rank step is the architectural boundary between cheap descriptor retrieval (C2) and expensive cross-domain matching (C3) — it pays a small extra cost so C3 only operates on the most promising candidates.

-**Architectural Pattern**: Strategy (single concrete implementation today: `InlierCountReRanker`). Future re-rank algorithms can be added as additional `ReRankStrategy` implementations behind the same interface.
+**Architectural Pattern**: Strategy (single concrete implementation today: `InlierCountReRanker`, AZ-343). Future re-rank algorithms can be added as additional `ReRankStrategy` implementations behind the same interface.
+
+**Cycle-1 operational reality**: the airborne binary registers `c2_5_rerank` into `_STRATEGY_REGISTRY` via `register_airborne_strategies()` (AZ-591) with a single strategy slot (`inlier_count`); the composition root passes infrastructure deps through the `pre_constructed` dict (AZ-618 → AZ-620 c6 phase + AZ-621 c3 helpers phase). The `c2_5_rerank` slot lists `("c6_tile_store", "c3_lightglue_runtime", "c3_feature_extractor", "clock")` in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`; `c13_fdr` is optional. Missing keys raise `AirborneBootstrapError` at composition time, naming the consumer and missing key. The `FeatureExtractor` placeholder remains `OpenCvOrbExtractor` in cycle-1; the TRT-backed DISK/ALIKED swap is parked as a Tier-2 follow-up (carried as the `FeatureExtractor` row in § 6 below).

 **Upstream dependencies**:
 - C2 → `VprResult` (top-K=10 candidates).
 - Shared `LightGlueRuntime` helper (used in single-pair mode for inlier counting; the same matcher object is shared with C3 — owned by the helper, not by C3, so neither component depends on the other at build time).
+- Shared `FeatureExtractor` helper (`helpers/feature_extractor.py`, AZ-343 scope expansion) — extracts `KeypointSet` from both the per-frame nav image and each candidate's tile JPEG; the placeholder impl is `OpenCvOrbExtractor`, swapped out for a TRT-backed deep extractor before flight.
 - C6 TileStore → fetch tile pixels for each candidate (cheap, in-memory page-cache hit during a flight).
 - Camera calibration artifact — for nav-frame preprocessing.

@@ -59,7 +62,7 @@ No caching layer beyond C6's mmap. The same tile may be fetched repeatedly acros

 **Algorithmic Complexity**: `O(K)` LightGlue forward passes per frame (K=10), each `O(M_tile · M_query)` in feature counts. The whole step is GPU-bound on the same engine that C3 uses — hence the shared LightGlue runtime.

-**State Management**: stateless per-frame. Holds a reference to the shared LightGlue object owned by C3.
+**State Management**: stateless per-frame. Holds references to the constructor-injected `LightGlueRuntime`, `FeatureExtractor`, `TileStore`, `Clock`, and (optionally) `FdrClient` — all lifecycle-owned by the runtime root, not by C2.5.

 **Key Dependencies**:

@@ -67,6 +70,7 @@ No caching layer beyond C6's mmap. The same tile may be fetched repeatedly acros
 |---------|---------|---------|
 | LightGlue (Python) | upstream HEAD pinned per Plan-phase | Single-pair matching for inlier count |
 | TensorRT | matches C7 | LightGlue inference engine reuse |
+| OpenCV (`cv2`) | `>=4.11.0.86,<4.12` (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`) | `OpenCvOrbExtractor` placeholder feature extraction (will be swapped to a TRT-backed DISK/ALIKED extractor in a follow-up) |

 **Error Handling Strategy**:
 - `RerankBackboneError`: LightGlue forward pass failed on one or more candidates. The candidate is dropped from the rerank set; if fewer than N=3 candidates survive, C2.5 returns whatever it has and C3 proceeds with reduced N.
@@ -78,6 +82,8 @@ No caching layer beyond C6's mmap. The same tile may be fetched repeatedly acros
 | Helper | Purpose | Used By |
 |--------|---------|---------|
 | `LightGlueRuntime` | shared LightGlue inference handle (one engine, many call sites) | C2.5, C3 |
+| `FeatureExtractor` (`helpers/feature_extractor.py`) | shared image → `KeypointSet` extractor; default `OpenCvOrbExtractor`, target TRT-backed DISK/ALIKED | C2.5, future C3 backbones |
+| `Clock` (`gps_denied_onboard.clock`) | composition-root time source; stamps `RerankResult.reranked_at` via `clock.monotonic_ns()` (Invariant 2 of the replay contract — no direct `time.*` in components) | every C* component |

 ## 7. Caveats & Edge Cases

@@ -6,6 +6,8 @@

 **Architectural Pattern**: Strategy — `CrossDomainMatcher` interface, with concrete implementations DISK+LightGlue (D-C3-1 = (a) primary), ALIKED+LightGlue (secondary), XFeat (alternate). Selection at startup by config (ADR-001); build-time gating by `BUILD_*` flags (ADR-002); composition-root wired (ADR-009).

+**Cycle-1 operational reality**: the airborne binary wires C3 through `_STRATEGY_REGISTRY` + `register_airborne_strategies()` (AZ-591) on top of the `BUILD_MATCHER_*` build-flag matrix (`disk_lightglue` / `aliked_lightglue` / `xfeat` — see `runtime_root/airborne_bootstrap.py::C3_MATCHER_BUILD_FLAGS`). The `c3_matcher` airborne slot registers **only `disk_lightglue` + `aliked_lightglue`** — `xfeat` has its build-flag wired but is parked as a Tier-2 follow-up (no airborne registry slot, no airborne `_C3_MATCHER_STRATEGIES` entry). Constructor injection flows through the `pre_constructed` dict passed to `compose_root(config, pre_constructed=...)` (AZ-618 umbrella → AZ-621 c3 helpers phase + AZ-622 LightGlue runtime builder + AZ-623 c7 inference phase). The `c3_matcher` slot lists `("c3_lightglue_runtime", "c282_ransac_filter", "c7_inference")` in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`; `clock` and `c13_fdr` are optional. Missing required keys raise `AirborneBootstrapError` at composition time, naming the consumer and missing key.
+
 **Upstream dependencies**:
 - C2.5 → `RerankResult` (top-N=3 candidates).
 - C7 InferenceRuntime → backbone forward pass.
@@ -79,7 +81,7 @@ No additional caching beyond C6.
 | LightGlue (Python) | upstream HEAD pinned per Plan-phase | Primary matcher; replaces SuperPoint+SuperGlue (Magic Leap noncommercial) |
 | ALIKED (Python) | upstream HEAD pinned per Plan-phase | Secondary feature extractor |
 | XFeat (Python) | upstream HEAD pinned per Plan-phase | Alternate (lightweight) feature+matcher |
-| OpenCV | ≥ 4.12.0 | RANSAC + reprojection residual computation |
+| OpenCV | `>=4.11.0.86,<4.12` (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`) | RANSAC + reprojection residual computation (consumed via the shared `RansacFilter` helper) |
 | TensorRT | matches C7 | Backbone engine compilation/runtime |

 **Error Handling Strategy**:
@@ -100,6 +102,9 @@ No additional caching beyond C6.
 - The cross-domain gap (nav-camera vs satellite tile) is the hardest step in the pipeline. Backbone choice depends on the deployment camera's spectral and resolution characteristics; the current default (DISK+LightGlue) is locked per Mode B Fact #110 / D-C3-1 = (a) pending IT-12 verdict.
 - D-C2-12 (DINOv2-feature-based matcher) is a carryforward research item that may displace DISK in a future cycle.

+**Cycle-1 Tier-2 follow-up dependencies**:
+- XFeat — build-flag (`BUILD_MATCHER_XFEAT`) + `matcher_factory._STRATEGY_TO_BUILD_FLAG` + concrete `c3_matcher/xfeat.py` module are all in place, but `c3_matcher`'s `_C3_MATCHER_STRATEGIES` tuple in `runtime_root/airborne_bootstrap.py` registers only `disk_lightglue` + `aliked_lightglue`. Selecting `xfeat` via airborne config currently raises `StrategyNotLinkedError` from the `_STRATEGY_REGISTRY` lookup. Tier-2 follow-up: extend the airborne registration tuple + airborne Jetson validation against Derkachi-class fixtures.
+
 **Potential race conditions**:
 - Shared LightGlue runtime with C2.5; serial access from one ingest thread.

@@ -6,6 +6,8 @@

 **Architectural Pattern**: Strategy with two concrete implementations: `AdHoPRefiner` (real refinement) and `PassthroughRefiner` (no-op for the non-conditional baseline / smoke tests). Selection at startup by config (ADR-001); both implementations linked into the deployment binary by default (refinement is conditionally invoked at runtime, not gated at build time).

+**Cycle-1 operational reality**: the airborne binary wires C3.5 through `_STRATEGY_REGISTRY` + `register_airborne_strategies()` (AZ-591). The `c3_5_adhop` airborne slot's `_C3_5_ADHOP_STRATEGIES` tuple in `runtime_root/airborne_bootstrap.py` registers **only `adhop`** — `PassthroughRefiner` is linked into the binary (no `BUILD_REFINER_*` gate; see `C3_5RefinerConfig.KNOWN_STRATEGIES = {"adhop", "passthrough"}`) and is freely selectable in non-airborne composition (unit tests, IT-12 baseline), but selecting `strategy="passthrough"` via the airborne config currently raises `StrategyNotLinkedError` from the `_STRATEGY_REGISTRY` lookup. Constructor injection flows through the `pre_constructed` dict passed to `compose_root(config, pre_constructed=...)` (AZ-618 umbrella → AZ-621 c3 helpers phase + AZ-623 c7 inference phase). The `c3_5_adhop` slot lists `("c282_ransac_filter", "c7_inference")` in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`; `clock` and `c13_fdr` are optional. Missing required keys raise `AirborneBootstrapError` at composition time, naming the consumer and missing key. The `c282_ransac_filter` and `c3_lightglue_runtime` helper instances are identity-shared with C3 / C4 (R14 / AZ-282).
+
 **Upstream dependencies**:
 - C3 → `MatchResult`.
 - C7 InferenceRuntime — AdHoP backbone forward pass when invoked.
@@ -61,7 +63,7 @@ No additional caching.
 | Library | Version | Purpose |
 |---------|---------|---------|
 | OrthoLoC AdHoP (research code drop) | upstream HEAD pinned per Plan-phase | Conditional refinement |
-| OpenCV | ≥ 4.12.0 | Reprojection residual computation, perspective transforms |
+| OpenCV (via shared `RansacFilter` / reprojection helpers) | `>=4.11.0.86,<4.12` (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`) | Reprojection residual computation, perspective transforms, RANSAC re-filtering of AdHoP-preconditioned correspondences |
 | TensorRT | matches C7 | AdHoP backbone engine when invoked |

 **Error Handling Strategy**:
@@ -86,6 +88,9 @@ No additional caching.
 **Performance bottlenecks**:
 - AdHoP invocation is the variable cost in the F3 budget. NFT-PERF-01 measures the invocation rate; an invocation rate above ~30% suggests the threshold needs revisiting.

+**Cycle-1 Tier-2 follow-up dependencies**:
+- `PassthroughRefiner` — the module + `register()` hook + `C3_5RefinerConfig.KNOWN_STRATEGIES` entry are all in place, but `c3_5_adhop`'s `_C3_5_ADHOP_STRATEGIES` tuple in `runtime_root/airborne_bootstrap.py` registers only `adhop`. Selecting `strategy="passthrough"` via airborne config currently raises `StrategyNotLinkedError`. Tier-2 follow-up: extend the airborne registration tuple if the IT-12 baseline comparison or a smoke-test deployment needs the passthrough path on a flight binary (today it's available via unit-test composition only).
+
 ## 8. Dependency Graph

 **Must be implemented after**: C3 (input), C7 (inference runtime).
@@ -6,6 +6,10 @@

 **Architectural Pattern**: single concrete implementation `OpenCVGtsamPoseEstimator` behind the `PoseEstimator` interface. The pose estimator and the state estimator (C5) **share the GTSAM substrate**; the C4 factor is added directly to C5's iSAM2 graph rather than computed in isolation.

+**Cycle-1 operational reality**: the airborne binary wires C4 through `_STRATEGY_REGISTRY` + `register_airborne_strategies()` (AZ-591) with a single strategy slot (`opencv_gtsam` — `C4PoseConfig.KNOWN_POSE_STRATEGIES = {"opencv_gtsam"}`). Constructor injection flows through the `pre_constructed` dict passed to `compose_root(config, pre_constructed=...)` (AZ-618 umbrella → AZ-623 c5 helpers phase + AZ-625 eager iSAM2 handle phase). The `c4_pose` slot lists `("c282_ransac_filter", "c5_wgs_converter", "c5_se3_utils", "c5_isam2_graph_handle")` in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`; `c13_fdr` and `clock` are optional. The `c5_isam2_graph_handle` slot is the **shared GTSAM substrate seam** — `build_pre_constructed` eagerly invokes `build_state_estimator` once (AZ-625 / Phase E.5) so the (`StateEstimator`, `ISam2GraphHandle`) tuple is constructed BEFORE either the C4 or C5 wrapper runs (C4 runs first in topo order via `_C4_POSE_DEPENDS_ON = ("c1_vio", "c3_matcher")`, then C5 short-circuits on the prebuilt estimator via the internal `_c5_prebuilt_estimator` key). The cross-seam identity invariant (`c4_pose._isam2_handle is c5_state._isam2_handle`) is verified by AC-625.3. Missing required keys raise `AirborneBootstrapError` at composition time, naming the consumer and missing key.
+
+**Enabled flag (AZ-776 / ADR-012)**: `C4PoseConfig.enabled: bool = True` is the user-facing switch that controls C4's participation in the composition graph. Default ON preserves the ADR-003 steady-state airborne path. Setting `c4_pose.enabled = false` in YAML removes C4 from the component selection map at compose time — the wrapper never runs, the consumer never sees an iSAM2 handle, and `build_pre_constructed` omits `c5_isam2_graph_handle` from the `pre_constructed` dict. The flag exists to support the open-loop ESKF composition profile (the AZ-265 replay Tier-2 smoke baseline) where C5 runs as the `eskf` strategy with no factor graph for C4 to anchor against. `compose_root` enforces the 2×2 pairing matrix between `c4_pose.enabled` and `c5_state.strategy` at compose time and rejects the off-diagonal cells (`enabled=False` + `gtsam_isam2`, `enabled=True` + `eskf`) with a `CompositionError`. See ADR-012 (architecture.md) and Invariant 13 in `_docs/02_document/contracts/replay/replay_protocol.md`.
+
 **Upstream dependencies**:
 - C3.5 → `MatchResult` (refined or passthrough).
 - C5 StateEstimator — supplies the GTSAM iSAM2 handle so C4 can add its factor in-graph (architecture principle: shared substrate per ADR-003).
@@ -65,7 +69,7 @@ Stateless w.r.t. persistent storage; reads camera calibration once at constructi

 | Library | Version | Purpose |
 |---------|---------|---------|
-| OpenCV | ≥ 4.12.0 (CVE-2025-53644 mitigation) | `solvePnPRansac` with `SOLVEPNP_IPPE` flag; D-C4-1 = (b) |
+| OpenCV (`cv2`) | `>=4.11.0.86,<4.12` (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`) | `solvePnPRansac` with `SOLVEPNP_IPPE` flag in `opencv_gtsam_estimator.py`; D-C4-1 = (b) |
 | GTSAM (Python + C++) | per Plan-phase pin | `Marginals.marginalCovariance(pose_key)` for native 6×6 covariance |
 | Eigen | matches GTSAM | Lie-algebra math |

@@ -6,6 +6,8 @@

 **Architectural Pattern**: Strategy with two concrete implementations: `GtsamIsam2StateEstimator` (production-default) and `EskfStateEstimator` (mandatory simple-baseline). Selection at startup (ADR-001), `BUILD_*` gating (ADR-002), composition-root wired (ADR-009).

+**Cycle-1 operational reality**: the airborne binary wires C5 through `_STRATEGY_REGISTRY` + `register_airborne_strategies()` (AZ-591) on top of the `BUILD_STATE_*` build-flag matrix (`runtime_root/airborne_bootstrap.py::C5_STATE_BUILD_FLAGS = {"gtsam_isam2": "BUILD_STATE_GTSAM_ISAM2", "eskf": "BUILD_STATE_ESKF"}`). Both strategies appear in `_C5_STATE_STRATEGIES`; the `gtsam_isam2` flag defaults ON-when-unset and `eskf` defaults OFF-when-unset (mirrors `state_factory._STATE_BUILD_FLAGS`). Strategy registration is lazy: `_ensure_state_strategy_registered` imports the concrete module (`gtsam_isam2_estimator` or `eskf_baseline`) only when the configured strategy's `BUILD_STATE_*` flag is ON, so a binary configured for `eskf` never imports gtsam. Constructor injection flows through the `pre_constructed` dict passed to `compose_root(config, pre_constructed=...)` (AZ-618 umbrella → AZ-623 c5 helpers phase + AZ-625 eager `(estimator, handle)` pair phase). The `c5_state` slot lists `("c5_imu_preintegrator", "c5_se3_utils", "c5_wgs_converter", "c13_fdr")` in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`; `c6_tile_store`, `camera_calibration`, `flight_id`, and `companion_id` are optional (consumed only when `c5_state.orthorectifier.enabled` is True — see AZ-389 / § 7 below). Missing required keys raise `AirborneBootstrapError` at composition time, naming the consumer and missing key. The `c5_imu_preintegrator` is per-process-cached keyed by `config.runtime.camera_calibration_path` (AC-623.2) so two `build_pre_constructed` invocations return the SAME instance — protecting its bias / sample accumulator from a silent reset on re-invocation. AZ-625 short-circuit: `build_pre_constructed` eagerly invokes `build_state_estimator` and stashes the `StateEstimator` under the private `_c5_prebuilt_estimator` key; `_c5_state_wrapper` returns the prebuilt instance so `c4_pose._isam2_handle` and `c5_state._isam2_handle` reference ONE object across the C4 / C5 seam (AC-625.3). AZ-687 replay-mode guard: when `config.mode == "replay"` and the minimal replay `Config` omits the `c5_state` block, the bootstrap skips the eager `(estimator, handle)` build to avoid forcing the gtsam import on the replay binary (the C5 wrapper itself never runs without the block).
+
 **Upstream dependencies**:
 - C1 → `VioOutput` (relative pose + IMU bias).
 - C4 → `PoseEstimate` (absolute satellite-anchored pose); C4 adds factors directly to C5's iSAM2 graph (shared substrate).
@@ -88,6 +90,7 @@ C5 is bounded by design — no unbounded growth.
 | GTSAM (Python + C++) | per Plan-phase pin | iSAM2 + `CombinedImuFactor` + `BetweenFactorPose3` + `GenericProjectionFactorCal3DS2` + `Marginals` |
 | `gtsam_unstable.IncrementalFixedLagSmoother` | per Plan-phase pin | Bounded keyframe window (D-C5-3 K=10–20) |
 | Eigen | matches GTSAM | Lie-algebra math |
+| OpenCV (`cv2`, AZ-389 orthorectifier subsystem only) | `>=4.11.0.86,<4.12` (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`) | Imported by `_orthorectifier.py` for warp / JPEG encode when `c5_state.orthorectifier.enabled = True`; default OFF means the import is loaded but the cv2 code path is unreached in cycle-1 |

 **Error Handling Strategy**:
 - `StateEstimatorConfigError`: `set_takeoff_origin` called with a malformed `LatLonAlt` (out of WGS-84 bounds / non-finite) OR with non-positive / non-finite sigmas, OR re-called inside the cold-start window with conflicting args. `EstimatorAlreadyStartedError` (a `StateEstimatorConfigError` subclass): `set_takeoff_origin` called after the first `add_*` call sealed the cold-start window. Caller must surface to operator; takeoff blocked.
@@ -116,6 +119,10 @@ C5 is bounded by design — no unbounded growth.
 **Performance bottlenecks**:
 - `Marginals.marginalCovariance(pose_key)` is the per-frame hot spot. D-CROSS-LATENCY-1 hybrid degrades C4's covariance recovery (not C5's) under thermal throttle.

+**Cycle-1 Tier-2 follow-up dependencies**:
+- AZ-389 **orthorectifier wiring** — `_orthorectifier.py` + `OrthorectifierConfig` (`enabled`, `cov_norm_threshold`, `inlier_floor`, `tile_size_meters`, `tile_size_pixels`, `zoom_level`, `jpeg_quality`) are wired into `C5StateConfig` and `build_state_estimator`. The default is `enabled: bool = False`, which preserves the existing smoke-test wiring that does not provide a `TileStore` — when False the runtime root skips orthorectifier construction entirely. Production enablement is parked pending AZ-624: the airborne `pre_constructed` dict must populate `camera_calibration`, `flight_id`, `companion_id`, and `c6_tile_store` from the operator-supplied manifest / takeoff orchestrator before flipping `orthorectifier.enabled=True`; until AZ-624 lands those four pre-constructed slots, only the test fixture path (`tests/unit/c5_state/test_az389_*.py`) exercises the orthorectifier subsystem.
+- AZ-624 **operator-supplied flight metadata** — the `_c5_state_wrapper` and `_build_c5_state_estimator_pair` already accept `flight_id` and `companion_id` kwargs and forward them to `build_state_estimator`. In cycle-1 the airborne `build_pre_constructed` only seeds the `tile_store` slot from `_build_c6_tile_store(config)`; the other three (`camera_calibration`, `flight_id`, `companion_id`) are passed as `None`. Tier-2 follow-up: AZ-624 production `main()` wiring populates these from the manifest + takeoff orchestrator handshake (currently their `None` value means orthorectifier disablement is the only safe runtime state — see prior bullet).
+
 ## 8. Dependency Graph

 **Must be implemented after**: C1 (input), C4 (input + shared graph), C8 inbound (FC IMU prior).
@@ -6,6 +6,8 @@

 **Architectural Pattern**: Repository — three concrete stores behind separate interfaces (`TileStore` for pixel + metadata I/O; `TileMetadataStore` for the Postgres spatial index; `DescriptorIndex` for FAISS HNSW). Single concrete implementation per interface today (`PostgresFilesystemStore`, `FaissDescriptorIndex`); future variants (e.g., RocksDB-backed metadata for resource-constrained tiers) can be added behind the same interfaces.

+**Cycle-1 operational reality**: C6 is **infrastructure** — it does NOT have a `c6_tile_cache` slot in the `_STRATEGY_REGISTRY` populated by `register_airborne_strategies()` (AZ-591). Instead the airborne binary materialises C6's two consumer-facing handles via `runtime_root/airborne_bootstrap.py::build_pre_constructed`, which seeds `pre_constructed["c6_descriptor_index"]` (via `_build_c6_descriptor_index` → `storage_factory.build_descriptor_index`, gated by `BUILD_FAISS_INDEX` per `airborne_bootstrap.FAISS_BUILD_FLAG`) and `pre_constructed["c6_tile_store"]` (via `_build_c6_tile_store` → `storage_factory.build_tile_store`, no `BUILD_*` flag — always built when the c6 block is configured). `compose_root` then passes those instances to the downstream wrappers that list them in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS` (c2_vpr consumes `c6_descriptor_index`; c2_5_rerank consumes `c6_tile_store`; c5_state optionally consumes `c6_tile_store` for the AZ-389 orthorectifier path). Strategy slots are populated from `C6TileCacheConfig.KNOWN_*_RUNTIMES`: `{store,metadata}_runtime ∈ {"postgres_filesystem"}` and `descriptor_index_runtime ∈ {"faiss_hnsw"}` — only one concrete impl per slot in cycle-1; the field exists to keep the contract open for a future SQLite Tier-0 dev runtime. When `BUILD_FAISS_INDEX` is OFF and any configured downstream consumer still requires the descriptor index, `_build_c6_descriptor_index` re-raises the lower-level `RuntimeNotAvailableError` as an `AirborneBootstrapError` naming `c6_descriptor_index`, the gating flag, and the consuming component slug. AZ-687 replay-mode guard: when `config.mode == "replay"` and the minimal replay `Config` omits the `c6_tile_cache` block, `build_pre_constructed` skips both C6 seeds entirely — the only wrappers that read those slots (c2_vpr / c2_5_rerank / c5_state) also require their own component entries in `config.components`, which the minimal replay `Config` likewise omits, so the skipped slots are never read.
+
 **Upstream dependencies**:
 - C11 `TileDownloader` (writes `tiles` rows + JPEGs during F1 pre-flight provisioning, source='googlemaps').
 - C10 CacheProvisioner (writes Manifest + FAISS index during F1 pre-flight provisioning, after C11 has populated tiles).
@@ -119,7 +121,7 @@ Not applicable — internal-only; C11 `TileUploader` reads via `TileStore` for u
 |---------|---------|---------|
 | PostgreSQL (server + libpq) | 16.x (mirror of `satellite-provider`'s pin) | Spatial metadata index |
 | psycopg / asyncpg | per project pin | Python Postgres client |
-| FAISS (Python + C++) | upstream HEAD pinned per Plan-phase | HNSW retrieval |
+| faiss-cpu (PyPI wheel) | `>=1.7,<2.0` | HNSW retrieval (AZ-306 chose the upstream wheel over a custom pybind11 wrapper; runtime-gated by `BUILD_FAISS_INDEX` at the storage factory) |
 | atomicwrites | latest | Atomic file replacement for `.index` rebuild (D-C10-3) |
 | hashlib (stdlib) | stdlib | SHA-256 content-hash sidecars |

@@ -6,6 +6,8 @@

 **Architectural Pattern**: Strategy — `InferenceRuntime` interface with three concrete implementations: `TensorrtRuntime` (production-default per D-C7-9 JetPack 6.2 + TensorRT 10.3 lock), `OnnxTrtEpRuntime` (fallback), `PytorchFp16Runtime` (mandatory simple-baseline). Selection at startup by config (ADR-001), build-time gating by `BUILD_*` flags (ADR-002), composition-root wired (ADR-009).

+**Cycle-1 operational reality**: C7 is **infrastructure shared across consumers** — it does NOT have its own slot in the `_STRATEGY_REGISTRY` populated by `register_airborne_strategies()` (AZ-591). Instead the airborne binary builds the `InferenceRuntime` once via `runtime_root/airborne_bootstrap.py::_build_c7_inference` → `inference_factory.build_inference_runtime`, and seeds the single instance into `pre_constructed["c7_inference"]` (AZ-621 / Phase C). The same instance is reused as the engine source for the shared `LightGlueRuntime` load (AZ-622 / Phase D, `_build_c3_lightglue_runtime`), so the bootstrap never double-builds the runtime; downstream wrappers (c2_vpr / c3_matcher / c3_5_adhop, per `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`) then receive the identity-shared runtime via `compose_root`'s constructor injection. Airborne-buildable runtimes are gated by `C7_AIRBORNE_BUILD_FLAGS = (("tensorrt", "BUILD_TENSORRT_RUNTIME"), ("pytorch_fp16", "BUILD_PYTORCH_FP16_RUNTIME"))` — `tensorrt` is the production-default, `pytorch_fp16` is the Tier-0 / workstation fallback (and is the conservative `C7InferenceConfig.runtime` default so unconfigured test environments resolve to the Tier-0 baseline). `onnx_trt_ep` is **deliberately omitted** from the airborne flag matrix even though `inference_factory._RUNTIME_TO_BUILD_FLAG` recognises it — see § 7 Tier-2 follow-up. When no airborne runtime is buildable (both `BUILD_TENSORRT_RUNTIME` and `BUILD_PYTORCH_FP16_RUNTIME` OFF, or the configured runtime's flag is OFF) and any configured consumer still requires `c7_inference`, `_build_c7_inference` surfaces the upstream `RuntimeNotAvailableError` as an `AirborneBootstrapError` (AC-621.2) naming the missing key, BOTH airborne `BUILD_*` flags + their runtimes, and the consuming component slug(s) — narrowed to the configured consumers when available. AZ-687 replay-mode guard: when `config.mode == "replay"` and the minimal replay `Config` omits the `c7_inference` block, `build_pre_constructed` skips both `c7_inference` AND the cascading `c3_lightglue_runtime` seed (the LightGlue runtime depends on the inference runtime); the c2_vpr / c3_matcher / c2_5_rerank / c3_5_adhop wrappers that would have consumed the runtime are likewise absent from the replay `Config` and therefore never look at the skipped slot.
+
 **Upstream dependencies**:
 - C10 CacheProvisioner → during F1 (after C11 `TileDownloader` has populated C6) triggers engine compilation when no cached engine matches the `(SM, JP, TRT, precision)` tuple.
 - F2 takeoff load → triggers `deserialize_cached_engine` for every model used by C1/C2/C2.5/C3/C3.5.
@@ -134,6 +136,9 @@ Not applicable.
 **Performance bottlenecks**:
 - Per-frame inference cost is the F3 hot path's largest contributor. NFT-PERF-01 partition is the source of truth.

+**Cycle-1 Tier-2 follow-up dependencies**:
+- `OnnxTrtEpRuntime` — the module + class are implemented and the lower-level `inference_factory._RUNTIME_TO_BUILD_FLAG` maps `"onnx_trt_ep" → "BUILD_ONNX_TRT_EP_RUNTIME"`, but the **airborne** `C7_AIRBORNE_BUILD_FLAGS` tuple in `runtime_root/airborne_bootstrap.py` deliberately omits it (research-only per the AZ-621 task spec). Setting `config.components['c7_inference'].runtime = "onnx_trt_ep"` on an airborne binary raises `AirborneBootstrapError` from `_build_c7_inference` whose message lists ONLY the two airborne flag options (tensorrt / pytorch_fp16) — operators see a clean recovery path instead of a research-build escape hatch. Tier-2 follow-up: extend `C7_AIRBORNE_BUILD_FLAGS` (and gate it on `BUILD_ONNX_TRT_EP_RUNTIME=ON`) only if a future deployment scenario justifies the ORT-TRT-EP path on a flight binary; until then the runtime is exercised via unit-test composition and ad-hoc workstation runs only.
+
 ## 8. Dependency Graph

 **Must be implemented after**: nothing internal — C7 is foundational.
@@ -6,6 +6,8 @@

 **Architectural Pattern**: Strategy — `FcAdapter` interface with two concrete implementations: `PymavlinkArdupilotAdapter`, `Msp2InavAdapter`. Plus a `GcsAdapter` (single concrete `QgcTelemetryAdapter` today). All selected at startup by config (ADR-001), build-time gating per `BUILD_*` flags (ADR-002, both adapters typically linked into the deployment binary so a single image can target both FCs by configuration), composition-root wired (ADR-009).

+**Cycle-1 operational reality**: C8 is composed via a **separate registry path** from the rest of the strategy-selecting components — there is no `c8_fc_adapter` slot in the central `_STRATEGY_REGISTRY` populated by `register_airborne_strategies()` (AZ-591), and no `c8_*` row in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`. Instead `runtime_root/fc_factory.py` owns two private registries — `_FC_REGISTRY` and `_GCS_REGISTRY` — and exposes `register_fc_adapter()` / `register_gcs_adapter()` so per-binary bootstrap modules (one per `BUILD_FC_<VARIANT>` / `BUILD_GCS_<VARIANT>` combination) can lazy-link the concrete adapter classes ahead of `build_fc_adapter()` / `build_gcs_adapter()`. Both calls run the build-flag gate against `_FC_BUILD_FLAGS = {"ardupilot_plane": "BUILD_FC_ARDUPILOT_PLANE", "inav": "BUILD_FC_INAV"}` and `_GCS_BUILD_FLAGS = {"qgc_mavlink": "BUILD_GCS_QGC_MAVLINK"}` and surface an OFF flag as `FcAdapterConfigError` / `GcsAdapterConfigError` naming the disabled flag (AC-4). The outbound single-writer invariant is enforced by `bind_outbound_emit_thread()` — called once per process before wiring outbound emit, a second call from any other thread raises `OutboundThreadAlreadyBoundError` (AC-6 / Invariant 8). Unit-test isolation uses `clear_strategy_registries()` + `clear_outbound_thread_binding()`. C8 takes **no** infrastructure dependency on `pre_constructed`; its `**deps` kwargs are populated by `compose_root`'s C5 / C13 wiring path (per-binary `main()` constructs the concrete adapter with the FC's port config and the C5 estimator's outbound `EstimatorOutput` stream).
+
 **Upstream dependencies**:
 - C5 StateEstimator → `EstimatorOutput` (5 Hz periodic emit driver).
 - Hardware: UART/USB to FC; UART (or USB) to GCS (often shared or via FC mavlink-routing).
@@ -14,7 +16,7 @@
 - C5 StateEstimator (consumes `ImuWindow`, `AttitudeWindow`, `GpsHealth`, `FlightStateSignal`).
 - C1 VIO (consumes `ImuWindow`).
 - C13 FDR (consumes raw inbound + emitted outbound MAVLink/MSP2 streams; signing key rotation events; spoof-promotion events).
- C11 `TileUploader` (consumes `FlightStateSignal == ON_GROUND` confirmation; runs on a different process / image, so the signal flows out-of-band via the FDR or a small bus the operator tool subscribes to post-flight). The C11 `TileDownloader` does NOT depend on `FlightStateSignal` — it runs pre-flight when the companion is plugged into the operator workstation.
+- C11 `TileUploader` does **NOT** depend on `FlightStateSignal` after Batch 44 (SRP refactor — the post-landing safety gate now lives in C12's `PostLandingUploadOrchestrator`, which reads the `flight_footer` FDR record's `clean_shutdown` field, not the live `FlightStateSignal`). The C11 `TileDownloader` likewise does not depend on `FlightStateSignal` — it runs pre-flight when the companion is plugged into the operator workstation.

 ## 2. Internal Interfaces

@@ -8,6 +8,8 @@

 **Architectural Pattern**: Coordinator — single concrete implementation `CacheProvisioner` behind two interfaces (`CacheProvisioner` for the F1 build phase, `ManifestVerifier` for F2's content-hash gate). The interfaces are split because F2 only needs the verifier and shouldn't pull in the full provisioning code path.

+**Cycle-1 operational reality**: C10 is **operator-side / cross-tier infrastructure**, NOT an airborne strategy slot — it does not appear in `_AIRBORNE_REGISTRATIONS` and `register_airborne_strategies()` (AZ-591) never registers it; equivalently it has no row in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`. The operator binary composes C10 via `runtime_root/c10_factory.py`, which exposes six tiny per-service factories (`build_engine_compiler`, `build_backbone_specs`, `build_manifest_builder`, `build_manifest_verifier`, `build_descriptor_batcher`, `build_cache_provisioner`) that the CLI wires directly. The factory reuses the C7 `InferenceRuntime` via `inference_factory.build_inference_runtime` for the engine-compile path (honouring `BUILD_TENSORRT_RUNTIME` / `BUILD_PYTORCH_FP16_RUNTIME`) and threads `Sha256Sidecar`, `Ed25519ManifestSigner`, and a structured logger explicitly — no global registry. The AZ-323 `ManifestBuilder` reads `config.components['c10_provisioning'].manifest` (`C10ManifestConfig`: `signing_mode ∈ {operator, dev}`, `allowed_operator_fingerprints`, `schema_version="1.1"`); operator-mode signs only with an allowlisted Ed25519 key fingerprint, dev-mode warns when an allowlisted key is used. AZ-324's `ManifestVerifierImpl` has two modes selected by `with_tile_store`: `False` (airborne C5 path, MV-INV-5: trust the Ed25519 signature + recorded `tiles_coverage_sha256`) and `True` (operator C12 path: re-derive the aggregate from C6 and report drift) — wired in `build_manifest_verifier` and never silently flipping. The AZ-507 cross-component cut keeps C10 from importing C6 directly: `c10_factory.py` owns three composition-root adapters (`c6_tile_metadata_store_to_tiles_query`, `c6_tile_store_to_pixel_opener`, `c6_descriptor_index_to_rebuilder`) that translate C6's DTOs into C10's narrow `TileHashRecord` / `TileBboxRecord` / `TilePixelOpener` / `DescriptorIndexRebuilder` cuts. AZ-687 replay-mode guard does not apply to C10 — replay-mode binaries are airborne-only and never invoke the C10 build path.
+
 **Upstream dependencies**:

 - C12 OperatorTooling → triggers `build_cache_artifacts(...)` after C11 `TileDownloader` has populated C6.
@@ -100,6 +102,24 @@ C10 reads `tiles` rows from C6 (scoped to the build's bbox + zoom_levels), write
 | atomicwrites | latest | Atomic file replacement for `.index` + Manifest (D-C10-3) |
 | hashlib (stdlib) | stdlib | SHA-256 content-hash sidecars |
 | PyYAML / orjson | per project pin | Manifest serialization |
+| numpy | per project pin | Descriptor batch ndarray container (AZ-322 `DescriptorBatcher`) |
+
+**AZ-322 internal phase — `DescriptorBatcher`**:
+
+The `populate_descriptors` phase walks every tile in C6 for the requested
+`(bbox, zoom_levels, sector_class)`, embeds them through C7's `InferenceRuntime`
+(via `C7EngineBackboneEmbedder`, the default `BackboneEmbedder` impl), and
+hands the resulting `(N, descriptor_dim)` ndarray to AZ-306's
+`DescriptorIndex.rebuild_from_descriptors` for atomic FAISS index write.
+CUDA OOM is handled via halve-and-retry bounded by `C10BatcherConfig.max_oom_retries`
+(default 1: 64 → 32, then succeed-or-fail-fast) so a real GPU regression
+surfaces in seconds rather than via silent retries. Per-10% progress is
+emitted both as DEBUG logs (`c10.descriptor.progress`) and via an optional
+`progress_callback` so operator tooling can wire a TTY/GUI bar without
+touching the batcher itself. The descriptor int64 id formula is the
+canonical AZ-306 scheme (`int.from_bytes(sha256("zoom|lat|lon").first8, "big", signed=True)`)
+— invented locally to avoid a circular dependency back into C6 internals
+would break AC-6.

 **Error Handling Strategy**:

@@ -127,7 +147,7 @@ C10 reads `tiles` rows from C6 (scoped to the build's bbox + zoom_levels), write

 **Potential race conditions**:

- Concurrent `build_cache_artifacts` invocations on the same cache root would corrupt state. Single-process operator-tool wraps with a filesystem lockfile (the same lockfile C11 honours); if a second invocation tries to start, fail with explicit error.
+- Concurrent `build_cache_artifacts` invocations on the same cache root would corrupt state. Single-process operator-orchestrator wraps with a filesystem lockfile (the same lockfile C11 honours); if a second invocation tries to start, fail with explicit error.

 **Performance bottlenecks**:

@@ -2,21 +2,32 @@

 ## 1. High-Level Overview

-**Purpose**: own the operator-side network I/O against `satellite-provider` for the onboard tile corpus, in **both directions**:
+**Purpose**: own the operator-side network I/O against `satellite-provider` for the onboard tile corpus, in **three directions**:

+- **Route seed** (pre-flight, F1, route-driven variant — Cycle 3 / Epic AZ-835): submit a tlog-derived `RouteSpec` (waypoints + per-waypoint coverage radius, produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836) to `satellite-provider`'s Route API and poll until corridor tile materialisation completes. Lets the operator pre-commit the cache to where the drone actually flew rather than a bounding box.
 - **Download** (pre-flight, F1): fetch tiles from `satellite-provider` for the operational area, apply AC-NEW-6 freshness gating, and write into C6 (`TileStore` + `TileMetadataStore`). C11 is the **only** path that crosses the workstation/companion enclave to the parent suite for tile pixels — C10 reads from the populated C6 store and never touches `satellite-provider` itself.
- **Upload** (post-landing, F10): when `flight_state == ON_GROUND` is confirmed, read pending mid-flight tiles from C6 and POST to `satellite-provider`'s ingest endpoint (D-PROJ-2 contract sketch).
+- **Upload** (post-landing, F10): read pending mid-flight tiles from C6 and POST to `satellite-provider`'s ingest endpoint (D-PROJ-2 contract sketch). C11 itself does NOT gate on flight state — it is a dumb pipe; the post-landing safety gate is owned by C12's `PostLandingUploadOrchestrator` (AZ-329 / Batch 44), which checks the C13 `flight_footer` FDR record for `clean_shutdown=True` before invoking `TileUploader.upload_pending_tiles`.

-C11 is a **separate operator-side binary / image**. The airborne companion image's CMake target deliberately excludes the entire `c11_tilemanager/` source tree so the airborne process cannot accidentally execute either the download path or the upload path even via reflection or config error (ADR-004 process-level isolation, AC-8.4). Both directions of tile I/O are operator-driven on the operator workstation; the companion only consumes the populated C6 store while airborne.
+C11 is a **separate operator-side binary / image**. The airborne companion image's CMake target deliberately excludes the entire `c11_tilemanager/` source tree so the airborne process cannot accidentally execute the seed path, the download path, or the upload path even via reflection or config error (ADR-004 process-level isolation, AC-8.4). All three directions of tile I/O are operator-driven on the operator workstation; the companion only consumes the populated C6 store while airborne.

-**Architectural Pattern**: Pipeline behind two interfaces (`TileDownloader`, `TileUploader`) under one component, consistent with C8's multi-interface shape (FC-AP, FC-iNav, GCS adapters under one component). The two interfaces are bundled into C11 because they share auth (TLS + service-internal API key for download, per-flight onboard signing key for upload), HTTP client, network configuration, deployment unit (operator-tooling tarball), and the airborne-exclusion property — splitting them into two components would duplicate all of that. They are kept as **two interfaces** so SRP is preserved at the call-site level: C12 binds `TileDownloader` for the F1 cache-build workflow, `TileUploader` for the F10 post-landing trigger; neither is forced to depend on the other.
+**Architectural Pattern**: Pipeline behind three interfaces (`SatelliteProviderRouteClient`, `TileDownloader`, `TileUploader`) under one component, consistent with C8's multi-interface shape (FC-AP, FC-iNav, GCS adapters under one component). The three interfaces are bundled into C11 because they share auth (JWT Bearer + optional TLS-insecure flag for dev self-signed certs across all three; the upload direction additionally signs each tile with the per-flight onboard signing key), HTTP client (`httpx`), network configuration, deployment unit (operator-tooling tarball), and the airborne-exclusion property — splitting them into separate components would duplicate all of that. They are kept as **three interfaces** so SRP is preserved at the call-site level: C12 binds `SatelliteProviderRouteClient.seed_route` to materialise the corridor cache from a tlog (cycle 3 e2e fixture today; planned C12 production path), `TileDownloader.download_tiles_for_area` for the F1 bbox-driven cache-build workflow, `TileUploader.upload_pending_tiles` for the F10 post-landing trigger; none is forced to depend on the others.
+
+**Cycle-1 operational reality**: C11 is **operator-workstation-only**, NOT an airborne strategy slot — there is no `c11_tile_manager` slot in `_AIRBORNE_REGISTRATIONS`, no row in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`, and the airborne companion image's build target deliberately excludes the entire `c11_tile_manager/` source tree (ADR-004 process-level isolation; AC-8.4). The operator binary composes C11 via `runtime_root/c11_factory.py`, which exposes three tiny per-service factories — `build_per_flight_key_manager` (AZ-318), `build_tile_uploader` (AZ-319 + AZ-320), and `build_tile_downloader` (AZ-316) — each called explicitly by C12's CLI; no central registry. FDR wiring goes through the per-producer `make_fdr_client` cache: AZ-318 `PerFlightKeyManager` defaults to `make_fdr_client("c11_tile_manager.signing_key", config)`, AZ-319 `HttpTileUploader` to `make_fdr_client("c11_tile_manager.tile_uploader", config)` — both distinct from the airborne `"airborne_main"` producer, so the operator-workstation process gets its own per-component FdrClient instances rather than sharing the airborne singleton. AZ-320's `IdempotentRetryTileUploader` decorator wraps `HttpTileUploader` by default (per-call + per-tile bounded retry); `config.components['c11_tile_manager'].disable_retry_decorator = True` suppresses the wrap for low-level debugging or test wiring that needs to observe the inner uploader. The AZ-507 cross-component cut keeps C11 from importing C6 directly: `tile_store` / `tile_metadata_store` are passed in by the operator-binary composition root as consumer-side cuts; `http_client` (an `httpx.Client`) is also caller-owned so tests can swap in `httpx.MockTransport`. AZ-687 replay-mode guard does not apply — C11 has no airborne footprint.
+
+**Cycle-3 operational reality (AZ-777 Phase 1 + Epic AZ-835)**: the e2e harness now wires the e2e-runner against the **real** parent-suite `satellite-provider` .NET service in `docker-compose.test.jetson.yml` (lineage AZ-688 / AZ-691 / AZ-692; tier-1 `docker-compose.test.yml` deprecated 2026-05-20). Two consequences cascaded into C11:
+
+- **`TileDownloader` contract adaptation (AZ-777 Phase 1)** — `HttpTileDownloader._INVENTORY_PATH = "/api/satellite/tiles/inventory"` (POST, bulk lookup by (z,x,y)) and `HttpTileDownloader._TILES_PATH = "/tiles"` (GET, slippy-map fetch via `/tiles/{z}/{x}/{y}`). Previously documented as `GET /api/satellite/tiles?bbox=…&zoom=…`; the real `satellite-provider` API surface uses the inventory + slippy-map split per `tile-inventory.md` v1.0.0 (AZ-505). The bbox-driven `download_tiles_for_area` entry point and its `DownloadRequest` / `DownloadBatchReport` DTOs are unchanged at the call-site level; the contract adaptation is internal. Because the inventory response does not carry a `Content-Length` hint, AZ-308's pre-write budget check uses `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` (conservative over-reserve; typical 256×256 JPEG basemap tile is 8–80 KiB). Auth is `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert.
+- **Third interface — `SatelliteProviderRouteClient` (AZ-838 / Epic AZ-835 C2)** — `seed_route(spec: RouteSpec) -> RouteSeedResult` POSTs the spec to `POST /api/satellite/route` (`requestMaps=true`, `createTilesZip=false`), polls `GET /api/satellite/route/{id}` until `mapsReady=true` (or a terminal-failure status), then verifies coverage via `POST /api/satellite/tiles/inventory`. Pre-emptively enforces AZ-809's `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges) so obviously-bad input fails before the HTTP POST. Default cadence: `poll_interval_s = 5.0`, `poll_max_attempts = 60`, `request_timeout_s = 30.0`. Errors form a dedicated hierarchy (`RouteValidationError` 4xx + RFC 7807 ProblemDetails; `RouteTransientError` 5xx / network / timeout with `__cause__` set; `RouteTerminalFailureError` for non-success terminal status) rooted at `SatelliteProviderRouteError` — independent of `TileManagerError` because the Route API is a corridor-onboarding flow, not a per-tile transfer.
+
+The route-driven path is exercised today by `tests/e2e/replay/conftest.py::operator_pre_flight_setup` (AZ-839 — replaces the cycle-1 `mkdir` placeholder; yields a `PopulatedC6Cache` dataclass) and `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840 — single test that takes only `(tlog, video, calibration)` and runs the full 7-step pipeline). The C12 production CLI binding for the route path is a future-cycle integration; today's C12 still drives only `download_tiles_for_area` for production pre-flight cache builds.

 **Upstream dependencies**:

- C12 OperatorTooling → invokes `TileDownloader.download_tiles_for_area(...)` during F1 and `TileUploader.upload_pending_tiles(...)` post-landing.
+- C12 OperatorTooling → invokes `TileDownloader.download_tiles_for_area(...)` during F1 and `TileUploader.upload_pending_tiles(...)` post-landing. (Cycle-3 e2e fixtures also drive `SatelliteProviderRouteClient.seed_route(...)` for the route-driven F1 variant; C12 production binding for the route path is a future cycle.)
 - C6 TileStore + TileMetadataStore → write target during download (`source = googlemaps`); read source during upload (`source = onboard_ingest`, `voting_status = pending`).
+- `replay_input.tlog_route.RouteSpec` (AZ-836; `_types/route.py` canonical home per AZ-845) → input DTO to `SatelliteProviderRouteClient.seed_route`.
 - Operator workstation OS → invocation entry point (CLI / tray app, owned by C12).
- `satellite-provider` (external) → `GET /api/satellite/tiles?bbox=…&zoom=…` for download; `POST /api/satellite/tiles/ingest` for upload (D-PROJ-2 design task #1, **planned, not yet implemented service-side**).
+- `satellite-provider` (external) → for download: `POST /api/satellite/tiles/inventory` (bulk lookup by (z,x,y)) + `GET /tiles/{z}/{x}/{y}` (slippy-map fetch, per `tile-inventory.md` v1.0.0 / AZ-505); for route seeding: `POST /api/satellite/route` + `GET /api/satellite/route/{id}` (per `CreateRouteRequest.cs` DTO + AZ-809 validator); for upload: `POST /api/satellite/tiles/ingest` (D-PROJ-2 design task #1, **planned, not yet implemented service-side**).

 **Downstream consumers**:

@@ -25,6 +36,12 @@ C11 is a **separate operator-side binary / image**. The airborne companion image

 ## 2. Internal Interfaces

+### Interface: `SatelliteProviderRouteClient` (cycle 3 — AZ-838 / Epic AZ-835 C2)
+
+| Method | Input | Output | Async | Error Types |
+|--------|-------|--------|-------|-------------|
+| `seed_route` | `RouteSpec` (from `_types/route.py`; `name: str \| None` optional) | `RouteSeedResult` | No (poll loop; seconds–minutes) | `RouteValidationError`, `RouteTransientError`, `RouteTerminalFailureError` (all under `SatelliteProviderRouteError`) |
+
 ### Interface: `TileDownloader`

 | Method | Input | Output | Async | Error Types |
@@ -36,13 +53,29 @@ C11 is a **separate operator-side binary / image**. The airborne companion image

 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
-| `confirm_flight_state` | `()` | `FlightStateSignal` (must be ON_GROUND) | No | `FlightStateNotOnGroundError` |
 | `enumerate_pending_tiles` | `flight_id: uuid (optional)` | `list[TileMetadata]` | No | `TileMetadataError` |
 | `upload_pending_tiles` | `UploadRequest` | `UploadBatchReport` | No | `SatelliteProviderError`, `RateLimitedError`, `SignatureRejectedError` |

+C11 no longer exposes `confirm_flight_state` — the post-landing flight-state gate moved to C12 (`PostLandingUploadOrchestrator`, AZ-329) per Batch 44. `FlightStateNotOnGroundError` is retired from C11; the corresponding refusal now lives at the C12 boundary as `FlightStateNotConfirmedError`.
+
 **Input/Output DTOs**:

 ```
+RouteSpec (cycle 3 — _types/route.py, produced by replay_input/tlog_route.py):
+  waypoints:                       tuple[tuple[float, float], ...]   # (lat, lon), 1..max_waypoints
+  suggested_region_size_meters:    float                              # per-waypoint coverage radius
+  source_tlog:                     Path                               # provenance
+  source_segment:                  tuple[int, int]                    # (start_idx, end_idx) into tlog GPS rows
+  total_distance_meters:           float                              # along-track distance of active segment
+
+RouteSeedResult (cycle 3 — c11_tile_manager.route_client):
+  route_id:                        uuid
+  terminal_status:                 string
+  maps_ready:                      bool
+  tile_count:                      int
+  elapsed_ms:                      int
+  submitted_payload_sha256:        string
+
 DownloadRequest:
  bbox:                       BoundingBox (lat_min, lon_min, lat_max, lon_max)
  zoom_levels:                list[int]
@@ -65,8 +98,6 @@ UploadRequest:
  batch_size:                 int
  satellite_provider_url:     URL

-FlightStateSignal:                see C8 — must be ON_GROUND for any upload to proceed
-
 UploadBatchReport:
  batch_uuid:                       uuid (assigned by satellite-provider per D-PROJ-2 contract)
  per_tile_status:                  list[(tile_id, status: enum {queued, rejected, duplicate, superseded})]
@@ -77,17 +108,25 @@ UploadBatchReport:

 ## 3. External API Specification

-C11 is a **client** of `satellite-provider`'s REST surface in both directions.
+C11 is a **client** of `satellite-provider`'s REST surface in three directions.

-### 3.1 Download — read path (existing `satellite-provider` API)
+### 3.1 Route seed — corridor materialisation (cycle 3 — AZ-838 / Epic AZ-835 C2)

 | Endpoint | Method | Auth | Rate Limit | Description |
 |----------|--------|------|------------|-------------|
-| `/api/satellite/tiles?bbox=…&zoom=…` | GET | TLS + service-internal API key | parent-suite enforces | Paged tile blobs + metadata for a bounding box at the given zoom level(s). |
+| `/api/satellite/route` | POST | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) + optional dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` | parent-suite enforces | Submit a `RouteSpec` (waypoints + region size + zoom level). Body shape per `CreateRouteRequest.cs` / `RoutePoint.cs` (`lat` / `lon` JSON property names) / `GeoPoint.cs` DTOs. Query: `requestMaps=true&createTilesZip=false`. Validated pre-emptively against AZ-809 `CreateRouteRequestValidator` rules. |
+| `/api/satellite/route/{id}` | GET | same as above | parent-suite enforces | Poll route processing status. Returns `mapsReady: bool` + a `status` string. Terminal-success: `mapsReady=true`. Terminal-failure: `status ∈ {failed, error, rejected}`. Default cadence: 5 s × ≤ 60 attempts. |

-C11 honours `Retry-After` on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Resolution below 0.5 m/px (RESTRICT-SAT-4) is rejected at the C11 boundary, not pushed downstream.
+### 3.2 Download — read path (`satellite-provider` v1.0.0 inventory contract — AZ-505 / AZ-777 Phase 1)

-### 3.2 Upload — write path (D-PROJ-2 contract sketch, **planned**)
+| Endpoint | Method | Auth | Rate Limit | Description |
+|----------|--------|------|------------|-------------|
+| `/api/satellite/tiles/inventory` | POST | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) + optional dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` | parent-suite enforces | Bulk lookup of `(zoom, x, y)` slippy-map coords (≤ 5000 entries / request); body shape per `tile-inventory.md` v1.0.0. Response order matches request order; each entry carries `present: true|false` plus metadata when present (`resolutionMPerPx`, `producedAt`, …). |
+| `/tiles/{z}/{x}/{y}` | GET | same as above | parent-suite enforces | Slippy-map tile fetch by coordinates (binary JPEG response). Issued only for inventory entries with `present=true`. |
+
+C11 honours `Retry-After` on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Resolution below 0.5 m/px (RESTRICT-SAT-4) is rejected at the C11 boundary, not pushed downstream. Because the inventory response carries no `Content-Length` hint, AZ-308's pre-write budget check uses a conservative `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` per-tile reserve.
+
+### 3.3 Upload — write path (D-PROJ-2 contract sketch, **planned**)

 | Endpoint | Method | Auth | Rate Limit | Description |
 |----------|--------|------|------------|-------------|
@@ -135,27 +174,30 @@ C11 reads from / writes to C6 (the local store) and reads from / writes to `sate

 **Algorithmic Complexity**:

+- Route seed: bounded by parent-suite tile materialisation latency (~seconds–minutes for the Derkachi corridor; gated by `poll_max_attempts × poll_interval_s`).
 - Download: linear in tile count; bandwidth-bound by the operator workstation's link to `satellite-provider`.
 - Upload: linear in pending tile count; bandwidth-bound; bursty post-landing.

-**State Management**: stateless except for the two journals.
+**State Management**: stateless except for the two journals (download / pending-upload). The route client is fully stateless — each `seed_route` call submits, polls, verifies, and returns.

 **Key Dependencies**:

 | Library | Version | Purpose |
 |---------|---------|---------|
-| httpx | per project pin | GET (download) + multipart POST (upload) to `satellite-provider` |
+| httpx | per project pin | POST inventory + GET slippy-map (download), POST route + GET status (route seed), multipart POST (upload) to `satellite-provider` |
 | atomicwrites | latest | Journal updates |
 | cryptography | per project pin | Per-flight signing key (upload payload signing); the production `satellite-provider` ingest endpoint and the e2e-test `mock-suite-sat-service` fixture both verify with the same key family |

 **Error Handling Strategy**:

- `SatelliteProviderError`: HTTP timeout / 5xx / TLS failure on either direction. Retry-with-backoff on 5xx; fail fast on TLS / auth. On download, surface to operator + takeoff blocked. On upload, leave tiles in the pending-upload journal and surface to operator. **Do not delete uploaded tiles from C6** until acknowledged.
+- `SatelliteProviderError`: HTTP timeout / 5xx / TLS failure on download / upload. Retry-with-backoff on 5xx; fail fast on TLS / auth. On download, surface to operator + takeoff blocked. On upload, leave tiles in the pending-upload journal and surface to operator. **Do not delete uploaded tiles from C6** until acknowledged.
 - `RateLimitedError` (429): obey `Retry-After`; the operator can also re-invoke later. Same handling either direction.
 - `FreshnessRejectionError` / `ResolutionRejectionError`: download-side only. Per AC-NEW-6 / RESTRICT-SAT-4 — never silently downgrade fresh-required tiles in `active_conflict` sectors. Surface counts in the `DownloadBatchReport`.
 - `CacheBudgetExceededError`: download-side only. Pre-flight free-space check against AC-8.3 (≤ 10 GB). Fail fast with explicit budget delta; no partial write.
- `FlightStateNotOnGroundError`: upload-side only. Refuse to start; log + show explicit reason. ADR-004 process-level isolation means C11 should never run when the FC believes it's airborne — this error is a defense-in-depth, not the primary control.
 - `SignatureRejectedError`: upload-side only. Per-flight signing key was rejected by `satellite-provider`. This is a security-critical event — do NOT silently drop; surface to operator + log to FDR.
+- **Route-seed errors** (cycle 3, dedicated hierarchy under `SatelliteProviderRouteError`): `RouteValidationError` (4xx + RFC 7807 `errors` dict; raised pre-emptively for AZ-809 validator violations BEFORE the HTTP POST), `RouteTransientError` (5xx / network / timeout; carries `__cause__`), `RouteTerminalFailureError` (parent suite reports a non-success terminal status; `.detail` carries the response JSON). Separate hierarchy from `TileManagerError` because the route flow is corridor onboarding, not per-tile transfer.
+
+Post-landing safety: C11's upload path no longer gates on flight state internally. The check now lives in C12's `PostLandingUploadOrchestrator` (AZ-329 / Batch 44), which refuses to invoke `TileUploader.upload_pending_tiles` unless the C13 `flight_footer` FDR record records `clean_shutdown=True` for the target flight. ADR-004 process-level isolation remains the primary control — C11 should never run on the companion at all.

 ## 6. Extensions and Helpers

@@ -168,8 +210,10 @@ C11 reads from / writes to C6 (the local store) and reads from / writes to `sate
 **Known limitations**:

 - D-PROJ-2 ingest endpoint is NOT yet implemented service-side. Until parent-suite delivers the endpoint, C11 will fail every upload — the pending-upload journal accumulates. Operator workflow tolerates this.
- The e2e-test `mock-suite-sat-service` fixture implements only the planned POST contract (per the leftover file). Download integration tests run against the real `satellite-provider`. Production runs reach `satellite-provider` directly in both directions; the fixture is never on the production path.
- `TileDownloader` requires the operator workstation to have network reach to `satellite-provider` (the only path that crosses out of the workstation enclave). Pre-flight network configuration is an operator concern owned by C12; C11 fails fast if reachability is missing.
+- The e2e-test `mock-suite-sat-service` fixture implements only the planned POST upload contract (per the leftover file). Download + route-seed integration tests run against the real `satellite-provider` on the Jetson harness. Production runs reach `satellite-provider` directly in all three directions; the fixture is never on the production path.
+- `TileDownloader` and `SatelliteProviderRouteClient` require the operator workstation to have network reach to `satellite-provider` (the only path that crosses out of the workstation enclave). Pre-flight network configuration is an operator concern owned by C12; C11 fails fast if reachability is missing.
+- **Imagery source license attribution (cycle 3 — AZ-777 Phase 2)**: the Jetson `satellite-provider` instance downloads from the **Google Maps** satellite layer (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; the operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution string. Production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the satellite-provider side (parent-suite ticket TBD; surfaced in `_docs/00_problem/input_data/flight_derkachi/README.md`).
+- **Dev TLS cert**: the e2e-runner today accepts the self-signed dev cert via `SATELLITE_PROVIDER_TLS_INSECURE=1`. Production deploys must validate against a CA-issued cert (`SATELLITE_PROVIDER_TLS_INSECURE=0`); the env knob is documented in `.env.test.example` + the smoke test + this section as **development-only**.

 **Potential race conditions**:

@@ -177,25 +221,28 @@ C11 reads from / writes to C6 (the local store) and reads from / writes to `sate

 **Performance bottlenecks**:

+- Route seed: parent-suite tile-materialisation latency dominates (corridor onboarding from Google Maps upstream). Bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s = 5 min wall-clock ceiling).
 - Download: bandwidth-bound by the operator workstation's `satellite-provider` link; descriptor / engine work is downstream in C10 (offline, minutes).
 - Upload: bandwidth-bound. Per-flight upload volume is bounded by the F4 mid-flight tile gen cap (typically a few hundred tiles, each 50–200 KB → tens of MB per flight).

 ## 8. Dependency Graph

-**Must be implemented after**: C6 (read source for upload, write target for download), `satellite-provider` (download path; existing) + D-PROJ-2 endpoint (upload path; the e2e-test `mock-suite-sat-service` fixture covers tests until the real endpoint ships).
+**Must be implemented after**: C6 (read source for upload, write target for download), `satellite-provider` (download + route-seed paths; existing) + D-PROJ-2 endpoint (upload path; the e2e-test `mock-suite-sat-service` fixture covers tests until the real endpoint ships). `replay_input.tlog_route` (AZ-836) is a soft prerequisite for the route-seed path — the route client accepts any `RouteSpec` regardless of how it was produced, but the cycle-3 e2e fixture wires `extract_route_from_tlog` upstream.

 **Can be implemented in parallel with**: anything except C6 changes.

-**Blocks**: F1 (pre-flight cache build cannot start without `TileDownloader`), F10 (post-landing upload cannot start without `TileUploader`).
+**Blocks**: F1 (pre-flight cache build cannot start without `TileDownloader` or — for the route-driven variant — `SatelliteProviderRouteClient.seed_route`), F10 (post-landing upload cannot start without `TileUploader`).

 ## 9. Logging Strategy

 | Log Level | When | Example |
 |-----------|------|---------|
-| ERROR | `FlightStateNotOnGroundError`, `SignatureRejectedError`, persistent `SatelliteProviderError`, `CacheBudgetExceededError` | `C11 refused to start: flight_state=IN_AIR; safeguard active` |
-| WARN | one-off network failure, scheduled retry, freshness-driven rejections (counts) | `C11 batch upload retry: batch_uuid=…; next_retry_in_s=30` |
-| INFO | session start/end; per-batch report (download + upload) | `C11 download complete: 87654 tiles, 12 stale-rejected; bbox=…` |
-| DEBUG | per-tile request/response | `C11 tile uploaded: tile_id=(z=18,lat=…,lon=…); status=queued` |
+| ERROR | `SignatureRejectedError`, persistent `SatelliteProviderError`, `CacheBudgetExceededError`, `RouteTerminalFailureError` | `C11 upload failure: signature rejected by satellite-provider`; `c11.route.poll.terminal kind=failed route_id=…` |
+| WARN | one-off network failure, scheduled retry, freshness-driven rejections (counts), `RouteTransientError` retries, `RouteValidationError` pre-flight rejections | `C11 batch upload retry: batch_uuid=…; next_retry_in_s=30`; `c11.route.validation_failed field=points reason=below_min(2)` |
+| INFO | session start/end; per-batch report (download + upload); route submit + each poll tick + inventory verify | `C11 download complete: 87654 tiles, 12 stale-rejected; bbox=…`; `c11.route.submit route_id=…`; `c11.route.poll.tick attempt=3 status=processing` |
+| DEBUG | per-tile request/response; per-tile inventory entries | `C11 tile uploaded: tile_id=(z=18,lat=…,lon=…); status=queued` |
+
+Cycle-3 route-client log kinds: `c11.route.submit`, `c11.route.poll.tick`, `c11.route.poll.terminal`, `c11.route.inventory`, `c11.route.validation_failed` (component `c11_tile_manager.route_client`).

 **Log format**: structured JSON.
 **Log storage**: operator workstation log file (e.g. `~/.azaion/onboard/c11-tilemanager.log`); also writes per-run summaries (download report, upload report) to the operator workstation cache root for audit. The companion's FDR is NOT involved (C11 doesn't run on the companion).
@@ -66,19 +66,13 @@ Component-scoped. Suite-level coverage in `_docs/02_document/tests/*.md`. C11 wa

 ---

-### C11-IT-04: TileUploader gates on `flight_state == ON_GROUND`
+### C11-IT-04: post-landing safety gate lives in C12 (cross-reference)

-**Summary**: `TileUploader.upload_pending` refuses to run if `FlightStateSignal != ON_GROUND` (defense-in-depth atop ADR-004 process isolation).
+**Summary**: post-landing safety is owned by C12, not C11. The gate that historically lived in `TileUploader.upload_pending_tiles` was removed in Batch 44 (supersedes AZ-317); the equivalent check now lives in C12's `PostLandingUploadOrchestrator` (AZ-329) and refuses to invoke `TileUploader.upload_pending_tiles` unless the C13 `flight_footer` FDR record records `clean_shutdown=True` for the target flight.

-**Traces to**: AC-8.4 (defensive — ADR-004's secondary guard)
+**Traces to**: see `_docs/02_document/components/13_c12_operator_orchestrator/tests.md` → C12-IT-03 for the post-landing safety test.

-**Description**: call `upload_pending` with `FlightStateSignal == IN_FLIGHT`; assert `UploadGateBlockedError`. Same with `UNKNOWN`. Set `ON_GROUND` and assert upload proceeds.
-
-**Input data**: scripted FlightStateSignal source.
-
-**Expected result**: upload blocked except in `ON_GROUND`.
-
-**Max execution time**: 30 s.
+**Status**: cross-reference only. C11's `TileUploader` no longer exposes `confirm_flight_state` or raises `FlightStateNotOnGroundError`.

 ---

@@ -193,10 +187,10 @@ Component-scoped. Suite-level coverage in `_docs/02_document/tests/*.md`. C11 wa

 | Step | Action | Expected Result |
 |------|--------|-----------------|
-| 1 | `operator-tool download --area derkachi.geojson --since 2026-01` | `DownloadBatchReport` printed; tiles in C6 |
-| 2 | `operator-tool build-cache` | C10 builds engines + descriptors + Manifest |
+| 1 | `operator-orchestrator download --area derkachi.geojson --since 2026-01` | `DownloadBatchReport` printed; tiles in C6 |
+| 2 | `operator-orchestrator build-cache` | C10 builds engines + descriptors + Manifest |
 | 3 | (simulate flight) | (covered by other tests) |
-| 4 | `operator-tool upload-pending` | Pending-upload tiles POSTed; report printed |
+| 4 | `operator-orchestrator upload-pending` | Pending-upload tiles POSTed; report printed |

 ---

@@ -1,4 +1,4 @@
-# C12 — Operator Pre-flight Tooling
+# C12 — Operator Pre-flight Orchestrator

 ## 1. High-Level Overview

@@ -6,6 +6,8 @@

 **Architectural Pattern**: Coordinator (single concrete `OperatorTooling` today). Three interfaces: `CacheBuildWorkflow` (pre-flight UX), `FlightsApiClient` (read the operator-authored `Flight` from the parent-suite `flights` REST service or a local JSON export — AZ-489, ADR-010), and `OperatorReLocService` (mid-flight operator re-loc requests, AC-3.4). `OperatorReLocService` is consumed via the GCS link by C8 (operator commands subscription); `FlightsApiClient` is operator-workstation-only and never reaches the airborne companion (Principle #9).

+**Cycle-1 operational reality**: C12 is the **operator-workstation CLI** — a standalone binary with `__main__.py` / `cli.py` entry points, NOT an airborne strategy slot. There is no `c12_operator_orchestrator` slot in `_AIRBORNE_REGISTRATIONS`, no row in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`, and the airborne companion image's build target deliberately excludes the entire `c12_operator_orchestrator/` source tree (ADR-004 + Principle #9 process-level isolation). The CLI composes its dependencies via `runtime_root/c12_factory.py`, which exposes one builder per service and aggregates them into the immutable `OperatorOrchestratorServices` dataclass: `build_flights_api_client` (AZ-489, returns `HttpxFlightsApiClient` with TLS verify ON), `build_sector_classification_store` (AZ-326), `build_companion_bringup` (AZ-327, paramiko-based SSH with `ParamikoSshSessionFactory` + `RemoteSidecarVerifier`), `build_build_cache_orchestrator` (AZ-328, wraps `FilelockFileLockFactory` + `RemoteCacheProvisionerInvoker` + a c11 `TileDownloaderCut` adapter), `build_post_landing_upload_orchestrator` (AZ-329, reads `LocalFdrFooterReader` and gates on `clean_shutdown=True` before invoking C11's `TileUploader`), and `build_operator_reloc_service` (AZ-330, the GCS-side `OperatorCommandTransport` producer). The `build_cache_orchestrator`, `post_landing_upload_orchestrator`, and `operator_reloc_service` fields default to `None` so CLI subcommands that don't need them short-circuit with `EXIT_OK` rather than fail — keeping AZ-326 / AZ-327 unit tests stable as later sibling tasks added service fields in-place. The AZ-507 cross-component cut keeps C12 from importing C10 or C11 directly: `RemoteCacheProvisionerInvoker` calls C10 over SSH, and `TileDownloaderCut` / `TileUploaderCut` adapters translate c11's real `DownloadRequest` / `UploadBatchReport` DTOs to the local cut DTOs the orchestrator consumes. AZ-687 replay-mode guard does not apply — C12 has no airborne footprint.
+
 **Upstream dependencies**:
 - Operator (human input — flight ID or flight file path, sector classification, calibration path).
 - `flights` REST service (parent-suite `suite/flights/`) — read via `FlightsApiClient` to fetch the operator-authored `Flight` (waypoints + altitudes); offline alternative is a local JSON export in the same DTO shape.
@@ -26,7 +28,7 @@
 | Method | Input | Output | Async | Error Types |
 |--------|-------|--------|-------|-------------|
 | `build_cache` | `flight_id` (online) OR `flight_file: Path` (offline), `sector_class`, `calibration_path`, `satellite_provider_url`, `api_key` | `CacheBuildReport` (wraps `FlightResolveReport` + C11 `DownloadBatchReport` + C10 `BuildReport`) | No (operator-facing; minutes) | `CacheBuildError` (wraps `FlightNotFoundError`, `FlightsApiUnreachableError`, `SatelliteProviderError`, `EngineBuildError`, etc.) |
-| `trigger_post_landing_upload` | `flight_id` | C11 `UploadBatchReport` | No (operator-facing; minutes) | `CacheBuildError` wrapper around `FlightStateNotOnGroundError`, `SignatureRejectedError`, etc. |
+| `trigger_post_landing_upload` | `PostLandingUploadRequest` (`flight_id`, `satellite_provider_url`, `api_key`, `batch_size`) | C11 `UploadBatchReport` (re-exposed as `UploadBatchReportCut`) | No (operator-facing; minutes) | `FlightStateNotConfirmedError` (footer missing / unclean / fdr-unreadable / flight-id not found), `SatelliteProviderError`, `SignatureRejectedError` (passthrough from C11) |
 | `verify_companion_ready` | `companion_address` | `ReadinessReport` | No | `CompanionUnreachableError`, `ContentHashMismatchError` |
 | `set_sector_classification` | `area, sector_class` | `None` | No | — |
 | `apply_freshness_threshold` | `sector_class` | `int (months)` | No | — |
@@ -1,4 +1,4 @@
-# Test Specification — C12 Operator Pre-flight Tooling
+# Test Specification — C12 Operator Pre-flight Orchestrator

 Component-scoped. Suite-level coverage in `_docs/02_document/tests/*.md`. C12 sequences the F1 (C11 download → C10 build) and F10 (C11 upload trigger) operator-side flows.

@@ -47,17 +47,17 @@ Component-scoped. Suite-level coverage in `_docs/02_document/tests/*.md`. C12 se

 ---

-### C12-IT-03: trigger_post_landing_upload invokes C11 TileUploader on confirmed ON_GROUND
+### C12-IT-03: trigger_post_landing_upload invokes C11 TileUploader on confirmed clean-shutdown footer

-**Summary**: `trigger_post_landing_upload` reads the most recent `FlightStateSignal` from the post-flight FDR; if `ON_GROUND` is confirmed for ≥ a configurable safety threshold (default 30 s), it invokes `C11.TileUploader.upload_pending`. If `ON_GROUND` is not confirmed, it refuses and returns a clear error.
+**Summary**: `trigger_post_landing_upload` reads the post-flight FDR newest-segment-first looking for a `flight_footer` record (kind registered by C13 in AZ-292; emitted exactly once per flight on `close_flight()`); if found with `payload["clean_shutdown"] == True`, it invokes `C11.TileUploader.upload_pending_tiles(UploadRequest(flight_id=..., ...))`. If the footer is absent (truncation / crash) or carries `clean_shutdown == False`, it refuses with `FlightStateNotConfirmedError`.

 **Traces to**: AC-8.4

-**Description**: stage two flight FDR fixtures — one ending with confirmed ON_GROUND for 60 s, one ending with `IN_FLIGHT` (incomplete log). Call `trigger_post_landing_upload`; assert (a) first case invokes upload, (b) second case refuses with `FlightStateNotConfirmedError`.
+**Description**: stage two flight FDR fixtures produced by C13's `FileFdrWriter` — one with a clean-shutdown footer (the writer's standard `close_flight()` path, which always sets `clean_shutdown=True` in the current AZ-292 implementation), one truncated (writer terminated before `close_flight()` ran, so no footer record). Call `trigger_post_landing_upload`; assert (a) first case invokes upload via the `TileUploaderCut` and returns the recorded `UploadBatchReport`, (b) second case refuses with `FlightStateNotConfirmedError(not_confirmed_reason="footer_missing")`.

-**Input data**: 2 scripted FDR fixtures.
+**Input data**: 2 FDR fixtures generated by `FileFdrWriter` (one closed cleanly; one with the close skipped).

-**Expected result**: per assertion.
+**Expected result**: per assertion. No 30-second ON_GROUND threshold is consulted — the footer's existence + `clean_shutdown` flag is the sole signal.

 **Max execution time**: 60 s.

@@ -99,7 +99,7 @@ Component-scoped. Suite-level coverage in `_docs/02_document/tests/*.md`. C12 se

 ### C12-ST-01: CLI rejects writes to airborne images

-**Summary**: the operator-tool CLI has no command path that writes into the airborne `production-binary` image (defends against operator-side mistakes that would defeat ADR-004).
+**Summary**: the operator-orchestrator CLI has no command path that writes into the airborne `production-binary` image (defends against operator-side mistakes that would defeat ADR-004).

 **Traces to**: ADR-004 R02 enforcement (C12 side)

@@ -135,7 +135,7 @@ Component-scoped. Suite-level coverage in `_docs/02_document/tests/*.md`. C12 se
 | Data Set | Source | Size |
 |----------|--------|------|
 | Operator-tooling tarball | CI build artifact | varies |
-| FDR fixtures (ON_GROUND-confirmed and IN_FLIGHT) | scripted | <100 MB each |
+| FDR fixtures (clean-shutdown footer present / footer absent) | generated by C13 `FileFdrWriter` | <100 MB each |
 | Small Derkachi sub-area for C12-IT-02 | scripted | <500 MB |

 **Setup**: extract operator-tooling tarball; bring up Docker compose.
@@ -6,6 +6,8 @@

 **Architectural Pattern**: single concrete `FileFdrWriter` behind a `FdrWriter` interface. Single writer thread fed by lock-free in-process queues from every component. Lossy on writer-thread overrun **only by logging the rollover event**, never silently.

+**Cycle-1 operational reality**: C13 is **airborne infrastructure** seeded as the very first slot of `build_pre_constructed` — `constructed["c13_fdr"] = make_fdr_client(AIRBORNE_MAIN_PRODUCER_ID, config)` (AZ-619 Phase A, where `AIRBORNE_MAIN_PRODUCER_ID = "airborne_main"`). The `make_fdr_client(producer_id, config)` factory in `fdr_client/client.py` carries a process-level `_CACHE` keyed by `producer_id`, so any later `make_fdr_client("airborne_main", config)` call in the same process returns the SAME `FdrClient` instance — that's the AC-619.2 cross-component singleton guarantee. Per-component callers can also obtain their OWN per-producer FdrClient via `make_fdr_client("<their_slug>", config)`: C11 uses `"c11_tile_manager.signing_key"` (AZ-318) and `"c11_tile_manager.tile_uploader"` (AZ-319), C6's `freshness_gate.py` uses its own producer, etc. — each entry in the cache is a distinct `SpscRingBuffer` consumer side. Per-producer capacity comes from `config.fdr.per_producer_capacity[producer_id]` (override) or `config.fdr.queue_size` (default), rounded UP to the next power of two and clipped to `MIN_CAPACITY`. The drop-oldest overrun policy (AZ-274 `default_overrun_policy`) is wired automatically at `FdrClient` construction time; AZ-274 also routes the dropped record through the `on_overrun` hook so the rollover-log event is emitted exactly once per overrun, never silently. Required-key relationship: `c1_vio` and `c5_state` list `c13_fdr` in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS` (missing raises `AirborneBootstrapError`); `c2_5_rerank`, `c3_matcher`, `c3_5_adhop`, and `c4_pose` read it via `constructed.get("c13_fdr")` (optional — silently passes `None` to the wrapper, which is the documented contract for "FDR off" test fixtures). AZ-687 replay-mode guard does NOT apply to C13: the slot is seeded unconditionally before any `_replay_omits_component_block(...)` check — a replay binary still writes FDR (TlogDerivedClock-stamped) so post-flight analysis tools can drain the queue.
+
 **Upstream dependencies**: every component publishes to C13 via in-process pub/sub (drop-oldest-with-rollover-log on overrun).

 **Downstream consumers**:
@@ -90,7 +92,7 @@ Not applicable.
 |---------|---------|---------|
 | orjson / msgpack | per project pin | Record serialisation (serialised format choice during decompose phase) |
 | atomicwrites | latest | Segment file rotation (atomic open of new segment + close of previous) |
-| filelock | per project pin | Cross-process safety for the FDR root (operator-tool reads while companion writes — companion-only access during flight) |
+| filelock | per project pin | Cross-process safety for the FDR root (operator-orchestrator reads while companion writes — companion-only access during flight) |

 **Error Handling Strategy**:
 - `FdrOpenError` at takeoff: refuse takeoff (per AC-NEW-3 every payload class must be present from t=0).
@@ -0,0 +1,126 @@
+# Contract: route_client
+
+**Component**: c11_tilemanager
+**Producer task**: AZ-838_satellite_provider_route_client (Epic AZ-835 C2)
+**Consumer tasks**: AZ-839 (`operator_pre_flight_setup` real fixture, Epic AZ-835 C3); AZ-840 (E2E orchestrator test, Epic AZ-835 C4); future C12 production binding (deferred — see § Non-Goals).
+**Version**: 1.0.0
+**Status**: stable
+**Last Updated**: 2026-05-26
+
+## Purpose
+
+The `SatelliteProviderRouteClient` is C11's operator-side **route-onboarding** interface. Given a `RouteSpec` (a coarsened, tlog-derived flight corridor produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836), it registers the corridor with the parent-suite `satellite-provider` Route API, polls until materialisation completes, and verifies coverage via the inventory contract.
+
+The route-driven seeding flow lets the operator pre-commit the C6 cache to the precise corridor the drone actually flew rather than a coarse bounding box — typically ~100× more tile-efficient on long, narrow flights.
+
+C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module.
+
+**Upstream API** (cycle 3 — AZ-838): `POST /api/satellite/route` (corridor onboarding; body shape per `CreateRouteRequest.cs` / `RoutePoint.cs` / `GeoPoint.cs` DTOs; query `requestMaps=true&createTilesZip=false`) + `GET /api/satellite/route/{id}` (status polling; terminal-success when `mapsReady=true`; terminal-failure when `status ∈ {failed, error, rejected}`) + `POST /api/satellite/tiles/inventory` (post-materialisation coverage verification, shared with `tile_downloader`). Authentication: `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert.
+
+## Shape
+
+### Function / method API
+
+```python
+import uuid
+from gps_denied_onboard._types.route import RouteSpec   # AZ-845 canonical home
+
+class SatelliteProviderRouteClient:
+    def __init__(
+        self,
+        base_url: str,
+        jwt: str,
+        *,
+        tls_insecure: bool = False,
+        request_timeout_s: float = 30.0,
+        poll_interval_s: float = 5.0,
+        poll_max_attempts: int = 60,
+    ) -> None: ...
+
+    def seed_route(
+        self,
+        spec: RouteSpec,
+        *,
+        name: str | None = None,
+    ) -> RouteSeedResult: ...
+```
+
+| Name | Signature | Throws / Errors | Blocking? |
+|------|-----------|-----------------|-----------|
+| `seed_route` | `(spec: RouteSpec, *, name: str \| None = None) -> RouteSeedResult` | `RouteValidationError`, `RouteTransientError`, `RouteTerminalFailureError` (all under `SatelliteProviderRouteError`) | sync; poll loop bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s = 5 min ceiling) |
+
+### Data DTOs
+
+```python
+@dataclass(frozen=True, slots=True)
+class RouteSpec:                                          # _types/route.py (AZ-845)
+    waypoints:                       tuple[tuple[float, float], ...]   # (lat, lon)
+    suggested_region_size_meters:    float                              # per-waypoint coverage radius
+    source_tlog:                     Path                               # provenance
+    source_segment:                  tuple[int, int]                    # (start_idx, end_idx) into tlog GPS rows
+    total_distance_meters:           float                              # along-track distance of active segment
+
+@dataclass(frozen=True, slots=True)
+class RouteSeedResult:                                    # c11_tile_manager/route_client.py
+    route_id:                         uuid.UUID
+    terminal_status:                  str                # e.g. "completed", "done", "succeeded"
+    maps_ready:                       bool               # True on terminal success
+    tile_count:                       int                # present=true entries from inventory verify
+    elapsed_ms:                       int                # POST → terminal-status wall time
+    submitted_payload_sha256:         str                # provenance for the inventory verify step
+```
+
+| Field | Type | Required | Description | Constraints |
+|-------|------|----------|-------------|-------------|
+| `RouteSpec.waypoints` | `tuple[tuple[float, float], ...]` | yes | Ordered list of (lat, lon) waypoints | `2 ≤ len(waypoints) ≤ 500` (AZ-809 validator); each `lat ∈ [-90, 90]`, `lon ∈ [-180, 180]` |
+| `RouteSpec.suggested_region_size_meters` | `float` | yes | Per-waypoint coverage radius | `100.0 ≤ value ≤ 10_000.0` (AZ-809 validator) |
+| `RouteSpec.source_tlog` | `Path` | yes | Provenance — which tlog produced this spec | filesystem path |
+| `RouteSeedResult.route_id` | `uuid.UUID` | yes | Server-assigned route id | non-zero |
+| `RouteSeedResult.terminal_status` | `str` | yes | Last status observed from `GET /api/satellite/route/{id}` | one of `{"completed", "failed", "error", "done", "succeeded", "rejected"}` |
+| `RouteSeedResult.maps_ready` | `bool` | yes | True iff parent suite reported `mapsReady=true` (terminal success) | True on success; False if poll budget exhausted before terminal |
+| `RouteSeedResult.tile_count` | `int` | yes | Inventory `present=true` count over the route's enumerated coverage | ≥ 0 (lower bound — server may interpolate between waypoints) |
+
+## Invariants
+
+- I-1: **Pre-emptive validation** rejects obviously-bad input as `RouteValidationError` BEFORE the HTTP POST. The client mirrors the AZ-809 `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges; `name`/`description` max lengths). The list MUST stay in sync with `SatelliteProvider.Api/Validators/CreateRouteRequestValidator.cs` (parent suite source).
+- I-2: The client POSTs the wire shape exactly per `CreateRouteRequest.cs` + `RoutePoint.cs` + `GeoPoint.cs` (note: `RoutePoint` uses `lat` / `lon` JSON property names for both input and output; the input/output naming asymmetry flagged in AZ-809 AC-10 is a parent-suite concern, not a client adaptation).
+- I-3: Poll cadence MUST respect `poll_interval_s` (lower bound between successive `GET /api/satellite/route/{id}` calls) and `poll_max_attempts` (upper bound on attempt count). The client logs every poll tick at INFO with the observed status.
+- I-4: Terminal-success is exactly `mapsReady=true`. Terminal-failure is exactly `status ∈ {"failed", "error", "rejected"}`. Any other status is treated as "still processing" and triggers the next poll. If the poll budget is exhausted without terminal status, `RouteTransientError` is raised with the last observed status.
+- I-5: 4xx responses with RFC 7807 `ProblemDetails` → `RouteValidationError`; `field_errors` is populated from the `errors` dict so the caller can render per-field rejections.
+- I-6: 5xx / network / timeout → `RouteTransientError` with `__cause__` set to the underlying `httpx` exception. The retry semantics are caller-driven — the route client itself does NOT retry the POST, leaving the policy to the fixture / CLI (e.g., `tests/e2e/replay/conftest.py::operator_pre_flight_setup` retries up to 3 times using C11's `_DEFAULT_BACKOFF_SCHEDULE_S = (1, 2, 4, 8)`).
+- I-7: The inventory verify step uses `POST /api/satellite/tiles/inventory` (≤ 5000 entries / request) and enumerates the route's tile coverage locally from `(waypoints, suggested_region_size_meters)` using the parent suite's web-Mercator math (`_EARTH_EQUATORIAL_CIRCUMFERENCE_M = 40 075 016.686`). The result is a **lower bound** on actual server coverage — the server may interpolate intermediate corridor tiles that the local enumeration misses; this is documented and acceptable as a sanity-check signal, not a coverage proof.
+
+## Non-Goals
+
+- Not covered: producing the `RouteSpec` — owned by `replay_input.tlog_route.extract_route_from_tlog` (AZ-836).
+- Not covered: orchestration of when the operator runs the seed — owned by C12 (production binding deferred; cycle-3 e2e fixture `operator_pre_flight_setup` is the current driver — AZ-839).
+- Not covered: FAISS index construction over the populated cache — owned by C10 `DescriptorBatcher`.
+- Not covered: bbox-based seeding — handled by `tile_downloader.download_tiles_for_area` (and by `tests/fixtures/derkachi_c6/seed_region.py` for the e2e fixture).
+- Not covered: multi-route batching — one `RouteSpec` per `seed_route` call. Multi-flight aggregate corridors are an operator-workflow concern.
+
+## Versioning Rules
+
+- **Breaking changes** (renamed method, removed required field, changed return type, parent-suite Route API contract break) require a major version bump. Coordinate with the C3 fixture (AZ-839) and any future C12 production binding via Choose A/B/C/D before bumping.
+- **Non-breaking additions** (new optional constructor kwarg, new field on `RouteSeedResult`, new error variant the consumer catches via `SatelliteProviderRouteError`) require a minor version bump.
+- The pre-emptive validation bounds (I-1) MUST track the parent-suite `CreateRouteRequestValidator.cs` exactly. Drift between client and server validators is a defect, not a version concern — fix the client to match the server.
+
+## Test Cases
+
+| Case | Input | Expected | Notes |
+|------|-------|----------|-------|
+| route-happy-path | `RouteSpec` for Derkachi tlog (2-waypoint corridor, region_size=500m) against a stubbed `satellite-provider` returning `mapsReady=true` on the 2nd poll | `RouteSeedResult` with `maps_ready=True`, `tile_count > 0`, `terminal_status="completed"`, `elapsed_ms` reflects 2 polls | AZ-838 AC-1, AC-2 |
+| validation-empty-points | `RouteSpec(waypoints=(), …)` | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 |
+| validation-too-many-points | `RouteSpec` with 501 waypoints | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 |
+| validation-region-too-large | `RouteSpec(suggested_region_size_meters=10_001.0, …)` | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 |
+| 4xx-problem-details | server returns 400 + RFC 7807 `errors` dict | `RouteValidationError` with `field_errors` populated from the response | I-5, AZ-838 AC-3 |
+| 5xx-transient | server returns 503 | `RouteTransientError` with `__cause__` set to the underlying `httpx` exception | I-6, AZ-838 AC-4 |
+| terminal-failure | server reports `status="failed"` mid-poll | `RouteTerminalFailureError`; `.detail` carries the response JSON | I-4, AZ-838 AC-5 |
+| poll-budget-exhausted | server stays in `status="processing"` past 60 attempts | `RouteTransientError` referencing the last observed status | I-3, I-4 |
+| inventory-verify-counts-present | `mapsReady=true` then inventory POST returns mixed `present=true/false` entries | `tile_count` equals the count of `present=true` entries | I-7 |
+| integration-derkachi | `RouteSpec` from real Derkachi tlog, against the Jetson `satellite-provider` (gated by `RUN_E2E=1` + `SATELLITE_PROVIDER_URL`) | `tile_count > 0`, `maps_ready=True`, completes in ≤ 15 s on the 2-waypoint reference route | AZ-838 AC-10 (Jetson-only, Tier-2) |
+
+## Change Log
+
+| Version | Date | Change | Author |
+|---------|------|--------|--------|
+| 1.0.0 | 2026-05-26 | Initial contract — produced by AZ-838 (Epic AZ-835 C2). Cycle-3 addition; consumed by AZ-839 (`operator_pre_flight_setup` real fixture) and AZ-840 (E2E orchestrator test). | autodev |
@@ -1,18 +1,20 @@
 # Contract: tile_downloader

 **Component**: c11_tilemanager
-**Producer task**: AZ-316_c11_tile_downloader
+**Producer task**: AZ-316_c11_tile_downloader (initial), AZ-777 Phase 1 (cycle-3 inventory-contract adaptation)
 **Consumer tasks**: AZ-253 (E-C12 Operator Pre-flight Tooling — TBD at C12 decompose time)
-**Version**: 1.0.0
-**Status**: draft
-**Last Updated**: 2026-05-10
+**Version**: 1.1.0
+**Status**: stable
+**Last Updated**: 2026-05-26

 ## Purpose

-The `TileDownloader` Protocol is C11's operator-side download interface. C12 invokes it during F1 (pre-flight cache build) to fetch satellite tiles from the parent suite's `satellite-provider` GET surface, apply RESTRICT-SAT-4 resolution gating at the C11 boundary, and write accepted tiles into C6. Freshness rejections surfacing from C6 (AZ-307) are counted and surfaced in the report.
+The `TileDownloader` Protocol is C11's operator-side download interface. C12 invokes it during F1 (pre-flight cache build) to fetch satellite tiles from the parent suite's `satellite-provider` inventory + slippy-map surface, apply RESTRICT-SAT-4 resolution gating at the C11 boundary, and write accepted tiles into C6. Freshness rejections surfacing from C6 (AZ-307) are counted and surfaced in the report.

 C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module.

+**Upstream API (cycle 3 — AZ-777 Phase 1)**: against the real parent-suite `satellite-provider` v1.0.0 inventory contract — `POST /api/satellite/tiles/inventory` (bulk lookup by `(zoom, x, y)`, ≤ 5000 entries / request, per `tile-inventory.md` v1.0.0 / AZ-505) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch, issued only for inventory entries with `present=true`). Authentication: `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert (production must validate against a CA-issued cert). Because the inventory response carries no `Content-Length` hint, AZ-308's pre-write budget pre-check uses a conservative `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` per-tile reserve.
+
 ## Shape

 ### Function / method API
@@ -79,7 +81,7 @@ class TileSummary:
 - I-1: `tiles_downloaded + tiles_rejected_resolution + tiles_rejected_freshness == sum of attempted tiles`. The report accounts for every tile the downloader attempted; no silent drops.
 - I-2: A re-run of `download_tiles_for_area` for the same `(bbox, zoom_levels, sector_class, flight_id)` after a successful prior run is idempotent: `outcome = idempotent_no_op` and no GETs are issued. Idempotence is enforced by C11's download-progress journal under `cache_root/.c11/journal/`.
 - I-3: Every accepted tile passes BOTH the C11 resolution gate (≥ 0.5 m/px per RESTRICT-SAT-4) AND the C6 freshness gate (AZ-307). A tile that fails either is excluded from `tiles_downloaded`.
- I-4: TLS + service-internal API key authenticate the GET; auth failure surfaces as `SatelliteProviderError` and aborts the run with `outcome = failure`. The downloader does NOT fall back to plaintext or unauthenticated requests.
+- I-4: JWT Bearer authentication (`SATELLITE_PROVIDER_API_KEY`) over TLS authenticates the inventory POST and the slippy-map GET; auth failure surfaces as `SatelliteProviderError` and aborts the run with `outcome = failure`. The downloader does NOT fall back to plaintext or unauthenticated requests. `SATELLITE_PROVIDER_TLS_INSECURE=1` is a dev-only knob for self-signed certs; production must run with it unset.
 - I-5: The downloader writes via the AZ-303 `TileStore`/`TileMetadataStore` Protocols; it does NOT touch C6's filesystem layout directly.
 - I-6: A `CacheBudgetExceededError` aborts pre-write with no partial write and `outcome = failure`. The C6 cache budget enforcer (AZ-308) drives the headroom check.

@@ -112,4 +114,5 @@ class TileSummary:

 | Version | Date | Change | Author |
 |---------|------|--------|--------|
+| 1.1.0 | 2026-05-26 | Internal upstream contract adapted to `satellite-provider` v1.0.0 inventory contract (AZ-777 Phase 1): `POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` replace the previous `GET /api/satellite/tiles?bbox=…&zoom=…` shape. `download_tiles_for_area` / `DownloadRequest` / `DownloadBatchReport` surface UNCHANGED — non-breaking minor bump. Auth tightened to JWT Bearer over TLS. Status moved draft → stable. | autodev |
 | 1.0.0 | 2026-05-10 | Initial contract — produced by AZ-316 (E-C11 decomposition) | autodev |
@@ -0,0 +1,95 @@
+syntax = "proto3";
+
+package satellite.v1;
+
+import "google/protobuf/timestamp.proto";
+
+option csharp_namespace = "Satellite.V1";
+
+service RouteTileDelivery {
+  rpc DeliverRouteTiles(DeliverRouteTilesRequest) returns (stream RouteTileEvent);
+}
+
+message DeliverRouteTilesRequest {
+  RouteSpec route = 1;
+  repeated ClientTileRecord client_tiles = 2;
+}
+
+message RouteSpec {
+  string route_id = 1;
+  repeated Waypoint waypoints = 2;
+  double region_size_meters = 3;
+  int32 zoom = 4;
+  repeated GeofencePolygon geofences = 5;
+  bool include_geofence_tiles = 6;
+}
+
+message Waypoint {
+  double lat = 1;
+  double lon = 2;
+}
+
+message GeofencePolygon {
+  repeated Waypoint vertices = 1;
+}
+
+message ClientTileRecord {
+  int32 z = 1;
+  int32 x = 2;
+  int32 y = 3;
+  double resolution_m_per_px = 4;
+  google.protobuf.Timestamp captured_at = 5;
+  optional string source = 6;
+  bytes content_sha256 = 7;
+}
+
+message RouteTileEvent {
+  oneof payload {
+    RouteManifest manifest = 1;
+    TileBatch batch = 2;
+    ProgressUpdate progress = 3;
+    DeliveryComplete complete = 4;
+    DeliveryError error = 5;
+  }
+}
+
+message RouteManifest {
+  uint32 total_candidates = 1;
+  uint32 skipped_by_client = 2;
+  uint32 to_deliver = 3;
+}
+
+message TileBatch {
+  uint32 batch_seq = 1;
+  repeated TilePayload tiles = 2;
+}
+
+message TilePayload {
+  int32 z = 1;
+  int32 x = 2;
+  int32 y = 3;
+  double resolution_m_per_px = 4;
+  google.protobuf.Timestamp captured_at = 5;
+  string source = 6;
+  bytes jpeg = 7;
+  bytes content_sha256 = 8;
+  uint32 route_priority = 9;
+}
+
+message ProgressUpdate {
+  uint32 delivered = 1;
+  uint32 total = 2;
+  uint32 downloading = 3;
+}
+
+message DeliveryComplete {
+  uint32 delivered = 1;
+  uint32 skipped_client = 2;
+  uint32 skipped_server_filter = 3;
+}
+
+message DeliveryError {
+  string code = 1;
+  string message = 2;
+  bool retryable = 3;
+}
@@ -0,0 +1,143 @@
+# Contract: RouteTileDelivery (gRPC)
+
+**Component**: c11_tilemanager (consumer), satellite-provider (producer)
+**Epic**: AZ-976
+**ADR**: ADR-013 (architecture.md)
+**Proto**: `tile_provision.proto` — `package satellite.v1`
+**Version**: 0.3.0
+**Status**: proposed
+**Last Updated**: 2026-06-19
+
+## Purpose
+
+Operator-side **pre-flight cache provisioning**. Client sends route + onboard tile catalog once; server streams `RouteTileEvent` messages until `DeliveryComplete` or `DeliveryError`.
+
+satellite-provider does **not** receive `flight_id` — that is a C6 bookkeeping concern on the gps-denied side only (`route_id` is the wire correlation id).
+
+C11/C12 on the **operator workstation** only. ADR-004: airborne image must not import stubs or open this channel.
+
+## RPC
+
+```protobuf
+service RouteTileDelivery {
+  rpc DeliverRouteTiles(DeliverRouteTilesRequest) returns (stream RouteTileEvent);
+}
+```
+
+| Concern | Rule |
+|---------|------|
+| Auth | gRPC metadata `authorization: Bearer <JWT>` |
+| TLS | Required in production; `SATELLITE_PROVIDER_TLS_INSECURE=1` dev knob |
+| Idempotency | `RouteSpec.route_id` (UUID string) |
+| Resume | Client persists last acked `batch_seq` per `route_id` locally (not on wire) |
+
+## Request
+
+### `DeliverRouteTilesRequest`
+
+| Field | Description |
+|-------|-------------|
+| `route` | Corridor geometry + single zoom |
+| `client_tiles` | Onboard inventory snapshot (route intersection only) |
+
+### `RouteSpec`
+
+| Field | Maps from gps-denied |
+|-------|----------------------|
+| `route_id` | Client-generated UUID per provision job |
+| `waypoints` | `replay_input.tlog_route.RouteSpec.waypoints` |
+| `region_size_meters` | `RouteSpec.suggested_region_size_meters` |
+| `zoom` | Single slippy zoom level (confirmed sufficient) |
+| `geofences` | Optional inclusion polygons |
+| `include_geofence_tiles` | Union geofence tiles with corridor grid |
+
+### `ClientTileRecord`
+
+Canonical key: **`(z, x, y)`**. `source` is informational only — **not** used in skip logic.
+
+| Field | C6 mapping |
+|-------|------------|
+| `resolution_m_per_px` | RESTRICT-SAT-4 (lower = better) |
+| `captured_at` | `TileMetadata.capture_timestamp` |
+| `content_sha256` | `TileMetadata.content_sha256_hex` (raw 32 bytes) |
+
+## Server skip rule (client catalog)
+
+For each server candidate tile, **omit from stream** when `client_tiles` has matching `(z,x,y)` and **any** of:
+
+1. `client.content_sha256` is non-empty and **equals** server payload hash → skip (byte-identical)
+2. `client.resolution_m_per_px <= server.resolution_m_per_px` **and** `client.captured_at >= server.captured_at` → skip (metadata-sufficient)
+
+`source` is **not** compared.
+
+`RouteManifest.skipped_by_client` counts tiles removed by this rule.
+
+## Sector — not on this wire
+
+**Sector** (`active_conflict` vs `stable_rear`) controls **how stale a tile may be before C6 rejects it on write** (AC-NEW-6 freshness). It is an operator decision about the geographic area, not something satellite-provider needs to deliver tiles.
+
+| Layer | Who applies sector |
+|-------|-------------------|
+| satellite-provider | Does not need sector — streams tiles by route geometry |
+| C11 client write | Reads sector from **C11/C12 config** (same as today) when calling C6 freshness gate |
+
+No `SectorClass` field on the gRPC request.
+
+## Response stream: `RouteTileEvent`
+
+Typical sequence:
+
+1. **`RouteManifest`** — `total_candidates`, `skipped_by_client`, `to_deliver`
+2. **`TileBatch`** — monotonic `batch_seq`; on-disk hits first, then freshly fetched
+3. **`ProgressUpdate`** — optional
+4. **`DeliveryComplete`** or **`DeliveryError`**
+
+### `DeliveryComplete` counters
+
+| Field | Meaning |
+|-------|---------|
+| `delivered` | Tiles actually sent in `TileBatch` streams |
+| `skipped_client` | Same as manifest `skipped_by_client` (echo for client verify) |
+| `skipped_server_filter` | Tiles SP required but **did not send** after client dedup — see below |
+
+#### `skipped_server_filter` — what counts
+
+Tiles that entered the post-client-dedup work queue but never appeared in a batch:
+
+| Reason | Example |
+|--------|---------|
+| **Fetch failed** | External imagery provider 404/timeout after retries |
+| **Below SP min resolution** | SP refuses to store/serve below its configured floor |
+| **Geometry clip** | Tile dropped after server-side corridor/geofence validation |
+| **Operational cap** | Job hit max-tiles / rate limit (if SP enforces) |
+
+Tiles skipped by the **client catalog rule** are **not** included here (they are `skipped_client`).
+
+If SP has no server-side filters in v1, `skipped_server_filter` may be **0**; the field is reserved for observability.
+
+### `TilePayload`
+
+| Field | Notes |
+|-------|-------|
+| `content_sha256` | 32-byte SHA-256 of `jpeg`; matches C6 DB invariant |
+| `route_priority` | Lower = earlier along route |
+
+## Client write path (gps-denied)
+
+`RouteTileDeliveryClient` (C11):
+
+- Assigns C6 `flight_id` from operator context locally (not from SP)
+- Applies RESTRICT-SAT-4, **sector-based freshness**, AZ-308 budget, download journal
+- Resumes via persisted `route_id` + `batch_seq`
+
+## Migration
+
+REST `route_client` + `HttpTileDownloader` remain fallback until AZ-979 benchmark.
+
+## Change log
+
+| Version | Date | Change |
+|---------|------|--------|
+| 0.3.0 | 2026-06-19 | `ClientTileRecord.content_sha256`; sequential field nums on `TilePayload`; sector/flight_id off wire; skip rule + `skipped_server_filter` defined |
+| 0.2.0 | 2026-06-19 | `satellite.v1.RouteTileDelivery` + `RouteTileEvent` oneof |
+| 0.1.0 | 2026-06-19 | Initial draft (superseded) |
@@ -1,17 +1,23 @@
 # Contract: tile_uploader

 **Component**: c11_tilemanager
-**Producer task**: AZ-319_c11_tile_uploader
-**Consumer tasks**: AZ-253 (E-C12 Operator Pre-flight Tooling — TBD at C12 decompose time)
-**Version**: 1.0.0
-**Status**: draft
-**Last Updated**: 2026-05-10
+**Producer task**: AZ-319_c11_tile_uploader (initial), Batch 44 C11-SRP-revert (v2.0.0 gate removal)
+**Consumer tasks**: AZ-329 (C12 `PostLandingUploadOrchestrator`) — see `_docs/02_document/contracts/c12_operator_orchestrator/` for the C12 surface that owns the post-landing safety gate.
+**Version**: 2.0.0
+**Status**: frozen
+**Last Updated**: 2026-05-13
+
+## Migration note — v1.0.0 → v2.0.0
+
+Batch 44 removed C11's internal post-landing safety gate per SRP. v1.0.0 exposed `confirm_flight_state(): FlightStateSignal` and raised `FlightStateNotOnGroundError` from `upload_pending_tiles`. v2.0.0 drops both — the equivalent check moved to C12's `PostLandingUploadOrchestrator` (AZ-329), which inspects the C13 `flight_footer` FDR record and refuses to invoke `upload_pending_tiles` unless `clean_shutdown=True` is recorded. C11 is now a dumb pipe.
+
+Consumers that still call `confirm_flight_state` or catch `FlightStateNotOnGroundError` MUST migrate to consuming C12's `FlightStateNotConfirmedError` family instead. ADR-004 process-level isolation remains the primary control — C11 never runs on the companion at all.

 ## Purpose

-The `TileUploader` Protocol is C11's operator-side post-landing upload interface. C12 invokes it during F10 (post-landing) to read mid-flight tiles flagged pending-upload from C6 (`source = onboard_ingest`, `voting_status = pending`), package them per the D-PROJ-2 ingest contract sketch, sign each tile payload with the per-flight ephemeral key (AZ-318), and POST to `satellite-provider`'s `/api/satellite/tiles/ingest` endpoint. Acknowledged tiles are marked uploaded in C6.
+The `TileUploader` Protocol is C11's operator-side post-landing upload interface. C12's `PostLandingUploadOrchestrator` (AZ-329) invokes it during F10 (post-landing) AFTER it has confirmed `clean_shutdown=True` from the C13 `flight_footer` FDR record. C11 then reads mid-flight tiles flagged pending-upload from C6 (`source = onboard_ingest`, `voting_status = pending`), packages them per the D-PROJ-2 ingest contract sketch, signs each tile payload with the per-flight ephemeral key (AZ-318), and POSTs to `satellite-provider`'s `/api/satellite/tiles/ingest` endpoint. Acknowledged tiles are marked uploaded in C6.

-The uploader gates on `flight_state == ON_GROUND` (AZ-317) before any network egress. C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module.
+C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module.

 ## Shape

@@ -24,14 +30,12 @@ from typing import Protocol, runtime_checkable
 class TileUploader(Protocol):
    def upload_pending_tiles(self, request: UploadRequest) -> UploadBatchReport: ...
    def enumerate_pending_tiles(self, flight_id: uuid.UUID | None = None) -> list[TileMetadata]: ...
-    def confirm_flight_state(self) -> FlightStateSignal: ...
 ```

 | Name | Signature | Throws / Errors | Blocking? |
 |------|-----------|-----------------|-----------|
-| `upload_pending_tiles` | `(request: UploadRequest) -> UploadBatchReport` | `FlightStateNotOnGroundError`, `SatelliteProviderError`, `RateLimitedError`, `SignatureRejectedError`, `TileMetadataError` | sync (post-landing; minutes) |
+| `upload_pending_tiles` | `(request: UploadRequest) -> UploadBatchReport` | `SatelliteProviderError`, `RateLimitedError`, `SignatureRejectedError`, `TileMetadataError` | sync (post-landing; minutes) |
 | `enumerate_pending_tiles` | `(flight_id: uuid.UUID \| None) -> list[TileMetadata]` | `TileMetadataError` | sync (seconds) |
-| `confirm_flight_state` | `() -> FlightStateSignal` | `FlightStateNotOnGroundError` | sync (≤ 1 ms) |

 ### Data DTOs

@@ -70,7 +74,7 @@ class PerTileStatus:

 ## Invariants

- I-1: `confirm_flight_state` is called by `upload_pending_tiles` BEFORE any C6 read or network egress; if `FlightStateNotOnGroundError` is raised, NO tiles are read, NO POSTs are issued, NO C6 mutation occurs. The gate is closed by default.
+- I-1 (v2.0.0): C11 itself does NOT gate on flight state. The pre-call gate is C12's `PostLandingUploadOrchestrator` (AZ-329), which inspects the C13 `flight_footer` FDR record for `clean_shutdown=True` BEFORE invoking `upload_pending_tiles`. C11 is a dumb pipe — once called, it proceeds to read C6 + POST to the satellite-provider with no internal short-circuit. ADR-004 process-level isolation remains the primary defence (C11 never runs on the companion).
 - I-2: Every uploaded tile carries a signature produced by the AZ-318 per-flight key manager's `sign(payload)`. The parent suite verifies against the public key it received via the safety officer's pre-flight enrolment OR the `kind="c11.upload.session.key.public"` FDR record.
 - I-3: A tile acknowledged as `queued`, `duplicate`, or `superseded` by the parent suite is marked `uploaded` in C6 (`mark_uploaded(tile_id)`); a tile acknowledged as `rejected` is NOT marked uploaded — it remains `pending` for human review.
 - I-4: The per-flight signing key is zeroised at the end of `upload_pending_tiles` regardless of success or failure (try/finally in the caller; AZ-318's `end_session()`).
@@ -98,8 +102,8 @@ class PerTileStatus:

 | Case | Input | Expected | Notes |
 |------|-------|----------|-------|
-| upload-happy-path | 50 pending tiles, ON_GROUND, parent-suite returns 202 with all `queued` | `UploadBatchReport.outcome = success`; all 50 marked `uploaded` in C6; signature verifies on each | C11-IT-03 |
-| flight-state-blocks | `FlightStateSource` returns `IN_FLIGHT` | `FlightStateNotOnGroundError`; zero C6 reads; zero POSTs | C11-IT-04 |
+| upload-happy-path | 50 pending tiles, parent-suite returns 202 with all `queued` | `UploadBatchReport.outcome = success`; all 50 marked `uploaded` in C6; signature verifies on each | C11-IT-03 |
+| post-landing-gate-in-c12 | C12 `PostLandingUploadOrchestrator` invocation flow | The flight-state gate lives in C12 (`FlightStateNotConfirmedError`), not C11. v2.0.0 removed the C11 internal gate. | See `c12_operator_orchestrator` contract + AZ-329 spec |
 | signature-rejected | Parent suite returns `rejected` for 1 tile with reason `"invalid signature"` | `PerTileStatus.status = rejected`; `outcome = partial`; FDR `c11.upload.signature_rejected` emitted; the tile NOT marked uploaded | I-5 |
 | duplicate-acknowledged | Parent suite returns `duplicate` for 5 tiles (already ingested in a prior batch) | All 5 marked `uploaded`; `outcome = success` | I-3 |
 | signing-key-zeroised | Run a successful upload, then assert the AZ-318 manager's `_private_key is None` | Always zeroised; FDR `c11.upload.session.key.zeroised` recorded | I-4 |
@@ -112,3 +116,4 @@ class PerTileStatus:
 | Version | Date | Change | Author |
 |---------|------|--------|--------|
 | 1.0.0 | 2026-05-10 | Initial contract — produced by AZ-319 (E-C11 decomposition) | autodev |
+| 2.0.0 | 2026-05-13 | Batch 44: remove C11 internal flight-state gate per SRP. `confirm_flight_state` method dropped; `FlightStateNotOnGroundError` retired; post-landing safety gate now owned by C12's `PostLandingUploadOrchestrator` (AZ-329). Breaking — consumers MUST migrate to C12's `FlightStateNotConfirmedError`. | autodev (Batch 44) |
@@ -1,6 +1,6 @@
 # Contract: flights_api_client

-**Component**: c12_operator_tooling
+**Component**: c12_operator_orchestrator
 **Producer task**: AZ-489 — `_docs/02_tasks/todo/AZ-489_c12_flights_api_client.md`
 **Consumer tasks**: AZ-326 (CLI app — wires `--flight-id` / `--flight-file` flags), AZ-328 (build-cache orchestrator — calls `fetch_flight` / `load_flight_file`, then `bbox_from_waypoints` + `takeoff_origin_from_flight`)
 **Version**: 1.0.0
@@ -1,6 +1,6 @@
 # Contract: operator_command_transport

-**Component**: c12_operator_tooling
+**Component**: c12_operator_orchestrator
 **Producer task**: AZ-330 — `_docs/02_tasks/todo/AZ-330_c12_operator_reloc_service.md`
 **Consumer tasks**: TBD — a future E-C8 (AZ-261) task implements `MavlinkOperatorCommandTransport` against pymavlink
 **Version**: 1.0.0
@@ -9,7 +9,7 @@

 ## Purpose

-Defines the operator-workstation ↔ companion command channel for AC-3.4 operator-relocalization. C12 owns the Protocol shape; E-C8 (AZ-261) ships the pymavlink-backed concrete implementation that encodes the hint into a MAVLink message and transmits it over the GCS link to the airborne companion. Decoupling the two sides through this Protocol prevents C12 from having to know MAVLink details, and prevents E-C8 from having to know operator-tool internals — they meet at this contract.
+Defines the operator-workstation ↔ companion command channel for AC-3.4 operator-relocalization. C12 owns the Protocol shape; E-C8 (AZ-261) ships the pymavlink-backed concrete implementation that encodes the hint into a MAVLink message and transmits it over the GCS link to the airborne companion. Decoupling the two sides through this Protocol prevents C12 from having to know MAVLink details, and prevents E-C8 from having to know operator-orchestrator internals — they meet at this contract.

 ## Shape

@@ -127,6 +127,12 @@ Lives at `src/gps_denied_onboard/runtime_root/vio_factory.py`. Selects the strat

 - **6×6 SPD covariance always returned**: `pose_covariance_6x6` is symmetric and positive-definite for every `VioOutput`. Implementations MUST NOT return a "tightened" covariance (smaller Frobenius norm) during a degradation event; honest covariance is the safety floor for AC-NEW-4 and AC-NEW-7. A test (covariance-monotonicity contract test, deferred to Step 9 / E-BBT) asserts this across all three strategies.
 - **`frame_id` echo**: `VioOutput.frame_id` equals the input `NavCameraFrame.frame_id`. C5 relies on this for time-aligned factor insertion.
+- **`relative_pose_T.translation()` is in metres** (NOT pixels, NOT unit-length). Every strategy MUST emit metric translation; C5 fuses it directly into the state estimator without further scaling. Monocular strategies (KLT/RANSAC) recover scale through an injected `AltitudeProvider` (see AZ-919/AZ-920); stereo / VIO strategies (OKVIS2, VINS-Mono) get scale from their backend optimization.
+- **`scale_quality` carries the per-frame degraded-mode signal** (AZ-921). Three values:
+  - `"metric"` — translation is in metres, fully trustworthy. ESKF consumes `pose_covariance_6x6` as-is.
+  - `"direction_only"` — translation direction is informative but magnitude is not (near-vertical motion in a nadir camera; banked turn). ESKF overrides `R_meas[0:3, 0:3]` to `_DIRECTION_ONLY_TRANSLATION_SIGMA_M² = 64 m²` so the rotation update is honoured and the position update contributes little.
+  - `"unknown"` — translation is not trustworthy at all (AGL missing, zero inlier flow, hover, stationary). ESKF overrides `R_meas[0:3, 0:3]` to `_UNKNOWN_TRANSLATION_SIGMA_M² = 1e6 m²` so the position update is effectively skipped while the rotation update remains active.
+  Default `"unknown"` on the `VioOutput` dataclass keeps legacy strategies bug-for-bug compatible until they opt in to the AZ-919 `AltitudeProvider` plumbing.
 - **Single-threaded by contract**: each `VioStrategy` instance is bound to one writer thread (the camera ingest thread). Concurrent calls to `process_frame` on the same instance are undefined behaviour. The composition root binds one instance per ingest thread.
 - **`reset_to_warm_start` is destructive**: clears the strategy's keyframe window, IMU integration state, and feature track buffer; subsequent `process_frame` calls re-initialise from the hint. Calling `reset_to_warm_start` mid-flight is allowed (F8 reboot recovery) but must not be issued concurrently with a `process_frame` call on the same instance.
 - **`current_strategy_label()` is constant per instance**: returns the same string for the lifetime of the instance and matches `config.vio.strategy` exactly. The label is FDR-stamped on every `VioHealth` event for AC-NEW-3 audit.
@@ -4,7 +4,7 @@
 **Producer task**: AZ-342 (`ReRankStrategy` Protocol + factory + composition)
 **Consumer tasks**: AZ-343 (`InlierCountReRanker` impl); downstream c3_matcher (epic AZ-257 / E-C3 — TBD at AZ-257 decompose time) which consumes `RerankResult`
 **Version**: 1.0.0
-**Status**: draft, awaiting AZ-342 implementation
+**Status**: v1.0.0 (AZ-342 implemented 2026-05-12)
 **Last Updated**: 2026-05-10
 **Module-layout home**: `src/gps_denied_onboard/components/c2_5_rerank/interface.py` (Protocol), `src/gps_denied_onboard/components/c2_5_rerank/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/rerank_factory.py` (factory)

@@ -75,31 +75,29 @@ class ReRankStrategy(Protocol):

 ```python
 from dataclasses import dataclass
-from uuid import UUID
-import numpy as np


@dataclass(frozen=True, slots=True)
 class RerankCandidate:
    """One re-rank survivor. Carries the C2-stage descriptor_distance forward for FDR provenance plus the new inlier_count from single-pair LightGlue."""

-    tile_id: tuple  # composite (zoomLevel, lat, lon); see C6 TileRecord
-    inlier_count: int  # single-pair LightGlue inliers; > 0 for any survivor
-    descriptor_distance: float  # carried forward from C2's VprCandidate
-    descriptor_dim: int  # carried forward from C2 for sanity assertions
-    tile_pixels_handle: object  # opaque page-cache-backed pixel reference; see C6 TileStore contract
+    tile_id: tuple[int, float, float]  # composite (zoom_level, lat, lon); matches c6_tile_cache.TileId. tuple form keeps _types/ free of an L1→L3 import per module-layout.md.
+    inlier_count: int                  # single-pair LightGlue inliers; > 0 for any survivor
+    descriptor_distance: float         # carried forward from C2's VprCandidate
+    descriptor_dim: int                # carried forward from C2 for sanity assertions
+    tile_pixels_handle: object         # opaque page-cache-backed pixel reference; see C6 TileStore contract


@dataclass(frozen=True, slots=True)
 class RerankResult:
    """Top-N survivors from `ReRankStrategy.rerank`. Consumed by C3 CrossDomainMatcher."""

-    frame_id: UUID
-    candidates: list[RerankCandidate]  # 0 < len <= n; sorted descending by inlier_count, ties broken by descriptor_distance ascending
-    reranked_at: int  # monotonic_ns
-    rerank_label: str  # non-empty; matches BUILD_RERANK_<variant> lowercase (e.g., "inlier_count")
-    candidates_input: int  # len(vpr_result.candidates) at entry — for FDR observability
-    candidates_dropped: int  # candidates_input - len(candidates)
+    frame_id: int                          # echoes NavCameraFrame.frame_id (int across the pipeline)
+    candidates: tuple[RerankCandidate, ...]  # 0 < len <= n; descending by inlier_count, ties broken by descriptor_distance ascending. tuple (not list) so the frozen+slots invariant holds.
+    reranked_at: int                       # monotonic_ns from injected Clock
+    rerank_label: str                      # non-empty; matches BUILD_RERANK_<variant> lowercase
+    candidates_input: int                  # len(vpr_result.candidates) at entry — for FDR observability
+    candidates_dropped: int                # candidates_input - len(candidates)
 ```

 ### Error Hierarchy (in `c2_5_rerank/errors.py`)
@@ -4,7 +4,7 @@
 **Producer task**: AZ-336 (`VprStrategy` Protocol + factory + composition)
 **Consumer tasks**: AZ-337 (UltraVPR), AZ-338 (NetVLAD baseline), AZ-339 (MegaLoc + MixVPR), AZ-340 (SelaVPR + EigenPlaces + SALAD), AZ-341 (FAISS HNSW retrieve wiring), and downstream c2_5_rerank (AZ-256 / E-C2.5)
 **Module-layout home**: `src/gps_denied_onboard/components/c2_vpr/interface.py` (Protocols), `src/gps_denied_onboard/components/c2_vpr/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/vpr_factory.py` (factory)
-**Status**: draft, awaiting AZ-336 implementation
+**Status**: v1.0.0 (AZ-336 implemented 2026-05-12)

 ## Purpose

@@ -69,25 +69,23 @@ class VprStrategy(Protocol):

 ```python
 from dataclasses import dataclass
-from uuid import UUID
-import numpy as np


@dataclass(frozen=True, slots=True)
 class VprQuery:
    """Backbone embedding for a single nav-camera frame. Produced by `VprStrategy.embed_query`; consumed by `VprStrategy.retrieve_topk` (same instance) or — in the C10 corpus-build path — by `DescriptorIndexBuilder` to populate the corpus descriptor matrix."""

-    frame_id: UUID
-    embedding: np.ndarray  # shape (D,), dtype float16 or float32; L2-normalised
-    produced_at: int  # monotonic_ns
+    frame_id: int                # echoes NavCameraFrame.frame_id (the source carries int across the pipeline)
+    embedding: object            # numpy.ndarray, shape (D,), dtype float16|float32; L2-normalised. typed object to keep _types/ free of numpy import-time dep.
+    produced_at: int             # monotonic_ns from injected Clock


@dataclass(frozen=True, slots=True)
 class VprCandidate:
    """One retrieval candidate from the top-K result."""

-    tile_id: tuple  # composite (zoomLevel, lat, lon); see C6 TileRecord
-    descriptor_distance: float  # backbone-specific metric (cosine for L2-normalised; Euclidean for raw)
+    tile_id: tuple[int, float, float]  # composite (zoom_level, lat, lon); matches c6_tile_cache.TileId. tuple form keeps _types/ free of an L1→L3 import per module-layout.md.
+    descriptor_distance: float         # backbone-specific metric (cosine for L2-normalised; Euclidean for raw)
    descriptor_dim: int


@@ -95,10 +93,10 @@ class VprCandidate:
 class VprResult:
    """Top-K candidates from `VprStrategy.retrieve_topk`. Consumed by C2.5 ReRanker."""

-    frame_id: UUID
-    candidates: list[VprCandidate]  # length == k, sorted ascending by descriptor_distance
-    retrieved_at: int  # monotonic_ns
-    backbone_label: str  # non-empty; matches BUILD_VPR_<variant> lowercase
+    frame_id: int                        # echoes the source NavCameraFrame.frame_id
+    candidates: tuple[VprCandidate, ...] # length == k, ascending by descriptor_distance. tuple (not list) so the frozen+slots invariant holds.
+    retrieved_at: int                    # monotonic_ns from injected Clock
+    backbone_label: str                  # non-empty; matches BUILD_VPR_<variant> lowercase
 ```

 ### Protocol: `BackbonePreprocessor` (C2-internal; lives in `c2_vpr/_preprocessor.py`)
@@ -155,18 +153,21 @@ class IndexUnavailableError(VprError):
 # src/gps_denied_onboard/runtime_root/vpr_factory.py

 from typing import TYPE_CHECKING
-from gps_denied_onboard.config import Config
+from gps_denied_onboard.config.schema import Config
 from gps_denied_onboard.components.c2_vpr import VprStrategy
-from gps_denied_onboard.components.c6_tile_cache import TileStore
+from gps_denied_onboard.components.c6_tile_cache import DescriptorIndex
 from gps_denied_onboard.components.c7_inference import InferenceRuntime


 def build_vpr_strategy(
    config: Config,
-    tile_store: TileStore,
+    *,
+    descriptor_index: DescriptorIndex,
    inference_runtime: InferenceRuntime,
 ) -> VprStrategy:
-    """Composition-root factory. Reads `config.vpr.strategy` and `config.vpr.backbone_weights_path`; lazy-imports the concrete strategy module gated by its CMake `BUILD_VPR_<variant>` flag; refuses to instantiate a strategy whose flag is OFF (raises `ConfigurationError` pointing at the offending strategy name + missing flag).
+    """Composition-root factory. Reads `config.components['c2_vpr'].strategy` and `config.components['c2_vpr'].backbone_weights_path`; lazy-imports the concrete strategy module gated by its CMake `BUILD_VPR_<variant>` flag; refuses to instantiate a strategy whose flag is OFF (raises `StrategyNotAvailableError` pointing at the offending strategy name + missing flag).
+
+    `descriptor_index` (NOT `tile_store`) is injected: the pre-flight `descriptor_dim` validation reads from the C6 `DescriptorIndex.descriptor_dim()` which is the Public API that owns the FAISS index sidecar. The contract draft earlier named this parameter `tile_store`; the implementation moved it to match C6's actual Public API.

    Strategy resolution table:

@@ -180,9 +181,9 @@ def build_vpr_strategy(
    | "eigen_places"      | EigenPlacesStrategy  | components.c2_vpr.eigen_places                | BUILD_VPR_EIGENPLACES   |
    | "salad"             | SaladStrategy        | components.c2_vpr.salad                       | BUILD_VPR_SALAD         |

-    Pre-flight validation: after constructing the strategy, the factory queries `strategy.descriptor_dim()` and asserts it matches the C6 corpus index's declared `descriptor_dim` (read from the FAISS index sidecar). Mismatch → `ConfigurationError` at startup, NOT at first frame.
+    Pre-flight validation: after constructing the strategy, the factory queries `strategy.descriptor_dim()` and asserts it matches `descriptor_index.descriptor_dim()` (the FAISS index sidecar value). Mismatch → `ConfigError` at startup, NOT at first frame.

-    Returns a fully-constructed strategy ready for `embed_query` / `retrieve_topk` invocation. The caller (runtime root) is responsible for binding the instance to one ingest thread.
+    Returns a fully-constructed strategy ready for `embed_query` / `retrieve_topk` invocation. The caller (runtime root) is responsible for binding the instance to one ingest thread (AC-9 deferred until the generic compose_root thread-binding registry is in place; see task spec Risk 4).
    """
    ...
 ```
@@ -4,8 +4,8 @@
 **Producer task**: AZ-348 (Protocol + factory + DTOs + composition + `PassthroughRefiner`)
 **Consumer tasks**: AZ-349 (`AdHoPRefiner` real refinement); downstream c4_pose (epic AZ-259) which consumes the (possibly refined) `MatchResult`
 **Version**: 1.0.0
-**Status**: draft, awaiting Producer task implementation
-**Last Updated**: 2026-05-10
+**Status**: v1.0.0 (AZ-348 implemented 2026-05-12; PassthroughRefiner shipped — AdHoPRefiner pending AZ-349)
+**Last Updated**: 2026-05-12
 **Module-layout home**: `src/gps_denied_onboard/components/c3_5_adhop/interface.py` (Protocol), `src/gps_denied_onboard/components/c3_5_adhop/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/refiner_factory.py` (factory)

 > **Public API symbol naming.** The component's public interface symbol is named `ConditionalRefiner` in `description.md` § 2 and `AdHoPRefinementStrategy` in `module-layout.md` § c3_5_adhop. Both refer to the SAME Protocol; the canonical class name in code is `ConditionalRefiner` — it is the role description-first name and matches the method `refine_if_needed`. The producer task ALSO updates `module-layout.md` to align (`AdHoPRefinementStrategy` → `ConditionalRefiner`) so the two documents agree.
@@ -4,8 +4,8 @@
 **Producer task**: AZ-344 (`CrossDomainMatcher` Protocol + factory + composition)
 **Consumer tasks**: AZ-345 (DISK+LightGlue primary), AZ-346 (ALIKED+LightGlue secondary), AZ-347 (XFeat alternate); downstream c3_5_adhop (epic AZ-258) which consumes `MatchResult`
 **Version**: 1.0.0
-**Status**: draft, awaiting AZ-344 implementation
-**Last Updated**: 2026-05-10
+**Status**: v1.0.0 (AZ-344 implemented 2026-05-12)
+**Last Updated**: 2026-05-12
 **Module-layout home**: `src/gps_denied_onboard/components/c3_matcher/interface.py` (Protocol), `src/gps_denied_onboard/components/c3_matcher/__init__.py` (re-exports), `src/gps_denied_onboard/runtime_root/matcher_factory.py` (factory)

 ## Purpose
@@ -65,14 +65,13 @@ class CrossDomainMatcher(Protocol):

 ```python
 from dataclasses import dataclass
-from uuid import UUID
 import numpy as np


@dataclass(frozen=True, slots=True)
 class CandidateMatchSet:
    """Per-candidate matching outcome inside a MatchResult."""
-    tile_id: tuple  # composite (zoomLevel, lat, lon)
+    tile_id: tuple[int, float, float]  # composite (zoomLevel, lat, lon); mirrors VprCandidate / RerankCandidate encoding so the L1 _types layer is free of an L1→L3 import to c6_tile_cache.TileId
    inlier_count: int
    inlier_correspondences: np.ndarray  # shape (I, 4) float32; (px_query, py_query, px_tile, py_tile)
    ransac_outlier_count: int
@@ -82,8 +81,8 @@ class CandidateMatchSet:
@dataclass(frozen=True, slots=True)
 class MatchResult:
    """Cross-domain match outcome for one frame. Consumed by C3.5 ConditionalRefiner."""
-    frame_id: UUID
-    per_candidate: list[CandidateMatchSet]  # 0 < len <= N=3, ranked by inlier_count descending; ties broken by per_candidate_residual_px ascending
+    frame_id: int  # mirrors NavCameraFrame.frame_id; matches AZ-336 / AZ-342 encoding
+    per_candidate: tuple[CandidateMatchSet, ...]  # 0 < len <= N=3, ranked by inlier_count descending; ties broken by per_candidate_residual_px ascending. tuple (not list) so frozen+slots actually holds.
    best_candidate_idx: int  # 0 by construction (sorted)
    reprojection_residual_px: float  # best candidate's median residual
    matched_at: int  # monotonic_ns
@@ -115,6 +114,8 @@ class InsufficientInliersError(MatcherError):
    """Every candidate failed OR every candidate's inlier count is below `config.matcher.min_inliers_threshold`. Raised by `match`. C5 falls back to VIO-only."""
 ```

+The composition-time selection error is **`StrategyNotAvailableError`** (`runtime_root.errors`), NOT a member of `MatcherError`: it surfaces when the binary lacks the requested `BUILD_MATCHER_<variant>` flag or the concrete strategy module is not built yet (AZ-345..AZ-347 pending). This matches the C2 VPR (AZ-336) and C2.5 ReRank (AZ-342) factory pattern: per-frame matcher errors live in the C3 family; composition-time selection errors live in the shared runtime-root family.
+
 ## Composition-Root Factory

 ```python
@@ -6,10 +6,10 @@
 - AZ-TBD-c6-postgres-filesystem-store (implements)
 - AZ-TBD-c6-freshness-gate (insert hook + sector classification reader)
 - AZ-TBD-c6-cache-budget-eviction (LRU candidate enumeration + delete coordination)
- TBD at decompose time: E-C10 (AZ-252 — manifest + provisioning), E-C11 (AZ-251 — both `TileDownloader` insert and `TileUploader` reader queries), E-C12 (AZ-253 — operator pre-flight tooling)
-**Version**: 1.0.0
+- TBD at decompose time: E-C10 (AZ-252 — manifest + provisioning), E-C11 (AZ-251 — both `TileDownloader` insert and `TileUploader` reader queries), E-C12 (AZ-253 — operator pre-flight orchestrator)
+**Version**: 1.3.0
 **Status**: draft
-**Last Updated**: 2026-05-10
+**Last Updated**: 2026-05-13

 ## Purpose

@@ -32,6 +32,7 @@ Defines the typed boundary to the Postgres-backed spatial index over `TileMetada
 | `lru_candidates` | `(*, max_count: int) -> list[TileMetadata]` | `TileMetadataError` | sync (oldest-`accessed_at`-first; bounded result set) |
 | `total_disk_bytes` | `() -> int` | `TileMetadataError` | sync (sum of `disk_bytes` column; ≤ 100 ms even at 100k rows) |
 | `get_by_id` | `(tile_id: TileId) -> Optional[TileMetadata]` | `TileMetadataError` | sync; returns `None` if absent (NOT `TileNotFoundError`) |
+| `increment_upload_attempts` | `(tile_id: TileId) -> int` | `TileMetadataError`, `TileNotFoundError` | sync; atomic ``UPDATE … RETURNING`` (per-row lock); added in v1.3.0 |

 ### DTOs

@@ -63,6 +64,17 @@ class TileMetadataPersistent:

 The Protocol returns `TileMetadata` from queries. `TileMetadataPersistent` is the in-process view of LRU and disk-budget state, accessible only via `lru_candidates` / `record_lru_access` / `total_disk_bytes`.

+#### TileMetadata.location_hash (v1.2.0)
+
+```python
+@dataclass(frozen=True)
+class TileMetadata:
+    # ...existing AZ-303 v1.1.0 fields unchanged...
+    location_hash: UUID | None = None    # uuidv5(TILE_NAMESPACE_UUID, "{zoom}/{tile_x}/{tile_y}")
+```
+
+`location_hash` is a deterministic per-cell-bag identifier (UUIDv5, namespace-pinned in `c6_tile_cache._uuid_namespace.TILE_NAMESPACE_UUID`) shared by every row at the same `(zoom_level, tile_x, tile_y)` regardless of source or flight (Scenario 1 UI lookup, Scenario 6 voting query of the 2026-05-12 tile-schema scenario analysis). Defaults to `None` so AZ-303-era constructors continue to work; `PostgresFilesystemStore.insert_metadata` derives the value via `derive_location_hash(zoom_level, tile_x, tile_y)` when `None`, and the DB-side NOT-NULL constraint is the safety net. Cross-repo coordinated with `satellite-provider` per `AZ-TBD_tile_identity_uuidv5_bulk_list`.
+
 ### Sector classification (read-only input to the freshness gate)

 ```python
@@ -77,18 +89,18 @@ class SectorBoundary:
    classification: SectorClassification
 ```

-`SectorClassification` is set pre-flight by the operator via C12; the metadata store reads `SectorBoundary` rows from a sibling table (`sector_boundaries`) at insert-time to decide which freshness rule to apply. The Protocol does NOT expose insert-side methods for `SectorBoundary` rows — that surface lives in C12.
+`SectorClassification` is set pre-flight by the operator via C12; the metadata store reads `SectorBoundary` rows from the sibling table `sector_classifications` (per the AZ-263 baseline schema; AZ-304 adds the NULLable `min_lat` / `min_lon` / `max_lat` / `max_lon` bbox columns operators populate) at insert-time to decide which freshness rule to apply. The Protocol does NOT expose insert-side methods for `SectorBoundary` rows — that surface lives in C12.

 ## Invariants

- **I-1 (composite key uniqueness):** `(zoom_level, lat, lon, source)` is the unique key in the `tiles` table. Re-inserting the same key with different content_sha256 raises `TileMetadataError` — no silent overwrite.
+- **I-1 (natural-key uniqueness, per-flight separated):** the storage's unique key is `(zoom_level, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'::uuid))`. The integer slippy-tile coordinates `(tile_x, tile_y)` are derived from the DTO's WGS84 `(lat, lon)` and `zoom_level` via the project's shared Web-Mercator helper at insert time; `lat` / `lon` are persisted advisory-only and are NOT part of the uniqueness predicate. The `flight_id` coalesce term means two `ONBOARD_INGEST` rows for the same cell from different flights coexist (required by the future D-PROJ-2 voting layer), while two `GOOGLEMAPS` rows for the same cell (both `flight_id` = NULL → both coalesce to the zero UUID) cannot. Re-inserting an identical natural key with different `content_sha256_hex` raises `TileMetadataError` — no silent overwrite.
 - **I-2 (freshness gate at insert):** `insert_metadata` rejects (raises `FreshnessRejectionError`) iff the tile's `(lat, lon)` falls inside an `ACTIVE_CONFLICT` sector AND `capture_timestamp < now() - active_conflict_max_age`. The freshness rules table is configured per-flight (default 6 months for active_conflict; 12 months for stable_rear which downgrades rather than rejects).
 - **I-3 (downgrade marking):** when a tile in a `STABLE_REAR` sector is older than `stable_rear_max_age`, the row is inserted with `freshness_label=DOWNGRADED` (NOT rejected). `query_by_bbox` returns the downgrade flag intact so consumers (C2 / C3 spoof-rejection) can act on it.
 - **I-4 (LRU clock):** `record_lru_access` updates `accessed_at = max(current accessed_at, supplied timestamp)`; clock skew never sets `accessed_at` backward. `lru_candidates` returns oldest-first.
 - **I-5 (disk-budget invariant):** `total_disk_bytes` MUST equal `SUM(disk_bytes)` over all rows where `voting_status != REJECTED`. Rejected rows are tombstones — they keep the on-disk file deleted but retain the row for the manifest's content-hash check (D-C10-3).
 - **I-6 (frozen DTOs):** `Bbox`, `SectorBoundary`, `TileMetadataPersistent` are `@dataclass(frozen=True)`.
 - **I-7 (transactional writes):** `insert_metadata` is a single transaction over the `tiles` table; the freshness check + the row insert MUST be atomic (a parallel sector-boundary update MUST NOT race the gate).
- **I-8 (no silent voting-status downgrade):** `update_voting_status` accepts only forward transitions (`PENDING → TRUSTED`, `PENDING → REJECTED`); a backward transition raises `TileMetadataError`. `TRUSTED → REJECTED` is allowed (covers the cache-poisoning recall path).
+- **I-8 (no silent voting-status downgrade):** `update_voting_status` accepts only forward transitions (`PENDING → TRUSTED`, `PENDING → REJECTED`, `TRUSTED → REJECTED`, `PENDING → UPLOAD_GIVEUP`, `TRUSTED → UPLOAD_GIVEUP`); a backward transition raises `TileMetadataError`. `TRUSTED → REJECTED` covers the cache-poisoning recall path; the two `UPLOAD_GIVEUP` transitions (added in v1.3.0 by AZ-320) cover the C11 retry decorator's per-tile budget exhaustion. `UPLOAD_GIVEUP → anything` is forbidden — recovery is an out-of-band SQL UPDATE by the operator.
 - **I-9 (`pending_uploads` is the single source for C11 TileUploader):** the uploader MUST NOT scan the filesystem for pending tiles; it MUST drive its loop off `pending_uploads()`. The metadata store is the bookkeeping.

 ## Non-Goals
@@ -98,7 +110,7 @@ class SectorBoundary:
 - **Not covered: sector boundary insert / update.** Owned by C12 operator-tooling against a sibling table; this Protocol is read-only on `SectorBoundary` and does NOT expose CRUD.
 - **Not covered: cross-flight aggregation / voting threshold computation.** That's `satellite-provider`'s D-PROJ-2 trust layer (parent suite); C6 just stamps the per-row `voting_status`.
 - **Not covered: full-text search / arbitrary-WHERE queries.** Only the methods above; ad-hoc queries go through DBA tooling, not this Protocol.
- **Not covered: schema migrations.** Migration scripts live in `c6_tile_cache/_alembic/`; the Protocol is shape-only.
+- **Not covered: schema migrations.** Migration scripts live in `db/migrations/versions/` (project-level Alembic env owned by c6_tile_cache per `module-layout.md`; `0001_initial.py` shipped by AZ-263, `0002_c6_tile_identity_and_lru.py` by AZ-304); the Protocol is shape-only.

 ## Versioning Rules

@@ -111,7 +123,8 @@ Same rules as `tile_store.md` § Versioning Rules.
 | protocol-conformance-full | A class implementing all 9 methods | `isinstance(impl, TileMetadataStore) == True` | Producer AC-1 |
 | query-by-bbox-basic | bbox covering 100 inserted tiles at zoom=18 | Returns exactly the 100 tiles; `voting_filter=None` returns all statuses | Smoke |
 | query-by-bbox-voting-filter | Same with `voting_filter=TRUSTED` | Returns only TRUSTED tiles in bbox | Used by C10 manifest builder |
-| insert-duplicate-key | Insert (z=18, lat, lon, src=GOOGLEMAPS) twice with different content_sha256 | First succeeds; second raises `TileMetadataError` | I-1 |
+| insert-duplicate-key | Insert (z=18, tile_x, tile_y, tile_size_meters, src=GOOGLEMAPS, flight_id=NULL) twice with different content_sha256 | First succeeds; second raises `TileMetadataError` | I-1 |
+| insert-per-flight-coexists | Insert (z=18, tile_x, tile_y, tile_size_meters, src=ONBOARD_INGEST) twice with different `flight_id` values | Both succeed; rows share the same derived `location_hash` cell-bag identifier | I-1 / D-PROJ-2 |
 | insert-active-conflict-stale | Insert into ACTIVE_CONFLICT sector, capture_timestamp = now - 7 months | `FreshnessRejectionError`; row not committed | I-2 / C6-IT-02 |
 | insert-stable-rear-stale | Insert into STABLE_REAR sector, capture_timestamp = now - 13 months | Row inserted with `freshness_label=DOWNGRADED` | I-3 |
 | update-voting-status-forward | PENDING → TRUSTED | Succeeds | I-8 |
@@ -130,3 +143,6 @@ Same rules as `tile_store.md` § Versioning Rules.
 | Version | Date | Change | Author |
 |---------|------|--------|--------|
 | 1.0.0 | 2026-05-10 | Initial contract — 9-method Protocol + LRU/disk-budget extensions + freshness gate semantics + composite-key uniqueness invariant. | autodev (decompose Step 2 of AZ-250 / E-C6) |
+| 1.1.0 | 2026-05-12 | Non-breaking refinement of Invariant I-1: natural key switched from `(zoom_level, lat, lon, source)` (float-based) to `(zoom_level, tile_x, tile_y, tile_size_meters, source, COALESCE(flight_id, zero_uuid))` (integer + per-flight separated). Protocol surface unchanged; consumers gain the ability to observe multiple ONBOARD_INGEST rows for the same cell from different flights (required by D-PROJ-2 voting). Driven by `_docs/_process_leftovers/2026-05-12_tile-schema-scenario-analysis.md` and the cross-workspace satellite-provider task `AZ-TBD_tile_identity_uuidv5_bulk_list`. | autodev (AZ-304 batch 27 of cycle 1) |
+| 1.2.0 | 2026-05-12 | Non-breaking addition of `TileMetadata.location_hash: UUID \| None = None` (cross-source/cross-flight cell-bag identifier; UUIDv5 over `(zoom, tile_x, tile_y)`). Corrected stale references: sector table name (`sector_boundaries` → `sector_classifications`) and Alembic env path (`c6_tile_cache/_alembic/` → `db/migrations/versions/`). Protocol surface unchanged; existing constructors continue to work because the field defaults to `None`. Shipped by AZ-304 alongside the additive `0002_c6_tile_identity_and_lru` migration. | autodev (AZ-304 batch 27 of cycle 1) |
+| 1.3.0 | 2026-05-13 | Non-breaking addition of (a) `VotingStatus.UPLOAD_GIVEUP` terminal state, (b) two new forward transitions (`PENDING → UPLOAD_GIVEUP`, `TRUSTED → UPLOAD_GIVEUP`) under Invariant I-8, (c) the `increment_upload_attempts(tile_id) -> int` Protocol method (atomic per-row UPDATE … RETURNING), and (d) the `tiles.upload_attempts INTEGER NOT NULL DEFAULT 0` column. The Protocol method body raises `NotImplementedError` so legacy duck-typed impls keep their conformance — production wiring uses `PostgresFilesystemStore` which ships the SQL. `pending_uploads()` now also excludes `voting_status = upload_giveup`. Shipped by AZ-320 (C11 retry decorator) alongside the additive `0003_c11_upload_attempts` migration. | autodev (AZ-320 batch 41 of cycle 1) |
@@ -7,7 +7,7 @@
 - AZ-TBD-c6-freshness-gate (insert hook collaborator)
 - AZ-TBD-c6-cache-budget-eviction (uses `tile_exists` + `delete_tile`)
 - TBD at decompose time: E-C2.5 (AZ-256), E-C3 (AZ-257), E-C11 (AZ-251 — both `TileDownloader` and `TileUploader`)
-**Version**: 1.0.0
+**Version**: 1.1.0
 **Status**: draft
 **Last Updated**: 2026-05-10

@@ -104,11 +104,12 @@ All under `c6_tile_cache.errors`:

 ```
 TileCacheError (Exception subclass)
-├── TileNotFoundError          # tile_id not present on disk
-├── TileFsError                # I/O error on read/write/rename
-├── TileMetadataError          # row missing despite file present, or vice-versa (consistency violation)
-├── ContentHashMismatchError   # supplied JPEG bytes don't match declared content_sha256
-└── FreshnessRejectionError    # rejected by the C6 freshness gate (raised on insert in active_conflict)
+├── TileNotFoundError            # tile_id not present on disk
+├── TileFsError                  # I/O error on read/write/rename
+├── TileMetadataError            # row missing despite file present, or vice-versa (consistency violation)
+├── ContentHashMismatchError     # supplied JPEG bytes don't match declared content_sha256
+├── FreshnessRejectionError      # rejected by the C6 freshness gate (raised on insert in active_conflict)
+└── CacheBudgetExhaustedError    # LRU sweep ran to completion but couldn't free `needed_bytes` (AZ-308)
 ```

 `IndexUnavailableError` lives under the same package but is exclusively raised by `DescriptorIndex` — it is not part of `TileStore`'s envelope.
@@ -164,3 +165,4 @@ JPEG body lands at `<root>/tiles/{zoom_level}/{x}/{y}.jpg` where `(x, y)` is der
 | Version | Date | Change | Author |
 |---------|------|--------|--------|
 | 1.0.0 | 2026-05-10 | Initial contract — Protocol + DTOs + 5-error family + filesystem byte-identity invariant. | autodev (decompose Step 2 of AZ-250 / E-C6) |
+| 1.1.0 | 2026-05-12 | Additive: `CacheBudgetExhaustedError` joins the `TileCacheError` family for AZ-308 cache-budget enforcement. No existing-shape changes. | autodev (AZ-308) |
@@ -0,0 +1,157 @@
+# Replay-input CSV format (AZ-896)
+
+**Status**: canonical operator-facing spec for the `--imu` argument of
+`gps-denied-replay` (AZ-894).
+**Audience**: operators preparing a (video, CSV) replay pair, plus engineers
+implementing alternative replay backends.
+**Companion artifacts**:
+
+- `_docs/02_document/contracts/replay/example_data_imu.csv` — minimal valid
+  example (20 rows = 2 s at 10 Hz).
+- `_docs/00_problem/input_data/flight_derkachi/data_imu.csv` — full Derkachi
+  fixture (4,900 rows = 489.9 s at 10 Hz).
+- Parser implementation:
+  `src/gps_denied_onboard/replay_input/csv_ground_truth.py`.
+
+## Hard contract (read before generating a file)
+
+The replay pipeline trusts the CSV blindly inside the loop. Violations of any
+of the following will produce silently wrong outputs (the parser only catches
+schema-level faults, not semantic ones), so the operator owns these
+invariants:
+
+1. **Nadir camera.** The companion `.mp4` must be a nadir (straight-down)
+   recording. The C1 VIO and C2 VPR stages assume nadir framing; oblique
+   imagery breaks the satellite-anchor and VIO scale recovery.
+2. **Airborne at row 0.** The UAV must already be airborne at the first CSV
+   row / first video frame. The replay pipeline does not implement a
+   take-off detector — feeding a ground-roll segment yields garbage IMU
+   integration.
+3. **Aligned start.** Row 0's `Time = 0.0` must correspond to the first
+   video frame. The CLI does not perform sub-frame alignment; offset the
+   CSV/clip pair offline before invoking `gps-denied-replay`.
+4. **Monotonic, uniformly-spaced `Time`.** Rows must be strictly increasing
+   on `Time` and uniformly spaced (the Derkachi fixture is 10 Hz). The
+   parser enforces monotonicity (AC-5); uniform spacing is the operator's
+   responsibility — non-uniform spacing skews the ESKF prediction step
+   without raising an error.
+
+## Schema
+
+The CSV must be header-first, comma-separated, UTF-8 encoded. Column order
+does not matter — the parser uses `csv.DictReader` and looks up by name —
+but the column **names** must match exactly (case-sensitive).
+
+15 columns are required; up to 4 additional columns (mag fields,
+`relative_alt`) are tolerated and ignored.
+
+### Required columns
+
+CSV columns use the MAVLink wire format (mG accel, mrad/s gyro, FRD
+body frame). The parser converts to SI / FLU at the `ImuSample`
+boundary via
+`gps_denied_onboard.helpers.imu_units.mavlink_imu_to_si_flu` (AZ-918)
+so downstream C5 ESKF + `imu_preintegrator` consumers see the contract
+they were built for. **Operator-facing CSV files keep the raw scaling**
+— the conversion is a parser-internal concern.
+
+| # | Column | Unit (CSV) | Type | Notes |
+|---|--------|------------|------|-------|
+| 1 | `timestamp(ms)` | ms | float | Pixhawk wall clock at sample capture. **Ignored by the replay pipeline** — kept only for trace-back to the original tlog. |
+| 2 | `Time` | s | float | **Canonical replay clock.** Must start at `0.0`, increase monotonically, and be uniformly spaced. The replay loop uses this column for every timestamp it emits. |
+| 3 | `SCALED_IMU2.xacc` | mg, FRD | float | Body-frame X accelerometer, MAVLink `SCALED_IMU2` raw scaling. Converted by the parser to m/s² in `ImuSample.accel_xyz[0]` (FLU body). |
+| 4 | `SCALED_IMU2.yacc` | mg, FRD | float | Body-frame Y accelerometer; sign-flipped during FRD→FLU. |
+| 5 | `SCALED_IMU2.zacc` | mg, FRD | float | Body-frame Z accelerometer; sign-flipped during FRD→FLU. |
+| 6 | `SCALED_IMU2.xgyro` | mrad/s, FRD | float | Body-frame X gyro, MAVLink `SCALED_IMU2` raw scaling. Converted to rad/s in `ImuSample.gyro_xyz[0]` (FLU body). |
+| 7 | `SCALED_IMU2.ygyro` | mrad/s, FRD | float | Body-frame Y gyro; sign-flipped during FRD→FLU. |
+| 8 | `SCALED_IMU2.zgyro` | mrad/s, FRD | float | Body-frame Z gyro; sign-flipped during FRD→FLU. |
+| 9 | `GLOBAL_POSITION_INT.lat` | degrees | float | WGS84 latitude. **Already in decimal degrees** (Derkachi dump convention — pre-divided by 1e7 from MAVLink's int representation). |
+| 10 | `GLOBAL_POSITION_INT.lon` | degrees | float | WGS84 longitude (same convention as `lat`). |
+| 11 | `GLOBAL_POSITION_INT.alt` | mm | float | MSL altitude. Parser divides by 1000 to emit metres. |
+| 12 | `GLOBAL_POSITION_INT.vx` | cm/s | float | NED north velocity. Parser divides by 100 to emit m/s. |
+| 13 | `GLOBAL_POSITION_INT.vy` | cm/s | float | NED east velocity. |
+| 14 | `GLOBAL_POSITION_INT.vz` | cm/s | float | NED down velocity. |
+| 15 | `GLOBAL_POSITION_INT.hdg` | cdeg | float | Heading, 0–35999. Parser divides by 100 to emit degrees. |
+
+### Tolerated extra columns
+
+The following may be present but are not consumed:
+
+| Column | Reason kept | Reason unused |
+|--------|-------------|---------------|
+| `SCALED_IMU2.xmag`, `.ymag`, `.zmag` | Symmetric with the accel/gyro triples in the Derkachi dump | The current ESKF does not integrate magnetometer; AZ-848 follow-up may add it |
+| `GLOBAL_POSITION_INT.relative_alt` | Present in the MAVLink dump | The replay pipeline uses MSL `alt` only |
+
+Additional columns beyond these are ignored without warning. Missing
+required columns cause the load to raise
+`ReplayInputAdapterError` before the replay loop starts (AC-5).
+
+## Schema-level errors the parser catches
+
+The parser raises `ReplayInputAdapterError` (CLI exit code 1) for any of:
+
+- File does not exist or is not a regular file.
+- File is empty (no header row).
+- File has a header but no data rows.
+- Any required column from the table above is missing from the header.
+- The `Time` column at any row contains a non-numeric / NaN / Inf value.
+- The `Time` column is non-monotonic (`Time[i] <= Time[i-1]`).
+- Any required IMU or GPS column at any row contains a non-numeric / NaN /
+  Inf value.
+
+The error message includes the row number (1-based, where row 1 is the
+header — so the first data row is row 2). Operators should treat the first
+parse failure as authoritative and fix the source CSV; the parser does not
+continue after the first invalid row.
+
+## Operator workflow
+
+```bash
+gps-denied-replay \
+  --video ./flight.mp4 \
+  --imu ./data_imu.csv \
+  --output ./estimator_output.jsonl \
+  --camera-calibration ./calib.json \
+  --config ./config.yaml \
+  --mavlink-signing-key ./signing_key.bin
+```
+
+`--tlog` is accepted as a deprecated alias and will be removed by AZ-895.
+When both `--imu` and `--tlog` are supplied, `--imu` wins and a deprecation
+warning is printed to stderr.
+
+## Deriving a new CSV from an ArduPilot tlog
+
+The Derkachi fixture was produced with `pymavlink`'s `mavlogdump.py`. The
+short version:
+
+```bash
+mavlogdump.py --format csv \
+  --types SCALED_IMU2,GLOBAL_POSITION_INT \
+  ./flight.tlog > ./raw_dump.csv
+```
+
+Then post-process to:
+
+1. Rename / merge the per-message timestamp into a single `Time` column
+   relative to the first row.
+2. Drop pre-takeoff rows (the UAV must be airborne at row 0 — see the hard
+   contract above).
+3. Pre-divide `lat` / `lon` from the MAVLink `int * 1e7` representation
+   into decimal degrees.
+4. Re-sample to a uniform 10 Hz cadence if the tlog dump produced
+   non-uniform spacing.
+
+A reference post-processor script is **not** shipped — operators
+historically write a one-off Python or Pandas pipeline per source aircraft.
+
+## Cross-references
+
+- AZ-894 — the CLI + adapter that consumes this format.
+- AZ-895 — deletes the legacy `--tlog` argument once all callers migrate.
+- AZ-897 — operator replay UI; links to this page and serves
+  `example_data_imu.csv`.
+- `_docs/02_document/contracts/replay/replay_protocol.md` — the broader
+  replay orchestration contract.
+- `_docs/00_problem/input_data/flight_derkachi/README.md` — fixture
+  provenance and license caveats.
@@ -0,0 +1,21 @@
+timestamp(ms),Time,SCALED_IMU2.xacc,SCALED_IMU2.yacc,SCALED_IMU2.zacc,SCALED_IMU2.xgyro,SCALED_IMU2.ygyro,SCALED_IMU2.zgyro,SCALED_IMU2.xmag,SCALED_IMU2.ymag,SCALED_IMU2.zmag,GLOBAL_POSITION_INT.lat,GLOBAL_POSITION_INT.lon,GLOBAL_POSITION_INT.alt,GLOBAL_POSITION_INT.relative_alt,GLOBAL_POSITION_INT.vx,GLOBAL_POSITION_INT.vy,GLOBAL_POSITION_INT.vz,GLOBAL_POSITION_INT.hdg
+4551116.348,0,21,-3,-984,52,32,-5,312,-1048,442,50.0809634,36.1115442,141290,23.182,-4,-6,-88,35041
+4551216.348,0.1,-68,-9,-995,58,-17,1,309,-1016,441,50.0809634,36.1115441,141360,23.251,-5,-2,-89,35042
+4551316.348,0.2,9,108,-988,69,-65,13,308,-964,436,50.0809633,36.1115441,141410,23.303,-1,-2,-86,35048
+4551416.348,0.3,-20,27,-977,55,10,26,310,-988,438,50.0809633,36.1115441,141450,23.348,-5,-6,-84,35057
+4551516.348,0.4,-40,40,-1026,0,65,10,306,-1076,440,50.0809633,36.111544,141510,23.402,-2,-2,-86,35065
+4551616.348,0.5,30,126,-1050,-1,75,14,321,-1146,442,50.0809633,36.111544,141570,23.464,0,0,-88,35074
+4551716.348,0.6,-64,67,-1031,-31,-6,21,314,-1066,438,50.0809632,36.1115439,141640,23.53,-5,1,-90,35080
+4551816.348,0.7,-22,112,-1027,-61,-88,-5,302,-951,436,50.0809632,36.1115439,141710,23.601,-2,3,-90,35082
+4551916.348,0.8,-123,-16,-998,-55,-104,-12,301,-942,440,50.0809631,36.1115439,141770,23.669,-10,0,-91,35079
+4552016.348,0.9,-64,-13,-1003,13,-70,-30,301,-936,442,50.080963,36.1115439,141860,23.755,-2,0,-90,35073
+4552116.348,1,-22,39,-995,73,20,-18,314,-988,436,50.080963,36.1115439,141930,23.826,-2,-2,-88,35070
+4552216.348,1.1,-49,-69,-984,2,29,1,317,-992,433,50.080963,36.1115438,142010,23.9,-6,-2,-88,35068
+4552316.348,1.2,-16,98,-991,-59,-28,-11,310,-970,435,50.080963,36.1115438,142080,23.975,-1,6,-86,35063
+4552416.348,1.3,-6,169,-998,-29,2,-2,310,-983,435,50.0809629,36.1115438,142150,24.042,-3,5,-83,35059
+4552516.348,1.4,-31,53,-1003,2,13,-10,317,-1042,438,50.0809629,36.1115438,142210,24.102,-3,3,-83,35051
+4552616.348,1.5,-47,21,-1023,13,13,-14,320,-1069,439,50.0809629,36.1115438,142270,24.166,2,2,-83,35047
+4552716.348,1.6,-30,-59,-1020,-18,24,0,315,-1083,438,50.0809629,36.1115439,142340,24.236,-5,1,-86,35049
+4552816.348,1.7,-103,23,-1058,-59,26,-7,314,-1113,442,50.0809629,36.1115439,142430,24.321,-4,4,-90,35050
+4552916.348,1.8,-17,51,-1037,-9,80,11,317,-1087,444,50.0809629,36.1115439,142510,24.404,-5,0,-93,35049
+4553016.348,1.9,-87,72,-1022,-10,-45,0,309,-1004,439,50.0809628,36.111544,142600,24.494,-6,2,-97,35046
--- a/Show More
+++ b/Show More