[AZ-626] Decompose complete: 47 tasks + docs + module layout

Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy Qt/C++ to a Rust workspace. - Remove legacy Qt/C++ tree (ai_controller, drone_controller, misc/camera, python_scaffold, root Dockerfile, autopilot.pro, legacy main.py / requirements.txt). - Add _docs/00_problem (problem, restrictions, acceptance criteria, security approach, input data + fixtures). - Add _docs/01_solution/solution_draft01. - Add _docs/02_document (architecture, system-flows, data_model, glossary, decision-rationale, deployment, 13 component descriptions, tests/ specs, FINAL_report, module-layout). - Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one bootstrap + 46 component tasks) and _dependencies_table.md. - Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for canonical _docs artifacts). - Track autodev state in _docs/_autodev_state.md (Step 6 completed, ready for Step 7 Implement). Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks AZ-640..AZ-686. Total complexity 173 points across 12 epics. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 09:51:10 +00:00 · 2026-05-19 11:02:01 +03:00
parent f7d6cb4a3a
commit bc40ea7300
235 changed files with 12585 additions and 15097 deletions
@@ -0,0 +1,124 @@
+# autopilot — Documentation Index
+
+**Status**: forward-looking design (Rust). The implementation is in flight. This page is the entry point into the doc set; it does not duplicate content.
+
+If you are new to autopilot, read in this order: `architecture.md` → `system-flows.md` → the component spec(s) you care about → `data_model.md` for entity-level detail → `decision-rationale.md` for *why* the design looks the way it does.
+
+---
+
+## 1. Doc set at a glance
+
+| File | Purpose |
+|---|---|
+| `architecture.md` | The system. System context, component layering, NFRs, detailed design (problem, restrictions, AC, training data, solution architecture, MAVLink and piloting, MapObjects/H3, MGRS sync, target relocation, MapObjects sync with central DB, tech stack), open questions, scope boundary. |
+| `system-flows.md` | Per-flow narratives + sequence diagrams. Frame pipeline, movement detection (zoom-out + zoom-in), VLM confirmation, scan-controller behaviour tree, operator round trip, mission lifecycle, MapObjects + ignored items, MapObjects sync, pre-flight BIT, lost-link failsafe ladder. |
+| `data_model.md` | Canonical entity catalogue. Frames, detections, POIs, VlmAssessment, MapObject + observation log + bundle, IgnoredItem, OperatorCommand envelope, MissionItem vs MissionWaypoint, MGRS wire format, persistence + versioning. |
+| `decision-rationale.md` | Load-bearing research and decision evidence (per-dimension reasoning chain, fact cards, fit matrix, validation log, source registry, weak-point→fix table, historical seed narrative). |
+| `glossary.md` | Project-specific terms. |
+| `components/<name>/description.md` | One per autopilot component (13 total): purpose, inputs, outputs, responsibilities, state, failure modes, dependencies, NFR targets, references. |
+| `deployment/containerization.md` | Single-binary deployment options (native systemd vs container), target hardware, configuration surface, health endpoint. |
+| `deployment/ci_cd_pipeline.md` | Build, test, SITL conformance, benchmark gate, sign + publish. |
+| `deployment/observability.md` | Logs (`tracing` + JSON), metrics, traces, health aggregation, replay-driven debugging. |
+| `FINAL_report.md` | This file. |
+
+---
+
+## 2. The system in two minutes
+
+`autopilot` is the onboard mission executor for a reconnaissance winged UAV. It runs as a single Rust process on an aarch64 Jetson Orin Nano. It pulls a mission from the external `missions` API (and the mission area's last-known MapObjects state), controls the UAV through a hand-rolled MAVLink layer, drives a ViewPro A40 gimbal in a two-level scan-and-zoom loop (zoom-out wide sweep + zoom-in on POI), streams camera frames + telemetry continuously over modem to an external Ground Station so the operator watches in a browser, and uses bi-directional gRPC to delegate primitive object detection to the external `../detections` API. Movement detection runs at both zoom levels with mandatory ego-motion compensation. Semantic-vision reasoning (Tier 2 + an optional local VLM), a POI scheduler with a ≤5 POIs/min operator-review cap, and a target-follow mode after operator confirmation all run inside autopilot. Pre-flight self-test gates takeoff; the mission's full pass diff is pushed back to the central MapObjects store at mission end. Operator commands are authenticated, signed, and replay-protected.
+
+Full synopsis: `architecture.md §Synopsis`.
+
+---
+
+## 3. Components
+
+The system is 13 components organised into 4 planes:
+
+| Plane | Components |
+|---|---|
+| Perception (data plane in) | `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client` (optional) |
+| Decision + Memory | `scan_controller`, `mapobjects_store` |
+| Action (data plane out) | `gimbal_controller`, `operator_bridge`, `mission_executor`, `mavlink_layer`, `mission_client` |
+| Telemetry plane (always-on, parallel) | `telemetry_stream` |
+
+Per-component design specs: `components/<name>/description.md`.
+
+---
+
+## 4. Architectural non-negotiables
+
+These are stated once in `architecture.md §5` and referenced everywhere:
+
+- Detection-as-a-service (Tier 1 lives in `../detections`).
+- Hand-rolled MAVLink (no third-party SDK).
+- Deterministic typed state machine for scan control: `ZoomedOut`, `ZoomedIn`, `TargetFollow`.
+- Ego-motion compensation is mandatory for movement detection. Movement detection runs at **both** zoom-out and zoom-in (per-zoom-band thresholds; classical-CV adequacy at zoom-in is benchmark-gated).
+- Operator workload cap of ≤5 POIs/minute is hard.
+- Operator timeout scales with confidence.
+- **Operator commands are authenticated, signed, and replay-protected** (modem encryption alone is not sufficient).
+- Local VLM with structured `VlmAssessment` schema; no cloud egress.
+- Always-on camera + telemetry stream to Ground Station.
+- **Lost-link failsafe is explicit** (`mission_executor` runs a typed ladder; default RTL after 30 s grace).
+- **Pre-flight self-test (BIT) gates takeoff** including MapObjects pre-flight pull.
+- **MapObjects are mission-bracketed and centrally synchronised** via the `missions` API extension `/missions/{id}/mapobjects`.
+- `autopilot` and `missions` are separate repos with a shared `mission-schema` artefact.
+- No silent error swallowing; health endpoint reflects every dependency including `mapobjects_sync`.
+- Geofence enforcement is symmetric: both INCLUSION and EXCLUSION are honoured.
+
+---
+
+## 5. Open questions
+
+Surfaced explicitly in `architecture.md §8`:
+
+| # | Question | Blocks |
+|---|---|---|
+| Q1 | Sweep pattern (pendulum / raster / lawn-mower), FOV per zoom tier, dwell time. | `scan_controller` zoom-out implementation. |
+| Q2 | Ground Station API contract (stream protocol, auth, bbox-overlay rendering). | `telemetry_stream` + `operator_bridge` design. |
+| Q3 | `mapobjects_store` engine (SQLite + H3 / KV / in-memory + snapshot). | Persistent-state design. |
+| Q4 | Tier 1 contract evolution / `detection_client` versioning. | gRPC contract definition. |
+| Q5 | `mission-schema` extraction location. | Schema sharing between `autopilot` and `missions`. |
+| Q6 | MAVLink-2 message signing. | `mavlink_layer` startup handshake. |
+| Q7 | Central MapObjects API contract details (paging, photo-ref upload, retention). | `missions` repo work + `mission_client` MapObjects sync code. |
+| Q8 | MapObjects conflict resolution (projection rules, REMOVED-claim expiry, multi-class disambiguation). | Central `map_objects_current` view definition. |
+| Q9 | Operator-command authentication scheme (HMAC vs ed25519 vs MAVLink-2 sig vs separate envelope). | `operator_bridge` validation logic + Ground Station integration. |
+| Q10 | Software rollback policy on the airframe (boot-time check, A/B partition, watchdog rollback). | Deployment design + on-airframe service supervision. |
+| Q11 | Multi-operator session policy (single active vs quorum). | `operator_bridge` session model. |
+| Q12 | Comms blackout during banking turns (tolerate as `LinkDegraded` vs suppress lost-link during turns). | Lost-link ladder timing constants. |
+| Q13 | All-season acceptance flight gates (minimum flights per season, per-season acceptance criteria). | MVP sign-off scope. |
+| Q14 | Movement-detector zoom-in fallback selection (learned optical flow vs CNN motion-segmentation vs IMU-tighter classical CV) if classical CV fails the per-zoom-band FP cap. | `movement_detector` zoom-in scope. |
+
+---
+
+## 6. Suite-level docs autopilot consumes
+
+These live in `../_docs/` (parent suite repo):
+
+| Path | Used for |
+|---|---|
+| `../_docs/00_top_level_architecture.md` | Suite topology, edge tier, flight-gate convention. |
+| `../_docs/02_missions.md` | Mission / Waypoint / Vehicle schemas (consumed by `mission_client`). |
+| `../_docs/03_detections.md` | Detections gRPC API (consumed by `detection_client`). |
+| `../_docs/04_system_design_clarifications.md` | REST patterns, stream-detection protocol, edge-device connection semantics. |
+| `../_docs/11_gps_denied.md` | GPS-Denied service architecture (out of autopilot scope). |
+| `../_docs/12_ai_training.md` | AI training pipeline (autopilot consumes the resulting models via the suite-wide model-sync timer). |
+
+Full table with ownership: `architecture.md §10`.
+
+---
+
+## 7. Where to put new content
+
+| You want to document… | Put it in… |
+|---|---|
+| A new flow between components | `system-flows.md` (and add a sequence diagram). |
+| A new entity / schema | `data_model.md`. |
+| A change in NFR target | `architecture.md §6`. |
+| A change in a single component's responsibilities | `components/<name>/description.md`. |
+| A change in the MAVLink command surface | `architecture.md §7.7`. |
+| A new architectural principle | `architecture.md §5`. |
+| A new design decision with research backing | `decision-rationale.md`. |
+| A new term | `glossary.md`. |
+| A change in deployment shape | `deployment/<file>.md`. |
+| Ad-hoc internal team note | not in `_docs/`. |
@@ -0,0 +1,847 @@
+# autopilot — Architecture
+
+**Status**: forward-looking design (Rust). The implementation is in flight; the system described here is the target architecture, not what runs today. Confirmed by user 2026-05-17.
+
+## Synopsis
+
+`autopilot` is the onboard mission executor for a reconnaissance winged UAV. It runs as a single Rust process on an aarch64 Jetson Orin Nano edge device. It pulls a mission from the external `missions` API, controls the UAV through a hand-rolled MAVLink layer (~10–15 commands; no third-party SDK), drives a ViewPro A40 gimbal in a two-level scan-and-zoom loop (zoom-out wide sweep + zoom-in on POI), streams camera frames + telemetry continuously over modem to an external Ground Station API so the operator watches in a browser, and uses bi-directional gRPC to delegate primitive object detection to the external `../detections` API. Semantic-vision reasoning (Tier 2 ROI analysis + an optional local VLM), a POI scheduler with an operator-review rate cap, and a target-follow mode after operator confirmation all run inside autopilot. The dominant pattern is a deterministic typed state machine (zoom-out / zoom-in / target-follow) coordinating a small set of async actors.
+
+---
+
+## 1. System Context
+
+Autopilot integrates with six external systems. The local VLM is optional (benchmark-gated); everything else is mandatory.
+
+```mermaid
+flowchart LR
+    cam["ViewPro A40<br/>RTSP camera + gimbal"]
+    det["../detections<br/>Tier 1 YOLO service"]
+    vlm["NanoLLM VILA1.5-3B<br/>(optional, local IPC)"]
+    miss["missions API"]
+    gs["Ground Station<br/>operator UI"]
+    ap["ArduPilot / PX4"]
+    autopilot["autopilot<br/>onboard mission + scan + perception"]
+    cam <-->|RTSP frames / UDP gimbal control| autopilot
+    autopilot <-->|bidir gRPC| det
+    autopilot <-.->|Unix-domain socket IPC| vlm
+    autopilot <-->|REST GET / POST| miss
+    autopilot <-->|stream over modem| gs
+    autopilot <-->|MAVLink v2| ap
+```
+
+Per-edge protocol details:
+
+| Edge | Protocol | Direction | Purpose |
+|---|---|---|---|
+| ViewPro A40 (camera) | RTSP/RTP over TCP/UDP | inbound | live H.264/265 1080p video to `frame_ingest`. |
+| ViewPro A40 (gimbal) | UDP, vendor control protocol | bidirectional | yaw / pitch / zoom commands + status; driven by `gimbal_controller`. |
+| `../detections` | bi-directional gRPC | bidirectional | frames out, bounding boxes back; driven by `detection_client`. |
+| NanoLLM VILA1.5-3B | Unix-domain socket IPC (peer-cred check) | bidirectional | bounded ROI + short prompt → structured `VlmAssessment`; optional. |
+| `missions` API | HTTPS REST (GET / POST) | bidirectional | mission pull on start; middle-waypoint POST on operator confirmation; **MapObjects** pre-flight pull + post-flight push (`/missions/{id}/mapobjects`, see §7.13). |
+| Ground Station API | continuous push over modem (protocol per `../_docs/04_system_design_clarifications.md`) | bidirectional | always-on camera feed + telemetry + bbox overlay; operator confirm / decline / target-follow. |
+| ArduPilot / PX4 | MAVLink v2 over UDP or serial | bidirectional | the small command surface in §7.7. |
+
+---
+
+## 2. Component Layering
+
+Three internal layers (Perception → Decision + Memory → Action) plus an always-on Telemetry plane that runs parallel to the decision loop.
+
+```mermaid
+flowchart TB
+    subgraph autopilot ["autopilot"]
+        subgraph perception ["Perception (data plane in)"]
+            fi[frame_ingest]
+            dc[detection_client]
+            md[movement_detector]
+            sa[semantic_analyzer]
+            vc["vlm_client (opt)"]
+        end
+        subgraph brain ["Decision + Memory"]
+            sc[scan_controller]
+            mo[mapobjects_store]
+        end
+        subgraph action ["Action (data plane out)"]
+            gc[gimbal_controller]
+            ob[operator_bridge]
+            me[mission_executor]
+            ml[mavlink_layer]
+            msc[mission_client]
+        end
+        subgraph tplane ["Telemetry plane (always-on, parallel)"]
+            ts[telemetry_stream]
+        end
+    end
+    perception ==>|"inputs (bboxes, motion, Tier 2, VlmAssessment)"| brain
+    brain ==>|"commands + POI updates + middle-waypoint hints"| action
+    perception -.->|"frames + bboxes"| tplane
+    action -.->|"telemetry"| tplane
+```
+
+Per-flow component-to-component sequence diagrams live in `system-flows.md`.
+
+---
+
+## 3. Components
+
+| Component | Layer | Responsibility |
+|---|---|---|
+| `frame_ingest` | Perception | Pull RTSP from ViewPro A40; decode; timestamp; hand frames to `detection_client`, `movement_detector`, and `telemetry_stream` (zero-copy where possible). |
+| `detection_client` | Perception | Bi-directional gRPC to `../detections`; streams frames out, receives bounding boxes back; same bboxes are reused for Tier 2 ROI selection and for operator overlay. Versioned against the `../_docs/03_detections.md` contract. |
+| `movement_detector` | Perception | Active in **both** zoom-out and zoom-in levels (skipped only during target-follow). OpenCV optical-flow / global-motion estimation fused with timestamped gimbal angle, zoom state, and UAV motion telemetry. Emits residual-motion clusters as POI candidates. Ego-motion compensation is mandatory; naive frame-differencing is rejected. Zoom-in adequacy of classical CV is benchmark-gated — see §7.6 Movement detector and Open Question Q14. |
+| `semantic_analyzer` | Perception | Tier 2. Primitive graph + lightweight ROI CNN over zoom-in crops. Owns path-freshness scoring, endpoint scoring, branch choice at intersections, and concealment-POI scoring. |
+| `vlm_client` | Perception (optional) | Local-IPC client to a NanoLLM/VILA1.5-3B process. Validates ROI payload size/format, calls the VLM with a bounded crop and short prompt, validates the response against a structured `VlmAssessment` schema. No cloud egress. Optional behind a `vlm_enabled` flag and a feature module (see §7.6 Local VLM Confirmation). |
+| `scan_controller` | Decision + Memory | Central deterministic typed state machine — `ZoomedOut`, `ZoomedIn`, `TargetFollow`. Owns the POI queue, timeouts, ≤5 POIs/min cap, confidence-scaled operator-decision window, and gimbal-command issuance. Full behaviour-tree spec in `system-flows.md §F4`. |
+| `mapobjects_store` | Decision + Memory | On-device H3-indexed map of detected objects + ignored-items list. Pre-flight pull of the mission-area map from the central `missions` API; in-flight on-device authoritative; post-flight push of the mission diff back to central. Computes new / moved / existing / removed diffs across passes (§7.10, §7.11, §7.12). Read/written directly by `scan_controller`; sync pulls/pushes are handled via `mission_client`. |
+| `gimbal_controller` | Action | ViewPro A40 control protocol (yaw / pitch / zoom). Honours ≤2 s zoom transition budget and ≤500 ms decision-to-movement latency. Owns the smooth-pan path-tracking primitive used in zoom-in level. |
+| `operator_bridge` | Action | Surfaces POIs and target-follow lifecycle events through `telemetry_stream` to the Ground Station; receives confirm / decline / target-follow start-release back. On decline, appends an `IgnoredItem` via `mapobjects_store`. On confirm, hands a middle-waypoint hint to `mission_executor`. |
+| `mission_executor` | Action | Multirotor and fixed-wing variants of the platform state machine: takeoff / climb / cruise / land for multirotor; upload-and-await-AUTO for fixed-wing. Owns geofence enforcement (both INCLUSION and EXCLUSION). Issues MAVLink commands through `mavlink_layer`; consumes `mission_client` mission state. Inserts middle waypoints on operator-confirmed targets. |
+| `mavlink_layer` | Action | Hand-rolled MAVLink v2 transport (UDP or serial) implementing only the ~10–15 commands this codebase needs. See §7.7 for the command surface. No third-party SDK. |
+| `mission_client` | Action | Pulls mission JSON from the `missions` API on start; validates against `mission-schema`; handles mid-flight middle-waypoint inserts (POST). Survives transient connection loss with bounded retry. |
+| `telemetry_stream` | Telemetry plane | Continuous push of camera frames + flight telemetry + bbox overlay to the Ground Station API over modem. Always-on; not detection-gated. Carries operator commands (confirm / decline / target-follow start-release) on the return path. |
+
+The system is intentionally a small set of well-named components rather than 30+ files. Everything in `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, and `vlm_client` runs on the **input data plane** — no UAV control, no operator surface. Everything in `gimbal_controller`, `mission_executor`, `mavlink_layer`, `mission_client`, and `operator_bridge` runs on the **output control plane** — UAV motion + operator interaction. `scan_controller` and `mapobjects_store` are the **brain** in between. `telemetry_stream` is parallel; it never sits in the decision path.
+
+Per-component design specs (purpose, inputs, outputs, state, failure modes, NFRs) live in `components/<name>/description.md`.
+
+---
+
+## 4. Major Data Flows
+
+1. **Frame pipeline**. ViewPro A40 RTSP → `frame_ingest` → `detection_client` (bi-dir gRPC to `../detections`) → bboxes back → `movement_detector` (active at both zoom-out and zoom-in; residual-motion clusters) → `scan_controller` POI queue. The same bboxes also flow into `telemetry_stream` for operator overlay. (`system-flows.md §F1`)
+2. **Zoom-in + confirmation**. `scan_controller` pops a POI → `gimbal_controller` zooms ViewPro A40 → `semantic_analyzer` runs Tier 2 over the ROI → optionally `vlm_client` runs Tier 3 → `scan_controller` decides. Movement candidates emerging during the zoom-in hold are still consumed (subject to telemetry-skew tolerance and the per-zoom-band thresholds). (`system-flows.md §F2`, `§F3`)
+3. **Operator round trip**. `telemetry_stream` pushes camera + telemetry + bbox overlay → Ground Station → operator browser → confirm / decline / target-follow start-release → modem → `operator_bridge` → `mapobjects_store` (decline) or `mission_executor` (confirm) or `scan_controller` (target-follow). Always-on, not detection-gated. Operator commands are authenticated, signed, and replay-protected (§5; scheme TBD per Q9). (`system-flows.md §F5`)
+4. **Mission lifecycle**. `mission_client` pulls from `missions` API → `mission_executor` issues MAVLink waypoints via `mavlink_layer` → `gimbal_controller` runs the zoom-out sweep along the route. On operator confirmation, `mission_executor` inserts a middle waypoint and resumes after target-follow ends. (`system-flows.md §F6`)
+5. **MapObjects + ignored items**. New detections compute an H3 cell, query the k-ring of neighbours, classify as new / moved / existing / removed (§7.12), and check for an `IgnoredItem` match before surfacing to the operator. (`system-flows.md §F7`)
+6. **MapObjects sync** (mission-bracketing). Pre-flight: `mission_client` pulls the last-known map state for the mission area from the `missions` API and hydrates `mapobjects_store`. Post-flight: `mission_client` pushes the mission's full pass diff (NEW / MOVED / REMOVED / CONFIRMED-EXISTING) back. In-flight sync is **batched only** for MVP — no streaming over modem (§7.13; `system-flows.md §F8`).
+
+---
+
+## 5. Architectural Principles / Non-Negotiables
+
+- **Detection-as-a-service.** Primitive (Tier 1) detection lives in `../detections`, not in autopilot. Autopilot owns Tier 2 (semantic) and Tier 3 (VLM, optional) only.
+- **Hand-rolled MAVLink.** No third-party SDK. The MAVLink command surface is small enough to hand-implement; eliminates the largest current dependency-risk item.
+- **Deterministic typed state machine** for scan control. States are `ZoomedOut | ZoomedIn { roi, hold_started_at } | TargetFollow { target_id, started_at }`. No ad-hoc booleans, no shared mutable flags. The full behaviour-tree spec lives in `system-flows.md §F4`.
+- **Ego-motion compensation is mandatory** for movement detection. Naive frame-differencing is rejected outright. Movement detection runs at **both** zoom-out and zoom-in (skipped only during target-follow); zoom-in adequacy of classical CV is benchmark-gated (§7.6, Q14).
+- **Operator workload cap of ≤5 POIs/minute** is hard, not soft. `scan_controller` enforces it.
+- **Operator timeout scales with confidence** — 40 % → 30 s, 100 % → 120 s, linear; below 40 % the target is not surfaced. Timeout = forget; decline = `IgnoredItem` entry.
+- **Operator commands are authenticated, signed, and replay-protected.** Modem-link encryption alone is not sufficient — every confirm / decline / target-follow / abort command MUST carry a session-bound, replay-resistant signature that `operator_bridge` validates before dispatch. Exact scheme TBD (§8 Q9).
+- **Local VLM with structured `VlmAssessment` schema.** Free-form VLM text is not a downstream API. No cloud egress.
+- **Always-on camera + telemetry stream** to Ground Station is part of the mission contract — operator always sees the live feed, not just on detection.
+- **Lost-link failsafe is explicit.** Loss of the operator/Ground-Station modem link triggers a typed failsafe ladder in `mission_executor` (§7.7). The ladder is deterministic; default action is RTL after a configured grace window.
+- **Pre-flight self-test (BIT) gates takeoff.** Every dependency listed in §5 plus mission load + MapObjects pre-flight pull (cached fallback acknowledged) must pass before `mission_executor` enters `ARMED` (multirotor) or `WAIT_AUTO` (fixed-wing). Health endpoint distinguishes pre-flight vs in-flight readiness.
+- **`autopilot` and `missions` are separate repos** with a shared `mission-schema` artefact. The same `missions` API also hosts the central MapObjects endpoints (§7.13).
+- **MapObjects are mission-bracketed and centrally synchronised.** Pre-flight pull on start; on-device authoritative in-flight; full pass diff pushed at mission end. The on-device store is a working copy of the central state for the mission's bounding box, not a private database.
+- **No silent error swallowing** anywhere in the pipeline. Health endpoint reflects every dependency: `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client` (if enabled), `scan_controller`, `gimbal_controller`, `mavlink_layer`, `mission_client`, `mission_executor`, `operator_bridge`, `telemetry_stream`, `mapobjects_store`, plus `mapobjects_sync` (pre-flight pull / post-flight push status).
+- **Geofence enforcement is symmetric.** Both INCLUSION and EXCLUSION polygons are honoured. (Earlier C++ behaviour silently ignored EXCLUSION; the rewrite explicitly enforces both.)
+
+---
+
+## 6. Non-Functional Targets
+
+| Concern | Target | Owner |
+|---|---|---|
+| Tier 1 latency | ≤100 ms / frame (end-to-end at 1280 px, FP16, batch 1) | `../detections` (autopilot's call budget respects it) |
+| Tier 2 latency | ≤200 ms / ROI | `semantic_analyzer` |
+| Tier 3 (VLM) latency | ≤5 s / ROI | `vlm_client` |
+| ViewPro A40 zoom transition | ≤2 s (medium → high) | `gimbal_controller` |
+| Decision-to-movement latency | ≤500 ms | `gimbal_controller` |
+| POI rate to operator | ≤5 POIs / min (hard cap) | `scan_controller` |
+| Concealed-position recall | ≥60 % | `semantic_analyzer` |
+| Concealed-position precision | ≥20 % (operators filter) | `semantic_analyzer` |
+| New per-class P / R | ≥80 % | `../detections` |
+| Footpath detection recall | ≥70 % | `semantic_analyzer` |
+| Movement-candidate enqueue latency | ≤1 s from detection (zoom-out); ≤1.5 s (zoom-in, accommodating gimbal slew) | `movement_detector` |
+| Zoom-out → zoom-in transition | ≤2 s including physical zoom | `scan_controller` + `gimbal_controller` |
+| Telemetry rate (position) | 1 Hz min, 10 Hz target | `mavlink_layer` |
+| Memory budget (semantic + movement + VLM) | ≤6 GB on Jetson Orin Nano (8 GB total, ~2 GB reserved for YOLO) | system-wide |
+| Watchdog / retry on MAVLink failures | bounded retry with exponential backoff; explicit max-retry; health flips to red | `mission_executor` |
+| Operator command → action latency | ≤500 ms operator-click → outbound MAVLink / gimbal command (excludes modem RTT) | `operator_bridge` + downstream |
+| Sustained frame-rate floor | ≥10 fps; below this `scan_controller` suppresses zoom-in transitions and surfaces health → yellow | `frame_ingest` + `scan_controller` |
+| MapObjects pre-flight pull | ≤30 s for a 30 km × 30 km mission area; cache-fallback acceptable on timeout | `mission_client` + `mapobjects_store` |
+| MapObjects post-flight push | ≤2 min for a 60 min mission's pass diff; bounded retry; persisted on disk if push fails | `mission_client` + `mapobjects_store` |
+
+---
+
+## 7. Detailed Design
+
+This section covers the rewrite-time problem narrative, suite-level concerns (mission regions, MapObjects, MGRS sync, new-vs-existing object detection), constraints, acceptance criteria, the chosen solution architecture, the MAVLink command surface, and the tech stack.
+
+### 7.1 Problem
+
+The reconnaissance winged UAV detects vehicles and military equipment with YOLO, but current high-value targets are camouflaged positions: FPV operator hideouts, hidden artillery emplacements, and dugouts masked by branches. These cannot be found by visual similarity to known object classes alone.
+
+The new approach has three cooperating search engines:
+
+- **Camera sweep** — follow the UAV route at wide or light/medium zoom with left-right gimbal movement to cover terrain and queue POIs.
+- **Movement detection** — runs in **both** zoom-out and zoom-in levels (skipped only during target-follow). Per-zoom-band thresholds keep false-positive rate below the operator-review cap; classical OpenCV adequacy at zoom-in is benchmark-gated (Q14).
+- **Semantic zoom search** — detect primitives such as black entrances, branch piles, footpaths, roads, trees, and tree blocks, then reason over scene context to find concealed positions.
+
+The system controls a two-level scan:
+
+- **Zoom-out level (wide-area sweep)** — the camera follows the UAV route at wide or light/medium zoom, sweeping left-right across the flight path while detecting primitives, buildings, vehicles, and small motion candidates. Footpath starts, suspicious branch piles, tree rows, movement candidates, and similar POIs are marked with GPS-denied coordinates and queued.
+- **Zoom-in level (detailed scan)** — the camera zooms into each queued POI or movement candidate for confirmation. It follows detected footpaths from origin to endpoint, keeps paths centered while the UAV moves, follows the freshest or most promising branch at intersections, holds on endpoints for VLM analysis of branch piles, dark entrances, dugouts, vehicles, or people, and slowly pans broader POIs such as tree rows or clearings. Movement detection continues, scaled for the higher pixel-to-metre ratio. After analysis or timeout, it returns to zoom-out and continues the queue or route.
+
+When an operator confirms a target, the system switches to **target-follow mode**: keep the target centered with gimbal control while the UAV moves, until the operator releases it or tracking is lost.
+
+### 7.2 Mission Regions and Reconnaissance Flow
+
+Mission directions can be vague. Waypoints define a route that passes through multiple regions:
+
+```text
+Start → Point1 → Point2 → Point3 → Point4 → Point5 → Point6 → Finish
+                    ╔═══════════════╗
+                    ║   Region 1    ║
+                    ╚═══════════════╝
+         ╔══════════════════╗
+         ║    Region 2      ║
+         ╚══════════════════╝
+  ╔══════════════╗
+  ║   Region 3   ║
+  ╚══════════════╝
+```
+
+The autopilot decides the route within each region (1, 2, and 3).
+
+**Alternative scenario — region-only search.** The user selects only a region for the search (no explicit waypoints inside). The autopilot plans its own route within the region.
+
+```text
+Start ──┐
+        │    ╔═══════════════╗
+        ├───►║    Region     ║  (contains Points)
+        │    ╚═══════════════╝
+Finish◄─┘
+```
+
+**Reconnaissance flow.** The reconnaissance UAV:
+
+1. Searches within the region and finds potential targets.
+2. Sends images to the retranslation UAV.
+3. The retranslation UAV forwards them to the human operator.
+4. The human operator makes a decision regarding the target using the behaviour-tree-driven `scan_controller` logic (`system-flows.md §F4`).
+
+**Scanning strategy.**
+
+- **Zoom-out level — wide-area scan.** Camera points along the UAV route with left-right swing. The detections service continuously recognises specific patterns as POIs. This initial scan runs at medium zoom while moving between targets. POI types: tree rows (potential caponiers, entrances concealed by tree rows); polygons (areas where military vehicles could be hidden); houses with vehicles or traces; roads and routes on snow or terrain, inside the forest, or near houses.
+- **Zoom-in level — detailed scan.** When the camera finds a POI or movement candidate, it zooms in and performs a detailed scan. During detailed scan it searches for trees, caponiers, military vehicles, and so on. Movement detection continues during the zoom-in hold (subject to the per-zoom-band thresholds) so a moving small target found mid-detail-scan is not lost.
+
+### 7.3 Restrictions
+
+**Hardware and camera.**
+
+- Jetson Orin Nano Super: 67 TOPS INT8, 8 GB shared LPDDR5; YOLO uses ~2 GB RAM, leaving ~6 GB for semantic detection, movement detection, and VLM.
+- All models use FP16 precision (frozen choice: keep FP16-only for all models).
+- Primary camera: ViewPro A40, 1080p (1920×1080), 40× optical zoom, f=4.25–170 mm, Sony 1/2.8" CMOS (IMX462LQR), HDMI or IP output at 1080p 30/60 fps.
+- Alternative camera: ViewPro Z40K at higher cost.
+- Thermal sensor (640×512, NETD ≤50 mK) is available only as a future enhancement, not a core requirement.
+
+**Operational.**
+
+- Flight altitude: 600–1000 m.
+- Support all seasons and terrain types: winter snow, spring mud, summer vegetation, autumn; forest, open field, urban edges, and mixed terrain. (Frozen choice: MVP must cover **all** seasons, not winter-first only.)
+- ViewPro A40 40× optical zoom traversal takes 1–2 s; zoom-out → zoom-in transition must complete within ≤2 s including physical zoom.
+- Movement detection runs at **both** zoom-out and zoom-in levels, compensates for UAV/gimbal motion, and queues candidates for zoom confirmation; target following starts only after operator confirmation. Per-zoom-band thresholds (cluster persistence, residual-velocity floor, telemetry-skew tolerance) are configurable.
+
+**Software.**
+
+- Inference: TensorRT on Jetson, ONNX Runtime fallback, 1280 px model input, tile splitting for large images.
+- VLM must run locally on Jetson with no cloud dependency, as a separate IPC process — not compiled into the autopilot binary.
+- YOLO and VLM inference run sequentially because they share GPU memory; no concurrent execution.
+
+**Reliability and safety.**
+
+- **Lost-link failsafe is mandatory.** Loss of the operator/Ground-Station modem link triggers a deterministic ladder in `mission_executor` (default RTL after a 30 s grace; configurable per mission). Loss of the airframe MAVLink link itself triggers immediate health → red and degrades to whatever ArduPilot/PX4's own failsafe dictates.
+- **Pre-flight self-test (BIT) gates takeoff.** GPS lock, camera RTSP healthy, gimbal homed (yaw/pitch/zoom feedback within tolerance), `../detections` reachable + warmed, mission loaded + validated, MapObjects pre-flight pull complete (or cached fallback acknowledged with operator confirm), VLM warm (if `vlm_enabled`), persistent-store space ≥ configured floor.
+- **Battery / fuel thresholds enforced.** `mission_executor` triggers RTL at battery ≤ configured RTL-floor (e.g. 25 %); land-now at hard-floor (e.g. 15 %); ignored only on operator override. Surfaces health → yellow / red accordingly. Threshold values are mission-configurable.
+- **Sustained frame-rate floor.** Below ≥10 fps sustained, `scan_controller` suppresses zoom-in transitions (only TIER 1 + operator overlay continue) and surfaces health → yellow.
+- **Wall-clock time source.** Monotonic clock is authoritative for telemetry-skew compensation and tick budgets. Wall-clock is bound to GPS time once GPS is locked (preferred) or NTP-set at boot if reachable; both are recorded with `clock_source` and `last_sync_at`. Drift > 200 ms surfaces health → yellow.
+- **On-device storage is bounded.** `mapobjects_store` retention + log buffer have configured caps; on cap-hit, oldest pre-current-mission data is evicted; persistent-store-full pre-flight is a BIT failure.
+
+**Integration and scope.**
+
+- The `../detections` service is FastAPI + Cython + TensorRT in a Docker container on Jetson; consumed via bi-directional gRPC.
+- Consume YOLO boxes with class, confidence, and normalised coordinates; output boxes in the same format for operator display.
+- Movement candidates and confirmed followed targets use the same normalised box format for operator display.
+- GPS coordinates come from the GPS-denied service (`../_docs/11_gps_denied.md`) and are out of scope for autopilot's own implementation.
+- **MapObjects sync** uses the central `missions` API extension `/missions/{id}/mapobjects` (pre-flight GET, post-flight POST). Schema in §7.13.
+- Annotation tooling, training pipeline, and data-collection automation are separate repositories and out of scope.
+- GPS-denied navigation is a separate project; mission planning and route selection inside a region remain in autopilot.
+
+**Frozen choices (2026-05-06, updated 2026-05-18).** Gating decisions for downstream design:
+
+1. **Tier 1 remains FP16-only** for all models. INT8 is rejected for MVP.
+2. **MVP acceptance requires all seasons**, not winter-first only.
+3. **Operator-review cap is ≤5 POIs/minute** (moderate cap chosen).
+4. **Movement detection assumes timestamped video, gimbal angle/zoom, and UAV motion telemetry** for MVP. Naive frame-differencing is rejected. Movement detection runs at both zoom-out and zoom-in; classical OpenCV adequacy at zoom-in is benchmark-gated (Q14).
+5. **Local VLM is required for MVP** if and only if the exact model satisfies ≤5 s/ROI and the memory budget; otherwise VLM is disabled for MVP and `scan_controller` operates without it.
+6. **MapObjects are mission-bracketed and centrally synchronised** via the `missions` API. In-flight sync is **batched only** for MVP (no streaming over modem).
+7. **Operator commands are authenticated, signed, and replay-protected.** Modem-link encryption alone is not sufficient.
+
+### 7.4 Acceptance Criteria
+
+**Latency.**
+
+| Tier | Target | Hardware |
+|---|---|---|
+| Tier 1 fast probe (YOLO26 + YOLOE-26) | ≤100 ms/frame | Jetson Orin Nano Super |
+| Tier 2 fast confirmation (custom CNN) | ≤200 ms/ROI | Jetson Orin Nano Super |
+| Tier 3 optional deep analysis (VLM) | ≤5 s/ROI | Jetson Orin Nano Super |
+
+**YOLO object detection.**
+
+- Add classes: black entrances of various sizes, branch piles, footpaths, roads, trees, and tree blocks.
+- New classes target: P ≥80 %, R ≥80 %; existing class performance must not degrade.
+- Baseline reference: current YOLO achieves P=81.6 %, R=85.2 % on non-masked objects.
+
+**Semantic detection.**
+
+- Initial concealed-position recall: ≥60 %, accepting high false positives for later reduction.
+- Initial concealed-position precision: ≥20 %, with operators filtering candidates.
+- Footpath detection recall: ≥70 %.
+- Pipeline consumes YOLO primitives (footpaths, roads, branch piles, entrances, trees), assesses path freshness, traces paths to endpoints, identifies concealed structures, and follows the freshest or most promising branch at intersections.
+
+**Movement detection.**
+
+- During the zoom-out sweep, detect small moving point/cluster candidates that are not yet classifiable and enqueue them for zoom confirmation within 1 s.
+- During the zoom-in hold, continue movement detection (independent residual-motion clustering, scaled for the zoomed pixel-to-metre ratio) so a moving small target appearing inside a held POI is not lost; enqueue within 1.5 s.
+- Account for UAV and gimbal motion: stable objects (trees, houses, roads, terrain) must not be treated as moving only because the camera platform moves.
+- Movement candidates become zoom-in POIs; after zoom, the system attempts semantic / YOLO confirmation as vehicle, people, or other relevant target.
+- Zoom-in adequacy of classical OpenCV optical-flow / global-motion estimation is benchmark-gated. If the false-positive rate at zoom-in exceeds the per-zoom-band budget, fall back to a learned optical-flow / CNN-based motion module behind a feature flag (Q14).
+
+**Scan and camera control.**
+
+- Zoom-out level covers the planned route with a wide or light/medium-zoom left-right sweep; POIs include footpaths, tree rows, branch piles, black entrances, movement candidates, houses with vehicles or traces, and roads on snow / terrain / forest.
+- Transition zoom-out → zoom-in within 2 s of POI detection, including physical zoom from medium to high.
+- Zoom-in level keeps camera lock while the UAV flies, compensates for aircraft motion, pans along footpaths or movement candidates so they stay visible and centered, holds endpoints for VLM analysis up to 2 s, and returns to zoom-out after analysis or configurable timeout (default 5 s/POI).
+- After operator confirmation, target-follow mode keeps the target in the centre 25 % of frame while visible, until operator release, target loss, or timeout.
+- Gimbal module commands ViewPro A40 pan/tilt/zoom with ≤500 ms decision-to-movement latency, smooth transitions, and footpaths/moving targets kept centered during pan.
+- Maintain an ordered POI queue prioritised by confidence and proximity to current camera position.
+
+**Resources and data.**
+
+- Semantic module + movement module + VLM RAM: ≤6 GB on Jetson Orin Nano Super.
+- Must coexist with the running YOLO pipeline without degrading YOLO performance.
+- Training data: hundreds to thousands of annotated images/sequences across all seasons and terrain types.
+- Dedicated annotation needed for black entrances, branch piles, footpaths, roads, trees, and tree blocks; available dataset assembly effort is 1.5 months at 5 hours/day.
+
+### 7.5 Training Data
+
+**Source.**
+
+- Aerial imagery from reconnaissance winged UAVs at 600–1000 m altitude.
+- ViewPro A40 camera, 1080p resolution, various zoom levels.
+- Extracted from video frames and still images.
+- Movement detection requires frame sequences, not still images only; include camera/gimbal telemetry where available to separate target motion from UAV motion.
+
+**Target classes.**
+
+- Footpaths / trails (linear features on snow, mud, forest floor).
+- Fresh footpaths (distinct edges, undisturbed surroundings, recent track marks).
+- Stale footpaths (partially covered by snow / vegetation, faded edges).
+- Concealed structures: branch-pile hideouts, dugout entrances, squared / circular openings.
+- Tree rows (potential concealment lines).
+- Open clearings connected to paths (FPV launch points).
+- Moving point/cluster candidates across the full zoom range (wide, light/medium, full zoom-in) — sequences must include both zoom-out and zoom-in examples to support per-zoom-band threshold tuning.
+
+**YOLO primitive classes (new).**
+
+- Black entrances to hideouts (various sizes).
+- Piles of tree branches.
+- Footpaths.
+- Roads.
+- Trees, tree blocks.
+
+**Annotation format.**
+
+- Managed by existing annotation tooling in a separate repository.
+- Expected: bounding boxes and/or segmentation masks depending on model architecture.
+- Footpaths may require polyline or segmentation annotation rather than bounding boxes.
+
+**Seasonal coverage.**
+
+- Winter: snow-covered terrain (footpaths as dark lines on white).
+- Spring: mud season (footpaths as compressed/disturbed soil).
+- Summer: full vegetation (paths through grass/undergrowth).
+- Autumn: mixed leaf cover, partial snow.
+
+**Volume.**
+
+- Target: hundreds to thousands of annotated images/sequences.
+- Available effort: 1.5 months at 5 hours/day.
+- Potential for annotation-process automation.
+
+### 7.6 Solution Architecture
+
+A two-level onboard scan system (zoom-out wide sweep + zoom-in confirmation). The system delegates Tier 1 detection to the existing FastAPI / Cython / TensorRT YOLO service (`../detections`), adds a central scan/perception scheduler (`scan_controller`), compensates motion using synchronised video / gimbal / UAV telemetry (movement detection runs at both zoom levels), controls the ViewPro A40 through a deterministic state machine, and invokes a secured local VLM process only for bounded zoom-in confirmation.
+
+Before implementation decomposition, the project must pass a **benchmark gate** on target hardware: Tier 1 latency, Tier 2 ROI latency, VLM latency / memory, A40 zoom timing, movement-replay false-positive rate, and all-season dataset readiness.
+
+```text
+Video frames + timestamped gimbal/zoom/UAV telemetry
+        |
+        v
+Input validation + telemetry synchronisation
+        |
+        v
+Central scan/perception scheduler (scan_controller)
+        |
+        +---> Existing FastAPI/Cython TensorRT service (../detections)
+        |       YOLO26 + YOLOE-26 fixed-class FP16 engines
+        |
+        +---> Movement detector (active in ZoomedOut and ZoomedIn)
+        |       OpenCV ego-motion compensation + residual clusters,
+        |       per-zoom-band thresholds; learned-CV fallback Q14
+        |
+        +---> Tier 2 semantic analyzer
+        |       primitive graph + lightweight ROI CNN (zoom-in only)
+        |
+        v
+POI queue (confidence + proximity + aging + <=5 POIs/min cap)
+        |
+        +---> ViewPro A40 state-machine controller
+        |
+        +---> Secured local VLM IPC (optional, benchmark-gated)
+                NanoLLM VILA1.5-3B, structured VlmAssessment output
+```
+
+#### Benchmark gate
+
+The first implementation milestone is a proof suite, not product code. It validates:
+
+- YOLO26 + YOLOE-26 FP16 TensorRT, fixed 1280 px, batch 1, end-to-end ≤100 ms/frame.
+- Tier 2 primitive graph + lightweight CNN ≤200 ms/ROI.
+- NanoLLM VILA1.5-3B local VLM ≤5 s/ROI and within remaining memory budget while the YOLO container is present.
+- ViewPro A40 medium-to-high zoom transition and command-to-movement latency.
+- Movement replay false-positive rate **measured independently** at zoom-out and zoom-in, under the ≤5 POIs/minute operator-review cap. If zoom-in exceeds the per-zoom-band cap with classical CV, the learned-CV fallback (Q14) becomes a benchmark-gate prerequisite for the zoom-in scope.
+- All-season dataset readiness and hard-negative coverage.
+
+#### Tier 1 primitive detector
+
+Use custom-trained fixed-class YOLO26 and YOLOE-26 TensorRT FP16 engines, owned by `../detections`. Runtime open-vocabulary prompt mutation is **not** part of MVP; fixed project classes or pre-baked embeddings are required. Outputs remain normalised boxes for operator display, with optional masks or path geometry passed as POI metadata.
+
+#### Tier 2 semantic analyzer
+
+Use a primitive graph plus a lightweight ROI CNN to reason over paths, branch piles, dark entrances, roads, trees, tree blocks, clearings, vehicles, people, and endpoint context. This layer owns path freshness, endpoint scoring, branch choice at intersections, and concealment-POI scoring. Active in the zoom-in level only.
+
+#### Movement detector
+
+Active at **both** zoom-out and zoom-in (skipped only during target-follow). Use OpenCV optical flow / global-motion estimation fused with timestamped gimbal angle, zoom state, and UAV motion telemetry. Naive frame differencing is rejected because it cannot distinguish target motion from platform motion. A telemetry synchronisation contract specifies maximum tolerated frame ↔ gimbal ↔ zoom ↔ UAV timestamp skew before motion compensation; out-of-tolerance samples must be rejected or downgraded.
+
+**Per-zoom-band tuning.** Cluster persistence threshold, residual-velocity floor, and telemetry-skew tolerance are configured per zoom band (zoom-out, zoom-in). The pixel-to-metre ratio differs by ~10× between bands, so identical residual pixel motion implies very different physical motion; thresholds must scale.
+
+**Adequacy at zoom-in (research item, Q14).** Classical optical flow / global-motion estimation is well-validated at zoom-out (UAV cruising, gimbal sweeping, large FOV, ego-motion is the dominant signal and easily fitted). At zoom-in the gimbal is actively path-following, the FOV is narrow, motion blur from any small command is large, and the homography model degrades. The benchmark gate (below) MUST measure the false-positive rate at zoom-in independently from zoom-out; if it exceeds the per-zoom-band cap, the implementation falls back to a learned optical-flow module (e.g. RAFT-derived) or a CNN-based motion-segmentation module behind a feature flag, while keeping the same input/output contract.
+
+#### Scan controller and POI queue
+
+Use a deterministic typed state machine with **`ZoomedOut`**, **`ZoomedIn { roi, hold_started_at }`**, and **`TargetFollow { target_id, started_at }`** states. The queue is ordered by confidence, proximity, and aging while enforcing the ≤5 POIs/minute operator-review cap. The controller handles timeouts, target loss, VLM waits, return-to-zoom-out, and target-follow centre-window behaviour. The full behaviour-tree spec — including tick scenarios and the 15 fixed-wing rules — lives in `system-flows.md §F4`.
+
+#### Local VLM confirmation
+
+Run NanoLLM with VILA1.5-3B through a separate local IPC process **if** the benchmark gate passes. Use one bounded ROI crop, short prompt, short answer, and a validated `VlmAssessment` schema. Free-form VLM text is not a downstream API. The IPC channel uses Unix-domain socket permissions and peer-credential checks where available.
+
+**Optionality model.** VLM is the only optional Tier in the system. Two complementary mechanisms model this:
+
+1. **Runtime configuration flag (`vlm_enabled`)**, gated by the benchmark-gate result. When the flag is `false`, `scan_controller` skips the VLM-confirmation step and proceeds with Tier 2 evidence alone for the zoom-in hold; the operator timeout still applies.
+2. **Build-time feature module.** The `vlm_client` component is a separate module behind a feature flag; the binary must build, link, and run identically when the module is absent. `scan_controller` MUST NOT contain a hard dependency on `vlm_client`'s presence — it depends only on a `VlmAssessment` provider trait whose default implementation returns `status: vlm_disabled`.
+
+The implementation chooses one of these (or both); both must yield the same observable behaviour: the system functions correctly with VLM absent, only losing the zoom-in confirmation step.
+
+#### Integration and reliability
+
+Preserve the normalised-box contract while adding POI metadata. A central scheduler (`scan_controller`) owns GPU-heavy work and enforces no concurrent YOLO/VLM execution. No silent exception swallowing; health must reflect every dependency listed in §5.
+
+#### Security and operational controls
+
+- Validate image / ROI payload size and format before decoding or inference.
+- Use patched OpenCV versions and an image-format allow-list.
+- Enforce local IPC authorisation and payload limits for the VLM process (Unix-domain socket permissions plus peer-credential checks).
+- Log POI creation reasons, source detections, queue decisions, gimbal commands, VLM requests, operator confirmations, and failure states.
+- Keep VLM local with no cloud egress.
+
+### 7.7 MAVLink and Piloting
+
+`mavlink_layer` is a hand-rolled MAVLink v2 transport. There is no third-party SDK dependency. The layer owns serialisation / deserialisation, heartbeat, sequence numbers, retry, and a single connection abstraction (UDP or serial, picked at startup from CLI / env).
+
+**Command surface (~10–15 commands).** Only what the system actually needs:
+
+| MAVLink message | Direction | Used by | Purpose |
+|---|---|---|---|
+| `HEARTBEAT` | bidirectional | `mavlink_layer` | liveness + GCS-vs-companion identification |
+| `COMMAND_LONG` (subset) | out | `mission_executor` | arm / disarm, takeoff, set-mode, change-speed, change-alt, land, RTL |
+| `COMMAND_ACK` | in | `mavlink_layer` | command-result demux, retry trigger |
+| `MISSION_COUNT` | out | `mission_executor` | pre-upload count |
+| `MISSION_REQUEST_INT` | in | `mission_executor` | pull-side mission upload |
+| `MISSION_ITEM_INT` | out | `mission_executor` | per-waypoint upload |
+| `MISSION_ACK` | in | `mission_executor` | upload completion |
+| `MISSION_SET_CURRENT` | out | `mission_executor` | start at item 0 |
+| `MISSION_CURRENT` | in | `mission_executor` | progress |
+| `MISSION_ITEM_REACHED` | in | `mission_executor` | progress |
+| `MISSION_CLEAR_ALL` | out | `mission_executor` | reset before re-upload (e.g., middle waypoint) |
+| `GLOBAL_POSITION_INT` | in | `telemetry_stream`, `mission_executor` | live position |
+| `ATTITUDE` | in | `telemetry_stream` | attitude for operator overlay |
+| `SYS_STATUS` / `EXTENDED_SYS_STATE` | in | health aggregator | mode, battery, sensor health |
+| `STATUSTEXT` | in | logger | autopilot diagnostic lines |
+| `SET_MODE` (or `COMMAND_LONG MAV_CMD_DO_SET_MODE`) | out | `mission_executor` | flight-mode transitions for fixed-wing |
+
+If the autopilot link supports MAVLink-2 message signing it is enabled; otherwise the link is treated as trusted (it is point-to-point on a closed serial / UDP path on the airframe).
+
+**Piloting variants.** `mission_executor` runs one of two state machines depending on the airframe declared at startup:
+
+- **Multirotor variant**: `DISCONNECTED → CONNECTED → HEALTH_OK → ARMED → TAKE_OFF → MISSION_UPLOADED → FLY_MISSION → LAND`. The executor arms, takes off to a configured altitude, and only then uploads + starts the mission. Bounded retry with exponential backoff at every transition; explicit max-retry; on exceeding it, health flips to red and the executor surfaces the failure via the operator bridge.
+- **Fixed-wing variant**: `DISCONNECTED → CONNECTED → HEALTH_OK → MISSION_UPLOADED → WAIT_AUTO → FLY_MISSION → LAND`. The executor skips arm/takeoff (the airframe is assumed already airborne under RC control), uploads the mission, and waits for the operator to switch the airframe into AUTO mode via RC. Same retry policy.
+
+**Geofence enforcement.** `mission_executor` honours both INCLUSION and EXCLUSION polygons declared in the mission. INCLUSION violations halt forward progress and trigger return-to-launch (RTL); EXCLUSION violations trigger the same. The earlier C++ implementation parsed but silently ignored EXCLUSION; the new design rejects that behaviour explicitly.
+
+**Mission uploads and middle-waypoint inserts.** When the operator confirms a target, `operator_bridge` hands a middle-waypoint hint to `mission_executor`. The executor recomputes the mission (current-position → middle-waypoint → resume original route), clears the existing autopilot mission via `MISSION_CLEAR_ALL`, re-uploads the new mission via the standard `MISSION_COUNT` / `MISSION_ITEM_INT` / `MISSION_ACK` sequence, and resumes flight. After target-follow ends (operator release, target loss, or timeout), the same sequence reverts to the original mission.
+
+**Lost-link failsafe (operator/Ground-Station modem link).** A typed failsafe ladder runs in `mission_executor`, evaluated each tick:
+
+| Stage | Trigger | Action |
+|---|---|---|
+| `LinkOk` | last operator heartbeat ≤ 5 s | continue mission; no behavioural change |
+| `LinkDegraded` | 5 s < last heartbeat ≤ 30 s | continue mission; surface health → yellow; queue all POI surface-events for replay-on-recovery |
+| `LinkLost` | last heartbeat > 30 s **and** target-follow inactive | trigger RTL via `MAV_CMD_NAV_RETURN_TO_LAUNCH`; log mission abort with reason; continue logging the mission diff for post-flight upload via `mapobjects_store` |
+| `LinkLostInFollow` | last heartbeat > 30 s **and** in target-follow | hold target-follow for an additional 30 s grace (operator may have momentarily lost link); thereafter fall through to `LinkLost` |
+
+The grace windows (5 s, 30 s, 30 s) are mission-configurable. **MAVLink-link loss to ArduPilot/PX4 itself** is not the same event — it triggers immediate health → red and falls through to whatever the airframe autopilot's own failsafe does (we do NOT override it).
+
+**Battery / fuel thresholds.** `mission_executor` reads `SYS_STATUS` / `EXTENDED_SYS_STATE` and enforces:
+
+- `battery ≤ rtl_threshold` (default 25 %) → trigger RTL, log reason, continue post-mission upload.
+- `battery ≤ hard_floor` (default 15 %) → land-now via `MAV_CMD_NAV_LAND` at safest reachable point; surface health → red.
+
+Operator override is permitted via a signed command (per Q9); without it, the thresholds are hard.
+
+**Connection configuration.** A single connection URI at startup: `udp://...` or `serial:///dev/...`. No runtime URI swap.
+
+**Frames and altitudes.** All waypoints in the mission API use `MAV_FRAME_GLOBAL_RELATIVE_ALT`. Terrain-following frames are not used (no SRTM database on the airframe).
+
+### 7.8 Detection Classes
+
+These classes extend the default seed set used by the detections service.
+
+| Class           | Local Name (UA) | Notes                      |
+|-----------------|-----------------|----------------------------|
+| Rows of trees   | Посадка         | Linear vegetation cover    |
+| Trenches/Ditches| Рів             | Linear earthwork features  |
+| Trash piles     | Сміття          | Indicators of activity     |
+| Tire tracks     | Сліди від шин   | Signs of movement          |
+
+Plus the new YOLO primitive classes from §7.5 Training Data: black entrances of various sizes, branch piles, footpaths, roads, trees, and tree blocks.
+
+### 7.9 MapObjects (H3 spatial index)
+
+`MapObjects` are created and managed internally by autopilot. There are **no** REST API endpoints for MapObjects — autopilot reads/writes them directly in the on-device store (`mapobjects_store`). The only external reference is the delete cascade in `DELETE /missions/{id}` (per the suite-level missions API).
+
+Autopilot needs to store objects on a 2D map efficiently in order to find differences fast:
+
+- New objects (new pile of trash, new tire tracks).
+- Changed objects.
+- Removed objects.
+
+Each object on the map is described by:
+
+- `gps(lat, lon)` — geographic position.
+- `size(width, height)` — bounding area.
+
+**Spatial indexing.** Use a hexagonal spatial index to efficiently store and query objects by location.
+
+**Approach:** H3 library (by Uber) — hierarchical hexagonal geospatial indexing system.
+
+| Aspect              | Detail                                     |
+|---------------------|--------------------------------------------|
+| Library             | H3 (`h3rs` crate for Rust)                 |
+| Algorithm basis     | 3D icosahedron → 2D hexagonal tessellation |
+| Key advantage       | Uniform area cells, good neighbour queries |
+| Open question       | Optimal tile/resolution size               |
+| Known issue         | Discontinuity problem at cell boundaries   |
+
+The hexagonal grid avoids the distortion problems of square grids and provides consistent neighbour relationships, making it suitable for fast spatial diff operations (detecting new, changed, and removed objects).
+
+### 7.10 Drone ⇄ Operator Sync Message Format
+
+Detection data is synced between drone and operator using a compact message format. MGRS (Military Grid Reference System) is used as the primary coordinate encoding — compact, standardised, and directly usable on military maps.
+
+**Drone → Operator (detection report):**
+
+```text
+missionId :: MGRS(encoded) :: class :: confidence :: size_width_m :: size_length_m :: photo_metadata :: flags
+```
+
+**Operator → Drone (command/acknowledgment):**
+
+```text
+missionId :: Encoded(GroundMGRS :: Time) :: ... :: missionId2
+```
+
+Wire-level field semantics live in `data_model.md §MGRS sync message`.
+
+### 7.11 Target Relocation / Movement Analysis
+
+The system maintains a live **map of objects** and detects changes between survey passes.
+
+**Map update types.**
+
+| Type    | Meaning                                      |
+|---------|----------------------------------------------|
+| New     | Object not seen before in this area          |
+| Moved   | Object of same class appeared nearby         |
+| Removed | Previously recorded object no longer present |
+
+**Map hashtable.** Objects are stored in a hashtable keyed by MGRS grid reference:
+
+```text
+MGRS1  -> Object1
+MGRS2  -> Object5
+MGRS12 -> Object2
+MGRSN  -> ObjectM
+```
+
+### 7.12 New vs Existing / Moved / Removed Object Detection
+
+When a detection occurs, the system must determine whether the object is **new**, **moved**, or **already known**. This must be done efficiently in real time. This is the implementation of `scan_controller`'s map-diff responsibilities; it lives in `mapobjects_store`.
+
+**Algorithm.**
+
+```text
+On each detection(gps, class, confidence, size):
+
+1. Compute H3 cell index at chosen resolution (e.g. res 10 ~15m edge).
+2. Build composite key = H3_cell + class.
+3. Query k-ring(H3_cell, k=2) -> get all neighbouring cells.
+4. For each neighbouring cell, lookup objects with same or similar class:
+     similar_classes = {military_vehicle, tank, artillery}  (configurable groups)
+5. Compare:
+     - If matching object found within distance_threshold (config, e.g. 50m)
+       AND same class group -> EXISTING (or MOVED if position delta > move_threshold).
+     - If no match -> NEW -> insert into map with H3 hash key.
+6. After full sweep: objects in the region that were NOT re-observed -> REMOVED candidates.
+```
+
+**Why H3 + MGRS.**
+
+| Step                     | Mechanism                  | Complexity |
+|--------------------------|----------------------------|------------|
+| Spatial cell lookup      | H3 `latlng_to_cell`        | O(1)       |
+| Neighbour query          | H3 `grid_disk(k=2)`        | O(1)       |
+| Object lookup per cell   | Hashtable by `MGRS+class`  | O(1)       |
+| Total per detection      | ~constant time             | O(k²)      |
+
+**Configurable parameters.**
+
+| Parameter            | Example Value | Purpose                                              |
+|----------------------|---------------|------------------------------------------------------|
+| search_radius_km     | 30            | Max radius to search for previously known objects    |
+| distance_threshold_m | 50            | Max distance to consider same object                 |
+| move_threshold_m     | 10            | Min displacement to flag as "moved"                  |
+| h3_resolution        | 10            | ~15 m edge length, good for vehicle-sized objects    |
+| similar_classes      | per config    | Class groups treated as equivalent for matching      |
+
+**Notes.**
+
+- The 30 km radius is for the broad initial query ("get all previously stored objects within 30 km"). H3 `grid_disk` at resolution 10 with k=2 covers ~90 m radius — this handles fine-grained matching. For the broad query, use a coarser H3 resolution (e.g. res 4 ~22 km edge) as a pre-filter.
+- `MGRS+class` is the composite key for the hashtable so that lookups are partitioned by both location and object type.
+- The discontinuity problem at H3 cell boundaries is solved by always querying the k-ring (centre cell + neighbours), ensuring objects near an edge are still matched.
+
+### 7.13 MapObjects Sync (central DB)
+
+`mapobjects_store` is **not** a private on-device database. It is the working copy of a centrally maintained map of detected objects, scoped to the mission's bounding box, synchronised on a per-mission basis.
+
+**Mirror of the GPS-Denied satellite-tile pattern.** Pre-flight, autopilot pulls the relevant central state into the on-device store; in-flight the on-device store is authoritative; post-flight, autopilot pushes the mission's full pass diff back to the central store. The central store is the source of truth across missions and across UAVs; the on-device store is the source of truth during the active mission.
+
+**Endpoint hosting (frozen 2026-05-18).** The endpoints are an extension of the existing `missions` API. There is no separate `mapobjects` service.
+
+| Endpoint | Method | Purpose |
+|---|---|---|
+| `/missions/{id}/mapobjects` | `GET` | Pre-flight: returns the central map state for the mission's bounding box (last-known objects + ignored items). |
+| `/missions/{id}/mapobjects` | `POST` | Post-flight: uploads the mission's full pass diff (NEW / MOVED / REMOVED-CANDIDATE / CONFIRMED-EXISTING) for central merge. |
+| `/missions/{id}/mapobjects/ignored` | `GET` | Pre-flight: returns the central ignored-items list scoped to the mission area. |
+| `/missions/{id}/mapobjects/ignored` | `POST` | Post-flight: uploads ignored-items appended during the mission. |
+| `DELETE /missions/{id}` | (existing) | Cascade: drops mission-scoped MapObjects and IgnoredItems centrally as well as on-device. |
+
+In-flight sync is **batched only** for MVP — no streaming over modem. Cross-UAV awareness lags by mission length; this is an explicit MVP trade-off (Frozen choice 6 in §7.3).
+
+**Sync lifecycle (per mission).**
+
+1. **Pre-flight pull** — `mission_client` calls `GET /missions/{id}/mapobjects` after fetching the mission itself. Response hydrates `mapobjects_store`. Failure modes:
+   - **Reachable + 200**: hydrate; record `pull_completed_at`. Sync state = `synced`.
+   - **Reachable + 4xx**: fail BIT; surface error; operator must investigate (likely mission-id mismatch or unauthorised UAV).
+   - **Unreachable / timeout**: BIT degrades. Operator may acknowledge to continue with **last-cached** state for this mission area (`sync state = cached_fallback`); the BIT failure is recorded for post-mission audit.
+   - **Empty response**: `sync state = synced`, store empty (legitimate first-flight in this area).
+2. **In-flight** — store is authoritative. All NEW / MOVED / EXISTING / IgnoredItem appends accumulate in the on-device store with `pending_upload = true`. No central writes.
+3. **Post-flight push** — `mission_client` calls `POST /missions/{id}/mapobjects` with the mission's full pass diff after landing or RTL. Conflict resolution is server-side per §7.13 conflict rules. Failure modes:
+   - **Reachable + 200**: clear `pending_upload`; record `push_completed_at`. Sync state = `synced`.
+   - **Unreachable / timeout / 5xx**: persist the pending diff on disk, retry with backoff. After max retries (configurable, default 24 h), surface as a warning; operator may manually trigger replay or accept loss.
+   - **4xx (rejected)**: log full payload, surface to operator; do not silently discard — the mission's results are at risk.
+
+**Conflict resolution at the central store (open question Q8 — proposed).** When two missions report contradicting state for the same `(h3_cell, class_group)`:
+
+- Both observations are **appended** to the per-`(h3_cell, class_group)` observation log (no destructive overwrite).
+- The "current view" surfaced to operator UI is computed from the observation log: most recent confirmed-existing observation wins; older REMOVED claims expire after a configurable age; class-group ambiguities surface as multi-class candidates.
+- IgnoredItems are union-merged (any operator-decline at any UAV propagates to all future missions in the same area, until explicit clear).
+
+**Central-side schema (SQL, indicative).**
+
+```sql
+-- Observations: every detection ever reported by any UAV/mission, never overwritten.
+CREATE TABLE map_object_observations (
+    id              UUID PRIMARY KEY,
+    h3_cell         BIGINT NOT NULL,
+    class           TEXT NOT NULL,
+    class_group     TEXT NOT NULL,
+    mission_id      UUID NOT NULL REFERENCES missions(id) ON DELETE CASCADE,
+    uav_id          UUID NOT NULL,
+    observed_at     TIMESTAMPTZ NOT NULL,
+    gps_lat         DOUBLE PRECISION NOT NULL,
+    gps_lon         DOUBLE PRECISION NOT NULL,
+    mgrs            TEXT NOT NULL,
+    size_width_m    REAL,
+    size_length_m   REAL,
+    confidence      REAL NOT NULL,
+    diff_kind       TEXT NOT NULL CHECK (diff_kind IN ('NEW','MOVED','EXISTING','REMOVED_CANDIDATE')),
+    photo_ref       TEXT,
+    raw_evidence    JSONB
+);
+CREATE INDEX ON map_object_observations (h3_cell, class_group);
+CREATE INDEX ON map_object_observations (mission_id);
+CREATE INDEX ON map_object_observations (observed_at DESC);
+
+-- IgnoredItems: per-area operator declines, union-merged across missions.
+CREATE TABLE map_object_ignored (
+    id              UUID PRIMARY KEY,
+    h3_cell         BIGINT NOT NULL,
+    mgrs            TEXT NOT NULL,
+    class_group     TEXT NOT NULL,
+    declined_at     TIMESTAMPTZ NOT NULL,
+    operator_id     UUID,
+    mission_id      UUID REFERENCES missions(id) ON DELETE SET NULL,
+    retention_scope TEXT NOT NULL CHECK (retention_scope IN ('mission','session','until_expiry')),
+    expires_at      TIMESTAMPTZ
+);
+CREATE INDEX ON map_object_ignored (h3_cell, class_group);
+CREATE INDEX ON map_object_ignored (expires_at) WHERE retention_scope = 'until_expiry';
+
+-- Materialised "current view" derived from observations + ignored.
+-- Recomputed nightly or on POST. Exact projection rules per §7.13 conflict resolution.
+CREATE MATERIALIZED VIEW map_objects_current AS ...;
+```
+
+**On-device-side schema (engine TBD per §8 Q3 — indicative shape).**
+
+```text
+mapobjects_store/
+  current_state            -- key = (h3_cell, class_group); value = MapObject record
+  pending_observations     -- ordered log of unflushed observations for post-flight POST
+  pending_ignored          -- unflushed IgnoredItem appends
+  sync_state               -- {pull_completed_at, push_completed_at, last_error, kind}
+```
+
+The on-device shape is intentionally narrower than the central schema — the on-device store does not need full observation history beyond the active mission; older history is only ever consulted via the central pull.
+
+**Bounding-box pull strategy.** The central API uses the mission's geofence INCLUSION polygon (or a generous AABB if no INCLUSION is set) to scope the response. Pulled records are filtered by retention age (default ≤30 days); operator can override to "all". The 30 km / k-ring numbers in §7.12 apply to **on-device** spatial queries; the pull radius is mission-defined.
+
+### 7.14 Tech Stack
+
+**Requirements.**
+
+| Area | Requirement |
+|---|---|
+| Runtime hardware | Jetson Orin Nano Super 8 GB, locked JetPack/power mode, ViewPro A40. |
+| Inference (Tier 1) | FP16 only, TensorRT primary, ONNX Runtime fallback, 1280 px model input. Lives in `../detections`. |
+| Service integration | Bi-directional gRPC client to the existing FastAPI + Cython + TensorRT detections service. |
+| VLM | Local-only, separate IPC process, sequential with YOLO, ≤5 s/ROI if used for MVP. |
+| Movement | Active at zoom-out and zoom-in, moving-camera compensation with timestamped video / gimbal / UAV telemetry; per-zoom-band thresholds; learned-CV fallback per Q14. |
+| MapObjects sync | Mission-bracketed: pre-flight `GET` + post-flight `POST` against `/missions/{id}/mapobjects`. Batched only for MVP. |
+| Output | Existing normalised-box format plus POI metadata for queue / reasoning. |
+| Proof gates | Hardware/replay benchmark suite before implementation decomposition; movement zoom-in benchmark independent of zoom-out. |
+
+**Selected stack.**
+
+| Layer | Selection | Rationale |
+|---|---|---|
+| Language (autopilot) | Rust | Memory safety, performance, single-binary deployment, strong type system for the deterministic state machine. |
+| Language (`../detections`) | Python + Cython | Existing service; we consume it, not rewrite it. |
+| Tier 1 detector | YOLO26 + YOLOE-26 fixed-class FP16 TensorRT | Best fit with acceptance criteria and export docs. Owned by `../detections`. |
+| Tier 2 analyzer | Primitive graph + lightweight CNN | Fast, explainable, data-efficient. |
+| Movement | OpenCV optical flow + telemetry | Directly addresses moving-camera constraint. |
+| VLM runtime | NanoLLM / VILA1.5-3B (with fallback benchmark path) | Documented local-multimodal path; matches no-cloud requirement. |
+| Scan controller | Deterministic typed state machine (Rust) | Simpler and easier to test for a fixed `ZoomedOut` / `ZoomedIn` / `TargetFollow` lifecycle. |
+| MAVLink transport | Hand-rolled in autopilot (Rust) | Eliminates the largest current dependency-risk item; small command surface (§7.7). |
+| Gimbal protocol | ViewPro A40 vendor protocol over UDP | Matches the deployed camera. |
+| `mapobjects_store` engine | TBD (SQLite + H3 extension / KV / in-memory + snapshot) | Open question; see §8. |
+| Inter-component IPC (in-process) | Tokio channels / actors | Idiomatic Rust async. |
+| External IPC (VLM) | Unix-domain socket with peer-credential check | Local-only authorisation. |
+| VLM output | Validated structured `VlmAssessment` schema | Makes VLM output a stable API contract. |
+| Input security | Content / size allow-list + patched OpenCV | Reduces crafted-input and resource-exhaustion risk. |
+| Observability | `tracing` + JSON logs to stdout, scraped by the deployment's log-shipping stack | See `deployment/observability.md`. |
+| Build | `cargo` cross-compile for `aarch64-unknown-linux-gnu` | See `deployment/ci_cd_pipeline.md`. |
+
+**Risk register.**
+
+| Risk | Impact | Mitigation |
+|---|---|---|
+| Tier 1 misses ≤100 ms/frame | Blocks acceptance | Fixed-shape FP16 engines, batch 1, benchmark before implementation decomposition. |
+| VLM misses ≤5 s/ROI or memory budget | Blocks VLM-required MVP policy | Benchmark NanoLLM / VILA first; fall back to smaller VLM only if it passes the same gates; otherwise disable VLM via `vlm_enabled=false`. |
+| All-season MVP data is insufficient | Blocks detection-quality targets | Per-season dataset gates and hard-negative mining. |
+| Movement false positives exceed ≤5 POIs/min | Operator overload | Telemetry-aided compensation, replay tests, queue cap, per-zoom-band thresholds. |
+| Classical OpenCV optical flow inadequate at zoom-in | Loss of zoom-in movement detection | Benchmark gate measures zoom-in independently; fallback to learned-CV / CNN motion module behind feature flag (Q14). |
+| Operator/Ground-Station modem link lost mid-flight | Uncontrolled UAV | Typed lost-link failsafe ladder in `mission_executor` (§7.7); RTL after 30 s grace; configurable. |
+| Battery / fuel below threshold mid-mission | Forced landing or crash | Hard-coded RTL + land-now thresholds (§7.7); operator override only via signed command. |
+| Operator command spoofing / replay over modem RF | Hostile hijack of operator commands | Authenticated, signed, replay-protected command envelope (§5; scheme TBD per Q9). |
+| Pre-flight self-test (BIT) misses a degraded dependency | Mid-flight component failure | BIT covers every dependency in §5 plus mission load + MapObjects pre-flight pull; cached-fallback acknowledgement is explicit. |
+| Wall-clock drift breaks operator-command timestamping | Forensic + audit failures | GPS-time-bound when GPS locked; NTP at boot; drift > 200 ms surfaces health → yellow. |
+| MapObjects post-flight push fails | Loss of mission-diff data centrally | Persist pending diff on disk; bounded retry; operator-visible warning; manual replay supported. |
+| A40 zoom transition exceeds ≤2 s | Breaks scan timing | Hardware-in-loop timing test; revise scan timeout / zoom range if needed. |
+| Hand-rolled MAVLink misses an edge case | Mission failure or hard-to-debug protocol behaviour | Conformance test against ArduPilot SITL; replay-based regression tests. |
+| Unstructured VLM output corrupts downstream decisions | Operator-facing false confidence | Schema validation, confidence enum, timeout / error state, fail-closed behaviour. |
+| Telemetry skew breaks movement compensation | False motion candidates | Define maximum frame / gimbal / UAV timestamp skew; reject / degrade unsynchronised samples. |
+| Untrusted image / ROI payloads exploit decoders or memory | Security and availability risk | Pin patched OpenCV, restrict formats, enforce size caps before decode. |
+
+---
+
+## 8. Open Questions
+
+| # | Question | Impact |
+|---|---|---|
+| Q1 | **Sweep pattern specification.** Pattern shape (pendulum / raster / lawn-mower), FOV per zoom tier, dwell time per direction, and whether sweep runs continuously or only between specific mission waypoints. | Blocks `scan_controller` zoom-out implementation. |
+| Q2 | **Ground Station API contract.** Stream protocol (WebRTC / WebSocket-H.264 / gRPC server-streaming?), session/auth model, and bbox-overlay rendering (server-side burn-in vs client-side render). | Blocks `telemetry_stream` + `operator_bridge` design. |
+| Q3 | **`mapobjects_store` engine.** SQLite + H3 extension / KV / in-memory + snapshot. | Blocks persistent-state design for ignored items + MapObjects. |
+| Q4 | **Tier 1 contract evolution.** How `detection_client` is versioned against an evolving `../detections` schema. | Blocks the gRPC contract definition. |
+| Q5 | **`mission-schema` extraction location.** `_infra/` at suite root, or a small third repo. | Blocks the `mission_client` / `missions` API contract sharing. |
+| Q6 | **MAVLink-2 message signing.** Whether the airframe link enables MAVLink-2 signing or treats the link as trusted. | Affects `mavlink_layer` startup handshake. |
+| Q7 | **Central MapObjects API contract.** Endpoint hosting is frozen as an extension of the `missions` API (§7.13). The remaining contract concerns are: schema versioning, paging strategy for large mission areas, photo-reference upload mechanism (URL handoff vs inline), and observation-history retention policy. | Blocks `missions` repo work + `mission_client` MapObjects sync code. |
+| Q8 | **MapObjects conflict resolution.** When two missions report contradicting state for the same `(h3_cell, class_group)`, the proposed rule is "append-only observation log + computed current view" (§7.13). Open: exact projection rules, REMOVED-claim expiry window, multi-class disambiguation. | Blocks central `map_objects_current` view definition. |
+| Q9 | **Operator-command authentication scheme.** The principle is committed (§5: signed, replay-protected). Scheme open: HMAC over (session_token, sequence_number, payload) vs JWT-style ed25519 vs MAVLink-2 signing extended to operator commands vs separate envelope. | Blocks `operator_bridge` validation logic + Ground Station integration. |
+| Q10 | **Software rollback policy on the airframe.** Watchtower OTA is mentioned in `../_docs/00_top_level_architecture.md`. Policy open: how a bad autopilot update is detected on the airframe (boot-time self-check, A/B partition, watchdog rollback) and rolled back without crew intervention. | Affects deployment design + on-airframe service supervision. |
+| Q11 | **Multi-operator session policy.** When two operators connect (one in primary station, one remote), which is authoritative for confirm/decline? Single active operator at a time, or quorum? How is `operator_id` recorded in `IgnoredItem`? | Blocks `operator_bridge` session model. |
+| Q12 | **Comms blackout during banking turns.** Winged UAV banking can lose modem LOS to Ground Station. Policy: tolerate brief blackouts as `LinkDegraded`, or suppress lost-link failsafe during known turn arcs (computed from mission shape)? | Affects lost-link failsafe ladder timing constants (§7.7). |
+| Q13 | **All-season acceptance flight gates.** Dataset gates (§7.4) are committed; flight-test gates are not. Open: minimum number of real flights per season before MVP acceptance, per-season acceptance pass criteria. | Affects MVP sign-off scope. |
+| Q14 | **Movement detection at zoom-in — fallback selection.** If classical OpenCV optical flow / global-motion estimation does not meet the per-zoom-band false-positive cap at zoom-in, the fallback module choice is open: learned optical flow (RAFT / FlowNet derivative) vs CNN motion segmentation vs IMU-tighter-coupled classical CV. The interface contract (`Frame + telemetry → Vec<MovementCandidate>`) is fixed; the implementation is replaceable. | Blocks `movement_detector` zoom-in scope if classical CV fails benchmark gate. |
+
+---
+
+## 9. Out of Scope
+
+- Multi-airframe coordination, fleet management, swarm logic.
+- Mission re-planning beyond middle-waypoint inserts.
+- Mission planning / route selection for arbitrary mission shapes (only intra-region routing).
+- GPS-denied navigation algorithms (delegated to the GPS-denied service, `../_docs/11_gps_denied.md`).
+- Cloud-hosted VLM or any external inference dependency.
+- Encrypted transport beyond what MAVLink-2 message signing and modem-level link encryption already provide.
+- Annotation tooling, model training, dataset curation (separate `ai-training` repo).
+- Operator browser UI (Ground Station hosts it; autopilot only feeds it).
+
+---
+
+## 10. External Suite Documents
+
+These suite-level documents live in the parent suite repo (`../_docs/`) and are consumed by autopilot but **not owned** by autopilot.
+
+| Suite-level path | Owner / primary-for | What autopilot uses it for |
+|---|---|---|
+| `../_docs/00_top_level_architecture.md` | suite (cross-cutting) | Suite topology, deployment tiers (`edge`), the **flight-gate convention** (`/run/azaion/in-flight` — written by autopilot, read by `model-sync.service`), Watchtower OTA model. Defines autopilot's place in the 11-component system. |
+| `../_docs/02_missions.md` | `missions` repo (.NET service) | Mission / Waypoint / Vehicle schemas. Autopilot consumes the missions API via `mission_client`. |
+| `../_docs/03_detections.md` | `detections` repo (Cython service) | Detections API spec. Autopilot consumes via bi-directional gRPC in `detection_client`. |
+| `../_docs/04_system_design_clarifications.md` | suite (cross-cutting) | REST patterns, stream-detection protocol, edge-device connection semantics. Defines the Ground Station push contract used by `telemetry_stream`. |
+| `../_docs/11_gps_denied.md` | `gps-denied-onboard` / `gps-denied-desktop` (shared primary) | GPS-Denied service architecture. Autopilot does NOT host any GPS-denied code; it consumes corrected GPS through the shared edge data path. |
+| `../_docs/12_ai_training.md` | `ai-training` repo | AI training pipeline. Autopilot consumes the resulting ONNX/TensorRT models via the rclone model-sync timer (flight-gate-aware). |
@@ -0,0 +1,76 @@
+# Component — `detection_client`
+
+**Layer**: Perception (data plane in)
+**Status**: forward-looking design (Rust)
+
+## 1. Purpose
+
+Bi-directional gRPC client to the external `../detections` service. Streams frames out, receives bounding-box detections back. Same bboxes are reused by `semantic_analyzer` (Tier 2 ROI selection) and by `telemetry_stream` (operator overlay). This is the only component in autopilot that talks to `../detections`.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| `Frame` | `frame_ingest` | up to 30 fps | Skipped when `ai_locked` is set. |
+| Tier-1 service config | startup config | once | gRPC endpoint, TLS settings, request budget, max concurrent streams. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `DetectionBatch` | `scan_controller`, `semantic_analyzer`, `telemetry_stream` | `{ frame_seq: u64, detections: Vec<Detection>, latency_ms, model_version }` |
+| Health metric | health aggregator | gRPC connection state, `requests_in_flight`, `latency_p50/p99`, `errors_by_kind`, `model_version`. |
+
+`Detection` mirrors the `../detections` contract: `{ class_id, class_name, confidence, bbox_normalized, optional_mask_or_polyline, source_frame_seq }`.
+
+## 4. Key Responsibilities
+
+- Maintain a single bi-directional gRPC stream to `../detections`. Reconnect on stream loss with bounded exponential backoff.
+- Frame budgeting: respect the Tier-1 ≤100 ms/frame target by dropping older in-flight frames if a new frame arrives before the previous response (configurable).
+- Validate the response payload against the schema version the client was built against. Surface a hard error on schema mismatch; do not silently downcast.
+- Tag each `DetectionBatch` with the source frame's monotonic timestamp so downstream consumers can compute end-to-end latency.
+
+## 5. Internal State
+
+- gRPC channel, stream handle, reconnect state.
+- Sliding window of in-flight frame sequence numbers.
+- Last-known model version (echoed by `../detections` on each response or on stream init).
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| `../detections` unreachable | gRPC connect error | Bounded exponential backoff; health → red after threshold; `scan_controller` continues but the `detection_client` health flag is red. |
+| Mid-stream cancellation by server | stream error | Reopen stream; do not lose frames in flight (best-effort retry on the latest only). |
+| Schema mismatch | response decode error | Hard error to the health aggregator; reject the response; alert. |
+| Model version change at runtime | new `model_version` on the stream | Log it; if the change implies new classes, surface to `scan_controller` so per-class thresholds can be reloaded. |
+| Consistent latency above budget | `latency_p99 > 100 ms` over a sliding window | Health → yellow; `scan_controller` may degrade to alternate-frame inference. |
+
+## 7. Dependencies
+
+**In-process**: `frame_ingest` (input), `scan_controller` / `semantic_analyzer` / `telemetry_stream` (output).
+
+**External**:
+- `../detections` gRPC service. Contract owner: `../_docs/03_detections.md`. Bi-directional streaming.
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Per-frame round-trip latency | ≤100 ms (Tier-1 NFR; mostly owned by `../detections`, autopilot's call budget respects it) |
+| Reconnect latency | ≤2 s after `../detections` returns |
+| Throughput | up to 30 fps at 1080p |
+| Backpressure | drop oldest in-flight rather than queue indefinitely |
+
+## 9. Open Questions
+
+- Versioning strategy of the gRPC contract (covered in `architecture.md §8 Q4`).
+
+## 10. References
+
+- `architecture.md §1`, `§3`, `§7.6`.
+- `system-flows.md §F1`.
+- `../_docs/03_detections.md`.
+- `data_model.md §Detection`, `§DetectionBatch`.
@@ -0,0 +1,74 @@
+# Component — `frame_ingest`
+
+**Layer**: Perception (data plane in)
+**Status**: forward-looking design (Rust)
+
+## 1. Purpose
+
+Pull RTSP from the ViewPro A40 camera, decode H.264/265 to raw frames, attach a monotonic timestamp + sequence number, and hand each frame to the downstream consumers (`detection_client`, `movement_detector`, `telemetry_stream`) without copying frame buffers more than once.
+
+Frames are the system's primary input. Everything downstream of `frame_ingest` is rate-limited by it.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| RTSP video stream | ViewPro A40 (via airframe IP/port) | 30 fps at 1080p (60 fps capable) | TCP or UDP transport per camera config. Re-opens on failure with bounded backoff. |
+| Camera startup config | Static config (env or CLI) | once at process start | Stream URL, transport, decode codec preference. |
+| `bringCameraDown` / `bringCameraUp` health signal | local supervisor (if present) | event | Optional. Used by deployments that gate AI access to the camera (e.g., during RC takeover). When `down` is asserted, `frame_ingest` continues decoding for `telemetry_stream` but flags frames as "AI-locked" so downstream consumers skip detection. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `Frame` | `detection_client`, `movement_detector`, `telemetry_stream` | `{ seq: u64, capture_ts_monotonic: ns, decode_ts_monotonic: ns, pixels: Arc<Bytes>, width, height, pix_fmt, ai_locked: bool }` |
+| Health metric | health aggregator | `frames/s`, `decode_ms_p50/p99`, `last_frame_age_ms`, `reopens_total`, `decode_errors_total` |
+
+## 4. Key Responsibilities
+
+- Open the RTSP session and recover from transient connection loss with bounded exponential backoff.
+- Decode frames using a hardware decoder where available (NVDEC on Jetson) with software fallback.
+- Stamp each frame with a monotonic capture timestamp at the earliest practical point in the pipeline; this is what `movement_detector` uses for telemetry-skew checks.
+- Publish frames through a single multi-consumer channel (Tokio broadcast or equivalent) using `Arc<Bytes>` for pixel data so consumers do not copy.
+- Drop frames if downstream consumers fall behind beyond a configured queue depth; record the drop with a reason ({{detection_client_slow, movement_detector_slow, telemetry_slow}}) and surface it through the health endpoint.
+
+## 5. Internal State
+
+- RTSP session handle and reconnect state (closed / connecting / streaming / failing).
+- Last-frame timestamp and sequence number.
+- Per-consumer drop counters.
+
+State is in-process only; nothing persists across restarts.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| RTSP connection refused / lost | TCP connect error / read timeout | Bounded exponential backoff (1 s → 30 s cap); health flips to yellow after first failure, red after `last_frame_age_ms` exceeds a configured threshold. |
+| Decode error on a single frame | decoder returns error | Drop the frame; increment `decode_errors_total`; do not abort the stream. |
+| Decoder cold-start latency | first-frame timestamp far from session-open | Surface `decode_ms_first_frame` once; not an alert by itself. |
+| Downstream consumer slow | broadcast channel back-pressure | Drop the oldest frame for that consumer; counter-tagged drop; warning on sustained drops. |
+| Camera output format mismatch | unexpected SPS/PPS | Hard-fail at session open with an explicit error; do not silently pick a wrong decode path. |
+
+## 7. Dependencies
+
+**In-process**: none upstream; downstream consumers are `detection_client`, `movement_detector`, `telemetry_stream`.
+
+**External**:
+- ViewPro A40 RTSP (live).
+- Hardware video decoder (NVDEC on Jetson) via FFmpeg / GStreamer or a Rust binding.
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| End-to-end frame latency (RTSP rx → publish to consumers) | ≤30 ms p99 on Jetson Orin Nano. |
+| Frame drop rate | ≤0.1 % under normal conditions. |
+| Reconnect latency after camera reboot | ≤5 s from camera availability. |
+| Memory | one decoded-frame buffer pool with bounded size; no unbounded growth on slow consumers. |
+
+## 9. References
+
+- `architecture.md §1 System Context`, `§3 Components`, `§7.6 Solution Architecture`.
+- `system-flows.md §F1 Frame pipeline`.
+- `data_model.md §Frame`.
@@ -0,0 +1,78 @@
+# Component — `gimbal_controller`
+
+**Layer**: Action (data plane out)
+**Status**: forward-looking design (Rust); ViewPro A40 vendor protocol
+
+## 1. Purpose
+
+Drives the ViewPro A40 gimbal: pan (yaw), tilt (pitch), and zoom. Honours the ≤2 s zoom-transition budget and ≤500 ms decision-to-movement latency. Owns the zoom-out sweep, the smooth-pan path-tracking primitive used during the zoom-in level (follow-the-footpath behaviour), and the centre-window primitive used during target-follow.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| `GimbalCommand` | `scan_controller` | per state-machine tick or per zoom-in plan step | yaw / pitch / zoom goal; or pan plan; or centre-on-target. |
+| Sweep config | startup config | once | Zoom-out sweep pattern (pendulum / raster / lawn-mower — see `architecture.md §8 Q1`). |
+| Live gimbal status | ViewPro A40 (vendor protocol) | as emitted by camera | yaw / pitch / zoom feedback + faults. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| Vendor-protocol commands | ViewPro A40 (UDP) | yaw / pitch / zoom commands |
+| `GimbalState` | `frame_ingest` (for telemetry tagging), `movement_detector` (for ego-motion compensation) | `{ yaw, pitch, zoom, ts_monotonic, command_in_flight: bool }` |
+| Health metric | health aggregator | `commands_per_min`, `decision_to_movement_p99_ms`, `zoom_transition_p99_ms`, `vendor_faults_total`. |
+
+## 4. Key Responsibilities
+
+- Send vendor-protocol commands to the ViewPro A40 over UDP. Re-issue on timeout with bounded retry.
+- Run the zoom-out sweep pattern when `scan_controller` is in `ZoomedOut` (pattern itself depends on `architecture.md §8 Q1` resolution).
+- For the zoom-in path-follow, accept a pan plan (sequence of yaw / pitch / zoom goals with timing) from `scan_controller` / `semantic_analyzer` and execute it smoothly.
+- For target-follow, accept a centre-on-target stream (target bbox normalized) from `scan_controller` and command the gimbal to keep the target inside the centre 25 % of frame while visible.
+- Stamp every emitted command with a monotonic timestamp so `movement_detector` can synchronise it with frames.
+- Surface vendor-protocol faults to health and to `scan_controller`.
+
+## 5. Internal State
+
+- Last-known commanded yaw / pitch / zoom.
+- Last-known reported yaw / pitch / zoom (from gimbal feedback).
+- Sweep pattern state (current direction, dwell counter).
+- Current execution mode: `Sweep | PanPlan | CentreOnTarget | Idle`.
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| ViewPro A40 not responding | command timeout | Bounded exponential backoff; health → yellow then red; `scan_controller` is informed and may pause zoom-in. |
+| Decision-to-movement above budget | self-instrumented | Health → yellow; investigate (likely UDP loss or vendor firmware issue). |
+| Zoom transition stalls | feedback shows no zoom progress | Re-issue command; health → yellow; report to `scan_controller`. |
+| Target lost during target-follow | feedback + tracker | Surface `target_lost` to `scan_controller`; controller decides to release follow. |
+| Conflicting commands | execution-mode mismatch | Reject the lower-priority command; log a hard error; never silently merge. |
+
+## 7. Dependencies
+
+**In-process** (input): `scan_controller`.
+**In-process** (output): `frame_ingest`, `movement_detector` (timestamped state).
+
+**External**: ViewPro A40 over UDP (vendor protocol).
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Decision-to-movement latency | ≤500 ms |
+| Zoom transition (medium → high) | ≤2 s |
+| Sweep pattern stability | bounded jitter; no overshoot beyond configured FOV bounds |
+| Target-follow centre-window | target inside centre 25 % of frame while visible |
+
+## 9. Open Questions
+
+- Sweep pattern specification (`architecture.md §8 Q1`): pendulum / raster / lawn-mower; FOV per zoom tier; dwell time per direction.
+
+## 10. References
+
+- `architecture.md §3`, `§6 NFR`, `§7.6 Solution Architecture`.
+- `system-flows.md §F2 Movement detection (zoom-out + zoom-in)`.
+- `data_model.md §GimbalState`.
@@ -0,0 +1,124 @@
+# Component — `mapobjects_store`
+
+**Layer**: Decision + Memory
+**Status**: forward-looking design (Rust); on-device working copy of the central MapObjects state, mission-bracketed
+
+## 1. Purpose
+
+On-device, H3-indexed working copy of the centrally maintained MapObjects state plus the IgnoredItems list, scoped to the active mission's bounding box. Computes new / moved / existing / removed diffs across survey passes and is the source of truth for the operator-decline suppression rule **for the duration of the active mission**.
+
+This is **not** a private database. It is hydrated pre-flight from the central `missions` API (`/missions/{id}/mapobjects`) and the mission's full pass diff is pushed back post-flight. The central observation log + computed current view are authoritative across missions and across UAVs (`architecture.md §7.13`).
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| Pre-flight pull payload | `mission_client` (from `missions` API) | once per mission | Hydrates `current_state` + `pending_ignored`. |
+| New detection / movement candidate (with MGRS + class + size) | `scan_controller` | per detection | Each is classified as new / moved / existing. |
+| `IgnoredItem` append | `scan_controller` (on operator decline) | event | `(MGRS, class_group)` plus operator metadata. |
+| End-of-pass marker | `scan_controller` / `mission_executor` | event per pass over a region | Triggers the removed-candidate sweep. |
+| Mission delete cascade | suite-level missions API hook (process-level config; not a network call) | event | Drops mission-scoped objects on mission deletion. |
+| Post-flight push trigger | `mission_executor` | once per mission, on terminal state | Causes `mission_client` to drain `pending_observations` + `pending_ignored` to the central API. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `MapObjectClassification` | `scan_controller` | `new \| moved \| existing \| removed_candidate` per detection |
+| `IgnoredItem` match | `scan_controller` | suppression flag for (MGRS, class_group) |
+| Pass diff | `mission_client` (post-flight upload) + `operator_bridge` (optionally surfaced in-flight) | new / moved / removed lists per pass |
+| Sync state | `scan_controller`, health aggregator | `synced \| cached_fallback \| degraded`; `pending_observations_count`, `pending_ignored_count` |
+
+## 4. Key Responsibilities
+
+- **Pre-flight hydrate** from `mission_client` pull. Establish `current_state` and `pending_ignored`. Surface `sync_state` (`synced` or `cached_fallback` or `degraded`).
+- Compute H3 cell for each detection at the configured resolution (default res 10, ~15 m edge).
+- Build the composite key `H3_cell + class`. Maintain an in-memory hashmap; persist asynchronously to disk for crash recovery.
+- Answer queries: `classify(detection) → new | moved | existing` using k-ring lookup and `(distance_threshold_m, move_threshold_m, similar_classes)` configuration.
+- After a region's scan-pass ends, return objects in the region that were not re-observed as `removed_candidate`s (the operator decides on actual removal).
+- Maintain the `IgnoredItem` set; answer suppression queries (`is_ignored(MGRS, class_group)`).
+- Append every NEW / MOVED / EXISTING / REMOVED-CANDIDATE / IgnoredItem event to `pending_observations` / `pending_ignored` for the post-flight push (in-flight central writes are forbidden — Frozen choice 6 in `architecture.md §7.3`).
+- **Post-flight push**: hand the contents of `pending_observations` + `pending_ignored` to `mission_client` for `POST /missions/{id}/mapobjects` and `POST /missions/{id}/mapobjects/ignored`. On ack, clear pending; on failure, persist for retry.
+- On `DELETE /missions/{id}` cascade signal (received via `mission_client`), drop all objects scoped to that mission. The central side cascades as well.
+
+## 5. Sync state machine
+
+```text
+fresh_boot
+   │
+   ├──> pre-flight pull
+   │       │
+   │       ├── 200 OK ────────────> synced
+   │       ├── unreachable ────────> [operator ack required]
+   │       │                           │
+   │       │                           ├── ack on cache ──> cached_fallback
+   │       │                           └── abort ─────────> BIT fail
+   │       └── 4xx ─────────────────> BIT fail
+   │
+   ├── (during flight; in-process writes only)
+   │       │
+   │       ├── pending_observations grow
+   │       └── pending_ignored grow
+   │
+   └── post-flight push
+           │
+           ├── 200 OK on both endpoints ──> synced (pending cleared)
+           ├── partial ────────────────────> retry per-endpoint
+           └── persistent failure ─────────> degraded (operator warning, manual replay)
+```
+
+## 6. Internal State
+
+- In-memory hashmap of `(H3_cell + class) → MapObject`.
+- `IgnoredItem` set keyed by `(MGRS, class_group)`.
+- Per-region pass tracker for removed-candidate detection.
+- `pending_observations`: ordered log of NEW / MOVED / REMOVED-CANDIDATE / EXISTING events not yet pushed centrally.
+- `pending_ignored`: ordered log of IgnoredItem appends not yet pushed centrally.
+- `sync_state`: enum + last-pull timestamp + last-push timestamp + last error.
+- Persistence layer (engine TBD — see Open Questions) for crash recovery and post-flight upload durability.
+
+## 7. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| Pre-flight pull unreachable | network | Surface BIT degradation; operator must acknowledge cached fallback or abort. Never silent. |
+| Pre-flight pull stale beyond freshness window | last-fetch-at compared to configured staleness | `sync_state = degraded`; operator must acknowledge or abort. |
+| Persistence write failure | engine error | Log + retry; in-memory state continues authoritative for this mission; health → yellow. |
+| Persistence corruption on startup | checksum / open failure | Refuse to start with stale state; require explicit recovery (engine-specific); surface to operator at startup. |
+| H3 query inconsistency near cell boundaries | algorithmic | Always query the k-ring (k=2 default) so boundary objects are matched anyway. |
+| Mission cascade signal lost | absent signal | `DELETE /missions/{id}` is the only cleanup trigger; on lost signal, mission-scoped objects accumulate. Operator-driven manual purge is acceptable. |
+| Post-flight push partial success | per-endpoint status | Independent retry per endpoint; do not roll back the successful one. |
+| Post-flight push persistent failure | bounded retries exhausted | `sync_state = degraded`; pending diff persisted on disk; operator-visible warning; manual replay supported. Mission's central data integrity at risk until replayed. |
+| In-flight crash | startup detects non-empty `pending_*` for a terminated mission | `mission_client` runs the post-flight push at startup before BIT completes for any new mission. |
+
+## 8. Dependencies
+
+**In-process**: `scan_controller`, `mission_client` (for pull/push round-trips), `mission_executor` (for post-flight trigger).
+
+**External**: H3 spatial-index library (Rust crate). Persistent store engine — TBD (SQLite + H3 extension / KV / in-memory + snapshot — see Open Questions). Central API contract via `mission_client`'s extension of the `missions` API (per `architecture.md §7.13`).
+
+## 9. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Per-detection classify latency | O(1); p99 ≤1 ms |
+| Pre-flight pull time | ≤30 s for a 30 km × 30 km mission area (per `architecture.md §6 NFR`) |
+| Post-flight push time | ≤2 min for a 60 min mission's pass diff (per `architecture.md §6 NFR`) |
+| Persistent-store size (single mission) | bounded; configurable retention |
+| Crash recovery time | ≤2 s to a usable state; in-flight crash → next-boot push of pending |
+| Boundary correctness | guaranteed by k-ring query |
+
+## 10. Open Questions
+
+- **Engine choice** (architecture.md §8 Q3): SQLite + H3 extension / KV / in-memory + snapshot.
+- **Central API schema details** (architecture.md §8 Q7): paging strategy, photo-reference upload mechanism, observation-history retention policy.
+- **Conflict resolution rules** (architecture.md §8 Q8): exact projection from observation log to current view; REMOVED-claim expiry window; multi-class disambiguation.
+- Optimal H3 resolution per terrain class.
+- Class-group definitions (`military_vehicle_group` vs `concealed_position_group` vs `movement_candidate`) — currently in `scan_controller` config.
+
+## 11. References
+
+- `architecture.md §3`, `§5 Architectural Principles` (MapObjects are mission-bracketed and centrally synchronised), `§6 NFR`, `§7.9 MapObjects (H3 spatial index)`, `§7.10 Sync Message Format`, `§7.11 Target Relocation`, `§7.12 New vs Existing object detection`, `§7.13 MapObjects Sync`.
+- `system-flows.md §F7 MapObjects + ignored-items` (in-flight diff), `§F8 MapObjects sync (central DB, mission-bracketing)`.
+- `data_model.md §MapObject`, `§IgnoredItem`, `§MapObjectObservation`, `§MapObjectsBundle`.
+- `../_docs/02_missions.md` (mission cascade contract; new MapObjects endpoints).
@@ -0,0 +1,87 @@
+# Component — `mavlink_layer`
+
+**Layer**: Action (data plane out)
+**Status**: forward-looking design (Rust); hand-rolled (no third-party SDK)
+
+## 1. Purpose
+
+Hand-rolled MAVLink v2 transport. Implements only the ~10–15 commands this codebase needs (full list in `architecture.md §7.7`). Owns serialisation / deserialisation, heartbeat, sequence numbers, retry, and a single connection abstraction (UDP or serial, picked at startup from CLI / env). No third-party SDK — eliminating the largest current dependency-risk item.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| Outgoing `COMMAND_LONG`, `MISSION_*`, `SET_MODE` | `mission_executor` | per state transition | Hand-rolled message constructors per command. |
+| Outgoing heartbeat | self (timer) | 1 Hz | `HEARTBEAT` to keep the autopilot's GCS-link alive. |
+| Connection URI | startup config | once | `udp://...` or `serial:///dev/...`. |
+| MAVLink-2 signing config | startup config | once | If supported by the link, signing is enabled; otherwise the link is treated as trusted. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| Decoded MAVLink messages | `mission_executor`, `telemetry_stream`, `movement_detector` (for UAV motion telemetry) | typed enum per message kind |
+| Connection state | health aggregator | `connected`, `last_heartbeat_age_ms`, `tx_seq`, `rx_seq`, `parse_errors_total`, `signing_enabled`. |
+
+The supported message surface (concise list; full table in `architecture.md §7.7`):
+
+- `HEARTBEAT` (bidir)
+- `COMMAND_LONG` subset (out): arm/disarm, takeoff, set-mode, change-speed, change-alt, land, RTL
+- `COMMAND_ACK` (in)
+- `MISSION_COUNT`, `MISSION_REQUEST_INT`, `MISSION_ITEM_INT`, `MISSION_ACK`, `MISSION_SET_CURRENT`, `MISSION_CURRENT`, `MISSION_ITEM_REACHED`, `MISSION_CLEAR_ALL`
+- `GLOBAL_POSITION_INT`, `ATTITUDE`, `SYS_STATUS`, `EXTENDED_SYS_STATE`, `STATUSTEXT`
+- `SET_MODE` (out, fixed-wing)
+
+## 4. Key Responsibilities
+
+- Open and maintain the MAVLink connection (UDP or serial). Reconnect on transport loss with bounded backoff.
+- Encode outgoing messages with correct sequence numbers, system / component IDs, and (when enabled) MAVLink-2 signing.
+- Decode incoming messages with strict validation: reject malformed frames, unknown message IDs, and signing failures.
+- Emit a 1 Hz heartbeat. Detect autopilot heartbeat timeouts and surface to health.
+- Demux `COMMAND_ACK` to the originating caller (per `command_id`); enforce a wall-clock ack timeout.
+
+## 5. Internal State
+
+- Connection handle (UDP socket or serial port).
+- Outgoing sequence number.
+- In-flight command map (`command_id → (caller, deadline)`).
+- Per-message-kind parse error counters.
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| Transport open failure | OS error | Bounded backoff; surface to health → red. |
+| Heartbeat from autopilot missing | wall-clock timeout | Surface `link_lost` to health and to `mission_executor`; do not silently fail. |
+| Command-ack timeout | wall-clock | Bubble timeout to `mission_executor`; the executor decides retry vs failure. |
+| Malformed inbound frame | parser error | Drop the frame; increment counter; do not abort the link. |
+| MAVLink-2 signing mismatch (if enabled) | signature check | Reject the frame; alert; do not silently accept. |
+| Sequence-number gap | rx_seq vs expected | Log; not a hard failure on its own. |
+
+## 7. Dependencies
+
+**In-process** (input): `mission_executor`.
+**In-process** (output): `mission_executor`, `telemetry_stream`, `movement_detector`.
+
+**External**: ArduPilot / PX4 over MAVLink v2 (UDP or serial).
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Per-message round-trip on a healthy link | ≤50 ms p99 |
+| Heartbeat cadence | 1 Hz out |
+| Command-ack timeout | configurable; default 1 s, with retry handled by `mission_executor` |
+| Reconnect after transport loss | ≤2 s on serial / ≤5 s on UDP |
+| Message subset | ~10–15 commands only — adding more requires explicit design review |
+
+## 9. Open Questions
+
+- **MAVLink-2 message signing** (`architecture.md §8 Q6`): whether the airframe link enables signing or treats the link as trusted.
+
+## 10. References
+
+- `architecture.md §3`, `§5 Architectural Principles` (no MAVSDK, no silent error swallowing), `§7.7 MAVLink and Piloting`.
+- `system-flows.md §F6 Mission lifecycle`.
@@ -0,0 +1,93 @@
+# Component — `mission_client`
+
+**Layer**: Action (data plane out)
+**Status**: forward-looking design (Rust)
+
+## 1. Purpose
+
+Pulls the mission from the external `missions` API on start; validates against the shared `mission-schema` artefact; supplies the parsed mission to `mission_executor`. POSTs middle-waypoint inserts on operator-confirmed targets, owns the **MapObjects pre-flight pull / post-flight push** round trips against the same `missions` API, and survives transient connection loss with bounded retry.
+
+`autopilot` and `missions` are **separate repos** with a shared `mission-schema`. There is no in-process mission database in autopilot. The MapObjects endpoints (`/missions/{id}/mapobjects` GET + POST) are an extension of the `missions` API per `architecture.md §7.13`.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| `mission_id` | startup CLI / env | once | Identifies which mission to fetch. |
+| Missions API endpoint + auth | startup config | once | HTTPS REST; auth model TBD per `../_docs/02_missions.md`. |
+| Middle-waypoint POST request | `mission_executor` (via `scan_controller` / `operator_bridge`) | event | The mission with the inserted middle waypoint. |
+| Mission-update notification | missions API (push or poll) | event | Optional; if missions API supports change notifications, propagate to `mission_executor`. |
+| MapObjects post-flight push trigger | `mission_executor` (on terminal state) + `mapobjects_store` (pending diff handle) | once per mission | Triggers the F8 post-flight upload. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| Parsed mission | `mission_executor` | `{ waypoints: Vec<MissionWaypoint>, geofences: Vec<Geofence>, return_point, mission_id, schema_version }` |
+| Pre-flight MapObjects bundle | `mapobjects_store` | `{ map_objects, ignored_items, fetched_at, schema_version, fallback_used: bool }` |
+| Post-flight push status | `mapobjects_store`, health aggregator | per-endpoint ack / retry / failure |
+| Mission cascade signal (`DELETE /missions/{id}` echoed by missions API) | `mapobjects_store` | event |
+| Health metric | health aggregator | `last_fetch_ts`, `fetch_errors_total`, `schema_version`, `connection_state`, `mapobjects_pull_state`, `mapobjects_push_pending`. |
+
+## 4. Key Responsibilities
+
+- Fetch the mission by `mission_id` on startup. Validate against `mission-schema`. Reject on schema-invalid; do not silently downcast.
+- **MapObjects pre-flight pull.** Immediately after the mission fetch succeeds, call `GET /missions/{id}/mapobjects` (and `GET /missions/{id}/mapobjects/ignored` if separated). Hand the bundle to `mapobjects_store`. On failure, surface to `mission_executor` BIT (F9) — operator may acknowledge cached fallback or abort. Never silent.
+- POST middle-waypoint updates; await ack; surface failure to `mission_executor` (which decides whether to halt, RTL, or proceed with the original mission).
+- **MapObjects post-flight push.** When `mission_executor` reaches a terminal state, drain `mapobjects_store`'s pending diff and call `POST /missions/{id}/mapobjects` + `POST /missions/{id}/mapobjects/ignored`. Independent retry per endpoint with bounded backoff. On persistent failure, persist pending diff on disk and surface a warning (operator may manually replay).
+- **Crash-recovery push.** At startup, if `mapobjects_store` reports non-empty pending diff for a previously terminated mission, run the post-flight push for that mission BEFORE BIT for any new mission begins.
+- On `DELETE /missions/{mission_id}` (observed via missions API or out-of-band signal), notify `mapobjects_store` to drop mission-scoped objects.
+- Survive transient connection loss with bounded exponential backoff. Pre-flight, this delays takeoff. In-flight, missing connectivity does not stop execution of the already-in-memory mission. (No central writes happen in-flight by design — Frozen choice 6.)
+
+## 5. Internal State
+
+- Currently active mission (the original, plus any patched version from middle-waypoint inserts).
+- Schema version reported by missions API at fetch.
+- MapObjects pull state: `not_started | in_flight | synced | cached_fallback | failed`.
+- MapObjects push queue: per-mission pending diff with retry counter and last-failure reason.
+- Retry counter and last-failure reason for each endpoint.
+
+State is in-process only **except** for the post-flight push queue, which is durable on disk so a crash mid-mission does not lose the diff.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| Missions API unreachable at startup | HTTP error / DNS failure | Bounded retry; if max-retry exceeded, refuse to start the mission; health → red; surface to operator. |
+| Schema mismatch (mission or mapobjects) | response decoder | Refuse to start the mission; surface raw response (size-capped) for offline analysis. |
+| Pre-flight MapObjects pull fails | HTTP error / timeout | BIT degrades; operator may acknowledge cached fallback or abort. Never silent. |
+| Mid-flight middle-waypoint POST fails | HTTP error | `mission_executor` decides: continue with the existing in-memory mission, or RTL if the failure is persistent. |
+| Post-flight MapObjects push fails | HTTP error / 5xx | Persist pending diff on disk; bounded retry with exponential backoff; operator-visible warning after max retries. |
+| Post-flight push partial success | per-endpoint status | Independent retry per endpoint; do not roll back the successful one. |
+| Mission deleted mid-flight | `DELETE` notification | Surface to operator; safe-shutdown decision is a policy in `mission_executor` (default: continue current mission and notify on landing). The post-flight push will receive 404; data preserved as orphaned for forensic review. |
+
+## 7. Dependencies
+
+**In-process** (input): startup config, `mission_executor`, `operator_bridge` (via `scan_controller`), `mapobjects_store` (pending-diff handle).
+**In-process** (output): `mission_executor`, `mapobjects_store`.
+
+**External**: missions API (HTTPS REST), including the MapObjects extension. Contract owner: `../_docs/02_missions.md` (with the §7.13 extension proposed in this repo).
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Startup mission fetch | ≤5 s on healthy connectivity |
+| Pre-flight MapObjects pull | ≤30 s for a 30 km × 30 km mission area |
+| Middle-waypoint POST | ≤2 s on healthy connectivity |
+| Post-flight MapObjects push | ≤2 min for a 60 min mission's pass diff; persisted on disk if push fails |
+| Bounded retry | configurable max; default 5 attempts with exponential backoff for synchronous calls; 24 h durable retry window for the post-flight push |
+
+## 9. Open Questions
+
+- **`mission-schema` extraction location** (`architecture.md §8 Q5`): `_infra/` at suite root, or a small third repo.
+- **MapObjects endpoint contract** (`architecture.md §8 Q7`): paging, photo-ref upload, retention policy.
+- **MapObjects conflict resolution** (`architecture.md §8 Q8`): server-side; this component only consumes the result.
+- Auth / session model for the missions API (per `../_docs/02_missions.md`).
+
+## 10. References
+
+- `architecture.md §3`, `§5 Architectural Principles` (separate repos + shared schema; MapObjects mission-bracketed), `§7.6 Solution Architecture`, `§7.13 MapObjects Sync`.
+- `system-flows.md §F6 Mission lifecycle`, `§F8 MapObjects sync`.
+- `data_model.md §MissionItem`, `§MissionWaypoint`, `§Geofence`, `§MapObjectsBundle`, `§MapObjectObservation`.
+- `../_docs/02_missions.md`.
@@ -0,0 +1,94 @@
+# Component — `mission_executor`
+
+**Layer**: Action (data plane out)
+**Status**: forward-looking design (Rust)
+
+## 1. Purpose
+
+Drives the airframe through a typed state machine: connect → health-check → **pre-flight self-test (BIT, F9)** → (variant-specific arm/takeoff or wait-for-AUTO) → upload mission → fly mission → land. Owns geofence enforcement (both INCLUSION and EXCLUSION), the **lost-link failsafe ladder** (F10), and **battery / fuel threshold enforcement**. Inserts middle waypoints on operator-confirmed targets and resumes the original mission after target-follow ends. Issues all autopilot-facing commands through `mavlink_layer`. Triggers post-flight MapObjects push (F8) on terminal state.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| Mission JSON (parsed) | `mission_client` | once at start; on middle-waypoint update | Contains waypoints + INCLUSION/EXCLUSION geofences + return point. |
+| Airframe variant | startup config | once | `multirotor` or `fixed_wing`. |
+| MAVLink telemetry | `mavlink_layer` | continuous | Position, attitude, mode, sys-status, mission progress. |
+| Middle-waypoint hint | `scan_controller` (from `operator_bridge`) | event on operator confirm | Triggers mission re-upload. |
+| Target-follow release / loss / timeout | `scan_controller` | event | Triggers reverting to the original mission. |
+| Health input from peer components | health aggregator | continuous | Used for the health-check gate before takeoff. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| MAVLink commands (arm, takeoff, set-mode, change-speed, change-alt, land, RTL, mission-clear, mission-upload, set-current, RTL) | `mavlink_layer` | per state transition |
+| UAV telemetry (forwarded) | `scan_controller`, `movement_detector`, `telemetry_stream` | continuous |
+| Mission state | `scan_controller`, `operator_bridge` | event on transitions |
+| Health metric | health aggregator | current state, `state_duration_ms`, `transition_failures_by_state`, geofence violations, retry counts. |
+
+## 4. Key Responsibilities
+
+- Run the variant-specific state machine (see `architecture.md §7.7`):
+  - **Multirotor**: `DISCONNECTED → CONNECTED → HEALTH_OK → BIT_OK → ARMED → TAKE_OFF → MISSION_UPLOADED → FLY_MISSION → LAND → POST_FLIGHT_SYNC → DONE`.
+  - **Fixed-wing**: `DISCONNECTED → CONNECTED → HEALTH_OK → BIT_OK → MISSION_UPLOADED → WAIT_AUTO → FLY_MISSION → LAND → POST_FLIGHT_SYNC → DONE`.
+- Apply bounded retry with exponential backoff at every transition; explicit max-retry; on exceeding it, health flips to red and the executor surfaces the failure via `operator_bridge`. **No infinite retry.**
+- **Run pre-flight BIT (F9)** before transitioning to `ARMED` / `WAIT_AUTO`. BIT covers every dependency in `architecture.md §5` plus mission load + MapObjects pre-flight pull (cached fallback acknowledged) + persistent-store free space + wall-clock binding. On BIT FAIL, no transition. On DEGRADED, surface to operator for signed acknowledgement (per Q9).
+- **Run the lost-link failsafe ladder (F10)** every tick: `LinkOk → LinkDegraded → LinkLost → LinkLostInFollow`. Default RTL after 30 s grace; configurable. MAVLink-link loss to ArduPilot itself is a separate, more severe event — health → red, airframe failsafe takes over (we do NOT override it).
+- **Enforce battery / fuel thresholds.** Read `SYS_STATUS` / `EXTENDED_SYS_STATE` continuously; trigger RTL at `battery ≤ rtl_threshold` (default 25 %); land-now at `battery ≤ hard_floor` (default 15 %); operator override only via signed command.
+- Enforce geofences. INCLUSION violations halt forward progress and trigger RTL; EXCLUSION violations trigger the same. Both are honoured (the earlier C++ behaviour silently ignored EXCLUSION; the new design rejects that).
+- On middle-waypoint hint: recompute the mission (`current_position → middle_waypoint → resume_original_route`), `MISSION_CLEAR_ALL`, re-upload via the standard sequence, `MISSION_SET_CURRENT(0)`, and resume.
+- On target-follow ending: recompute and re-upload the original mission from the current position; resume.
+- **Trigger post-flight MapObjects push (F8)** on entry to `POST_FLIGHT_SYNC` — that is, after `LAND` completes (or after RTL completes, or after operator-acknowledged abort). Hand off to `mission_client`.
+- Forward MAVLink telemetry to `scan_controller` (for proximity priority + middle-waypoint computation), to `movement_detector` (for ego-motion compensation), and to `telemetry_stream` (for operator overlay).
+
+## 5. Internal State
+
+- Current state + variant.
+- Currently active mission (original) + active patched mission (with middle waypoint), if any.
+- Per-transition retry counter and last-failure reason.
+- Mission progress (current item index).
+- Geofence violation history (for diagnostics).
+
+State is in-process only; restart re-runs the state machine from `DISCONNECTED`.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| MAVLink connection lost | heartbeat timeout from `mavlink_layer` | Bounded retry; health → red after threshold; state machine pauses (does not reset). |
+| Health-check gate fails (sensors not ok, low battery, etc.) | telemetry inspection | Stay in `CONNECTED` state; alert; no takeoff. |
+| BIT FAIL on any item | F9 evaluation | No transition past `BIT_OK`; surface report to operator; remain in `HEALTH_OK`. |
+| Mission upload `MISSION_ACK` rejection | `mavlink_layer` response | Bounded retry with full re-upload; on max-retry, health → red, surface to operator. |
+| Geofence INCLUSION exit | telemetry vs polygon | Trigger RTL via MAVLink; surface alert; transition to `LAND`. |
+| Geofence EXCLUSION entry | telemetry vs polygon | Trigger RTL via MAVLink; surface alert; transition to `LAND`. |
+| Operator/Ground-Station modem link lost | F10 ladder evaluation | `LinkDegraded` (5–30 s) → health yellow + queue events; `LinkLost` (>30 s) → RTL; `LinkLostInFollow` (>30 s in target-follow) → 30 s grace then RTL. Configurable. |
+| MAVLink-link loss to ArduPilot/PX4 | heartbeat timeout | Health → red; airframe's own MAVLink failsafe takes over (we do NOT override). |
+| Battery ≤ rtl_threshold (default 25 %) | SYS_STATUS | Trigger RTL; surface alert; transition to `LAND`. |
+| Battery ≤ hard_floor (default 15 %) | SYS_STATUS | Land-now via `MAV_CMD_NAV_LAND` at safest reachable point; health → red. |
+| Operator override of safety threshold | signed command (Q9) | Permitted; recorded in audit log with operator ID + rationale. |
+| Middle-waypoint compute fails (e.g., target outside INCLUSION) | pre-upload validation | Reject the hint with reason; surface to `operator_bridge`; original mission continues. |
+| Target-follow handover from `scan_controller` while not yet airborne | state guard | Reject; surface error; never deliver target-follow before `FLY_MISSION`. |
+| Post-flight MapObjects push fails | F8 status | Persist pending diff on disk; bounded retry; operator-visible warning after max retries. State machine still reaches `DONE` so a new mission can start. |
+
+## 7. Dependencies
+
+**In-process** (input): `mission_client`, `mavlink_layer`, `scan_controller`, health aggregator.
+**In-process** (output): `mavlink_layer`, `scan_controller`, `movement_detector`, `telemetry_stream`, `operator_bridge`.
+
+**External**: ArduPilot / PX4 over MAVLink (mediated by `mavlink_layer`).
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Time-to-takeoff (multirotor, healthy startup) | bounded; no infinite waits |
+| Mission-upload retry budget | configurable max; default 3 attempts |
+| Geofence response time | ≤500 ms from violation detection to RTL command |
+| Middle-waypoint re-upload | ≤2 s end-to-end |
+
+## 9. References
+
+- `architecture.md §3`, `§5 Architectural Principles` (bounded retry, geofence symmetric, lost-link mandatory, BIT mandatory, MapObjects mission-bracketed), `§7.3 Reliability and safety`, `§7.7 MAVLink and Piloting` (lost-link ladder + battery thresholds).
+- `system-flows.md §F6 Mission lifecycle`, `§F8 MapObjects sync`, `§F9 Pre-flight self-test`, `§F10 Lost-link failsafe ladder`.
+- `data_model.md §MissionItem`, `§MissionWaypoint`, `§Geofence`.
@@ -0,0 +1,96 @@
+# Component — `movement_detector`
+
+**Layer**: Perception (data plane in)
+**Status**: forward-looking design (Rust + OpenCV bindings; learned-CV fallback per `architecture.md §8 Q14`)
+
+## 1. Purpose
+
+Detect small moving point/cluster candidates that are not yet classifiable by Tier 1, in **both** the zoom-out and zoom-in scan levels, and enqueue them as POIs for confirmation. Compensates for UAV and gimbal motion using synchronised telemetry; naive frame differencing is rejected.
+
+The component is suppressed only during `scan_controller`'s `TargetFollow` state (the gimbal is dominated by tracking commands during follow).
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| `Frame` | `frame_ingest` | up to 30 fps | Frames are skipped when `ai_locked` is set or the system is in `TargetFollow`. |
+| Gimbal angle (yaw, pitch) | `gimbal_controller` | per frame, monotonic-timestamped | Telemetry-skew gate: reject samples where frame ↔ gimbal skew exceeds the configured tolerance for the current zoom band. |
+| Zoom state | `gimbal_controller` | per frame, monotonic-timestamped | Drives zoom-band selection (`zoomed_out` vs `zoomed_in`) and per-band thresholds; also used for residual-motion scaling. |
+| UAV motion telemetry | `mavlink_layer` (via `mission_executor`) | 10 Hz target | Position + attitude + velocity + monotonic timestamp. |
+| Active-state hint | `scan_controller` | event | `enable_zoomed_out` / `enable_zoomed_in` / `disable` (the latter is set during `TargetFollow`). |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `MovementCandidate` | `scan_controller` | `{ frame_seq, bbox_normalized, residual_velocity_estimate, telemetry_quality, source_frame_ts, source_zoom_band }` |
+| Health metric | health aggregator | `enabled`, `current_zoom_band`, `candidates_per_min_zoomed_out`, `candidates_per_min_zoomed_in`, `telemetry_skew_drops_total`, `compensation_quality_per_band`. |
+
+## 4. Key Responsibilities
+
+- Compute per-frame ego-motion using OpenCV optical flow / global motion estimation (e.g. dense Lucas-Kanade or feature-based homography), refined by the synchronised gimbal + UAV telemetry.
+- Subtract estimated ego-motion from per-pixel motion; cluster the residuals.
+- Emit clusters that meet the **per-zoom-band** minimum size + persistence threshold as `MovementCandidate`s, capped to honour the system-wide ≤5 POIs/min operator-review budget shared with `scan_controller`.
+- Self-disable in `TargetFollow`. The component still consumes frames while disabled (to keep its motion-history warm) but emits no candidates.
+- Tag each emitted candidate with `source_zoom_band` so `scan_controller` can apply zoom-band-aware queueing logic (described in `system-flows.md §F2`).
+
+## 5. Per-zoom-band tuning
+
+The same code path runs at zoom-out and zoom-in, but the configuration differs because the pixel-to-metre ratio differs by ~10×.
+
+| Knob | Zoom-out (typical) | Zoom-in (typical) |
+|---|---|---|
+| Cluster persistence threshold | 3–5 frames | 6–10 frames (gimbal-pan-induced flicker is more frequent at narrow FOV) |
+| Residual-velocity floor | low (small physical motion is enough) | higher (small physical motion is amplified pixel-wise; raising the floor reduces FP from compensation residuals) |
+| Telemetry-skew tolerance | 50 ms frame ↔ gimbal, 100 ms frame ↔ UAV | 25 ms frame ↔ gimbal, 50 ms frame ↔ UAV (stricter — gimbal slewing dominates zoomed FOV) |
+| Enqueue-latency budget | ≤1 s | ≤1.5 s (allows brief gimbal-stability window) |
+| FP cap (per-band) | per `architecture.md §6 NFR` | per `architecture.md §6 NFR`; if exceeded, fallback per Q14 |
+
+Exact values are mission-tunable; defaults are calibrated during the benchmark gate.
+
+## 6. Internal State
+
+- Rolling motion-history buffer (a few seconds of frames + telemetry). One buffer per zoom band; switching bands does not invalidate the buffer for the other.
+- Per-cluster persistence counters (per zoom band).
+- Telemetry-sync state machine.
+- `current_zoom_band` derived from `gimbal_controller`'s zoom state.
+
+State is in-process only.
+
+## 7. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| Telemetry skew above tolerance (per zoom band) | timestamp delta exceeds threshold | Drop that frame's compensation; do not emit candidates for the affected window; counter-tagged drop. |
+| Optical-flow degenerate | flow magnitudes implausible (e.g. camera failure, full motion blur) | Skip emission for that frame; surface as a health signal on sustained occurrence. |
+| Sustained candidate flood at zoom-in (FP cap exceeded) | candidates_per_min_zoomed_in over a sliding window | Suppress zoom-in emission only; keep zoom-out emission running; surface health → yellow; this is the trigger condition for the Q14 fallback. |
+| Sustained candidate flood at zoom-out (FP cap exceeded) | candidates_per_min_zoomed_out over a sliding window | Down-rank lowest-confidence candidates; surface health → yellow; never silently drop without counting. |
+| Component disabled by `scan_controller` | active-state hint = `disable` | Emit zero candidates; keep motion history warm. |
+
+## 8. Dependencies
+
+**In-process**: `frame_ingest`, `gimbal_controller`, `mavlink_layer`, `scan_controller`.
+
+**External**: OpenCV (patched, version-pinned). Optional: a learned-CV crate / module (RAFT-derivative or CNN motion-segmentation) behind a build-time feature flag — engaged only when the Q14 fallback is required.
+
+## 9. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Candidate enqueue latency (zoom-out) | ≤1 s from detection to POI in queue |
+| Candidate enqueue latency (zoom-in) | ≤1.5 s from detection to POI in queue |
+| False-positive rate at the operator surface | bounded by `scan_controller`'s ≤5 POIs/min cap; per-zoom-band internal caps prevent zoom-in starving zoom-out |
+| CPU budget on Jetson | configurable; must coexist with Tier 1 (running in `../detections`) and Tier 2 |
+| Telemetry-skew tolerance | per-zoom-band; defaults in §5 |
+
+## 10. Open Questions
+
+- **Q14 fallback selection** (architecture.md §8): if classical OpenCV fails the per-zoom-band FP cap at zoom-in, the fallback module — learned optical flow vs CNN motion-segmentation vs IMU-tighter-coupled classical — is open. Interface contract is fixed (`Frame + telemetry → Vec<MovementCandidate>`).
+- Minimum cluster persistence threshold across zoom bands (refined during benchmark gate).
+- Whether to share the motion-history buffer across zoom-band transitions or reset on transition (§6 currently says share).
+
+## 11. References
+
+- `architecture.md §3`, `§5 Architectural Principles` (ego-motion compensation mandatory; movement runs at both zoom levels), `§7.6 Movement detector`, `§8 Q14`.
+- `system-flows.md §F2 Movement detection (zoom-out + zoom-in)`.
+- `data_model.md §MovementCandidate`.
@@ -0,0 +1,89 @@
+# Component — `operator_bridge`
+
+**Layer**: Action (data plane out)
+**Status**: forward-looking design (Rust)
+
+## 1. Purpose
+
+Surfaces POIs to the operator (via the always-on `telemetry_stream`) and routes operator commands (confirm / decline / target-follow start / target-follow release / safety-override / BIT-degraded-acknowledge) back into autopilot. On decline, appends an `IgnoredItem`. On confirm, hands a middle-waypoint hint to `mission_executor`. On target-follow start / release, drives `scan_controller`'s state transition. **Validates every operator command's authentication signature, replay protection, and session binding before dispatching it** — the modem link's encryption alone is not sufficient (per `architecture.md §5` and Q9).
+
+The Ground Station is the operator-facing UI; `operator_bridge` is the autopilot-side counterpart.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| POI surface request | `scan_controller` | per POI | Includes Tier 1, Tier 2, and (optional) Tier 3 evidence. |
+| POI dequeue / replace | `scan_controller` | event | When the queue rotates (cap, age-out, or completion). |
+| Operator command (confirm / decline / target-follow start / target-follow release) | Ground Station (via `telemetry_stream`) | event | Acked back to operator with command id + result. |
+| Modem link state | `telemetry_stream` | event | Used to decide whether to surface POIs at all (see Failure Modes). |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| Operator-facing POI event | `telemetry_stream` (which pushes to Ground Station) | `{ poi_id, mgrs, class_group, confidence, vlm_status, tier2_evidence_summary, photo_metadata }` |
+| `IgnoredItem` append | `mapobjects_store` (via `scan_controller`) | on operator decline |
+| Middle-waypoint hint | `mission_executor` (via `scan_controller`) | on operator confirm |
+| Target-follow start / release | `scan_controller` | on operator command |
+| Health metric | health aggregator | `pois_surfaced_per_min`, `decision_latency_p50/p99` (operator-side), `commands_in_flight`. |
+
+## 4. Key Responsibilities
+
+- Translate `POI` events from `scan_controller` into the wire format defined in `architecture.md §7.10 Drone ⇄ Operator Sync Message Format` and push them through `telemetry_stream`.
+- Receive operator commands on the return path; **validate the authentication signature, replay-protection sequence number, and session token** before any other processing. Reject and surface to health on signature failure, sequence-number reuse, or unknown session.
+- Validate the command id matches a POI in flight (or a target-follow session, BIT report, or safety-override scope); ack the operator with the result.
+- Apply the confidence-scaled operator decision window (40 % → 30 s, 100 % → 120 s, linear) — though the timeout itself is enforced by `scan_controller`; this component just ensures the surfaced POI carries the deadline.
+- On confirm, hand `(target_mgrs, target_class)` to `scan_controller` (which forwards a middle-waypoint hint to `mission_executor`).
+- On decline, hand `(MGRS, class_group)` to `scan_controller` for `IgnoredItem` append.
+- Forward BIT-degraded acknowledgements (signed) to `mission_executor` (F9), and safety-override commands (signed) for battery / lost-link suppression to `mission_executor` (F10).
+
+## 5. Internal State
+
+- Currently surfaced POIs by id (with deadlines).
+- In-flight target-follow session (if any).
+- Per-command idempotency keys.
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| Modem link down | `telemetry_stream` health | Stop surfacing POIs; queue them in `scan_controller` (whose cap still applies); resume on reconnect. F10 lost-link ladder owns the larger response. |
+| Operator command for unknown POI id | command validation | Ack with error; do not act on it. |
+| Operator command after deadline | command validation | Ack with `expired`; do not act on it. |
+| Duplicate operator command (re-tx) | idempotency key | Ack with the cached result; do not double-act. |
+| `scan_controller` rejects the confirm (e.g., already in target-follow) | response from controller | Ack operator with `rejected: already_following`; surface the active target. |
+| Operator command signature invalid | auth check | Reject with `auth_failed`; log; surface health → red on sustained failures (potential hostile injection). |
+| Operator command sequence number reused | replay-protection check | Reject with `replay_detected`; log; do not act on it. |
+| Unknown session token | session validation | Reject with `auth_failed`; log; require operator re-auth at Ground Station. |
+| Operator attempts to acknowledge a BIT FAIL as DEGRADED | severity check | Rejected by validation; surface to operator as `cannot_acknowledge_fail`. |
+
+## 7. Dependencies
+
+**In-process** (input): `scan_controller`, `telemetry_stream`.
+**In-process** (output): `scan_controller` (for state transitions), indirectly `mapobjects_store` and `mission_executor` (via `scan_controller`).
+
+**External**: Ground Station API (operator-facing); contract owned by `../_docs/04_system_design_clarifications.md`.
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| POI surface → operator visible | ≤1 s under normal modem conditions |
+| Operator command → autopilot effect | ≤1 s under normal modem conditions |
+| Idempotency window | 60 s (per-command-id cache) |
+
+## 9. Open Questions
+
+- Ground Station API contract (`architecture.md §8 Q2`): stream protocol (WebRTC / WebSocket-H.264 / gRPC server-streaming?), session/auth model, bbox-overlay rendering.
+- **Operator-command authentication scheme** (`architecture.md §8 Q9`): HMAC over (session_token, sequence_number, payload) vs JWT-style ed25519 vs MAVLink-2 signing extended to operator commands vs separate envelope. The principle is committed; the scheme is open.
+- **Multi-operator session policy** (`architecture.md §8 Q11`): single active operator at a time, or quorum?
+
+## 10. References
+
+- `architecture.md §3`, `§5 Architectural Principles` (operator commands authenticated, signed, replay-protected), `§7.10 Drone ⇄ Operator Sync Message Format`, `§8 Q9 / Q11`.
+- `system-flows.md §F5 Operator round trip`, `§F9 Pre-flight self-test`, `§F10 Lost-link failsafe ladder`.
+- `data_model.md §POI`, `§IgnoredItem`, `§OperatorCommand`.
+- `../_docs/04_system_design_clarifications.md`.
@@ -0,0 +1,96 @@
+# Component — `scan_controller`
+
+**Layer**: Decision + Memory
+**Status**: forward-looking design (Rust)
+
+## 1. Purpose
+
+The system's brain. A deterministic typed state machine — `ZoomedOut`, `ZoomedIn { roi, hold_started_at }`, and `TargetFollow { target_id, started_at }`. Owns the POI queue, timeouts, the ≤5 POIs/min operator-review cap, the confidence-scaled operator-decision window, gimbal command issuance, and the new/moved/existing/removed dispatch into `mapobjects_store`.
+
+The full behaviour-tree spec — including tick scenarios and the 15 fixed-wing rules — lives in `system-flows.md §F4`.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| `DetectionBatch` | `detection_client` | per frame | Tier 1 primitives. |
+| `MovementCandidate` | `movement_detector` | per frame at both zoom-out and zoom-in (suppressed only during `TargetFollow`) | Each candidate carries `source_zoom_band`. |
+| `Tier2Evidence` | `semantic_analyzer` | per zoom-in hold | Path / endpoint / concealment scoring. |
+| `VlmAssessment` | `vlm_client` (optional) | per zoom-in endpoint hold | `status: disabled` if VLM is off. |
+| Operator commands | `operator_bridge` | event | confirm / decline / target-follow start / target-follow release. Authenticated, signed, replay-protected upstream of this component. |
+| UAV telemetry | `mavlink_layer` (via `mission_executor`) | 10 Hz target | Position used for proximity-weighted POI priority and middle-waypoint computation. |
+| Mission state | `mission_executor` | event | Current waypoint, mission progress; used for sweep-vs-route alignment. |
+| MapObjects sync state | `mapobjects_store` | event at startup + post-flight | `synced` / `cached_fallback` / `degraded` — surfaces a health flag and (for `degraded`) suppresses MapObject diff classifications until corrected. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `GimbalCommand` (yaw / pitch / zoom) | `gimbal_controller` | per state-machine tick or per zoom-in plan step |
+| `POI` to operator | `operator_bridge` (then `telemetry_stream`) | enqueue / dequeue events |
+| Middle-waypoint hint | `mission_executor` | event on operator-confirmed target |
+| MapObjects update | `mapobjects_store` | new / moved / existing / removed dispatch |
+| Health metric | health aggregator | `state`, `pois_in_queue`, `pois_per_min`, `tick_latency_p99`, `last_state_change_ts`, `mapobjects_sync_state`. |
+
+## 4. Key Responsibilities
+
+- Run the `ZoomedOut` / `ZoomedIn` / `TargetFollow` state machine. Transitions are explicit, typed, and fully enumerated; no ad-hoc booleans.
+- Maintain the POI queue ordered by `confidence × proximity_to_current_camera × age_factor`. Hard-cap output to ≤5 POIs/min surfaced to the operator.
+- Apply the confidence-scaled operator decision window (40 % → 30 s, 100 % → 120 s, linear; below 40 % the POI is not surfaced). Timeout = forget; decline = `IgnoredItem` entry via `mapobjects_store`.
+- Suppress new POIs whose `(MGRS, class_group)` matches an existing `IgnoredItem`.
+- For each new detection or movement candidate: compute the H3 cell, ask `mapobjects_store` to classify as new / moved / existing, and only surface non-existing entries.
+- **Zoom-in candidate handling.** When a `MovementCandidate` arrives with `source_zoom_band = zoomed_in`, evaluate against the current ROI: if inside, bump current-ROI confidence; if outside the ROI but inside the broader zoomed FOV, enqueue as a candidate-POI; only interrupt the current zoom-in hold if the candidate's priority exceeds the current hold's priority.
+- On operator confirmation: hand a middle-waypoint hint to `mission_executor`, transition to `TargetFollow`, and command `gimbal_controller` to keep the target in the centre 25 % of frame.
+- On operator decline / timeout / target loss: append (decline only) an `IgnoredItem` and return to `ZoomedOut`.
+- On `mapobjects_store` reporting `sync_state = degraded`, surface health → red and **do not** classify new detections (avoid corrupting the central observation log on next push); continue to surface POIs to the operator on Tier-1 + movement evidence alone.
+
+## 5. Internal State
+
+The state machine lives entirely in this component. State variables:
+
+- Current state: `ZoomedOut | ZoomedIn { roi, hold_started_at } | TargetFollow { target_id, started_at }`.
+- POI queue: ordered, with per-entry priority and queue position.
+- Per-class operator-decision-window thresholds.
+- Last-N tick timestamps for tick-latency observability.
+- Frame-rate floor monitor: when sustained FPS < 10, suppress `ZoomedOut → ZoomedIn` transitions and surface health → yellow.
+
+State is in-process only; restart starts in `ZoomedOut` with an empty queue.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| `detection_client` health red | health input | Continue zoom-out sweep; emit no new POIs from Tier 1; movement candidates still flow. |
+| `movement_detector` health red | health input | Continue; lose movement-candidate enqueueing. |
+| `semantic_analyzer` health red | health input | Skip Tier 2; surface POIs with Tier-1-only evidence; flag in operator overlay. |
+| `vlm_client` returns `status: disabled \| timeout \| ipc_error \| schema_invalid` | per-call status | Surface POI without VLM evidence (fail-closed). |
+| `gimbal_controller` not ready | health input | Stay in current state; alert; do not silently drop scan steps. |
+| `operator_bridge` disconnected | health input | Continue zoom-out (operator UI is unreachable, but the system must not crash); pause POI surfacing; resume on reconnect. F10 lost-link ladder owns the larger response. |
+| `mapobjects_store` sync degraded | sync_state input | Suppress diff classifications; surface POIs on Tier-1 + movement only; health → red. |
+| Sustained FPS < 10 | self-instrumented | Suppress zoom-in transitions; health → yellow. |
+| Tick-latency above budget | self-instrumented | Health → yellow; investigate (likely upstream consumer slowness). |
+
+## 7. Dependencies
+
+**In-process** (input): `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client`, `operator_bridge`, `mission_executor`, `mapobjects_store`.
+**In-process** (output): `gimbal_controller`, `operator_bridge`, `mission_executor`, `mapobjects_store`.
+
+**External**: none directly. All external integrations are mediated by other components.
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Tick latency | ≤10 ms p99 |
+| POI enqueue → operator surface | ≤1 s in normal load |
+| POI rate to operator | ≤5 POIs/min (hard cap) |
+| Zoom-out → zoom-in transition | ≤2 s including physical zoom |
+| Zoom-in hold duration | configurable; default 5 s/POI |
+| Target-follow centre-window | target inside centre 25 % of frame while visible |
+| Frame-rate floor | ≥10 fps sustained; below this, suppress zoom-in transitions |
+
+## 9. References
+
+- `architecture.md §3`, `§5 Architectural Principles`, `§6 NFR`, `§7.6 Scan controller and POI queue`, `§7.12 New vs Existing / Moved / Removed Object Detection`, `§7.13 MapObjects Sync`.
+- `system-flows.md §F4 Scan controller behaviour tree` (full BT spec, tick scenarios, 15 fixed-wing rules).
+- `data_model.md §POI`, `§IgnoredItem`, `§MapObject`.
@@ -0,0 +1,70 @@
+# Component — `semantic_analyzer`
+
+**Layer**: Perception (data plane in)
+**Status**: forward-looking design (Rust + ONNX/TensorRT bindings)
+
+## 1. Purpose
+
+Tier 2 of the perception pipeline. Reasons over zoom-in crops using a primitive graph plus a lightweight ROI CNN. Active only when `scan_controller` is in `ZoomedIn`. Owns path-freshness scoring, endpoint scoring, branch choice at intersections, and concealment-POI scoring. Operates on bounded ROIs only — never full frames.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| `DetectionBatch` (Tier 1 primitives) | `detection_client` | per zoom-in frame | Used for primitive-graph construction (paths, branches, entrances, trees). |
+| Zoom-in frame + ROI selection | `frame_ingest` (frame), `scan_controller` (ROI bounds) | per zoom-in hold | Bounded crop only; full frame is not consumed. |
+| Per-class config | startup config | once | Confidence floors, freshness thresholds, branch-priority rules. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `Tier2Evidence` | `scan_controller` | `{ roi_id, path_freshness, endpoint_score, concealment_score, recommended_next_action: PanFollowFootpath \| HoldEndpoint \| PanBroad \| ReturnToZoomOut, source_detections: Vec<DetectionId> }` |
+| `Pan plan` | `scan_controller` (then `gimbal_controller`) | sequence of pan goals for footpath following |
+| Health metric | health aggregator | `tier2_latency_p50/p99`, `roi_size_bytes_p99`, `errors_total`. |
+
+## 4. Key Responsibilities
+
+- Build a small primitive graph from Tier-1 detections inside the ROI: path nodes (footpaths, roads), endpoint nodes (branch piles, dark entrances, dugouts), context nodes (trees, tree blocks).
+- Score path freshness using the freshness model (texture, edge clarity, undisturbed-surroundings cues).
+- Score concealment for endpoint candidates.
+- At intersections, recommend the freshest / most-promising branch for `gimbal_controller` to pan toward; emit a follow plan that keeps the path centered while the UAV moves.
+- Bound every inference call by a strict ROI size and timeout. Never run on a full frame.
+
+## 5. Internal State
+
+- ROI-scoped primitive graphs (per-ROI lifetime; dropped on zoom-in exit).
+- Lightweight CNN session (ONNX/TensorRT engine).
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| ROI size exceeds limit | pre-decode size check | Reject the ROI; surface to `scan_controller` as `tier2_oversize`; do not decode. |
+| Inference timeout (>200 ms) | wall-clock | Return `Tier2Evidence` with `status: timeout`; `scan_controller` decides to skip VLM and surface a low-evidence POI. |
+| CNN session OOM or hardware error | inference call error | Health → red on sustained errors; `scan_controller` falls back to Tier-1-only POI surfacing. |
+| Inconsistent primitive graph (e.g., disconnected paths) | graph validation step | Emit `Tier2Evidence` with `recommended_next_action: ReturnToZoomOut` and `path_freshness: undefined`. |
+
+## 7. Dependencies
+
+**In-process**: `detection_client`, `frame_ingest`, `scan_controller`.
+
+**External**: ONNX Runtime / TensorRT (whichever the lightweight CNN ships with), OpenCV (preprocessing).
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Per-ROI latency | ≤200 ms p99 |
+| Concealed-position recall | ≥60 % |
+| Concealed-position precision | ≥20 % (operators filter) |
+| Footpath detection recall | ≥70 % |
+| ROI memory footprint | bounded; no unbounded buffering |
+
+## 9. References
+
+- `architecture.md §3`, `§7.6 Tier 2 semantic analyzer`, `§7.5 Training Data`.
+- `system-flows.md §F1 Frame pipeline`, `§F4 Scan controller behaviour tree`.
+- `data_model.md §Tier2Evidence`.
@@ -0,0 +1,78 @@
+# Component — `telemetry_stream`
+
+**Layer**: Telemetry plane (always-on, parallel to the decision loop)
+**Status**: forward-looking design (Rust)
+
+## 1. Purpose
+
+Continuous, always-on push of the camera feed + UAV telemetry + bbox overlay to the Ground Station API over modem. Carries operator commands (confirm / decline / target-follow start / target-follow release) on the return path. Independent of the decision loop — the operator always sees the live feed, not just on detection.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| Decoded `Frame` | `frame_ingest` | up to 30 fps | Re-encoded for the modem link bandwidth. |
+| `DetectionBatch` | `detection_client` | per frame | Used to build the bbox overlay (server-burn-in or client-render — see Open Questions). |
+| `MovementCandidate` (zoom-out + zoom-in) | `scan_controller` (forwarded) | per candidate | Surfaced in operator overlay; the `source_zoom_band` tag is preserved so the overlay can render zoom-out vs zoom-in candidates differently. |
+| UAV telemetry | `mavlink_layer` (via `mission_executor`) | 10 Hz | Position, attitude, mode, sys-status. |
+| Gimbal state | `gimbal_controller` | per change | yaw / pitch / zoom. |
+| `POI` events | `operator_bridge` | per POI surface / dequeue | Passed straight through. |
+| Operator commands | Ground Station (return path) | event | Forwarded to `operator_bridge`. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| Outbound stream | Ground Station API (over modem) | per stream protocol (TBD — see Open Questions) |
+| Inbound operator commands | `operator_bridge` | event |
+| Health metric | health aggregator | `link_state`, `bandwidth_used_mbps`, `frame_drop_rate`, `last_command_received_ts`. |
+
+## 4. Key Responsibilities
+
+- Encode and push the camera feed + telemetry + bbox overlay continuously, regardless of detection state.
+- Apply bandwidth-aware rate adaptation (drop bbox-overlay frequency before frame frequency; drop frame frequency before resolution).
+- Surface the modem link state to the health aggregator; `operator_bridge` consults this to decide whether to surface POIs.
+- Receive operator commands on the return path; forward to `operator_bridge` with monotonic timestamps.
+
+## 5. Internal State
+
+- Stream session handle.
+- Rate-adaptation state machine.
+- In-flight frame buffer (bounded).
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| Modem link down | transport error / heartbeat | Surface `link_lost`; pause outbound push (do not buffer indefinitely); `operator_bridge` pauses POI surfacing. |
+| Bandwidth saturation | adaptive monitor | Reduce bbox-overlay rate, then frame rate, then resolution; surface to health → yellow. |
+| Inbound command unparseable | parser error | Reject; ack with error; do not act. |
+| Inbound command from unauthenticated peer | session check (per Ground Station contract) | Reject; alert. |
+
+## 7. Dependencies
+
+**In-process** (input): `frame_ingest`, `detection_client`, `scan_controller`, `mavlink_layer`, `gimbal_controller`, `operator_bridge`.
+**In-process** (output): `operator_bridge` (return-path commands).
+
+**External**: Ground Station API. Contract owner: `../_docs/04_system_design_clarifications.md`.
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| End-to-end glass-to-operator latency | bounded by modem characteristics; target ≤2 s p99 on a healthy link |
+| Always-on | yes; not detection-gated |
+| Rate adaptation | smooth; no sudden full-resolution → no-feed transitions |
+| Outbound buffering | bounded; no unbounded growth on slow link |
+
+## 9. Open Questions
+
+- **Ground Station API contract** (`architecture.md §8 Q2`): stream protocol (WebRTC / WebSocket-H.264 / gRPC server-streaming?), session/auth model, bbox-overlay rendering (server-side burn-in vs client-side render).
+
+## 10. References
+
+- `architecture.md §3`, `§5 Architectural Principles` (always-on stream, no silent error swallowing), `§7.6 Integration and reliability`.
+- `system-flows.md §F5 Operator round trip`.
+- `../_docs/04_system_design_clarifications.md`.
@@ -0,0 +1,82 @@
+# Component — `vlm_client` (optional)
+
+**Layer**: Perception (data plane in)
+**Status**: forward-looking design (Rust); optional behind a feature flag and a runtime config flag
+
+## 1. Purpose
+
+Tier 3 of the perception pipeline. Asks a local NanoLLM/VILA1.5-3B process to confirm a zoom-in endpoint POI using one bounded ROI crop and a short prompt. Returns a structured `VlmAssessment`. The free-form VLM text is **not** a downstream API contract — only the validated structured output is.
+
+VLM is optional; the system MUST function correctly when VLM is disabled or absent.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| Zoom-in ROI crop + prompt | `scan_controller` | per zoom-in endpoint hold | One bounded crop, short prompt, short answer. |
+| `vlm_enabled` runtime flag | startup config | once at start (re-readable on SIGHUP if implemented) | Gates whether `scan_controller` calls this component at all. |
+| IPC socket path | startup config | once | Unix-domain socket to the NanoLLM process. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `VlmAssessment` | `scan_controller` | `{ label, confidence, status: ok \| inconclusive \| timeout \| schema_invalid \| ipc_error \| disabled, source_roi_id, latency_ms, model_version }` |
+| Health metric | health aggregator | `enabled`, `vlm_latency_p50/p99`, `errors_by_kind`, `peer_cred_check_pass_rate`. |
+
+## 4. Key Responsibilities
+
+- Validate the ROI payload (size, format) **before** sending it across the IPC channel.
+- Maintain the Unix-domain-socket connection to the NanoLLM process; perform a peer-credential check on connect (where supported by the platform).
+- Send one bounded ROI + short prompt; await one short response within ≤5 s.
+- Validate the response against the `VlmAssessment` schema; on schema-invalid, return `status: schema_invalid` to `scan_controller` and surface to health.
+- Return `status: disabled` when the runtime flag is `false`; `scan_controller` treats this identically to "VLM not present" and proceeds with Tier 2 evidence alone.
+- Capture `model_version` (whatever the NanoLLM process reports for its loaded weights) on every assessment for forensic correlation; log the version on change.
+
+## 5. Internal State
+
+- IPC socket handle and peer-credential cache.
+- In-flight request map (request id → caller).
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| VLM process not reachable | connect / send error | Return `status: ipc_error`; bounded-backoff reconnect; health → yellow then red. |
+| Peer-cred check fails | platform API | Hard-fail the connect; do not retry without operator intervention; health → red. |
+| Response timeout (>5 s) | wall-clock | Return `status: timeout`; do not block `scan_controller` past the budget. |
+| Schema-invalid response | response parser | Return `status: schema_invalid`; log the raw response (size-capped) for offline analysis. |
+| ROI payload too large | pre-send size check | Return `status: schema_invalid` synchronously; never send. |
+| Optional component absent at build time | feature flag off at compile | `scan_controller` depends only on the `VlmAssessment` provider trait; the default impl returns `status: disabled`. The binary builds and runs identically without `vlm_client`. |
+
+## 7. Dependencies
+
+**In-process**: `scan_controller`.
+
+**External**: NanoLLM / VILA1.5-3B local process. IPC over Unix-domain socket. No network egress.
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Per-ROI latency | ≤5 s p99 |
+| Memory budget | within the 6 GB shared budget after Tier 1 + Tier 2 |
+| Cloud egress | **none** (hard rule) |
+| Failure mode | fail-closed — never surface a POI with VLM evidence on a degraded VLM call |
+
+## 9. Optionality Model
+
+Two complementary mechanisms; the implementation chooses one or both:
+
+1. **Runtime flag (`vlm_enabled`)** gated by the benchmark-gate result. When `false`, `scan_controller` skips VLM confirmation; the zoom-in hold proceeds with Tier 2 evidence alone.
+2. **Build-time feature module.** `vlm_client` is a separate Cargo feature; the binary builds, links, and runs identically when the feature is off. `scan_controller` depends on a `VlmAssessmentProvider` trait whose default impl returns `status: disabled`.
+
+Both must yield the same observable behaviour: the system functions correctly with VLM absent, only losing the zoom-in confirmation step.
+
+## 10. References
+
+- `architecture.md §5 Architectural Principles` (no cloud egress, fail-closed), `§7.6 Local VLM confirmation`.
+- `system-flows.md §F3 VLM confirmation` (with explicit fail-closed and disabled branches).
+- `data_model.md §VlmAssessment`.
@@ -0,0 +1,384 @@
+# autopilot — Data Model
+
+**Status**: forward-looking design (Rust). This is the canonical entity catalogue.
+
+The autopilot binary itself has **one** persistent store: the on-device `mapobjects_store` (engine TBD — `architecture.md §8 Q3`). Everything else is in-memory only. Mission state and the central MapObjects state are pulled from the external `missions` API on start; there is no in-process mission database. The on-device `mapobjects_store` is a working copy of the central MapObjects state for the active mission's bounding box; the central observation log is the source of truth across missions (per `architecture.md §7.13`).
+
+---
+
+## 1. Entity Map
+
+```mermaid
+erDiagram
+    Frame ||--o{ Detection : "produced by detection_client"
+    Frame ||--o{ MovementCandidate : "produced by movement_detector (zoom-out + zoom-in)"
+    Detection ||--|| BoundingBox : "bbox_normalized"
+    DetectionBatch ||--o{ Detection : "contains"
+
+    POI ||--o| Tier2Evidence : "zoom-in Tier 2"
+    POI ||--o| VlmAssessment : "Tier 3 (optional, zoom-in)"
+    POI }o--o| MapObject : "lookup by H3 + class"
+    POI }o--o| IgnoredItem : "decline_suppressed"
+
+    MapObject ||--o{ MapObjectObservation : "history (central append-only log)"
+    MapObjectsBundle ||--o{ MapObject : "pre-flight pull"
+    MapObjectsBundle ||--o{ MapObjectObservation : "post-flight push"
+    MapObjectsBundle ||--o{ IgnoredItem : "ignored items round-trip"
+
+    OperatorCommand ||--o| POI : "confirm/decline target"
+
+    MissionItem ||--o{ MissionWaypoint : "translates to"
+    MissionItem ||--o{ Geofence : "carries"
+    Geofence }o--o{ Coordinate : "polygon"
+    MissionWaypoint ||--|| Coordinate : "at"
+```
+
+---
+
+## 2. Perception entities
+
+### `Frame`
+
+A decoded video frame. Produced by `frame_ingest`; consumed by `detection_client`, `movement_detector`, `telemetry_stream`.
+
+| Field | Type | Notes |
+|---|---|---|
+| `seq` | u64 | Monotonic sequence number; primary key for cross-component correlation. |
+| `capture_ts_monotonic_ns` | u64 | Wall-clock-independent timestamp at the earliest practical point in the pipeline. |
+| `decode_ts_monotonic_ns` | u64 | When `frame_ingest` finished decoding. |
+| `pixels` | `Arc<Bytes>` | Raw pixel data; consumers do not copy. |
+| `width`, `height` | u32 | |
+| `pix_fmt` | enum | `NV12` \| `YUV420P` \| `RGB24` (decoder dependent). |
+| `ai_locked` | bool | If set, downstream consumers skip detection (operator-side or supervisor gating). |
+
+In-memory only.
+
+### `BoundingBox`
+
+| Field | Type | Notes |
+|---|---|---|
+| `x_min`, `y_min`, `x_max`, `y_max` | f32 | Normalised to `[0.0, 1.0]` in image coordinates. |
+
+### `Detection`
+
+One Tier-1 detection. Mirrors the `../detections` contract; carries through to operator overlay unchanged.
+
+| Field | Type | Notes |
+|---|---|---|
+| `class_id` | u32 | |
+| `class_name` | string | Human-readable label. |
+| `confidence` | f32 | 0.0–1.0. |
+| `bbox_normalized` | `BoundingBox` | |
+| `mask_or_polyline` | optional bytes | For polyline classes (e.g. footpaths). |
+| `source_frame_seq` | u64 | Foreign key into `Frame`. |
+
+### `DetectionBatch`
+
+| Field | Type | Notes |
+|---|---|---|
+| `frame_seq` | u64 | |
+| `detections` | `Vec<Detection>` | |
+| `latency_ms` | u32 | Tier-1 round-trip; observed for budget compliance. |
+| `model_version` | string | Reported by `../detections`; logged on change. |
+
+### `MovementCandidate`
+
+A residual-motion cluster surviving ego-motion compensation in `movement_detector`.
+
+| Field | Type | Notes |
+|---|---|---|
+| `frame_seq` | u64 | |
+| `bbox_normalized` | `BoundingBox` | |
+| `residual_velocity_estimate` | optional struct | Direction + magnitude in image coords; used for prioritisation. |
+| `telemetry_quality` | enum | `synced` \| `degraded` \| `unsynced` (drives whether the candidate may be surfaced at all). |
+| `source_frame_ts_monotonic_ns` | u64 | |
+| `source_zoom_band` | enum | `zoomed_out` \| `zoomed_in`. Drives `scan_controller`'s queueing logic (per `system-flows.md §F2`): zoom-out candidates enter the POI queue normally; zoom-in candidates may bump current-ROI confidence or enter the queue with their own priority. |
+
+### `Tier2Evidence`
+
+Output of `semantic_analyzer` for a single zoom-in ROI hold.
+
+| Field | Type | Notes |
+|---|---|---|
+| `roi_id` | uuid | Stable identifier within a zoom-in hold. |
+| `path_freshness` | f32 \| null | 0.0 = no path / not applicable; 1.0 = fresh. |
+| `endpoint_score` | f32 \| null | Concealed-position likelihood at an endpoint (branch pile / dark entrance). |
+| `concealment_score` | f32 \| null | General concealment-POI score. |
+| `recommended_next_action` | enum | `PanFollowFootpath` \| `HoldEndpoint` \| `PanBroad` \| `ReturnToZoomOut`. |
+| `source_detections` | `Vec<DetectionId>` | For audit / replay. |
+| `status` | enum | `ok` \| `timeout` \| `oversize` \| `error`. |
+
+### `VlmAssessment`
+
+Validated, structured response from `vlm_client`. Free-form VLM text is **not** a downstream API.
+
+| Field | Type | Notes |
+|---|---|---|
+| `label` | enum | `confirmed_concealed_position` \| `rejected` \| `inconclusive` \| `error`. |
+| `confidence` | f32 | 0.0–1.0; VLM-reported or derived. |
+| `evidence_spans` | `Vec<string>` | Short justifications, bounded length. |
+| `reason` | string | One-line rationale; bounded length. |
+| `status` | enum | `ok` \| `timeout` \| `schema_invalid` \| `ipc_error` \| `disabled`. |
+| `latency_ms` | u32 | Round-trip including IPC. |
+| `model_version` | string | Reported by the NanoLLM process for the loaded weights; logged on change for forensic correlation. |
+
+`status` semantics: any value other than `ok` MUST result in `label = inconclusive` (or `error` for a critical failure). `scan_controller` MUST NOT promote a POI to a confirmed target on a non-`ok` `VlmAssessment`.
+
+---
+
+## 3. Decision entities
+
+### `POI`
+
+A Point-of-Interest enqueued by `scan_controller`. Source: a Tier-1 detection, a movement candidate from `movement_detector`, or a Tier-2 semantic finding.
+
+| Field | Type | Notes |
+|---|---|---|
+| `id` | uuid | Stable for the POI's lifetime. |
+| `confidence` | f32 (0.0–1.0) | Composite of detection / motion / Tier-2 score. |
+| `mgrs` | string | MGRS coordinate from the GPS-Denied service or autopilot GPS. |
+| `class` | string | Concrete class. |
+| `class_group` | enum | Per `mapobjects_store` config (e.g. `military_vehicle_group`, `concealed_position_group`, `movement_candidate`). |
+| `source_detection_ids` | `Vec<DetectionId>` | For audit / replay. |
+| `enqueued_at` | timestamp | For queue ageing. |
+| `priority` | f32 | `confidence × proximity_to_current_camera × age_factor`. |
+| `decline_suppressed` | bool | True if `(MGRS, class_group)` matches an existing `IgnoredItem`. |
+| `vlm_status` | enum | Mirrors `VlmAssessment.status` (or `not_requested` / `pending`). |
+| `tier2_evidence` | optional `Tier2Evidence` | |
+| `deadline` | timestamp | Per the confidence-scaled operator-decision window. |
+
+Field `queue_position` is **not** stored; it is computed at read time from `priority` + `enqueued_at`.
+
+### `MapObject`
+
+A persisted map entry, indexed by H3 cell. Owned by `mapobjects_store`; written on each `NEW` / `MOVED` classification, read on each new detection. The on-device `MapObject` is a **working copy** of the central state for the active mission.
+
+| Field | Type | Notes |
+|---|---|---|
+| `h3_cell` | u64 | H3 cell index at the configured resolution (default `res 10`, ~15 m edge). |
+| `mgrs_key` | string | MGRS coordinate; together with `class` forms the hashtable composite key. |
+| `class` | string | Concrete class (not the group). |
+| `class_group` | string | Group used for matching during `EXISTING` / `MOVED` / `NEW` classification. |
+| `gps_lat`, `gps_lon` | f64 | For distance calculation against incoming detections. |
+| `size_width_m`, `size_length_m` | f32 | Bounding area on the ground. |
+| `confidence` | f32 | Latest observation confidence (or running average, per implementation). |
+| `first_seen`, `last_seen` | timestamp | Earliest and most recent observation; `last_seen` drives the `REMOVED` candidate diff at region-end. |
+| `mission_id` | string | For the `DELETE /missions/{id}` cascade. |
+| `source` | enum | `central_pulled` (came from pre-flight pull) \| `local_observed` (added during this mission). On post-flight push only `local_observed` records become new observations centrally. |
+| `pending_upload` | bool | True for any `local_observed` entry not yet pushed centrally. Cleared on successful `POST /missions/{id}/mapobjects` ack. |
+
+Persisted in `mapobjects_store` (engine TBD per `architecture.md §8 Q3`).
+
+### `MapObjectObservation`
+
+A single per-detection record. The on-device store appends one of these per NEW / MOVED / EXISTING / REMOVED-CANDIDATE classification; the post-flight push uploads the unflushed list to the central `missions` API. The central side stores all observations append-only as the source of truth (per `architecture.md §7.13`).
+
+| Field | Type | Notes |
+|---|---|---|
+| `id` | uuid | Locally generated; stable across the mission. |
+| `h3_cell` | u64 | |
+| `class` | string | Concrete class. |
+| `class_group` | string | Group used for the diff. |
+| `mission_id` | string | |
+| `uav_id` | string | Identifies the airframe; assigned at provisioning. |
+| `observed_at_monotonic_ns` | u64 | Local monotonic at observation. |
+| `observed_at_wallclock` | timestamp | Bound from GPS or NTP per the wall-clock policy. |
+| `gps_lat`, `gps_lon` | f64 | |
+| `mgrs` | string | |
+| `size_width_m`, `size_length_m` | f32 | |
+| `confidence` | f32 | |
+| `diff_kind` | enum | `NEW` \| `MOVED` \| `EXISTING` \| `REMOVED_CANDIDATE`. |
+| `photo_ref` | string \| null | URL or compact reference; uploaded out-of-band per the central API contract (Q7). |
+| `raw_evidence` | json \| null | Audit payload; size-capped. |
+
+In-memory; durably persisted in `mapobjects_store` until the post-flight push acknowledges. On the central side, `map_object_observations` is the corresponding table (see `architecture.md §7.13`).
+
+### `MapObjectsBundle`
+
+The wire shape for both the pre-flight pull (response body) and the post-flight push (request body) on `/missions/{id}/mapobjects`.
+
+| Field | Type | Notes |
+|---|---|---|
+| `schema_version` | string | Semver; mismatched versions are rejected. |
+| `mission_id` | string | |
+| `bbox` | `Coordinate[2]` (NW + SE) | The mission area; used by the central API to scope the response. |
+| `map_objects` | `Vec<MapObject>` | Pre-flight: current view from the central store. Post-flight push uses `MapObjectObservation` instead (see below). |
+| `observations` | `Vec<MapObjectObservation>` | Post-flight: the full pass diff. |
+| `ignored_items` | `Vec<IgnoredItem>` | Pre-flight: union-merged from the central store. Post-flight: only items appended during this mission. |
+| `as_of` | timestamp | Pre-flight: when the central store snapshot was computed. Post-flight: when the on-device flush started. |
+| `freshness` | enum (pre-flight only) | `fresh` (≤ configured staleness window) \| `stale` (operator must acknowledge to use). |
+
+### `IgnoredItem`
+
+A scene the operator declined; consulted by `scan_controller` before promoting any future detection to a POI. Union-merged across missions on the central side (per `architecture.md §7.13` conflict resolution).
+
+| Field | Type | Notes |
+|---|---|---|
+| `id` | uuid | Locally generated. |
+| `mgrs` | string | Decline location. |
+| `h3_cell` | u64 | For central-side indexing. |
+| `class_group` | string | Class group of the declined detection. |
+| `decline_time` | timestamp | Wall-clock at decline (operator-side). |
+| `operator_id` | string \| null | If known from the Ground Station session. |
+| `mission_id` | string | The mission during which the decline happened. |
+| `retention_scope` | enum | `mission` (cleared at mission end on-device, retained centrally indexed by mission) \| `session` (cleared at session end on-device) \| `until_expiry` (carries `expires_at`). |
+| `expires_at` | timestamp \| null | Required when `retention_scope = until_expiry`. |
+| `source` | enum | `central_pulled` (pre-flight pull) \| `local_appended` (during this mission). Only `local_appended` is uploaded to central in the post-flight push. |
+| `pending_upload` | bool | True for any `local_appended` entry not yet pushed centrally. |
+
+Lookup key: `(MGRS, class_group)` exact match (subject to the same H3 k-ring widening as `MapObject` lookups, when configured).
+
+Persisted in `mapobjects_store`. Central-side table: `map_object_ignored` per `architecture.md §7.13`.
+
+---
+
+## 4. Action / piloting entities
+
+### `Coordinate`
+
+| Field | Type | Notes |
+|---|---|---|
+| `latitude` | f64 | Geographic; degrees. |
+| `longitude` | f64 | Geographic; degrees. |
+| `altitude_m` | f32 | Above ground or above home, depending on usage; the carrying entity defines the frame. |
+
+### `Geofence`
+
+A polygon on the mission. Both INCLUSION and EXCLUSION are honoured by `mission_executor`.
+
+| Field | Type | Notes |
+|---|---|---|
+| `kind` | enum | `INCLUSION` \| `EXCLUSION`. |
+| `vertices` | `Vec<Coordinate>` | Polygon vertices in order. |
+
+### `MissionItem`
+
+The business-level mission item: what the `missions` API delivers and what the operator authored. **Owned by `mission-schema`**, the artefact shared with the `missions` repo (extraction location TBD — `architecture.md §8 Q5`).
+
+| Field | Type | Notes |
+|---|---|---|
+| `id` | uuid | |
+| `kind` | enum | `waypoint` \| `search` \| `region_search` \| `return` \| `target_follow_breakpoint`. |
+| `at` | optional `Coordinate` | For `waypoint` / `return`. |
+| `region` | optional polygon | For `region_search`. |
+| `cruise_speed_mps` | optional f32 | If set, `mission_executor` emits a `MAV_CMD_DO_CHANGE_SPEED` waypoint before the affected items. |
+| `target_classes` | optional `Vec<string>` | Per-item search hint (e.g. `tank`, `artillery`). |
+
+### `MissionWaypoint`
+
+The MAVLink-level wire item: what `mavlink_layer` sends to ArduPilot / PX4. **Owned by `mavlink_layer`**.
+
+| Field | Type | Notes |
+|---|---|---|
+| `seq` | u16 | MAVLink mission item sequence number. |
+| `frame` | enum | `MAV_FRAME_GLOBAL_RELATIVE_ALT` (system default; no terrain-following). |
+| `command` | enum | One of: `MAV_CMD_NAV_TAKEOFF`, `MAV_CMD_NAV_WAYPOINT`, `MAV_CMD_NAV_LAND`, `MAV_CMD_DO_CHANGE_SPEED`, `MAV_CMD_NAV_RETURN_TO_LAUNCH`, `MAV_CMD_DO_SET_MODE`. |
+| `current` | bool | True only for the very first item in a fresh upload. |
+| `auto_continue` | bool | True for everything except the final item. |
+| `param_1..param_4` | f32 | Command-specific. |
+| `lat_deg_e7`, `lon_deg_e7` | i32 | Scaled-integer geographic coordinates. |
+| `alt_m` | f32 | Above home (relative). |
+
+### Translation contract — `MissionItem` → `MissionWaypoint`
+
+Owner: `mission_executor`, variant-aware (multirotor / fixed-wing).
+
+| Source `MissionItem.kind` | Resulting `MissionWaypoint`(s) |
+|---|---|
+| `waypoint` | exactly one `MAV_CMD_NAV_WAYPOINT` |
+| `region_search` | sequence of `MAV_CMD_NAV_WAYPOINT`s computed per the sweep pattern (`architecture.md §8 Q1`) |
+| `return` | one `MAV_CMD_NAV_RETURN_TO_LAUNCH` (or `MAV_CMD_NAV_LAND` at the explicit return point) |
+| `target_follow_breakpoint` | (none) — used only as a structural marker for re-upload; not sent to MAVLink |
+| (cruise speed carried by a `MissionItem`) | one `MAV_CMD_DO_CHANGE_SPEED` placed **before** the affected `MAV_CMD_NAV_WAYPOINT`s |
+
+Multirotor variants prepend `MAV_CMD_NAV_TAKEOFF` and append `MAV_CMD_NAV_LAND`. Fixed-wing variants do neither (the airframe is RC-launched and put into AUTO by the operator); they only upload + start the mission.
+
+The cruise-speed translation is required to **reach the autopilot**. If a `MissionItem` declares a cruise speed, the corresponding `MAV_CMD_DO_CHANGE_SPEED` MUST be present in the uploaded sequence with the speed in `param_1`. Conformance test in `deployment/ci_cd_pipeline.md §5`.
+
+### `OperatorCommand`
+
+Every command from the Ground Station to autopilot is wrapped in this authenticated envelope. The principle is committed (`architecture.md §5`); the exact signature scheme is open per Q9. `operator_bridge` rejects any command that fails signature validation, replay-protection check, or session validation.
+
+| Field | Type | Notes |
+|---|---|---|
+| `command_id` | uuid | Idempotency key; cached for 60 s by `operator_bridge`. |
+| `session_token` | string | Opaque session token issued by the Ground Station at operator login; bound to `operator_id`. |
+| `sequence_number` | u64 | Monotonically increasing per-session; replay-protection. Lower-or-equal numbers per session are rejected. |
+| `issued_at_wallclock` | timestamp | Operator-side wall-clock. Used for forensic audit; not used for trust decisions. |
+| `kind` | enum | `confirm_poi` \| `decline_poi` \| `start_target_follow` \| `release_target_follow` \| `acknowledge_bit_degraded` \| `safety_override` \| `mission_abort`. |
+| `payload` | json | Action-specific body. |
+| `signature` | bytes | Signature over (`session_token`, `sequence_number`, `kind`, `payload`). Scheme TBD per Q9. |
+
+`scan_controller` and `mission_executor` see only the validated payload; the auth envelope is opaque to them. Audit logs record `command_id`, `operator_id` (resolved from session token), `kind`, and result.
+
+### `GimbalState`
+
+| Field | Type | Notes |
+|---|---|---|
+| `yaw`, `pitch` | f32 | Degrees. |
+| `zoom` | f32 | Effective focal length or zoom factor (vendor-specific). |
+| `ts_monotonic_ns` | u64 | Stamp at the moment the gimbal feedback was received. |
+| `command_in_flight` | bool | True between command issuance and feedback that motion completed. |
+
+In-memory only; consumed by `frame_ingest` and `movement_detector` for telemetry-skew compensation.
+
+---
+
+## 5. Sync / wire formats
+
+### MGRS sync message — wire format
+
+The operator round trip (`telemetry_stream` ⇄ Ground Station) uses MGRS-encoded payloads in both directions. Field separator is `::`.
+
+**Drone → Operator (detection report):**
+
+| Position | Field | Type | Notes |
+|---|---|---|---|
+| 1 | `missionId` | string | Server-assigned mission UUID. |
+| 2 | `MGRS(encoded)` | string | MGRS coordinate (compact, military-grid). |
+| 3 | `class` | string | Concrete detection class. |
+| 4 | `confidence` | f32 | 0.0–1.0. |
+| 5 | `size_width_m` | f32 | Ground-projected width. |
+| 6 | `size_length_m` | f32 | Ground-projected length. |
+| 7 | `photo_metadata` | string | URL or compact reference to the snapshot frame. |
+| 8 | `flags` | bitmask | Reserved (e.g. `target_follow_active`, `vlm_used`, `movement_origin`). |
+
+**Operator → Drone (command / acknowledgment):**
+
+| Position | Field | Type | Notes |
+|---|---|---|---|
+| 1 | `missionId` | string | Must match the drone-side mission. |
+| 2 | `Encoded(GroundMGRS :: Time)` | string | Operator's ground location + decision timestamp. |
+| 3 | (variable) | … | Action-specific payload (POI ID, action enum, follow-toggle, etc.). |
+| N | `missionId2` | string | Echo of `missionId` for stream-multiplexing safety. |
+
+The exact serialisation of position 3 (action payload) is left to the Ground Station API contract (open question; see `architecture.md §8 Q2`).
+
+---
+
+## 6. Persistence and lifecycle
+
+| Entity | Persisted? | Where | Lifecycle |
+|---|---|---|---|
+| `Frame`, `Detection`, `DetectionBatch`, `MovementCandidate`, `Tier2Evidence`, `VlmAssessment`, `GimbalState` | no | in-memory | per frame / per ROI / per command — dropped on state change. |
+| `POI` | no | in-memory inside `scan_controller` | enqueued, surfaced, decided (confirm / decline / timeout), then dropped. |
+| `MapObject` | yes | `mapobjects_store` (working copy of central state) | mission-scoped on-device; appended to central observation log via post-flight push (F8); cleared on `DELETE /missions/{id}` cascade. |
+| `MapObjectObservation` | yes | `mapobjects_store` until acknowledged centrally | per-detection append-log; durable across in-flight crash; cleared per record on `POST /missions/{id}/mapobjects` ack. |
+| `IgnoredItem` | yes | `mapobjects_store` (working copy + post-flight upload of locally-appended items) | per `retention_scope`; central side union-merged. |
+| `MissionItem` | no in autopilot | source of truth is the `missions` API | pulled on start; refreshed on middle-waypoint POST. |
+| `MissionWaypoint` | no | in-memory inside `mavlink_layer` | re-derived from `MissionItem`s on each upload / re-upload. |
+| `OperatorCommand` | partial | command-id cache (60 s) for idempotency; full audit log persisted on disk | per-command; audit-retained per configured policy. |
+
+---
+
+## 7. Versioning and contracts
+
+| Contract | Owner | Versioning |
+|---|---|---|
+| `mission-schema` (the `MissionItem` shape) | shared between `autopilot` and `missions` repos; extraction location TBD (`architecture.md §8 Q5`) | semantic versioning; `mission_client` validates `schema_version` on fetch. |
+| MapObjects bundle schema (`MapObjectsBundle` for pull/push, `MapObjectObservation` for the central observation log) | shared between `autopilot` and `missions` repos as part of the §7.13 endpoint extension | semantic versioning; `mission_client` validates `schema_version`; central side rejects mismatches with 4xx (`architecture.md §8 Q7`). |
+| `../detections` gRPC contract | `../detections` repo (per `../_docs/03_detections.md`) | versioned; `detection_client` rejects schema mismatches (`architecture.md §8 Q4`). |
+| `VlmAssessment` schema | autopilot-internal (this document is the source of truth) | versioned; `vlm_client` rejects schema-invalid responses. The `model_version` field correlates assessments with VLM weights. |
+| MGRS sync wire format | autopilot-internal (this document is the source of truth) | versioned; field-position changes are breaking. |
+| MAVLink command surface | per `architecture.md §7.7` | adding messages requires explicit design review. |
+| `OperatorCommand` envelope (signature scheme) | open per `architecture.md §8 Q9` | once chosen, versioned; both Ground Station and `operator_bridge` must agree. |
@@ -0,0 +1,816 @@
+# autopilot — Decision Rationale
+
+This file is the load-bearing research evidence behind the design. It captures the per-dimension reasoning, the fact cards backing each decision, the component-fit matrix, the validation log, the source bibliography, the evolution from the early draft to the final solution, and the original seed problem narrative. It is **read-only** in the sense that decisions documented here have already shaped `architecture.md §7 Detailed Design` and `system-flows.md §F1–F7`; updates here should follow updates there, not lead them.
+
+## Reasoning chain (per-dimension)
+
+### Dimension 1: Tiered Perception Pipeline
+
+**Fact confirmation.** Existing YOLO integration already emits normalized boxes through a FastAPI/Cython/TensorRT service (Fact #16). Ultralytics supports TensorRT FP16 export for YOLO26-style models (Fact #19). UAV small-object and camouflaged-object literature shows that small concealed targets need class-specific and attention/semantic support rather than assuming generic object-detection transfer (Fact #8, Fact #11).
+
+**Reference comparison.** A single detector is simpler but cannot satisfy footpath tracing, endpoint reasoning, motion candidates, and VLM confirmation. A full VLM-first approach is too slow and memory-sensitive for the zoom-out / zoom-in fast paths (Fact #12, Fact #24).
+
+**Conclusion.** Use a three-tier perception pipeline: Tier 1 fixed-class YOLO26 / YOLOE-26 TensorRT FP16 primitives, Tier 2 primitive-graph plus lightweight ROI confirmation, and Tier 3 NanoLLM VLM only for bounded zoom-in endpoint / POI questions.
+
+**Confidence.** Medium-high. API fit is supported; runtime targets still require hardware benchmarks.
+
+### Dimension 2: Movement Detection
+
+**Fact confirmation.** Dynamic-camera motion detection needs ego-motion compensation because platform movement creates apparent motion in stable objects (Fact #6, Fact #7). OpenCV provides sparse optical flow, feature tracking, and global-motion estimation APIs (Fact #22). The user confirmed timestamped video, gimbal, zoom, and UAV telemetry are available for MVP.
+
+**Reference comparison.** Naive frame differencing is simpler but directly conflicts with the stable-scene rejection requirement. Pure learned tracking without telemetry may work later, but it adds data requirements and hides failure modes.
+
+**Conclusion.** Select telemetry-aided OpenCV ego-motion compensation as the MVP movement-detector baseline, with residual cluster extraction. Run movement detection at **both** zoom-out and zoom-in (per-zoom-band thresholds), benchmark-gating classical CV adequacy at zoom-in before MVP acceptance. The ≤5 POIs/minute cap is enforced by `scan_controller`'s POI scheduler, not by the detector itself, so the same detector can serve both zoom levels.
+
+**Confidence.** High for mechanism fit; medium for runtime and false-positive performance until replay-tested.
+
+### Dimension 3: Scan and Gimbal Control
+
+**Fact confirmation.** ViewPro A40 official specs support fast tracking output metadata and a 40× optical camera, but do not prove the project's full zoom traversal time (Fact #4, Fact #5). Behaviour trees help large UAV autonomy systems, but this project has a small deterministic scan lifecycle (Fact #26).
+
+**Reference comparison.** Behaviour trees are more extensible, but a deterministic state machine gives simpler timing, queue, and timeout tests for `ZoomedOut`, `ZoomedIn`, and `TargetFollow` states.
+
+**Conclusion.** Use a typed `scan_controller` state machine with explicit states, queue ageing, timeouts, and target-loss handling. Treat ViewPro zoom timing as a hardware-in-loop acceptance test.
+
+**Confidence.** High for architecture fit; medium for physical zoom timing until measured.
+
+### Dimension 4: VLM Confirmation
+
+**Fact confirmation.** NanoLLM documents local multimodal VILA1.5-3B image+text prompting with MLC and quantisation options (Fact #23). Orin Nano 8 GB VLM deployment is memory-sensitive and needs strict context / token limits (Fact #24). The user confirmed VLM is required for MVP only if the exact model / runtime passes ≤5 s/ROI and memory gates.
+
+**Reference comparison.** Using VLM for every ROI would overload latency and memory. Skipping VLM entirely would miss the requirement. A separate local VLM IPC process preserves no-cloud and isolation constraints while allowing a scheduler to avoid concurrent GPU use.
+
+**Conclusion.** Select NanoLLM + VILA1.5-3B MLC quantised as the lead VLM, run only on bounded zoom-in crops, and enforce hard benchmark gates before MVP acceptance.
+
+**Confidence.** Medium. API capability is proven; runtime-quality fit is not proven without target hardware.
+
+### Dimension 5: Data and Acceptance Risk
+
+**Fact confirmation.** All-season MVP was confirmed by the user. UAV small-object and camouflaged-object detection is sensitive to background, scale, and season (Fact #8, Fact #11). Annotation effort is plausible only with assistance and careful prioritisation (Fact #14, Fact #15).
+
+**Reference comparison.** Winter-first MVP would lower risk but conflicts with the confirmed requirement. All-season MVP demands stronger dataset gates and should not rely on aggregate metrics.
+
+**Conclusion.** Keep all-season MVP, but make per-class, per-season, per-terrain validation mandatory. Use annotation assistance and hard-negative mining from false positives to control schedule risk.
+
+**Confidence.** Medium. The requirement is clear; dataset availability is the main risk.
+
+## Fact cards
+
+These are the load-bearing facts referenced from the reasoning chain and the fit matrix. Each card lists the source, confidence, related dimension, and fit impact. Source numbers refer to the bibliography in §References below.
+
+### Fact #1
+
+- **Statement**: Jetson Orin Nano Super is officially specified at 67 INT8 TOPS with 8 GB 128-bit LPDDR5 memory and 102 GB/s memory bandwidth.
+- **Source**: Source #1
+- **Confidence**: High
+- **Related dimension**: Hardware feasibility
+- **Fit impact**: Supports the hardware restriction, but does not prove FP16 multi-model latency.
+
+### Fact #2
+
+- **Statement**: NVIDIA's Super Mode performance gain depends on the JetPack / software configuration and power mode, so benchmark results must record the installed JetPack / L4T and power mode.
+- **Source**: Source #2
+- **Confidence**: High
+- **Related dimension**: Runtime reproducibility
+- **Fit impact**: Adds a missing restriction: lock and report JetPack / power mode for all latency tests.
+
+### Fact #3
+
+- **Statement**: Ultralytics provides Jetson / TensorRT deployment guidance, but the consulted documentation / search results do not prove a two-model YOLO26 + YOLOE-26 pipeline at 1280 px will stay below 100 ms/frame including preprocessing, tiling, and postprocessing.
+- **Source**: Source #3
+- **Confidence**: Medium
+- **Related dimension**: Tier 1 latency
+- **Fit impact**: Makes the ≤100 ms/frame criterion plausible but unproven until benchmarked with the exact exported engines.
+
+### Fact #4
+
+- **Statement**: ViewPro A40 Pro official specifications list 1080p output, 40× optical zoom with 4.25–170 mm focal range, 30 Hz tracking deviation update rate, less than 30 ms deviation output delay, and 5×5 pixel minimum AI target size for the built-in AI feature.
+- **Source**: Source #4
+- **Confidence**: High
+- **Related dimension**: Camera / gimbal feasibility
+- **Fit impact**: Supports control-loop feasibility but does not prove full wide-to-high optical zoom traversal in ≤2 s.
+
+### Fact #5
+
+- **Statement**: The official ViewPro A40 Pro page does not provide a direct full-range optical zoom traversal time; the project-specific 1–2 s zoom traversal claim must be measured on the target camera / interface.
+- **Source**: Source #4
+- **Confidence**: High
+- **Related dimension**: zoom-out → zoom-in transition
+- **Fit impact**: Adds a validation prerequisite for the ≤2 s transition criterion.
+
+### Fact #6
+
+- **Statement**: Recent dynamic-camera moving-object detection work uses optical flow plus additional mechanisms such as tracking-any-point, adaptive bounding-box filtering, segmentation priors, or focus-of-expansion reasoning, because camera motion alone produces apparent motion.
+- **Source**: Source #5, Source #6
+- **Confidence**: High
+- **Related dimension**: Movement detection
+- **Fit impact**: Supports the requirement to compensate UAV / gimbal motion and disqualifies naive frame differencing.
+
+### Fact #7
+
+- **Statement**: Moving-object detection from UAV footage is difficult because objects are small, camera motion is complex, and structured backgrounds can make optical-flow-only approaches unreliable.
+- **Source**: Source #5, Source #6
+- **Confidence**: Medium
+- **Related dimension**: Movement detection reliability
+- **Fit impact**: Adds a missing false-positive / false-negative acceptance criterion for zoom-out motion candidates (and, after the zoom-in benchmark gate, an analogous per-zoom-band criterion for zoom-in).
+
+### Fact #8
+
+- **Statement**: UAV small-object detection literature repeatedly identifies small pixel footprint, complex backgrounds, low contrast, and scale variation as major causes of missed detections and false alarms.
+- **Source**: Source #7, Source #8
+- **Confidence**: High
+- **Related dimension**: YOLO and semantic detection quality
+- **Fit impact**: Makes 80 % precision / recall for new primitive classes realistic only with class-specific validation, tiling, and seasonal coverage.
+
+### Fact #9
+
+- **Statement**: Recent UAV YOLO variants improve small-target results through attention, receptive-field, or feature-fusion changes, implying generic YOLO baseline performance should not be assumed to transfer unchanged to small concealed primitives.
+- **Source**: Source #7, Source #8
+- **Confidence**: High
+- **Related dimension**: Model selection
+- **Fit impact**: Supports keeping "existing class performance must not degrade" and adding per-class / season reporting.
+
+### Fact #10
+
+- **Statement**: Trail / path detection can be treated as a structured perception problem using neural detection plus path-continuity reasoning, not just independent bounding boxes.
+- **Source**: Source #9
+- **Confidence**: Medium
+- **Related dimension**: Footpath detection
+- **Fit impact**: Supports requiring path tracing, freshness scoring, endpoint reasoning, and branch-following behaviour.
+
+### Fact #11
+
+- **Statement**: Camouflaged-object detection papers use specialised attention, illumination, frequency / spatial, or super-resolution methods because camouflaged targets are intentionally similar to the background.
+- **Source**: Source #14
+- **Confidence**: Medium
+- **Related dimension**: Concealed-position detection
+- **Fit impact**: Supports the project's claim that visual similarity to known object classes is insufficient.
+
+### Fact #12
+
+- **Statement**: Small local VLMs can run on Jetson-class devices, but model choice, quantisation, context size, crop size, and runtime container determine whether memory and ≤5 s/ROI are realistic.
+- **Source**: Source #1, Source #12, Source #13
+- **Confidence**: Medium
+- **Related dimension**: VLM feasibility
+- **Fit impact**: Makes local VLM feasible only as a tightly bounded optional Tier 3 module with an exact-model benchmark.
+
+### Fact #13
+
+- **Statement**: The project has about 6 GB remaining RAM only because existing YOLO is assumed to use about 2 GB; unified-memory contention means VLM and YOLO scheduling must be sequential and benchmarked together, not in isolation.
+- **Source**: Source #1, project restrictions
+- **Confidence**: Medium
+- **Related dimension**: Resource budget
+- **Fit impact**: Supports the restriction against concurrent YOLO / VLM GPU inference and adds a whole-pipeline memory test.
+
+### Fact #14
+
+- **Statement**: Interactive or model-assisted segmentation can reduce mask annotation time compared with manual polygon annotation, but this benefit depends on tooling and object-boundary clarity.
+- **Source**: Source #10
+- **Confidence**: High
+- **Related dimension**: Annotation effort
+- **Fit impact**: Makes hundreds-to-thousands of labels plausible in 225 hours only if annotation assistance and prioritisation are used.
+
+### Fact #15
+
+- **Statement**: Label propagation can reduce annotation effort for related frames / sequences, which is relevant to movement-detection video data.
+- **Source**: Source #11
+- **Confidence**: Medium
+- **Related dimension**: Movement dataset creation
+- **Fit impact**: Supports using video / sequential annotation tools for movement candidates rather than frame-by-frame manual labelling only.
+
+### Fact #16
+
+- **Statement**: The existing FastAPI service has endpoints that emit normalized boxes and uses a global inference object around Cython / TensorRT inference.
+- **Source**: `../detections/main.py` (existing detections service)
+- **Confidence**: High
+- **Related dimension**: Integration boundary
+- **Fit impact**: Supports keeping normalized-box output but favours isolating VLM and scan control outside the Cython inference path.
+
+### Fact #17
+
+- **Statement**: The input images show long thin paths, dark narrow entrances, branch / forest-edge concealment, and partial occlusion, so bounding boxes alone may be weak for footpaths and path-follow behaviour.
+- **Source**: original problem-side data parameters (deleted on doc consolidation 2026-05-17; reference PNGs `semantic01..04.png` lived alongside)
+- **Confidence**: Medium
+- **Related dimension**: Annotation format
+- **Fit impact**: Supports allowing segmentation masks or polylines for footpaths instead of boxes only.
+
+### Fact #18
+
+- **Statement**: The project provides no explicit acceptance criteria for false positives per route / time, operator-review workload, queue starvation, telemetry availability, power / thermal throttling, or evidence logging.
+- **Source**: original problem-side acceptance criteria + restrictions (deleted on doc consolidation 2026-05-17; consolidated into `architecture.md §7.3 Restrictions` and `§7.4 Acceptance Criteria`)
+- **Confidence**: High
+- **Related dimension**: Missing criteria
+- **Fit impact**: Requires adding or confirming these criteria before final architecture planning. (Most are now folded into `architecture.md §7.4 Acceptance Criteria > Frozen choices (2026-05-06)`.)
+
+### Fact #19
+
+- **Statement**: Ultralytics YOLO supports TensorRT engine export with FP16 through `half=True`; TensorRT export is GPU-only and supports arguments including `dynamic`, `half`, `int8`, and workspace configuration.
+- **Source**: Source #15
+- **Confidence**: High
+- **Related dimension**: Tier 1 primitive detector
+- **Fit impact**: Supports selecting custom-trained YOLO26 TensorRT FP16 as the primary primitive detector.
+
+### Fact #20
+
+- **Statement**: Jetson TensorRT export can run into workspace and dynamic-shape memory issues, so fixed input shapes, batch 1, and on-device export / benchmarking are safer for this project than dynamic batch export.
+- **Source**: Source #18
+- **Confidence**: Medium
+- **Related dimension**: Tier 1 latency
+- **Fit impact**: Adds a hard implementation constraint for the FP16 TensorRT engines.
+
+### Fact #21
+
+- **Statement**: YOLOE supports open-vocabulary detection / segmentation, but TensorRT runtime should not depend on Python open-vocabulary prompt-mutation APIs; MVP runtime should use fixed trained classes or pre-baked class embeddings only.
+- **Source**: Source #19
+- **Confidence**: Medium
+- **Related dimension**: YOLOE exact-fit
+- **Fit impact**: Selects YOLOE-26 only in fixed-class FP16 TensorRT mode, not runtime open-vocabulary mode.
+
+### Fact #22
+
+- **Statement**: OpenCV 4.x provides Lucas-Kanade sparse optical flow (`calcOpticalFlowPyrLK`), feature detection (`goodFeaturesToTrack`), and global-motion estimation APIs that can estimate frame-to-frame background motion before residual moving-object detection.
+- **Source**: Source #16
+- **Confidence**: High
+- **Related dimension**: Movement detector
+- **Fit impact**: Supports selecting telemetry-aided OpenCV ego-motion compensation as the movement baseline.
+
+### Fact #23
+
+- **Statement**: NanoLLM supports model loading through MLC / AWQ / HF APIs with quantisation settings such as `q4f16_ft`, and multimodal chat examples using VILA1.5-3B with image prompts.
+- **Source**: Source #17
+- **Confidence**: High
+- **Related dimension**: VLM confirmation
+- **Fit impact**: Supports selecting NanoLLM + VILA1.5-3B MLC as the lead local VLM candidate, subject to runtime-quality benchmark.
+
+### Fact #24
+
+- **Statement**: VILA1.5-3B on Orin Nano 8 GB is plausible but memory-sensitive; context length, max tokens, crop count, and container / storage footprint must be capped.
+- **Source**: Source #21
+- **Confidence**: Medium
+- **Related dimension**: VLM feasibility
+- **Fit impact**: Requires the VLM process to use bounded crops, short prompts, short answers, and a watchdog.
+
+### Fact #25
+
+- **Statement**: NanoSAM / MobileSAM-style segmentation is useful for ROI mask refinement and annotation assistance, but not as the zoom-out wide-area sweep lead because it still adds an image-encoder cost and prompt dependency.
+- **Source**: Source #20
+- **Confidence**: Medium
+- **Related dimension**: Segmentation fallback
+- **Fit impact**: Marks segmentation foundation models as fallback / annotation-assist, not primary runtime.
+
+### Fact #26
+
+- **Statement**: Behaviour trees improve modularity for large UAV autonomy systems, but this project's scan lifecycle has a small fixed set of states and strict timing, making a typed deterministic state machine simpler for MVP.
+- **Source**: Source #22
+- **Confidence**: Medium
+- **Related dimension**: Scan control
+- **Fit impact**: Selects a deterministic scan state machine with explicit queues / timeouts; behaviour tree remains a later extensibility option (the BT primer in `system-flows.md §F4` is the canonical decomposition the state machine must satisfy).
+
+### Fact #27
+
+- **Statement**: Multiple GPU inference contexts / processes can complicate TensorRT scheduling and memory behaviour on Jetson; the project should centralise GPU scheduling and preserve the restriction that YOLO and VLM do not run concurrently.
+- **Source**: Source #23, project restrictions
+- **Confidence**: Medium
+- **Related dimension**: Integration boundary
+- **Fit impact**: Selects a local IPC VLM process controlled by an integration scheduler, not unmanaged concurrent inference.
+
+### Fact #28
+
+- **Statement**: The first draft under-specified the proof gates that must happen before implementation: Tier 1 latency, VLM memory / latency, ViewPro zoom timing, movement false-positive replay, and all-season dataset readiness.
+- **Source**: `solution_draft01.md` (superseded), `validation_log` (this file §Validation)
+- **Confidence**: High
+- **Related dimension**: Planning readiness
+- **Fit impact**: Adds a required benchmark-gate stage before decomposition / implementation.
+
+### Fact #29
+
+- **Statement**: Secure FastAPI file / image handling should not trust client content-type headers alone and should enforce size limits, validation, authorisation, cleanup, and audit logging.
+- **Source**: Source #24
+- **Confidence**: Medium
+- **Related dimension**: API security
+- **Fit impact**: Adds explicit upload / payload validation requirements for image, ROI, and VLM IPC inputs.
+
+### Fact #30
+
+- **Statement**: Local IPC can use Unix-socket filesystem permissions and peer-credential checks such as `SO_PEERCRED` to restrict which local processes may call the VLM service.
+- **Source**: Source #25
+- **Confidence**: Medium
+- **Related dimension**: IPC security
+- **Fit impact**: Replaces vague "local IPC authorisation" with a concrete Unix-socket permission and peer-credential control.
+
+### Fact #31
+
+- **Statement**: Production LLM / VLM integrations should validate or constrain outputs against a schema before downstream use, because free-form text is not a stable API contract.
+- **Source**: Source #26
+- **Confidence**: Medium
+- **Related dimension**: VLM output reliability
+- **Fit impact**: Adds a structured `VlmAssessment` schema and retry / fail-closed behaviour.
+
+### Fact #32
+
+- **Statement**: Sensor-fusion systems use correlated timestamps to align camera frames and telemetry; movement detection should define maximum tolerated skew between video frames, gimbal state, and UAV motion data.
+- **Source**: Source #27
+- **Confidence**: Medium
+- **Related dimension**: Telemetry synchronisation
+- **Fit impact**: Adds a telemetry-synchronisation contract before movement detection can claim compensation correctness.
+
+### Fact #33
+
+- **Statement**: TensorRT performance must be measured under the actual model configuration and scheduler; documentation-level export support does not prove end-to-end latency with multiple engines and preprocessing.
+- **Source**: Source #28
+- **Confidence**: High
+- **Related dimension**: GPU scheduling
+- **Fit impact**: Strengthens the central GPU scheduler and benchmark gate.
+
+### Fact #34
+
+- **Statement**: OpenCV image decoders have had critical crafted-image vulnerabilities in recent 4.x versions, including CVE-2025-53644 affecting 4.10.0 / 4.11.0 and patched in 4.12.0.
+- **Source**: Source #29
+- **Confidence**: High
+- **Related dimension**: Image-processing security
+- **Fit impact**: Requires patched OpenCV version and image-format allow-list for untrusted inputs.
+
+### Fact #35
+
+- **Statement**: The existing `main.py` swallows refresh / posting / detection exceptions in several paths and returns healthy status even when inference initialisation fails, which would hide critical runtime failures in the expanded system.
+- **Source**: `../detections/main.py` (existing detections service)
+- **Confidence**: High
+- **Related dimension**: Observability and reliability
+- **Fit impact**: Adds a reliability task to replace silent exception handling in touched service paths.
+
+### MVE: Ultralytics YOLO26 / YOLOE-26 in fixed-class TensorRT FP16 mode
+
+- **Source**: Source #15, Source #19
+- **Pinned mode**: Custom-trained YOLO26 detector and YOLOE-26 segmentation / detection engines exported as TensorRT FP16 with fixed project classes, batch 1, fixed 1280 px input, no runtime open-vocabulary prompt mutation.
+- **Inputs in the example**: Image input passed to `YOLO("yolo26n.pt")`, exported with `model.export(format="engine", half=True)`, then loaded as `.engine` for prediction.
+- **Outputs in the example**: Detection / segmentation results from a TensorRT engine.
+- **Project inputs**: 1080p UAV frames or tiles resized / split for 1280 px model input.
+- **Project outputs required**: Normalized boxes for primitives and operator display; optional masks / polylines for path / branch reasoning.
+- **Match assessment**: Exact API / deployment match for fixed-class TensorRT FP16 engines; runtime open-vocabulary YOLOE behaviour is rejected.
+
+#### Restrictions and AC binding — YOLO26 / YOLOE-26 fixed-class FP16
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| FP16 precision | TensorRT export supports `half=True`. | Pass | Fact #19 |
+| TensorRT primary / ONNX fallback | TensorRT engine export is documented; ONNX remains project fallback. | Pass | Fact #19 |
+| 1280 px input | Export supports `imgsz`; exact latency requires benchmark. | Pass for API; runtime gate | Fact #19, Fact #20 |
+| ≤100 ms/frame Tier 1 | API can run TensorRT FP16; runtime quality must be measured end-to-end. | Pass with runtime-quality gate | Fact #20 |
+| Normalized boxes output | YOLO result conversion can preserve existing normalized-box DTO contract. | Pass | Fact #16 |
+| No degradation of existing classes | Requires validation, not an API capability. | Pass with runtime-quality gate | Fact #9 |
+| All seasons MVP | Requires dataset / training coverage, not an API capability. | Pass with data-quality gate | Fact #8 |
+
+### MVE: OpenCV telemetry-aided ego-motion compensation
+
+- **Source**: Source #16
+- **Pinned mode**: OpenCV 4.x sparse optical flow + feature tracking + global-motion estimation, fused with timestamped gimbal angle / zoom and UAV motion telemetry before residual moving-candidate extraction.
+- **Inputs in the example**: Consecutive video frames converted to grayscale; features from `goodFeaturesToTrack`; tracked points from `calcOpticalFlowPyrLK`.
+- **Outputs in the example**: Matched point trajectories and estimated motion between frames.
+- **Project inputs**: 1080p zoom-out frame sequences plus timestamped gimbal / UAV telemetry; zoom-in frame sequences for the per-zoom-band benchmark.
+- **Project outputs required**: Small residual moving point / cluster candidate boxes queued within 1 s.
+- **Match assessment**: Exact match for ego-motion compensation primitives; project-specific candidate thresholds require benchmark.
+
+#### Restrictions and AC binding — OpenCV movement detector
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Movement detection at zoom-out + zoom-in | OpenCV frame-to-frame processing applies at both zoom levels with per-zoom-band thresholds. Classical-CV adequacy at zoom-in is benchmark-gated; if the FP cap fails, fall back per Q14. | Pass with runtime-quality gate | Fact #22 |
+| Compensate UAV / gimbal motion | Optical flow / global motion plus telemetry directly supports compensation. | Pass | Fact #6, Fact #22 |
+| Enqueue within 1 s | CPU / GPU cost depends on implementation; API supports required operations. | Pass with runtime-quality gate | Fact #22 |
+| Stable objects must not be moving due to platform motion | Compensation design directly targets this failure mode. | Pass | Fact #6 |
+| Timestamped telemetry available | User confirmed full telemetry is available for MVP. | Pass | User decision |
+
+### MVE: NanoLLM VILA1.5-3B local VLM ROI confirmation
+
+- **Source**: Source #17, Source #21
+- **Pinned mode**: NanoLLM multimodal chat with MLC backend, `Efficient-Large-Model/VILA1.5-3b`, quantised mode such as `q4f16_ft`, one bounded ROI crop, short prompt, short answer.
+- **Inputs in the example**: Image-path prompt plus text prompt, e.g. `--prompt '/data/images/lake.jpg' --prompt 'please describe the scene.'`.
+- **Outputs in the example**: Natural-language generated answer from the VLM.
+- **Project inputs**: zoom-in ROI crop around path endpoint, branch pile, dark entrance, dugout, person, or vehicle candidate.
+- **Project outputs required**: Confirmation label / reason that can be converted to POI metadata and operator display-box status.
+- **Match assessment**: Exact API capability match for image+text ROI reasoning; latency and memory are runtime-quality gates.
+
+#### Restrictions and AC binding — NanoLLM VILA1.5-3B
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Local VLM, no cloud | NanoLLM runs local models. | Pass | Fact #23 |
+| Separate IPC process | NanoLLM can run as a separate process / container invoked by local IPC. | Pass | Fact #23, Fact #27 |
+| Sequential with YOLO | Scheduler can enforce no concurrent GPU execution. | Pass | Fact #27 |
+| ≤5 s/ROI | API can process image prompts; exact latency must be benchmarked on Jetson. | Pass with runtime-quality gate | Fact #24 |
+| ≤6 GB remaining RAM | Quantised mode is supported; exact memory must be benchmarked with YOLO container present. | Pass with runtime-quality gate | Fact #23, Fact #24 |
+| MVP requires VLM if benchmark passes | User-confirmed policy. | Pass | User decision |
+
+## Component fit matrix
+
+### Top-level component fit matrix
+
+| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
+|---|---|---|---|---|---|---|---|---|
+| Tier 1 primitive detection | Ultralytics YOLO26 + YOLOE-26 | Fixed-class TensorRT FP16 engines, batch 1, fixed 1280 px input, no runtime open-vocabulary prompt mutation | Established/open-source + current SOTA | Fast zoom-out primitive boxes/masks for paths, roads, trees, branch piles, entrances | MVE above; docs: Source #15, #19 | Runtime open-vocabulary TensorRT APIs rejected; dynamic batch rejected | Selected with runtime-quality gate | Best fit with existing TensorRT/Cython service and FP16 restriction. |
+| Tier 2 semantic analyzer | Primitive graph + lightweight custom CNN | ROI crop and Tier 1 primitives → POI score, path freshness, endpoint, concealment candidate | Simple baseline + custom model | Confirm and reason over primitives within ≤200 ms/ROI | Facts #10, #17, #25 | None at API level; data-quality gate remains | Selected | Keeps reasoning explainable and faster than VLM-first confirmation. |
+| Movement detection | OpenCV 4.x optical flow / global motion + timestamped UAV / gimbal telemetry | Zoom-out and zoom-in frame pairs plus telemetry → residual moving-point / cluster boxes (per-zoom-band thresholds) | Established production baseline | Detect moving candidates while rejecting platform-induced motion at both zoom levels | MVE above; docs: Source #16 | Video-only mode is not selected for MVP. Zoom-in classical CV is benchmark-gated; learned fallback per Q14 if the FP cap fails. | Selected with runtime-quality gate | Directly matches user-confirmed telemetry and movement restrictions. |
+| Tier 3 VLM confirmation | NanoLLM + VILA1.5-3B | MLC backend, quantised mode such as `q4f16_ft`, one bounded ROI crop, short prompt, short response | Open-source edge VLM | Local confirmation of endpoint / branch-pile / entrance / dugout ROI | MVE above; docs: Source #17, #21 | Must pass ≤5 s/ROI and memory gate; otherwise smaller-VLM fallback | Selected with runtime-quality gate | Satisfies local / no-cloud / VLM-required policy if benchmark passes. |
+| Scan control | Typed deterministic state machine | `ZoomedOut`, `ZoomedIn`, `TargetFollow` states with POI queue, timeouts, target-loss, gimbal command adapters | Simple baseline | Camera sweep, zoom, POI servicing, target follow | Source #4, #22, Fact #26 | Behaviour tree deferred (canonical decomposition kept in `system-flows.md §F4`) | Selected | Small fixed lifecycle favours deterministic timing and testability. |
+| Integration boundary | Existing FastAPI / Cython YOLO core + `scan_controller` scheduler + local IPC VLM process | Normalized-box contract + POI metadata; central GPU scheduler enforces sequential YOLO / VLM | Established production pattern | Integrate modules without compiling VLM into Cython | Fact #16, #27 | Unmanaged multiprocessing / concurrent GPU rejected | Selected | Preserves existing service and isolates memory-heavy VLM. |
+
+### Sub-matrix: YOLO26 / YOLOE-26 fixed-class TensorRT FP16
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Hardware: Jetson Orin Nano Super 8 GB | TensorRT FP16 is an NVIDIA GPU deployment path; memory must be benchmarked. | Pass with runtime-quality gate | Fact #1, Fact #20 |
+| FP16 precision | Uses `half=True` TensorRT export. | Pass | Fact #19 |
+| 1280 px model input | Export supports image-size configuration; use fixed 1280 px / batch 1. | Pass | Fact #19 |
+| Existing tile splitting | Candidate accepts image / tiles and returns detections per tile. | Pass | Fact #16, Fact #19 |
+| YOLO and VLM sequential | Tier 1 runs before VLM; scheduler prevents concurrency. | Pass | Fact #27 |
+| Output normalized boxes | Existing DTO contract can wrap candidate outputs. | Pass | Fact #16 |
+| New primitive classes | Fixed custom classes support the required primitive set. | Pass | Fact #19, Fact #21 |
+| P ≥80 %, R ≥80 % and no degradation | Model API supports training / validation; actual performance is data / runtime quality. | Pass with runtime-quality gate | Fact #8, Fact #9 |
+| All-season MVP | Requires dataset coverage rather than API feature. | Pass with data-quality gate | Fact #8, user confirmation |
+
+### Sub-matrix: Primitive graph + lightweight CNN
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Tier 2 ≤200 ms/ROI | Bounded ROI crop and lightweight CNN / rules keep workload limited. | Pass with runtime-quality gate | Fact #10, Fact #17 |
+| Consumes YOLO primitives | Candidate uses primitive boxes / masks as primary input. | Pass | Fact #10 |
+| Path freshness and endpoint tracing | Graph / path model represents path continuity and endpoint scoring. | Pass | Fact #10, Fact #17 |
+| Branch choice at intersections | Queue / path scorer can select freshest / most promising branch by configured score. | Pass | Fact #10 |
+| VLM sequentiality | Candidate can run before VLM and invoke VLM only after endpoint hold. | Pass | Fact #27 |
+
+### Sub-matrix: OpenCV telemetry-aided movement detector
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Both zoom levels | Runs during both zoom-out and zoom-in scan states. | Pass with runtime-quality gate | Fact #22 |
+| Wide / light / medium zoom | Candidate consumes only selected zoom-state frames. | Pass | User confirmation, Fact #22 |
+| Timestamped video / gimbal / UAV telemetry | User confirmed full telemetry is available for MVP. | Pass | User decision |
+| Compensate UAV / gimbal motion | Optical flow / global motion plus telemetry estimate ego-motion before residuals. | Pass | Fact #6, Fact #22 |
+| Enqueue within 1 s | Candidate operations support streaming implementation; exact latency is runtime quality. | Pass with runtime-quality gate | Fact #22 |
+| Stable objects not treated as moving | Ego-motion compensation directly addresses this failure mode. | Pass | Fact #6 |
+| Output normalized movement boxes | Residual clusters can be converted to normalized candidate boxes. | Pass | Fact #16 |
+
+### Sub-matrix: NanoLLM VILA1.5-3B
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Local VLM, no cloud | Runs local model through NanoLLM. | Pass | Fact #23 |
+| Separate IPC process | Candidate can run as an isolated process / container behind local IPC. | Pass | Fact #23, Fact #27 |
+| Sequential with YOLO | Scheduler grants VLM GPU slot only after YOLO / Tier 2 work. | Pass | Fact #27 |
+| ≤5 s/ROI | API supports image+text prompt; exact latency is runtime quality. | Pass with runtime-quality gate | Fact #23, Fact #24 |
+| ≤6 GB remaining RAM | Quantised mode supports smaller memory footprint; exact budget is runtime quality. | Pass with runtime-quality gate | Fact #23, Fact #24 |
+| Required for MVP if benchmark passes | User-confirmed policy. | Pass | User decision |
+| Output usable for operator display | Text confirmation can be converted into POI metadata while display box comes from Tier 1 / 2. | Pass | Fact #23 |
+
+### Sub-matrix: scan controller state machine
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Zoom-out route sweep | State machine owns sweep pattern and POI queueing. | Pass | Fact #26 |
+| Zoom-out → zoom-in ≤2 s | State machine can command transition; physical zoom timing must be measured. | Pass with runtime-quality gate | Fact #5 |
+| Zoom-in lock, pan, hold, timeout | Explicit states encode lock, follow path, endpoint hold, VLM request, timeout, return. | Pass | Fact #26 |
+| Target-follow centre 25 % | Target-follow state can enforce centre-window metric. | Pass | Source #4, Fact #26 |
+| Decision-to-movement ≤500 ms | Controller can timestamp commands; physical / protocol latency is runtime quality. | Pass with runtime-quality gate | Fact #4 |
+| Ordered POI queue with confidence / proximity | Queue can also include user-confirmed ≤5 POIs/minute cap and ageing. | Pass | User decision |
+
+### Sub-matrix: integration scheduler and IPC
+
+| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
+|---|---|---|---|
+| Extend existing FastAPI + Cython service | Keeps existing YOLO core and adds scheduler / adapters around it. | Pass | Fact #16 |
+| VLM separate IPC | VLM remains outside Cython and communicates locally. | Pass | Fact #23, Fact #27 |
+| No concurrent YOLO / VLM GPU inference | Central scheduler serializes GPU-heavy work. | Pass | Fact #27 |
+| Same normalized-box output | Integration layer preserves current DTOs and adds POI metadata separately. | Pass | Fact #16 |
+| GPS-denied coordinates out of scope | Scheduler stores external coordinate references but does not estimate them. | Pass | Project restrictions |
+| Annotation / training separate repos | Integration consumes trained-model artefacts and label schema only. | Pass | Project restrictions |
+
+### Mode B revised fit additions
+
+These rows extend the top-level matrix with the cross-cutting concerns surfaced during the second draft of the solution.
+
+| Component Area | Candidate | Pinned Mode/Config | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
+|---|---|---|---|---|---|---|---|
+| Benchmark gate | Pre-implementation proof suite | Hardware-in-loop and replay benchmarks for Tier 1, Tier 2, VLM, A40 zoom, movement, all-season data readiness | Prevent implementation from depending on unproven runtime-quality assumptions | Facts #28, #33 | None | Selected | Converts draft caveats into explicit go / no-go gates. |
+| Telemetry sync contract | Frame / telemetry alignment layer | Timestamped frame, gimbal angle, zoom, and UAV motion samples with maximum-skew validation | Make movement compensation testable and reproducible | Fact #32 | Telemetry-missing MVP rejected by user decision | Selected | Required for movement-detector exact fit. |
+| VLM output contract | Structured `VlmAssessment` schema | Label enum, confidence, evidence spans, reason text, timeout / error status; validate before accepting | Prevent free-form VLM text from becoming an unstable API | Fact #31 | Raw free-text VLM output rejected | Selected | Needed for operator display and downstream logs. |
+| IPC security | Unix-domain socket permissions + peer credentials | Local socket with filesystem permissions, peer-credential check, payload-size limits | Restrict local VLM callers and bound payload abuse | Fact #30 | Unauthenticated localhost HTTP rejected for VLM control plane | Selected | Local-only is not sufficient without local authorisation. |
+| Input security | Image / ROI payload validation | MIME / format allow-list, size limits, patched OpenCV, decode sandbox where practical | Reduce crafted-input and resource-exhaustion risk | Fact #29, Fact #34 | Trusting headers / client filenames rejected | Selected | Existing service will process more image / ROI inputs. |
+| Service reliability | Explicit errors and health semantics | No silent exception swallowing in touched paths; health reflects inference / scheduler / VLM availability | Make failures visible during scans and tests | Fact #35 | "Always healthy" failure masking rejected | Selected | Required before expanding mission-critical behaviour. |
+
+## Validation
+
+### Validation scenario
+
+A winged UAV flies a planned route at 600–1000 m over mixed winter forest and field terrain. In `ZoomedOut`, the camera sweeps left-right at wide / light zoom. The system detects a faint footpath and a small moving dot, queues both, zooms to the path endpoint within 2 s (entering `ZoomedIn`), keeps the endpoint centred while the UAV moves, asks the local VLM for a bounded confirmation, then returns to `ZoomedOut`. Later, an operator confirms a target and `TargetFollow` mode keeps it in the centre 25 % of frame.
+
+### Expected behaviour (based on conclusions)
+
+- Tier 1 emits primitive boxes / masks for path, branch pile, road / tree context, and fixed known object classes.
+- Movement detector compensates gimbal / UAV ego-motion with telemetry and optical flow before residual moving-cluster extraction.
+- `scan_controller` queues POIs by confidence / proximity plus ageing and enforces ≤5 POIs/minute operator-review budget.
+- Zoom-in zoom and endpoint hold run through a deterministic state machine with timeouts and target-loss handling.
+- VLM runs only on bounded ROI crops through local IPC and only when the scheduler grants the GPU slot.
+
+### Actual validation results
+
+The architecture is internally consistent with the researched constraints and user confirmations. Runtime quality still requires hardware validation:
+
+1. Tier 1 end-to-end frame latency for fixed-shape YOLO26 + YOLOE-26 FP16 engines at 1280 px.
+2. ViewPro A40 medium-to-high zoom transition under the selected control protocol.
+3. Movement false-positive rate with timestamped telemetry and representative zoom-out panning, plus zoom-in tracking. Both must satisfy per-zoom-band caps.
+4. NanoLLM VILA1.5-3B ≤5 s/ROI and memory budget while the existing YOLO container is present.
+5. All-season validation coverage and hard-negative mining.
+
+### Counterexamples
+
+- If YOLOE TensorRT requires runtime prompt mutation for the chosen classes, it is not a valid MVP runtime path; use fixed trained classes only.
+- If VILA1.5-3B misses memory or latency gates, MVP cannot claim VLM-required acceptance until a smaller local VLM passes the same API and runtime gates. In that case, `scan_controller` operates with VLM disabled per the optionality model in `architecture.md §7.6 Local VLM confirmation`.
+- If telemetry is unavailable or unsynchronised, the movement detector must degrade to stabilised video-only mode and should not claim the zoom-out movement criteria.
+
+### Review checklist
+
+- [x] Draft conclusions are consistent with fact cards.
+- [x] Important dimensions include hardware, model runtime, movement compensation, scan control, data, security, and integration boundaries.
+- [x] No selected runtime component depends on cloud services.
+- [x] No selected TensorRT YOLOE path depends on runtime open-vocabulary prompt mutation.
+- [x] Runtime-quality gates are separated from API capability gates.
+- [x] All selected components match the project constraint matrix at the API / architecture level.
+
+### Conclusions requiring revision
+
+None at research-draft level. Hardware benchmark failures may revise the selected model variants during planning or implementation.
+
+## References (source registry)
+
+Access date for web sources: 2026-05-06.
+
+### Source #1 — Jetson Orin Nano Super Developer Kit
+
+- **Link**: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit
+- **Tier**: L1 (vendor primary)
+- **Version info**: 67 INT8 TOPS, 8 GB LPDDR5, 102 GB/s
+- **Boundary match**: Full match
+- **Summary**: Official NVIDIA page for Jetson Orin Nano Super Developer Kit confirms 67 INT8 TOPS, 8 GB 128-bit LPDDR5 at 102 GB/s, 7–25 W power, generative-AI edge positioning.
+- **Used for**: Latency and memory feasibility (Facts #1, #12, #13).
+
+### Source #2 — NVIDIA JetPack 6.2 Super Mode
+
+- **Link**: https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/
+- **Tier**: L2 (vendor blog)
+- **Version info**: JetPack 6.2 / Super Mode
+- **Boundary match**: Full match
+- **Summary**: Describes the software-enabled Super Mode performance gain and applicable power modes for Jetson Orin Nano.
+- **Used for**: Reproducibility constraint (Fact #2).
+
+### Source #3 — Ultralytics Jetson / TensorRT deployment
+
+- **Link**: https://docs.ultralytics.com/yolov5/jetson_nano/
+- **Tier**: L1 (vendor docs)
+- **Boundary match**: Partial overlap
+- **Summary**: Official Ultralytics documentation for Jetson deployment and TensorRT export.
+- **Used for**: Tier 1 latency / deployment feasibility (Fact #3).
+
+### Source #4 — ViewPro A40 Pro official spec
+
+- **Link**: https://www.viewprotech.com/index.php?ac=article&at=read&did=561
+- **Tier**: L1 (vendor primary)
+- **Version info**: A40 Pro
+- **Boundary match**: Full match
+- **Summary**: 1080p output, 40× optical zoom, 4.25–170 mm focal range, 30 Hz tracking deviation update, <30 ms deviation output delay, 5×5 px minimum AI target size for built-in AI.
+- **Used for**: Camera and gimbal control feasibility (Facts #4, #5).
+
+### Source #5 — MONA: Moving Object Detection from Videos Shot by Dynamic Camera
+
+- **Link**: https://arxiv.org/html/2501.13183v1
+- **Tier**: L1 (peer-reviewed venue / arXiv)
+- **Boundary match**: Partial overlap
+- **Summary**: Optical flow + tracking-any-point + adaptive bounding-box filtering + segmentation for moving-camera object detection.
+- **Used for**: Movement detection under moving camera (Facts #6, #7).
+
+### Source #6 — Moving Object Detection from Moving Camera using Focus of Expansion Likelihood and Segmentation
+
+- **Link**: https://arxiv.org/html/2507.13628v1
+- **Tier**: L1
+- **Boundary match**: Partial overlap
+- **Summary**: Optical flow, focus-of-expansion likelihood, segmentation priors for moving-camera object detection.
+- **Used for**: Movement detection under moving camera (Facts #6, #7).
+
+### Source #7 — RFAG-YOLO: receptive-field attention-guided YOLO for small-object detection in UAV images
+
+- **Link**: https://pmc.ncbi.nlm.nih.gov/articles/PMC11991089/
+- **Tier**: L1
+- **Boundary match**: Partial overlap
+- **Summary**: UAV small-object detection difficulties; improvements from receptive-field attention.
+- **Used for**: Detection-quality realism (Facts #8, #9).
+
+### Source #8 — YOLO-CAM: lightweight UAV detector with combined attention for small targets
+
+- **Link**: https://www.mdpi.com/2072-4292/17/21/3575
+- **Tier**: L1
+- **Boundary match**: Partial overlap
+- **Summary**: Lightweight UAV small-target detection using attention.
+- **Used for**: Detection-quality realism (Facts #8, #9).
+
+### Source #9 — Accurate Natural Trail Detection (DNN + Dynamic Programming)
+
+- **Link**: https://mdpi-res.com/d_attachment/sensors/sensors-18-00178/article_deploy/sensors-18-00178.pdf?version=1515603128
+- **Tier**: L1
+- **Boundary match**: Reference only
+- **Summary**: Trail detection using DNN + dynamic programming; supports path-as-structured-perception view.
+- **Used for**: Footpath / trail detection (Fact #10).
+
+### Source #10 — Large-Scale Interactive Object Segmentation with Human Annotators
+
+- **Link**: https://arxiv.org/pdf/1903.10830
+- **Tier**: L1
+- **Boundary match**: Reference only
+- **Summary**: Interactive segmentation faster than manual polygon annotation while maintaining quality.
+- **Used for**: Annotation effort (Fact #14).
+
+### Source #11 — Scaling up Instance Annotation via Label Propagation
+
+- **Link**: http://scaling-anno.csail.mit.edu/
+- **Tier**: L1
+- **Boundary match**: Reference only
+- **Summary**: Label propagation reducing annotation effort for video / object masks.
+- **Used for**: Annotation effort for movement sequences (Fact #15).
+
+### Source #12 — Getting Started with VLM on Jetson Nano
+
+- **Link**: https://learnopencv.com/vlm-on-jetson-nano/
+- **Tier**: L3 (third-party tutorial)
+- **Boundary match**: Partial overlap
+- **Summary**: Small VLMs can run on Jetson-class hardware with careful runtime / memory tuning.
+- **Used for**: Local VLM feasibility (Fact #12).
+
+### Source #13 — NVIDIA Jetson AI Lab
+
+- **Link**: https://www.jetson-ai-lab.com/
+- **Tier**: L2
+- **Boundary match**: Partial overlap
+- **Summary**: NVIDIA-linked Jetson AI Lab as the official-adjacent source for model-specific local VLM deployment evidence.
+- **Used for**: Local VLM feasibility (Fact #12).
+
+### Source #14 — CAMOUFLAGE-Net / Improved YOLOv7 Tiny / FSEL camouflaged-object detection
+
+- **Link**: https://link.springer.com/article/10.1007/s40031-024-01152-6
+- **Tier**: L1
+- **Boundary match**: Reference only
+- **Summary**: Camouflage detection requires specialised features / attention; cannot be assumed from generic object detection.
+- **Used for**: Concealed-target detection realism (Fact #11).
+
+### Source #15 — Ultralytics YOLO TensorRT integration
+
+- **Link**: https://docs.ultralytics.com/integrations/tensorrt/
+- **Tier**: L1
+- **Boundary match**: Full match
+- **Summary**: Ultralytics export to TensorRT engine format with FP16 through `half=True`; TensorRT as the recommended high-performance NVIDIA deployment path.
+- **Used for**: Tier 1 primitive detector (Facts #19, #21).
+
+### Source #16 — OpenCV 4.x optical-flow / videostab APIs
+
+- **Link**: https://docs.opencv.org/4.x/d4/dee/tutorial_optical_flow.html
+- **Tier**: L1
+- **Boundary match**: Full match
+- **Summary**: Lucas-Kanade optical flow, feature tracking, global-motion estimation APIs useful for ego-motion compensation.
+- **Used for**: Movement detection (Fact #22).
+
+### Source #17 — NanoLLM multimodal documentation
+
+- **Link**: https://github.com/dusty-nv/nanollm/blob/main/docs/multimodal.md
+- **Tier**: L1
+- **Boundary match**: Full match
+- **Summary**: NanoLLM multimodal chat with `Efficient-Large-Model/VILA1.5-3b`, image prompts, MLC quantisation options.
+- **Used for**: Local VLM confirmation (Fact #23).
+
+### Source #18 — TensorRT FP16 YOLO export on Jetson — limitations and issues
+
+- **Link**: https://docs.ultralytics.com/integrations/tensorrt/
+- **Tier**: L1 / L4 mixed (docs + community issues)
+- **Boundary match**: Full match
+- **Summary**: Official docs support FP16 TensorRT export; community results highlight Jetson workspace / dynamic-shape memory issues, mitigated with fixed shapes and careful workspace configuration.
+- **Used for**: Tier 1 latency / export reliability (Fact #20).
+
+### Source #19 — YOLOE open-vocabulary detection and TensorRT export notes
+
+- **Link**: https://v8docs.ultralytics.com/models/yoloe/
+- **Tier**: L1 / L4 mixed
+- **Boundary match**: Partial overlap
+- **Summary**: YOLOE supports open-vocabulary detection / segmentation, but TensorRT-engine use should not rely on runtime prompt APIs; fixed trained classes are safer for MVP runtime.
+- **Used for**: Semantic primitive detector (Fact #21).
+
+### Source #20 — NanoSAM / MobileSAM Jetson Orin Nano segmentation
+
+- **Link**: https://github.com/NVIDIA-AI-IOT/nanosam/
+- **Tier**: L1 / L2
+- **Boundary match**: Partial overlap
+- **Summary**: NanoSAM / MobileSAM as Jetson-optimised segmentation; useful as ROI mask refinement / annotation assist rather than primary sweep model.
+- **Used for**: Segmentation fallback (Fact #25).
+
+### Source #21 — VILA1.5-3B and VLM performance on Jetson Orin Nano
+
+- **Link**: https://dusty-nv.github.io/NanoLLM/multimodal.html
+- **Tier**: L1 / L4 mixed
+- **Boundary match**: Full match
+- **Summary**: VILA1.5-3B documented for NanoLLM multimodal usage; community results warn that Orin Nano 8 GB requires strict context / token / crop limits.
+- **Used for**: VLM feasibility (Fact #24).
+
+### Source #22 — Behaviour trees for UAV autonomy
+
+- **Link**: https://www.sciencedirect.com/science/article/pii/S0921889022000513
+- **Tier**: L1
+- **Boundary match**: Reference only
+- **Summary**: Behaviour-tree literature supports modular, reactive UAV behaviour; this project's zoom-out / zoom-in scan behaviour is small enough for a deterministic FSM first.
+- **Used for**: Scan controller architecture (Fact #26).
+
+### Source #23 — TensorRT concurrency / multiprocessing issue evidence
+
+- **Link**: https://github.com/NVIDIA/TensorRT/issues/2474
+- **Tier**: L4 (community issue tracker)
+- **Boundary match**: Partial overlap
+- **Summary**: Multiple TensorRT engines / processes on one GPU can cause context and performance problems; central GPU scheduler is safer for sequential-inference restriction.
+- **Used for**: Integration boundary / GPU scheduling (Fact #27).
+
+### Source #24 — FastAPI file-upload security references
+
+- **Link**: https://fastapi.tiangolo.com/tutorial/request-files
+- **Tier**: L1 / L3 mixed
+- **Boundary match**: Partial overlap
+- **Summary**: Secure upload handling needs content-type verification beyond headers, size limits, streaming behaviour, cleanup, authorisation, audit logging.
+- **Used for**: Security weak points (Fact #29).
+
+### Source #25 — Unix-domain socket authentication and peer credentials
+
+- **Link**: https://linux.die.net/man/7/unix
+- **Tier**: L1 / L3 mixed
+- **Boundary match**: Partial overlap
+- **Summary**: Local IPC can use filesystem permissions and peer-credential checks (`SO_PEERCRED`) to restrict which processes may connect.
+- **Used for**: VLM IPC security (Fact #30).
+
+### Source #26 — Structured output for LLM / VLM production use
+
+- **Link**: https://docs.vllm.ai/en/v0.6.5/usage/structured_outputs.html
+- **Tier**: L1 / L3 mixed
+- **Boundary match**: Reference only
+- **Summary**: Production systems should constrain or validate model output against schemas before using it in APIs / databases.
+- **Used for**: VLM output reliability (Fact #31).
+
+### Source #27 — NVIDIA / Isaac ROS timestamp synchronisation
+
+- **Link**: https://nvidia-isaac-ros.github.io/v/release-3.2/repositories_and_packages/isaac_ros_nova/isaac_ros_correlated_timestamp_driver/index.html
+- **Tier**: L1 / L2 mixed
+- **Boundary match**: Reference only
+- **Summary**: Jetson sensor-fusion uses hardware / correlated timestamps to reduce synchronisation jitter.
+- **Used for**: Telemetry synchronisation (Fact #32).
+
+### Source #28 — NVIDIA TensorRT performance optimisation
+
+- **Link**: https://docs.nvidia.com/deeplearning/tensorrt/latest/performance/optimization.html
+- **Tier**: L1
+- **Boundary match**: Partial overlap
+- **Summary**: TensorRT performance depends on batching, engine configuration, and runtime scheduling; project-specific latency must be measured under the actual scheduler.
+- **Used for**: GPU scheduler / benchmark gate (Fact #33).
+
+### Source #29 — OpenCV CVE-2025-53644
+
+- **Link**: https://securitylab.github.com/advisories/GHSL-2025-057_OpenCV
+- **Tier**: L1 / L2 mixed
+- **Version info**: OpenCV 4.10.0 / 4.11.0 affected; 4.12.0 patched
+- **Boundary match**: Partial overlap
+- **Summary**: Crafted image inputs caused critical OpenCV decoder vulnerabilities; image-input validation and pinned patched OpenCV versions matter.
+- **Used for**: Image-processing security (Fact #34).
+
+## Solution evolution
+
+The final solution architecture (now in `architecture.md §7.6 Solution Architecture`) evolved from an earlier draft that under-specified several gating concerns. Each row below pairs an old draft component with the weak point that made it insufficient, and the corresponding fix that landed in the final design.
+
+| Old component | Weak point | New solution |
+|---|---|---|
+| Tiered Jetson pipeline with runtime gates | Gates were listed as caveats, not as a concrete pre-implementation stage. | Add a mandatory benchmark gate before implementation decomposition: Tier 1 latency, Tier 2 ROI latency, VLM latency / memory, A40 zoom timing, movement replay, and all-season dataset readiness. |
+| YOLO26 / YOLOE-26 TensorRT FP16 | YOLOE runtime prompt / open-vocabulary behaviour could be accidentally assumed. | Runtime uses only fixed trained classes / pre-baked embeddings in FP16 TensorRT; runtime open-vocabulary mutation remains rejected. |
+| Movement detector with telemetry | Telemetry availability was confirmed, but synchronisation tolerance was not specified. | Add a telemetry-synchronisation contract with frame / gimbal / zoom / UAV timestamps and a maximum tolerated skew before motion compensation. |
+| NanoLLM VLM IPC | Free-form VLM output is not a stable interface for operator-facing decisions. | Add a structured `VlmAssessment` schema, validation, retry / timeout handling, and fail-closed behaviour. |
+| Local VLM process | "Local IPC authorisation" was too vague. | Use Unix-domain socket permissions plus peer-credential checks where available; enforce payload size limits. |
+| FastAPI / image processing surface | The draft did not address file / image payload validation or OpenCV decoder risk. | Add content validation, image-format allow-list, size limits, patched OpenCV requirement, and audit logs. |
+| Existing service integration | Existing code swallows several exceptions and reports healthy status even when inference fails. | Add reliability tasks for touched paths: explicit error propagation, meaningful health, structured failure logs. |
+| Scan controller | Queue cap was present, but not tied to benchmark evidence. | Include ≤5 POIs/minute in replay tests and queue backpressure behaviour. |
+
+## Historical seed
+
+This is the original (March 2026) articulation of the semantic-detection problem that the system ultimately addresses. It is preserved here for traceability — it is the seed of the entire `architecture.md §7.1 Problem` narrative. The reference images it mentions (`semantic01.png` … `semantic04.png`) lived in the original problem-side data parameters (deleted on doc consolidation 2026-05-17); they are not duplicated here.
+
+Currently, the system consists of mainly 3 parts:
+
+1. **AI object detection.** Allows automatic object detection from the video / images by classes, using a pre-trained AI model. The detection is based on visual similarity. The idea is that the UAV can automatically detect objects and work with them. If it is a reconnaissance UAV, it should deliver a short message with the detected image to the operator to confirm the target. The detection process is described in the suite-level detections doc.
+2. **GPS-Denied.** Detection of the current GPS coordinates based on a downward-facing camera and IMU, AI models for optical flow, and pre-downloaded satellite imagery for the route of the plane. Implemented by the suite-level GPS-Denied service (`gps-denied-onboard`).
+3. **Search algorithm.** Before the flight, the operator selects a region and a route. During the flight, the system uses the scanning strategy described in `architecture.md §7.2 Mission Regions and Reconnaissance Flow`.
+
+But this whole workflow has a fundamental flaw, which lies in AI object detection. The regular object detection cannot help with the current frontline situation. Regular object detection picks up old and already-destroyed vehicles and military vehicles, and they have zero value for the system.
+
+Current targets are well-hidden and well-masked. Current targets are mostly hidden positions of FPV operators. There are also well-hidden positions of artillery and other well-masked / well-hidden positions. Right now, simple object detection is not enough, because the main object to search for is a small entrance to a hidden safe place — typically a black circular or squared hole near a building, or a dugout masked by tree branches, where personnel or artillery is hidden.
+
+The reference images (deleted on doc consolidation 2026-05-17) showed the typical pattern: a footpath through forest or snow leading to a mass of black colour (mostly tree branches concealing a hideout); footpaths leading to open clearings used as FPV launch points; footpaths terminating at squared hideout structures; footpaths terminating at tree-branch concealment.
+
+The main research question that motivated the design: which AI can handle these tasks? Is it possible to instruct AI to recognise these patterns, follow footsteps (fresh only) and footpaths, analyse the potential hideouts, and signal about them to the operator? First, it should pick up footpaths; then it should distinguish stale vs fresh footpaths; then it should find potential hideouts at the freshest footpath endpoints; then it can signal potential targets.
+
+This question is now answered by the three-tier perception pipeline (Tier 1 fixed-class YOLO primitives → Tier 2 primitive-graph + lightweight ROI CNN → optional Tier 3 local VLM confirmation), the deterministic `scan_controller` state machine, and the H3-indexed `mapobjects_store`, all documented in `architecture.md §7 Detailed Design`.
@@ -0,0 +1,90 @@
+# CI / CD Pipeline
+
+**Status**: forward-looking design (Rust). Final pipeline file lands during build-system bring-up. The shape below describes the intent.
+
+## 1. Goals
+
+The pipeline must:
+
+- Build the autopilot Rust binary cross-compiled for `aarch64-unknown-linux-gnu`.
+- Run the full Rust test suite (unit + integration + replay-based) on every commit.
+- Run a hardware-in-loop conformance gate against an ArduPilot SITL instance (covers `mavlink_layer` + `mission_executor`).
+- Run a benchmark gate on representative target hardware (covers Tier 1 / Tier 2 / VLM / gimbal latency budgets — see `architecture.md §7.6 Benchmark gate`).
+- Sign and publish artefacts (binary + container image) on tagged releases.
+- Never auto-deploy to the airframe. Deployment is a human-driven operation tied to the suite's flight-gate convention (`/run/azaion/in-flight`).
+
+## 2. Pipeline stages
+
+Single Woodpecker pipeline, multi-stage. Stages run sequentially; a failed stage stops the run.
+
+| Stage | Purpose | Notes |
+|---|---|---|
+| **fetch** | Clone, restore Cargo cache | `cargo fetch` with a remote cache key. |
+| **lint** | `cargo fmt --check`, `cargo clippy --all-targets --all-features -- -D warnings` | Hard fail on any warning. |
+| **unit-test** | `cargo test --workspace` (host-arch) | Most logic is platform-independent; runs in parallel on host. |
+| **build-arm64** | Cross-compile for `aarch64-unknown-linux-gnu` | `cross` or `cargo zigbuild` depending on Rust toolchain. Produces the production binary + a debug symbol artefact. |
+| **integration-test** | Replay-based integration tests under emulation | Fixtures: pre-recorded RTSP clip, MAVLink replay, synthetic telemetry. No hardware required. |
+| **sitl-conformance** | ArduPilot SITL conformance gate | Spins up ArduPilot SITL + autopilot binary in a container; runs a fixed mission script; asserts MAVLink command surface (per `architecture.md §7.7`) and geofence enforcement. |
+| **benchmark-gate** *(opt-in, manual / nightly)* | Tier 1 / 2 / VLM / gimbal latency on real Jetson | Runs on a self-hosted Jetson Orin Nano runner. Asserts `architecture.md §6 NFR` budgets. Slow; not on every PR. |
+| **package** | Build container image (Option B from `containerization.md`) | Multi-arch tag: `azaion/autopilot:<branch>-arm64`. |
+| **sign** | Sign binary + image | Cosign for the image; OS-vendor signing flow for the binary if used in native deployment. |
+| **publish** | Push image + binary to internal registry | Tagged builds only. |
+
+## 3. Artefacts
+
+| Artefact | Where | Retention |
+|---|---|---|
+| `autopilot` binary (aarch64) | internal artefact store | last 10 builds per branch; tagged builds kept indefinitely |
+| Debug symbols (`.dwp`) | internal artefact store, separate path | matched to binary lifetime |
+| Container image | internal Docker registry | last 10 dev builds; tagged builds kept indefinitely |
+| Cosign signature | next to image | matched to image lifetime |
+| Test logs | CI run | per Woodpecker retention |
+| Benchmark gate report | internal artefact store (Markdown + JSON) | per-tag retention |
+
+## 4. Build matrix
+
+Single matrix entry today:
+
+| Toolchain | Target | Tier-1 dep | VLM feature |
+|---|---|---|---|
+| Rust stable | `aarch64-unknown-linux-gnu` | `../detections` (Cython service consumed via gRPC; not built here) | `cargo --features vlm` (also `cargo` without — both must build) |
+
+The `--features vlm` and the no-feature path are both built and tested to enforce the optionality contract from `architecture.md §7.6 Local VLM confirmation`.
+
+## 5. SITL conformance gate (in detail)
+
+Stage runs in CI; produces a pass/fail signal that gates merge to `dev`.
+
+**Setup:**
+
+1. Start ArduPilot SITL in a container, listening on `udp://0.0.0.0:14550`.
+2. Start autopilot binary configured for SITL endpoint.
+3. Pre-load a fixture mission via the missions API mock (`mission_client` HTTP target).
+4. Pre-load a fixture RTSP source (looped clip).
+5. Mock the `../detections` service with deterministic detections.
+
+**Assertions:**
+
+- All MAVLink message kinds in `architecture.md §7.7` succeed at least once.
+- Mission upload + start completes within the configured retry budget.
+- INCLUSION geofence violation triggers RTL.
+- EXCLUSION geofence violation triggers RTL (regression gate against the earlier silent-ignore behaviour).
+- Middle-waypoint POST + re-upload succeeds within ≤2 s.
+- Health endpoint returns `green` once steady state is reached.
+
+## 6. Branch policy
+
+| Branch | Triggers | Required gates |
+|---|---|---|
+| feature branches (PR) | on push | fetch → lint → unit-test → build-arm64 → integration-test → sitl-conformance |
+| `dev` | on merge | all PR gates + package |
+| tagged release (`v*`) | on tag | all `dev` gates + sign + publish + benchmark-gate (manual approval) |
+
+`main` and `dev` are protected. Force-push is forbidden. Merges require a green pipeline.
+
+## 7. Out of scope here
+
+- Airframe deployment automation (manual; tied to flight-gate).
+- Ground Station and `../detections` pipelines (each owns its own).
+- AI training pipeline — `../_docs/12_ai_training.md`.
+- Model-sync to the airframe (`model-sync.service`, suite-level — `../_docs/00_top_level_architecture.md`).
@@ -0,0 +1,142 @@
+# Containerisation
+
+**Status**: forward-looking design (Rust). Final shape will surface during build-system bring-up; treat the choices below as the current intent, not commitments.
+
+## 1. Deployment shape
+
+`autopilot` is a single Rust binary. Two delivery options are considered:
+
+| Option | Form | Pros | Cons |
+|---|---|---|---|
+| **A — native systemd unit** | bare binary deployed to `/usr/local/bin/autopilot` + a `.service` unit | minimum overhead on Jetson; closest to airframe constraints; trivial flight-gate integration | per-host installation discipline; less reproducible across nodes |
+| **B — single container image** | `azaion/autopilot:<branch>-arm64` | consistent across environments; matches the suite's existing OTA model (Watchtower) | container runtime adds startup latency and one more failure surface on the airframe |
+
+The decision is **Option A** for the on-airframe deployment (lowest overhead, closest to the autopilot's real-time constraints), and **Option B** for development / CI / emulated-hardware testing (reproducibility wins). The same Rust binary is built once and packaged into both.
+
+## 2. Target hardware
+
+| Item | Value |
+|---|---|
+| Edge device | NVIDIA Jetson Orin Nano Super 8 GB |
+| Architecture | aarch64 |
+| OS | Ubuntu 22.04 (JetPack-bundled) — locked JetPack version + power mode |
+| Camera | ViewPro A40 (RTSP + UDP control) |
+| Autopilot | ArduPilot or PX4 over MAVLink v2 (UDP or serial) |
+
+## 3. Native deployment (Option A — production)
+
+**Layout:**
+
+```text
+/usr/local/bin/autopilot                  Rust binary
+/etc/azaion/autopilot/config.toml         runtime config
+/etc/systemd/system/autopilot.service     systemd unit
+/var/lib/autopilot/                       persistent state (mapobjects_store)
+/run/azaion/in-flight                     flight-gate marker (per ../_docs/00_top_level_architecture.md)
+```
+
+**systemd unit highlights:**
+
+- `Type=notify` — autopilot signals readiness once Tier 1, gimbal, and MAVLink links are healthy.
+- `Restart=on-failure`, `RestartSec=2s`, `StartLimitBurst=5` — bounded restart (so a hard-broken binary doesn't loop forever).
+- `MemoryMax=` — enforces the on-airframe memory budget (~6 GB; Tier-1 YOLO container holds ~2 GB).
+- `LimitNOFILE`, `LimitNPROC` set explicitly.
+- `ExecStartPre=/bin/sh -c 'mkdir -p /run/azaion && touch /run/azaion/in-flight'` — asserts the suite-wide flight-gate so `model-sync.service` does not pull a new model mid-flight.
+- `ExecStopPost=/bin/rm -f /run/azaion/in-flight` — clears the flight-gate on shutdown.
+
+**Runtime config** (`/etc/azaion/autopilot/config.toml`) is the single source for non-secret configuration: RTSP URL, gimbal endpoint, MAVLink connection URI, missions API endpoint, Ground Station endpoint, VLM IPC socket path, `vlm_enabled` flag, log level. Secrets (if any — TBD per `../_docs/02_missions.md` auth model) come from the systemd `EnvironmentFile=` pointing at a permission-restricted file.
+
+## 4. Container image (Option B — dev / CI / emulation)
+
+**Base image:** `nvcr.io/nvidia/l4t-base:<JetPack-pinned-tag>` for production-equivalent NVDEC + TensorRT plumbing; `ubuntu:22.04` for emulation (no GPU acceleration).
+
+**Image layout:**
+
+```text
+/usr/local/bin/autopilot                  Rust binary (built outside the image)
+/etc/azaion/autopilot/config.toml         runtime config (mounted at runtime)
+/var/lib/autopilot/                       persistent state (volume-mounted)
+```
+
+**Image is non-root.** Default `USER` is `autopilot:autopilot`; `/var/lib/autopilot/` is owned by that user.
+
+**Compose example** (development):
+
+```yaml
+services:
+  autopilot:
+    image: azaion/autopilot:dev-arm64
+    restart: unless-stopped
+    environment:
+      AUTOPILOT_CONFIG: /etc/azaion/autopilot/config.toml
+    volumes:
+      - ./config/autopilot.toml:/etc/azaion/autopilot/config.toml:ro
+      - autopilot-state:/var/lib/autopilot
+      - /run/azaion:/run/azaion
+    devices:
+      - /dev/ttyUSB0:/dev/ttyUSB0   # MAVLink serial (if used)
+    network_mode: host              # RTSP / UDP gimbal / Ground Station modem all on host
+volumes:
+  autopilot-state: {}
+```
+
+`network_mode: host` is intentional on Jetson: RTSP, gimbal UDP, MAVLink UDP, and the modem-link to the Ground Station all share the airframe's network namespace.
+
+## 5. External dependencies on the airframe
+
+`autopilot` itself is the only autopilot-owned process. The on-airframe tier also runs (separately):
+
+- **`../detections`** — Tier 1 YOLO service. Container delivered from its own pipeline. Bi-directional gRPC endpoint consumed by `detection_client`.
+- **NanoLLM / VILA1.5-3B** (optional) — local IPC peer of `vlm_client`. Separate container or process; not embedded in the autopilot binary. Surfaces a Unix-domain socket; peer-credential check is mandatory when supported.
+- **GPS-Denied service** — separate edge service, owned by `gps-denied-onboard`; consumed indirectly through the shared edge data path (per `../_docs/11_gps_denied.md`).
+- **`model-sync.service`** — suite-wide rclone-driven model puller. Reads `/run/azaion/in-flight` to defer model swaps during flight (per `../_docs/00_top_level_architecture.md`).
+
+## 6. Configuration surface
+
+All configuration is declarative (`config.toml`); there is no compile-time configuration of endpoints, addresses, or feature switches **except** the `vlm_client` build-time feature flag (see `architecture.md §7.6 Local VLM confirmation > Optionality model`).
+
+| Concern | Mechanism |
+|---|---|
+| RTSP / gimbal / MAVLink endpoints | `config.toml` |
+| `missions` API endpoint + auth | `config.toml` (auth pulled from `EnvironmentFile=`) |
+| Ground Station endpoint | `config.toml` |
+| VLM IPC socket path | `config.toml` |
+| `vlm_enabled` runtime flag | `config.toml` |
+| `vlm_client` build-time feature | `cargo --features vlm` at build |
+| Log level + format | `RUST_LOG` env (`tracing-subscriber` honours it) |
+| Mission ID for the current flight | CLI arg (per-flight, not per-host) |
+
+## 7. Health endpoint
+
+`autopilot` exposes a single HTTP health endpoint (port and bind address from `config.toml`; default `127.0.0.1:8080`). It aggregates per-component readiness:
+
+```json
+{
+  "status": "green | yellow | red",
+  "components": {
+    "frame_ingest":      "green",
+    "detection_client":  "green",
+    "movement_detector": "green",
+    "semantic_analyzer": "green",
+    "vlm_client":        "disabled",
+    "scan_controller":   "green",
+    "mapobjects_store":  "green",
+    "gimbal_controller": "green",
+    "operator_bridge":   "yellow",
+    "mission_executor":  "green",
+    "mavlink_layer":     "green",
+    "mission_client":    "green",
+    "telemetry_stream":  "green"
+  },
+  "last_state_change": "2026-05-17T12:00:00Z"
+}
+```
+
+`yellow` is degraded-but-running; `red` is unrecoverable for at least one essential component. The aggregator surfaces details on each transition through `tracing` (see `observability.md`).
+
+## 8. Out of scope here
+
+- Provisioning the Jetson host itself (Ansible / Kickstart / disk imaging) — owned by airframe ops.
+- Build pipeline (cross-compile, signing, registry push) — see `ci_cd_pipeline.md`.
+- Observability stack (tracing exporter, log shipper, metrics scraper) — see `observability.md`.
+- Mission delivery to the airframe — owned by `missions` API.
@@ -0,0 +1,142 @@
+# Observability
+
+**Status**: forward-looking design (Rust). Treat the choices below as the intended approach; the exact tracing exporter / metrics scraper / log-shipping target depend on the suite's overall observability stack at deploy time.
+
+## 1. Posture
+
+- **One binary, one process.** Per-component instrumentation is structured (each component listed in `architecture.md §3` is a `tracing` target).
+- **Structured logs are primary**, metrics are derived from log spans and counters, traces are end-to-end on a frame's journey through the pipeline.
+- **No silent error swallowing.** Every failure path increments a counter, emits a span event, or both.
+- **Health is aggregated**, not derived from logs. The HTTP health endpoint (`containerization.md §7`) is the source of truth for live readiness.
+
+## 2. Logs
+
+**Library**: `tracing` + `tracing-subscriber`.
+
+**Format**: JSON to stdout. Captured by the host's journald (Option A) or by the container runtime (Option B), then shipped to the suite's log aggregator.
+
+**Per-line fields:**
+
+| Field | Source | Notes |
+|---|---|---|
+| `ts` | wall clock | ISO-8601 UTC. |
+| `ts_mono_ns` | monotonic clock | For ordering across components without clock-skew artefacts. |
+| `level` | `tracing` | `error \| warn \| info \| debug \| trace`. |
+| `target` | component name | One of `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client`, `scan_controller`, `mapobjects_store`, `gimbal_controller`, `operator_bridge`, `mission_executor`, `mavlink_layer`, `mission_client`, `telemetry_stream`. |
+| `frame_seq` | propagated context | Where applicable. Lets us reconstruct one frame's journey. |
+| `poi_id`, `roi_id`, `target_id`, `mission_id`, `command_id` | propagated context | Where applicable. |
+| `event` | message | Short, machine-friendly identifier (e.g., `frame.dropped`, `vlm.timeout`, `mission.geofence_violation`, `bit.check_failed`, `failsafe.lost_link`, `mapobjects.push_failed`, `operator.auth_rejected`). |
+| `model_version` | propagated context | Version string for `tier1_model_version` and `vlm_model_version`. Required on every `vlm.response` and on every Tier-2 evidence span for forensic correlation. |
+| `wall_clock_source` | telemetry frame | `gnss \| host \| coast`; emitted on every state-transition span and on every operator-command audit log line. |
+| `reason` | message | Free-form for human readers. |
+
+**Log level defaults:**
+
+- `info`: lifecycle (startup / shutdown / state transitions), all error and security events.
+- `warn`: degraded-but-running events (yellow health, retries, drops).
+- `error`: red health, hard failures, schema violations, security violations.
+- `debug` / `trace`: off in production; enabled per-target via `RUST_LOG`.
+
+**Always logged at `warn` or higher** (per `coderule.mdc`):
+
+- Every exception path that the operator could care about.
+- Authentication / authorisation failures (peer-cred check failures on VLM IPC, malformed Ground Station session, MAVLink-2 signing rejection).
+- Geofence violations.
+- Schema validation failures (Tier 1 response, VLM response, mission JSON).
+
+## 3. Metrics
+
+Derived from log spans + a small set of explicit counters. Exporter: Prometheus-compatible (per the suite's stack).
+
+**Per-component counters** (illustrative — exact names finalised at implementation):
+
+| Component | Counter | Type |
+|---|---|---|
+| `frame_ingest` | `frames_received_total`, `frames_dropped_total{reason}`, `decode_errors_total` | counter |
+| `frame_ingest` | `decode_ms` | histogram |
+| `detection_client` | `requests_total`, `errors_total{kind}`, `latency_ms` | counter / histogram |
+| `movement_detector` | `candidates_total`, `telemetry_skew_drops_total` | counter |
+| `semantic_analyzer` | `tier2_runs_total`, `tier2_latency_ms`, `tier2_oversize_total` | counter / histogram |
+| `vlm_client` | `vlm_requests_total{status}`, `vlm_latency_ms` | counter / histogram |
+| `scan_controller` | `state_transitions_total{from,to}`, `pois_in_queue`, `pois_per_min`, `tick_latency_ms` | counter / gauge / histogram |
+| `mapobjects_store` | `classify_total{result}`, `ignored_items_total`, `removed_candidates_total` | counter |
+| `gimbal_controller` | `commands_total`, `decision_to_movement_ms`, `zoom_transition_ms`, `vendor_faults_total` | counter / histogram |
+| `mavlink_layer` | `messages_in_total{kind}`, `messages_out_total{kind}`, `command_acks_total{result}`, `parse_errors_total`, `link_state` | counter / gauge |
+| `mission_executor` | `state_transitions_total{from,to}`, `mission_uploads_total{result}`, `geofence_violations_total{kind}` | counter |
+| `mission_client` | `fetches_total{result}`, `middle_waypoint_posts_total{result}`, `mapobjects_pull_total{result}`, `mapobjects_push_total{result}`, `mapobjects_pull_bytes`, `mapobjects_push_bytes`, `mapobjects_sync_lag_s` | counter / gauge |
+| `mission_executor` (BIT) | `bit_runs_total{result}`, `bit_check_failures_total{check}` | counter |
+| `mission_executor` (failsafe) | `link_loss_events_total{trigger}`, `failsafe_action_total{action}` | counter |
+| `operator_bridge` | `pois_surfaced_total`, `commands_received_total{kind,result}`, `decision_latency_ms`, `auth_rejections_total{reason}`, `command_e2e_ms` | counter / histogram |
+| `telemetry_stream` | `bytes_out_total`, `frames_out_total`, `link_state`, `bandwidth_used_mbps` | counter / gauge |
+
+**Aggregated:**
+
+- `health_state{component}` — 0 (red) / 1 (yellow) / 2 (green); enables alerting per-component.
+- `process_uptime_seconds`, `process_resident_memory_bytes` — standard.
+
+## 4. Traces
+
+`tracing` spans cover the path of a single frame and the path of a single POI.
+
+**Frame trace** (per `Frame`):
+
+```text
+frame_ingest.publish
+  detection_client.request
+    detection_client.response
+  movement_detector.tick
+    [movement_detector.emit_candidate]
+  telemetry_stream.push
+```
+
+**POI trace** (per `POI`):
+
+```text
+scan_controller.enqueue
+  scan_controller.dequeue
+    gimbal_controller.zoom
+    semantic_analyzer.tier2
+      [vlm_client.request -> vlm_client.response]
+    operator_bridge.surface
+      [operator_bridge.confirm | decline | timeout]
+        mission_executor.middle_waypoint    # confirm path
+        mapobjects_store.append_ignored     # decline path
+```
+
+Spans propagate via context across in-process channels. Trace export target depends on the suite's stack (OTLP / Jaeger / Tempo).
+
+## 5. Health endpoint
+
+See `containerization.md §7`. The endpoint is the operator-facing readiness API; metrics + logs are the engineer-facing investigation API.
+
+A red health state for any of these components is unrecoverable for the current flight:
+
+- `frame_ingest` red → no input → cannot operate.
+- `mavlink_layer` red → no UAV control → trigger RTL via the autopilot's failsafe (the autopilot itself enforces this when MAVLink heartbeat stops).
+- `mission_executor` red → mission lifecycle stuck → operator must take RC control.
+
+A red health state for these components is degraded-but-survivable:
+
+- `detection_client` → continue zoom-out sweep; lose Tier 1.
+- `movement_detector` → continue; lose movement-candidate POI source.
+- `semantic_analyzer` → continue; surface Tier-1-only POIs.
+- `vlm_client` → fail-closed (POIs surfaced without VLM evidence).
+- `mapobjects_store` → continue with in-memory state; persistent diff lost on restart. Sync state may transition to `Stale` (operator visible).
+- `mapobjects_sync` (logical, owned by `mission_client`) → mission proceeds with stale snapshot; post-flight push retries via leftover spool. Operator sees `mapobjects_sync = degraded`.
+- `operator_bridge` / `telemetry_stream` → continue zoom-out sweep; pause POI surfacing; resume on reconnect. F10 lost-link ladder owns the larger response.
+- `gimbal_controller` → pause zoom-in / target-follow; zoom-out sweep stops.
+- `mission_client` → continue current mission from in-memory copy.
+
+## 6. Replay-driven debugging
+
+All non-trivial decisions in `scan_controller`, `movement_detector`, `semantic_analyzer`, `vlm_client`, and `mission_executor` are reconstructable from logs + the (size-capped) raw inputs that drove them:
+
+- Frame seq, gimbal state at decode, telemetry sample used, Tier-1 detections returned, Tier-2 score, VLM raw response (size-capped), operator command, resulting state transition.
+
+This is the foundation of the replay-based integration tests in `ci_cd_pipeline.md §2`.
+
+## 7. Out of scope here
+
+- Suite-wide observability stack choice (OTLP vs Loki vs Tempo vs Promtail) — owned by suite ops.
+- Persistent log retention policy — owned by suite ops.
+- Alerting routing (Slack / PagerDuty / email) — owned by suite ops.
@@ -0,0 +1,113 @@
+# autopilot — Glossary
+
+**Status**: confirmed-by-user (2026-05-17), updated for the rewrite paradigm.
+
+Project-specific terms only. Generic CS / industry terms (RTSP, gRPC, FastAPI, MAVLink, JSON, etc.) are intentionally omitted.
+
+---
+
+**AUTO mode** — fixed-wing autopilot flight mode in which the airframe follows its uploaded mission. `mission_executor`'s fixed-wing variant uploads the mission and waits for the operator to switch the airframe into AUTO via RC; only then does it transition to `FLY_MISSION`. source: `architecture.md §7.7`.
+
+**Behaviour tree (BT)** — hierarchical decision-making model (Selector / Sequence / Condition / Action / Decorator) ticked from the root every cycle. The canonical decomposition of `scan_controller`'s logic. The implementation may use a typed deterministic state machine that satisfies the same priorities, preemption, and tick scenarios. source: `system-flows.md §F4`.
+
+**Benchmark gate** — proof-of-concept milestone. Tier 1 ≤100 ms/frame, Tier 2 ≤200 ms/ROI, VLM ≤5 s/ROI, A40 transition ≤2 s, decision-to-movement ≤500 ms, ≤5 POIs/min. Must pass before product code begins. source: `architecture.md §7.6 Benchmark gate`.
+
+**`../detections`** — separate sibling repo. FastAPI / Cython service running TensorRT YOLO26 + YOLOE-26 FP16 engines. Tier 1 primitive detection lives here, NOT in autopilot. Consumed by `detection_client` over bi-directional gRPC. source: `architecture.md §1`, `../_docs/03_detections.md`.
+
+**detection_client** — autopilot component: bi-directional gRPC client to `../detections`; streams frames out, receives bounding boxes back; same bboxes are reused for Tier 2 ROI selection and for operator overlay. source: `components/detection_client/description.md`.
+
+**Confidence-scaled timeout** — operator-decision window scales linearly with target confidence: 40 % → 30 s, 100 % → 120 s. Below 40 % a target is not surfaced. Timeout = forget; decline = `IgnoredItem` entry. source: `architecture.md §5 Architectural Principles`.
+
+**Ego-motion compensation** — separating target motion from platform motion in `movement_detector`, using synchronised video + gimbal angle + zoom state + UAV telemetry. Naive frame-differencing is explicitly rejected. Per-zoom-band thresholds: tighter at zoom-in. source: `architecture.md §5 Architectural Principles`, `components/movement_detector/description.md`.
+
+**Flight-gate** — suite-wide convention: a marker file (`/run/azaion/in-flight`) written by autopilot at startup and removed at shutdown, read by `model-sync.service` to defer model swaps during flight. source: `../_docs/00_top_level_architecture.md`, `deployment/containerization.md §3`.
+
+**frame_ingest** — autopilot component: pulls RTSP from ViewPro A40, decodes, timestamps, hands frames to `detection_client`, `movement_detector`, and `telemetry_stream`. source: `components/frame_ingest/description.md`.
+
+**Geofence (INCLUSION / EXCLUSION)** — polygonal area constraint in the mission. **Both** are enforced symmetrically in the rewrite (`mission_executor`); a violation triggers RTL. source: `architecture.md §5 Architectural Principles`, `§7.7 MAVLink and Piloting`, `components/mission_executor/description.md`.
+
+**gimbal_controller** — autopilot component: ViewPro A40 control protocol (yaw / pitch / zoom) + zoom-out sweep + zoom-in path-follow + target-follow centre-window. source: `components/gimbal_controller/description.md`.
+
+**Ground Station API** — external, out-of-this-repo service that receives a continuous camera + telemetry stream from each UAV and hosts the operator browser UI (bbox overlay, target confirm/decline). Not built; not in autopilot scope. source: `architecture.md §1`, `../_docs/04_system_design_clarifications.md`.
+
+**Hand-rolled MAVLink layer** — `mavlink_layer` implements the ~10–15 MAVLink commands this codebase actually uses with no third-party SDK. Eliminates the largest dependency-risk item. source: `architecture.md §7.7`, `components/mavlink_layer/description.md`.
+
+**H3 spatial index** — hexagonal hierarchical geospatial indexing used by `mapobjects_store` for fast new / moved / existing / removed diffs. source: `architecture.md §7.9`, `components/mapobjects_store/description.md`.
+
+**IgnoredItem** — operator-declined target. Persisted in `mapobjects_store` as `(MGRS, class_group)`; new detections matching an entry are suppressed before reaching the operator. source: `architecture.md §7.12`, `components/mapobjects_store/description.md`, `data_model.md §IgnoredItem`.
+
+**Jetson Orin Nano** — edge-device compute platform (NVIDIA, aarch64, CUDA-capable). 8 GB shared LPDDR5; ~2 GB used by Tier 1, ~6 GB available for the rest of autopilot + VLM. source: `architecture.md §7.3`.
+
+**Zoom-out / zoom-in scan** — two-tier search behaviour. **Zoom-out level** = wide / light-medium zoom sweep along the UAV route, runs `movement_detector` + Tier 1. **Zoom-in level** = zoom into a queued POI for Tier 2 ROI analysis + optional VLM confirmation; `movement_detector` continues to run with per-zoom-band thresholds. State-machine variants: `ZoomedOut`, `ZoomedIn { roi, hold_started_at }`. source: `architecture.md §7.1`, `components/scan_controller/description.md`.
+
+**MapObject** — entry in `mapobjects_store`; keyed by `H3_cell + class`; carries GPS, size, class, and a list of recent observations. source: `architecture.md §7.9`, `data_model.md §MapObject`.
+
+**mapobjects_store** — autopilot component: on-device H3-indexed map of detected objects + ignored-items list. No REST API. source: `components/mapobjects_store/description.md`.
+
+**mavlink_layer** — autopilot component: hand-rolled MAVLink v2 transport + the small command set this codebase needs. source: `components/mavlink_layer/description.md`.
+
+**MGRS** — Military Grid Reference System; primary coordinate encoding for autopilot ⇄ operator sync messages and for `mapobjects_store` keys. source: `architecture.md §7.10`, `data_model.md §MGRS sync message`.
+
+**Middle waypoint** — autopilot-inserted waypoint between the current position and the next mission waypoint, computed from an operator-confirmed POI. Triggers a mission re-upload (`MISSION_CLEAR_ALL` + standard upload sequence). source: `architecture.md §7.7`, `components/mission_executor/description.md`.
+
+**mission_client** — autopilot component: pulls the mission from the `missions` API on start; POSTs middle-waypoint inserts; honours the mission cascade signal. source: `components/mission_client/description.md`.
+
+**mission_executor** — autopilot component: variant-specific (multirotor / fixed-wing) state machine that drives the airframe through connect → health-check → arm/takeoff (multirotor) or wait-for-AUTO (fixed-wing) → upload → fly → land. Owns geofence enforcement. source: `architecture.md §7.7`, `components/mission_executor/description.md`.
+
+**mission-schema** — shared schema artefact between `autopilot` and `missions` repos. Extraction location TBD (`_infra/` at suite root, or a small third repo) — `architecture.md §8 Q5`. source: `architecture.md §5`.
+
+**`missions`** — separate sibling repo (.NET service). Hosts the missions REST API. Stays separate from `autopilot`; the two share `mission-schema`. source: `../_docs/02_missions.md`.
+
+**Movement candidate** — small moving point/cluster emitted by `movement_detector` in either zoom-out or zoom-in. Tagged with `source_zoom_band`. Promoted to a zoom-in POI by `scan_controller` (or used to bump in-progress ROI confidence at zoom-in). source: `architecture.md §7.4`, `data_model.md §MovementCandidate`.
+
+**movement_detector** — autopilot component: OpenCV optical-flow / global-motion estimation with mandatory ego-motion compensation. Active at **both** zoom-out and zoom-in (suppressed only during target-follow); per-zoom-band thresholds. Classical-CV adequacy at zoom-in is benchmark-gated; learned-CV fallback per Q14. source: `components/movement_detector/description.md`.
+
+**operator_bridge** — autopilot component: surfaces POIs (via `telemetry_stream` → Ground Station) for operator confirm / decline; forwards target-follow start / release; on decline appends an `IgnoredItem`. source: `components/operator_bridge/description.md`.
+
+**Optionality model (VLM)** — VLM is the only optional Tier. Two complementary mechanisms: a runtime `vlm_enabled` flag, and a build-time feature module. The system MUST function correctly with VLM absent. source: `architecture.md §7.6 Local VLM confirmation > Optionality model`, `components/vlm_client/description.md §9`.
+
+**POI (Point of Interest)** — a queued candidate for zoom-in inspection (footpath start, branch pile, tree row, movement candidate, etc.). source: `architecture.md §7.1`, `data_model.md §POI`.
+
+**POI queue** — operator-review queue inside `scan_controller`; ordered by `confidence × proximity × age_factor`; hard cap of **≤5 POIs/min** to bound operator workload. source: `architecture.md §5`, `components/scan_controller/description.md`.
+
+**RTL (Return-to-Launch)** — MAVLink-driven return to the configured rally point; triggered by INCLUSION / EXCLUSION violation, by max-retry exhaustion in `mission_executor`, or by failsafe in the autopilot itself. source: `architecture.md §7.7`, `components/mission_executor/description.md`.
+
+**Scan controller** — autopilot component: central deterministic typed state machine — `ZoomedOut`, `ZoomedIn`, `TargetFollow`. Owns POI queue, timeouts, gimbal commands, ≤5 POIs/min cap. source: `architecture.md §7.6 Scan controller and POI queue`, `components/scan_controller/description.md` (full BT spec in `system-flows.md §F4`).
+
+**semantic_analyzer** — autopilot component (Tier 2): primitive graph + lightweight ROI CNN, reasoning over paths, branch piles, dark entrances, etc. source: `components/semantic_analyzer/description.md`.
+
+**SITL conformance gate** — CI stage in which autopilot runs against ArduPilot SITL with a mocked `../detections` and a fixture mission, asserting the MAVLink command surface and geofence enforcement. source: `deployment/ci_cd_pipeline.md §5`.
+
+**Sweep** — zoom-out camera motion: gimbal swings left-right across the UAV's flight path while the UAV flies the mission. Exact pattern (pendulum / raster / lawn-mower), FOV per zoom tier, dwell time, and mission-segment alignment are unresolved (`architecture.md §8 Q1`). source: `architecture.md §7.1`, `components/gimbal_controller/description.md`.
+
+**Target-follow mode** — gimbal keeps an operator-confirmed target centred (within the centre 25 % of frame) while the UAV continues to move. Ends on operator release or tracking loss. State-machine variant: `TargetFollow { target_id, started_at }`. source: `architecture.md §7.1`, `components/scan_controller/description.md`.
+
+**telemetry_stream** — autopilot component: continuous (always-on) push of camera frames + flight telemetry + bbox overlay over modem to the Ground Station API. Operator always sees live feed, not just on detection. Carries operator commands on the return path. source: `components/telemetry_stream/description.md`.
+
+**Tier 1 detection** — primitive YOLO over the full frame; delegated to `../detections`. source: `architecture.md §7.6 Tier 1 primitive detector`.
+
+**Tier 2 semantic** — primitive-graph + lightweight ROI CNN reasoning over zoom-in crops; lives in autopilot. source: `architecture.md §7.6 Tier 2 semantic analyzer`.
+
+**Tier 3 / VLM (Vision Language Model)** — NanoLLM running VILA1.5-3B in a separate local process, invoked only for bounded zoom-in ROI confirmation. Local IPC over Unix domain socket with peer-credential check. No cloud egress. Optional. source: `architecture.md §7.6 Local VLM confirmation`, `components/vlm_client/description.md`, `system-flows.md §F3`.
+
+**vlm_client** — autopilot component: optional local-IPC client to a NanoLLM/VILA1.5-3B process; validates ROI payload, calls VLM, validates response against the `VlmAssessment` schema. source: `components/vlm_client/description.md`.
+
+**VlmAssessment** — structured-schema output from the VLM. The free-form VLM text is not a downstream API contract. source: `architecture.md §5 Architectural Principles`, `data_model.md §VlmAssessment`.
+
+**ViewPro A40** — deployment gimbal hardware. NFR budget: zoom transition ≤2 s, decision-to-movement ≤500 ms. source: `architecture.md §7.3`, `components/gimbal_controller/description.md`.
+
+**Waypoint** — mission node coordinate (lat, lon, alt). Pulled from the `missions` API by `mission_client`; the operator-confirm flow may insert a **middle waypoint** to detour toward a confirmed target. source: `architecture.md §7.7`, `data_model.md §MissionWaypoint`.
+
+**BIT (Built-In Self Test)** — pre-flight gate run by `mission_executor`. Covers GPS lock, camera RTSP, gimbal homing, `../detections`, VLM warmup (if enabled), mission load, MapObjects pre-flight pull, persistent-store free space, wall-clock binding, MAVLink + airframe health. Items return `OK | DEGRADED | FAIL`. DEGRADED requires signed operator acknowledgement. FAIL blocks transition past `HEALTH_OK`. source: `architecture.md §5 / §7.3`, `system-flows.md §F9`.
+
+**Lost-link failsafe ladder** — typed ladder evaluated each tick by `mission_executor` against the operator/Ground-Station modem heartbeat: `LinkOk` (≤5 s) → `LinkDegraded` (≤30 s, queue events, health yellow) → `LinkLost` (>30 s, no follow) → `LinkLostInFollow` (>30 s in target-follow, +30 s grace). Default action on lost link is RTL. MAVLink-link loss to ArduPilot itself is a separate, more severe event. source: `architecture.md §5 / §7.7`, `system-flows.md §F10`.
+
+**MapObjects sync** — pre-flight pull + post-flight push of MapObjects + IgnoredItems against the central `missions` API extension `/missions/{id}/mapobjects`. In-flight is batched only (no streaming over modem). On-device store is a working copy; central store is the source of truth across missions. source: `architecture.md §5 / §7.13`, `system-flows.md §F8`, `components/mapobjects_store/description.md`, `components/mission_client/description.md`.
+
+**Per-zoom-band thresholds** — `movement_detector` configuration is split between zoom-out and zoom-in because the pixel-to-metre ratio differs by ~10×. Cluster persistence threshold, residual-velocity floor, telemetry-skew tolerance, and enqueue-latency budget are all per-band. source: `architecture.md §7.6 Movement detector`, `components/movement_detector/description.md §5`.
+
+**Operator-command authentication** — every operator command (confirm / decline / target-follow / safety-override / BIT-degraded-acknowledge) carries a session-bound signature with replay protection, validated by `operator_bridge` before dispatch. The principle is committed; the exact scheme is open per Q9. source: `architecture.md §5 / §8 Q9`, `components/operator_bridge/description.md`.
+
+**Sync state (`mapobjects_store`)** — `synced | cached_fallback | degraded`. `synced` after a fresh successful pull or a successful post-flight push. `cached_fallback` when pre-flight pull failed and the operator acknowledged continuing on cache. `degraded` after a persistent push failure or a stale cache. `scan_controller` suppresses MapObject diff classifications while `degraded` to avoid corrupting the central observation log. source: `components/mapobjects_store/description.md §5`.
+
+**Wall-clock source** — autopilot binds wall-clock to GPS time once GPS is locked (preferred) or to NTP at boot if reachable. Drift > 200 ms surfaces health → yellow. Monotonic clock (independent of wall-clock) is authoritative for telemetry-skew compensation and tick budgets. source: `architecture.md §7.3 Reliability and safety`.
@@ -0,0 +1,357 @@
+# Module Layout
+
+**Language**: rust
+**Layout Convention**: crates-workspace
+**Root**: `crates/`
+**Last Updated**: 2026-05-19
+
+## Layout Rules
+
+1. Each component owns ONE top-level directory under `crates/`. The directory name matches the component name in `_docs/02_document/components/`.
+2. Shared code lives in a single `crates/shared/` crate. Cross-cutting concerns are modules inside it (`shared/models/`, `shared/config/`, `shared/error/`, `shared/health/`, `shared/observability/`, `shared/clock/`, `shared/contracts/`). Other crates re-export from `shared::`; they MUST NOT duplicate types.
+3. Public API surface per component = the files in `Public API` below. Everything under `src/internal/` (and any other module not listed in `Public API`) is internal and other crates MUST NOT use it.
+4. Tests live in each crate's own `tests/` directory (Rust convention). Workspace-level end-to-end tests live at `tests/e2e/` (the workspace root, not under any crate).
+5. **Stream-based wiring**: tokio channels carrying shared data types are passed into actor constructors by the composition root (`crates/autopilot`). This keeps Layer 2 actors free of sibling imports — they receive `Receiver<Frame>`, `Receiver<GimbalState>`, etc. from `shared::models` without importing the crate that produces them.
+6. **Sink traits in shared**: where one component must push into another's transport (e.g. `operator_bridge` pushes POIs through `telemetry_stream`), the receiving side implements a trait defined in `shared::contracts` (`TelemetrySink`, `MavlinkSink`, etc.). The producing side depends only on the trait, not on the receiving crate.
+
+## Per-Component Mapping
+
+### Component: shared
+
+- **Epic**: AZ-626 (Bootstrap & Initial Structure — shared crate lands as part of AZ-640 initial structure task)
+- **Directory**: `crates/shared/`
+- **Public API**:
+  - `crates/shared/src/lib.rs` (re-exports the submodules listed below)
+  - `crates/shared/src/models/mod.rs` (`Frame`, `BoundingBox`, `Detection`, `DetectionBatch`, `MovementCandidate`, `Tier2Evidence`, `VlmAssessment`, `POI`, `MapObject`, `MapObjectObservation`, `MapObjectsBundle`, `IgnoredItem`, `Coordinate`, `Geofence`, `MissionItem`, `MissionWaypoint`, `OperatorCommand`, `GimbalState`)
+  - `crates/shared/src/config/mod.rs` (`Config`, `ConfigLoader`, per-component typed sections)
+  - `crates/shared/src/error.rs` (`AutopilotError`, `Result<T>`)
+  - `crates/shared/src/health.rs` (`ComponentHealth`, `AggregatedHealth`, `HealthLevel`)
+  - `crates/shared/src/observability/mod.rs` (`tracing` init, log-field constants per `observability.md §2`)
+  - `crates/shared/src/clock.rs` (`MonoClock`, `WallClock`, `ClockSource`)
+  - `crates/shared/src/contracts/mod.rs` (`TelemetrySink`, `MavlinkSink`, `VlmProvider`, `OperatorCommandSink`)
+- **Internal**: none — shared has no `internal/` subtree; everything in shared is part of its public API by design.
+- **Owns (exclusive write during implementation)**: `crates/shared/**`
+- **Imports from**: (none — Layer 1)
+- **Consumed by**: every other component crate + the `autopilot` binary
+
+---
+
+### Component: mavlink_layer
+
+- **Epic**: AZ-637
+- **Directory**: `crates/mavlink_layer/`
+- **Public API**:
+  - `crates/mavlink_layer/src/lib.rs` (`MavlinkLayer`, `MavlinkHandle`, `MavlinkConnection`, public message types re-exported from `shared::models`)
+- **Internal**:
+  - `crates/mavlink_layer/src/internal/codec/*` (MAVLink v2 encode/decode for the §7.7 surface only)
+  - `crates/mavlink_layer/src/internal/transport/udp.rs`
+  - `crates/mavlink_layer/src/internal/transport/serial.rs`
+  - `crates/mavlink_layer/src/internal/heartbeat.rs`
+  - `crates/mavlink_layer/src/internal/retry.rs`
+- **Owns**: `crates/mavlink_layer/**`
+- **Imports from**: `shared`
+- **Consumed by**: `mission_executor`, `telemetry_stream` (via constructor-injected `Receiver<MavlinkTelemetry>` or via the `MavlinkSink` trait)
+
+---
+
+### Component: mission_client
+
+- **Epic**: AZ-638
+- **Directory**: `crates/mission_client/`
+- **Public API**:
+  - `crates/mission_client/src/lib.rs` (`MissionClient`, `MissionClientHandle::pull_mission()`, `post_middle_waypoint()`, `pull_mapobjects()`, `push_mapobjects()`, `health()`)
+- **Internal**:
+  - `crates/mission_client/src/internal/missions_api/*` (REST client + retry + auth)
+  - `crates/mission_client/src/internal/mapobjects_sync/*` (pre-flight GET + post-flight POST bundles)
+  - `crates/mission_client/src/internal/schema/*` (schema-version validation against `mission-schema`)
+- **Owns**: `crates/mission_client/**`
+- **Imports from**: `shared`
+- **Consumed by**: `mission_executor` (for mission lifecycle), `mapobjects_store` (for hydrate/dump indirectly through `mission_executor` orchestration)
+
+---
+
+### Component: frame_ingest
+
+- **Epic**: AZ-627
+- **Directory**: `crates/frame_ingest/`
+- **Public API**:
+  - `crates/frame_ingest/src/lib.rs` (`FrameIngest`, `FrameIngestHandle::subscribe() -> Receiver<Frame>`, `health()`)
+- **Internal**:
+  - `crates/frame_ingest/src/internal/rtsp_client.rs`
+  - `crates/frame_ingest/src/internal/decoder.rs`
+  - `crates/frame_ingest/src/internal/timestamp.rs`
+- **Owns**: `crates/frame_ingest/**`
+- **Imports from**: `shared`
+- **Consumed by**: `detection_client`, `movement_detector`, `telemetry_stream` (all via composition-root-wired `Receiver<Frame>`)
+
+---
+
+### Component: detection_client
+
+- **Epic**: AZ-628
+- **Directory**: `crates/detection_client/`
+- **Public API**:
+  - `crates/detection_client/src/lib.rs` (`DetectionClient`, `DetectionClientHandle::request(Frame) -> Result<DetectionBatch>`, `health()`)
+- **Internal**:
+  - `crates/detection_client/build.rs` (`tonic-build` for the gRPC proto)
+  - `crates/detection_client/proto/detections.proto` (vendored copy of `../detections` contract per `architecture.md §10`)
+  - `crates/detection_client/src/internal/grpc/*` (bi-directional streaming client, version handshake)
+- **Owns**: `crates/detection_client/**`
+- **Imports from**: `shared`
+- **Consumed by**: `scan_controller` (handle for direct request), `telemetry_stream` (via constructor-injected `Receiver<DetectionBatch>` for operator overlay)
+
+---
+
+### Component: gimbal_controller
+
+- **Epic**: AZ-634
+- **Directory**: `crates/gimbal_controller/`
+- **Public API**:
+  - `crates/gimbal_controller/src/lib.rs` (`GimbalController`, `GimbalControllerHandle::set_pose(...)`, `zoom(level)`, `state() -> GimbalState`, `state_stream() -> Receiver<GimbalState>`, `health()`)
+- **Internal**:
+  - `crates/gimbal_controller/src/internal/a40_protocol/*` (ViewPro A40 UDP vendor protocol — encode, decode, CRC)
+  - `crates/gimbal_controller/src/internal/smooth_pan.rs` (smooth-pan path-tracking primitive)
+- **Owns**: `crates/gimbal_controller/**`
+- **Imports from**: `shared`
+- **Consumed by**: `scan_controller` (handle), `movement_detector` (via constructor-injected `Receiver<GimbalState>`), `frame_ingest` (constructor-injected `Receiver<GimbalState>` for timestamp annotation if needed)
+
+---
+
+### Component: semantic_analyzer
+
+- **Epic**: AZ-630
+- **Directory**: `crates/semantic_analyzer/`
+- **Public API**:
+  - `crates/semantic_analyzer/src/lib.rs` (`SemanticAnalyzer`, `SemanticAnalyzerHandle::analyze(roi) -> Result<Tier2Evidence>`, `health()`)
+- **Internal**:
+  - `crates/semantic_analyzer/src/internal/primitive_graph/*` (path, branch-pile, entrance, road graph reasoner)
+  - `crates/semantic_analyzer/src/internal/roi_cnn.rs` (TensorRT ROI CNN wrapper)
+  - `crates/semantic_analyzer/src/internal/scoring/*` (path-freshness, endpoint, concealment)
+- **Owns**: `crates/semantic_analyzer/**`
+- **Imports from**: `shared`
+- **Consumed by**: `scan_controller` (handle)
+
+---
+
+### Component: vlm_client
+
+- **Epic**: AZ-631
+- **Directory**: `crates/vlm_client/`
+- **Public API**:
+  - `crates/vlm_client/src/lib.rs` (`VlmClient` implementing `shared::contracts::VlmProvider`; `VlmClient::with_default()` returns the no-op impl returning `VlmAssessment { status: vlm_disabled }`; real impl is gated behind `feature = "vlm"`)
+- **Internal**:
+  - `crates/vlm_client/src/internal/uds_client.rs` (Unix-domain socket IPC + peer-credential check)
+  - `crates/vlm_client/src/internal/schema_validate.rs` (`VlmAssessment` schema validation)
+  - `crates/vlm_client/src/internal/prompt.rs` (bounded prompt + payload size enforcement)
+- **Owns**: `crates/vlm_client/**`
+- **Imports from**: `shared`
+- **Consumed by**: `scan_controller` (via the `shared::contracts::VlmProvider` trait — never directly)
+
+---
+
+### Component: mapobjects_store
+
+- **Epic**: AZ-633
+- **Directory**: `crates/mapobjects_store/`
+- **Public API**:
+  - `crates/mapobjects_store/src/lib.rs` (`MapObjectsStore`, `MapObjectsStoreHandle::classify(Detection) -> Classification`, `apply_decline(POI)`, `dump_pending() -> MapObjectsBundle`, `hydrate(MapObjectsBundle)`, `set_sync_state(SyncState)`, `health()`)
+- **Internal**:
+  - `crates/mapobjects_store/src/internal/h3_index/*` (`h3rs` wrapper + k-ring queries)
+  - `crates/mapobjects_store/src/internal/engine/mod.rs` (`StorageEngine` trait — pluggable for Q3)
+  - `crates/mapobjects_store/src/internal/engine/in_memory_snapshot.rs` (default impl: in-memory + JSON snapshot on flush)
+  - `crates/mapobjects_store/src/internal/diff.rs` (NEW / MOVED / EXISTING / REMOVED-CANDIDATE classification)
+  - `crates/mapobjects_store/src/internal/ignored.rs`
+- **Owns**: `crates/mapobjects_store/**`
+- **Imports from**: `shared`
+- **Consumed by**: `scan_controller`, `operator_bridge`, `mission_executor` (for hydrate at pre-flight + dump_pending at post-flight)
+
+---
+
+### Component: movement_detector
+
+- **Epic**: AZ-629
+- **Directory**: `crates/movement_detector/`
+- **Public API**:
+  - `crates/movement_detector/src/lib.rs` (`MovementDetector`, `MovementDetectorHandle::candidates() -> Receiver<MovementCandidate>`, `health()`; constructor takes `Receiver<Frame>`, `Receiver<GimbalState>`, `Receiver<MavlinkTelemetry>`)
+- **Internal**:
+  - `crates/movement_detector/src/internal/ego_motion.rs` (homography-based ego-motion estimate)
+  - `crates/movement_detector/src/internal/optical_flow/*` (classical CV path)
+  - `crates/movement_detector/src/internal/learned_cv/*` (fallback per Q14 — behind `feature = "learned_cv"`)
+  - `crates/movement_detector/src/internal/zoom_bands.rs` (per-zoom-band threshold tables)
+  - `crates/movement_detector/src/internal/telemetry_sync.rs` (frame ↔ gimbal ↔ UAV skew gate)
+- **Owns**: `crates/movement_detector/**`
+- **Imports from**: `shared`
+- **Consumed by**: `scan_controller` (consumes the `MovementCandidate` stream)
+
+---
+
+### Component: telemetry_stream
+
+- **Epic**: AZ-639
+- **Directory**: `crates/telemetry_stream/`
+- **Public API**:
+  - `crates/telemetry_stream/src/lib.rs` (`TelemetryStream` implementing `shared::contracts::TelemetrySink`; `TelemetryStreamHandle::commands() -> Receiver<OperatorCommand>`, `health()`; constructor takes `Receiver<Frame>`, `Receiver<DetectionBatch>`, `Receiver<MavlinkTelemetry>`, `Receiver<BboxOverlay>`)
+- **Internal**:
+  - `crates/telemetry_stream/src/internal/uplink/*` (modem push: protocol per `../_docs/04_system_design_clarifications.md` — Q2)
+  - `crates/telemetry_stream/src/internal/downlink/*` (operator-command receive path)
+  - `crates/telemetry_stream/src/internal/encode/*` (frame + telemetry + bbox-overlay serialisation)
+- **Owns**: `crates/telemetry_stream/**`
+- **Imports from**: `shared`
+- **Consumed by**: `operator_bridge` (via the `TelemetrySink` trait in `shared::contracts`; commands consumed via constructor-injected `Receiver<OperatorCommand>`)
+
+---
+
+### Component: operator_bridge
+
+- **Epic**: AZ-635
+- **Directory**: `crates/operator_bridge/`
+- **Public API**:
+  - `crates/operator_bridge/src/lib.rs` (`OperatorBridge`, `OperatorBridgeHandle::surface_poi(POI) -> OperatorDecision`, `middle_waypoint_hints() -> Receiver<MiddleWaypointHint>`, `target_follow_events() -> Receiver<TargetFollowEvent>`, `health()`; constructor takes `Arc<dyn TelemetrySink>` and `Receiver<OperatorCommand>`)
+- **Internal**:
+  - `crates/operator_bridge/src/internal/auth/*` (`OperatorCommand` envelope validation — signature, replay protection, session validation; scheme stubbed pending Q9)
+  - `crates/operator_bridge/src/internal/audit_log.rs` (persistent audit log writer for `/var/lib/autopilot/audit/`)
+  - `crates/operator_bridge/src/internal/decision_window.rs` (confidence-scaled timeout: 40 % → 30 s, 100 % → 120 s linear)
+- **Owns**: `crates/operator_bridge/**`
+- **Imports from**: `shared`, `mapobjects_store`
+- **Consumed by**: `scan_controller` (for `surface_poi`), `mission_executor` (consumes `middle_waypoint_hints` stream)
+
+---
+
+### Component: mission_executor
+
+- **Epic**: AZ-636
+- **Directory**: `crates/mission_executor/`
+- **Public API**:
+  - `crates/mission_executor/src/lib.rs` (`MissionExecutor`, `MissionExecutorHandle::start(Mission)`, `insert_middle_waypoint(Coordinate)`, `failsafe_trigger(FailsafeKind)`, `state() -> ExecutorState`, `health()`; constructor takes `Receiver<MiddleWaypointHint>` from operator_bridge)
+- **Internal**:
+  - `crates/mission_executor/src/internal/multirotor/fsm.rs` (DISCONNECTED → … → LAND)
+  - `crates/mission_executor/src/internal/fixed_wing/fsm.rs` (DISCONNECTED → … → WAIT_AUTO → LAND)
+  - `crates/mission_executor/src/internal/geofence/*` (INCLUSION + EXCLUSION enforcement)
+  - `crates/mission_executor/src/internal/failsafe/ladder.rs` (lost-link `LinkOk → LinkDegraded → LinkLost → LinkLostInFollow`)
+  - `crates/mission_executor/src/internal/battery_thresholds.rs` (RTL floor, hard floor)
+  - `crates/mission_executor/src/internal/bit.rs` (pre-flight built-in self-test; orchestrates pre-flight `mapobjects_store.hydrate(mission_client.pull_mapobjects(...))`)
+  - `crates/mission_executor/src/internal/middle_waypoint.rs` (re-upload sequence on operator confirm)
+  - `crates/mission_executor/src/internal/post_flight.rs` (orchestrates post-flight `mission_client.push_mapobjects(mapobjects_store.dump_pending())`)
+- **Owns**: `crates/mission_executor/**`
+- **Imports from**: `shared`, `mavlink_layer`, `mission_client`, `mapobjects_store`
+- **Consumed by**: `scan_controller` (for `failsafe_trigger` and `insert_middle_waypoint`)
+
+---
+
+### Component: scan_controller
+
+- **Epic**: AZ-632
+- **Directory**: `crates/scan_controller/`
+- **Public API**:
+  - `crates/scan_controller/src/lib.rs` (`ScanController`, `ScanControllerHandle::tick()`, `submit_operator_cmd(OperatorCommand)`, `state() -> ScanState`, `health()`; constructor takes `Receiver<DetectionBatch>`, `Receiver<MovementCandidate>`, `Receiver<Frame>` plus handles for `mapobjects_store`, `gimbal_controller`, `mission_executor`, `semantic_analyzer`, `operator_bridge`, and `Arc<dyn VlmProvider>`)
+- **Internal**:
+  - `crates/scan_controller/src/internal/state_machine/mod.rs` (`ZoomedOut`, `ZoomedIn { roi, hold_started_at }`, `TargetFollow { target_id, started_at }`)
+  - `crates/scan_controller/src/internal/state_machine/transitions.rs`
+  - `crates/scan_controller/src/internal/poi_queue/*` (priority queue + `≤5 POIs/min` cap + confidence × proximity × age ordering)
+  - `crates/scan_controller/src/internal/behaviour_tree/*` (per `system-flows.md §F4`)
+  - `crates/scan_controller/src/internal/timeouts.rs` (operator-decision window, POI timeouts, VLM waits)
+  - `crates/scan_controller/src/internal/frame_rate_guard.rs` (suppress zoom-in transitions below ≥10 fps; surface yellow health)
+- **Owns**: `crates/scan_controller/**`
+- **Imports from**: `shared`, `mapobjects_store`, `gimbal_controller`, `mission_executor`, `semantic_analyzer`, `operator_bridge`
+- **Consumed by**: `autopilot` (composition root)
+
+---
+
+### Component: autopilot (binary, composition root)
+
+- **Epic**: AZ-626 (Bootstrap & Initial Structure — the binary scaffold is part of AZ-640)
+- **Directory**: `crates/autopilot/`
+- **Public API**: this is a `[[bin]]` crate — it exposes no library API.
+- **Internal**:
+  - `crates/autopilot/src/main.rs` (CLI parse, config load, `tracing` init, build component instances, run)
+  - `crates/autopilot/src/runtime.rs` (build channels, wire actors, owns the `Vec<JoinHandle>`, shutdown orchestration)
+  - `crates/autopilot/src/health_server.rs` (HTTP `/health` endpoint per `containerization.md §7`)
+  - `crates/autopilot/src/bit_runner.rs` (invokes `mission_executor.bit()` and gates startup)
+- **Owns**: `crates/autopilot/**`
+- **Imports from**: `shared` + every Layer 2 actor crate + every Layer 3 coordinator + `scan_controller`
+- **Consumed by**: nothing — this is the binary
+
+---
+
+## Shared / Cross-Cutting
+
+All cross-cutting concerns live as modules inside the single `crates/shared/` crate (Rust convention prefers a single shared crate over many tiny ones; the module boundaries inside `shared::` enforce conceptual separation).
+
+### shared::models
+- **Path**: `crates/shared/src/models/`
+- **Purpose**: the canonical entity catalogue from `_docs/02_document/data_model.md`. One submodule per entity grouping (`frame.rs`, `detection.rs`, `movement.rs`, `tier2.rs`, `vlm.rs`, `poi.rs`, `mapobject.rs`, `mission.rs`, `operator.rs`, `gimbal.rs`).
+- **Owned by**: AZ-640 initial structure task (under epic AZ-626).
+- **Consumed by**: every component crate + the `autopilot` binary.
+
+### shared::config
+- **Path**: `crates/shared/src/config/`
+- **Purpose**: TOML loader (per `containerization.md §6`), typed per-component sections, environment-variable overlay, secrets resolution (via path to `EnvironmentFile=`).
+- **Owned by**: AZ-640 initial structure task.
+- **Consumed by**: every component crate.
+
+### shared::error
+- **Path**: `crates/shared/src/error.rs`
+- **Purpose**: `AutopilotError` enum + `Result<T> = std::result::Result<T, AutopilotError>` alias.
+- **Owned by**: AZ-640 initial structure task.
+- **Consumed by**: every crate.
+
+### shared::health
+- **Path**: `crates/shared/src/health.rs`
+- **Purpose**: `ComponentHealth`, `HealthLevel { Green, Yellow, Red, Disabled }`, `AggregatedHealth` — each component exposes its own `health() -> ComponentHealth`; `autopilot::health_server` aggregates per `containerization.md §7`.
+- **Owned by**: AZ-640 initial structure task.
+- **Consumed by**: every component + the binary's health server.
+
+### shared::observability
+- **Path**: `crates/shared/src/observability/`
+- **Purpose**: `tracing-subscriber` init (JSON to stdout); log-field constants for the §2 fields in `observability.md`; span helpers for frame trace + POI trace.
+- **Owned by**: AZ-640 initial structure task.
+- **Consumed by**: every component (for spans and counters).
+
+### shared::clock
+- **Path**: `crates/shared/src/clock.rs`
+- **Purpose**: `MonoClock` (monotonic, authoritative for telemetry-skew compensation and tick budgets), `WallClock` (bound to GPS time once locked, NTP at boot), `ClockSource { Gnss, Host, Coast }`. Drift > 200 ms → yellow health.
+- **Owned by**: AZ-640 initial structure task.
+- **Consumed by**: every component that timestamps anything (frame_ingest, movement_detector, scan_controller, operator_bridge audit log, mapobjects_store).
+
+### shared::contracts
+- **Path**: `crates/shared/src/contracts/`
+- **Purpose**: trait definitions for cross-component coupling that we want to keep import-free:
+  - `TelemetrySink` — `push_frame`, `push_telemetry`, `push_overlay` (impl: `telemetry_stream`)
+  - `MavlinkSink` — `send` (impl: `mavlink_layer`; lets `mission_executor` depend on a trait rather than the concrete crate when convenient)
+  - `VlmProvider` — `assess(roi) -> VlmAssessment` (impl: `vlm_client`; default no-op impl returns `vlm_disabled`)
+  - `OperatorCommandSink` — `dispatch(OperatorCommand)` (lets the composition root forward decoded commands from `telemetry_stream` to `operator_bridge` without coupling them)
+- **Owned by**: AZ-640 initial structure task.
+- **Consumed by**: `operator_bridge` (TelemetrySink), `scan_controller` (VlmProvider), `mission_executor` (may use MavlinkSink), `telemetry_stream` + `vlm_client` (implement the traits).
+
+## Allowed Dependencies (layering)
+
+Read top-to-bottom; an upper layer may import from a lower layer but NEVER the reverse. Same-layer imports are explicitly listed in each component's `Imports from`.
+
+| Layer | Components | May import from |
+|---|---|---|
+| 5. Composition | `autopilot` (binary) | 1, 2, 3, 4 |
+| 4. Brain | `scan_controller` | 1, 2, 3 |
+| 3. Coordinators | `operator_bridge`, `mission_executor` | 1, 2 |
+| 2. Actors / Transports / Storage | `mavlink_layer`, `mission_client`, `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client`, `mapobjects_store`, `gimbal_controller`, `telemetry_stream` | 1 |
+| 1. Shared / Foundation | `shared/*` | (none) |
+
+Violations of this table are **Architecture** findings in code-review and are **High** severity. Specifically:
+
+- A Layer 2 actor MAY NOT import a sibling Layer 2 actor. Stream dependencies (e.g. `movement_detector` consuming `Frame`) are wired via constructor-injected channels by the composition root; sink dependencies (e.g. `operator_bridge` pushing into `telemetry_stream`) are bridged via a trait in `shared::contracts`.
+- A Layer 3 coordinator MAY import any Layer 2 actor whose handle it directly calls. `operator_bridge` imports `mapopjects_store` for `apply_decline`. `mission_executor` imports `mavlink_layer`, `mission_client`, and `mapobjects_store`.
+- A Layer 3 coordinator MAY NOT import another Layer 3 coordinator. `mission_executor` consumes `MiddleWaypointHint` from `operator_bridge` via a constructor-injected `Receiver<MiddleWaypointHint>` wired by the composition root.
+
+## Layout Conventions (reference)
+
+| Language | Root | Per-component path | Public API file | Test path |
+|---|---|---|---|---|
+| Rust | `crates/` | `crates/<component>/` | `crates/<component>/src/lib.rs` | `crates/<component>/tests/` (crate-level) + `tests/e2e/` (workspace-level) |
+
+## Self-verification
+
+- [x] Every component in `_docs/02_document/components/` has a Per-Component Mapping entry (13 components + `shared` + `autopilot` binary).
+- [x] Every shared / cross-cutting concern has a Shared section entry (`models`, `config`, `error`, `health`, `observability`, `clock`, `contracts`).
+- [x] Layering table covers every component, with `shared` at the bottom and `autopilot` binary at the top.
+- [x] No component's `Imports from` list points at a higher layer. (`scan_controller` Layer 4 → Layers 1, 2, 3; coordinators Layer 3 → Layers 1, 2; actors Layer 2 → Layer 1 only.)
+- [x] Paths follow Rust's `crates/<component>/` convention.
+- [x] No two components own overlapping paths — each `Owns` glob is rooted at a distinct `crates/<component>/**`.
@@ -0,0 +1,535 @@
+# Blackbox Tests
+
+Authored by `/test-spec` Phase 2 (2026-05-19). Every scenario observes the SUT only through public surfaces (RTSP, gRPC, MAVLink, REST, operator stream, gimbal UDP, VLM IPC, health endpoint, structured logs). No scenario imports internal modules or peeks at on-device state directly.
+
+Each scenario header records:
+
+- **Summary** — one-line behaviour validated.
+- **Traces to** — AC ID(s) and any RESTRICT ID.
+- **Tier** — execution tier required (U / I / B / E / HW).
+- **Test status** — `READY` or `DEFERRED — <reason>` (per the override 2026-05-19 deferred scenarios are kept; release-gate items).
+
+The `Expected result` field gives the inline pass/fail criterion; the authoritative comparison lives in `_docs/00_problem/input_data/expected_results/results_report.md` (referenced by row id).
+
+---
+
+## Positive Scenarios — Detection Quality (functional)
+
+### FT-P-001: Tier-1 normalised-box contract conformance
+**Summary**: Frame in → autopilot must consume and re-emit the Tier-1 detection stream conforming to the suite's normalised-box schema (class ids 0..18, coords ∈ [0,1]).
+**Traces to**: AC `Detection Quality / D6`, RESTRICT `Suite-level architectural splits — Tier 1 lives in ../detections`.
+**Tier**: B (mock detector) + E (live `../detections`).
+**Test status**: READY.
+
+**Preconditions**:
+- SUT started; `detections-mock` serving recorded Tier-1 stream for `image-set-existing`.
+- `e2e-consumer` subscribed to the SUT's outbound normalised-box stream (observable via the operator-stream channel and via the internal-test-only `/debug/detections` socket IF exposed in test build; otherwise via operator-stream only).
+
+**Input data**: `fixtures/images/4d6e1830d211ad50.jpg` (1280 px aerial frame) → encoded into `rtsp-loopback` as a 1-second loop.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Begin RTSP playback of the frame loop | SUT consumes the frame; emits a normalised-box detection record on the operator-stream channel |
+| 2 | Capture one emitted detection record | Record validates against `fixtures/schemas/expected_detections.schema.json`; every bbox coord ∈ [0,1]; class_id ∈ {0..18} |
+
+**Expected outcome**: D6 — schema-match + range comparison passes.
+**Max execution time**: 10 s.
+
+---
+
+### FT-P-002: Existing-class regression vs documented baseline
+**Summary**: Per-class precision and recall must not regress by more than ±2 percentage points against the pinned baseline (P=0.816, R=0.852).
+**Traces to**: AC `Detection Quality — Existing-class regression / D2`.
+**Tier**: E + HW (HW required for the project-level Acceptance Gate).
+**Test status**: DEFERRED — expected_results baseline JSON not yet recorded (`<DEFERRED: expected_results/existing_classes_baseline.json>`). Visual fixtures (5 frames) are on disk; baseline numbers depend on a recording against the currently pinned `../detections` model.
+
+**Preconditions**:
+- Baseline JSON recorded against pinned `../detections` model (DEFERRED).
+- SUT + live `../detections` running (Tier E) or HW Jetson (HW).
+
+**Input data**: `fixtures/images/{4d6e1830...,54f6459...,6dd601b7...,805bcf1e...,f997d093...}.jpg` (5 frames).
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Stream each frame through RTSP | SUT emits detections per frame |
+| 2 | Compare aggregated per-class P/R to baseline | each per-class P, R within ± 0.02 absolute of baseline |
+
+**Expected outcome**: D2 — `numeric_tolerance` passes.
+**Max execution time**: 60 s.
+
+---
+
+### FT-P-003: New-class precision and recall ≥80%
+**Summary**: New target classes (black entrances, branch piles, footpaths, roads, trees, tree blocks) reach precision ≥0.80 AND recall ≥0.80 per class.
+**Traces to**: AC `Detection Quality — New target classes / D1`.
+**Tier**: E + HW.
+**Test status**: DEFERRED — multi-season annotated new-class eval set not acquired; annotation campaign owned by `../ai-training` repo. `<DEFERRED: expected_results/new_classes_pr.json>`.
+
+**Preconditions**:
+- Multi-season annotated new-class eval set acquired.
+- Tier-1 model updated to include the 5 new classes.
+
+**Input data**: `<DEFERRED: new-class eval set across all four seasons>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Stream eval-set frames through RTSP | SUT emits detections including new-class items |
+| 2 | Compute per-class P, R | each ≥ 0.80 |
+
+**Expected outcome**: D1 — `threshold_min` passes for every new class.
+**Max execution time**: 120 s.
+
+---
+
+### FT-P-004: Concealed-position recall ≥60% (initial gate)
+**Summary**: System surfaces concealed positions (FPV hideouts, dugouts) with recall ≥0.60, accepting high false-positive rate as operators filter.
+**Traces to**: AC `Detection Quality — Concealed-position recall / D3`.
+**Tier**: E + HW.
+**Test status**: DEFERRED — only 4 starter PNGs on disk; full multi-season annotated set required.
+
+**Input data**: `fixtures/semantic/semantic0[1-4].png` (starter) + `<DEFERRED full set>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Stream concealed-position frames | SUT emits concealed-structure POIs |
+| 2 | Compute aggregate recall against ground truth | recall ≥ 0.60 |
+
+**Expected outcome**: D3 — `threshold_min` passes.
+**Max execution time**: 120 s.
+
+---
+
+### FT-P-005: Concealed-position precision ≥20% (initial gate)
+**Summary**: Concealed-position precision ≥0.20 (operators filter; high-FP-accepting gate).
+**Traces to**: AC `Detection Quality — Concealed-position precision / D4`.
+**Tier**: E + HW.
+**Test status**: DEFERRED — same dataset gap as FT-P-004.
+
+**Input data**: same as FT-P-004.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Stream concealed-position frames | SUT emits POIs |
+| 2 | Compute aggregate precision against ground truth | precision ≥ 0.20 |
+
+**Expected outcome**: D4 — `threshold_min` passes.
+
+---
+
+### FT-P-006: Footpath recall ≥70%
+**Summary**: Footpath recall ≥0.70 across multi-season polyline-annotated eval set.
+**Traces to**: AC `Detection Quality — Footpath recall / D5`.
+**Tier**: E + HW.
+**Test status**: DEFERRED — `<DEFERRED: footpath sequences (fresh + stale, all seasons), polyline-annotated>`.
+
+**Input data**: `fixtures/semantic/semantic0[1-4].png` (starter; 4 frames feature footpaths leading to concealment) + `<DEFERRED full multi-season set>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Stream footpath-bearing frames | SUT emits footpath polyline annotations |
+| 2 | Compute recall against polyline ground truth | recall ≥ 0.70 |
+
+**Expected outcome**: D5 — `threshold_min` passes.
+
+---
+
+## Positive Scenarios — Movement Detection Behaviour
+
+### FT-P-007: Ego-motion compensation rejects stable scene elements
+**Summary**: With the camera platform itself moving, stable elements (tree rows, houses, roads) MUST NOT generate movement candidates; only the actual mover does.
+**Traces to**: AC `Movement Detection — Stable objects MUST NOT be treated as moving / M1`, RESTRICT `Operational — moving camera platform`.
+**Tier**: B (with paired CSVs) + E.
+**Test status**: DEFERRED — `<DEFERRED: paired gimbal.csv + telemetry.csv for video01.mp4; scene must contain 1 stable tree row + 1 moving vehicle>`.
+
+**Preconditions**:
+- `rtsp-loopback` plays `fixtures/movement/video01.mp4` at 30 fps.
+- `gimbal-mock` replays paired gimbal.csv synchronised to RTSP frame timestamps.
+- `mavlink-sitl` replays paired telemetry (position + attitude) for the same duration.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Begin synchronised playback (video + gimbal + telemetry) | SUT begins consuming frames and ego-motion compensating |
+| 2 | Capture every movement candidate emitted on operator-stream for the clip duration | exactly 1 candidate (the vehicle); tree-row position is NOT among candidates |
+
+**Expected outcome**: M1 — `set_contains` passes; candidate set == {vehicle}; tree-row position ∉ candidates.
+**Max execution time**: clip_duration + 10 s.
+
+---
+
+### FT-P-008: Movement detection continues during zoomed-in hold
+**Summary**: While the camera is in a zoomed-in hold on a confirmed POI, a new mover appearing in the ROI is still detected and enqueued; current ROI is preempted only if the new candidate's priority exceeds it.
+**Traces to**: AC `Movement Detection — MUST continue during the zoomed-in inspection / M2`.
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED zoomed-in gimbal.csv + telemetry.csv pair; 1 small mover>`.
+
+**Input data**: `fixtures/movement/video02.mp4` + DEFERRED CSV pair.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Drive SUT into ZoomedIn hold via prior FT-P-016 setup | SUT in `ZoomedIn { roi, hold_started_at }` |
+| 2 | Begin playback of the zoomed-in scene with the small mover | Movement candidate enqueued within ≤ 1.5 s (timing checked by NFT-PERF-L7) |
+| 3 | Observe ROI lifecycle | ROI is preempted only if new candidate's `confidence × proximity × age_factor` exceeds the held ROI's; otherwise the held ROI completes |
+
+**Expected outcome**: M2 — `exact` passes; 1 candidate enqueued; ROI preempt decision matches the documented priority rule.
+
+---
+
+### FT-P-009: Per-zoom-band threshold honoured (no false candidate at edge)
+**Summary**: When a movement-cluster persists for one frame BELOW the configured per-zoom-band threshold, no candidate is emitted.
+**Traces to**: AC `Movement Detection — configurable per-zoom-band false-positive budget MUST be honoured / M3`.
+**Tier**: B.
+**Test status**: DEFERRED — `<DEFERRED gimbal.csv simulating threshold edge>`.
+
+**Input data**: `fixtures/movement/video03.mp4` + DEFERRED CSV.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Replay scene at the threshold edge | SUT processes frames |
+| 2 | Observe candidate count over the clip duration | count == 0 |
+
+**Expected outcome**: M3 — `exact` passes.
+
+---
+
+### FT-P-010: Movement zoomed-in benchmark FP-rate budget
+**Summary**: Across the zoom-out + zoomed-in benchmark suite, false-positive rate per zoom band stays within the configurable per-zoom-band budget (Q14 fallback trigger).
+**Traces to**: AC `Q-tagged criteria — Movement detection FP rate at zoomed-in inspection / M4` (depends on Q14).
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: zoom-out + zoomed-in benchmark suite + expected_results/movement_benchmark_caps.json; Q14>`.
+
+**Input data**: `fixtures/movement/video04.mp4` (visual ref) + DEFERRED benchmark suite.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Replay the benchmark suite end-to-end | SUT processes all frames |
+| 2 | Aggregate FP candidates per zoom band | rate per band ≤ configured cap (default ≤ 0.5 / min at zoomed-in) |
+
+**Expected outcome**: M4 — `threshold_max` passes per zoom band.
+
+---
+
+## Positive Scenarios — Scan & Camera Control
+
+### FT-P-011: Sweep → zoomed-inspection transition + POI enqueue
+**Summary**: A POI detected mid-sweep triggers a transition into zoomed-inspection within 2 s (timing: NFT-PERF-L8) AND the POI is enqueued correctly.
+**Traces to**: AC `Scan & Camera Control — Transition from sweep to detailed inspection / S1`.
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: scripted mission with planned route + simulated POI detected mid-sweep>`.
+
+**Input data**: scripted MAVLink mission + scripted Tier-1 detection injection at known frame index.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Start SUT with scripted mission; begin RTSP playback | SUT enters `ZoomedOut`, performs sweep |
+| 2 | Inject Tier-1 detection of a high-confidence target at frame N | SUT transitions to `ZoomedIn { roi, hold_started_at }`; ROI bbox matches the injected detection's bbox; POI queue length increments by 1 |
+
+**Expected outcome**: S1 — `exact (transition)` + `exact (ROI matches POI bbox)` + `exact (queue Δ+1)`.
+
+---
+
+### FT-P-012: Footpath-pan during zoomed-in hold
+**Summary**: During a zoomed-in hold on a footpath ROI, the camera pans along the footpath while the airframe continues to fly. The footpath stays in the centre 50% of frame for the duration of the hold.
+**Traces to**: AC `Scan & Camera Control — pan to keep features visible / S2`.
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: zoomed-inspection scenario with footpath polyline overlapping the ROI>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Drive SUT into ZoomedIn hold on a footpath ROI | SUT in `ZoomedIn { roi, hold_started_at }` |
+| 2 | Continue airframe flight; observe gimbal commands stream | SUT issues pan commands to track the footpath; observed centre offset ≤ 25% per frame |
+
+**Expected outcome**: S2 — `numeric_tolerance` passes; per-frame centre offset ≤ 0.25 × frame_dim.
+
+---
+
+### FT-P-013: Target-follow centre-window
+**Summary**: After operator confirmation, target-follow mode keeps the target within the centre 25% of frame while visible.
+**Traces to**: AC `Scan & Camera Control — target-follow mode / S3`.
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: operator-confirmed target + 60 s follow window>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Drive SUT into `TargetFollow { target_id, started_at }` via prior FT-P-016 | mode == target-follow |
+| 2 | Observe gimbal commands + per-frame target position for 60 s | per-frame |dx, dy| ≤ 0.125 × frame_size |
+
+**Expected outcome**: S3 — `threshold_max` passes per frame.
+
+---
+
+### FT-P-014: POI queue ordering by `confidence × proximity × age_factor`
+**Summary**: With 3 POIs varying in confidence × proximity × age_factor, the system pops them in the documented relative order.
+**Traces to**: AC `Scan & Camera Control — POI queue MUST be ordered by … / S4`.
+**Tier**: B.
+**Test status**: READY (synthetic-poi-feeds inline-authorable).
+
+**Input data**: `synthetic-poi-feeds` ordering test — 3 POIs with confidence ∈ {0.50, 0.80, 0.60}, proximity ∈ {near, mid, far}, age_factor ∈ {fresh, fresh, stale} chosen to produce a known relative ordering.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Inject the 3 POIs as Tier-1 detections | all 3 enter the queue |
+| 2 | Observe ZoomedIn transitions over the next N seconds | SUT inspects POIs in the documented relative order |
+
+**Expected outcome**: S4 — `exact (order)` passes.
+
+---
+
+### FT-P-015: Zoomed-in hold cap interacts with deep-analysis
+**Summary**: Zoomed-in hold defaults to 5 s/POI but caps deep-analysis interactions at 2 s; actual hold duration = min(5 s, deep_analysis_complete_at).
+**Traces to**: AC `Scan & Camera Control — hold endpoints up to 2 s for deep analysis … per-POI timeout (default 5 s/POI) / S5`.
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: VLM-enabled hold scenario with vlm_io_pair returning within 2 s>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Drive SUT into ZoomedIn hold; enable deep-analysis | SUT begins VLM IPC call on enter |
+| 2a | Case A: VLM returns at 1.5 s | hold ends at 1.5 s (deep_analysis_complete) |
+| 2b | Case B: VLM returns at 3.0 s | hold ends at 2.0 s (deep-analysis cap) |
+| 2c | Case C: deep-analysis disabled | hold ends at 5.0 s (per-POI timeout) |
+
+**Expected outcome**: S5 — `exact` passes for each case.
+
+---
+
+## Positive Scenarios — Operator Workflow
+
+### FT-P-016: Operator confirm → middle waypoint inserted + target-follow
+**Summary**: Valid + signed operator-confirm command results in a middle waypoint POSTed to `missions` AND a transition into target-follow mode.
+**Traces to**: AC `Operator Workflow — Operator confirmation MUST result in … / O8`.
+**Tier**: B + E.
+**Test status**: READY for happy path (default placeholder envelope until Q9 resolves; envelope replaced when Q9 ships).
+
+**Input data**: `operator-envelopes` (valid happy path) + `mission-suite-fixture` (DEFERRED full version) + `operator-session-scripts` (nominal session).
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | SUT in ZoomedIn hold on a POI surfaced to the operator | mode == ZoomedIn |
+| 2 | Replay operator-confirm envelope on the return path | SUT validates envelope; commits decision |
+| 3 | Observe HTTPS POST to `missions-mock` | `POST /missions/{id}` with a middle waypoint at the POI MGRS; HTTP 200 |
+| 4 | Observe scan-mode state | mode == `TargetFollow { target_id, started_at }` |
+
+**Expected outcome**: O8 — `exact (HTTP 200)` + `exact (mode == TargetFollow)`.
+
+---
+
+### FT-P-017: Decision window = 30 s at conf = 0.40
+**Summary**: At confidence = 0.40 the decision window surfaced to the operator MUST equal 30 s (lower-bound anchor of the linear scale).
+**Traces to**: AC `Operator Workflow — decision window … 40% confidence → 30 s / O1`.
+**Tier**: B.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Inject a synthetic POI at conf=0.40 | POI surfaced on operator-stream with `decision_window_seconds: 30` |
+
+**Expected outcome**: O1 — `exact (window == 30 s)`.
+
+---
+
+### FT-P-018: Decision window = 120 s at conf = 1.00
+**Summary**: At confidence = 1.00 the decision window MUST equal 120 s (upper-bound anchor).
+**Traces to**: AC `O2`.
+**Tier**: B.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Inject a synthetic POI at conf=1.00 | window == 120 s |
+
+**Expected outcome**: O2 — `exact`.
+
+---
+
+### FT-P-019: Decision window linear interpolation at conf = 0.70
+**Summary**: At conf=0.70 the window is interpolated linearly between (0.40, 30 s) and (1.00, 120 s) → 75 s ± 0.5 s.
+**Traces to**: AC `O3`.
+**Tier**: B.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Inject a synthetic POI at conf=0.70 | window ≈ 75 s ± 0.5 s |
+
+**Expected outcome**: O3 — `numeric_tolerance ± 0.5 s`.
+
+---
+
+### FT-P-020: Operator decline → persistent ignored-item
+**Summary**: Operator-decline on a surfaced POI MUST persist an ignored-item entry keyed by `(MGRS cell, class_group)`.
+**Traces to**: AC `Operator Workflow — Operator-decline MUST result in a persistent ignored-item entry / O5`.
+**Tier**: B + E.
+**Test status**: READY (operator-session-scripts inline-authorable; envelope uses default placeholder until Q9 resolves).
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Surface a POI to the operator | POI on operator-stream |
+| 2 | Replay operator-decline envelope | SUT validates; ignored-item count via health endpoint increments by 1; new item has `(MGRS, class_group)` matching the declined POI |
+
+**Expected outcome**: O5 — `exact (count Δ+1)` + `schema_match` (ignored-item record shape).
+
+---
+
+### FT-P-021: Ignored-item suppresses future matching detections
+**Summary**: A new detection whose `(MGRS, class_group)` matches an existing ignored-item MUST NOT be surfaced to the operator.
+**Traces to**: AC `Operator Workflow — A new detection whose (MGRS, class_group) matches an existing ignored-item MUST NOT be surfaced / O6`.
+**Tier**: B + E.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Seed an ignored-item for `(MGRS=X, class_group=Y)` via FT-P-020 | ignored-item present |
+| 2 | Inject a new detection at `(MGRS=X, class_group=Y)` | operator-stream emits NO POI for this detection; counter `pois_suppressed_by_ignored_total` increments |
+
+**Expected outcome**: O6 — `exact (count surfaced == 0)`.
+
+---
+
+### FT-P-022: Operator timeout = forget (no ignored-item)
+**Summary**: If the decision window expires with no operator response, the POI is removed from the queue but NO ignored-item is created (forget, do not blacklist).
+**Traces to**: AC `Operator Workflow — Timeout (no operator response within the window) MUST NOT create an ignored-item entry / O7`.
+**Tier**: B + E.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Surface a POI at conf=0.40 (30 s window) | POI on operator-stream |
+| 2 | Wait > 30 s with no response | POI removed from queue; ignored-item count UNCHANGED |
+
+**Expected outcome**: O7 — `exact (queue −1)` + `exact (ignored-item count unchanged)`.
+
+---
+
+## Positive Scenarios — Pre-flight & Map Reconciliation
+
+### FT-P-023: BIT pre-flight pass with every dependency healthy
+**Summary**: When every external dependency is reachable + healthy AND on-device storage < 95 % full AND wall-clock is bound, BIT passes and takeoff is permitted.
+**Traces to**: AC `Reliability & Safety — Pre-flight self-test MUST pass / R1`, RESTRICT `Reliability & Safety obligations — Pre-flight self-test (BIT) MUST gate takeoff`.
+**Tier**: B + E.
+**Test status**: READY (bit-scenarios inline-authorable).
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Bring up all mocks healthy + clean autopilot-state volume | every dependency green |
+| 2 | Trigger BIT via the BIT-arm operator command (or scripted in `operator-session-scripts`) | health endpoint returns `{ "ok": true, "deps": { ...all green }, "takeoff_permitted": true }` |
+
+**Expected outcome**: R1 — `exact (takeoff_permitted == true)` + `exact (health.all == "green")`.
+
+---
+
+### FT-P-024: Pre-flight map pull ≤ 30 s for a 30×30 km region
+**Summary**: Pulling the area-level map of previously-detected objects for a 30 km × 30 km mission area MUST complete within 30 s wall-clock.
+**Traces to**: AC `Map Reconciliation — Pre-flight map pull / Mp1`.
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: mock central area-map service with ~10000 map objects for the 30 km × 30 km region>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Configure `missions-mock` with the 30×30 km mapobjects fixture | mock ready |
+| 2 | Trigger BIT (which pulls the map) | SUT issues `GET /missions/{id}/mapobjects`; local copy hydrated within 30 s |
+| 3 | Confirm BIT proceeds normally afterwards | takeoff permitted |
+
+**Expected outcome**: Mp1 — `threshold_max` passes (NFT-PERF measures the latency; this scenario asserts the functional pathway).
+
+---
+
+### FT-P-025: Post-flight map diff push for a 60-minute mission
+**Summary**: Pushing the post-flight pass diff (~17 500 records: NEW + MOVED + REMOVED + CONFIRMED-EXISTING) for a 60-minute mission MUST complete within 120 s wall-clock.
+**Traces to**: AC `Map Reconciliation — Post-flight pass diff push / Mp3`.
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: 60-minute mission pass diff fixture>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Land the SUT after a 60-minute mission (scripted) | SUT enters post-flight reconciliation |
+| 2 | Observe HTTPS POST to `missions-mock` | `POST /missions/{id}/mapobjects` with the diff; HTTP 200 within 120 s |
+
+**Expected outcome**: Mp3 — `threshold_max` passes (NFT-PERF measures latency).
+
+---
+
+### FT-P-026: MapObjects conflict resolution (append-only + projection)
+**Summary**: When two map updates conflict for the same `(spatial-cell, class_group)`, the SUT records both observations append-only AND computes the current view per the documented resolution rule.
+**Traces to**: AC `Q-tagged — MapObjects conflict resolution / Mp5` (depends on Q8).
+**Tier**: B + E.
+**Test status**: DEFERRED — `<DEFERRED: conflict pair fixture + expected_results/mapobjects_conflict_resolution.json; Q8>`.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Seed the local mapobjects store via map pull | local store hydrated |
+| 2 | Trigger two conflicting observations for `(cell=X, class=Y)` | both appended to the observation log |
+| 3 | Observe the projected current view (via the operator-stream map-overlay channel or health debug) | current view matches the resolution rule (Q8) |
+
+**Expected outcome**: Mp5 — `json_diff` passes against the reference.
+
+---
+
+## Negative Scenarios
+
+### FT-N-001: BIT inhibits takeoff when Tier-1 detection is unreachable
+**Summary**: When `../detections` is unreachable at BIT, takeoff MUST be inhibited and the detection dependency MUST report red.
+**Traces to**: AC `Reliability & Safety — Pre-flight self-test MUST pass / R2`, RESTRICT `Suite-level architectural splits — Tier 1 lives in ../detections`.
+**Tier**: B + E.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Stop `detections-mock` | dependency unreachable |
+| 2 | Trigger BIT | health endpoint returns `takeoff_permitted: false`; `deps.detection == "red"`; operator-stream surfaces a BIT-failure event with category `detection` |
+| 3 | Attempt to issue a takeoff MAVLink command (scripted) | SUT refuses; no MAVLink takeoff command observed on `mavlink-sitl` |
+
+**Expected outcome**: R2 — `exact (takeoff inhibited)`.
+
+---
+
+### FT-N-002: BIT inhibits takeoff when persistent storage ≥ 95 % full
+**Summary**: When the on-device persistent store is ≥ 95 % full at BIT, takeoff MUST be inhibited.
+**Traces to**: AC `Reliability & Safety — Pre-flight self-test MUST pass / R3`, RESTRICT `On-device storage MUST be bounded`.
+**Tier**: B.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Pre-fill `autopilot-state` volume to ≥ 95 % via seed file | storage threshold tripped |
+| 2 | Trigger BIT | `takeoff_permitted: false`; `deps.storage == "red"` |
+
+**Expected outcome**: R3 — `exact (takeoff inhibited)`.
+
+---
+
+### FT-N-003: Cache-fallback on map-pull timeout requires operator acknowledgement
+**Summary**: When the pre-flight map pull times out, the SUT falls back to last-known cached MapObjects, reports `map_sync == "cached_fallback"`, AND MUST require explicit operator acknowledgement before takeoff is permitted.
+**Traces to**: AC `Map Reconciliation — Cache-fallback on timeout is acceptable only with explicit operator acknowledgement / Mp2`.
+**Tier**: B + E.
+**Test status**: READY (operator-session-scripts inline-authorable; cached state seeded from prior pull).
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Seed `autopilot-state` with a known prior MapObjects snapshot | cached map present |
+| 2 | Configure `missions-mock` to timeout on `GET /missions/{id}/mapobjects` | mock returns 504 / silent timeout |
+| 3 | Trigger BIT | SUT falls back to cached; `map_sync == "cached_fallback"`; BIT reports `takeoff_permitted: false, awaiting_ack: ["map_cache_fallback"]` |
+| 4 | Replay operator-ack envelope for `map_cache_fallback` | BIT now reports `takeoff_permitted: true`; one structured-log entry at WARN with `map_cache_fallback_acked_by_operator` |
+| 5 | Replay a takeoff scenario WITHOUT the ack | takeoff remains inhibited |
+
+**Expected outcome**: Mp2 — `exact (cached_fallback)` + `exact (BIT requires explicit ack)`.
+
+---
+
+### FT-N-004: Below-threshold POI suppression (conf < 40 %)
+**Summary**: A POI at confidence < 0.40 MUST NOT be surfaced to the operator at all.
+**Traces to**: AC `Operator Workflow — Below 40% confidence, the POI MUST NOT be surfaced at all / O4`.
+**Tier**: B.
+**Test status**: READY.
+
+| Step | Consumer Action | Expected System Response |
+|---|---|---|
+| 1 | Inject a synthetic POI at conf=0.39 | POI does NOT appear on operator-stream; counter `pois_below_threshold_total` increments by 1 |
+
+**Expected outcome**: O4 — `exact (count surfaced == 0)`.
+
+---
+
+## Notes for downstream skills
+
+- Decompose: every `READY` scenario above maps to at least one blackbox test task. DEFERRED scenarios MUST still produce a task spec (so the implementation has a placeholder), but the task spec's `Acceptance` section will reference the leftover entry that gates the fixture.
+- Implement Tests: per-scenario assertion helpers (RTSP playback orchestration, MAVLink observer, operator-stream observer) are likely shared across scenarios — Phase 4's runner scripts will assume a thin `e2e/consumer/lib/` module that all scenarios depend on.
+- Test-Spec Sync (cycle-update mode): post-implementation, scenarios may be split (e.g. FT-P-015's three sub-cases may become FT-P-015a/b/c) or merged. The traceability-matrix is the source of truth — every scenario MUST trace to at least one AC or RESTRICT.
@@ -0,0 +1,320 @@
+# Test Environment
+
+Authored by `/test-spec` Phase 2 (2026-05-19) against:
+
+- `_docs/00_problem/problem.md`, `acceptance_criteria.md`, `restrictions.md`, `security_approach.md`
+- `_docs/01_solution/solution_draft01.md`
+- `_docs/02_document/architecture.md` (incl. §6 NFR Targets, §7 Detailed Design)
+- `_docs/00_problem/input_data/data_parameters.md`, `services.md`, `fixtures/README.md`, `expected_results/results_report.md`
+
+Per `.cursor/rules/artifact-srp.mdc` this artifact owns ONLY the test environment / harness shape — measurable thresholds belong in `acceptance_criteria.md`, fixture inventory belongs in `test-data.md`, and per-test specs belong in the sibling `*-tests.md` files.
+
+---
+
+## Overview
+
+**System under test (SUT)**: `autopilot` — a single Rust binary that mounts onto the Jetson Orin Nano Super of a reconnaissance UAV. Its observable external surfaces:
+
+| Surface | Direction | Protocol | Source/Sink in production |
+|---|---|---|---|
+| Tier-1 detection RPC | autopilot ⇄ detector | bi-directional gRPC streaming (local) | `../detections` |
+| MAVLink command/telemetry | autopilot ⇄ airframe | MAVLink v2 over UDP (or serial) | ArduPilot / PX4 |
+| Camera RTSP feed | camera → autopilot | H.264/265 1080p, 30/60 fps | ViewPro A40 |
+| Gimbal control + telemetry | autopilot ⇄ camera | ViewPro vendor UDP | ViewPro A40 |
+| Mission + MapObjects REST | autopilot ⇄ central | HTTPS JSON | `missions` service |
+| Operator stream (telemetry out, commands in) | autopilot ⇄ GS | Suite-level modem protocol, signed commands | Ground Station |
+| Deep-analysis VLM IPC (optional) | autopilot ⇄ VLM | Unix-domain socket | local-onboard VLM |
+| Health endpoint | autopilot → ops | HTTP/JSON | scraped by ops |
+| Structured logs | autopilot → ops | JSON to stdout | log shipper |
+
+The harness exercises every one of those surfaces from outside the SUT process. No test reaches inside the binary (no module imports, no direct DB peeks, no shared memory).
+
+**Consumer app purpose**: a black-box test runner (`e2e-consumer`) that:
+
+1. Brings up the SUT in a controlled topology (with mock or live peers).
+2. Drives inputs through public surfaces.
+3. Captures every observable: outbound network frames, MAVLink commands, gimbal UDP commands, REST calls, operator-stream messages, health-endpoint JSON, log lines, plus passive resource metrics (RSS, CPU, GPU).
+4. Compares each observation against the expected result tagged in `_docs/00_problem/input_data/expected_results/results_report.md` and emits a CSV report.
+
+## Test execution tiers
+
+Three execution tiers exist; each test scenario declares which tier(s) it must run in:
+
+| Tier | Purpose | What is real vs mocked | When it runs |
+|---|---|---|---|
+| **U** — unit | Pure in-process logic with no external surface (state-machine transitions, geometry helpers, schema validators) | Everything in-process | Per commit (cargo test) |
+| **I** — component-integration | One autopilot component against mocks for every peer | SUT component real; all peers stubbed/replayed | Per commit; isolates contract drift |
+| **B** — blackbox / harness | Full SUT binary against mock peers in containers | SUT binary real; every external peer mocked (HTTPS mock, gRPC replay, MAVLink SITL, scripted operator trace, RTSP loopback) | Per commit + nightly |
+| **E** — suite-e2e | Full SUT against live siblings (`../detections`, `../missions`, ArduPilot SITL, Ground Station replay) | All real services in the suite-e2e compose | Nightly + pre-release |
+| **HW** — hardware/replay benchmark | SUT binary on representative Jetson hardware OR on a benchmarked replay of that hardware | Real Jetson Orin Nano Super OR benchmarked replay | Pre-release; the only path that satisfies the `acceptance_criteria.md → Acceptance Gates (project-level)` hardware gate |
+
+Hardware-dependency analysis (which AC rows require HW vs replay vs commodity) is produced by the test-spec `phases/hardware-assessment.md` step before Phase 4 runner scripts are generated and is appended to this file as `## Hardware Execution Matrix`.
+
+## Docker environment (Tier B + E)
+
+The suite-e2e compose lives at the monorepo level (`../e2e/docker-compose.suite-e2e.yml`, owned by the `monorepo-e2e` skill — see `_docs/00_problem/input_data/services.md`). The autopilot-local harness lives at `e2e/docker-compose.autopilot-e2e.yml` (created by Phase 4) and brings up only the SUT + mocks needed for Tier-B runs.
+
+### Services (Tier B — autopilot-local harness)
+
+| Service | Image / Build | Purpose | Ports |
+|---|---|---|---|
+| `autopilot` | build: `.` (cross to `aarch64-unknown-linux-gnu` for HW, native for Tier B) | SUT | health: 9100/tcp; log: stdout; MAVLink: 14550/udp; gimbal: 9201/udp; operator: 9301/tcp |
+| `detections-mock` | build: `e2e/mocks/detections-mock` (Python) | Bi-directional gRPC mock replaying recorded `Detections` streams | 50051/tcp |
+| `missions-mock` | build: `e2e/mocks/missions-mock` (Python FastAPI) | HTTPS REST mock — `GET/POST /missions/{id}` + `/mapobjects` | 8443/tcp (TLS) |
+| `rtsp-loopback` | image: `bluenviron/mediamtx` | RTSP server playing back recorded `.mp4` frame sequences at 30/60 fps | 8554/tcp |
+| `gimbal-mock` | build: `e2e/mocks/gimbal-mock` (Rust) | ViewPro UDP echo + scripted yaw/pitch/zoom telemetry replays | 9200/udp |
+| `mavlink-sitl` | image: `ardupilot/ardupilot-sitl` | ArduPilot SITL — MAVLink v2 endpoint for the autopilot to drive | 14551/udp |
+| `vlm-mock` | build: `e2e/mocks/vlm-mock` (Python, UDS) | Optional Tier-3 VLM IPC mock; replays recorded `VlmAssessment` JSON | (UDS only) |
+| `operator-replay` | build: `e2e/mocks/operator-replay` (Python) | Scripted Ground Station session trace: connect / push frame / push telemetry / operator-click / modem-drop / reconnect / lost-link | 9300/tcp |
+| `time-injector` | build: `e2e/mocks/time-injector` (Rust) | Injects clock-drift / NTP-loss scenarios into the SUT container's clock via `faketime` LD_PRELOAD shim | — |
+| `e2e-consumer` | build: `e2e/consumer` (Rust + assert crates) | The black-box test runner that drives scenarios + compares observables to expected results | — |
+
+### Networks
+
+| Network | Services | Purpose |
+|---|---|---|
+| `autopilot-e2e` | all | Isolated test network; no egress |
+
+### Volumes
+
+| Volume | Mounted to | Purpose |
+|---|---|---|
+| `fixtures-ro` | every mock service (read-only) | Mounts `_docs/00_problem/input_data/fixtures/` for replay sources |
+| `expected-ro` | `e2e-consumer:/expected:ro` | Mounts `_docs/00_problem/input_data/expected_results/` for assertion comparison |
+| `reports-rw` | `e2e-consumer:/reports` | CSV + JSON test output |
+| `autopilot-state` | `autopilot:/var/lib/autopilot` | On-device persistent store (R3, Mp4) — wiped between runs |
+
+### docker-compose structure (outline only — not runnable)
+
+```yaml
+services:
+  autopilot:
+    build: .
+    depends_on: [detections-mock, missions-mock, rtsp-loopback, gimbal-mock, mavlink-sitl, operator-replay]
+    networks: [autopilot-e2e]
+    environment:
+      DETECTOR_GRPC: detections-mock:50051
+      MISSIONS_URL: https://missions-mock:8443
+      RTSP_URL: rtsp://rtsp-loopback:8554/feed
+      GIMBAL_UDP: gimbal-mock:9200
+      MAVLINK_UDP: mavlink-sitl:14551
+      OPERATOR_TCP: operator-replay:9300
+      VLM_SOCK: /tmp/vlm.sock
+      AUTOPILOT_CONFIG: /etc/autopilot/test.toml
+    volumes:
+      - autopilot-state:/var/lib/autopilot
+  detections-mock: { build: e2e/mocks/detections-mock, volumes: [fixtures-ro:/fixtures:ro] }
+  missions-mock:   { build: e2e/mocks/missions-mock,   volumes: [fixtures-ro:/fixtures:ro] }
+  rtsp-loopback:   { image: bluenviron/mediamtx,       volumes: [fixtures-ro:/fixtures:ro] }
+  gimbal-mock:     { build: e2e/mocks/gimbal-mock,     volumes: [fixtures-ro:/fixtures:ro] }
+  mavlink-sitl:    { image: ardupilot/ardupilot-sitl }
+  vlm-mock:        { build: e2e/mocks/vlm-mock,        volumes: [fixtures-ro:/fixtures:ro] }
+  operator-replay: { build: e2e/mocks/operator-replay, volumes: [fixtures-ro:/fixtures:ro] }
+  time-injector:   { build: e2e/mocks/time-injector }
+  e2e-consumer:
+    build: e2e/consumer
+    depends_on: [autopilot]
+    volumes: [expected-ro:/expected:ro, reports-rw:/reports]
+networks:
+  autopilot-e2e: {}
+volumes:
+  fixtures-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/fixtures } }
+  expected-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/expected_results } }
+  reports-rw: {}
+  autopilot-state: {}
+```
+
+### Suite-e2e compose (Tier E) — referenced, not redefined
+
+For Tier-E runs the harness uses `../e2e/docker-compose.suite-e2e.yml` (owned by `monorepo-e2e`). It adds the real `../detections`, real `../missions`, and a richer `mavlink-sitl` configuration. Autopilot's Tier-E entries in this file MUST mirror the suite-e2e topology — drift is reconciled by the `monorepo-e2e` skill, not here.
+
+## Consumer application (`e2e-consumer`)
+
+**Tech stack**: Rust + `assert_cmd` + `testcontainers-rs` + `prost`/`tonic` (for gRPC observation) + `mavlink-rs` (for MAVLink observation) + `reqwest`/`hyper` (for HTTPS observation) + `tokio-tungstenite` (for operator-stream observation). Tests are organised one-scenario-per-file under `e2e/consumer/tests/scenarios/`.
+
+**Entry point**: `cargo test --release --test scenarios` (orchestrated by `scripts/run-tests.sh`, produced in Phase 4).
+
+### Communication with the system under test
+
+| Interface | Protocol | Endpoint / Topic | Authentication |
+|---|---|---|---|
+| Health endpoint | HTTP GET | `http://autopilot:9100/health` | none (loopback) |
+| Structured log stream | line-delimited JSON on stdout | docker-compose log tail | none |
+| MAVLink observed | MAVLink v2 / UDP | `mavlink-sitl:14551` (the harness records both sides of the link) | per Q6: MAVLink-2 message signing if configured |
+| Gimbal observed | ViewPro UDP | `gimbal-mock:9200` (commands recorded + telemetry replayed) | none |
+| RTSP delivered | RTSP | `rtsp://rtsp-loopback:8554/feed` (consumer schedules which clip plays per scenario) | none |
+| Detection RPC observed | gRPC streaming | `detections-mock:50051` (consumer scripts the recorded replay served) | none |
+| Mission REST observed | HTTPS | `missions-mock:8443` (consumer scripts JSON fixtures + asserts captured request bodies) | TLS cert (self-signed for test) |
+| Operator stream observed | Suite modem protocol | `operator-replay:9300` (consumer scripts session traces + signed-command envelopes) | per Q9: signed envelope (HMAC / ed25519 / MAVLink-2-ext) |
+| VLM IPC observed (when enabled) | Unix-domain socket | `/tmp/vlm.sock` shared with `vlm-mock` | peer-credential check (security_approach §"Local IPC peer authorisation") |
+
+### What the consumer does NOT have access to
+
+- No direct database access to the autopilot's on-device persistent store (`autopilot-state` volume) — the consumer reads it only via the health endpoint, the operator telemetry stream, or as a post-run forensic check (the storage AC R3 is checked via the BIT health response, not by peeking at SQLite rows).
+- No internal Rust module imports — the consumer is a separate crate compiled against published public proto/schema files only.
+- No shared memory, no `/proc/$pid/...` inspection beyond passive resource metrics.
+- No direct reading of in-flight POI queue ordering — ordering is observed indirectly via the operator-stream emission order and the gimbal command stream.
+
+## External dependency mocks
+
+| Dependency | Mock service | Determinism guarantee | Source fixture(s) |
+|---|---|---|---|
+| `../detections` Tier-1 RPC | `detections-mock` | Replays recorded `Detections` stream byte-for-byte; same input → same output | `<DEFERRED: tier1_replay/*.replay; services.md §1>` (live `../detections` used as fallback in Tier-E) |
+| `missions` API | `missions-mock` | Static JSON responses per scenario; recorded round-trip captured for `POST` | `<DEFERRED: missions_fixtures/*.json; services.md §2>` |
+| ViewPro A40 camera frames | `rtsp-loopback` (mediamtx) | Plays back `.mp4` at exact configured fps; frame timestamps deterministic | `fixtures/videos/94d42580bd1ad6ff.mp4`, `fixtures/movement/video0[1-4].mp4` |
+| ViewPro A40 gimbal control | `gimbal-mock` | Replays `gimbal.csv` per scenario; echoes commands with bounded latency budget per scenario | `<DEFERRED: gimbal_csv/*.csv paired with movement videos; services.md §6>` |
+| ArduPilot airframe | `mavlink-sitl` (ArduPilot SITL) | Deterministic seed + scripted mission | scripted per scenario; no fixture file required for Tier B (SITL is the fixture) |
+| Ground Station modem session | `operator-replay` | Replays `(t, event)` script per scenario | `<DEFERRED: operator_sessions/*.script; services.md §3>` |
+| Local VLM (Tier-3 optional) | `vlm-mock` | Returns paired `(roi.png → VlmAssessment)` from disk; schema-violation fixtures for fail-closed tests | `<DEFERRED: vlm_io_pairs/*.json; services.md §7>` |
+| Wall-clock / GPS / NTP | `time-injector` (faketime LD_PRELOAD) | Scripted offset / jump / source-loss; injected at SUT process start | scripted per scenario; no fixture file required |
+
+Mocks that are marked `<DEFERRED:>` are bridged through `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`. Scenarios that consume those mocks declare `Test status: DEFERRED — input fixture not yet acquired (see leftover row N)` in their entry under the relevant `*-tests.md` file.
+
+## CI/CD integration
+
+| Stage | Tier(s) | When | Gate | Timeout |
+|---|---|---|---|---|
+| PR pipeline | U, I | on every PR push | block merge on FAIL | 10 min |
+| dev-branch nightly | U, I, B | nightly | warn on FAIL; report attached | 60 min |
+| weekly suite-e2e | U, I, B, E | weekly + on release branch | block release on FAIL | 180 min |
+| pre-release HW benchmark | HW | manual + pre-release | block release on FAIL | 240 min |
+
+Owned in `_docs/02_document/deployment/ci_cd_pipeline.md`. This file only declares which tier each scenario MUST run in; the pipeline orchestration is documented there.
+
+## Reporting
+
+**Format**: CSV (one row per scenario per run).
+
+**Columns**:
+
+| Column | Type | Notes |
+|---|---|---|
+| `test_id` | string | e.g. `FT-P-001`, `NFT-PERF-L1`, `NFT-SEC-O9` |
+| `test_name` | string | short title from the scenario header |
+| `tier` | enum | U / I / B / E / HW |
+| `seed` | int | deterministic seed used (where applicable) |
+| `start_ts_utc` | ISO 8601 | scenario start |
+| `duration_ms` | int | total execution time |
+| `result` | enum | PASS / FAIL / SKIP / DEFERRED |
+| `expected_result_ref` | string | row id in `expected_results/results_report.md` (e.g. `L1`, `Mp3`) |
+| `actual_value` | string | quantitative observation (latency_ms, count, etc.) |
+| `compare_method` | string | one of `expected-results.md` methods |
+| `tolerance` | string | as declared in the expected-results row |
+| `failure_reason` | string | populated only on FAIL or DEFERRED |
+| `artifacts_path` | string | path under `/reports/<run-id>/` for captured logs / pcaps / mavlink dumps |
+
+**Output path**: `e2e/consumer/reports/<run-id>/report.csv` (mounted host-side to `./reports/<run-id>/report.csv`).
+
+**Sidecar artifacts** per scenario (one folder per `test_id`): `stdout.log`, `stderr.log`, `mavlink.tlog` (where applicable), `pcap.bin` (where applicable), `health-trace.jsonl`, `actual-output.json`.
+
+## Test Execution
+
+**Decision** (recorded 2026-05-19 by `phases/hardware-assessment.md`): **local-only on Jetson Orin Nano Super**. Every scenario — Tier B, Tier E, Tier HW — runs on representative Jetson hardware (the same hardware the airborne payload deploys to). Docker is used for **service orchestration** (mocks, sibling services) on the Jetson host, NOT for SUT execution on x86.
+
+### Hardware dependencies found
+
+| File | Dependency surfaced |
+|---|---|
+| `_docs/00_problem/restrictions.md → "Hardware"` | Jetson Orin Nano Super (aarch64), 8 GB shared LPDDR5, 67 TOPS INT8; ViewPro A40 (40× optical zoom + vendor UDP); ViewPro Z40K compatibility |
+| `_docs/00_problem/restrictions.md → "Software environment"` | FP16 precision (INT8 rejected); no cloud egress; Tier 1 + local large models share Jetson GPU with mutual exclusion |
+| `_docs/01_solution/solution_draft01.md` | "single Rust binary on Jetson Orin Nano Super (aarch64)"; TensorRT FP16; Tokio + Unix-domain-socket VLM IPC |
+| `_docs/02_document/architecture.md §6` (NFR Targets) + `§7.6` (Solution Architecture) + `§7.14` (Tech Stack) | cross-compile target `aarch64-unknown-linux-gnu`; TensorRT engine; gimbal UDP; MAVLink-v2 transport |
+| `_docs/02_document/components/*/description.md` (13 components) | physical UDP (gimbal_controller), RTSP capture (frame_ingest), MAVLink airframe link (mavlink_layer), local-onboard model (semantic_analyzer + vlm_client) |
+
+### Why local-only on Jetson
+
+The choice rejects two alternatives:
+
+- **Docker-only on x86** would leave Tier-HW rows (L1–L9, Re1, Re2, NFT-RES-LIM-CPU, NFT-RES-LIM-GPU) `SKIPPED-NO-HW`. That defeats the project-level Acceptance Gate (`acceptance_criteria.md → "Acceptance Gates (project-level)"`: every latency criterion MUST be measured on the deployed compute device).
+- **Both x86 + Jetson** would split the test surface and let Tier-B scenarios pass on x86 while masking real-hardware regressions (e.g. GPU contention is invisible on x86). The honest path is to exercise the actual hardware path uniformly.
+
+### Execution instructions (local on Jetson)
+
+**Prerequisites** (one-time, per Jetson runner):
+- JetPack 6.x SDK + L4T r36.x (matches the airborne deployment image).
+- Rust toolchain pinned to the workspace's `rust-toolchain.toml` (added by Step 7 Implement); rustup target `aarch64-unknown-linux-gnu` already native here.
+- Docker + Docker Compose v2 (for orchestrating the mock services + sibling repos in Tier-E mode).
+- `mavlink-router`, `tegrastats`, `iperf3`, `tc` (network shaping).
+- ViewPro A40 (or Z40K for the Z40K-swap regression run) connected over Ethernet at the documented control endpoint.
+- ArduPilot SITL binary installed natively (the Docker image is x86-only; on Jetson aarch64 we run SITL natively or via Apptainer).
+- A representative ViewPro A40 RTSP feed source — either the physical camera or a recorded `.mp4` looped through a local `mediamtx`.
+
+**How to start services**: `docker compose -f e2e/docker-compose.autopilot-e2e.yml up -d` brings up `detections-mock`, `missions-mock`, `rtsp-loopback`, `gimbal-mock`, `vlm-mock`, `operator-replay`, `time-injector` on the Jetson host. The SUT (`autopilot` binary) runs **outside** the compose — `cargo run --release` on the Jetson directly, with env vars pointing at the compose-side mock endpoints. For Tier E, swap `detections-mock` → live `../detections` and `missions-mock` → live `missions` per `../e2e/docker-compose.suite-e2e.yml`.
+
+**How to run the test runner**: `scripts/run-tests.sh` (to be created by a Decompose task per `traceability-matrix.md → "Phase 4 SKIPPED"` handoff) orchestrates: bring up compose → start SUT → run `cargo test --release --test scenarios -p e2e-consumer` → tear down. The runner reads `RUN_TIER ∈ {B, E, HW}` to decide which scenarios to execute.
+
+**Environment variables** (consumed by both the SUT and the consumer):
+- `RUN_TIER` (`B` | `E` | `HW`) — selects scenario set per the matrix below.
+- `AUTOPILOT_CONFIG` — path to the test profile TOML (overrides per-scenario thresholds + Q-tagged defaults).
+- `AUTOPILOT_RNG_SEED` — deterministic-seed per scenario; captured in the CSV report.
+- `JETSON_RUNNER_ID` — identifier for the physical Jetson + camera+gimbal hardware combo; carried into every CSV row for forensic comparison across runners.
+
+### CI/CD addendum (overrides the earlier `## CI/CD integration` table)
+
+The earlier table assumed a Docker-on-x86 PR pipeline. Under this decision, every tier runs on a Jetson runner. Operationally that means:
+
+| Stage | Tier(s) | When | Gate | Timeout | Runner |
+|---|---|---|---|---|---|
+| PR pipeline | U, I | on every PR push | block merge on FAIL | 10 min | Jetson runner (native cargo test for U + I) |
+| dev-branch nightly | U, I, B | nightly | warn on FAIL; report attached | 60 min | Jetson runner |
+| weekly suite-e2e | U, I, B, E | weekly + on release branch | block release on FAIL | 180 min | Jetson runner + live siblings reachable from it |
+| pre-release HW benchmark | HW | manual + pre-release | block release on FAIL | 240 min | Jetson runner + physical A40 + airframe SITL/HW |
+
+Capacity note: the PR pipeline running on Jetson trades x86 throughput for execution honesty. If PR latency becomes painful, the team's mitigation is to add more Jetson runners — NOT to fall back to x86 for Tier B (that would defeat the choice).
+
+## Hardware Execution Matrix
+
+Per the local-only-on-Jetson decision, every tier runs on Jetson. The matrix below is collapsed accordingly: it records **what each scenario actually exercises on the Jetson** (which hardware surface is the load-bearing one) so that a runner-capacity planner can predict which scenarios contend for the same physical resource.
+
+| Scenario | Tier | Jetson surface exercised | Concurrent-with constraint |
+|---|---|---|---|
+| FT-P-001 (D6 Tier-1 contract) | B + E | GPU (Tier 1 inference) | conflicts with NFT-RES-LIM-Re2 / GPU |
+| FT-P-002 — FT-P-006 (D1–D5) | E + HW | GPU (Tier 1 inference) | as above |
+| FT-P-007 — FT-P-010 (M1–M4) | B + E | CPU (movement) + GPU (Tier 1 inputs) | as above |
+| FT-P-011 — FT-P-015 (S1–S5) | B + E | CPU + gimbal UDP + GPU (Tier 3 in S5) | gimbal contention serialises S1/S2/S3 |
+| FT-P-016 — FT-P-022 (O1–O7, O8 happy) | B + E | CPU + operator-stream | low contention |
+| FT-P-023 (R1 BIT pass) | B + E | every dep mocked | none |
+| FT-N-001 — FT-N-002 (R2/R3) | B + E | none (storage seed manipulation) | none |
+| FT-N-003 (Mp2 cache-fallback) | B + E | mock timeout on `missions-mock` | none |
+| FT-N-004 (O4 below-threshold) | B | CPU only | none |
+| FT-P-024 / FT-P-025 / FT-P-026 (Mp1/Mp3/Mp5) | B + E | network + persistent store | persistent-store contention serialises |
+| NFT-PERF-L1 | **HW** | GPU (Tier 1) | dedicate runner — measurement integrity |
+| NFT-PERF-L2 | HW + B | GPU (Tier 2) | conflicts with L1/L3/L8 — serialise |
+| NFT-PERF-L3 | HW + B (vlm-mock) | GPU (Tier 3 VLM) | conflicts with L1/L2 — serialise |
+| NFT-PERF-L4 | **HW** | A40 physical zoom motor | dedicate runner — physical motion |
+| NFT-PERF-L5 | HW + B | CPU + gimbal UDP | serialise with L4/L8 |
+| NFT-PERF-L6 / L7 | B + E | CPU + ego-motion + GPU (Tier 1 inputs) | serialise with L1 |
+| NFT-PERF-L8 | HW + B | A40 physical zoom + Tier 1 GPU | dedicate runner |
+| NFT-PERF-L9 | B + E | CPU + operator-stream | low contention |
+| NFT-PERF-T1 | B | CPU + queue | none |
+| NFT-PERF-T2 | B + E | airframe link | low |
+| NFT-PERF-T3 | B | RTSP throttling + health | none |
+| NFT-RES-R4–R9 | B + E | airframe link + persistent store | serialise per-mission |
+| NFT-RES-Mp2 / Mp4 | B + E | network + persistent store | low |
+| NFT-SEC-O9 / O10 | B + E | operator-stream + crypto path | low |
+| NFT-SEC-CraftedFrame / OversizeCrop | B | decoder CPU | low |
+| NFT-SEC-VlmSchemaViolation / FreeFormText | B (vlm-mock) | UDS IPC | low |
+| NFT-SEC-IpcPeerAuth | B | UDS IPC + peer-cred | low |
+| NFT-SEC-Tier1SchemaViolation | B | Tier-1 RPC | none |
+| NFT-SEC-MavlinkUnsigned | B + E | airframe link (Q6 dep) | low |
+| NFT-SEC-HealthExposesSecurity | B | counters + health | low |
+| NFT-RES-LIM-Re1 | **HW** | full Jetson workload (RSS) | dedicate runner — measurement integrity |
+| NFT-RES-LIM-Re2 | **HW** | Tier 1 + autopilot workload concurrent | runs back-to-back with NFT-PERF-L1 in same session |
+| NFT-RES-LIM-Storage | B + HW | persistent store | low |
+| NFT-RES-LIM-CPU | **HW** | full CPU | dedicate runner |
+| NFT-RES-LIM-GPU | **HW** | GPU mutex (Tier 1 vs Tier 3) | dedicate runner |
+| NFT-RES-LIM-FileHandles | B + HW | `/proc/<pid>/fd` | low |
+
+**Bold Tier values** mark scenarios that REQUIRE physical Jetson + (sometimes) physical A40 to satisfy the project-level Acceptance Gate; surrogate replay does NOT count for those rows.
+
+**Capacity rule**: scenarios marked `dedicate runner` MUST NOT run concurrently with any other scenario on the same Jetson — measurement integrity depends on the workload being exclusively the SUT.
+
+## Open dependencies that affect the harness
+
+| Open Q | Affects | Default until resolved |
+|---|---|---|
+| Q6 (MAVLink-2 signing) | `mavlink-sitl` config + observed-MAVLink assertions | signing disabled; tests skip signing assertions until Q6 lands |
+| Q8 (MapObjects conflict resolution) | Mp5 fixture shape | `<DEFERRED>` |
+| Q9 (Operator-command auth scheme) | `operator-replay` envelope format + signature validator | `<DEFERRED>` for O9/O10; O8 runs the happy path only |
+| Q11 (multi-operator session policy) | `operator-replay` session-id semantics | single-operator only |
+| Q14 (movement-detection classical vs learned-CV) | M4 benchmark fixture shape | `<DEFERRED>` |
@@ -0,0 +1,270 @@
+# Performance Tests
+
+Authored by `/test-spec` Phase 2 (2026-05-19). Performance tests measure latency / rate / sustained-load characteristics. Functional behaviour that those characteristics enable lives in `blackbox-tests.md`. Resource ceilings live in `resource-limit-tests.md`.
+
+Every scenario records steady-state metrics — cold-start measurements are explicitly excluded by a warm-up precondition. Pass criteria use the methods in `_docs/00_problem/input_data/expected_results/results_report.md` (referenced by row id).
+
+---
+
+## Latency
+
+### NFT-PERF-L1: Tier-1 per-frame end-to-end latency ≤ 100 ms
+**Summary**: Per-frame end-to-end latency through the Tier-1 contract (frame in → normalised-box record out) ≤ 100 ms at 1280 px input.
+**Traces to**: AC `Latency — Primitive (Tier 1) object detection / L1`.
+**Tier**: HW (representative Jetson Orin Nano Super) OR benchmarked replay (the only way to satisfy the project-level Acceptance Gate).
+**Metric**: per-frame wall-clock from RTSP frame-receive timestamp to normalised-box emission timestamp.
+
+**Preconditions**:
+- Warm-up: 100 frames played before measurement starts (TensorRT engine warm, autopilot's frame pipeline in steady state).
+- Single 1280 px frame replayed via `rtsp-loopback`; the live Tier-1 service is colocated on the same Jetson.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Play `fixtures/images/4d6e1830d211ad50.jpg` as a 60 s loop at 30 fps | record per-frame (frame_receive_ts, normalised_box_emit_ts); compute Δms |
+| 2 | Aggregate over the measurement window | report p50, p95, p99, max |
+
+**Pass criteria**: `p95 ≤ 100 ms` AND `max ≤ 150 ms` (max gives a soft headroom; AC enforces the p95 line).
+**Duration**: 60 s after warm-up.
+**Test status**: READY (fixture present); Tier requires HW for the release gate.
+
+---
+
+### NFT-PERF-L2: Tier-2 per-ROI semantic confirmation ≤ 200 ms
+**Summary**: Per-ROI latency through Tier-2 semantic confirmation ≤ 200 ms.
+**Traces to**: AC `Latency — Semantic confirmation (Tier 2) / L2`.
+**Tier**: HW + Tier-B (inline ROI crop generation).
+**Metric**: per-ROI wall-clock from ROI submitted to Tier-2 to Tier-2 emits semantic confirmation.
+
+**Preconditions**:
+- Warm-up: 50 ROIs processed before measurement.
+- Test runner derives a ~640×640 ROI inline from `fixtures/images/4d6e1830d211ad50.jpg` and injects it directly into the SUT's Tier-2 entry (via a test-only ROI submission API exposed in test builds).
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Submit 1000 ROIs at 5 Hz | per-ROI Δms |
+| 2 | Aggregate | p50, p95, p99 |
+
+**Pass criteria**: `p95 ≤ 200 ms`.
+**Duration**: 200 s.
+**Test status**: READY.
+
+---
+
+### NFT-PERF-L3: Tier-3 deep-analysis ≤ 5 s per ROI
+**Summary**: Per-ROI deep-analysis (Tier-3 / VLM, when enabled) ≤ 5 s.
+**Traces to**: AC `Latency — Deep semantic confirmation (Tier 3 / VLM, when enabled) / L3`.
+**Tier**: HW + Tier-B (vlm-mock).
+**Metric**: per-ROI wall-clock from SUT issuing a Tier-3 IPC call to VLM response received and schema-validated.
+
+**Preconditions**:
+- Warm-up: 5 Tier-3 calls.
+- `vlm-mock` configured to respond from `vlm-io-pairs` fixture; Tier-3 enabled via SUT config.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Trigger 100 Tier-3 calls via injected ROIs | per-call Δms |
+| 2 | Aggregate | p50, p95, p99 |
+
+**Pass criteria**: `p95 ≤ 5000 ms`.
+**Duration**: as needed for 100 calls.
+**Test status**: DEFERRED — `<DEFERRED: vlm-io-pairs (real I/O) and the pinned local VLM model>`.
+
+---
+
+### NFT-PERF-L4: Camera zoom transition (medium → high) ≤ 2 s
+**Summary**: Wall-clock from issuing the medium→high zoom command to the physical zoom transition completing ≤ 2 s, including the 1–2 s physical floor (restriction).
+**Traces to**: AC `Latency — Camera zoom transition / L4`, RESTRICT `Hardware — 40× optical zoom traversal takes 1–2 s wall-clock`.
+**Tier**: HW (physical A40 OR benchmarked replay) — pure-emulator runs not acceptable per `expected_results/results_report.md → Notes on this spec`.
+**Metric**: wall-clock from outbound zoom command (observed on gimbal UDP) to gimbal-mock zoom telemetry reporting target_zoom_band.
+
+**Preconditions**:
+- SUT in `ZoomedIn` mode after a sweep-to-zoom transition; gimbal at medium zoom.
+- HW Jetson OR `gimbal-mock` replaying recorded A40 zoom telemetry with realistic traversal time.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Trigger 30 medium→high zoom transitions via scripted POI sequence | per-transition Δms |
+| 2 | Aggregate | p50, p95, max |
+
+**Pass criteria**: `p95 ≤ 2000 ms`.
+**Test status**: DEFERRED — `<DEFERRED: SITL or hardware-in-loop ViewPro A40 zoom command capture>`.
+
+---
+
+### NFT-PERF-L5: Decision-to-movement latency ≤ 500 ms
+**Summary**: From the internal scan-control decision (POI detected mid-sweep) to the camera physically beginning to move ≤ 500 ms.
+**Traces to**: AC `Latency — Decision-to-movement latency / L5`.
+**Tier**: HW + Tier-B.
+**Metric**: wall-clock from Tier-1 detection received at the scan-controller to first gimbal command observed on `gimbal-mock`.
+
+**Preconditions**:
+- Warm-up: 10 scripted POI events.
+- Scripted scan-decision events followed by camera physical motion observed on the gimbal UDP channel.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Inject 100 POI detections at random sweep positions | per-event Δms (detection-receive-ts → gimbal-command-out-ts) |
+| 2 | Aggregate | p95 |
+
+**Pass criteria**: `p95 ≤ 500 ms`.
+**Test status**: DEFERRED — `<DEFERRED: scripted scan decision events with paired gimbal telemetry capture>`.
+
+---
+
+### NFT-PERF-L6: Movement candidate enqueue ≤ 1 s (wide sweep)
+**Summary**: From the movement event in the visual stream to candidate enqueued for zoomed inspection ≤ 1 s during the wide-area sweep.
+**Traces to**: AC `Latency — Movement candidate enqueue / L6`.
+**Tier**: B + E.
+**Metric**: wall-clock from ground-truth movement-event timestamp (annotated in the fixture) to candidate appearing on operator-stream.
+
+**Preconditions**:
+- Warm-up: 30 s of sweep playback.
+- Synchronised RTSP + gimbal.csv + telemetry.csv (DEFERRED CSV pair).
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Replay `fixtures/movement/video01.mp4` + paired CSVs | record per-event Δms |
+| 2 | Aggregate over ~20 movement events | p95 |
+
+**Pass criteria**: `p95 ≤ 1000 ms`.
+**Test status**: DEFERRED — `<DEFERRED: paired gimbal.csv + telemetry.csv for video01.mp4 with annotated movement-event timestamps>`.
+
+---
+
+### NFT-PERF-L7: Movement candidate enqueue ≤ 1.5 s (zoomed-in)
+**Summary**: Same as L6 but during a zoomed-in hold; budget relaxed to 1.5 s to accommodate gimbal slew.
+**Traces to**: AC `Latency — Movement candidate enqueue … during the zoomed-in inspection / L7`.
+**Tier**: B + E.
+**Metric**: same as L6 but starting from a ZoomedIn hold.
+
+**Preconditions**:
+- SUT in ZoomedIn hold; small mover appears mid-hold.
+- DEFERRED zoomed-in CSV pair.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Drive SUT into ZoomedIn hold; replay zoomed-in scene with small mover | per-event Δms |
+| 2 | Aggregate over ~10 movement events | p95 |
+
+**Pass criteria**: `p95 ≤ 1500 ms`.
+**Test status**: DEFERRED — `<DEFERRED: paired gimbal.csv + telemetry.csv at zoomed-in band>`.
+
+---
+
+### NFT-PERF-L8: Zoom-out → zoom-in transition ≤ 2 s
+**Summary**: From POI detected during sweep to ROI fully zoomed and held ≤ 2 s wall-clock.
+**Traces to**: AC `Latency — Zoom-out → zoom-in transition / L8`.
+**Tier**: HW + Tier-B.
+**Metric**: wall-clock from Tier-1 detection injected → first frame at full zoom on the ROI (observed via gimbal-mock zoom telemetry and the operator-stream ROI overlay).
+
+**Preconditions**:
+- Warm-up.
+- Scripted sweep + injected POI.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Inject 30 mid-sweep POIs | per-transition Δms |
+| 2 | Aggregate | p95 |
+
+**Pass criteria**: `p95 ≤ 2000 ms`.
+**Test status**: DEFERRED — `<DEFERRED: sweep → zoomed-inspection transition capture with annotated transition-complete timestamps>`.
+
+---
+
+### NFT-PERF-L9: Operator command → action ≤ 500 ms
+**Summary**: From operator click event (entering the SUT on the operator-stream return path) to the corresponding outbound command observed on its destination channel ≤ 500 ms; modem RTT explicitly excluded by measuring inside the SUT-side of the modem.
+**Traces to**: AC `Latency — Operator command → action / L9`.
+**Tier**: B + E.
+**Metric**: wall-clock from operator-stream message arrival at SUT → first outbound command observed on the affected channel (MAVLink waypoint POST, gimbal command, mode-change emission).
+
+**Preconditions**:
+- Operator-session-scripts include click events at deterministic offsets.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Replay scripted operator-click sequence (50 clicks across confirm / decline / target-follow / abort) | per-click Δms |
+| 2 | Aggregate | p95 |
+
+**Pass criteria**: `p95 ≤ 500 ms`.
+**Test status**: DEFERRED — `<DEFERRED: operator-envelopes once Q9 resolves>` for signed commands; happy-path placeholder usable today for an early measurement (mark interim baseline only).
+
+---
+
+## Throughput / Rate
+
+### NFT-PERF-T1: POI rate to operator capped at ≤ 5 / min
+**Summary**: Even when Tier-1 produces detections faster than the cap, the rate of POIs SURFACED to the operator MUST stay ≤ 5 / min (hard cap, frozen 2026-05-06).
+**Traces to**: AC `Throughput / Rate — POI rate surfaced to the operator / T1`.
+**Tier**: B.
+**Metric**: count of POIs emitted on operator-stream per rolling 60 s window.
+
+**Preconditions**:
+- Synthetic POI feed sustained at 20 POIs / min via `synthetic-poi-feeds`.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Inject sustained 20 POI/min feed for 10 minutes | per-minute count of surfaced POIs |
+| 2 | Compute max over any rolling 60 s window | rolling-max |
+
+**Pass criteria**: `rolling-max ≤ 5` POIs/min for every 60 s window.
+**Duration**: 10 min.
+**Test status**: READY (synthetic feeds inline-authorable).
+
+---
+
+### NFT-PERF-T2: Position telemetry rate ∈ [1 Hz, 10 Hz]
+**Summary**: The position telemetry the SUT consumes from the airframe link MUST sustain ≥1 Hz, target 10 Hz, over a 60 s window.
+**Traces to**: AC `Throughput / Rate — Position telemetry rate / T2`.
+**Tier**: B (with MAVLink replay) + E (live SITL).
+**Metric**: count of `GLOBAL_POSITION_INT` messages consumed by the SUT per second.
+
+**Preconditions**:
+- MAVLink stream replayed at the configured target rate (10 Hz).
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Replay 60 s of GLOBAL_POSITION_INT at 10 Hz | per-second consumed count |
+| 2 | Aggregate | min, mean |
+
+**Pass criteria**: `min ≥ 1 Hz` AND `mean ≥ 9.5 Hz` (target 10 Hz with ≤ 5 % tolerance).
+**Test status**: DEFERRED — `<DEFERRED: MAVLink replay fixture over a 60 s window>`.
+
+---
+
+### NFT-PERF-T3: Frame-rate floor → suppress zoom-in + health yellow
+**Summary**: When the sustained camera frame rate drops below 10 fps for ≥5 s, zoom-in transitions MUST be suppressed AND overall health MUST surface yellow.
+**Traces to**: AC `Throughput / Rate — Sustained camera frame-rate floor / T3`.
+**Tier**: B.
+**Metric**: pair: (boolean — was a zoom-in suppressed during the low-FPS window?), (boolean — did health surface yellow?).
+
+**Preconditions**:
+- SUT in normal sweep mode.
+- `rtsp-loopback` plays `fixtures/videos/94d42580bd1ad6ff.mp4` with throttled decode injecting frame drops to keep FPS < 10 for ≥ 5 s.
+
+| Step | Consumer Action | Measurement |
+|---|---|---|
+| 1 | Start playback at normal 30 fps | health remains green; zoom-in proceeds normally on detection |
+| 2 | Throttle decode + drop frames to push FPS below 10 for ≥ 5 s | record: (a) whether a zoom-in-required event during this window was suppressed; (b) whether `GET /health` returns `overall == "yellow"` |
+
+**Pass criteria**: both observations TRUE.
+**Duration**: 30 s (5 s low-FPS window + buffer).
+**Test status**: READY (fixture present; throttling implemented by consumer).
+
+---
+
+## Sustained-load (handoff to resource-limit-tests)
+
+The two sustained-resource AC rows (Re1, Re2) live as resource-limit tests rather than performance tests because the pass criterion is "stays within ceiling for the duration", not "is fast enough":
+
+- Re1 — combined RSS ≤ 6 GB onboard for everything autopilot owns — see `resource-limit-tests.md → NFT-RES-LIM-Re1`.
+- Re2 — Tier-1 per-frame latency Δ ≤ 5 ms when autopilot's workload runs concurrently — see `resource-limit-tests.md → NFT-RES-LIM-Re2`. Re2 is the Tier-1 non-degradation contract; the absolute Tier-1 latency target is L1.
+
+---
+
+## Common preconditions for every performance scenario
+
+- **Warm-up**: every scenario MUST include an explicit warm-up phase whose duration is recorded in the CSV report. This separates cold-start cost from steady-state behaviour.
+- **Steady-state window**: pass criteria apply only to the steady-state window (after warm-up), not to the warm-up itself.
+- **Hardware honesty**: scenarios that name Tier HW MUST run on representative Jetson Orin Nano Super OR on a benchmarked replay. Pure-x86-emulator runs report results but do NOT contribute to the project-level Acceptance Gate.
+- **Concurrent workload disclosure**: every scenario records whether other autopilot subsystems were running concurrently (Tier-1 inference, VLM, MAVLink, etc.). Re2 is the only scenario that REQUIRES concurrent workload; the others MUST report it for context.
+- **Seed + determinism**: where the test inputs randomness (e.g., synthetic-POI ordering tie-breakers), the seed is captured in the CSV report.
@@ -0,0 +1,196 @@
+# Resilience Tests
+
+Authored by `/test-spec` Phase 2 (2026-05-19). Resilience tests inject a fault, observe behaviour during the fault, observe recovery behaviour, and assert against both. The fault and the recovery contract are both quantifiable.
+
+BIT pre-flight pathways (positive R1, negatives R2/R3) are in `blackbox-tests.md` because they assert a functional gate. The runtime fault scenarios live here.
+
+---
+
+### NFT-RES-R4: Lost operator/Ground-Station link → RTL at 30 s grace (default)
+**Summary**: Sustained loss of the operator/Ground-Station radio link MUST trigger an RTL exactly at the configured grace window (default 30 s), and operator-link health MUST flip red.
+**Traces to**: AC `Reliability & Safety — Loss of operator/Ground-Station radio link MUST trigger a known mission-safe outcome / R4`, RESTRICT `Reliability & Safety — Lost operator-link failsafe MUST be deterministic and bounded`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT mid-flight (scripted MAVLink stream + active operator session).
+- Operator session in steady state for ≥ 30 s before fault injection.
+- Grace window configured to default 30 s.
+
+**Fault injection**:
+- `operator-replay` issues `lost-link` event at T=0 and STAYS silent (no reconnect) for the remainder of the window.
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | Inject lost-link event at T=0 | health endpoint immediately shows `deps.operator_link == "red"`; `last_seen_at` frozen |
+| 2 | Wait 25 s (within grace) | NO RTL command yet on `mavlink-sitl`; SUT continues mission |
+| 3 | Wait until T=30 s | RTL command observed on `mavlink-sitl` at T = 30 s ± 1 s; operator-stream emits a `failsafe_triggered` event with reason `operator_link_lost` |
+| 4 | Optionally reconnect operator-replay after RTL | RTL persists (operator cannot un-RTL silently — requires explicit operator override per AC); health.operator_link transitions back to green when traffic resumes |
+
+**Pass criteria**: RTL command at T = 30 s ± 1 s (`exact` with ± 1 s tolerance), `exact` operator-link red.
+**Recovery time bound**: RTL must be issued within 31 s of fault start.
+**Test status**: READY (operator-session-scripts inline-authorable; mavlink-sitl runs an ArduPilot SITL accepting RTL).
+
+---
+
+### NFT-RES-R5: Battery at RTL-floor → RTL
+**Summary**: When the airframe battery sample drops to the configured RTL floor (e.g. 25 %), the SUT MUST issue an RTL and health MUST surface yellow.
+**Traces to**: AC `Reliability & Safety — Battery at or below the configured RTL floor / R5`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT mid-flight; battery telemetry replayed via `mavlink-sitl` at 1 Hz.
+
+**Fault injection**:
+- `mavlink-sitl` scripted battery curve: starts at 80 %; ramps down to 25 % at T=T0; held at 25 % afterwards.
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | At T=T0, battery reads 25 % | within 1 sample period (1 s) the SUT issues RTL on `mavlink-sitl`; health transitions to `overall == "yellow"`; operator-stream emits `failsafe_triggered` with reason `battery_rtl_floor` |
+| 2 | Battery continues at 25 % | RTL persists; no oscillation |
+
+**Pass criteria**: `exact (RTL command observed)` + `exact (health.overall == "yellow")`.
+**Test status**: DEFERRED — `<DEFERRED: mid-flight battery sample at RTL-floor via mavlink-sitl battery curve script>`.
+
+---
+
+### NFT-RES-R6: Battery at hard floor → land-now
+**Summary**: When the battery hits the configured hard floor (e.g. 15 %), the SUT MUST issue land-now and ONLY an authenticated operator command may override.
+**Traces to**: AC `Reliability & Safety — battery at or below the hard floor / R6`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT mid-flight; battery ramps to 15 %.
+
+**Fault injection**: same as R5 but ramp continues to 15 %.
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | At T=T0, battery reads 15 % | within 1 sample period the SUT issues land-now (`MAV_CMD_NAV_LAND` or equivalent) on `mavlink-sitl`; health red; operator-stream emits `failsafe_triggered` with reason `battery_hard_floor` |
+| 2 | Replay an UNAUTHENTICATED operator-override command | SUT REFUSES; land-now persists |
+| 3 | Replay an AUTHENTICATED operator-override (placeholder until Q9; full once Q9 resolves) | land-now cancelled; SUT returns to prior mode |
+
+**Pass criteria**: `exact (land_now observed)`; `exact (refusal of unauthenticated override)`; `exact (acceptance of authenticated override)`.
+**Test status**: DEFERRED — same fixture gap as R5; step 3's full authentication semantics also `<DEFERRED: Q9>`.
+
+---
+
+### NFT-RES-R7: Airframe link exhaustion → health red after max-retry
+**Summary**: When MAVLink commands fail through the configured bounded-retry budget (no airframe response), the airframe-link dependency MUST flip health red.
+**Traces to**: AC `Reliability & Safety — MAVLink command exhaustion (bounded retry with exponential backoff fails through max-retry) / R7`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT mid-flight; max-retry configured (e.g., 5 attempts; exponential backoff base 100 ms).
+
+**Fault injection**:
+- `mavlink-sitl` configured to drop all command-ack messages for the duration of the test (peer non-responsive).
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | SUT issues a MAVLink command (e.g., waypoint upload) | command sent; no ack received |
+| 2 | Backoff + retry loop executes through max-retry | retries observed on the wire with exponential backoff |
+| 3 | After final retry exhausts | health.airframe_link transitions to red; operator-stream emits a `dependency_degraded` event with reason `airframe_link_retry_exhausted` |
+
+**Pass criteria**: `exact (health.airframe_link == "red")` after max-retry; retries observed with backoff base 100 ms ± 20 ms.
+**Test status**: DEFERRED — `<DEFERRED: airframe link command + bounded retry/backoff with peer not responding through max-retries>`.
+
+---
+
+### NFT-RES-R8: Wall-clock drift > 200 ms → time-source yellow
+**Summary**: When wall-clock drift versus GPS or NTP source exceeds 200 ms, the time-source dependency MUST report yellow, AND `clock_source` + `last_sync_at` MUST reflect the drift.
+**Traces to**: AC `Reliability & Safety — Wall-clock drift greater than 200 ms / R8`, RESTRICT `Wall-clock MUST be bound to GPS time once GPS is locked, or NTP at boot`.
+**Tier**: B.
+
+**Preconditions**:
+- SUT running with `time-injector` LD_PRELOAD active.
+- GPS source initially locked via `mavlink-sitl` GPS_RAW_INT messages.
+
+**Fault injection**:
+- `time-injector` advances the SUT process clock by 250 ms over a 1 s window while keeping GPS source locked.
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | Bind clock to GPS at boot | health.time_source == green; `clock_source == "gps"`; `last_sync_at` recent |
+| 2 | Inject 250 ms drift | within 5 s health.time_source transitions to yellow; `clock_source` and `last_sync_at` updated to reflect the drift |
+| 3 | Stop drift | health.time_source returns to green within the next sync cycle |
+
+**Pass criteria**: `exact (health.time_source == "yellow")` during step 2; `exact (clock_source updated)` + `exact (last_sync_at updated)`.
+**Test status**: READY (time-drift-scripts inline-authorable).
+
+---
+
+### NFT-RES-R9: Geofence EXCLUSION crossing → waypoint refusal + RTL
+**Summary**: When a simulated waypoint crosses an EXCLUSION polygon, the SUT MUST refuse the waypoint AND trigger RTL. Symmetric behaviour for INCLUSION violations.
+**Traces to**: AC `Reliability & Safety — Geofence INCLUSION and EXCLUSION violations MUST both result in waypoint refusal + RTL / R9`, RESTRICT `Geofence enforcement MUST be symmetric`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT mid-flight; geofence INCLUSION + EXCLUSION polygons loaded as part of the mission.
+
+**Fault injection**:
+- Scripted waypoint upload that crosses the EXCLUSION polygon; subsequently INCLUSION-exit test.
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | Upload waypoint crossing EXCLUSION polygon | SUT refuses the waypoint; structured-log WARN with `geofence_violation_exclusion`; RTL command observed on `mavlink-sitl` |
+| 2 | Reset; upload waypoint exiting the INCLUSION polygon | identical behaviour — refused + RTL |
+
+**Pass criteria**: `exact (waypoint rejected)` + `exact (RTL command observed)` for both EXCLUSION and INCLUSION cases.
+**Test status**: DEFERRED — `<DEFERRED: geofence EXCLUSION polygon crossed by simulated waypoint via mavlink-sitl scripted mission>`.
+
+---
+
+### NFT-RES-Mp2: Map-pull timeout → cache-fallback (functional coverage in FT-N-003)
+**Summary**: When the pre-flight map pull times out, the SUT falls back to last-known cached MapObjects and surfaces `map_sync == "cached_fallback"` with an operator-ack gate. (Functional gate semantics are tested in `blackbox-tests.md → FT-N-003`; this scenario adds the **timing+recovery** dimension.)
+**Traces to**: AC `Map Reconciliation — Cache-fallback on timeout / Mp2`.
+**Tier**: B.
+
+**Preconditions**:
+- `autopilot-state` seeded with a known prior MapObjects snapshot.
+- `missions-mock` configured to time out on `GET /missions/{id}/mapobjects` for a configurable duration.
+
+**Fault injection**:
+- `missions-mock` returns 504 / silent timeout for 60 s; then responds normally.
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | Trigger BIT | SUT issues `GET /missions/{id}/mapobjects`; observes timeout (per its configured request timeout); within 5 s falls back to cached snapshot; `map_sync == "cached_fallback"`; BIT requires explicit operator ack (see FT-N-003) |
+| 2 | Mock recovers (responds normally) | next periodic resync re-attempts; once successful, `map_sync == "live"`; structured-log INFO `map_resync_recovered` |
+
+**Pass criteria**: `exact (cached_fallback within 5 s of timeout)`; recovery within the next resync cycle.
+**Test status**: READY (no external fixture beyond `mission-suite-fixture` (DEFERRED) for the cached snapshot seed; cached snapshot can be authored inline at minimal scale).
+
+---
+
+### NFT-RES-Mp4: Post-flight map-push 5xx → persist + bounded retry + operator warning
+**Summary**: When the post-flight `POST /missions/{id}/mapobjects` returns 5xx, the pending diff MUST be persisted on durable on-device storage, an operator-visible warning MUST surface, AND bounded retry MUST execute (capped at the configured retry limit).
+**Traces to**: AC `Map Reconciliation — Failure MUST persist the pending diff to durable on-device storage with bounded retry / Mp4`, RESTRICT `On-device storage MUST be bounded`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT post-landing; pending diff ready to push.
+- `missions-mock` configured to return 5xx N times then 200.
+
+**Fault injection**:
+- `missions-mock` returns 503 for the first N attempts (N = configured retry-cap + 1); then returns 200.
+
+| Step | Action | Expected Behaviour |
+|---|---|---|
+| 1 | Trigger post-flight reconciliation | SUT issues `POST /missions/{id}/mapobjects`; receives 503 |
+| 2 | Observe persistence | pending diff file exists under `autopilot-state/pending_map_diff/<mission-id>.json`; size > 0 |
+| 3 | Observe operator-stream | warning event `map_push_failed` surfaced |
+| 4 | Observe retry loop | retries observed within the configured cap; backoff with jitter |
+| 5 | After retry-cap reached without success | SUT stops retrying; pending file remains for next session pickup |
+| 6 | Eventual success (mock returns 200) | next attempt succeeds; pending file removed; warning cleared |
+
+**Pass criteria**: `exact (pending file exists)` + `exact (warning surfaced)` + `threshold_max (retries ≤ configured cap)`.
+**Test status**: DEFERRED — `<DEFERRED: same fixture as Mp3 (60-minute pass diff)>`.
+
+---
+
+## Recovery-time invariants common to every scenario
+
+- **No silent error swallowing.** Every fault scenario MUST observe a corresponding structured-log entry at WARN+ AND a corresponding health-endpoint transition. A fault that the SUT handles without surfacing through both channels is a TEST FAILURE per `security_approach.md → "No silent error swallowing for security-relevant failures"` (extended here to operational faults per `coderule.mdc → "Never suppress errors silently"`).
+- **Bounded behaviour.** Every retry/backoff loop MUST be bounded — the scenario asserts the cap on retry count and the cap on backoff window. Open-ended retry is a test failure.
+- **State integrity post-recovery.** After fault recovery (when applicable), the scenario asserts that the SUT returns to a known state — mode unchanged unless the fault legitimately altered it (e.g., RTL stays RTL until operator override).
+- **Symmetry assertions.** R9 explicitly tests both INCLUSION and EXCLUSION because the AC names symmetric behaviour. Wherever an AC pairs two outcomes (`fail-fast` + `fail-closed`, `red` + `yellow`, etc.), the resilience scenario MUST cover both halves.
@@ -0,0 +1,156 @@
+# Resource Limit Tests
+
+Authored by `/test-spec` Phase 2 (2026-05-19). Resource-limit tests assert that the SUT stays within a quantified resource ceiling for the configured duration. Short bursts do not satisfy these tests — every scenario has an explicit sustained-monitoring window.
+
+---
+
+### NFT-RES-LIM-Re1: Combined onboard RSS ≤ 6 GB sustained
+**Summary**: Combined process RSS on the deployed compute device for everything autopilot owns onboard (excluding Tier 1) MUST stay ≤ 6 GB throughout a 5-minute steady-state window with the full onboard workload active.
+**Traces to**: AC `Resources & Data — Combined RSS on the deployed compute device, for everything autopilot owns onboard (excluding Tier 1), MUST stay within ≤ 6 GB / Re1`, RESTRICT `Hardware — Compute device: Jetson Orin Nano Super, 8 GB shared LPDDR5; Tier 1 consumes ~2 GB, leaving ~6 GB for autopilot`.
+
+**Tier**: HW (representative Jetson Orin Nano Super) — pure-x86 reports informational only and does NOT satisfy the project-level Acceptance Gate.
+
+**Preconditions**:
+- Full onboard workload active: frame ingest from `rtsp-loopback`, Tier-2 + Tier-3 (when enabled) inferring at the documented steady-state load, gimbal commands flowing, MAVLink stream consumed at 10 Hz, operator-stream connected, MapObjects store hydrated for a 30×30 km region.
+- Warm-up: 60 s before measurement starts (any first-load model warm-up complete).
+- Tier-1 process is RUNNING in parallel but its RSS is EXCLUDED from the measurement (the AC scope is autopilot-owned RSS, excluding Tier 1).
+
+**Monitoring**:
+- Cgroup-level RSS for every process the SUT owns (the SUT binary plus any child processes it spawns — e.g., the VLM IPC peer if it lives in autopilot's cgroup), sampled at 1 Hz.
+- Cgroup-level RSS for Tier 1 sampled at the same cadence (for the Re2 cross-reference).
+- Per-process RSS captured to `reports/<run-id>/rss-trace.csv` for forensic review on failure.
+
+**Duration**: 5 minutes of measurement after warm-up.
+
+**Pass criteria**:
+- `threshold_max`: per 1 s sample, `sum(autopilot_owned_RSS) ≤ 6 GB`.
+- No single 1 s sample exceeds the ceiling.
+- (Reporting only — not pass/fail): peak RSS, mean RSS, P95 RSS recorded in the CSV report.
+
+**Test status**: DEFERRED — `<DEFERRED: long-running scenario harness exercising the full onboard workload for 5 min; inline-authorable but requires that the SUT be operational end-to-end first>`.
+
+---
+
+### NFT-RES-LIM-Re2: Tier-1 non-degradation under autopilot workload
+**Summary**: When autopilot's full onboard workload runs concurrently with Tier 1 on the same Jetson, Tier-1 per-frame latency MUST NOT degrade by more than ± 5 ms versus the Tier-1-alone baseline (recorded by NFT-PERF-L1).
+**Traces to**: AC `Resources & Data — Tier 1 per-frame latency MUST NOT degrade by more than ± 5 ms when autopilot's own onboard workload is running concurrently / Re2`, RESTRICT `Tier 1 (YOLO) and any local large model with GPU memory pressure share the Jetson GPU — only one of them may execute at any wall-clock instant`.
+
+**Tier**: HW (the only meaningful environment for this assertion — GPU contention behaviour does not reproduce on x86).
+
+**Preconditions**:
+- NFT-PERF-L1 has been run on the same HW configuration in the SAME session and a baseline `tier1_baseline_p95_ms` recorded.
+- Full onboard workload active (same as Re1).
+
+**Monitoring**:
+- Tier-1 per-frame latency sampled per frame for the duration of the test.
+- The same metric source as NFT-PERF-L1 — for direct delta comparison.
+
+**Duration**: 5 minutes of measurement after warm-up (matches Re1 window so both can run in the same session).
+
+**Pass criteria**:
+- `numeric_tolerance`: `|p95(tier1_with_autopilot) - tier1_baseline_p95_ms| ≤ 5 ms`.
+- (Reporting only): mean, P95, max delta over the window.
+
+**Test status**: DEFERRED — same fixture dependency as Re1; requires SUT operational + Tier 1 colocated on HW.
+
+---
+
+### NFT-RES-LIM-Storage: On-device persistent store stays under 95 % for in-flight operation
+**Summary**: During a steady-state mission run (no abnormal load), the on-device persistent store MUST NOT exceed 95 % full. This protects the takeoff gate (R3) from being silently violated mid-mission and protects the post-flight push (Mp4) from running out of room to persist a failed diff.
+**Traces to**: AC `Reliability & Safety — On-device storage MUST be bounded` (via R3 BIT gate), RESTRICT `On-device storage MUST be bounded`.
+
+**Tier**: B + HW.
+
+**Preconditions**:
+- SUT mid-flight; persistent store at typical post-takeoff utilisation (e.g. 30 %).
+- Normal-operation event volume: telemetry persistence, ignored-item appends, pending map-diff buffer (empty in this scenario).
+
+**Monitoring**:
+- Volume utilisation sampled at 10 Hz throughout the duration.
+
+**Duration**: 60 minutes (representative mission duration per Mp3).
+
+**Pass criteria**:
+- `threshold_max`: `volume_used / volume_total ≤ 0.95` at every sample point.
+- On approach to 85 %: structured-log INFO `storage_pressure` with current utilisation.
+- On approach to 90 %: structured-log WARN with current utilisation; health.storage transitions to yellow.
+- On 95 %: health.storage transitions to red; the SUT begins its documented eviction policy (this scenario does NOT test the policy semantics — that belongs to its own scenario; this scenario only asserts the policy IS triggered).
+
+**Test status**: READY (no external fixture beyond the SUT itself; the persistent-store seed file controls starting utilisation).
+
+---
+
+### NFT-RES-LIM-CPU: CPU headroom for the Tier-1 colocation guarantee
+**Summary**: Combined CPU utilisation of every autopilot-owned process MUST leave enough Jetson CPU headroom for Tier 1 to keep its NFT-PERF-L1 budget. Concretely: per-second sustained CPU usage by autopilot-owned processes MUST stay ≤ the configured budget (default 60 % of total CPU cycles measured at the cgroup level) for the duration of the run.
+**Traces to**: AC `Resources & Data — Tier 1 per-frame latency MUST NOT degrade by more than ± 5 ms / Re2` (CPU-side mechanism backing Re2), RESTRICT `Hardware — Jetson Orin Nano Super`.
+
+**Tier**: HW (CPU contention does not reproduce on x86).
+
+**Preconditions**:
+- Same workload as Re1 + Re2.
+
+**Monitoring**:
+- Cgroup CPU usage at 1 Hz.
+
+**Duration**: 5 minutes after warm-up.
+
+**Pass criteria**:
+- `threshold_max`: per 1 s sample, `sum(autopilot_cpu_usage) ≤ 60 %` of total CPU.
+- Reporting: mean, P95, max.
+
+**Test status**: DEFERRED — same dependency as Re1/Re2.
+
+---
+
+### NFT-RES-LIM-GPU: GPU mutual exclusion contract (Tier 1 vs local large model)
+**Summary**: Per RESTRICT (`Tier 1 (YOLO) and any local large model with GPU memory pressure share the Jetson GPU — only one of them may execute at any wall-clock instant`), the SUT MUST NOT issue a GPU compute call (e.g. Tier-3 VLM inference) while Tier 1 is executing on the GPU. The serialisation MUST be observable: a single GPU is busy at one instant.
+**Traces to**: RESTRICT `Tier 1 and any local large model … only one of them may execute at any wall-clock instant`.
+
+**Tier**: HW.
+
+**Preconditions**:
+- Tier 1 active; SUT in a ZoomedIn hold with deep-analysis enabled (Tier-3 will fire).
+
+**Monitoring**:
+- GPU-instance occupancy via `tegrastats` / equivalent at the highest available sampling rate.
+- The SUT's own internal "compute-class" telemetry exposed on the health endpoint as `gpu_owner_current` ∈ { "tier1", "tier3", "idle" }.
+
+**Duration**: 60 s containing ≥ 5 Tier-3 hold cycles.
+
+**Pass criteria**:
+- `exact`: at every sample point, `gpu_owner_current ∈ { "tier1", "tier3", "idle" }`; never simultaneously both.
+- `tegrastats` peak GPU occupancy attributable to autopilot processes never overlaps Tier 1's known activity window for the same wall-clock instant.
+
+**Test status**: DEFERRED — depends on the SUT being operational end-to-end + Tier-3 enabled; also depends on the SUT exposing `gpu_owner_current` (which is an architectural choice not yet locked).
+
+---
+
+### NFT-RES-LIM-FileHandles: File-descriptor and socket bound
+**Summary**: Sustained operation MUST NOT leak file descriptors or sockets. The count MUST stay within a documented headroom of the initial-post-warmup baseline for the duration of the run.
+**Traces to**: RESTRICT `On-device storage MUST be bounded` (general bounded-resource principle), security principle `No silent error swallowing for security-relevant failures` (FD exhaustion would silently break the operator-stream).
+
+**Tier**: B + HW.
+
+**Preconditions**:
+- Warm-up: 60 s.
+- Workload: full onboard workload at steady state.
+
+**Monitoring**:
+- `/proc/<pid>/fd` count per autopilot process at 1 Hz.
+
+**Duration**: 60 minutes.
+
+**Pass criteria**:
+- `threshold_max`: at every sample point, `fd_count ≤ fd_baseline_post_warmup + 50` (50 = documented churn headroom for intermittent operator reconnects).
+- A monotonically rising trend (slope > 0 over the run) is a TEST FAILURE even if the absolute ceiling is not breached.
+
+**Test status**: READY for a Tier-B run; gains its real value once HW + sustained-workload land.
+
+---
+
+## Common assertions for every resource-limit scenario
+
+- **Sustained-monitoring is non-negotiable.** Each scenario specifies a duration ≥ 60 s; short bursts that pass do not satisfy the test. The CSV report records the full sample trace path under `artifacts_path`.
+- **No silent eviction.** Where a ceiling is approached, the SUT MUST surface the pressure (structured-log INFO at 85 %, WARN at 90 %, transition to yellow/red on health) BEFORE reaching the ceiling. A pass with no observable pressure signal at thresholds is a TEST FAILURE.
+- **HW reporting vs gating.** Pure-x86 runs report informational deltas only; they do NOT satisfy the project-level Acceptance Gate. Every CSV row records its tier so this distinction stays auditable.
+- **Re1 + Re2 are paired.** Re1 establishes the autopilot RSS ceiling; Re2 establishes that respecting Re1 does not cost Tier 1 latency. They MUST be run in the same session to make the Re2 baseline meaningful.
@@ -0,0 +1,215 @@
+# Security Tests
+
+Authored by `/test-spec` Phase 2 (2026-05-19). Security tests validate blackbox-observable security properties derived from `_docs/00_problem/security_approach.md` and the AC operator-command rules. Code-level vulnerability scanning is out of scope at this layer (see deploy-time security audit `Step 14` of the autodev flow).
+
+Each scenario observes the SUT through its public surfaces only; pass criteria assert that an attack attempt produces no state change AND surfaces a structured-log entry / health signal — silent rejection is a test failure.
+
+---
+
+### NFT-SEC-O9: Operator-command replay protection
+**Summary**: An operator command envelope replayed within (or outside) the replay-protection window MUST be rejected; system state MUST NOT change; security WARN logged with reason `replay`.
+**Traces to**: AC `Operator Workflow — A replayed or unsigned operator command MUST be rejected with a logged security warning / O9`, security principle `Operator commands MUST be authenticated, signed, and replay-protected`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT in steady state; a prior valid operator-confirm envelope already accepted.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | Capture the valid envelope from the prior FT-P-016 run | envelope captured (sequence_id S, timestamp T) |
+| 2 | Replay the exact same envelope a second time | SUT rejects at the boundary; no `POST /missions/{id}` observed; no mode change; counter `operator_cmd_rejected_replay_total` += 1; structured-log WARN with `reason: "replay"`, `sequence_id: S`, `originating_envelope_id` recorded |
+| 3 | Replay an envelope with sequence_id S but timestamp T+window+1s (outside replay window) | rejected as expired; counter `operator_cmd_rejected_expired_total` += 1; structured-log WARN reason `expired` |
+
+**Pass criteria**: `exact (state unchanged)` AND `substring (log contains "replay")` for step 2; `exact (state unchanged)` AND `substring (log contains "expired")` for step 3.
+**Test status**: DEFERRED — `<DEFERRED: operator-envelopes (replayed) fixture; services.md §8 — blocked on Q9 operator-command auth scheme>`. Until Q9 resolves, this scenario asserts only that a duplicate envelope at the byte level is rejected (placeholder behaviour); the full replay-window semantics land with Q9.
+
+---
+
+### NFT-SEC-O10: Operator-command signature validation
+**Summary**: A malformed / unsigned operator command MUST be rejected with `reason: "invalid"`; state MUST NOT change.
+**Traces to**: AC `O10`, security principle `Operator commands MUST be authenticated, signed, and replay-protected`.
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT in steady state.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | Send a malformed envelope (signature bytes flipped) | rejected; no state change; counter `operator_cmd_rejected_signature_total` += 1; structured-log WARN reason `invalid_signature` |
+| 2 | Send an UNSIGNED envelope (signature field absent / zero) | rejected; counter increments; structured-log WARN reason `unsigned` |
+| 3 | Send a well-formed envelope but signed with a key NOT in the operator's authorised set | rejected; counter increments; reason `unauthorised_signer` |
+| 4 | Send a valid envelope (control case) | accepted; state changes as per the command type |
+
+**Pass criteria**: steps 1–3 all `exact (state unchanged)` + `substring (log contains "invalid"|"unsigned"|"unauthorised")`; step 4 succeeds normally.
+**Test status**: DEFERRED — `<DEFERRED: operator-envelopes (malformed / unsigned / wrong-key); blocked on Q9>`.
+
+---
+
+### NFT-SEC-CraftedFrame: Crafted RTSP frame → no decoder OOM / no crash
+**Summary**: A crafted H.264/265 frame (oversize SPS, malformed NAL, truncated slice) MUST NOT crash or hang the SUT and MUST NOT consume unbounded memory. Frame is dropped with a counter increment.
+**Traces to**: security principle `Bounded input for any model call`, RESTRICT `On-device storage / RSS budgets`.
+**Tier**: B.
+
+**Preconditions**:
+- SUT in normal sweep mode; `rtsp-loopback` switched to a corpus of crafted clips.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | Stream a fuzzed clip corpus (≥ 100 crafted frames) | each crafted frame dropped at decode; counter `frame_decode_error_total` increments per drop; structured-log WARN with `reason: "decode_error"` |
+| 2 | Observe SUT process | RSS does NOT exceed 1.2 × baseline; no crash; no hang; gimbal & operator-stream still responsive within their normal latency budgets |
+
+**Pass criteria**: `exact (no crash)`; `threshold_max (RSS ≤ 1.2 × baseline)`; counter consistent with crafted-frame count.
+**Test status**: READY (crafted-clip corpus authorable inline using afl++ / honggfuzz output against a vanilla H.264 decoder; corpus stored in `e2e/consumer/fixtures/fuzzed_clips/`).
+
+---
+
+### NFT-SEC-OversizeCrop: Bounded crop enforcement
+**Summary**: An attempt to submit an oversize ROI crop (above the configured max bytes or outside the format allow-list) to any onboard model entry point MUST be rejected at the boundary; downstream models MUST NOT be invoked.
+**Traces to**: security principle `Bounded input for any model call`.
+**Tier**: B.
+
+**Preconditions**:
+- SUT with Tier-2 + Tier-3 enabled.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | Submit a 5000 × 5000 PNG (above the configured 1024 × 1024 cap) to the Tier-2 ROI entry | rejected; Tier-2 inference NOT invoked (verified via `tier2_inference_total` counter unchanged); structured-log WARN `reason: "roi_too_large"` |
+| 2 | Submit a BMP (not in the allow-list) | rejected; reason `roi_format_not_allowed` |
+| 3 | Submit a well-formed 640×640 JPEG (control) | accepted; Tier-2 invoked normally |
+
+**Pass criteria**: `exact (downstream model not invoked)` for steps 1–2; `exact (downstream invoked)` for step 3.
+**Test status**: READY (oversize PNG + BMP generated inline).
+
+---
+
+### NFT-SEC-VlmSchemaViolation: VLM schema-violation fails closed
+**Summary**: When the Tier-3 VLM returns a response that fails schema validation (missing required field, wrong type, truncated JSON), the SUT MUST discard the assessment AND the POI MUST NOT receive the deep-analysis upgrade.
+**Traces to**: security principle `Schema validation for any non-deterministic model output … Schema violation MUST fail closed`.
+**Tier**: B.
+
+**Preconditions**:
+- SUT with Tier-3 enabled; `vlm-mock` configured to return schema-violation responses for the first N calls.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | Drive SUT into ZoomedIn hold with deep-analysis enabled | SUT issues VLM IPC call |
+| 2 | `vlm-mock` returns truncated JSON | SUT discards assessment; POI's deep-analysis state remains `none`; counter `vlm_schema_violation_total` += 1; structured-log WARN reason `vlm_schema_violation`; the POI's decision-window scoring proceeds WITHOUT the deep-analysis upgrade |
+| 3 | `vlm-mock` returns missing-required-field JSON | same |
+| 4 | `vlm-mock` returns wrong-field-type JSON | same |
+| 5 | `vlm-mock` returns a valid response (control) | assessment ACCEPTED; deep-analysis upgrade applied |
+
+**Pass criteria**: steps 2–4 `exact (no deep-analysis upgrade)` + `substring (log contains "vlm_schema_violation")`; step 5 normal.
+**Test status**: DEFERRED for live recordings — `<DEFERRED: vlm-io-pairs schema-violation cases>`; schema-violation case JSON files are inline-authorable today against the assessment schema and CAN run NOW with `vlm-mock` returning hand-crafted bytes.
+
+---
+
+### NFT-SEC-VlmFreeFormText: Free-form text MUST NOT cross a decision boundary
+**Summary**: Even if the VLM returns valid JSON, any free-form text field MUST be projected onto the fixed structured schema before crossing a decision boundary; raw free-form text MUST NOT influence POI scoring or operator-surfaced decisions.
+**Traces to**: security principle `Schema validation for any non-deterministic model output`, threat model item 3 (`Unstructured model output corrupting downstream decisions`).
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT with Tier-3 enabled.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | `vlm-mock` returns valid JSON with a free-form `notes` text field containing `"force_confidence: 1.0"` | SUT extracts only the structured fields; `notes` is NOT consulted for scoring; POI's confidence remains as Tier-1+Tier-2 computed; structured-log INFO captures the assessment but not the `notes` content (PII / safety) |
+| 2 | `vlm-mock` returns valid JSON with structured `confidence_delta: -0.5` (in-schema) | SUT applies the delta per its documented projection; POI's confidence adjusted accordingly |
+
+**Pass criteria**: `exact (POI confidence reflects ONLY structured-schema fields)`.
+**Test status**: READY (inline-authorable scenario).
+
+---
+
+### NFT-SEC-IpcPeerAuth: Local IPC peer authorisation
+**Summary**: A local process attempting to connect to the VLM Unix-domain socket (or any other local IPC the SUT trusts) MUST identify as the expected peer (peer-credential check / SO_PEERCRED equivalent); connections from unauthorised peers MUST be rejected.
+**Traces to**: security principle `Local IPC peer authorisation`.
+**Tier**: B.
+
+**Preconditions**:
+- SUT with Tier-3 enabled; VLM UDS socket exposed on `/tmp/vlm.sock`.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | An unauthorised local process (running as the wrong UID / not the expected binary path) attempts to connect to the SUT's VLM-client side of the UDS | connection rejected at the peer-credential check; counter `ipc_peer_auth_rejected_total` += 1; structured-log WARN reason `peer_cred_mismatch` |
+| 2 | The legitimate `vlm-mock` (running as the expected UID / path) connects | connection accepted; subsequent IPC succeeds |
+
+**Pass criteria**: `exact (unauthorised connection rejected)` + `exact (legitimate connection accepted)`.
+**Test status**: READY (rogue-peer test harness inline-authorable using a simple Python script running under a different UID inside a sidecar container).
+
+---
+
+### NFT-SEC-Tier1SchemaViolation: Tier-1 detection-stream schema violation
+**Summary**: A `Detections` record from `../detections` that violates the normalised-box schema (coord out of [0,1], invalid class_id) MUST cause the frame's detections to be dropped (not partially used); counter increments; structured-log WARN. SUT does not crash and continues with subsequent frames.
+**Traces to**: security principle `No silent error swallowing for security-relevant failures` (extends to peer schema violations) + AC `D6` (normalised-box conformance).
+**Tier**: B.
+
+**Preconditions**:
+- SUT in normal sweep mode; `detections-mock` configured to emit schema-violating records interleaved with valid ones.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | Mock emits Detections for frame N with bbox `x2 = 1.5` (coord > 1.0) | frame N's detections dropped; counter `tier1_invalid_frame_total` += 1; structured-log WARN with `field: "x2"`, `value: 1.5` |
+| 2 | Mock emits Detections for frame N with `class_id = 99` (not in 0..18) | dropped; reason `class_id_out_of_range` |
+| 3 | Mock emits valid Detections for frame N+1 | processed normally |
+
+**Pass criteria**: `exact (no operator-stream emission for frames N)` + `exact (counter incremented per dropped frame)`.
+**Test status**: READY (inline-authorable injection by `detections-mock`).
+
+---
+
+### NFT-SEC-MavlinkUnsigned: Optional MAVLink-2 signing enforcement
+**Summary**: When MAVLink-2 message signing is configured ON (per Q6 once resolved), unsigned messages on the airframe link MUST be dropped with a security WARN; signed messages flow normally. When signing is OFF (current default until Q6), no signing assertion runs.
+**Traces to**: security principle `Airframe MAVLink integrity` (Q6).
+**Tier**: B + E.
+
+**Preconditions**:
+- SUT configured with MAVLink-2 signing ENABLED (test profile).
+- `mavlink-sitl` configured to send a mix of signed and unsigned messages.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | `mavlink-sitl` sends a valid signed message | accepted; processed normally |
+| 2 | `mavlink-sitl` sends an unsigned message | dropped; counter `mavlink_unsigned_dropped_total` += 1; structured-log WARN reason `mavlink_unsigned`; airframe-link health unaffected for an isolated drop |
+| 3 | Sustained unsigned-only stream | airframe-link health flips red after the configured tolerance window (same threshold as R7 retry exhaustion) |
+
+**Pass criteria**: `exact (unsigned dropped)` + `exact (signed accepted)`; sustained-unsigned escalates per the documented threshold.
+**Test status**: DEFERRED — `<DEFERRED: Q6 (MAVLink-2 message signing decision)>`. When Q6 lands and signing is mandated, this scenario becomes READY.
+
+---
+
+### NFT-SEC-HealthExposesSecurity: Health endpoint surfaces security state
+**Summary**: The `/health` endpoint MUST reflect security state — repeated operator-command signature failures, repeated peer-credential mismatches, repeated schema-violation rates all MUST be visible to ops.
+**Traces to**: security principle `Health endpoint MUST reflect security state`.
+**Tier**: B.
+
+**Preconditions**:
+- SUT in steady state; counters baselined.
+
+| Step | Consumer Action | Expected Response |
+|---|---|---|
+| 1 | Drive sustained signature-failure rate (10 / s) for 10 s via the NFT-SEC-O10 flow | `GET /health` exposes a `security` sub-object that includes `operator_cmd_rejected_signature_rate_60s` non-zero; if rate exceeds the configured alert threshold, the security sub-object transitions to yellow |
+| 2 | Drive sustained peer-credential-mismatch attempts (1 / s) for 60 s via NFT-SEC-IpcPeerAuth | `security.ipc_peer_auth_rejected_rate_60s` non-zero; transitions to yellow at threshold |
+| 3 | Drive sustained Tier-1 schema-violation rate (1 / s) via NFT-SEC-Tier1SchemaViolation | `security.tier1_invalid_rate_60s` non-zero |
+
+**Pass criteria**: `exact (health.security exposes each rate)` + `exact (transition to yellow at threshold)`.
+**Test status**: READY.
+
+---
+
+## Out of scope at this layer
+
+Per `security_approach.md → "Out of scope"`, the following are NOT covered by blackbox security tests because they are owned elsewhere in the suite:
+
+- Modem-link encryption setup (radio layer below autopilot).
+- Suite-wide TLS / certificate provisioning (suite-level deployment, `../_infra/`).
+- OTA update signing (Watchtower; autopilot consumes signed images only). Boot-time self-check + rollback is Q10 — when it lands, it becomes a new scenario here.
+- Annotation / training-data security (`../ai-training` repo).
+- Operator browser UI auth (Ground Station owns it; only the modem-side handshake is jointly specified per Q9, covered by O8/O9/O10).
+- Multi-operator session policy (Q11 — when it lands, becomes a new scenario here).
+
+## Common assertions
+
+- **No silent rejection.** Every rejected security event MUST produce both a counter increment AND a structured-log entry at WARN+. A rejection that occurs silently is a TEST FAILURE.
+- **Fail-closed everywhere.** When an authentication / signature / schema check is uncertain, the SUT MUST fail closed (reject) rather than fail open. Tests assert this by sending borderline / ambiguous inputs and checking for rejection.
+- **No information leak in error paths.** Error responses (where the SUT exposes any to the operator-stream or health endpoint) MUST NOT leak the rejected payload contents beyond the minimum needed for ops to triage. Tests inspect log/health output for absence of crafted-payload byte sequences.
@@ -0,0 +1,152 @@
+# Test Data Management
+
+Authored by `/test-spec` Phase 2 (2026-05-19). Owns the **mapping** from fixtures to tests, mock data shapes, isolation strategy, and the deferred-fixture inventory bridge.
+
+- Per-row input-to-expected-result binding lives in `_docs/00_problem/input_data/expected_results/results_report.md` — this file references it but never duplicates it.
+- Fixture manifest (SHA-pinned files + provenance) lives in `_docs/00_problem/input_data/fixtures/README.md`.
+- Per-service mock catalogue (what shape each mock returns) lives in `_docs/00_problem/input_data/services.md`.
+- Deferred fixture inventory + replay obligation lives in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`.
+
+## Seed data sets
+
+| Data Set | Description | Used by Tests | How Loaded | Cleanup |
+|---|---|---|---|---|
+| `image-set-existing` | `fixtures/images/{4d6e1830d211ad50,54f6459dbddb93d8,6dd601b7d2dc1b30,805bcf1e9f271a58,f997d0934726b555}.jpg` — 5 aerial frames | FT-P-Tier1Contract, NFT-PERF-L1, NFT-PERF-L2, FT-P-DetectExisting | mounted read-only via `fixtures-ro:/fixtures` on `rtsp-loopback` (encoded to `.mp4` clip) and on `detections-mock` (paired with `expected_detections.json` per frame) | volume detached on container teardown |
+| `video-recon` | `fixtures/videos/94d42580bd1ad6ff.mp4` | NFT-PERF-T3 | mounted read-only on `rtsp-loopback`; consumer requests stream at 30 fps then throttles decode + drops frames per scenario script | as above |
+| `video-movement` | `fixtures/movement/video0[1-4].mp4` (4 wide-area clips) | FT-P-MoveStarter (visual reference only), FT-P-MoveBenchmark (deferred) | mounted on `rtsp-loopback`; played at 30 fps; consumer schedules which clip per scenario | as above |
+| `image-semantic-starter` | `fixtures/semantic/semantic0[1-4].png` (1 winter + 3 unmarked season) | FT-P-ConcealStarter, FT-P-FootpathStarter (visual reference only; assertion semantics deferred) | mounted on `detections-mock` and `rtsp-loopback` as a single-frame loop | as above |
+| `schemas-detection` | `fixtures/schemas/expected_detections.{json,schema.json}` | FT-P-Tier1Contract, FT-P-NormalisedBoxes (D6) | mounted on `e2e-consumer:/expected:ro` | as above |
+| `sql-init-suite` | `fixtures/sql/init.sql` | NOT USED by autopilot tests (suite-only artefact; recorded here for traceability) | n/a | n/a |
+| `mission-suite-fixture` | `<DEFERRED: missions_fixtures/mission_30x30km.json + mapobjects_10k.json; services.md §2>` | FT-P-MissionStart, FT-P-MapPull (Mp1), FT-P-MapPush (Mp3), NFT-RES-Mp2, NFT-RES-Mp4 | mounted on `missions-mock` once acquired | as above |
+| `mavlink-sitl-scripts` | scripted `ardupilot/sitl` scenarios (waypoint upload, geofence in/out, RTL on link loss, RTL on battery floor) | FT-P-WaypointInsert (O8), NFT-RES-R4, NFT-RES-R5, NFT-RES-R6, NFT-RES-R7, NFT-RES-R9 | run in `mavlink-sitl` via `--script` argument per scenario | SITL container restarted per scenario |
+| `operator-session-scripts` | scripted `(t, event)` traces — nominal, drop+reconnect, lost-link 30 s, sustained lost-link | FT-P-DecisionWindow (O1–O3, O4), FT-P-OperatorDecline (O5), FT-P-OperatorIgnoredSuppress (O6), FT-P-OperatorTimeout (O7), FT-P-OperatorConfirm (O8), NFT-RES-R4 | replayed by `operator-replay` per scenario | per-scenario |
+| `operator-envelopes` | `<DEFERRED: operator_envelopes/{valid,replayed,malformed,unsigned,expired}.bin; services.md §8 (Q9-blocked)>` | NFT-SEC-O9, NFT-SEC-O10, FT-P-OperatorConfirm (O8 happy path uses a default placeholder envelope) | replayed by `operator-replay` | per-scenario |
+| `vlm-io-pairs` | `<DEFERRED: vlm_io_pairs/{roi,prompt,response}.* + schema-violation cases; services.md §7>` | NFT-PERF-L3, FT-P-DeepAnalysisHold (S5), NFT-SEC-VlmSchemaViolation | mounted on `vlm-mock` | per-scenario |
+| `gimbal-csv-pairs` | `<DEFERRED: gimbal_csv/video0[1-4].csv paired with movement videos at zoomed-in band + threshold-edge cluster; services.md §6>` | FT-P-EgoMotion (M1), FT-P-MoveDuringHold (M2), FT-P-ThresholdEdge (M3), FT-P-MoveBenchmark (M4), NFT-PERF-L6, NFT-PERF-L7 | replayed by `gimbal-mock` synchronised to RTSP frame timestamps | per-scenario |
+| `tier1-replay-streams` | `<DEFERRED: tier1_replay/*.replay; services.md §1>` | FT-P-Tier1ContractIsolated (Tier B variant); Tier-E uses live `../detections` | served by `detections-mock` | per-scenario |
+| `time-drift-scripts` | scripted clock offsets (50 ms ramp, 250 ms jump, NTP loss, GPS unlock) | NFT-RES-R8 | injected by `time-injector` via faketime LD_PRELOAD shim | per-scenario |
+| `synthetic-poi-feeds` | inline-authorable: confidence={0.39, 0.40, 0.70, 1.00}, ordering-test feed, sustained-rate feed >5 POI/min | FT-P-DecisionWindow (O1–O4), FT-P-POIOrdering (S4), NFT-PERF-T1 | authored in Rust under `e2e/consumer/fixtures/synthetic_poi/`; pumped into the SUT by injecting recorded `Detections` into `detections-mock` | n/a (in-memory) |
+| `bit-scenarios` | inline-authorable: every-dep-green, tier1-unreachable, storage-95pct-full | FT-P-BitPass (R1), NFT-RES-R2, NFT-RES-R3 | manipulated by toggling mock services up/down + `autopilot-state` volume seed file | volume seed file removed |
+
+## Data isolation strategy
+
+- **Per scenario, fresh containers.** Each scenario starts with `docker compose down -v && docker compose up -d` (the `e2e-consumer` orchestrates this via `testcontainers-rs`). No state leaks between scenarios.
+- **`autopilot-state` volume** is named per `(test_id, run_id)` so parallel scenario runs do not collide.
+- **Deterministic seeds.** Every randomness source in the SUT (POI age-factor tie-breaking, retry jitter, replay-window nonce window) is configured to a per-scenario seed via env vars (`AUTOPILOT_RNG_SEED=<test_id>`). The seed is captured in the CSV report.
+- **Wall-clock control.** Scenarios that depend on absolute time (NFT-RES-R8, NFT-RES-R4 grace window, FT-P-DecisionWindow timeouts) use `time-injector` (faketime LD_PRELOAD). The SUT's `time.now()` calls are intercepted; GPS-source state is set via the `mavlink-sitl` GLOBAL_POSITION_INT message stream.
+- **Network determinism.** All inter-service traffic stays on the `autopilot-e2e` Docker network (no internet egress). Latency injection (for L9 modem RTT exclusion checks) uses `tc qdisc` inside the `operator-replay` container.
+- **No shared mocks between scenarios.** Even when two scenarios use the same fixture, each gets its own mock container instance — this avoids stale state in `missions-mock`'s POST-buffer or `gimbal-mock`'s last-command cache.
+
+## Input data mapping (fixtures → scenarios)
+
+This is the **fixture-side index**; the scenario-side index is in each `*-tests.md` file's `Input data` field.
+
+| Input data file | Source location | Description | Covers scenarios |
+|---|---|---|---|
+| `fixtures/images/4d6e1830d211ad50.jpg` | `_docs/00_problem/input_data/fixtures/images/` | Aerial frame, 1280 px input | FT-P-Tier1Contract (D6), NFT-PERF-L1, NFT-PERF-L2 |
+| `fixtures/images/{54f6...,6dd6...,805b...,f997...}.jpg` | same dir | 4 additional aerial frames for existing-class regression | FT-P-DetectExisting (D2) |
+| `fixtures/videos/94d42580bd1ad6ff.mp4` | same dir | Reconnaissance clip, 30 fps; consumer throttles to drop below 10 fps for ≥5 s | NFT-PERF-T3 |
+| `fixtures/movement/video01.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-EgoMotion (M1) [DEFERRED — needs gimbal.csv] |
+| `fixtures/movement/video02.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-MoveDuringHold (M2) [DEFERRED — needs zoomed-in gimbal.csv] |
+| `fixtures/movement/video03.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-ThresholdEdge (M3) [DEFERRED — needs threshold-edge gimbal.csv] |
+| `fixtures/movement/video04.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-MoveBenchmark (M4) [DEFERRED — needs zoom-band benchmark CSV] |
+| `fixtures/semantic/semantic01.png` | same dir | Winter concealed-position reference (starter only) | FT-P-ConcealStarter (D3, D4), FT-P-FootpathStarter (D5) [DEFERRED — needs annotated multi-season set] |
+| `fixtures/semantic/semantic0[2-4].png` | same dir | 3 unmarked-season concealed-position references | as above |
+| `fixtures/schemas/expected_detections.json` | same dir | Reference output for D6 | FT-P-Tier1Contract (D6), FT-P-NormalisedBoxes |
+| `fixtures/schemas/expected_detections.schema.json` | same dir | Schema for normalised-box output | FT-P-NormalisedBoxes, NFT-SEC-Tier1SchemaViolation |
+| `fixtures/sql/init.sql` | same dir | (suite-only — recorded for traceability) | none |
+
+## Expected results mapping (scenario → comparison row)
+
+Every scenario in `*-tests.md` traces to a row id in `_docs/00_problem/input_data/expected_results/results_report.md`. The comparison method + tolerance is owned by that row — this table is the **scenario-side index** so a reader can navigate from a test to its assertion contract.
+
+| Scenario ID | Input data | Expected result row | Comparison method | Tolerance | Source |
+|---|---|---|---|---|---|
+| FT-P-Tier1Contract | `image-set-existing` (1 frame) | `D6` | `schema_match` + `range` | each coord ∈ [0,1] | `fixtures/schemas/expected_detections.schema.json` |
+| FT-P-DetectExisting | `image-set-existing` (5 frames) | `D2` | `numeric_tolerance` | ± 0.02 (P, R) | `<DEFERRED: expected_results/existing_classes_baseline.json>` |
+| FT-P-DetectNew | `<DEFERRED: new-class eval set>` | `D1` | `threshold_min` | P ≥ 0.80 AND R ≥ 0.80 | `<DEFERRED: expected_results/new_classes_pr.json>` |
+| FT-P-ConcealRecall | `image-semantic-starter` + `<DEFERRED: full set>` | `D3` | `threshold_min` | recall ≥ 0.60 | `<DEFERRED: expected_results/concealed_positions.json>` |
+| FT-P-ConcealPrecision | same | `D4` | `threshold_min` | precision ≥ 0.20 | same |
+| FT-P-FootpathRecall | `image-semantic-starter` + `<DEFERRED>` | `D5` | `threshold_min` | recall ≥ 0.70 | `<DEFERRED: expected_results/footpaths.json>` |
+| NFT-PERF-L1 | `image-set-existing` (1 frame) | `L1` | `threshold_max` | ≤ 100 ms | inline |
+| NFT-PERF-L2 | derived ROI from same | `L2` | `threshold_max` | ≤ 200 ms | inline |
+| NFT-PERF-L3 | `vlm-io-pairs` | `L3` | `threshold_max` | ≤ 5000 ms | inline |
+| NFT-PERF-L4 | `<DEFERRED: SITL or HW zoom-cmd capture>` | `L4` | `threshold_max` | ≤ 2000 ms | inline |
+| NFT-PERF-L5 | `<DEFERRED: scripted scan→movement>` | `L5` | `threshold_max` | ≤ 500 ms | inline |
+| NFT-PERF-L6 | `video-movement` (visual ref) + `<DEFERRED gimbal.csv>` | `L6` | `threshold_max` | ≤ 1000 ms | inline |
+| NFT-PERF-L7 | `video-movement` + `<DEFERRED zoomed-in gimbal.csv>` | `L7` | `threshold_max` | ≤ 1500 ms | inline |
+| NFT-PERF-L8 | `<DEFERRED: sweep→zoomed transition capture>` | `L8` | `threshold_max` | ≤ 2000 ms | inline |
+| NFT-PERF-L9 | `<DEFERRED: operator-click → outbound>` | `L9` | `threshold_max` | ≤ 500 ms | inline |
+| NFT-PERF-T1 | `synthetic-poi-feeds` (sustained > cap) | `T1` | `threshold_max` | ≤ 5 / min | inline |
+| NFT-PERF-T2 | `<DEFERRED: MAVLink replay 60 s>` | `T2` | `range` | 1 Hz ≤ r ≤ 10 Hz | inline |
+| NFT-PERF-T3 | `video-recon` (throttled) | `T3` | `exact` × 2 | suppression bool + health=yellow | inline |
+| FT-P-EgoMotion (M1) | `video-movement/video01.mp4` + `<DEFERRED gimbal.csv + telemetry.csv>` | `M1` | `set_contains` | candidate set == {vehicle}; ∉ tree row | inline |
+| FT-P-MoveDuringHold (M2) | `video02.mp4` + `<DEFERRED zoomed-in CSV pair>` | `M2` | `exact` | 1 candidate; preempt per priority rule | inline |
+| FT-P-ThresholdEdge (M3) | `video03.mp4` + `<DEFERRED threshold-edge CSV>` | `M3` | `exact` | count == 0 | inline |
+| FT-P-MoveBenchmark (M4) | `video04.mp4` + `<DEFERRED benchmark suite>` | `M4` | `threshold_max` | per-zoom-band FP rate budget | `<DEFERRED: expected_results/movement_benchmark_caps.json>` |
+| FT-P-SweepToZoom (S1) | `<DEFERRED scripted mission + POI>` | `S1` | `exact` × 3 | transition + ROI + queue+=1 | inline |
+| FT-P-FootpathPan (S2) | `<DEFERRED hold + footpath polyline>` | `S2` | `numeric_tolerance` | centre offset ≤ 25% per frame | inline |
+| FT-P-TargetFollow (S3) | `<DEFERRED confirmed target>` | `S3` | `threshold_max` | per-frame |dx,dy| ≤ 0.125 | inline |
+| FT-P-POIOrdering (S4) | `synthetic-poi-feeds` (ordering test) | `S4` | `exact (order)` | ordering matches `conf × prox × age` | inline |
+| FT-P-DeepAnalysisHold (S5) | `<DEFERRED VLM-enabled hold>` | `S5` | `exact` | hold = min(5 s, vlm_complete) | inline |
+| FT-P-DecisionWindow30s (O1) | `synthetic-poi-feeds` (conf=0.40) | `O1` | `exact` | window = 30 s | inline |
+| FT-P-DecisionWindow120s (O2) | conf=1.00 | `O2` | `exact` | window = 120 s | inline |
+| FT-P-DecisionWindow75s (O3) | conf=0.70 | `O3` | `numeric_tolerance` | window ≈ 75 s ± 0.5 s | inline |
+| FT-N-BelowThreshold (O4) | conf=0.39 | `O4` | `exact` | not surfaced | inline |
+| FT-P-OperatorDecline (O5) | `operator-session-scripts` (nominal + decline) | `O5` | `exact (count Δ+1)` + `schema_match` | ignored-item appended | inline |
+| FT-P-IgnoredSuppress (O6) | matching MGRS + class_group | `O6` | `exact` | not surfaced | inline |
+| FT-P-OperatorTimeout (O7) | no-response + > window | `O7` | `exact` × 2 | queue −1; ignored unchanged | inline |
+| FT-P-OperatorConfirm (O8) | `operator-envelopes` (valid happy path) | `O8` | `exact (HTTP 200)` + `exact (mode)` | mission POST + target-follow | inline |
+| NFT-SEC-O9 | `operator-envelopes` (replayed) | `O9` | `exact` + `substring` | state unchanged; log contains "replay" | inline |
+| NFT-SEC-O10 | `operator-envelopes` (malformed/unsigned) | `O10` | `exact` + `substring` | state unchanged; log contains "invalid" | inline |
+| FT-P-BitPass (R1) | `bit-scenarios` (every dep green) | `R1` | `exact` × 2 | takeoff permitted + health all green | inline |
+| FT-N-BitDetectionDown (R2) | tier1 unreachable | `R2` | `exact` | takeoff inhibited + detection red | inline |
+| FT-N-BitStorageFull (R3) | storage ≥ 95 % | `R3` | `exact` | takeoff inhibited + storage red | inline |
+| NFT-RES-R4 | `operator-session-scripts` (sustained lost-link) | `R4` | `exact (RTL at 30 s ± 1 s)` | RTL command + operator-link red | inline |
+| NFT-RES-R5 | `mavlink-sitl-scripts` (battery at RTL-floor) | `R5` | `exact` × 2 | RTL + health yellow | inline |
+| NFT-RES-R6 | battery at hard-floor | `R6` | `exact` | land-now | inline |
+| NFT-RES-R7 | `mavlink-sitl-scripts` (no-response retry exhaustion) | `R7` | `exact` | health red after max-retry | inline |
+| NFT-RES-R8 | `time-drift-scripts` (250 ms drift) | `R8` | `exact` | time-source yellow + clock_source/last_sync_at updated | inline |
+| NFT-RES-R9 | `mavlink-sitl-scripts` (EXCLUSION cross) | `R9` | `exact` × 2 | waypoint rejected + RTL | inline |
+| NFT-RES-LIM-Re1 | `<DEFERRED long-running RSS harness>` | `Re1` | `threshold_max` | combined RSS ≤ 6 GB | inline |
+| NFT-RES-LIM-Re2 | Re1 + concurrent Tier-1 traffic | `Re2` | `numeric_tolerance` | Tier-1 ms/frame Δ ± 5 ms | inline |
+| FT-P-MapPull (Mp1) | `<DEFERRED 30×30 km area + ~10k mapobjects>` | `Mp1` | `threshold_max` | ≤ 30 s | inline |
+| NFT-RES-Mp2 | mock unreachable | `Mp2` | `exact` × 2 | cached_fallback + BIT requires ack | inline |
+| FT-P-MapPush (Mp3) | `<DEFERRED 60 min diff>` | `Mp3` | `threshold_max` | ≤ 120 s | inline |
+| NFT-RES-Mp4 | POST returns 5xx | `Mp4` | `exact` × 2 + `threshold_max` | file exists + warning + retries ≤ cap | inline |
+| FT-P-MapConflict (Mp5) | `<DEFERRED conflict pair>` | `Mp5` | `json_diff` | conflict resolution per Q8 | `<DEFERRED: expected_results/mapobjects_conflict_resolution.json>` |
+
+## External dependency mocks
+
+(Index-only; per-mock acquisition status owned by `services.md`.)
+
+| External service | Mock/stub | How provided | Behavior |
+|---|---|---|---|
+| `../detections` Tier-1 RPC | `detections-mock` (gRPC bi-stream) | Docker container; serves `.replay` files | Returns recorded `Detections` byte-stream for the input frame's hash; serves a 19-class catalogue (0..18) deterministically; supports schema-violation injection for NFT-SEC tests |
+| `missions` API | `missions-mock` (HTTPS FastAPI) | Docker container; TLS via self-signed test cert | Static JSON for `GET /missions/{id}`, `GET /missions/{id}/mapobjects`; records POST bodies for assertion; can be configured to return 5xx for NFT-RES-Mp4 |
+| ViewPro A40 RTSP | `rtsp-loopback` (mediamtx) | Docker container | Plays back `.mp4` at scheduled fps with frame-drop injection (T3) |
+| ViewPro A40 gimbal | `gimbal-mock` (Rust UDP) | Docker container | Replays `gimbal.csv` synchronised to RTSP frame timestamps; echoes received commands with bounded latency budget |
+| ArduPilot | `mavlink-sitl` (official ardupilot/ardupilot-sitl image) | Docker container | Deterministic SITL run from a scripted mission file |
+| Ground Station modem | `operator-replay` (Python) | Docker container | Replays `(t, event)` script per scenario; signs envelopes per Q9 once resolved |
+| Local VLM | `vlm-mock` (Python over UDS) | Docker container; UDS shared via `/tmp` volume | Returns paired `VlmAssessment` JSON; can return schema-violation responses for NFT-SEC tests |
+| Wall-clock / GPS / NTP | `time-injector` (Rust) | LD_PRELOAD faketime shim into the SUT container at start | Scripted offset/jump/source-loss |
+
+## Data validation rules
+
+| Data Type | Validation | Invalid Examples | Expected System Behaviour |
+|---|---|---|---|
+| Mission JSON | `mission-schema` (shared with `missions` repo) | missing required field; coord out of [-180, 180]; unknown enum value | system refuses; mission-state stays at last-known; health flips mission-config-source = yellow; structured-log at WARN with `schema_violation_field` |
+| Map-object record | suite-level mapobjects schema | non-finite coordinate; class_group not in catalogue; missing MGRS | record dropped; counter `mapobjects_rejected_total` increments; structured-log at WARN |
+| Tier-1 `Detections` stream | `expected_detections.schema.json` (normalised-box) | bbox coord ∉ [0, 1]; confidence ∉ [0, 1]; class_id ∉ {0..18} | frame's detections dropped (not partially used); `tier1_invalid_frame_total` increments; per AC D6 the system must surface a structured WARN |
+| MAVLink message | MAVLink v2 dialect (per ArduPilot) | unknown MSG_ID; CRC mismatch; (if Q6 resolves to "signing on") missing signature | message dropped; if signing required and missing → security WARN; airframe-link health unaffected for individual drops |
+| Operator command envelope | Q9 scheme (TBD) | replay (sequence_id seen recently); signature invalid; timestamp outside replay window | rejected at the boundary; no state mutation; security WARN with reason code; counters `operator_cmd_rejected_replay_total`, `..._signature_total`, `..._expired_total` |
+| VLM `VlmAssessment` response | structured assessment schema | missing required field; wrong type; truncated JSON | fail-closed: assessment discarded; POI does NOT get the deep-analysis upgrade; structured WARN |
+| RTSP frame | container-level decode | malformed H.264/265 NAL; oversized SPS | frame dropped; `frame_decode_error_total` increments; if rate falls below 10 fps for ≥5 s → T3 path triggers (zoom-in suppressed + health yellow) |
+| Camera frame size | bounded crop policy (security_approach §Bounded input) | crop > configured max bytes; format not in allow-list | rejected at boundary; security WARN |
+| Time source | wall-clock binding | GPS unlocked AND no NTP sync at boot | clock_source = `none`; health red until either source available |
+
+## Deferred-fixture bridge (replay obligation)
+
+Every `<DEFERRED:>` row above maps 1-to-1 to an entry in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md → "What is needed before /autodev can resume"` table. On every `/autodev` invocation, the leftovers step must re-evaluate whether any deferred fixture has landed; once landed, the corresponding scenario(s) become unblocked and their `Test status` line in the matching `*-tests.md` file moves from `DEFERRED — input fixture not yet acquired` to `READY`.
+
+Inline-authorable categories (10 and 11 in the leftover) — `synthetic-poi-feeds`, `time-drift-scripts`, `operator-session-scripts`, `bit-scenarios` — are NOT marked `<DEFERRED:>` in this file because they have no external dependency. They are authored by Phase 4's `e2e/consumer/fixtures/` generators when the runner scripts come online.
@@ -0,0 +1,202 @@
+# Traceability Matrix
+
+Authored by `/test-spec` Phase 2 (2026-05-19).
+
+This matrix maps every acceptance-criterion bullet from `_docs/00_problem/acceptance_criteria.md` and every restriction bullet from `_docs/00_problem/restrictions.md` to the test scenarios that exercise them. Coverage is **scenario-level**, not fixture-level — scenarios marked `DEFERRED` in the underlying `*-tests.md` files still count as covered for the purpose of "the test is specified"; the fixture-acquisition status is tracked separately in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`.
+
+## Acceptance Criteria Coverage
+
+| AC ID | Acceptance criterion (paraphrased; canonical text in `acceptance_criteria.md`) | Test IDs | Coverage |
+|---|---|---|---|
+| AC-L1 | Tier 1 per-frame ≤ 100 ms at 1280 px on deployed compute | NFT-PERF-L1, FT-P-001 (functional contract) | Covered |
+| AC-L2 | Tier 2 per-ROI ≤ 200 ms | NFT-PERF-L2 | Covered |
+| AC-L3 | Tier 3 per-ROI ≤ 5 s (when enabled) | NFT-PERF-L3 | Covered (fixture DEFERRED) |
+| AC-L4 | Camera zoom transition (medium→high) ≤ 2 s | NFT-PERF-L4 | Covered (fixture DEFERRED) |
+| AC-L5 | Decision-to-movement latency ≤ 500 ms | NFT-PERF-L5 | Covered (fixture DEFERRED) |
+| AC-L6 | Movement candidate enqueue ≤ 1 s (wide sweep) | NFT-PERF-L6 | Covered (fixture DEFERRED — gimbal.csv) |
+| AC-L7 | Movement candidate enqueue ≤ 1.5 s (zoomed-in) | NFT-PERF-L7 | Covered (fixture DEFERRED — zoomed gimbal.csv) |
+| AC-L8 | Zoom-out → zoom-in transition ≤ 2 s | NFT-PERF-L8 | Covered (fixture DEFERRED) |
+| AC-L9 | Operator command → outbound action ≤ 500 ms | NFT-PERF-L9 | Covered (fixture DEFERRED for signed envelopes; placeholder usable today) |
+| AC-T1 | POI rate surfaced to operator ≤ 5 / min (hard cap) | NFT-PERF-T1 | Covered |
+| AC-T2 | Position telemetry rate ∈ [1, 10] Hz (target 10) | NFT-PERF-T2 | Covered (fixture DEFERRED — MAVLink replay) |
+| AC-T3 | Frame-rate floor < 10 fps for ≥ 5 s → suppress zoom-in AND health yellow | NFT-PERF-T3 | Covered |
+| AC-D1 | New classes per-class P ≥ 0.80 AND R ≥ 0.80 | FT-P-003 | Covered (fixture DEFERRED — annotated eval set) |
+| AC-D2 | Existing-class regression Δ ≤ ± 0.02 vs baseline | FT-P-002 | Covered (baseline JSON DEFERRED; visual fixtures present) |
+| AC-D3 | Concealed-position recall ≥ 0.60 (initial gate) | FT-P-004 | Covered (fixture DEFERRED — multi-season set) |
+| AC-D4 | Concealed-position precision ≥ 0.20 (initial gate) | FT-P-005 | Covered (fixture DEFERRED — same as D3) |
+| AC-D5 | Footpath recall ≥ 0.70 | FT-P-006 | Covered (fixture DEFERRED — polyline-annotated set) |
+| AC-D6 | Tier-1 normalised-box contract conformance (class ids 0..18, coords ∈ [0,1]) | FT-P-001, NFT-SEC-Tier1SchemaViolation | Covered |
+| AC-Mov-EnqueueWideSweep | Small movers during wide sweep MUST be detected and enqueued ≤ 1 s | FT-P-007 (M1 behavioural), NFT-PERF-L6 (latency dimension) | Covered |
+| AC-Mov-ContinueDuringZoom | Movement detection continues during zoomed-in inspection | FT-P-008 (M2), NFT-PERF-L7 | Covered |
+| AC-Mov-StableObjectsRejected | Stable objects (trees, houses, roads) NOT treated as moving solely due to camera platform motion | FT-P-007 (M1 — set_contains explicitly excludes tree row) | Covered |
+| AC-Mov-FPBudgetHonoured | Configurable per-zoom-band FP budget honoured | FT-P-009 (M3), FT-P-010 (M4 — Q14) | Covered (M4 fixture DEFERRED) |
+| AC-Scan-SweepCoverage | Wide-area sweep covers planned route at wide/light/medium zoom | implicitly by FT-P-011 setup + scenario-runner BIT scenarios; NOT covered as a distinct test | NOT COVERED (see Uncovered Items § §1) |
+| AC-Scan-SweepToZoomTransition | Sweep → detailed inspection transition ≤ 2 s | FT-P-011, NFT-PERF-L8 | Covered |
+| AC-Scan-TargetLock | Lock + pan + 2 s deep-analysis hold + per-POI timeout default 5 s | FT-P-015 (S5 — three cases) | Covered (fixture DEFERRED — vlm-mock with realistic timing) |
+| AC-Scan-TargetFollowCentre | Target-follow within centre 25 % of frame | FT-P-013 | Covered (fixture DEFERRED) |
+| AC-Scan-GimbalLatency | Gimbal decision-to-movement ≤ 500 ms (links to L5) | NFT-PERF-L5 | Covered |
+| AC-Scan-POIOrdering | POI queue ordered by `confidence × proximity × age_factor` | FT-P-014 | Covered |
+| AC-Op-DecisionWindowScale | Decision window scales 30 s @ 0.40 → 120 s @ 1.00 linearly | FT-P-017 (O1), FT-P-018 (O2), FT-P-019 (O3), FT-N-004 (O4 below-threshold) | Covered |
+| AC-Op-DeclinePersistsIgnored | Operator-decline → persistent ignored-item per (MGRS, class_group) | FT-P-020 (O5) | Covered |
+| AC-Op-TimeoutForget | Timeout (no response) MUST NOT create ignored-item | FT-P-022 (O7) | Covered |
+| AC-Op-IgnoredSuppress | New detection matching existing ignored-item NOT surfaced | FT-P-021 (O6) | Covered |
+| AC-Op-ConfirmWaypointFollow | Operator-confirm → middle waypoint POST + target-follow mode | FT-P-016 (O8) | Covered (Q9 envelope DEFERRED; happy path uses placeholder) |
+| AC-Op-ReplayUnsignedRejected | Replayed or unsigned operator command REJECTED with logged security WARN; state UNCHANGED | NFT-SEC-O9, NFT-SEC-O10 | Covered (Q9 DEFERRED for full semantics) |
+| AC-Rel-BITGatesTakeoff | BIT MUST pass before takeoff permitted | FT-P-023 (R1), FT-N-001 (R2), FT-N-002 (R3), FT-N-003 (Mp2 BIT gate) | Covered |
+| AC-Rel-LostLinkRTL30s | Lost operator/GS link → known mission-safe outcome within configurable grace (default 30 s → RTL) | NFT-RES-R4 | Covered |
+| AC-Rel-AirframeLinkRedImmediate | Airframe command link loss → health red immediately; defer to airframe failsafe | NFT-RES-R7 (extension), implicitly by airframe-link health observation in NFT-RES-R5/R6 | Covered |
+| AC-Rel-BatteryFloors | Battery ≤ RTL floor → RTL; battery ≤ hard floor → land-now; operator override only | NFT-RES-R5, NFT-RES-R6 | Covered (fixture DEFERRED) |
+| AC-Rel-MavlinkExhaustionRed | MAVLink command exhaustion → airframe-link health red | NFT-RES-R7 | Covered (fixture DEFERRED) |
+| AC-Rel-DriftYellow | Wall-clock drift > 200 ms → health yellow | NFT-RES-R8 | Covered |
+| AC-Rel-GeofenceSymmetric | Geofence INCLUSION + EXCLUSION violations → waypoint refusal + RTL | NFT-RES-R9 (both cases) | Covered (fixture DEFERRED) |
+| AC-Res-RSS6GB | Combined RSS on Jetson (excluding Tier 1) ≤ 6 GB sustained | NFT-RES-LIM-Re1, NFT-RES-LIM-CPU (CPU dimension), NFT-RES-LIM-FileHandles (FD dimension) | Covered (HW DEFERRED) |
+| AC-Res-Tier1NonDegradation | Tier 1 per-frame latency Δ ± 5 ms under concurrent autopilot workload | NFT-RES-LIM-Re2, NFT-RES-LIM-GPU (GPU mutual exclusion) | Covered (HW DEFERRED) |
+| AC-Mp-PreFlightPull30s | Pre-flight map pull ≤ 30 s; cache-fallback only with explicit operator ack | FT-P-024 (Mp1), FT-N-003 (Mp2 cache-fallback gate), NFT-RES-Mp2 (timing+recovery) | Covered |
+| AC-Mp-PostFlightPush120s | Post-flight pass diff push ≤ 120 s; failure → persist + bounded retry | FT-P-025 (Mp3), NFT-RES-Mp4 | Covered (fixture DEFERRED) |
+| AC-Gate-HWBench | HW/replay benchmark suite MUST pass before product implementation | every Tier-HW row in environment.md `Hardware Execution Matrix` (filled by `hardware-assessment.md`) | Covered as a gate, executed at the Acceptance-Gates milestone |
+| AC-Gate-SeasonCoverage | Per-season dataset coverage demonstrated before MVP sign-off (Q13) | NOT COVERED at blackbox test level — gated on annotation campaign and the `../ai-training` repo | NOT COVERED (see Uncovered Items § §2) |
+| AC-Gate-MavlinkSITLConformance | MAVLink command surface MUST pass SITL conformance | implicitly by FT-P-016 (O8 confirms waypoint POST through SITL) + NFT-RES-R4/R5/R6/R7/R9 (all run through SITL); a dedicated conformance suite is recommended | Partially Covered (see Uncovered Items § §3) |
+| AC-Q-Mov-Zoomed-FPRate | Movement detection FP rate at zoomed-in inspection (Q14) | FT-P-010 (M4) | Covered (Q14 DEFERRED) |
+| AC-Q-MapObjectsConflict | MapObjects conflict resolution rule (Q8) | FT-P-026 (Mp5) | Covered (Q8 DEFERRED) |
+| AC-Q-OperatorCmdAuth | Operator-command authentication conformance (Q9) | NFT-SEC-O9, NFT-SEC-O10, FT-P-016 (O8) | Covered (Q9 DEFERRED — placeholders used today) |
+| AC-Q-MAVLinkSigning | Airframe MAVLink-2 message signing (Q6) | NFT-SEC-MavlinkUnsigned | Covered (Q6 DEFERRED) |
+| AC-Q-SeasonGates | Per-season flight-test gates (Q13) | NOT COVERED — same as AC-Gate-SeasonCoverage | NOT COVERED |
+
+## Restrictions Coverage
+
+| Restriction ID | Restriction (paraphrased; canonical in `restrictions.md`) | Test IDs | Coverage |
+|---|---|---|---|
+| RESTRICT-HW-Jetson | Compute device Jetson Orin Nano Super; 8 GB shared LPDDR5; ~6 GB residual after Tier 1 | NFT-RES-LIM-Re1, NFT-RES-LIM-CPU, NFT-RES-LIM-Re2, all Tier-HW rows | Covered (HW DEFERRED) |
+| RESTRICT-HW-A40 | Primary camera ViewPro A40; vendor protocol mandatory | FT-P-011, FT-P-012, FT-P-013, NFT-PERF-L4 (zoom traversal floor) | Covered (HW DEFERRED for L4) |
+| RESTRICT-HW-Z40K | Alternative camera ViewPro Z40K — system must remain compatible | NOT COVERED at autopilot test level — verified by component-swap regression run on the Z40K HW | NOT COVERED (see Uncovered Items § §4) |
+| RESTRICT-HW-ThermalLater | Thermal sensor may be added later; not assumed today | implicit (no test depends on thermal) | Covered by absence (negative assumption) |
+| RESTRICT-HW-ZoomFloor | 40× optical zoom traversal 1–2 s wall-clock | NFT-PERF-L4 (asserts the ≤ 2 s ceiling that includes the physical floor) | Covered (HW DEFERRED) |
+| RESTRICT-Op-Altitude | Flight altitude 600–1000 m | implicitly by every mission-trace fixture; no dedicated test | Covered by fixture assumption |
+| RESTRICT-Op-AllSeasons | All four seasons in scope; winter-first-only rejected | FT-P-002, FT-P-003, FT-P-004, FT-P-005, FT-P-006 — multi-season fixtures required | Covered (all DEFERRED on multi-season fixtures) |
+| RESTRICT-Op-AllTerrains | Forest, open field, urban edges, mixed terrain | same as RESTRICT-Op-AllSeasons | Covered (DEFERRED) |
+| RESTRICT-Op-IntermittentModem | Modem operator/GS link intermittent | NFT-RES-R4, FT-P-016 (O8 nominal session), NFT-SEC-O9/O10 | Covered |
+| RESTRICT-SW-JetsonResidualBudget | Onboard inference path runs within 6 GB residual RAM | NFT-RES-LIM-Re1 | Covered (HW DEFERRED) |
+| RESTRICT-SW-FP16 | Models use FP16 precision (INT8 rejected for MVP) | NOT COVERED at autopilot test level — pinned at the model-loading layer (Tier 1 in `../detections`; Tier 2/3 in autopilot config) | NOT COVERED (see Uncovered Items § §5) |
+| RESTRICT-SW-NoCloudInference | No cloud egress for inference | NFT-SEC-CraftedFrame (process boundary), implicit by environment.md `autopilot-e2e` network having no egress | Covered |
+| RESTRICT-SW-GPUMutualExclusion | Tier 1 + any local large model serialise on the Jetson GPU | NFT-RES-LIM-GPU | Covered (HW DEFERRED) |
+| RESTRICT-SW-MissionSchemaShared | Autopilot consumes shared `mission-schema`; cannot fork | FT-P-016 (O8 — POST validates against schema), FT-P-024 (Mp1 — schema-validated pull) | Covered (fixtures DEFERRED) |
+| RESTRICT-Arch-Tier1External | Tier 1 lives in `../detections`; autopilot consumes | FT-P-001 (D6), NFT-SEC-Tier1SchemaViolation, FT-N-001 (R2 — Tier 1 unreachable inhibits BIT) | Covered |
+| RESTRICT-Arch-MissionExternal | Mission state from `missions` service; autopilot doesn't author | FT-P-024, FT-P-025, FT-P-016 | Covered (fixtures DEFERRED) |
+| RESTRICT-Arch-MapInMissions | Central area map in `missions /mapobjects` | FT-P-024, FT-P-025, FT-P-026 (Mp5), NFT-RES-Mp2, NFT-RES-Mp4 | Covered (fixtures DEFERRED) |
+| RESTRICT-Arch-GPSDeniedExternal | GPS coords from separate GPS-denied service; autopilot does NOT implement | NOT COVERED at autopilot test level — verified at suite-e2e tier via the live GPS-denied service | NOT COVERED at autopilot tier (covered at suite-e2e tier) |
+| RESTRICT-Arch-OperatorUIExternal | Operator browser UI owned by Ground Station; autopilot pushes data | implicit by NOT testing any UI rendering; verified by operator-stream protocol assertions in FT-P-016, FT-P-017–022 | Covered by absence |
+| RESTRICT-Arch-AnnotationTrainingExternal | Annotation + training in `../annotations`, `../ai-training`; autopilot doesn't own | NOT TESTABLE at autopilot blackbox tier — process boundary | NOT TESTABLE (intentional scope exclusion) |
+| RESTRICT-Rel-BITGate | Pre-flight BIT MUST gate takeoff | FT-P-023 (R1), FT-N-001 (R2), FT-N-002 (R3), FT-N-003 (Mp2) | Covered |
+| RESTRICT-Rel-LostLinkDeterministic | Lost operator-link failsafe deterministic + bounded | NFT-RES-R4 | Covered |
+| RESTRICT-Rel-AirframeLossRedImmediate | Airframe MAVLink loss → health red immediately | NFT-RES-R7 (red after retry exhaustion); a dedicated "immediate red on link loss" scenario MAY be desirable (currently rolled into R7) | Partially Covered (see Uncovered Items § §6) |
+| RESTRICT-Rel-BatteryThresholds | Battery RTL + land-now triggers (override only via operator) | NFT-RES-R5, NFT-RES-R6 | Covered (fixtures DEFERRED) |
+| RESTRICT-Rel-GeofenceSymmetric | Geofence INCLUSION + EXCLUSION enforcement | NFT-RES-R9 (both) | Covered (fixture DEFERRED) |
+| RESTRICT-Rel-OperatorCmdAuth | Operator commands authenticated + signed + replay-protected | NFT-SEC-O9, NFT-SEC-O10, FT-P-016 happy path | Covered (Q9 DEFERRED) |
+| RESTRICT-Rel-StorageBounded | On-device storage bounded; full = takeoff blocker; mid-flight eviction policy | FT-N-002 (R3 — BIT block), NFT-RES-LIM-Storage | Covered |
+| RESTRICT-Rel-NoSilentErrors | No silent error swallowing | every NFT-SEC-* scenario asserts a counter + log entry; every NFT-RES-* asserts a structured-log + health transition | Covered |
+| RESTRICT-Rel-ClockBound | Wall-clock bound to GPS once locked, else NTP at boot | NFT-RES-R8 | Covered |
+| RESTRICT-Rel-MavlinkConformance | MAVLink command surface MUST conform to ArduPilot/PX4 SITL | every MAVLink-emitting scenario runs through `mavlink-sitl`; a dedicated conformance suite is recommended | Partially Covered (see Uncovered Items § §3) |
+
+## Coverage Summary
+
+| Category | Total Items | Covered | Partially Covered | Not Covered | Coverage % (counting Partially as 0.5) |
+|---|---|---|---|---|---|
+| Acceptance Criteria | 47 | 43 | 1 | 3 | (43 + 0.5×1) / 47 ≈ **92.6 %** |
+| Restrictions | 30 | 25 | 2 | 3 | (25 + 0.5×2) / 30 ≈ **86.7 %** |
+| **Total** | 77 | 68 | 3 | 6 | **(68 + 1.5) / 77 ≈ 90.3 %** |
+
+(Coverage here is "test scenario exists for the item", not "fixture has been acquired and the test currently passes". Fixture status is tracked in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`.)
+
+## Uncovered Items Analysis
+
+| § | Item | Reason not covered | Risk | Mitigation |
+|---|---|---|---|---|
+| §1 | AC-Scan-SweepCoverage (wide-area sweep covers planned route) | The "covers the planned route" property is a path-coverage assertion best tested by component-level tests in the `scan_controller` component (geometry coverage) rather than at the blackbox level | Medium — incorrect sweep pattern leaks observation gaps | Componenet-test in `scan_controller` (added by `/decompose` test tasks); a Tier-E "did the camera point at every planned waypoint area for ≥ N seconds" scenario can be added if needed |
+| §2 | AC-Gate-SeasonCoverage / AC-Q-SeasonGates | Per-season coverage gates depend on dataset acquisition owned by `../ai-training` and per-season flight tests (Q13) | High — model performance on un-evaluated seasons unknown | Tracked as release-gate item; D3/D4/D5/D1 scenarios DEFERRED until each season's dataset lands |
+| §3 | AC-Gate-MavlinkSITLConformance / RESTRICT-Rel-MavlinkConformance | A dedicated "every command in `architecture.md §7.7` exercised against SITL" suite is recommended in addition to the implicit coverage by R-scenarios | Medium — could miss a rarely-used command | Add a `NFT-MavlinkConformance` suite during Step 9 (Decompose Tests) — explicit per-command SITL exercise |
+| §4 | RESTRICT-HW-Z40K (Z40K compatibility) | Requires a second camera HW for the swap test | Medium — could miss a A40-specific assumption | Run the Tier-HW rows on Z40K as a post-MVP smoke step |
+| §5 | RESTRICT-SW-FP16 (model precision) | Pinned at config + model-loading layer; not externally observable beyond perf/latency | Low — incorrect precision would manifest as either L1 latency or D2 regression failure | Add a startup log assertion: "Tier 2/3 models loaded with precision=FP16" via the SUT's structured boot log |
+| §6 | RESTRICT-Rel-AirframeLossRedImmediate (immediate red on airframe link loss) | NFT-RES-R7 asserts red after retry exhaustion; the "immediate red on link loss" path (no retries) is implicit | Low–Medium — depends on timing window between "link silent" and "considered lost" | Add `NFT-RES-AirframeImmediate` scenario in Step 9 (Decompose Tests) — sustained zero MAVLink traffic for N seconds → immediate health red (no retry phase) |
+
+## Scenario index by file
+
+| File | Scenarios | Read-back ID prefix |
+|---|---|---|
+| `blackbox-tests.md` | 26 positive + 4 negative | FT-P-001..FT-P-026, FT-N-001..FT-N-004 |
+| `performance-tests.md` | 9 latency + 3 rate | NFT-PERF-L1..L9, NFT-PERF-T1..T3 |
+| `resilience-tests.md` | 6 R-rows + 2 Mp-rows | NFT-RES-R4..R9, NFT-RES-Mp2, NFT-RES-Mp4 |
+| `security-tests.md` | 10 SEC rows | NFT-SEC-O9, NFT-SEC-O10, NFT-SEC-CraftedFrame, NFT-SEC-OversizeCrop, NFT-SEC-VlmSchemaViolation, NFT-SEC-VlmFreeFormText, NFT-SEC-IpcPeerAuth, NFT-SEC-Tier1SchemaViolation, NFT-SEC-MavlinkUnsigned, NFT-SEC-HealthExposesSecurity |
+| `resource-limit-tests.md` | 6 LIM rows | NFT-RES-LIM-Re1, Re2, Storage, CPU, GPU, FileHandles |
+
+**Total scenarios authored**: 66.
+
+## Open dependencies summary
+
+| Dependency | Affects (scenario count) | Tracking |
+|---|---|---|
+| `<DEFERRED: gimbal.csv + telemetry.csv pairs>` | FT-P-007/008/009/010, NFT-PERF-L6/L7 | Leftover row "Gimbal CSV pairs" |
+| `<DEFERRED: multi-season annotated datasets (concealed, footpath, new classes, existing baseline)>` | FT-P-002/003/004/005/006 | Leftover row "Concealed position image set + Footpath sequences + new-class eval set" |
+| `<DEFERRED: SITL or HW capture for L4/L5/L8>` | NFT-PERF-L4/L5/L8 | Leftover row "MAVLink SITL traces" + camera frame sequences with zoom-band labelling |
+| `<DEFERRED: missions API mock fixtures (Mp1/Mp3/Mp4)>` | FT-P-024/025, NFT-RES-Mp4 | Leftover row "Mock central area-map service responses" |
+| `<DEFERRED: vlm-io-pairs (real recordings)>` | NFT-PERF-L3, FT-P-015 (S5), NFT-SEC-VlmSchemaViolation real-recording variant | Leftover row "Deep-analysis I/O pairs" |
+| `<DEFERRED: operator-envelopes (Q9-blocked)>` | NFT-SEC-O9/O10, full semantics of FT-P-016 | Leftover row "Operator-command envelopes" + Q9 |
+| `<DEFERRED: HW Jetson Orin Nano Super OR benchmarked replay>` | every Tier-HW scenario (L1, L2, L4, L5, L8, Re1, Re2, CPU, GPU) | Leftover does not enumerate HW directly — tracked via the project-level Acceptance Gate |
+| `<DEFERRED: Q6 — MAVLink-2 signing decision>` | NFT-SEC-MavlinkUnsigned | architecture.md §8 Q6 |
+| `<DEFERRED: Q8 — MapObjects conflict resolution rule>` | FT-P-026 (Mp5) | architecture.md §8 Q8 |
+| `<DEFERRED: Q9 — operator-command auth scheme>` | NFT-SEC-O9/O10 full semantics | architecture.md §8 Q9 |
+| `<DEFERRED: Q13 — per-season gates>` | AC-Gate-SeasonCoverage | architecture.md §8 Q13 |
+| `<DEFERRED: Q14 — movement-detection classical vs learned-CV>` | FT-P-010 (M4) | architecture.md §8 Q14 |
+
+When any of the above dependencies resolves, the corresponding leftover entry is replayed (per `tracker.mdc → Leftovers Mechanism`) and the affected scenarios' `Test status` lines move from `DEFERRED` to `READY` in the source files.
+
+## Phase 3 — Test Data & Expected Results Validation Gate Outcome
+
+Recorded by `/test-spec` Phase 3 on 2026-05-19.
+
+### Mechanical gate
+
+Phase 3's mechanical contract is: every scenario MUST have either (a) a provided input + provided quantifiable expected result, OR (b) a behavioural trigger + observable behaviour + quantifiable pass/fail criterion. Scenarios that fail this contract are normally REMOVED. The 75 % final-coverage check then applies.
+
+| Shape | Total scenarios | Quantifiable comparison declared | Input/trigger fully provided today | Input/trigger DEFERRED (release-gate item) |
+|---|---|---|---|---|
+| Input/output | 56 | 56 | 16 | 40 |
+| Behavioural | 10 | 10 | 10 | 0 |
+| **Total** | **66** | **66 (100 %)** | **26 (39 %)** | **40 (61 %)** |
+
+Every scenario carries a `Comparison` method drawn from `.cursor/skills/test-spec/templates/expected-results.md` (`exact`, `numeric_tolerance`, `threshold_min/max`, `range`, `regex`, `substring`, `set_contains`, `json_diff`, `schema_match`, `file_reference`) — none of the 66 fail the quantifiability check.
+
+### Project-policy override (recorded 2026-05-19)
+
+The Phase 3 75 % fixture-coverage gate is **intentionally overridden** for this project, per the decision recorded in `_docs/00_problem/input_data/expected_results/results_report.md → "Decision (project policy)"`:
+
+> rather than block on the Phase 3 75 % gate, each deferred row is now registered with a structured `<DEFERRED:>` tag and surfaces in `data_parameters.md → "Gaps that block /test-spec downstream"`. `/test-spec` Phase 2 can author scenarios for all 56 rows; deferred rows become **release-gate items**, not development-gate items. The `acceptance_criteria.md → "Acceptance Gates (project-level)"` hardware/replay benchmark requirement is preserved as the hard release gate — that one is NOT being deferred.
+
+Under this policy:
+
+- **No scenarios are removed by Phase 3.** Every authored scenario remains in the spec; its `Test status` line in the source file (`blackbox-tests.md`, `performance-tests.md`, etc.) carries either `READY` or `DEFERRED — <reason>`.
+- **Final coverage** is computed at the **scenario level**, not the fixture level. Per the matrix above:
+  - AC coverage: 92.6 % (43 + 0.5 × 1 / 47)
+  - RESTRICT coverage: 86.7 % (25 + 0.5 × 2 / 30)
+  - **Total: 90.3 %** — well above the 75 % gate.
+- **Fixture acquisition** is tracked as a release-gate concern in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`; on every `/autodev` invocation the leftover-replay step re-evaluates whether any deferred fixture has landed and moves the affected scenarios from `DEFERRED` to `READY`.
+- **The project-level Acceptance Gate** (`acceptance_criteria.md → "Acceptance Gates"` — HW/replay benchmark, per-season coverage, MAVLink SITL conformance) remains a hard release blocker. The override does NOT relax that gate.
+
+### Phase 3 verdict
+
+**PASSED** — scenario-level coverage 90.3 % ≥ 75 % gate; every scenario has a quantifiable comparison; deferred-fixture tracking handled via leftovers replay; no scenarios removed.
+
+## Phase 4 — Test Runner Script Generation: SKIPPED in this invocation
+
+Per `phases/04-runner-scripts.md → "Skip condition"`:
+
+> If this skill was invoked from the `/plan` skill (planning context, no code exists yet), skip Phase 4 entirely. Script creation should instead be planned as a task during decompose — the decomposer creates a task for creating these scripts. Phase 4 only runs when invoked from the existing-code flow (where source code already exists) or standalone.
+
+This invocation is greenfield Step 5 (Test Spec) and no source code exists yet — the `_docs/02_document/components/*/description.md` files describe 13 Rust components that the Implement step (Step 7) will create. Producing runner scripts here would write `scripts/run-tests.sh` and `scripts/run-performance-tests.sh` against a binary that does not yet exist.
+
+**Handoff to Step 6 (Decompose)**: the decomposer MUST create at least two task specs covering the test runner scripts:
+
+1. A task to create `scripts/run-tests.sh` (Tier B/E orchestration; calls `docker compose -f e2e/docker-compose.autopilot-e2e.yml up` and runs `cargo test --release --test scenarios` in `e2e-consumer`).
+2. A task to create `scripts/run-performance-tests.sh` (Tier HW orchestration; per `environment.md → Hardware Execution Matrix`).
+
+Both tasks should be tagged as part of the test-infrastructure decomposition (`Step 1t` of decompose tests-only mode) so they land before any Tier-B test scenarios are implemented.
+