[AZ-626] Decompose complete: 47 tasks + docs + module layout

Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-19 11:02:01 +03:00
parent f7d6cb4a3a
commit bc40ea7300
235 changed files with 12585 additions and 15097 deletions
+60
View File
@@ -0,0 +1,60 @@
---
description: "Single Responsibility Principle applied to _docs/ artifacts. Each canonical file owns ONE concern and MUST NOT bleed into a sibling artifact's concern."
alwaysApply: true
---
# Artifact Single Responsibility
SRP is not only for code. Every canonical `_docs/` artifact owns exactly **one** concern. Mixing concerns across artifacts is a violation — fix the artifact, do not let the leak survive.
## Canonical artifact responsibilities
| Artifact | Owns ONLY | MUST NOT contain |
|---|---|---|
| `_docs/00_problem/problem.md` | What the system is for, the problem it solves, who uses it, the operational/environmental reality that defines the problem space. WHO + WHAT + WHY. | Technology choices, frameworks, languages, libraries, state-machine designs, component lists, internal data flows, IPC mechanisms, algorithms, schema names, "uses X library", "implements Y pattern". |
| `_docs/00_problem/restrictions.md` | Externally imposed constraints the system MUST satisfy: hardware (the device that already exists), regulatory, operational (deployment environment, climate, link reliability), vendor-fixed protocols (a chosen camera or radio whose protocol cannot be changed), legal/budget/timeline. | Design choices framed as constraints. "We chose Rust for memory safety" is design, not restriction. "The Jetson Orin Nano has 8 GB RAM" is a restriction. |
| `_docs/00_problem/acceptance_criteria.md` | Measurable, design-independent outcomes. What "done" looks like, expressed so a black-box test can verify it. | Implementation choices (libraries, params, algorithms, internal component names). AC is reverse-engineered FROM problem+restrictions, never FROM solution. |
| `_docs/00_problem/input_data/` | Reference data the system consumes + the input→quantifiable-expected-output mapping consumed by `/test-spec`. | Solution design or AC restatement. |
| `_docs/00_problem/security_approach.md` | Threat model + non-negotiable security principles + open security decisions. | Specific algorithms / libraries unless the AC truly mandates them (e.g. "must use AES-256" only if regulation forces it). |
| `_docs/01_solution/solution.md` | The chosen solution shape: high-level approach, the component breakdown name list, the tech stack with one-line rationale, pointers to the architecture deep dive. | Detailed flows (those belong in system-flows.md). Per-component contracts (those belong in component specs). Re-statement of the problem (point to it, do not duplicate). |
| `_docs/02_document/architecture.md` | System context, component layering, NFR targets, detailed design, MAVLink command surface, sync protocols, open architecture questions, scope boundary. The "how" at a system level. | Wholesale re-statement of problem.md, restrictions.md, AC, or solution overview. May briefly reference them; must not duplicate them. (If the project predates this rule and architecture.md has §Problem / §Restrictions / §AC sections, leave them but mark them as "MOVED to canonical location — keep this in sync or delete on next refactor".) |
| `_docs/02_document/system-flows.md` | Per-flow narratives + sequence diagrams. Behaviour over time. | Component implementation details (those live in component specs). |
| `_docs/02_document/data_model.md` | Canonical entity catalogue. | Component implementation details. |
| `_docs/02_document/components/<name>/description.md` | Per-component: purpose, inputs, outputs, responsibilities, state, failure modes, NFR targets, dependencies. | Cross-component flows (those live in system-flows.md). |
| `_docs/02_document/decision-rationale.md` | The "why" behind every load-bearing decision. Research evidence, reasoning chain, fact cards, fit matrix, validation log. | Authoritative architecture (point to architecture.md). |
| `_docs/02_document/glossary.md` | Project-specific terms only. | Generic CS/industry terms (RTSP, gRPC, JSON, etc.). |
## Litmus test (apply before writing or editing any of the above)
Before you save a file, scan each sentence and ask: **does this sentence belong to this artifact's concern (per the table above)?** If it belongs to a sibling artifact, move it there. Do not "summarise the system architecture in problem.md so the reader has context" — that is exactly the violation this rule exists to prevent.
Specific signals that you are leaking:
- problem.md mentions a programming language, framework, library, IPC mechanism, state-machine pattern, container, file format, RPC framework, or algorithm → solution leakage. Remove.
- restrictions.md says "we will use X because Y" or "the solution must be implemented with Z" → design choice masquerading as restriction. Move to solution.md (or architecture.md if it is a design non-negotiable).
- acceptance_criteria.md names a specific library, model file, or component → implementation leakage. Re-express as observable behaviour ("system returns N detections within Tms"), not "library X must return N detections".
- solution.md re-explains the problem in detail (more than a one-paragraph context-setter) → duplication. Point to problem.md instead.
- architecture.md restates AC numerically instead of referencing acceptance_criteria.md → duplication that will drift.
## When a fact is genuinely cross-cutting
Sometimes a single fact touches multiple concerns. Pick the artifact whose concern is *primary* and reference from the others:
- "ViewPro A40 is the camera." Hardware reality → **restrictions.md**. solution.md / architecture.md reference it.
- "Tier-1 inference lives in `../detections`, not in autopilot." Architectural non-negotiable → **architecture.md §5**. solution.md mentions it; restrictions.md does NOT (it is not an external constraint, it is a chosen split).
- "Operator commands must be authenticated, signed, replay-protected." This is a **principle / restriction** the threat model imposes → security_approach.md owns the principle; architecture.md owns the chosen scheme.
- "≤5 POIs / minute" is a **product requirement** → acceptance_criteria.md owns it; architecture.md owns how scan_controller enforces it.
## When you are tempted to skip the rule
Common excuses and the answer:
- "But the architecture document was authored before the canonical problem/solution split existed." → Then the architecture document over-reaches into other concerns. Mark the over-reaching sections "MOVED — see <canonical path>" and shrink them on the next refactor. Do not propagate the over-reach into newly authored artifacts.
- "But the reader needs context to understand the problem statement." → Context for the *problem* means **operational + environmental + user reality** (e.g. "the UAV flies at 6001000 m, must work in winter snow"). Context does NOT mean a tour of the solution design.
- "But everyone will read both files anyway." → Then the duplication is harmless? No — duplication drifts. The two copies diverge silently and a reader cannot tell which one is authoritative.
- "But the source paragraph in architecture.md said it this way." → architecture.md may itself be in violation (see "When a project predates this rule" in the table above). Do not propagate a pre-existing violation when authoring a new file.
## Enforcement
- Any `/autodev` or skill workflow that writes one of the canonical artifacts MUST self-check against the table above before saving.
- When auditing an existing artifact, flag any sentence that violates the table. If the violation is in a file you are editing for another reason, fix it inline (per the "adjacent hygiene" allowance in coderule.mdc → "Scope discipline"). If it is in a file outside your current scope, record it in `_docs/_process_leftovers/` for later cleanup.
-10
View File
@@ -1,10 +0,0 @@
FROM python:3.11-slim
ARG CI_COMMIT_SHA=unknown
ENV AZAION_REVISION=$CI_COMMIT_SHA
RUN apt-get update && apt-get install -y libxml2-dev libxslt1-dev && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8080
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
+93
View File
@@ -0,0 +1,93 @@
# Acceptance Criteria
Measurable, design-independent success criteria. Implementation choices (specific models, libraries, components, algorithms) belong in `_docs/01_solution/` and `_docs/02_document/`, NOT here. (Audited against `.cursor/rules/artifact-srp.mdc`.)
Every criterion below is observable through the system's external behaviour and can be evaluated by a black-box test.
## Latency
- Primitive (Tier 1) object detection — per-frame end-to-end on the deployed compute device: **≤100 ms** at 1280 px input.
- Semantic confirmation (Tier 2) over a single ROI: **≤200 ms**.
- Deep semantic confirmation (Tier 3 / VLM, when enabled): **≤5 s** per ROI.
- Camera zoom transition (medium → high): **≤2 s** wall-clock, including the physical zoom traversal.
- Decision-to-movement latency (internal scan-control decision → camera physically moving): **≤500 ms**.
- Movement candidate enqueue: **≤1 s** during the wide-area sweep; **≤1.5 s** during the zoomed-in inspection (accommodating gimbal slew).
- Zoom-out → zoom-in transition (POI detected → ROI fully zoomed): **≤2 s** wall-clock.
- Operator command → action: **≤500 ms** from operator click to outbound command (modem RTT excluded).
## Throughput / Rate
- POI rate surfaced to the operator: **≤5 POIs / minute** (hard cap; frozen 2026-05-06).
- Position telemetry rate: **≥1 Hz**, target **10 Hz**.
- Sustained camera frame-rate floor: **≥10 fps**. Below this, zoom-in transitions MUST be suppressed and overall health MUST surface yellow.
## Detection Quality
(Behaviour as observed at the system boundary. Model identity, training data, and label catalogue live in `_docs/02_document/architecture.md` and the `../ai-training` repo.)
- New target classes (black entrances, branch piles, footpaths, roads, trees, tree blocks): per-class **precision ≥80%** AND **recall ≥80%**.
- Existing-class regression: per-class precision and recall MUST NOT degrade by more than ±2 percentage points against the documented baseline.
- Concealed-position recall (initial gate, accepting high false-positive rate): **≥60%**.
- Concealed-position precision (initial gate, operators filter): **≥20%**.
- Footpath recall: **≥70%**.
## Movement Detection Behaviour
- Small moving point/cluster candidates that are not yet classifiable MUST be detected during the wide-area sweep and enqueued for zoomed inspection within **≤1 s**.
- Movement detection MUST continue during the zoomed-in inspection (a moving target that appears inside a held POI must not be lost), with enqueue within **≤1.5 s**.
- Stable objects (trees, houses, roads, terrain) MUST NOT be treated as moving solely because the camera platform itself moves.
- A configurable per-zoom-band false-positive budget MUST be honoured (the system must not flood the operator with false candidates by ignoring its own threshold).
## Scan & Camera Control Behaviour
- The wide-area sweep MUST cover the planned route with a left-right gimbal pattern at wide or light/medium zoom.
- Transition from sweep to detailed inspection MUST complete within **≤2 s** of POI detection (including physical zoom).
- During detailed inspection the system MUST keep the target locked while the airframe flies, pan to keep features visible, hold endpoints up to **2 s** for deep analysis, and return to the sweep after analysis or a configurable per-POI timeout (default **5 s/POI**).
- After operator confirmation, target-follow mode MUST keep the target within the **centre 25%** of the frame while visible.
- Gimbal commands MUST achieve **≤500 ms** decision-to-movement latency with visibly smooth transitions.
- The POI queue MUST be ordered by confidence × proximity to current camera × age factor (relative ranking, not absolute formula).
## Operator Workflow
- The decision window surfaced to the operator MUST scale linearly with confidence: **40% confidence → 30 s; 100% confidence → 120 s**. Below 40% confidence, the POI MUST NOT be surfaced at all.
- Operator-decline MUST result in a persistent ignored-item entry for the matching `(MGRS cell, class group)` so the same target is not re-surfaced.
- Timeout (no operator response within the window) MUST NOT create an ignored-item entry (forget, do not blacklist).
- A new detection whose `(MGRS cell, class group)` matches an existing ignored-item MUST NOT be surfaced.
- Operator confirmation MUST result in (a) a middle waypoint inserted into the mission and (b) a transition to target-follow mode.
- A replayed or unsigned operator command MUST be rejected with a logged security warning; system state MUST NOT change.
## Reliability & Safety
- Pre-flight self-test MUST pass (every dependency healthy OR explicit operator acknowledgement of a known degraded state) before takeoff is permitted.
- Loss of operator/Ground-Station radio link MUST trigger a known mission-safe outcome within a deterministic, configurable grace window (default **30 s grace → RTL**).
- Loss of airframe command link MUST surface health red immediately and defer to the airframe autopilot's own failsafe.
- Battery at or below the configured **RTL floor** (e.g. 25%) MUST trigger RTL automatically; battery at or below the **hard floor** (e.g. 15%) MUST trigger land-now. Only an authenticated operator command may override.
- MAVLink command exhaustion (bounded retry with exponential backoff fails through max-retry) MUST flip the airframe-link health to red.
- Wall-clock drift greater than **200 ms** versus GPS or NTP source MUST surface health yellow.
- Geofence INCLUSION and EXCLUSION violations MUST both result in waypoint refusal + RTL.
## Resources & Data
- Combined RSS on the deployed compute device, for everything autopilot owns onboard (excluding Tier 1), MUST stay within **≤6 GB**.
- Tier 1 per-frame latency MUST NOT degrade by more than **±5 ms** when autopilot's own onboard workload is running concurrently.
## Map Reconciliation (with the central area-level map)
- Pre-flight map pull for a **30 km × 30 km** mission area: **≤30 s** wall-clock. Cache-fallback on timeout is acceptable only with explicit operator acknowledgement.
- Post-flight pass diff push for a **60-minute** mission: **≤2 min** wall-clock. Failure MUST persist the pending diff to durable on-device storage with bounded retry.
## Acceptance Gates (project-level)
- A hardware/replay benchmark suite MUST pass before product implementation begins. Specifically: every latency criterion above MUST be measured on the deployed compute device, not on a developer workstation.
- Per-season dataset coverage MUST be demonstrated before MVP sign-off (winter, spring, summer, autumn).
- MAVLink command surface MUST pass SITL conformance against ArduPilot.
## Q-tagged criteria (depend on open architecture decisions)
These criteria are real and measurable; their tolerance ranges may sharpen once the linked open question resolves. The questions are tracked in `_docs/02_document/architecture.md §8`.
- Movement detection false-positive rate at zoomed-in inspection — depends on **Q14** (classical-CV adequacy vs learned-CV fallback).
- MapObjects conflict resolution behaviour — depends on **Q8** (append-only log + projection rules).
- Operator-command authentication conformance — depends on **Q9** (signing scheme).
- Airframe MAVLink-2 message signing — depends on **Q6**.
- Per-season flight-test gates — depends on **Q13**.
+58
View File
@@ -0,0 +1,58 @@
# Input Data
Runtime inputs the autopilot consumes when flying, plus reference fixtures + expected-output assertions for tests. **All fixtures live inside this workspace** (`fixtures/`) — never reach into sibling repos at `../` for inputs. The autopilot repo is self-sufficient.
## Layout
| Path | Owns |
|---|---|
| `data_parameters.md` | Description of runtime input shapes (camera, telemetry, gRPC, mission JSON, operator commands, VLM IPC) + the categories of reference data tests need + Tier-1/Tier-2 class catalogue. |
| `services.md` | Per-external-service test-mock requirements: what shape of mock/fixture each of the 7 external systems needs and the acquisition status of each. |
| `fixtures/README.md` | File-by-file manifest of every fixture in this directory: SHA-256, size, upstream provenance, which `expected_results/results_report.md` rows consume it. |
| `fixtures/images/` | Real aerial frames (5 images, ~9 MB total) — Tier-1 inputs for detection-quality assertions (L1, D2, D6). |
| `fixtures/videos/` | Real reconnaissance video (1 clip, 12 MB) for frame-rate floor + sequence tests (T3). |
| `fixtures/movement/` | Wide-area movement-detection visual reference clips (4 clips, ~23 MB total). **No paired `gimbal.csv` / `telemetry.csv`** — ego-motion compensation (M1M4) cannot run against these alone. |
| `fixtures/semantic/` | Concealed-position semantic reference frames (4 PNGs, ~11 MB total) + `data_parameters.md` describing the new YOLO primitive classes the examples motivate. **Starter set only**, not a graded eval set. |
| `fixtures/schemas/` | Detection-result contract schemas (JSON + JSON-schema) for D6. |
| `fixtures/sql/` | Database init script — reference only; not directly asserted by an autopilot AC. |
| `expected_results/results_report.md` | The input → quantifiable-expected-output mapping consumed by `/test-spec` Phase 1. Every row keys off an AC in `../acceptance_criteria.md`; deferred rows carry a structured `<DEFERRED: <shape>; ref <pointer>>` tag. |
## Why fixtures are local
The autopilot repo MUST be self-sufficient — a developer with only the autopilot clone (no parent suite checked out) MUST be able to run the test specifications. Cross-repo `../` paths are forbidden in `results_report.md` and in any test runner script. When a sibling repo (`../detections/`, `../e2e/`, `../missions/`, etc.) is the upstream source of a fixture, we **copy** it in and SHA-pin it in `fixtures/README.md` so upstream drift is detectable.
## Suite-level coupling that still matters
Even though fixtures are local, the underlying contracts the fixtures embody come from suite-level decisions. When those decisions change, the fixtures here go stale:
- **Tier-1 detection model / classes** — when `../detections` ships a new model the `expected_detections.json` baseline goes stale; D1, D2, D6 rows in `results_report.md` must be re-recorded.
- **`mission-schema`** — shared between autopilot and the `missions` repo. Schema changes break the mission JSON contract; the mock fixtures for Mp1Mp5 (when authored) must re-pin.
- **Detection classes catalogue** — class IDs 0..18 are governed at the suite level. Autopilot's normalised-box output uses the same IDs. The 5 new Tier-1 classes documented in `data_parameters.md → "Class catalogue"` must land in the suite catalogue before D1 can be measured.
Today these couplings are tracked manually. The `monorepo-e2e` skill at the suite root will eventually own the drift detection.
## Fixture gaps and the project policy on `/test-spec` Phase 3
`/test-spec` Phase 3 has a **hard 75% coverage gate** on rows with real input fixtures + real expected results. Today's coverage is well below that gate (see `expected_results/results_report.md → "Coverage Status"`). **Project policy as of 2026-05-19**: rather than block the autodev flow at the gate, each deferred row is registered with a structured `<DEFERRED: <shape>; ref <pointer>>` tag in `results_report.md`, pointing at the per-service acquisition path in `services.md` or at an open architecture question (Q-tag). Deferred rows become **release-gate items**, not development-gate items. The `acceptance_criteria.md → "Acceptance Gates (project-level)"` hardware/replay benchmark requirement remains a hard release blocker.
Summary of open gaps (authoritative list lives in `services.md` and `fixtures/README.md`):
1. **Paired `gimbal.csv` + `telemetry.csv` for the 4 movement clips** — highest priority (blocks M1M4 + tightens L6/L7). **User-confirmed unavailable today (2026-05-19).**
2. Annotated multi-season eval set (concealed positions + footpaths).
3. Mock `missions` API exchanges + mock `/mapobjects` round-trip.
4. Mock Ground Station session traces.
5. ArduPilot SITL traces.
6. Operator-command envelopes (blocked on Q9).
7. VLM I/O pairs.
8. GPS / NTP drift scripts.
Closing each gap is its own workstream tracked in Jira; the autodev flow does not block on them.
## Adding new fixtures
1. Drop the file under `fixtures/<images|videos|movement|semantic|schemas|sql|gimbal|telemetry|mavlink|vlm|operator|mapobjects>/<descriptive-name>.<ext>` — create the subdirectory if it does not exist.
2. Compute SHA-256 (`shasum -a 256 <file>`).
3. Add a row to the matching subsection in `fixtures/README.md` (file path, size, SHA, upstream provenance, which `results_report.md` rows consume it).
4. Replace the matching `<DEFERRED: ...>` placeholder(s) in `expected_results/results_report.md` with the local path `fixtures/<...>`.
5. If the fixture replaces a service mock, also update `services.md → "Coverage summary by service"` to reflect the new acquisition status.
6. If the fixture is binary and large (> 50 MB) consider gitignoring it + adding an acquisition script per the e2e pattern; for everything in the current set, direct commit is fine.
@@ -0,0 +1,101 @@
# Input Data Parameters
Describes the **categories of input data** the system consumes at runtime, and the **categories of reference data** tests need. Internal component names, programming languages, IPC mechanisms, schema class names, and specific model choices are design and live in `_docs/02_document/architecture.md` — they do not belong in this file (per `.cursor/rules/artifact-srp.mdc`).
Local fixtures live in `fixtures/`; see `fixtures/README.md` for the manifest. External-service test-mock requirements live in `services.md`; the per-row binding to AC criteria lives in `expected_results/results_report.md`.
## Runtime inputs (what the system consumes when flying)
| Input | Source | Format | Cadence | Notes |
|---|---|---|---|---|
| Camera frames | ViewPro A40 (or alternative ViewPro Z40K) | H.264 / H.265 over RTSP, 1080p (1920×1080) | 30 / 60 fps | Frame timestamps are mandatory. |
| Primitive (Tier 1) detection responses | `../detections` service over a bi-directional streaming RPC contract | Bounding boxes with class id, confidence, normalised coordinates | Per frame | Same boxes feed Tier-2 ROI selection and the operator overlay. |
| UAV telemetry | Airframe via MAVLink v2 (UDP or serial) | MAVLink messages: position, attitude, velocity, battery, link health, GPS fix | ≥1 Hz (10 Hz target) | Source-of-truth for ego-motion compensation. |
| Gimbal feedback | ViewPro A40 vendor protocol over UDP | Yaw / pitch / zoom angle telemetry | per-tick | Source-of-truth for camera-pose compensation. |
| Mission JSON | `missions` service via HTTPS REST | Shared `mission-schema` JSON | Once at mission start + middle-waypoint updates | Validated against the shared schema. |
| Area-level map state | `missions` service extension `/missions/{id}/mapobjects` (GET) | Map-object records keyed by spatial cell | Once at mission start | Hydrates the system's local copy of the area map; cache-fallback on timeout. |
| Operator commands | Ground Station via modem (return path of the outbound telemetry stream) | Authenticated + signed + replay-protected command envelope (scheme open per Q9) | Event-driven | confirm / decline / target-follow start / target-follow release / abort. |
| Deep-analysis responses (optional) | Local-onboard model accessed via local IPC | Structured assessment schema (validated) | Per zoomed-in endpoint hold (when deep-analysis is enabled) | Schema-violation fails closed. |
## Class catalogue (Tier-1 + Tier-2)
Detection-quality acceptance criteria (`acceptance_criteria.md → Detection Quality`) are evaluated against a class catalogue that combines pre-existing suite-level classes with new autopilot-driven additions. Class IDs are governed at the suite level (`../detections` owns the catalogue); autopilot only consumes the IDs.
### New Tier-1 (YOLO primitive) classes — to be added to the suite catalogue
| # | Class name | Annotation hint | Motivated by |
|---|---|---|---|
| 1 | Black entrances | Bounding box; various sizes (small hideout openings to dugout entrances) | Concealed-position detection (D3, D4) |
| 2 | Branch piles | Bounding box | Concealment material around hideouts (D3, D4) |
| 3 | Footpaths | **Polyline / segmentation preferred over bbox** for linear features | Footpath recall gate (D5) |
| 4 | Roads | Polyline / segmentation | Distinguishing roads from footpaths in the same scene |
| 5 | Trees / tree blocks | Bounding box; tree-block annotation may use larger box for clusters | Concealment-context anchor; reduces false positives around tree-rows in movement detection (M1) |
### Tier-2 semantic attributes — composed by `semantic_analyzer`, NOT added to YOLO catalogue
| # | Attribute | Composed from | Used by |
|---|---|---|---|
| 1 | Footpath freshness (fresh / stale) | Footpath bbox + texture/edge analysis + seasonal context | Decision-window scoring, D5 partial coverage |
| 2 | Concealed-structure inference | Black-entrance + branch-piles + footpath-approach proximity | POI surfacing for D3/D4 (the structure itself is composed, not directly labelled) |
| 3 | Open clearing connected to path | Cleared-terrain texture + footpath endpoint | FPV-launch-point flagging |
### Existing classes (already in the suite catalogue)
The existing-class baseline (P=0.816, R=0.852 per the AC) covers the suite's pre-autopilot class set (vehicles, military equipment, etc.). Autopilot must not degrade these — see D2.
### Reference for IDs
The 19-id catalogue (0..18) is owned by `../detections`. Autopilot's normalised-box output uses the same IDs. When `../detections` ships a new model or renumbers IDs, the `expected_detections.json` baseline goes stale and D1, D2, D6 rows must be re-recorded.
## Reference data needed for testing
### Local fixtures already on disk
See `fixtures/README.md` for the SHA-pinned manifest. Categorised summary:
| Local fixture category | Files | Purpose | Bound to AC rows |
|---|---|---|---|
| `fixtures/images/*.jpg` | 5 aerial frames | Tier-1 detection contract; existing-class regression; normalised-box conformance | L1, D2, D6 |
| `fixtures/videos/94d42580bd1ad6ff.mp4` | 1 reconnaissance clip | Frame-rate floor scenario, reserved for future movement-sequence tests | T3 |
| `fixtures/schemas/expected_detections.{json,schema.json}` | 2 schema files | Detection-result contract shape reference | D6 |
| `fixtures/sql/init.sql` | 1 SQL file | Suite-e2e DB seed reference | (suite-only; no autopilot AC) |
| `fixtures/movement/video0[1-4].mp4` | 4 wide-area clips | Visual reference for movement-detection scenarios — **no paired telemetry CSVs**, ego-motion assertions unfalsifiable until those land | M1M4 (visual reference only) |
| `fixtures/semantic/semantic0[1-4].png` | 4 reference frames | Visual reference for concealed-position semantic targets — **starter set only, not a graded eval set** | D3, D4, D5 (starter only) |
### Reference shapes still needed but not yet on disk
The per-service mock catalogue is in `services.md` (authoritative). Summary of categories tests need:
| Reference shape | Why it's needed | See |
|---|---|---|
| Frame sequences with synchronised `gimbal.csv` + `telemetry.csv` | Ego-motion compensation at zoom-out AND zoomed-in inspection | `services.md §6 Gimbal telemetry CSV` |
| Concealed-position image set across all four seasons (annotated) | Concealed-position recall ≥60% and precision ≥20% | `services.md §5 Camera frame sequences` |
| Footpath sequences (fresh, stale, all four seasons, polyline-annotated) | Footpath recall ≥70% | `services.md §5` |
| New-class evaluation set (5 new classes above) | New-class per-class P/R ≥80% without degrading existing-class performance | `services.md §1 Tier-1 detection replay` (plus annotation campaign owned by `../ai-training` repo) |
| Mock Tier-1 streaming-RPC replays | Detection-consumer isolation tests | `services.md §1` |
| Mock Ground Station session traces | Lost-link failsafe ladder + operator-link reconnect | `services.md §3` |
| MAVLink SITL traces | MAVLink conformance + waypoint insertion + geofence enforcement | `services.md §4` |
| Mock central area-map service responses | Pre-flight pull / post-flight push round-trip; conflict cases (Q8) | `services.md §2` |
| Operator-command envelopes | Signature + replay-protection tests (once Q9 resolves) | `services.md §8` |
| VLM I/O pairs | Bounded ROI inputs + structured assessment outputs + schema-violation cases | `services.md §7` |
| GPS / NTP drift scenarios | Wall-clock drift health-yellow gate | `services.md §9` |
## Data volume targets
- Training data: hundreds to thousands of annotated images/sequences total.
- Seasonal coverage: winter (snow), spring (mud), summer (vegetation), autumn (mixed leaf + partial snow).
- Available assembly effort: 1.5 months at 5 hours/day.
- Movement detection requires **frame sequences** (not still images only) with synchronised camera + gimbal + UAV telemetry.
- Footpaths require polyline or segmentation annotation rather than bounding boxes (see "Class catalogue" above).
## Gaps that block `/test-spec` downstream
`/test-spec` Phase 1 will pass on prerequisite existence (`expected_results/results_report.md` is non-empty). Phase 3 has a **hard 75% coverage gate** on rows with real input fixtures + real expected results.
**Current coverage state** (re-computed 2026-05-19 after fixture restoration):
- Rows bound to real local fixtures: L1, D2, D6, T3 (~4 rows) — these are also the rows whose fixtures were restored on 2026-05-19 from sibling repos.
- Rows bound to **starter-only** fixtures (insufficient on their own): D3, D4, D5 (semantic PNGs), M1M4 (movement videos without CSV).
- Rows still deferred for fixture acquisition: see `fixtures/README.md → "Gaps still pending fixture acquisition"` and `services.md` for the authoritative list.
**Project policy on the Phase 3 gate**: rather than block `/test-spec` at the 75% gate, the autodev flow registers each deferred row with a structured `<DEFERRED: needs <shape>; blocks AC <id>>` tag in `expected_results/results_report.md`. Test-spec authoring proceeds; deferred rows become release-gate items, not development-gate items. The acceptance_criteria.md project-level gate ("MUST pass before product implementation begins") still applies for the hardware/replay benchmark — that remains a hard release blocker, not deferred.
@@ -0,0 +1,153 @@
# Expected Results
Maps every quantifiable acceptance criterion from `_docs/00_problem/acceptance_criteria.md` to an input fixture + a measurable expected result. Consumed by `/test-spec` Phase 1.
Per `.cursor/rules/artifact-srp.mdc`, this file uses **role / observable-behaviour language**, not internal component slugs. The system's externally observable behaviour is what's tested. Implementation names (component slugs, libraries, model names) live in `_docs/02_document/`.
**Fixture sourcing**: all fixtures live in `fixtures/` (sibling-repo `../` paths are forbidden). Where no fixture exists yet, the `Input` cell carries a structured `<DEFERRED: <shape>; ref services.md §N>` tag. Phase 3 has a hard 75% coverage gate — the autodev flow registers deferred rows as release-gate items rather than blocking on the gate; see `data_parameters.md → "Gaps that block /test-spec downstream"`.
**Comparison vocabulary**: see `.cursor/skills/test-spec/templates/expected-results.md` for canonical methods (`exact`, `numeric_tolerance`, `threshold_min`, `threshold_max`, `range`, `regex`, `substring`, `set_contains`, `json_diff`, `file_reference`).
**Deferred-tag legend**: `<DEFERRED: <shape>; ref <pointer>>` where `<pointer>` is a section in `../services.md` (per-service mock requirements), an open architecture question (e.g. `Q9`), or `inline-authorable` (no external dependency — just not yet written).
---
## Latency
Source ACs: `acceptance_criteria.md → Latency`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| L1 | `fixtures/images/4d6e1830d211ad50.jpg` | Single 1280 px aerial frame consumed through the Tier-1 contract; measure end-to-end | per-frame end-to-end latency | threshold_max | ≤ 100 ms | N/A |
| L2 | derived ROI ~640×640 from `fixtures/images/4d6e1830d211ad50.jpg` (inline-cropped by the test runner) | Tier-2 semantic confirmation over a single ROI | per-ROI latency | threshold_max | ≤ 200 ms | N/A |
| L3 | `<DEFERRED: bounded ROI crop matching the deep-analysis input contract; ref services.md §7>` | Tier-3 deep-analysis (when enabled) local-IPC call | per-ROI call latency | threshold_max | ≤ 5000 ms | N/A |
| L4 | `<DEFERRED: SITL or hardware-in-loop ViewPro A40 zoom command (medium→high); ref services.md §5>` | A40 physical zoom transition | wall-clock transition duration | threshold_max | ≤ 2000 ms | N/A |
| L5 | `<DEFERRED: scripted scan decision event followed by camera physical motion; ref services.md §3, §5>` | Decision-to-movement latency end-to-end | wall-clock decision→motion duration | threshold_max | ≤ 500 ms | N/A |
| L6 | `fixtures/movement/video01.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv; ref services.md §6>` | Movement candidate enqueue at the wide-area sweep | detection→enqueue duration | threshold_max | ≤ 1000 ms | N/A |
| L7 | `fixtures/movement/video02.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv at zoomed-in band; ref services.md §6>` | Movement candidate enqueue during zoomed inspection | detection→enqueue duration | threshold_max | ≤ 1500 ms | N/A |
| L8 | `<DEFERRED: full sweep → zoomed-inspection transition (POI detected → ROI fully zoomed); ref services.md §3, §5>` | Scan-mode transition including physical zoom | wall-clock transition | threshold_max | ≤ 2000 ms | N/A |
| L9 | `<DEFERRED: scripted operator-click → outbound command emitted by the system (modem RTT excluded); ref services.md §3>` | Operator command → action latency | wall-clock click→outbound | threshold_max | ≤ 500 ms | N/A |
## Throughput / Rate
Source ACs: `acceptance_criteria.md → Throughput / Rate`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| T1 | `<DEFERRED: long synthetic POI feed sustained above the cap (e.g. 20 POIs/min); inline-authorable>` | Cap enforcement on POIs surfaced to operator | POI rate surfaced | threshold_max | ≤ 5 / min | N/A |
| T2 | `<DEFERRED: airframe MAVLink telemetry replay over a 60 s window; ref services.md §4>` | Position telemetry consumed from the airframe link | reported position rate | range | 1 Hz ≤ rate ≤ 10 Hz (10 Hz target) | N/A |
| T3 | `fixtures/videos/94d42580bd1ad6ff.mp4` replayed with throttled-decode + frame-drop injection to drop below 10 fps for ≥5 s | Frame-rate floor trigger | zoom-in transitions suppressed AND overall health surfaces yellow | exact (suppression bool) + exact (health = yellow) | N/A | N/A |
## Detection Quality
Source ACs: `acceptance_criteria.md → Detection Quality`. Evaluation runs against the Tier-1 detection pipeline that the system consumes; autopilot's role is correct consumption + re-emission of the normalised-box contract. Class catalogue (5 new Tier-1 classes + 3 Tier-2 attributes) is defined in `../data_parameters.md → "Class catalogue"`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| D1 | `<DEFERRED: new-class eval set across all four seasons (black entrances, branch piles, footpaths, roads, trees, tree blocks); ref services.md §1, annotation campaign in ../ai-training>` | Per-class precision/recall for added classes | per-class precision ≥ 0.80 AND recall ≥ 0.80 | threshold_min (both) | N/A | `<DEFERRED: expected_results/new_classes_pr.json>` |
| D2 | `fixtures/images/{4d6e1830d211ad50,54f6459dbddb93d8,6dd601b7d2dc1b30,805bcf1e9f271a58,f997d0934726b555}.jpg` (5 frames) | Existing-class regression — must not degrade vs documented baseline P=0.816, R=0.852 | per-class precision + recall delta vs baseline | numeric_tolerance | ± 0.02 absolute | `<DEFERRED: expected_results/existing_classes_baseline.json — to be recorded against the pinned ../detections model>` |
| D3 | `fixtures/semantic/semantic0[1-4].png` (4 starter frames — 1 winter, 3 unmarked season) + `<DEFERRED: full multi-season annotated concealed-position set; ref services.md §5>` | Concealed-position recall (initial gate, accepting high FP) | recall | threshold_min | ≥ 0.60 | `<DEFERRED: expected_results/concealed_positions.json>` |
| D4 | Same as D3 | Concealed-position precision (operators filter) | precision | threshold_min | ≥ 0.20 | same as D3 |
| D5 | `fixtures/semantic/semantic0[1-4].png` (all 4 feature footpaths leading to concealment — starter set) + `<DEFERRED: footpath sequences (fresh + stale, all four seasons), polyline-annotated; ref services.md §5>` | Footpath recall | recall | threshold_min | ≥ 0.70 | `<DEFERRED: expected_results/footpaths.json>` |
| D6 | `fixtures/images/4d6e1830d211ad50.jpg` | Single-frame Tier-1 contract — system must consume the bbox stream and re-emit normalised-box format | output box stream conforms to the suite-level class catalogue (ids 0..18) + normalised coordinates ∈ [0,1] | schema_match + range | each coord ∈ [0,1] | `fixtures/schemas/expected_detections.schema.json` |
## Movement Detection Behaviour
Source ACs: `acceptance_criteria.md → Movement Detection`. Latency aspects (L6, L7) live under Latency.
**Note**: M1M4 each have a visual-reference video on disk but NO paired `gimbal.csv` / `telemetry.csv`. Ego-motion compensation cannot be verified against these videos — the visual binding is provided so a smoke harness can run, but the assertions in this section require the deferred CSVs to be meaningful. User confirmed 2026-05-19: paired CSVs do not exist today.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| M1 | `fixtures/movement/video01.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv; scene must contain 1 stable tree row + 1 moving vehicle; ref services.md §6>` | Ego-motion compensation — stable objects rejected | system emits exactly 1 movement candidate (the vehicle); does NOT emit a candidate for the tree row | set_contains | candidate set == {vehicle}; ∉ tree row | N/A |
| M2 | `fixtures/movement/video02.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv at zoomed-in band; 1 small mover; ref services.md §6>` | Movement detection continues during zoomed-in hold | system enqueues 1 candidate while the camera is in the zoomed-in hold; current ROI is not preempted unless the candidate's priority exceeds it | exact | 1 candidate enqueued; ROI preempt decision matches priority rule | N/A |
| M3 | `fixtures/movement/video03.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv simulating per-zoom-band threshold edge (cluster persistence one frame below threshold); ref services.md §6>` | Per-zoom-band threshold honoured (no false candidate) | no candidate emitted | exact | count == 0 | N/A |
| M4 | `fixtures/movement/video04.mp4` (visual reference) + `<DEFERRED: zoom-out + zoomed-in benchmark suite measuring false-positive rate at each band; ref services.md §6, Q14>` | Movement zoomed-in benchmark gate (Q14 fallback trigger) | false-positive rate per zoom band | threshold_max | ≤ per-zoom-band budget (configurable; default ≤ 0.5 / minute at zoomed-in) | `<DEFERRED: expected_results/movement_benchmark_caps.json>` |
## Scan & Camera Control Behaviour
Source ACs: `acceptance_criteria.md → Scan and Camera Control`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| S1 | `<DEFERRED: scripted mission with planned route + simulated POI detected mid-sweep; ref services.md §3, §4>` | Sweep → zoomed-inspection transition within 2 s (L8) AND POI properly enqueued | transition completes; ROI matches POI bbox; queue length increments | exact (multiple) | N/A | N/A |
| S2 | `<DEFERRED: zoomed-inspection hold scenario with footpath polyline overlapping the ROI; ref services.md §5, §6>` | Camera lock + pan along footpath while airframe flies | camera commands keep the footpath in the centre 50% of frame for the duration of the hold | numeric_tolerance | centre offset ≤ 25% per frame | N/A |
| S3 | `<DEFERRED: operator-confirmed target + 60 s follow window; ref services.md §3>` | Target-follow centre-window | target inside centre 25% of frame while visible | threshold_max | per-frame |dx,dy| ≤ 0.125 × frame_size | N/A |
| S4 | `<DEFERRED: queue with 3 POIs at varied confidence × proximity scores; inline-authorable>` | POI queue ordering | system pops POIs in order of `confidence × proximity × age_factor` (relative order matches) | exact (order) | N/A | N/A |
| S5 | `<DEFERRED: hold endpoint with deep-analysis enabled — assessment returns within 2 s; ref services.md §7>` | Zoomed-in hold timeout default 5 s/POI; deep-analysis hold capped at 2 s | hold ends at min(5 s, deep_analysis_complete) | exact | N/A | N/A |
## Operator Workflow
Source ACs: `acceptance_criteria.md → Operator Workflow`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| O1 | `<DEFERRED: synthetic POI at confidence = 0.40; inline-authorable>` | Confidence-scaled decision window lower bound | window duration | exact | 30 s | N/A |
| O2 | `<DEFERRED: synthetic POI at confidence = 1.00; inline-authorable>` | Confidence-scaled decision window upper bound | window duration | exact | 120 s | N/A |
| O3 | `<DEFERRED: synthetic POI at confidence = 0.70; inline-authorable>` | Linear interpolation (40% → 30 s, 100% → 120 s) | window duration ≈ 30 + (0.70-0.40)/(1.00-0.40) × (120-30) = 75 s | numeric_tolerance | ± 0.5 s | N/A |
| O4 | `<DEFERRED: synthetic POI at confidence = 0.39; inline-authorable>` | Below-threshold suppression | POI NOT surfaced to operator | exact | count surfaced == 0 | N/A |
| O5 | `<DEFERRED: surfaced POI followed by operator decline event; inline-authorable>` | Decline → ignored-item entry persisted | ignored-item appended with `(MGRS, class_group)` matching the declined POI | exact (count delta +1) + schema_match | N/A | N/A |
| O6 | `<DEFERRED: new detection whose (MGRS, class_group) matches an existing ignored-item; inline-authorable>` | Ignored-item suppression | POI NOT surfaced | exact | count surfaced == 0 | N/A |
| O7 | `<DEFERRED: surfaced POI + no operator response, > decision-window; inline-authorable>` | Timeout = forget (NOT blacklisted) | POI removed from queue; no ignored-item written | exact (queue 1) + exact (ignored-item count unchanged) | N/A | N/A |
| O8 | `<DEFERRED: operator confirm command — valid + signed + within sequence; ref services.md §3, §8 (Q9)>` | Confirm → middle waypoint inserted; mode transitions to target-follow | mission update POSTed; scan-mode reports target-follow | exact (HTTP 200) + exact (mode) | N/A | N/A |
| O9 | `<DEFERRED: replayed operator command — same envelope a second time; ref services.md §8 (blocked on Q9)>` | Replay protection | command rejected; security WARN logged; no state change | exact (state unchanged) + substring (log contains "replay") | N/A | N/A |
| O10 | `<DEFERRED: malformed / unsigned operator command; ref services.md §8 (blocked on Q9)>` | Signature validation | command rejected; security WARN logged | exact (state unchanged) + substring (log contains "invalid") | N/A | N/A |
## Reliability & Safety
Source ACs: `acceptance_criteria.md → Reliability & Safety` + lost-link failsafe ladder.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| R1 | `<DEFERRED: BIT scenario — every dependency healthy; inline-authorable>` | Pre-flight self-test passes | health endpoint returns all green; takeoff permitted | exact (state) + exact (health.all == "green") | N/A | N/A |
| R2 | `<DEFERRED: BIT scenario — Tier-1 detection unreachable; inline-authorable>` | BIT fails the takeoff gate | takeoff NOT permitted; detection dependency reports red | exact (takeoff inhibited) | N/A | N/A |
| R3 | `<DEFERRED: BIT scenario — persistent-store ≥95% full; inline-authorable>` | Storage floor BIT failure | takeoff NOT permitted; storage dependency reports red | exact (takeoff inhibited) | N/A | N/A |
| R4 | `<DEFERRED: in-flight operator/Ground-Station modem-link loss + 30 s elapsed; ref services.md §3, §4>` | Lost-link failsafe ladder (default 30 s grace → RTL) | system issues RTL at exactly 30 s; operator-link dependency reports red | exact (RTL command at 30s ± 1s) | ± 1 s | N/A |
| R5 | `<DEFERRED: mid-flight battery sample at RTL-floor (e.g. 25%); ref services.md §4>` | RTL trigger | system issues RTL; health → yellow | exact (RTL command) + exact (health == yellow) | N/A | N/A |
| R6 | `<DEFERRED: mid-flight battery sample at hard-floor (e.g. 15%); ref services.md §4>` | Land-now trigger (only operator-overridable) | system issues land-now | exact (land_now command) | N/A | N/A |
| R7 | `<DEFERRED: airframe link command + simulated bounded retry/backoff with peer not responding through max-retries; ref services.md §4>` | Watchdog flips health red on exhaustion | airframe-link dependency reports red after configured max-retry | exact (health == red) | N/A | N/A |
| R8 | `<DEFERRED: wall-clock drift > 200 ms simulation (GPS lock present, NTP disabled); ref services.md §9>` | Drift alarm | time-source dependency reports yellow; `clock_source` + `last_sync_at` reflect the drift | exact (health == yellow) | N/A | N/A |
| R9 | `<DEFERRED: geofence EXCLUSION polygon crossed by simulated waypoint; ref services.md §4>` | Symmetric geofence enforcement | waypoint refused; RTL triggered | exact (waypoint rejected) + exact (RTL) | N/A | N/A |
## Resources & Data
Source ACs: `acceptance_criteria.md → Resources & Data`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| Re1 | `<DEFERRED: long-running scenario — system's full onboard workload active for 5 min, monitored via process RSS; inline-authorable harness>` | Onboard memory budget (everything autopilot owns, excluding Tier 1) | combined RSS on the deployed compute device | threshold_max | ≤ 6 GB | N/A |
| Re2 | Same as Re1 with concurrent Tier-1 traffic | Tier-1 non-degradation | Tier-1 ms/frame delta vs baseline (L1) | numeric_tolerance | ± 5 ms | N/A |
## Map Reconciliation
Source ACs: `acceptance_criteria.md → Map Reconciliation`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|---|---|---|---|---|---|
| Mp1 | `<DEFERRED: mock central area-map service — 30 km × 30 km region, ~10000 map objects; ref services.md §2>` | Pre-flight pull | wall-clock GET → local copy hydrated | threshold_max | ≤ 30 s | N/A |
| Mp2 | `<DEFERRED: same mock but unreachable (timeout); ref services.md §2>` | Cache-fallback path | system falls back to last-known cached state; reports `map_sync == "cached_fallback"`; operator MUST acknowledge before takeoff | exact (state) + exact (BIT requires explicit ack) | N/A | N/A |
| Mp3 | `<DEFERRED: simulated 60-minute mission pass diff (~5000 NEW + ~2000 MOVED + ~500 REMOVED + ~10000 CONFIRMED-EXISTING); ref services.md §2>` | Post-flight push | wall-clock POST → 200 OK | threshold_max | ≤ 120 s | N/A |
| Mp4 | `<DEFERRED: same as Mp3 but POST returns 5xx; ref services.md §2>` | Persist-on-disk + bounded retry | pending diff written to on-device storage; operator-visible warning surfaced; retry attempts logged | exact (file exists) + exact (warning surfaced) + threshold_max (retries ≤ configured cap) | N/A | N/A |
| Mp5 | `<DEFERRED: two map updates with conflicting state for same (spatial-cell, class_group) — append-only log scenario; ref services.md §2, Q8>` | Conflict-resolution rule (Q8 placeholder) | append-only observation log + computed current view; conflict resolution per documented rule | json_diff | N/A | `<DEFERRED: expected_results/mapobjects_conflict_resolution.json — pending Q8>` |
---
## Coverage Status (auto-recomputed 2026-05-19)
- **Total rows**: 56 (L1L9, T1T3, D1D6, M1M4, S1S5, O1O10, R1R9, Re1Re2, Mp1Mp5).
- **Fully bound to real fixtures**: L1, T3, D2, D6 = **4 rows (~7%)**.
- **Bound to derived inline fixture** (no external acquisition needed): L2 = **+1 row (5 total, ~9%)**.
- **Bound to starter/partial fixtures** (visual reference only — assertions need additional deferred inputs to be meaningful): D3, D4, D5, M1, M2, M3, M4 = **+7 rows (12 total partial, ~21%)**.
- **Inline-authorable but not yet authored** (no external dependency — can be unblocked anytime by writing the fixture): T1, S4, O1O7, R1R3, R8, Re1, Re2 = **15 rows (~27%)**. Lifting these alone would bring effective coverage to ~48%.
- **Blocked on external acquisition** (real recordings, SITL, annotated eval sets, mock services): L3L9 (minus L6/L7 partial), T2, D1, M1M4 (CSV pairs), S1, S2, S3, S5, R4R7, R9, Mp1Mp5 = **~24 rows (~43%)**.
- **Blocked on architecture questions**: O8 (depends on Q9 partially), O9, O10 (Q9), M4 (Q14), Mp5 (Q8) = **4 rows**.
**Decision (project policy)**: rather than block on the Phase 3 75% gate, each deferred row is now registered with a structured `<DEFERRED:>` tag and surfaces in `data_parameters.md → "Gaps that block /test-spec downstream"`. `/test-spec` Phase 2 can author scenarios for all 56 rows; deferred rows become **release-gate items**, not development-gate items. The `acceptance_criteria.md → "Acceptance Gates (project-level)"` hardware/replay benchmark requirement is preserved as the hard release gate — that one is NOT being deferred.
## Notes on this spec
- Every row carries a quantifiable comparison + tolerance — no row is "should work".
- Where the AC depends on hardware (the deployed compute device, ViewPro A40), the test must run on representative hardware OR a benchmarked replay; pure-emulator runs are NOT acceptable for L1L9, T1T3, Re1Re2.
- Where the AC depends on an external service (`../detections`, `missions`, Ground Station), the test runs against either (a) the real service in the suite e2e (`../e2e/docker-compose.suite-e2e.yml`), or (b) a recorded replay fixture for isolation tests. Both modes are valid; the test scenario states which.
- Q-tagged rows (M4 → Q14, Mp5 → Q8, O8O10 → Q9) depend on open architecture questions. Their tolerance ranges may sharpen once those questions resolve; the existence of each row is non-negotiable.
- M1M4 visual-reference bindings (`fixtures/movement/video0[1-4].mp4`) are usable for harness smoke testing but DO NOT satisfy the assertion semantics — paired `gimbal.csv` + `telemetry.csv` are required for ego-motion compensation to be verifiable. This is the single highest-priority fixture gap.
@@ -0,0 +1,90 @@
# Fixture manifest
All fixtures live **inside this workspace** so the autopilot repo is self-sufficient — downstream test runners must never reach into a sibling repo at `../`. When you add or refresh a fixture, update the matching SHA-256 in this manifest AND the rows in `../expected_results/results_report.md` that consume it.
Total on-disk size: ~57 MB.
## Files
### Still-image aerial frames — `images/`
Used as Tier-1 input frames for detection-quality assertions.
| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
|---|---|---|---|---|
| `images/4d6e1830d211ad50.jpg` | 152 KB | `4c396495af64aaf9aac5ecb92431bf0c75db42b0bdb8e4eec1937f9995acee42` | `../detections/data/images/` (re-copied 2026-05-19) | L1, D6 |
| `images/54f6459dbddb93d8.jpg` | 6.7 MB | `cd65c76a080ef72ce3528031f003f067fca6091c067a86d527a1ae91cd78be59` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
| `images/6dd601b7d2dc1b30.jpg` | 1.4 MB | `45edd83a357a9f852e14e5845265cd09c20b4b99b1828c160cb3298f0e160181` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
| `images/805bcf1e9f271a58.jpg` | 176 KB | `fe696899225fc04f2335e87acf6a3ad8a00cd3950c5940d5e73e5ce438f36257` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
| `images/f997d0934726b555.jpg` | 232 KB | `5d1c9c551c0680e5b3d0aab261bca71e724c78f6db3580da598c680b4f7d4d79` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
### Reconnaissance video — `videos/`
| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
|---|---|---|---|---|
| `videos/94d42580bd1ad6ff.mp4` | 12 MB | `602b22a42515a754313551847caa6d6a6d7b3cde1d857cbd08ebc5543fb8cf7c` | `../detections/data/videos/` (re-copied 2026-05-19) | T3 (frame-rate floor scenario) |
### Movement-detection clips — `movement/`
Wide-area reconnaissance clips intended for movement-detection visual baselines. **Important**: these clips DO NOT have paired `gimbal.csv` / `telemetry.csv` files — ego-motion compensation assertions (M1M4) cannot run against them. They are useful for visual harness work, frame-count assertions, and as visual reference for the movement-detection scenarios.
| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
|---|---|---|---|---|
| `movement/video01.mp4` | 5.3 MB | `6f37186f5e9be97109db8d0d220df96d21cac9ce5b50b576234c6f7ee369d2bb` | local; provenance pre-existing in workspace | M1 (visual reference only — no telemetry) |
| `movement/video02.mp4` | 5.9 MB | `7de7981e511e21e1e72f506d44541b44a4c27a995c9505ef8e3b48e69b416367` | local; provenance pre-existing in workspace | M2 (visual reference only — no telemetry) |
| `movement/video03.mp4` | 6.1 MB | `df441164da7f37d715968212b95e9bf53c8e37384f20ddfab61cd6d0d18b4f3a` | local; provenance pre-existing in workspace | M3 (visual reference only — no telemetry) |
| `movement/video04.mp4` | 5.8 MB | `36445bf1c86c5afa524000b5b2da7fc9cb3d39c745f9ad830b3d60f6868948e7` | local; provenance pre-existing in workspace | M4 (visual reference only — no telemetry) |
### Semantic reference frames — `semantic/`
Annotated reference examples for concealed-position semantic targets. **Not a graded eval set** — these are 4 hand-picked examples of footpath-to-concealment patterns, intended as visual reference for what the system should recognise. Detection-quality gates (D1, D3, D4, D5) need a full annotated multi-season eval set; these 4 PNGs are insufficient for those gates and serve as starter reference only.
| File | Size | SHA-256 | Description | `results_report.md` rows |
|---|---|---|---|---|
| `semantic/semantic01.png` | 3.1 MB | `339ad4d35ab36052828f05652ab7249801bcd5d7bb04522f0ab9cbf6f0ca008a` | Footpath leading to branch-pile hideout in winter forest | D3, D4, D5 (starter only — full multi-season set still required) |
| `semantic/semantic02.png` | 5.1 MB | `ffe3c49f5f1833724ce46083d212e714422e664b635cdd48b63311adefcd7b1f` | Footpath to FPV launch clearing, branch mass at forest edge | D3, D4, D5 (starter only) |
| `semantic/semantic03.png` | 1.0 MB | `ce89c139815e9a80679237008f7cfc3039bbd53f162d48017e840ff91e57b109` | Footpath to squared hideout structure | D3, D4, D5 (starter only) |
| `semantic/semantic04.png` | 1.3 MB | `b25c689b7aa543ec15858e4b5edfa32387ced4930130eb280d952c555f104e69` | Footpath terminating at tree-branch concealment | D3, D4, D5 (starter only) |
| `semantic/data_parameters.md` | 2 KB | n/a (text) | Description of the four reference examples + the new YOLO primitive classes that motivate them | reference only |
### Detection contract schemas — `schemas/`
| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
|---|---|---|---|---|
| `schemas/expected_detections.json` | 1.4 KB | `ce60c105d697efe0359d2e6b1b46fc63e53d3789b067d53501f9c76aad9bd1ae` | `../e2e/fixtures/` (re-copied 2026-05-19) | D6 (sample Tier-1 response) |
| `schemas/expected_detections.schema.json` | 2.4 KB | `a7174e0b083dcbf42fa8672acd3e1807d11ea0629cc636ff958a4d77168733b9` | `../e2e/fixtures/` (re-copied 2026-05-19) | D6 (JSON-schema for the Tier-1 contract) |
### Database init script — `sql/`
| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
|---|---|---|---|---|
| `sql/init.sql` | 3.7 KB | `b61e452c549f7b006db88d265f4346837e0a33d1abd4d977ebf3d48d8c943439` | `../e2e/fixtures/` (re-copied 2026-05-19) | suite-only reference; no autopilot AC row asserts against this |
## Copy vs reference
Fixtures were COPIED (not moved). The sibling repos still own the originals — keeping autopilot's copy in sync when an upstream changes is a manual chore today (the `monorepo-e2e` skill at the suite root will eventually own this drift; see `_docs/_process_leftovers/` if a sync is pending).
When an upstream fixture changes:
1. Recompute the SHA-256 in the source repo.
2. Re-copy into the matching `fixtures/` subdirectory here.
3. Update this manifest's SHA-256 column.
4. If the change invalidates an assertion in `../expected_results/results_report.md`, fix the row's expected result too — do not let assertions drift silently against new data.
## Gaps still pending fixture acquisition
The authoritative per-service acquisition catalogue lives in `../services.md`. Summary of the still-open gaps (each is also tagged on its row in `../expected_results/results_report.md` with a structured `<DEFERRED: ...>` marker, and a `_docs/_process_leftovers/` entry records the replay obligation):
| Gap | What's missing | Blocks AC rows | Acquisition status |
|---|---|---|---|
| Paired gimbal+telemetry CSVs for the 4 movement clips | `gimbal.csv` + `telemetry.csv` aligned to each video frame timestamps | M1M4, tightens L6/L7 | **Confirmed unavailable today** (user 2026-05-19) — requires re-flight or new recording with gimbal-feedback channel captured |
| Annotated eval set across all four seasons | Hundredsthousands of labelled images per season for concealed-position + footpath gates | D1, D3, D4, D5 | needs annotation campaign (1.5 months at 5 hrs/day target per `semantic/data_parameters.md`) |
| Per-zoom-band frame sequences | Same kind of clip as `movement/` but recorded at light, medium, and high zoom bands | tightens M2, L7, S2 | needs flight time + zoom-band metadata in the recorder |
| Mock `missions` HTTPS exchanges | Recorded JSON request/response pairs for mission GET/POST + mapobjects GET/POST | Mp1Mp5 | inline-authorable against the `mission-schema`; not yet authored |
| Mock Ground Station session traces | Scripted timing trace (connect / push / drop / reconnect / lost-link) | R4, O8 | inline-authorable; not yet authored |
| ArduPilot SITL traces | Recorded MAVLink streams for waypoint upload, geofence INCLUSION + EXCLUSION, RTL on lost-link, RTL on battery floor | R4, R5, R6, R7, R9 + project SITL conformance gate | needs SITL run |
| Operator-command envelopes | Valid / expired / replayed / malformed envelopes under the chosen Q9 auth scheme | O9, O10 | **blocked on Q9** (`_docs/02_document/architecture.md §8`) |
| VLM I/O pairs | Bounded ROI in → structured `VlmAssessment` out + schema-violation cases | L3, S5 | inline-authorable against the assessment schema once the local model is pinned |
| GPS / NTP drift scenarios | Scripted offset / lock-loss traces | R8 | inline-authorable |
When a fixture from this list lands, copy it under `fixtures/<category>/`, add a row to the relevant subsection above, and bind the matching `<DEFERRED>` row in `../expected_results/results_report.md` to its new local path.
Binary file not shown.

After

Width:  |  Height:  |  Size: 149 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.7 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 231 KiB

@@ -0,0 +1,32 @@
{
"$schema": "./expected_detections.schema.json",
"_meta": {
"fixture_version": "0.1.0-placeholder",
"video": "sample.mp4",
"video_sha256": "TBD-after-fixture-recording",
"model": {
"_comment": "Pinned model + classes that detections must run when this baseline applies. Refresh this block (and counts/bboxes below) whenever detections ships a new model.",
"name": "TBD",
"revision": "TBD",
"classes_source": "annotations/src/Database/DatabaseMigrator.cs (ids 0..18)"
},
"tolerance": {
"_comment": "Spec asserts ranges, not exact values. INT8 calibration drift can move pixel positions by a few units; absolute count can drift by ±1 across re-runs of the same engine on the same Jetson.",
"count_delta": 1,
"bbox_iou_min": 0.8,
"confidence_delta": 0.1
}
},
"expected": {
"total_annotations": 0,
"by_class": [
{
"class_id": 0,
"class_name": "ArmorVehicle",
"count": 0,
"bbox_samples": []
}
],
"_placeholder_note": "Replace this block with the real baseline once sample.mp4 is recorded. Each entry under `by_class` carries: class_id, class_name (must match detection_classes.name), count, and bbox_samples (an array of {time_sec, center_x, center_y, width, height, confidence} entries the spec uses for IoU comparison)."
}
}
@@ -0,0 +1,66 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Suite e2e expected detections baseline",
"type": "object",
"required": ["_meta", "expected"],
"properties": {
"$schema": { "type": "string" },
"_meta": {
"type": "object",
"required": ["fixture_version", "video", "video_sha256", "model", "tolerance"],
"properties": {
"fixture_version": { "type": "string" },
"video": { "type": "string" },
"video_sha256": { "type": "string" },
"model": {
"type": "object",
"required": ["name", "revision", "classes_source"],
"additionalProperties": true
},
"tolerance": {
"type": "object",
"required": ["count_delta", "bbox_iou_min", "confidence_delta"],
"properties": {
"count_delta": { "type": "integer", "minimum": 0 },
"bbox_iou_min": { "type": "number", "minimum": 0, "maximum": 1 },
"confidence_delta": { "type": "number", "minimum": 0, "maximum": 1 }
}
}
}
},
"expected": {
"type": "object",
"required": ["total_annotations", "by_class"],
"properties": {
"total_annotations": { "type": "integer", "minimum": 0 },
"by_class": {
"type": "array",
"items": {
"type": "object",
"required": ["class_id", "class_name", "count"],
"properties": {
"class_id": { "type": "integer", "minimum": 0 },
"class_name": { "type": "string" },
"count": { "type": "integer", "minimum": 0 },
"bbox_samples": {
"type": "array",
"items": {
"type": "object",
"required": ["time_sec", "center_x", "center_y", "width", "height"],
"properties": {
"time_sec": { "type": "number", "minimum": 0 },
"center_x": { "type": "number" },
"center_y": { "type": "number" },
"width": { "type": "number", "minimum": 0 },
"height": { "type": "number", "minimum": 0 },
"confidence": { "type": "number", "minimum": 0, "maximum": 1 }
}
}
}
}
}
}
}
}
}
}
@@ -0,0 +1,45 @@
# Semantic And Movement Detection Training Data
# Source
- Aerial imagery from reconnaissance winged UAVs at 6001000m altitude
- ViewPro A40 camera, 1080p resolution, various zoom levels
- Extracted from video frames and still images
- Movement detection requires frame sequences, not still images only; include camera/gimbal telemetry where available to separate target motion from UAV motion.
# Target Classes
- Footpaths / trails (linear features on snow, mud, forest floor)
- Fresh footpaths (distinct edges, undisturbed surroundings, recent track marks)
- Stale footpaths (partially covered by snow/vegetation, faded edges)
- Concealed structures: branch pile hideouts, dugout entrances, squared/circular openings
- Tree rows (potential concealment lines)
- Open clearings connected to paths (FPV launch points)
- Moving point/cluster candidates at wide or light/medium zoom
# YOLO Primitive Classes (new)
- Black entrances to hideouts (various sizes)
- Piles of tree branches
- Footpaths
- Roads
- Trees, tree blocks
# Annotation Format
- Managed by existing annotation tooling in separate repository
- Expected: bounding boxes and/or segmentation masks depending on model architecture
- Footpaths may require polyline or segmentation annotation rather than bounding boxes
# Seasonal Coverage Required
- Winter: snow-covered terrain (footpaths as dark lines on white)
- Spring: mud season (footpaths as compressed/disturbed soil)
- Summer: full vegetation (paths through grass/undergrowth)
- Autumn: mixed leaf cover, partial snow
# Volume
- Target: hundreds to thousands of annotated images/sequences
- Available effort: 1.5 months, 5 hours/day
- Potential for annotation process automation
# Reference Examples
- semantic01.png — footpath leading to branch-pile hideout in winter forest
- semantic02.png — footpath to FPV launch clearing, branch mass at forest edge
- semantic03.png — footpath to squared hideout structure
- semantic04.png — footpath terminating at tree-branch concealment
Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.3 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 MiB

@@ -0,0 +1,104 @@
-- Suite e2e database seed.
--
-- Loaded by the `db-seed` service in docker-compose.suite-e2e.yml after
-- annotations has run its own DatabaseMigrator (which creates the schema +
-- inserts the canonical detection_classes 0..18). This file therefore only
-- adds rows that the e2e scenario depends on but the production runtime does
-- NOT seed automatically.
--
-- Idempotency: every statement uses ON CONFLICT / IF NOT EXISTS so re-running
-- the seed (e.g. on a `down -v` followed by `up`) lands the same final state.
--
-- Schema reference: annotations/src/Database/DatabaseMigrator.cs.
\set ON_ERROR_STOP on
-- Wait until annotations has populated its schema. The db-seed container starts
-- only after postgres-local is healthy, but annotations may still be spinning
-- up its tables. A bounded poll keeps the seed deterministic.
DO $$
DECLARE
attempt int := 0;
BEGIN
WHILE attempt < 60 LOOP
PERFORM 1
FROM information_schema.tables
WHERE table_schema = 'public' AND table_name = 'detection_classes';
IF FOUND THEN
EXIT;
END IF;
PERFORM pg_sleep(1);
attempt := attempt + 1;
END LOOP;
IF attempt >= 60 THEN
RAISE EXCEPTION 'detection_classes table not found after 60s — annotations migration did not complete';
END IF;
END $$;
-- Default system_settings row. Annotations starts without one, but several
-- spec assertions rely on `silent_detection = false` and known thumbnail dims
-- so overlay rendering is reproducible.
INSERT INTO system_settings (
id, name, military_unit,
default_camera_width, default_camera_fov,
thumbnail_width, thumbnail_height, thumbnail_border,
generate_annotated_image, silent_detection
) VALUES (
'00000000-0000-0000-0000-00000000aaaa',
'azaion-suite-e2e',
'e2e-unit',
3840, 70,
240, 135, 10,
true, false
) ON CONFLICT (id) DO NOTHING;
-- Default directory_settings row. Annotations writes media files under the
-- paths defined here; the e2e-runner doesn't read these directly but the
-- service requires the row to exist on first hit.
INSERT INTO directory_settings (
id, videos_dir, images_dir, labels_dir, results_dir,
thumbnails_dir, gps_sat_dir, gps_route_dir
) VALUES (
'00000000-0000-0000-0000-00000000bbbb',
'/data/videos', '/data/images', '/data/labels', '/data/results',
'/data/thumbnails', '/data/gps_sat', '/data/gps_route'
) ON CONFLICT (id) DO NOTHING;
-- Default camera_settings row used by detections to size bbox-to-meters.
INSERT INTO camera_settings (
id, altitude, focal_length, sensor_width
) VALUES (
'00000000-0000-0000-0000-00000000cccc',
100, 50, 36
) ON CONFLICT (id) DO NOTHING;
-- Stable e2e user. The UUID is referenced by the spec when asserting
-- annotation rows. Annotations does not own a `users` table — user identity
-- is carried in JWTs minted with JWT_SECRET; the user_id here just needs to
-- be deterministic and stable across runs.
-- Stored in user_settings so the spec can `SELECT user_id` to confirm the
-- seed ran.
INSERT INTO user_settings (
id, user_id,
annotations_left_panel_width, annotations_right_panel_width,
dataset_left_panel_width, dataset_right_panel_width
) VALUES (
'00000000-0000-0000-0000-00000000dddd',
'00000000-0000-0000-0000-0000e2e2e2e2',
300, 400, 320, 320
) ON CONFLICT (id) DO NOTHING;
-- Sanity check — fail loudly if the canonical detection_classes are missing.
-- annotations/src/Database/DatabaseMigrator.cs inserts ids 0..18 unconditionally.
DO $$
DECLARE
cnt int;
BEGIN
SELECT COUNT(*) INTO cnt FROM detection_classes WHERE id BETWEEN 0 AND 18;
IF cnt < 19 THEN
RAISE EXCEPTION 'expected canonical detection_classes 0..18 (count=19), got %', cnt;
END IF;
END $$;
\echo 'suite-e2e seed complete'
+113
View File
@@ -0,0 +1,113 @@
# External Services — Test-Mock Requirements
Black-box catalogue of every external system autopilot depends on at runtime, with the **test-fixture / mock shape required for each**. Service-side design (protocols, component contracts, ownership boundaries) lives in `_docs/02_document/architecture.md` — this file owns ONLY the test-data dependency view (per `.cursor/rules/artifact-srp.mdc`, `_docs/00_problem/input_data/` is a test-data concern).
Runtime input shapes (frame rates, message types) are described in `data_parameters.md`. This file extends them with the **acquisition status of the corresponding test fixture**.
## Index
| # | External system | Production role | Test-mock shape needed | Acquisition status |
|---|---|---|---|---|
| 1 | Tier-1 detection (`../detections`) | Primitive YOLO inference on every frame; returns class + bbox + confidence | Recorded bi-stream replay file (`request frame``response detections`) | **MISSING** — no replay recorded yet |
| 2 | Mission planner (`missions` API) | Mission JSON pull at start; middle-waypoint POST on operator-confirm; pre-flight area-map pull + post-flight diff push | Mock HTTPS exchanges for GET/POST + sample mission + sample mapobjects state | **MISSING** — schema known (mission-schema), no fixture recorded |
| 3 | Ground Station (modem) | Continuous push of camera + telemetry + bbox overlay; return path carries operator commands (confirm / decline / target-follow / abort) | Scripted session traces: nominal session, modem drop at T=N, reconnect at T=M, lost-link sustained ≥30 s | **MISSING** — authorable inline (no external dependency) |
| 4 | Airframe autopilot (ArduPilot / PX4) | MAVLink v2 transport for the ~1015 commands in `architecture.md §7.7`; battery + position telemetry; geofence enforcement | ArduPilot SITL traces: waypoint upload, geofence INCLUSION + EXCLUSION, RTL on lost-link, RTL on battery floor | **MISSING** — needs SITL run with scripted scenarios |
| 5 | ViewPro A40 camera (frames) | H.264/265 1080p RTSP video feed at 30/60 fps | Recorded frame sequences (`.mp4`) — wide-zoom, light-zoom, medium-zoom, high-zoom variants | **PARTIAL** — 4 wide-zoom clips on disk (`fixtures/movement/video0[1-4].mp4`); zoom-band variants missing |
| 6 | ViewPro A40 gimbal (control) | Vendor UDP control protocol; yaw / pitch / zoom telemetry per tick | Per-frame-sequence `gimbal.csv` paired with the matching video; per-tick yaw/pitch/zoom + timestamp | **MISSING** — no `gimbal.csv` paired with the 4 movement videos; ego-motion compensation (M1M4) is unfalsifiable without this |
| 7 | Deep-analysis VLM (local IPC) | Optional Tier-3 confirmation over bounded ROI; structured `VlmAssessment` response | Recorded I/O pairs (ROI in → `VlmAssessment` out) + schema-violation cases for fail-closed tests | **MISSING** — depends on the local model choice; can be authored against the assessment schema once the model is pinned |
| 8 | Time source (GPS / NTP) | Wall-clock; drift triggers the R8 health-yellow gate | Scripted drift scenarios (no real GPS/NTP hardware needed) — clock offset, jump, source loss | **MISSING** — authorable inline |
## Per-service detail — what acquisition would look like
The table above is the index; the rows below explain the shape and acquisition path so the gaps can be planned out one at a time.
### 1. Tier-1 detection replay (`../detections`)
- Production transport: bi-directional gRPC. The autopilot streams frames out; `../detections` streams `Detections` messages back.
- Mock shape: a `.replay` file (one per scenario) recording timestamped frames + the exact `Detections` responses the model emitted. Used by `detection_client` integration tests in isolation — no need to boot the real Tier-1 service.
- Acquisition path: record one replay against the currently pinned `../detections` model. Re-record when the upstream model changes (the `monorepo-e2e` skill should eventually own this drift; see the suite's leftovers).
- Blocks AC rows: every row that needs a deterministic detection stream — practically L1, L2, D1, D2, D6 in isolation; in suite-e2e mode these run live against the real `../detections`.
### 2. Mission + MapObjects mock (`missions` API)
- Production transport: HTTPS REST.
- Mock shape: JSON fixtures per endpoint + a small mock HTTP server (or replay-style fixtures consumed by a test double). Endpoints in scope:
- `GET /missions/{id}` — mission JSON, validated against `mission-schema`.
- `POST /missions/{id}` — middle-waypoint insertion (200 OK + updated mission).
- `GET /missions/{id}/mapobjects` — pre-flight area-map pull (response shape: map-object records keyed by spatial cell; volume target ~10000 objects for the 30×30 km gate Mp1).
- `POST /missions/{id}/mapobjects` — post-flight diff push (NEW / MOVED / REMOVED / CONFIRMED-EXISTING; volume target per Mp3 ~17500 records).
- Acquisition path: author JSON fixtures against the known schema; record real exchanges once `missions` is reachable from the test bench.
- Blocks AC rows: Mp1Mp5 (all 5 map-reconciliation rows).
### 3. Ground Station session trace
- Production transport: continuous push over modem (suite-level protocol).
- Mock shape: scripted timing trace per scenario. Each scenario is a list of `(t, event)` pairs: connect, push frame, push telemetry, operator-click, modem-drop, reconnect, lost-link.
- Acquisition path: authorable inline from `architecture.md §7` and `acceptance_criteria.md §Reliability & Safety`. No external dependency — just a fixture generator.
- Blocks AC rows: R4 (lost-link → RTL at 30 s); O8, O9, O10 (operator command lifecycle on the return path, **but** O9/O10 also depend on Q9 for the auth scheme).
### 4. MAVLink SITL trace
- Production transport: MAVLink v2 over UDP or serial.
- Mock shape: ArduPilot SITL recording capturing the autopilot's command stream + the airframe's response stream. One trace per scenario: waypoint upload, geofence INCLUSION violation, geofence EXCLUSION violation, lost-link RTL, battery RTL-floor RTL, battery hard-floor land-now.
- Acquisition path: run ArduPilot SITL with a scripted mission; capture the full MAVLink stream with mavlink-router or equivalent.
- Blocks AC rows: R4 (RTL exact timing), R5, R6, R7, R9; plus the project-level "MAVLink command surface MUST pass SITL conformance" gate.
### 5. Camera frame sequences (ViewPro A40)
- Production transport: RTSP/RTP over TCP/UDP, 1080p H.264/265 at 30/60 fps.
- Current local fixtures: `fixtures/movement/video0[1-4].mp4` (4 clips, ~56 MB each), `fixtures/videos/94d42580bd1ad6ff.mp4` (one reconnaissance clip used for T3 frame-rate floor).
- Mock-shape gap: zoom-band coverage. Each AC scenario that names a zoom level (wide, light, medium, high) needs a representative clip at that zoom band. The 4 movement clips do not enumerate which zoom band each represents — this needs documenting per clip OR re-recording with zoom-band labels.
- Acquisition path: existing clips usable for movement-detection visual baselines; new recordings at each zoom band require flight time.
### 6. Gimbal telemetry CSV (paired with frames)
- Production transport: ViewPro A40 vendor protocol over UDP; per-tick yaw/pitch/zoom updates.
- Mock shape: `gimbal.csv` with columns `(t, yaw_deg, pitch_deg, zoom_band, focal_mm)`, one CSV per video file, timestamps aligned to frame timestamps within ≤ 1 frame.
- Acquisition path: requires re-flying the recording with the gimbal-feedback channel captured alongside. CANNOT be back-fitted to existing videos.
- Blocks AC rows: M1, M2, M3, M4 (movement-detection ego-motion compensation); also tightens L6, L7 (movement candidate enqueue latency).
- **Confirmed not available today (user-stated 2026-05-19).**
### 7. VLM I/O pairs
- Production transport: Unix-domain socket IPC to local-onboard VLM (NanoLLM / VILA1.5-3B per architecture §1).
- Mock shape: paired `(roi.png, prompt.txt, vlm_response.json)` per scenario + a small set of schema-violation cases (truncated JSON, wrong field types, missing required fields) for fail-closed tests.
- Acquisition path: depends on the local VLM model choice. Once pinned, capture real I/O during a flight or scripted run; schema-violation cases authored inline.
- Blocks AC rows: L3 (Tier-3 ≤5 s latency on bounded ROI), S5 (deep-analysis hold-cap interaction).
### 8. Operator-command envelopes
- Production transport: comes back to autopilot via Ground Station modem return path.
- Mock shape: per envelope, a `(scheme, payload, signature, sequence_id)` tuple. One fixture per case: valid, expired, replayed (same envelope sent twice), malformed (signature mismatch), unsigned.
- Acquisition path: **blocked on Q9** (operator-command auth scheme — open in `_docs/02_document/architecture.md §8`). Once the scheme is chosen, envelopes are authorable inline.
- Blocks AC rows: O9 (replay protection), O10 (signature validation); strengthens O8 (confirm pathway).
### 9. GPS / NTP drift scripts
- Production transport: kernel-level wall clock + GPS lock state.
- Mock shape: scripted offset injection — bump the clock by N ms, drop GPS lock, change time source.
- Acquisition path: authorable inline; no external dependency.
- Blocks AC rows: R8.
## Coverage summary by service
| Service | Rows covered (real fixture) | Rows blocked on this service | Acquisition priority |
|---|---|---|---|
| Tier-1 replay | L1, D2, D6 (live; replay desirable for isolation) | none independently blocked | low (can use live `../detections` in suite-e2e) |
| `missions` mock | none | Mp1Mp5 (5 rows) | medium |
| Ground Station trace | none | R4, O8 (2 rows) | low (inline-authorable) |
| MAVLink SITL | none | R4, R5, R6, R7, R9 (5 rows) + project conformance gate | high |
| Frame sequences | L1 (with image), T3 (with video) | enriches L6/L7 with telemetry | medium |
| Gimbal CSV | none | M1M4 (4 rows) + L6, L7 | **high — explicit user gap** |
| VLM I/O pairs | none | L3, S5 (2 rows) | low (model-choice gated) |
| Operator envelopes | none | O9, O10 (2 rows) | blocked on Q9 |
| GPS/NTP drift | none | R8 | low (inline-authorable) |
Per-row binding lives in `expected_results/results_report.md`. The status of each gap is mirrored in `_docs/_process_leftovers/` so the next `/autodev` run can replay the missing-fixture decision.
## What this file does NOT own
- Component design (how `detection_client` talks to Tier-1, how `mission_client` retries, etc.) — `_docs/02_document/architecture.md` and `_docs/02_document/components/*/description.md`.
- Production data shapes (frame rate, MAVLink message types) — `data_parameters.md` already has these.
- AC text — `_docs/00_problem/acceptance_criteria.md`.
- The choice of which mocks to use during a given test run (live vs replay vs scripted) — `_docs/02_document/tests/` (test strategy doc, authored by `/test-spec` Phase 2).
+55
View File
@@ -0,0 +1,55 @@
# Problem
## What is being built
`autopilot` is the onboard mission executor for a reconnaissance winged UAV. It runs on the airframe's edge compute device. It receives a mission from outside, controls the airframe, drives the camera + gimbal to inspect terrain, and feeds a remote human operator with everything the operator needs to confirm or decline each candidate target.
## What problem it solves
The reconnaissance UAV detects vehicles and military equipment well enough today, but the current high-value targets are **camouflaged positions** — FPV-operator hideouts, hidden artillery emplacements, dugouts masked by branches. These cannot be found by visual similarity to known object classes alone.
Three observation gaps must be closed:
- **Visual sweep coverage** — the camera must follow the planned route and keep eyes on the terrain it overflies, not only on already-known targets.
- **Movement detection on a moving camera platform** — small movers must be surfaced as they appear, even while the airframe and gimbal are themselves moving and even at higher zoom levels.
- **Context-aware target recognition** — a candidate position has to be assessed against scene context (footpaths arriving at it, fresh-vs-stale tracks, concealment patterns), not just shape.
For every candidate it does surface, the system must reach a human operator quickly enough to act, without overwhelming the operator with too many candidates at once, and with confidence-scaled urgency so high-confidence targets are not lost to a low-confidence noise queue.
## Who uses it
- **Operators** — single primary, optional remote secondary. They see camera feed + telemetry + candidate overlays in a browser at a Ground Station and respond with confirm / decline / target-follow / abort. Their decisions must be authenticated, signed, and replay-protected because the radio link is hostile territory.
- **Mission planners** — define the mission region and consume the post-mission diff of what was found.
- **Airframe / Ground-Station crews** — depend on the system to safely abort or RTL when the operator link is lost, and to refuse takeoff if the system is not in a flight-ready state.
- **Suite operations** — need to know when the airframe is in flight so that other ground-side housekeeping (model updates, OTA) does not interfere.
## The operational reality this problem lives in
Stated as fact, not as a design choice. (Design lives in `_docs/01_solution/solution.md` and `_docs/02_document/architecture.md`.)
- The airframe is a reconnaissance winged UAV flying at 6001000 m altitude.
- Missions cover all four seasons and all common terrain types (winter snow, spring mud, summer vegetation, autumn; forest, open field, urban edges, mixed terrain).
- The link between the airframe and the Ground Station is a modem radio that can degrade or drop entirely mid-flight; the system has to keep flying safely when this happens.
- The operator is remote, watches a browser UI on the Ground Station, and is not co-located with the airframe.
- Primitive (Tier 1) object detection is the responsibility of a separate service running alongside the autopilot on the same compute, accessible over a local interface — this split is fixed at the suite level, not something autopilot can choose.
- Mission state and the area-level map of previously-seen objects come from a separate `missions` service over the network and are reconciled before takeoff and after landing.
## What this system is NOT for
(Scope-clarifying so the reader does not project unrelated concerns onto autopilot.)
- Multi-airframe coordination, fleet management, swarm logic.
- Mission planning across regions.
- GPS-denied navigation algorithms (a separate suite service provides corrected GPS).
- Annotation tooling, model training, dataset curation.
- The operator browser UI itself (the Ground Station hosts it; autopilot feeds it).
- Cloud-hosted inference of any kind.
## Where to read further
- `_docs/00_problem/restrictions.md` — the hard constraints (hardware, environment, regulatory).
- `_docs/00_problem/acceptance_criteria.md` — measurable success criteria.
- `_docs/00_problem/security_approach.md` — threat model + security non-negotiables.
- `_docs/00_problem/input_data/` — runtime inputs + test fixture references.
- `_docs/01_solution/solution.md` — the chosen solution shape (component breakdown, tech stack rationale).
- `_docs/02_document/architecture.md` — the full architectural design.
+54
View File
@@ -0,0 +1,54 @@
# Restrictions
Externally imposed constraints the system MUST satisfy. Design choices — even frozen ones — live in `_docs/02_document/architecture.md`, not here. (Audited against `.cursor/rules/artifact-srp.mdc`.)
## Hardware (fixed at the suite level — autopilot does not choose)
- Compute device: **Jetson Orin Nano Super** (aarch64), 67 TOPS INT8, **8 GB shared LPDDR5**. Tier 1 detection consumes ~2 GB of that, leaving ~6 GB for everything autopilot owns.
- Primary camera: **ViewPro A40**. 1080p (1920×1080), 40× optical zoom, f=4.25170 mm, Sony 1/2.8" CMOS (IMX462LQR), HDMI or IP output at 1080p 30/60 fps. The A40's vendor control protocol is the only way to drive its pan/tilt/zoom — autopilot must speak it.
- Alternative camera: **ViewPro Z40K** (higher cost; the system must remain compatible).
- Thermal sensor (640×512, NETD ≤50 mK) may be added later; the system must not assume it is present today.
- 40× optical zoom traversal takes 12 s wall-clock. Any sub-2-second zoom-out → zoom-in product behaviour must account for this physical floor.
## Operational
- Flight altitude: 6001000 m.
- All seasons in scope: winter snow, spring mud, summer vegetation, autumn. Winter-first-only is rejected (frozen 2026-05-06).
- All terrain types in scope: forest, open field, urban edges, mixed terrain.
- The operator/Ground-Station radio link is a modem with intermittent reliability — the system must tolerate degradation and full loss mid-flight.
## Software environment (externally imposed)
- The chosen onboard inference path must run on Jetson Orin Nano Super within the 6 GB residual RAM budget (after Tier 1).
- **Models use FP16 precision** (frozen 2026-05-06; INT8 is rejected for MVP). Applies to every model loaded onto Jetson.
- **No cloud egress for inference.** Any model larger than the in-binary footprint must run locally on the same Jetson, not in the cloud. Network calls for inference are forbidden.
- Tier 1 (YOLO) and any local large model with GPU memory pressure share the Jetson GPU — only one of them may execute at any wall-clock instant. (This is a hardware-resource fact; how the system serialises them is design.)
- The mission file format is the shared `mission-schema` artefact owned jointly by autopilot and the `missions` service. Autopilot MUST consume that schema; it cannot fork it.
## Suite-level architectural splits (autopilot does not own these decisions)
- Tier 1 primitive object detection runs in the sibling **`../detections`** service. Autopilot consumes its output; autopilot does NOT host Tier 1.
- Mission state (waypoints, region, etc.) comes from the **`missions`** service. Autopilot does not author missions.
- Central map of previously-detected objects lives in **`missions`** (extension `/missions/{id}/mapobjects`). Autopilot reconciles with it pre-flight and post-flight; in-flight, autopilot is authoritative for its mission's area.
- GPS coordinates come from a separate **GPS-denied service** (`../gps-denied-onboard` / `../gps-denied-desktop`). Autopilot does NOT implement GPS-denied algorithms.
- Operator browser UI is owned by the **Ground Station**. Autopilot pushes the data; it does NOT render the UI.
- Annotation tooling + model training live in **separate repos** (`../annotations`, `../ai-training`). Autopilot does NOT own them.
## Reliability & Safety obligations (mandatory)
These are existence-of-the-rule constraints. The specific numeric thresholds (RTL grace, drift bound, retry count) are measured success criteria and live in `acceptance_criteria.md`.
- **Pre-flight self-test (BIT) MUST gate takeoff.** The airframe must not take off until every dependency the mission needs is verifiably healthy or the operator has explicitly accepted a known degraded state (e.g. cached MapObjects fallback).
- **Lost operator-link failsafe MUST be deterministic and bounded.** Loss of the operator/Ground-Station radio link cannot result in undefined behaviour. The eventual outcome must be a known mission-safe state (RTL by default, configurable per mission).
- **Airframe MAVLink link loss MUST surface health-red immediately** and defer behaviour to the autopilot stack on the airframe (ArduPilot / PX4).
- **Battery / fuel thresholds MUST trigger pre-defined safety behaviour** (RTL above a soft floor; land-now below a hard floor). Only operator override may bypass.
- **Geofence enforcement MUST be symmetric** — both INCLUSION and EXCLUSION polygons honoured.
- **Operator commands MUST be authenticated, signed, and replay-protected.** Modem-link encryption alone is not sufficient. (Threat model + open scheme choice live in `security_approach.md`.)
- **On-device storage MUST be bounded.** Persistent-store full is a takeoff-blocker; mid-flight eviction policy is mandatory.
- **No silent error swallowing.** Every dependency state MUST surface through a health endpoint.
- **Wall-clock MUST be bound to GPS time once GPS is locked, or NTP at boot.** Forensic timestamping of operator commands depends on this.
- **MAVLink command surface MUST conform** to whatever ArduPilot/PX4 actually accepts (SITL is the conformance reference). Inventing MAVLink semantics is not permitted.
## Out of scope — see `problem.md → "What this system is NOT for"`
Scope-exclusion statements are owned by `problem.md`. Not duplicated here.
+52
View File
@@ -0,0 +1,52 @@
# Security Approach
Threat model + non-negotiable security principles. Specific schemes / libraries / algorithms (HMAC vs ed25519, Unix-domain socket peer-cred mechanism, etc.) are design choices and live in `_docs/02_document/architecture.md` + per-component specs. (Audited against `.cursor/rules/artifact-srp.mdc`.)
## Threat model
The autopilot runs onboard a flying UAV. The threats it must defend against on the MVP timeline:
1. **Hijack of operator commands over the radio link.** Even with modem-level link encryption, an attacker who acquires session state could replay a confirm / decline / target-follow / abort command and seize the system's behaviour. The radio link is hostile territory; link encryption alone cannot be the entire defence.
2. **Crafted input payloads** (image / video crops sent to onboard models, malformed messages on the airframe link, oversize attachments to any onboard service) exploiting decoders, memory bugs, or causing resource exhaustion.
3. **Unstructured model output** corrupting downstream decisions and producing false operator-facing confidence (e.g. a free-form VLM text response treated as a trusted downstream API).
4. **Mid-flight peer spoofing** — a fake sibling service (Tier 1 detection, mission service, or any local IPC peer) impersonating a trusted dependency.
5. **Forensic / audit gaps** — wall-clock drift breaking operator-command timestamping, post-mission diff attribution, or replay-protection windowing.
**Out of scope** (lives elsewhere in the suite or is not relevant to the airborne payload):
- Cloud-hosted secret management — autopilot does not call cloud services.
- Multi-tenancy — single mission per flight; single operator-or-paired-operator session per flight.
- Web-attack surface — the operator browser UI lives in the Ground Station, not in autopilot.
- OTA update signing — Watchtower at the suite level owns it; autopilot only consumes signed images.
## Non-negotiable security principles
These are existence-of-the-rule constraints. The chosen mechanism for each is a design decision and lives in `_docs/02_document/architecture.md`.
- **Operator commands MUST be authenticated, signed, and replay-protected.** Every confirm / decline / target-follow / abort command MUST carry a session-bound, replay-resistant signature that is validated before any state change. Failures are logged at WARN+ and dropped silently from the system's state machine; they are never permitted to take effect.
- **No cloud egress for inference.** Tier 2 + Tier 3 (if enabled) MUST run on the same compute as the rest of autopilot. No HTTP / external network call originating from autopilot for inference is permitted.
- **No silent error swallowing for security-relevant failures.** Signature invalid, peer-credential mismatch, schema violation, oversize payload rejected — each MUST surface through the health endpoint and the structured log.
- **Bounded input for any model call.** Crop size + format allow-list + patched image decoders. Crafted-input and resource-exhaustion mitigation is mandatory; "accept anything and hope the decoder handles it" is not acceptable.
- **Schema validation for any non-deterministic model output.** Free-form generative output (e.g. VLM text) MUST be projected onto a fixed structured schema before it crosses any decision boundary inside autopilot. Schema violation MUST fail closed.
- **Local IPC peer authorisation.** Any onboard IPC peer that autopilot trusts MUST be identifiable as the expected local process (not just "anyone who can reach the socket"). The mechanism is a design choice.
- **Health endpoint MUST reflect security state.** Pre-flight BIT covers reachability + warm-up of every external dependency; the same endpoint surfaces in-flight security signals (repeated signature failures, peer-credential mismatch, schema-violation rate).
- **Wall-clock binding requirement.** Operator-command timestamping requires a trusted clock source. Wall-clock MUST be bound to GPS time once GPS is locked, or NTP at boot. Both sources MUST be recorded with `clock_source` + `last_sync_at`. Drift > 200 ms surfaces health yellow (the AC enforces the threshold; this rule mandates the binding).
- **Airframe MAVLink integrity.** Whether the airframe link MUST use MAVLink-2 message signing depends on whether the link is physically isolated. If it is not physically isolated, message signing MUST be enabled. (The decision and the mechanism are tracked as Q6 in `architecture.md §8`.)
## What this system does NOT own
- Modem-link encryption setup — handled at the radio layer below autopilot.
- Suite-wide TLS / certificate provisioning — delegated to suite-level deployment (`../_infra/`).
- OTA update signing — Watchtower; autopilot consumes already-signed images. Boot-time self-check + rollback policy is an open suite-level question (Q10 in `architecture.md §8`).
- Annotation / training-data security — lives in the `ai-training` repo.
- Operator browser UI auth — Ground Station owns it; the modem-side handshake is jointly specified per the operator-command auth scheme (Q9).
## Open security decisions (tracked in `_docs/02_document/architecture.md §8`)
- **Q6** — MAVLink-2 message signing on the airframe link.
- **Q9** — Operator-command authentication scheme (HMAC / ed25519 / MAVLink-2-extension / separate envelope).
- **Q10** — Software rollback policy on the airframe (boot-time self-check, A/B partition, watchdog rollback).
- **Q11** — Multi-operator session policy (single active operator vs quorum).
- **Q12** — Comms blackout during banking turns (tolerate vs suppress lost-link failsafe during known turn arcs).
None of these block the rest of the design. Each affected component spec calls out the question it depends on and the temporary contract used until the question resolves.
+50
View File
@@ -0,0 +1,50 @@
# Solution
The solution for `autopilot` is captured **in full** in `_docs/02_document/architecture.md`, `_docs/02_document/system-flows.md`, `_docs/02_document/data_model.md`, `_docs/02_document/decision-rationale.md`, the 13 per-component specs under `_docs/02_document/components/`, and `_docs/02_document/glossary.md`. These were produced before the canonical greenfield Problem step and were confirmed by the user on 2026-05-17.
This file is the **canonical greenfield Solution pointer** — it exists so downstream skills that expect `_docs/01_solution/solution.md` (test-spec, decompose, plan-resume) have a single entry point, and it summarises the decision shape; it does not duplicate the architecture.
## What is the solution
A single Rust binary on Jetson Orin Nano Super (aarch64) that runs the mission, drives the gimbal in a two-level scan loop, ingests RTSP, delegates Tier 1 detection to `../detections` over bi-directional gRPC, runs Tier 2 + optional Tier 3 (VLM) locally, talks to a remote operator over modem via an always-on telemetry stream, and bracket-synchronises a local H3-indexed MapObjects store with the central `missions` API. The dominant pattern is a deterministic typed state machine — `ZoomedOut`, `ZoomedIn { roi, hold_started_at }`, `TargetFollow { target_id, started_at }` — coordinating a small set of Tokio actor components.
## Component breakdown
13 components organised into four planes (see `architecture.md §2`, §3 and per-component specs):
- **Perception (data plane in)**: `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client` (optional).
- **Decision + Memory**: `scan_controller`, `mapobjects_store`.
- **Action (data plane out)**: `gimbal_controller`, `operator_bridge`, `mission_executor`, `mavlink_layer`, `mission_client`.
- **Telemetry plane (always-on, parallel)**: `telemetry_stream`.
Per-component design contracts (inputs, outputs, state, failure modes, NFRs) live in `_docs/02_document/components/<name>/description.md`.
## Tech stack rationale (one-line summary per choice; full rationale in `decision-rationale.md`)
| Layer | Selection | Rationale |
|---|---|---|
| Language | Rust | Memory safety, performance, single-binary deployment, strong typing for the deterministic state machine. |
| Tier 1 detector | YOLO26 + YOLOE-26 FP16 TensorRT (in `../detections`) | Best fit with acceptance criteria + existing export pipeline. Not owned by autopilot. |
| Tier 2 analyzer | Primitive graph + lightweight ROI CNN | Fast, explainable, data-efficient. |
| Movement detection | OpenCV optical flow + telemetry; learned-CV fallback per Q14 | Addresses moving-camera constraint directly; benchmark-gated. |
| VLM runtime | NanoLLM / VILA1.5-3B (optional, local IPC) | Local multimodal path that matches the no-cloud requirement. |
| MAVLink transport | Hand-rolled (Rust) | Eliminates the largest current dependency-risk item; command surface is small (`architecture.md §7.7`). |
| Gimbal protocol | ViewPro A40 vendor protocol over UDP | Matches the deployed camera. |
| Inter-component IPC | Tokio channels / actors | Idiomatic Rust async. |
| External IPC (VLM) | Unix-domain socket + peer-credential check | Local-only authorisation. |
| MapObjects engine | TBD (SQLite + H3 / KV / in-memory + snapshot) | Open question Q3; does not block decomposition of the rest of the system. |
| Observability | `tracing` + JSON logs to stdout | Scraped by the deployment's log-shipping stack. |
| Build | `cargo` cross-compile for `aarch64-unknown-linux-gnu` | See `_docs/02_document/deployment/ci_cd_pipeline.md`. |
## Reading order for downstream skills
1. `_docs/02_document/architecture.md` — start with §0 Synopsis, then §3 Components, §5 Architectural Principles, §6 NFR Targets, §7 Detailed Design (in section order).
2. `_docs/02_document/system-flows.md` — flow-by-flow walkthroughs; cross-referenced from the architecture sections.
3. `_docs/02_document/data_model.md` — canonical entities (Frame, Detection, POI, VlmAssessment, MapObject, IgnoredItem, MissionItem, ...).
4. `_docs/02_document/components/<name>/description.md` — one per component; consumed by `/decompose` to map tasks to components.
5. `_docs/02_document/glossary.md` — project-specific terms (also user-confirmed 2026-05-17).
6. `_docs/02_document/decision-rationale.md` — load-bearing research and decision evidence (the equivalent of `research/` Mode A + Mode B outputs).
## Open questions / open decisions
Tracked in `_docs/02_document/architecture.md §8 Open Questions` (Q1Q14). None of them block initial implementation decomposition; each component spec calls out the questions it depends on and what the temporary contract is until the question resolves.
+124
View File
@@ -0,0 +1,124 @@
# autopilot — Documentation Index
**Status**: forward-looking design (Rust). The implementation is in flight. This page is the entry point into the doc set; it does not duplicate content.
If you are new to autopilot, read in this order: `architecture.md``system-flows.md` → the component spec(s) you care about → `data_model.md` for entity-level detail → `decision-rationale.md` for *why* the design looks the way it does.
---
## 1. Doc set at a glance
| File | Purpose |
|---|---|
| `architecture.md` | The system. System context, component layering, NFRs, detailed design (problem, restrictions, AC, training data, solution architecture, MAVLink and piloting, MapObjects/H3, MGRS sync, target relocation, MapObjects sync with central DB, tech stack), open questions, scope boundary. |
| `system-flows.md` | Per-flow narratives + sequence diagrams. Frame pipeline, movement detection (zoom-out + zoom-in), VLM confirmation, scan-controller behaviour tree, operator round trip, mission lifecycle, MapObjects + ignored items, MapObjects sync, pre-flight BIT, lost-link failsafe ladder. |
| `data_model.md` | Canonical entity catalogue. Frames, detections, POIs, VlmAssessment, MapObject + observation log + bundle, IgnoredItem, OperatorCommand envelope, MissionItem vs MissionWaypoint, MGRS wire format, persistence + versioning. |
| `decision-rationale.md` | Load-bearing research and decision evidence (per-dimension reasoning chain, fact cards, fit matrix, validation log, source registry, weak-point→fix table, historical seed narrative). |
| `glossary.md` | Project-specific terms. |
| `components/<name>/description.md` | One per autopilot component (13 total): purpose, inputs, outputs, responsibilities, state, failure modes, dependencies, NFR targets, references. |
| `deployment/containerization.md` | Single-binary deployment options (native systemd vs container), target hardware, configuration surface, health endpoint. |
| `deployment/ci_cd_pipeline.md` | Build, test, SITL conformance, benchmark gate, sign + publish. |
| `deployment/observability.md` | Logs (`tracing` + JSON), metrics, traces, health aggregation, replay-driven debugging. |
| `FINAL_report.md` | This file. |
---
## 2. The system in two minutes
`autopilot` is the onboard mission executor for a reconnaissance winged UAV. It runs as a single Rust process on an aarch64 Jetson Orin Nano. It pulls a mission from the external `missions` API (and the mission area's last-known MapObjects state), controls the UAV through a hand-rolled MAVLink layer, drives a ViewPro A40 gimbal in a two-level scan-and-zoom loop (zoom-out wide sweep + zoom-in on POI), streams camera frames + telemetry continuously over modem to an external Ground Station so the operator watches in a browser, and uses bi-directional gRPC to delegate primitive object detection to the external `../detections` API. Movement detection runs at both zoom levels with mandatory ego-motion compensation. Semantic-vision reasoning (Tier 2 + an optional local VLM), a POI scheduler with a ≤5 POIs/min operator-review cap, and a target-follow mode after operator confirmation all run inside autopilot. Pre-flight self-test gates takeoff; the mission's full pass diff is pushed back to the central MapObjects store at mission end. Operator commands are authenticated, signed, and replay-protected.
Full synopsis: `architecture.md §Synopsis`.
---
## 3. Components
The system is 13 components organised into 4 planes:
| Plane | Components |
|---|---|
| Perception (data plane in) | `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client` (optional) |
| Decision + Memory | `scan_controller`, `mapobjects_store` |
| Action (data plane out) | `gimbal_controller`, `operator_bridge`, `mission_executor`, `mavlink_layer`, `mission_client` |
| Telemetry plane (always-on, parallel) | `telemetry_stream` |
Per-component design specs: `components/<name>/description.md`.
---
## 4. Architectural non-negotiables
These are stated once in `architecture.md §5` and referenced everywhere:
- Detection-as-a-service (Tier 1 lives in `../detections`).
- Hand-rolled MAVLink (no third-party SDK).
- Deterministic typed state machine for scan control: `ZoomedOut`, `ZoomedIn`, `TargetFollow`.
- Ego-motion compensation is mandatory for movement detection. Movement detection runs at **both** zoom-out and zoom-in (per-zoom-band thresholds; classical-CV adequacy at zoom-in is benchmark-gated).
- Operator workload cap of ≤5 POIs/minute is hard.
- Operator timeout scales with confidence.
- **Operator commands are authenticated, signed, and replay-protected** (modem encryption alone is not sufficient).
- Local VLM with structured `VlmAssessment` schema; no cloud egress.
- Always-on camera + telemetry stream to Ground Station.
- **Lost-link failsafe is explicit** (`mission_executor` runs a typed ladder; default RTL after 30 s grace).
- **Pre-flight self-test (BIT) gates takeoff** including MapObjects pre-flight pull.
- **MapObjects are mission-bracketed and centrally synchronised** via the `missions` API extension `/missions/{id}/mapobjects`.
- `autopilot` and `missions` are separate repos with a shared `mission-schema` artefact.
- No silent error swallowing; health endpoint reflects every dependency including `mapobjects_sync`.
- Geofence enforcement is symmetric: both INCLUSION and EXCLUSION are honoured.
---
## 5. Open questions
Surfaced explicitly in `architecture.md §8`:
| # | Question | Blocks |
|---|---|---|
| Q1 | Sweep pattern (pendulum / raster / lawn-mower), FOV per zoom tier, dwell time. | `scan_controller` zoom-out implementation. |
| Q2 | Ground Station API contract (stream protocol, auth, bbox-overlay rendering). | `telemetry_stream` + `operator_bridge` design. |
| Q3 | `mapobjects_store` engine (SQLite + H3 / KV / in-memory + snapshot). | Persistent-state design. |
| Q4 | Tier 1 contract evolution / `detection_client` versioning. | gRPC contract definition. |
| Q5 | `mission-schema` extraction location. | Schema sharing between `autopilot` and `missions`. |
| Q6 | MAVLink-2 message signing. | `mavlink_layer` startup handshake. |
| Q7 | Central MapObjects API contract details (paging, photo-ref upload, retention). | `missions` repo work + `mission_client` MapObjects sync code. |
| Q8 | MapObjects conflict resolution (projection rules, REMOVED-claim expiry, multi-class disambiguation). | Central `map_objects_current` view definition. |
| Q9 | Operator-command authentication scheme (HMAC vs ed25519 vs MAVLink-2 sig vs separate envelope). | `operator_bridge` validation logic + Ground Station integration. |
| Q10 | Software rollback policy on the airframe (boot-time check, A/B partition, watchdog rollback). | Deployment design + on-airframe service supervision. |
| Q11 | Multi-operator session policy (single active vs quorum). | `operator_bridge` session model. |
| Q12 | Comms blackout during banking turns (tolerate as `LinkDegraded` vs suppress lost-link during turns). | Lost-link ladder timing constants. |
| Q13 | All-season acceptance flight gates (minimum flights per season, per-season acceptance criteria). | MVP sign-off scope. |
| Q14 | Movement-detector zoom-in fallback selection (learned optical flow vs CNN motion-segmentation vs IMU-tighter classical CV) if classical CV fails the per-zoom-band FP cap. | `movement_detector` zoom-in scope. |
---
## 6. Suite-level docs autopilot consumes
These live in `../_docs/` (parent suite repo):
| Path | Used for |
|---|---|
| `../_docs/00_top_level_architecture.md` | Suite topology, edge tier, flight-gate convention. |
| `../_docs/02_missions.md` | Mission / Waypoint / Vehicle schemas (consumed by `mission_client`). |
| `../_docs/03_detections.md` | Detections gRPC API (consumed by `detection_client`). |
| `../_docs/04_system_design_clarifications.md` | REST patterns, stream-detection protocol, edge-device connection semantics. |
| `../_docs/11_gps_denied.md` | GPS-Denied service architecture (out of autopilot scope). |
| `../_docs/12_ai_training.md` | AI training pipeline (autopilot consumes the resulting models via the suite-wide model-sync timer). |
Full table with ownership: `architecture.md §10`.
---
## 7. Where to put new content
| You want to document… | Put it in… |
|---|---|
| A new flow between components | `system-flows.md` (and add a sequence diagram). |
| A new entity / schema | `data_model.md`. |
| A change in NFR target | `architecture.md §6`. |
| A change in a single component's responsibilities | `components/<name>/description.md`. |
| A change in the MAVLink command surface | `architecture.md §7.7`. |
| A new architectural principle | `architecture.md §5`. |
| A new design decision with research backing | `decision-rationale.md`. |
| A new term | `glossary.md`. |
| A change in deployment shape | `deployment/<file>.md`. |
| Ad-hoc internal team note | not in `_docs/`. |
+847
View File
@@ -0,0 +1,847 @@
# autopilot — Architecture
**Status**: forward-looking design (Rust). The implementation is in flight; the system described here is the target architecture, not what runs today. Confirmed by user 2026-05-17.
## Synopsis
`autopilot` is the onboard mission executor for a reconnaissance winged UAV. It runs as a single Rust process on an aarch64 Jetson Orin Nano edge device. It pulls a mission from the external `missions` API, controls the UAV through a hand-rolled MAVLink layer (~1015 commands; no third-party SDK), drives a ViewPro A40 gimbal in a two-level scan-and-zoom loop (zoom-out wide sweep + zoom-in on POI), streams camera frames + telemetry continuously over modem to an external Ground Station API so the operator watches in a browser, and uses bi-directional gRPC to delegate primitive object detection to the external `../detections` API. Semantic-vision reasoning (Tier 2 ROI analysis + an optional local VLM), a POI scheduler with an operator-review rate cap, and a target-follow mode after operator confirmation all run inside autopilot. The dominant pattern is a deterministic typed state machine (zoom-out / zoom-in / target-follow) coordinating a small set of async actors.
---
## 1. System Context
Autopilot integrates with six external systems. The local VLM is optional (benchmark-gated); everything else is mandatory.
```mermaid
flowchart LR
cam["ViewPro A40<br/>RTSP camera + gimbal"]
det["../detections<br/>Tier 1 YOLO service"]
vlm["NanoLLM VILA1.5-3B<br/>(optional, local IPC)"]
miss["missions API"]
gs["Ground Station<br/>operator UI"]
ap["ArduPilot / PX4"]
autopilot["autopilot<br/>onboard mission + scan + perception"]
cam <-->|RTSP frames / UDP gimbal control| autopilot
autopilot <-->|bidir gRPC| det
autopilot <-.->|Unix-domain socket IPC| vlm
autopilot <-->|REST GET / POST| miss
autopilot <-->|stream over modem| gs
autopilot <-->|MAVLink v2| ap
```
Per-edge protocol details:
| Edge | Protocol | Direction | Purpose |
|---|---|---|---|
| ViewPro A40 (camera) | RTSP/RTP over TCP/UDP | inbound | live H.264/265 1080p video to `frame_ingest`. |
| ViewPro A40 (gimbal) | UDP, vendor control protocol | bidirectional | yaw / pitch / zoom commands + status; driven by `gimbal_controller`. |
| `../detections` | bi-directional gRPC | bidirectional | frames out, bounding boxes back; driven by `detection_client`. |
| NanoLLM VILA1.5-3B | Unix-domain socket IPC (peer-cred check) | bidirectional | bounded ROI + short prompt → structured `VlmAssessment`; optional. |
| `missions` API | HTTPS REST (GET / POST) | bidirectional | mission pull on start; middle-waypoint POST on operator confirmation; **MapObjects** pre-flight pull + post-flight push (`/missions/{id}/mapobjects`, see §7.13). |
| Ground Station API | continuous push over modem (protocol per `../_docs/04_system_design_clarifications.md`) | bidirectional | always-on camera feed + telemetry + bbox overlay; operator confirm / decline / target-follow. |
| ArduPilot / PX4 | MAVLink v2 over UDP or serial | bidirectional | the small command surface in §7.7. |
---
## 2. Component Layering
Three internal layers (Perception → Decision + Memory → Action) plus an always-on Telemetry plane that runs parallel to the decision loop.
```mermaid
flowchart TB
subgraph autopilot ["autopilot"]
subgraph perception ["Perception (data plane in)"]
fi[frame_ingest]
dc[detection_client]
md[movement_detector]
sa[semantic_analyzer]
vc["vlm_client (opt)"]
end
subgraph brain ["Decision + Memory"]
sc[scan_controller]
mo[mapobjects_store]
end
subgraph action ["Action (data plane out)"]
gc[gimbal_controller]
ob[operator_bridge]
me[mission_executor]
ml[mavlink_layer]
msc[mission_client]
end
subgraph tplane ["Telemetry plane (always-on, parallel)"]
ts[telemetry_stream]
end
end
perception ==>|"inputs (bboxes, motion, Tier 2, VlmAssessment)"| brain
brain ==>|"commands + POI updates + middle-waypoint hints"| action
perception -.->|"frames + bboxes"| tplane
action -.->|"telemetry"| tplane
```
Per-flow component-to-component sequence diagrams live in `system-flows.md`.
---
## 3. Components
| Component | Layer | Responsibility |
|---|---|---|
| `frame_ingest` | Perception | Pull RTSP from ViewPro A40; decode; timestamp; hand frames to `detection_client`, `movement_detector`, and `telemetry_stream` (zero-copy where possible). |
| `detection_client` | Perception | Bi-directional gRPC to `../detections`; streams frames out, receives bounding boxes back; same bboxes are reused for Tier 2 ROI selection and for operator overlay. Versioned against the `../_docs/03_detections.md` contract. |
| `movement_detector` | Perception | Active in **both** zoom-out and zoom-in levels (skipped only during target-follow). OpenCV optical-flow / global-motion estimation fused with timestamped gimbal angle, zoom state, and UAV motion telemetry. Emits residual-motion clusters as POI candidates. Ego-motion compensation is mandatory; naive frame-differencing is rejected. Zoom-in adequacy of classical CV is benchmark-gated — see §7.6 Movement detector and Open Question Q14. |
| `semantic_analyzer` | Perception | Tier 2. Primitive graph + lightweight ROI CNN over zoom-in crops. Owns path-freshness scoring, endpoint scoring, branch choice at intersections, and concealment-POI scoring. |
| `vlm_client` | Perception (optional) | Local-IPC client to a NanoLLM/VILA1.5-3B process. Validates ROI payload size/format, calls the VLM with a bounded crop and short prompt, validates the response against a structured `VlmAssessment` schema. No cloud egress. Optional behind a `vlm_enabled` flag and a feature module (see §7.6 Local VLM Confirmation). |
| `scan_controller` | Decision + Memory | Central deterministic typed state machine — `ZoomedOut`, `ZoomedIn`, `TargetFollow`. Owns the POI queue, timeouts, ≤5 POIs/min cap, confidence-scaled operator-decision window, and gimbal-command issuance. Full behaviour-tree spec in `system-flows.md §F4`. |
| `mapobjects_store` | Decision + Memory | On-device H3-indexed map of detected objects + ignored-items list. Pre-flight pull of the mission-area map from the central `missions` API; in-flight on-device authoritative; post-flight push of the mission diff back to central. Computes new / moved / existing / removed diffs across passes (§7.10, §7.11, §7.12). Read/written directly by `scan_controller`; sync pulls/pushes are handled via `mission_client`. |
| `gimbal_controller` | Action | ViewPro A40 control protocol (yaw / pitch / zoom). Honours ≤2 s zoom transition budget and ≤500 ms decision-to-movement latency. Owns the smooth-pan path-tracking primitive used in zoom-in level. |
| `operator_bridge` | Action | Surfaces POIs and target-follow lifecycle events through `telemetry_stream` to the Ground Station; receives confirm / decline / target-follow start-release back. On decline, appends an `IgnoredItem` via `mapobjects_store`. On confirm, hands a middle-waypoint hint to `mission_executor`. |
| `mission_executor` | Action | Multirotor and fixed-wing variants of the platform state machine: takeoff / climb / cruise / land for multirotor; upload-and-await-AUTO for fixed-wing. Owns geofence enforcement (both INCLUSION and EXCLUSION). Issues MAVLink commands through `mavlink_layer`; consumes `mission_client` mission state. Inserts middle waypoints on operator-confirmed targets. |
| `mavlink_layer` | Action | Hand-rolled MAVLink v2 transport (UDP or serial) implementing only the ~1015 commands this codebase needs. See §7.7 for the command surface. No third-party SDK. |
| `mission_client` | Action | Pulls mission JSON from the `missions` API on start; validates against `mission-schema`; handles mid-flight middle-waypoint inserts (POST). Survives transient connection loss with bounded retry. |
| `telemetry_stream` | Telemetry plane | Continuous push of camera frames + flight telemetry + bbox overlay to the Ground Station API over modem. Always-on; not detection-gated. Carries operator commands (confirm / decline / target-follow start-release) on the return path. |
The system is intentionally a small set of well-named components rather than 30+ files. Everything in `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, and `vlm_client` runs on the **input data plane** — no UAV control, no operator surface. Everything in `gimbal_controller`, `mission_executor`, `mavlink_layer`, `mission_client`, and `operator_bridge` runs on the **output control plane** — UAV motion + operator interaction. `scan_controller` and `mapobjects_store` are the **brain** in between. `telemetry_stream` is parallel; it never sits in the decision path.
Per-component design specs (purpose, inputs, outputs, state, failure modes, NFRs) live in `components/<name>/description.md`.
---
## 4. Major Data Flows
1. **Frame pipeline**. ViewPro A40 RTSP → `frame_ingest``detection_client` (bi-dir gRPC to `../detections`) → bboxes back → `movement_detector` (active at both zoom-out and zoom-in; residual-motion clusters) → `scan_controller` POI queue. The same bboxes also flow into `telemetry_stream` for operator overlay. (`system-flows.md §F1`)
2. **Zoom-in + confirmation**. `scan_controller` pops a POI → `gimbal_controller` zooms ViewPro A40 → `semantic_analyzer` runs Tier 2 over the ROI → optionally `vlm_client` runs Tier 3 → `scan_controller` decides. Movement candidates emerging during the zoom-in hold are still consumed (subject to telemetry-skew tolerance and the per-zoom-band thresholds). (`system-flows.md §F2`, `§F3`)
3. **Operator round trip**. `telemetry_stream` pushes camera + telemetry + bbox overlay → Ground Station → operator browser → confirm / decline / target-follow start-release → modem → `operator_bridge``mapobjects_store` (decline) or `mission_executor` (confirm) or `scan_controller` (target-follow). Always-on, not detection-gated. Operator commands are authenticated, signed, and replay-protected (§5; scheme TBD per Q9). (`system-flows.md §F5`)
4. **Mission lifecycle**. `mission_client` pulls from `missions` API → `mission_executor` issues MAVLink waypoints via `mavlink_layer``gimbal_controller` runs the zoom-out sweep along the route. On operator confirmation, `mission_executor` inserts a middle waypoint and resumes after target-follow ends. (`system-flows.md §F6`)
5. **MapObjects + ignored items**. New detections compute an H3 cell, query the k-ring of neighbours, classify as new / moved / existing / removed (§7.12), and check for an `IgnoredItem` match before surfacing to the operator. (`system-flows.md §F7`)
6. **MapObjects sync** (mission-bracketing). Pre-flight: `mission_client` pulls the last-known map state for the mission area from the `missions` API and hydrates `mapobjects_store`. Post-flight: `mission_client` pushes the mission's full pass diff (NEW / MOVED / REMOVED / CONFIRMED-EXISTING) back. In-flight sync is **batched only** for MVP — no streaming over modem (§7.13; `system-flows.md §F8`).
---
## 5. Architectural Principles / Non-Negotiables
- **Detection-as-a-service.** Primitive (Tier 1) detection lives in `../detections`, not in autopilot. Autopilot owns Tier 2 (semantic) and Tier 3 (VLM, optional) only.
- **Hand-rolled MAVLink.** No third-party SDK. The MAVLink command surface is small enough to hand-implement; eliminates the largest current dependency-risk item.
- **Deterministic typed state machine** for scan control. States are `ZoomedOut | ZoomedIn { roi, hold_started_at } | TargetFollow { target_id, started_at }`. No ad-hoc booleans, no shared mutable flags. The full behaviour-tree spec lives in `system-flows.md §F4`.
- **Ego-motion compensation is mandatory** for movement detection. Naive frame-differencing is rejected outright. Movement detection runs at **both** zoom-out and zoom-in (skipped only during target-follow); zoom-in adequacy of classical CV is benchmark-gated (§7.6, Q14).
- **Operator workload cap of ≤5 POIs/minute** is hard, not soft. `scan_controller` enforces it.
- **Operator timeout scales with confidence** — 40 % → 30 s, 100 % → 120 s, linear; below 40 % the target is not surfaced. Timeout = forget; decline = `IgnoredItem` entry.
- **Operator commands are authenticated, signed, and replay-protected.** Modem-link encryption alone is not sufficient — every confirm / decline / target-follow / abort command MUST carry a session-bound, replay-resistant signature that `operator_bridge` validates before dispatch. Exact scheme TBD (§8 Q9).
- **Local VLM with structured `VlmAssessment` schema.** Free-form VLM text is not a downstream API. No cloud egress.
- **Always-on camera + telemetry stream** to Ground Station is part of the mission contract — operator always sees the live feed, not just on detection.
- **Lost-link failsafe is explicit.** Loss of the operator/Ground-Station modem link triggers a typed failsafe ladder in `mission_executor` (§7.7). The ladder is deterministic; default action is RTL after a configured grace window.
- **Pre-flight self-test (BIT) gates takeoff.** Every dependency listed in §5 plus mission load + MapObjects pre-flight pull (cached fallback acknowledged) must pass before `mission_executor` enters `ARMED` (multirotor) or `WAIT_AUTO` (fixed-wing). Health endpoint distinguishes pre-flight vs in-flight readiness.
- **`autopilot` and `missions` are separate repos** with a shared `mission-schema` artefact. The same `missions` API also hosts the central MapObjects endpoints (§7.13).
- **MapObjects are mission-bracketed and centrally synchronised.** Pre-flight pull on start; on-device authoritative in-flight; full pass diff pushed at mission end. The on-device store is a working copy of the central state for the mission's bounding box, not a private database.
- **No silent error swallowing** anywhere in the pipeline. Health endpoint reflects every dependency: `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client` (if enabled), `scan_controller`, `gimbal_controller`, `mavlink_layer`, `mission_client`, `mission_executor`, `operator_bridge`, `telemetry_stream`, `mapobjects_store`, plus `mapobjects_sync` (pre-flight pull / post-flight push status).
- **Geofence enforcement is symmetric.** Both INCLUSION and EXCLUSION polygons are honoured. (Earlier C++ behaviour silently ignored EXCLUSION; the rewrite explicitly enforces both.)
---
## 6. Non-Functional Targets
| Concern | Target | Owner |
|---|---|---|
| Tier 1 latency | ≤100 ms / frame (end-to-end at 1280 px, FP16, batch 1) | `../detections` (autopilot's call budget respects it) |
| Tier 2 latency | ≤200 ms / ROI | `semantic_analyzer` |
| Tier 3 (VLM) latency | ≤5 s / ROI | `vlm_client` |
| ViewPro A40 zoom transition | ≤2 s (medium → high) | `gimbal_controller` |
| Decision-to-movement latency | ≤500 ms | `gimbal_controller` |
| POI rate to operator | ≤5 POIs / min (hard cap) | `scan_controller` |
| Concealed-position recall | ≥60 % | `semantic_analyzer` |
| Concealed-position precision | ≥20 % (operators filter) | `semantic_analyzer` |
| New per-class P / R | ≥80 % | `../detections` |
| Footpath detection recall | ≥70 % | `semantic_analyzer` |
| Movement-candidate enqueue latency | ≤1 s from detection (zoom-out); ≤1.5 s (zoom-in, accommodating gimbal slew) | `movement_detector` |
| Zoom-out → zoom-in transition | ≤2 s including physical zoom | `scan_controller` + `gimbal_controller` |
| Telemetry rate (position) | 1 Hz min, 10 Hz target | `mavlink_layer` |
| Memory budget (semantic + movement + VLM) | ≤6 GB on Jetson Orin Nano (8 GB total, ~2 GB reserved for YOLO) | system-wide |
| Watchdog / retry on MAVLink failures | bounded retry with exponential backoff; explicit max-retry; health flips to red | `mission_executor` |
| Operator command → action latency | ≤500 ms operator-click → outbound MAVLink / gimbal command (excludes modem RTT) | `operator_bridge` + downstream |
| Sustained frame-rate floor | ≥10 fps; below this `scan_controller` suppresses zoom-in transitions and surfaces health → yellow | `frame_ingest` + `scan_controller` |
| MapObjects pre-flight pull | ≤30 s for a 30 km × 30 km mission area; cache-fallback acceptable on timeout | `mission_client` + `mapobjects_store` |
| MapObjects post-flight push | ≤2 min for a 60 min mission's pass diff; bounded retry; persisted on disk if push fails | `mission_client` + `mapobjects_store` |
---
## 7. Detailed Design
This section covers the rewrite-time problem narrative, suite-level concerns (mission regions, MapObjects, MGRS sync, new-vs-existing object detection), constraints, acceptance criteria, the chosen solution architecture, the MAVLink command surface, and the tech stack.
### 7.1 Problem
The reconnaissance winged UAV detects vehicles and military equipment with YOLO, but current high-value targets are camouflaged positions: FPV operator hideouts, hidden artillery emplacements, and dugouts masked by branches. These cannot be found by visual similarity to known object classes alone.
The new approach has three cooperating search engines:
- **Camera sweep** — follow the UAV route at wide or light/medium zoom with left-right gimbal movement to cover terrain and queue POIs.
- **Movement detection** — runs in **both** zoom-out and zoom-in levels (skipped only during target-follow). Per-zoom-band thresholds keep false-positive rate below the operator-review cap; classical OpenCV adequacy at zoom-in is benchmark-gated (Q14).
- **Semantic zoom search** — detect primitives such as black entrances, branch piles, footpaths, roads, trees, and tree blocks, then reason over scene context to find concealed positions.
The system controls a two-level scan:
- **Zoom-out level (wide-area sweep)** — the camera follows the UAV route at wide or light/medium zoom, sweeping left-right across the flight path while detecting primitives, buildings, vehicles, and small motion candidates. Footpath starts, suspicious branch piles, tree rows, movement candidates, and similar POIs are marked with GPS-denied coordinates and queued.
- **Zoom-in level (detailed scan)** — the camera zooms into each queued POI or movement candidate for confirmation. It follows detected footpaths from origin to endpoint, keeps paths centered while the UAV moves, follows the freshest or most promising branch at intersections, holds on endpoints for VLM analysis of branch piles, dark entrances, dugouts, vehicles, or people, and slowly pans broader POIs such as tree rows or clearings. Movement detection continues, scaled for the higher pixel-to-metre ratio. After analysis or timeout, it returns to zoom-out and continues the queue or route.
When an operator confirms a target, the system switches to **target-follow mode**: keep the target centered with gimbal control while the UAV moves, until the operator releases it or tracking is lost.
### 7.2 Mission Regions and Reconnaissance Flow
Mission directions can be vague. Waypoints define a route that passes through multiple regions:
```text
Start → Point1 → Point2 → Point3 → Point4 → Point5 → Point6 → Finish
╔═══════════════╗
║ Region 1 ║
╚═══════════════╝
╔══════════════════╗
║ Region 2 ║
╚══════════════════╝
╔══════════════╗
║ Region 3 ║
╚══════════════╝
```
The autopilot decides the route within each region (1, 2, and 3).
**Alternative scenario — region-only search.** The user selects only a region for the search (no explicit waypoints inside). The autopilot plans its own route within the region.
```text
Start ──┐
│ ╔═══════════════╗
├───►║ Region ║ (contains Points)
│ ╚═══════════════╝
Finish◄─┘
```
**Reconnaissance flow.** The reconnaissance UAV:
1. Searches within the region and finds potential targets.
2. Sends images to the retranslation UAV.
3. The retranslation UAV forwards them to the human operator.
4. The human operator makes a decision regarding the target using the behaviour-tree-driven `scan_controller` logic (`system-flows.md §F4`).
**Scanning strategy.**
- **Zoom-out level — wide-area scan.** Camera points along the UAV route with left-right swing. The detections service continuously recognises specific patterns as POIs. This initial scan runs at medium zoom while moving between targets. POI types: tree rows (potential caponiers, entrances concealed by tree rows); polygons (areas where military vehicles could be hidden); houses with vehicles or traces; roads and routes on snow or terrain, inside the forest, or near houses.
- **Zoom-in level — detailed scan.** When the camera finds a POI or movement candidate, it zooms in and performs a detailed scan. During detailed scan it searches for trees, caponiers, military vehicles, and so on. Movement detection continues during the zoom-in hold (subject to the per-zoom-band thresholds) so a moving small target found mid-detail-scan is not lost.
### 7.3 Restrictions
**Hardware and camera.**
- Jetson Orin Nano Super: 67 TOPS INT8, 8 GB shared LPDDR5; YOLO uses ~2 GB RAM, leaving ~6 GB for semantic detection, movement detection, and VLM.
- All models use FP16 precision (frozen choice: keep FP16-only for all models).
- Primary camera: ViewPro A40, 1080p (1920×1080), 40× optical zoom, f=4.25170 mm, Sony 1/2.8" CMOS (IMX462LQR), HDMI or IP output at 1080p 30/60 fps.
- Alternative camera: ViewPro Z40K at higher cost.
- Thermal sensor (640×512, NETD ≤50 mK) is available only as a future enhancement, not a core requirement.
**Operational.**
- Flight altitude: 6001000 m.
- Support all seasons and terrain types: winter snow, spring mud, summer vegetation, autumn; forest, open field, urban edges, and mixed terrain. (Frozen choice: MVP must cover **all** seasons, not winter-first only.)
- ViewPro A40 40× optical zoom traversal takes 12 s; zoom-out → zoom-in transition must complete within ≤2 s including physical zoom.
- Movement detection runs at **both** zoom-out and zoom-in levels, compensates for UAV/gimbal motion, and queues candidates for zoom confirmation; target following starts only after operator confirmation. Per-zoom-band thresholds (cluster persistence, residual-velocity floor, telemetry-skew tolerance) are configurable.
**Software.**
- Inference: TensorRT on Jetson, ONNX Runtime fallback, 1280 px model input, tile splitting for large images.
- VLM must run locally on Jetson with no cloud dependency, as a separate IPC process — not compiled into the autopilot binary.
- YOLO and VLM inference run sequentially because they share GPU memory; no concurrent execution.
**Reliability and safety.**
- **Lost-link failsafe is mandatory.** Loss of the operator/Ground-Station modem link triggers a deterministic ladder in `mission_executor` (default RTL after a 30 s grace; configurable per mission). Loss of the airframe MAVLink link itself triggers immediate health → red and degrades to whatever ArduPilot/PX4's own failsafe dictates.
- **Pre-flight self-test (BIT) gates takeoff.** GPS lock, camera RTSP healthy, gimbal homed (yaw/pitch/zoom feedback within tolerance), `../detections` reachable + warmed, mission loaded + validated, MapObjects pre-flight pull complete (or cached fallback acknowledged with operator confirm), VLM warm (if `vlm_enabled`), persistent-store space ≥ configured floor.
- **Battery / fuel thresholds enforced.** `mission_executor` triggers RTL at battery ≤ configured RTL-floor (e.g. 25 %); land-now at hard-floor (e.g. 15 %); ignored only on operator override. Surfaces health → yellow / red accordingly. Threshold values are mission-configurable.
- **Sustained frame-rate floor.** Below ≥10 fps sustained, `scan_controller` suppresses zoom-in transitions (only TIER 1 + operator overlay continue) and surfaces health → yellow.
- **Wall-clock time source.** Monotonic clock is authoritative for telemetry-skew compensation and tick budgets. Wall-clock is bound to GPS time once GPS is locked (preferred) or NTP-set at boot if reachable; both are recorded with `clock_source` and `last_sync_at`. Drift > 200 ms surfaces health → yellow.
- **On-device storage is bounded.** `mapobjects_store` retention + log buffer have configured caps; on cap-hit, oldest pre-current-mission data is evicted; persistent-store-full pre-flight is a BIT failure.
**Integration and scope.**
- The `../detections` service is FastAPI + Cython + TensorRT in a Docker container on Jetson; consumed via bi-directional gRPC.
- Consume YOLO boxes with class, confidence, and normalised coordinates; output boxes in the same format for operator display.
- Movement candidates and confirmed followed targets use the same normalised box format for operator display.
- GPS coordinates come from the GPS-denied service (`../_docs/11_gps_denied.md`) and are out of scope for autopilot's own implementation.
- **MapObjects sync** uses the central `missions` API extension `/missions/{id}/mapobjects` (pre-flight GET, post-flight POST). Schema in §7.13.
- Annotation tooling, training pipeline, and data-collection automation are separate repositories and out of scope.
- GPS-denied navigation is a separate project; mission planning and route selection inside a region remain in autopilot.
**Frozen choices (2026-05-06, updated 2026-05-18).** Gating decisions for downstream design:
1. **Tier 1 remains FP16-only** for all models. INT8 is rejected for MVP.
2. **MVP acceptance requires all seasons**, not winter-first only.
3. **Operator-review cap is ≤5 POIs/minute** (moderate cap chosen).
4. **Movement detection assumes timestamped video, gimbal angle/zoom, and UAV motion telemetry** for MVP. Naive frame-differencing is rejected. Movement detection runs at both zoom-out and zoom-in; classical OpenCV adequacy at zoom-in is benchmark-gated (Q14).
5. **Local VLM is required for MVP** if and only if the exact model satisfies ≤5 s/ROI and the memory budget; otherwise VLM is disabled for MVP and `scan_controller` operates without it.
6. **MapObjects are mission-bracketed and centrally synchronised** via the `missions` API. In-flight sync is **batched only** for MVP (no streaming over modem).
7. **Operator commands are authenticated, signed, and replay-protected.** Modem-link encryption alone is not sufficient.
### 7.4 Acceptance Criteria
**Latency.**
| Tier | Target | Hardware |
|---|---|---|
| Tier 1 fast probe (YOLO26 + YOLOE-26) | ≤100 ms/frame | Jetson Orin Nano Super |
| Tier 2 fast confirmation (custom CNN) | ≤200 ms/ROI | Jetson Orin Nano Super |
| Tier 3 optional deep analysis (VLM) | ≤5 s/ROI | Jetson Orin Nano Super |
**YOLO object detection.**
- Add classes: black entrances of various sizes, branch piles, footpaths, roads, trees, and tree blocks.
- New classes target: P ≥80 %, R ≥80 %; existing class performance must not degrade.
- Baseline reference: current YOLO achieves P=81.6 %, R=85.2 % on non-masked objects.
**Semantic detection.**
- Initial concealed-position recall: ≥60 %, accepting high false positives for later reduction.
- Initial concealed-position precision: ≥20 %, with operators filtering candidates.
- Footpath detection recall: ≥70 %.
- Pipeline consumes YOLO primitives (footpaths, roads, branch piles, entrances, trees), assesses path freshness, traces paths to endpoints, identifies concealed structures, and follows the freshest or most promising branch at intersections.
**Movement detection.**
- During the zoom-out sweep, detect small moving point/cluster candidates that are not yet classifiable and enqueue them for zoom confirmation within 1 s.
- During the zoom-in hold, continue movement detection (independent residual-motion clustering, scaled for the zoomed pixel-to-metre ratio) so a moving small target appearing inside a held POI is not lost; enqueue within 1.5 s.
- Account for UAV and gimbal motion: stable objects (trees, houses, roads, terrain) must not be treated as moving only because the camera platform moves.
- Movement candidates become zoom-in POIs; after zoom, the system attempts semantic / YOLO confirmation as vehicle, people, or other relevant target.
- Zoom-in adequacy of classical OpenCV optical-flow / global-motion estimation is benchmark-gated. If the false-positive rate at zoom-in exceeds the per-zoom-band budget, fall back to a learned optical-flow / CNN-based motion module behind a feature flag (Q14).
**Scan and camera control.**
- Zoom-out level covers the planned route with a wide or light/medium-zoom left-right sweep; POIs include footpaths, tree rows, branch piles, black entrances, movement candidates, houses with vehicles or traces, and roads on snow / terrain / forest.
- Transition zoom-out → zoom-in within 2 s of POI detection, including physical zoom from medium to high.
- Zoom-in level keeps camera lock while the UAV flies, compensates for aircraft motion, pans along footpaths or movement candidates so they stay visible and centered, holds endpoints for VLM analysis up to 2 s, and returns to zoom-out after analysis or configurable timeout (default 5 s/POI).
- After operator confirmation, target-follow mode keeps the target in the centre 25 % of frame while visible, until operator release, target loss, or timeout.
- Gimbal module commands ViewPro A40 pan/tilt/zoom with ≤500 ms decision-to-movement latency, smooth transitions, and footpaths/moving targets kept centered during pan.
- Maintain an ordered POI queue prioritised by confidence and proximity to current camera position.
**Resources and data.**
- Semantic module + movement module + VLM RAM: ≤6 GB on Jetson Orin Nano Super.
- Must coexist with the running YOLO pipeline without degrading YOLO performance.
- Training data: hundreds to thousands of annotated images/sequences across all seasons and terrain types.
- Dedicated annotation needed for black entrances, branch piles, footpaths, roads, trees, and tree blocks; available dataset assembly effort is 1.5 months at 5 hours/day.
### 7.5 Training Data
**Source.**
- Aerial imagery from reconnaissance winged UAVs at 6001000 m altitude.
- ViewPro A40 camera, 1080p resolution, various zoom levels.
- Extracted from video frames and still images.
- Movement detection requires frame sequences, not still images only; include camera/gimbal telemetry where available to separate target motion from UAV motion.
**Target classes.**
- Footpaths / trails (linear features on snow, mud, forest floor).
- Fresh footpaths (distinct edges, undisturbed surroundings, recent track marks).
- Stale footpaths (partially covered by snow / vegetation, faded edges).
- Concealed structures: branch-pile hideouts, dugout entrances, squared / circular openings.
- Tree rows (potential concealment lines).
- Open clearings connected to paths (FPV launch points).
- Moving point/cluster candidates across the full zoom range (wide, light/medium, full zoom-in) — sequences must include both zoom-out and zoom-in examples to support per-zoom-band threshold tuning.
**YOLO primitive classes (new).**
- Black entrances to hideouts (various sizes).
- Piles of tree branches.
- Footpaths.
- Roads.
- Trees, tree blocks.
**Annotation format.**
- Managed by existing annotation tooling in a separate repository.
- Expected: bounding boxes and/or segmentation masks depending on model architecture.
- Footpaths may require polyline or segmentation annotation rather than bounding boxes.
**Seasonal coverage.**
- Winter: snow-covered terrain (footpaths as dark lines on white).
- Spring: mud season (footpaths as compressed/disturbed soil).
- Summer: full vegetation (paths through grass/undergrowth).
- Autumn: mixed leaf cover, partial snow.
**Volume.**
- Target: hundreds to thousands of annotated images/sequences.
- Available effort: 1.5 months at 5 hours/day.
- Potential for annotation-process automation.
### 7.6 Solution Architecture
A two-level onboard scan system (zoom-out wide sweep + zoom-in confirmation). The system delegates Tier 1 detection to the existing FastAPI / Cython / TensorRT YOLO service (`../detections`), adds a central scan/perception scheduler (`scan_controller`), compensates motion using synchronised video / gimbal / UAV telemetry (movement detection runs at both zoom levels), controls the ViewPro A40 through a deterministic state machine, and invokes a secured local VLM process only for bounded zoom-in confirmation.
Before implementation decomposition, the project must pass a **benchmark gate** on target hardware: Tier 1 latency, Tier 2 ROI latency, VLM latency / memory, A40 zoom timing, movement-replay false-positive rate, and all-season dataset readiness.
```text
Video frames + timestamped gimbal/zoom/UAV telemetry
|
v
Input validation + telemetry synchronisation
|
v
Central scan/perception scheduler (scan_controller)
|
+---> Existing FastAPI/Cython TensorRT service (../detections)
| YOLO26 + YOLOE-26 fixed-class FP16 engines
|
+---> Movement detector (active in ZoomedOut and ZoomedIn)
| OpenCV ego-motion compensation + residual clusters,
| per-zoom-band thresholds; learned-CV fallback Q14
|
+---> Tier 2 semantic analyzer
| primitive graph + lightweight ROI CNN (zoom-in only)
|
v
POI queue (confidence + proximity + aging + <=5 POIs/min cap)
|
+---> ViewPro A40 state-machine controller
|
+---> Secured local VLM IPC (optional, benchmark-gated)
NanoLLM VILA1.5-3B, structured VlmAssessment output
```
#### Benchmark gate
The first implementation milestone is a proof suite, not product code. It validates:
- YOLO26 + YOLOE-26 FP16 TensorRT, fixed 1280 px, batch 1, end-to-end ≤100 ms/frame.
- Tier 2 primitive graph + lightweight CNN ≤200 ms/ROI.
- NanoLLM VILA1.5-3B local VLM ≤5 s/ROI and within remaining memory budget while the YOLO container is present.
- ViewPro A40 medium-to-high zoom transition and command-to-movement latency.
- Movement replay false-positive rate **measured independently** at zoom-out and zoom-in, under the ≤5 POIs/minute operator-review cap. If zoom-in exceeds the per-zoom-band cap with classical CV, the learned-CV fallback (Q14) becomes a benchmark-gate prerequisite for the zoom-in scope.
- All-season dataset readiness and hard-negative coverage.
#### Tier 1 primitive detector
Use custom-trained fixed-class YOLO26 and YOLOE-26 TensorRT FP16 engines, owned by `../detections`. Runtime open-vocabulary prompt mutation is **not** part of MVP; fixed project classes or pre-baked embeddings are required. Outputs remain normalised boxes for operator display, with optional masks or path geometry passed as POI metadata.
#### Tier 2 semantic analyzer
Use a primitive graph plus a lightweight ROI CNN to reason over paths, branch piles, dark entrances, roads, trees, tree blocks, clearings, vehicles, people, and endpoint context. This layer owns path freshness, endpoint scoring, branch choice at intersections, and concealment-POI scoring. Active in the zoom-in level only.
#### Movement detector
Active at **both** zoom-out and zoom-in (skipped only during target-follow). Use OpenCV optical flow / global-motion estimation fused with timestamped gimbal angle, zoom state, and UAV motion telemetry. Naive frame differencing is rejected because it cannot distinguish target motion from platform motion. A telemetry synchronisation contract specifies maximum tolerated frame ↔ gimbal ↔ zoom ↔ UAV timestamp skew before motion compensation; out-of-tolerance samples must be rejected or downgraded.
**Per-zoom-band tuning.** Cluster persistence threshold, residual-velocity floor, and telemetry-skew tolerance are configured per zoom band (zoom-out, zoom-in). The pixel-to-metre ratio differs by ~10× between bands, so identical residual pixel motion implies very different physical motion; thresholds must scale.
**Adequacy at zoom-in (research item, Q14).** Classical optical flow / global-motion estimation is well-validated at zoom-out (UAV cruising, gimbal sweeping, large FOV, ego-motion is the dominant signal and easily fitted). At zoom-in the gimbal is actively path-following, the FOV is narrow, motion blur from any small command is large, and the homography model degrades. The benchmark gate (below) MUST measure the false-positive rate at zoom-in independently from zoom-out; if it exceeds the per-zoom-band cap, the implementation falls back to a learned optical-flow module (e.g. RAFT-derived) or a CNN-based motion-segmentation module behind a feature flag, while keeping the same input/output contract.
#### Scan controller and POI queue
Use a deterministic typed state machine with **`ZoomedOut`**, **`ZoomedIn { roi, hold_started_at }`**, and **`TargetFollow { target_id, started_at }`** states. The queue is ordered by confidence, proximity, and aging while enforcing the ≤5 POIs/minute operator-review cap. The controller handles timeouts, target loss, VLM waits, return-to-zoom-out, and target-follow centre-window behaviour. The full behaviour-tree spec — including tick scenarios and the 15 fixed-wing rules — lives in `system-flows.md §F4`.
#### Local VLM confirmation
Run NanoLLM with VILA1.5-3B through a separate local IPC process **if** the benchmark gate passes. Use one bounded ROI crop, short prompt, short answer, and a validated `VlmAssessment` schema. Free-form VLM text is not a downstream API. The IPC channel uses Unix-domain socket permissions and peer-credential checks where available.
**Optionality model.** VLM is the only optional Tier in the system. Two complementary mechanisms model this:
1. **Runtime configuration flag (`vlm_enabled`)**, gated by the benchmark-gate result. When the flag is `false`, `scan_controller` skips the VLM-confirmation step and proceeds with Tier 2 evidence alone for the zoom-in hold; the operator timeout still applies.
2. **Build-time feature module.** The `vlm_client` component is a separate module behind a feature flag; the binary must build, link, and run identically when the module is absent. `scan_controller` MUST NOT contain a hard dependency on `vlm_client`'s presence — it depends only on a `VlmAssessment` provider trait whose default implementation returns `status: vlm_disabled`.
The implementation chooses one of these (or both); both must yield the same observable behaviour: the system functions correctly with VLM absent, only losing the zoom-in confirmation step.
#### Integration and reliability
Preserve the normalised-box contract while adding POI metadata. A central scheduler (`scan_controller`) owns GPU-heavy work and enforces no concurrent YOLO/VLM execution. No silent exception swallowing; health must reflect every dependency listed in §5.
#### Security and operational controls
- Validate image / ROI payload size and format before decoding or inference.
- Use patched OpenCV versions and an image-format allow-list.
- Enforce local IPC authorisation and payload limits for the VLM process (Unix-domain socket permissions plus peer-credential checks).
- Log POI creation reasons, source detections, queue decisions, gimbal commands, VLM requests, operator confirmations, and failure states.
- Keep VLM local with no cloud egress.
### 7.7 MAVLink and Piloting
`mavlink_layer` is a hand-rolled MAVLink v2 transport. There is no third-party SDK dependency. The layer owns serialisation / deserialisation, heartbeat, sequence numbers, retry, and a single connection abstraction (UDP or serial, picked at startup from CLI / env).
**Command surface (~1015 commands).** Only what the system actually needs:
| MAVLink message | Direction | Used by | Purpose |
|---|---|---|---|
| `HEARTBEAT` | bidirectional | `mavlink_layer` | liveness + GCS-vs-companion identification |
| `COMMAND_LONG` (subset) | out | `mission_executor` | arm / disarm, takeoff, set-mode, change-speed, change-alt, land, RTL |
| `COMMAND_ACK` | in | `mavlink_layer` | command-result demux, retry trigger |
| `MISSION_COUNT` | out | `mission_executor` | pre-upload count |
| `MISSION_REQUEST_INT` | in | `mission_executor` | pull-side mission upload |
| `MISSION_ITEM_INT` | out | `mission_executor` | per-waypoint upload |
| `MISSION_ACK` | in | `mission_executor` | upload completion |
| `MISSION_SET_CURRENT` | out | `mission_executor` | start at item 0 |
| `MISSION_CURRENT` | in | `mission_executor` | progress |
| `MISSION_ITEM_REACHED` | in | `mission_executor` | progress |
| `MISSION_CLEAR_ALL` | out | `mission_executor` | reset before re-upload (e.g., middle waypoint) |
| `GLOBAL_POSITION_INT` | in | `telemetry_stream`, `mission_executor` | live position |
| `ATTITUDE` | in | `telemetry_stream` | attitude for operator overlay |
| `SYS_STATUS` / `EXTENDED_SYS_STATE` | in | health aggregator | mode, battery, sensor health |
| `STATUSTEXT` | in | logger | autopilot diagnostic lines |
| `SET_MODE` (or `COMMAND_LONG MAV_CMD_DO_SET_MODE`) | out | `mission_executor` | flight-mode transitions for fixed-wing |
If the autopilot link supports MAVLink-2 message signing it is enabled; otherwise the link is treated as trusted (it is point-to-point on a closed serial / UDP path on the airframe).
**Piloting variants.** `mission_executor` runs one of two state machines depending on the airframe declared at startup:
- **Multirotor variant**: `DISCONNECTED → CONNECTED → HEALTH_OK → ARMED → TAKE_OFF → MISSION_UPLOADED → FLY_MISSION → LAND`. The executor arms, takes off to a configured altitude, and only then uploads + starts the mission. Bounded retry with exponential backoff at every transition; explicit max-retry; on exceeding it, health flips to red and the executor surfaces the failure via the operator bridge.
- **Fixed-wing variant**: `DISCONNECTED → CONNECTED → HEALTH_OK → MISSION_UPLOADED → WAIT_AUTO → FLY_MISSION → LAND`. The executor skips arm/takeoff (the airframe is assumed already airborne under RC control), uploads the mission, and waits for the operator to switch the airframe into AUTO mode via RC. Same retry policy.
**Geofence enforcement.** `mission_executor` honours both INCLUSION and EXCLUSION polygons declared in the mission. INCLUSION violations halt forward progress and trigger return-to-launch (RTL); EXCLUSION violations trigger the same. The earlier C++ implementation parsed but silently ignored EXCLUSION; the new design rejects that behaviour explicitly.
**Mission uploads and middle-waypoint inserts.** When the operator confirms a target, `operator_bridge` hands a middle-waypoint hint to `mission_executor`. The executor recomputes the mission (current-position → middle-waypoint → resume original route), clears the existing autopilot mission via `MISSION_CLEAR_ALL`, re-uploads the new mission via the standard `MISSION_COUNT` / `MISSION_ITEM_INT` / `MISSION_ACK` sequence, and resumes flight. After target-follow ends (operator release, target loss, or timeout), the same sequence reverts to the original mission.
**Lost-link failsafe (operator/Ground-Station modem link).** A typed failsafe ladder runs in `mission_executor`, evaluated each tick:
| Stage | Trigger | Action |
|---|---|---|
| `LinkOk` | last operator heartbeat ≤ 5 s | continue mission; no behavioural change |
| `LinkDegraded` | 5 s < last heartbeat ≤ 30 s | continue mission; surface health → yellow; queue all POI surface-events for replay-on-recovery |
| `LinkLost` | last heartbeat > 30 s **and** target-follow inactive | trigger RTL via `MAV_CMD_NAV_RETURN_TO_LAUNCH`; log mission abort with reason; continue logging the mission diff for post-flight upload via `mapobjects_store` |
| `LinkLostInFollow` | last heartbeat > 30 s **and** in target-follow | hold target-follow for an additional 30 s grace (operator may have momentarily lost link); thereafter fall through to `LinkLost` |
The grace windows (5 s, 30 s, 30 s) are mission-configurable. **MAVLink-link loss to ArduPilot/PX4 itself** is not the same event — it triggers immediate health → red and falls through to whatever the airframe autopilot's own failsafe does (we do NOT override it).
**Battery / fuel thresholds.** `mission_executor` reads `SYS_STATUS` / `EXTENDED_SYS_STATE` and enforces:
- `battery ≤ rtl_threshold` (default 25 %) → trigger RTL, log reason, continue post-mission upload.
- `battery ≤ hard_floor` (default 15 %) → land-now via `MAV_CMD_NAV_LAND` at safest reachable point; surface health → red.
Operator override is permitted via a signed command (per Q9); without it, the thresholds are hard.
**Connection configuration.** A single connection URI at startup: `udp://...` or `serial:///dev/...`. No runtime URI swap.
**Frames and altitudes.** All waypoints in the mission API use `MAV_FRAME_GLOBAL_RELATIVE_ALT`. Terrain-following frames are not used (no SRTM database on the airframe).
### 7.8 Detection Classes
These classes extend the default seed set used by the detections service.
| Class | Local Name (UA) | Notes |
|-----------------|-----------------|----------------------------|
| Rows of trees | Посадка | Linear vegetation cover |
| Trenches/Ditches| Рів | Linear earthwork features |
| Trash piles | Сміття | Indicators of activity |
| Tire tracks | Сліди від шин | Signs of movement |
Plus the new YOLO primitive classes from §7.5 Training Data: black entrances of various sizes, branch piles, footpaths, roads, trees, and tree blocks.
### 7.9 MapObjects (H3 spatial index)
`MapObjects` are created and managed internally by autopilot. There are **no** REST API endpoints for MapObjects — autopilot reads/writes them directly in the on-device store (`mapobjects_store`). The only external reference is the delete cascade in `DELETE /missions/{id}` (per the suite-level missions API).
Autopilot needs to store objects on a 2D map efficiently in order to find differences fast:
- New objects (new pile of trash, new tire tracks).
- Changed objects.
- Removed objects.
Each object on the map is described by:
- `gps(lat, lon)` — geographic position.
- `size(width, height)` — bounding area.
**Spatial indexing.** Use a hexagonal spatial index to efficiently store and query objects by location.
**Approach:** H3 library (by Uber) — hierarchical hexagonal geospatial indexing system.
| Aspect | Detail |
|---------------------|--------------------------------------------|
| Library | H3 (`h3rs` crate for Rust) |
| Algorithm basis | 3D icosahedron → 2D hexagonal tessellation |
| Key advantage | Uniform area cells, good neighbour queries |
| Open question | Optimal tile/resolution size |
| Known issue | Discontinuity problem at cell boundaries |
The hexagonal grid avoids the distortion problems of square grids and provides consistent neighbour relationships, making it suitable for fast spatial diff operations (detecting new, changed, and removed objects).
### 7.10 Drone ⇄ Operator Sync Message Format
Detection data is synced between drone and operator using a compact message format. MGRS (Military Grid Reference System) is used as the primary coordinate encoding — compact, standardised, and directly usable on military maps.
**Drone → Operator (detection report):**
```text
missionId :: MGRS(encoded) :: class :: confidence :: size_width_m :: size_length_m :: photo_metadata :: flags
```
**Operator → Drone (command/acknowledgment):**
```text
missionId :: Encoded(GroundMGRS :: Time) :: ... :: missionId2
```
Wire-level field semantics live in `data_model.md §MGRS sync message`.
### 7.11 Target Relocation / Movement Analysis
The system maintains a live **map of objects** and detects changes between survey passes.
**Map update types.**
| Type | Meaning |
|---------|----------------------------------------------|
| New | Object not seen before in this area |
| Moved | Object of same class appeared nearby |
| Removed | Previously recorded object no longer present |
**Map hashtable.** Objects are stored in a hashtable keyed by MGRS grid reference:
```text
MGRS1 -> Object1
MGRS2 -> Object5
MGRS12 -> Object2
MGRSN -> ObjectM
```
### 7.12 New vs Existing / Moved / Removed Object Detection
When a detection occurs, the system must determine whether the object is **new**, **moved**, or **already known**. This must be done efficiently in real time. This is the implementation of `scan_controller`'s map-diff responsibilities; it lives in `mapobjects_store`.
**Algorithm.**
```text
On each detection(gps, class, confidence, size):
1. Compute H3 cell index at chosen resolution (e.g. res 10 ~15m edge).
2. Build composite key = H3_cell + class.
3. Query k-ring(H3_cell, k=2) -> get all neighbouring cells.
4. For each neighbouring cell, lookup objects with same or similar class:
similar_classes = {military_vehicle, tank, artillery} (configurable groups)
5. Compare:
- If matching object found within distance_threshold (config, e.g. 50m)
AND same class group -> EXISTING (or MOVED if position delta > move_threshold).
- If no match -> NEW -> insert into map with H3 hash key.
6. After full sweep: objects in the region that were NOT re-observed -> REMOVED candidates.
```
**Why H3 + MGRS.**
| Step | Mechanism | Complexity |
|--------------------------|----------------------------|------------|
| Spatial cell lookup | H3 `latlng_to_cell` | O(1) |
| Neighbour query | H3 `grid_disk(k=2)` | O(1) |
| Object lookup per cell | Hashtable by `MGRS+class` | O(1) |
| Total per detection | ~constant time | O(k²) |
**Configurable parameters.**
| Parameter | Example Value | Purpose |
|----------------------|---------------|------------------------------------------------------|
| search_radius_km | 30 | Max radius to search for previously known objects |
| distance_threshold_m | 50 | Max distance to consider same object |
| move_threshold_m | 10 | Min displacement to flag as "moved" |
| h3_resolution | 10 | ~15 m edge length, good for vehicle-sized objects |
| similar_classes | per config | Class groups treated as equivalent for matching |
**Notes.**
- The 30 km radius is for the broad initial query ("get all previously stored objects within 30 km"). H3 `grid_disk` at resolution 10 with k=2 covers ~90 m radius — this handles fine-grained matching. For the broad query, use a coarser H3 resolution (e.g. res 4 ~22 km edge) as a pre-filter.
- `MGRS+class` is the composite key for the hashtable so that lookups are partitioned by both location and object type.
- The discontinuity problem at H3 cell boundaries is solved by always querying the k-ring (centre cell + neighbours), ensuring objects near an edge are still matched.
### 7.13 MapObjects Sync (central DB)
`mapobjects_store` is **not** a private on-device database. It is the working copy of a centrally maintained map of detected objects, scoped to the mission's bounding box, synchronised on a per-mission basis.
**Mirror of the GPS-Denied satellite-tile pattern.** Pre-flight, autopilot pulls the relevant central state into the on-device store; in-flight the on-device store is authoritative; post-flight, autopilot pushes the mission's full pass diff back to the central store. The central store is the source of truth across missions and across UAVs; the on-device store is the source of truth during the active mission.
**Endpoint hosting (frozen 2026-05-18).** The endpoints are an extension of the existing `missions` API. There is no separate `mapobjects` service.
| Endpoint | Method | Purpose |
|---|---|---|
| `/missions/{id}/mapobjects` | `GET` | Pre-flight: returns the central map state for the mission's bounding box (last-known objects + ignored items). |
| `/missions/{id}/mapobjects` | `POST` | Post-flight: uploads the mission's full pass diff (NEW / MOVED / REMOVED-CANDIDATE / CONFIRMED-EXISTING) for central merge. |
| `/missions/{id}/mapobjects/ignored` | `GET` | Pre-flight: returns the central ignored-items list scoped to the mission area. |
| `/missions/{id}/mapobjects/ignored` | `POST` | Post-flight: uploads ignored-items appended during the mission. |
| `DELETE /missions/{id}` | (existing) | Cascade: drops mission-scoped MapObjects and IgnoredItems centrally as well as on-device. |
In-flight sync is **batched only** for MVP — no streaming over modem. Cross-UAV awareness lags by mission length; this is an explicit MVP trade-off (Frozen choice 6 in §7.3).
**Sync lifecycle (per mission).**
1. **Pre-flight pull**`mission_client` calls `GET /missions/{id}/mapobjects` after fetching the mission itself. Response hydrates `mapobjects_store`. Failure modes:
- **Reachable + 200**: hydrate; record `pull_completed_at`. Sync state = `synced`.
- **Reachable + 4xx**: fail BIT; surface error; operator must investigate (likely mission-id mismatch or unauthorised UAV).
- **Unreachable / timeout**: BIT degrades. Operator may acknowledge to continue with **last-cached** state for this mission area (`sync state = cached_fallback`); the BIT failure is recorded for post-mission audit.
- **Empty response**: `sync state = synced`, store empty (legitimate first-flight in this area).
2. **In-flight** — store is authoritative. All NEW / MOVED / EXISTING / IgnoredItem appends accumulate in the on-device store with `pending_upload = true`. No central writes.
3. **Post-flight push**`mission_client` calls `POST /missions/{id}/mapobjects` with the mission's full pass diff after landing or RTL. Conflict resolution is server-side per §7.13 conflict rules. Failure modes:
- **Reachable + 200**: clear `pending_upload`; record `push_completed_at`. Sync state = `synced`.
- **Unreachable / timeout / 5xx**: persist the pending diff on disk, retry with backoff. After max retries (configurable, default 24 h), surface as a warning; operator may manually trigger replay or accept loss.
- **4xx (rejected)**: log full payload, surface to operator; do not silently discard — the mission's results are at risk.
**Conflict resolution at the central store (open question Q8 — proposed).** When two missions report contradicting state for the same `(h3_cell, class_group)`:
- Both observations are **appended** to the per-`(h3_cell, class_group)` observation log (no destructive overwrite).
- The "current view" surfaced to operator UI is computed from the observation log: most recent confirmed-existing observation wins; older REMOVED claims expire after a configurable age; class-group ambiguities surface as multi-class candidates.
- IgnoredItems are union-merged (any operator-decline at any UAV propagates to all future missions in the same area, until explicit clear).
**Central-side schema (SQL, indicative).**
```sql
-- Observations: every detection ever reported by any UAV/mission, never overwritten.
CREATE TABLE map_object_observations (
id UUID PRIMARY KEY,
h3_cell BIGINT NOT NULL,
class TEXT NOT NULL,
class_group TEXT NOT NULL,
mission_id UUID NOT NULL REFERENCES missions(id) ON DELETE CASCADE,
uav_id UUID NOT NULL,
observed_at TIMESTAMPTZ NOT NULL,
gps_lat DOUBLE PRECISION NOT NULL,
gps_lon DOUBLE PRECISION NOT NULL,
mgrs TEXT NOT NULL,
size_width_m REAL,
size_length_m REAL,
confidence REAL NOT NULL,
diff_kind TEXT NOT NULL CHECK (diff_kind IN ('NEW','MOVED','EXISTING','REMOVED_CANDIDATE')),
photo_ref TEXT,
raw_evidence JSONB
);
CREATE INDEX ON map_object_observations (h3_cell, class_group);
CREATE INDEX ON map_object_observations (mission_id);
CREATE INDEX ON map_object_observations (observed_at DESC);
-- IgnoredItems: per-area operator declines, union-merged across missions.
CREATE TABLE map_object_ignored (
id UUID PRIMARY KEY,
h3_cell BIGINT NOT NULL,
mgrs TEXT NOT NULL,
class_group TEXT NOT NULL,
declined_at TIMESTAMPTZ NOT NULL,
operator_id UUID,
mission_id UUID REFERENCES missions(id) ON DELETE SET NULL,
retention_scope TEXT NOT NULL CHECK (retention_scope IN ('mission','session','until_expiry')),
expires_at TIMESTAMPTZ
);
CREATE INDEX ON map_object_ignored (h3_cell, class_group);
CREATE INDEX ON map_object_ignored (expires_at) WHERE retention_scope = 'until_expiry';
-- Materialised "current view" derived from observations + ignored.
-- Recomputed nightly or on POST. Exact projection rules per §7.13 conflict resolution.
CREATE MATERIALIZED VIEW map_objects_current AS ...;
```
**On-device-side schema (engine TBD per §8 Q3 — indicative shape).**
```text
mapobjects_store/
current_state -- key = (h3_cell, class_group); value = MapObject record
pending_observations -- ordered log of unflushed observations for post-flight POST
pending_ignored -- unflushed IgnoredItem appends
sync_state -- {pull_completed_at, push_completed_at, last_error, kind}
```
The on-device shape is intentionally narrower than the central schema — the on-device store does not need full observation history beyond the active mission; older history is only ever consulted via the central pull.
**Bounding-box pull strategy.** The central API uses the mission's geofence INCLUSION polygon (or a generous AABB if no INCLUSION is set) to scope the response. Pulled records are filtered by retention age (default ≤30 days); operator can override to "all". The 30 km / k-ring numbers in §7.12 apply to **on-device** spatial queries; the pull radius is mission-defined.
### 7.14 Tech Stack
**Requirements.**
| Area | Requirement |
|---|---|
| Runtime hardware | Jetson Orin Nano Super 8 GB, locked JetPack/power mode, ViewPro A40. |
| Inference (Tier 1) | FP16 only, TensorRT primary, ONNX Runtime fallback, 1280 px model input. Lives in `../detections`. |
| Service integration | Bi-directional gRPC client to the existing FastAPI + Cython + TensorRT detections service. |
| VLM | Local-only, separate IPC process, sequential with YOLO, ≤5 s/ROI if used for MVP. |
| Movement | Active at zoom-out and zoom-in, moving-camera compensation with timestamped video / gimbal / UAV telemetry; per-zoom-band thresholds; learned-CV fallback per Q14. |
| MapObjects sync | Mission-bracketed: pre-flight `GET` + post-flight `POST` against `/missions/{id}/mapobjects`. Batched only for MVP. |
| Output | Existing normalised-box format plus POI metadata for queue / reasoning. |
| Proof gates | Hardware/replay benchmark suite before implementation decomposition; movement zoom-in benchmark independent of zoom-out. |
**Selected stack.**
| Layer | Selection | Rationale |
|---|---|---|
| Language (autopilot) | Rust | Memory safety, performance, single-binary deployment, strong type system for the deterministic state machine. |
| Language (`../detections`) | Python + Cython | Existing service; we consume it, not rewrite it. |
| Tier 1 detector | YOLO26 + YOLOE-26 fixed-class FP16 TensorRT | Best fit with acceptance criteria and export docs. Owned by `../detections`. |
| Tier 2 analyzer | Primitive graph + lightweight CNN | Fast, explainable, data-efficient. |
| Movement | OpenCV optical flow + telemetry | Directly addresses moving-camera constraint. |
| VLM runtime | NanoLLM / VILA1.5-3B (with fallback benchmark path) | Documented local-multimodal path; matches no-cloud requirement. |
| Scan controller | Deterministic typed state machine (Rust) | Simpler and easier to test for a fixed `ZoomedOut` / `ZoomedIn` / `TargetFollow` lifecycle. |
| MAVLink transport | Hand-rolled in autopilot (Rust) | Eliminates the largest current dependency-risk item; small command surface (§7.7). |
| Gimbal protocol | ViewPro A40 vendor protocol over UDP | Matches the deployed camera. |
| `mapobjects_store` engine | TBD (SQLite + H3 extension / KV / in-memory + snapshot) | Open question; see §8. |
| Inter-component IPC (in-process) | Tokio channels / actors | Idiomatic Rust async. |
| External IPC (VLM) | Unix-domain socket with peer-credential check | Local-only authorisation. |
| VLM output | Validated structured `VlmAssessment` schema | Makes VLM output a stable API contract. |
| Input security | Content / size allow-list + patched OpenCV | Reduces crafted-input and resource-exhaustion risk. |
| Observability | `tracing` + JSON logs to stdout, scraped by the deployment's log-shipping stack | See `deployment/observability.md`. |
| Build | `cargo` cross-compile for `aarch64-unknown-linux-gnu` | See `deployment/ci_cd_pipeline.md`. |
**Risk register.**
| Risk | Impact | Mitigation |
|---|---|---|
| Tier 1 misses ≤100 ms/frame | Blocks acceptance | Fixed-shape FP16 engines, batch 1, benchmark before implementation decomposition. |
| VLM misses ≤5 s/ROI or memory budget | Blocks VLM-required MVP policy | Benchmark NanoLLM / VILA first; fall back to smaller VLM only if it passes the same gates; otherwise disable VLM via `vlm_enabled=false`. |
| All-season MVP data is insufficient | Blocks detection-quality targets | Per-season dataset gates and hard-negative mining. |
| Movement false positives exceed ≤5 POIs/min | Operator overload | Telemetry-aided compensation, replay tests, queue cap, per-zoom-band thresholds. |
| Classical OpenCV optical flow inadequate at zoom-in | Loss of zoom-in movement detection | Benchmark gate measures zoom-in independently; fallback to learned-CV / CNN motion module behind feature flag (Q14). |
| Operator/Ground-Station modem link lost mid-flight | Uncontrolled UAV | Typed lost-link failsafe ladder in `mission_executor` (§7.7); RTL after 30 s grace; configurable. |
| Battery / fuel below threshold mid-mission | Forced landing or crash | Hard-coded RTL + land-now thresholds (§7.7); operator override only via signed command. |
| Operator command spoofing / replay over modem RF | Hostile hijack of operator commands | Authenticated, signed, replay-protected command envelope (§5; scheme TBD per Q9). |
| Pre-flight self-test (BIT) misses a degraded dependency | Mid-flight component failure | BIT covers every dependency in §5 plus mission load + MapObjects pre-flight pull; cached-fallback acknowledgement is explicit. |
| Wall-clock drift breaks operator-command timestamping | Forensic + audit failures | GPS-time-bound when GPS locked; NTP at boot; drift > 200 ms surfaces health → yellow. |
| MapObjects post-flight push fails | Loss of mission-diff data centrally | Persist pending diff on disk; bounded retry; operator-visible warning; manual replay supported. |
| A40 zoom transition exceeds ≤2 s | Breaks scan timing | Hardware-in-loop timing test; revise scan timeout / zoom range if needed. |
| Hand-rolled MAVLink misses an edge case | Mission failure or hard-to-debug protocol behaviour | Conformance test against ArduPilot SITL; replay-based regression tests. |
| Unstructured VLM output corrupts downstream decisions | Operator-facing false confidence | Schema validation, confidence enum, timeout / error state, fail-closed behaviour. |
| Telemetry skew breaks movement compensation | False motion candidates | Define maximum frame / gimbal / UAV timestamp skew; reject / degrade unsynchronised samples. |
| Untrusted image / ROI payloads exploit decoders or memory | Security and availability risk | Pin patched OpenCV, restrict formats, enforce size caps before decode. |
---
## 8. Open Questions
| # | Question | Impact |
|---|---|---|
| Q1 | **Sweep pattern specification.** Pattern shape (pendulum / raster / lawn-mower), FOV per zoom tier, dwell time per direction, and whether sweep runs continuously or only between specific mission waypoints. | Blocks `scan_controller` zoom-out implementation. |
| Q2 | **Ground Station API contract.** Stream protocol (WebRTC / WebSocket-H.264 / gRPC server-streaming?), session/auth model, and bbox-overlay rendering (server-side burn-in vs client-side render). | Blocks `telemetry_stream` + `operator_bridge` design. |
| Q3 | **`mapobjects_store` engine.** SQLite + H3 extension / KV / in-memory + snapshot. | Blocks persistent-state design for ignored items + MapObjects. |
| Q4 | **Tier 1 contract evolution.** How `detection_client` is versioned against an evolving `../detections` schema. | Blocks the gRPC contract definition. |
| Q5 | **`mission-schema` extraction location.** `_infra/` at suite root, or a small third repo. | Blocks the `mission_client` / `missions` API contract sharing. |
| Q6 | **MAVLink-2 message signing.** Whether the airframe link enables MAVLink-2 signing or treats the link as trusted. | Affects `mavlink_layer` startup handshake. |
| Q7 | **Central MapObjects API contract.** Endpoint hosting is frozen as an extension of the `missions` API (§7.13). The remaining contract concerns are: schema versioning, paging strategy for large mission areas, photo-reference upload mechanism (URL handoff vs inline), and observation-history retention policy. | Blocks `missions` repo work + `mission_client` MapObjects sync code. |
| Q8 | **MapObjects conflict resolution.** When two missions report contradicting state for the same `(h3_cell, class_group)`, the proposed rule is "append-only observation log + computed current view" (§7.13). Open: exact projection rules, REMOVED-claim expiry window, multi-class disambiguation. | Blocks central `map_objects_current` view definition. |
| Q9 | **Operator-command authentication scheme.** The principle is committed (§5: signed, replay-protected). Scheme open: HMAC over (session_token, sequence_number, payload) vs JWT-style ed25519 vs MAVLink-2 signing extended to operator commands vs separate envelope. | Blocks `operator_bridge` validation logic + Ground Station integration. |
| Q10 | **Software rollback policy on the airframe.** Watchtower OTA is mentioned in `../_docs/00_top_level_architecture.md`. Policy open: how a bad autopilot update is detected on the airframe (boot-time self-check, A/B partition, watchdog rollback) and rolled back without crew intervention. | Affects deployment design + on-airframe service supervision. |
| Q11 | **Multi-operator session policy.** When two operators connect (one in primary station, one remote), which is authoritative for confirm/decline? Single active operator at a time, or quorum? How is `operator_id` recorded in `IgnoredItem`? | Blocks `operator_bridge` session model. |
| Q12 | **Comms blackout during banking turns.** Winged UAV banking can lose modem LOS to Ground Station. Policy: tolerate brief blackouts as `LinkDegraded`, or suppress lost-link failsafe during known turn arcs (computed from mission shape)? | Affects lost-link failsafe ladder timing constants (§7.7). |
| Q13 | **All-season acceptance flight gates.** Dataset gates (§7.4) are committed; flight-test gates are not. Open: minimum number of real flights per season before MVP acceptance, per-season acceptance pass criteria. | Affects MVP sign-off scope. |
| Q14 | **Movement detection at zoom-in — fallback selection.** If classical OpenCV optical flow / global-motion estimation does not meet the per-zoom-band false-positive cap at zoom-in, the fallback module choice is open: learned optical flow (RAFT / FlowNet derivative) vs CNN motion segmentation vs IMU-tighter-coupled classical CV. The interface contract (`Frame + telemetry → Vec<MovementCandidate>`) is fixed; the implementation is replaceable. | Blocks `movement_detector` zoom-in scope if classical CV fails benchmark gate. |
---
## 9. Out of Scope
- Multi-airframe coordination, fleet management, swarm logic.
- Mission re-planning beyond middle-waypoint inserts.
- Mission planning / route selection for arbitrary mission shapes (only intra-region routing).
- GPS-denied navigation algorithms (delegated to the GPS-denied service, `../_docs/11_gps_denied.md`).
- Cloud-hosted VLM or any external inference dependency.
- Encrypted transport beyond what MAVLink-2 message signing and modem-level link encryption already provide.
- Annotation tooling, model training, dataset curation (separate `ai-training` repo).
- Operator browser UI (Ground Station hosts it; autopilot only feeds it).
---
## 10. External Suite Documents
These suite-level documents live in the parent suite repo (`../_docs/`) and are consumed by autopilot but **not owned** by autopilot.
| Suite-level path | Owner / primary-for | What autopilot uses it for |
|---|---|---|
| `../_docs/00_top_level_architecture.md` | suite (cross-cutting) | Suite topology, deployment tiers (`edge`), the **flight-gate convention** (`/run/azaion/in-flight` — written by autopilot, read by `model-sync.service`), Watchtower OTA model. Defines autopilot's place in the 11-component system. |
| `../_docs/02_missions.md` | `missions` repo (.NET service) | Mission / Waypoint / Vehicle schemas. Autopilot consumes the missions API via `mission_client`. |
| `../_docs/03_detections.md` | `detections` repo (Cython service) | Detections API spec. Autopilot consumes via bi-directional gRPC in `detection_client`. |
| `../_docs/04_system_design_clarifications.md` | suite (cross-cutting) | REST patterns, stream-detection protocol, edge-device connection semantics. Defines the Ground Station push contract used by `telemetry_stream`. |
| `../_docs/11_gps_denied.md` | `gps-denied-onboard` / `gps-denied-desktop` (shared primary) | GPS-Denied service architecture. Autopilot does NOT host any GPS-denied code; it consumes corrected GPS through the shared edge data path. |
| `../_docs/12_ai_training.md` | `ai-training` repo | AI training pipeline. Autopilot consumes the resulting ONNX/TensorRT models via the rclone model-sync timer (flight-gate-aware). |
@@ -0,0 +1,76 @@
# Component — `detection_client`
**Layer**: Perception (data plane in)
**Status**: forward-looking design (Rust)
## 1. Purpose
Bi-directional gRPC client to the external `../detections` service. Streams frames out, receives bounding-box detections back. Same bboxes are reused by `semantic_analyzer` (Tier 2 ROI selection) and by `telemetry_stream` (operator overlay). This is the only component in autopilot that talks to `../detections`.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| `Frame` | `frame_ingest` | up to 30 fps | Skipped when `ai_locked` is set. |
| Tier-1 service config | startup config | once | gRPC endpoint, TLS settings, request budget, max concurrent streams. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `DetectionBatch` | `scan_controller`, `semantic_analyzer`, `telemetry_stream` | `{ frame_seq: u64, detections: Vec<Detection>, latency_ms, model_version }` |
| Health metric | health aggregator | gRPC connection state, `requests_in_flight`, `latency_p50/p99`, `errors_by_kind`, `model_version`. |
`Detection` mirrors the `../detections` contract: `{ class_id, class_name, confidence, bbox_normalized, optional_mask_or_polyline, source_frame_seq }`.
## 4. Key Responsibilities
- Maintain a single bi-directional gRPC stream to `../detections`. Reconnect on stream loss with bounded exponential backoff.
- Frame budgeting: respect the Tier-1 ≤100 ms/frame target by dropping older in-flight frames if a new frame arrives before the previous response (configurable).
- Validate the response payload against the schema version the client was built against. Surface a hard error on schema mismatch; do not silently downcast.
- Tag each `DetectionBatch` with the source frame's monotonic timestamp so downstream consumers can compute end-to-end latency.
## 5. Internal State
- gRPC channel, stream handle, reconnect state.
- Sliding window of in-flight frame sequence numbers.
- Last-known model version (echoed by `../detections` on each response or on stream init).
State is in-process only.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| `../detections` unreachable | gRPC connect error | Bounded exponential backoff; health → red after threshold; `scan_controller` continues but the `detection_client` health flag is red. |
| Mid-stream cancellation by server | stream error | Reopen stream; do not lose frames in flight (best-effort retry on the latest only). |
| Schema mismatch | response decode error | Hard error to the health aggregator; reject the response; alert. |
| Model version change at runtime | new `model_version` on the stream | Log it; if the change implies new classes, surface to `scan_controller` so per-class thresholds can be reloaded. |
| Consistent latency above budget | `latency_p99 > 100 ms` over a sliding window | Health → yellow; `scan_controller` may degrade to alternate-frame inference. |
## 7. Dependencies
**In-process**: `frame_ingest` (input), `scan_controller` / `semantic_analyzer` / `telemetry_stream` (output).
**External**:
- `../detections` gRPC service. Contract owner: `../_docs/03_detections.md`. Bi-directional streaming.
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Per-frame round-trip latency | ≤100 ms (Tier-1 NFR; mostly owned by `../detections`, autopilot's call budget respects it) |
| Reconnect latency | ≤2 s after `../detections` returns |
| Throughput | up to 30 fps at 1080p |
| Backpressure | drop oldest in-flight rather than queue indefinitely |
## 9. Open Questions
- Versioning strategy of the gRPC contract (covered in `architecture.md §8 Q4`).
## 10. References
- `architecture.md §1`, `§3`, `§7.6`.
- `system-flows.md §F1`.
- `../_docs/03_detections.md`.
- `data_model.md §Detection`, `§DetectionBatch`.
@@ -0,0 +1,74 @@
# Component — `frame_ingest`
**Layer**: Perception (data plane in)
**Status**: forward-looking design (Rust)
## 1. Purpose
Pull RTSP from the ViewPro A40 camera, decode H.264/265 to raw frames, attach a monotonic timestamp + sequence number, and hand each frame to the downstream consumers (`detection_client`, `movement_detector`, `telemetry_stream`) without copying frame buffers more than once.
Frames are the system's primary input. Everything downstream of `frame_ingest` is rate-limited by it.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| RTSP video stream | ViewPro A40 (via airframe IP/port) | 30 fps at 1080p (60 fps capable) | TCP or UDP transport per camera config. Re-opens on failure with bounded backoff. |
| Camera startup config | Static config (env or CLI) | once at process start | Stream URL, transport, decode codec preference. |
| `bringCameraDown` / `bringCameraUp` health signal | local supervisor (if present) | event | Optional. Used by deployments that gate AI access to the camera (e.g., during RC takeover). When `down` is asserted, `frame_ingest` continues decoding for `telemetry_stream` but flags frames as "AI-locked" so downstream consumers skip detection. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `Frame` | `detection_client`, `movement_detector`, `telemetry_stream` | `{ seq: u64, capture_ts_monotonic: ns, decode_ts_monotonic: ns, pixels: Arc<Bytes>, width, height, pix_fmt, ai_locked: bool }` |
| Health metric | health aggregator | `frames/s`, `decode_ms_p50/p99`, `last_frame_age_ms`, `reopens_total`, `decode_errors_total` |
## 4. Key Responsibilities
- Open the RTSP session and recover from transient connection loss with bounded exponential backoff.
- Decode frames using a hardware decoder where available (NVDEC on Jetson) with software fallback.
- Stamp each frame with a monotonic capture timestamp at the earliest practical point in the pipeline; this is what `movement_detector` uses for telemetry-skew checks.
- Publish frames through a single multi-consumer channel (Tokio broadcast or equivalent) using `Arc<Bytes>` for pixel data so consumers do not copy.
- Drop frames if downstream consumers fall behind beyond a configured queue depth; record the drop with a reason ({{detection_client_slow, movement_detector_slow, telemetry_slow}}) and surface it through the health endpoint.
## 5. Internal State
- RTSP session handle and reconnect state (closed / connecting / streaming / failing).
- Last-frame timestamp and sequence number.
- Per-consumer drop counters.
State is in-process only; nothing persists across restarts.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| RTSP connection refused / lost | TCP connect error / read timeout | Bounded exponential backoff (1 s → 30 s cap); health flips to yellow after first failure, red after `last_frame_age_ms` exceeds a configured threshold. |
| Decode error on a single frame | decoder returns error | Drop the frame; increment `decode_errors_total`; do not abort the stream. |
| Decoder cold-start latency | first-frame timestamp far from session-open | Surface `decode_ms_first_frame` once; not an alert by itself. |
| Downstream consumer slow | broadcast channel back-pressure | Drop the oldest frame for that consumer; counter-tagged drop; warning on sustained drops. |
| Camera output format mismatch | unexpected SPS/PPS | Hard-fail at session open with an explicit error; do not silently pick a wrong decode path. |
## 7. Dependencies
**In-process**: none upstream; downstream consumers are `detection_client`, `movement_detector`, `telemetry_stream`.
**External**:
- ViewPro A40 RTSP (live).
- Hardware video decoder (NVDEC on Jetson) via FFmpeg / GStreamer or a Rust binding.
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| End-to-end frame latency (RTSP rx → publish to consumers) | ≤30 ms p99 on Jetson Orin Nano. |
| Frame drop rate | ≤0.1 % under normal conditions. |
| Reconnect latency after camera reboot | ≤5 s from camera availability. |
| Memory | one decoded-frame buffer pool with bounded size; no unbounded growth on slow consumers. |
## 9. References
- `architecture.md §1 System Context`, `§3 Components`, `§7.6 Solution Architecture`.
- `system-flows.md §F1 Frame pipeline`.
- `data_model.md §Frame`.
@@ -0,0 +1,78 @@
# Component — `gimbal_controller`
**Layer**: Action (data plane out)
**Status**: forward-looking design (Rust); ViewPro A40 vendor protocol
## 1. Purpose
Drives the ViewPro A40 gimbal: pan (yaw), tilt (pitch), and zoom. Honours the ≤2 s zoom-transition budget and ≤500 ms decision-to-movement latency. Owns the zoom-out sweep, the smooth-pan path-tracking primitive used during the zoom-in level (follow-the-footpath behaviour), and the centre-window primitive used during target-follow.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| `GimbalCommand` | `scan_controller` | per state-machine tick or per zoom-in plan step | yaw / pitch / zoom goal; or pan plan; or centre-on-target. |
| Sweep config | startup config | once | Zoom-out sweep pattern (pendulum / raster / lawn-mower — see `architecture.md §8 Q1`). |
| Live gimbal status | ViewPro A40 (vendor protocol) | as emitted by camera | yaw / pitch / zoom feedback + faults. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| Vendor-protocol commands | ViewPro A40 (UDP) | yaw / pitch / zoom commands |
| `GimbalState` | `frame_ingest` (for telemetry tagging), `movement_detector` (for ego-motion compensation) | `{ yaw, pitch, zoom, ts_monotonic, command_in_flight: bool }` |
| Health metric | health aggregator | `commands_per_min`, `decision_to_movement_p99_ms`, `zoom_transition_p99_ms`, `vendor_faults_total`. |
## 4. Key Responsibilities
- Send vendor-protocol commands to the ViewPro A40 over UDP. Re-issue on timeout with bounded retry.
- Run the zoom-out sweep pattern when `scan_controller` is in `ZoomedOut` (pattern itself depends on `architecture.md §8 Q1` resolution).
- For the zoom-in path-follow, accept a pan plan (sequence of yaw / pitch / zoom goals with timing) from `scan_controller` / `semantic_analyzer` and execute it smoothly.
- For target-follow, accept a centre-on-target stream (target bbox normalized) from `scan_controller` and command the gimbal to keep the target inside the centre 25 % of frame while visible.
- Stamp every emitted command with a monotonic timestamp so `movement_detector` can synchronise it with frames.
- Surface vendor-protocol faults to health and to `scan_controller`.
## 5. Internal State
- Last-known commanded yaw / pitch / zoom.
- Last-known reported yaw / pitch / zoom (from gimbal feedback).
- Sweep pattern state (current direction, dwell counter).
- Current execution mode: `Sweep | PanPlan | CentreOnTarget | Idle`.
State is in-process only.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| ViewPro A40 not responding | command timeout | Bounded exponential backoff; health → yellow then red; `scan_controller` is informed and may pause zoom-in. |
| Decision-to-movement above budget | self-instrumented | Health → yellow; investigate (likely UDP loss or vendor firmware issue). |
| Zoom transition stalls | feedback shows no zoom progress | Re-issue command; health → yellow; report to `scan_controller`. |
| Target lost during target-follow | feedback + tracker | Surface `target_lost` to `scan_controller`; controller decides to release follow. |
| Conflicting commands | execution-mode mismatch | Reject the lower-priority command; log a hard error; never silently merge. |
## 7. Dependencies
**In-process** (input): `scan_controller`.
**In-process** (output): `frame_ingest`, `movement_detector` (timestamped state).
**External**: ViewPro A40 over UDP (vendor protocol).
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Decision-to-movement latency | ≤500 ms |
| Zoom transition (medium → high) | ≤2 s |
| Sweep pattern stability | bounded jitter; no overshoot beyond configured FOV bounds |
| Target-follow centre-window | target inside centre 25 % of frame while visible |
## 9. Open Questions
- Sweep pattern specification (`architecture.md §8 Q1`): pendulum / raster / lawn-mower; FOV per zoom tier; dwell time per direction.
## 10. References
- `architecture.md §3`, `§6 NFR`, `§7.6 Solution Architecture`.
- `system-flows.md §F2 Movement detection (zoom-out + zoom-in)`.
- `data_model.md §GimbalState`.
@@ -0,0 +1,124 @@
# Component — `mapobjects_store`
**Layer**: Decision + Memory
**Status**: forward-looking design (Rust); on-device working copy of the central MapObjects state, mission-bracketed
## 1. Purpose
On-device, H3-indexed working copy of the centrally maintained MapObjects state plus the IgnoredItems list, scoped to the active mission's bounding box. Computes new / moved / existing / removed diffs across survey passes and is the source of truth for the operator-decline suppression rule **for the duration of the active mission**.
This is **not** a private database. It is hydrated pre-flight from the central `missions` API (`/missions/{id}/mapobjects`) and the mission's full pass diff is pushed back post-flight. The central observation log + computed current view are authoritative across missions and across UAVs (`architecture.md §7.13`).
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| Pre-flight pull payload | `mission_client` (from `missions` API) | once per mission | Hydrates `current_state` + `pending_ignored`. |
| New detection / movement candidate (with MGRS + class + size) | `scan_controller` | per detection | Each is classified as new / moved / existing. |
| `IgnoredItem` append | `scan_controller` (on operator decline) | event | `(MGRS, class_group)` plus operator metadata. |
| End-of-pass marker | `scan_controller` / `mission_executor` | event per pass over a region | Triggers the removed-candidate sweep. |
| Mission delete cascade | suite-level missions API hook (process-level config; not a network call) | event | Drops mission-scoped objects on mission deletion. |
| Post-flight push trigger | `mission_executor` | once per mission, on terminal state | Causes `mission_client` to drain `pending_observations` + `pending_ignored` to the central API. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `MapObjectClassification` | `scan_controller` | `new \| moved \| existing \| removed_candidate` per detection |
| `IgnoredItem` match | `scan_controller` | suppression flag for (MGRS, class_group) |
| Pass diff | `mission_client` (post-flight upload) + `operator_bridge` (optionally surfaced in-flight) | new / moved / removed lists per pass |
| Sync state | `scan_controller`, health aggregator | `synced \| cached_fallback \| degraded`; `pending_observations_count`, `pending_ignored_count` |
## 4. Key Responsibilities
- **Pre-flight hydrate** from `mission_client` pull. Establish `current_state` and `pending_ignored`. Surface `sync_state` (`synced` or `cached_fallback` or `degraded`).
- Compute H3 cell for each detection at the configured resolution (default res 10, ~15 m edge).
- Build the composite key `H3_cell + class`. Maintain an in-memory hashmap; persist asynchronously to disk for crash recovery.
- Answer queries: `classify(detection) → new | moved | existing` using k-ring lookup and `(distance_threshold_m, move_threshold_m, similar_classes)` configuration.
- After a region's scan-pass ends, return objects in the region that were not re-observed as `removed_candidate`s (the operator decides on actual removal).
- Maintain the `IgnoredItem` set; answer suppression queries (`is_ignored(MGRS, class_group)`).
- Append every NEW / MOVED / EXISTING / REMOVED-CANDIDATE / IgnoredItem event to `pending_observations` / `pending_ignored` for the post-flight push (in-flight central writes are forbidden — Frozen choice 6 in `architecture.md §7.3`).
- **Post-flight push**: hand the contents of `pending_observations` + `pending_ignored` to `mission_client` for `POST /missions/{id}/mapobjects` and `POST /missions/{id}/mapobjects/ignored`. On ack, clear pending; on failure, persist for retry.
- On `DELETE /missions/{id}` cascade signal (received via `mission_client`), drop all objects scoped to that mission. The central side cascades as well.
## 5. Sync state machine
```text
fresh_boot
├──> pre-flight pull
│ │
│ ├── 200 OK ────────────> synced
│ ├── unreachable ────────> [operator ack required]
│ │ │
│ │ ├── ack on cache ──> cached_fallback
│ │ └── abort ─────────> BIT fail
│ └── 4xx ─────────────────> BIT fail
├── (during flight; in-process writes only)
│ │
│ ├── pending_observations grow
│ └── pending_ignored grow
└── post-flight push
├── 200 OK on both endpoints ──> synced (pending cleared)
├── partial ────────────────────> retry per-endpoint
└── persistent failure ─────────> degraded (operator warning, manual replay)
```
## 6. Internal State
- In-memory hashmap of `(H3_cell + class) → MapObject`.
- `IgnoredItem` set keyed by `(MGRS, class_group)`.
- Per-region pass tracker for removed-candidate detection.
- `pending_observations`: ordered log of NEW / MOVED / REMOVED-CANDIDATE / EXISTING events not yet pushed centrally.
- `pending_ignored`: ordered log of IgnoredItem appends not yet pushed centrally.
- `sync_state`: enum + last-pull timestamp + last-push timestamp + last error.
- Persistence layer (engine TBD — see Open Questions) for crash recovery and post-flight upload durability.
## 7. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| Pre-flight pull unreachable | network | Surface BIT degradation; operator must acknowledge cached fallback or abort. Never silent. |
| Pre-flight pull stale beyond freshness window | last-fetch-at compared to configured staleness | `sync_state = degraded`; operator must acknowledge or abort. |
| Persistence write failure | engine error | Log + retry; in-memory state continues authoritative for this mission; health → yellow. |
| Persistence corruption on startup | checksum / open failure | Refuse to start with stale state; require explicit recovery (engine-specific); surface to operator at startup. |
| H3 query inconsistency near cell boundaries | algorithmic | Always query the k-ring (k=2 default) so boundary objects are matched anyway. |
| Mission cascade signal lost | absent signal | `DELETE /missions/{id}` is the only cleanup trigger; on lost signal, mission-scoped objects accumulate. Operator-driven manual purge is acceptable. |
| Post-flight push partial success | per-endpoint status | Independent retry per endpoint; do not roll back the successful one. |
| Post-flight push persistent failure | bounded retries exhausted | `sync_state = degraded`; pending diff persisted on disk; operator-visible warning; manual replay supported. Mission's central data integrity at risk until replayed. |
| In-flight crash | startup detects non-empty `pending_*` for a terminated mission | `mission_client` runs the post-flight push at startup before BIT completes for any new mission. |
## 8. Dependencies
**In-process**: `scan_controller`, `mission_client` (for pull/push round-trips), `mission_executor` (for post-flight trigger).
**External**: H3 spatial-index library (Rust crate). Persistent store engine — TBD (SQLite + H3 extension / KV / in-memory + snapshot — see Open Questions). Central API contract via `mission_client`'s extension of the `missions` API (per `architecture.md §7.13`).
## 9. Non-Functional Targets
| Concern | Target |
|---|---|
| Per-detection classify latency | O(1); p99 ≤1 ms |
| Pre-flight pull time | ≤30 s for a 30 km × 30 km mission area (per `architecture.md §6 NFR`) |
| Post-flight push time | ≤2 min for a 60 min mission's pass diff (per `architecture.md §6 NFR`) |
| Persistent-store size (single mission) | bounded; configurable retention |
| Crash recovery time | ≤2 s to a usable state; in-flight crash → next-boot push of pending |
| Boundary correctness | guaranteed by k-ring query |
## 10. Open Questions
- **Engine choice** (architecture.md §8 Q3): SQLite + H3 extension / KV / in-memory + snapshot.
- **Central API schema details** (architecture.md §8 Q7): paging strategy, photo-reference upload mechanism, observation-history retention policy.
- **Conflict resolution rules** (architecture.md §8 Q8): exact projection from observation log to current view; REMOVED-claim expiry window; multi-class disambiguation.
- Optimal H3 resolution per terrain class.
- Class-group definitions (`military_vehicle_group` vs `concealed_position_group` vs `movement_candidate`) — currently in `scan_controller` config.
## 11. References
- `architecture.md §3`, `§5 Architectural Principles` (MapObjects are mission-bracketed and centrally synchronised), `§6 NFR`, `§7.9 MapObjects (H3 spatial index)`, `§7.10 Sync Message Format`, `§7.11 Target Relocation`, `§7.12 New vs Existing object detection`, `§7.13 MapObjects Sync`.
- `system-flows.md §F7 MapObjects + ignored-items` (in-flight diff), `§F8 MapObjects sync (central DB, mission-bracketing)`.
- `data_model.md §MapObject`, `§IgnoredItem`, `§MapObjectObservation`, `§MapObjectsBundle`.
- `../_docs/02_missions.md` (mission cascade contract; new MapObjects endpoints).
@@ -0,0 +1,87 @@
# Component — `mavlink_layer`
**Layer**: Action (data plane out)
**Status**: forward-looking design (Rust); hand-rolled (no third-party SDK)
## 1. Purpose
Hand-rolled MAVLink v2 transport. Implements only the ~1015 commands this codebase needs (full list in `architecture.md §7.7`). Owns serialisation / deserialisation, heartbeat, sequence numbers, retry, and a single connection abstraction (UDP or serial, picked at startup from CLI / env). No third-party SDK — eliminating the largest current dependency-risk item.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| Outgoing `COMMAND_LONG`, `MISSION_*`, `SET_MODE` | `mission_executor` | per state transition | Hand-rolled message constructors per command. |
| Outgoing heartbeat | self (timer) | 1 Hz | `HEARTBEAT` to keep the autopilot's GCS-link alive. |
| Connection URI | startup config | once | `udp://...` or `serial:///dev/...`. |
| MAVLink-2 signing config | startup config | once | If supported by the link, signing is enabled; otherwise the link is treated as trusted. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| Decoded MAVLink messages | `mission_executor`, `telemetry_stream`, `movement_detector` (for UAV motion telemetry) | typed enum per message kind |
| Connection state | health aggregator | `connected`, `last_heartbeat_age_ms`, `tx_seq`, `rx_seq`, `parse_errors_total`, `signing_enabled`. |
The supported message surface (concise list; full table in `architecture.md §7.7`):
- `HEARTBEAT` (bidir)
- `COMMAND_LONG` subset (out): arm/disarm, takeoff, set-mode, change-speed, change-alt, land, RTL
- `COMMAND_ACK` (in)
- `MISSION_COUNT`, `MISSION_REQUEST_INT`, `MISSION_ITEM_INT`, `MISSION_ACK`, `MISSION_SET_CURRENT`, `MISSION_CURRENT`, `MISSION_ITEM_REACHED`, `MISSION_CLEAR_ALL`
- `GLOBAL_POSITION_INT`, `ATTITUDE`, `SYS_STATUS`, `EXTENDED_SYS_STATE`, `STATUSTEXT`
- `SET_MODE` (out, fixed-wing)
## 4. Key Responsibilities
- Open and maintain the MAVLink connection (UDP or serial). Reconnect on transport loss with bounded backoff.
- Encode outgoing messages with correct sequence numbers, system / component IDs, and (when enabled) MAVLink-2 signing.
- Decode incoming messages with strict validation: reject malformed frames, unknown message IDs, and signing failures.
- Emit a 1 Hz heartbeat. Detect autopilot heartbeat timeouts and surface to health.
- Demux `COMMAND_ACK` to the originating caller (per `command_id`); enforce a wall-clock ack timeout.
## 5. Internal State
- Connection handle (UDP socket or serial port).
- Outgoing sequence number.
- In-flight command map (`command_id → (caller, deadline)`).
- Per-message-kind parse error counters.
State is in-process only.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| Transport open failure | OS error | Bounded backoff; surface to health → red. |
| Heartbeat from autopilot missing | wall-clock timeout | Surface `link_lost` to health and to `mission_executor`; do not silently fail. |
| Command-ack timeout | wall-clock | Bubble timeout to `mission_executor`; the executor decides retry vs failure. |
| Malformed inbound frame | parser error | Drop the frame; increment counter; do not abort the link. |
| MAVLink-2 signing mismatch (if enabled) | signature check | Reject the frame; alert; do not silently accept. |
| Sequence-number gap | rx_seq vs expected | Log; not a hard failure on its own. |
## 7. Dependencies
**In-process** (input): `mission_executor`.
**In-process** (output): `mission_executor`, `telemetry_stream`, `movement_detector`.
**External**: ArduPilot / PX4 over MAVLink v2 (UDP or serial).
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Per-message round-trip on a healthy link | ≤50 ms p99 |
| Heartbeat cadence | 1 Hz out |
| Command-ack timeout | configurable; default 1 s, with retry handled by `mission_executor` |
| Reconnect after transport loss | ≤2 s on serial / ≤5 s on UDP |
| Message subset | ~1015 commands only — adding more requires explicit design review |
## 9. Open Questions
- **MAVLink-2 message signing** (`architecture.md §8 Q6`): whether the airframe link enables signing or treats the link as trusted.
## 10. References
- `architecture.md §3`, `§5 Architectural Principles` (no MAVSDK, no silent error swallowing), `§7.7 MAVLink and Piloting`.
- `system-flows.md §F6 Mission lifecycle`.
@@ -0,0 +1,93 @@
# Component — `mission_client`
**Layer**: Action (data plane out)
**Status**: forward-looking design (Rust)
## 1. Purpose
Pulls the mission from the external `missions` API on start; validates against the shared `mission-schema` artefact; supplies the parsed mission to `mission_executor`. POSTs middle-waypoint inserts on operator-confirmed targets, owns the **MapObjects pre-flight pull / post-flight push** round trips against the same `missions` API, and survives transient connection loss with bounded retry.
`autopilot` and `missions` are **separate repos** with a shared `mission-schema`. There is no in-process mission database in autopilot. The MapObjects endpoints (`/missions/{id}/mapobjects` GET + POST) are an extension of the `missions` API per `architecture.md §7.13`.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| `mission_id` | startup CLI / env | once | Identifies which mission to fetch. |
| Missions API endpoint + auth | startup config | once | HTTPS REST; auth model TBD per `../_docs/02_missions.md`. |
| Middle-waypoint POST request | `mission_executor` (via `scan_controller` / `operator_bridge`) | event | The mission with the inserted middle waypoint. |
| Mission-update notification | missions API (push or poll) | event | Optional; if missions API supports change notifications, propagate to `mission_executor`. |
| MapObjects post-flight push trigger | `mission_executor` (on terminal state) + `mapobjects_store` (pending diff handle) | once per mission | Triggers the F8 post-flight upload. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| Parsed mission | `mission_executor` | `{ waypoints: Vec<MissionWaypoint>, geofences: Vec<Geofence>, return_point, mission_id, schema_version }` |
| Pre-flight MapObjects bundle | `mapobjects_store` | `{ map_objects, ignored_items, fetched_at, schema_version, fallback_used: bool }` |
| Post-flight push status | `mapobjects_store`, health aggregator | per-endpoint ack / retry / failure |
| Mission cascade signal (`DELETE /missions/{id}` echoed by missions API) | `mapobjects_store` | event |
| Health metric | health aggregator | `last_fetch_ts`, `fetch_errors_total`, `schema_version`, `connection_state`, `mapobjects_pull_state`, `mapobjects_push_pending`. |
## 4. Key Responsibilities
- Fetch the mission by `mission_id` on startup. Validate against `mission-schema`. Reject on schema-invalid; do not silently downcast.
- **MapObjects pre-flight pull.** Immediately after the mission fetch succeeds, call `GET /missions/{id}/mapobjects` (and `GET /missions/{id}/mapobjects/ignored` if separated). Hand the bundle to `mapobjects_store`. On failure, surface to `mission_executor` BIT (F9) — operator may acknowledge cached fallback or abort. Never silent.
- POST middle-waypoint updates; await ack; surface failure to `mission_executor` (which decides whether to halt, RTL, or proceed with the original mission).
- **MapObjects post-flight push.** When `mission_executor` reaches a terminal state, drain `mapobjects_store`'s pending diff and call `POST /missions/{id}/mapobjects` + `POST /missions/{id}/mapobjects/ignored`. Independent retry per endpoint with bounded backoff. On persistent failure, persist pending diff on disk and surface a warning (operator may manually replay).
- **Crash-recovery push.** At startup, if `mapobjects_store` reports non-empty pending diff for a previously terminated mission, run the post-flight push for that mission BEFORE BIT for any new mission begins.
- On `DELETE /missions/{mission_id}` (observed via missions API or out-of-band signal), notify `mapobjects_store` to drop mission-scoped objects.
- Survive transient connection loss with bounded exponential backoff. Pre-flight, this delays takeoff. In-flight, missing connectivity does not stop execution of the already-in-memory mission. (No central writes happen in-flight by design — Frozen choice 6.)
## 5. Internal State
- Currently active mission (the original, plus any patched version from middle-waypoint inserts).
- Schema version reported by missions API at fetch.
- MapObjects pull state: `not_started | in_flight | synced | cached_fallback | failed`.
- MapObjects push queue: per-mission pending diff with retry counter and last-failure reason.
- Retry counter and last-failure reason for each endpoint.
State is in-process only **except** for the post-flight push queue, which is durable on disk so a crash mid-mission does not lose the diff.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| Missions API unreachable at startup | HTTP error / DNS failure | Bounded retry; if max-retry exceeded, refuse to start the mission; health → red; surface to operator. |
| Schema mismatch (mission or mapobjects) | response decoder | Refuse to start the mission; surface raw response (size-capped) for offline analysis. |
| Pre-flight MapObjects pull fails | HTTP error / timeout | BIT degrades; operator may acknowledge cached fallback or abort. Never silent. |
| Mid-flight middle-waypoint POST fails | HTTP error | `mission_executor` decides: continue with the existing in-memory mission, or RTL if the failure is persistent. |
| Post-flight MapObjects push fails | HTTP error / 5xx | Persist pending diff on disk; bounded retry with exponential backoff; operator-visible warning after max retries. |
| Post-flight push partial success | per-endpoint status | Independent retry per endpoint; do not roll back the successful one. |
| Mission deleted mid-flight | `DELETE` notification | Surface to operator; safe-shutdown decision is a policy in `mission_executor` (default: continue current mission and notify on landing). The post-flight push will receive 404; data preserved as orphaned for forensic review. |
## 7. Dependencies
**In-process** (input): startup config, `mission_executor`, `operator_bridge` (via `scan_controller`), `mapobjects_store` (pending-diff handle).
**In-process** (output): `mission_executor`, `mapobjects_store`.
**External**: missions API (HTTPS REST), including the MapObjects extension. Contract owner: `../_docs/02_missions.md` (with the §7.13 extension proposed in this repo).
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Startup mission fetch | ≤5 s on healthy connectivity |
| Pre-flight MapObjects pull | ≤30 s for a 30 km × 30 km mission area |
| Middle-waypoint POST | ≤2 s on healthy connectivity |
| Post-flight MapObjects push | ≤2 min for a 60 min mission's pass diff; persisted on disk if push fails |
| Bounded retry | configurable max; default 5 attempts with exponential backoff for synchronous calls; 24 h durable retry window for the post-flight push |
## 9. Open Questions
- **`mission-schema` extraction location** (`architecture.md §8 Q5`): `_infra/` at suite root, or a small third repo.
- **MapObjects endpoint contract** (`architecture.md §8 Q7`): paging, photo-ref upload, retention policy.
- **MapObjects conflict resolution** (`architecture.md §8 Q8`): server-side; this component only consumes the result.
- Auth / session model for the missions API (per `../_docs/02_missions.md`).
## 10. References
- `architecture.md §3`, `§5 Architectural Principles` (separate repos + shared schema; MapObjects mission-bracketed), `§7.6 Solution Architecture`, `§7.13 MapObjects Sync`.
- `system-flows.md §F6 Mission lifecycle`, `§F8 MapObjects sync`.
- `data_model.md §MissionItem`, `§MissionWaypoint`, `§Geofence`, `§MapObjectsBundle`, `§MapObjectObservation`.
- `../_docs/02_missions.md`.
@@ -0,0 +1,94 @@
# Component — `mission_executor`
**Layer**: Action (data plane out)
**Status**: forward-looking design (Rust)
## 1. Purpose
Drives the airframe through a typed state machine: connect → health-check → **pre-flight self-test (BIT, F9)** → (variant-specific arm/takeoff or wait-for-AUTO) → upload mission → fly mission → land. Owns geofence enforcement (both INCLUSION and EXCLUSION), the **lost-link failsafe ladder** (F10), and **battery / fuel threshold enforcement**. Inserts middle waypoints on operator-confirmed targets and resumes the original mission after target-follow ends. Issues all autopilot-facing commands through `mavlink_layer`. Triggers post-flight MapObjects push (F8) on terminal state.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| Mission JSON (parsed) | `mission_client` | once at start; on middle-waypoint update | Contains waypoints + INCLUSION/EXCLUSION geofences + return point. |
| Airframe variant | startup config | once | `multirotor` or `fixed_wing`. |
| MAVLink telemetry | `mavlink_layer` | continuous | Position, attitude, mode, sys-status, mission progress. |
| Middle-waypoint hint | `scan_controller` (from `operator_bridge`) | event on operator confirm | Triggers mission re-upload. |
| Target-follow release / loss / timeout | `scan_controller` | event | Triggers reverting to the original mission. |
| Health input from peer components | health aggregator | continuous | Used for the health-check gate before takeoff. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| MAVLink commands (arm, takeoff, set-mode, change-speed, change-alt, land, RTL, mission-clear, mission-upload, set-current, RTL) | `mavlink_layer` | per state transition |
| UAV telemetry (forwarded) | `scan_controller`, `movement_detector`, `telemetry_stream` | continuous |
| Mission state | `scan_controller`, `operator_bridge` | event on transitions |
| Health metric | health aggregator | current state, `state_duration_ms`, `transition_failures_by_state`, geofence violations, retry counts. |
## 4. Key Responsibilities
- Run the variant-specific state machine (see `architecture.md §7.7`):
- **Multirotor**: `DISCONNECTED → CONNECTED → HEALTH_OK → BIT_OK → ARMED → TAKE_OFF → MISSION_UPLOADED → FLY_MISSION → LAND → POST_FLIGHT_SYNC → DONE`.
- **Fixed-wing**: `DISCONNECTED → CONNECTED → HEALTH_OK → BIT_OK → MISSION_UPLOADED → WAIT_AUTO → FLY_MISSION → LAND → POST_FLIGHT_SYNC → DONE`.
- Apply bounded retry with exponential backoff at every transition; explicit max-retry; on exceeding it, health flips to red and the executor surfaces the failure via `operator_bridge`. **No infinite retry.**
- **Run pre-flight BIT (F9)** before transitioning to `ARMED` / `WAIT_AUTO`. BIT covers every dependency in `architecture.md §5` plus mission load + MapObjects pre-flight pull (cached fallback acknowledged) + persistent-store free space + wall-clock binding. On BIT FAIL, no transition. On DEGRADED, surface to operator for signed acknowledgement (per Q9).
- **Run the lost-link failsafe ladder (F10)** every tick: `LinkOk → LinkDegraded → LinkLost → LinkLostInFollow`. Default RTL after 30 s grace; configurable. MAVLink-link loss to ArduPilot itself is a separate, more severe event — health → red, airframe failsafe takes over (we do NOT override it).
- **Enforce battery / fuel thresholds.** Read `SYS_STATUS` / `EXTENDED_SYS_STATE` continuously; trigger RTL at `battery ≤ rtl_threshold` (default 25 %); land-now at `battery ≤ hard_floor` (default 15 %); operator override only via signed command.
- Enforce geofences. INCLUSION violations halt forward progress and trigger RTL; EXCLUSION violations trigger the same. Both are honoured (the earlier C++ behaviour silently ignored EXCLUSION; the new design rejects that).
- On middle-waypoint hint: recompute the mission (`current_position → middle_waypoint → resume_original_route`), `MISSION_CLEAR_ALL`, re-upload via the standard sequence, `MISSION_SET_CURRENT(0)`, and resume.
- On target-follow ending: recompute and re-upload the original mission from the current position; resume.
- **Trigger post-flight MapObjects push (F8)** on entry to `POST_FLIGHT_SYNC` — that is, after `LAND` completes (or after RTL completes, or after operator-acknowledged abort). Hand off to `mission_client`.
- Forward MAVLink telemetry to `scan_controller` (for proximity priority + middle-waypoint computation), to `movement_detector` (for ego-motion compensation), and to `telemetry_stream` (for operator overlay).
## 5. Internal State
- Current state + variant.
- Currently active mission (original) + active patched mission (with middle waypoint), if any.
- Per-transition retry counter and last-failure reason.
- Mission progress (current item index).
- Geofence violation history (for diagnostics).
State is in-process only; restart re-runs the state machine from `DISCONNECTED`.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| MAVLink connection lost | heartbeat timeout from `mavlink_layer` | Bounded retry; health → red after threshold; state machine pauses (does not reset). |
| Health-check gate fails (sensors not ok, low battery, etc.) | telemetry inspection | Stay in `CONNECTED` state; alert; no takeoff. |
| BIT FAIL on any item | F9 evaluation | No transition past `BIT_OK`; surface report to operator; remain in `HEALTH_OK`. |
| Mission upload `MISSION_ACK` rejection | `mavlink_layer` response | Bounded retry with full re-upload; on max-retry, health → red, surface to operator. |
| Geofence INCLUSION exit | telemetry vs polygon | Trigger RTL via MAVLink; surface alert; transition to `LAND`. |
| Geofence EXCLUSION entry | telemetry vs polygon | Trigger RTL via MAVLink; surface alert; transition to `LAND`. |
| Operator/Ground-Station modem link lost | F10 ladder evaluation | `LinkDegraded` (530 s) → health yellow + queue events; `LinkLost` (>30 s) → RTL; `LinkLostInFollow` (>30 s in target-follow) → 30 s grace then RTL. Configurable. |
| MAVLink-link loss to ArduPilot/PX4 | heartbeat timeout | Health → red; airframe's own MAVLink failsafe takes over (we do NOT override). |
| Battery ≤ rtl_threshold (default 25 %) | SYS_STATUS | Trigger RTL; surface alert; transition to `LAND`. |
| Battery ≤ hard_floor (default 15 %) | SYS_STATUS | Land-now via `MAV_CMD_NAV_LAND` at safest reachable point; health → red. |
| Operator override of safety threshold | signed command (Q9) | Permitted; recorded in audit log with operator ID + rationale. |
| Middle-waypoint compute fails (e.g., target outside INCLUSION) | pre-upload validation | Reject the hint with reason; surface to `operator_bridge`; original mission continues. |
| Target-follow handover from `scan_controller` while not yet airborne | state guard | Reject; surface error; never deliver target-follow before `FLY_MISSION`. |
| Post-flight MapObjects push fails | F8 status | Persist pending diff on disk; bounded retry; operator-visible warning after max retries. State machine still reaches `DONE` so a new mission can start. |
## 7. Dependencies
**In-process** (input): `mission_client`, `mavlink_layer`, `scan_controller`, health aggregator.
**In-process** (output): `mavlink_layer`, `scan_controller`, `movement_detector`, `telemetry_stream`, `operator_bridge`.
**External**: ArduPilot / PX4 over MAVLink (mediated by `mavlink_layer`).
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Time-to-takeoff (multirotor, healthy startup) | bounded; no infinite waits |
| Mission-upload retry budget | configurable max; default 3 attempts |
| Geofence response time | ≤500 ms from violation detection to RTL command |
| Middle-waypoint re-upload | ≤2 s end-to-end |
## 9. References
- `architecture.md §3`, `§5 Architectural Principles` (bounded retry, geofence symmetric, lost-link mandatory, BIT mandatory, MapObjects mission-bracketed), `§7.3 Reliability and safety`, `§7.7 MAVLink and Piloting` (lost-link ladder + battery thresholds).
- `system-flows.md §F6 Mission lifecycle`, `§F8 MapObjects sync`, `§F9 Pre-flight self-test`, `§F10 Lost-link failsafe ladder`.
- `data_model.md §MissionItem`, `§MissionWaypoint`, `§Geofence`.
@@ -0,0 +1,96 @@
# Component — `movement_detector`
**Layer**: Perception (data plane in)
**Status**: forward-looking design (Rust + OpenCV bindings; learned-CV fallback per `architecture.md §8 Q14`)
## 1. Purpose
Detect small moving point/cluster candidates that are not yet classifiable by Tier 1, in **both** the zoom-out and zoom-in scan levels, and enqueue them as POIs for confirmation. Compensates for UAV and gimbal motion using synchronised telemetry; naive frame differencing is rejected.
The component is suppressed only during `scan_controller`'s `TargetFollow` state (the gimbal is dominated by tracking commands during follow).
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| `Frame` | `frame_ingest` | up to 30 fps | Frames are skipped when `ai_locked` is set or the system is in `TargetFollow`. |
| Gimbal angle (yaw, pitch) | `gimbal_controller` | per frame, monotonic-timestamped | Telemetry-skew gate: reject samples where frame ↔ gimbal skew exceeds the configured tolerance for the current zoom band. |
| Zoom state | `gimbal_controller` | per frame, monotonic-timestamped | Drives zoom-band selection (`zoomed_out` vs `zoomed_in`) and per-band thresholds; also used for residual-motion scaling. |
| UAV motion telemetry | `mavlink_layer` (via `mission_executor`) | 10 Hz target | Position + attitude + velocity + monotonic timestamp. |
| Active-state hint | `scan_controller` | event | `enable_zoomed_out` / `enable_zoomed_in` / `disable` (the latter is set during `TargetFollow`). |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `MovementCandidate` | `scan_controller` | `{ frame_seq, bbox_normalized, residual_velocity_estimate, telemetry_quality, source_frame_ts, source_zoom_band }` |
| Health metric | health aggregator | `enabled`, `current_zoom_band`, `candidates_per_min_zoomed_out`, `candidates_per_min_zoomed_in`, `telemetry_skew_drops_total`, `compensation_quality_per_band`. |
## 4. Key Responsibilities
- Compute per-frame ego-motion using OpenCV optical flow / global motion estimation (e.g. dense Lucas-Kanade or feature-based homography), refined by the synchronised gimbal + UAV telemetry.
- Subtract estimated ego-motion from per-pixel motion; cluster the residuals.
- Emit clusters that meet the **per-zoom-band** minimum size + persistence threshold as `MovementCandidate`s, capped to honour the system-wide ≤5 POIs/min operator-review budget shared with `scan_controller`.
- Self-disable in `TargetFollow`. The component still consumes frames while disabled (to keep its motion-history warm) but emits no candidates.
- Tag each emitted candidate with `source_zoom_band` so `scan_controller` can apply zoom-band-aware queueing logic (described in `system-flows.md §F2`).
## 5. Per-zoom-band tuning
The same code path runs at zoom-out and zoom-in, but the configuration differs because the pixel-to-metre ratio differs by ~10×.
| Knob | Zoom-out (typical) | Zoom-in (typical) |
|---|---|---|
| Cluster persistence threshold | 35 frames | 610 frames (gimbal-pan-induced flicker is more frequent at narrow FOV) |
| Residual-velocity floor | low (small physical motion is enough) | higher (small physical motion is amplified pixel-wise; raising the floor reduces FP from compensation residuals) |
| Telemetry-skew tolerance | 50 ms frame ↔ gimbal, 100 ms frame ↔ UAV | 25 ms frame ↔ gimbal, 50 ms frame ↔ UAV (stricter — gimbal slewing dominates zoomed FOV) |
| Enqueue-latency budget | ≤1 s | ≤1.5 s (allows brief gimbal-stability window) |
| FP cap (per-band) | per `architecture.md §6 NFR` | per `architecture.md §6 NFR`; if exceeded, fallback per Q14 |
Exact values are mission-tunable; defaults are calibrated during the benchmark gate.
## 6. Internal State
- Rolling motion-history buffer (a few seconds of frames + telemetry). One buffer per zoom band; switching bands does not invalidate the buffer for the other.
- Per-cluster persistence counters (per zoom band).
- Telemetry-sync state machine.
- `current_zoom_band` derived from `gimbal_controller`'s zoom state.
State is in-process only.
## 7. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| Telemetry skew above tolerance (per zoom band) | timestamp delta exceeds threshold | Drop that frame's compensation; do not emit candidates for the affected window; counter-tagged drop. |
| Optical-flow degenerate | flow magnitudes implausible (e.g. camera failure, full motion blur) | Skip emission for that frame; surface as a health signal on sustained occurrence. |
| Sustained candidate flood at zoom-in (FP cap exceeded) | candidates_per_min_zoomed_in over a sliding window | Suppress zoom-in emission only; keep zoom-out emission running; surface health → yellow; this is the trigger condition for the Q14 fallback. |
| Sustained candidate flood at zoom-out (FP cap exceeded) | candidates_per_min_zoomed_out over a sliding window | Down-rank lowest-confidence candidates; surface health → yellow; never silently drop without counting. |
| Component disabled by `scan_controller` | active-state hint = `disable` | Emit zero candidates; keep motion history warm. |
## 8. Dependencies
**In-process**: `frame_ingest`, `gimbal_controller`, `mavlink_layer`, `scan_controller`.
**External**: OpenCV (patched, version-pinned). Optional: a learned-CV crate / module (RAFT-derivative or CNN motion-segmentation) behind a build-time feature flag — engaged only when the Q14 fallback is required.
## 9. Non-Functional Targets
| Concern | Target |
|---|---|
| Candidate enqueue latency (zoom-out) | ≤1 s from detection to POI in queue |
| Candidate enqueue latency (zoom-in) | ≤1.5 s from detection to POI in queue |
| False-positive rate at the operator surface | bounded by `scan_controller`'s ≤5 POIs/min cap; per-zoom-band internal caps prevent zoom-in starving zoom-out |
| CPU budget on Jetson | configurable; must coexist with Tier 1 (running in `../detections`) and Tier 2 |
| Telemetry-skew tolerance | per-zoom-band; defaults in §5 |
## 10. Open Questions
- **Q14 fallback selection** (architecture.md §8): if classical OpenCV fails the per-zoom-band FP cap at zoom-in, the fallback module — learned optical flow vs CNN motion-segmentation vs IMU-tighter-coupled classical — is open. Interface contract is fixed (`Frame + telemetry → Vec<MovementCandidate>`).
- Minimum cluster persistence threshold across zoom bands (refined during benchmark gate).
- Whether to share the motion-history buffer across zoom-band transitions or reset on transition (§6 currently says share).
## 11. References
- `architecture.md §3`, `§5 Architectural Principles` (ego-motion compensation mandatory; movement runs at both zoom levels), `§7.6 Movement detector`, `§8 Q14`.
- `system-flows.md §F2 Movement detection (zoom-out + zoom-in)`.
- `data_model.md §MovementCandidate`.
@@ -0,0 +1,89 @@
# Component — `operator_bridge`
**Layer**: Action (data plane out)
**Status**: forward-looking design (Rust)
## 1. Purpose
Surfaces POIs to the operator (via the always-on `telemetry_stream`) and routes operator commands (confirm / decline / target-follow start / target-follow release / safety-override / BIT-degraded-acknowledge) back into autopilot. On decline, appends an `IgnoredItem`. On confirm, hands a middle-waypoint hint to `mission_executor`. On target-follow start / release, drives `scan_controller`'s state transition. **Validates every operator command's authentication signature, replay protection, and session binding before dispatching it** — the modem link's encryption alone is not sufficient (per `architecture.md §5` and Q9).
The Ground Station is the operator-facing UI; `operator_bridge` is the autopilot-side counterpart.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| POI surface request | `scan_controller` | per POI | Includes Tier 1, Tier 2, and (optional) Tier 3 evidence. |
| POI dequeue / replace | `scan_controller` | event | When the queue rotates (cap, age-out, or completion). |
| Operator command (confirm / decline / target-follow start / target-follow release) | Ground Station (via `telemetry_stream`) | event | Acked back to operator with command id + result. |
| Modem link state | `telemetry_stream` | event | Used to decide whether to surface POIs at all (see Failure Modes). |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| Operator-facing POI event | `telemetry_stream` (which pushes to Ground Station) | `{ poi_id, mgrs, class_group, confidence, vlm_status, tier2_evidence_summary, photo_metadata }` |
| `IgnoredItem` append | `mapobjects_store` (via `scan_controller`) | on operator decline |
| Middle-waypoint hint | `mission_executor` (via `scan_controller`) | on operator confirm |
| Target-follow start / release | `scan_controller` | on operator command |
| Health metric | health aggregator | `pois_surfaced_per_min`, `decision_latency_p50/p99` (operator-side), `commands_in_flight`. |
## 4. Key Responsibilities
- Translate `POI` events from `scan_controller` into the wire format defined in `architecture.md §7.10 Drone ⇄ Operator Sync Message Format` and push them through `telemetry_stream`.
- Receive operator commands on the return path; **validate the authentication signature, replay-protection sequence number, and session token** before any other processing. Reject and surface to health on signature failure, sequence-number reuse, or unknown session.
- Validate the command id matches a POI in flight (or a target-follow session, BIT report, or safety-override scope); ack the operator with the result.
- Apply the confidence-scaled operator decision window (40 % → 30 s, 100 % → 120 s, linear) — though the timeout itself is enforced by `scan_controller`; this component just ensures the surfaced POI carries the deadline.
- On confirm, hand `(target_mgrs, target_class)` to `scan_controller` (which forwards a middle-waypoint hint to `mission_executor`).
- On decline, hand `(MGRS, class_group)` to `scan_controller` for `IgnoredItem` append.
- Forward BIT-degraded acknowledgements (signed) to `mission_executor` (F9), and safety-override commands (signed) for battery / lost-link suppression to `mission_executor` (F10).
## 5. Internal State
- Currently surfaced POIs by id (with deadlines).
- In-flight target-follow session (if any).
- Per-command idempotency keys.
State is in-process only.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| Modem link down | `telemetry_stream` health | Stop surfacing POIs; queue them in `scan_controller` (whose cap still applies); resume on reconnect. F10 lost-link ladder owns the larger response. |
| Operator command for unknown POI id | command validation | Ack with error; do not act on it. |
| Operator command after deadline | command validation | Ack with `expired`; do not act on it. |
| Duplicate operator command (re-tx) | idempotency key | Ack with the cached result; do not double-act. |
| `scan_controller` rejects the confirm (e.g., already in target-follow) | response from controller | Ack operator with `rejected: already_following`; surface the active target. |
| Operator command signature invalid | auth check | Reject with `auth_failed`; log; surface health → red on sustained failures (potential hostile injection). |
| Operator command sequence number reused | replay-protection check | Reject with `replay_detected`; log; do not act on it. |
| Unknown session token | session validation | Reject with `auth_failed`; log; require operator re-auth at Ground Station. |
| Operator attempts to acknowledge a BIT FAIL as DEGRADED | severity check | Rejected by validation; surface to operator as `cannot_acknowledge_fail`. |
## 7. Dependencies
**In-process** (input): `scan_controller`, `telemetry_stream`.
**In-process** (output): `scan_controller` (for state transitions), indirectly `mapobjects_store` and `mission_executor` (via `scan_controller`).
**External**: Ground Station API (operator-facing); contract owned by `../_docs/04_system_design_clarifications.md`.
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| POI surface → operator visible | ≤1 s under normal modem conditions |
| Operator command → autopilot effect | ≤1 s under normal modem conditions |
| Idempotency window | 60 s (per-command-id cache) |
## 9. Open Questions
- Ground Station API contract (`architecture.md §8 Q2`): stream protocol (WebRTC / WebSocket-H.264 / gRPC server-streaming?), session/auth model, bbox-overlay rendering.
- **Operator-command authentication scheme** (`architecture.md §8 Q9`): HMAC over (session_token, sequence_number, payload) vs JWT-style ed25519 vs MAVLink-2 signing extended to operator commands vs separate envelope. The principle is committed; the scheme is open.
- **Multi-operator session policy** (`architecture.md §8 Q11`): single active operator at a time, or quorum?
## 10. References
- `architecture.md §3`, `§5 Architectural Principles` (operator commands authenticated, signed, replay-protected), `§7.10 Drone ⇄ Operator Sync Message Format`, `§8 Q9 / Q11`.
- `system-flows.md §F5 Operator round trip`, `§F9 Pre-flight self-test`, `§F10 Lost-link failsafe ladder`.
- `data_model.md §POI`, `§IgnoredItem`, `§OperatorCommand`.
- `../_docs/04_system_design_clarifications.md`.
@@ -0,0 +1,96 @@
# Component — `scan_controller`
**Layer**: Decision + Memory
**Status**: forward-looking design (Rust)
## 1. Purpose
The system's brain. A deterministic typed state machine — `ZoomedOut`, `ZoomedIn { roi, hold_started_at }`, and `TargetFollow { target_id, started_at }`. Owns the POI queue, timeouts, the ≤5 POIs/min operator-review cap, the confidence-scaled operator-decision window, gimbal command issuance, and the new/moved/existing/removed dispatch into `mapobjects_store`.
The full behaviour-tree spec — including tick scenarios and the 15 fixed-wing rules — lives in `system-flows.md §F4`.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| `DetectionBatch` | `detection_client` | per frame | Tier 1 primitives. |
| `MovementCandidate` | `movement_detector` | per frame at both zoom-out and zoom-in (suppressed only during `TargetFollow`) | Each candidate carries `source_zoom_band`. |
| `Tier2Evidence` | `semantic_analyzer` | per zoom-in hold | Path / endpoint / concealment scoring. |
| `VlmAssessment` | `vlm_client` (optional) | per zoom-in endpoint hold | `status: disabled` if VLM is off. |
| Operator commands | `operator_bridge` | event | confirm / decline / target-follow start / target-follow release. Authenticated, signed, replay-protected upstream of this component. |
| UAV telemetry | `mavlink_layer` (via `mission_executor`) | 10 Hz target | Position used for proximity-weighted POI priority and middle-waypoint computation. |
| Mission state | `mission_executor` | event | Current waypoint, mission progress; used for sweep-vs-route alignment. |
| MapObjects sync state | `mapobjects_store` | event at startup + post-flight | `synced` / `cached_fallback` / `degraded` — surfaces a health flag and (for `degraded`) suppresses MapObject diff classifications until corrected. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `GimbalCommand` (yaw / pitch / zoom) | `gimbal_controller` | per state-machine tick or per zoom-in plan step |
| `POI` to operator | `operator_bridge` (then `telemetry_stream`) | enqueue / dequeue events |
| Middle-waypoint hint | `mission_executor` | event on operator-confirmed target |
| MapObjects update | `mapobjects_store` | new / moved / existing / removed dispatch |
| Health metric | health aggregator | `state`, `pois_in_queue`, `pois_per_min`, `tick_latency_p99`, `last_state_change_ts`, `mapobjects_sync_state`. |
## 4. Key Responsibilities
- Run the `ZoomedOut` / `ZoomedIn` / `TargetFollow` state machine. Transitions are explicit, typed, and fully enumerated; no ad-hoc booleans.
- Maintain the POI queue ordered by `confidence × proximity_to_current_camera × age_factor`. Hard-cap output to ≤5 POIs/min surfaced to the operator.
- Apply the confidence-scaled operator decision window (40 % → 30 s, 100 % → 120 s, linear; below 40 % the POI is not surfaced). Timeout = forget; decline = `IgnoredItem` entry via `mapobjects_store`.
- Suppress new POIs whose `(MGRS, class_group)` matches an existing `IgnoredItem`.
- For each new detection or movement candidate: compute the H3 cell, ask `mapobjects_store` to classify as new / moved / existing, and only surface non-existing entries.
- **Zoom-in candidate handling.** When a `MovementCandidate` arrives with `source_zoom_band = zoomed_in`, evaluate against the current ROI: if inside, bump current-ROI confidence; if outside the ROI but inside the broader zoomed FOV, enqueue as a candidate-POI; only interrupt the current zoom-in hold if the candidate's priority exceeds the current hold's priority.
- On operator confirmation: hand a middle-waypoint hint to `mission_executor`, transition to `TargetFollow`, and command `gimbal_controller` to keep the target in the centre 25 % of frame.
- On operator decline / timeout / target loss: append (decline only) an `IgnoredItem` and return to `ZoomedOut`.
- On `mapobjects_store` reporting `sync_state = degraded`, surface health → red and **do not** classify new detections (avoid corrupting the central observation log on next push); continue to surface POIs to the operator on Tier-1 + movement evidence alone.
## 5. Internal State
The state machine lives entirely in this component. State variables:
- Current state: `ZoomedOut | ZoomedIn { roi, hold_started_at } | TargetFollow { target_id, started_at }`.
- POI queue: ordered, with per-entry priority and queue position.
- Per-class operator-decision-window thresholds.
- Last-N tick timestamps for tick-latency observability.
- Frame-rate floor monitor: when sustained FPS < 10, suppress `ZoomedOut → ZoomedIn` transitions and surface health → yellow.
State is in-process only; restart starts in `ZoomedOut` with an empty queue.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| `detection_client` health red | health input | Continue zoom-out sweep; emit no new POIs from Tier 1; movement candidates still flow. |
| `movement_detector` health red | health input | Continue; lose movement-candidate enqueueing. |
| `semantic_analyzer` health red | health input | Skip Tier 2; surface POIs with Tier-1-only evidence; flag in operator overlay. |
| `vlm_client` returns `status: disabled \| timeout \| ipc_error \| schema_invalid` | per-call status | Surface POI without VLM evidence (fail-closed). |
| `gimbal_controller` not ready | health input | Stay in current state; alert; do not silently drop scan steps. |
| `operator_bridge` disconnected | health input | Continue zoom-out (operator UI is unreachable, but the system must not crash); pause POI surfacing; resume on reconnect. F10 lost-link ladder owns the larger response. |
| `mapobjects_store` sync degraded | sync_state input | Suppress diff classifications; surface POIs on Tier-1 + movement only; health → red. |
| Sustained FPS < 10 | self-instrumented | Suppress zoom-in transitions; health → yellow. |
| Tick-latency above budget | self-instrumented | Health → yellow; investigate (likely upstream consumer slowness). |
## 7. Dependencies
**In-process** (input): `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client`, `operator_bridge`, `mission_executor`, `mapobjects_store`.
**In-process** (output): `gimbal_controller`, `operator_bridge`, `mission_executor`, `mapobjects_store`.
**External**: none directly. All external integrations are mediated by other components.
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Tick latency | ≤10 ms p99 |
| POI enqueue → operator surface | ≤1 s in normal load |
| POI rate to operator | ≤5 POIs/min (hard cap) |
| Zoom-out → zoom-in transition | ≤2 s including physical zoom |
| Zoom-in hold duration | configurable; default 5 s/POI |
| Target-follow centre-window | target inside centre 25 % of frame while visible |
| Frame-rate floor | ≥10 fps sustained; below this, suppress zoom-in transitions |
## 9. References
- `architecture.md §3`, `§5 Architectural Principles`, `§6 NFR`, `§7.6 Scan controller and POI queue`, `§7.12 New vs Existing / Moved / Removed Object Detection`, `§7.13 MapObjects Sync`.
- `system-flows.md §F4 Scan controller behaviour tree` (full BT spec, tick scenarios, 15 fixed-wing rules).
- `data_model.md §POI`, `§IgnoredItem`, `§MapObject`.
@@ -0,0 +1,70 @@
# Component — `semantic_analyzer`
**Layer**: Perception (data plane in)
**Status**: forward-looking design (Rust + ONNX/TensorRT bindings)
## 1. Purpose
Tier 2 of the perception pipeline. Reasons over zoom-in crops using a primitive graph plus a lightweight ROI CNN. Active only when `scan_controller` is in `ZoomedIn`. Owns path-freshness scoring, endpoint scoring, branch choice at intersections, and concealment-POI scoring. Operates on bounded ROIs only — never full frames.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| `DetectionBatch` (Tier 1 primitives) | `detection_client` | per zoom-in frame | Used for primitive-graph construction (paths, branches, entrances, trees). |
| Zoom-in frame + ROI selection | `frame_ingest` (frame), `scan_controller` (ROI bounds) | per zoom-in hold | Bounded crop only; full frame is not consumed. |
| Per-class config | startup config | once | Confidence floors, freshness thresholds, branch-priority rules. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `Tier2Evidence` | `scan_controller` | `{ roi_id, path_freshness, endpoint_score, concealment_score, recommended_next_action: PanFollowFootpath \| HoldEndpoint \| PanBroad \| ReturnToZoomOut, source_detections: Vec<DetectionId> }` |
| `Pan plan` | `scan_controller` (then `gimbal_controller`) | sequence of pan goals for footpath following |
| Health metric | health aggregator | `tier2_latency_p50/p99`, `roi_size_bytes_p99`, `errors_total`. |
## 4. Key Responsibilities
- Build a small primitive graph from Tier-1 detections inside the ROI: path nodes (footpaths, roads), endpoint nodes (branch piles, dark entrances, dugouts), context nodes (trees, tree blocks).
- Score path freshness using the freshness model (texture, edge clarity, undisturbed-surroundings cues).
- Score concealment for endpoint candidates.
- At intersections, recommend the freshest / most-promising branch for `gimbal_controller` to pan toward; emit a follow plan that keeps the path centered while the UAV moves.
- Bound every inference call by a strict ROI size and timeout. Never run on a full frame.
## 5. Internal State
- ROI-scoped primitive graphs (per-ROI lifetime; dropped on zoom-in exit).
- Lightweight CNN session (ONNX/TensorRT engine).
State is in-process only.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| ROI size exceeds limit | pre-decode size check | Reject the ROI; surface to `scan_controller` as `tier2_oversize`; do not decode. |
| Inference timeout (>200 ms) | wall-clock | Return `Tier2Evidence` with `status: timeout`; `scan_controller` decides to skip VLM and surface a low-evidence POI. |
| CNN session OOM or hardware error | inference call error | Health → red on sustained errors; `scan_controller` falls back to Tier-1-only POI surfacing. |
| Inconsistent primitive graph (e.g., disconnected paths) | graph validation step | Emit `Tier2Evidence` with `recommended_next_action: ReturnToZoomOut` and `path_freshness: undefined`. |
## 7. Dependencies
**In-process**: `detection_client`, `frame_ingest`, `scan_controller`.
**External**: ONNX Runtime / TensorRT (whichever the lightweight CNN ships with), OpenCV (preprocessing).
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Per-ROI latency | ≤200 ms p99 |
| Concealed-position recall | ≥60 % |
| Concealed-position precision | ≥20 % (operators filter) |
| Footpath detection recall | ≥70 % |
| ROI memory footprint | bounded; no unbounded buffering |
## 9. References
- `architecture.md §3`, `§7.6 Tier 2 semantic analyzer`, `§7.5 Training Data`.
- `system-flows.md §F1 Frame pipeline`, `§F4 Scan controller behaviour tree`.
- `data_model.md §Tier2Evidence`.
@@ -0,0 +1,78 @@
# Component — `telemetry_stream`
**Layer**: Telemetry plane (always-on, parallel to the decision loop)
**Status**: forward-looking design (Rust)
## 1. Purpose
Continuous, always-on push of the camera feed + UAV telemetry + bbox overlay to the Ground Station API over modem. Carries operator commands (confirm / decline / target-follow start / target-follow release) on the return path. Independent of the decision loop — the operator always sees the live feed, not just on detection.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| Decoded `Frame` | `frame_ingest` | up to 30 fps | Re-encoded for the modem link bandwidth. |
| `DetectionBatch` | `detection_client` | per frame | Used to build the bbox overlay (server-burn-in or client-render — see Open Questions). |
| `MovementCandidate` (zoom-out + zoom-in) | `scan_controller` (forwarded) | per candidate | Surfaced in operator overlay; the `source_zoom_band` tag is preserved so the overlay can render zoom-out vs zoom-in candidates differently. |
| UAV telemetry | `mavlink_layer` (via `mission_executor`) | 10 Hz | Position, attitude, mode, sys-status. |
| Gimbal state | `gimbal_controller` | per change | yaw / pitch / zoom. |
| `POI` events | `operator_bridge` | per POI surface / dequeue | Passed straight through. |
| Operator commands | Ground Station (return path) | event | Forwarded to `operator_bridge`. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| Outbound stream | Ground Station API (over modem) | per stream protocol (TBD — see Open Questions) |
| Inbound operator commands | `operator_bridge` | event |
| Health metric | health aggregator | `link_state`, `bandwidth_used_mbps`, `frame_drop_rate`, `last_command_received_ts`. |
## 4. Key Responsibilities
- Encode and push the camera feed + telemetry + bbox overlay continuously, regardless of detection state.
- Apply bandwidth-aware rate adaptation (drop bbox-overlay frequency before frame frequency; drop frame frequency before resolution).
- Surface the modem link state to the health aggregator; `operator_bridge` consults this to decide whether to surface POIs.
- Receive operator commands on the return path; forward to `operator_bridge` with monotonic timestamps.
## 5. Internal State
- Stream session handle.
- Rate-adaptation state machine.
- In-flight frame buffer (bounded).
State is in-process only.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| Modem link down | transport error / heartbeat | Surface `link_lost`; pause outbound push (do not buffer indefinitely); `operator_bridge` pauses POI surfacing. |
| Bandwidth saturation | adaptive monitor | Reduce bbox-overlay rate, then frame rate, then resolution; surface to health → yellow. |
| Inbound command unparseable | parser error | Reject; ack with error; do not act. |
| Inbound command from unauthenticated peer | session check (per Ground Station contract) | Reject; alert. |
## 7. Dependencies
**In-process** (input): `frame_ingest`, `detection_client`, `scan_controller`, `mavlink_layer`, `gimbal_controller`, `operator_bridge`.
**In-process** (output): `operator_bridge` (return-path commands).
**External**: Ground Station API. Contract owner: `../_docs/04_system_design_clarifications.md`.
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| End-to-end glass-to-operator latency | bounded by modem characteristics; target ≤2 s p99 on a healthy link |
| Always-on | yes; not detection-gated |
| Rate adaptation | smooth; no sudden full-resolution → no-feed transitions |
| Outbound buffering | bounded; no unbounded growth on slow link |
## 9. Open Questions
- **Ground Station API contract** (`architecture.md §8 Q2`): stream protocol (WebRTC / WebSocket-H.264 / gRPC server-streaming?), session/auth model, bbox-overlay rendering (server-side burn-in vs client-side render).
## 10. References
- `architecture.md §3`, `§5 Architectural Principles` (always-on stream, no silent error swallowing), `§7.6 Integration and reliability`.
- `system-flows.md §F5 Operator round trip`.
- `../_docs/04_system_design_clarifications.md`.
@@ -0,0 +1,82 @@
# Component — `vlm_client` (optional)
**Layer**: Perception (data plane in)
**Status**: forward-looking design (Rust); optional behind a feature flag and a runtime config flag
## 1. Purpose
Tier 3 of the perception pipeline. Asks a local NanoLLM/VILA1.5-3B process to confirm a zoom-in endpoint POI using one bounded ROI crop and a short prompt. Returns a structured `VlmAssessment`. The free-form VLM text is **not** a downstream API contract — only the validated structured output is.
VLM is optional; the system MUST function correctly when VLM is disabled or absent.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| Zoom-in ROI crop + prompt | `scan_controller` | per zoom-in endpoint hold | One bounded crop, short prompt, short answer. |
| `vlm_enabled` runtime flag | startup config | once at start (re-readable on SIGHUP if implemented) | Gates whether `scan_controller` calls this component at all. |
| IPC socket path | startup config | once | Unix-domain socket to the NanoLLM process. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `VlmAssessment` | `scan_controller` | `{ label, confidence, status: ok \| inconclusive \| timeout \| schema_invalid \| ipc_error \| disabled, source_roi_id, latency_ms, model_version }` |
| Health metric | health aggregator | `enabled`, `vlm_latency_p50/p99`, `errors_by_kind`, `peer_cred_check_pass_rate`. |
## 4. Key Responsibilities
- Validate the ROI payload (size, format) **before** sending it across the IPC channel.
- Maintain the Unix-domain-socket connection to the NanoLLM process; perform a peer-credential check on connect (where supported by the platform).
- Send one bounded ROI + short prompt; await one short response within ≤5 s.
- Validate the response against the `VlmAssessment` schema; on schema-invalid, return `status: schema_invalid` to `scan_controller` and surface to health.
- Return `status: disabled` when the runtime flag is `false`; `scan_controller` treats this identically to "VLM not present" and proceeds with Tier 2 evidence alone.
- Capture `model_version` (whatever the NanoLLM process reports for its loaded weights) on every assessment for forensic correlation; log the version on change.
## 5. Internal State
- IPC socket handle and peer-credential cache.
- In-flight request map (request id → caller).
State is in-process only.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| VLM process not reachable | connect / send error | Return `status: ipc_error`; bounded-backoff reconnect; health → yellow then red. |
| Peer-cred check fails | platform API | Hard-fail the connect; do not retry without operator intervention; health → red. |
| Response timeout (>5 s) | wall-clock | Return `status: timeout`; do not block `scan_controller` past the budget. |
| Schema-invalid response | response parser | Return `status: schema_invalid`; log the raw response (size-capped) for offline analysis. |
| ROI payload too large | pre-send size check | Return `status: schema_invalid` synchronously; never send. |
| Optional component absent at build time | feature flag off at compile | `scan_controller` depends only on the `VlmAssessment` provider trait; the default impl returns `status: disabled`. The binary builds and runs identically without `vlm_client`. |
## 7. Dependencies
**In-process**: `scan_controller`.
**External**: NanoLLM / VILA1.5-3B local process. IPC over Unix-domain socket. No network egress.
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| Per-ROI latency | ≤5 s p99 |
| Memory budget | within the 6 GB shared budget after Tier 1 + Tier 2 |
| Cloud egress | **none** (hard rule) |
| Failure mode | fail-closed — never surface a POI with VLM evidence on a degraded VLM call |
## 9. Optionality Model
Two complementary mechanisms; the implementation chooses one or both:
1. **Runtime flag (`vlm_enabled`)** gated by the benchmark-gate result. When `false`, `scan_controller` skips VLM confirmation; the zoom-in hold proceeds with Tier 2 evidence alone.
2. **Build-time feature module.** `vlm_client` is a separate Cargo feature; the binary builds, links, and runs identically when the feature is off. `scan_controller` depends on a `VlmAssessmentProvider` trait whose default impl returns `status: disabled`.
Both must yield the same observable behaviour: the system functions correctly with VLM absent, only losing the zoom-in confirmation step.
## 10. References
- `architecture.md §5 Architectural Principles` (no cloud egress, fail-closed), `§7.6 Local VLM confirmation`.
- `system-flows.md §F3 VLM confirmation` (with explicit fail-closed and disabled branches).
- `data_model.md §VlmAssessment`.
+384
View File
@@ -0,0 +1,384 @@
# autopilot — Data Model
**Status**: forward-looking design (Rust). This is the canonical entity catalogue.
The autopilot binary itself has **one** persistent store: the on-device `mapobjects_store` (engine TBD — `architecture.md §8 Q3`). Everything else is in-memory only. Mission state and the central MapObjects state are pulled from the external `missions` API on start; there is no in-process mission database. The on-device `mapobjects_store` is a working copy of the central MapObjects state for the active mission's bounding box; the central observation log is the source of truth across missions (per `architecture.md §7.13`).
---
## 1. Entity Map
```mermaid
erDiagram
Frame ||--o{ Detection : "produced by detection_client"
Frame ||--o{ MovementCandidate : "produced by movement_detector (zoom-out + zoom-in)"
Detection ||--|| BoundingBox : "bbox_normalized"
DetectionBatch ||--o{ Detection : "contains"
POI ||--o| Tier2Evidence : "zoom-in Tier 2"
POI ||--o| VlmAssessment : "Tier 3 (optional, zoom-in)"
POI }o--o| MapObject : "lookup by H3 + class"
POI }o--o| IgnoredItem : "decline_suppressed"
MapObject ||--o{ MapObjectObservation : "history (central append-only log)"
MapObjectsBundle ||--o{ MapObject : "pre-flight pull"
MapObjectsBundle ||--o{ MapObjectObservation : "post-flight push"
MapObjectsBundle ||--o{ IgnoredItem : "ignored items round-trip"
OperatorCommand ||--o| POI : "confirm/decline target"
MissionItem ||--o{ MissionWaypoint : "translates to"
MissionItem ||--o{ Geofence : "carries"
Geofence }o--o{ Coordinate : "polygon"
MissionWaypoint ||--|| Coordinate : "at"
```
---
## 2. Perception entities
### `Frame`
A decoded video frame. Produced by `frame_ingest`; consumed by `detection_client`, `movement_detector`, `telemetry_stream`.
| Field | Type | Notes |
|---|---|---|
| `seq` | u64 | Monotonic sequence number; primary key for cross-component correlation. |
| `capture_ts_monotonic_ns` | u64 | Wall-clock-independent timestamp at the earliest practical point in the pipeline. |
| `decode_ts_monotonic_ns` | u64 | When `frame_ingest` finished decoding. |
| `pixels` | `Arc<Bytes>` | Raw pixel data; consumers do not copy. |
| `width`, `height` | u32 | |
| `pix_fmt` | enum | `NV12` \| `YUV420P` \| `RGB24` (decoder dependent). |
| `ai_locked` | bool | If set, downstream consumers skip detection (operator-side or supervisor gating). |
In-memory only.
### `BoundingBox`
| Field | Type | Notes |
|---|---|---|
| `x_min`, `y_min`, `x_max`, `y_max` | f32 | Normalised to `[0.0, 1.0]` in image coordinates. |
### `Detection`
One Tier-1 detection. Mirrors the `../detections` contract; carries through to operator overlay unchanged.
| Field | Type | Notes |
|---|---|---|
| `class_id` | u32 | |
| `class_name` | string | Human-readable label. |
| `confidence` | f32 | 0.01.0. |
| `bbox_normalized` | `BoundingBox` | |
| `mask_or_polyline` | optional bytes | For polyline classes (e.g. footpaths). |
| `source_frame_seq` | u64 | Foreign key into `Frame`. |
### `DetectionBatch`
| Field | Type | Notes |
|---|---|---|
| `frame_seq` | u64 | |
| `detections` | `Vec<Detection>` | |
| `latency_ms` | u32 | Tier-1 round-trip; observed for budget compliance. |
| `model_version` | string | Reported by `../detections`; logged on change. |
### `MovementCandidate`
A residual-motion cluster surviving ego-motion compensation in `movement_detector`.
| Field | Type | Notes |
|---|---|---|
| `frame_seq` | u64 | |
| `bbox_normalized` | `BoundingBox` | |
| `residual_velocity_estimate` | optional struct | Direction + magnitude in image coords; used for prioritisation. |
| `telemetry_quality` | enum | `synced` \| `degraded` \| `unsynced` (drives whether the candidate may be surfaced at all). |
| `source_frame_ts_monotonic_ns` | u64 | |
| `source_zoom_band` | enum | `zoomed_out` \| `zoomed_in`. Drives `scan_controller`'s queueing logic (per `system-flows.md §F2`): zoom-out candidates enter the POI queue normally; zoom-in candidates may bump current-ROI confidence or enter the queue with their own priority. |
### `Tier2Evidence`
Output of `semantic_analyzer` for a single zoom-in ROI hold.
| Field | Type | Notes |
|---|---|---|
| `roi_id` | uuid | Stable identifier within a zoom-in hold. |
| `path_freshness` | f32 \| null | 0.0 = no path / not applicable; 1.0 = fresh. |
| `endpoint_score` | f32 \| null | Concealed-position likelihood at an endpoint (branch pile / dark entrance). |
| `concealment_score` | f32 \| null | General concealment-POI score. |
| `recommended_next_action` | enum | `PanFollowFootpath` \| `HoldEndpoint` \| `PanBroad` \| `ReturnToZoomOut`. |
| `source_detections` | `Vec<DetectionId>` | For audit / replay. |
| `status` | enum | `ok` \| `timeout` \| `oversize` \| `error`. |
### `VlmAssessment`
Validated, structured response from `vlm_client`. Free-form VLM text is **not** a downstream API.
| Field | Type | Notes |
|---|---|---|
| `label` | enum | `confirmed_concealed_position` \| `rejected` \| `inconclusive` \| `error`. |
| `confidence` | f32 | 0.01.0; VLM-reported or derived. |
| `evidence_spans` | `Vec<string>` | Short justifications, bounded length. |
| `reason` | string | One-line rationale; bounded length. |
| `status` | enum | `ok` \| `timeout` \| `schema_invalid` \| `ipc_error` \| `disabled`. |
| `latency_ms` | u32 | Round-trip including IPC. |
| `model_version` | string | Reported by the NanoLLM process for the loaded weights; logged on change for forensic correlation. |
`status` semantics: any value other than `ok` MUST result in `label = inconclusive` (or `error` for a critical failure). `scan_controller` MUST NOT promote a POI to a confirmed target on a non-`ok` `VlmAssessment`.
---
## 3. Decision entities
### `POI`
A Point-of-Interest enqueued by `scan_controller`. Source: a Tier-1 detection, a movement candidate from `movement_detector`, or a Tier-2 semantic finding.
| Field | Type | Notes |
|---|---|---|
| `id` | uuid | Stable for the POI's lifetime. |
| `confidence` | f32 (0.01.0) | Composite of detection / motion / Tier-2 score. |
| `mgrs` | string | MGRS coordinate from the GPS-Denied service or autopilot GPS. |
| `class` | string | Concrete class. |
| `class_group` | enum | Per `mapobjects_store` config (e.g. `military_vehicle_group`, `concealed_position_group`, `movement_candidate`). |
| `source_detection_ids` | `Vec<DetectionId>` | For audit / replay. |
| `enqueued_at` | timestamp | For queue ageing. |
| `priority` | f32 | `confidence × proximity_to_current_camera × age_factor`. |
| `decline_suppressed` | bool | True if `(MGRS, class_group)` matches an existing `IgnoredItem`. |
| `vlm_status` | enum | Mirrors `VlmAssessment.status` (or `not_requested` / `pending`). |
| `tier2_evidence` | optional `Tier2Evidence` | |
| `deadline` | timestamp | Per the confidence-scaled operator-decision window. |
Field `queue_position` is **not** stored; it is computed at read time from `priority` + `enqueued_at`.
### `MapObject`
A persisted map entry, indexed by H3 cell. Owned by `mapobjects_store`; written on each `NEW` / `MOVED` classification, read on each new detection. The on-device `MapObject` is a **working copy** of the central state for the active mission.
| Field | Type | Notes |
|---|---|---|
| `h3_cell` | u64 | H3 cell index at the configured resolution (default `res 10`, ~15 m edge). |
| `mgrs_key` | string | MGRS coordinate; together with `class` forms the hashtable composite key. |
| `class` | string | Concrete class (not the group). |
| `class_group` | string | Group used for matching during `EXISTING` / `MOVED` / `NEW` classification. |
| `gps_lat`, `gps_lon` | f64 | For distance calculation against incoming detections. |
| `size_width_m`, `size_length_m` | f32 | Bounding area on the ground. |
| `confidence` | f32 | Latest observation confidence (or running average, per implementation). |
| `first_seen`, `last_seen` | timestamp | Earliest and most recent observation; `last_seen` drives the `REMOVED` candidate diff at region-end. |
| `mission_id` | string | For the `DELETE /missions/{id}` cascade. |
| `source` | enum | `central_pulled` (came from pre-flight pull) \| `local_observed` (added during this mission). On post-flight push only `local_observed` records become new observations centrally. |
| `pending_upload` | bool | True for any `local_observed` entry not yet pushed centrally. Cleared on successful `POST /missions/{id}/mapobjects` ack. |
Persisted in `mapobjects_store` (engine TBD per `architecture.md §8 Q3`).
### `MapObjectObservation`
A single per-detection record. The on-device store appends one of these per NEW / MOVED / EXISTING / REMOVED-CANDIDATE classification; the post-flight push uploads the unflushed list to the central `missions` API. The central side stores all observations append-only as the source of truth (per `architecture.md §7.13`).
| Field | Type | Notes |
|---|---|---|
| `id` | uuid | Locally generated; stable across the mission. |
| `h3_cell` | u64 | |
| `class` | string | Concrete class. |
| `class_group` | string | Group used for the diff. |
| `mission_id` | string | |
| `uav_id` | string | Identifies the airframe; assigned at provisioning. |
| `observed_at_monotonic_ns` | u64 | Local monotonic at observation. |
| `observed_at_wallclock` | timestamp | Bound from GPS or NTP per the wall-clock policy. |
| `gps_lat`, `gps_lon` | f64 | |
| `mgrs` | string | |
| `size_width_m`, `size_length_m` | f32 | |
| `confidence` | f32 | |
| `diff_kind` | enum | `NEW` \| `MOVED` \| `EXISTING` \| `REMOVED_CANDIDATE`. |
| `photo_ref` | string \| null | URL or compact reference; uploaded out-of-band per the central API contract (Q7). |
| `raw_evidence` | json \| null | Audit payload; size-capped. |
In-memory; durably persisted in `mapobjects_store` until the post-flight push acknowledges. On the central side, `map_object_observations` is the corresponding table (see `architecture.md §7.13`).
### `MapObjectsBundle`
The wire shape for both the pre-flight pull (response body) and the post-flight push (request body) on `/missions/{id}/mapobjects`.
| Field | Type | Notes |
|---|---|---|
| `schema_version` | string | Semver; mismatched versions are rejected. |
| `mission_id` | string | |
| `bbox` | `Coordinate[2]` (NW + SE) | The mission area; used by the central API to scope the response. |
| `map_objects` | `Vec<MapObject>` | Pre-flight: current view from the central store. Post-flight push uses `MapObjectObservation` instead (see below). |
| `observations` | `Vec<MapObjectObservation>` | Post-flight: the full pass diff. |
| `ignored_items` | `Vec<IgnoredItem>` | Pre-flight: union-merged from the central store. Post-flight: only items appended during this mission. |
| `as_of` | timestamp | Pre-flight: when the central store snapshot was computed. Post-flight: when the on-device flush started. |
| `freshness` | enum (pre-flight only) | `fresh` (≤ configured staleness window) \| `stale` (operator must acknowledge to use). |
### `IgnoredItem`
A scene the operator declined; consulted by `scan_controller` before promoting any future detection to a POI. Union-merged across missions on the central side (per `architecture.md §7.13` conflict resolution).
| Field | Type | Notes |
|---|---|---|
| `id` | uuid | Locally generated. |
| `mgrs` | string | Decline location. |
| `h3_cell` | u64 | For central-side indexing. |
| `class_group` | string | Class group of the declined detection. |
| `decline_time` | timestamp | Wall-clock at decline (operator-side). |
| `operator_id` | string \| null | If known from the Ground Station session. |
| `mission_id` | string | The mission during which the decline happened. |
| `retention_scope` | enum | `mission` (cleared at mission end on-device, retained centrally indexed by mission) \| `session` (cleared at session end on-device) \| `until_expiry` (carries `expires_at`). |
| `expires_at` | timestamp \| null | Required when `retention_scope = until_expiry`. |
| `source` | enum | `central_pulled` (pre-flight pull) \| `local_appended` (during this mission). Only `local_appended` is uploaded to central in the post-flight push. |
| `pending_upload` | bool | True for any `local_appended` entry not yet pushed centrally. |
Lookup key: `(MGRS, class_group)` exact match (subject to the same H3 k-ring widening as `MapObject` lookups, when configured).
Persisted in `mapobjects_store`. Central-side table: `map_object_ignored` per `architecture.md §7.13`.
---
## 4. Action / piloting entities
### `Coordinate`
| Field | Type | Notes |
|---|---|---|
| `latitude` | f64 | Geographic; degrees. |
| `longitude` | f64 | Geographic; degrees. |
| `altitude_m` | f32 | Above ground or above home, depending on usage; the carrying entity defines the frame. |
### `Geofence`
A polygon on the mission. Both INCLUSION and EXCLUSION are honoured by `mission_executor`.
| Field | Type | Notes |
|---|---|---|
| `kind` | enum | `INCLUSION` \| `EXCLUSION`. |
| `vertices` | `Vec<Coordinate>` | Polygon vertices in order. |
### `MissionItem`
The business-level mission item: what the `missions` API delivers and what the operator authored. **Owned by `mission-schema`**, the artefact shared with the `missions` repo (extraction location TBD — `architecture.md §8 Q5`).
| Field | Type | Notes |
|---|---|---|
| `id` | uuid | |
| `kind` | enum | `waypoint` \| `search` \| `region_search` \| `return` \| `target_follow_breakpoint`. |
| `at` | optional `Coordinate` | For `waypoint` / `return`. |
| `region` | optional polygon | For `region_search`. |
| `cruise_speed_mps` | optional f32 | If set, `mission_executor` emits a `MAV_CMD_DO_CHANGE_SPEED` waypoint before the affected items. |
| `target_classes` | optional `Vec<string>` | Per-item search hint (e.g. `tank`, `artillery`). |
### `MissionWaypoint`
The MAVLink-level wire item: what `mavlink_layer` sends to ArduPilot / PX4. **Owned by `mavlink_layer`**.
| Field | Type | Notes |
|---|---|---|
| `seq` | u16 | MAVLink mission item sequence number. |
| `frame` | enum | `MAV_FRAME_GLOBAL_RELATIVE_ALT` (system default; no terrain-following). |
| `command` | enum | One of: `MAV_CMD_NAV_TAKEOFF`, `MAV_CMD_NAV_WAYPOINT`, `MAV_CMD_NAV_LAND`, `MAV_CMD_DO_CHANGE_SPEED`, `MAV_CMD_NAV_RETURN_TO_LAUNCH`, `MAV_CMD_DO_SET_MODE`. |
| `current` | bool | True only for the very first item in a fresh upload. |
| `auto_continue` | bool | True for everything except the final item. |
| `param_1..param_4` | f32 | Command-specific. |
| `lat_deg_e7`, `lon_deg_e7` | i32 | Scaled-integer geographic coordinates. |
| `alt_m` | f32 | Above home (relative). |
### Translation contract — `MissionItem` → `MissionWaypoint`
Owner: `mission_executor`, variant-aware (multirotor / fixed-wing).
| Source `MissionItem.kind` | Resulting `MissionWaypoint`(s) |
|---|---|
| `waypoint` | exactly one `MAV_CMD_NAV_WAYPOINT` |
| `region_search` | sequence of `MAV_CMD_NAV_WAYPOINT`s computed per the sweep pattern (`architecture.md §8 Q1`) |
| `return` | one `MAV_CMD_NAV_RETURN_TO_LAUNCH` (or `MAV_CMD_NAV_LAND` at the explicit return point) |
| `target_follow_breakpoint` | (none) — used only as a structural marker for re-upload; not sent to MAVLink |
| (cruise speed carried by a `MissionItem`) | one `MAV_CMD_DO_CHANGE_SPEED` placed **before** the affected `MAV_CMD_NAV_WAYPOINT`s |
Multirotor variants prepend `MAV_CMD_NAV_TAKEOFF` and append `MAV_CMD_NAV_LAND`. Fixed-wing variants do neither (the airframe is RC-launched and put into AUTO by the operator); they only upload + start the mission.
The cruise-speed translation is required to **reach the autopilot**. If a `MissionItem` declares a cruise speed, the corresponding `MAV_CMD_DO_CHANGE_SPEED` MUST be present in the uploaded sequence with the speed in `param_1`. Conformance test in `deployment/ci_cd_pipeline.md §5`.
### `OperatorCommand`
Every command from the Ground Station to autopilot is wrapped in this authenticated envelope. The principle is committed (`architecture.md §5`); the exact signature scheme is open per Q9. `operator_bridge` rejects any command that fails signature validation, replay-protection check, or session validation.
| Field | Type | Notes |
|---|---|---|
| `command_id` | uuid | Idempotency key; cached for 60 s by `operator_bridge`. |
| `session_token` | string | Opaque session token issued by the Ground Station at operator login; bound to `operator_id`. |
| `sequence_number` | u64 | Monotonically increasing per-session; replay-protection. Lower-or-equal numbers per session are rejected. |
| `issued_at_wallclock` | timestamp | Operator-side wall-clock. Used for forensic audit; not used for trust decisions. |
| `kind` | enum | `confirm_poi` \| `decline_poi` \| `start_target_follow` \| `release_target_follow` \| `acknowledge_bit_degraded` \| `safety_override` \| `mission_abort`. |
| `payload` | json | Action-specific body. |
| `signature` | bytes | Signature over (`session_token`, `sequence_number`, `kind`, `payload`). Scheme TBD per Q9. |
`scan_controller` and `mission_executor` see only the validated payload; the auth envelope is opaque to them. Audit logs record `command_id`, `operator_id` (resolved from session token), `kind`, and result.
### `GimbalState`
| Field | Type | Notes |
|---|---|---|
| `yaw`, `pitch` | f32 | Degrees. |
| `zoom` | f32 | Effective focal length or zoom factor (vendor-specific). |
| `ts_monotonic_ns` | u64 | Stamp at the moment the gimbal feedback was received. |
| `command_in_flight` | bool | True between command issuance and feedback that motion completed. |
In-memory only; consumed by `frame_ingest` and `movement_detector` for telemetry-skew compensation.
---
## 5. Sync / wire formats
### MGRS sync message — wire format
The operator round trip (`telemetry_stream` ⇄ Ground Station) uses MGRS-encoded payloads in both directions. Field separator is `::`.
**Drone → Operator (detection report):**
| Position | Field | Type | Notes |
|---|---|---|---|
| 1 | `missionId` | string | Server-assigned mission UUID. |
| 2 | `MGRS(encoded)` | string | MGRS coordinate (compact, military-grid). |
| 3 | `class` | string | Concrete detection class. |
| 4 | `confidence` | f32 | 0.01.0. |
| 5 | `size_width_m` | f32 | Ground-projected width. |
| 6 | `size_length_m` | f32 | Ground-projected length. |
| 7 | `photo_metadata` | string | URL or compact reference to the snapshot frame. |
| 8 | `flags` | bitmask | Reserved (e.g. `target_follow_active`, `vlm_used`, `movement_origin`). |
**Operator → Drone (command / acknowledgment):**
| Position | Field | Type | Notes |
|---|---|---|---|
| 1 | `missionId` | string | Must match the drone-side mission. |
| 2 | `Encoded(GroundMGRS :: Time)` | string | Operator's ground location + decision timestamp. |
| 3 | (variable) | … | Action-specific payload (POI ID, action enum, follow-toggle, etc.). |
| N | `missionId2` | string | Echo of `missionId` for stream-multiplexing safety. |
The exact serialisation of position 3 (action payload) is left to the Ground Station API contract (open question; see `architecture.md §8 Q2`).
---
## 6. Persistence and lifecycle
| Entity | Persisted? | Where | Lifecycle |
|---|---|---|---|
| `Frame`, `Detection`, `DetectionBatch`, `MovementCandidate`, `Tier2Evidence`, `VlmAssessment`, `GimbalState` | no | in-memory | per frame / per ROI / per command — dropped on state change. |
| `POI` | no | in-memory inside `scan_controller` | enqueued, surfaced, decided (confirm / decline / timeout), then dropped. |
| `MapObject` | yes | `mapobjects_store` (working copy of central state) | mission-scoped on-device; appended to central observation log via post-flight push (F8); cleared on `DELETE /missions/{id}` cascade. |
| `MapObjectObservation` | yes | `mapobjects_store` until acknowledged centrally | per-detection append-log; durable across in-flight crash; cleared per record on `POST /missions/{id}/mapobjects` ack. |
| `IgnoredItem` | yes | `mapobjects_store` (working copy + post-flight upload of locally-appended items) | per `retention_scope`; central side union-merged. |
| `MissionItem` | no in autopilot | source of truth is the `missions` API | pulled on start; refreshed on middle-waypoint POST. |
| `MissionWaypoint` | no | in-memory inside `mavlink_layer` | re-derived from `MissionItem`s on each upload / re-upload. |
| `OperatorCommand` | partial | command-id cache (60 s) for idempotency; full audit log persisted on disk | per-command; audit-retained per configured policy. |
---
## 7. Versioning and contracts
| Contract | Owner | Versioning |
|---|---|---|
| `mission-schema` (the `MissionItem` shape) | shared between `autopilot` and `missions` repos; extraction location TBD (`architecture.md §8 Q5`) | semantic versioning; `mission_client` validates `schema_version` on fetch. |
| MapObjects bundle schema (`MapObjectsBundle` for pull/push, `MapObjectObservation` for the central observation log) | shared between `autopilot` and `missions` repos as part of the §7.13 endpoint extension | semantic versioning; `mission_client` validates `schema_version`; central side rejects mismatches with 4xx (`architecture.md §8 Q7`). |
| `../detections` gRPC contract | `../detections` repo (per `../_docs/03_detections.md`) | versioned; `detection_client` rejects schema mismatches (`architecture.md §8 Q4`). |
| `VlmAssessment` schema | autopilot-internal (this document is the source of truth) | versioned; `vlm_client` rejects schema-invalid responses. The `model_version` field correlates assessments with VLM weights. |
| MGRS sync wire format | autopilot-internal (this document is the source of truth) | versioned; field-position changes are breaking. |
| MAVLink command surface | per `architecture.md §7.7` | adding messages requires explicit design review. |
| `OperatorCommand` envelope (signature scheme) | open per `architecture.md §8 Q9` | once chosen, versioned; both Ground Station and `operator_bridge` must agree. |
+816
View File
@@ -0,0 +1,816 @@
# autopilot — Decision Rationale
This file is the load-bearing research evidence behind the design. It captures the per-dimension reasoning, the fact cards backing each decision, the component-fit matrix, the validation log, the source bibliography, the evolution from the early draft to the final solution, and the original seed problem narrative. It is **read-only** in the sense that decisions documented here have already shaped `architecture.md §7 Detailed Design` and `system-flows.md §F1F7`; updates here should follow updates there, not lead them.
## Reasoning chain (per-dimension)
### Dimension 1: Tiered Perception Pipeline
**Fact confirmation.** Existing YOLO integration already emits normalized boxes through a FastAPI/Cython/TensorRT service (Fact #16). Ultralytics supports TensorRT FP16 export for YOLO26-style models (Fact #19). UAV small-object and camouflaged-object literature shows that small concealed targets need class-specific and attention/semantic support rather than assuming generic object-detection transfer (Fact #8, Fact #11).
**Reference comparison.** A single detector is simpler but cannot satisfy footpath tracing, endpoint reasoning, motion candidates, and VLM confirmation. A full VLM-first approach is too slow and memory-sensitive for the zoom-out / zoom-in fast paths (Fact #12, Fact #24).
**Conclusion.** Use a three-tier perception pipeline: Tier 1 fixed-class YOLO26 / YOLOE-26 TensorRT FP16 primitives, Tier 2 primitive-graph plus lightweight ROI confirmation, and Tier 3 NanoLLM VLM only for bounded zoom-in endpoint / POI questions.
**Confidence.** Medium-high. API fit is supported; runtime targets still require hardware benchmarks.
### Dimension 2: Movement Detection
**Fact confirmation.** Dynamic-camera motion detection needs ego-motion compensation because platform movement creates apparent motion in stable objects (Fact #6, Fact #7). OpenCV provides sparse optical flow, feature tracking, and global-motion estimation APIs (Fact #22). The user confirmed timestamped video, gimbal, zoom, and UAV telemetry are available for MVP.
**Reference comparison.** Naive frame differencing is simpler but directly conflicts with the stable-scene rejection requirement. Pure learned tracking without telemetry may work later, but it adds data requirements and hides failure modes.
**Conclusion.** Select telemetry-aided OpenCV ego-motion compensation as the MVP movement-detector baseline, with residual cluster extraction. Run movement detection at **both** zoom-out and zoom-in (per-zoom-band thresholds), benchmark-gating classical CV adequacy at zoom-in before MVP acceptance. The ≤5 POIs/minute cap is enforced by `scan_controller`'s POI scheduler, not by the detector itself, so the same detector can serve both zoom levels.
**Confidence.** High for mechanism fit; medium for runtime and false-positive performance until replay-tested.
### Dimension 3: Scan and Gimbal Control
**Fact confirmation.** ViewPro A40 official specs support fast tracking output metadata and a 40× optical camera, but do not prove the project's full zoom traversal time (Fact #4, Fact #5). Behaviour trees help large UAV autonomy systems, but this project has a small deterministic scan lifecycle (Fact #26).
**Reference comparison.** Behaviour trees are more extensible, but a deterministic state machine gives simpler timing, queue, and timeout tests for `ZoomedOut`, `ZoomedIn`, and `TargetFollow` states.
**Conclusion.** Use a typed `scan_controller` state machine with explicit states, queue ageing, timeouts, and target-loss handling. Treat ViewPro zoom timing as a hardware-in-loop acceptance test.
**Confidence.** High for architecture fit; medium for physical zoom timing until measured.
### Dimension 4: VLM Confirmation
**Fact confirmation.** NanoLLM documents local multimodal VILA1.5-3B image+text prompting with MLC and quantisation options (Fact #23). Orin Nano 8 GB VLM deployment is memory-sensitive and needs strict context / token limits (Fact #24). The user confirmed VLM is required for MVP only if the exact model / runtime passes ≤5 s/ROI and memory gates.
**Reference comparison.** Using VLM for every ROI would overload latency and memory. Skipping VLM entirely would miss the requirement. A separate local VLM IPC process preserves no-cloud and isolation constraints while allowing a scheduler to avoid concurrent GPU use.
**Conclusion.** Select NanoLLM + VILA1.5-3B MLC quantised as the lead VLM, run only on bounded zoom-in crops, and enforce hard benchmark gates before MVP acceptance.
**Confidence.** Medium. API capability is proven; runtime-quality fit is not proven without target hardware.
### Dimension 5: Data and Acceptance Risk
**Fact confirmation.** All-season MVP was confirmed by the user. UAV small-object and camouflaged-object detection is sensitive to background, scale, and season (Fact #8, Fact #11). Annotation effort is plausible only with assistance and careful prioritisation (Fact #14, Fact #15).
**Reference comparison.** Winter-first MVP would lower risk but conflicts with the confirmed requirement. All-season MVP demands stronger dataset gates and should not rely on aggregate metrics.
**Conclusion.** Keep all-season MVP, but make per-class, per-season, per-terrain validation mandatory. Use annotation assistance and hard-negative mining from false positives to control schedule risk.
**Confidence.** Medium. The requirement is clear; dataset availability is the main risk.
## Fact cards
These are the load-bearing facts referenced from the reasoning chain and the fit matrix. Each card lists the source, confidence, related dimension, and fit impact. Source numbers refer to the bibliography in §References below.
### Fact #1
- **Statement**: Jetson Orin Nano Super is officially specified at 67 INT8 TOPS with 8 GB 128-bit LPDDR5 memory and 102 GB/s memory bandwidth.
- **Source**: Source #1
- **Confidence**: High
- **Related dimension**: Hardware feasibility
- **Fit impact**: Supports the hardware restriction, but does not prove FP16 multi-model latency.
### Fact #2
- **Statement**: NVIDIA's Super Mode performance gain depends on the JetPack / software configuration and power mode, so benchmark results must record the installed JetPack / L4T and power mode.
- **Source**: Source #2
- **Confidence**: High
- **Related dimension**: Runtime reproducibility
- **Fit impact**: Adds a missing restriction: lock and report JetPack / power mode for all latency tests.
### Fact #3
- **Statement**: Ultralytics provides Jetson / TensorRT deployment guidance, but the consulted documentation / search results do not prove a two-model YOLO26 + YOLOE-26 pipeline at 1280 px will stay below 100 ms/frame including preprocessing, tiling, and postprocessing.
- **Source**: Source #3
- **Confidence**: Medium
- **Related dimension**: Tier 1 latency
- **Fit impact**: Makes the ≤100 ms/frame criterion plausible but unproven until benchmarked with the exact exported engines.
### Fact #4
- **Statement**: ViewPro A40 Pro official specifications list 1080p output, 40× optical zoom with 4.25170 mm focal range, 30 Hz tracking deviation update rate, less than 30 ms deviation output delay, and 5×5 pixel minimum AI target size for the built-in AI feature.
- **Source**: Source #4
- **Confidence**: High
- **Related dimension**: Camera / gimbal feasibility
- **Fit impact**: Supports control-loop feasibility but does not prove full wide-to-high optical zoom traversal in ≤2 s.
### Fact #5
- **Statement**: The official ViewPro A40 Pro page does not provide a direct full-range optical zoom traversal time; the project-specific 12 s zoom traversal claim must be measured on the target camera / interface.
- **Source**: Source #4
- **Confidence**: High
- **Related dimension**: zoom-out → zoom-in transition
- **Fit impact**: Adds a validation prerequisite for the ≤2 s transition criterion.
### Fact #6
- **Statement**: Recent dynamic-camera moving-object detection work uses optical flow plus additional mechanisms such as tracking-any-point, adaptive bounding-box filtering, segmentation priors, or focus-of-expansion reasoning, because camera motion alone produces apparent motion.
- **Source**: Source #5, Source #6
- **Confidence**: High
- **Related dimension**: Movement detection
- **Fit impact**: Supports the requirement to compensate UAV / gimbal motion and disqualifies naive frame differencing.
### Fact #7
- **Statement**: Moving-object detection from UAV footage is difficult because objects are small, camera motion is complex, and structured backgrounds can make optical-flow-only approaches unreliable.
- **Source**: Source #5, Source #6
- **Confidence**: Medium
- **Related dimension**: Movement detection reliability
- **Fit impact**: Adds a missing false-positive / false-negative acceptance criterion for zoom-out motion candidates (and, after the zoom-in benchmark gate, an analogous per-zoom-band criterion for zoom-in).
### Fact #8
- **Statement**: UAV small-object detection literature repeatedly identifies small pixel footprint, complex backgrounds, low contrast, and scale variation as major causes of missed detections and false alarms.
- **Source**: Source #7, Source #8
- **Confidence**: High
- **Related dimension**: YOLO and semantic detection quality
- **Fit impact**: Makes 80 % precision / recall for new primitive classes realistic only with class-specific validation, tiling, and seasonal coverage.
### Fact #9
- **Statement**: Recent UAV YOLO variants improve small-target results through attention, receptive-field, or feature-fusion changes, implying generic YOLO baseline performance should not be assumed to transfer unchanged to small concealed primitives.
- **Source**: Source #7, Source #8
- **Confidence**: High
- **Related dimension**: Model selection
- **Fit impact**: Supports keeping "existing class performance must not degrade" and adding per-class / season reporting.
### Fact #10
- **Statement**: Trail / path detection can be treated as a structured perception problem using neural detection plus path-continuity reasoning, not just independent bounding boxes.
- **Source**: Source #9
- **Confidence**: Medium
- **Related dimension**: Footpath detection
- **Fit impact**: Supports requiring path tracing, freshness scoring, endpoint reasoning, and branch-following behaviour.
### Fact #11
- **Statement**: Camouflaged-object detection papers use specialised attention, illumination, frequency / spatial, or super-resolution methods because camouflaged targets are intentionally similar to the background.
- **Source**: Source #14
- **Confidence**: Medium
- **Related dimension**: Concealed-position detection
- **Fit impact**: Supports the project's claim that visual similarity to known object classes is insufficient.
### Fact #12
- **Statement**: Small local VLMs can run on Jetson-class devices, but model choice, quantisation, context size, crop size, and runtime container determine whether memory and ≤5 s/ROI are realistic.
- **Source**: Source #1, Source #12, Source #13
- **Confidence**: Medium
- **Related dimension**: VLM feasibility
- **Fit impact**: Makes local VLM feasible only as a tightly bounded optional Tier 3 module with an exact-model benchmark.
### Fact #13
- **Statement**: The project has about 6 GB remaining RAM only because existing YOLO is assumed to use about 2 GB; unified-memory contention means VLM and YOLO scheduling must be sequential and benchmarked together, not in isolation.
- **Source**: Source #1, project restrictions
- **Confidence**: Medium
- **Related dimension**: Resource budget
- **Fit impact**: Supports the restriction against concurrent YOLO / VLM GPU inference and adds a whole-pipeline memory test.
### Fact #14
- **Statement**: Interactive or model-assisted segmentation can reduce mask annotation time compared with manual polygon annotation, but this benefit depends on tooling and object-boundary clarity.
- **Source**: Source #10
- **Confidence**: High
- **Related dimension**: Annotation effort
- **Fit impact**: Makes hundreds-to-thousands of labels plausible in 225 hours only if annotation assistance and prioritisation are used.
### Fact #15
- **Statement**: Label propagation can reduce annotation effort for related frames / sequences, which is relevant to movement-detection video data.
- **Source**: Source #11
- **Confidence**: Medium
- **Related dimension**: Movement dataset creation
- **Fit impact**: Supports using video / sequential annotation tools for movement candidates rather than frame-by-frame manual labelling only.
### Fact #16
- **Statement**: The existing FastAPI service has endpoints that emit normalized boxes and uses a global inference object around Cython / TensorRT inference.
- **Source**: `../detections/main.py` (existing detections service)
- **Confidence**: High
- **Related dimension**: Integration boundary
- **Fit impact**: Supports keeping normalized-box output but favours isolating VLM and scan control outside the Cython inference path.
### Fact #17
- **Statement**: The input images show long thin paths, dark narrow entrances, branch / forest-edge concealment, and partial occlusion, so bounding boxes alone may be weak for footpaths and path-follow behaviour.
- **Source**: original problem-side data parameters (deleted on doc consolidation 2026-05-17; reference PNGs `semantic01..04.png` lived alongside)
- **Confidence**: Medium
- **Related dimension**: Annotation format
- **Fit impact**: Supports allowing segmentation masks or polylines for footpaths instead of boxes only.
### Fact #18
- **Statement**: The project provides no explicit acceptance criteria for false positives per route / time, operator-review workload, queue starvation, telemetry availability, power / thermal throttling, or evidence logging.
- **Source**: original problem-side acceptance criteria + restrictions (deleted on doc consolidation 2026-05-17; consolidated into `architecture.md §7.3 Restrictions` and `§7.4 Acceptance Criteria`)
- **Confidence**: High
- **Related dimension**: Missing criteria
- **Fit impact**: Requires adding or confirming these criteria before final architecture planning. (Most are now folded into `architecture.md §7.4 Acceptance Criteria > Frozen choices (2026-05-06)`.)
### Fact #19
- **Statement**: Ultralytics YOLO supports TensorRT engine export with FP16 through `half=True`; TensorRT export is GPU-only and supports arguments including `dynamic`, `half`, `int8`, and workspace configuration.
- **Source**: Source #15
- **Confidence**: High
- **Related dimension**: Tier 1 primitive detector
- **Fit impact**: Supports selecting custom-trained YOLO26 TensorRT FP16 as the primary primitive detector.
### Fact #20
- **Statement**: Jetson TensorRT export can run into workspace and dynamic-shape memory issues, so fixed input shapes, batch 1, and on-device export / benchmarking are safer for this project than dynamic batch export.
- **Source**: Source #18
- **Confidence**: Medium
- **Related dimension**: Tier 1 latency
- **Fit impact**: Adds a hard implementation constraint for the FP16 TensorRT engines.
### Fact #21
- **Statement**: YOLOE supports open-vocabulary detection / segmentation, but TensorRT runtime should not depend on Python open-vocabulary prompt-mutation APIs; MVP runtime should use fixed trained classes or pre-baked class embeddings only.
- **Source**: Source #19
- **Confidence**: Medium
- **Related dimension**: YOLOE exact-fit
- **Fit impact**: Selects YOLOE-26 only in fixed-class FP16 TensorRT mode, not runtime open-vocabulary mode.
### Fact #22
- **Statement**: OpenCV 4.x provides Lucas-Kanade sparse optical flow (`calcOpticalFlowPyrLK`), feature detection (`goodFeaturesToTrack`), and global-motion estimation APIs that can estimate frame-to-frame background motion before residual moving-object detection.
- **Source**: Source #16
- **Confidence**: High
- **Related dimension**: Movement detector
- **Fit impact**: Supports selecting telemetry-aided OpenCV ego-motion compensation as the movement baseline.
### Fact #23
- **Statement**: NanoLLM supports model loading through MLC / AWQ / HF APIs with quantisation settings such as `q4f16_ft`, and multimodal chat examples using VILA1.5-3B with image prompts.
- **Source**: Source #17
- **Confidence**: High
- **Related dimension**: VLM confirmation
- **Fit impact**: Supports selecting NanoLLM + VILA1.5-3B MLC as the lead local VLM candidate, subject to runtime-quality benchmark.
### Fact #24
- **Statement**: VILA1.5-3B on Orin Nano 8 GB is plausible but memory-sensitive; context length, max tokens, crop count, and container / storage footprint must be capped.
- **Source**: Source #21
- **Confidence**: Medium
- **Related dimension**: VLM feasibility
- **Fit impact**: Requires the VLM process to use bounded crops, short prompts, short answers, and a watchdog.
### Fact #25
- **Statement**: NanoSAM / MobileSAM-style segmentation is useful for ROI mask refinement and annotation assistance, but not as the zoom-out wide-area sweep lead because it still adds an image-encoder cost and prompt dependency.
- **Source**: Source #20
- **Confidence**: Medium
- **Related dimension**: Segmentation fallback
- **Fit impact**: Marks segmentation foundation models as fallback / annotation-assist, not primary runtime.
### Fact #26
- **Statement**: Behaviour trees improve modularity for large UAV autonomy systems, but this project's scan lifecycle has a small fixed set of states and strict timing, making a typed deterministic state machine simpler for MVP.
- **Source**: Source #22
- **Confidence**: Medium
- **Related dimension**: Scan control
- **Fit impact**: Selects a deterministic scan state machine with explicit queues / timeouts; behaviour tree remains a later extensibility option (the BT primer in `system-flows.md §F4` is the canonical decomposition the state machine must satisfy).
### Fact #27
- **Statement**: Multiple GPU inference contexts / processes can complicate TensorRT scheduling and memory behaviour on Jetson; the project should centralise GPU scheduling and preserve the restriction that YOLO and VLM do not run concurrently.
- **Source**: Source #23, project restrictions
- **Confidence**: Medium
- **Related dimension**: Integration boundary
- **Fit impact**: Selects a local IPC VLM process controlled by an integration scheduler, not unmanaged concurrent inference.
### Fact #28
- **Statement**: The first draft under-specified the proof gates that must happen before implementation: Tier 1 latency, VLM memory / latency, ViewPro zoom timing, movement false-positive replay, and all-season dataset readiness.
- **Source**: `solution_draft01.md` (superseded), `validation_log` (this file §Validation)
- **Confidence**: High
- **Related dimension**: Planning readiness
- **Fit impact**: Adds a required benchmark-gate stage before decomposition / implementation.
### Fact #29
- **Statement**: Secure FastAPI file / image handling should not trust client content-type headers alone and should enforce size limits, validation, authorisation, cleanup, and audit logging.
- **Source**: Source #24
- **Confidence**: Medium
- **Related dimension**: API security
- **Fit impact**: Adds explicit upload / payload validation requirements for image, ROI, and VLM IPC inputs.
### Fact #30
- **Statement**: Local IPC can use Unix-socket filesystem permissions and peer-credential checks such as `SO_PEERCRED` to restrict which local processes may call the VLM service.
- **Source**: Source #25
- **Confidence**: Medium
- **Related dimension**: IPC security
- **Fit impact**: Replaces vague "local IPC authorisation" with a concrete Unix-socket permission and peer-credential control.
### Fact #31
- **Statement**: Production LLM / VLM integrations should validate or constrain outputs against a schema before downstream use, because free-form text is not a stable API contract.
- **Source**: Source #26
- **Confidence**: Medium
- **Related dimension**: VLM output reliability
- **Fit impact**: Adds a structured `VlmAssessment` schema and retry / fail-closed behaviour.
### Fact #32
- **Statement**: Sensor-fusion systems use correlated timestamps to align camera frames and telemetry; movement detection should define maximum tolerated skew between video frames, gimbal state, and UAV motion data.
- **Source**: Source #27
- **Confidence**: Medium
- **Related dimension**: Telemetry synchronisation
- **Fit impact**: Adds a telemetry-synchronisation contract before movement detection can claim compensation correctness.
### Fact #33
- **Statement**: TensorRT performance must be measured under the actual model configuration and scheduler; documentation-level export support does not prove end-to-end latency with multiple engines and preprocessing.
- **Source**: Source #28
- **Confidence**: High
- **Related dimension**: GPU scheduling
- **Fit impact**: Strengthens the central GPU scheduler and benchmark gate.
### Fact #34
- **Statement**: OpenCV image decoders have had critical crafted-image vulnerabilities in recent 4.x versions, including CVE-2025-53644 affecting 4.10.0 / 4.11.0 and patched in 4.12.0.
- **Source**: Source #29
- **Confidence**: High
- **Related dimension**: Image-processing security
- **Fit impact**: Requires patched OpenCV version and image-format allow-list for untrusted inputs.
### Fact #35
- **Statement**: The existing `main.py` swallows refresh / posting / detection exceptions in several paths and returns healthy status even when inference initialisation fails, which would hide critical runtime failures in the expanded system.
- **Source**: `../detections/main.py` (existing detections service)
- **Confidence**: High
- **Related dimension**: Observability and reliability
- **Fit impact**: Adds a reliability task to replace silent exception handling in touched service paths.
### MVE: Ultralytics YOLO26 / YOLOE-26 in fixed-class TensorRT FP16 mode
- **Source**: Source #15, Source #19
- **Pinned mode**: Custom-trained YOLO26 detector and YOLOE-26 segmentation / detection engines exported as TensorRT FP16 with fixed project classes, batch 1, fixed 1280 px input, no runtime open-vocabulary prompt mutation.
- **Inputs in the example**: Image input passed to `YOLO("yolo26n.pt")`, exported with `model.export(format="engine", half=True)`, then loaded as `.engine` for prediction.
- **Outputs in the example**: Detection / segmentation results from a TensorRT engine.
- **Project inputs**: 1080p UAV frames or tiles resized / split for 1280 px model input.
- **Project outputs required**: Normalized boxes for primitives and operator display; optional masks / polylines for path / branch reasoning.
- **Match assessment**: Exact API / deployment match for fixed-class TensorRT FP16 engines; runtime open-vocabulary YOLOE behaviour is rejected.
#### Restrictions and AC binding — YOLO26 / YOLOE-26 fixed-class FP16
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| FP16 precision | TensorRT export supports `half=True`. | Pass | Fact #19 |
| TensorRT primary / ONNX fallback | TensorRT engine export is documented; ONNX remains project fallback. | Pass | Fact #19 |
| 1280 px input | Export supports `imgsz`; exact latency requires benchmark. | Pass for API; runtime gate | Fact #19, Fact #20 |
| ≤100 ms/frame Tier 1 | API can run TensorRT FP16; runtime quality must be measured end-to-end. | Pass with runtime-quality gate | Fact #20 |
| Normalized boxes output | YOLO result conversion can preserve existing normalized-box DTO contract. | Pass | Fact #16 |
| No degradation of existing classes | Requires validation, not an API capability. | Pass with runtime-quality gate | Fact #9 |
| All seasons MVP | Requires dataset / training coverage, not an API capability. | Pass with data-quality gate | Fact #8 |
### MVE: OpenCV telemetry-aided ego-motion compensation
- **Source**: Source #16
- **Pinned mode**: OpenCV 4.x sparse optical flow + feature tracking + global-motion estimation, fused with timestamped gimbal angle / zoom and UAV motion telemetry before residual moving-candidate extraction.
- **Inputs in the example**: Consecutive video frames converted to grayscale; features from `goodFeaturesToTrack`; tracked points from `calcOpticalFlowPyrLK`.
- **Outputs in the example**: Matched point trajectories and estimated motion between frames.
- **Project inputs**: 1080p zoom-out frame sequences plus timestamped gimbal / UAV telemetry; zoom-in frame sequences for the per-zoom-band benchmark.
- **Project outputs required**: Small residual moving point / cluster candidate boxes queued within 1 s.
- **Match assessment**: Exact match for ego-motion compensation primitives; project-specific candidate thresholds require benchmark.
#### Restrictions and AC binding — OpenCV movement detector
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Movement detection at zoom-out + zoom-in | OpenCV frame-to-frame processing applies at both zoom levels with per-zoom-band thresholds. Classical-CV adequacy at zoom-in is benchmark-gated; if the FP cap fails, fall back per Q14. | Pass with runtime-quality gate | Fact #22 |
| Compensate UAV / gimbal motion | Optical flow / global motion plus telemetry directly supports compensation. | Pass | Fact #6, Fact #22 |
| Enqueue within 1 s | CPU / GPU cost depends on implementation; API supports required operations. | Pass with runtime-quality gate | Fact #22 |
| Stable objects must not be moving due to platform motion | Compensation design directly targets this failure mode. | Pass | Fact #6 |
| Timestamped telemetry available | User confirmed full telemetry is available for MVP. | Pass | User decision |
### MVE: NanoLLM VILA1.5-3B local VLM ROI confirmation
- **Source**: Source #17, Source #21
- **Pinned mode**: NanoLLM multimodal chat with MLC backend, `Efficient-Large-Model/VILA1.5-3b`, quantised mode such as `q4f16_ft`, one bounded ROI crop, short prompt, short answer.
- **Inputs in the example**: Image-path prompt plus text prompt, e.g. `--prompt '/data/images/lake.jpg' --prompt 'please describe the scene.'`.
- **Outputs in the example**: Natural-language generated answer from the VLM.
- **Project inputs**: zoom-in ROI crop around path endpoint, branch pile, dark entrance, dugout, person, or vehicle candidate.
- **Project outputs required**: Confirmation label / reason that can be converted to POI metadata and operator display-box status.
- **Match assessment**: Exact API capability match for image+text ROI reasoning; latency and memory are runtime-quality gates.
#### Restrictions and AC binding — NanoLLM VILA1.5-3B
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Local VLM, no cloud | NanoLLM runs local models. | Pass | Fact #23 |
| Separate IPC process | NanoLLM can run as a separate process / container invoked by local IPC. | Pass | Fact #23, Fact #27 |
| Sequential with YOLO | Scheduler can enforce no concurrent GPU execution. | Pass | Fact #27 |
| ≤5 s/ROI | API can process image prompts; exact latency must be benchmarked on Jetson. | Pass with runtime-quality gate | Fact #24 |
| ≤6 GB remaining RAM | Quantised mode is supported; exact memory must be benchmarked with YOLO container present. | Pass with runtime-quality gate | Fact #23, Fact #24 |
| MVP requires VLM if benchmark passes | User-confirmed policy. | Pass | User decision |
## Component fit matrix
### Top-level component fit matrix
| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
|---|---|---|---|---|---|---|---|---|
| Tier 1 primitive detection | Ultralytics YOLO26 + YOLOE-26 | Fixed-class TensorRT FP16 engines, batch 1, fixed 1280 px input, no runtime open-vocabulary prompt mutation | Established/open-source + current SOTA | Fast zoom-out primitive boxes/masks for paths, roads, trees, branch piles, entrances | MVE above; docs: Source #15, #19 | Runtime open-vocabulary TensorRT APIs rejected; dynamic batch rejected | Selected with runtime-quality gate | Best fit with existing TensorRT/Cython service and FP16 restriction. |
| Tier 2 semantic analyzer | Primitive graph + lightweight custom CNN | ROI crop and Tier 1 primitives → POI score, path freshness, endpoint, concealment candidate | Simple baseline + custom model | Confirm and reason over primitives within ≤200 ms/ROI | Facts #10, #17, #25 | None at API level; data-quality gate remains | Selected | Keeps reasoning explainable and faster than VLM-first confirmation. |
| Movement detection | OpenCV 4.x optical flow / global motion + timestamped UAV / gimbal telemetry | Zoom-out and zoom-in frame pairs plus telemetry → residual moving-point / cluster boxes (per-zoom-band thresholds) | Established production baseline | Detect moving candidates while rejecting platform-induced motion at both zoom levels | MVE above; docs: Source #16 | Video-only mode is not selected for MVP. Zoom-in classical CV is benchmark-gated; learned fallback per Q14 if the FP cap fails. | Selected with runtime-quality gate | Directly matches user-confirmed telemetry and movement restrictions. |
| Tier 3 VLM confirmation | NanoLLM + VILA1.5-3B | MLC backend, quantised mode such as `q4f16_ft`, one bounded ROI crop, short prompt, short response | Open-source edge VLM | Local confirmation of endpoint / branch-pile / entrance / dugout ROI | MVE above; docs: Source #17, #21 | Must pass ≤5 s/ROI and memory gate; otherwise smaller-VLM fallback | Selected with runtime-quality gate | Satisfies local / no-cloud / VLM-required policy if benchmark passes. |
| Scan control | Typed deterministic state machine | `ZoomedOut`, `ZoomedIn`, `TargetFollow` states with POI queue, timeouts, target-loss, gimbal command adapters | Simple baseline | Camera sweep, zoom, POI servicing, target follow | Source #4, #22, Fact #26 | Behaviour tree deferred (canonical decomposition kept in `system-flows.md §F4`) | Selected | Small fixed lifecycle favours deterministic timing and testability. |
| Integration boundary | Existing FastAPI / Cython YOLO core + `scan_controller` scheduler + local IPC VLM process | Normalized-box contract + POI metadata; central GPU scheduler enforces sequential YOLO / VLM | Established production pattern | Integrate modules without compiling VLM into Cython | Fact #16, #27 | Unmanaged multiprocessing / concurrent GPU rejected | Selected | Preserves existing service and isolates memory-heavy VLM. |
### Sub-matrix: YOLO26 / YOLOE-26 fixed-class TensorRT FP16
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Hardware: Jetson Orin Nano Super 8 GB | TensorRT FP16 is an NVIDIA GPU deployment path; memory must be benchmarked. | Pass with runtime-quality gate | Fact #1, Fact #20 |
| FP16 precision | Uses `half=True` TensorRT export. | Pass | Fact #19 |
| 1280 px model input | Export supports image-size configuration; use fixed 1280 px / batch 1. | Pass | Fact #19 |
| Existing tile splitting | Candidate accepts image / tiles and returns detections per tile. | Pass | Fact #16, Fact #19 |
| YOLO and VLM sequential | Tier 1 runs before VLM; scheduler prevents concurrency. | Pass | Fact #27 |
| Output normalized boxes | Existing DTO contract can wrap candidate outputs. | Pass | Fact #16 |
| New primitive classes | Fixed custom classes support the required primitive set. | Pass | Fact #19, Fact #21 |
| P ≥80 %, R ≥80 % and no degradation | Model API supports training / validation; actual performance is data / runtime quality. | Pass with runtime-quality gate | Fact #8, Fact #9 |
| All-season MVP | Requires dataset coverage rather than API feature. | Pass with data-quality gate | Fact #8, user confirmation |
### Sub-matrix: Primitive graph + lightweight CNN
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Tier 2 ≤200 ms/ROI | Bounded ROI crop and lightweight CNN / rules keep workload limited. | Pass with runtime-quality gate | Fact #10, Fact #17 |
| Consumes YOLO primitives | Candidate uses primitive boxes / masks as primary input. | Pass | Fact #10 |
| Path freshness and endpoint tracing | Graph / path model represents path continuity and endpoint scoring. | Pass | Fact #10, Fact #17 |
| Branch choice at intersections | Queue / path scorer can select freshest / most promising branch by configured score. | Pass | Fact #10 |
| VLM sequentiality | Candidate can run before VLM and invoke VLM only after endpoint hold. | Pass | Fact #27 |
### Sub-matrix: OpenCV telemetry-aided movement detector
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Both zoom levels | Runs during both zoom-out and zoom-in scan states. | Pass with runtime-quality gate | Fact #22 |
| Wide / light / medium zoom | Candidate consumes only selected zoom-state frames. | Pass | User confirmation, Fact #22 |
| Timestamped video / gimbal / UAV telemetry | User confirmed full telemetry is available for MVP. | Pass | User decision |
| Compensate UAV / gimbal motion | Optical flow / global motion plus telemetry estimate ego-motion before residuals. | Pass | Fact #6, Fact #22 |
| Enqueue within 1 s | Candidate operations support streaming implementation; exact latency is runtime quality. | Pass with runtime-quality gate | Fact #22 |
| Stable objects not treated as moving | Ego-motion compensation directly addresses this failure mode. | Pass | Fact #6 |
| Output normalized movement boxes | Residual clusters can be converted to normalized candidate boxes. | Pass | Fact #16 |
### Sub-matrix: NanoLLM VILA1.5-3B
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Local VLM, no cloud | Runs local model through NanoLLM. | Pass | Fact #23 |
| Separate IPC process | Candidate can run as an isolated process / container behind local IPC. | Pass | Fact #23, Fact #27 |
| Sequential with YOLO | Scheduler grants VLM GPU slot only after YOLO / Tier 2 work. | Pass | Fact #27 |
| ≤5 s/ROI | API supports image+text prompt; exact latency is runtime quality. | Pass with runtime-quality gate | Fact #23, Fact #24 |
| ≤6 GB remaining RAM | Quantised mode supports smaller memory footprint; exact budget is runtime quality. | Pass with runtime-quality gate | Fact #23, Fact #24 |
| Required for MVP if benchmark passes | User-confirmed policy. | Pass | User decision |
| Output usable for operator display | Text confirmation can be converted into POI metadata while display box comes from Tier 1 / 2. | Pass | Fact #23 |
### Sub-matrix: scan controller state machine
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Zoom-out route sweep | State machine owns sweep pattern and POI queueing. | Pass | Fact #26 |
| Zoom-out → zoom-in ≤2 s | State machine can command transition; physical zoom timing must be measured. | Pass with runtime-quality gate | Fact #5 |
| Zoom-in lock, pan, hold, timeout | Explicit states encode lock, follow path, endpoint hold, VLM request, timeout, return. | Pass | Fact #26 |
| Target-follow centre 25 % | Target-follow state can enforce centre-window metric. | Pass | Source #4, Fact #26 |
| Decision-to-movement ≤500 ms | Controller can timestamp commands; physical / protocol latency is runtime quality. | Pass with runtime-quality gate | Fact #4 |
| Ordered POI queue with confidence / proximity | Queue can also include user-confirmed ≤5 POIs/minute cap and ageing. | Pass | User decision |
### Sub-matrix: integration scheduler and IPC
| Restriction / AC | Candidate-mode behaviour | Result | Evidence |
|---|---|---|---|
| Extend existing FastAPI + Cython service | Keeps existing YOLO core and adds scheduler / adapters around it. | Pass | Fact #16 |
| VLM separate IPC | VLM remains outside Cython and communicates locally. | Pass | Fact #23, Fact #27 |
| No concurrent YOLO / VLM GPU inference | Central scheduler serializes GPU-heavy work. | Pass | Fact #27 |
| Same normalized-box output | Integration layer preserves current DTOs and adds POI metadata separately. | Pass | Fact #16 |
| GPS-denied coordinates out of scope | Scheduler stores external coordinate references but does not estimate them. | Pass | Project restrictions |
| Annotation / training separate repos | Integration consumes trained-model artefacts and label schema only. | Pass | Project restrictions |
### Mode B revised fit additions
These rows extend the top-level matrix with the cross-cutting concerns surfaced during the second draft of the solution.
| Component Area | Candidate | Pinned Mode/Config | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
|---|---|---|---|---|---|---|---|
| Benchmark gate | Pre-implementation proof suite | Hardware-in-loop and replay benchmarks for Tier 1, Tier 2, VLM, A40 zoom, movement, all-season data readiness | Prevent implementation from depending on unproven runtime-quality assumptions | Facts #28, #33 | None | Selected | Converts draft caveats into explicit go / no-go gates. |
| Telemetry sync contract | Frame / telemetry alignment layer | Timestamped frame, gimbal angle, zoom, and UAV motion samples with maximum-skew validation | Make movement compensation testable and reproducible | Fact #32 | Telemetry-missing MVP rejected by user decision | Selected | Required for movement-detector exact fit. |
| VLM output contract | Structured `VlmAssessment` schema | Label enum, confidence, evidence spans, reason text, timeout / error status; validate before accepting | Prevent free-form VLM text from becoming an unstable API | Fact #31 | Raw free-text VLM output rejected | Selected | Needed for operator display and downstream logs. |
| IPC security | Unix-domain socket permissions + peer credentials | Local socket with filesystem permissions, peer-credential check, payload-size limits | Restrict local VLM callers and bound payload abuse | Fact #30 | Unauthenticated localhost HTTP rejected for VLM control plane | Selected | Local-only is not sufficient without local authorisation. |
| Input security | Image / ROI payload validation | MIME / format allow-list, size limits, patched OpenCV, decode sandbox where practical | Reduce crafted-input and resource-exhaustion risk | Fact #29, Fact #34 | Trusting headers / client filenames rejected | Selected | Existing service will process more image / ROI inputs. |
| Service reliability | Explicit errors and health semantics | No silent exception swallowing in touched paths; health reflects inference / scheduler / VLM availability | Make failures visible during scans and tests | Fact #35 | "Always healthy" failure masking rejected | Selected | Required before expanding mission-critical behaviour. |
## Validation
### Validation scenario
A winged UAV flies a planned route at 6001000 m over mixed winter forest and field terrain. In `ZoomedOut`, the camera sweeps left-right at wide / light zoom. The system detects a faint footpath and a small moving dot, queues both, zooms to the path endpoint within 2 s (entering `ZoomedIn`), keeps the endpoint centred while the UAV moves, asks the local VLM for a bounded confirmation, then returns to `ZoomedOut`. Later, an operator confirms a target and `TargetFollow` mode keeps it in the centre 25 % of frame.
### Expected behaviour (based on conclusions)
- Tier 1 emits primitive boxes / masks for path, branch pile, road / tree context, and fixed known object classes.
- Movement detector compensates gimbal / UAV ego-motion with telemetry and optical flow before residual moving-cluster extraction.
- `scan_controller` queues POIs by confidence / proximity plus ageing and enforces ≤5 POIs/minute operator-review budget.
- Zoom-in zoom and endpoint hold run through a deterministic state machine with timeouts and target-loss handling.
- VLM runs only on bounded ROI crops through local IPC and only when the scheduler grants the GPU slot.
### Actual validation results
The architecture is internally consistent with the researched constraints and user confirmations. Runtime quality still requires hardware validation:
1. Tier 1 end-to-end frame latency for fixed-shape YOLO26 + YOLOE-26 FP16 engines at 1280 px.
2. ViewPro A40 medium-to-high zoom transition under the selected control protocol.
3. Movement false-positive rate with timestamped telemetry and representative zoom-out panning, plus zoom-in tracking. Both must satisfy per-zoom-band caps.
4. NanoLLM VILA1.5-3B ≤5 s/ROI and memory budget while the existing YOLO container is present.
5. All-season validation coverage and hard-negative mining.
### Counterexamples
- If YOLOE TensorRT requires runtime prompt mutation for the chosen classes, it is not a valid MVP runtime path; use fixed trained classes only.
- If VILA1.5-3B misses memory or latency gates, MVP cannot claim VLM-required acceptance until a smaller local VLM passes the same API and runtime gates. In that case, `scan_controller` operates with VLM disabled per the optionality model in `architecture.md §7.6 Local VLM confirmation`.
- If telemetry is unavailable or unsynchronised, the movement detector must degrade to stabilised video-only mode and should not claim the zoom-out movement criteria.
### Review checklist
- [x] Draft conclusions are consistent with fact cards.
- [x] Important dimensions include hardware, model runtime, movement compensation, scan control, data, security, and integration boundaries.
- [x] No selected runtime component depends on cloud services.
- [x] No selected TensorRT YOLOE path depends on runtime open-vocabulary prompt mutation.
- [x] Runtime-quality gates are separated from API capability gates.
- [x] All selected components match the project constraint matrix at the API / architecture level.
### Conclusions requiring revision
None at research-draft level. Hardware benchmark failures may revise the selected model variants during planning or implementation.
## References (source registry)
Access date for web sources: 2026-05-06.
### Source #1 — Jetson Orin Nano Super Developer Kit
- **Link**: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit
- **Tier**: L1 (vendor primary)
- **Version info**: 67 INT8 TOPS, 8 GB LPDDR5, 102 GB/s
- **Boundary match**: Full match
- **Summary**: Official NVIDIA page for Jetson Orin Nano Super Developer Kit confirms 67 INT8 TOPS, 8 GB 128-bit LPDDR5 at 102 GB/s, 725 W power, generative-AI edge positioning.
- **Used for**: Latency and memory feasibility (Facts #1, #12, #13).
### Source #2 — NVIDIA JetPack 6.2 Super Mode
- **Link**: https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/
- **Tier**: L2 (vendor blog)
- **Version info**: JetPack 6.2 / Super Mode
- **Boundary match**: Full match
- **Summary**: Describes the software-enabled Super Mode performance gain and applicable power modes for Jetson Orin Nano.
- **Used for**: Reproducibility constraint (Fact #2).
### Source #3 — Ultralytics Jetson / TensorRT deployment
- **Link**: https://docs.ultralytics.com/yolov5/jetson_nano/
- **Tier**: L1 (vendor docs)
- **Boundary match**: Partial overlap
- **Summary**: Official Ultralytics documentation for Jetson deployment and TensorRT export.
- **Used for**: Tier 1 latency / deployment feasibility (Fact #3).
### Source #4 — ViewPro A40 Pro official spec
- **Link**: https://www.viewprotech.com/index.php?ac=article&at=read&did=561
- **Tier**: L1 (vendor primary)
- **Version info**: A40 Pro
- **Boundary match**: Full match
- **Summary**: 1080p output, 40× optical zoom, 4.25170 mm focal range, 30 Hz tracking deviation update, <30 ms deviation output delay, 5×5 px minimum AI target size for built-in AI.
- **Used for**: Camera and gimbal control feasibility (Facts #4, #5).
### Source #5 — MONA: Moving Object Detection from Videos Shot by Dynamic Camera
- **Link**: https://arxiv.org/html/2501.13183v1
- **Tier**: L1 (peer-reviewed venue / arXiv)
- **Boundary match**: Partial overlap
- **Summary**: Optical flow + tracking-any-point + adaptive bounding-box filtering + segmentation for moving-camera object detection.
- **Used for**: Movement detection under moving camera (Facts #6, #7).
### Source #6 — Moving Object Detection from Moving Camera using Focus of Expansion Likelihood and Segmentation
- **Link**: https://arxiv.org/html/2507.13628v1
- **Tier**: L1
- **Boundary match**: Partial overlap
- **Summary**: Optical flow, focus-of-expansion likelihood, segmentation priors for moving-camera object detection.
- **Used for**: Movement detection under moving camera (Facts #6, #7).
### Source #7 — RFAG-YOLO: receptive-field attention-guided YOLO for small-object detection in UAV images
- **Link**: https://pmc.ncbi.nlm.nih.gov/articles/PMC11991089/
- **Tier**: L1
- **Boundary match**: Partial overlap
- **Summary**: UAV small-object detection difficulties; improvements from receptive-field attention.
- **Used for**: Detection-quality realism (Facts #8, #9).
### Source #8 — YOLO-CAM: lightweight UAV detector with combined attention for small targets
- **Link**: https://www.mdpi.com/2072-4292/17/21/3575
- **Tier**: L1
- **Boundary match**: Partial overlap
- **Summary**: Lightweight UAV small-target detection using attention.
- **Used for**: Detection-quality realism (Facts #8, #9).
### Source #9 — Accurate Natural Trail Detection (DNN + Dynamic Programming)
- **Link**: https://mdpi-res.com/d_attachment/sensors/sensors-18-00178/article_deploy/sensors-18-00178.pdf?version=1515603128
- **Tier**: L1
- **Boundary match**: Reference only
- **Summary**: Trail detection using DNN + dynamic programming; supports path-as-structured-perception view.
- **Used for**: Footpath / trail detection (Fact #10).
### Source #10 — Large-Scale Interactive Object Segmentation with Human Annotators
- **Link**: https://arxiv.org/pdf/1903.10830
- **Tier**: L1
- **Boundary match**: Reference only
- **Summary**: Interactive segmentation faster than manual polygon annotation while maintaining quality.
- **Used for**: Annotation effort (Fact #14).
### Source #11 — Scaling up Instance Annotation via Label Propagation
- **Link**: http://scaling-anno.csail.mit.edu/
- **Tier**: L1
- **Boundary match**: Reference only
- **Summary**: Label propagation reducing annotation effort for video / object masks.
- **Used for**: Annotation effort for movement sequences (Fact #15).
### Source #12 — Getting Started with VLM on Jetson Nano
- **Link**: https://learnopencv.com/vlm-on-jetson-nano/
- **Tier**: L3 (third-party tutorial)
- **Boundary match**: Partial overlap
- **Summary**: Small VLMs can run on Jetson-class hardware with careful runtime / memory tuning.
- **Used for**: Local VLM feasibility (Fact #12).
### Source #13 — NVIDIA Jetson AI Lab
- **Link**: https://www.jetson-ai-lab.com/
- **Tier**: L2
- **Boundary match**: Partial overlap
- **Summary**: NVIDIA-linked Jetson AI Lab as the official-adjacent source for model-specific local VLM deployment evidence.
- **Used for**: Local VLM feasibility (Fact #12).
### Source #14 — CAMOUFLAGE-Net / Improved YOLOv7 Tiny / FSEL camouflaged-object detection
- **Link**: https://link.springer.com/article/10.1007/s40031-024-01152-6
- **Tier**: L1
- **Boundary match**: Reference only
- **Summary**: Camouflage detection requires specialised features / attention; cannot be assumed from generic object detection.
- **Used for**: Concealed-target detection realism (Fact #11).
### Source #15 — Ultralytics YOLO TensorRT integration
- **Link**: https://docs.ultralytics.com/integrations/tensorrt/
- **Tier**: L1
- **Boundary match**: Full match
- **Summary**: Ultralytics export to TensorRT engine format with FP16 through `half=True`; TensorRT as the recommended high-performance NVIDIA deployment path.
- **Used for**: Tier 1 primitive detector (Facts #19, #21).
### Source #16 — OpenCV 4.x optical-flow / videostab APIs
- **Link**: https://docs.opencv.org/4.x/d4/dee/tutorial_optical_flow.html
- **Tier**: L1
- **Boundary match**: Full match
- **Summary**: Lucas-Kanade optical flow, feature tracking, global-motion estimation APIs useful for ego-motion compensation.
- **Used for**: Movement detection (Fact #22).
### Source #17 — NanoLLM multimodal documentation
- **Link**: https://github.com/dusty-nv/nanollm/blob/main/docs/multimodal.md
- **Tier**: L1
- **Boundary match**: Full match
- **Summary**: NanoLLM multimodal chat with `Efficient-Large-Model/VILA1.5-3b`, image prompts, MLC quantisation options.
- **Used for**: Local VLM confirmation (Fact #23).
### Source #18 — TensorRT FP16 YOLO export on Jetson — limitations and issues
- **Link**: https://docs.ultralytics.com/integrations/tensorrt/
- **Tier**: L1 / L4 mixed (docs + community issues)
- **Boundary match**: Full match
- **Summary**: Official docs support FP16 TensorRT export; community results highlight Jetson workspace / dynamic-shape memory issues, mitigated with fixed shapes and careful workspace configuration.
- **Used for**: Tier 1 latency / export reliability (Fact #20).
### Source #19 — YOLOE open-vocabulary detection and TensorRT export notes
- **Link**: https://v8docs.ultralytics.com/models/yoloe/
- **Tier**: L1 / L4 mixed
- **Boundary match**: Partial overlap
- **Summary**: YOLOE supports open-vocabulary detection / segmentation, but TensorRT-engine use should not rely on runtime prompt APIs; fixed trained classes are safer for MVP runtime.
- **Used for**: Semantic primitive detector (Fact #21).
### Source #20 — NanoSAM / MobileSAM Jetson Orin Nano segmentation
- **Link**: https://github.com/NVIDIA-AI-IOT/nanosam/
- **Tier**: L1 / L2
- **Boundary match**: Partial overlap
- **Summary**: NanoSAM / MobileSAM as Jetson-optimised segmentation; useful as ROI mask refinement / annotation assist rather than primary sweep model.
- **Used for**: Segmentation fallback (Fact #25).
### Source #21 — VILA1.5-3B and VLM performance on Jetson Orin Nano
- **Link**: https://dusty-nv.github.io/NanoLLM/multimodal.html
- **Tier**: L1 / L4 mixed
- **Boundary match**: Full match
- **Summary**: VILA1.5-3B documented for NanoLLM multimodal usage; community results warn that Orin Nano 8 GB requires strict context / token / crop limits.
- **Used for**: VLM feasibility (Fact #24).
### Source #22 — Behaviour trees for UAV autonomy
- **Link**: https://www.sciencedirect.com/science/article/pii/S0921889022000513
- **Tier**: L1
- **Boundary match**: Reference only
- **Summary**: Behaviour-tree literature supports modular, reactive UAV behaviour; this project's zoom-out / zoom-in scan behaviour is small enough for a deterministic FSM first.
- **Used for**: Scan controller architecture (Fact #26).
### Source #23 — TensorRT concurrency / multiprocessing issue evidence
- **Link**: https://github.com/NVIDIA/TensorRT/issues/2474
- **Tier**: L4 (community issue tracker)
- **Boundary match**: Partial overlap
- **Summary**: Multiple TensorRT engines / processes on one GPU can cause context and performance problems; central GPU scheduler is safer for sequential-inference restriction.
- **Used for**: Integration boundary / GPU scheduling (Fact #27).
### Source #24 — FastAPI file-upload security references
- **Link**: https://fastapi.tiangolo.com/tutorial/request-files
- **Tier**: L1 / L3 mixed
- **Boundary match**: Partial overlap
- **Summary**: Secure upload handling needs content-type verification beyond headers, size limits, streaming behaviour, cleanup, authorisation, audit logging.
- **Used for**: Security weak points (Fact #29).
### Source #25 — Unix-domain socket authentication and peer credentials
- **Link**: https://linux.die.net/man/7/unix
- **Tier**: L1 / L3 mixed
- **Boundary match**: Partial overlap
- **Summary**: Local IPC can use filesystem permissions and peer-credential checks (`SO_PEERCRED`) to restrict which processes may connect.
- **Used for**: VLM IPC security (Fact #30).
### Source #26 — Structured output for LLM / VLM production use
- **Link**: https://docs.vllm.ai/en/v0.6.5/usage/structured_outputs.html
- **Tier**: L1 / L3 mixed
- **Boundary match**: Reference only
- **Summary**: Production systems should constrain or validate model output against schemas before using it in APIs / databases.
- **Used for**: VLM output reliability (Fact #31).
### Source #27 — NVIDIA / Isaac ROS timestamp synchronisation
- **Link**: https://nvidia-isaac-ros.github.io/v/release-3.2/repositories_and_packages/isaac_ros_nova/isaac_ros_correlated_timestamp_driver/index.html
- **Tier**: L1 / L2 mixed
- **Boundary match**: Reference only
- **Summary**: Jetson sensor-fusion uses hardware / correlated timestamps to reduce synchronisation jitter.
- **Used for**: Telemetry synchronisation (Fact #32).
### Source #28 — NVIDIA TensorRT performance optimisation
- **Link**: https://docs.nvidia.com/deeplearning/tensorrt/latest/performance/optimization.html
- **Tier**: L1
- **Boundary match**: Partial overlap
- **Summary**: TensorRT performance depends on batching, engine configuration, and runtime scheduling; project-specific latency must be measured under the actual scheduler.
- **Used for**: GPU scheduler / benchmark gate (Fact #33).
### Source #29 — OpenCV CVE-2025-53644
- **Link**: https://securitylab.github.com/advisories/GHSL-2025-057_OpenCV
- **Tier**: L1 / L2 mixed
- **Version info**: OpenCV 4.10.0 / 4.11.0 affected; 4.12.0 patched
- **Boundary match**: Partial overlap
- **Summary**: Crafted image inputs caused critical OpenCV decoder vulnerabilities; image-input validation and pinned patched OpenCV versions matter.
- **Used for**: Image-processing security (Fact #34).
## Solution evolution
The final solution architecture (now in `architecture.md §7.6 Solution Architecture`) evolved from an earlier draft that under-specified several gating concerns. Each row below pairs an old draft component with the weak point that made it insufficient, and the corresponding fix that landed in the final design.
| Old component | Weak point | New solution |
|---|---|---|
| Tiered Jetson pipeline with runtime gates | Gates were listed as caveats, not as a concrete pre-implementation stage. | Add a mandatory benchmark gate before implementation decomposition: Tier 1 latency, Tier 2 ROI latency, VLM latency / memory, A40 zoom timing, movement replay, and all-season dataset readiness. |
| YOLO26 / YOLOE-26 TensorRT FP16 | YOLOE runtime prompt / open-vocabulary behaviour could be accidentally assumed. | Runtime uses only fixed trained classes / pre-baked embeddings in FP16 TensorRT; runtime open-vocabulary mutation remains rejected. |
| Movement detector with telemetry | Telemetry availability was confirmed, but synchronisation tolerance was not specified. | Add a telemetry-synchronisation contract with frame / gimbal / zoom / UAV timestamps and a maximum tolerated skew before motion compensation. |
| NanoLLM VLM IPC | Free-form VLM output is not a stable interface for operator-facing decisions. | Add a structured `VlmAssessment` schema, validation, retry / timeout handling, and fail-closed behaviour. |
| Local VLM process | "Local IPC authorisation" was too vague. | Use Unix-domain socket permissions plus peer-credential checks where available; enforce payload size limits. |
| FastAPI / image processing surface | The draft did not address file / image payload validation or OpenCV decoder risk. | Add content validation, image-format allow-list, size limits, patched OpenCV requirement, and audit logs. |
| Existing service integration | Existing code swallows several exceptions and reports healthy status even when inference fails. | Add reliability tasks for touched paths: explicit error propagation, meaningful health, structured failure logs. |
| Scan controller | Queue cap was present, but not tied to benchmark evidence. | Include ≤5 POIs/minute in replay tests and queue backpressure behaviour. |
## Historical seed
This is the original (March 2026) articulation of the semantic-detection problem that the system ultimately addresses. It is preserved here for traceability — it is the seed of the entire `architecture.md §7.1 Problem` narrative. The reference images it mentions (`semantic01.png``semantic04.png`) lived in the original problem-side data parameters (deleted on doc consolidation 2026-05-17); they are not duplicated here.
Currently, the system consists of mainly 3 parts:
1. **AI object detection.** Allows automatic object detection from the video / images by classes, using a pre-trained AI model. The detection is based on visual similarity. The idea is that the UAV can automatically detect objects and work with them. If it is a reconnaissance UAV, it should deliver a short message with the detected image to the operator to confirm the target. The detection process is described in the suite-level detections doc.
2. **GPS-Denied.** Detection of the current GPS coordinates based on a downward-facing camera and IMU, AI models for optical flow, and pre-downloaded satellite imagery for the route of the plane. Implemented by the suite-level GPS-Denied service (`gps-denied-onboard`).
3. **Search algorithm.** Before the flight, the operator selects a region and a route. During the flight, the system uses the scanning strategy described in `architecture.md §7.2 Mission Regions and Reconnaissance Flow`.
But this whole workflow has a fundamental flaw, which lies in AI object detection. The regular object detection cannot help with the current frontline situation. Regular object detection picks up old and already-destroyed vehicles and military vehicles, and they have zero value for the system.
Current targets are well-hidden and well-masked. Current targets are mostly hidden positions of FPV operators. There are also well-hidden positions of artillery and other well-masked / well-hidden positions. Right now, simple object detection is not enough, because the main object to search for is a small entrance to a hidden safe place — typically a black circular or squared hole near a building, or a dugout masked by tree branches, where personnel or artillery is hidden.
The reference images (deleted on doc consolidation 2026-05-17) showed the typical pattern: a footpath through forest or snow leading to a mass of black colour (mostly tree branches concealing a hideout); footpaths leading to open clearings used as FPV launch points; footpaths terminating at squared hideout structures; footpaths terminating at tree-branch concealment.
The main research question that motivated the design: which AI can handle these tasks? Is it possible to instruct AI to recognise these patterns, follow footsteps (fresh only) and footpaths, analyse the potential hideouts, and signal about them to the operator? First, it should pick up footpaths; then it should distinguish stale vs fresh footpaths; then it should find potential hideouts at the freshest footpath endpoints; then it can signal potential targets.
This question is now answered by the three-tier perception pipeline (Tier 1 fixed-class YOLO primitives → Tier 2 primitive-graph + lightweight ROI CNN → optional Tier 3 local VLM confirmation), the deterministic `scan_controller` state machine, and the H3-indexed `mapobjects_store`, all documented in `architecture.md §7 Detailed Design`.
@@ -0,0 +1,90 @@
# CI / CD Pipeline
**Status**: forward-looking design (Rust). Final pipeline file lands during build-system bring-up. The shape below describes the intent.
## 1. Goals
The pipeline must:
- Build the autopilot Rust binary cross-compiled for `aarch64-unknown-linux-gnu`.
- Run the full Rust test suite (unit + integration + replay-based) on every commit.
- Run a hardware-in-loop conformance gate against an ArduPilot SITL instance (covers `mavlink_layer` + `mission_executor`).
- Run a benchmark gate on representative target hardware (covers Tier 1 / Tier 2 / VLM / gimbal latency budgets — see `architecture.md §7.6 Benchmark gate`).
- Sign and publish artefacts (binary + container image) on tagged releases.
- Never auto-deploy to the airframe. Deployment is a human-driven operation tied to the suite's flight-gate convention (`/run/azaion/in-flight`).
## 2. Pipeline stages
Single Woodpecker pipeline, multi-stage. Stages run sequentially; a failed stage stops the run.
| Stage | Purpose | Notes |
|---|---|---|
| **fetch** | Clone, restore Cargo cache | `cargo fetch` with a remote cache key. |
| **lint** | `cargo fmt --check`, `cargo clippy --all-targets --all-features -- -D warnings` | Hard fail on any warning. |
| **unit-test** | `cargo test --workspace` (host-arch) | Most logic is platform-independent; runs in parallel on host. |
| **build-arm64** | Cross-compile for `aarch64-unknown-linux-gnu` | `cross` or `cargo zigbuild` depending on Rust toolchain. Produces the production binary + a debug symbol artefact. |
| **integration-test** | Replay-based integration tests under emulation | Fixtures: pre-recorded RTSP clip, MAVLink replay, synthetic telemetry. No hardware required. |
| **sitl-conformance** | ArduPilot SITL conformance gate | Spins up ArduPilot SITL + autopilot binary in a container; runs a fixed mission script; asserts MAVLink command surface (per `architecture.md §7.7`) and geofence enforcement. |
| **benchmark-gate** *(opt-in, manual / nightly)* | Tier 1 / 2 / VLM / gimbal latency on real Jetson | Runs on a self-hosted Jetson Orin Nano runner. Asserts `architecture.md §6 NFR` budgets. Slow; not on every PR. |
| **package** | Build container image (Option B from `containerization.md`) | Multi-arch tag: `azaion/autopilot:<branch>-arm64`. |
| **sign** | Sign binary + image | Cosign for the image; OS-vendor signing flow for the binary if used in native deployment. |
| **publish** | Push image + binary to internal registry | Tagged builds only. |
## 3. Artefacts
| Artefact | Where | Retention |
|---|---|---|
| `autopilot` binary (aarch64) | internal artefact store | last 10 builds per branch; tagged builds kept indefinitely |
| Debug symbols (`.dwp`) | internal artefact store, separate path | matched to binary lifetime |
| Container image | internal Docker registry | last 10 dev builds; tagged builds kept indefinitely |
| Cosign signature | next to image | matched to image lifetime |
| Test logs | CI run | per Woodpecker retention |
| Benchmark gate report | internal artefact store (Markdown + JSON) | per-tag retention |
## 4. Build matrix
Single matrix entry today:
| Toolchain | Target | Tier-1 dep | VLM feature |
|---|---|---|---|
| Rust stable | `aarch64-unknown-linux-gnu` | `../detections` (Cython service consumed via gRPC; not built here) | `cargo --features vlm` (also `cargo` without — both must build) |
The `--features vlm` and the no-feature path are both built and tested to enforce the optionality contract from `architecture.md §7.6 Local VLM confirmation`.
## 5. SITL conformance gate (in detail)
Stage runs in CI; produces a pass/fail signal that gates merge to `dev`.
**Setup:**
1. Start ArduPilot SITL in a container, listening on `udp://0.0.0.0:14550`.
2. Start autopilot binary configured for SITL endpoint.
3. Pre-load a fixture mission via the missions API mock (`mission_client` HTTP target).
4. Pre-load a fixture RTSP source (looped clip).
5. Mock the `../detections` service with deterministic detections.
**Assertions:**
- All MAVLink message kinds in `architecture.md §7.7` succeed at least once.
- Mission upload + start completes within the configured retry budget.
- INCLUSION geofence violation triggers RTL.
- EXCLUSION geofence violation triggers RTL (regression gate against the earlier silent-ignore behaviour).
- Middle-waypoint POST + re-upload succeeds within ≤2 s.
- Health endpoint returns `green` once steady state is reached.
## 6. Branch policy
| Branch | Triggers | Required gates |
|---|---|---|
| feature branches (PR) | on push | fetch → lint → unit-test → build-arm64 → integration-test → sitl-conformance |
| `dev` | on merge | all PR gates + package |
| tagged release (`v*`) | on tag | all `dev` gates + sign + publish + benchmark-gate (manual approval) |
`main` and `dev` are protected. Force-push is forbidden. Merges require a green pipeline.
## 7. Out of scope here
- Airframe deployment automation (manual; tied to flight-gate).
- Ground Station and `../detections` pipelines (each owns its own).
- AI training pipeline — `../_docs/12_ai_training.md`.
- Model-sync to the airframe (`model-sync.service`, suite-level — `../_docs/00_top_level_architecture.md`).
@@ -0,0 +1,142 @@
# Containerisation
**Status**: forward-looking design (Rust). Final shape will surface during build-system bring-up; treat the choices below as the current intent, not commitments.
## 1. Deployment shape
`autopilot` is a single Rust binary. Two delivery options are considered:
| Option | Form | Pros | Cons |
|---|---|---|---|
| **A — native systemd unit** | bare binary deployed to `/usr/local/bin/autopilot` + a `.service` unit | minimum overhead on Jetson; closest to airframe constraints; trivial flight-gate integration | per-host installation discipline; less reproducible across nodes |
| **B — single container image** | `azaion/autopilot:<branch>-arm64` | consistent across environments; matches the suite's existing OTA model (Watchtower) | container runtime adds startup latency and one more failure surface on the airframe |
The decision is **Option A** for the on-airframe deployment (lowest overhead, closest to the autopilot's real-time constraints), and **Option B** for development / CI / emulated-hardware testing (reproducibility wins). The same Rust binary is built once and packaged into both.
## 2. Target hardware
| Item | Value |
|---|---|
| Edge device | NVIDIA Jetson Orin Nano Super 8 GB |
| Architecture | aarch64 |
| OS | Ubuntu 22.04 (JetPack-bundled) — locked JetPack version + power mode |
| Camera | ViewPro A40 (RTSP + UDP control) |
| Autopilot | ArduPilot or PX4 over MAVLink v2 (UDP or serial) |
## 3. Native deployment (Option A — production)
**Layout:**
```text
/usr/local/bin/autopilot Rust binary
/etc/azaion/autopilot/config.toml runtime config
/etc/systemd/system/autopilot.service systemd unit
/var/lib/autopilot/ persistent state (mapobjects_store)
/run/azaion/in-flight flight-gate marker (per ../_docs/00_top_level_architecture.md)
```
**systemd unit highlights:**
- `Type=notify` — autopilot signals readiness once Tier 1, gimbal, and MAVLink links are healthy.
- `Restart=on-failure`, `RestartSec=2s`, `StartLimitBurst=5` — bounded restart (so a hard-broken binary doesn't loop forever).
- `MemoryMax=` — enforces the on-airframe memory budget (~6 GB; Tier-1 YOLO container holds ~2 GB).
- `LimitNOFILE`, `LimitNPROC` set explicitly.
- `ExecStartPre=/bin/sh -c 'mkdir -p /run/azaion && touch /run/azaion/in-flight'` — asserts the suite-wide flight-gate so `model-sync.service` does not pull a new model mid-flight.
- `ExecStopPost=/bin/rm -f /run/azaion/in-flight` — clears the flight-gate on shutdown.
**Runtime config** (`/etc/azaion/autopilot/config.toml`) is the single source for non-secret configuration: RTSP URL, gimbal endpoint, MAVLink connection URI, missions API endpoint, Ground Station endpoint, VLM IPC socket path, `vlm_enabled` flag, log level. Secrets (if any — TBD per `../_docs/02_missions.md` auth model) come from the systemd `EnvironmentFile=` pointing at a permission-restricted file.
## 4. Container image (Option B — dev / CI / emulation)
**Base image:** `nvcr.io/nvidia/l4t-base:<JetPack-pinned-tag>` for production-equivalent NVDEC + TensorRT plumbing; `ubuntu:22.04` for emulation (no GPU acceleration).
**Image layout:**
```text
/usr/local/bin/autopilot Rust binary (built outside the image)
/etc/azaion/autopilot/config.toml runtime config (mounted at runtime)
/var/lib/autopilot/ persistent state (volume-mounted)
```
**Image is non-root.** Default `USER` is `autopilot:autopilot`; `/var/lib/autopilot/` is owned by that user.
**Compose example** (development):
```yaml
services:
autopilot:
image: azaion/autopilot:dev-arm64
restart: unless-stopped
environment:
AUTOPILOT_CONFIG: /etc/azaion/autopilot/config.toml
volumes:
- ./config/autopilot.toml:/etc/azaion/autopilot/config.toml:ro
- autopilot-state:/var/lib/autopilot
- /run/azaion:/run/azaion
devices:
- /dev/ttyUSB0:/dev/ttyUSB0 # MAVLink serial (if used)
network_mode: host # RTSP / UDP gimbal / Ground Station modem all on host
volumes:
autopilot-state: {}
```
`network_mode: host` is intentional on Jetson: RTSP, gimbal UDP, MAVLink UDP, and the modem-link to the Ground Station all share the airframe's network namespace.
## 5. External dependencies on the airframe
`autopilot` itself is the only autopilot-owned process. The on-airframe tier also runs (separately):
- **`../detections`** — Tier 1 YOLO service. Container delivered from its own pipeline. Bi-directional gRPC endpoint consumed by `detection_client`.
- **NanoLLM / VILA1.5-3B** (optional) — local IPC peer of `vlm_client`. Separate container or process; not embedded in the autopilot binary. Surfaces a Unix-domain socket; peer-credential check is mandatory when supported.
- **GPS-Denied service** — separate edge service, owned by `gps-denied-onboard`; consumed indirectly through the shared edge data path (per `../_docs/11_gps_denied.md`).
- **`model-sync.service`** — suite-wide rclone-driven model puller. Reads `/run/azaion/in-flight` to defer model swaps during flight (per `../_docs/00_top_level_architecture.md`).
## 6. Configuration surface
All configuration is declarative (`config.toml`); there is no compile-time configuration of endpoints, addresses, or feature switches **except** the `vlm_client` build-time feature flag (see `architecture.md §7.6 Local VLM confirmation > Optionality model`).
| Concern | Mechanism |
|---|---|
| RTSP / gimbal / MAVLink endpoints | `config.toml` |
| `missions` API endpoint + auth | `config.toml` (auth pulled from `EnvironmentFile=`) |
| Ground Station endpoint | `config.toml` |
| VLM IPC socket path | `config.toml` |
| `vlm_enabled` runtime flag | `config.toml` |
| `vlm_client` build-time feature | `cargo --features vlm` at build |
| Log level + format | `RUST_LOG` env (`tracing-subscriber` honours it) |
| Mission ID for the current flight | CLI arg (per-flight, not per-host) |
## 7. Health endpoint
`autopilot` exposes a single HTTP health endpoint (port and bind address from `config.toml`; default `127.0.0.1:8080`). It aggregates per-component readiness:
```json
{
"status": "green | yellow | red",
"components": {
"frame_ingest": "green",
"detection_client": "green",
"movement_detector": "green",
"semantic_analyzer": "green",
"vlm_client": "disabled",
"scan_controller": "green",
"mapobjects_store": "green",
"gimbal_controller": "green",
"operator_bridge": "yellow",
"mission_executor": "green",
"mavlink_layer": "green",
"mission_client": "green",
"telemetry_stream": "green"
},
"last_state_change": "2026-05-17T12:00:00Z"
}
```
`yellow` is degraded-but-running; `red` is unrecoverable for at least one essential component. The aggregator surfaces details on each transition through `tracing` (see `observability.md`).
## 8. Out of scope here
- Provisioning the Jetson host itself (Ansible / Kickstart / disk imaging) — owned by airframe ops.
- Build pipeline (cross-compile, signing, registry push) — see `ci_cd_pipeline.md`.
- Observability stack (tracing exporter, log shipper, metrics scraper) — see `observability.md`.
- Mission delivery to the airframe — owned by `missions` API.
@@ -0,0 +1,142 @@
# Observability
**Status**: forward-looking design (Rust). Treat the choices below as the intended approach; the exact tracing exporter / metrics scraper / log-shipping target depend on the suite's overall observability stack at deploy time.
## 1. Posture
- **One binary, one process.** Per-component instrumentation is structured (each component listed in `architecture.md §3` is a `tracing` target).
- **Structured logs are primary**, metrics are derived from log spans and counters, traces are end-to-end on a frame's journey through the pipeline.
- **No silent error swallowing.** Every failure path increments a counter, emits a span event, or both.
- **Health is aggregated**, not derived from logs. The HTTP health endpoint (`containerization.md §7`) is the source of truth for live readiness.
## 2. Logs
**Library**: `tracing` + `tracing-subscriber`.
**Format**: JSON to stdout. Captured by the host's journald (Option A) or by the container runtime (Option B), then shipped to the suite's log aggregator.
**Per-line fields:**
| Field | Source | Notes |
|---|---|---|
| `ts` | wall clock | ISO-8601 UTC. |
| `ts_mono_ns` | monotonic clock | For ordering across components without clock-skew artefacts. |
| `level` | `tracing` | `error \| warn \| info \| debug \| trace`. |
| `target` | component name | One of `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client`, `scan_controller`, `mapobjects_store`, `gimbal_controller`, `operator_bridge`, `mission_executor`, `mavlink_layer`, `mission_client`, `telemetry_stream`. |
| `frame_seq` | propagated context | Where applicable. Lets us reconstruct one frame's journey. |
| `poi_id`, `roi_id`, `target_id`, `mission_id`, `command_id` | propagated context | Where applicable. |
| `event` | message | Short, machine-friendly identifier (e.g., `frame.dropped`, `vlm.timeout`, `mission.geofence_violation`, `bit.check_failed`, `failsafe.lost_link`, `mapobjects.push_failed`, `operator.auth_rejected`). |
| `model_version` | propagated context | Version string for `tier1_model_version` and `vlm_model_version`. Required on every `vlm.response` and on every Tier-2 evidence span for forensic correlation. |
| `wall_clock_source` | telemetry frame | `gnss \| host \| coast`; emitted on every state-transition span and on every operator-command audit log line. |
| `reason` | message | Free-form for human readers. |
**Log level defaults:**
- `info`: lifecycle (startup / shutdown / state transitions), all error and security events.
- `warn`: degraded-but-running events (yellow health, retries, drops).
- `error`: red health, hard failures, schema violations, security violations.
- `debug` / `trace`: off in production; enabled per-target via `RUST_LOG`.
**Always logged at `warn` or higher** (per `coderule.mdc`):
- Every exception path that the operator could care about.
- Authentication / authorisation failures (peer-cred check failures on VLM IPC, malformed Ground Station session, MAVLink-2 signing rejection).
- Geofence violations.
- Schema validation failures (Tier 1 response, VLM response, mission JSON).
## 3. Metrics
Derived from log spans + a small set of explicit counters. Exporter: Prometheus-compatible (per the suite's stack).
**Per-component counters** (illustrative — exact names finalised at implementation):
| Component | Counter | Type |
|---|---|---|
| `frame_ingest` | `frames_received_total`, `frames_dropped_total{reason}`, `decode_errors_total` | counter |
| `frame_ingest` | `decode_ms` | histogram |
| `detection_client` | `requests_total`, `errors_total{kind}`, `latency_ms` | counter / histogram |
| `movement_detector` | `candidates_total`, `telemetry_skew_drops_total` | counter |
| `semantic_analyzer` | `tier2_runs_total`, `tier2_latency_ms`, `tier2_oversize_total` | counter / histogram |
| `vlm_client` | `vlm_requests_total{status}`, `vlm_latency_ms` | counter / histogram |
| `scan_controller` | `state_transitions_total{from,to}`, `pois_in_queue`, `pois_per_min`, `tick_latency_ms` | counter / gauge / histogram |
| `mapobjects_store` | `classify_total{result}`, `ignored_items_total`, `removed_candidates_total` | counter |
| `gimbal_controller` | `commands_total`, `decision_to_movement_ms`, `zoom_transition_ms`, `vendor_faults_total` | counter / histogram |
| `mavlink_layer` | `messages_in_total{kind}`, `messages_out_total{kind}`, `command_acks_total{result}`, `parse_errors_total`, `link_state` | counter / gauge |
| `mission_executor` | `state_transitions_total{from,to}`, `mission_uploads_total{result}`, `geofence_violations_total{kind}` | counter |
| `mission_client` | `fetches_total{result}`, `middle_waypoint_posts_total{result}`, `mapobjects_pull_total{result}`, `mapobjects_push_total{result}`, `mapobjects_pull_bytes`, `mapobjects_push_bytes`, `mapobjects_sync_lag_s` | counter / gauge |
| `mission_executor` (BIT) | `bit_runs_total{result}`, `bit_check_failures_total{check}` | counter |
| `mission_executor` (failsafe) | `link_loss_events_total{trigger}`, `failsafe_action_total{action}` | counter |
| `operator_bridge` | `pois_surfaced_total`, `commands_received_total{kind,result}`, `decision_latency_ms`, `auth_rejections_total{reason}`, `command_e2e_ms` | counter / histogram |
| `telemetry_stream` | `bytes_out_total`, `frames_out_total`, `link_state`, `bandwidth_used_mbps` | counter / gauge |
**Aggregated:**
- `health_state{component}` — 0 (red) / 1 (yellow) / 2 (green); enables alerting per-component.
- `process_uptime_seconds`, `process_resident_memory_bytes` — standard.
## 4. Traces
`tracing` spans cover the path of a single frame and the path of a single POI.
**Frame trace** (per `Frame`):
```text
frame_ingest.publish
detection_client.request
detection_client.response
movement_detector.tick
[movement_detector.emit_candidate]
telemetry_stream.push
```
**POI trace** (per `POI`):
```text
scan_controller.enqueue
scan_controller.dequeue
gimbal_controller.zoom
semantic_analyzer.tier2
[vlm_client.request -> vlm_client.response]
operator_bridge.surface
[operator_bridge.confirm | decline | timeout]
mission_executor.middle_waypoint # confirm path
mapobjects_store.append_ignored # decline path
```
Spans propagate via context across in-process channels. Trace export target depends on the suite's stack (OTLP / Jaeger / Tempo).
## 5. Health endpoint
See `containerization.md §7`. The endpoint is the operator-facing readiness API; metrics + logs are the engineer-facing investigation API.
A red health state for any of these components is unrecoverable for the current flight:
- `frame_ingest` red → no input → cannot operate.
- `mavlink_layer` red → no UAV control → trigger RTL via the autopilot's failsafe (the autopilot itself enforces this when MAVLink heartbeat stops).
- `mission_executor` red → mission lifecycle stuck → operator must take RC control.
A red health state for these components is degraded-but-survivable:
- `detection_client` → continue zoom-out sweep; lose Tier 1.
- `movement_detector` → continue; lose movement-candidate POI source.
- `semantic_analyzer` → continue; surface Tier-1-only POIs.
- `vlm_client` → fail-closed (POIs surfaced without VLM evidence).
- `mapobjects_store` → continue with in-memory state; persistent diff lost on restart. Sync state may transition to `Stale` (operator visible).
- `mapobjects_sync` (logical, owned by `mission_client`) → mission proceeds with stale snapshot; post-flight push retries via leftover spool. Operator sees `mapobjects_sync = degraded`.
- `operator_bridge` / `telemetry_stream` → continue zoom-out sweep; pause POI surfacing; resume on reconnect. F10 lost-link ladder owns the larger response.
- `gimbal_controller` → pause zoom-in / target-follow; zoom-out sweep stops.
- `mission_client` → continue current mission from in-memory copy.
## 6. Replay-driven debugging
All non-trivial decisions in `scan_controller`, `movement_detector`, `semantic_analyzer`, `vlm_client`, and `mission_executor` are reconstructable from logs + the (size-capped) raw inputs that drove them:
- Frame seq, gimbal state at decode, telemetry sample used, Tier-1 detections returned, Tier-2 score, VLM raw response (size-capped), operator command, resulting state transition.
This is the foundation of the replay-based integration tests in `ci_cd_pipeline.md §2`.
## 7. Out of scope here
- Suite-wide observability stack choice (OTLP vs Loki vs Tempo vs Promtail) — owned by suite ops.
- Persistent log retention policy — owned by suite ops.
- Alerting routing (Slack / PagerDuty / email) — owned by suite ops.
+113
View File
@@ -0,0 +1,113 @@
# autopilot — Glossary
**Status**: confirmed-by-user (2026-05-17), updated for the rewrite paradigm.
Project-specific terms only. Generic CS / industry terms (RTSP, gRPC, FastAPI, MAVLink, JSON, etc.) are intentionally omitted.
---
**AUTO mode** — fixed-wing autopilot flight mode in which the airframe follows its uploaded mission. `mission_executor`'s fixed-wing variant uploads the mission and waits for the operator to switch the airframe into AUTO via RC; only then does it transition to `FLY_MISSION`. source: `architecture.md §7.7`.
**Behaviour tree (BT)** — hierarchical decision-making model (Selector / Sequence / Condition / Action / Decorator) ticked from the root every cycle. The canonical decomposition of `scan_controller`'s logic. The implementation may use a typed deterministic state machine that satisfies the same priorities, preemption, and tick scenarios. source: `system-flows.md §F4`.
**Benchmark gate** — proof-of-concept milestone. Tier 1 ≤100 ms/frame, Tier 2 ≤200 ms/ROI, VLM ≤5 s/ROI, A40 transition ≤2 s, decision-to-movement ≤500 ms, ≤5 POIs/min. Must pass before product code begins. source: `architecture.md §7.6 Benchmark gate`.
**`../detections`** — separate sibling repo. FastAPI / Cython service running TensorRT YOLO26 + YOLOE-26 FP16 engines. Tier 1 primitive detection lives here, NOT in autopilot. Consumed by `detection_client` over bi-directional gRPC. source: `architecture.md §1`, `../_docs/03_detections.md`.
**detection_client** — autopilot component: bi-directional gRPC client to `../detections`; streams frames out, receives bounding boxes back; same bboxes are reused for Tier 2 ROI selection and for operator overlay. source: `components/detection_client/description.md`.
**Confidence-scaled timeout** — operator-decision window scales linearly with target confidence: 40 % → 30 s, 100 % → 120 s. Below 40 % a target is not surfaced. Timeout = forget; decline = `IgnoredItem` entry. source: `architecture.md §5 Architectural Principles`.
**Ego-motion compensation** — separating target motion from platform motion in `movement_detector`, using synchronised video + gimbal angle + zoom state + UAV telemetry. Naive frame-differencing is explicitly rejected. Per-zoom-band thresholds: tighter at zoom-in. source: `architecture.md §5 Architectural Principles`, `components/movement_detector/description.md`.
**Flight-gate** — suite-wide convention: a marker file (`/run/azaion/in-flight`) written by autopilot at startup and removed at shutdown, read by `model-sync.service` to defer model swaps during flight. source: `../_docs/00_top_level_architecture.md`, `deployment/containerization.md §3`.
**frame_ingest** — autopilot component: pulls RTSP from ViewPro A40, decodes, timestamps, hands frames to `detection_client`, `movement_detector`, and `telemetry_stream`. source: `components/frame_ingest/description.md`.
**Geofence (INCLUSION / EXCLUSION)** — polygonal area constraint in the mission. **Both** are enforced symmetrically in the rewrite (`mission_executor`); a violation triggers RTL. source: `architecture.md §5 Architectural Principles`, `§7.7 MAVLink and Piloting`, `components/mission_executor/description.md`.
**gimbal_controller** — autopilot component: ViewPro A40 control protocol (yaw / pitch / zoom) + zoom-out sweep + zoom-in path-follow + target-follow centre-window. source: `components/gimbal_controller/description.md`.
**Ground Station API** — external, out-of-this-repo service that receives a continuous camera + telemetry stream from each UAV and hosts the operator browser UI (bbox overlay, target confirm/decline). Not built; not in autopilot scope. source: `architecture.md §1`, `../_docs/04_system_design_clarifications.md`.
**Hand-rolled MAVLink layer** — `mavlink_layer` implements the ~1015 MAVLink commands this codebase actually uses with no third-party SDK. Eliminates the largest dependency-risk item. source: `architecture.md §7.7`, `components/mavlink_layer/description.md`.
**H3 spatial index** — hexagonal hierarchical geospatial indexing used by `mapobjects_store` for fast new / moved / existing / removed diffs. source: `architecture.md §7.9`, `components/mapobjects_store/description.md`.
**IgnoredItem** — operator-declined target. Persisted in `mapobjects_store` as `(MGRS, class_group)`; new detections matching an entry are suppressed before reaching the operator. source: `architecture.md §7.12`, `components/mapobjects_store/description.md`, `data_model.md §IgnoredItem`.
**Jetson Orin Nano** — edge-device compute platform (NVIDIA, aarch64, CUDA-capable). 8 GB shared LPDDR5; ~2 GB used by Tier 1, ~6 GB available for the rest of autopilot + VLM. source: `architecture.md §7.3`.
**Zoom-out / zoom-in scan** — two-tier search behaviour. **Zoom-out level** = wide / light-medium zoom sweep along the UAV route, runs `movement_detector` + Tier 1. **Zoom-in level** = zoom into a queued POI for Tier 2 ROI analysis + optional VLM confirmation; `movement_detector` continues to run with per-zoom-band thresholds. State-machine variants: `ZoomedOut`, `ZoomedIn { roi, hold_started_at }`. source: `architecture.md §7.1`, `components/scan_controller/description.md`.
**MapObject** — entry in `mapobjects_store`; keyed by `H3_cell + class`; carries GPS, size, class, and a list of recent observations. source: `architecture.md §7.9`, `data_model.md §MapObject`.
**mapobjects_store** — autopilot component: on-device H3-indexed map of detected objects + ignored-items list. No REST API. source: `components/mapobjects_store/description.md`.
**mavlink_layer** — autopilot component: hand-rolled MAVLink v2 transport + the small command set this codebase needs. source: `components/mavlink_layer/description.md`.
**MGRS** — Military Grid Reference System; primary coordinate encoding for autopilot ⇄ operator sync messages and for `mapobjects_store` keys. source: `architecture.md §7.10`, `data_model.md §MGRS sync message`.
**Middle waypoint** — autopilot-inserted waypoint between the current position and the next mission waypoint, computed from an operator-confirmed POI. Triggers a mission re-upload (`MISSION_CLEAR_ALL` + standard upload sequence). source: `architecture.md §7.7`, `components/mission_executor/description.md`.
**mission_client** — autopilot component: pulls the mission from the `missions` API on start; POSTs middle-waypoint inserts; honours the mission cascade signal. source: `components/mission_client/description.md`.
**mission_executor** — autopilot component: variant-specific (multirotor / fixed-wing) state machine that drives the airframe through connect → health-check → arm/takeoff (multirotor) or wait-for-AUTO (fixed-wing) → upload → fly → land. Owns geofence enforcement. source: `architecture.md §7.7`, `components/mission_executor/description.md`.
**mission-schema** — shared schema artefact between `autopilot` and `missions` repos. Extraction location TBD (`_infra/` at suite root, or a small third repo) — `architecture.md §8 Q5`. source: `architecture.md §5`.
**`missions`** — separate sibling repo (.NET service). Hosts the missions REST API. Stays separate from `autopilot`; the two share `mission-schema`. source: `../_docs/02_missions.md`.
**Movement candidate** — small moving point/cluster emitted by `movement_detector` in either zoom-out or zoom-in. Tagged with `source_zoom_band`. Promoted to a zoom-in POI by `scan_controller` (or used to bump in-progress ROI confidence at zoom-in). source: `architecture.md §7.4`, `data_model.md §MovementCandidate`.
**movement_detector** — autopilot component: OpenCV optical-flow / global-motion estimation with mandatory ego-motion compensation. Active at **both** zoom-out and zoom-in (suppressed only during target-follow); per-zoom-band thresholds. Classical-CV adequacy at zoom-in is benchmark-gated; learned-CV fallback per Q14. source: `components/movement_detector/description.md`.
**operator_bridge** — autopilot component: surfaces POIs (via `telemetry_stream` → Ground Station) for operator confirm / decline; forwards target-follow start / release; on decline appends an `IgnoredItem`. source: `components/operator_bridge/description.md`.
**Optionality model (VLM)** — VLM is the only optional Tier. Two complementary mechanisms: a runtime `vlm_enabled` flag, and a build-time feature module. The system MUST function correctly with VLM absent. source: `architecture.md §7.6 Local VLM confirmation > Optionality model`, `components/vlm_client/description.md §9`.
**POI (Point of Interest)** — a queued candidate for zoom-in inspection (footpath start, branch pile, tree row, movement candidate, etc.). source: `architecture.md §7.1`, `data_model.md §POI`.
**POI queue** — operator-review queue inside `scan_controller`; ordered by `confidence × proximity × age_factor`; hard cap of **≤5 POIs/min** to bound operator workload. source: `architecture.md §5`, `components/scan_controller/description.md`.
**RTL (Return-to-Launch)** — MAVLink-driven return to the configured rally point; triggered by INCLUSION / EXCLUSION violation, by max-retry exhaustion in `mission_executor`, or by failsafe in the autopilot itself. source: `architecture.md §7.7`, `components/mission_executor/description.md`.
**Scan controller** — autopilot component: central deterministic typed state machine — `ZoomedOut`, `ZoomedIn`, `TargetFollow`. Owns POI queue, timeouts, gimbal commands, ≤5 POIs/min cap. source: `architecture.md §7.6 Scan controller and POI queue`, `components/scan_controller/description.md` (full BT spec in `system-flows.md §F4`).
**semantic_analyzer** — autopilot component (Tier 2): primitive graph + lightweight ROI CNN, reasoning over paths, branch piles, dark entrances, etc. source: `components/semantic_analyzer/description.md`.
**SITL conformance gate** — CI stage in which autopilot runs against ArduPilot SITL with a mocked `../detections` and a fixture mission, asserting the MAVLink command surface and geofence enforcement. source: `deployment/ci_cd_pipeline.md §5`.
**Sweep** — zoom-out camera motion: gimbal swings left-right across the UAV's flight path while the UAV flies the mission. Exact pattern (pendulum / raster / lawn-mower), FOV per zoom tier, dwell time, and mission-segment alignment are unresolved (`architecture.md §8 Q1`). source: `architecture.md §7.1`, `components/gimbal_controller/description.md`.
**Target-follow mode** — gimbal keeps an operator-confirmed target centred (within the centre 25 % of frame) while the UAV continues to move. Ends on operator release or tracking loss. State-machine variant: `TargetFollow { target_id, started_at }`. source: `architecture.md §7.1`, `components/scan_controller/description.md`.
**telemetry_stream** — autopilot component: continuous (always-on) push of camera frames + flight telemetry + bbox overlay over modem to the Ground Station API. Operator always sees live feed, not just on detection. Carries operator commands on the return path. source: `components/telemetry_stream/description.md`.
**Tier 1 detection** — primitive YOLO over the full frame; delegated to `../detections`. source: `architecture.md §7.6 Tier 1 primitive detector`.
**Tier 2 semantic** — primitive-graph + lightweight ROI CNN reasoning over zoom-in crops; lives in autopilot. source: `architecture.md §7.6 Tier 2 semantic analyzer`.
**Tier 3 / VLM (Vision Language Model)** — NanoLLM running VILA1.5-3B in a separate local process, invoked only for bounded zoom-in ROI confirmation. Local IPC over Unix domain socket with peer-credential check. No cloud egress. Optional. source: `architecture.md §7.6 Local VLM confirmation`, `components/vlm_client/description.md`, `system-flows.md §F3`.
**vlm_client** — autopilot component: optional local-IPC client to a NanoLLM/VILA1.5-3B process; validates ROI payload, calls VLM, validates response against the `VlmAssessment` schema. source: `components/vlm_client/description.md`.
**VlmAssessment** — structured-schema output from the VLM. The free-form VLM text is not a downstream API contract. source: `architecture.md §5 Architectural Principles`, `data_model.md §VlmAssessment`.
**ViewPro A40** — deployment gimbal hardware. NFR budget: zoom transition ≤2 s, decision-to-movement ≤500 ms. source: `architecture.md §7.3`, `components/gimbal_controller/description.md`.
**Waypoint** — mission node coordinate (lat, lon, alt). Pulled from the `missions` API by `mission_client`; the operator-confirm flow may insert a **middle waypoint** to detour toward a confirmed target. source: `architecture.md §7.7`, `data_model.md §MissionWaypoint`.
**BIT (Built-In Self Test)** — pre-flight gate run by `mission_executor`. Covers GPS lock, camera RTSP, gimbal homing, `../detections`, VLM warmup (if enabled), mission load, MapObjects pre-flight pull, persistent-store free space, wall-clock binding, MAVLink + airframe health. Items return `OK | DEGRADED | FAIL`. DEGRADED requires signed operator acknowledgement. FAIL blocks transition past `HEALTH_OK`. source: `architecture.md §5 / §7.3`, `system-flows.md §F9`.
**Lost-link failsafe ladder** — typed ladder evaluated each tick by `mission_executor` against the operator/Ground-Station modem heartbeat: `LinkOk` (≤5 s) → `LinkDegraded` (≤30 s, queue events, health yellow) → `LinkLost` (>30 s, no follow) → `LinkLostInFollow` (>30 s in target-follow, +30 s grace). Default action on lost link is RTL. MAVLink-link loss to ArduPilot itself is a separate, more severe event. source: `architecture.md §5 / §7.7`, `system-flows.md §F10`.
**MapObjects sync** — pre-flight pull + post-flight push of MapObjects + IgnoredItems against the central `missions` API extension `/missions/{id}/mapobjects`. In-flight is batched only (no streaming over modem). On-device store is a working copy; central store is the source of truth across missions. source: `architecture.md §5 / §7.13`, `system-flows.md §F8`, `components/mapobjects_store/description.md`, `components/mission_client/description.md`.
**Per-zoom-band thresholds** — `movement_detector` configuration is split between zoom-out and zoom-in because the pixel-to-metre ratio differs by ~10×. Cluster persistence threshold, residual-velocity floor, telemetry-skew tolerance, and enqueue-latency budget are all per-band. source: `architecture.md §7.6 Movement detector`, `components/movement_detector/description.md §5`.
**Operator-command authentication** — every operator command (confirm / decline / target-follow / safety-override / BIT-degraded-acknowledge) carries a session-bound signature with replay protection, validated by `operator_bridge` before dispatch. The principle is committed; the exact scheme is open per Q9. source: `architecture.md §5 / §8 Q9`, `components/operator_bridge/description.md`.
**Sync state (`mapobjects_store`)** — `synced | cached_fallback | degraded`. `synced` after a fresh successful pull or a successful post-flight push. `cached_fallback` when pre-flight pull failed and the operator acknowledged continuing on cache. `degraded` after a persistent push failure or a stale cache. `scan_controller` suppresses MapObject diff classifications while `degraded` to avoid corrupting the central observation log. source: `components/mapobjects_store/description.md §5`.
**Wall-clock source** — autopilot binds wall-clock to GPS time once GPS is locked (preferred) or to NTP at boot if reachable. Drift > 200 ms surfaces health → yellow. Monotonic clock (independent of wall-clock) is authoritative for telemetry-skew compensation and tick budgets. source: `architecture.md §7.3 Reliability and safety`.
+357
View File
@@ -0,0 +1,357 @@
# Module Layout
**Language**: rust
**Layout Convention**: crates-workspace
**Root**: `crates/`
**Last Updated**: 2026-05-19
## Layout Rules
1. Each component owns ONE top-level directory under `crates/`. The directory name matches the component name in `_docs/02_document/components/`.
2. Shared code lives in a single `crates/shared/` crate. Cross-cutting concerns are modules inside it (`shared/models/`, `shared/config/`, `shared/error/`, `shared/health/`, `shared/observability/`, `shared/clock/`, `shared/contracts/`). Other crates re-export from `shared::`; they MUST NOT duplicate types.
3. Public API surface per component = the files in `Public API` below. Everything under `src/internal/` (and any other module not listed in `Public API`) is internal and other crates MUST NOT use it.
4. Tests live in each crate's own `tests/` directory (Rust convention). Workspace-level end-to-end tests live at `tests/e2e/` (the workspace root, not under any crate).
5. **Stream-based wiring**: tokio channels carrying shared data types are passed into actor constructors by the composition root (`crates/autopilot`). This keeps Layer 2 actors free of sibling imports — they receive `Receiver<Frame>`, `Receiver<GimbalState>`, etc. from `shared::models` without importing the crate that produces them.
6. **Sink traits in shared**: where one component must push into another's transport (e.g. `operator_bridge` pushes POIs through `telemetry_stream`), the receiving side implements a trait defined in `shared::contracts` (`TelemetrySink`, `MavlinkSink`, etc.). The producing side depends only on the trait, not on the receiving crate.
## Per-Component Mapping
### Component: shared
- **Epic**: AZ-626 (Bootstrap & Initial Structure — shared crate lands as part of AZ-640 initial structure task)
- **Directory**: `crates/shared/`
- **Public API**:
- `crates/shared/src/lib.rs` (re-exports the submodules listed below)
- `crates/shared/src/models/mod.rs` (`Frame`, `BoundingBox`, `Detection`, `DetectionBatch`, `MovementCandidate`, `Tier2Evidence`, `VlmAssessment`, `POI`, `MapObject`, `MapObjectObservation`, `MapObjectsBundle`, `IgnoredItem`, `Coordinate`, `Geofence`, `MissionItem`, `MissionWaypoint`, `OperatorCommand`, `GimbalState`)
- `crates/shared/src/config/mod.rs` (`Config`, `ConfigLoader`, per-component typed sections)
- `crates/shared/src/error.rs` (`AutopilotError`, `Result<T>`)
- `crates/shared/src/health.rs` (`ComponentHealth`, `AggregatedHealth`, `HealthLevel`)
- `crates/shared/src/observability/mod.rs` (`tracing` init, log-field constants per `observability.md §2`)
- `crates/shared/src/clock.rs` (`MonoClock`, `WallClock`, `ClockSource`)
- `crates/shared/src/contracts/mod.rs` (`TelemetrySink`, `MavlinkSink`, `VlmProvider`, `OperatorCommandSink`)
- **Internal**: none — shared has no `internal/` subtree; everything in shared is part of its public API by design.
- **Owns (exclusive write during implementation)**: `crates/shared/**`
- **Imports from**: (none — Layer 1)
- **Consumed by**: every other component crate + the `autopilot` binary
---
### Component: mavlink_layer
- **Epic**: AZ-637
- **Directory**: `crates/mavlink_layer/`
- **Public API**:
- `crates/mavlink_layer/src/lib.rs` (`MavlinkLayer`, `MavlinkHandle`, `MavlinkConnection`, public message types re-exported from `shared::models`)
- **Internal**:
- `crates/mavlink_layer/src/internal/codec/*` (MAVLink v2 encode/decode for the §7.7 surface only)
- `crates/mavlink_layer/src/internal/transport/udp.rs`
- `crates/mavlink_layer/src/internal/transport/serial.rs`
- `crates/mavlink_layer/src/internal/heartbeat.rs`
- `crates/mavlink_layer/src/internal/retry.rs`
- **Owns**: `crates/mavlink_layer/**`
- **Imports from**: `shared`
- **Consumed by**: `mission_executor`, `telemetry_stream` (via constructor-injected `Receiver<MavlinkTelemetry>` or via the `MavlinkSink` trait)
---
### Component: mission_client
- **Epic**: AZ-638
- **Directory**: `crates/mission_client/`
- **Public API**:
- `crates/mission_client/src/lib.rs` (`MissionClient`, `MissionClientHandle::pull_mission()`, `post_middle_waypoint()`, `pull_mapobjects()`, `push_mapobjects()`, `health()`)
- **Internal**:
- `crates/mission_client/src/internal/missions_api/*` (REST client + retry + auth)
- `crates/mission_client/src/internal/mapobjects_sync/*` (pre-flight GET + post-flight POST bundles)
- `crates/mission_client/src/internal/schema/*` (schema-version validation against `mission-schema`)
- **Owns**: `crates/mission_client/**`
- **Imports from**: `shared`
- **Consumed by**: `mission_executor` (for mission lifecycle), `mapobjects_store` (for hydrate/dump indirectly through `mission_executor` orchestration)
---
### Component: frame_ingest
- **Epic**: AZ-627
- **Directory**: `crates/frame_ingest/`
- **Public API**:
- `crates/frame_ingest/src/lib.rs` (`FrameIngest`, `FrameIngestHandle::subscribe() -> Receiver<Frame>`, `health()`)
- **Internal**:
- `crates/frame_ingest/src/internal/rtsp_client.rs`
- `crates/frame_ingest/src/internal/decoder.rs`
- `crates/frame_ingest/src/internal/timestamp.rs`
- **Owns**: `crates/frame_ingest/**`
- **Imports from**: `shared`
- **Consumed by**: `detection_client`, `movement_detector`, `telemetry_stream` (all via composition-root-wired `Receiver<Frame>`)
---
### Component: detection_client
- **Epic**: AZ-628
- **Directory**: `crates/detection_client/`
- **Public API**:
- `crates/detection_client/src/lib.rs` (`DetectionClient`, `DetectionClientHandle::request(Frame) -> Result<DetectionBatch>`, `health()`)
- **Internal**:
- `crates/detection_client/build.rs` (`tonic-build` for the gRPC proto)
- `crates/detection_client/proto/detections.proto` (vendored copy of `../detections` contract per `architecture.md §10`)
- `crates/detection_client/src/internal/grpc/*` (bi-directional streaming client, version handshake)
- **Owns**: `crates/detection_client/**`
- **Imports from**: `shared`
- **Consumed by**: `scan_controller` (handle for direct request), `telemetry_stream` (via constructor-injected `Receiver<DetectionBatch>` for operator overlay)
---
### Component: gimbal_controller
- **Epic**: AZ-634
- **Directory**: `crates/gimbal_controller/`
- **Public API**:
- `crates/gimbal_controller/src/lib.rs` (`GimbalController`, `GimbalControllerHandle::set_pose(...)`, `zoom(level)`, `state() -> GimbalState`, `state_stream() -> Receiver<GimbalState>`, `health()`)
- **Internal**:
- `crates/gimbal_controller/src/internal/a40_protocol/*` (ViewPro A40 UDP vendor protocol — encode, decode, CRC)
- `crates/gimbal_controller/src/internal/smooth_pan.rs` (smooth-pan path-tracking primitive)
- **Owns**: `crates/gimbal_controller/**`
- **Imports from**: `shared`
- **Consumed by**: `scan_controller` (handle), `movement_detector` (via constructor-injected `Receiver<GimbalState>`), `frame_ingest` (constructor-injected `Receiver<GimbalState>` for timestamp annotation if needed)
---
### Component: semantic_analyzer
- **Epic**: AZ-630
- **Directory**: `crates/semantic_analyzer/`
- **Public API**:
- `crates/semantic_analyzer/src/lib.rs` (`SemanticAnalyzer`, `SemanticAnalyzerHandle::analyze(roi) -> Result<Tier2Evidence>`, `health()`)
- **Internal**:
- `crates/semantic_analyzer/src/internal/primitive_graph/*` (path, branch-pile, entrance, road graph reasoner)
- `crates/semantic_analyzer/src/internal/roi_cnn.rs` (TensorRT ROI CNN wrapper)
- `crates/semantic_analyzer/src/internal/scoring/*` (path-freshness, endpoint, concealment)
- **Owns**: `crates/semantic_analyzer/**`
- **Imports from**: `shared`
- **Consumed by**: `scan_controller` (handle)
---
### Component: vlm_client
- **Epic**: AZ-631
- **Directory**: `crates/vlm_client/`
- **Public API**:
- `crates/vlm_client/src/lib.rs` (`VlmClient` implementing `shared::contracts::VlmProvider`; `VlmClient::with_default()` returns the no-op impl returning `VlmAssessment { status: vlm_disabled }`; real impl is gated behind `feature = "vlm"`)
- **Internal**:
- `crates/vlm_client/src/internal/uds_client.rs` (Unix-domain socket IPC + peer-credential check)
- `crates/vlm_client/src/internal/schema_validate.rs` (`VlmAssessment` schema validation)
- `crates/vlm_client/src/internal/prompt.rs` (bounded prompt + payload size enforcement)
- **Owns**: `crates/vlm_client/**`
- **Imports from**: `shared`
- **Consumed by**: `scan_controller` (via the `shared::contracts::VlmProvider` trait — never directly)
---
### Component: mapobjects_store
- **Epic**: AZ-633
- **Directory**: `crates/mapobjects_store/`
- **Public API**:
- `crates/mapobjects_store/src/lib.rs` (`MapObjectsStore`, `MapObjectsStoreHandle::classify(Detection) -> Classification`, `apply_decline(POI)`, `dump_pending() -> MapObjectsBundle`, `hydrate(MapObjectsBundle)`, `set_sync_state(SyncState)`, `health()`)
- **Internal**:
- `crates/mapobjects_store/src/internal/h3_index/*` (`h3rs` wrapper + k-ring queries)
- `crates/mapobjects_store/src/internal/engine/mod.rs` (`StorageEngine` trait — pluggable for Q3)
- `crates/mapobjects_store/src/internal/engine/in_memory_snapshot.rs` (default impl: in-memory + JSON snapshot on flush)
- `crates/mapobjects_store/src/internal/diff.rs` (NEW / MOVED / EXISTING / REMOVED-CANDIDATE classification)
- `crates/mapobjects_store/src/internal/ignored.rs`
- **Owns**: `crates/mapobjects_store/**`
- **Imports from**: `shared`
- **Consumed by**: `scan_controller`, `operator_bridge`, `mission_executor` (for hydrate at pre-flight + dump_pending at post-flight)
---
### Component: movement_detector
- **Epic**: AZ-629
- **Directory**: `crates/movement_detector/`
- **Public API**:
- `crates/movement_detector/src/lib.rs` (`MovementDetector`, `MovementDetectorHandle::candidates() -> Receiver<MovementCandidate>`, `health()`; constructor takes `Receiver<Frame>`, `Receiver<GimbalState>`, `Receiver<MavlinkTelemetry>`)
- **Internal**:
- `crates/movement_detector/src/internal/ego_motion.rs` (homography-based ego-motion estimate)
- `crates/movement_detector/src/internal/optical_flow/*` (classical CV path)
- `crates/movement_detector/src/internal/learned_cv/*` (fallback per Q14 — behind `feature = "learned_cv"`)
- `crates/movement_detector/src/internal/zoom_bands.rs` (per-zoom-band threshold tables)
- `crates/movement_detector/src/internal/telemetry_sync.rs` (frame ↔ gimbal ↔ UAV skew gate)
- **Owns**: `crates/movement_detector/**`
- **Imports from**: `shared`
- **Consumed by**: `scan_controller` (consumes the `MovementCandidate` stream)
---
### Component: telemetry_stream
- **Epic**: AZ-639
- **Directory**: `crates/telemetry_stream/`
- **Public API**:
- `crates/telemetry_stream/src/lib.rs` (`TelemetryStream` implementing `shared::contracts::TelemetrySink`; `TelemetryStreamHandle::commands() -> Receiver<OperatorCommand>`, `health()`; constructor takes `Receiver<Frame>`, `Receiver<DetectionBatch>`, `Receiver<MavlinkTelemetry>`, `Receiver<BboxOverlay>`)
- **Internal**:
- `crates/telemetry_stream/src/internal/uplink/*` (modem push: protocol per `../_docs/04_system_design_clarifications.md` — Q2)
- `crates/telemetry_stream/src/internal/downlink/*` (operator-command receive path)
- `crates/telemetry_stream/src/internal/encode/*` (frame + telemetry + bbox-overlay serialisation)
- **Owns**: `crates/telemetry_stream/**`
- **Imports from**: `shared`
- **Consumed by**: `operator_bridge` (via the `TelemetrySink` trait in `shared::contracts`; commands consumed via constructor-injected `Receiver<OperatorCommand>`)
---
### Component: operator_bridge
- **Epic**: AZ-635
- **Directory**: `crates/operator_bridge/`
- **Public API**:
- `crates/operator_bridge/src/lib.rs` (`OperatorBridge`, `OperatorBridgeHandle::surface_poi(POI) -> OperatorDecision`, `middle_waypoint_hints() -> Receiver<MiddleWaypointHint>`, `target_follow_events() -> Receiver<TargetFollowEvent>`, `health()`; constructor takes `Arc<dyn TelemetrySink>` and `Receiver<OperatorCommand>`)
- **Internal**:
- `crates/operator_bridge/src/internal/auth/*` (`OperatorCommand` envelope validation — signature, replay protection, session validation; scheme stubbed pending Q9)
- `crates/operator_bridge/src/internal/audit_log.rs` (persistent audit log writer for `/var/lib/autopilot/audit/`)
- `crates/operator_bridge/src/internal/decision_window.rs` (confidence-scaled timeout: 40 % → 30 s, 100 % → 120 s linear)
- **Owns**: `crates/operator_bridge/**`
- **Imports from**: `shared`, `mapobjects_store`
- **Consumed by**: `scan_controller` (for `surface_poi`), `mission_executor` (consumes `middle_waypoint_hints` stream)
---
### Component: mission_executor
- **Epic**: AZ-636
- **Directory**: `crates/mission_executor/`
- **Public API**:
- `crates/mission_executor/src/lib.rs` (`MissionExecutor`, `MissionExecutorHandle::start(Mission)`, `insert_middle_waypoint(Coordinate)`, `failsafe_trigger(FailsafeKind)`, `state() -> ExecutorState`, `health()`; constructor takes `Receiver<MiddleWaypointHint>` from operator_bridge)
- **Internal**:
- `crates/mission_executor/src/internal/multirotor/fsm.rs` (DISCONNECTED → … → LAND)
- `crates/mission_executor/src/internal/fixed_wing/fsm.rs` (DISCONNECTED → … → WAIT_AUTO → LAND)
- `crates/mission_executor/src/internal/geofence/*` (INCLUSION + EXCLUSION enforcement)
- `crates/mission_executor/src/internal/failsafe/ladder.rs` (lost-link `LinkOk → LinkDegraded → LinkLost → LinkLostInFollow`)
- `crates/mission_executor/src/internal/battery_thresholds.rs` (RTL floor, hard floor)
- `crates/mission_executor/src/internal/bit.rs` (pre-flight built-in self-test; orchestrates pre-flight `mapobjects_store.hydrate(mission_client.pull_mapobjects(...))`)
- `crates/mission_executor/src/internal/middle_waypoint.rs` (re-upload sequence on operator confirm)
- `crates/mission_executor/src/internal/post_flight.rs` (orchestrates post-flight `mission_client.push_mapobjects(mapobjects_store.dump_pending())`)
- **Owns**: `crates/mission_executor/**`
- **Imports from**: `shared`, `mavlink_layer`, `mission_client`, `mapobjects_store`
- **Consumed by**: `scan_controller` (for `failsafe_trigger` and `insert_middle_waypoint`)
---
### Component: scan_controller
- **Epic**: AZ-632
- **Directory**: `crates/scan_controller/`
- **Public API**:
- `crates/scan_controller/src/lib.rs` (`ScanController`, `ScanControllerHandle::tick()`, `submit_operator_cmd(OperatorCommand)`, `state() -> ScanState`, `health()`; constructor takes `Receiver<DetectionBatch>`, `Receiver<MovementCandidate>`, `Receiver<Frame>` plus handles for `mapobjects_store`, `gimbal_controller`, `mission_executor`, `semantic_analyzer`, `operator_bridge`, and `Arc<dyn VlmProvider>`)
- **Internal**:
- `crates/scan_controller/src/internal/state_machine/mod.rs` (`ZoomedOut`, `ZoomedIn { roi, hold_started_at }`, `TargetFollow { target_id, started_at }`)
- `crates/scan_controller/src/internal/state_machine/transitions.rs`
- `crates/scan_controller/src/internal/poi_queue/*` (priority queue + `≤5 POIs/min` cap + confidence × proximity × age ordering)
- `crates/scan_controller/src/internal/behaviour_tree/*` (per `system-flows.md §F4`)
- `crates/scan_controller/src/internal/timeouts.rs` (operator-decision window, POI timeouts, VLM waits)
- `crates/scan_controller/src/internal/frame_rate_guard.rs` (suppress zoom-in transitions below ≥10 fps; surface yellow health)
- **Owns**: `crates/scan_controller/**`
- **Imports from**: `shared`, `mapobjects_store`, `gimbal_controller`, `mission_executor`, `semantic_analyzer`, `operator_bridge`
- **Consumed by**: `autopilot` (composition root)
---
### Component: autopilot (binary, composition root)
- **Epic**: AZ-626 (Bootstrap & Initial Structure — the binary scaffold is part of AZ-640)
- **Directory**: `crates/autopilot/`
- **Public API**: this is a `[[bin]]` crate — it exposes no library API.
- **Internal**:
- `crates/autopilot/src/main.rs` (CLI parse, config load, `tracing` init, build component instances, run)
- `crates/autopilot/src/runtime.rs` (build channels, wire actors, owns the `Vec<JoinHandle>`, shutdown orchestration)
- `crates/autopilot/src/health_server.rs` (HTTP `/health` endpoint per `containerization.md §7`)
- `crates/autopilot/src/bit_runner.rs` (invokes `mission_executor.bit()` and gates startup)
- **Owns**: `crates/autopilot/**`
- **Imports from**: `shared` + every Layer 2 actor crate + every Layer 3 coordinator + `scan_controller`
- **Consumed by**: nothing — this is the binary
---
## Shared / Cross-Cutting
All cross-cutting concerns live as modules inside the single `crates/shared/` crate (Rust convention prefers a single shared crate over many tiny ones; the module boundaries inside `shared::` enforce conceptual separation).
### shared::models
- **Path**: `crates/shared/src/models/`
- **Purpose**: the canonical entity catalogue from `_docs/02_document/data_model.md`. One submodule per entity grouping (`frame.rs`, `detection.rs`, `movement.rs`, `tier2.rs`, `vlm.rs`, `poi.rs`, `mapobject.rs`, `mission.rs`, `operator.rs`, `gimbal.rs`).
- **Owned by**: AZ-640 initial structure task (under epic AZ-626).
- **Consumed by**: every component crate + the `autopilot` binary.
### shared::config
- **Path**: `crates/shared/src/config/`
- **Purpose**: TOML loader (per `containerization.md §6`), typed per-component sections, environment-variable overlay, secrets resolution (via path to `EnvironmentFile=`).
- **Owned by**: AZ-640 initial structure task.
- **Consumed by**: every component crate.
### shared::error
- **Path**: `crates/shared/src/error.rs`
- **Purpose**: `AutopilotError` enum + `Result<T> = std::result::Result<T, AutopilotError>` alias.
- **Owned by**: AZ-640 initial structure task.
- **Consumed by**: every crate.
### shared::health
- **Path**: `crates/shared/src/health.rs`
- **Purpose**: `ComponentHealth`, `HealthLevel { Green, Yellow, Red, Disabled }`, `AggregatedHealth` — each component exposes its own `health() -> ComponentHealth`; `autopilot::health_server` aggregates per `containerization.md §7`.
- **Owned by**: AZ-640 initial structure task.
- **Consumed by**: every component + the binary's health server.
### shared::observability
- **Path**: `crates/shared/src/observability/`
- **Purpose**: `tracing-subscriber` init (JSON to stdout); log-field constants for the §2 fields in `observability.md`; span helpers for frame trace + POI trace.
- **Owned by**: AZ-640 initial structure task.
- **Consumed by**: every component (for spans and counters).
### shared::clock
- **Path**: `crates/shared/src/clock.rs`
- **Purpose**: `MonoClock` (monotonic, authoritative for telemetry-skew compensation and tick budgets), `WallClock` (bound to GPS time once locked, NTP at boot), `ClockSource { Gnss, Host, Coast }`. Drift > 200 ms → yellow health.
- **Owned by**: AZ-640 initial structure task.
- **Consumed by**: every component that timestamps anything (frame_ingest, movement_detector, scan_controller, operator_bridge audit log, mapobjects_store).
### shared::contracts
- **Path**: `crates/shared/src/contracts/`
- **Purpose**: trait definitions for cross-component coupling that we want to keep import-free:
- `TelemetrySink``push_frame`, `push_telemetry`, `push_overlay` (impl: `telemetry_stream`)
- `MavlinkSink``send` (impl: `mavlink_layer`; lets `mission_executor` depend on a trait rather than the concrete crate when convenient)
- `VlmProvider``assess(roi) -> VlmAssessment` (impl: `vlm_client`; default no-op impl returns `vlm_disabled`)
- `OperatorCommandSink``dispatch(OperatorCommand)` (lets the composition root forward decoded commands from `telemetry_stream` to `operator_bridge` without coupling them)
- **Owned by**: AZ-640 initial structure task.
- **Consumed by**: `operator_bridge` (TelemetrySink), `scan_controller` (VlmProvider), `mission_executor` (may use MavlinkSink), `telemetry_stream` + `vlm_client` (implement the traits).
## Allowed Dependencies (layering)
Read top-to-bottom; an upper layer may import from a lower layer but NEVER the reverse. Same-layer imports are explicitly listed in each component's `Imports from`.
| Layer | Components | May import from |
|---|---|---|
| 5. Composition | `autopilot` (binary) | 1, 2, 3, 4 |
| 4. Brain | `scan_controller` | 1, 2, 3 |
| 3. Coordinators | `operator_bridge`, `mission_executor` | 1, 2 |
| 2. Actors / Transports / Storage | `mavlink_layer`, `mission_client`, `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `vlm_client`, `mapobjects_store`, `gimbal_controller`, `telemetry_stream` | 1 |
| 1. Shared / Foundation | `shared/*` | (none) |
Violations of this table are **Architecture** findings in code-review and are **High** severity. Specifically:
- A Layer 2 actor MAY NOT import a sibling Layer 2 actor. Stream dependencies (e.g. `movement_detector` consuming `Frame`) are wired via constructor-injected channels by the composition root; sink dependencies (e.g. `operator_bridge` pushing into `telemetry_stream`) are bridged via a trait in `shared::contracts`.
- A Layer 3 coordinator MAY import any Layer 2 actor whose handle it directly calls. `operator_bridge` imports `mapopjects_store` for `apply_decline`. `mission_executor` imports `mavlink_layer`, `mission_client`, and `mapobjects_store`.
- A Layer 3 coordinator MAY NOT import another Layer 3 coordinator. `mission_executor` consumes `MiddleWaypointHint` from `operator_bridge` via a constructor-injected `Receiver<MiddleWaypointHint>` wired by the composition root.
## Layout Conventions (reference)
| Language | Root | Per-component path | Public API file | Test path |
|---|---|---|---|---|
| Rust | `crates/` | `crates/<component>/` | `crates/<component>/src/lib.rs` | `crates/<component>/tests/` (crate-level) + `tests/e2e/` (workspace-level) |
## Self-verification
- [x] Every component in `_docs/02_document/components/` has a Per-Component Mapping entry (13 components + `shared` + `autopilot` binary).
- [x] Every shared / cross-cutting concern has a Shared section entry (`models`, `config`, `error`, `health`, `observability`, `clock`, `contracts`).
- [x] Layering table covers every component, with `shared` at the bottom and `autopilot` binary at the top.
- [x] No component's `Imports from` list points at a higher layer. (`scan_controller` Layer 4 → Layers 1, 2, 3; coordinators Layer 3 → Layers 1, 2; actors Layer 2 → Layer 1 only.)
- [x] Paths follow Rust's `crates/<component>/` convention.
- [x] No two components own overlapping paths — each `Owns` glob is rooted at a distinct `crates/<component>/**`.
File diff suppressed because it is too large Load Diff
+535
View File
@@ -0,0 +1,535 @@
# Blackbox Tests
Authored by `/test-spec` Phase 2 (2026-05-19). Every scenario observes the SUT only through public surfaces (RTSP, gRPC, MAVLink, REST, operator stream, gimbal UDP, VLM IPC, health endpoint, structured logs). No scenario imports internal modules or peeks at on-device state directly.
Each scenario header records:
- **Summary** — one-line behaviour validated.
- **Traces to** — AC ID(s) and any RESTRICT ID.
- **Tier** — execution tier required (U / I / B / E / HW).
- **Test status**`READY` or `DEFERRED — <reason>` (per the override 2026-05-19 deferred scenarios are kept; release-gate items).
The `Expected result` field gives the inline pass/fail criterion; the authoritative comparison lives in `_docs/00_problem/input_data/expected_results/results_report.md` (referenced by row id).
---
## Positive Scenarios — Detection Quality (functional)
### FT-P-001: Tier-1 normalised-box contract conformance
**Summary**: Frame in → autopilot must consume and re-emit the Tier-1 detection stream conforming to the suite's normalised-box schema (class ids 0..18, coords ∈ [0,1]).
**Traces to**: AC `Detection Quality / D6`, RESTRICT `Suite-level architectural splits — Tier 1 lives in ../detections`.
**Tier**: B (mock detector) + E (live `../detections`).
**Test status**: READY.
**Preconditions**:
- SUT started; `detections-mock` serving recorded Tier-1 stream for `image-set-existing`.
- `e2e-consumer` subscribed to the SUT's outbound normalised-box stream (observable via the operator-stream channel and via the internal-test-only `/debug/detections` socket IF exposed in test build; otherwise via operator-stream only).
**Input data**: `fixtures/images/4d6e1830d211ad50.jpg` (1280 px aerial frame) → encoded into `rtsp-loopback` as a 1-second loop.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Begin RTSP playback of the frame loop | SUT consumes the frame; emits a normalised-box detection record on the operator-stream channel |
| 2 | Capture one emitted detection record | Record validates against `fixtures/schemas/expected_detections.schema.json`; every bbox coord ∈ [0,1]; class_id ∈ {0..18} |
**Expected outcome**: D6 — schema-match + range comparison passes.
**Max execution time**: 10 s.
---
### FT-P-002: Existing-class regression vs documented baseline
**Summary**: Per-class precision and recall must not regress by more than ±2 percentage points against the pinned baseline (P=0.816, R=0.852).
**Traces to**: AC `Detection Quality — Existing-class regression / D2`.
**Tier**: E + HW (HW required for the project-level Acceptance Gate).
**Test status**: DEFERRED — expected_results baseline JSON not yet recorded (`<DEFERRED: expected_results/existing_classes_baseline.json>`). Visual fixtures (5 frames) are on disk; baseline numbers depend on a recording against the currently pinned `../detections` model.
**Preconditions**:
- Baseline JSON recorded against pinned `../detections` model (DEFERRED).
- SUT + live `../detections` running (Tier E) or HW Jetson (HW).
**Input data**: `fixtures/images/{4d6e1830...,54f6459...,6dd601b7...,805bcf1e...,f997d093...}.jpg` (5 frames).
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Stream each frame through RTSP | SUT emits detections per frame |
| 2 | Compare aggregated per-class P/R to baseline | each per-class P, R within ± 0.02 absolute of baseline |
**Expected outcome**: D2 — `numeric_tolerance` passes.
**Max execution time**: 60 s.
---
### FT-P-003: New-class precision and recall ≥80%
**Summary**: New target classes (black entrances, branch piles, footpaths, roads, trees, tree blocks) reach precision ≥0.80 AND recall ≥0.80 per class.
**Traces to**: AC `Detection Quality — New target classes / D1`.
**Tier**: E + HW.
**Test status**: DEFERRED — multi-season annotated new-class eval set not acquired; annotation campaign owned by `../ai-training` repo. `<DEFERRED: expected_results/new_classes_pr.json>`.
**Preconditions**:
- Multi-season annotated new-class eval set acquired.
- Tier-1 model updated to include the 5 new classes.
**Input data**: `<DEFERRED: new-class eval set across all four seasons>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Stream eval-set frames through RTSP | SUT emits detections including new-class items |
| 2 | Compute per-class P, R | each ≥ 0.80 |
**Expected outcome**: D1 — `threshold_min` passes for every new class.
**Max execution time**: 120 s.
---
### FT-P-004: Concealed-position recall ≥60% (initial gate)
**Summary**: System surfaces concealed positions (FPV hideouts, dugouts) with recall ≥0.60, accepting high false-positive rate as operators filter.
**Traces to**: AC `Detection Quality — Concealed-position recall / D3`.
**Tier**: E + HW.
**Test status**: DEFERRED — only 4 starter PNGs on disk; full multi-season annotated set required.
**Input data**: `fixtures/semantic/semantic0[1-4].png` (starter) + `<DEFERRED full set>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Stream concealed-position frames | SUT emits concealed-structure POIs |
| 2 | Compute aggregate recall against ground truth | recall ≥ 0.60 |
**Expected outcome**: D3 — `threshold_min` passes.
**Max execution time**: 120 s.
---
### FT-P-005: Concealed-position precision ≥20% (initial gate)
**Summary**: Concealed-position precision ≥0.20 (operators filter; high-FP-accepting gate).
**Traces to**: AC `Detection Quality — Concealed-position precision / D4`.
**Tier**: E + HW.
**Test status**: DEFERRED — same dataset gap as FT-P-004.
**Input data**: same as FT-P-004.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Stream concealed-position frames | SUT emits POIs |
| 2 | Compute aggregate precision against ground truth | precision ≥ 0.20 |
**Expected outcome**: D4 — `threshold_min` passes.
---
### FT-P-006: Footpath recall ≥70%
**Summary**: Footpath recall ≥0.70 across multi-season polyline-annotated eval set.
**Traces to**: AC `Detection Quality — Footpath recall / D5`.
**Tier**: E + HW.
**Test status**: DEFERRED — `<DEFERRED: footpath sequences (fresh + stale, all seasons), polyline-annotated>`.
**Input data**: `fixtures/semantic/semantic0[1-4].png` (starter; 4 frames feature footpaths leading to concealment) + `<DEFERRED full multi-season set>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Stream footpath-bearing frames | SUT emits footpath polyline annotations |
| 2 | Compute recall against polyline ground truth | recall ≥ 0.70 |
**Expected outcome**: D5 — `threshold_min` passes.
---
## Positive Scenarios — Movement Detection Behaviour
### FT-P-007: Ego-motion compensation rejects stable scene elements
**Summary**: With the camera platform itself moving, stable elements (tree rows, houses, roads) MUST NOT generate movement candidates; only the actual mover does.
**Traces to**: AC `Movement Detection — Stable objects MUST NOT be treated as moving / M1`, RESTRICT `Operational — moving camera platform`.
**Tier**: B (with paired CSVs) + E.
**Test status**: DEFERRED — `<DEFERRED: paired gimbal.csv + telemetry.csv for video01.mp4; scene must contain 1 stable tree row + 1 moving vehicle>`.
**Preconditions**:
- `rtsp-loopback` plays `fixtures/movement/video01.mp4` at 30 fps.
- `gimbal-mock` replays paired gimbal.csv synchronised to RTSP frame timestamps.
- `mavlink-sitl` replays paired telemetry (position + attitude) for the same duration.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Begin synchronised playback (video + gimbal + telemetry) | SUT begins consuming frames and ego-motion compensating |
| 2 | Capture every movement candidate emitted on operator-stream for the clip duration | exactly 1 candidate (the vehicle); tree-row position is NOT among candidates |
**Expected outcome**: M1 — `set_contains` passes; candidate set == {vehicle}; tree-row position ∉ candidates.
**Max execution time**: clip_duration + 10 s.
---
### FT-P-008: Movement detection continues during zoomed-in hold
**Summary**: While the camera is in a zoomed-in hold on a confirmed POI, a new mover appearing in the ROI is still detected and enqueued; current ROI is preempted only if the new candidate's priority exceeds it.
**Traces to**: AC `Movement Detection — MUST continue during the zoomed-in inspection / M2`.
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED zoomed-in gimbal.csv + telemetry.csv pair; 1 small mover>`.
**Input data**: `fixtures/movement/video02.mp4` + DEFERRED CSV pair.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Drive SUT into ZoomedIn hold via prior FT-P-016 setup | SUT in `ZoomedIn { roi, hold_started_at }` |
| 2 | Begin playback of the zoomed-in scene with the small mover | Movement candidate enqueued within ≤ 1.5 s (timing checked by NFT-PERF-L7) |
| 3 | Observe ROI lifecycle | ROI is preempted only if new candidate's `confidence × proximity × age_factor` exceeds the held ROI's; otherwise the held ROI completes |
**Expected outcome**: M2 — `exact` passes; 1 candidate enqueued; ROI preempt decision matches the documented priority rule.
---
### FT-P-009: Per-zoom-band threshold honoured (no false candidate at edge)
**Summary**: When a movement-cluster persists for one frame BELOW the configured per-zoom-band threshold, no candidate is emitted.
**Traces to**: AC `Movement Detection — configurable per-zoom-band false-positive budget MUST be honoured / M3`.
**Tier**: B.
**Test status**: DEFERRED — `<DEFERRED gimbal.csv simulating threshold edge>`.
**Input data**: `fixtures/movement/video03.mp4` + DEFERRED CSV.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Replay scene at the threshold edge | SUT processes frames |
| 2 | Observe candidate count over the clip duration | count == 0 |
**Expected outcome**: M3 — `exact` passes.
---
### FT-P-010: Movement zoomed-in benchmark FP-rate budget
**Summary**: Across the zoom-out + zoomed-in benchmark suite, false-positive rate per zoom band stays within the configurable per-zoom-band budget (Q14 fallback trigger).
**Traces to**: AC `Q-tagged criteria — Movement detection FP rate at zoomed-in inspection / M4` (depends on Q14).
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: zoom-out + zoomed-in benchmark suite + expected_results/movement_benchmark_caps.json; Q14>`.
**Input data**: `fixtures/movement/video04.mp4` (visual ref) + DEFERRED benchmark suite.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Replay the benchmark suite end-to-end | SUT processes all frames |
| 2 | Aggregate FP candidates per zoom band | rate per band ≤ configured cap (default ≤ 0.5 / min at zoomed-in) |
**Expected outcome**: M4 — `threshold_max` passes per zoom band.
---
## Positive Scenarios — Scan & Camera Control
### FT-P-011: Sweep → zoomed-inspection transition + POI enqueue
**Summary**: A POI detected mid-sweep triggers a transition into zoomed-inspection within 2 s (timing: NFT-PERF-L8) AND the POI is enqueued correctly.
**Traces to**: AC `Scan & Camera Control — Transition from sweep to detailed inspection / S1`.
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: scripted mission with planned route + simulated POI detected mid-sweep>`.
**Input data**: scripted MAVLink mission + scripted Tier-1 detection injection at known frame index.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Start SUT with scripted mission; begin RTSP playback | SUT enters `ZoomedOut`, performs sweep |
| 2 | Inject Tier-1 detection of a high-confidence target at frame N | SUT transitions to `ZoomedIn { roi, hold_started_at }`; ROI bbox matches the injected detection's bbox; POI queue length increments by 1 |
**Expected outcome**: S1 — `exact (transition)` + `exact (ROI matches POI bbox)` + `exact (queue Δ+1)`.
---
### FT-P-012: Footpath-pan during zoomed-in hold
**Summary**: During a zoomed-in hold on a footpath ROI, the camera pans along the footpath while the airframe continues to fly. The footpath stays in the centre 50% of frame for the duration of the hold.
**Traces to**: AC `Scan & Camera Control — pan to keep features visible / S2`.
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: zoomed-inspection scenario with footpath polyline overlapping the ROI>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Drive SUT into ZoomedIn hold on a footpath ROI | SUT in `ZoomedIn { roi, hold_started_at }` |
| 2 | Continue airframe flight; observe gimbal commands stream | SUT issues pan commands to track the footpath; observed centre offset ≤ 25% per frame |
**Expected outcome**: S2 — `numeric_tolerance` passes; per-frame centre offset ≤ 0.25 × frame_dim.
---
### FT-P-013: Target-follow centre-window
**Summary**: After operator confirmation, target-follow mode keeps the target within the centre 25% of frame while visible.
**Traces to**: AC `Scan & Camera Control — target-follow mode / S3`.
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: operator-confirmed target + 60 s follow window>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Drive SUT into `TargetFollow { target_id, started_at }` via prior FT-P-016 | mode == target-follow |
| 2 | Observe gimbal commands + per-frame target position for 60 s | per-frame |dx, dy| ≤ 0.125 × frame_size |
**Expected outcome**: S3 — `threshold_max` passes per frame.
---
### FT-P-014: POI queue ordering by `confidence × proximity × age_factor`
**Summary**: With 3 POIs varying in confidence × proximity × age_factor, the system pops them in the documented relative order.
**Traces to**: AC `Scan & Camera Control — POI queue MUST be ordered by … / S4`.
**Tier**: B.
**Test status**: READY (synthetic-poi-feeds inline-authorable).
**Input data**: `synthetic-poi-feeds` ordering test — 3 POIs with confidence ∈ {0.50, 0.80, 0.60}, proximity ∈ {near, mid, far}, age_factor ∈ {fresh, fresh, stale} chosen to produce a known relative ordering.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Inject the 3 POIs as Tier-1 detections | all 3 enter the queue |
| 2 | Observe ZoomedIn transitions over the next N seconds | SUT inspects POIs in the documented relative order |
**Expected outcome**: S4 — `exact (order)` passes.
---
### FT-P-015: Zoomed-in hold cap interacts with deep-analysis
**Summary**: Zoomed-in hold defaults to 5 s/POI but caps deep-analysis interactions at 2 s; actual hold duration = min(5 s, deep_analysis_complete_at).
**Traces to**: AC `Scan & Camera Control — hold endpoints up to 2 s for deep analysis … per-POI timeout (default 5 s/POI) / S5`.
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: VLM-enabled hold scenario with vlm_io_pair returning within 2 s>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Drive SUT into ZoomedIn hold; enable deep-analysis | SUT begins VLM IPC call on enter |
| 2a | Case A: VLM returns at 1.5 s | hold ends at 1.5 s (deep_analysis_complete) |
| 2b | Case B: VLM returns at 3.0 s | hold ends at 2.0 s (deep-analysis cap) |
| 2c | Case C: deep-analysis disabled | hold ends at 5.0 s (per-POI timeout) |
**Expected outcome**: S5 — `exact` passes for each case.
---
## Positive Scenarios — Operator Workflow
### FT-P-016: Operator confirm → middle waypoint inserted + target-follow
**Summary**: Valid + signed operator-confirm command results in a middle waypoint POSTed to `missions` AND a transition into target-follow mode.
**Traces to**: AC `Operator Workflow — Operator confirmation MUST result in … / O8`.
**Tier**: B + E.
**Test status**: READY for happy path (default placeholder envelope until Q9 resolves; envelope replaced when Q9 ships).
**Input data**: `operator-envelopes` (valid happy path) + `mission-suite-fixture` (DEFERRED full version) + `operator-session-scripts` (nominal session).
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | SUT in ZoomedIn hold on a POI surfaced to the operator | mode == ZoomedIn |
| 2 | Replay operator-confirm envelope on the return path | SUT validates envelope; commits decision |
| 3 | Observe HTTPS POST to `missions-mock` | `POST /missions/{id}` with a middle waypoint at the POI MGRS; HTTP 200 |
| 4 | Observe scan-mode state | mode == `TargetFollow { target_id, started_at }` |
**Expected outcome**: O8 — `exact (HTTP 200)` + `exact (mode == TargetFollow)`.
---
### FT-P-017: Decision window = 30 s at conf = 0.40
**Summary**: At confidence = 0.40 the decision window surfaced to the operator MUST equal 30 s (lower-bound anchor of the linear scale).
**Traces to**: AC `Operator Workflow — decision window … 40% confidence → 30 s / O1`.
**Tier**: B.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Inject a synthetic POI at conf=0.40 | POI surfaced on operator-stream with `decision_window_seconds: 30` |
**Expected outcome**: O1 — `exact (window == 30 s)`.
---
### FT-P-018: Decision window = 120 s at conf = 1.00
**Summary**: At confidence = 1.00 the decision window MUST equal 120 s (upper-bound anchor).
**Traces to**: AC `O2`.
**Tier**: B.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Inject a synthetic POI at conf=1.00 | window == 120 s |
**Expected outcome**: O2 — `exact`.
---
### FT-P-019: Decision window linear interpolation at conf = 0.70
**Summary**: At conf=0.70 the window is interpolated linearly between (0.40, 30 s) and (1.00, 120 s) → 75 s ± 0.5 s.
**Traces to**: AC `O3`.
**Tier**: B.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Inject a synthetic POI at conf=0.70 | window ≈ 75 s ± 0.5 s |
**Expected outcome**: O3 — `numeric_tolerance ± 0.5 s`.
---
### FT-P-020: Operator decline → persistent ignored-item
**Summary**: Operator-decline on a surfaced POI MUST persist an ignored-item entry keyed by `(MGRS cell, class_group)`.
**Traces to**: AC `Operator Workflow — Operator-decline MUST result in a persistent ignored-item entry / O5`.
**Tier**: B + E.
**Test status**: READY (operator-session-scripts inline-authorable; envelope uses default placeholder until Q9 resolves).
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Surface a POI to the operator | POI on operator-stream |
| 2 | Replay operator-decline envelope | SUT validates; ignored-item count via health endpoint increments by 1; new item has `(MGRS, class_group)` matching the declined POI |
**Expected outcome**: O5 — `exact (count Δ+1)` + `schema_match` (ignored-item record shape).
---
### FT-P-021: Ignored-item suppresses future matching detections
**Summary**: A new detection whose `(MGRS, class_group)` matches an existing ignored-item MUST NOT be surfaced to the operator.
**Traces to**: AC `Operator Workflow — A new detection whose (MGRS, class_group) matches an existing ignored-item MUST NOT be surfaced / O6`.
**Tier**: B + E.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Seed an ignored-item for `(MGRS=X, class_group=Y)` via FT-P-020 | ignored-item present |
| 2 | Inject a new detection at `(MGRS=X, class_group=Y)` | operator-stream emits NO POI for this detection; counter `pois_suppressed_by_ignored_total` increments |
**Expected outcome**: O6 — `exact (count surfaced == 0)`.
---
### FT-P-022: Operator timeout = forget (no ignored-item)
**Summary**: If the decision window expires with no operator response, the POI is removed from the queue but NO ignored-item is created (forget, do not blacklist).
**Traces to**: AC `Operator Workflow — Timeout (no operator response within the window) MUST NOT create an ignored-item entry / O7`.
**Tier**: B + E.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Surface a POI at conf=0.40 (30 s window) | POI on operator-stream |
| 2 | Wait > 30 s with no response | POI removed from queue; ignored-item count UNCHANGED |
**Expected outcome**: O7 — `exact (queue 1)` + `exact (ignored-item count unchanged)`.
---
## Positive Scenarios — Pre-flight & Map Reconciliation
### FT-P-023: BIT pre-flight pass with every dependency healthy
**Summary**: When every external dependency is reachable + healthy AND on-device storage < 95 % full AND wall-clock is bound, BIT passes and takeoff is permitted.
**Traces to**: AC `Reliability & Safety — Pre-flight self-test MUST pass / R1`, RESTRICT `Reliability & Safety obligations — Pre-flight self-test (BIT) MUST gate takeoff`.
**Tier**: B + E.
**Test status**: READY (bit-scenarios inline-authorable).
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Bring up all mocks healthy + clean autopilot-state volume | every dependency green |
| 2 | Trigger BIT via the BIT-arm operator command (or scripted in `operator-session-scripts`) | health endpoint returns `{ "ok": true, "deps": { ...all green }, "takeoff_permitted": true }` |
**Expected outcome**: R1 — `exact (takeoff_permitted == true)` + `exact (health.all == "green")`.
---
### FT-P-024: Pre-flight map pull ≤ 30 s for a 30×30 km region
**Summary**: Pulling the area-level map of previously-detected objects for a 30 km × 30 km mission area MUST complete within 30 s wall-clock.
**Traces to**: AC `Map Reconciliation — Pre-flight map pull / Mp1`.
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: mock central area-map service with ~10000 map objects for the 30 km × 30 km region>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Configure `missions-mock` with the 30×30 km mapobjects fixture | mock ready |
| 2 | Trigger BIT (which pulls the map) | SUT issues `GET /missions/{id}/mapobjects`; local copy hydrated within 30 s |
| 3 | Confirm BIT proceeds normally afterwards | takeoff permitted |
**Expected outcome**: Mp1 — `threshold_max` passes (NFT-PERF measures the latency; this scenario asserts the functional pathway).
---
### FT-P-025: Post-flight map diff push for a 60-minute mission
**Summary**: Pushing the post-flight pass diff (~17 500 records: NEW + MOVED + REMOVED + CONFIRMED-EXISTING) for a 60-minute mission MUST complete within 120 s wall-clock.
**Traces to**: AC `Map Reconciliation — Post-flight pass diff push / Mp3`.
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: 60-minute mission pass diff fixture>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Land the SUT after a 60-minute mission (scripted) | SUT enters post-flight reconciliation |
| 2 | Observe HTTPS POST to `missions-mock` | `POST /missions/{id}/mapobjects` with the diff; HTTP 200 within 120 s |
**Expected outcome**: Mp3 — `threshold_max` passes (NFT-PERF measures latency).
---
### FT-P-026: MapObjects conflict resolution (append-only + projection)
**Summary**: When two map updates conflict for the same `(spatial-cell, class_group)`, the SUT records both observations append-only AND computes the current view per the documented resolution rule.
**Traces to**: AC `Q-tagged — MapObjects conflict resolution / Mp5` (depends on Q8).
**Tier**: B + E.
**Test status**: DEFERRED — `<DEFERRED: conflict pair fixture + expected_results/mapobjects_conflict_resolution.json; Q8>`.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Seed the local mapobjects store via map pull | local store hydrated |
| 2 | Trigger two conflicting observations for `(cell=X, class=Y)` | both appended to the observation log |
| 3 | Observe the projected current view (via the operator-stream map-overlay channel or health debug) | current view matches the resolution rule (Q8) |
**Expected outcome**: Mp5 — `json_diff` passes against the reference.
---
## Negative Scenarios
### FT-N-001: BIT inhibits takeoff when Tier-1 detection is unreachable
**Summary**: When `../detections` is unreachable at BIT, takeoff MUST be inhibited and the detection dependency MUST report red.
**Traces to**: AC `Reliability & Safety — Pre-flight self-test MUST pass / R2`, RESTRICT `Suite-level architectural splits — Tier 1 lives in ../detections`.
**Tier**: B + E.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Stop `detections-mock` | dependency unreachable |
| 2 | Trigger BIT | health endpoint returns `takeoff_permitted: false`; `deps.detection == "red"`; operator-stream surfaces a BIT-failure event with category `detection` |
| 3 | Attempt to issue a takeoff MAVLink command (scripted) | SUT refuses; no MAVLink takeoff command observed on `mavlink-sitl` |
**Expected outcome**: R2 — `exact (takeoff inhibited)`.
---
### FT-N-002: BIT inhibits takeoff when persistent storage ≥ 95 % full
**Summary**: When the on-device persistent store is ≥ 95 % full at BIT, takeoff MUST be inhibited.
**Traces to**: AC `Reliability & Safety — Pre-flight self-test MUST pass / R3`, RESTRICT `On-device storage MUST be bounded`.
**Tier**: B.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Pre-fill `autopilot-state` volume to ≥ 95 % via seed file | storage threshold tripped |
| 2 | Trigger BIT | `takeoff_permitted: false`; `deps.storage == "red"` |
**Expected outcome**: R3 — `exact (takeoff inhibited)`.
---
### FT-N-003: Cache-fallback on map-pull timeout requires operator acknowledgement
**Summary**: When the pre-flight map pull times out, the SUT falls back to last-known cached MapObjects, reports `map_sync == "cached_fallback"`, AND MUST require explicit operator acknowledgement before takeoff is permitted.
**Traces to**: AC `Map Reconciliation — Cache-fallback on timeout is acceptable only with explicit operator acknowledgement / Mp2`.
**Tier**: B + E.
**Test status**: READY (operator-session-scripts inline-authorable; cached state seeded from prior pull).
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Seed `autopilot-state` with a known prior MapObjects snapshot | cached map present |
| 2 | Configure `missions-mock` to timeout on `GET /missions/{id}/mapobjects` | mock returns 504 / silent timeout |
| 3 | Trigger BIT | SUT falls back to cached; `map_sync == "cached_fallback"`; BIT reports `takeoff_permitted: false, awaiting_ack: ["map_cache_fallback"]` |
| 4 | Replay operator-ack envelope for `map_cache_fallback` | BIT now reports `takeoff_permitted: true`; one structured-log entry at WARN with `map_cache_fallback_acked_by_operator` |
| 5 | Replay a takeoff scenario WITHOUT the ack | takeoff remains inhibited |
**Expected outcome**: Mp2 — `exact (cached_fallback)` + `exact (BIT requires explicit ack)`.
---
### FT-N-004: Below-threshold POI suppression (conf < 40 %)
**Summary**: A POI at confidence < 0.40 MUST NOT be surfaced to the operator at all.
**Traces to**: AC `Operator Workflow — Below 40% confidence, the POI MUST NOT be surfaced at all / O4`.
**Tier**: B.
**Test status**: READY.
| Step | Consumer Action | Expected System Response |
|---|---|---|
| 1 | Inject a synthetic POI at conf=0.39 | POI does NOT appear on operator-stream; counter `pois_below_threshold_total` increments by 1 |
**Expected outcome**: O4 — `exact (count surfaced == 0)`.
---
## Notes for downstream skills
- Decompose: every `READY` scenario above maps to at least one blackbox test task. DEFERRED scenarios MUST still produce a task spec (so the implementation has a placeholder), but the task spec's `Acceptance` section will reference the leftover entry that gates the fixture.
- Implement Tests: per-scenario assertion helpers (RTSP playback orchestration, MAVLink observer, operator-stream observer) are likely shared across scenarios — Phase 4's runner scripts will assume a thin `e2e/consumer/lib/` module that all scenarios depend on.
- Test-Spec Sync (cycle-update mode): post-implementation, scenarios may be split (e.g. FT-P-015's three sub-cases may become FT-P-015a/b/c) or merged. The traceability-matrix is the source of truth — every scenario MUST trace to at least one AC or RESTRICT.
+320
View File
@@ -0,0 +1,320 @@
# Test Environment
Authored by `/test-spec` Phase 2 (2026-05-19) against:
- `_docs/00_problem/problem.md`, `acceptance_criteria.md`, `restrictions.md`, `security_approach.md`
- `_docs/01_solution/solution_draft01.md`
- `_docs/02_document/architecture.md` (incl. §6 NFR Targets, §7 Detailed Design)
- `_docs/00_problem/input_data/data_parameters.md`, `services.md`, `fixtures/README.md`, `expected_results/results_report.md`
Per `.cursor/rules/artifact-srp.mdc` this artifact owns ONLY the test environment / harness shape — measurable thresholds belong in `acceptance_criteria.md`, fixture inventory belongs in `test-data.md`, and per-test specs belong in the sibling `*-tests.md` files.
---
## Overview
**System under test (SUT)**: `autopilot` — a single Rust binary that mounts onto the Jetson Orin Nano Super of a reconnaissance UAV. Its observable external surfaces:
| Surface | Direction | Protocol | Source/Sink in production |
|---|---|---|---|
| Tier-1 detection RPC | autopilot ⇄ detector | bi-directional gRPC streaming (local) | `../detections` |
| MAVLink command/telemetry | autopilot ⇄ airframe | MAVLink v2 over UDP (or serial) | ArduPilot / PX4 |
| Camera RTSP feed | camera → autopilot | H.264/265 1080p, 30/60 fps | ViewPro A40 |
| Gimbal control + telemetry | autopilot ⇄ camera | ViewPro vendor UDP | ViewPro A40 |
| Mission + MapObjects REST | autopilot ⇄ central | HTTPS JSON | `missions` service |
| Operator stream (telemetry out, commands in) | autopilot ⇄ GS | Suite-level modem protocol, signed commands | Ground Station |
| Deep-analysis VLM IPC (optional) | autopilot ⇄ VLM | Unix-domain socket | local-onboard VLM |
| Health endpoint | autopilot → ops | HTTP/JSON | scraped by ops |
| Structured logs | autopilot → ops | JSON to stdout | log shipper |
The harness exercises every one of those surfaces from outside the SUT process. No test reaches inside the binary (no module imports, no direct DB peeks, no shared memory).
**Consumer app purpose**: a black-box test runner (`e2e-consumer`) that:
1. Brings up the SUT in a controlled topology (with mock or live peers).
2. Drives inputs through public surfaces.
3. Captures every observable: outbound network frames, MAVLink commands, gimbal UDP commands, REST calls, operator-stream messages, health-endpoint JSON, log lines, plus passive resource metrics (RSS, CPU, GPU).
4. Compares each observation against the expected result tagged in `_docs/00_problem/input_data/expected_results/results_report.md` and emits a CSV report.
## Test execution tiers
Three execution tiers exist; each test scenario declares which tier(s) it must run in:
| Tier | Purpose | What is real vs mocked | When it runs |
|---|---|---|---|
| **U** — unit | Pure in-process logic with no external surface (state-machine transitions, geometry helpers, schema validators) | Everything in-process | Per commit (cargo test) |
| **I** — component-integration | One autopilot component against mocks for every peer | SUT component real; all peers stubbed/replayed | Per commit; isolates contract drift |
| **B** — blackbox / harness | Full SUT binary against mock peers in containers | SUT binary real; every external peer mocked (HTTPS mock, gRPC replay, MAVLink SITL, scripted operator trace, RTSP loopback) | Per commit + nightly |
| **E** — suite-e2e | Full SUT against live siblings (`../detections`, `../missions`, ArduPilot SITL, Ground Station replay) | All real services in the suite-e2e compose | Nightly + pre-release |
| **HW** — hardware/replay benchmark | SUT binary on representative Jetson hardware OR on a benchmarked replay of that hardware | Real Jetson Orin Nano Super OR benchmarked replay | Pre-release; the only path that satisfies the `acceptance_criteria.md → Acceptance Gates (project-level)` hardware gate |
Hardware-dependency analysis (which AC rows require HW vs replay vs commodity) is produced by the test-spec `phases/hardware-assessment.md` step before Phase 4 runner scripts are generated and is appended to this file as `## Hardware Execution Matrix`.
## Docker environment (Tier B + E)
The suite-e2e compose lives at the monorepo level (`../e2e/docker-compose.suite-e2e.yml`, owned by the `monorepo-e2e` skill — see `_docs/00_problem/input_data/services.md`). The autopilot-local harness lives at `e2e/docker-compose.autopilot-e2e.yml` (created by Phase 4) and brings up only the SUT + mocks needed for Tier-B runs.
### Services (Tier B — autopilot-local harness)
| Service | Image / Build | Purpose | Ports |
|---|---|---|---|
| `autopilot` | build: `.` (cross to `aarch64-unknown-linux-gnu` for HW, native for Tier B) | SUT | health: 9100/tcp; log: stdout; MAVLink: 14550/udp; gimbal: 9201/udp; operator: 9301/tcp |
| `detections-mock` | build: `e2e/mocks/detections-mock` (Python) | Bi-directional gRPC mock replaying recorded `Detections` streams | 50051/tcp |
| `missions-mock` | build: `e2e/mocks/missions-mock` (Python FastAPI) | HTTPS REST mock — `GET/POST /missions/{id}` + `/mapobjects` | 8443/tcp (TLS) |
| `rtsp-loopback` | image: `bluenviron/mediamtx` | RTSP server playing back recorded `.mp4` frame sequences at 30/60 fps | 8554/tcp |
| `gimbal-mock` | build: `e2e/mocks/gimbal-mock` (Rust) | ViewPro UDP echo + scripted yaw/pitch/zoom telemetry replays | 9200/udp |
| `mavlink-sitl` | image: `ardupilot/ardupilot-sitl` | ArduPilot SITL — MAVLink v2 endpoint for the autopilot to drive | 14551/udp |
| `vlm-mock` | build: `e2e/mocks/vlm-mock` (Python, UDS) | Optional Tier-3 VLM IPC mock; replays recorded `VlmAssessment` JSON | (UDS only) |
| `operator-replay` | build: `e2e/mocks/operator-replay` (Python) | Scripted Ground Station session trace: connect / push frame / push telemetry / operator-click / modem-drop / reconnect / lost-link | 9300/tcp |
| `time-injector` | build: `e2e/mocks/time-injector` (Rust) | Injects clock-drift / NTP-loss scenarios into the SUT container's clock via `faketime` LD_PRELOAD shim | — |
| `e2e-consumer` | build: `e2e/consumer` (Rust + assert crates) | The black-box test runner that drives scenarios + compares observables to expected results | — |
### Networks
| Network | Services | Purpose |
|---|---|---|
| `autopilot-e2e` | all | Isolated test network; no egress |
### Volumes
| Volume | Mounted to | Purpose |
|---|---|---|
| `fixtures-ro` | every mock service (read-only) | Mounts `_docs/00_problem/input_data/fixtures/` for replay sources |
| `expected-ro` | `e2e-consumer:/expected:ro` | Mounts `_docs/00_problem/input_data/expected_results/` for assertion comparison |
| `reports-rw` | `e2e-consumer:/reports` | CSV + JSON test output |
| `autopilot-state` | `autopilot:/var/lib/autopilot` | On-device persistent store (R3, Mp4) — wiped between runs |
### docker-compose structure (outline only — not runnable)
```yaml
services:
autopilot:
build: .
depends_on: [detections-mock, missions-mock, rtsp-loopback, gimbal-mock, mavlink-sitl, operator-replay]
networks: [autopilot-e2e]
environment:
DETECTOR_GRPC: detections-mock:50051
MISSIONS_URL: https://missions-mock:8443
RTSP_URL: rtsp://rtsp-loopback:8554/feed
GIMBAL_UDP: gimbal-mock:9200
MAVLINK_UDP: mavlink-sitl:14551
OPERATOR_TCP: operator-replay:9300
VLM_SOCK: /tmp/vlm.sock
AUTOPILOT_CONFIG: /etc/autopilot/test.toml
volumes:
- autopilot-state:/var/lib/autopilot
detections-mock: { build: e2e/mocks/detections-mock, volumes: [fixtures-ro:/fixtures:ro] }
missions-mock: { build: e2e/mocks/missions-mock, volumes: [fixtures-ro:/fixtures:ro] }
rtsp-loopback: { image: bluenviron/mediamtx, volumes: [fixtures-ro:/fixtures:ro] }
gimbal-mock: { build: e2e/mocks/gimbal-mock, volumes: [fixtures-ro:/fixtures:ro] }
mavlink-sitl: { image: ardupilot/ardupilot-sitl }
vlm-mock: { build: e2e/mocks/vlm-mock, volumes: [fixtures-ro:/fixtures:ro] }
operator-replay: { build: e2e/mocks/operator-replay, volumes: [fixtures-ro:/fixtures:ro] }
time-injector: { build: e2e/mocks/time-injector }
e2e-consumer:
build: e2e/consumer
depends_on: [autopilot]
volumes: [expected-ro:/expected:ro, reports-rw:/reports]
networks:
autopilot-e2e: {}
volumes:
fixtures-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/fixtures } }
expected-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/expected_results } }
reports-rw: {}
autopilot-state: {}
```
### Suite-e2e compose (Tier E) — referenced, not redefined
For Tier-E runs the harness uses `../e2e/docker-compose.suite-e2e.yml` (owned by `monorepo-e2e`). It adds the real `../detections`, real `../missions`, and a richer `mavlink-sitl` configuration. Autopilot's Tier-E entries in this file MUST mirror the suite-e2e topology — drift is reconciled by the `monorepo-e2e` skill, not here.
## Consumer application (`e2e-consumer`)
**Tech stack**: Rust + `assert_cmd` + `testcontainers-rs` + `prost`/`tonic` (for gRPC observation) + `mavlink-rs` (for MAVLink observation) + `reqwest`/`hyper` (for HTTPS observation) + `tokio-tungstenite` (for operator-stream observation). Tests are organised one-scenario-per-file under `e2e/consumer/tests/scenarios/`.
**Entry point**: `cargo test --release --test scenarios` (orchestrated by `scripts/run-tests.sh`, produced in Phase 4).
### Communication with the system under test
| Interface | Protocol | Endpoint / Topic | Authentication |
|---|---|---|---|
| Health endpoint | HTTP GET | `http://autopilot:9100/health` | none (loopback) |
| Structured log stream | line-delimited JSON on stdout | docker-compose log tail | none |
| MAVLink observed | MAVLink v2 / UDP | `mavlink-sitl:14551` (the harness records both sides of the link) | per Q6: MAVLink-2 message signing if configured |
| Gimbal observed | ViewPro UDP | `gimbal-mock:9200` (commands recorded + telemetry replayed) | none |
| RTSP delivered | RTSP | `rtsp://rtsp-loopback:8554/feed` (consumer schedules which clip plays per scenario) | none |
| Detection RPC observed | gRPC streaming | `detections-mock:50051` (consumer scripts the recorded replay served) | none |
| Mission REST observed | HTTPS | `missions-mock:8443` (consumer scripts JSON fixtures + asserts captured request bodies) | TLS cert (self-signed for test) |
| Operator stream observed | Suite modem protocol | `operator-replay:9300` (consumer scripts session traces + signed-command envelopes) | per Q9: signed envelope (HMAC / ed25519 / MAVLink-2-ext) |
| VLM IPC observed (when enabled) | Unix-domain socket | `/tmp/vlm.sock` shared with `vlm-mock` | peer-credential check (security_approach §"Local IPC peer authorisation") |
### What the consumer does NOT have access to
- No direct database access to the autopilot's on-device persistent store (`autopilot-state` volume) — the consumer reads it only via the health endpoint, the operator telemetry stream, or as a post-run forensic check (the storage AC R3 is checked via the BIT health response, not by peeking at SQLite rows).
- No internal Rust module imports — the consumer is a separate crate compiled against published public proto/schema files only.
- No shared memory, no `/proc/$pid/...` inspection beyond passive resource metrics.
- No direct reading of in-flight POI queue ordering — ordering is observed indirectly via the operator-stream emission order and the gimbal command stream.
## External dependency mocks
| Dependency | Mock service | Determinism guarantee | Source fixture(s) |
|---|---|---|---|
| `../detections` Tier-1 RPC | `detections-mock` | Replays recorded `Detections` stream byte-for-byte; same input → same output | `<DEFERRED: tier1_replay/*.replay; services.md §1>` (live `../detections` used as fallback in Tier-E) |
| `missions` API | `missions-mock` | Static JSON responses per scenario; recorded round-trip captured for `POST` | `<DEFERRED: missions_fixtures/*.json; services.md §2>` |
| ViewPro A40 camera frames | `rtsp-loopback` (mediamtx) | Plays back `.mp4` at exact configured fps; frame timestamps deterministic | `fixtures/videos/94d42580bd1ad6ff.mp4`, `fixtures/movement/video0[1-4].mp4` |
| ViewPro A40 gimbal control | `gimbal-mock` | Replays `gimbal.csv` per scenario; echoes commands with bounded latency budget per scenario | `<DEFERRED: gimbal_csv/*.csv paired with movement videos; services.md §6>` |
| ArduPilot airframe | `mavlink-sitl` (ArduPilot SITL) | Deterministic seed + scripted mission | scripted per scenario; no fixture file required for Tier B (SITL is the fixture) |
| Ground Station modem session | `operator-replay` | Replays `(t, event)` script per scenario | `<DEFERRED: operator_sessions/*.script; services.md §3>` |
| Local VLM (Tier-3 optional) | `vlm-mock` | Returns paired `(roi.png → VlmAssessment)` from disk; schema-violation fixtures for fail-closed tests | `<DEFERRED: vlm_io_pairs/*.json; services.md §7>` |
| Wall-clock / GPS / NTP | `time-injector` (faketime LD_PRELOAD) | Scripted offset / jump / source-loss; injected at SUT process start | scripted per scenario; no fixture file required |
Mocks that are marked `<DEFERRED:>` are bridged through `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`. Scenarios that consume those mocks declare `Test status: DEFERRED — input fixture not yet acquired (see leftover row N)` in their entry under the relevant `*-tests.md` file.
## CI/CD integration
| Stage | Tier(s) | When | Gate | Timeout |
|---|---|---|---|---|
| PR pipeline | U, I | on every PR push | block merge on FAIL | 10 min |
| dev-branch nightly | U, I, B | nightly | warn on FAIL; report attached | 60 min |
| weekly suite-e2e | U, I, B, E | weekly + on release branch | block release on FAIL | 180 min |
| pre-release HW benchmark | HW | manual + pre-release | block release on FAIL | 240 min |
Owned in `_docs/02_document/deployment/ci_cd_pipeline.md`. This file only declares which tier each scenario MUST run in; the pipeline orchestration is documented there.
## Reporting
**Format**: CSV (one row per scenario per run).
**Columns**:
| Column | Type | Notes |
|---|---|---|
| `test_id` | string | e.g. `FT-P-001`, `NFT-PERF-L1`, `NFT-SEC-O9` |
| `test_name` | string | short title from the scenario header |
| `tier` | enum | U / I / B / E / HW |
| `seed` | int | deterministic seed used (where applicable) |
| `start_ts_utc` | ISO 8601 | scenario start |
| `duration_ms` | int | total execution time |
| `result` | enum | PASS / FAIL / SKIP / DEFERRED |
| `expected_result_ref` | string | row id in `expected_results/results_report.md` (e.g. `L1`, `Mp3`) |
| `actual_value` | string | quantitative observation (latency_ms, count, etc.) |
| `compare_method` | string | one of `expected-results.md` methods |
| `tolerance` | string | as declared in the expected-results row |
| `failure_reason` | string | populated only on FAIL or DEFERRED |
| `artifacts_path` | string | path under `/reports/<run-id>/` for captured logs / pcaps / mavlink dumps |
**Output path**: `e2e/consumer/reports/<run-id>/report.csv` (mounted host-side to `./reports/<run-id>/report.csv`).
**Sidecar artifacts** per scenario (one folder per `test_id`): `stdout.log`, `stderr.log`, `mavlink.tlog` (where applicable), `pcap.bin` (where applicable), `health-trace.jsonl`, `actual-output.json`.
## Test Execution
**Decision** (recorded 2026-05-19 by `phases/hardware-assessment.md`): **local-only on Jetson Orin Nano Super**. Every scenario — Tier B, Tier E, Tier HW — runs on representative Jetson hardware (the same hardware the airborne payload deploys to). Docker is used for **service orchestration** (mocks, sibling services) on the Jetson host, NOT for SUT execution on x86.
### Hardware dependencies found
| File | Dependency surfaced |
|---|---|
| `_docs/00_problem/restrictions.md → "Hardware"` | Jetson Orin Nano Super (aarch64), 8 GB shared LPDDR5, 67 TOPS INT8; ViewPro A40 (40× optical zoom + vendor UDP); ViewPro Z40K compatibility |
| `_docs/00_problem/restrictions.md → "Software environment"` | FP16 precision (INT8 rejected); no cloud egress; Tier 1 + local large models share Jetson GPU with mutual exclusion |
| `_docs/01_solution/solution_draft01.md` | "single Rust binary on Jetson Orin Nano Super (aarch64)"; TensorRT FP16; Tokio + Unix-domain-socket VLM IPC |
| `_docs/02_document/architecture.md §6` (NFR Targets) + `§7.6` (Solution Architecture) + `§7.14` (Tech Stack) | cross-compile target `aarch64-unknown-linux-gnu`; TensorRT engine; gimbal UDP; MAVLink-v2 transport |
| `_docs/02_document/components/*/description.md` (13 components) | physical UDP (gimbal_controller), RTSP capture (frame_ingest), MAVLink airframe link (mavlink_layer), local-onboard model (semantic_analyzer + vlm_client) |
### Why local-only on Jetson
The choice rejects two alternatives:
- **Docker-only on x86** would leave Tier-HW rows (L1L9, Re1, Re2, NFT-RES-LIM-CPU, NFT-RES-LIM-GPU) `SKIPPED-NO-HW`. That defeats the project-level Acceptance Gate (`acceptance_criteria.md → "Acceptance Gates (project-level)"`: every latency criterion MUST be measured on the deployed compute device).
- **Both x86 + Jetson** would split the test surface and let Tier-B scenarios pass on x86 while masking real-hardware regressions (e.g. GPU contention is invisible on x86). The honest path is to exercise the actual hardware path uniformly.
### Execution instructions (local on Jetson)
**Prerequisites** (one-time, per Jetson runner):
- JetPack 6.x SDK + L4T r36.x (matches the airborne deployment image).
- Rust toolchain pinned to the workspace's `rust-toolchain.toml` (added by Step 7 Implement); rustup target `aarch64-unknown-linux-gnu` already native here.
- Docker + Docker Compose v2 (for orchestrating the mock services + sibling repos in Tier-E mode).
- `mavlink-router`, `tegrastats`, `iperf3`, `tc` (network shaping).
- ViewPro A40 (or Z40K for the Z40K-swap regression run) connected over Ethernet at the documented control endpoint.
- ArduPilot SITL binary installed natively (the Docker image is x86-only; on Jetson aarch64 we run SITL natively or via Apptainer).
- A representative ViewPro A40 RTSP feed source — either the physical camera or a recorded `.mp4` looped through a local `mediamtx`.
**How to start services**: `docker compose -f e2e/docker-compose.autopilot-e2e.yml up -d` brings up `detections-mock`, `missions-mock`, `rtsp-loopback`, `gimbal-mock`, `vlm-mock`, `operator-replay`, `time-injector` on the Jetson host. The SUT (`autopilot` binary) runs **outside** the compose — `cargo run --release` on the Jetson directly, with env vars pointing at the compose-side mock endpoints. For Tier E, swap `detections-mock` → live `../detections` and `missions-mock` → live `missions` per `../e2e/docker-compose.suite-e2e.yml`.
**How to run the test runner**: `scripts/run-tests.sh` (to be created by a Decompose task per `traceability-matrix.md → "Phase 4 SKIPPED"` handoff) orchestrates: bring up compose → start SUT → run `cargo test --release --test scenarios -p e2e-consumer` → tear down. The runner reads `RUN_TIER ∈ {B, E, HW}` to decide which scenarios to execute.
**Environment variables** (consumed by both the SUT and the consumer):
- `RUN_TIER` (`B` | `E` | `HW`) — selects scenario set per the matrix below.
- `AUTOPILOT_CONFIG` — path to the test profile TOML (overrides per-scenario thresholds + Q-tagged defaults).
- `AUTOPILOT_RNG_SEED` — deterministic-seed per scenario; captured in the CSV report.
- `JETSON_RUNNER_ID` — identifier for the physical Jetson + camera+gimbal hardware combo; carried into every CSV row for forensic comparison across runners.
### CI/CD addendum (overrides the earlier `## CI/CD integration` table)
The earlier table assumed a Docker-on-x86 PR pipeline. Under this decision, every tier runs on a Jetson runner. Operationally that means:
| Stage | Tier(s) | When | Gate | Timeout | Runner |
|---|---|---|---|---|---|
| PR pipeline | U, I | on every PR push | block merge on FAIL | 10 min | Jetson runner (native cargo test for U + I) |
| dev-branch nightly | U, I, B | nightly | warn on FAIL; report attached | 60 min | Jetson runner |
| weekly suite-e2e | U, I, B, E | weekly + on release branch | block release on FAIL | 180 min | Jetson runner + live siblings reachable from it |
| pre-release HW benchmark | HW | manual + pre-release | block release on FAIL | 240 min | Jetson runner + physical A40 + airframe SITL/HW |
Capacity note: the PR pipeline running on Jetson trades x86 throughput for execution honesty. If PR latency becomes painful, the team's mitigation is to add more Jetson runners — NOT to fall back to x86 for Tier B (that would defeat the choice).
## Hardware Execution Matrix
Per the local-only-on-Jetson decision, every tier runs on Jetson. The matrix below is collapsed accordingly: it records **what each scenario actually exercises on the Jetson** (which hardware surface is the load-bearing one) so that a runner-capacity planner can predict which scenarios contend for the same physical resource.
| Scenario | Tier | Jetson surface exercised | Concurrent-with constraint |
|---|---|---|---|
| FT-P-001 (D6 Tier-1 contract) | B + E | GPU (Tier 1 inference) | conflicts with NFT-RES-LIM-Re2 / GPU |
| FT-P-002 — FT-P-006 (D1D5) | E + HW | GPU (Tier 1 inference) | as above |
| FT-P-007 — FT-P-010 (M1M4) | B + E | CPU (movement) + GPU (Tier 1 inputs) | as above |
| FT-P-011 — FT-P-015 (S1S5) | B + E | CPU + gimbal UDP + GPU (Tier 3 in S5) | gimbal contention serialises S1/S2/S3 |
| FT-P-016 — FT-P-022 (O1O7, O8 happy) | B + E | CPU + operator-stream | low contention |
| FT-P-023 (R1 BIT pass) | B + E | every dep mocked | none |
| FT-N-001 — FT-N-002 (R2/R3) | B + E | none (storage seed manipulation) | none |
| FT-N-003 (Mp2 cache-fallback) | B + E | mock timeout on `missions-mock` | none |
| FT-N-004 (O4 below-threshold) | B | CPU only | none |
| FT-P-024 / FT-P-025 / FT-P-026 (Mp1/Mp3/Mp5) | B + E | network + persistent store | persistent-store contention serialises |
| NFT-PERF-L1 | **HW** | GPU (Tier 1) | dedicate runner — measurement integrity |
| NFT-PERF-L2 | HW + B | GPU (Tier 2) | conflicts with L1/L3/L8 — serialise |
| NFT-PERF-L3 | HW + B (vlm-mock) | GPU (Tier 3 VLM) | conflicts with L1/L2 — serialise |
| NFT-PERF-L4 | **HW** | A40 physical zoom motor | dedicate runner — physical motion |
| NFT-PERF-L5 | HW + B | CPU + gimbal UDP | serialise with L4/L8 |
| NFT-PERF-L6 / L7 | B + E | CPU + ego-motion + GPU (Tier 1 inputs) | serialise with L1 |
| NFT-PERF-L8 | HW + B | A40 physical zoom + Tier 1 GPU | dedicate runner |
| NFT-PERF-L9 | B + E | CPU + operator-stream | low contention |
| NFT-PERF-T1 | B | CPU + queue | none |
| NFT-PERF-T2 | B + E | airframe link | low |
| NFT-PERF-T3 | B | RTSP throttling + health | none |
| NFT-RES-R4R9 | B + E | airframe link + persistent store | serialise per-mission |
| NFT-RES-Mp2 / Mp4 | B + E | network + persistent store | low |
| NFT-SEC-O9 / O10 | B + E | operator-stream + crypto path | low |
| NFT-SEC-CraftedFrame / OversizeCrop | B | decoder CPU | low |
| NFT-SEC-VlmSchemaViolation / FreeFormText | B (vlm-mock) | UDS IPC | low |
| NFT-SEC-IpcPeerAuth | B | UDS IPC + peer-cred | low |
| NFT-SEC-Tier1SchemaViolation | B | Tier-1 RPC | none |
| NFT-SEC-MavlinkUnsigned | B + E | airframe link (Q6 dep) | low |
| NFT-SEC-HealthExposesSecurity | B | counters + health | low |
| NFT-RES-LIM-Re1 | **HW** | full Jetson workload (RSS) | dedicate runner — measurement integrity |
| NFT-RES-LIM-Re2 | **HW** | Tier 1 + autopilot workload concurrent | runs back-to-back with NFT-PERF-L1 in same session |
| NFT-RES-LIM-Storage | B + HW | persistent store | low |
| NFT-RES-LIM-CPU | **HW** | full CPU | dedicate runner |
| NFT-RES-LIM-GPU | **HW** | GPU mutex (Tier 1 vs Tier 3) | dedicate runner |
| NFT-RES-LIM-FileHandles | B + HW | `/proc/<pid>/fd` | low |
**Bold Tier values** mark scenarios that REQUIRE physical Jetson + (sometimes) physical A40 to satisfy the project-level Acceptance Gate; surrogate replay does NOT count for those rows.
**Capacity rule**: scenarios marked `dedicate runner` MUST NOT run concurrently with any other scenario on the same Jetson — measurement integrity depends on the workload being exclusively the SUT.
## Open dependencies that affect the harness
| Open Q | Affects | Default until resolved |
|---|---|---|
| Q6 (MAVLink-2 signing) | `mavlink-sitl` config + observed-MAVLink assertions | signing disabled; tests skip signing assertions until Q6 lands |
| Q8 (MapObjects conflict resolution) | Mp5 fixture shape | `<DEFERRED>` |
| Q9 (Operator-command auth scheme) | `operator-replay` envelope format + signature validator | `<DEFERRED>` for O9/O10; O8 runs the happy path only |
| Q11 (multi-operator session policy) | `operator-replay` session-id semantics | single-operator only |
| Q14 (movement-detection classical vs learned-CV) | M4 benchmark fixture shape | `<DEFERRED>` |
@@ -0,0 +1,270 @@
# Performance Tests
Authored by `/test-spec` Phase 2 (2026-05-19). Performance tests measure latency / rate / sustained-load characteristics. Functional behaviour that those characteristics enable lives in `blackbox-tests.md`. Resource ceilings live in `resource-limit-tests.md`.
Every scenario records steady-state metrics — cold-start measurements are explicitly excluded by a warm-up precondition. Pass criteria use the methods in `_docs/00_problem/input_data/expected_results/results_report.md` (referenced by row id).
---
## Latency
### NFT-PERF-L1: Tier-1 per-frame end-to-end latency ≤ 100 ms
**Summary**: Per-frame end-to-end latency through the Tier-1 contract (frame in → normalised-box record out) ≤ 100 ms at 1280 px input.
**Traces to**: AC `Latency — Primitive (Tier 1) object detection / L1`.
**Tier**: HW (representative Jetson Orin Nano Super) OR benchmarked replay (the only way to satisfy the project-level Acceptance Gate).
**Metric**: per-frame wall-clock from RTSP frame-receive timestamp to normalised-box emission timestamp.
**Preconditions**:
- Warm-up: 100 frames played before measurement starts (TensorRT engine warm, autopilot's frame pipeline in steady state).
- Single 1280 px frame replayed via `rtsp-loopback`; the live Tier-1 service is colocated on the same Jetson.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Play `fixtures/images/4d6e1830d211ad50.jpg` as a 60 s loop at 30 fps | record per-frame (frame_receive_ts, normalised_box_emit_ts); compute Δms |
| 2 | Aggregate over the measurement window | report p50, p95, p99, max |
**Pass criteria**: `p95 ≤ 100 ms` AND `max ≤ 150 ms` (max gives a soft headroom; AC enforces the p95 line).
**Duration**: 60 s after warm-up.
**Test status**: READY (fixture present); Tier requires HW for the release gate.
---
### NFT-PERF-L2: Tier-2 per-ROI semantic confirmation ≤ 200 ms
**Summary**: Per-ROI latency through Tier-2 semantic confirmation ≤ 200 ms.
**Traces to**: AC `Latency — Semantic confirmation (Tier 2) / L2`.
**Tier**: HW + Tier-B (inline ROI crop generation).
**Metric**: per-ROI wall-clock from ROI submitted to Tier-2 to Tier-2 emits semantic confirmation.
**Preconditions**:
- Warm-up: 50 ROIs processed before measurement.
- Test runner derives a ~640×640 ROI inline from `fixtures/images/4d6e1830d211ad50.jpg` and injects it directly into the SUT's Tier-2 entry (via a test-only ROI submission API exposed in test builds).
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Submit 1000 ROIs at 5 Hz | per-ROI Δms |
| 2 | Aggregate | p50, p95, p99 |
**Pass criteria**: `p95 ≤ 200 ms`.
**Duration**: 200 s.
**Test status**: READY.
---
### NFT-PERF-L3: Tier-3 deep-analysis ≤ 5 s per ROI
**Summary**: Per-ROI deep-analysis (Tier-3 / VLM, when enabled) ≤ 5 s.
**Traces to**: AC `Latency — Deep semantic confirmation (Tier 3 / VLM, when enabled) / L3`.
**Tier**: HW + Tier-B (vlm-mock).
**Metric**: per-ROI wall-clock from SUT issuing a Tier-3 IPC call to VLM response received and schema-validated.
**Preconditions**:
- Warm-up: 5 Tier-3 calls.
- `vlm-mock` configured to respond from `vlm-io-pairs` fixture; Tier-3 enabled via SUT config.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Trigger 100 Tier-3 calls via injected ROIs | per-call Δms |
| 2 | Aggregate | p50, p95, p99 |
**Pass criteria**: `p95 ≤ 5000 ms`.
**Duration**: as needed for 100 calls.
**Test status**: DEFERRED — `<DEFERRED: vlm-io-pairs (real I/O) and the pinned local VLM model>`.
---
### NFT-PERF-L4: Camera zoom transition (medium → high) ≤ 2 s
**Summary**: Wall-clock from issuing the medium→high zoom command to the physical zoom transition completing ≤ 2 s, including the 12 s physical floor (restriction).
**Traces to**: AC `Latency — Camera zoom transition / L4`, RESTRICT `Hardware — 40× optical zoom traversal takes 12 s wall-clock`.
**Tier**: HW (physical A40 OR benchmarked replay) — pure-emulator runs not acceptable per `expected_results/results_report.md → Notes on this spec`.
**Metric**: wall-clock from outbound zoom command (observed on gimbal UDP) to gimbal-mock zoom telemetry reporting target_zoom_band.
**Preconditions**:
- SUT in `ZoomedIn` mode after a sweep-to-zoom transition; gimbal at medium zoom.
- HW Jetson OR `gimbal-mock` replaying recorded A40 zoom telemetry with realistic traversal time.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Trigger 30 medium→high zoom transitions via scripted POI sequence | per-transition Δms |
| 2 | Aggregate | p50, p95, max |
**Pass criteria**: `p95 ≤ 2000 ms`.
**Test status**: DEFERRED — `<DEFERRED: SITL or hardware-in-loop ViewPro A40 zoom command capture>`.
---
### NFT-PERF-L5: Decision-to-movement latency ≤ 500 ms
**Summary**: From the internal scan-control decision (POI detected mid-sweep) to the camera physically beginning to move ≤ 500 ms.
**Traces to**: AC `Latency — Decision-to-movement latency / L5`.
**Tier**: HW + Tier-B.
**Metric**: wall-clock from Tier-1 detection received at the scan-controller to first gimbal command observed on `gimbal-mock`.
**Preconditions**:
- Warm-up: 10 scripted POI events.
- Scripted scan-decision events followed by camera physical motion observed on the gimbal UDP channel.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Inject 100 POI detections at random sweep positions | per-event Δms (detection-receive-ts → gimbal-command-out-ts) |
| 2 | Aggregate | p95 |
**Pass criteria**: `p95 ≤ 500 ms`.
**Test status**: DEFERRED — `<DEFERRED: scripted scan decision events with paired gimbal telemetry capture>`.
---
### NFT-PERF-L6: Movement candidate enqueue ≤ 1 s (wide sweep)
**Summary**: From the movement event in the visual stream to candidate enqueued for zoomed inspection ≤ 1 s during the wide-area sweep.
**Traces to**: AC `Latency — Movement candidate enqueue / L6`.
**Tier**: B + E.
**Metric**: wall-clock from ground-truth movement-event timestamp (annotated in the fixture) to candidate appearing on operator-stream.
**Preconditions**:
- Warm-up: 30 s of sweep playback.
- Synchronised RTSP + gimbal.csv + telemetry.csv (DEFERRED CSV pair).
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Replay `fixtures/movement/video01.mp4` + paired CSVs | record per-event Δms |
| 2 | Aggregate over ~20 movement events | p95 |
**Pass criteria**: `p95 ≤ 1000 ms`.
**Test status**: DEFERRED — `<DEFERRED: paired gimbal.csv + telemetry.csv for video01.mp4 with annotated movement-event timestamps>`.
---
### NFT-PERF-L7: Movement candidate enqueue ≤ 1.5 s (zoomed-in)
**Summary**: Same as L6 but during a zoomed-in hold; budget relaxed to 1.5 s to accommodate gimbal slew.
**Traces to**: AC `Latency — Movement candidate enqueue … during the zoomed-in inspection / L7`.
**Tier**: B + E.
**Metric**: same as L6 but starting from a ZoomedIn hold.
**Preconditions**:
- SUT in ZoomedIn hold; small mover appears mid-hold.
- DEFERRED zoomed-in CSV pair.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Drive SUT into ZoomedIn hold; replay zoomed-in scene with small mover | per-event Δms |
| 2 | Aggregate over ~10 movement events | p95 |
**Pass criteria**: `p95 ≤ 1500 ms`.
**Test status**: DEFERRED — `<DEFERRED: paired gimbal.csv + telemetry.csv at zoomed-in band>`.
---
### NFT-PERF-L8: Zoom-out → zoom-in transition ≤ 2 s
**Summary**: From POI detected during sweep to ROI fully zoomed and held ≤ 2 s wall-clock.
**Traces to**: AC `Latency — Zoom-out → zoom-in transition / L8`.
**Tier**: HW + Tier-B.
**Metric**: wall-clock from Tier-1 detection injected → first frame at full zoom on the ROI (observed via gimbal-mock zoom telemetry and the operator-stream ROI overlay).
**Preconditions**:
- Warm-up.
- Scripted sweep + injected POI.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Inject 30 mid-sweep POIs | per-transition Δms |
| 2 | Aggregate | p95 |
**Pass criteria**: `p95 ≤ 2000 ms`.
**Test status**: DEFERRED — `<DEFERRED: sweep → zoomed-inspection transition capture with annotated transition-complete timestamps>`.
---
### NFT-PERF-L9: Operator command → action ≤ 500 ms
**Summary**: From operator click event (entering the SUT on the operator-stream return path) to the corresponding outbound command observed on its destination channel ≤ 500 ms; modem RTT explicitly excluded by measuring inside the SUT-side of the modem.
**Traces to**: AC `Latency — Operator command → action / L9`.
**Tier**: B + E.
**Metric**: wall-clock from operator-stream message arrival at SUT → first outbound command observed on the affected channel (MAVLink waypoint POST, gimbal command, mode-change emission).
**Preconditions**:
- Operator-session-scripts include click events at deterministic offsets.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Replay scripted operator-click sequence (50 clicks across confirm / decline / target-follow / abort) | per-click Δms |
| 2 | Aggregate | p95 |
**Pass criteria**: `p95 ≤ 500 ms`.
**Test status**: DEFERRED — `<DEFERRED: operator-envelopes once Q9 resolves>` for signed commands; happy-path placeholder usable today for an early measurement (mark interim baseline only).
---
## Throughput / Rate
### NFT-PERF-T1: POI rate to operator capped at ≤ 5 / min
**Summary**: Even when Tier-1 produces detections faster than the cap, the rate of POIs SURFACED to the operator MUST stay ≤ 5 / min (hard cap, frozen 2026-05-06).
**Traces to**: AC `Throughput / Rate — POI rate surfaced to the operator / T1`.
**Tier**: B.
**Metric**: count of POIs emitted on operator-stream per rolling 60 s window.
**Preconditions**:
- Synthetic POI feed sustained at 20 POIs / min via `synthetic-poi-feeds`.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Inject sustained 20 POI/min feed for 10 minutes | per-minute count of surfaced POIs |
| 2 | Compute max over any rolling 60 s window | rolling-max |
**Pass criteria**: `rolling-max ≤ 5` POIs/min for every 60 s window.
**Duration**: 10 min.
**Test status**: READY (synthetic feeds inline-authorable).
---
### NFT-PERF-T2: Position telemetry rate ∈ [1 Hz, 10 Hz]
**Summary**: The position telemetry the SUT consumes from the airframe link MUST sustain ≥1 Hz, target 10 Hz, over a 60 s window.
**Traces to**: AC `Throughput / Rate — Position telemetry rate / T2`.
**Tier**: B (with MAVLink replay) + E (live SITL).
**Metric**: count of `GLOBAL_POSITION_INT` messages consumed by the SUT per second.
**Preconditions**:
- MAVLink stream replayed at the configured target rate (10 Hz).
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Replay 60 s of GLOBAL_POSITION_INT at 10 Hz | per-second consumed count |
| 2 | Aggregate | min, mean |
**Pass criteria**: `min ≥ 1 Hz` AND `mean ≥ 9.5 Hz` (target 10 Hz with ≤ 5 % tolerance).
**Test status**: DEFERRED — `<DEFERRED: MAVLink replay fixture over a 60 s window>`.
---
### NFT-PERF-T3: Frame-rate floor → suppress zoom-in + health yellow
**Summary**: When the sustained camera frame rate drops below 10 fps for ≥5 s, zoom-in transitions MUST be suppressed AND overall health MUST surface yellow.
**Traces to**: AC `Throughput / Rate — Sustained camera frame-rate floor / T3`.
**Tier**: B.
**Metric**: pair: (boolean — was a zoom-in suppressed during the low-FPS window?), (boolean — did health surface yellow?).
**Preconditions**:
- SUT in normal sweep mode.
- `rtsp-loopback` plays `fixtures/videos/94d42580bd1ad6ff.mp4` with throttled decode injecting frame drops to keep FPS < 10 for ≥ 5 s.
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Start playback at normal 30 fps | health remains green; zoom-in proceeds normally on detection |
| 2 | Throttle decode + drop frames to push FPS below 10 for ≥ 5 s | record: (a) whether a zoom-in-required event during this window was suppressed; (b) whether `GET /health` returns `overall == "yellow"` |
**Pass criteria**: both observations TRUE.
**Duration**: 30 s (5 s low-FPS window + buffer).
**Test status**: READY (fixture present; throttling implemented by consumer).
---
## Sustained-load (handoff to resource-limit-tests)
The two sustained-resource AC rows (Re1, Re2) live as resource-limit tests rather than performance tests because the pass criterion is "stays within ceiling for the duration", not "is fast enough":
- Re1 — combined RSS ≤ 6 GB onboard for everything autopilot owns — see `resource-limit-tests.md → NFT-RES-LIM-Re1`.
- Re2 — Tier-1 per-frame latency Δ ≤ 5 ms when autopilot's workload runs concurrently — see `resource-limit-tests.md → NFT-RES-LIM-Re2`. Re2 is the Tier-1 non-degradation contract; the absolute Tier-1 latency target is L1.
---
## Common preconditions for every performance scenario
- **Warm-up**: every scenario MUST include an explicit warm-up phase whose duration is recorded in the CSV report. This separates cold-start cost from steady-state behaviour.
- **Steady-state window**: pass criteria apply only to the steady-state window (after warm-up), not to the warm-up itself.
- **Hardware honesty**: scenarios that name Tier HW MUST run on representative Jetson Orin Nano Super OR on a benchmarked replay. Pure-x86-emulator runs report results but do NOT contribute to the project-level Acceptance Gate.
- **Concurrent workload disclosure**: every scenario records whether other autopilot subsystems were running concurrently (Tier-1 inference, VLM, MAVLink, etc.). Re2 is the only scenario that REQUIRES concurrent workload; the others MUST report it for context.
- **Seed + determinism**: where the test inputs randomness (e.g., synthetic-POI ordering tie-breakers), the seed is captured in the CSV report.
+196
View File
@@ -0,0 +1,196 @@
# Resilience Tests
Authored by `/test-spec` Phase 2 (2026-05-19). Resilience tests inject a fault, observe behaviour during the fault, observe recovery behaviour, and assert against both. The fault and the recovery contract are both quantifiable.
BIT pre-flight pathways (positive R1, negatives R2/R3) are in `blackbox-tests.md` because they assert a functional gate. The runtime fault scenarios live here.
---
### NFT-RES-R4: Lost operator/Ground-Station link → RTL at 30 s grace (default)
**Summary**: Sustained loss of the operator/Ground-Station radio link MUST trigger an RTL exactly at the configured grace window (default 30 s), and operator-link health MUST flip red.
**Traces to**: AC `Reliability & Safety — Loss of operator/Ground-Station radio link MUST trigger a known mission-safe outcome / R4`, RESTRICT `Reliability & Safety — Lost operator-link failsafe MUST be deterministic and bounded`.
**Tier**: B + E.
**Preconditions**:
- SUT mid-flight (scripted MAVLink stream + active operator session).
- Operator session in steady state for ≥ 30 s before fault injection.
- Grace window configured to default 30 s.
**Fault injection**:
- `operator-replay` issues `lost-link` event at T=0 and STAYS silent (no reconnect) for the remainder of the window.
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | Inject lost-link event at T=0 | health endpoint immediately shows `deps.operator_link == "red"`; `last_seen_at` frozen |
| 2 | Wait 25 s (within grace) | NO RTL command yet on `mavlink-sitl`; SUT continues mission |
| 3 | Wait until T=30 s | RTL command observed on `mavlink-sitl` at T = 30 s ± 1 s; operator-stream emits a `failsafe_triggered` event with reason `operator_link_lost` |
| 4 | Optionally reconnect operator-replay after RTL | RTL persists (operator cannot un-RTL silently — requires explicit operator override per AC); health.operator_link transitions back to green when traffic resumes |
**Pass criteria**: RTL command at T = 30 s ± 1 s (`exact` with ± 1 s tolerance), `exact` operator-link red.
**Recovery time bound**: RTL must be issued within 31 s of fault start.
**Test status**: READY (operator-session-scripts inline-authorable; mavlink-sitl runs an ArduPilot SITL accepting RTL).
---
### NFT-RES-R5: Battery at RTL-floor → RTL
**Summary**: When the airframe battery sample drops to the configured RTL floor (e.g. 25 %), the SUT MUST issue an RTL and health MUST surface yellow.
**Traces to**: AC `Reliability & Safety — Battery at or below the configured RTL floor / R5`.
**Tier**: B + E.
**Preconditions**:
- SUT mid-flight; battery telemetry replayed via `mavlink-sitl` at 1 Hz.
**Fault injection**:
- `mavlink-sitl` scripted battery curve: starts at 80 %; ramps down to 25 % at T=T0; held at 25 % afterwards.
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | At T=T0, battery reads 25 % | within 1 sample period (1 s) the SUT issues RTL on `mavlink-sitl`; health transitions to `overall == "yellow"`; operator-stream emits `failsafe_triggered` with reason `battery_rtl_floor` |
| 2 | Battery continues at 25 % | RTL persists; no oscillation |
**Pass criteria**: `exact (RTL command observed)` + `exact (health.overall == "yellow")`.
**Test status**: DEFERRED — `<DEFERRED: mid-flight battery sample at RTL-floor via mavlink-sitl battery curve script>`.
---
### NFT-RES-R6: Battery at hard floor → land-now
**Summary**: When the battery hits the configured hard floor (e.g. 15 %), the SUT MUST issue land-now and ONLY an authenticated operator command may override.
**Traces to**: AC `Reliability & Safety — battery at or below the hard floor / R6`.
**Tier**: B + E.
**Preconditions**:
- SUT mid-flight; battery ramps to 15 %.
**Fault injection**: same as R5 but ramp continues to 15 %.
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | At T=T0, battery reads 15 % | within 1 sample period the SUT issues land-now (`MAV_CMD_NAV_LAND` or equivalent) on `mavlink-sitl`; health red; operator-stream emits `failsafe_triggered` with reason `battery_hard_floor` |
| 2 | Replay an UNAUTHENTICATED operator-override command | SUT REFUSES; land-now persists |
| 3 | Replay an AUTHENTICATED operator-override (placeholder until Q9; full once Q9 resolves) | land-now cancelled; SUT returns to prior mode |
**Pass criteria**: `exact (land_now observed)`; `exact (refusal of unauthenticated override)`; `exact (acceptance of authenticated override)`.
**Test status**: DEFERRED — same fixture gap as R5; step 3's full authentication semantics also `<DEFERRED: Q9>`.
---
### NFT-RES-R7: Airframe link exhaustion → health red after max-retry
**Summary**: When MAVLink commands fail through the configured bounded-retry budget (no airframe response), the airframe-link dependency MUST flip health red.
**Traces to**: AC `Reliability & Safety — MAVLink command exhaustion (bounded retry with exponential backoff fails through max-retry) / R7`.
**Tier**: B + E.
**Preconditions**:
- SUT mid-flight; max-retry configured (e.g., 5 attempts; exponential backoff base 100 ms).
**Fault injection**:
- `mavlink-sitl` configured to drop all command-ack messages for the duration of the test (peer non-responsive).
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | SUT issues a MAVLink command (e.g., waypoint upload) | command sent; no ack received |
| 2 | Backoff + retry loop executes through max-retry | retries observed on the wire with exponential backoff |
| 3 | After final retry exhausts | health.airframe_link transitions to red; operator-stream emits a `dependency_degraded` event with reason `airframe_link_retry_exhausted` |
**Pass criteria**: `exact (health.airframe_link == "red")` after max-retry; retries observed with backoff base 100 ms ± 20 ms.
**Test status**: DEFERRED — `<DEFERRED: airframe link command + bounded retry/backoff with peer not responding through max-retries>`.
---
### NFT-RES-R8: Wall-clock drift > 200 ms → time-source yellow
**Summary**: When wall-clock drift versus GPS or NTP source exceeds 200 ms, the time-source dependency MUST report yellow, AND `clock_source` + `last_sync_at` MUST reflect the drift.
**Traces to**: AC `Reliability & Safety — Wall-clock drift greater than 200 ms / R8`, RESTRICT `Wall-clock MUST be bound to GPS time once GPS is locked, or NTP at boot`.
**Tier**: B.
**Preconditions**:
- SUT running with `time-injector` LD_PRELOAD active.
- GPS source initially locked via `mavlink-sitl` GPS_RAW_INT messages.
**Fault injection**:
- `time-injector` advances the SUT process clock by 250 ms over a 1 s window while keeping GPS source locked.
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | Bind clock to GPS at boot | health.time_source == green; `clock_source == "gps"`; `last_sync_at` recent |
| 2 | Inject 250 ms drift | within 5 s health.time_source transitions to yellow; `clock_source` and `last_sync_at` updated to reflect the drift |
| 3 | Stop drift | health.time_source returns to green within the next sync cycle |
**Pass criteria**: `exact (health.time_source == "yellow")` during step 2; `exact (clock_source updated)` + `exact (last_sync_at updated)`.
**Test status**: READY (time-drift-scripts inline-authorable).
---
### NFT-RES-R9: Geofence EXCLUSION crossing → waypoint refusal + RTL
**Summary**: When a simulated waypoint crosses an EXCLUSION polygon, the SUT MUST refuse the waypoint AND trigger RTL. Symmetric behaviour for INCLUSION violations.
**Traces to**: AC `Reliability & Safety — Geofence INCLUSION and EXCLUSION violations MUST both result in waypoint refusal + RTL / R9`, RESTRICT `Geofence enforcement MUST be symmetric`.
**Tier**: B + E.
**Preconditions**:
- SUT mid-flight; geofence INCLUSION + EXCLUSION polygons loaded as part of the mission.
**Fault injection**:
- Scripted waypoint upload that crosses the EXCLUSION polygon; subsequently INCLUSION-exit test.
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | Upload waypoint crossing EXCLUSION polygon | SUT refuses the waypoint; structured-log WARN with `geofence_violation_exclusion`; RTL command observed on `mavlink-sitl` |
| 2 | Reset; upload waypoint exiting the INCLUSION polygon | identical behaviour — refused + RTL |
**Pass criteria**: `exact (waypoint rejected)` + `exact (RTL command observed)` for both EXCLUSION and INCLUSION cases.
**Test status**: DEFERRED — `<DEFERRED: geofence EXCLUSION polygon crossed by simulated waypoint via mavlink-sitl scripted mission>`.
---
### NFT-RES-Mp2: Map-pull timeout → cache-fallback (functional coverage in FT-N-003)
**Summary**: When the pre-flight map pull times out, the SUT falls back to last-known cached MapObjects and surfaces `map_sync == "cached_fallback"` with an operator-ack gate. (Functional gate semantics are tested in `blackbox-tests.md → FT-N-003`; this scenario adds the **timing+recovery** dimension.)
**Traces to**: AC `Map Reconciliation — Cache-fallback on timeout / Mp2`.
**Tier**: B.
**Preconditions**:
- `autopilot-state` seeded with a known prior MapObjects snapshot.
- `missions-mock` configured to time out on `GET /missions/{id}/mapobjects` for a configurable duration.
**Fault injection**:
- `missions-mock` returns 504 / silent timeout for 60 s; then responds normally.
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | Trigger BIT | SUT issues `GET /missions/{id}/mapobjects`; observes timeout (per its configured request timeout); within 5 s falls back to cached snapshot; `map_sync == "cached_fallback"`; BIT requires explicit operator ack (see FT-N-003) |
| 2 | Mock recovers (responds normally) | next periodic resync re-attempts; once successful, `map_sync == "live"`; structured-log INFO `map_resync_recovered` |
**Pass criteria**: `exact (cached_fallback within 5 s of timeout)`; recovery within the next resync cycle.
**Test status**: READY (no external fixture beyond `mission-suite-fixture` (DEFERRED) for the cached snapshot seed; cached snapshot can be authored inline at minimal scale).
---
### NFT-RES-Mp4: Post-flight map-push 5xx → persist + bounded retry + operator warning
**Summary**: When the post-flight `POST /missions/{id}/mapobjects` returns 5xx, the pending diff MUST be persisted on durable on-device storage, an operator-visible warning MUST surface, AND bounded retry MUST execute (capped at the configured retry limit).
**Traces to**: AC `Map Reconciliation — Failure MUST persist the pending diff to durable on-device storage with bounded retry / Mp4`, RESTRICT `On-device storage MUST be bounded`.
**Tier**: B + E.
**Preconditions**:
- SUT post-landing; pending diff ready to push.
- `missions-mock` configured to return 5xx N times then 200.
**Fault injection**:
- `missions-mock` returns 503 for the first N attempts (N = configured retry-cap + 1); then returns 200.
| Step | Action | Expected Behaviour |
|---|---|---|
| 1 | Trigger post-flight reconciliation | SUT issues `POST /missions/{id}/mapobjects`; receives 503 |
| 2 | Observe persistence | pending diff file exists under `autopilot-state/pending_map_diff/<mission-id>.json`; size > 0 |
| 3 | Observe operator-stream | warning event `map_push_failed` surfaced |
| 4 | Observe retry loop | retries observed within the configured cap; backoff with jitter |
| 5 | After retry-cap reached without success | SUT stops retrying; pending file remains for next session pickup |
| 6 | Eventual success (mock returns 200) | next attempt succeeds; pending file removed; warning cleared |
**Pass criteria**: `exact (pending file exists)` + `exact (warning surfaced)` + `threshold_max (retries ≤ configured cap)`.
**Test status**: DEFERRED — `<DEFERRED: same fixture as Mp3 (60-minute pass diff)>`.
---
## Recovery-time invariants common to every scenario
- **No silent error swallowing.** Every fault scenario MUST observe a corresponding structured-log entry at WARN+ AND a corresponding health-endpoint transition. A fault that the SUT handles without surfacing through both channels is a TEST FAILURE per `security_approach.md → "No silent error swallowing for security-relevant failures"` (extended here to operational faults per `coderule.mdc → "Never suppress errors silently"`).
- **Bounded behaviour.** Every retry/backoff loop MUST be bounded — the scenario asserts the cap on retry count and the cap on backoff window. Open-ended retry is a test failure.
- **State integrity post-recovery.** After fault recovery (when applicable), the scenario asserts that the SUT returns to a known state — mode unchanged unless the fault legitimately altered it (e.g., RTL stays RTL until operator override).
- **Symmetry assertions.** R9 explicitly tests both INCLUSION and EXCLUSION because the AC names symmetric behaviour. Wherever an AC pairs two outcomes (`fail-fast` + `fail-closed`, `red` + `yellow`, etc.), the resilience scenario MUST cover both halves.
@@ -0,0 +1,156 @@
# Resource Limit Tests
Authored by `/test-spec` Phase 2 (2026-05-19). Resource-limit tests assert that the SUT stays within a quantified resource ceiling for the configured duration. Short bursts do not satisfy these tests — every scenario has an explicit sustained-monitoring window.
---
### NFT-RES-LIM-Re1: Combined onboard RSS ≤ 6 GB sustained
**Summary**: Combined process RSS on the deployed compute device for everything autopilot owns onboard (excluding Tier 1) MUST stay ≤ 6 GB throughout a 5-minute steady-state window with the full onboard workload active.
**Traces to**: AC `Resources & Data — Combined RSS on the deployed compute device, for everything autopilot owns onboard (excluding Tier 1), MUST stay within ≤ 6 GB / Re1`, RESTRICT `Hardware — Compute device: Jetson Orin Nano Super, 8 GB shared LPDDR5; Tier 1 consumes ~2 GB, leaving ~6 GB for autopilot`.
**Tier**: HW (representative Jetson Orin Nano Super) — pure-x86 reports informational only and does NOT satisfy the project-level Acceptance Gate.
**Preconditions**:
- Full onboard workload active: frame ingest from `rtsp-loopback`, Tier-2 + Tier-3 (when enabled) inferring at the documented steady-state load, gimbal commands flowing, MAVLink stream consumed at 10 Hz, operator-stream connected, MapObjects store hydrated for a 30×30 km region.
- Warm-up: 60 s before measurement starts (any first-load model warm-up complete).
- Tier-1 process is RUNNING in parallel but its RSS is EXCLUDED from the measurement (the AC scope is autopilot-owned RSS, excluding Tier 1).
**Monitoring**:
- Cgroup-level RSS for every process the SUT owns (the SUT binary plus any child processes it spawns — e.g., the VLM IPC peer if it lives in autopilot's cgroup), sampled at 1 Hz.
- Cgroup-level RSS for Tier 1 sampled at the same cadence (for the Re2 cross-reference).
- Per-process RSS captured to `reports/<run-id>/rss-trace.csv` for forensic review on failure.
**Duration**: 5 minutes of measurement after warm-up.
**Pass criteria**:
- `threshold_max`: per 1 s sample, `sum(autopilot_owned_RSS) ≤ 6 GB`.
- No single 1 s sample exceeds the ceiling.
- (Reporting only — not pass/fail): peak RSS, mean RSS, P95 RSS recorded in the CSV report.
**Test status**: DEFERRED — `<DEFERRED: long-running scenario harness exercising the full onboard workload for 5 min; inline-authorable but requires that the SUT be operational end-to-end first>`.
---
### NFT-RES-LIM-Re2: Tier-1 non-degradation under autopilot workload
**Summary**: When autopilot's full onboard workload runs concurrently with Tier 1 on the same Jetson, Tier-1 per-frame latency MUST NOT degrade by more than ± 5 ms versus the Tier-1-alone baseline (recorded by NFT-PERF-L1).
**Traces to**: AC `Resources & Data — Tier 1 per-frame latency MUST NOT degrade by more than ± 5 ms when autopilot's own onboard workload is running concurrently / Re2`, RESTRICT `Tier 1 (YOLO) and any local large model with GPU memory pressure share the Jetson GPU — only one of them may execute at any wall-clock instant`.
**Tier**: HW (the only meaningful environment for this assertion — GPU contention behaviour does not reproduce on x86).
**Preconditions**:
- NFT-PERF-L1 has been run on the same HW configuration in the SAME session and a baseline `tier1_baseline_p95_ms` recorded.
- Full onboard workload active (same as Re1).
**Monitoring**:
- Tier-1 per-frame latency sampled per frame for the duration of the test.
- The same metric source as NFT-PERF-L1 — for direct delta comparison.
**Duration**: 5 minutes of measurement after warm-up (matches Re1 window so both can run in the same session).
**Pass criteria**:
- `numeric_tolerance`: `|p95(tier1_with_autopilot) - tier1_baseline_p95_ms| ≤ 5 ms`.
- (Reporting only): mean, P95, max delta over the window.
**Test status**: DEFERRED — same fixture dependency as Re1; requires SUT operational + Tier 1 colocated on HW.
---
### NFT-RES-LIM-Storage: On-device persistent store stays under 95 % for in-flight operation
**Summary**: During a steady-state mission run (no abnormal load), the on-device persistent store MUST NOT exceed 95 % full. This protects the takeoff gate (R3) from being silently violated mid-mission and protects the post-flight push (Mp4) from running out of room to persist a failed diff.
**Traces to**: AC `Reliability & Safety — On-device storage MUST be bounded` (via R3 BIT gate), RESTRICT `On-device storage MUST be bounded`.
**Tier**: B + HW.
**Preconditions**:
- SUT mid-flight; persistent store at typical post-takeoff utilisation (e.g. 30 %).
- Normal-operation event volume: telemetry persistence, ignored-item appends, pending map-diff buffer (empty in this scenario).
**Monitoring**:
- Volume utilisation sampled at 10 Hz throughout the duration.
**Duration**: 60 minutes (representative mission duration per Mp3).
**Pass criteria**:
- `threshold_max`: `volume_used / volume_total ≤ 0.95` at every sample point.
- On approach to 85 %: structured-log INFO `storage_pressure` with current utilisation.
- On approach to 90 %: structured-log WARN with current utilisation; health.storage transitions to yellow.
- On 95 %: health.storage transitions to red; the SUT begins its documented eviction policy (this scenario does NOT test the policy semantics — that belongs to its own scenario; this scenario only asserts the policy IS triggered).
**Test status**: READY (no external fixture beyond the SUT itself; the persistent-store seed file controls starting utilisation).
---
### NFT-RES-LIM-CPU: CPU headroom for the Tier-1 colocation guarantee
**Summary**: Combined CPU utilisation of every autopilot-owned process MUST leave enough Jetson CPU headroom for Tier 1 to keep its NFT-PERF-L1 budget. Concretely: per-second sustained CPU usage by autopilot-owned processes MUST stay ≤ the configured budget (default 60 % of total CPU cycles measured at the cgroup level) for the duration of the run.
**Traces to**: AC `Resources & Data — Tier 1 per-frame latency MUST NOT degrade by more than ± 5 ms / Re2` (CPU-side mechanism backing Re2), RESTRICT `Hardware — Jetson Orin Nano Super`.
**Tier**: HW (CPU contention does not reproduce on x86).
**Preconditions**:
- Same workload as Re1 + Re2.
**Monitoring**:
- Cgroup CPU usage at 1 Hz.
**Duration**: 5 minutes after warm-up.
**Pass criteria**:
- `threshold_max`: per 1 s sample, `sum(autopilot_cpu_usage) ≤ 60 %` of total CPU.
- Reporting: mean, P95, max.
**Test status**: DEFERRED — same dependency as Re1/Re2.
---
### NFT-RES-LIM-GPU: GPU mutual exclusion contract (Tier 1 vs local large model)
**Summary**: Per RESTRICT (`Tier 1 (YOLO) and any local large model with GPU memory pressure share the Jetson GPU — only one of them may execute at any wall-clock instant`), the SUT MUST NOT issue a GPU compute call (e.g. Tier-3 VLM inference) while Tier 1 is executing on the GPU. The serialisation MUST be observable: a single GPU is busy at one instant.
**Traces to**: RESTRICT `Tier 1 and any local large model … only one of them may execute at any wall-clock instant`.
**Tier**: HW.
**Preconditions**:
- Tier 1 active; SUT in a ZoomedIn hold with deep-analysis enabled (Tier-3 will fire).
**Monitoring**:
- GPU-instance occupancy via `tegrastats` / equivalent at the highest available sampling rate.
- The SUT's own internal "compute-class" telemetry exposed on the health endpoint as `gpu_owner_current` ∈ { "tier1", "tier3", "idle" }.
**Duration**: 60 s containing ≥ 5 Tier-3 hold cycles.
**Pass criteria**:
- `exact`: at every sample point, `gpu_owner_current ∈ { "tier1", "tier3", "idle" }`; never simultaneously both.
- `tegrastats` peak GPU occupancy attributable to autopilot processes never overlaps Tier 1's known activity window for the same wall-clock instant.
**Test status**: DEFERRED — depends on the SUT being operational end-to-end + Tier-3 enabled; also depends on the SUT exposing `gpu_owner_current` (which is an architectural choice not yet locked).
---
### NFT-RES-LIM-FileHandles: File-descriptor and socket bound
**Summary**: Sustained operation MUST NOT leak file descriptors or sockets. The count MUST stay within a documented headroom of the initial-post-warmup baseline for the duration of the run.
**Traces to**: RESTRICT `On-device storage MUST be bounded` (general bounded-resource principle), security principle `No silent error swallowing for security-relevant failures` (FD exhaustion would silently break the operator-stream).
**Tier**: B + HW.
**Preconditions**:
- Warm-up: 60 s.
- Workload: full onboard workload at steady state.
**Monitoring**:
- `/proc/<pid>/fd` count per autopilot process at 1 Hz.
**Duration**: 60 minutes.
**Pass criteria**:
- `threshold_max`: at every sample point, `fd_count ≤ fd_baseline_post_warmup + 50` (50 = documented churn headroom for intermittent operator reconnects).
- A monotonically rising trend (slope > 0 over the run) is a TEST FAILURE even if the absolute ceiling is not breached.
**Test status**: READY for a Tier-B run; gains its real value once HW + sustained-workload land.
---
## Common assertions for every resource-limit scenario
- **Sustained-monitoring is non-negotiable.** Each scenario specifies a duration ≥ 60 s; short bursts that pass do not satisfy the test. The CSV report records the full sample trace path under `artifacts_path`.
- **No silent eviction.** Where a ceiling is approached, the SUT MUST surface the pressure (structured-log INFO at 85 %, WARN at 90 %, transition to yellow/red on health) BEFORE reaching the ceiling. A pass with no observable pressure signal at thresholds is a TEST FAILURE.
- **HW reporting vs gating.** Pure-x86 runs report informational deltas only; they do NOT satisfy the project-level Acceptance Gate. Every CSV row records its tier so this distinction stays auditable.
- **Re1 + Re2 are paired.** Re1 establishes the autopilot RSS ceiling; Re2 establishes that respecting Re1 does not cost Tier 1 latency. They MUST be run in the same session to make the Re2 baseline meaningful.
+215
View File
@@ -0,0 +1,215 @@
# Security Tests
Authored by `/test-spec` Phase 2 (2026-05-19). Security tests validate blackbox-observable security properties derived from `_docs/00_problem/security_approach.md` and the AC operator-command rules. Code-level vulnerability scanning is out of scope at this layer (see deploy-time security audit `Step 14` of the autodev flow).
Each scenario observes the SUT through its public surfaces only; pass criteria assert that an attack attempt produces no state change AND surfaces a structured-log entry / health signal — silent rejection is a test failure.
---
### NFT-SEC-O9: Operator-command replay protection
**Summary**: An operator command envelope replayed within (or outside) the replay-protection window MUST be rejected; system state MUST NOT change; security WARN logged with reason `replay`.
**Traces to**: AC `Operator Workflow — A replayed or unsigned operator command MUST be rejected with a logged security warning / O9`, security principle `Operator commands MUST be authenticated, signed, and replay-protected`.
**Tier**: B + E.
**Preconditions**:
- SUT in steady state; a prior valid operator-confirm envelope already accepted.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Capture the valid envelope from the prior FT-P-016 run | envelope captured (sequence_id S, timestamp T) |
| 2 | Replay the exact same envelope a second time | SUT rejects at the boundary; no `POST /missions/{id}` observed; no mode change; counter `operator_cmd_rejected_replay_total` += 1; structured-log WARN with `reason: "replay"`, `sequence_id: S`, `originating_envelope_id` recorded |
| 3 | Replay an envelope with sequence_id S but timestamp T+window+1s (outside replay window) | rejected as expired; counter `operator_cmd_rejected_expired_total` += 1; structured-log WARN reason `expired` |
**Pass criteria**: `exact (state unchanged)` AND `substring (log contains "replay")` for step 2; `exact (state unchanged)` AND `substring (log contains "expired")` for step 3.
**Test status**: DEFERRED — `<DEFERRED: operator-envelopes (replayed) fixture; services.md §8 — blocked on Q9 operator-command auth scheme>`. Until Q9 resolves, this scenario asserts only that a duplicate envelope at the byte level is rejected (placeholder behaviour); the full replay-window semantics land with Q9.
---
### NFT-SEC-O10: Operator-command signature validation
**Summary**: A malformed / unsigned operator command MUST be rejected with `reason: "invalid"`; state MUST NOT change.
**Traces to**: AC `O10`, security principle `Operator commands MUST be authenticated, signed, and replay-protected`.
**Tier**: B + E.
**Preconditions**:
- SUT in steady state.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Send a malformed envelope (signature bytes flipped) | rejected; no state change; counter `operator_cmd_rejected_signature_total` += 1; structured-log WARN reason `invalid_signature` |
| 2 | Send an UNSIGNED envelope (signature field absent / zero) | rejected; counter increments; structured-log WARN reason `unsigned` |
| 3 | Send a well-formed envelope but signed with a key NOT in the operator's authorised set | rejected; counter increments; reason `unauthorised_signer` |
| 4 | Send a valid envelope (control case) | accepted; state changes as per the command type |
**Pass criteria**: steps 13 all `exact (state unchanged)` + `substring (log contains "invalid"|"unsigned"|"unauthorised")`; step 4 succeeds normally.
**Test status**: DEFERRED — `<DEFERRED: operator-envelopes (malformed / unsigned / wrong-key); blocked on Q9>`.
---
### NFT-SEC-CraftedFrame: Crafted RTSP frame → no decoder OOM / no crash
**Summary**: A crafted H.264/265 frame (oversize SPS, malformed NAL, truncated slice) MUST NOT crash or hang the SUT and MUST NOT consume unbounded memory. Frame is dropped with a counter increment.
**Traces to**: security principle `Bounded input for any model call`, RESTRICT `On-device storage / RSS budgets`.
**Tier**: B.
**Preconditions**:
- SUT in normal sweep mode; `rtsp-loopback` switched to a corpus of crafted clips.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Stream a fuzzed clip corpus (≥ 100 crafted frames) | each crafted frame dropped at decode; counter `frame_decode_error_total` increments per drop; structured-log WARN with `reason: "decode_error"` |
| 2 | Observe SUT process | RSS does NOT exceed 1.2 × baseline; no crash; no hang; gimbal & operator-stream still responsive within their normal latency budgets |
**Pass criteria**: `exact (no crash)`; `threshold_max (RSS ≤ 1.2 × baseline)`; counter consistent with crafted-frame count.
**Test status**: READY (crafted-clip corpus authorable inline using afl++ / honggfuzz output against a vanilla H.264 decoder; corpus stored in `e2e/consumer/fixtures/fuzzed_clips/`).
---
### NFT-SEC-OversizeCrop: Bounded crop enforcement
**Summary**: An attempt to submit an oversize ROI crop (above the configured max bytes or outside the format allow-list) to any onboard model entry point MUST be rejected at the boundary; downstream models MUST NOT be invoked.
**Traces to**: security principle `Bounded input for any model call`.
**Tier**: B.
**Preconditions**:
- SUT with Tier-2 + Tier-3 enabled.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Submit a 5000 × 5000 PNG (above the configured 1024 × 1024 cap) to the Tier-2 ROI entry | rejected; Tier-2 inference NOT invoked (verified via `tier2_inference_total` counter unchanged); structured-log WARN `reason: "roi_too_large"` |
| 2 | Submit a BMP (not in the allow-list) | rejected; reason `roi_format_not_allowed` |
| 3 | Submit a well-formed 640×640 JPEG (control) | accepted; Tier-2 invoked normally |
**Pass criteria**: `exact (downstream model not invoked)` for steps 12; `exact (downstream invoked)` for step 3.
**Test status**: READY (oversize PNG + BMP generated inline).
---
### NFT-SEC-VlmSchemaViolation: VLM schema-violation fails closed
**Summary**: When the Tier-3 VLM returns a response that fails schema validation (missing required field, wrong type, truncated JSON), the SUT MUST discard the assessment AND the POI MUST NOT receive the deep-analysis upgrade.
**Traces to**: security principle `Schema validation for any non-deterministic model output … Schema violation MUST fail closed`.
**Tier**: B.
**Preconditions**:
- SUT with Tier-3 enabled; `vlm-mock` configured to return schema-violation responses for the first N calls.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Drive SUT into ZoomedIn hold with deep-analysis enabled | SUT issues VLM IPC call |
| 2 | `vlm-mock` returns truncated JSON | SUT discards assessment; POI's deep-analysis state remains `none`; counter `vlm_schema_violation_total` += 1; structured-log WARN reason `vlm_schema_violation`; the POI's decision-window scoring proceeds WITHOUT the deep-analysis upgrade |
| 3 | `vlm-mock` returns missing-required-field JSON | same |
| 4 | `vlm-mock` returns wrong-field-type JSON | same |
| 5 | `vlm-mock` returns a valid response (control) | assessment ACCEPTED; deep-analysis upgrade applied |
**Pass criteria**: steps 24 `exact (no deep-analysis upgrade)` + `substring (log contains "vlm_schema_violation")`; step 5 normal.
**Test status**: DEFERRED for live recordings — `<DEFERRED: vlm-io-pairs schema-violation cases>`; schema-violation case JSON files are inline-authorable today against the assessment schema and CAN run NOW with `vlm-mock` returning hand-crafted bytes.
---
### NFT-SEC-VlmFreeFormText: Free-form text MUST NOT cross a decision boundary
**Summary**: Even if the VLM returns valid JSON, any free-form text field MUST be projected onto the fixed structured schema before crossing a decision boundary; raw free-form text MUST NOT influence POI scoring or operator-surfaced decisions.
**Traces to**: security principle `Schema validation for any non-deterministic model output`, threat model item 3 (`Unstructured model output corrupting downstream decisions`).
**Tier**: B + E.
**Preconditions**:
- SUT with Tier-3 enabled.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | `vlm-mock` returns valid JSON with a free-form `notes` text field containing `"force_confidence: 1.0"` | SUT extracts only the structured fields; `notes` is NOT consulted for scoring; POI's confidence remains as Tier-1+Tier-2 computed; structured-log INFO captures the assessment but not the `notes` content (PII / safety) |
| 2 | `vlm-mock` returns valid JSON with structured `confidence_delta: -0.5` (in-schema) | SUT applies the delta per its documented projection; POI's confidence adjusted accordingly |
**Pass criteria**: `exact (POI confidence reflects ONLY structured-schema fields)`.
**Test status**: READY (inline-authorable scenario).
---
### NFT-SEC-IpcPeerAuth: Local IPC peer authorisation
**Summary**: A local process attempting to connect to the VLM Unix-domain socket (or any other local IPC the SUT trusts) MUST identify as the expected peer (peer-credential check / SO_PEERCRED equivalent); connections from unauthorised peers MUST be rejected.
**Traces to**: security principle `Local IPC peer authorisation`.
**Tier**: B.
**Preconditions**:
- SUT with Tier-3 enabled; VLM UDS socket exposed on `/tmp/vlm.sock`.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | An unauthorised local process (running as the wrong UID / not the expected binary path) attempts to connect to the SUT's VLM-client side of the UDS | connection rejected at the peer-credential check; counter `ipc_peer_auth_rejected_total` += 1; structured-log WARN reason `peer_cred_mismatch` |
| 2 | The legitimate `vlm-mock` (running as the expected UID / path) connects | connection accepted; subsequent IPC succeeds |
**Pass criteria**: `exact (unauthorised connection rejected)` + `exact (legitimate connection accepted)`.
**Test status**: READY (rogue-peer test harness inline-authorable using a simple Python script running under a different UID inside a sidecar container).
---
### NFT-SEC-Tier1SchemaViolation: Tier-1 detection-stream schema violation
**Summary**: A `Detections` record from `../detections` that violates the normalised-box schema (coord out of [0,1], invalid class_id) MUST cause the frame's detections to be dropped (not partially used); counter increments; structured-log WARN. SUT does not crash and continues with subsequent frames.
**Traces to**: security principle `No silent error swallowing for security-relevant failures` (extends to peer schema violations) + AC `D6` (normalised-box conformance).
**Tier**: B.
**Preconditions**:
- SUT in normal sweep mode; `detections-mock` configured to emit schema-violating records interleaved with valid ones.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Mock emits Detections for frame N with bbox `x2 = 1.5` (coord > 1.0) | frame N's detections dropped; counter `tier1_invalid_frame_total` += 1; structured-log WARN with `field: "x2"`, `value: 1.5` |
| 2 | Mock emits Detections for frame N with `class_id = 99` (not in 0..18) | dropped; reason `class_id_out_of_range` |
| 3 | Mock emits valid Detections for frame N+1 | processed normally |
**Pass criteria**: `exact (no operator-stream emission for frames N)` + `exact (counter incremented per dropped frame)`.
**Test status**: READY (inline-authorable injection by `detections-mock`).
---
### NFT-SEC-MavlinkUnsigned: Optional MAVLink-2 signing enforcement
**Summary**: When MAVLink-2 message signing is configured ON (per Q6 once resolved), unsigned messages on the airframe link MUST be dropped with a security WARN; signed messages flow normally. When signing is OFF (current default until Q6), no signing assertion runs.
**Traces to**: security principle `Airframe MAVLink integrity` (Q6).
**Tier**: B + E.
**Preconditions**:
- SUT configured with MAVLink-2 signing ENABLED (test profile).
- `mavlink-sitl` configured to send a mix of signed and unsigned messages.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | `mavlink-sitl` sends a valid signed message | accepted; processed normally |
| 2 | `mavlink-sitl` sends an unsigned message | dropped; counter `mavlink_unsigned_dropped_total` += 1; structured-log WARN reason `mavlink_unsigned`; airframe-link health unaffected for an isolated drop |
| 3 | Sustained unsigned-only stream | airframe-link health flips red after the configured tolerance window (same threshold as R7 retry exhaustion) |
**Pass criteria**: `exact (unsigned dropped)` + `exact (signed accepted)`; sustained-unsigned escalates per the documented threshold.
**Test status**: DEFERRED — `<DEFERRED: Q6 (MAVLink-2 message signing decision)>`. When Q6 lands and signing is mandated, this scenario becomes READY.
---
### NFT-SEC-HealthExposesSecurity: Health endpoint surfaces security state
**Summary**: The `/health` endpoint MUST reflect security state — repeated operator-command signature failures, repeated peer-credential mismatches, repeated schema-violation rates all MUST be visible to ops.
**Traces to**: security principle `Health endpoint MUST reflect security state`.
**Tier**: B.
**Preconditions**:
- SUT in steady state; counters baselined.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Drive sustained signature-failure rate (10 / s) for 10 s via the NFT-SEC-O10 flow | `GET /health` exposes a `security` sub-object that includes `operator_cmd_rejected_signature_rate_60s` non-zero; if rate exceeds the configured alert threshold, the security sub-object transitions to yellow |
| 2 | Drive sustained peer-credential-mismatch attempts (1 / s) for 60 s via NFT-SEC-IpcPeerAuth | `security.ipc_peer_auth_rejected_rate_60s` non-zero; transitions to yellow at threshold |
| 3 | Drive sustained Tier-1 schema-violation rate (1 / s) via NFT-SEC-Tier1SchemaViolation | `security.tier1_invalid_rate_60s` non-zero |
**Pass criteria**: `exact (health.security exposes each rate)` + `exact (transition to yellow at threshold)`.
**Test status**: READY.
---
## Out of scope at this layer
Per `security_approach.md → "Out of scope"`, the following are NOT covered by blackbox security tests because they are owned elsewhere in the suite:
- Modem-link encryption setup (radio layer below autopilot).
- Suite-wide TLS / certificate provisioning (suite-level deployment, `../_infra/`).
- OTA update signing (Watchtower; autopilot consumes signed images only). Boot-time self-check + rollback is Q10 — when it lands, it becomes a new scenario here.
- Annotation / training-data security (`../ai-training` repo).
- Operator browser UI auth (Ground Station owns it; only the modem-side handshake is jointly specified per Q9, covered by O8/O9/O10).
- Multi-operator session policy (Q11 — when it lands, becomes a new scenario here).
## Common assertions
- **No silent rejection.** Every rejected security event MUST produce both a counter increment AND a structured-log entry at WARN+. A rejection that occurs silently is a TEST FAILURE.
- **Fail-closed everywhere.** When an authentication / signature / schema check is uncertain, the SUT MUST fail closed (reject) rather than fail open. Tests assert this by sending borderline / ambiguous inputs and checking for rejection.
- **No information leak in error paths.** Error responses (where the SUT exposes any to the operator-stream or health endpoint) MUST NOT leak the rejected payload contents beyond the minimum needed for ops to triage. Tests inspect log/health output for absence of crafted-payload byte sequences.
+152
View File
@@ -0,0 +1,152 @@
# Test Data Management
Authored by `/test-spec` Phase 2 (2026-05-19). Owns the **mapping** from fixtures to tests, mock data shapes, isolation strategy, and the deferred-fixture inventory bridge.
- Per-row input-to-expected-result binding lives in `_docs/00_problem/input_data/expected_results/results_report.md` — this file references it but never duplicates it.
- Fixture manifest (SHA-pinned files + provenance) lives in `_docs/00_problem/input_data/fixtures/README.md`.
- Per-service mock catalogue (what shape each mock returns) lives in `_docs/00_problem/input_data/services.md`.
- Deferred fixture inventory + replay obligation lives in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`.
## Seed data sets
| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|---|---|---|---|---|
| `image-set-existing` | `fixtures/images/{4d6e1830d211ad50,54f6459dbddb93d8,6dd601b7d2dc1b30,805bcf1e9f271a58,f997d0934726b555}.jpg` — 5 aerial frames | FT-P-Tier1Contract, NFT-PERF-L1, NFT-PERF-L2, FT-P-DetectExisting | mounted read-only via `fixtures-ro:/fixtures` on `rtsp-loopback` (encoded to `.mp4` clip) and on `detections-mock` (paired with `expected_detections.json` per frame) | volume detached on container teardown |
| `video-recon` | `fixtures/videos/94d42580bd1ad6ff.mp4` | NFT-PERF-T3 | mounted read-only on `rtsp-loopback`; consumer requests stream at 30 fps then throttles decode + drops frames per scenario script | as above |
| `video-movement` | `fixtures/movement/video0[1-4].mp4` (4 wide-area clips) | FT-P-MoveStarter (visual reference only), FT-P-MoveBenchmark (deferred) | mounted on `rtsp-loopback`; played at 30 fps; consumer schedules which clip per scenario | as above |
| `image-semantic-starter` | `fixtures/semantic/semantic0[1-4].png` (1 winter + 3 unmarked season) | FT-P-ConcealStarter, FT-P-FootpathStarter (visual reference only; assertion semantics deferred) | mounted on `detections-mock` and `rtsp-loopback` as a single-frame loop | as above |
| `schemas-detection` | `fixtures/schemas/expected_detections.{json,schema.json}` | FT-P-Tier1Contract, FT-P-NormalisedBoxes (D6) | mounted on `e2e-consumer:/expected:ro` | as above |
| `sql-init-suite` | `fixtures/sql/init.sql` | NOT USED by autopilot tests (suite-only artefact; recorded here for traceability) | n/a | n/a |
| `mission-suite-fixture` | `<DEFERRED: missions_fixtures/mission_30x30km.json + mapobjects_10k.json; services.md §2>` | FT-P-MissionStart, FT-P-MapPull (Mp1), FT-P-MapPush (Mp3), NFT-RES-Mp2, NFT-RES-Mp4 | mounted on `missions-mock` once acquired | as above |
| `mavlink-sitl-scripts` | scripted `ardupilot/sitl` scenarios (waypoint upload, geofence in/out, RTL on link loss, RTL on battery floor) | FT-P-WaypointInsert (O8), NFT-RES-R4, NFT-RES-R5, NFT-RES-R6, NFT-RES-R7, NFT-RES-R9 | run in `mavlink-sitl` via `--script` argument per scenario | SITL container restarted per scenario |
| `operator-session-scripts` | scripted `(t, event)` traces — nominal, drop+reconnect, lost-link 30 s, sustained lost-link | FT-P-DecisionWindow (O1O3, O4), FT-P-OperatorDecline (O5), FT-P-OperatorIgnoredSuppress (O6), FT-P-OperatorTimeout (O7), FT-P-OperatorConfirm (O8), NFT-RES-R4 | replayed by `operator-replay` per scenario | per-scenario |
| `operator-envelopes` | `<DEFERRED: operator_envelopes/{valid,replayed,malformed,unsigned,expired}.bin; services.md §8 (Q9-blocked)>` | NFT-SEC-O9, NFT-SEC-O10, FT-P-OperatorConfirm (O8 happy path uses a default placeholder envelope) | replayed by `operator-replay` | per-scenario |
| `vlm-io-pairs` | `<DEFERRED: vlm_io_pairs/{roi,prompt,response}.* + schema-violation cases; services.md §7>` | NFT-PERF-L3, FT-P-DeepAnalysisHold (S5), NFT-SEC-VlmSchemaViolation | mounted on `vlm-mock` | per-scenario |
| `gimbal-csv-pairs` | `<DEFERRED: gimbal_csv/video0[1-4].csv paired with movement videos at zoomed-in band + threshold-edge cluster; services.md §6>` | FT-P-EgoMotion (M1), FT-P-MoveDuringHold (M2), FT-P-ThresholdEdge (M3), FT-P-MoveBenchmark (M4), NFT-PERF-L6, NFT-PERF-L7 | replayed by `gimbal-mock` synchronised to RTSP frame timestamps | per-scenario |
| `tier1-replay-streams` | `<DEFERRED: tier1_replay/*.replay; services.md §1>` | FT-P-Tier1ContractIsolated (Tier B variant); Tier-E uses live `../detections` | served by `detections-mock` | per-scenario |
| `time-drift-scripts` | scripted clock offsets (50 ms ramp, 250 ms jump, NTP loss, GPS unlock) | NFT-RES-R8 | injected by `time-injector` via faketime LD_PRELOAD shim | per-scenario |
| `synthetic-poi-feeds` | inline-authorable: confidence={0.39, 0.40, 0.70, 1.00}, ordering-test feed, sustained-rate feed >5 POI/min | FT-P-DecisionWindow (O1O4), FT-P-POIOrdering (S4), NFT-PERF-T1 | authored in Rust under `e2e/consumer/fixtures/synthetic_poi/`; pumped into the SUT by injecting recorded `Detections` into `detections-mock` | n/a (in-memory) |
| `bit-scenarios` | inline-authorable: every-dep-green, tier1-unreachable, storage-95pct-full | FT-P-BitPass (R1), NFT-RES-R2, NFT-RES-R3 | manipulated by toggling mock services up/down + `autopilot-state` volume seed file | volume seed file removed |
## Data isolation strategy
- **Per scenario, fresh containers.** Each scenario starts with `docker compose down -v && docker compose up -d` (the `e2e-consumer` orchestrates this via `testcontainers-rs`). No state leaks between scenarios.
- **`autopilot-state` volume** is named per `(test_id, run_id)` so parallel scenario runs do not collide.
- **Deterministic seeds.** Every randomness source in the SUT (POI age-factor tie-breaking, retry jitter, replay-window nonce window) is configured to a per-scenario seed via env vars (`AUTOPILOT_RNG_SEED=<test_id>`). The seed is captured in the CSV report.
- **Wall-clock control.** Scenarios that depend on absolute time (NFT-RES-R8, NFT-RES-R4 grace window, FT-P-DecisionWindow timeouts) use `time-injector` (faketime LD_PRELOAD). The SUT's `time.now()` calls are intercepted; GPS-source state is set via the `mavlink-sitl` GLOBAL_POSITION_INT message stream.
- **Network determinism.** All inter-service traffic stays on the `autopilot-e2e` Docker network (no internet egress). Latency injection (for L9 modem RTT exclusion checks) uses `tc qdisc` inside the `operator-replay` container.
- **No shared mocks between scenarios.** Even when two scenarios use the same fixture, each gets its own mock container instance — this avoids stale state in `missions-mock`'s POST-buffer or `gimbal-mock`'s last-command cache.
## Input data mapping (fixtures → scenarios)
This is the **fixture-side index**; the scenario-side index is in each `*-tests.md` file's `Input data` field.
| Input data file | Source location | Description | Covers scenarios |
|---|---|---|---|
| `fixtures/images/4d6e1830d211ad50.jpg` | `_docs/00_problem/input_data/fixtures/images/` | Aerial frame, 1280 px input | FT-P-Tier1Contract (D6), NFT-PERF-L1, NFT-PERF-L2 |
| `fixtures/images/{54f6...,6dd6...,805b...,f997...}.jpg` | same dir | 4 additional aerial frames for existing-class regression | FT-P-DetectExisting (D2) |
| `fixtures/videos/94d42580bd1ad6ff.mp4` | same dir | Reconnaissance clip, 30 fps; consumer throttles to drop below 10 fps for ≥5 s | NFT-PERF-T3 |
| `fixtures/movement/video01.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-EgoMotion (M1) [DEFERRED — needs gimbal.csv] |
| `fixtures/movement/video02.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-MoveDuringHold (M2) [DEFERRED — needs zoomed-in gimbal.csv] |
| `fixtures/movement/video03.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-ThresholdEdge (M3) [DEFERRED — needs threshold-edge gimbal.csv] |
| `fixtures/movement/video04.mp4` | same dir | Wide-area movement clip (visual reference only) | FT-P-MoveBenchmark (M4) [DEFERRED — needs zoom-band benchmark CSV] |
| `fixtures/semantic/semantic01.png` | same dir | Winter concealed-position reference (starter only) | FT-P-ConcealStarter (D3, D4), FT-P-FootpathStarter (D5) [DEFERRED — needs annotated multi-season set] |
| `fixtures/semantic/semantic0[2-4].png` | same dir | 3 unmarked-season concealed-position references | as above |
| `fixtures/schemas/expected_detections.json` | same dir | Reference output for D6 | FT-P-Tier1Contract (D6), FT-P-NormalisedBoxes |
| `fixtures/schemas/expected_detections.schema.json` | same dir | Schema for normalised-box output | FT-P-NormalisedBoxes, NFT-SEC-Tier1SchemaViolation |
| `fixtures/sql/init.sql` | same dir | (suite-only — recorded for traceability) | none |
## Expected results mapping (scenario → comparison row)
Every scenario in `*-tests.md` traces to a row id in `_docs/00_problem/input_data/expected_results/results_report.md`. The comparison method + tolerance is owned by that row — this table is the **scenario-side index** so a reader can navigate from a test to its assertion contract.
| Scenario ID | Input data | Expected result row | Comparison method | Tolerance | Source |
|---|---|---|---|---|---|
| FT-P-Tier1Contract | `image-set-existing` (1 frame) | `D6` | `schema_match` + `range` | each coord ∈ [0,1] | `fixtures/schemas/expected_detections.schema.json` |
| FT-P-DetectExisting | `image-set-existing` (5 frames) | `D2` | `numeric_tolerance` | ± 0.02 (P, R) | `<DEFERRED: expected_results/existing_classes_baseline.json>` |
| FT-P-DetectNew | `<DEFERRED: new-class eval set>` | `D1` | `threshold_min` | P ≥ 0.80 AND R ≥ 0.80 | `<DEFERRED: expected_results/new_classes_pr.json>` |
| FT-P-ConcealRecall | `image-semantic-starter` + `<DEFERRED: full set>` | `D3` | `threshold_min` | recall ≥ 0.60 | `<DEFERRED: expected_results/concealed_positions.json>` |
| FT-P-ConcealPrecision | same | `D4` | `threshold_min` | precision ≥ 0.20 | same |
| FT-P-FootpathRecall | `image-semantic-starter` + `<DEFERRED>` | `D5` | `threshold_min` | recall ≥ 0.70 | `<DEFERRED: expected_results/footpaths.json>` |
| NFT-PERF-L1 | `image-set-existing` (1 frame) | `L1` | `threshold_max` | ≤ 100 ms | inline |
| NFT-PERF-L2 | derived ROI from same | `L2` | `threshold_max` | ≤ 200 ms | inline |
| NFT-PERF-L3 | `vlm-io-pairs` | `L3` | `threshold_max` | ≤ 5000 ms | inline |
| NFT-PERF-L4 | `<DEFERRED: SITL or HW zoom-cmd capture>` | `L4` | `threshold_max` | ≤ 2000 ms | inline |
| NFT-PERF-L5 | `<DEFERRED: scripted scan→movement>` | `L5` | `threshold_max` | ≤ 500 ms | inline |
| NFT-PERF-L6 | `video-movement` (visual ref) + `<DEFERRED gimbal.csv>` | `L6` | `threshold_max` | ≤ 1000 ms | inline |
| NFT-PERF-L7 | `video-movement` + `<DEFERRED zoomed-in gimbal.csv>` | `L7` | `threshold_max` | ≤ 1500 ms | inline |
| NFT-PERF-L8 | `<DEFERRED: sweep→zoomed transition capture>` | `L8` | `threshold_max` | ≤ 2000 ms | inline |
| NFT-PERF-L9 | `<DEFERRED: operator-click → outbound>` | `L9` | `threshold_max` | ≤ 500 ms | inline |
| NFT-PERF-T1 | `synthetic-poi-feeds` (sustained > cap) | `T1` | `threshold_max` | ≤ 5 / min | inline |
| NFT-PERF-T2 | `<DEFERRED: MAVLink replay 60 s>` | `T2` | `range` | 1 Hz ≤ r ≤ 10 Hz | inline |
| NFT-PERF-T3 | `video-recon` (throttled) | `T3` | `exact` × 2 | suppression bool + health=yellow | inline |
| FT-P-EgoMotion (M1) | `video-movement/video01.mp4` + `<DEFERRED gimbal.csv + telemetry.csv>` | `M1` | `set_contains` | candidate set == {vehicle}; ∉ tree row | inline |
| FT-P-MoveDuringHold (M2) | `video02.mp4` + `<DEFERRED zoomed-in CSV pair>` | `M2` | `exact` | 1 candidate; preempt per priority rule | inline |
| FT-P-ThresholdEdge (M3) | `video03.mp4` + `<DEFERRED threshold-edge CSV>` | `M3` | `exact` | count == 0 | inline |
| FT-P-MoveBenchmark (M4) | `video04.mp4` + `<DEFERRED benchmark suite>` | `M4` | `threshold_max` | per-zoom-band FP rate budget | `<DEFERRED: expected_results/movement_benchmark_caps.json>` |
| FT-P-SweepToZoom (S1) | `<DEFERRED scripted mission + POI>` | `S1` | `exact` × 3 | transition + ROI + queue+=1 | inline |
| FT-P-FootpathPan (S2) | `<DEFERRED hold + footpath polyline>` | `S2` | `numeric_tolerance` | centre offset ≤ 25% per frame | inline |
| FT-P-TargetFollow (S3) | `<DEFERRED confirmed target>` | `S3` | `threshold_max` | per-frame |dx,dy| ≤ 0.125 | inline |
| FT-P-POIOrdering (S4) | `synthetic-poi-feeds` (ordering test) | `S4` | `exact (order)` | ordering matches `conf × prox × age` | inline |
| FT-P-DeepAnalysisHold (S5) | `<DEFERRED VLM-enabled hold>` | `S5` | `exact` | hold = min(5 s, vlm_complete) | inline |
| FT-P-DecisionWindow30s (O1) | `synthetic-poi-feeds` (conf=0.40) | `O1` | `exact` | window = 30 s | inline |
| FT-P-DecisionWindow120s (O2) | conf=1.00 | `O2` | `exact` | window = 120 s | inline |
| FT-P-DecisionWindow75s (O3) | conf=0.70 | `O3` | `numeric_tolerance` | window ≈ 75 s ± 0.5 s | inline |
| FT-N-BelowThreshold (O4) | conf=0.39 | `O4` | `exact` | not surfaced | inline |
| FT-P-OperatorDecline (O5) | `operator-session-scripts` (nominal + decline) | `O5` | `exact (count Δ+1)` + `schema_match` | ignored-item appended | inline |
| FT-P-IgnoredSuppress (O6) | matching MGRS + class_group | `O6` | `exact` | not surfaced | inline |
| FT-P-OperatorTimeout (O7) | no-response + > window | `O7` | `exact` × 2 | queue 1; ignored unchanged | inline |
| FT-P-OperatorConfirm (O8) | `operator-envelopes` (valid happy path) | `O8` | `exact (HTTP 200)` + `exact (mode)` | mission POST + target-follow | inline |
| NFT-SEC-O9 | `operator-envelopes` (replayed) | `O9` | `exact` + `substring` | state unchanged; log contains "replay" | inline |
| NFT-SEC-O10 | `operator-envelopes` (malformed/unsigned) | `O10` | `exact` + `substring` | state unchanged; log contains "invalid" | inline |
| FT-P-BitPass (R1) | `bit-scenarios` (every dep green) | `R1` | `exact` × 2 | takeoff permitted + health all green | inline |
| FT-N-BitDetectionDown (R2) | tier1 unreachable | `R2` | `exact` | takeoff inhibited + detection red | inline |
| FT-N-BitStorageFull (R3) | storage ≥ 95 % | `R3` | `exact` | takeoff inhibited + storage red | inline |
| NFT-RES-R4 | `operator-session-scripts` (sustained lost-link) | `R4` | `exact (RTL at 30 s ± 1 s)` | RTL command + operator-link red | inline |
| NFT-RES-R5 | `mavlink-sitl-scripts` (battery at RTL-floor) | `R5` | `exact` × 2 | RTL + health yellow | inline |
| NFT-RES-R6 | battery at hard-floor | `R6` | `exact` | land-now | inline |
| NFT-RES-R7 | `mavlink-sitl-scripts` (no-response retry exhaustion) | `R7` | `exact` | health red after max-retry | inline |
| NFT-RES-R8 | `time-drift-scripts` (250 ms drift) | `R8` | `exact` | time-source yellow + clock_source/last_sync_at updated | inline |
| NFT-RES-R9 | `mavlink-sitl-scripts` (EXCLUSION cross) | `R9` | `exact` × 2 | waypoint rejected + RTL | inline |
| NFT-RES-LIM-Re1 | `<DEFERRED long-running RSS harness>` | `Re1` | `threshold_max` | combined RSS ≤ 6 GB | inline |
| NFT-RES-LIM-Re2 | Re1 + concurrent Tier-1 traffic | `Re2` | `numeric_tolerance` | Tier-1 ms/frame Δ ± 5 ms | inline |
| FT-P-MapPull (Mp1) | `<DEFERRED 30×30 km area + ~10k mapobjects>` | `Mp1` | `threshold_max` | ≤ 30 s | inline |
| NFT-RES-Mp2 | mock unreachable | `Mp2` | `exact` × 2 | cached_fallback + BIT requires ack | inline |
| FT-P-MapPush (Mp3) | `<DEFERRED 60 min diff>` | `Mp3` | `threshold_max` | ≤ 120 s | inline |
| NFT-RES-Mp4 | POST returns 5xx | `Mp4` | `exact` × 2 + `threshold_max` | file exists + warning + retries ≤ cap | inline |
| FT-P-MapConflict (Mp5) | `<DEFERRED conflict pair>` | `Mp5` | `json_diff` | conflict resolution per Q8 | `<DEFERRED: expected_results/mapobjects_conflict_resolution.json>` |
## External dependency mocks
(Index-only; per-mock acquisition status owned by `services.md`.)
| External service | Mock/stub | How provided | Behavior |
|---|---|---|---|
| `../detections` Tier-1 RPC | `detections-mock` (gRPC bi-stream) | Docker container; serves `.replay` files | Returns recorded `Detections` byte-stream for the input frame's hash; serves a 19-class catalogue (0..18) deterministically; supports schema-violation injection for NFT-SEC tests |
| `missions` API | `missions-mock` (HTTPS FastAPI) | Docker container; TLS via self-signed test cert | Static JSON for `GET /missions/{id}`, `GET /missions/{id}/mapobjects`; records POST bodies for assertion; can be configured to return 5xx for NFT-RES-Mp4 |
| ViewPro A40 RTSP | `rtsp-loopback` (mediamtx) | Docker container | Plays back `.mp4` at scheduled fps with frame-drop injection (T3) |
| ViewPro A40 gimbal | `gimbal-mock` (Rust UDP) | Docker container | Replays `gimbal.csv` synchronised to RTSP frame timestamps; echoes received commands with bounded latency budget |
| ArduPilot | `mavlink-sitl` (official ardupilot/ardupilot-sitl image) | Docker container | Deterministic SITL run from a scripted mission file |
| Ground Station modem | `operator-replay` (Python) | Docker container | Replays `(t, event)` script per scenario; signs envelopes per Q9 once resolved |
| Local VLM | `vlm-mock` (Python over UDS) | Docker container; UDS shared via `/tmp` volume | Returns paired `VlmAssessment` JSON; can return schema-violation responses for NFT-SEC tests |
| Wall-clock / GPS / NTP | `time-injector` (Rust) | LD_PRELOAD faketime shim into the SUT container at start | Scripted offset/jump/source-loss |
## Data validation rules
| Data Type | Validation | Invalid Examples | Expected System Behaviour |
|---|---|---|---|
| Mission JSON | `mission-schema` (shared with `missions` repo) | missing required field; coord out of [-180, 180]; unknown enum value | system refuses; mission-state stays at last-known; health flips mission-config-source = yellow; structured-log at WARN with `schema_violation_field` |
| Map-object record | suite-level mapobjects schema | non-finite coordinate; class_group not in catalogue; missing MGRS | record dropped; counter `mapobjects_rejected_total` increments; structured-log at WARN |
| Tier-1 `Detections` stream | `expected_detections.schema.json` (normalised-box) | bbox coord ∉ [0, 1]; confidence ∉ [0, 1]; class_id ∉ {0..18} | frame's detections dropped (not partially used); `tier1_invalid_frame_total` increments; per AC D6 the system must surface a structured WARN |
| MAVLink message | MAVLink v2 dialect (per ArduPilot) | unknown MSG_ID; CRC mismatch; (if Q6 resolves to "signing on") missing signature | message dropped; if signing required and missing → security WARN; airframe-link health unaffected for individual drops |
| Operator command envelope | Q9 scheme (TBD) | replay (sequence_id seen recently); signature invalid; timestamp outside replay window | rejected at the boundary; no state mutation; security WARN with reason code; counters `operator_cmd_rejected_replay_total`, `..._signature_total`, `..._expired_total` |
| VLM `VlmAssessment` response | structured assessment schema | missing required field; wrong type; truncated JSON | fail-closed: assessment discarded; POI does NOT get the deep-analysis upgrade; structured WARN |
| RTSP frame | container-level decode | malformed H.264/265 NAL; oversized SPS | frame dropped; `frame_decode_error_total` increments; if rate falls below 10 fps for ≥5 s → T3 path triggers (zoom-in suppressed + health yellow) |
| Camera frame size | bounded crop policy (security_approach §Bounded input) | crop > configured max bytes; format not in allow-list | rejected at boundary; security WARN |
| Time source | wall-clock binding | GPS unlocked AND no NTP sync at boot | clock_source = `none`; health red until either source available |
## Deferred-fixture bridge (replay obligation)
Every `<DEFERRED:>` row above maps 1-to-1 to an entry in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md → "What is needed before /autodev can resume"` table. On every `/autodev` invocation, the leftovers step must re-evaluate whether any deferred fixture has landed; once landed, the corresponding scenario(s) become unblocked and their `Test status` line in the matching `*-tests.md` file moves from `DEFERRED — input fixture not yet acquired` to `READY`.
Inline-authorable categories (10 and 11 in the leftover) — `synthetic-poi-feeds`, `time-drift-scripts`, `operator-session-scripts`, `bit-scenarios` — are NOT marked `<DEFERRED:>` in this file because they have no external dependency. They are authored by Phase 4's `e2e/consumer/fixtures/` generators when the runner scripts come online.
@@ -0,0 +1,202 @@
# Traceability Matrix
Authored by `/test-spec` Phase 2 (2026-05-19).
This matrix maps every acceptance-criterion bullet from `_docs/00_problem/acceptance_criteria.md` and every restriction bullet from `_docs/00_problem/restrictions.md` to the test scenarios that exercise them. Coverage is **scenario-level**, not fixture-level — scenarios marked `DEFERRED` in the underlying `*-tests.md` files still count as covered for the purpose of "the test is specified"; the fixture-acquisition status is tracked separately in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`.
## Acceptance Criteria Coverage
| AC ID | Acceptance criterion (paraphrased; canonical text in `acceptance_criteria.md`) | Test IDs | Coverage |
|---|---|---|---|
| AC-L1 | Tier 1 per-frame ≤ 100 ms at 1280 px on deployed compute | NFT-PERF-L1, FT-P-001 (functional contract) | Covered |
| AC-L2 | Tier 2 per-ROI ≤ 200 ms | NFT-PERF-L2 | Covered |
| AC-L3 | Tier 3 per-ROI ≤ 5 s (when enabled) | NFT-PERF-L3 | Covered (fixture DEFERRED) |
| AC-L4 | Camera zoom transition (medium→high) ≤ 2 s | NFT-PERF-L4 | Covered (fixture DEFERRED) |
| AC-L5 | Decision-to-movement latency ≤ 500 ms | NFT-PERF-L5 | Covered (fixture DEFERRED) |
| AC-L6 | Movement candidate enqueue ≤ 1 s (wide sweep) | NFT-PERF-L6 | Covered (fixture DEFERRED — gimbal.csv) |
| AC-L7 | Movement candidate enqueue ≤ 1.5 s (zoomed-in) | NFT-PERF-L7 | Covered (fixture DEFERRED — zoomed gimbal.csv) |
| AC-L8 | Zoom-out → zoom-in transition ≤ 2 s | NFT-PERF-L8 | Covered (fixture DEFERRED) |
| AC-L9 | Operator command → outbound action ≤ 500 ms | NFT-PERF-L9 | Covered (fixture DEFERRED for signed envelopes; placeholder usable today) |
| AC-T1 | POI rate surfaced to operator ≤ 5 / min (hard cap) | NFT-PERF-T1 | Covered |
| AC-T2 | Position telemetry rate ∈ [1, 10] Hz (target 10) | NFT-PERF-T2 | Covered (fixture DEFERRED — MAVLink replay) |
| AC-T3 | Frame-rate floor < 10 fps for ≥ 5 s → suppress zoom-in AND health yellow | NFT-PERF-T3 | Covered |
| AC-D1 | New classes per-class P ≥ 0.80 AND R ≥ 0.80 | FT-P-003 | Covered (fixture DEFERRED — annotated eval set) |
| AC-D2 | Existing-class regression Δ ≤ ± 0.02 vs baseline | FT-P-002 | Covered (baseline JSON DEFERRED; visual fixtures present) |
| AC-D3 | Concealed-position recall ≥ 0.60 (initial gate) | FT-P-004 | Covered (fixture DEFERRED — multi-season set) |
| AC-D4 | Concealed-position precision ≥ 0.20 (initial gate) | FT-P-005 | Covered (fixture DEFERRED — same as D3) |
| AC-D5 | Footpath recall ≥ 0.70 | FT-P-006 | Covered (fixture DEFERRED — polyline-annotated set) |
| AC-D6 | Tier-1 normalised-box contract conformance (class ids 0..18, coords ∈ [0,1]) | FT-P-001, NFT-SEC-Tier1SchemaViolation | Covered |
| AC-Mov-EnqueueWideSweep | Small movers during wide sweep MUST be detected and enqueued ≤ 1 s | FT-P-007 (M1 behavioural), NFT-PERF-L6 (latency dimension) | Covered |
| AC-Mov-ContinueDuringZoom | Movement detection continues during zoomed-in inspection | FT-P-008 (M2), NFT-PERF-L7 | Covered |
| AC-Mov-StableObjectsRejected | Stable objects (trees, houses, roads) NOT treated as moving solely due to camera platform motion | FT-P-007 (M1 — set_contains explicitly excludes tree row) | Covered |
| AC-Mov-FPBudgetHonoured | Configurable per-zoom-band FP budget honoured | FT-P-009 (M3), FT-P-010 (M4 — Q14) | Covered (M4 fixture DEFERRED) |
| AC-Scan-SweepCoverage | Wide-area sweep covers planned route at wide/light/medium zoom | implicitly by FT-P-011 setup + scenario-runner BIT scenarios; NOT covered as a distinct test | NOT COVERED (see Uncovered Items § §1) |
| AC-Scan-SweepToZoomTransition | Sweep → detailed inspection transition ≤ 2 s | FT-P-011, NFT-PERF-L8 | Covered |
| AC-Scan-TargetLock | Lock + pan + 2 s deep-analysis hold + per-POI timeout default 5 s | FT-P-015 (S5 — three cases) | Covered (fixture DEFERRED — vlm-mock with realistic timing) |
| AC-Scan-TargetFollowCentre | Target-follow within centre 25 % of frame | FT-P-013 | Covered (fixture DEFERRED) |
| AC-Scan-GimbalLatency | Gimbal decision-to-movement ≤ 500 ms (links to L5) | NFT-PERF-L5 | Covered |
| AC-Scan-POIOrdering | POI queue ordered by `confidence × proximity × age_factor` | FT-P-014 | Covered |
| AC-Op-DecisionWindowScale | Decision window scales 30 s @ 0.40 → 120 s @ 1.00 linearly | FT-P-017 (O1), FT-P-018 (O2), FT-P-019 (O3), FT-N-004 (O4 below-threshold) | Covered |
| AC-Op-DeclinePersistsIgnored | Operator-decline → persistent ignored-item per (MGRS, class_group) | FT-P-020 (O5) | Covered |
| AC-Op-TimeoutForget | Timeout (no response) MUST NOT create ignored-item | FT-P-022 (O7) | Covered |
| AC-Op-IgnoredSuppress | New detection matching existing ignored-item NOT surfaced | FT-P-021 (O6) | Covered |
| AC-Op-ConfirmWaypointFollow | Operator-confirm → middle waypoint POST + target-follow mode | FT-P-016 (O8) | Covered (Q9 envelope DEFERRED; happy path uses placeholder) |
| AC-Op-ReplayUnsignedRejected | Replayed or unsigned operator command REJECTED with logged security WARN; state UNCHANGED | NFT-SEC-O9, NFT-SEC-O10 | Covered (Q9 DEFERRED for full semantics) |
| AC-Rel-BITGatesTakeoff | BIT MUST pass before takeoff permitted | FT-P-023 (R1), FT-N-001 (R2), FT-N-002 (R3), FT-N-003 (Mp2 BIT gate) | Covered |
| AC-Rel-LostLinkRTL30s | Lost operator/GS link → known mission-safe outcome within configurable grace (default 30 s → RTL) | NFT-RES-R4 | Covered |
| AC-Rel-AirframeLinkRedImmediate | Airframe command link loss → health red immediately; defer to airframe failsafe | NFT-RES-R7 (extension), implicitly by airframe-link health observation in NFT-RES-R5/R6 | Covered |
| AC-Rel-BatteryFloors | Battery ≤ RTL floor → RTL; battery ≤ hard floor → land-now; operator override only | NFT-RES-R5, NFT-RES-R6 | Covered (fixture DEFERRED) |
| AC-Rel-MavlinkExhaustionRed | MAVLink command exhaustion → airframe-link health red | NFT-RES-R7 | Covered (fixture DEFERRED) |
| AC-Rel-DriftYellow | Wall-clock drift > 200 ms → health yellow | NFT-RES-R8 | Covered |
| AC-Rel-GeofenceSymmetric | Geofence INCLUSION + EXCLUSION violations → waypoint refusal + RTL | NFT-RES-R9 (both cases) | Covered (fixture DEFERRED) |
| AC-Res-RSS6GB | Combined RSS on Jetson (excluding Tier 1) ≤ 6 GB sustained | NFT-RES-LIM-Re1, NFT-RES-LIM-CPU (CPU dimension), NFT-RES-LIM-FileHandles (FD dimension) | Covered (HW DEFERRED) |
| AC-Res-Tier1NonDegradation | Tier 1 per-frame latency Δ ± 5 ms under concurrent autopilot workload | NFT-RES-LIM-Re2, NFT-RES-LIM-GPU (GPU mutual exclusion) | Covered (HW DEFERRED) |
| AC-Mp-PreFlightPull30s | Pre-flight map pull ≤ 30 s; cache-fallback only with explicit operator ack | FT-P-024 (Mp1), FT-N-003 (Mp2 cache-fallback gate), NFT-RES-Mp2 (timing+recovery) | Covered |
| AC-Mp-PostFlightPush120s | Post-flight pass diff push ≤ 120 s; failure → persist + bounded retry | FT-P-025 (Mp3), NFT-RES-Mp4 | Covered (fixture DEFERRED) |
| AC-Gate-HWBench | HW/replay benchmark suite MUST pass before product implementation | every Tier-HW row in environment.md `Hardware Execution Matrix` (filled by `hardware-assessment.md`) | Covered as a gate, executed at the Acceptance-Gates milestone |
| AC-Gate-SeasonCoverage | Per-season dataset coverage demonstrated before MVP sign-off (Q13) | NOT COVERED at blackbox test level — gated on annotation campaign and the `../ai-training` repo | NOT COVERED (see Uncovered Items § §2) |
| AC-Gate-MavlinkSITLConformance | MAVLink command surface MUST pass SITL conformance | implicitly by FT-P-016 (O8 confirms waypoint POST through SITL) + NFT-RES-R4/R5/R6/R7/R9 (all run through SITL); a dedicated conformance suite is recommended | Partially Covered (see Uncovered Items § §3) |
| AC-Q-Mov-Zoomed-FPRate | Movement detection FP rate at zoomed-in inspection (Q14) | FT-P-010 (M4) | Covered (Q14 DEFERRED) |
| AC-Q-MapObjectsConflict | MapObjects conflict resolution rule (Q8) | FT-P-026 (Mp5) | Covered (Q8 DEFERRED) |
| AC-Q-OperatorCmdAuth | Operator-command authentication conformance (Q9) | NFT-SEC-O9, NFT-SEC-O10, FT-P-016 (O8) | Covered (Q9 DEFERRED — placeholders used today) |
| AC-Q-MAVLinkSigning | Airframe MAVLink-2 message signing (Q6) | NFT-SEC-MavlinkUnsigned | Covered (Q6 DEFERRED) |
| AC-Q-SeasonGates | Per-season flight-test gates (Q13) | NOT COVERED — same as AC-Gate-SeasonCoverage | NOT COVERED |
## Restrictions Coverage
| Restriction ID | Restriction (paraphrased; canonical in `restrictions.md`) | Test IDs | Coverage |
|---|---|---|---|
| RESTRICT-HW-Jetson | Compute device Jetson Orin Nano Super; 8 GB shared LPDDR5; ~6 GB residual after Tier 1 | NFT-RES-LIM-Re1, NFT-RES-LIM-CPU, NFT-RES-LIM-Re2, all Tier-HW rows | Covered (HW DEFERRED) |
| RESTRICT-HW-A40 | Primary camera ViewPro A40; vendor protocol mandatory | FT-P-011, FT-P-012, FT-P-013, NFT-PERF-L4 (zoom traversal floor) | Covered (HW DEFERRED for L4) |
| RESTRICT-HW-Z40K | Alternative camera ViewPro Z40K — system must remain compatible | NOT COVERED at autopilot test level — verified by component-swap regression run on the Z40K HW | NOT COVERED (see Uncovered Items § §4) |
| RESTRICT-HW-ThermalLater | Thermal sensor may be added later; not assumed today | implicit (no test depends on thermal) | Covered by absence (negative assumption) |
| RESTRICT-HW-ZoomFloor | 40× optical zoom traversal 12 s wall-clock | NFT-PERF-L4 (asserts the ≤ 2 s ceiling that includes the physical floor) | Covered (HW DEFERRED) |
| RESTRICT-Op-Altitude | Flight altitude 6001000 m | implicitly by every mission-trace fixture; no dedicated test | Covered by fixture assumption |
| RESTRICT-Op-AllSeasons | All four seasons in scope; winter-first-only rejected | FT-P-002, FT-P-003, FT-P-004, FT-P-005, FT-P-006 — multi-season fixtures required | Covered (all DEFERRED on multi-season fixtures) |
| RESTRICT-Op-AllTerrains | Forest, open field, urban edges, mixed terrain | same as RESTRICT-Op-AllSeasons | Covered (DEFERRED) |
| RESTRICT-Op-IntermittentModem | Modem operator/GS link intermittent | NFT-RES-R4, FT-P-016 (O8 nominal session), NFT-SEC-O9/O10 | Covered |
| RESTRICT-SW-JetsonResidualBudget | Onboard inference path runs within 6 GB residual RAM | NFT-RES-LIM-Re1 | Covered (HW DEFERRED) |
| RESTRICT-SW-FP16 | Models use FP16 precision (INT8 rejected for MVP) | NOT COVERED at autopilot test level — pinned at the model-loading layer (Tier 1 in `../detections`; Tier 2/3 in autopilot config) | NOT COVERED (see Uncovered Items § §5) |
| RESTRICT-SW-NoCloudInference | No cloud egress for inference | NFT-SEC-CraftedFrame (process boundary), implicit by environment.md `autopilot-e2e` network having no egress | Covered |
| RESTRICT-SW-GPUMutualExclusion | Tier 1 + any local large model serialise on the Jetson GPU | NFT-RES-LIM-GPU | Covered (HW DEFERRED) |
| RESTRICT-SW-MissionSchemaShared | Autopilot consumes shared `mission-schema`; cannot fork | FT-P-016 (O8 — POST validates against schema), FT-P-024 (Mp1 — schema-validated pull) | Covered (fixtures DEFERRED) |
| RESTRICT-Arch-Tier1External | Tier 1 lives in `../detections`; autopilot consumes | FT-P-001 (D6), NFT-SEC-Tier1SchemaViolation, FT-N-001 (R2 — Tier 1 unreachable inhibits BIT) | Covered |
| RESTRICT-Arch-MissionExternal | Mission state from `missions` service; autopilot doesn't author | FT-P-024, FT-P-025, FT-P-016 | Covered (fixtures DEFERRED) |
| RESTRICT-Arch-MapInMissions | Central area map in `missions /mapobjects` | FT-P-024, FT-P-025, FT-P-026 (Mp5), NFT-RES-Mp2, NFT-RES-Mp4 | Covered (fixtures DEFERRED) |
| RESTRICT-Arch-GPSDeniedExternal | GPS coords from separate GPS-denied service; autopilot does NOT implement | NOT COVERED at autopilot test level — verified at suite-e2e tier via the live GPS-denied service | NOT COVERED at autopilot tier (covered at suite-e2e tier) |
| RESTRICT-Arch-OperatorUIExternal | Operator browser UI owned by Ground Station; autopilot pushes data | implicit by NOT testing any UI rendering; verified by operator-stream protocol assertions in FT-P-016, FT-P-017022 | Covered by absence |
| RESTRICT-Arch-AnnotationTrainingExternal | Annotation + training in `../annotations`, `../ai-training`; autopilot doesn't own | NOT TESTABLE at autopilot blackbox tier — process boundary | NOT TESTABLE (intentional scope exclusion) |
| RESTRICT-Rel-BITGate | Pre-flight BIT MUST gate takeoff | FT-P-023 (R1), FT-N-001 (R2), FT-N-002 (R3), FT-N-003 (Mp2) | Covered |
| RESTRICT-Rel-LostLinkDeterministic | Lost operator-link failsafe deterministic + bounded | NFT-RES-R4 | Covered |
| RESTRICT-Rel-AirframeLossRedImmediate | Airframe MAVLink loss → health red immediately | NFT-RES-R7 (red after retry exhaustion); a dedicated "immediate red on link loss" scenario MAY be desirable (currently rolled into R7) | Partially Covered (see Uncovered Items § §6) |
| RESTRICT-Rel-BatteryThresholds | Battery RTL + land-now triggers (override only via operator) | NFT-RES-R5, NFT-RES-R6 | Covered (fixtures DEFERRED) |
| RESTRICT-Rel-GeofenceSymmetric | Geofence INCLUSION + EXCLUSION enforcement | NFT-RES-R9 (both) | Covered (fixture DEFERRED) |
| RESTRICT-Rel-OperatorCmdAuth | Operator commands authenticated + signed + replay-protected | NFT-SEC-O9, NFT-SEC-O10, FT-P-016 happy path | Covered (Q9 DEFERRED) |
| RESTRICT-Rel-StorageBounded | On-device storage bounded; full = takeoff blocker; mid-flight eviction policy | FT-N-002 (R3 — BIT block), NFT-RES-LIM-Storage | Covered |
| RESTRICT-Rel-NoSilentErrors | No silent error swallowing | every NFT-SEC-* scenario asserts a counter + log entry; every NFT-RES-* asserts a structured-log + health transition | Covered |
| RESTRICT-Rel-ClockBound | Wall-clock bound to GPS once locked, else NTP at boot | NFT-RES-R8 | Covered |
| RESTRICT-Rel-MavlinkConformance | MAVLink command surface MUST conform to ArduPilot/PX4 SITL | every MAVLink-emitting scenario runs through `mavlink-sitl`; a dedicated conformance suite is recommended | Partially Covered (see Uncovered Items § §3) |
## Coverage Summary
| Category | Total Items | Covered | Partially Covered | Not Covered | Coverage % (counting Partially as 0.5) |
|---|---|---|---|---|---|
| Acceptance Criteria | 47 | 43 | 1 | 3 | (43 + 0.5×1) / 47 ≈ **92.6 %** |
| Restrictions | 30 | 25 | 2 | 3 | (25 + 0.5×2) / 30 ≈ **86.7 %** |
| **Total** | 77 | 68 | 3 | 6 | **(68 + 1.5) / 77 ≈ 90.3 %** |
(Coverage here is "test scenario exists for the item", not "fixture has been acquired and the test currently passes". Fixture status is tracked in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`.)
## Uncovered Items Analysis
| § | Item | Reason not covered | Risk | Mitigation |
|---|---|---|---|---|
| §1 | AC-Scan-SweepCoverage (wide-area sweep covers planned route) | The "covers the planned route" property is a path-coverage assertion best tested by component-level tests in the `scan_controller` component (geometry coverage) rather than at the blackbox level | Medium — incorrect sweep pattern leaks observation gaps | Componenet-test in `scan_controller` (added by `/decompose` test tasks); a Tier-E "did the camera point at every planned waypoint area for ≥ N seconds" scenario can be added if needed |
| §2 | AC-Gate-SeasonCoverage / AC-Q-SeasonGates | Per-season coverage gates depend on dataset acquisition owned by `../ai-training` and per-season flight tests (Q13) | High — model performance on un-evaluated seasons unknown | Tracked as release-gate item; D3/D4/D5/D1 scenarios DEFERRED until each season's dataset lands |
| §3 | AC-Gate-MavlinkSITLConformance / RESTRICT-Rel-MavlinkConformance | A dedicated "every command in `architecture.md §7.7` exercised against SITL" suite is recommended in addition to the implicit coverage by R-scenarios | Medium — could miss a rarely-used command | Add a `NFT-MavlinkConformance` suite during Step 9 (Decompose Tests) — explicit per-command SITL exercise |
| §4 | RESTRICT-HW-Z40K (Z40K compatibility) | Requires a second camera HW for the swap test | Medium — could miss a A40-specific assumption | Run the Tier-HW rows on Z40K as a post-MVP smoke step |
| §5 | RESTRICT-SW-FP16 (model precision) | Pinned at config + model-loading layer; not externally observable beyond perf/latency | Low — incorrect precision would manifest as either L1 latency or D2 regression failure | Add a startup log assertion: "Tier 2/3 models loaded with precision=FP16" via the SUT's structured boot log |
| §6 | RESTRICT-Rel-AirframeLossRedImmediate (immediate red on airframe link loss) | NFT-RES-R7 asserts red after retry exhaustion; the "immediate red on link loss" path (no retries) is implicit | LowMedium — depends on timing window between "link silent" and "considered lost" | Add `NFT-RES-AirframeImmediate` scenario in Step 9 (Decompose Tests) — sustained zero MAVLink traffic for N seconds → immediate health red (no retry phase) |
## Scenario index by file
| File | Scenarios | Read-back ID prefix |
|---|---|---|
| `blackbox-tests.md` | 26 positive + 4 negative | FT-P-001..FT-P-026, FT-N-001..FT-N-004 |
| `performance-tests.md` | 9 latency + 3 rate | NFT-PERF-L1..L9, NFT-PERF-T1..T3 |
| `resilience-tests.md` | 6 R-rows + 2 Mp-rows | NFT-RES-R4..R9, NFT-RES-Mp2, NFT-RES-Mp4 |
| `security-tests.md` | 10 SEC rows | NFT-SEC-O9, NFT-SEC-O10, NFT-SEC-CraftedFrame, NFT-SEC-OversizeCrop, NFT-SEC-VlmSchemaViolation, NFT-SEC-VlmFreeFormText, NFT-SEC-IpcPeerAuth, NFT-SEC-Tier1SchemaViolation, NFT-SEC-MavlinkUnsigned, NFT-SEC-HealthExposesSecurity |
| `resource-limit-tests.md` | 6 LIM rows | NFT-RES-LIM-Re1, Re2, Storage, CPU, GPU, FileHandles |
**Total scenarios authored**: 66.
## Open dependencies summary
| Dependency | Affects (scenario count) | Tracking |
|---|---|---|
| `<DEFERRED: gimbal.csv + telemetry.csv pairs>` | FT-P-007/008/009/010, NFT-PERF-L6/L7 | Leftover row "Gimbal CSV pairs" |
| `<DEFERRED: multi-season annotated datasets (concealed, footpath, new classes, existing baseline)>` | FT-P-002/003/004/005/006 | Leftover row "Concealed position image set + Footpath sequences + new-class eval set" |
| `<DEFERRED: SITL or HW capture for L4/L5/L8>` | NFT-PERF-L4/L5/L8 | Leftover row "MAVLink SITL traces" + camera frame sequences with zoom-band labelling |
| `<DEFERRED: missions API mock fixtures (Mp1/Mp3/Mp4)>` | FT-P-024/025, NFT-RES-Mp4 | Leftover row "Mock central area-map service responses" |
| `<DEFERRED: vlm-io-pairs (real recordings)>` | NFT-PERF-L3, FT-P-015 (S5), NFT-SEC-VlmSchemaViolation real-recording variant | Leftover row "Deep-analysis I/O pairs" |
| `<DEFERRED: operator-envelopes (Q9-blocked)>` | NFT-SEC-O9/O10, full semantics of FT-P-016 | Leftover row "Operator-command envelopes" + Q9 |
| `<DEFERRED: HW Jetson Orin Nano Super OR benchmarked replay>` | every Tier-HW scenario (L1, L2, L4, L5, L8, Re1, Re2, CPU, GPU) | Leftover does not enumerate HW directly — tracked via the project-level Acceptance Gate |
| `<DEFERRED: Q6 — MAVLink-2 signing decision>` | NFT-SEC-MavlinkUnsigned | architecture.md §8 Q6 |
| `<DEFERRED: Q8 — MapObjects conflict resolution rule>` | FT-P-026 (Mp5) | architecture.md §8 Q8 |
| `<DEFERRED: Q9 — operator-command auth scheme>` | NFT-SEC-O9/O10 full semantics | architecture.md §8 Q9 |
| `<DEFERRED: Q13 — per-season gates>` | AC-Gate-SeasonCoverage | architecture.md §8 Q13 |
| `<DEFERRED: Q14 — movement-detection classical vs learned-CV>` | FT-P-010 (M4) | architecture.md §8 Q14 |
When any of the above dependencies resolves, the corresponding leftover entry is replayed (per `tracker.mdc → Leftovers Mechanism`) and the affected scenarios' `Test status` lines move from `DEFERRED` to `READY` in the source files.
## Phase 3 — Test Data & Expected Results Validation Gate Outcome
Recorded by `/test-spec` Phase 3 on 2026-05-19.
### Mechanical gate
Phase 3's mechanical contract is: every scenario MUST have either (a) a provided input + provided quantifiable expected result, OR (b) a behavioural trigger + observable behaviour + quantifiable pass/fail criterion. Scenarios that fail this contract are normally REMOVED. The 75 % final-coverage check then applies.
| Shape | Total scenarios | Quantifiable comparison declared | Input/trigger fully provided today | Input/trigger DEFERRED (release-gate item) |
|---|---|---|---|---|
| Input/output | 56 | 56 | 16 | 40 |
| Behavioural | 10 | 10 | 10 | 0 |
| **Total** | **66** | **66 (100 %)** | **26 (39 %)** | **40 (61 %)** |
Every scenario carries a `Comparison` method drawn from `.cursor/skills/test-spec/templates/expected-results.md` (`exact`, `numeric_tolerance`, `threshold_min/max`, `range`, `regex`, `substring`, `set_contains`, `json_diff`, `schema_match`, `file_reference`) — none of the 66 fail the quantifiability check.
### Project-policy override (recorded 2026-05-19)
The Phase 3 75 % fixture-coverage gate is **intentionally overridden** for this project, per the decision recorded in `_docs/00_problem/input_data/expected_results/results_report.md → "Decision (project policy)"`:
> rather than block on the Phase 3 75 % gate, each deferred row is now registered with a structured `<DEFERRED:>` tag and surfaces in `data_parameters.md → "Gaps that block /test-spec downstream"`. `/test-spec` Phase 2 can author scenarios for all 56 rows; deferred rows become **release-gate items**, not development-gate items. The `acceptance_criteria.md → "Acceptance Gates (project-level)"` hardware/replay benchmark requirement is preserved as the hard release gate — that one is NOT being deferred.
Under this policy:
- **No scenarios are removed by Phase 3.** Every authored scenario remains in the spec; its `Test status` line in the source file (`blackbox-tests.md`, `performance-tests.md`, etc.) carries either `READY` or `DEFERRED — <reason>`.
- **Final coverage** is computed at the **scenario level**, not the fixture level. Per the matrix above:
- AC coverage: 92.6 % (43 + 0.5 × 1 / 47)
- RESTRICT coverage: 86.7 % (25 + 0.5 × 2 / 30)
- **Total: 90.3 %** — well above the 75 % gate.
- **Fixture acquisition** is tracked as a release-gate concern in `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`; on every `/autodev` invocation the leftover-replay step re-evaluates whether any deferred fixture has landed and moves the affected scenarios from `DEFERRED` to `READY`.
- **The project-level Acceptance Gate** (`acceptance_criteria.md → "Acceptance Gates"` — HW/replay benchmark, per-season coverage, MAVLink SITL conformance) remains a hard release blocker. The override does NOT relax that gate.
### Phase 3 verdict
**PASSED** — scenario-level coverage 90.3 % ≥ 75 % gate; every scenario has a quantifiable comparison; deferred-fixture tracking handled via leftovers replay; no scenarios removed.
## Phase 4 — Test Runner Script Generation: SKIPPED in this invocation
Per `phases/04-runner-scripts.md → "Skip condition"`:
> If this skill was invoked from the `/plan` skill (planning context, no code exists yet), skip Phase 4 entirely. Script creation should instead be planned as a task during decompose — the decomposer creates a task for creating these scripts. Phase 4 only runs when invoked from the existing-code flow (where source code already exists) or standalone.
This invocation is greenfield Step 5 (Test Spec) and no source code exists yet — the `_docs/02_document/components/*/description.md` files describe 13 Rust components that the Implement step (Step 7) will create. Producing runner scripts here would write `scripts/run-tests.sh` and `scripts/run-performance-tests.sh` against a binary that does not yet exist.
**Handoff to Step 6 (Decompose)**: the decomposer MUST create at least two task specs covering the test runner scripts:
1. A task to create `scripts/run-tests.sh` (Tier B/E orchestration; calls `docker compose -f e2e/docker-compose.autopilot-e2e.yml up` and runs `cargo test --release --test scenarios` in `e2e-consumer`).
2. A task to create `scripts/run-performance-tests.sh` (Tier HW orchestration; per `environment.md → Hardware Execution Matrix`).
Both tasks should be tagged as part of the test-infrastructure decomposition (`Step 1t` of decompose tests-only mode) so they land before any Tier-B test scenarios are implemented.
+111
View File
@@ -0,0 +1,111 @@
# Dependencies Table
**Date**: 2026-05-19
**Total Tasks**: 47 (1 bootstrap + 46 component)
**Total Complexity Points**: 173
**Cycle Check**: PASS (DAG verified)
**Coverage Check**: PASS (every named architectural capability has an implementation task)
## Recommended Execution Order
Tasks are listed in a topologically valid order. The `/implement` skill batches by dependency depth automatically; the order here is one valid linearization.
| Task | Name | Complexity | Dependencies | Epic |
|------|------|-----------|-------------|------|
| AZ-640 | initial_structure | 5 | None | AZ-626 |
| AZ-641 | mavlink_transport_and_heartbeat | 3 | AZ-640 | AZ-637 |
| AZ-642 | mavlink_codec | 5 | AZ-640 | AZ-637 |
| AZ-643 | mavlink_ack_demux_and_signing | 3 | AZ-640, AZ-641, AZ-642 | AZ-637 |
| AZ-644 | mission_client_pull_and_schema | 3 | AZ-640 | AZ-638 |
| AZ-645 | mission_client_waypoint_post | 2 | AZ-640, AZ-644 | AZ-638 |
| AZ-646 | mission_client_mapobjects_pull | 3 | AZ-640, AZ-644 | AZ-638 |
| AZ-647 | mission_client_mapobjects_push | 5 | AZ-640, AZ-644, AZ-646 | AZ-638 |
| AZ-648 | mission_executor_state_machine | 5 | AZ-640, AZ-641, AZ-642, AZ-643 | AZ-636 |
| AZ-649 | mission_executor_telemetry_forwarding | 2 | AZ-640, AZ-648 | AZ-636 |
| AZ-650 | mission_executor_bit_f9 | 5 | AZ-640, AZ-648, AZ-649, AZ-644, AZ-646 | AZ-636 |
| AZ-651 | mission_executor_lost_link_ladder | 3 | AZ-640, AZ-648, AZ-649 | AZ-636 |
| AZ-652 | mission_executor_safety_and_resume | 5 | AZ-640, AZ-648, AZ-649, AZ-643, AZ-647 | AZ-636 |
| AZ-653 | gimbal_a40_transport | 5 | AZ-640 | AZ-634 |
| AZ-654 | gimbal_zoom_out_sweep | 3 | AZ-640, AZ-653 | AZ-634 |
| AZ-655 | gimbal_smooth_pan_plan | 3 | AZ-640, AZ-653 | AZ-634 |
| AZ-656 | gimbal_centre_on_target | 3 | AZ-640, AZ-653 | AZ-634 |
| AZ-657 | frame_ingest_rtsp_session | 3 | AZ-640 | AZ-627 |
| AZ-658 | frame_ingest_decoder | 5 | AZ-640, AZ-657 | AZ-627 |
| AZ-659 | frame_ingest_publisher | 3 | AZ-640, AZ-657, AZ-658 | AZ-627 |
| AZ-660 | detection_client_grpc_stream | 5 | AZ-640, AZ-659 | AZ-628 |
| AZ-661 | detection_client_schema_and_health | 2 | AZ-640, AZ-660 | AZ-628 |
| AZ-662 | movement_detector_ego_motion | 5 | AZ-640, AZ-659, AZ-656, AZ-649 | AZ-629 |
| AZ-663 | movement_detector_clustering_and_emission | 5 | AZ-640, AZ-662 | AZ-629 |
| AZ-664 | movement_detector_fp_cap_and_q14_fallback | 3 | AZ-640, AZ-662, AZ-663 | AZ-629 |
| AZ-665 | mapobjects_store_h3_classify | 5 | AZ-640 | AZ-633 |
| AZ-666 | mapobjects_store_ignored_and_pass_sweep | 3 | AZ-640, AZ-665 | AZ-633 |
| AZ-667 | mapobjects_store_hydrate_and_pending | 5 | AZ-640, AZ-665, AZ-666 | AZ-633 |
| AZ-668 | mapobjects_store_persistence | 3 | AZ-640, AZ-665, AZ-667 | AZ-633 |
| AZ-669 | semantic_analyzer_primitive_graph | 5 | AZ-640, AZ-660, AZ-661 | AZ-630 |
| AZ-670 | semantic_analyzer_roi_cnn | 5 | AZ-640, AZ-669 | AZ-630 |
| AZ-671 | semantic_analyzer_action_policy | 3 | AZ-640, AZ-669, AZ-670 | AZ-630 |
| AZ-672 | vlm_client_provider_trait | 2 | AZ-640 | AZ-631 |
| AZ-673 | vlm_client_nanollm_ipc | 5 | AZ-640, AZ-672 | AZ-631 |
| AZ-674 | vlm_client_schema_and_model_version | 3 | AZ-640, AZ-673 | AZ-631 |
| AZ-675 | telemetry_stream_grpc_server | 3 | AZ-640, AZ-649, AZ-657 | AZ-637 |
| AZ-676 | telemetry_stream_video_path | 3 | AZ-640, AZ-657, AZ-675 | AZ-637 |
| AZ-677 | telemetry_stream_mapobjects_snapshot | 3 | AZ-640, AZ-675, AZ-667 | AZ-637 |
| AZ-678 | operator_bridge_command_auth | 5 | AZ-640, AZ-675 | AZ-628 |
| AZ-679 | operator_bridge_poi_surface | 3 | AZ-640, AZ-675 | AZ-628 |
| AZ-680 | operator_bridge_command_dispatch | 3 | AZ-640, AZ-678 | AZ-628 |
| AZ-681 | operator_bridge_safety_and_bit_ack | 3 | AZ-640, AZ-678, AZ-650, AZ-652 | AZ-628 |
| AZ-682 | scan_controller_state_machine | 5 | AZ-640, AZ-649 | AZ-635 |
| AZ-683 | scan_controller_poi_queue_and_window | 5 | AZ-640, AZ-682 | AZ-635 |
| AZ-684 | scan_controller_evidence_ladder | 5 | AZ-640, AZ-682, AZ-683, AZ-660, AZ-671, AZ-672 | AZ-635 |
| AZ-685 | scan_controller_mapobjects_dispatch | 3 | AZ-640, AZ-682, AZ-684, AZ-665, AZ-666, AZ-667 | AZ-635 |
| AZ-686 | scan_controller_gimbal_issuance | 3 | AZ-640, AZ-682, AZ-683, AZ-684, AZ-654, AZ-655, AZ-656, AZ-648 | AZ-635 |
## Per-Epic Roll-Up
| Epic | Component | Tasks | Points |
|------|-----------|-------|--------|
| AZ-626 | bootstrap | 1 | 5 |
| AZ-627 | frame_ingest | 3 | 11 |
| AZ-628 | detection_client + operator_bridge | 6 | 21 |
| AZ-629 | movement_detector | 3 | 13 |
| AZ-630 | semantic_analyzer | 3 | 13 |
| AZ-631 | vlm_client | 3 | 10 |
| AZ-633 | mapobjects_store | 4 | 16 |
| AZ-634 | gimbal_controller | 4 | 14 |
| AZ-635 | scan_controller | 5 | 21 |
| AZ-636 | mission_executor | 5 | 20 |
| AZ-637 | mavlink_layer + telemetry_stream | 6 | 20 |
| AZ-638 | mission_client | 4 | 13 |
| **TOTAL** | — | **47** | **173** |
> Note: Epic AZ-628 holds both `detection_client` and `operator_bridge` (per the Plan-step epic assignment, which placed both Action-plane components under the same epic). Epic AZ-637 holds both `mavlink_layer` and `telemetry_stream`.
## Coverage Check (named runtime capabilities → task)
Every named capability from `architecture.md`, `system-flows.md`, and per-component `description.md` files maps to a production-implementation task (no scaffold-only coverage):
- MAVLink transport + heartbeat → AZ-641; MAVLink v2 codec → AZ-642; ack demux + signing → AZ-643
- Mission pull + schema → AZ-644; middle-waypoint POST → AZ-645; MapObjects pull → AZ-646; MapObjects push (durable) → AZ-647
- Mission state machine → AZ-648; telemetry forwarding → AZ-649; BIT F9 → AZ-650; lost-link F10 → AZ-651; geofence/battery + safety + resume → AZ-652
- A40 vendor transport (CRC16, UDP) → AZ-653; zoom-out sweep → AZ-654; smooth-pan plan → AZ-655; centre-on-target → AZ-656
- RTSP session → AZ-657; H.264/265 decode (NVDEC + sw fallback) → AZ-658; broadcast publisher (zero-copy) → AZ-659
- Detection bi-di gRPC + budgeting → AZ-660; detection schema + model_version + health → AZ-661
- Ego-motion (OpenCV + telemetry sync) → AZ-662; clustering + persistence + emission → AZ-663; FP cap + Q14 hook → AZ-664
- H3 classify → AZ-665; ignored set + sweep → AZ-666; hydrate + sync_state + pending → AZ-667; persistence (JSON snapshot) → AZ-668
- Primitive graph + freshness → AZ-669; ROI CNN (ONNX) → AZ-670; action policy + pan plan → AZ-671
- VLM provider trait + disabled + feature → AZ-672; NanoLLM UDS + peer-cred → AZ-673; schema + model_version → AZ-674
- Operator telemetry gRPC + per-client lossy → AZ-675; video path (rtsp_forward / bytes_inline) + ai_locked → AZ-676; MapObjects snapshot + diff + resnap → AZ-677
- Operator command auth (sig + replay + session) → AZ-678; POI surface format → AZ-679; command dispatch + idempotency → AZ-680; BIT-degraded ack + safety-override → AZ-681
- Scan state machine + frame-rate floor → AZ-682; POI queue + 5/min cap + decision window → AZ-683; evidence ladder + zoom-in candidate routing → AZ-684; classify/ignored/degraded-sync dispatch → AZ-685; gimbal issuance + mission hint + health fallback → AZ-686
## DAG Validation Notes
- Every dependency targets a strictly smaller task number → no cycles.
- `AZ-640_initial_structure` is the unique root.
- Maximum dependency depth = 8 (AZ-686: 640 → 657 → 658 → 659 → 660 → 671 → 684 → 686 along one path; other paths similar).
- No task depends on a task in a later epic that itself depends back into an earlier epic.
## Notes for /implement
- Total complexity 173 pts: at ~5 pts/day per implementer that's ~35 implementer-days. Parallelism is bounded by the DAG; the widest parallel layer is just after AZ-640 (15 leaf-level tasks: AZ-641, AZ-642, AZ-644, AZ-653, AZ-657, AZ-665, AZ-672 plus the bootstrap-only-dep tasks below them on each chain).
- Suite-level integration tests (Phase 2 in monorepo-document terms) are NOT in this table — they belong to the suite-e2e epic and are decomposed separately when the suite-e2e harness is wired up.
@@ -0,0 +1,374 @@
# Initial Project Structure
**Task**: AZ-640_initial_structure
**Name**: Initial Structure
**Description**: Scaffold the Rust cargo workspace — per-component crates, shared crate, runtime composition root, Dockerfile + docker-compose for dev/test, Woodpecker CI pipeline, observability scaffold, on-device state directory, env config, and replay-based integration test layout.
**Complexity**: 5 points
**Dependencies**: None
**Component**: Bootstrap
**Tracker**: AZ-640
**Epic**: AZ-626
## Project Folder Layout
```
autopilot/
├── Cargo.toml # cargo workspace manifest
├── Cargo.lock
├── rust-toolchain.toml # pin stable channel + components
├── .cargo/
│ └── config.toml # cross-compile target = aarch64-unknown-linux-gnu
├── .woodpecker.yml # CI pipeline (per deployment/ci_cd_pipeline.md)
├── .dockerignore
├── Dockerfile # multi-stage; non-root; pinned l4t-base for prod, ubuntu:22.04 for emul
├── docker-compose.yml # dev: autopilot + mock detections + mock missions + mock ground-station
├── docker-compose.test.yml # blackbox: autopilot + ArduPilot SITL + mock detections + replay sources
├── .env.example # documented environment variables
├── config/
│ ├── autopilot.dev.toml # dev profile (mock endpoints)
│ ├── autopilot.staging.toml # staging profile (real endpoints, non-flight)
│ └── autopilot.prod.toml # prod template (Jetson on-airframe)
├── crates/
│ ├── autopilot/ # binary crate — runtime composition root
│ │ ├── Cargo.toml # `[[bin]] name = "autopilot"`
│ │ ├── src/
│ │ │ ├── main.rs # CLI parse, config load, wire actors, run
│ │ │ ├── runtime.rs # actor topology, health aggregator, shutdown
│ │ │ └── health_server.rs # HTTP /health endpoint (port from config)
│ │ └── tests/ # cross-crate integration tests (replay-based)
│ ├── shared/
│ │ ├── Cargo.toml
│ │ └── src/
│ │ ├── lib.rs # re-exports
│ │ ├── models/ # canonical entities from data_model.md
│ │ │ ├── mod.rs
│ │ │ ├── frame.rs # Frame, BoundingBox
│ │ │ ├── detection.rs # Detection, DetectionBatch
│ │ │ ├── movement.rs # MovementCandidate
│ │ │ ├── tier2.rs # Tier2Evidence
│ │ │ ├── vlm.rs # VlmAssessment
│ │ │ ├── poi.rs # POI
│ │ │ ├── mapobject.rs # MapObject, MapObjectObservation, MapObjectsBundle, IgnoredItem
│ │ │ ├── mission.rs # MissionItem, MissionWaypoint, Geofence, Coordinate
│ │ │ ├── operator.rs # OperatorCommand
│ │ │ └── gimbal.rs # GimbalState
│ │ ├── config/ # toml loader + typed config sections
│ │ ├── error.rs # AutopilotError enum, Result alias
│ │ ├── health.rs # ComponentHealth, AggregatedHealth
│ │ ├── observability/ # tracing-subscriber init + log field constants
│ │ └── clock.rs # monotonic + wall-clock binding (GPS / NTP)
│ ├── frame_ingest/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs # public API trait + actor handle
│ │ ├── src/internal/ # decoder, RTSP client
│ │ └── tests/ # replay-based unit tests against fixture RTSP clips
│ ├── detection_client/
│ │ ├── Cargo.toml
│ │ ├── build.rs # tonic-build for ../detections .proto
│ │ ├── proto/ # copy of ../detections gRPC contract
│ │ ├── src/lib.rs
│ │ └── tests/
│ ├── movement_detector/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/ # ego-motion, optical-flow, per-zoom-band thresholds
│ │ └── tests/ # replay fixtures, zoom-out + zoom-in
│ ├── semantic_analyzer/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/ # primitive graph, ROI CNN call
│ │ └── tests/
│ ├── vlm_client/
│ │ ├── Cargo.toml # feature = ["vlm"] — see autopilot/Cargo.toml
│ │ ├── src/lib.rs # default impl returns VlmAssessment{status=vlm_disabled}
│ │ ├── src/internal/ # UDS client + peer-cred check
│ │ └── tests/
│ ├── scan_controller/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/state_machine/ # ZoomedOut / ZoomedIn / TargetFollow types
│ │ ├── src/poi_queue/ # priority queue + ≤5 POIs/min cap
│ │ └── tests/ # behaviour-tree scenarios from system-flows.md §F4
│ ├── mapobjects_store/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/h3_index/ # h3rs wrapper
│ │ ├── src/internal/engine/ # engine trait + in-memory+snapshot default impl (Q3)
│ │ └── tests/
│ ├── gimbal_controller/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/a40_protocol/ # ViewPro A40 UDP vendor protocol
│ │ └── tests/ # mock A40 over UDP
│ ├── operator_bridge/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/auth/ # OperatorCommand envelope validation (Q9 — stubbed)
│ │ └── tests/
│ ├── mission_executor/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/multirotor/ # multirotor variant FSM
│ │ ├── src/internal/fixed_wing/ # fixed-wing variant FSM
│ │ ├── src/internal/geofence/ # INCLUSION + EXCLUSION enforcement
│ │ ├── src/internal/failsafe/ # lost-link ladder, battery thresholds
│ │ └── tests/ # ArduPilot SITL fixtures
│ ├── mavlink_layer/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/codec/ # MAVLink v2 encode/decode (only §7.7 surface)
│ │ ├── src/internal/transport/ # UDP and serial connection abstraction
│ │ └── tests/ # SITL conformance fixtures
│ ├── mission_client/
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs
│ │ ├── src/internal/missions_api/ # HTTPS REST client; pull + middle-waypoint POST
│ │ ├── src/internal/mapobjects_sync/ # pre-flight GET + post-flight POST of /mapobjects bundles
│ │ └── tests/ # mock missions API
│ └── telemetry_stream/
│ ├── Cargo.toml
│ ├── src/lib.rs
│ ├── src/internal/uplink/ # modem push of frames + telemetry + bbox overlay
│ └── tests/ # mock Ground Station receiver
├── tests/
│ └── e2e/ # cross-crate blackbox scenarios (used by docker-compose.test.yml)
├── benches/
│ ├── tier1_latency.rs # benchmark-gate harness for §6 NFRs
│ ├── tier2_latency.rs
│ ├── gimbal_zoom.rs
│ └── movement_fpr.rs # per-zoom-band FPR replay benchmark
├── fixtures/
│ ├── rtsp/ # pre-recorded RTSP clips
│ ├── mavlink/ # ArduPilot SITL replay scripts
│ ├── missions/ # mission JSON fixtures
│ └── detections/ # deterministic Tier-1 response fixtures
├── deploy/
│ ├── systemd/
│ │ └── autopilot.service # per deployment/containerization.md §3
│ └── jetson/
│ └── README.md # on-airframe install steps
└── README.md
```
### Layout Rationale
- **Cargo workspace with one crate per component** matches the recommended Rust layout in `_docs/02_document/decompose/templates/module-layout.md` (`crates/<component>/`). It enforces module boundaries: a crate's internals (`internal/`, private modules) are unreachable from sibling components — only its `lib.rs` public surface is.
- **Single binary crate `crates/autopilot/`** is the runtime composition root (per `deployment/containerization.md` — "single Rust binary"). It depends on every component crate and wires the actor topology in `runtime.rs`.
- **`crates/shared/`** owns the canonical entity catalogue from `data_model.md` and cross-cutting concerns (config, error, health, observability, clock). All component crates may import from it; it imports from no one.
- **`fixtures/` separate from `tests/`** because the same fixtures feed unit tests, replay-based integration tests, blackbox tests, and benchmark gates.
- **`vlm_client` crate exists unconditionally**; the optional behaviour is implemented via a default `VlmAssessment` provider that returns `status=vlm_disabled` when the `vlm` feature is off (per `architecture.md §7.6` "Optionality model").
## DTOs and Interfaces
### Shared DTOs (live in `crates/shared/src/models/`)
| DTO | Source spec | Used by components |
|---|---|---|
| `Frame`, `BoundingBox` | `data_model.md §2` | `frame_ingest`, `detection_client`, `movement_detector`, `semantic_analyzer`, `telemetry_stream` |
| `Detection`, `DetectionBatch` | `data_model.md §2` | `detection_client`, `scan_controller`, `telemetry_stream`, `operator_bridge` |
| `MovementCandidate` | `data_model.md §2` | `movement_detector`, `scan_controller` |
| `Tier2Evidence` | `data_model.md §2` | `semantic_analyzer`, `scan_controller` |
| `VlmAssessment` | `data_model.md §2` | `vlm_client`, `scan_controller` |
| `POI` | `data_model.md §3` | `scan_controller`, `operator_bridge`, `telemetry_stream` |
| `MapObject`, `MapObjectObservation`, `MapObjectsBundle`, `IgnoredItem` | `data_model.md §3` | `mapobjects_store`, `mission_client`, `scan_controller` |
| `Coordinate`, `Geofence`, `MissionItem` | `data_model.md §4` | `mission_client`, `mission_executor`, `operator_bridge` |
| `MissionWaypoint` | `data_model.md §4` | `mission_executor`, `mavlink_layer` |
| `OperatorCommand` | `data_model.md §4` | `operator_bridge`, `scan_controller`, `mission_executor` |
| `GimbalState` | `data_model.md §4` | `gimbal_controller`, `frame_ingest`, `movement_detector` |
| `AutopilotError`, `Result<T>` | new | every crate |
| `ComponentHealth`, `AggregatedHealth` | new (per `containerization.md §7`) | every crate + `autopilot/runtime.rs` |
### Component Public APIs (live in each component's `lib.rs`)
Each component exposes an actor handle plus its narrow request/response trait. Inter-component communication is Tokio channels owned inside the component; consumers receive a typed handle, not the underlying `tokio::sync::*` types.
| Component | Public surface (handle methods) | Exposed to |
|---|---|---|
| `frame_ingest` | `FrameIngestHandle::subscribe() -> FrameStream`, `health()` | `detection_client`, `movement_detector`, `telemetry_stream` |
| `detection_client` | `DetectionClientHandle::request(Frame) -> Result<DetectionBatch>`, `health()` | `scan_controller`, `movement_detector`, `telemetry_stream` |
| `movement_detector` | `MovementDetectorHandle::candidates() -> CandidateStream`, `health()` | `scan_controller` |
| `semantic_analyzer` | `SemanticAnalyzerHandle::analyze(Roi) -> Result<Tier2Evidence>`, `health()` | `scan_controller` |
| `vlm_client` (trait) | `VlmProvider::assess(Roi) -> Result<VlmAssessment>` (default impl returns `vlm_disabled`) | `scan_controller` |
| `scan_controller` | `ScanControllerHandle::tick(), submit_operator_cmd(OperatorCommand)`, `health()` | `autopilot::runtime` |
| `mapobjects_store` | `MapObjectsStoreHandle::classify(Detection) -> Classification`, `apply_decline(Poi)`, `dump_pending() -> MapObjectsBundle`, `hydrate(MapObjectsBundle)`, `health()` | `scan_controller`, `mission_client` |
| `gimbal_controller` | `GimbalControllerHandle::set_pose(GimbalCommand), zoom(level), state() -> GimbalState`, `health()` | `scan_controller` |
| `operator_bridge` | `OperatorBridgeHandle::surface_poi(POI) -> OperatorDecision`, `cmds() -> CommandStream`, `health()` | `scan_controller`, `mission_executor` |
| `mission_executor` | `MissionExecutorHandle::start(Mission), insert_middle_waypoint(Coordinate), failsafe_trigger(FailsafeKind)`, `health()` | `scan_controller`, `operator_bridge` |
| `mavlink_layer` | `MavlinkHandle::send(Command), telemetry() -> TelemetryStream`, `health()` | `mission_executor`, `telemetry_stream` |
| `mission_client` | `MissionClientHandle::pull_mission() -> Mission`, `post_middle_waypoint(Coordinate)`, `pull_mapobjects(MissionId) -> MapObjectsBundle`, `push_mapobjects(MapObjectsBundle)`, `health()` | `mission_executor`, `mapobjects_store` |
| `telemetry_stream` | `TelemetryStreamHandle::push_frame(Frame, Overlay), push_telemetry(Sample)`, `health()` | `frame_ingest`, `detection_client`, `mavlink_layer`, `operator_bridge` |
## CI/CD Pipeline
Single Woodpecker pipeline (per `deployment/ci_cd_pipeline.md §2`). Stages run sequentially; a failed stage stops the run.
| Stage | Purpose | Tool / Command |
|---|---|---|
| Fetch | Clone, restore Cargo cache | `cargo fetch` with remote cache key |
| Lint | `cargo fmt --check`; `cargo clippy --all-targets --all-features -- -D warnings` | Hard fail on any warning |
| Unit Tests | `cargo test --workspace` (host-arch) | Most logic is platform-independent |
| Build arm64 | Cross-compile for `aarch64-unknown-linux-gnu` | `cross` or `cargo zigbuild`; produce binary + debug symbols |
| Build no-vlm | `cargo build --workspace --no-default-features` | Enforces VLM optionality contract |
| Integration Tests | Replay-based, no hardware | `cargo test --test '*' -- --include-ignored=false`; fixtures from `fixtures/` |
| SITL Conformance | ArduPilot SITL + autopilot binary in containers, fixed mission, asserts §7.7 surface + geofence | `docker compose -f docker-compose.test.yml up --abort-on-container-exit` |
| Security Scan | `cargo audit` + `cargo deny check` | Dependency CVE scan |
| Benchmark Gate (manual / nightly) | Tier 1 / 2 / VLM / gimbal latency on real Jetson | Runs on self-hosted Jetson Orin Nano runner |
| Package | Build container image | Multi-arch tag `azaion/autopilot:<branch>-arm64` |
| Sign | Cosign for image; OS signing flow for binary | Tagged builds only |
| Publish | Push image + binary to internal registry | Tagged builds only |
### Pipeline Configuration Notes
- Cache `~/.cargo/registry/`, `~/.cargo/git/`, and `target/` between runs keyed on `Cargo.lock` hash.
- `--features vlm` and the no-feature path are both built and tested to enforce the optionality contract.
- `dev` and `main` branches are protected; force-push forbidden; merges require a green pipeline.
- Benchmark gate is opt-in (manual approval or nightly cron) because it requires a Jetson runner.
## Environment Strategy
| Environment | Purpose | Configuration Notes |
|---|---|---|
| Development (local) | Run autopilot locally against mock detections + mock missions + mock Ground Station; iterate on logic | `docker compose -f docker-compose.yml up`; `config/autopilot.dev.toml`; `RUST_LOG=info,autopilot=debug` |
| Staging | Pre-production: real `../detections`, real `missions` API, real `Ground Station`, but no airframe MAVLink (SITL instead) | `config/autopilot.staging.toml`; secrets via `EnvironmentFile=` |
| Production (airframe) | Native systemd on Jetson Orin Nano per `containerization.md §3` | `/etc/azaion/autopilot/config.toml`; `/etc/systemd/system/autopilot.service`; `/var/lib/autopilot/`; `/run/azaion/in-flight` flight-gate marker |
| CI (Tier-1) | Lint + unit + replay-based integration on amd64 | GitHub-hosted runner; no GPU |
| CI (Tier-2) | Benchmark gate on real Jetson | Self-hosted Jetson Orin Nano Super runner; pinned JetPack + power mode |
### Environment Variables
| Variable | Dev | Staging | Production | Description |
|---|---|---|---|---|
| `AUTOPILOT_CONFIG` | `./config/autopilot.dev.toml` | `/etc/azaion/autopilot/config.toml` | `/etc/azaion/autopilot/config.toml` | Path to TOML config |
| `RUST_LOG` | `info,autopilot=debug` | `info` | `info` | `tracing-subscriber` filter |
| `AUTOPILOT_MISSION_ID` | (per-flight CLI arg) | (per-flight CLI arg) | (per-flight CLI arg) | Active mission UUID; CLI arg, not env |
| `AUTOPILOT_HEALTH_BIND` | `127.0.0.1:8080` | `127.0.0.1:8080` | `127.0.0.1:8080` | HTTP `/health` bind address |
| `AUTOPILOT_VLM_ENABLED` | `false` | `false` (until benchmark passes) | per benchmark | Runtime VLM flag; binary must also build with `--features vlm` |
| `MISSIONS_API_TOKEN` | (mock) | from `EnvironmentFile=` | from `EnvironmentFile=` | Bearer token; never in `config.toml` |
| `GROUND_STATION_TOKEN` | (mock) | from `EnvironmentFile=` | from `EnvironmentFile=` | Bearer / session token |
All non-secret configuration lives in `config.toml` (per `containerization.md §6`). Secrets come from `EnvironmentFile=` on systemd, from compose `secrets:` in containers.
## Database Migration Approach
**Migration tool**: none — autopilot has **no traditional database**.
**Persistence strategy**: the only persisted data is the on-device `mapobjects_store`. Its engine is open (`architecture.md §8 Q3`); the bootstrap default is **in-memory + snapshot to `/var/lib/autopilot/mapobjects/`** (file-backed, no schema migrations). When Q3 resolves toward SQLite + H3 or another engine, the `mapobjects_store` crate's engine module is swapped without changing its public API. The central `missions` API owns its own Postgres schema (per `architecture.md §7.13`) — autopilot does NOT migrate central tables.
### Initial Persisted Surface
| Subsystem | What is persisted | Where | Format |
|---|---|---|---|
| `mapobjects_store` | `current_state`, `pending_observations`, `pending_ignored`, `sync_state` | `/var/lib/autopilot/mapobjects/` | engine-defined; default = JSON snapshots + append-only log |
| `operator_bridge` audit log | accepted/rejected `OperatorCommand` envelopes | `/var/lib/autopilot/audit/` | newline-delimited JSON |
| `mission_client` deferred uploads | post-flight push payload on push failure | `/var/lib/autopilot/pending_pushes/` | JSON files keyed by mission ID |
Disk quota for `/var/lib/autopilot/` is configured in `config.toml`; persistent-store-full at pre-flight BIT is a takeoff blocker (per `architecture.md §5`).
## Test Structure
```
crates/<component>/
└── tests/ # crate-level integration tests; per-crate
└── <scenario>.rs
tests/
└── e2e/ # workspace-level end-to-end (uses docker-compose.test.yml)
├── sitl_conformance.rs # SITL gate per ci_cd_pipeline.md §5
├── geofence_inclusion.rs
├── geofence_exclusion.rs # explicit regression vs earlier silent-ignore behaviour
├── lost_link_failsafe.rs
└── operator_command_replay.rs
fixtures/
├── rtsp/<clip>.h264
├── mavlink/<replay>.tlog
├── missions/<mission>.json
└── detections/<deterministic>.json
benches/
├── tier1_latency.rs # benchmark-gate harness
├── tier2_latency.rs
├── gimbal_zoom.rs
└── movement_fpr.rs # per-zoom-band FPR replay
```
### Test Configuration Notes
- **Unit tests** live alongside each component's source in `#[cfg(test)] mod tests { ... }` within `src/` files. They MUST run in <5 s on developer workstation; no network, no Docker.
- **Crate-level integration tests** live in `crates/<component>/tests/`. They may use fixtures from `fixtures/` but MUST NOT cross component boundaries — that's what workspace e2e is for.
- **Workspace e2e** in `tests/e2e/` exercises the full binary against a docker-compose-managed stack (ArduPilot SITL, mock missions API, mock detections gRPC, replay RTSP).
- **Replay-driven debugging**: all non-trivial decisions are reconstructable from logs + size-capped raw inputs (per `observability.md §6`). Replay fixtures are the foundation of regression tests.
- **Test runner**: `cargo test --workspace` for unit + integration; `docker compose -f docker-compose.test.yml up --abort-on-container-exit` for e2e; `cargo bench` (or `criterion`) for benchmark-gate measurements.
- **Mock-data discipline**: mocks live in `tests/` directories only — never in production crates (per `coderule.mdc`).
## Implementation Order
| Order | Component | Reason |
|---|---|---|
| 1 | `shared` (models + config + error + health + observability + clock) | Every other crate depends on it; nothing depends on it. Must land first. |
| 2 | `mavlink_layer` | Self-contained transport; required by `mission_executor` and `telemetry_stream`; SITL conformance lands the first hard gate early. |
| 3 | `mission_client` | Self-contained REST client; required by `mission_executor` and `mapobjects_store` sync. |
| 4 | `mission_executor` | Combines `mavlink_layer` + `mission_client` + geofence/failsafe logic; gates takeoff via BIT. |
| 5 | `gimbal_controller` | Self-contained A40 UDP driver; required by `scan_controller`. |
| 6 | `frame_ingest` | RTSP decoder; required by all perception crates. |
| 7 | `detection_client` | gRPC client to `../detections`; required by `scan_controller` and `telemetry_stream`. |
| 8 | `movement_detector` | Depends on `frame_ingest` + `GimbalState`; standalone otherwise. |
| 9 | `mapobjects_store` | Engine choice may be deferred; default in-memory+snapshot unblocks `scan_controller`. |
| 10 | `semantic_analyzer` | Tier 2; depends on `Frame` + `Detection`. |
| 11 | `vlm_client` | Optional; default impl returns `vlm_disabled`. Real IPC implementation can land later. |
| 12 | `telemetry_stream` | Pure egress; ready once `frame_ingest`, `detection_client`, `mavlink_layer` exist. |
| 13 | `operator_bridge` | Depends on `telemetry_stream` + `mapobjects_store`; envelope auth scheme is Q9-stubbed. |
| 14 | `scan_controller` | Sits on top of everything in Perception + Action; lands last. |
| 15 | `autopilot` binary (composition root) | Wires every component handle; runs the actor topology. |
## Acceptance Criteria
**AC-1: Workspace scaffolded**
Given the structure plan above
When the implementer executes this task
Then `cargo metadata` lists all 14 crates (`shared`, `autopilot`, and 12 components — `vlm_client` is the 13th component crate but listed under perception above) and `cargo check --workspace` succeeds with no compile errors.
**AC-2: Stub tests runnable**
Given the scaffolded workspace
When `cargo test --workspace` runs on a developer workstation (no Docker, no GPU)
Then every crate's stub test (e.g. `it_compiles()`) passes within 5 seconds total.
**AC-3: CI pipeline configured**
Given the scaffolded workspace
When the Woodpecker pipeline runs on a feature branch push
Then `fetch → lint → unit-test → build-arm64 → build-no-vlm → integration-test → sitl-conformance` all complete successfully on a known-good baseline commit.
**AC-4: Dev compose boots**
Given `docker-compose.yml`
When `docker compose -f docker-compose.yml up -d` runs on a fresh workstation
Then the autopilot container starts, the `/health` endpoint returns HTTP 200 with `status: green | yellow` (red is acceptable here only for components without a mock target), and the mock detections + mock missions services are reachable.
**AC-5: Blackbox compose boots with SITL**
Given `docker-compose.test.yml`
When `docker compose -f docker-compose.test.yml up --abort-on-container-exit` runs
Then ArduPilot SITL + autopilot + mock detections + replay RTSP all start, and the SITL conformance e2e test exits 0.
**AC-6: Optionality contract enforced**
Given the scaffolded workspace
When `cargo build --workspace --no-default-features` runs
Then the binary builds and links without the `vlm` feature; `cargo test --workspace --no-default-features` passes; the `VlmProvider` default impl returns `VlmAssessment{status=vlm_disabled}`.
**AC-7: Cross-compile target ready**
Given `.cargo/config.toml` configured for `aarch64-unknown-linux-gnu`
When `cross build --target aarch64-unknown-linux-gnu --release` (or `cargo zigbuild` equivalent) runs in CI
Then an aarch64 binary is produced and stored as an artifact.
**AC-8: Flight-gate marker wiring exists**
Given `deploy/systemd/autopilot.service`
When systemd parses the unit
Then `ExecStartPre` asserts `/run/azaion/in-flight` is created and `ExecStopPost` removes it (per `containerization.md §3` and the suite-level flight-gate convention).
**AC-9: Observability scaffold initialised**
Given the autopilot binary
When it starts
Then `tracing-subscriber` emits JSON-formatted logs to stdout with the per-line fields enumerated in `observability.md §2` (`ts`, `ts_mono_ns`, `level`, `target`, `event`), and the `/health` endpoint returns the per-component breakdown documented in `containerization.md §7`.
**AC-10: Persistent state directory created**
Given `/var/lib/autopilot/` (or its container-mounted equivalent)
When autopilot starts in dev or prod
Then the binary creates `mapobjects/`, `audit/`, and `pending_pushes/` subdirectories with the owning user, fails closed if any directory cannot be created, and surfaces the failure to `/health` (red on `mapobjects_store`).
@@ -0,0 +1,80 @@
# MAVLink Transport and Heartbeat
**Task**: AZ-641_mavlink_transport_and_heartbeat
**Name**: MAVLink transport + heartbeat
**Description**: Single connection abstraction (UDP or serial, picked at startup), 1 Hz outgoing HEARTBEAT, bounded reconnect on transport loss, autopilot-heartbeat timeout detection.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure
**Component**: mavlink_layer
**Tracker**: AZ-641
**Epic**: AZ-637
## Problem
`mavlink_layer` needs a single, stable connection abstraction to the airframe autopilot (ArduPilot / PX4). The connection is either UDP or serial — picked once at startup from the connection URI (`udp://...` or `serial:///dev/...`); no runtime URI swap. The link must self-heal on transport loss with bounded backoff and surface link health to the rest of the system without silent failure.
## Outcome
- A `MavlinkConnection` opens once at startup and reconnects automatically on transport loss with bounded exponential backoff (≤2 s on serial / ≤5 s on UDP).
- A 1 Hz outgoing `HEARTBEAT` keeps the autopilot's GCS-link path alive.
- Autopilot heartbeats received on the inbound stream are timestamped; a configurable wall-clock timeout flips link state to `lost` and surfaces it via `health()` and a typed signal consumed by `mission_executor`.
- Health surface includes `connected`, `last_heartbeat_age_ms`, `signing_enabled`.
## Scope
### Included
- Connection-URI parser (`udp://host:port` and `serial:///dev/...`).
- UDP socket and serial port concrete transports behind a single `Transport` trait.
- Bounded exponential backoff on transport-open failure and on read failure.
- 1 Hz outgoing `HEARTBEAT` timer.
- Inbound heartbeat timestamping + wall-clock timeout → `link_lost` signal.
- `ComponentHealth` surface fields above.
### Excluded
- Message encoding / decoding (separate task 03).
- Command-ack demux and retry (separate task 04).
- MAVLink-2 signing (separate task 04; only the `signing_enabled` flag is plumbed here).
## Acceptance Criteria
**AC-1: UDP connection opens and survives drop**
Given a configured `udp://127.0.0.1:14550` endpoint
When the autopilot is not listening at process start
Then `MavlinkLayer::run()` retries with exponential backoff up to its cap and reports `connected = false` via `health()`; when the autopilot becomes reachable, the link reconnects within ≤5 s.
**AC-2: Serial connection opens and survives drop**
Given a configured `serial:///dev/pts/N` endpoint backed by a `socat` pair (or equivalent)
When the peer end is closed and reopened
Then `mavlink_layer` reconnects within ≤2 s and resumes heartbeat emission.
**AC-3: Heartbeat emitted at 1 Hz**
Given a healthy link
When the connection is open for 10 s
Then exactly 10 ± 1 outbound `HEARTBEAT` frames are observed by the peer.
**AC-4: Autopilot heartbeat loss flips link state**
Given a healthy link that has been emitting peer heartbeats
When the peer stops sending heartbeats
Then within the configured timeout (default 3 s) `health()` reports `link_lost = true` and a typed `LinkLost` signal is emitted on the public output channel.
## Non-Functional Requirements
**Performance**
- Reconnect latency: ≤2 s serial, ≤5 s UDP.
- Heartbeat cadence: 1 Hz ± 50 ms.
**Reliability**
- No infinite retry — bounded backoff cap is configurable (default 30 s).
- Transport-open failure surfaces to health → red; never silently absorbed.
## Constraints
- Hand-rolled — no third-party MAVLink SDK (per `architecture.md §5`).
- Single connection per process; no runtime URI swap.
## Runtime Completeness
- **Named capability**: MAVLink emission (HEARTBEAT) and link liveness.
- **Production code that must exist**: real UDP socket and real serial port transports.
- **Allowed external stubs**: in CI / integration tests, the peer end may be `socat` for serial or a loopback UDP listener.
- **Unacceptable substitutes**: a "fake transport" that swallows writes and synthesises heartbeats is not allowed in production code — only as a test double under `#[cfg(test)]`.
@@ -0,0 +1,79 @@
# MAVLink Message Codec (§7.7 Surface)
**Task**: AZ-642_mavlink_codec
**Name**: MAVLink v2 encode/decode for the §7.7 surface
**Description**: Encode and decode the ~1015 MAVLink v2 messages this codebase needs (the §7.7 surface only) with strict validation.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure
**Component**: mavlink_layer
**Tracker**: AZ-642
**Epic**: AZ-637
## Problem
Autopilot speaks a deliberately narrow MAVLink command surface (per `architecture.md §7.7` — ~1015 messages). Adding messages outside that list requires explicit design review. A hand-rolled MAVLink v2 codec must encode outbound messages with correct sequence numbers, system / component IDs, and (when enabled) signing, and decode inbound messages with strict validation — rejecting malformed frames, unknown IDs, and signing failures.
## Outcome
- Outbound encoder produces wire-correct MAVLink v2 frames for the message surface in §7.7 with monotonically incrementing per-link sequence numbers.
- Inbound decoder parses the same surface, rejecting malformed frames, unknown message IDs, and frames with sequence-number gaps (logged, not hard-failed).
- Decoded messages are exposed as a typed `MavlinkMessage` enum (one variant per supported message kind) on the inbound channel.
- Per-message-kind parse error counters are exposed via `health()`.
## Scope
### Included
- Encode + decode for `HEARTBEAT` (bidir), `COMMAND_LONG` outbound subset (arm/disarm, takeoff, set-mode, change-speed, change-alt, land, RTL), `COMMAND_ACK` inbound, `MISSION_COUNT`, `MISSION_REQUEST_INT`, `MISSION_ITEM_INT`, `MISSION_ACK`, `MISSION_SET_CURRENT`, `MISSION_CURRENT`, `MISSION_ITEM_REACHED`, `MISSION_CLEAR_ALL`, `GLOBAL_POSITION_INT`, `ATTITUDE`, `SYS_STATUS`, `EXTENDED_SYS_STATE`, `STATUSTEXT`, `SET_MODE`.
- Per-link outbound `tx_seq` counter with wrap-around handling.
- Strict size + CRC validation; reject malformed frames.
- Unknown message IDs counted and dropped (not hard-failed).
- Sequence-number gap detection (logged, not fatal).
### Excluded
- Transport and reconnect (task 02).
- Heartbeat scheduling (task 02).
- Ack demultiplexing to callers (task 04).
- MAVLink-2 signing (task 04).
- Any message not in the §7.7 surface — adding new messages requires design review.
## Acceptance Criteria
**AC-1: Round-trip every supported message**
Given the encoder produces a frame for each message kind in the §7.7 surface with deterministic field values
When the same frame is fed back through the decoder
Then the typed `MavlinkMessage` matches the original fields and `parse_errors_total` does not increment.
**AC-2: Malformed frame is rejected**
Given a byte buffer with a truncated payload or a wrong CRC
When the decoder consumes it
Then the frame is dropped, `parse_errors_total{kind="crc" | "truncated"}` increments by 1, and the codec continues processing subsequent bytes.
**AC-3: Unknown message ID is counted, not fatal**
Given an inbound frame with a message ID outside the §7.7 surface
When the decoder consumes it
Then the frame is dropped, `parse_errors_total{kind="unknown_id"}` increments by 1, and decoding continues.
**AC-4: SITL round-trip**
Given an ArduPilot SITL instance configured for `udp://127.0.0.1:14550`
When `mavlink_layer` emits a `COMMAND_LONG` for `MAV_CMD_NAV_RETURN_TO_LAUNCH`
Then SITL receives the command and replies with a matching `COMMAND_ACK`; the decoder emits a `MavlinkMessage::CommandAck` with `result = MAV_RESULT_ACCEPTED`.
## Non-Functional Requirements
**Performance**
- Per-message encode + decode round-trip: ≤50 ms p99 on a healthy link (per `description.md §8`).
**Reliability**
- No silent acceptance of malformed or signed-mismatch frames.
## Constraints
- Hand-rolled — no third-party MAVLink SDK.
- Adding any message outside the §7.7 surface requires an explicit design review noted in the PR description.
## Runtime Completeness
- **Named capability**: MAVLink v2 wire-correct encode/decode for the §7.7 command surface.
- **Production code that must exist**: real byte-level encoder + decoder; CRC computation; sequence number handling.
- **Allowed external stubs**: ArduPilot SITL is the conformance reference for the SITL round-trip AC.
- **Unacceptable substitutes**: a JSON or human-readable "MAVLink-like" envelope is not acceptable — the wire format must be MAVLink v2.
@@ -0,0 +1,74 @@
# MAVLink Ack Demux, Retry, and Signing
**Task**: AZ-643_mavlink_ack_demux_and_signing
**Name**: Command-ack demux + retry handle + optional MAVLink-2 signing
**Description**: Map outbound `COMMAND_LONG` requests to their `COMMAND_ACK` responses by `command_id`, enforce ack timeout, surface result to the originating caller; optionally enable MAVLink-2 message signing.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-641_mavlink_transport_and_heartbeat, AZ-642_mavlink_codec
**Component**: mavlink_layer
**Tracker**: AZ-643
**Epic**: AZ-637
## Problem
Outbound MAVLink commands are async with respect to their acks. `mission_executor` (and other callers) need a synchronous-feeling `send_command(...) -> Result<CommandAck>` API that times out at a configurable wall-clock deadline (default 1 s) — the retry decision then belongs to the caller, not to `mavlink_layer`. Separately, when the autopilot link supports it, MAVLink-2 message signing should be enabled for outbound frames and validated for inbound frames; mismatched signatures are rejected.
## Outcome
- `MavlinkHandle::send_command(cmd) -> Result<CommandAck, AckTimeout>` resolves when a matching `COMMAND_ACK` arrives within the deadline, or returns `AckTimeout` otherwise.
- An in-flight command map (`command_id → (caller, deadline)`) is correctly populated and cleared on success and on timeout (no leaks).
- When `signing_enabled = true` at config time, outbound frames are signed; inbound frames with bad signatures are rejected and counted (`parse_errors_total{kind="signing_mismatch"}`).
- `signing_enabled` is reported in `health()`.
## Scope
### Included
- In-flight command map with deadline-driven eviction.
- Public `send_command(...) -> Result<CommandAck>` API.
- MAVLink-2 outbound signature + inbound signature validation (off-by-default; on when configured).
- Health fields: `commands_in_flight`, `signing_enabled`.
### Excluded
- The decision to retry on `AckTimeout` (belongs to `mission_executor`).
- Encoding the new commands themselves (task 03).
## Acceptance Criteria
**AC-1: Command-ack happy path**
Given a healthy SITL link
When `send_command(MAV_CMD_NAV_RETURN_TO_LAUNCH)` is called
Then within ≤1 s the result resolves with `MAV_RESULT_ACCEPTED` and `commands_in_flight` returns to 0.
**AC-2: Ack timeout returns explicit error**
Given a SITL instance that is configured not to ack commands (or is paused)
When `send_command(...)` is called with the default 1 s deadline
Then the call resolves with `Err(AckTimeout)`; the in-flight map is cleared; the link stays open.
**AC-3: Signing rejection counted**
Given `signing_enabled = true` and an inbound frame whose signature does not match
When the decoder runs on the frame
Then the frame is rejected, `parse_errors_total{kind="signing_mismatch"}` increments by 1, and the link stays open.
**AC-4: Optional signing — disabled path**
Given `signing_enabled = false`
When inbound frames arrive (signed or unsigned)
Then the signature field is ignored and `parse_errors_total{kind="signing_mismatch"}` stays at 0.
## Non-Functional Requirements
**Performance**
- Ack demux lookup: O(1); does not contribute measurably to the ≤50 ms per-message round-trip target.
**Reliability**
- No leaked entries in the in-flight map; every `send_command` either resolves or times out.
## Constraints
- Signing scheme decision (Q6) lives elsewhere — this task only wires the on/off mechanism using the spec-defined MAVLink-2 signing.
## Runtime Completeness
- **Named capability**: MAVLink-2 message signing (when enabled) + COMMAND_ACK demux.
- **Production code that must exist**: real signature computation + verification; in-flight map keyed by `command_id`.
- **Allowed external stubs**: SITL with signing disabled is the default test fixture; a separate fixture exercises the signing path.
- **Unacceptable substitutes**: signature stub that always returns "valid" is not acceptable in production.
@@ -0,0 +1,81 @@
# Mission Pull + Schema Validation
**Task**: AZ-644_mission_client_pull_and_schema
**Name**: HTTPS mission fetch + schema validation
**Description**: HTTPS REST client to the external `missions` API, mission fetch by `mission_id` on startup, validate the response against the shared `mission-schema`, bounded retry on transient connection loss.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure
**Component**: mission_client
**Tracker**: AZ-644
**Epic**: AZ-638
## Problem
`autopilot` does not own the missions database — it fetches the mission by ID from the external `missions` API at startup. The response must validate against the shared `mission-schema`; on schema-invalid the mission MUST be rejected (no silent downcast). On transient connectivity failure, fetch is retried with bounded exponential backoff; on exceeding the cap, the mission start is refused and health flips to red.
## Outcome
- `MissionClient::fetch(mission_id) -> Result<Mission, FetchError>` performs an HTTPS GET against the configured `missions` endpoint, validates the response against the bundled `mission-schema` (schema version recorded), and returns a typed `Mission` (`{ waypoints, geofences, return_point, mission_id, schema_version }`).
- Transient failures (timeout, 5xx, DNS) are retried with bounded exponential backoff; max attempts configurable (default 5).
- On schema mismatch the call returns `Err(SchemaInvalid)` with a size-capped sample of the raw response for offline analysis.
- Health surface includes `last_fetch_ts`, `fetch_errors_total`, `schema_version`, `connection_state`.
## Scope
### Included
- HTTPS client (`reqwest` or `hyper` — pick the one already pinned in `shared`).
- Auth header plumb-through (concrete scheme deferred to `../_docs/02_missions.md`; passed as opaque `Authorization` header).
- Schema validation against `mission-schema` (bundled in `shared/contracts/`).
- Bounded exponential backoff.
### Excluded
- Middle-waypoint POST (task 06).
- MapObjects pre-flight pull (task 07).
- MapObjects post-flight push and durable queue (task 08).
## Acceptance Criteria
**AC-1: Happy path fetch**
Given a fixture `missions` API that returns a schema-valid mission JSON for `mission_id = M1`
When `MissionClient::fetch("M1")` is called
Then it returns `Ok(Mission { ... })` and `health()` reports `last_fetch_ts` updated, `connection_state = "ok"`.
**AC-2: Schema-invalid is rejected**
Given a fixture `missions` API that returns a valid HTTP 200 but the JSON body has a missing required field
When `MissionClient::fetch("M1")` is called
Then it returns `Err(SchemaInvalid)` and `health()` records the failure; the raw response excerpt is logged size-capped.
**AC-3: Transient failure retries within budget**
Given the missions API returns `503` for the first two attempts and `200` on the third
When `MissionClient::fetch("M1")` is called
Then it returns `Ok` after the third attempt; backoff is observed between attempts.
**AC-4: Cap exhaustion refuses start**
Given the missions API is unreachable for all 5 default attempts
When `MissionClient::fetch("M1")` is called
Then it returns `Err(MaxRetriesExceeded)` and `health()` is red.
## Non-Functional Requirements
**Performance**
- Startup fetch completes within ≤5 s on healthy connectivity.
**Reliability**
- No silent downcast on schema mismatch.
- No infinite retry — bounded backoff cap is configurable.
## Constraints
- Mission schema is shared with the external `missions` repo; the schema file lives in `shared/contracts/mission-schema.json` (bundled at build time).
## Contract
- `mission-schema.json` is the authoritative wire contract. Owner: `../_docs/02_missions.md`. Bundled copy in `shared/contracts/mission-schema.json`.
- Canonical typed model: `data_model.md §MissionItem`, `§MissionWaypoint`, `§Geofence`.
## Runtime Completeness
- **Named capability**: HTTPS REST to the external `missions` API + JSON Schema validation.
- **Production code that must exist**: real HTTPS request; real JSON Schema validator (e.g. `jsonschema` crate).
- **Allowed external stubs**: in tests, the missions API can be a local `wiremock`/`mockito` server.
- **Unacceptable substitutes**: skipping schema validation in production "for speed" is not acceptable; validation is a safety boundary.
@@ -0,0 +1,64 @@
# Middle-Waypoint POST
**Task**: AZ-645_mission_client_waypoint_post
**Name**: Middle-waypoint POST to missions API
**Description**: POST the updated mission (with operator-confirmed middle waypoint inserted) to the external `missions` API; bounded retry; surface failure to `mission_executor`.
**Complexity**: 2 points
**Dependencies**: AZ-640_initial_structure, AZ-644_mission_client_pull_and_schema
**Component**: mission_client
**Tracker**: AZ-645
**Epic**: AZ-638
## Problem
When the operator confirms a POI, `scan_controller` hands a middle-waypoint hint to `mission_executor`, which computes the patched mission (`current_position → middle_waypoint → resume_original_route`). That patched mission must be POSTed to the external `missions` API for persistence and traceability. If the POST fails, the executor decides whether to halt, RTL, or continue with the in-memory mission — `mission_client` only surfaces the failure.
## Outcome
- `MissionClient::post_middle_waypoint(mission_id, patched_mission) -> Result<MissionUpdateAck, PostError>` performs a `POST /missions/{id}/middle-waypoint` (exact path per `../_docs/02_missions.md`) and awaits an ack.
- Bounded exponential backoff on transient failure (default 3 attempts).
- On final failure returns a typed error; never silent.
- Health field `last_middle_waypoint_post_status` updated.
## Scope
### Included
- POST endpoint call with the patched mission body.
- Bounded retry on 5xx / timeout.
- Error surface to caller.
### Excluded
- The decision to RTL on failure (`mission_executor`).
- Recomputing the patched mission (`mission_executor`).
## Acceptance Criteria
**AC-1: Happy path POST**
Given a fixture missions API that accepts the POST and returns `200`
When `post_middle_waypoint("M1", patched)` is called
Then it returns `Ok(MissionUpdateAck { ... })` within ≤2 s and `health.last_middle_waypoint_post_status = "ok"`.
**AC-2: Transient failure retries**
Given the API returns `503` once then `200`
When the call is made
Then it returns `Ok` on the second attempt.
**AC-3: Cap exhaustion bubbles error**
Given the API returns `500` for all 3 default attempts
When the call is made
Then it returns `Err(MaxRetriesExceeded)` and the error is surfaced to the caller; no silent absorption.
## Non-Functional Requirements
**Performance**
- Single happy-path POST completes in ≤2 s on healthy connectivity.
**Reliability**
- Bounded backoff; no infinite retry.
## Runtime Completeness
- **Named capability**: middle-waypoint POST against the external `missions` API.
- **Production code that must exist**: real HTTPS POST.
- **Allowed external stubs**: `wiremock`/`mockito` for tests.
- **Unacceptable substitutes**: swallowing the error and proceeding is not acceptable.
@@ -0,0 +1,76 @@
# MapObjects Pre-Flight Pull
**Task**: AZ-646_mission_client_mapobjects_pull
**Name**: Pre-flight MapObjects GET + cached-fallback handshake
**Description**: After mission fetch succeeds, GET `/missions/{id}/mapobjects` (and `/ignored` if separated). Surface the bundle to `mapobjects_store`. On failure, surface BIT degradation — operator must acknowledge cached fallback or abort. Never silent.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-644_mission_client_pull_and_schema
**Component**: mission_client
**Tracker**: AZ-646
**Epic**: AZ-638
## Problem
The MapObjects working copy is hydrated pre-flight from the central `missions` API. The pull must complete before `mission_executor` proceeds past `BIT_OK`. On pull failure the system must NOT silently proceed; instead, `mission_executor`'s BIT (F9) surfaces a degraded state — the operator either acknowledges cached fallback (signed acknowledgement per Q9) or aborts.
## Outcome
- `MissionClient::pull_mapobjects(mission_id) -> Result<MapObjectsBundle, PullError>` performs a `GET /missions/{id}/mapobjects` (and `/ignored` if the API splits them) and returns a typed `MapObjectsBundle { map_objects, ignored_items, fetched_at, schema_version, fallback_used: bool }`.
- On 200, the bundle is handed to `mapobjects_store` for hydration; `mapobjects_pull_state = synced`.
- On error or timeout, `pull_state = failed`; the typed error is surfaced to `mission_executor` (F9 BIT degrades, never silent).
- Health fields: `mapobjects_pull_state`, `last_mapobjects_pull_ts`.
## Scope
### Included
- GET endpoint(s) call.
- Schema validation of the bundle (using the shared MapObjects schema in `shared/contracts/`).
- Cached-fallback semantics — the **cache** itself lives in `mapobjects_store` (task 28); this task only knows to set `fallback_used = true` if it uses cached on operator ack.
- Health surface fields above.
### Excluded
- The cache storage itself (lives in `mapobjects_store`).
- Operator-acknowledgement flow (`operator_bridge`).
- BIT orchestration (`mission_executor`).
## Acceptance Criteria
**AC-1: Happy path pull**
Given a fixture API that returns a schema-valid MapObjects bundle
When `pull_mapobjects("M1")` is called
Then it returns `Ok(bundle)`, `pull_state = synced`, and the bundle reaches `mapobjects_store` for hydration.
**AC-2: Schema-invalid is rejected**
Given the API returns a 200 with a missing required field
When `pull_mapobjects("M1")` is called
Then it returns `Err(SchemaInvalid)` and `pull_state = failed`; no silent acceptance.
**AC-3: Network failure surfaces to F9**
Given the API is unreachable
When `pull_mapobjects("M1")` is called
Then it returns `Err(Unreachable)`, `pull_state = failed`, and the error is observable by `mission_executor`'s BIT path.
**AC-4: 30 km × 30 km area completes within budget**
Given a fixture bundle the size of a 30 km × 30 km mission area
When the pull is performed on a 100 Mbps loopback link
Then the call completes in ≤30 s.
## Non-Functional Requirements
**Performance**
- ≤30 s for a 30 km × 30 km mission area on healthy connectivity (per `description.md §8`).
**Reliability**
- Never silent on failure.
## Contract
- MapObjects bundle schema: `shared/contracts/mapobjects-bundle.json`. Owner: `../_docs/02_missions.md` §7.13 extension.
- Canonical typed model: `data_model.md §MapObjectsBundle`.
## Runtime Completeness
- **Named capability**: HTTPS GET against the central MapObjects extension + schema validation.
- **Production code that must exist**: real HTTPS GET; real schema validator.
- **Allowed external stubs**: `wiremock`/`mockito`.
- **Unacceptable substitutes**: skipping schema validation in production.
@@ -0,0 +1,84 @@
# MapObjects Post-Flight Push + Durable Queue
**Task**: AZ-647_mission_client_mapobjects_push
**Name**: Post-flight MapObjects push with durable queue and crash-recovery push
**Description**: On `mission_executor` terminal state, drain `mapobjects_store`'s pending diff and POST to `/missions/{id}/mapobjects` + `/missions/{id}/mapobjects/ignored`. Independent retry per endpoint. Persist pending diff on disk for 24 h durable retry. At startup, replay any non-empty pending diff from a previously terminated mission BEFORE BIT for any new mission begins.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-644_mission_client_pull_and_schema, AZ-646_mission_client_mapobjects_pull
**Component**: mission_client
**Tracker**: AZ-647
**Epic**: AZ-638
## Problem
The full pass diff (NEW / MOVED / EXISTING / REMOVED-candidate observations + IgnoredItem appends) must reach the central API after the mission ends. In-flight central writes are forbidden (Frozen choice 6 — `architecture.md §7.3`). The post-flight push must survive transient failure (independent retry per endpoint), persistent failure (operator-visible warning + manual replay), and crash mid-mission (next-boot push of pending diff). The durable queue is the disk-backed safety net.
## Outcome
- `MissionClient::push_mapobjects_diff(mission_id, diff) -> PushReport` posts the observations and ignored-items independently; partial success does not roll back the successful endpoint.
- The pending diff is persisted on disk at `${state_dir}/mapobjects_push/<mission_id>.json` BEFORE the push starts (write-ahead).
- Per-endpoint bounded exponential backoff (24 h durable retry window; configurable).
- Persistent failure: `sync_state = degraded`; operator-visible warning; entry stays on disk for manual replay.
- At startup, if `${state_dir}/mapobjects_push/` has any non-empty file, run the push for those missions BEFORE BIT for any new mission begins (crash-recovery path).
## Scope
### Included
- Two POST endpoints, called independently with separate retry/backoff state.
- Write-ahead persistence of the pending diff before the network call.
- Crash-recovery sweep at startup.
- `PushReport { observations: PerEndpointStatus, ignored: PerEndpointStatus }`.
- Health surface: `mapobjects_push_pending`, `last_push_ts`, per-endpoint last error.
### Excluded
- Building the pending diff (`mapobjects_store` — task 28 owns `pending_observations` + `pending_ignored`).
- Choosing what's a terminal state (`mission_executor`).
- Operator UI for the manual-replay warning (`operator_bridge` / Ground Station).
## Acceptance Criteria
**AC-1: Happy path push**
Given the mission ended with N observations and M ignored items
When `push_mapobjects_diff("M1", diff)` is called and both endpoints return 200
Then both succeed, the disk file is cleared, and `sync_state = synced`.
**AC-2: Partial success — independent retry**
Given `/mapobjects` returns 200 and `/mapobjects/ignored` returns 503
When the push runs
Then the observations endpoint is reported success, the ignored endpoint is queued for retry, and the disk file retains ONLY the ignored portion.
**AC-3: Persistent failure persists for manual replay**
Given both endpoints return 503 for all 24 h of bounded retry
When the retry window closes
Then `sync_state = degraded`, the disk file remains intact, and a manual-replay warning is observable in `health()`.
**AC-4: Crash-recovery push at startup**
Given a previous run terminated with a non-empty disk file at `${state_dir}/mapobjects_push/M0.json`
When the process starts a new run for mission `M1`
Then the push for `M0` is attempted before BIT begins for `M1`; the order is observable via logs.
**AC-5: 60-min mission push within budget**
Given a fixture pass diff sized for a 60-min mission
When the push is performed on a 100 Mbps loopback link
Then both endpoints complete in ≤2 min.
## Non-Functional Requirements
**Performance**
- ≤2 min for a 60-min mission's pass diff (per `description.md §8`).
**Reliability**
- 24 h durable retry window.
- Crash-mid-mission: nothing is lost on disk.
## Contract
- MapObjects POST schemas: `shared/contracts/mapobjects-observations.json` and `shared/contracts/mapobjects-ignored.json`. Owner: `../_docs/02_missions.md` §7.13 extension.
- Canonical typed model: `data_model.md §MapObjectObservation`, `§IgnoredItem`.
## Runtime Completeness
- **Named capability**: durable on-disk queue + post-flight push to the central `missions` API.
- **Production code that must exist**: real disk write-ahead (atomic rename); real HTTPS POST; real backoff state machine; real crash-recovery sweep.
- **Allowed external stubs**: `wiremock`/`mockito` for tests; `tempfile` for the disk-queue tests.
- **Unacceptable substitutes**: an in-memory-only queue is not acceptable (crash recovery requires disk).
@@ -0,0 +1,82 @@
# Mission Executor State Machine (Both Variants)
**Task**: AZ-648_mission_executor_state_machine
**Name**: Variant-aware mission state machine
**Description**: Typed state machine for both multirotor and fixed-wing variants. Transitions are explicit and fully enumerated; bounded retry per transition with explicit max-retry. No infinite retry. State is in-process only; restart re-runs from `DISCONNECTED`.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-641_mavlink_transport_and_heartbeat, AZ-642_mavlink_codec, AZ-643_mavlink_ack_demux_and_signing
**Component**: mission_executor
**Tracker**: AZ-648
**Epic**: AZ-636
## Problem
`mission_executor` drives the airframe through a typed state machine. The flow differs per variant (multirotor vs fixed-wing); both variants share the same transition discipline and observability surface. Every transition has a bounded retry budget — on cap exhaustion health flips to red and the failure surfaces via `operator_bridge`. **No infinite retry** is permitted (per `architecture.md §5`).
## Outcome
- A typed `MissionState` enum encodes:
- Multirotor: `DISCONNECTED → CONNECTED → HEALTH_OK → BIT_OK → ARMED → TAKE_OFF → MISSION_UPLOADED → FLY_MISSION → LAND → POST_FLIGHT_SYNC → DONE`.
- Fixed-wing: `DISCONNECTED → CONNECTED → HEALTH_OK → BIT_OK → MISSION_UPLOADED → WAIT_AUTO → FLY_MISSION → LAND → POST_FLIGHT_SYNC → DONE`.
- `MissionExecutor::tick(now, telemetry)` advances the state machine; each transition is gated by an explicit guard.
- Per-transition retry counter + last-failure reason; on cap exhaustion the machine pauses and health → red.
- Health surface: current state, `state_duration_ms`, `transition_failures_by_state`, retry counts.
## Scope
### Included
- Both variant state graphs.
- Bounded retry per transition (configurable; default 3 attempts).
- `Variant` enum (`Multirotor`, `FixedWing`) wired from startup config.
- State-transition events published on an output channel for `scan_controller` and `telemetry_stream`.
- Mission re-upload sequence (`MISSION_CLEAR_ALL` → upload waypoints → `MISSION_SET_CURRENT`) — invoked from `MISSION_UPLOADED` entry guards.
### Excluded
- BIT (F9) — separate task 11.
- Lost-link failsafe ladder (F10) — separate task 12.
- Geofence + battery enforcement — separate task 13.
- Middle-waypoint re-upload — separate task 13 (logic) but exercised here for the base mission upload.
- Post-flight push trigger — separate task 13.
## Acceptance Criteria
**AC-1: Happy-path multirotor flow against SITL**
Given a multirotor SITL + `mavlink_layer` healthy + a valid in-memory mission
When `mission_executor::run()` is started
Then it reaches `DONE` traversing the multirotor state graph; transitions are observable as events; mission progress reaches all waypoints.
**AC-2: Happy-path fixed-wing flow against SITL**
Given a fixed-wing SITL + the operator's GCS sets AUTO mode externally
When `mission_executor::run()` is started
Then it traverses the fixed-wing graph (no `ARMED`/`TAKE_OFF`; `WAIT_AUTO` waits for the AUTO transition) and reaches `DONE`.
**AC-3: Bounded retry on mission-upload rejection**
Given SITL is configured to reject `MISSION_ACK` for the first attempt and accept the second
When the executor reaches `MISSION_UPLOADED`
Then the retry counter increments to 1, the second attempt succeeds, and the machine proceeds.
**AC-4: Cap exhaustion flips health to red**
Given SITL is configured to reject `MISSION_ACK` for all 3 default attempts
When the executor reaches `MISSION_UPLOADED`
Then the machine pauses, health → red, and the failure is observable on the output channel; no transition past `MISSION_UPLOADED`.
## Non-Functional Requirements
**Performance**
- Mission-upload retry budget: configurable; default 3 attempts.
- State-machine tick: ≤10 ms p99.
**Reliability**
- No infinite retry anywhere.
## Constraints
- `mavlink_layer::send_command` is the only path to the airframe.
- Variant is fixed at startup; no runtime swap.
## Runtime Completeness
- **Named capability**: variant-aware state machine + mission upload via MAVLink.
- **Production code that must exist**: explicit transition guards; real retry counters; real mission-upload sequence.
- **Allowed external stubs**: ArduPilot SITL is the conformance target (both `arducopter` and `arduplane`).
- **Unacceptable substitutes**: a generic "if-else cascade" instead of typed state transitions is not acceptable.
@@ -0,0 +1,65 @@
# Telemetry Forwarding from Mission Executor
**Task**: AZ-649_mission_executor_telemetry_forwarding
**Name**: Telemetry forwarding to scan, movement, telemetry, BIT input
**Description**: Forward decoded MAVLink telemetry (position, attitude, mode, sys-status) from `mavlink_layer` to `scan_controller` (proximity + middle-waypoint computation), `movement_detector` (ego-motion compensation), and `telemetry_stream` (operator overlay). Provide a typed `UavTelemetry` snapshot for BIT consumption.
**Complexity**: 2 points
**Dependencies**: AZ-640_initial_structure, AZ-648_mission_executor_state_machine
**Component**: mission_executor
**Tracker**: AZ-649
**Epic**: AZ-636
## Problem
`mission_executor` is the only component subscribed to the raw decoded MAVLink stream — it owns the airframe relationship. Downstream components (`scan_controller`, `movement_detector`, `telemetry_stream`) and the BIT path need the same telemetry, but in a typed, projection-friendly form (`UavTelemetry { position, attitude, mode, sys_status, monotonic_ts }`). Forwarding must not duplicate decode work and must not drop messages silently.
## Outcome
- `UavTelemetry` is published on three lossy broadcast channels (one per downstream consumer) with monotonic timestamps; consumers that fall behind get drops counted, not blocking.
- `UavTelemetrySnapshot` (latest-state view) is exposed for BIT and health-check consumers.
- Health surface: `last_telemetry_ts`, per-consumer drop counters.
## Scope
### Included
- Subscribe to the typed `MavlinkMessage` enum from `mavlink_layer`.
- Project to `UavTelemetry` (`data_model.md §UavTelemetry`).
- Publish on three Tokio broadcast channels.
- Maintain an atomic latest-snapshot for synchronous reads.
### Excluded
- Decoding MAVLink (task 03).
- Geofence/battery checks (task 13).
- BIT logic (task 11).
## Acceptance Criteria
**AC-1: Telemetry reaches all three consumers**
Given a healthy SITL link
When `GLOBAL_POSITION_INT` and `ATTITUDE` arrive at 10 Hz
Then `UavTelemetry` is observed at ≥10 Hz on all three downstream channels, with monotonic timestamps.
**AC-2: Slow consumer drops, fast consumers unaffected**
Given a slow consumer that yields every 500 ms while telemetry arrives at 10 Hz
When the channels back-pressure
Then the slow consumer's drop counter increments while the other two channels deliver every frame.
**AC-3: Latest-snapshot is monotonic**
Given a sequence of telemetry messages with monotonically advancing timestamps
When `latest_snapshot()` is read concurrently
Then every read returns a snapshot whose `monotonic_ts` is `>=` the previously observed value.
## Non-Functional Requirements
**Performance**
- Telemetry republish adds ≤2 ms to the MAVLink decode-to-consumer path.
**Reliability**
- Slow consumer never blocks fast consumers (lossy broadcast).
- Drops are counted, never silent.
## Runtime Completeness
- **Named capability**: typed telemetry fan-out to three concurrent consumers.
- **Production code that must exist**: real Tokio broadcast or equivalent; real atomic snapshot.
- **Unacceptable substitutes**: blocking single-consumer queue is not acceptable (it would gate the slowest downstream).
@@ -0,0 +1,69 @@
# Pre-Flight BIT (F9)
**Task**: AZ-650_mission_executor_bit_f9
**Name**: Built-In Test gate before ARMED/WAIT_AUTO
**Description**: Pre-flight Built-In Test (F9). Gates the transition to `ARMED` (multirotor) or `WAIT_AUTO` (fixed-wing). Covers every dependency in `architecture.md §5` plus mission load + MapObjects pre-flight pull (cached fallback acknowledged) + persistent-store free space + wall-clock binding. On FAIL no transition. On DEGRADED, surface to operator for signed acknowledgement (per Q9).
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-648_mission_executor_state_machine, AZ-649_mission_executor_telemetry_forwarding, AZ-644_mission_client_pull_and_schema, AZ-646_mission_client_mapobjects_pull
**Component**: mission_executor
**Tracker**: AZ-650
**Epic**: AZ-636
## Problem
The airframe must not be armed until every load-bearing dependency is verified healthy and every load-bearing input has been ingested. The BIT is the deliberate gate that captures `architecture.md §5` "BIT mandatory" + `system-flows.md §F9`. On FAIL the executor MUST refuse to transition past `BIT_OK`. On DEGRADED the executor surfaces a signed acknowledgement requirement to the operator (per Q9) and only proceeds when ack is observed.
## Outcome
- `Bit::evaluate(env) -> BitReport { items: Vec<BitItem { name, status: Pass | Degraded | Fail, detail }> }` returns a structured report.
- BIT items cover (at minimum): `mavlink_link`, `gimbal_link`, `camera_rtsp`, `detection_grpc`, `movement_telemetry_sync_ready`, `mapobjects_synced_or_cached_acked`, `mission_loaded`, `state_dir_free_space`, `wall_clock_bound`, `tier2_session_ready` (if enabled), `vlm_session_ready` (if enabled), `operator_bridge_session`.
- On `Fail` for any item, the state machine does NOT transition past `BIT_OK`; the report surfaces via `operator_bridge`.
- On `Degraded` items, the state machine waits for a signed `BitDegradedAck` from `operator_bridge` (matching the report id); on ack, proceeds; on timeout (configurable; default 5 min), surfaces failure.
## Scope
### Included
- BIT item evaluators (one per item).
- Report aggregation + status fusion.
- Signed `BitDegradedAck` handling (the auth check itself lives in `operator_bridge` — this task only consumes the validated event).
- Timeout for ack.
### Excluded
- BIT UI / operator overlay (Ground Station + `operator_bridge`).
- Operator-command auth validation (lives in `operator_bridge` — task 41).
## Acceptance Criteria
**AC-1: All-pass BIT proceeds**
Given every dependency is healthy
When the executor reaches `HEALTH_OK` and runs BIT
Then `BitReport.overall = Pass`, the machine transitions to `BIT_OK`, and proceeds to `ARMED` (multirotor) or `MISSION_UPLOADED` (fixed-wing).
**AC-2: Fail blocks transition**
Given `camera_rtsp` reports `Fail`
When BIT runs
Then `BitReport.overall = Fail`, the machine stays at `HEALTH_OK`, and the report is observable via `operator_bridge`.
**AC-3: Degraded requires signed ack**
Given `mapobjects_synced_or_cached_acked` reports `Degraded` (cached fallback)
When BIT runs
Then the executor waits; only after a signed `BitDegradedAck` matching the report id does the machine transition to `BIT_OK`.
**AC-4: Degraded ack timeout fails the BIT**
Given a Degraded report with no ack within the configured timeout (default 5 min)
When the timeout fires
Then `BitReport.overall = Fail`, the machine stays at `HEALTH_OK`, and the timeout is observable.
## Non-Functional Requirements
**Performance**
- BIT evaluation completes in ≤2 s when all dependencies are healthy.
**Reliability**
- No silent FAIL; every item's status is observable.
## Runtime Completeness
- **Named capability**: F9 BIT — production gate before arming.
- **Production code that must exist**: real evaluators that read live health from each dependency; real signed-ack consumption path.
- **Unacceptable substitutes**: a hardcoded "BIT always passes" path in production is unacceptable.
@@ -0,0 +1,72 @@
# Lost-Link Failsafe Ladder (F10)
**Task**: AZ-651_mission_executor_lost_link_ladder
**Name**: Lost-link ladder LinkOk → LinkDegraded → LinkLost → LinkLostInFollow
**Description**: Per-tick evaluation of the operator/Ground-Station modem link state. Default RTL after 30 s grace. Configurable. MAVLink-link loss to ArduPilot itself is a separate, more severe event — health → red, airframe failsafe takes over (we do NOT override it).
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-648_mission_executor_state_machine, AZ-649_mission_executor_telemetry_forwarding
**Component**: mission_executor
**Tracker**: AZ-651
**Epic**: AZ-636
## Problem
The operator's modem link is critical to safe operation but inherently flaky. The failsafe must escalate predictably from `LinkOk` to `LinkDegraded` (530 s) to `LinkLost` (>30 s) to `LinkLostInFollow` (special-cased target-follow case) — each step with a defined behaviour. Default action on `LinkLost` is RTL after a grace window. Crucially, MAVLink-link loss to ArduPilot is a different event — autopilot does NOT override the airframe's built-in failsafe in that case.
## Outcome
- `LostLinkLadder::tick(now, link_state)` updates an enum `LadderState ∈ {LinkOk, LinkDegraded, LinkLost, LinkLostInFollow}` deterministically based on the elapsed time since the last operator-link heartbeat.
- `LinkDegraded` for 530 s: health → yellow; events queued; no command to airframe.
- `LinkLost` for >30 s (configurable): trigger RTL via `mavlink_layer`; transition to `LAND`.
- `LinkLostInFollow` (active `TargetFollow` + >30 s): 30 s grace, then RTL.
- MAVLink-link loss to ArduPilot: detected via `mavlink_layer`'s `LinkLost`; health → red; do NOT issue RTL (airframe handles it).
- Health surface: current `LadderState`, time-in-state, RTL trigger count.
## Scope
### Included
- Ladder state machine.
- Subscribe to operator-link state from `telemetry_stream` (forwarded by `operator_bridge` health).
- Subscribe to MAVLink-link state from `mavlink_layer`.
- Configurable thresholds (defaults: degraded=5 s, lost=30 s, follow-grace=30 s).
- RTL command issuance via `mavlink_layer::send_command(MAV_CMD_NAV_RETURN_TO_LAUNCH)`.
### Excluded
- Operator command auth checks (`operator_bridge`).
- Target-follow state ownership (`scan_controller`).
## Acceptance Criteria
**AC-1: Operator-link degraded then recovers**
Given a healthy link
When the operator-link heartbeat stops for 10 s and resumes
Then the ladder reports `LinkOk → LinkDegraded → LinkOk` with correct dwell times; no RTL is issued.
**AC-2: Operator-link lost triggers RTL**
Given a healthy link
When the operator-link heartbeat stops for 31 s
Then the ladder reports `LinkLost`, `send_command(MAV_CMD_NAV_RETURN_TO_LAUNCH)` is issued exactly once, and the state machine transitions to `LAND`.
**AC-3: Lost-in-follow grace then RTL**
Given the system is in `TargetFollow` and the operator-link drops
When the link is down for 30 s (grace), then continues to be down past the grace
Then RTL is triggered after the grace fires, not earlier.
**AC-4: MAVLink loss does NOT trigger autopilot-side RTL**
Given the MAVLink link to ArduPilot is lost (`mavlink_layer` reports `LinkLost`)
When the ladder tick runs
Then health → red, no `MAV_CMD_NAV_RETURN_TO_LAUNCH` is issued by autopilot (airframe failsafe owns the response), and the event is observable.
## Non-Functional Requirements
**Performance**
- Ladder tick: ≤5 ms.
**Reliability**
- All thresholds configurable; no hardcoded defaults beyond the defaults documented above.
## Runtime Completeness
- **Named capability**: F10 lost-link failsafe ladder.
- **Production code that must exist**: real state machine; real RTL command issuance.
- **Unacceptable substitutes**: omitting the `LinkLostInFollow` grace is not acceptable (an operator may have momentary glitches mid-follow).
@@ -0,0 +1,92 @@
# Geofence + Battery Enforcement + Middle-Waypoint Re-Upload + Post-Flight Trigger
**Task**: AZ-652_mission_executor_safety_and_resume
**Name**: Geofence + battery thresholds + middle-waypoint re-upload + post-flight push trigger
**Description**: Continuous safety enforcement (INCLUSION + EXCLUSION geofences honoured equally; battery thresholds with operator override). Mission re-upload on middle-waypoint hint. Mission revert on target-follow ending. Trigger post-flight MapObjects push on `POST_FLIGHT_SYNC` entry.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-648_mission_executor_state_machine, AZ-649_mission_executor_telemetry_forwarding, AZ-643_mavlink_ack_demux_and_signing, AZ-647_mission_client_mapobjects_push
**Component**: mission_executor
**Tracker**: AZ-652
**Epic**: AZ-636
## Problem
The state machine alone is not enough — three continuous concerns must run on every tick:
1. **Geofence enforcement**: both INCLUSION and EXCLUSION violations trigger RTL. The earlier C++ behaviour silently ignored EXCLUSION; the new design rejects that.
2. **Battery / fuel thresholds**: RTL at `battery ≤ rtl_threshold` (default 25 %); land-now at `battery ≤ hard_floor` (default 15 %); operator override only via signed command.
3. **Middle-waypoint re-upload + target-follow revert**: on operator confirm, recompute and re-upload (`MISSION_CLEAR_ALL` → re-upload → `MISSION_SET_CURRENT(0)`); on target-follow ending, recompute and re-upload the original mission from current position.
Plus the post-flight trigger: on `POST_FLIGHT_SYNC` entry, hand off to `mission_client::push_mapobjects_diff`.
## Outcome
- `GeofenceMonitor::tick(uav_telemetry, mission_geofences)` triggers RTL on INCLUSION exit or EXCLUSION entry within ≤500 ms; alert is observable.
- `BatteryMonitor::tick(sys_status, ext_sys_state)` triggers RTL at `≤rtl_threshold`, land-now at `≤hard_floor`; signed operator-override is honoured and audit-logged.
- `MissionRePlanner::on_middle_waypoint(hint)` computes the patched mission and issues the re-upload sequence; result is observable.
- `MissionRePlanner::on_target_follow_release(reason)` recomputes the original mission from the current position and re-uploads.
- On entry to `POST_FLIGHT_SYNC`, the executor calls `mission_client::push_mapobjects_diff(mission_id, diff)`; result is logged; the machine still reaches `DONE` even on push failure (push surface manual-replay warning).
## Scope
### Included
- Continuous geofence check using `geo` crate or equivalent (point-in-polygon).
- Continuous battery check using `SYS_STATUS` + `EXTENDED_SYS_STATE`.
- Re-upload sequence helpers.
- Post-flight push trigger.
### Excluded
- Middle-waypoint computation algorithm (`scan_controller` provides the hint with `target_mgrs` + `target_class`; the executor only handles re-upload mechanics).
- Operator signature validation (`operator_bridge`).
- The actual push (`mission_client` task 08).
- The audit log persistence layer (lives in `shared::audit`).
## Acceptance Criteria
**AC-1: INCLUSION geofence exit triggers RTL**
Given a multirotor flying inside an INCLUSION polygon
When the UAV position crosses outside the polygon
Then RTL is triggered within ≤500 ms; the alert is observable; the state machine transitions to `LAND`.
**AC-2: EXCLUSION geofence entry triggers RTL**
Given a multirotor flying outside an EXCLUSION polygon
When the UAV position crosses into the polygon
Then RTL is triggered within ≤500 ms (parity with INCLUSION); the alert is observable.
**AC-3: Battery thresholds**
Given a multirotor flying with battery at 30 %
When `SYS_STATUS` reports battery at 24 %
Then RTL is triggered; transition to `LAND`.
When (in a separate scenario) `SYS_STATUS` drops below 15 %
Then `MAV_CMD_NAV_LAND` is issued (land-now); health → red.
**AC-4: Signed operator override of battery RTL**
Given the battery monitor would otherwise RTL at 24 %
When a signed `BatteryOverride { until_ts }` is received from `operator_bridge`
Then RTL is suppressed until `until_ts`; the override is recorded with operator id + rationale in the audit log.
**AC-5: Middle-waypoint re-upload sequence**
Given a confirmed POI yields a middle-waypoint hint
When `on_middle_waypoint` is invoked
Then the sequence `MISSION_CLEAR_ALL` → upload all waypoints → `MISSION_SET_CURRENT(0)` is issued in order, completing in ≤2 s end-to-end.
**AC-6: Post-flight push trigger**
Given the executor enters `POST_FLIGHT_SYNC`
When the entry guard runs
Then `mission_client::push_mapobjects_diff(mission_id, diff)` is called exactly once; the executor reaches `DONE` regardless of push success.
## Non-Functional Requirements
**Performance**
- Geofence response time: ≤500 ms from violation detection to RTL command.
- Middle-waypoint re-upload: ≤2 s end-to-end.
**Reliability**
- Both geofence variants enforced; symmetric behaviour.
- No infinite retry on re-upload — bounded by the executor's transition-retry budget.
## Runtime Completeness
- **Named capability**: geofence enforcement (both variants) + battery thresholds + re-upload sequence + post-flight push trigger.
- **Production code that must exist**: real point-in-polygon; real `SYS_STATUS` decode; real `MAV_CMD_*` issuance.
- **Unacceptable substitutes**: ignoring EXCLUSION (the pre-existing C++ bug) is unacceptable; ignoring battery overrides without signed proof is unacceptable.
@@ -0,0 +1,79 @@
# ViewPro A40 Vendor Transport
**Task**: AZ-653_gimbal_a40_transport
**Name**: ViewPro A40 vendor protocol UDP transport
**Description**: UDP transport, frame encode/decode, CRC16 (vendor spec), bounded retry on command timeout. Surface vendor faults to health.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure
**Component**: gimbal_controller
**Tracker**: AZ-653
**Epic**: AZ-634
## Problem
The gimbal is a ViewPro A40 vendor product reachable over UDP using a vendor-specified frame format with CRC16. The transport layer must encode and decode every command/response frame this codebase issues (yaw, pitch, zoom, feedback request, mode commands), validate CRC on inbound frames, and re-issue on timeout with bounded retry. The vendor protocol is fixed by the camera — the device's binary protocol is a `restrictions.md` constraint, not a design choice.
## Outcome
- `A40Transport::send(cmd) -> Result<A40Response, A40Error>` writes a CRC-correct vendor frame to the configured UDP endpoint and awaits the matching response within a deadline.
- Inbound frames are CRC-validated; mismatches are dropped and counted as `vendor_faults_total{kind="crc"}`.
- Bounded retry on timeout (default 3 attempts; configurable).
- Health surface: `commands_per_min`, `vendor_faults_total`, `last_command_in_flight`.
## Scope
### Included
- UDP socket (single endpoint).
- CRC16 (vendor polynomial) encode/decode helpers.
- Frame encoders for yaw / pitch / zoom commands + feedback request.
- Frame decoders for yaw / pitch / zoom feedback + vendor fault frames.
- Bounded retry on timeout.
### Excluded
- Sweep pattern primitive (task 15).
- Smooth-pan plan execution (task 16).
- Centre-on-target primitive (task 17).
- Vendor protocol *specification* — assumed to be reverse-engineered or vendor-supplied separately; this task implements against the documented frame layout in `misc/camera/a8/` (which is the predecessor model A8; A40 differs in command codes per architecture.md).
## Acceptance Criteria
**AC-1: CRC round-trip**
Given the encoder produces a yaw command frame for `yaw = 30°`
When the same frame is fed back through the decoder
Then the decoded command matches and `vendor_faults_total{kind="crc"} = 0`.
**AC-2: CRC mismatch counted**
Given an inbound frame with corrupted CRC
When the decoder consumes it
Then the frame is dropped and `vendor_faults_total{kind="crc"}` increments by 1.
**AC-3: Command timeout retries**
Given a fake A40 endpoint that drops the first command silently
When `send(yaw_cmd)` is called with default 3 attempts
Then the call succeeds on retry; `vendor_faults_total{kind="timeout"}` reports 1.
**AC-4: Cap exhaustion returns explicit error**
Given the endpoint never responds
When `send(yaw_cmd)` is called
Then after 3 attempts the call returns `Err(MaxRetriesExceeded)` and the error surfaces to the caller.
## Non-Functional Requirements
**Performance**
- Single command round-trip: ≤200 ms on a healthy link (well under the ≤500 ms decision-to-movement budget).
**Reliability**
- CRC mismatches counted, never silent.
- Bounded retry; no infinite retry.
## Constraints
- Vendor protocol is fixed; no negotiation.
- One A40 per autopilot instance.
## Runtime Completeness
- **Named capability**: ViewPro A40 vendor protocol on UDP.
- **Production code that must exist**: real CRC16; real UDP socket; real per-command encoder/decoder.
- **Allowed external stubs**: in tests, a UDP echo with vendor-frame replay can simulate the camera.
- **Unacceptable substitutes**: a generic "send raw bytes and assume success" path is unacceptable — the protocol's frame format and CRC are non-negotiable.
@@ -0,0 +1,64 @@
# Zoom-Out Sweep Pattern
**Task**: AZ-654_gimbal_zoom_out_sweep
**Name**: Zoom-out sweep pattern primitive
**Description**: Run the zoom-out sweep pattern when `scan_controller` is in `ZoomedOut`. The exact pattern (pendulum / raster / lawn-mower) is gated by `architecture.md §8 Q1`; this task implements one selectable default with the pattern enum and exposes the choice through config.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-653_gimbal_a40_transport
**Component**: gimbal_controller
**Tracker**: AZ-654
**Epic**: AZ-634
## Problem
In `ZoomedOut`, the gimbal must sweep its FOV continuously to maximise coverage. The exact pattern is an open architecture question (Q1); this task implements the `SweepEngine` abstraction and ships `Pendulum` as the safe default, with `Raster` and `LawnMower` enum variants reserved. Switching pattern is config-only — no API change to consumers.
## Outcome
- `SweepEngine::next_step(state) -> GimbalCommand` produces a sequence of yaw / pitch / zoom commands implementing the configured sweep pattern with bounded jitter and no overshoot beyond configured FOV bounds.
- Default pattern is `Pendulum`; `Raster` and `LawnMower` are wired as enum variants (one implemented; the others reserved).
- Sweep config (FOV per zoom tier, dwell time per direction, step size) is loaded from startup config.
## Scope
### Included
- `SweepPattern` enum with all three variants declared; default impl for `Pendulum`.
- `SweepEngine` struct holding the current direction + dwell counter.
- Bounded-jitter command emission.
- FOV-bound enforcement.
### Excluded
- The pattern selection rationale (Q1 — resolved separately).
- Smooth-pan plan execution (task 16).
- Centre-on-target (task 17).
## Acceptance Criteria
**AC-1: Pendulum sweep emits a bounded-jitter command stream**
Given `SweepEngine::new(SweepPattern::Pendulum, config)`
When `next_step()` is called 100 times
Then the yaw values stay within `[config.min_yaw, config.max_yaw]`, never overshoot, and reverse direction at each bound.
**AC-2: Dwell at bounds is respected**
Given a config with `dwell_ms = 500`
When the sweep reaches a yaw bound
Then `next_step()` returns the same yaw for at least 500 ms before reversing direction.
**AC-3: Pattern enum exhaustiveness**
Given the `SweepPattern` enum
When match-exhausting it in client code
Then the compiler covers `Pendulum`, `Raster`, `LawnMower` — unimplemented variants return `Err(NotImplemented)` at runtime, never silently fall back.
## Non-Functional Requirements
**Performance**
- `next_step()` p99 ≤1 ms.
**Reliability**
- Bounded jitter; no overshoot.
## Runtime Completeness
- **Named capability**: zoom-out sweep pattern (default `Pendulum`).
- **Production code that must exist**: real bounded sweep state machine.
- **Unacceptable substitutes**: random walk is not acceptable — sweep coverage must be deterministic and bounded.
@@ -0,0 +1,64 @@
# Smooth-Pan Path-Tracking Plan Executor
**Task**: AZ-655_gimbal_smooth_pan_plan
**Name**: Smooth-pan plan executor (zoom-in path-follow)
**Description**: Accept a pan plan (sequence of yaw / pitch / zoom goals with timing) from `semantic_analyzer` via `scan_controller` and execute it smoothly. Used for follow-the-footpath behaviour during the zoom-in level.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-653_gimbal_a40_transport
**Component**: gimbal_controller
**Tracker**: AZ-655
**Epic**: AZ-634
## Problem
When `scan_controller` is in `ZoomedIn` and `semantic_analyzer` recommends `PanFollowFootpath`, a sequence of yaw/pitch/zoom goals with timing arrives. The executor must interpolate between goals smoothly (no step jumps) and respect the vendor's command rate — if the plan is too dense, drop the lowest-priority goals rather than blocking the queue.
## Outcome
- `PlanExecutor::load(plan: PanPlan)` accepts an ordered sequence `Vec<(yaw, pitch, zoom, at_ts)>`.
- `next_step(now)` returns the interpolated `GimbalCommand` to issue at `now`; goals past their `at_ts` are skipped; goals before `at_ts` are extrapolated linearly.
- The executor self-throttles: emits at most one command per `min_cmd_interval_ms` (default 50 ms), dropping intermediate interpolations.
- Health: `plan_loaded_at`, `commands_emitted_total`, `commands_dropped_to_throttle_total`.
## Scope
### Included
- `PanPlan` data type (`data_model.md §PanPlan`).
- Linear interpolation between adjacent goals.
- Self-throttling.
### Excluded
- Generating the plan (`semantic_analyzer`).
- Sweep pattern (task 15).
- Centre-on-target (task 17).
## Acceptance Criteria
**AC-1: Linear interpolation between goals**
Given a plan with two goals 1 s apart and yaw `0° → 30°`
When `next_step(now=500ms)` is called
Then the returned `yaw` is `15°` ± a defined epsilon.
**AC-2: Self-throttle drops intermediate calls**
Given `min_cmd_interval_ms = 100`
When `next_step()` is called every 10 ms for 1 s
Then exactly ~10 commands are emitted (the rest counted as throttled).
**AC-3: Plan past its end clamps to last goal**
Given a plan whose last `at_ts` is in the past
When `next_step(now)` is called
Then the returned command equals the last goal's `(yaw, pitch, zoom)`; no error.
## Non-Functional Requirements
**Performance**
- `next_step()` p99 ≤1 ms.
**Reliability**
- Throttle drops are counted, never silent.
## Runtime Completeness
- **Named capability**: smooth-pan plan execution + interpolation.
- **Production code that must exist**: real interpolation; real self-throttle.
- **Unacceptable substitutes**: dispatching every plan goal directly without interpolation/throttling is not acceptable (causes jerky panning).
@@ -0,0 +1,64 @@
# Centre-On-Target Primitive + GimbalState Publish
**Task**: AZ-656_gimbal_centre_on_target
**Name**: Centre-on-target primitive + timestamped GimbalState publish
**Description**: During `TargetFollow`, accept a centre-on-target stream (target bbox normalized) from `scan_controller` and command the gimbal to keep the target inside the centre 25 % of frame while visible. Stamp every emitted command + reported state with a monotonic timestamp so `movement_detector` can synchronise.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-653_gimbal_a40_transport
**Component**: gimbal_controller
**Tracker**: AZ-656
**Epic**: AZ-634
## Problem
During target-follow, the gimbal must continuously re-aim to keep the target inside the centre 25 % of frame. The control loop must converge without overshoot, and every emitted command + every reported `GimbalState` must carry a monotonic timestamp so `movement_detector` can synchronise gimbal motion with the per-frame ego-motion estimate.
## Outcome
- `CentreOnTarget::tick(bbox_normalized, current_state) -> GimbalCommand` produces the yaw/pitch command needed to nudge the target toward frame centre; convergence within ≤3 ticks under nominal latency.
- Reported `GimbalState { yaw, pitch, zoom, ts_monotonic, command_in_flight }` is published on the state channel for `frame_ingest` (telemetry tagging) and `movement_detector` (ego-motion sync) consumption.
- If the target bbox is missing for 3 consecutive ticks, emit a `target_lost` signal to `scan_controller`.
## Scope
### Included
- Centre-25% control loop (proportional, configurable gain).
- Monotonic timestamp stamping (single source of truth: `Instant::now()` at emit point).
- `GimbalState` publisher.
- `target_lost` signal on 3 consecutive missing bboxes.
### Excluded
- Target-follow state ownership (`scan_controller`).
- Sweep (task 15) and pan plan (task 16).
## Acceptance Criteria
**AC-1: Centre convergence**
Given a target initially at bbox `(0.7, 0.5, 0.1, 0.1)` (right side of frame) and a healthy A40
When `tick()` is invoked over 3 cycles at 100 ms each
Then by the third cycle the target bbox centre is within the centre 25 % region.
**AC-2: GimbalState carries monotonic timestamp**
Given a sequence of `tick()` calls
When the resulting `GimbalState` is observed
Then `ts_monotonic` is strictly monotonically increasing across observations.
**AC-3: Target loss signals after 3 missing ticks**
Given the target bbox stream goes empty
When 3 consecutive ticks have no bbox
Then a `target_lost` signal is published exactly once; subsequent ticks do not re-emit.
## Non-Functional Requirements
**Performance**
- `tick()` p99 ≤2 ms.
- Centre convergence within ≤3 ticks at 10 Hz.
**Reliability**
- `target_lost` debounced — never spurious.
## Runtime Completeness
- **Named capability**: target-follow centre-25% loop + timestamped GimbalState publish.
- **Production code that must exist**: real control loop; real monotonic timestamping.
- **Unacceptable substitutes**: open-loop "send target position once" is not acceptable — the loop must close.
@@ -0,0 +1,71 @@
# RTSP Session + Reconnect + AI-Lock Signal
**Task**: AZ-657_frame_ingest_rtsp_session
**Name**: RTSP session lifecycle + bounded reconnect + AI-lock plumb-through
**Description**: Open the RTSP session to the ViewPro A40, recover from transient connection loss with bounded exponential backoff (1 s → 30 s cap), and plumb through the `bringCameraDown`/`bringCameraUp` AI-lock signal so downstream consumers can skip detection.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure
**Component**: frame_ingest
**Tracker**: AZ-657
**Epic**: AZ-627
## Problem
The RTSP session is the foundation of the perception pipeline. It must (a) open against the camera at startup, (b) recover from drops with bounded backoff (no infinite retry), and (c) carry the `ai_locked` flag through to every emitted `Frame` so that downstream consumers (`detection_client`, `movement_detector`) know to skip detection while the local supervisor is asserting an RC-takeover lock.
## Outcome
- `RtspSession::open(config) -> Result<Self, OpenError>` opens with TCP or UDP transport per camera config.
- On stream loss the session reopens with exponential backoff `1 s → 2 s → 4 s ...` capped at 30 s.
- A subscription to `bringCameraDown` / `bringCameraUp` toggles `ai_locked` on every subsequently emitted frame.
- Health surface: `reopens_total`, `last_frame_age_ms`, `session_state ∈ {closed, connecting, streaming, failing}`, `ai_locked`.
- Camera output-format mismatch (unexpected SPS/PPS) hard-fails at session open with an explicit error; never silently picks a wrong decode path.
## Scope
### Included
- RTSP client (FFmpeg / GStreamer binding or pure-Rust client — pick what `shared` pins).
- Backoff state machine.
- AI-lock signal source subscription (the supervisor channel is implementation-defined; the local supervisor signals over a unix-domain socket per `architecture.md`).
- Session state surface.
### Excluded
- Frame decoding (task 19).
- Multi-consumer publisher (task 20).
## Acceptance Criteria
**AC-1: Open against ViewPro A40 (fixture)**
Given a fixture RTSP server (e.g. `MediaMTX`) replaying a sample stream
When `RtspSession::open(...)` is called
Then it returns `Ok` within ≤2 s and `session_state = "streaming"`.
**AC-2: Reconnect on drop**
Given a healthy session for 5 s
When the fixture RTSP server is killed and restarted
Then the session reopens within ≤5 s and `reopens_total` increments by 1.
**AC-3: SPS/PPS mismatch hard-fails**
Given a fixture stream that announces an unsupported codec profile
When `RtspSession::open(...)` is called
Then it returns `Err(UnsupportedProfile { details })`; no silent decode-path selection.
**AC-4: AI-lock toggles ai_locked flag**
Given a healthy session emitting frames
When `bringCameraDown` is asserted
Then subsequent emitted frames have `ai_locked = true`; when `bringCameraUp` is asserted, they revert to `false`.
## Non-Functional Requirements
**Performance**
- Reconnect latency: ≤5 s from camera availability (per `description.md §8`).
**Reliability**
- Bounded backoff cap configurable; no infinite retry.
## Runtime Completeness
- **Named capability**: RTSP transport against ViewPro A40 + AI-lock signal plumb.
- **Production code that must exist**: real RTSP session; real AI-lock subscription.
- **Allowed external stubs**: `MediaMTX` or `live555-test` as fixture in dev/CI.
- **Unacceptable substitutes**: bypassing AI-lock entirely is unacceptable — it is a safety boundary.
@@ -0,0 +1,72 @@
# Frame Decoder (NVDEC + Software Fallback)
**Task**: AZ-658_frame_ingest_decoder
**Name**: H.264/265 decoder (NVDEC primary, software fallback) + monotonic timestamps
**Description**: Decode H.264/265 to raw frames using NVDEC on Jetson Orin Nano, with software fallback. Stamp each frame with a monotonic capture timestamp + sequence number at the earliest practical point in the pipeline.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-657_frame_ingest_rtsp_session
**Component**: frame_ingest
**Tracker**: AZ-658
**Epic**: AZ-627
## Problem
Every frame downstream needs a monotonic capture timestamp so `movement_detector` can detect telemetry skew. Decoding must use the hardware decoder (NVDEC on Jetson) where present and fall back to software otherwise, without changing the emitted `Frame` shape. Decode errors on a single frame must be dropped (counted), not abort the stream — cold-start latency is observable once but not an alert by itself.
## Outcome
- `FrameDecoder::decode(packet) -> Result<Frame, DecodeError>` emits a `Frame { seq, capture_ts_monotonic, decode_ts_monotonic, pixels: Arc<Bytes>, width, height, pix_fmt, ai_locked }`.
- NVDEC code path is used when available; software fallback otherwise (selection is automatic and observable in health).
- Single-frame errors are dropped and counted as `decode_errors_total`; the stream is never aborted on a single frame.
- Cold-start latency (first-frame decode time) is surfaced as `decode_ms_first_frame` once per session open.
- Health surface: `decode_ms_p50`, `decode_ms_p99`, `decoder_backend ∈ {NVDEC, Software}`, `decode_errors_total`.
## Scope
### Included
- NVDEC binding (via Jetson Multimedia API or GStreamer `nvv4l2decoder`).
- Software decoder fallback (FFmpeg `libavcodec`).
- Monotonic timestamping at the earliest point in the decode pipeline.
- Sequence-number generation (monotonic u64 per session).
- Single-frame error handling.
### Excluded
- RTSP session lifecycle (task 18).
- Multi-consumer publisher (task 20).
## Acceptance Criteria
**AC-1: Software-path decode of a sample stream**
Given a sample H.264 RTSP stream at 1080p / 30 fps and a host without NVDEC
When the decoder runs for 10 s
Then ≥285 frames are emitted; `decoder_backend = "Software"`; sequence numbers are strictly monotonic.
**AC-2: NVDEC-path selection on Jetson**
Given the host has NVDEC available
When the decoder is initialized
Then `decoder_backend = "NVDEC"`; functional correctness is identical to software path.
**AC-3: Single-frame decode error does not abort the stream**
Given the input contains one corrupted frame
When the decoder runs
Then that single frame is dropped, `decode_errors_total` increments by 1, and subsequent frames continue to be emitted.
**AC-4: Monotonic timestamps**
Given a sequence of decoded frames
When their `capture_ts_monotonic` is read
Then values are strictly monotonically increasing.
## Non-Functional Requirements
**Performance**
- End-to-end RTSP-rx → publish ≤30 ms p99 on Jetson Orin Nano (per `description.md §8`); decoder portion of that budget ≤20 ms p99.
**Reliability**
- Single-frame errors do not abort the stream.
- Cold-start latency surfaced once; not an alert.
## Runtime Completeness
- **Named capability**: H.264/265 decode (NVDEC primary, software fallback) — production decode path required.
- **Production code that must exist**: real NVDEC binding; real software fallback; real monotonic timestamping.
- **Unacceptable substitutes**: software-only decode on Jetson is acceptable as fallback but the NVDEC code path MUST exist (otherwise the latency target cannot be met).
@@ -0,0 +1,64 @@
# Multi-Consumer Frame Publisher + Back-Pressure Drops
**Task**: AZ-659_frame_ingest_publisher
**Name**: Tokio broadcast publisher + per-consumer drop counters + zero-copy `Arc<Bytes>`
**Description**: Publish `Frame`s through a single multi-consumer channel using `Arc<Bytes>` for pixel data so consumers do not copy. Drop frames when downstream consumers fall behind beyond a configured queue depth; record per-consumer drop counters with reason tags.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-657_frame_ingest_rtsp_session, AZ-658_frame_ingest_decoder
**Component**: frame_ingest
**Tracker**: AZ-659
**Epic**: AZ-627
## Problem
Three downstream consumers (`detection_client`, `movement_detector`, `telemetry_stream`) all need the same frames at the same rate. A single-consumer queue would serialise the slowest; a per-consumer fan-out with cloned pixel buffers would multiply memory. The right structure is a Tokio `broadcast` channel (or equivalent) carrying `Arc<Bytes>` so pixels are shared by reference. Slow consumers drop their oldest frame, with the drop counted (and reason-tagged) — never silently coalesced.
## Outcome
- `FramePublisher::subscribe() -> FrameReceiver` returns a per-consumer receiver.
- `Frame` carries `Arc<Bytes>` for `pixels` so consumers do not copy.
- When a consumer falls behind beyond `channel_depth` (configurable, default 4), the oldest frame is dropped for THAT consumer; per-consumer counters increment with reason tag (`{detection_client_slow, movement_detector_slow, telemetry_slow}`).
- Health surface: per-consumer drop counters, total publish count.
## Scope
### Included
- `tokio::sync::broadcast` (or equivalent) with `Arc<Bytes>` payload.
- Per-consumer drop counter (statically known three consumer ids; future-extensible).
- Channel-depth config.
### Excluded
- RTSP session (task 18).
- Decoder (task 19).
## Acceptance Criteria
**AC-1: Three consumers receive every frame at nominal rate**
Given three subscribers consuming at 30 fps and source at 30 fps
When the publisher runs for 10 s
Then each consumer observes ~300 frames; per-consumer drop counters = 0.
**AC-2: Slow consumer drops, fast consumers unaffected**
Given a slow consumer that yields every 200 ms while source is 30 fps and `channel_depth = 4`
When the publisher runs for 5 s
Then the slow consumer's drop counter increments and fast consumers continue to receive every frame.
**AC-3: Zero-copy under load**
Given a publisher emitting at 30 fps for 60 s with three subscribers
When peak memory is sampled
Then memory does not scale linearly with consumer count (i.e. `Arc<Bytes>` is correctly shared).
## Non-Functional Requirements
**Performance**
- Publish-to-consumer p99 ≤5 ms (helps keep total RTSP-rx-to-publish under the 30 ms p99 budget).
**Reliability**
- Drops are counted with reason; never silent.
- No unbounded memory growth on slow consumer.
## Runtime Completeness
- **Named capability**: lossy multi-consumer frame fan-out with `Arc<Bytes>`.
- **Production code that must exist**: real broadcast channel; real per-consumer drop accounting.
- **Unacceptable substitutes**: cloning pixel buffers per consumer is unacceptable (multiplies memory); blocking the publisher on a slow consumer is unacceptable (gates the whole pipeline).
@@ -0,0 +1,77 @@
# Detection gRPC Bi-Directional Stream + Frame Budgeting
**Task**: AZ-660_detection_client_grpc_stream
**Name**: Bi-directional gRPC stream to ../detections + drop-oldest frame budgeting
**Description**: Single bi-directional gRPC stream to the external `../detections` service. Reconnect on stream loss with bounded exponential backoff. Frame budgeting: drop older in-flight frames if a new frame arrives before the previous response, respecting the Tier-1 ≤100 ms/frame target.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-659_frame_ingest_publisher
**Component**: detection_client
**Tracker**: AZ-660
**Epic**: AZ-628
## Problem
`detection_client` is the only autopilot component talking to `../detections`. The contract is a bi-directional gRPC stream; the client must maintain it (reconnect with bounded backoff), respect the Tier-1 latency target by NOT queueing frames indefinitely (drop-oldest in-flight when a newer frame arrives), and never block the upstream `frame_ingest` publisher.
## Outcome
- `DetectionClient::run(frame_rx)` maintains one bi-directional gRPC stream to `../detections`; reconnect on stream loss with exponential backoff capped at 30 s.
- Outbound: send each `Frame` (skipping `ai_locked` ones) up to `max_concurrent_in_flight` (default 2); drop older in-flight frames when the budget is full and a new frame arrives (logged as `budget_drop`).
- Inbound: receive `DetectionBatch` and publish on the output channel; tag with the source frame's `monotonic_ts`.
- Health surface: `gRPC_connection_state`, `requests_in_flight`, `latency_p50/p99`, `errors_by_kind`, `budget_drops_total`.
## Scope
### Included
- `tonic` (or equivalent) gRPC client + bi-directional streaming.
- Reconnect state machine.
- In-flight tracker (sliding window of `frame_seq`).
- Drop-oldest budgeting.
### Excluded
- Schema validation + model_version handling (task 22).
- The `../detections` service itself (separate repo).
## Acceptance Criteria
**AC-1: Happy path against fixture**
Given a fixture gRPC server that returns a `DetectionBatch` per request within 50 ms
When `DetectionClient::run` is started against a 30 fps frame source for 10 s
Then ≥285 `DetectionBatch` are observed on the output channel; latency_p99 ≤100 ms; budget_drops_total = 0.
**AC-2: Reconnect after server restart**
Given a healthy stream
When the gRPC server is killed and restarted
Then the client reconnects within ≤2 s; subsequent frames flow through.
**AC-3: Budget drop on slow server**
Given the server takes 200 ms per response and the source is 30 fps
When the client runs for 5 s
Then `budget_drops_total > 0`, frames continue to flow, and the publisher is never blocked.
**AC-4: ai_locked frames are skipped**
Given a frame stream where every 5th frame has `ai_locked = true`
When the client runs
Then no requests are sent for `ai_locked` frames (observable via outgoing count).
## Non-Functional Requirements
**Performance**
- Per-frame round-trip ≤100 ms p99 (Tier-1 NFR; mostly owned by `../detections`).
- Reconnect latency: ≤2 s after `../detections` returns.
**Reliability**
- Drop-oldest never queues indefinitely.
- Reconnect is bounded.
## Contract
- gRPC service contract owner: `../_docs/03_detections.md`.
- Canonical typed model: `data_model.md §Detection`, `§DetectionBatch`.
## Runtime Completeness
- **Named capability**: bi-directional gRPC stream against `../detections`.
- **Production code that must exist**: real `tonic` (or equivalent) bi-directional stream; real budgeting.
- **Allowed external stubs**: a fixture gRPC server in tests; the real `../detections` for integration.
- **Unacceptable substitutes**: a unary call-per-frame instead of streaming is unacceptable (multiplies per-request overhead).
@@ -0,0 +1,62 @@
# Detection Schema Validation + Model-Version + Health
**Task**: AZ-661_detection_client_schema_and_health
**Name**: Response schema validation + model_version tracking + Tier-1 health degradation signal
**Description**: Validate every `DetectionBatch` response against the schema version the client was built against. Surface a hard error on schema mismatch (never silent downcast). Track `model_version`; on change, surface to `scan_controller` so per-class thresholds can be reloaded. Track sliding-window latency; on `latency_p99 > 100 ms` flip health → yellow so `scan_controller` can degrade to alternate-frame inference.
**Complexity**: 2 points
**Dependencies**: AZ-640_initial_structure, AZ-660_detection_client_grpc_stream
**Component**: detection_client
**Tracker**: AZ-661
**Epic**: AZ-628
## Problem
Schema drift between `../detections` and autopilot must be caught loudly — not silently downcast. The model version can change at runtime (model swap); when it does, the per-class confidence thresholds may need to be reloaded by `scan_controller`. The Tier-1 latency target (≤100 ms) is mostly owned by `../detections` but autopilot must observe drift and surface health degradation so the scan controller can take action.
## Outcome
- Every response is validated against the bundled schema; on mismatch, returns a hard error to the output channel and health → red.
- `last_model_version` is tracked; on change, a `ModelVersionChanged(new_version)` event is emitted on the output channel.
- A sliding-window latency tracker (e.g. last 1 min) emits a `Tier1Degraded { reason: HighLatency }` event when `latency_p99 > 100 ms`.
## Scope
### Included
- Schema validation hook on every response.
- `model_version` tracker.
- Sliding-window latency tracker + degradation signal.
### Excluded
- The reaction to `Tier1Degraded` (lives in `scan_controller`).
- The schema definition itself (lives in the contract).
## Acceptance Criteria
**AC-1: Schema mismatch surfaces as hard error**
Given the fixture server returns a `DetectionBatch` with an unknown field type
When the client validates the response
Then a hard error is emitted on the output channel and `errors_by_kind{kind="schema_mismatch"}` increments by 1.
**AC-2: Model version change is signalled**
Given the server reports `model_version = "v1.2"` on initial stream open
When a subsequent response reports `model_version = "v1.3"`
Then exactly one `ModelVersionChanged("v1.3")` event is emitted.
**AC-3: Latency degradation signal**
Given the server's response latency rises to 150 ms p99 over a 1-min window
When the latency tracker evaluates
Then `Tier1Degraded { reason: HighLatency }` is emitted exactly once until latency falls back below 100 ms.
## Non-Functional Requirements
**Performance**
- Validation overhead: ≤1 ms per response.
**Reliability**
- Schema mismatches never silent.
## Runtime Completeness
- **Named capability**: response schema validation + model-version awareness + latency-degradation signal.
- **Production code that must exist**: real schema validation; real model-version tracker; real percentile tracker.
- **Unacceptable substitutes**: silently downcasting an unknown response shape is unacceptable.
@@ -0,0 +1,64 @@
# Ego-Motion Estimator + Telemetry Sync Gate
**Task**: AZ-662_movement_detector_ego_motion
**Name**: OpenCV optical-flow / global-motion estimator + telemetry-skew gate
**Description**: Compute per-frame ego-motion using OpenCV (LucasKanade optical flow or feature-based homography), refined by the synchronised gimbal + UAV telemetry. Drop frames whose telemetry skew exceeds the per-zoom-band tolerance; never silent.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-659_frame_ingest_publisher, AZ-656_gimbal_centre_on_target, AZ-649_mission_executor_telemetry_forwarding
**Component**: movement_detector
**Tracker**: AZ-662
**Epic**: AZ-629
## Problem
Naive frame differencing is rejected — the UAV and gimbal are moving, so most pixel motion is ego-motion. The estimator must (a) recover camera motion from the frame stream and (b) cross-check against telemetry (gimbal + UAV) within a per-zoom-band skew tolerance. Frames whose telemetry skew exceeds the tolerance MUST be dropped (with a counter), never silently consumed — otherwise the compensation is wrong and false positives flood the operator.
## Outcome
- `EgoMotionEstimator::estimate(frame, gimbal_state, uav_telemetry) -> Result<EgoMotion, SkewExceeded>` returns the per-frame ego-motion vector (or homography) refined by telemetry, OR rejects the frame as skewed.
- Per-zoom-band tolerance from config (defaults per `description.md §5`): zoom-out 50 ms frame↔gimbal / 100 ms frame↔UAV; zoom-in 25 ms / 50 ms.
- Health surface: `telemetry_skew_drops_total`, `optical_flow_degenerate_total`, `current_zoom_band`.
## Scope
### Included
- OpenCV bindings (Rust crate `opencv`).
- Optical-flow primary path (dense LucasKanade or feature-based homography — `opencv::video::CalcOpticalFlow*` or `opencv::calib3d::findHomography`).
- Telemetry-skew gate per zoom band.
- Compensation output (the residual-pixel-motion field; downstream task 24 clusters it).
### Excluded
- Cluster persistence + candidate emission (task 24).
- Q14 fallback (task 25).
## Acceptance Criteria
**AC-1: Synthetic pure-pan: residual ≈ 0**
Given a synthetic frame pair where the camera panned by `dx` and the entire scene is static
When `estimate(frame, gimbal_state, uav_telemetry)` runs
Then the returned ego-motion captures `dx` and the residual motion field is ≈ 0 within epsilon.
**AC-2: Telemetry skew above zoom-out tolerance is dropped**
Given a frame whose gimbal-telemetry timestamp differs by 200 ms while `zoom_band = zoomed_out` (tolerance 50 ms)
When `estimate(...)` is called
Then it returns `Err(SkewExceeded)` and `telemetry_skew_drops_total{band="zoomed_out"}` increments by 1.
**AC-3: Optical-flow degenerate is observable**
Given a fully-saturated white frame
When `estimate(...)` runs
Then it returns `Err(OpticalFlowDegenerate)` and `optical_flow_degenerate_total` increments by 1.
## Non-Functional Requirements
**Performance**
- Per-frame ego-motion estimation: ≤30 ms p99 on Jetson Orin Nano (must coexist with Tier 1 + Tier 2 — per `description.md §9`).
**Reliability**
- Drops never silent.
## Runtime Completeness
- **Named capability**: ego-motion estimation using real OpenCV; telemetry-skew gating.
- **Production code that must exist**: real OpenCV optical-flow / homography path; real synchronisation logic.
- **Allowed external stubs**: synthetic frame pairs in tests; pinned `opencv` Rust crate in CI.
- **Unacceptable substitutes**: a fake/stub estimator that always returns "no motion" is unacceptable in production (would mask real movement candidates).
@@ -0,0 +1,79 @@
# Cluster Persistence + Candidate Emission
**Task**: AZ-663_movement_detector_clustering_and_emission
**Name**: Residual-motion clustering + per-zoom-band persistence + candidate emission with source_zoom_band
**Description**: Subtract estimated ego-motion from per-pixel motion; cluster residuals; emit clusters meeting per-zoom-band minimum size + persistence threshold as `MovementCandidate`s. Self-disable in `TargetFollow` (consume frames to keep history warm; emit nothing).
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-662_movement_detector_ego_motion
**Component**: movement_detector
**Tracker**: AZ-663
**Epic**: AZ-629
## Problem
Once ego-motion is compensated, the remaining residual pixel motion is the candidate signal. Residuals must be clustered (connected components or DBSCAN-like spatial cluster) and tracked across frames; only clusters that persist for the per-zoom-band threshold count as candidates. Single-frame noise blips MUST NOT surface to the operator.
The candidate emission also carries `source_zoom_band` (`zoomed_out | zoomed_in`) so `scan_controller` can apply zoom-band-aware queueing logic.
The component must self-disable when `scan_controller` is in `TargetFollow` — emit zero candidates but keep consuming frames so the motion-history buffer stays warm for the next state transition.
## Outcome
- `MovementClusterer::ingest(frame, residual_motion)` updates per-cluster persistence counters per zoom band.
- A cluster meeting `min_size_px` + `min_persistence_frames` emits a `MovementCandidate { frame_seq, bbox_normalized, residual_velocity_estimate, telemetry_quality, source_frame_ts, source_zoom_band }`.
- Per-zoom-band knobs (defaults per `description.md §5`):
- zoom-out: persistence 35 frames; residual-velocity floor low.
- zoom-in: persistence 610 frames; residual-velocity floor higher.
- Active-state hint `disable` (during `TargetFollow`) suppresses emission but keeps history.
- Health surface: `candidates_per_min_zoomed_out`, `candidates_per_min_zoomed_in`, `current_zoom_band`, `compensation_quality_per_band`.
## Scope
### Included
- Connected-component (or spatial cluster) extraction over the residual motion field.
- Per-cluster persistence tracker, per zoom band.
- Per-band motion-history buffer (a few seconds of frames + residuals; one per zoom band).
- Candidate emission with full metadata.
- Active-state hint handling.
### Excluded
- Ego-motion estimation (task 23).
- Q14 fallback (task 25).
- POI queue ordering (`scan_controller`).
## Acceptance Criteria
**AC-1: Single-frame blip is suppressed**
Given a single isolated 5×5 px residual motion blip in one frame at zoom-out
When the clusterer runs over 30 frames
Then no `MovementCandidate` is emitted (below `min_persistence_frames = 3`).
**AC-2: Persistent moving target emits a candidate**
Given a 20×20 px residual cluster persisting across 5 consecutive frames at zoom-out
When the clusterer runs
Then exactly one `MovementCandidate` is emitted with `source_zoom_band = "zoomed_out"` and `bbox_normalized` localised around the cluster centre.
**AC-3: Zoom-in stricter threshold**
Given the same persistent cluster but at zoom-in with `min_persistence_frames = 8`
When the clusterer runs for only 5 frames
Then no candidate is emitted; 9th frame onwards emits one.
**AC-4: TargetFollow suppresses emission, keeps history warm**
Given the active-state hint is `disable`
When 30 frames with persistent clusters arrive
Then zero candidates are emitted; `compensation_quality_per_band` is still updated; when `disable` is lifted, the next persistent cluster is emitted on the SAME zoom band's threshold (history is warm).
## Non-Functional Requirements
**Performance**
- Per-frame clustering + emission: ≤20 ms p99.
- Candidate enqueue latency: zoom-out ≤1 s, zoom-in ≤1.5 s (per `description.md §9`).
**Reliability**
- Single-frame blips never surface as candidates.
## Runtime Completeness
- **Named capability**: persistent-cluster candidate detection with per-zoom-band tuning.
- **Production code that must exist**: real residual clustering; real per-band persistence tracker.
- **Unacceptable substitutes**: emitting every residual blip without persistence gating is unacceptable (operator would be flooded).
@@ -0,0 +1,70 @@
# FP Cap Monitor + Q14 Fallback Hook
**Task**: AZ-664_movement_detector_fp_cap_and_q14_fallback
**Name**: FP cap monitor + Q14 fallback module hook
**Description**: Monitor per-zoom-band candidate flood. If sustained candidates_per_min exceeds the configured cap, suppress that band's emission (zoom-in only at first; zoom-out down-ranks lowest-confidence). Q14 fallback engages a learned-CV module behind a build-time feature flag — wire the `EgoMotionProvider` trait + a stub fallback impl that returns `not_engaged`; the real ML module is a follow-up if the benchmark gate triggers Q14.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-662_movement_detector_ego_motion, AZ-663_movement_detector_clustering_and_emission
**Component**: movement_detector
**Tracker**: AZ-664
**Epic**: AZ-629
## Problem
If classical OpenCV optical flow fails to meet the per-zoom-band FP cap at zoom-in (Q14 trigger), the system must degrade safely: suppress zoom-in emission, keep zoom-out running, and engage the (optional) learned-CV fallback module if compiled in. The fallback's interface contract is fixed (`Frame + telemetry → Vec<MovementCandidate>`); the impl is a separate engineering effort gated on benchmark-gate results.
This task delivers the trait, the FP-cap monitor, the suppression behaviour, and a stub fallback impl. The real learned-CV impl is out of scope here (separate Q14 follow-up if and when the benchmark gate fires).
## Outcome
- `FpCapMonitor::tick(per_band_rate)` flags when `candidates_per_min_zoomed_in > cap`; suppresses zoom-in emission for the duration of the breach + a configurable hysteresis.
- Zoom-out FP-cap breach down-ranks lowest-confidence candidates rather than suppressing entirely (zoom-out is the only source for far-field threats).
- `EgoMotionProvider` trait with the fixed contract; default `OpenCvEgoMotion` impl wraps task 23; `LearnedCvFallback` stub returns `not_engaged`.
- Build-time feature flag `learned_cv_fallback` reserves the slot; if off, the build is identical and the stub is the only provider.
## Scope
### Included
- `EgoMotionProvider` trait (re-exported from `shared::contracts`).
- `FpCapMonitor` with sliding-window per-band rate + hysteresis.
- Zoom-in suppression behaviour.
- Zoom-out down-rank behaviour.
- Stub `LearnedCvFallback` returning `not_engaged`.
### Excluded
- Real learned-CV implementation (Q14 follow-up, gated on benchmark results).
- Benchmark-gate orchestration (out of scope; manual decision based on benchmark data).
## Acceptance Criteria
**AC-1: Zoom-in suppression on flood**
Given `candidates_per_min_zoomed_in = 20` over 60 s while cap is 10
When the FP-cap monitor evaluates
Then zoom-in emission is suppressed and `health → yellow`; when rate falls below cap + hysteresis, emission resumes.
**AC-2: Zoom-out down-ranks instead of suppressing**
Given a similar zoom-out flood
When the monitor evaluates
Then no emission is suppressed; instead, the lowest-confidence candidates are down-ranked (counted as `down_ranked_total`).
**AC-3: Feature-flag absence does not break build**
Given the binary is built WITHOUT the `learned_cv_fallback` feature
When the build runs
Then the binary builds cleanly and `EgoMotionProvider` is satisfied by `OpenCvEgoMotion` exclusively.
**AC-4: Stub fallback returns not_engaged**
Given the `learned_cv_fallback` feature IS enabled and the stub is registered
When `LearnedCvFallback::estimate(...)` is called
Then it returns `Status::NotEngaged` immediately; no real ML is run.
## Non-Functional Requirements
**Reliability**
- FP-cap monitor never spuriously toggles (hysteresis required).
## Runtime Completeness
- **Named capability**: FP-cap monitor + Q14 fallback trait wiring. Real learned-CV impl is explicitly out of scope here.
- **Production code that must exist**: real FP-cap monitor + real suppression logic + real trait.
- **Allowed external stubs**: `LearnedCvFallback` is a stub by design until benchmark-gate triggers Q14.
- **Unacceptable substitutes**: silently dropping zoom-in candidates without an observable signal is unacceptable.
@@ -0,0 +1,81 @@
# H3 Indexing + Classify
**Task**: AZ-665_mapobjects_store_h3_classify
**Name**: H3 indexing + k-ring classify(detection) → new/moved/existing
**Description**: Compute H3 cell for each detection at the configured resolution (default 10, ~15 m edge). Maintain in-memory `(H3_cell + class) → MapObject` hashmap. Answer `classify(detection)` using k-ring (k=2 default) lookup against `(distance_threshold_m, move_threshold_m, similar_classes)` config.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure
**Component**: mapobjects_store
**Tracker**: AZ-665
**Epic**: AZ-633
## Problem
The H3 spatial index is the foundation of new-vs-existing detection (`architecture.md §7.12`). Each detection's MGRS position is converted to an H3 cell at the configured resolution; the composite key `(H3_cell, class)` keys an in-memory map of known MapObjects. Classification answers `new | moved | existing` by querying the k-ring of cells (boundary correctness) and computing distance against move thresholds.
## Outcome
- `H3Index::cell_of(mgrs, resolution) -> H3Cell`.
- `MapObjectsStore::classify(detection) -> MapObjectClassification ∈ {New, Moved { from_mgrs, to_mgrs }, Existing { existing_id }}`.
- k-ring lookup (default k=2) over the in-memory hashmap.
- `distance_threshold_m` (default 30 m), `move_threshold_m` (default 50 m), `similar_classes` (configured set per `data_model.md §IgnoredItem` class groups) read from config.
- O(1) classify p99 ≤1 ms.
## Scope
### Included
- H3 binding (Rust crate `h3o` or equivalent).
- `MapObjectsStore` struct + in-memory hashmap.
- `classify` API.
- Config-driven thresholds.
### Excluded
- IgnoredItem suppression (task 27).
- Pre-flight hydrate + sync_state machine (task 28).
- Persistence (task 29).
- End-of-pass removed-candidate sweep (task 27).
## Acceptance Criteria
**AC-1: New detection at unseen MGRS**
Given an empty store
When `classify(detection_at_M1, class=A)` is called
Then it returns `Classification::New`.
**AC-2: Existing detection at known MGRS within threshold**
Given the store has a MapObject at `M1, class=A`
When `classify(detection_at_M1+5m, class=A)` is called and `distance_threshold_m = 30`
Then it returns `Classification::Existing { existing_id: ... }`.
**AC-3: Moved detection beyond move threshold**
Given the store has a MapObject at `M1, class=A`
When `classify(detection_at_M1+60m, class=A)` is called and `move_threshold_m = 50`
Then it returns `Classification::Moved { from_mgrs: M1, to_mgrs: M1+60m }`.
**AC-4: k-ring boundary lookup**
Given the store has a MapObject in cell `C1`
When a new detection falls in cell `C2` (boundary cell of `C1`)
Then with k=2 the lookup finds `C1` and returns `Existing` (not `New`).
**AC-5: Classify p99 ≤1 ms**
Given a store warmed with 10 000 MapObjects
When `classify` is called 1 000 times
Then p99 latency is ≤1 ms.
## Non-Functional Requirements
**Performance**
- O(1) classify p99 ≤1 ms (per `description.md §9`).
**Reliability**
- k-ring boundary correctness guaranteed by default config.
## Contract
- Canonical typed model: `data_model.md §MapObject`, `§MapObjectClassification`.
## Runtime Completeness
- **Named capability**: H3 spatial index + k-ring queries — production new/moved/existing dispatch.
- **Production code that must exist**: real H3 crate; real k-ring lookup.
- **Unacceptable substitutes**: Euclidean-distance-only naive search is unacceptable for production (loses boundary correctness and O(1) latency).
@@ -0,0 +1,64 @@
# IgnoredItem Set + End-of-Pass Sweep
**Task**: AZ-666_mapobjects_store_ignored_and_pass_sweep
**Name**: IgnoredItem set + end-of-pass removed-candidate sweep
**Description**: `IgnoredItem` set keyed by `(MGRS, class_group)`. `is_ignored(MGRS, class_group)` suppression query. End-of-pass sweep: after a region's pass ends, return objects in the region that were not re-observed as `removed_candidate`s.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-665_mapobjects_store_h3_classify
**Component**: mapobjects_store
**Tracker**: AZ-666
**Epic**: AZ-633
## Problem
When the operator declines a POI, the (MGRS, class_group) pair is added to the `IgnoredItem` set; subsequent detections matching the pair are suppressed BEFORE they reach the queue. Separately, when a scan pass over a region ends (signal from `scan_controller` / `mission_executor`), MapObjects that were known in the region but NOT re-observed during the pass should be flagged `removed_candidate` — the operator (not the system) decides actual removal.
## Outcome
- `IgnoredSet::append(item: IgnoredItem)` stores the entry.
- `is_ignored(mgrs, class_group) -> bool` answers in O(1).
- `MapObjectsStore::end_of_pass(region_bbox) -> Vec<RemovedCandidate>` returns objects in the region that were NOT re-observed since the pass started.
- Per-region pass tracker (start_ts, observed_ids) maintained.
## Scope
### Included
- `IgnoredSet` using a `HashSet<(H3Cell, ClassGroup)>` keyed structure.
- Class-group resolution (read group from config; e.g. `military_vehicle_group`, `concealed_position_group`, `movement_candidate`).
- Per-region pass tracker.
- End-of-pass sweep query.
### Excluded
- H3 classify (task 26).
- Pre-flight hydrate (task 28).
- Persistence (task 29).
- Append to `pending_observations` / `pending_ignored` (task 28).
## Acceptance Criteria
**AC-1: Ignored item suppresses subsequent detections**
Given `append(IgnoredItem { mgrs: M1, class_group: G })`
When `is_ignored(M1, G)` is called
Then it returns `true`; calls for other pairs return `false`.
**AC-2: End-of-pass returns un-observed objects**
Given a store with MapObjects at `M1, M2, M3` in region `R`
When the pass starts at `t0`, only `M1` is re-observed, and `end_of_pass(R)` is called at `t1`
Then it returns `[M2, M3]` as `RemovedCandidate`s.
**AC-3: End-of-pass excludes ignored**
Given `M2` was un-observed AND `is_ignored(M2.mgrs, M2.class_group) == true`
When `end_of_pass(R)` is called
Then `M2` is NOT in the returned list (ignored objects are not surfaced as removed-candidates).
## Non-Functional Requirements
**Performance**
- `is_ignored` p99 ≤1 ms.
- `end_of_pass` p99 ≤50 ms for a 30 km × 30 km region with ≤1 000 known objects.
## Runtime Completeness
- **Named capability**: IgnoredItem suppression + end-of-pass sweep.
- **Production code that must exist**: real HashSet + real per-region pass tracker.
- **Unacceptable substitutes**: re-querying the store for every detection without an `IgnoredSet` cache is unacceptable (latency violation).
@@ -0,0 +1,80 @@
# Pre-Flight Hydrate + Sync State Machine + Pending Logs
**Task**: AZ-667_mapobjects_store_hydrate_and_pending
**Name**: Pre-flight hydrate from MapObjectsBundle + sync_state machine + pending_observations/pending_ignored append logs
**Description**: Hydrate the store from a `MapObjectsBundle` (from `mission_client`'s pull). Maintain a `sync_state` enum (`synced | cached_fallback | degraded | failed`). Append every NEW / MOVED / EXISTING / REMOVED-CANDIDATE / IgnoredItem event to `pending_observations` / `pending_ignored` for the post-flight push.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-665_mapobjects_store_h3_classify, AZ-666_mapobjects_store_ignored_and_pass_sweep
**Component**: mapobjects_store
**Tracker**: AZ-667
**Epic**: AZ-633
## Problem
The on-device working copy is hydrated pre-flight from the central API. The sync_state machine (`fresh_boot → synced | cached_fallback | degraded`) tracks the relationship to the central source of truth. During flight, every classification event is appended to `pending_observations` (or, for declines, `pending_ignored`) — central writes are forbidden mid-flight (Frozen choice 6). The pending logs feed the post-flight push.
## Outcome
- `hydrate(bundle: MapObjectsBundle) -> Result<()>` loads the bundle into the in-memory hashmap + IgnoredSet; sets `sync_state = synced` (or `cached_fallback` if `bundle.fallback_used`).
- `on_classify_result(classification, detection)` appends a `MapObjectObservation` to `pending_observations` for NEW / MOVED / EXISTING / REMOVED-CANDIDATE.
- `on_decline(ignored_item)` appends to `pending_ignored`.
- `drain_pending() -> (Vec<MapObjectObservation>, Vec<IgnoredItem>)` is called by `mission_client::push_mapobjects_diff` post-flight.
- Health surface: `sync_state`, `pending_observations_count`, `pending_ignored_count`, `last_pull_ts`, `last_push_ts`.
- On `DELETE /missions/{id}` cascade signal from `mission_client`, drop mission-scoped objects.
## Scope
### Included
- `MapObjectsBundle` hydration (model = `data_model.md §MapObjectsBundle`).
- Sync-state enum + transitions.
- Append-only `pending_observations` + `pending_ignored` logs (in-memory; durable disk handoff lives in `mission_client` task 08).
- Drain API.
- Mission-cascade handler.
### Excluded
- H3 classify (task 26).
- Disk persistence (task 29) — this task keeps pending in memory + lets `mission_client` task 08 handle disk durability.
- Post-flight push (lives in `mission_client` task 08).
## Acceptance Criteria
**AC-1: Hydrate from bundle**
Given a `MapObjectsBundle` with N MapObjects and M IgnoredItems
When `hydrate(bundle)` is called
Then the store contains all N + M entries and `sync_state = "synced"`.
**AC-2: Fallback bundle sets cached_fallback**
Given a bundle with `fallback_used = true`
When `hydrate(bundle)` is called
Then `sync_state = "cached_fallback"`.
**AC-3: Classify appends pending observation**
Given the store hydrated and a detection that classifies as `New`
When `on_classify_result(New, detection)` is called
Then `pending_observations_count` increments by 1.
**AC-4: Drain returns and clears pending**
Given pending_observations_count = 5, pending_ignored_count = 2
When `drain_pending()` is called
Then it returns 5 observations + 2 ignored items; counts return to 0.
**AC-5: Cascade drops mission-scoped objects**
Given `M1` (mission A) and `M2` (mission B) objects in the store
When the cascade signal for mission A arrives
Then `M1` is dropped; `M2` remains.
## Non-Functional Requirements
**Performance**
- Hydrate from a 30 km × 30 km bundle: ≤2 s (peer of pre-flight pull's 30 s budget).
- Append per classification: ≤100 µs.
## Contract
- Canonical typed model: `data_model.md §MapObjectsBundle`, `§MapObjectObservation`.
## Runtime Completeness
- **Named capability**: hydrate + sync_state + pending event logs.
- **Production code that must exist**: real hydrate; real pending append; real drain.
- **Unacceptable substitutes**: central writes mid-flight are forbidden (Frozen choice 6).
@@ -0,0 +1,76 @@
# Persistence — In-Memory + JSON Snapshot (Q3 Default)
**Task**: AZ-668_mapobjects_store_persistence
**Name**: In-memory + JSON snapshot persistence (default per Q3)
**Description**: Crash-recovery and post-flight upload durability for the in-memory MapObjects state. Default engine: in-memory + atomic JSON snapshot to `${state_dir}/mapobjects/<mission_id>.json` per checkpoint. Q3 reserves the slot for SQLite+H3 / KV alternatives.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-665_mapobjects_store_h3_classify, AZ-667_mapobjects_store_hydrate_and_pending
**Component**: mapobjects_store
**Tracker**: AZ-668
**Epic**: AZ-633
## Problem
The in-memory hashmap is authoritative for the active mission, but a crash mid-mission must not lose the pending diff. The persistence engine choice is Q3 (open); the default is in-memory + JSON snapshot (atomic rename), which keeps the engine choice cleanly behind a `MapObjectsPersistence` trait so SQLite+H3 or RocksDB can swap in later without touching call sites.
## Outcome
- `MapObjectsPersistence` trait with `save_snapshot(state) -> Result<()>` and `load_snapshot(path) -> Result<State>`.
- `JsonSnapshotEngine` impl that writes to `${state_dir}/mapobjects/<mission_id>.json` via atomic rename (write to `.tmp` then rename).
- Snapshot cadence: configurable; default every 30 s OR on every N pending-observation appends, whichever first.
- Crash recovery: at startup, load the most recent snapshot for any mission that did not reach `POST_FLIGHT_SYNC`.
- Health surface: `last_snapshot_ts`, `snapshot_size_bytes`, `snapshot_errors_total`.
- Persistence corruption on startup: refuse to start with stale state; surface explicit error to the operator.
## Scope
### Included
- `MapObjectsPersistence` trait.
- `JsonSnapshotEngine` (default impl).
- Atomic rename pattern.
- Crash-recovery load.
- Snapshot cadence policy.
### Excluded
- SQLite+H3 alternative (Q3 follow-up if chosen later).
- KV alternative (Q3 follow-up).
- The post-flight push itself (`mission_client` task 08).
## Acceptance Criteria
**AC-1: Snapshot + reload round-trip**
Given a store with 100 MapObjects + 10 IgnoredItems + 5 pending observations
When `save_snapshot()` writes to disk and a fresh process calls `load_snapshot()`
Then the loaded state equals the saved state.
**AC-2: Atomic rename prevents partial writes**
Given a snapshot write is interrupted mid-write (simulated kill -9)
When a fresh process starts
Then it loads the previous good snapshot, not the partial one (no corruption observed).
**AC-3: Crash recovery loads pending**
Given a previous run terminated with non-empty pending_observations
When the new process calls `load_snapshot()` for the same mission_id
Then pending_observations is non-empty and matches the pre-crash count.
**AC-4: Corruption surfaces explicit error**
Given a snapshot file with truncated content
When `load_snapshot()` runs
Then it returns `Err(CorruptSnapshot)` and `snapshot_errors_total` increments; the store does NOT silently start empty.
## Non-Functional Requirements
**Performance**
- Snapshot of a 30 km × 30 km mission (≤1 000 MapObjects): ≤1 s.
- Crash recovery: ≤2 s to a usable state (per `description.md §9`).
**Reliability**
- Atomic rename — no partial-write corruption.
- Corruption never silent.
## Runtime Completeness
- **Named capability**: persistent MapObjects state with crash recovery — default engine in-memory + JSON snapshot per Q3.
- **Production code that must exist**: real disk write; real atomic rename; real corruption-detection on load.
- **Allowed external stubs**: `tempfile` for test fixtures.
- **Unacceptable substitutes**: a no-op persistence in production is unacceptable (crash mid-flight loses the diff).
@@ -0,0 +1,64 @@
# Primitive Graph Builder + Path Freshness Scoring
**Task**: AZ-669_semantic_analyzer_primitive_graph
**Name**: Primitive graph from Tier-1 detections + path-freshness scoring
**Description**: Build a small ROI-scoped primitive graph from Tier-1 detections (path nodes, endpoint nodes, context nodes). Score path freshness using texture, edge clarity, undisturbed-surroundings cues.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-660_detection_client_grpc_stream, AZ-661_detection_client_schema_and_health
**Component**: semantic_analyzer
**Tracker**: AZ-669
**Epic**: AZ-630
## Problem
Tier 2 reasons over zoom-in crops using a primitive graph built from Tier-1 detections. The graph captures footpaths (path nodes), branch piles / dark entrances / dugouts (endpoint nodes), and trees / tree-blocks (context nodes). Path-freshness scoring combines surface texture, edge clarity, and undisturbed-surroundings cues into a single freshness score consumed by the recommended-action policy.
## Outcome
- `PrimitiveGraph::build(roi, detections) -> Graph` builds the graph from Tier-1 detections inside the ROI.
- `FreshnessScorer::score(graph, frame_crop) -> PathFreshnessScore` returns a normalized 01 score per path node.
- Graph validation: disconnected paths trigger an explicit warning (consumed by task 32).
- Health surface: `graphs_built_total`, `freshness_score_p50/p99`, `disconnected_graphs_total`.
## Scope
### Included
- Graph data structures (path / endpoint / context node types).
- Detection-to-node mapping (per-class).
- Freshness scoring (computer-vision-style: edge density, texture variance, surrounding undisturbed area).
- Graph validation.
### Excluded
- ROI CNN inference (task 31).
- Recommended-action policy (task 32).
- VLM (separate component).
## Acceptance Criteria
**AC-1: Graph contains all relevant detections**
Given a `DetectionBatch` with 3 footpath bboxes + 2 branch-pile bboxes + 5 tree bboxes inside the ROI
When `build(roi, batch)` runs
Then the graph contains 3 path nodes + 2 endpoint nodes + 5 context nodes.
**AC-2: Freshness score is bounded**
Given any valid graph + frame crop
When `score(graph, crop)` runs
Then every emitted freshness score is in `[0.0, 1.0]`.
**AC-3: Disconnected graph is flagged**
Given a graph with two unconnected path components
When validation runs
Then `disconnected_graphs_total` increments by 1 and the graph is marked invalid.
## Non-Functional Requirements
**Performance**
- Graph build: ≤30 ms per ROI on Jetson Orin Nano.
- Freshness scoring: ≤50 ms per ROI.
## Runtime Completeness
- **Named capability**: primitive graph construction + path-freshness scoring — production reasoning path.
- **Production code that must exist**: real graph construction; real freshness scorer.
- **Allowed external stubs**: `opencv` for texture/edge feature extraction.
- **Unacceptable substitutes**: a constant-score scorer in production is unacceptable.
@@ -0,0 +1,70 @@
# ROI CNN Inference + Size/Timeout Bounds + Concealment Scoring
**Task**: AZ-670_semantic_analyzer_roi_cnn
**Name**: ONNX/TensorRT ROI CNN + ROI size/timeout enforcement + concealment scoring
**Description**: Lightweight CNN session (ONNX/TensorRT) for endpoint-candidate concealment scoring. Bound every inference by strict ROI size and timeout. Never run on a full frame.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-669_semantic_analyzer_primitive_graph
**Component**: semantic_analyzer
**Tracker**: AZ-670
**Epic**: AZ-630
## Problem
Endpoint candidates (branch piles, dark entrances, dugouts) need a concealment score that combines visual cues a primitive graph can't capture alone. A lightweight CNN session (ONNX or TensorRT) runs on bounded ROI crops with a strict timeout — never on a full frame. Oversize ROIs are rejected pre-decode. Inference timeout returns a structured `Tier2Evidence { status: timeout }` so `scan_controller` can decide to skip VLM and surface a low-evidence POI.
## Outcome
- `RoiInference::infer(roi_crop) -> Result<ConcealmentScore, RoiError>` runs the CNN session.
- ROI size check pre-decode: reject if larger than `max_roi_bytes` config; `RoiError::Oversize`.
- Wall-clock timeout `inference_timeout_ms` (default 200 ms); on timeout returns `RoiError::Timeout`.
- CNN backend: ONNX Runtime primary (CPU); TensorRT optional behind a build-time feature for Jetson.
- Health surface: `tier2_latency_p50/p99`, `roi_size_bytes_p99`, `errors_total`, `oversize_rejections_total`, `timeouts_total`.
## Scope
### Included
- ONNX Runtime binding (Rust crate `ort` or equivalent).
- TensorRT integration behind feature flag (defer real impl if not Jetson-ready).
- ROI size + timeout bounds.
- Concealment scoring (raw CNN output + post-process).
### Excluded
- Primitive graph + freshness scoring (task 30).
- Recommended-action policy (task 32).
- The CNN model weights themselves (treated as a build/deploy artefact; ONNX model file path is config).
## Acceptance Criteria
**AC-1: Inference happy path**
Given a 256×256 RGB ROI and a fixture CNN model
When `infer(roi)` runs
Then it returns `Ok(ConcealmentScore { value: f32, model_version })` within ≤200 ms p99.
**AC-2: Oversize ROI rejected pre-decode**
Given an ROI larger than `max_roi_bytes`
When `infer(roi)` is called
Then it returns `Err(RoiError::Oversize)` immediately; no decode happens.
**AC-3: Inference timeout returns explicit error**
Given a fixture CNN that takes 500 ms
When `infer(roi)` is called with `inference_timeout_ms = 200`
Then it returns `Err(RoiError::Timeout)` and `timeouts_total` increments by 1.
**AC-4: TensorRT feature absent does not break build**
Given the binary is built WITHOUT the `tensorrt` feature
When the build runs
Then it builds cleanly using ONNX Runtime only.
## Non-Functional Requirements
**Performance**
- Per-ROI inference: ≤200 ms p99 (per `description.md §8`).
- Concealed-position recall ≥60 %, precision ≥20 % (per `description.md §8`; both measured against the benchmark dataset, not asserted here).
## Runtime Completeness
- **Named capability**: ROI CNN inference on ONNX (TensorRT optional).
- **Production code that must exist**: real ONNX session; real ROI bounds; real timeout.
- **Allowed external stubs**: a tiny fixture ONNX model for unit tests.
- **Unacceptable substitutes**: running on the full frame instead of an ROI is unacceptable (memory + latency).
@@ -0,0 +1,69 @@
# Recommended-Action Policy + Pan Plan Emission
**Task**: AZ-671_semantic_analyzer_action_policy
**Name**: Tier2Evidence action policy + pan-plan emission for footpath-follow
**Description**: At intersections, recommend `PanFollowFootpath | HoldEndpoint | PanBroad | ReturnToZoomOut` based on the primitive graph + freshness + concealment scores + Tier2Evidence shape. Emit a pan plan (sequence of pan goals) when `PanFollowFootpath` is chosen.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-669_semantic_analyzer_primitive_graph, AZ-670_semantic_analyzer_roi_cnn
**Component**: semantic_analyzer
**Tracker**: AZ-671
**Epic**: AZ-630
## Problem
Once the primitive graph + freshness + concealment scores are computed, the policy must choose what the gimbal does next. At intersections, the freshest / most-promising branch is recommended for `gimbal_controller` to pan toward; an explicit `pan plan` (sequence of pan goals with timing) is emitted that keeps the path centered while the UAV moves.
## Outcome
- `ActionPolicy::recommend(graph, freshness, concealment, current_roi) -> Tier2Evidence` returns the typed evidence with `recommended_next_action`.
- For `PanFollowFootpath`, the evidence carries an attached `PanPlan` (sequence of `(yaw, pitch, zoom, at_ts)` goals) consumed by `gimbal_controller` (task 16).
- `HoldEndpoint`, `PanBroad`, `ReturnToZoomOut` are returned without a pan plan.
- For graph-invalid (disconnected) cases, returns `recommended_next_action: ReturnToZoomOut` + `path_freshness: undefined`.
## Scope
### Included
- Policy rule table (graph-shape × scores × current ROI → action).
- Pan plan generator (footpath traversal sequence).
- Tier2Evidence type assembly.
### Excluded
- Plan execution (`gimbal_controller` task 16).
- VLM gating (lives in `scan_controller`).
## Acceptance Criteria
**AC-1: Single fresh footpath → PanFollowFootpath**
Given a graph with one path node, freshness > 0.7, no endpoint nodes
When `recommend(...)` runs
Then it returns `Tier2Evidence { recommended_next_action: PanFollowFootpath, pan_plan: Some(...) }`.
**AC-2: Branched intersection picks freshest branch**
Given a graph with three path nodes meeting at an intersection, freshness `[0.3, 0.9, 0.5]`
When `recommend(...)` runs
Then the emitted pan plan's first non-trivial pan goal lies along the branch with freshness 0.9.
**AC-3: High-concealment endpoint → HoldEndpoint**
Given a graph with one endpoint node, concealment > 0.8
When `recommend(...)` runs
Then it returns `Tier2Evidence { recommended_next_action: HoldEndpoint, pan_plan: None }`.
**AC-4: Disconnected graph → ReturnToZoomOut**
Given a graph marked invalid (disconnected paths)
When `recommend(...)` runs
Then it returns `Tier2Evidence { recommended_next_action: ReturnToZoomOut, path_freshness: undefined, pan_plan: None }`.
## Non-Functional Requirements
**Performance**
- Policy + pan-plan generation: ≤20 ms p99 (well within the ≤200 ms Tier 2 budget).
## Contract
- Canonical typed model: `data_model.md §Tier2Evidence`, `§PanPlan`.
## Runtime Completeness
- **Named capability**: Tier-2 action policy + pan-plan emission.
- **Production code that must exist**: real policy rules; real pan-plan generator.
- **Unacceptable substitutes**: a "always return PanFollowFootpath" placeholder is unacceptable in production.
@@ -0,0 +1,70 @@
# VLM Provider Trait + Disabled Default Impl + Feature Flag
**Task**: AZ-672_vlm_client_provider_trait
**Name**: VlmAssessmentProvider trait + default disabled impl + build-time feature gating
**Description**: Define `VlmAssessmentProvider` trait (in `shared::contracts`) and a default impl that always returns `status: disabled`. The `vlm_client` crate is behind a build-time feature flag; with the feature off the default impl is used and the binary builds + runs identically without `vlm_client`.
**Complexity**: 2 points
**Dependencies**: AZ-640_initial_structure
**Component**: vlm_client
**Tracker**: AZ-672
**Epic**: AZ-631
## Problem
VLM is optional in two ways: at runtime (`vlm_enabled` flag) and at build time (`vlm_client` Cargo feature). `scan_controller` depends only on the trait — never on the `vlm_client` crate directly — so the binary builds and runs with VLM absent. The default trait impl returns `status: disabled` so the call-site code path is identical whether VLM is enabled or absent.
## Outcome
- `VlmAssessmentProvider` trait in `shared::contracts::vlm`:
```text
trait VlmAssessmentProvider {
async fn assess(&self, roi_crop: &RoiCrop, prompt: &str) -> VlmAssessment;
}
```
- Default impl `DisabledVlmProvider` returns `VlmAssessment { status: Disabled, .. }` for every call.
- `vlm_client` Cargo feature gates inclusion of the real `vlm_client` crate; with feature off, only `DisabledVlmProvider` is registered.
- Runtime flag `vlm_enabled = false` causes the composition root to install `DisabledVlmProvider` even when the feature is compiled in.
## Scope
### Included
- Trait definition in `shared::contracts::vlm`.
- `DisabledVlmProvider` default impl (also in `shared` so it's available regardless of feature).
- Cargo feature flag wiring in `Cargo.toml` (workspace + binary).
- Runtime flag plumb from config.
### Excluded
- The real NanoLLM IPC client (task 34).
- Schema validation (task 35).
## Acceptance Criteria
**AC-1: Disabled default returns disabled status**
Given a `DisabledVlmProvider`
When `assess(roi, "...")` is called
Then it returns `VlmAssessment { status: Status::Disabled, .. }` immediately (≤1 ms).
**AC-2: Binary builds without vlm_client feature**
Given the binary is built with `--no-default-features` (or whatever toggles the `vlm_client` feature off)
When the build runs
Then it succeeds; the `vlm_client` crate is NOT a build dependency.
**AC-3: Runtime vlm_enabled = false uses disabled impl**
Given the binary is built WITH the `vlm_client` feature but config sets `vlm_enabled = false`
When the composition root constructs the provider
Then `DisabledVlmProvider` is installed; the real NanoLLM client is NOT constructed.
## Non-Functional Requirements
**Performance**
- `DisabledVlmProvider::assess` ≤1 ms.
## Contract
- Canonical typed model: `data_model.md §VlmAssessment`.
## Runtime Completeness
- **Named capability**: optional-VLM trait + disabled default.
- **Production code that must exist**: real trait; real disabled impl; real feature-flag wiring.
- **Unacceptable substitutes**: hardcoding `vlm_client` as a non-optional dependency is unacceptable per `description.md §9 Optionality Model`.
@@ -0,0 +1,72 @@
# NanoLLM UDS Client + Peer-Cred Check + Pre-Send Validation
**Task**: AZ-673_vlm_client_nanollm_ipc
**Name**: Unix-domain socket client to NanoLLM + peer-cred check + ROI pre-send validation
**Description**: Maintain the Unix-domain-socket connection to the NanoLLM process. Perform a peer-credential check on connect (where supported). Validate ROI payload (size, format) BEFORE sending across the IPC channel. No network egress — UDS only.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure, AZ-672_vlm_client_provider_trait
**Component**: vlm_client
**Tracker**: AZ-673
**Epic**: AZ-631
## Problem
VLM runs as a local NanoLLM/VILA1.5-3B process. The link is a Unix-domain socket — no network egress, ever. The connection MUST be peer-credential-checked on connect (Linux `SO_PEERCRED`) to confirm the peer process belongs to the expected user / GID; failure is a hard error requiring operator intervention, not a silent retry. ROI payloads MUST be validated for size + format BEFORE crossing the socket — never spend network IPC time on a payload that's known-too-big.
## Outcome
- `NanoLlmClient::connect(socket_path) -> Result<Self, ConnectError>` opens a UDS connection and performs `SO_PEERCRED` check; mismatch returns `Err(PeerCredMismatch)`.
- `NanoLlmClient::assess(roi_crop, prompt) -> VlmAssessment` validates the ROI pre-send and sends a single request; awaits one response within ≤5 s; returns `VlmAssessment`.
- Bounded reconnect on transport loss; on peer-cred failure NO reconnect happens (operator intervention required).
- Health surface: `vlm_latency_p50/p99`, `errors_by_kind`, `peer_cred_check_pass_rate`.
## Scope
### Included
- UDS client (`tokio::net::UnixStream`).
- `SO_PEERCRED` check (Linux; on macOS dev hosts, log a warning and proceed for development purposes only — production target is Jetson Linux).
- Pre-send size + format validation.
- Reconnect state machine (bounded).
- Bounded request deadline.
### Excluded
- VlmAssessment schema validation (task 35).
- Provider trait wiring (task 33).
## Acceptance Criteria
**AC-1: Happy path against fixture NanoLLM**
Given a fixture NanoLLM process listening on a UDS path with correct peer-cred
When `connect` is called and then `assess(roi, "is this concealed?")` is called
Then `connect` returns Ok; `assess` returns `VlmAssessment { status: Ok, label, confidence, .. }` within ≤5 s.
**AC-2: Peer-cred mismatch hard-fails connect**
Given a fixture peer with wrong UID
When `connect` is called
Then it returns `Err(PeerCredMismatch)`; subsequent connect attempts are blocked until config-driven intervention (no automatic retry); health → red.
**AC-3: Oversize ROI rejected pre-send**
Given an ROI larger than `max_roi_bytes`
When `assess(...)` is called
Then it returns `VlmAssessment { status: SchemaInvalid, .. }` synchronously without writing to the socket.
**AC-4: Response timeout returns explicit status**
Given a fixture NanoLLM that never responds within 5 s
When `assess(...)` is called
Then it returns `VlmAssessment { status: Timeout, .. }` after ≤5 s; subsequent requests are not blocked.
## Non-Functional Requirements
**Performance**
- Per-ROI latency: ≤5 s p99 (per `description.md §8`).
**Reliability**
- No network egress (hard rule — UDS only).
- Peer-cred mismatch never silently retried.
## Runtime Completeness
- **Named capability**: NanoLLM/VILA1.5-3B IPC over UDS + peer-cred enforcement.
- **Production code that must exist**: real UDS connection; real `SO_PEERCRED`; real pre-send validation.
- **Allowed external stubs**: a Python NanoLLM stub script in tests that echoes a canned response.
- **Unacceptable substitutes**: TCP to localhost instead of UDS is unacceptable (violates the no-network-egress rule).
@@ -0,0 +1,72 @@
# VlmAssessment Schema Validation + Model-Version Tracking
**Task**: AZ-674_vlm_client_schema_and_model_version
**Name**: VlmAssessment schema validation + model_version tracking + status enum coverage
**Description**: Validate every NanoLLM response against the `VlmAssessment` schema. On schema-invalid, return `status: schema_invalid` + log the raw response (size-capped) for offline analysis. Capture `model_version` on every assessment for forensic correlation; log on change.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-673_vlm_client_nanollm_ipc
**Component**: vlm_client
**Tracker**: AZ-674
**Epic**: AZ-631
## Problem
The NanoLLM process emits free-form text, but the autopilot consumes ONLY a validated structured `VlmAssessment`. Schema-invalid responses MUST not propagate as malformed evidence — they're returned as `status: schema_invalid` with the raw response logged size-capped for offline analysis. Model-version capture supports forensic correlation when an assessment's quality is later disputed.
## Outcome
- `VlmAssessmentParser::parse(raw_response) -> VlmAssessment` validates the response against the schema; on failure returns `VlmAssessment { status: SchemaInvalid, .. }` and logs the raw response (size-capped to e.g. 4 KB) at warn level.
- `model_version` field is populated on every assessment from the NanoLLM-reported version; changes are logged at info level once per change.
- Status enum exhaustively covers `Ok | Inconclusive | Timeout | SchemaInvalid | IpcError | Disabled`; consumer match-exhaustion is enforced by the type.
## Scope
### Included
- Schema definition in `shared/contracts/vlm-assessment.json` (or equivalent Rust schema).
- Parser implementation.
- Model-version change detection.
- Size-capped raw-response logging.
### Excluded
- The UDS transport (task 34).
- Provider trait wiring (task 33).
## Acceptance Criteria
**AC-1: Valid response parses successfully**
Given a fixture NanoLLM response with all required fields
When `parse(raw)` runs
Then it returns `VlmAssessment { status: Ok, label, confidence, model_version, .. }`.
**AC-2: Schema-invalid response returns schema_invalid + logs**
Given a fixture response missing a required field
When `parse(raw)` runs
Then it returns `VlmAssessment { status: SchemaInvalid, .. }` and the raw response excerpt (size-capped) is observable in log output.
**AC-3: Model version change logged once**
Given an assessment with `model_version = "v1.0"` followed by another with `model_version = "v1.1"`
When the change is detected
Then a single log entry observes the change; subsequent assessments with `v1.1` do NOT re-log.
**AC-4: Status enum is exhaustive**
Given consumer code that matches on `VlmAssessment.status`
When a new variant is added (compile-time)
Then the compiler forces handling of the new variant; no `_ => …` catch-all in the policy code-path.
## Non-Functional Requirements
**Performance**
- Schema validation: ≤2 ms.
**Reliability**
- Schema mismatches never silent.
## Contract
- Canonical typed model: `data_model.md §VlmAssessment`. Schema lives at `shared/contracts/vlm-assessment.json`.
## Runtime Completeness
- **Named capability**: VlmAssessment schema validation + model-version awareness.
- **Production code that must exist**: real schema validator; real model-version tracker.
- **Unacceptable substitutes**: silently mapping a schema-invalid response to `status: Ok` with placeholder fields is unacceptable.
@@ -0,0 +1,68 @@
# Telemetry gRPC Server + Per-Client Lossy Subscriber
**Task**: AZ-675_telemetry_stream_grpc_server
**Name**: Tonic gRPC server bind + per-client lossy subscriber bounded queue
**Description**: Bring up the operator-bound telemetry gRPC server (Tonic). Per-client subscriber has a bounded queue. Slow clients drop oldest, count drops; never block the producer.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-649_mission_executor_telemetry_forwarding, AZ-657_frame_ingest_rtsp_session
**Component**: telemetry_stream
**Tracker**: AZ-675
**Epic**: AZ-637
## Problem
`telemetry_stream` is the operator-bound publisher for `TelemetrySample`, `GimbalState`, `DetectionEvent`, `MovementCandidate`, `MapObjectsBundle`. Per-client throttling MUST be lossy and per-client so a slow client never starves a healthy one. The server runs over the operator-link gRPC channel — same physical transport as `operator_bridge` but a separate logical service.
## Outcome
- Tonic gRPC server bound on `telemetry.listen_addr` exposing a single subscribe-style streaming RPC per topic (or a multiplex RPC).
- Each connected client has a `(bounded_queue, drop_counter, last_sent_seq)` state.
- Producer fan-out copies (refcount where possible) the message into each subscriber's queue. Full queue → drop oldest, increment `drops_total{client_id, topic}`.
- Disconnects cleanly tear down the subscriber.
- Health surface: `subscribed_clients`, `drops_total{client_id, topic}`, `bytes_out_per_topic`.
## Scope
### Included
- Tonic server bind + cleanup.
- Per-client subscriber state.
- Drop-oldest back-pressure.
- Disconnect handling.
### Excluded
- The .proto schema (lives in `shared/contracts/telemetry-stream.proto`; if absent, add it as a side-effect of this task).
- Diff-based snapshot emission for `MapObjectsBundle` (task 38).
- Operator commands (lives in `operator_bridge` component).
## Acceptance Criteria
**AC-1: Multiple subscribers receive the same stream**
Given 3 clients subscribed to `TelemetrySample`
When 100 samples are published
Then each client receives all 100 (assuming no slowness); ordering preserved.
**AC-2: Slow subscriber drops oldest, healthy unaffected**
Given client A reads slowly and client B reads at full speed
When producer pushes 1000 samples while A is paused
Then client A's queue grows up to `max_queue` and then drops oldest (drops_total{A} > 0); client B receives all 1000.
**AC-3: Disconnect cleanly removes subscriber**
Given a connected client
When the gRPC stream is canceled
Then `subscribed_clients` decrements by 1; producer fan-out no longer copies to that client.
## Non-Functional Requirements
**Performance**
- Per-message fan-out CPU: ≤2 ms p99 for ≤10 clients (per architecture NFR class).
- Tx tail latency end-to-end (producer → wire) ≤100 ms p95 over a healthy link.
**Reliability**
- No producer-side blocking on slow clients (hard rule).
## Runtime Completeness
- **Named capability**: Tonic gRPC operator telemetry stream with lossy per-client throttling.
- **Production code that must exist**: real gRPC server; real per-client subscriber state machine; real drop counters.
- **Allowed external stubs**: an in-process gRPC client in tests.
- **Unacceptable substitutes**: a single global queue (head-of-line blocking) is unacceptable.
@@ -0,0 +1,65 @@
# Video Path Selection (Forward RTSP vs Encoded Bytes) + AI-Lock Coordination
**Task**: AZ-676_telemetry_stream_video_path
**Name**: Operator-bound video path: forward RTSP URL OR carry encoded bytes; coordinate with frame_ingest ai_locked signal
**Description**: Two delivery modes for the operator video path (config-driven): (1) forward the RTSP URL to the operator (most common), (2) carry encoded bytes over the operator gRPC stream. Coordinate with `frame_ingest`'s `ai_locked` signal so AI inference is suppressed only while operator-led control occupies the frame budget.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-657_frame_ingest_rtsp_session, AZ-675_telemetry_stream_grpc_server
**Component**: telemetry_stream
**Tracker**: AZ-676
**Epic**: AZ-637
## Problem
The operator sees the camera feed. Two modes are supported because some operator stacks attach to the RTSP source directly (lower onboard cost, recommended default), and others need bytes carried over the same operator-link channel (no separate RTSP socket to the operator).
When the operator-bound feed is active (either mode), `frame_ingest` MUST raise `ai_locked = true` so Tier-1 inference does not run on the same frames the operator is actively driving. The mechanism is a shared `Arc<AtomicBool>` (or equivalent) toggled by `telemetry_stream`'s session start/stop, read by `frame_ingest` (task 18) and `detection_client` (task 21).
## Outcome
- Config flag `video_path = "rtsp_forward" | "bytes_inline"`; default `rtsp_forward`.
- `rtsp_forward`: emit the canonical RTSP URL as part of the session-start telemetry.
- `bytes_inline`: take frames from `frame_ingest`'s broadcast channel and forward bytes to subscribed operator clients.
- `ai_locked` shared flag plumbed at startup; flipped to `true` while at least one operator session is consuming the video path, `false` otherwise.
- Health surface: `video_path_mode`, `ai_locked_state`, `bytes_inline_drops_total`.
## Scope
### Included
- Both modes (rtsp_forward + bytes_inline).
- ai_locked toggle wiring.
- Session-tracking (active client count gating ai_locked).
### Excluded
- RTSP server stream itself (it's owned by the camera; we just forward the URL).
- `frame_ingest` reading the flag (task 18 owns that).
- Snapshot/diff for MapObjects (task 38).
## Acceptance Criteria
**AC-1: rtsp_forward mode emits URL only**
Given `video_path = "rtsp_forward"` and a client subscribes
When the session starts
Then the client receives the configured RTSP URL in the session-start message; no bytes are streamed by this component.
**AC-2: bytes_inline forwards encoded frames**
Given `video_path = "bytes_inline"` and a client subscribes
When `frame_ingest` publishes 100 frames
Then the client receives all 100 (modulo bounded-queue drops handled by task 36).
**AC-3: ai_locked toggles on session start/stop**
Given no operator session is active (`ai_locked = false`)
When the first client subscribes to the video stream
Then `ai_locked` flips to `true`; when all clients disconnect, `ai_locked` flips back to `false`.
## Non-Functional Requirements
**Performance**
- bytes_inline: frame copy cost ≤2 ms p99 per frame on Jetson Orin Nano.
- AI-lock toggle latency: ≤50 ms from subscribe → flag flip.
## Runtime Completeness
- **Named capability**: operator video path (dual mode) + ai_locked coordination.
- **Production code that must exist**: both modes; real ai_locked atomic wired to consumers.
- **Unacceptable substitutes**: rtsp_forward that doesn't actually emit the URL (or bytes_inline that doesn't read frame_ingest) is unacceptable.
@@ -0,0 +1,62 @@
# Pre-Flight MapObjects Snapshot + In-Flight Diffs + Reconnect Resync
**Task**: AZ-677_telemetry_stream_mapobjects_snapshot
**Name**: MapObjects bundle: pre-flight snapshot + in-flight diff stream + reconnect re-snapshot
**Description**: Emit a full `MapObjectsBundle` snapshot on operator client connect/reconnect, then stream diff messages as the store appends new observations / ignored items. On client reconnect after disconnect, emit a fresh snapshot rather than trying to replay diffs.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-675_telemetry_stream_grpc_server, AZ-667_mapobjects_store_hydrate_and_pending
**Component**: telemetry_stream
**Tracker**: AZ-677
**Epic**: AZ-637
## Problem
The operator views the live map state. Sending the entire `MapObjectsBundle` on every change is wasteful, but streaming diffs without a baseline forces the operator to recover from missing state on reconnect. The pattern: snapshot on connect, then diffs while connected. On disconnect-then-reconnect, treat as fresh client → re-snapshot. No best-effort gap-filling.
## Outcome
- On client subscribe to `MapObjectsBundle` topic: read current store state via `MapObjectsStore::snapshot()`; emit one `MapObjectsBundleSnapshot` message.
- During the session: subscribe to the store's append log (pending_observations + pending_ignored streams); emit `MapObjectsDiff { added: [...], moved: [...], removed_candidates: [...], ignored: [...] }` messages.
- On client disconnect: drop the subscriber.
- On reconnect: treat as new subscribe; emit a fresh snapshot. NO diff replay.
- Health: `mapobjects_snapshot_bytes`, `mapobjects_diff_count`, `mapobjects_resnap_count`.
## Scope
### Included
- Snapshot emission on subscribe.
- Diff stream from store append log.
- Re-snapshot on reconnect.
### Excluded
- Store implementation (task 28).
- Per-client subscriber state machine (task 36).
## Acceptance Criteria
**AC-1: First subscribe receives snapshot**
Given a store with 50 MapObjects + 10 IgnoredItems hydrated
When a client subscribes to the MapObjectsBundle topic
Then it receives exactly one `MapObjectsBundleSnapshot` containing 50 + 10 entries.
**AC-2: In-flight changes emit diffs**
Given a connected client
When 3 new observations and 1 ignored item are appended to the store
Then the client receives one or more `MapObjectsDiff` messages whose combined contents = `{added: 3, ignored: 1}`.
**AC-3: Reconnect re-snapshots**
Given a client disconnected mid-session and the store grew by 5 entries while disconnected
When the client reconnects
Then the client receives a fresh `MapObjectsBundleSnapshot` reflecting the current state; NO diff replay.
## Non-Functional Requirements
**Performance**
- Snapshot serialization: ≤200 ms p99 for ≤10 000 MapObjects.
- Diff fan-out: ≤2 ms p99 per append.
## Runtime Completeness
- **Named capability**: snapshot + diff transport for MapObjects.
- **Production code that must exist**: real snapshot emission; real diff streaming; real re-snapshot on reconnect.
- **Unacceptable substitutes**: emitting full snapshots on every change (bandwidth) or replaying diffs across reconnect (consistency hazard) are both unacceptable.

Some files were not shown because too many files have changed in this diff Show More