[AZ-626] Decompose complete: 47 tasks + docs + module layout

Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy Qt/C++ to a Rust workspace. - Remove legacy Qt/C++ tree (ai_controller, drone_controller, misc/camera, python_scaffold, root Dockerfile, autopilot.pro, legacy main.py / requirements.txt). - Add _docs/00_problem (problem, restrictions, acceptance criteria, security approach, input data + fixtures). - Add _docs/01_solution/solution_draft01. - Add _docs/02_document (architecture, system-flows, data_model, glossary, decision-rationale, deployment, 13 component descriptions, tests/ specs, FINAL_report, module-layout). - Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one bootstrap + 46 component tasks) and _dependencies_table.md. - Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for canonical _docs artifacts). - Track autodev state in _docs/_autodev_state.md (Step 6 completed, ready for Step 7 Implement). Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks AZ-640..AZ-686. Total complexity 173 points across 12 epics. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 14:41:09 +00:00 · 2026-05-19 11:02:01 +03:00
parent f7d6cb4a3a
commit bc40ea7300
235 changed files with 12585 additions and 15097 deletions
@@ -0,0 +1,58 @@
+# Input Data
+
+Runtime inputs the autopilot consumes when flying, plus reference fixtures + expected-output assertions for tests. **All fixtures live inside this workspace** (`fixtures/`) — never reach into sibling repos at `../` for inputs. The autopilot repo is self-sufficient.
+
+## Layout
+
+| Path | Owns |
+|---|---|
+| `data_parameters.md` | Description of runtime input shapes (camera, telemetry, gRPC, mission JSON, operator commands, VLM IPC) + the categories of reference data tests need + Tier-1/Tier-2 class catalogue. |
+| `services.md` | Per-external-service test-mock requirements: what shape of mock/fixture each of the 7 external systems needs and the acquisition status of each. |
+| `fixtures/README.md` | File-by-file manifest of every fixture in this directory: SHA-256, size, upstream provenance, which `expected_results/results_report.md` rows consume it. |
+| `fixtures/images/` | Real aerial frames (5 images, ~9 MB total) — Tier-1 inputs for detection-quality assertions (L1, D2, D6). |
+| `fixtures/videos/` | Real reconnaissance video (1 clip, 12 MB) for frame-rate floor + sequence tests (T3). |
+| `fixtures/movement/` | Wide-area movement-detection visual reference clips (4 clips, ~23 MB total). **No paired `gimbal.csv` / `telemetry.csv`** — ego-motion compensation (M1–M4) cannot run against these alone. |
+| `fixtures/semantic/` | Concealed-position semantic reference frames (4 PNGs, ~11 MB total) + `data_parameters.md` describing the new YOLO primitive classes the examples motivate. **Starter set only**, not a graded eval set. |
+| `fixtures/schemas/` | Detection-result contract schemas (JSON + JSON-schema) for D6. |
+| `fixtures/sql/` | Database init script — reference only; not directly asserted by an autopilot AC. |
+| `expected_results/results_report.md` | The input → quantifiable-expected-output mapping consumed by `/test-spec` Phase 1. Every row keys off an AC in `../acceptance_criteria.md`; deferred rows carry a structured `<DEFERRED: <shape>; ref <pointer>>` tag. |
+
+## Why fixtures are local
+
+The autopilot repo MUST be self-sufficient — a developer with only the autopilot clone (no parent suite checked out) MUST be able to run the test specifications. Cross-repo `../` paths are forbidden in `results_report.md` and in any test runner script. When a sibling repo (`../detections/`, `../e2e/`, `../missions/`, etc.) is the upstream source of a fixture, we **copy** it in and SHA-pin it in `fixtures/README.md` so upstream drift is detectable.
+
+## Suite-level coupling that still matters
+
+Even though fixtures are local, the underlying contracts the fixtures embody come from suite-level decisions. When those decisions change, the fixtures here go stale:
+
+- **Tier-1 detection model / classes** — when `../detections` ships a new model the `expected_detections.json` baseline goes stale; D1, D2, D6 rows in `results_report.md` must be re-recorded.
+- **`mission-schema`** — shared between autopilot and the `missions` repo. Schema changes break the mission JSON contract; the mock fixtures for Mp1–Mp5 (when authored) must re-pin.
+- **Detection classes catalogue** — class IDs 0..18 are governed at the suite level. Autopilot's normalised-box output uses the same IDs. The 5 new Tier-1 classes documented in `data_parameters.md → "Class catalogue"` must land in the suite catalogue before D1 can be measured.
+
+Today these couplings are tracked manually. The `monorepo-e2e` skill at the suite root will eventually own the drift detection.
+
+## Fixture gaps and the project policy on `/test-spec` Phase 3
+
+`/test-spec` Phase 3 has a **hard 75% coverage gate** on rows with real input fixtures + real expected results. Today's coverage is well below that gate (see `expected_results/results_report.md → "Coverage Status"`). **Project policy as of 2026-05-19**: rather than block the autodev flow at the gate, each deferred row is registered with a structured `<DEFERRED: <shape>; ref <pointer>>` tag in `results_report.md`, pointing at the per-service acquisition path in `services.md` or at an open architecture question (Q-tag). Deferred rows become **release-gate items**, not development-gate items. The `acceptance_criteria.md → "Acceptance Gates (project-level)"` hardware/replay benchmark requirement remains a hard release blocker.
+
+Summary of open gaps (authoritative list lives in `services.md` and `fixtures/README.md`):
+
+1. **Paired `gimbal.csv` + `telemetry.csv` for the 4 movement clips** — highest priority (blocks M1–M4 + tightens L6/L7). **User-confirmed unavailable today (2026-05-19).**
+2. Annotated multi-season eval set (concealed positions + footpaths).
+3. Mock `missions` API exchanges + mock `/mapobjects` round-trip.
+4. Mock Ground Station session traces.
+5. ArduPilot SITL traces.
+6. Operator-command envelopes (blocked on Q9).
+7. VLM I/O pairs.
+8. GPS / NTP drift scripts.
+
+Closing each gap is its own workstream tracked in Jira; the autodev flow does not block on them.
+
+## Adding new fixtures
+
+1. Drop the file under `fixtures/<images|videos|movement|semantic|schemas|sql|gimbal|telemetry|mavlink|vlm|operator|mapobjects>/<descriptive-name>.<ext>` — create the subdirectory if it does not exist.
+2. Compute SHA-256 (`shasum -a 256 <file>`).
+3. Add a row to the matching subsection in `fixtures/README.md` (file path, size, SHA, upstream provenance, which `results_report.md` rows consume it).
+4. Replace the matching `<DEFERRED: ...>` placeholder(s) in `expected_results/results_report.md` with the local path `fixtures/<...>`.
+5. If the fixture replaces a service mock, also update `services.md → "Coverage summary by service"` to reflect the new acquisition status.
+6. If the fixture is binary and large (> 50 MB) consider gitignoring it + adding an acquisition script per the e2e pattern; for everything in the current set, direct commit is fine.
@@ -0,0 +1,101 @@
+# Input Data Parameters
+
+Describes the **categories of input data** the system consumes at runtime, and the **categories of reference data** tests need. Internal component names, programming languages, IPC mechanisms, schema class names, and specific model choices are design and live in `_docs/02_document/architecture.md` — they do not belong in this file (per `.cursor/rules/artifact-srp.mdc`).
+
+Local fixtures live in `fixtures/`; see `fixtures/README.md` for the manifest. External-service test-mock requirements live in `services.md`; the per-row binding to AC criteria lives in `expected_results/results_report.md`.
+
+## Runtime inputs (what the system consumes when flying)
+
+| Input | Source | Format | Cadence | Notes |
+|---|---|---|---|---|
+| Camera frames | ViewPro A40 (or alternative ViewPro Z40K) | H.264 / H.265 over RTSP, 1080p (1920×1080) | 30 / 60 fps | Frame timestamps are mandatory. |
+| Primitive (Tier 1) detection responses | `../detections` service over a bi-directional streaming RPC contract | Bounding boxes with class id, confidence, normalised coordinates | Per frame | Same boxes feed Tier-2 ROI selection and the operator overlay. |
+| UAV telemetry | Airframe via MAVLink v2 (UDP or serial) | MAVLink messages: position, attitude, velocity, battery, link health, GPS fix | ≥1 Hz (10 Hz target) | Source-of-truth for ego-motion compensation. |
+| Gimbal feedback | ViewPro A40 vendor protocol over UDP | Yaw / pitch / zoom angle telemetry | per-tick | Source-of-truth for camera-pose compensation. |
+| Mission JSON | `missions` service via HTTPS REST | Shared `mission-schema` JSON | Once at mission start + middle-waypoint updates | Validated against the shared schema. |
+| Area-level map state | `missions` service extension `/missions/{id}/mapobjects` (GET) | Map-object records keyed by spatial cell | Once at mission start | Hydrates the system's local copy of the area map; cache-fallback on timeout. |
+| Operator commands | Ground Station via modem (return path of the outbound telemetry stream) | Authenticated + signed + replay-protected command envelope (scheme open per Q9) | Event-driven | confirm / decline / target-follow start / target-follow release / abort. |
+| Deep-analysis responses (optional) | Local-onboard model accessed via local IPC | Structured assessment schema (validated) | Per zoomed-in endpoint hold (when deep-analysis is enabled) | Schema-violation fails closed. |
+
+## Class catalogue (Tier-1 + Tier-2)
+
+Detection-quality acceptance criteria (`acceptance_criteria.md → Detection Quality`) are evaluated against a class catalogue that combines pre-existing suite-level classes with new autopilot-driven additions. Class IDs are governed at the suite level (`../detections` owns the catalogue); autopilot only consumes the IDs.
+
+### New Tier-1 (YOLO primitive) classes — to be added to the suite catalogue
+
+| # | Class name | Annotation hint | Motivated by |
+|---|---|---|---|
+| 1 | Black entrances | Bounding box; various sizes (small hideout openings to dugout entrances) | Concealed-position detection (D3, D4) |
+| 2 | Branch piles | Bounding box | Concealment material around hideouts (D3, D4) |
+| 3 | Footpaths | **Polyline / segmentation preferred over bbox** for linear features | Footpath recall gate (D5) |
+| 4 | Roads | Polyline / segmentation | Distinguishing roads from footpaths in the same scene |
+| 5 | Trees / tree blocks | Bounding box; tree-block annotation may use larger box for clusters | Concealment-context anchor; reduces false positives around tree-rows in movement detection (M1) |
+
+### Tier-2 semantic attributes — composed by `semantic_analyzer`, NOT added to YOLO catalogue
+
+| # | Attribute | Composed from | Used by |
+|---|---|---|---|
+| 1 | Footpath freshness (fresh / stale) | Footpath bbox + texture/edge analysis + seasonal context | Decision-window scoring, D5 partial coverage |
+| 2 | Concealed-structure inference | Black-entrance + branch-piles + footpath-approach proximity | POI surfacing for D3/D4 (the structure itself is composed, not directly labelled) |
+| 3 | Open clearing connected to path | Cleared-terrain texture + footpath endpoint | FPV-launch-point flagging |
+
+### Existing classes (already in the suite catalogue)
+
+The existing-class baseline (P=0.816, R=0.852 per the AC) covers the suite's pre-autopilot class set (vehicles, military equipment, etc.). Autopilot must not degrade these — see D2.
+
+### Reference for IDs
+
+The 19-id catalogue (0..18) is owned by `../detections`. Autopilot's normalised-box output uses the same IDs. When `../detections` ships a new model or renumbers IDs, the `expected_detections.json` baseline goes stale and D1, D2, D6 rows must be re-recorded.
+
+## Reference data needed for testing
+
+### Local fixtures already on disk
+
+See `fixtures/README.md` for the SHA-pinned manifest. Categorised summary:
+
+| Local fixture category | Files | Purpose | Bound to AC rows |
+|---|---|---|---|
+| `fixtures/images/*.jpg` | 5 aerial frames | Tier-1 detection contract; existing-class regression; normalised-box conformance | L1, D2, D6 |
+| `fixtures/videos/94d42580bd1ad6ff.mp4` | 1 reconnaissance clip | Frame-rate floor scenario, reserved for future movement-sequence tests | T3 |
+| `fixtures/schemas/expected_detections.{json,schema.json}` | 2 schema files | Detection-result contract shape reference | D6 |
+| `fixtures/sql/init.sql` | 1 SQL file | Suite-e2e DB seed reference | (suite-only; no autopilot AC) |
+| `fixtures/movement/video0[1-4].mp4` | 4 wide-area clips | Visual reference for movement-detection scenarios — **no paired telemetry CSVs**, ego-motion assertions unfalsifiable until those land | M1–M4 (visual reference only) |
+| `fixtures/semantic/semantic0[1-4].png` | 4 reference frames | Visual reference for concealed-position semantic targets — **starter set only, not a graded eval set** | D3, D4, D5 (starter only) |
+
+### Reference shapes still needed but not yet on disk
+
+The per-service mock catalogue is in `services.md` (authoritative). Summary of categories tests need:
+
+| Reference shape | Why it's needed | See |
+|---|---|---|
+| Frame sequences with synchronised `gimbal.csv` + `telemetry.csv` | Ego-motion compensation at zoom-out AND zoomed-in inspection | `services.md §6 Gimbal telemetry CSV` |
+| Concealed-position image set across all four seasons (annotated) | Concealed-position recall ≥60% and precision ≥20% | `services.md §5 Camera frame sequences` |
+| Footpath sequences (fresh, stale, all four seasons, polyline-annotated) | Footpath recall ≥70% | `services.md §5` |
+| New-class evaluation set (5 new classes above) | New-class per-class P/R ≥80% without degrading existing-class performance | `services.md §1 Tier-1 detection replay` (plus annotation campaign owned by `../ai-training` repo) |
+| Mock Tier-1 streaming-RPC replays | Detection-consumer isolation tests | `services.md §1` |
+| Mock Ground Station session traces | Lost-link failsafe ladder + operator-link reconnect | `services.md §3` |
+| MAVLink SITL traces | MAVLink conformance + waypoint insertion + geofence enforcement | `services.md §4` |
+| Mock central area-map service responses | Pre-flight pull / post-flight push round-trip; conflict cases (Q8) | `services.md §2` |
+| Operator-command envelopes | Signature + replay-protection tests (once Q9 resolves) | `services.md §8` |
+| VLM I/O pairs | Bounded ROI inputs + structured assessment outputs + schema-violation cases | `services.md §7` |
+| GPS / NTP drift scenarios | Wall-clock drift health-yellow gate | `services.md §9` |
+
+## Data volume targets
+
+- Training data: hundreds to thousands of annotated images/sequences total.
+- Seasonal coverage: winter (snow), spring (mud), summer (vegetation), autumn (mixed leaf + partial snow).
+- Available assembly effort: 1.5 months at 5 hours/day.
+- Movement detection requires **frame sequences** (not still images only) with synchronised camera + gimbal + UAV telemetry.
+- Footpaths require polyline or segmentation annotation rather than bounding boxes (see "Class catalogue" above).
+
+## Gaps that block `/test-spec` downstream
+
+`/test-spec` Phase 1 will pass on prerequisite existence (`expected_results/results_report.md` is non-empty). Phase 3 has a **hard 75% coverage gate** on rows with real input fixtures + real expected results.
+
+**Current coverage state** (re-computed 2026-05-19 after fixture restoration):
+
+- Rows bound to real local fixtures: L1, D2, D6, T3 (~4 rows) — these are also the rows whose fixtures were restored on 2026-05-19 from sibling repos.
+- Rows bound to **starter-only** fixtures (insufficient on their own): D3, D4, D5 (semantic PNGs), M1–M4 (movement videos without CSV).
+- Rows still deferred for fixture acquisition: see `fixtures/README.md → "Gaps still pending fixture acquisition"` and `services.md` for the authoritative list.
+
+**Project policy on the Phase 3 gate**: rather than block `/test-spec` at the 75% gate, the autodev flow registers each deferred row with a structured `<DEFERRED: needs <shape>; blocks AC <id>>` tag in `expected_results/results_report.md`. Test-spec authoring proceeds; deferred rows become release-gate items, not development-gate items. The acceptance_criteria.md project-level gate ("MUST pass before product implementation begins") still applies for the hardware/replay benchmark — that remains a hard release blocker, not deferred.
@@ -0,0 +1,153 @@
+# Expected Results
+
+Maps every quantifiable acceptance criterion from `_docs/00_problem/acceptance_criteria.md` to an input fixture + a measurable expected result. Consumed by `/test-spec` Phase 1.
+
+Per `.cursor/rules/artifact-srp.mdc`, this file uses **role / observable-behaviour language**, not internal component slugs. The system's externally observable behaviour is what's tested. Implementation names (component slugs, libraries, model names) live in `_docs/02_document/`.
+
+**Fixture sourcing**: all fixtures live in `fixtures/` (sibling-repo `../` paths are forbidden). Where no fixture exists yet, the `Input` cell carries a structured `<DEFERRED: <shape>; ref services.md §N>` tag. Phase 3 has a hard 75% coverage gate — the autodev flow registers deferred rows as release-gate items rather than blocking on the gate; see `data_parameters.md → "Gaps that block /test-spec downstream"`.
+
+**Comparison vocabulary**: see `.cursor/skills/test-spec/templates/expected-results.md` for canonical methods (`exact`, `numeric_tolerance`, `threshold_min`, `threshold_max`, `range`, `regex`, `substring`, `set_contains`, `json_diff`, `file_reference`).
+
+**Deferred-tag legend**: `<DEFERRED: <shape>; ref <pointer>>` where `<pointer>` is a section in `../services.md` (per-service mock requirements), an open architecture question (e.g. `Q9`), or `inline-authorable` (no external dependency — just not yet written).
+
+---
+
+## Latency
+
+Source ACs: `acceptance_criteria.md → Latency`.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| L1 | `fixtures/images/4d6e1830d211ad50.jpg` | Single 1280 px aerial frame consumed through the Tier-1 contract; measure end-to-end | per-frame end-to-end latency | threshold_max | ≤ 100 ms | N/A |
+| L2 | derived ROI ~640×640 from `fixtures/images/4d6e1830d211ad50.jpg` (inline-cropped by the test runner) | Tier-2 semantic confirmation over a single ROI | per-ROI latency | threshold_max | ≤ 200 ms | N/A |
+| L3 | `<DEFERRED: bounded ROI crop matching the deep-analysis input contract; ref services.md §7>` | Tier-3 deep-analysis (when enabled) local-IPC call | per-ROI call latency | threshold_max | ≤ 5000 ms | N/A |
+| L4 | `<DEFERRED: SITL or hardware-in-loop ViewPro A40 zoom command (medium→high); ref services.md §5>` | A40 physical zoom transition | wall-clock transition duration | threshold_max | ≤ 2000 ms | N/A |
+| L5 | `<DEFERRED: scripted scan decision event followed by camera physical motion; ref services.md §3, §5>` | Decision-to-movement latency end-to-end | wall-clock decision→motion duration | threshold_max | ≤ 500 ms | N/A |
+| L6 | `fixtures/movement/video01.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv; ref services.md §6>` | Movement candidate enqueue at the wide-area sweep | detection→enqueue duration | threshold_max | ≤ 1000 ms | N/A |
+| L7 | `fixtures/movement/video02.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv at zoomed-in band; ref services.md §6>` | Movement candidate enqueue during zoomed inspection | detection→enqueue duration | threshold_max | ≤ 1500 ms | N/A |
+| L8 | `<DEFERRED: full sweep → zoomed-inspection transition (POI detected → ROI fully zoomed); ref services.md §3, §5>` | Scan-mode transition including physical zoom | wall-clock transition | threshold_max | ≤ 2000 ms | N/A |
+| L9 | `<DEFERRED: scripted operator-click → outbound command emitted by the system (modem RTT excluded); ref services.md §3>` | Operator command → action latency | wall-clock click→outbound | threshold_max | ≤ 500 ms | N/A |
+
+## Throughput / Rate
+
+Source ACs: `acceptance_criteria.md → Throughput / Rate`.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| T1 | `<DEFERRED: long synthetic POI feed sustained above the cap (e.g. 20 POIs/min); inline-authorable>` | Cap enforcement on POIs surfaced to operator | POI rate surfaced | threshold_max | ≤ 5 / min | N/A |
+| T2 | `<DEFERRED: airframe MAVLink telemetry replay over a 60 s window; ref services.md §4>` | Position telemetry consumed from the airframe link | reported position rate | range | 1 Hz ≤ rate ≤ 10 Hz (10 Hz target) | N/A |
+| T3 | `fixtures/videos/94d42580bd1ad6ff.mp4` replayed with throttled-decode + frame-drop injection to drop below 10 fps for ≥5 s | Frame-rate floor trigger | zoom-in transitions suppressed AND overall health surfaces yellow | exact (suppression bool) + exact (health = yellow) | N/A | N/A |
+
+## Detection Quality
+
+Source ACs: `acceptance_criteria.md → Detection Quality`. Evaluation runs against the Tier-1 detection pipeline that the system consumes; autopilot's role is correct consumption + re-emission of the normalised-box contract. Class catalogue (5 new Tier-1 classes + 3 Tier-2 attributes) is defined in `../data_parameters.md → "Class catalogue"`.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| D1 | `<DEFERRED: new-class eval set across all four seasons (black entrances, branch piles, footpaths, roads, trees, tree blocks); ref services.md §1, annotation campaign in ../ai-training>` | Per-class precision/recall for added classes | per-class precision ≥ 0.80 AND recall ≥ 0.80 | threshold_min (both) | N/A | `<DEFERRED: expected_results/new_classes_pr.json>` |
+| D2 | `fixtures/images/{4d6e1830d211ad50,54f6459dbddb93d8,6dd601b7d2dc1b30,805bcf1e9f271a58,f997d0934726b555}.jpg` (5 frames) | Existing-class regression — must not degrade vs documented baseline P=0.816, R=0.852 | per-class precision + recall delta vs baseline | numeric_tolerance | ± 0.02 absolute | `<DEFERRED: expected_results/existing_classes_baseline.json — to be recorded against the pinned ../detections model>` |
+| D3 | `fixtures/semantic/semantic0[1-4].png` (4 starter frames — 1 winter, 3 unmarked season) + `<DEFERRED: full multi-season annotated concealed-position set; ref services.md §5>` | Concealed-position recall (initial gate, accepting high FP) | recall | threshold_min | ≥ 0.60 | `<DEFERRED: expected_results/concealed_positions.json>` |
+| D4 | Same as D3 | Concealed-position precision (operators filter) | precision | threshold_min | ≥ 0.20 | same as D3 |
+| D5 | `fixtures/semantic/semantic0[1-4].png` (all 4 feature footpaths leading to concealment — starter set) + `<DEFERRED: footpath sequences (fresh + stale, all four seasons), polyline-annotated; ref services.md §5>` | Footpath recall | recall | threshold_min | ≥ 0.70 | `<DEFERRED: expected_results/footpaths.json>` |
+| D6 | `fixtures/images/4d6e1830d211ad50.jpg` | Single-frame Tier-1 contract — system must consume the bbox stream and re-emit normalised-box format | output box stream conforms to the suite-level class catalogue (ids 0..18) + normalised coordinates ∈ [0,1] | schema_match + range | each coord ∈ [0,1] | `fixtures/schemas/expected_detections.schema.json` |
+
+## Movement Detection Behaviour
+
+Source ACs: `acceptance_criteria.md → Movement Detection`. Latency aspects (L6, L7) live under Latency.
+
+**Note**: M1–M4 each have a visual-reference video on disk but NO paired `gimbal.csv` / `telemetry.csv`. Ego-motion compensation cannot be verified against these videos — the visual binding is provided so a smoke harness can run, but the assertions in this section require the deferred CSVs to be meaningful. User confirmed 2026-05-19: paired CSVs do not exist today.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| M1 | `fixtures/movement/video01.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv; scene must contain 1 stable tree row + 1 moving vehicle; ref services.md §6>` | Ego-motion compensation — stable objects rejected | system emits exactly 1 movement candidate (the vehicle); does NOT emit a candidate for the tree row | set_contains | candidate set == {vehicle}; ∉ tree row | N/A |
+| M2 | `fixtures/movement/video02.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv at zoomed-in band; 1 small mover; ref services.md §6>` | Movement detection continues during zoomed-in hold | system enqueues 1 candidate while the camera is in the zoomed-in hold; current ROI is not preempted unless the candidate's priority exceeds it | exact | 1 candidate enqueued; ROI preempt decision matches priority rule | N/A |
+| M3 | `fixtures/movement/video03.mp4` (visual reference) + `<DEFERRED: paired gimbal.csv + telemetry.csv simulating per-zoom-band threshold edge (cluster persistence one frame below threshold); ref services.md §6>` | Per-zoom-band threshold honoured (no false candidate) | no candidate emitted | exact | count == 0 | N/A |
+| M4 | `fixtures/movement/video04.mp4` (visual reference) + `<DEFERRED: zoom-out + zoomed-in benchmark suite measuring false-positive rate at each band; ref services.md §6, Q14>` | Movement zoomed-in benchmark gate (Q14 fallback trigger) | false-positive rate per zoom band | threshold_max | ≤ per-zoom-band budget (configurable; default ≤ 0.5 / minute at zoomed-in) | `<DEFERRED: expected_results/movement_benchmark_caps.json>` |
+
+## Scan & Camera Control Behaviour
+
+Source ACs: `acceptance_criteria.md → Scan and Camera Control`.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| S1 | `<DEFERRED: scripted mission with planned route + simulated POI detected mid-sweep; ref services.md §3, §4>` | Sweep → zoomed-inspection transition within 2 s (L8) AND POI properly enqueued | transition completes; ROI matches POI bbox; queue length increments | exact (multiple) | N/A | N/A |
+| S2 | `<DEFERRED: zoomed-inspection hold scenario with footpath polyline overlapping the ROI; ref services.md §5, §6>` | Camera lock + pan along footpath while airframe flies | camera commands keep the footpath in the centre 50% of frame for the duration of the hold | numeric_tolerance | centre offset ≤ 25% per frame | N/A |
+| S3 | `<DEFERRED: operator-confirmed target + 60 s follow window; ref services.md §3>` | Target-follow centre-window | target inside centre 25% of frame while visible | threshold_max | per-frame |dx,dy| ≤ 0.125 × frame_size | N/A |
+| S4 | `<DEFERRED: queue with 3 POIs at varied confidence × proximity scores; inline-authorable>` | POI queue ordering | system pops POIs in order of `confidence × proximity × age_factor` (relative order matches) | exact (order) | N/A | N/A |
+| S5 | `<DEFERRED: hold endpoint with deep-analysis enabled — assessment returns within 2 s; ref services.md §7>` | Zoomed-in hold timeout default 5 s/POI; deep-analysis hold capped at 2 s | hold ends at min(5 s, deep_analysis_complete) | exact | N/A | N/A |
+
+## Operator Workflow
+
+Source ACs: `acceptance_criteria.md → Operator Workflow`.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| O1 | `<DEFERRED: synthetic POI at confidence = 0.40; inline-authorable>` | Confidence-scaled decision window lower bound | window duration | exact | 30 s | N/A |
+| O2 | `<DEFERRED: synthetic POI at confidence = 1.00; inline-authorable>` | Confidence-scaled decision window upper bound | window duration | exact | 120 s | N/A |
+| O3 | `<DEFERRED: synthetic POI at confidence = 0.70; inline-authorable>` | Linear interpolation (40% → 30 s, 100% → 120 s) | window duration ≈ 30 + (0.70-0.40)/(1.00-0.40) × (120-30) = 75 s | numeric_tolerance | ± 0.5 s | N/A |
+| O4 | `<DEFERRED: synthetic POI at confidence = 0.39; inline-authorable>` | Below-threshold suppression | POI NOT surfaced to operator | exact | count surfaced == 0 | N/A |
+| O5 | `<DEFERRED: surfaced POI followed by operator decline event; inline-authorable>` | Decline → ignored-item entry persisted | ignored-item appended with `(MGRS, class_group)` matching the declined POI | exact (count delta +1) + schema_match | N/A | N/A |
+| O6 | `<DEFERRED: new detection whose (MGRS, class_group) matches an existing ignored-item; inline-authorable>` | Ignored-item suppression | POI NOT surfaced | exact | count surfaced == 0 | N/A |
+| O7 | `<DEFERRED: surfaced POI + no operator response, > decision-window; inline-authorable>` | Timeout = forget (NOT blacklisted) | POI removed from queue; no ignored-item written | exact (queue −1) + exact (ignored-item count unchanged) | N/A | N/A |
+| O8 | `<DEFERRED: operator confirm command — valid + signed + within sequence; ref services.md §3, §8 (Q9)>` | Confirm → middle waypoint inserted; mode transitions to target-follow | mission update POSTed; scan-mode reports target-follow | exact (HTTP 200) + exact (mode) | N/A | N/A |
+| O9 | `<DEFERRED: replayed operator command — same envelope a second time; ref services.md §8 (blocked on Q9)>` | Replay protection | command rejected; security WARN logged; no state change | exact (state unchanged) + substring (log contains "replay") | N/A | N/A |
+| O10 | `<DEFERRED: malformed / unsigned operator command; ref services.md §8 (blocked on Q9)>` | Signature validation | command rejected; security WARN logged | exact (state unchanged) + substring (log contains "invalid") | N/A | N/A |
+
+## Reliability & Safety
+
+Source ACs: `acceptance_criteria.md → Reliability & Safety` + lost-link failsafe ladder.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| R1 | `<DEFERRED: BIT scenario — every dependency healthy; inline-authorable>` | Pre-flight self-test passes | health endpoint returns all green; takeoff permitted | exact (state) + exact (health.all == "green") | N/A | N/A |
+| R2 | `<DEFERRED: BIT scenario — Tier-1 detection unreachable; inline-authorable>` | BIT fails the takeoff gate | takeoff NOT permitted; detection dependency reports red | exact (takeoff inhibited) | N/A | N/A |
+| R3 | `<DEFERRED: BIT scenario — persistent-store ≥95% full; inline-authorable>` | Storage floor BIT failure | takeoff NOT permitted; storage dependency reports red | exact (takeoff inhibited) | N/A | N/A |
+| R4 | `<DEFERRED: in-flight operator/Ground-Station modem-link loss + 30 s elapsed; ref services.md §3, §4>` | Lost-link failsafe ladder (default 30 s grace → RTL) | system issues RTL at exactly 30 s; operator-link dependency reports red | exact (RTL command at 30s ± 1s) | ± 1 s | N/A |
+| R5 | `<DEFERRED: mid-flight battery sample at RTL-floor (e.g. 25%); ref services.md §4>` | RTL trigger | system issues RTL; health → yellow | exact (RTL command) + exact (health == yellow) | N/A | N/A |
+| R6 | `<DEFERRED: mid-flight battery sample at hard-floor (e.g. 15%); ref services.md §4>` | Land-now trigger (only operator-overridable) | system issues land-now | exact (land_now command) | N/A | N/A |
+| R7 | `<DEFERRED: airframe link command + simulated bounded retry/backoff with peer not responding through max-retries; ref services.md §4>` | Watchdog flips health red on exhaustion | airframe-link dependency reports red after configured max-retry | exact (health == red) | N/A | N/A |
+| R8 | `<DEFERRED: wall-clock drift > 200 ms simulation (GPS lock present, NTP disabled); ref services.md §9>` | Drift alarm | time-source dependency reports yellow; `clock_source` + `last_sync_at` reflect the drift | exact (health == yellow) | N/A | N/A |
+| R9 | `<DEFERRED: geofence EXCLUSION polygon crossed by simulated waypoint; ref services.md §4>` | Symmetric geofence enforcement | waypoint refused; RTL triggered | exact (waypoint rejected) + exact (RTL) | N/A | N/A |
+
+## Resources & Data
+
+Source ACs: `acceptance_criteria.md → Resources & Data`.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| Re1 | `<DEFERRED: long-running scenario — system's full onboard workload active for 5 min, monitored via process RSS; inline-authorable harness>` | Onboard memory budget (everything autopilot owns, excluding Tier 1) | combined RSS on the deployed compute device | threshold_max | ≤ 6 GB | N/A |
+| Re2 | Same as Re1 with concurrent Tier-1 traffic | Tier-1 non-degradation | Tier-1 ms/frame delta vs baseline (L1) | numeric_tolerance | ± 5 ms | N/A |
+
+## Map Reconciliation
+
+Source ACs: `acceptance_criteria.md → Map Reconciliation`.
+
+| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
+|---|---|---|---|---|---|---|
+| Mp1 | `<DEFERRED: mock central area-map service — 30 km × 30 km region, ~10000 map objects; ref services.md §2>` | Pre-flight pull | wall-clock GET → local copy hydrated | threshold_max | ≤ 30 s | N/A |
+| Mp2 | `<DEFERRED: same mock but unreachable (timeout); ref services.md §2>` | Cache-fallback path | system falls back to last-known cached state; reports `map_sync == "cached_fallback"`; operator MUST acknowledge before takeoff | exact (state) + exact (BIT requires explicit ack) | N/A | N/A |
+| Mp3 | `<DEFERRED: simulated 60-minute mission pass diff (~5000 NEW + ~2000 MOVED + ~500 REMOVED + ~10000 CONFIRMED-EXISTING); ref services.md §2>` | Post-flight push | wall-clock POST → 200 OK | threshold_max | ≤ 120 s | N/A |
+| Mp4 | `<DEFERRED: same as Mp3 but POST returns 5xx; ref services.md §2>` | Persist-on-disk + bounded retry | pending diff written to on-device storage; operator-visible warning surfaced; retry attempts logged | exact (file exists) + exact (warning surfaced) + threshold_max (retries ≤ configured cap) | N/A | N/A |
+| Mp5 | `<DEFERRED: two map updates with conflicting state for same (spatial-cell, class_group) — append-only log scenario; ref services.md §2, Q8>` | Conflict-resolution rule (Q8 placeholder) | append-only observation log + computed current view; conflict resolution per documented rule | json_diff | N/A | `<DEFERRED: expected_results/mapobjects_conflict_resolution.json — pending Q8>` |
+
+---
+
+## Coverage Status (auto-recomputed 2026-05-19)
+
+- **Total rows**: 56 (L1–L9, T1–T3, D1–D6, M1–M4, S1–S5, O1–O10, R1–R9, Re1–Re2, Mp1–Mp5).
+- **Fully bound to real fixtures**: L1, T3, D2, D6 = **4 rows (~7%)**.
+- **Bound to derived inline fixture** (no external acquisition needed): L2 = **+1 row (5 total, ~9%)**.
+- **Bound to starter/partial fixtures** (visual reference only — assertions need additional deferred inputs to be meaningful): D3, D4, D5, M1, M2, M3, M4 = **+7 rows (12 total partial, ~21%)**.
+- **Inline-authorable but not yet authored** (no external dependency — can be unblocked anytime by writing the fixture): T1, S4, O1–O7, R1–R3, R8, Re1, Re2 = **15 rows (~27%)**. Lifting these alone would bring effective coverage to ~48%.
+- **Blocked on external acquisition** (real recordings, SITL, annotated eval sets, mock services): L3–L9 (minus L6/L7 partial), T2, D1, M1–M4 (CSV pairs), S1, S2, S3, S5, R4–R7, R9, Mp1–Mp5 = **~24 rows (~43%)**.
+- **Blocked on architecture questions**: O8 (depends on Q9 partially), O9, O10 (Q9), M4 (Q14), Mp5 (Q8) = **4 rows**.
+
+**Decision (project policy)**: rather than block on the Phase 3 75% gate, each deferred row is now registered with a structured `<DEFERRED:>` tag and surfaces in `data_parameters.md → "Gaps that block /test-spec downstream"`. `/test-spec` Phase 2 can author scenarios for all 56 rows; deferred rows become **release-gate items**, not development-gate items. The `acceptance_criteria.md → "Acceptance Gates (project-level)"` hardware/replay benchmark requirement is preserved as the hard release gate — that one is NOT being deferred.
+
+## Notes on this spec
+
+- Every row carries a quantifiable comparison + tolerance — no row is "should work".
+- Where the AC depends on hardware (the deployed compute device, ViewPro A40), the test must run on representative hardware OR a benchmarked replay; pure-emulator runs are NOT acceptable for L1–L9, T1–T3, Re1–Re2.
+- Where the AC depends on an external service (`../detections`, `missions`, Ground Station), the test runs against either (a) the real service in the suite e2e (`../e2e/docker-compose.suite-e2e.yml`), or (b) a recorded replay fixture for isolation tests. Both modes are valid; the test scenario states which.
+- Q-tagged rows (M4 → Q14, Mp5 → Q8, O8–O10 → Q9) depend on open architecture questions. Their tolerance ranges may sharpen once those questions resolve; the existence of each row is non-negotiable.
+- M1–M4 visual-reference bindings (`fixtures/movement/video0[1-4].mp4`) are usable for harness smoke testing but DO NOT satisfy the assertion semantics — paired `gimbal.csv` + `telemetry.csv` are required for ego-motion compensation to be verifiable. This is the single highest-priority fixture gap.
@@ -0,0 +1,90 @@
+# Fixture manifest
+
+All fixtures live **inside this workspace** so the autopilot repo is self-sufficient — downstream test runners must never reach into a sibling repo at `../`. When you add or refresh a fixture, update the matching SHA-256 in this manifest AND the rows in `../expected_results/results_report.md` that consume it.
+
+Total on-disk size: ~57 MB.
+
+## Files
+
+### Still-image aerial frames — `images/`
+
+Used as Tier-1 input frames for detection-quality assertions.
+
+| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
+|---|---|---|---|---|
+| `images/4d6e1830d211ad50.jpg` | 152 KB | `4c396495af64aaf9aac5ecb92431bf0c75db42b0bdb8e4eec1937f9995acee42` | `../detections/data/images/` (re-copied 2026-05-19) | L1, D6 |
+| `images/54f6459dbddb93d8.jpg` | 6.7 MB | `cd65c76a080ef72ce3528031f003f067fca6091c067a86d527a1ae91cd78be59` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
+| `images/6dd601b7d2dc1b30.jpg` | 1.4 MB | `45edd83a357a9f852e14e5845265cd09c20b4b99b1828c160cb3298f0e160181` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
+| `images/805bcf1e9f271a58.jpg` | 176 KB | `fe696899225fc04f2335e87acf6a3ad8a00cd3950c5940d5e73e5ce438f36257` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
+| `images/f997d0934726b555.jpg` | 232 KB | `5d1c9c551c0680e5b3d0aab261bca71e724c78f6db3580da598c680b4f7d4d79` | `../detections/data/images/` (re-copied 2026-05-19) | D2 |
+
+### Reconnaissance video — `videos/`
+
+| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
+|---|---|---|---|---|
+| `videos/94d42580bd1ad6ff.mp4` | 12 MB | `602b22a42515a754313551847caa6d6a6d7b3cde1d857cbd08ebc5543fb8cf7c` | `../detections/data/videos/` (re-copied 2026-05-19) | T3 (frame-rate floor scenario) |
+
+### Movement-detection clips — `movement/`
+
+Wide-area reconnaissance clips intended for movement-detection visual baselines. **Important**: these clips DO NOT have paired `gimbal.csv` / `telemetry.csv` files — ego-motion compensation assertions (M1–M4) cannot run against them. They are useful for visual harness work, frame-count assertions, and as visual reference for the movement-detection scenarios.
+
+| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
+|---|---|---|---|---|
+| `movement/video01.mp4` | 5.3 MB | `6f37186f5e9be97109db8d0d220df96d21cac9ce5b50b576234c6f7ee369d2bb` | local; provenance pre-existing in workspace | M1 (visual reference only — no telemetry) |
+| `movement/video02.mp4` | 5.9 MB | `7de7981e511e21e1e72f506d44541b44a4c27a995c9505ef8e3b48e69b416367` | local; provenance pre-existing in workspace | M2 (visual reference only — no telemetry) |
+| `movement/video03.mp4` | 6.1 MB | `df441164da7f37d715968212b95e9bf53c8e37384f20ddfab61cd6d0d18b4f3a` | local; provenance pre-existing in workspace | M3 (visual reference only — no telemetry) |
+| `movement/video04.mp4` | 5.8 MB | `36445bf1c86c5afa524000b5b2da7fc9cb3d39c745f9ad830b3d60f6868948e7` | local; provenance pre-existing in workspace | M4 (visual reference only — no telemetry) |
+
+### Semantic reference frames — `semantic/`
+
+Annotated reference examples for concealed-position semantic targets. **Not a graded eval set** — these are 4 hand-picked examples of footpath-to-concealment patterns, intended as visual reference for what the system should recognise. Detection-quality gates (D1, D3, D4, D5) need a full annotated multi-season eval set; these 4 PNGs are insufficient for those gates and serve as starter reference only.
+
+| File | Size | SHA-256 | Description | `results_report.md` rows |
+|---|---|---|---|---|
+| `semantic/semantic01.png` | 3.1 MB | `339ad4d35ab36052828f05652ab7249801bcd5d7bb04522f0ab9cbf6f0ca008a` | Footpath leading to branch-pile hideout in winter forest | D3, D4, D5 (starter only — full multi-season set still required) |
+| `semantic/semantic02.png` | 5.1 MB | `ffe3c49f5f1833724ce46083d212e714422e664b635cdd48b63311adefcd7b1f` | Footpath to FPV launch clearing, branch mass at forest edge | D3, D4, D5 (starter only) |
+| `semantic/semantic03.png` | 1.0 MB | `ce89c139815e9a80679237008f7cfc3039bbd53f162d48017e840ff91e57b109` | Footpath to squared hideout structure | D3, D4, D5 (starter only) |
+| `semantic/semantic04.png` | 1.3 MB | `b25c689b7aa543ec15858e4b5edfa32387ced4930130eb280d952c555f104e69` | Footpath terminating at tree-branch concealment | D3, D4, D5 (starter only) |
+| `semantic/data_parameters.md` | 2 KB | n/a (text) | Description of the four reference examples + the new YOLO primitive classes that motivate them | reference only |
+
+### Detection contract schemas — `schemas/`
+
+| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
+|---|---|---|---|---|
+| `schemas/expected_detections.json` | 1.4 KB | `ce60c105d697efe0359d2e6b1b46fc63e53d3789b067d53501f9c76aad9bd1ae` | `../e2e/fixtures/` (re-copied 2026-05-19) | D6 (sample Tier-1 response) |
+| `schemas/expected_detections.schema.json` | 2.4 KB | `a7174e0b083dcbf42fa8672acd3e1807d11ea0629cc636ff958a4d77168733b9` | `../e2e/fixtures/` (re-copied 2026-05-19) | D6 (JSON-schema for the Tier-1 contract) |
+
+### Database init script — `sql/`
+
+| File | Size | SHA-256 | Upstream source | `results_report.md` rows |
+|---|---|---|---|---|
+| `sql/init.sql` | 3.7 KB | `b61e452c549f7b006db88d265f4346837e0a33d1abd4d977ebf3d48d8c943439` | `../e2e/fixtures/` (re-copied 2026-05-19) | suite-only reference; no autopilot AC row asserts against this |
+
+## Copy vs reference
+
+Fixtures were COPIED (not moved). The sibling repos still own the originals — keeping autopilot's copy in sync when an upstream changes is a manual chore today (the `monorepo-e2e` skill at the suite root will eventually own this drift; see `_docs/_process_leftovers/` if a sync is pending).
+
+When an upstream fixture changes:
+
+1. Recompute the SHA-256 in the source repo.
+2. Re-copy into the matching `fixtures/` subdirectory here.
+3. Update this manifest's SHA-256 column.
+4. If the change invalidates an assertion in `../expected_results/results_report.md`, fix the row's expected result too — do not let assertions drift silently against new data.
+
+## Gaps still pending fixture acquisition
+
+The authoritative per-service acquisition catalogue lives in `../services.md`. Summary of the still-open gaps (each is also tagged on its row in `../expected_results/results_report.md` with a structured `<DEFERRED: ...>` marker, and a `_docs/_process_leftovers/` entry records the replay obligation):
+
+| Gap | What's missing | Blocks AC rows | Acquisition status |
+|---|---|---|---|
+| Paired gimbal+telemetry CSVs for the 4 movement clips | `gimbal.csv` + `telemetry.csv` aligned to each video frame timestamps | M1–M4, tightens L6/L7 | **Confirmed unavailable today** (user 2026-05-19) — requires re-flight or new recording with gimbal-feedback channel captured |
+| Annotated eval set across all four seasons | Hundreds–thousands of labelled images per season for concealed-position + footpath gates | D1, D3, D4, D5 | needs annotation campaign (1.5 months at 5 hrs/day target per `semantic/data_parameters.md`) |
+| Per-zoom-band frame sequences | Same kind of clip as `movement/` but recorded at light, medium, and high zoom bands | tightens M2, L7, S2 | needs flight time + zoom-band metadata in the recorder |
+| Mock `missions` HTTPS exchanges | Recorded JSON request/response pairs for mission GET/POST + mapobjects GET/POST | Mp1–Mp5 | inline-authorable against the `mission-schema`; not yet authored |
+| Mock Ground Station session traces | Scripted timing trace (connect / push / drop / reconnect / lost-link) | R4, O8 | inline-authorable; not yet authored |
+| ArduPilot SITL traces | Recorded MAVLink streams for waypoint upload, geofence INCLUSION + EXCLUSION, RTL on lost-link, RTL on battery floor | R4, R5, R6, R7, R9 + project SITL conformance gate | needs SITL run |
+| Operator-command envelopes | Valid / expired / replayed / malformed envelopes under the chosen Q9 auth scheme | O9, O10 | **blocked on Q9** (`_docs/02_document/architecture.md §8`) |
+| VLM I/O pairs | Bounded ROI in → structured `VlmAssessment` out + schema-violation cases | L3, S5 | inline-authorable against the assessment schema once the local model is pinned |
+| GPS / NTP drift scenarios | Scripted offset / lock-loss traces | R8 | inline-authorable |
+
+When a fixture from this list lands, copy it under `fixtures/<category>/`, add a row to the relevant subsection above, and bind the matching `<DEFERRED>` row in `../expected_results/results_report.md` to its new local path.
@@ -0,0 +1,32 @@
+{
+  "$schema": "./expected_detections.schema.json",
+  "_meta": {
+    "fixture_version": "0.1.0-placeholder",
+    "video": "sample.mp4",
+    "video_sha256": "TBD-after-fixture-recording",
+    "model": {
+      "_comment": "Pinned model + classes that detections must run when this baseline applies. Refresh this block (and counts/bboxes below) whenever detections ships a new model.",
+      "name": "TBD",
+      "revision": "TBD",
+      "classes_source": "annotations/src/Database/DatabaseMigrator.cs (ids 0..18)"
+    },
+    "tolerance": {
+      "_comment": "Spec asserts ranges, not exact values. INT8 calibration drift can move pixel positions by a few units; absolute count can drift by ±1 across re-runs of the same engine on the same Jetson.",
+      "count_delta": 1,
+      "bbox_iou_min": 0.8,
+      "confidence_delta": 0.1
+    }
+  },
+  "expected": {
+    "total_annotations": 0,
+    "by_class": [
+      {
+        "class_id": 0,
+        "class_name": "ArmorVehicle",
+        "count": 0,
+        "bbox_samples": []
+      }
+    ],
+    "_placeholder_note": "Replace this block with the real baseline once sample.mp4 is recorded. Each entry under `by_class` carries: class_id, class_name (must match detection_classes.name), count, and bbox_samples (an array of {time_sec, center_x, center_y, width, height, confidence} entries the spec uses for IoU comparison)."
+  }
+}
@@ -0,0 +1,66 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "title": "Suite e2e expected detections baseline",
+  "type": "object",
+  "required": ["_meta", "expected"],
+  "properties": {
+    "$schema": { "type": "string" },
+    "_meta": {
+      "type": "object",
+      "required": ["fixture_version", "video", "video_sha256", "model", "tolerance"],
+      "properties": {
+        "fixture_version": { "type": "string" },
+        "video": { "type": "string" },
+        "video_sha256": { "type": "string" },
+        "model": {
+          "type": "object",
+          "required": ["name", "revision", "classes_source"],
+          "additionalProperties": true
+        },
+        "tolerance": {
+          "type": "object",
+          "required": ["count_delta", "bbox_iou_min", "confidence_delta"],
+          "properties": {
+            "count_delta": { "type": "integer", "minimum": 0 },
+            "bbox_iou_min": { "type": "number", "minimum": 0, "maximum": 1 },
+            "confidence_delta": { "type": "number", "minimum": 0, "maximum": 1 }
+          }
+        }
+      }
+    },
+    "expected": {
+      "type": "object",
+      "required": ["total_annotations", "by_class"],
+      "properties": {
+        "total_annotations": { "type": "integer", "minimum": 0 },
+        "by_class": {
+          "type": "array",
+          "items": {
+            "type": "object",
+            "required": ["class_id", "class_name", "count"],
+            "properties": {
+              "class_id": { "type": "integer", "minimum": 0 },
+              "class_name": { "type": "string" },
+              "count": { "type": "integer", "minimum": 0 },
+              "bbox_samples": {
+                "type": "array",
+                "items": {
+                  "type": "object",
+                  "required": ["time_sec", "center_x", "center_y", "width", "height"],
+                  "properties": {
+                    "time_sec": { "type": "number", "minimum": 0 },
+                    "center_x": { "type": "number" },
+                    "center_y": { "type": "number" },
+                    "width": { "type": "number", "minimum": 0 },
+                    "height": { "type": "number", "minimum": 0 },
+                    "confidence": { "type": "number", "minimum": 0, "maximum": 1 }
+                  }
+                }
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+}
@@ -0,0 +1,45 @@
+# Semantic And Movement Detection Training Data
+
+# Source
+- Aerial imagery from reconnaissance winged UAVs at 600–1000m altitude
+- ViewPro A40 camera, 1080p resolution, various zoom levels
+- Extracted from video frames and still images
+- Movement detection requires frame sequences, not still images only; include camera/gimbal telemetry where available to separate target motion from UAV motion.
+
+# Target Classes
+- Footpaths / trails (linear features on snow, mud, forest floor)
+- Fresh footpaths (distinct edges, undisturbed surroundings, recent track marks)
+- Stale footpaths (partially covered by snow/vegetation, faded edges)
+- Concealed structures: branch pile hideouts, dugout entrances, squared/circular openings
+- Tree rows (potential concealment lines)
+- Open clearings connected to paths (FPV launch points)
+- Moving point/cluster candidates at wide or light/medium zoom
+
+# YOLO Primitive Classes (new)
+- Black entrances to hideouts (various sizes)
+- Piles of tree branches
+- Footpaths
+- Roads
+- Trees, tree blocks
+
+# Annotation Format
+- Managed by existing annotation tooling in separate repository
+- Expected: bounding boxes and/or segmentation masks depending on model architecture
+- Footpaths may require polyline or segmentation annotation rather than bounding boxes
+
+# Seasonal Coverage Required
+- Winter: snow-covered terrain (footpaths as dark lines on white)
+- Spring: mud season (footpaths as compressed/disturbed soil)
+- Summer: full vegetation (paths through grass/undergrowth)
+- Autumn: mixed leaf cover, partial snow
+
+# Volume
+- Target: hundreds to thousands of annotated images/sequences
+- Available effort: 1.5 months, 5 hours/day
+- Potential for annotation process automation
+
+# Reference Examples
+- semantic01.png — footpath leading to branch-pile hideout in winter forest
+- semantic02.png — footpath to FPV launch clearing, branch mass at forest edge
+- semantic03.png — footpath to squared hideout structure
+- semantic04.png — footpath terminating at tree-branch concealment
@@ -0,0 +1,104 @@
+-- Suite e2e database seed.
+--
+-- Loaded by the `db-seed` service in docker-compose.suite-e2e.yml after
+-- annotations has run its own DatabaseMigrator (which creates the schema +
+-- inserts the canonical detection_classes 0..18). This file therefore only
+-- adds rows that the e2e scenario depends on but the production runtime does
+-- NOT seed automatically.
+--
+-- Idempotency: every statement uses ON CONFLICT / IF NOT EXISTS so re-running
+-- the seed (e.g. on a `down -v` followed by `up`) lands the same final state.
+--
+-- Schema reference: annotations/src/Database/DatabaseMigrator.cs.
+
+\set ON_ERROR_STOP on
+
+-- Wait until annotations has populated its schema. The db-seed container starts
+-- only after postgres-local is healthy, but annotations may still be spinning
+-- up its tables. A bounded poll keeps the seed deterministic.
+DO $$
+DECLARE
+  attempt int := 0;
+BEGIN
+  WHILE attempt < 60 LOOP
+    PERFORM 1
+    FROM information_schema.tables
+    WHERE table_schema = 'public' AND table_name = 'detection_classes';
+    IF FOUND THEN
+      EXIT;
+    END IF;
+    PERFORM pg_sleep(1);
+    attempt := attempt + 1;
+  END LOOP;
+
+  IF attempt >= 60 THEN
+    RAISE EXCEPTION 'detection_classes table not found after 60s — annotations migration did not complete';
+  END IF;
+END $$;
+
+-- Default system_settings row. Annotations starts without one, but several
+-- spec assertions rely on `silent_detection = false` and known thumbnail dims
+-- so overlay rendering is reproducible.
+INSERT INTO system_settings (
+  id, name, military_unit,
+  default_camera_width, default_camera_fov,
+  thumbnail_width, thumbnail_height, thumbnail_border,
+  generate_annotated_image, silent_detection
+) VALUES (
+  '00000000-0000-0000-0000-00000000aaaa',
+  'azaion-suite-e2e',
+  'e2e-unit',
+  3840, 70,
+  240, 135, 10,
+  true, false
+) ON CONFLICT (id) DO NOTHING;
+
+-- Default directory_settings row. Annotations writes media files under the
+-- paths defined here; the e2e-runner doesn't read these directly but the
+-- service requires the row to exist on first hit.
+INSERT INTO directory_settings (
+  id, videos_dir, images_dir, labels_dir, results_dir,
+  thumbnails_dir, gps_sat_dir, gps_route_dir
+) VALUES (
+  '00000000-0000-0000-0000-00000000bbbb',
+  '/data/videos', '/data/images', '/data/labels', '/data/results',
+  '/data/thumbnails', '/data/gps_sat', '/data/gps_route'
+) ON CONFLICT (id) DO NOTHING;
+
+-- Default camera_settings row used by detections to size bbox-to-meters.
+INSERT INTO camera_settings (
+  id, altitude, focal_length, sensor_width
+) VALUES (
+  '00000000-0000-0000-0000-00000000cccc',
+  100, 50, 36
+) ON CONFLICT (id) DO NOTHING;
+
+-- Stable e2e user. The UUID is referenced by the spec when asserting
+-- annotation rows. Annotations does not own a `users` table — user identity
+-- is carried in JWTs minted with JWT_SECRET; the user_id here just needs to
+-- be deterministic and stable across runs.
+-- Stored in user_settings so the spec can `SELECT user_id` to confirm the
+-- seed ran.
+INSERT INTO user_settings (
+  id, user_id,
+  annotations_left_panel_width, annotations_right_panel_width,
+  dataset_left_panel_width,    dataset_right_panel_width
+) VALUES (
+  '00000000-0000-0000-0000-00000000dddd',
+  '00000000-0000-0000-0000-0000e2e2e2e2',
+  300, 400, 320, 320
+) ON CONFLICT (id) DO NOTHING;
+
+-- Sanity check — fail loudly if the canonical detection_classes are missing.
+-- annotations/src/Database/DatabaseMigrator.cs inserts ids 0..18 unconditionally.
+DO $$
+DECLARE
+  cnt int;
+BEGIN
+  SELECT COUNT(*) INTO cnt FROM detection_classes WHERE id BETWEEN 0 AND 18;
+  IF cnt < 19 THEN
+    RAISE EXCEPTION 'expected canonical detection_classes 0..18 (count=19), got %', cnt;
+  END IF;
+END $$;
+
+\echo 'suite-e2e seed complete'
@@ -0,0 +1,113 @@
+# External Services — Test-Mock Requirements
+
+Black-box catalogue of every external system autopilot depends on at runtime, with the **test-fixture / mock shape required for each**. Service-side design (protocols, component contracts, ownership boundaries) lives in `_docs/02_document/architecture.md` — this file owns ONLY the test-data dependency view (per `.cursor/rules/artifact-srp.mdc`, `_docs/00_problem/input_data/` is a test-data concern).
+
+Runtime input shapes (frame rates, message types) are described in `data_parameters.md`. This file extends them with the **acquisition status of the corresponding test fixture**.
+
+## Index
+
+| # | External system | Production role | Test-mock shape needed | Acquisition status |
+|---|---|---|---|---|
+| 1 | Tier-1 detection (`../detections`) | Primitive YOLO inference on every frame; returns class + bbox + confidence | Recorded bi-stream replay file (`request frame` → `response detections`) | **MISSING** — no replay recorded yet |
+| 2 | Mission planner (`missions` API) | Mission JSON pull at start; middle-waypoint POST on operator-confirm; pre-flight area-map pull + post-flight diff push | Mock HTTPS exchanges for GET/POST + sample mission + sample mapobjects state | **MISSING** — schema known (mission-schema), no fixture recorded |
+| 3 | Ground Station (modem) | Continuous push of camera + telemetry + bbox overlay; return path carries operator commands (confirm / decline / target-follow / abort) | Scripted session traces: nominal session, modem drop at T=N, reconnect at T=M, lost-link sustained ≥30 s | **MISSING** — authorable inline (no external dependency) |
+| 4 | Airframe autopilot (ArduPilot / PX4) | MAVLink v2 transport for the ~10–15 commands in `architecture.md §7.7`; battery + position telemetry; geofence enforcement | ArduPilot SITL traces: waypoint upload, geofence INCLUSION + EXCLUSION, RTL on lost-link, RTL on battery floor | **MISSING** — needs SITL run with scripted scenarios |
+| 5 | ViewPro A40 camera (frames) | H.264/265 1080p RTSP video feed at 30/60 fps | Recorded frame sequences (`.mp4`) — wide-zoom, light-zoom, medium-zoom, high-zoom variants | **PARTIAL** — 4 wide-zoom clips on disk (`fixtures/movement/video0[1-4].mp4`); zoom-band variants missing |
+| 6 | ViewPro A40 gimbal (control) | Vendor UDP control protocol; yaw / pitch / zoom telemetry per tick | Per-frame-sequence `gimbal.csv` paired with the matching video; per-tick yaw/pitch/zoom + timestamp | **MISSING** — no `gimbal.csv` paired with the 4 movement videos; ego-motion compensation (M1–M4) is unfalsifiable without this |
+| 7 | Deep-analysis VLM (local IPC) | Optional Tier-3 confirmation over bounded ROI; structured `VlmAssessment` response | Recorded I/O pairs (ROI in → `VlmAssessment` out) + schema-violation cases for fail-closed tests | **MISSING** — depends on the local model choice; can be authored against the assessment schema once the model is pinned |
+| 8 | Time source (GPS / NTP) | Wall-clock; drift triggers the R8 health-yellow gate | Scripted drift scenarios (no real GPS/NTP hardware needed) — clock offset, jump, source loss | **MISSING** — authorable inline |
+
+## Per-service detail — what acquisition would look like
+
+The table above is the index; the rows below explain the shape and acquisition path so the gaps can be planned out one at a time.
+
+### 1. Tier-1 detection replay (`../detections`)
+
+- Production transport: bi-directional gRPC. The autopilot streams frames out; `../detections` streams `Detections` messages back.
+- Mock shape: a `.replay` file (one per scenario) recording timestamped frames + the exact `Detections` responses the model emitted. Used by `detection_client` integration tests in isolation — no need to boot the real Tier-1 service.
+- Acquisition path: record one replay against the currently pinned `../detections` model. Re-record when the upstream model changes (the `monorepo-e2e` skill should eventually own this drift; see the suite's leftovers).
+- Blocks AC rows: every row that needs a deterministic detection stream — practically L1, L2, D1, D2, D6 in isolation; in suite-e2e mode these run live against the real `../detections`.
+
+### 2. Mission + MapObjects mock (`missions` API)
+
+- Production transport: HTTPS REST.
+- Mock shape: JSON fixtures per endpoint + a small mock HTTP server (or replay-style fixtures consumed by a test double). Endpoints in scope:
+  - `GET /missions/{id}` — mission JSON, validated against `mission-schema`.
+  - `POST /missions/{id}` — middle-waypoint insertion (200 OK + updated mission).
+  - `GET /missions/{id}/mapobjects` — pre-flight area-map pull (response shape: map-object records keyed by spatial cell; volume target ~10000 objects for the 30×30 km gate Mp1).
+  - `POST /missions/{id}/mapobjects` — post-flight diff push (NEW / MOVED / REMOVED / CONFIRMED-EXISTING; volume target per Mp3 ~17500 records).
+- Acquisition path: author JSON fixtures against the known schema; record real exchanges once `missions` is reachable from the test bench.
+- Blocks AC rows: Mp1–Mp5 (all 5 map-reconciliation rows).
+
+### 3. Ground Station session trace
+
+- Production transport: continuous push over modem (suite-level protocol).
+- Mock shape: scripted timing trace per scenario. Each scenario is a list of `(t, event)` pairs: connect, push frame, push telemetry, operator-click, modem-drop, reconnect, lost-link.
+- Acquisition path: authorable inline from `architecture.md §7` and `acceptance_criteria.md §Reliability & Safety`. No external dependency — just a fixture generator.
+- Blocks AC rows: R4 (lost-link → RTL at 30 s); O8, O9, O10 (operator command lifecycle on the return path, **but** O9/O10 also depend on Q9 for the auth scheme).
+
+### 4. MAVLink SITL trace
+
+- Production transport: MAVLink v2 over UDP or serial.
+- Mock shape: ArduPilot SITL recording capturing the autopilot's command stream + the airframe's response stream. One trace per scenario: waypoint upload, geofence INCLUSION violation, geofence EXCLUSION violation, lost-link RTL, battery RTL-floor RTL, battery hard-floor land-now.
+- Acquisition path: run ArduPilot SITL with a scripted mission; capture the full MAVLink stream with mavlink-router or equivalent.
+- Blocks AC rows: R4 (RTL exact timing), R5, R6, R7, R9; plus the project-level "MAVLink command surface MUST pass SITL conformance" gate.
+
+### 5. Camera frame sequences (ViewPro A40)
+
+- Production transport: RTSP/RTP over TCP/UDP, 1080p H.264/265 at 30/60 fps.
+- Current local fixtures: `fixtures/movement/video0[1-4].mp4` (4 clips, ~5–6 MB each), `fixtures/videos/94d42580bd1ad6ff.mp4` (one reconnaissance clip used for T3 frame-rate floor).
+- Mock-shape gap: zoom-band coverage. Each AC scenario that names a zoom level (wide, light, medium, high) needs a representative clip at that zoom band. The 4 movement clips do not enumerate which zoom band each represents — this needs documenting per clip OR re-recording with zoom-band labels.
+- Acquisition path: existing clips usable for movement-detection visual baselines; new recordings at each zoom band require flight time.
+
+### 6. Gimbal telemetry CSV (paired with frames)
+
+- Production transport: ViewPro A40 vendor protocol over UDP; per-tick yaw/pitch/zoom updates.
+- Mock shape: `gimbal.csv` with columns `(t, yaw_deg, pitch_deg, zoom_band, focal_mm)`, one CSV per video file, timestamps aligned to frame timestamps within ≤ 1 frame.
+- Acquisition path: requires re-flying the recording with the gimbal-feedback channel captured alongside. CANNOT be back-fitted to existing videos.
+- Blocks AC rows: M1, M2, M3, M4 (movement-detection ego-motion compensation); also tightens L6, L7 (movement candidate enqueue latency).
+- **Confirmed not available today (user-stated 2026-05-19).**
+
+### 7. VLM I/O pairs
+
+- Production transport: Unix-domain socket IPC to local-onboard VLM (NanoLLM / VILA1.5-3B per architecture §1).
+- Mock shape: paired `(roi.png, prompt.txt, vlm_response.json)` per scenario + a small set of schema-violation cases (truncated JSON, wrong field types, missing required fields) for fail-closed tests.
+- Acquisition path: depends on the local VLM model choice. Once pinned, capture real I/O during a flight or scripted run; schema-violation cases authored inline.
+- Blocks AC rows: L3 (Tier-3 ≤5 s latency on bounded ROI), S5 (deep-analysis hold-cap interaction).
+
+### 8. Operator-command envelopes
+
+- Production transport: comes back to autopilot via Ground Station modem return path.
+- Mock shape: per envelope, a `(scheme, payload, signature, sequence_id)` tuple. One fixture per case: valid, expired, replayed (same envelope sent twice), malformed (signature mismatch), unsigned.
+- Acquisition path: **blocked on Q9** (operator-command auth scheme — open in `_docs/02_document/architecture.md §8`). Once the scheme is chosen, envelopes are authorable inline.
+- Blocks AC rows: O9 (replay protection), O10 (signature validation); strengthens O8 (confirm pathway).
+
+### 9. GPS / NTP drift scripts
+
+- Production transport: kernel-level wall clock + GPS lock state.
+- Mock shape: scripted offset injection — bump the clock by N ms, drop GPS lock, change time source.
+- Acquisition path: authorable inline; no external dependency.
+- Blocks AC rows: R8.
+
+## Coverage summary by service
+
+| Service | Rows covered (real fixture) | Rows blocked on this service | Acquisition priority |
+|---|---|---|---|
+| Tier-1 replay | L1, D2, D6 (live; replay desirable for isolation) | none independently blocked | low (can use live `../detections` in suite-e2e) |
+| `missions` mock | none | Mp1–Mp5 (5 rows) | medium |
+| Ground Station trace | none | R4, O8 (2 rows) | low (inline-authorable) |
+| MAVLink SITL | none | R4, R5, R6, R7, R9 (5 rows) + project conformance gate | high |
+| Frame sequences | L1 (with image), T3 (with video) | enriches L6/L7 with telemetry | medium |
+| Gimbal CSV | none | M1–M4 (4 rows) + L6, L7 | **high — explicit user gap** |
+| VLM I/O pairs | none | L3, S5 (2 rows) | low (model-choice gated) |
+| Operator envelopes | none | O9, O10 (2 rows) | blocked on Q9 |
+| GPS/NTP drift | none | R8 | low (inline-authorable) |
+
+Per-row binding lives in `expected_results/results_report.md`. The status of each gap is mirrored in `_docs/_process_leftovers/` so the next `/autodev` run can replay the missing-fixture decision.
+
+## What this file does NOT own
+
+- Component design (how `detection_client` talks to Tier-1, how `mission_client` retries, etc.) — `_docs/02_document/architecture.md` and `_docs/02_document/components/*/description.md`.
+- Production data shapes (frame rate, MAVLink message types) — `data_parameters.md` already has these.
+- AC text — `_docs/00_problem/acceptance_criteria.md`.
+- The choice of which mocks to use during a given test run (live vs replay vs scripted) — `_docs/02_document/tests/` (test strategy doc, authored by `/test-spec` Phase 2).