Files
autopilot/_docs/00_problem/input_data/data_parameters.md
T
Oleksandr Bezdieniezhnykh bc40ea7300 [AZ-626] Decompose complete: 47 tasks + docs + module layout
Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 11:02:01 +03:00

9.2 KiB
Raw Blame History

Input Data Parameters

Describes the categories of input data the system consumes at runtime, and the categories of reference data tests need. Internal component names, programming languages, IPC mechanisms, schema class names, and specific model choices are design and live in _docs/02_document/architecture.md — they do not belong in this file (per .cursor/rules/artifact-srp.mdc).

Local fixtures live in fixtures/; see fixtures/README.md for the manifest. External-service test-mock requirements live in services.md; the per-row binding to AC criteria lives in expected_results/results_report.md.

Runtime inputs (what the system consumes when flying)

Input Source Format Cadence Notes
Camera frames ViewPro A40 (or alternative ViewPro Z40K) H.264 / H.265 over RTSP, 1080p (1920×1080) 30 / 60 fps Frame timestamps are mandatory.
Primitive (Tier 1) detection responses ../detections service over a bi-directional streaming RPC contract Bounding boxes with class id, confidence, normalised coordinates Per frame Same boxes feed Tier-2 ROI selection and the operator overlay.
UAV telemetry Airframe via MAVLink v2 (UDP or serial) MAVLink messages: position, attitude, velocity, battery, link health, GPS fix ≥1 Hz (10 Hz target) Source-of-truth for ego-motion compensation.
Gimbal feedback ViewPro A40 vendor protocol over UDP Yaw / pitch / zoom angle telemetry per-tick Source-of-truth for camera-pose compensation.
Mission JSON missions service via HTTPS REST Shared mission-schema JSON Once at mission start + middle-waypoint updates Validated against the shared schema.
Area-level map state missions service extension /missions/{id}/mapobjects (GET) Map-object records keyed by spatial cell Once at mission start Hydrates the system's local copy of the area map; cache-fallback on timeout.
Operator commands Ground Station via modem (return path of the outbound telemetry stream) Authenticated + signed + replay-protected command envelope (scheme open per Q9) Event-driven confirm / decline / target-follow start / target-follow release / abort.
Deep-analysis responses (optional) Local-onboard model accessed via local IPC Structured assessment schema (validated) Per zoomed-in endpoint hold (when deep-analysis is enabled) Schema-violation fails closed.

Class catalogue (Tier-1 + Tier-2)

Detection-quality acceptance criteria (acceptance_criteria.md → Detection Quality) are evaluated against a class catalogue that combines pre-existing suite-level classes with new autopilot-driven additions. Class IDs are governed at the suite level (../detections owns the catalogue); autopilot only consumes the IDs.

New Tier-1 (YOLO primitive) classes — to be added to the suite catalogue

# Class name Annotation hint Motivated by
1 Black entrances Bounding box; various sizes (small hideout openings to dugout entrances) Concealed-position detection (D3, D4)
2 Branch piles Bounding box Concealment material around hideouts (D3, D4)
3 Footpaths Polyline / segmentation preferred over bbox for linear features Footpath recall gate (D5)
4 Roads Polyline / segmentation Distinguishing roads from footpaths in the same scene
5 Trees / tree blocks Bounding box; tree-block annotation may use larger box for clusters Concealment-context anchor; reduces false positives around tree-rows in movement detection (M1)

Tier-2 semantic attributes — composed by semantic_analyzer, NOT added to YOLO catalogue

# Attribute Composed from Used by
1 Footpath freshness (fresh / stale) Footpath bbox + texture/edge analysis + seasonal context Decision-window scoring, D5 partial coverage
2 Concealed-structure inference Black-entrance + branch-piles + footpath-approach proximity POI surfacing for D3/D4 (the structure itself is composed, not directly labelled)
3 Open clearing connected to path Cleared-terrain texture + footpath endpoint FPV-launch-point flagging

Existing classes (already in the suite catalogue)

The existing-class baseline (P=0.816, R=0.852 per the AC) covers the suite's pre-autopilot class set (vehicles, military equipment, etc.). Autopilot must not degrade these — see D2.

Reference for IDs

The 19-id catalogue (0..18) is owned by ../detections. Autopilot's normalised-box output uses the same IDs. When ../detections ships a new model or renumbers IDs, the expected_detections.json baseline goes stale and D1, D2, D6 rows must be re-recorded.

Reference data needed for testing

Local fixtures already on disk

See fixtures/README.md for the SHA-pinned manifest. Categorised summary:

Local fixture category Files Purpose Bound to AC rows
fixtures/images/*.jpg 5 aerial frames Tier-1 detection contract; existing-class regression; normalised-box conformance L1, D2, D6
fixtures/videos/94d42580bd1ad6ff.mp4 1 reconnaissance clip Frame-rate floor scenario, reserved for future movement-sequence tests T3
fixtures/schemas/expected_detections.{json,schema.json} 2 schema files Detection-result contract shape reference D6
fixtures/sql/init.sql 1 SQL file Suite-e2e DB seed reference (suite-only; no autopilot AC)
fixtures/movement/video0[1-4].mp4 4 wide-area clips Visual reference for movement-detection scenarios — no paired telemetry CSVs, ego-motion assertions unfalsifiable until those land M1M4 (visual reference only)
fixtures/semantic/semantic0[1-4].png 4 reference frames Visual reference for concealed-position semantic targets — starter set only, not a graded eval set D3, D4, D5 (starter only)

Reference shapes still needed but not yet on disk

The per-service mock catalogue is in services.md (authoritative). Summary of categories tests need:

Reference shape Why it's needed See
Frame sequences with synchronised gimbal.csv + telemetry.csv Ego-motion compensation at zoom-out AND zoomed-in inspection services.md §6 Gimbal telemetry CSV
Concealed-position image set across all four seasons (annotated) Concealed-position recall ≥60% and precision ≥20% services.md §5 Camera frame sequences
Footpath sequences (fresh, stale, all four seasons, polyline-annotated) Footpath recall ≥70% services.md §5
New-class evaluation set (5 new classes above) New-class per-class P/R ≥80% without degrading existing-class performance services.md §1 Tier-1 detection replay (plus annotation campaign owned by ../ai-training repo)
Mock Tier-1 streaming-RPC replays Detection-consumer isolation tests services.md §1
Mock Ground Station session traces Lost-link failsafe ladder + operator-link reconnect services.md §3
MAVLink SITL traces MAVLink conformance + waypoint insertion + geofence enforcement services.md §4
Mock central area-map service responses Pre-flight pull / post-flight push round-trip; conflict cases (Q8) services.md §2
Operator-command envelopes Signature + replay-protection tests (once Q9 resolves) services.md §8
VLM I/O pairs Bounded ROI inputs + structured assessment outputs + schema-violation cases services.md §7
GPS / NTP drift scenarios Wall-clock drift health-yellow gate services.md §9

Data volume targets

  • Training data: hundreds to thousands of annotated images/sequences total.
  • Seasonal coverage: winter (snow), spring (mud), summer (vegetation), autumn (mixed leaf + partial snow).
  • Available assembly effort: 1.5 months at 5 hours/day.
  • Movement detection requires frame sequences (not still images only) with synchronised camera + gimbal + UAV telemetry.
  • Footpaths require polyline or segmentation annotation rather than bounding boxes (see "Class catalogue" above).

Gaps that block /test-spec downstream

/test-spec Phase 1 will pass on prerequisite existence (expected_results/results_report.md is non-empty). Phase 3 has a hard 75% coverage gate on rows with real input fixtures + real expected results.

Current coverage state (re-computed 2026-05-19 after fixture restoration):

  • Rows bound to real local fixtures: L1, D2, D6, T3 (~4 rows) — these are also the rows whose fixtures were restored on 2026-05-19 from sibling repos.
  • Rows bound to starter-only fixtures (insufficient on their own): D3, D4, D5 (semantic PNGs), M1M4 (movement videos without CSV).
  • Rows still deferred for fixture acquisition: see fixtures/README.md → "Gaps still pending fixture acquisition" and services.md for the authoritative list.

Project policy on the Phase 3 gate: rather than block /test-spec at the 75% gate, the autodev flow registers each deferred row with a structured <DEFERRED: needs <shape>; blocks AC <id>> tag in expected_results/results_report.md. Test-spec authoring proceeds; deferred rows become release-gate items, not development-gate items. The acceptance_criteria.md project-level gate ("MUST pass before product implementation begins") still applies for the hardware/replay benchmark — that remains a hard release blocker, not deferred.