Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy Qt/C++ to a Rust workspace. - Remove legacy Qt/C++ tree (ai_controller, drone_controller, misc/camera, python_scaffold, root Dockerfile, autopilot.pro, legacy main.py / requirements.txt). - Add _docs/00_problem (problem, restrictions, acceptance criteria, security approach, input data + fixtures). - Add _docs/01_solution/solution_draft01. - Add _docs/02_document (architecture, system-flows, data_model, glossary, decision-rationale, deployment, 13 component descriptions, tests/ specs, FINAL_report, module-layout). - Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one bootstrap + 46 component tasks) and _dependencies_table.md. - Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for canonical _docs artifacts). - Track autodev state in _docs/_autodev_state.md (Step 6 completed, ready for Step 7 Implement). Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks AZ-640..AZ-686. Total complexity 173 points across 12 epics. Co-authored-by: Cursor <cursoragent@cursor.com>
4.7 KiB
Solution
The solution for autopilot is captured in full in _docs/02_document/architecture.md, _docs/02_document/system-flows.md, _docs/02_document/data_model.md, _docs/02_document/decision-rationale.md, the 13 per-component specs under _docs/02_document/components/, and _docs/02_document/glossary.md. These were produced before the canonical greenfield Problem step and were confirmed by the user on 2026-05-17.
This file is the canonical greenfield Solution pointer — it exists so downstream skills that expect _docs/01_solution/solution.md (test-spec, decompose, plan-resume) have a single entry point, and it summarises the decision shape; it does not duplicate the architecture.
What is the solution
A single Rust binary on Jetson Orin Nano Super (aarch64) that runs the mission, drives the gimbal in a two-level scan loop, ingests RTSP, delegates Tier 1 detection to ../detections over bi-directional gRPC, runs Tier 2 + optional Tier 3 (VLM) locally, talks to a remote operator over modem via an always-on telemetry stream, and bracket-synchronises a local H3-indexed MapObjects store with the central missions API. The dominant pattern is a deterministic typed state machine — ZoomedOut, ZoomedIn { roi, hold_started_at }, TargetFollow { target_id, started_at } — coordinating a small set of Tokio actor components.
Component breakdown
13 components organised into four planes (see architecture.md §2, §3 and per-component specs):
- Perception (data plane in):
frame_ingest,detection_client,movement_detector,semantic_analyzer,vlm_client(optional). - Decision + Memory:
scan_controller,mapobjects_store. - Action (data plane out):
gimbal_controller,operator_bridge,mission_executor,mavlink_layer,mission_client. - Telemetry plane (always-on, parallel):
telemetry_stream.
Per-component design contracts (inputs, outputs, state, failure modes, NFRs) live in _docs/02_document/components/<name>/description.md.
Tech stack rationale (one-line summary per choice; full rationale in decision-rationale.md)
| Layer | Selection | Rationale |
|---|---|---|
| Language | Rust | Memory safety, performance, single-binary deployment, strong typing for the deterministic state machine. |
| Tier 1 detector | YOLO26 + YOLOE-26 FP16 TensorRT (in ../detections) |
Best fit with acceptance criteria + existing export pipeline. Not owned by autopilot. |
| Tier 2 analyzer | Primitive graph + lightweight ROI CNN | Fast, explainable, data-efficient. |
| Movement detection | OpenCV optical flow + telemetry; learned-CV fallback per Q14 | Addresses moving-camera constraint directly; benchmark-gated. |
| VLM runtime | NanoLLM / VILA1.5-3B (optional, local IPC) | Local multimodal path that matches the no-cloud requirement. |
| MAVLink transport | Hand-rolled (Rust) | Eliminates the largest current dependency-risk item; command surface is small (architecture.md §7.7). |
| Gimbal protocol | ViewPro A40 vendor protocol over UDP | Matches the deployed camera. |
| Inter-component IPC | Tokio channels / actors | Idiomatic Rust async. |
| External IPC (VLM) | Unix-domain socket + peer-credential check | Local-only authorisation. |
| MapObjects engine | TBD (SQLite + H3 / KV / in-memory + snapshot) | Open question Q3; does not block decomposition of the rest of the system. |
| Observability | tracing + JSON logs to stdout |
Scraped by the deployment's log-shipping stack. |
| Build | cargo cross-compile for aarch64-unknown-linux-gnu |
See _docs/02_document/deployment/ci_cd_pipeline.md. |
Reading order for downstream skills
_docs/02_document/architecture.md— start with §0 Synopsis, then §3 Components, §5 Architectural Principles, §6 NFR Targets, §7 Detailed Design (in section order)._docs/02_document/system-flows.md— flow-by-flow walkthroughs; cross-referenced from the architecture sections._docs/02_document/data_model.md— canonical entities (Frame, Detection, POI, VlmAssessment, MapObject, IgnoredItem, MissionItem, ...)._docs/02_document/components/<name>/description.md— one per component; consumed by/decomposeto map tasks to components._docs/02_document/glossary.md— project-specific terms (also user-confirmed 2026-05-17)._docs/02_document/decision-rationale.md— load-bearing research and decision evidence (the equivalent ofresearch/Mode A + Mode B outputs).
Open questions / open decisions
Tracked in _docs/02_document/architecture.md §8 Open Questions (Q1–Q14). None of them block initial implementation decomposition; each component spec calls out the questions it depends on and what the temporary contract is until the question resolves.