mirror of
https://github.com/azaion/autopilot.git
synced 2026-06-21 15:31:10 +00:00
bc40ea7300
Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy Qt/C++ to a Rust workspace. - Remove legacy Qt/C++ tree (ai_controller, drone_controller, misc/camera, python_scaffold, root Dockerfile, autopilot.pro, legacy main.py / requirements.txt). - Add _docs/00_problem (problem, restrictions, acceptance criteria, security approach, input data + fixtures). - Add _docs/01_solution/solution_draft01. - Add _docs/02_document (architecture, system-flows, data_model, glossary, decision-rationale, deployment, 13 component descriptions, tests/ specs, FINAL_report, module-layout). - Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one bootstrap + 46 component tasks) and _dependencies_table.md. - Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for canonical _docs artifacts). - Track autodev state in _docs/_autodev_state.md (Step 6 completed, ready for Step 7 Implement). Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks AZ-640..AZ-686. Total complexity 173 points across 12 epics. Co-authored-by: Cursor <cursoragent@cursor.com>
3.6 KiB
3.6 KiB
Component — semantic_analyzer
Layer: Perception (data plane in) Status: forward-looking design (Rust + ONNX/TensorRT bindings)
1. Purpose
Tier 2 of the perception pipeline. Reasons over zoom-in crops using a primitive graph plus a lightweight ROI CNN. Active only when scan_controller is in ZoomedIn. Owns path-freshness scoring, endpoint scoring, branch choice at intersections, and concealment-POI scoring. Operates on bounded ROIs only — never full frames.
2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
DetectionBatch (Tier 1 primitives) |
detection_client |
per zoom-in frame | Used for primitive-graph construction (paths, branches, entrances, trees). |
| Zoom-in frame + ROI selection | frame_ingest (frame), scan_controller (ROI bounds) |
per zoom-in hold | Bounded crop only; full frame is not consumed. |
| Per-class config | startup config | once | Confidence floors, freshness thresholds, branch-priority rules. |
3. Outputs
| Output | Consumer | Shape |
|---|---|---|
Tier2Evidence |
scan_controller |
{ roi_id, path_freshness, endpoint_score, concealment_score, recommended_next_action: PanFollowFootpath | HoldEndpoint | PanBroad | ReturnToZoomOut, source_detections: Vec<DetectionId> } |
Pan plan |
scan_controller (then gimbal_controller) |
sequence of pan goals for footpath following |
| Health metric | health aggregator | tier2_latency_p50/p99, roi_size_bytes_p99, errors_total. |
4. Key Responsibilities
- Build a small primitive graph from Tier-1 detections inside the ROI: path nodes (footpaths, roads), endpoint nodes (branch piles, dark entrances, dugouts), context nodes (trees, tree blocks).
- Score path freshness using the freshness model (texture, edge clarity, undisturbed-surroundings cues).
- Score concealment for endpoint candidates.
- At intersections, recommend the freshest / most-promising branch for
gimbal_controllerto pan toward; emit a follow plan that keeps the path centered while the UAV moves. - Bound every inference call by a strict ROI size and timeout. Never run on a full frame.
5. Internal State
- ROI-scoped primitive graphs (per-ROI lifetime; dropped on zoom-in exit).
- Lightweight CNN session (ONNX/TensorRT engine).
State is in-process only.
6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| ROI size exceeds limit | pre-decode size check | Reject the ROI; surface to scan_controller as tier2_oversize; do not decode. |
| Inference timeout (>200 ms) | wall-clock | Return Tier2Evidence with status: timeout; scan_controller decides to skip VLM and surface a low-evidence POI. |
| CNN session OOM or hardware error | inference call error | Health → red on sustained errors; scan_controller falls back to Tier-1-only POI surfacing. |
| Inconsistent primitive graph (e.g., disconnected paths) | graph validation step | Emit Tier2Evidence with recommended_next_action: ReturnToZoomOut and path_freshness: undefined. |
7. Dependencies
In-process: detection_client, frame_ingest, scan_controller.
External: ONNX Runtime / TensorRT (whichever the lightweight CNN ships with), OpenCV (preprocessing).
8. Non-Functional Targets
| Concern | Target |
|---|---|
| Per-ROI latency | ≤200 ms p99 |
| Concealed-position recall | ≥60 % |
| Concealed-position precision | ≥20 % (operators filter) |
| Footpath detection recall | ≥70 % |
| ROI memory footprint | bounded; no unbounded buffering |
9. References
architecture.md §3,§7.6 Tier 2 semantic analyzer,§7.5 Training Data.system-flows.md §F1 Frame pipeline,§F4 Scan controller behaviour tree.data_model.md §Tier2Evidence.