Files
Oleksandr Bezdieniezhnykh bc40ea7300 [AZ-626] Decompose complete: 47 tasks + docs + module layout
Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 11:02:01 +03:00

3.6 KiB

Component — semantic_analyzer

Layer: Perception (data plane in) Status: forward-looking design (Rust + ONNX/TensorRT bindings)

1. Purpose

Tier 2 of the perception pipeline. Reasons over zoom-in crops using a primitive graph plus a lightweight ROI CNN. Active only when scan_controller is in ZoomedIn. Owns path-freshness scoring, endpoint scoring, branch choice at intersections, and concealment-POI scoring. Operates on bounded ROIs only — never full frames.

2. Inputs

Input Source Cadence Notes
DetectionBatch (Tier 1 primitives) detection_client per zoom-in frame Used for primitive-graph construction (paths, branches, entrances, trees).
Zoom-in frame + ROI selection frame_ingest (frame), scan_controller (ROI bounds) per zoom-in hold Bounded crop only; full frame is not consumed.
Per-class config startup config once Confidence floors, freshness thresholds, branch-priority rules.

3. Outputs

Output Consumer Shape
Tier2Evidence scan_controller { roi_id, path_freshness, endpoint_score, concealment_score, recommended_next_action: PanFollowFootpath | HoldEndpoint | PanBroad | ReturnToZoomOut, source_detections: Vec<DetectionId> }
Pan plan scan_controller (then gimbal_controller) sequence of pan goals for footpath following
Health metric health aggregator tier2_latency_p50/p99, roi_size_bytes_p99, errors_total.

4. Key Responsibilities

  • Build a small primitive graph from Tier-1 detections inside the ROI: path nodes (footpaths, roads), endpoint nodes (branch piles, dark entrances, dugouts), context nodes (trees, tree blocks).
  • Score path freshness using the freshness model (texture, edge clarity, undisturbed-surroundings cues).
  • Score concealment for endpoint candidates.
  • At intersections, recommend the freshest / most-promising branch for gimbal_controller to pan toward; emit a follow plan that keeps the path centered while the UAV moves.
  • Bound every inference call by a strict ROI size and timeout. Never run on a full frame.

5. Internal State

  • ROI-scoped primitive graphs (per-ROI lifetime; dropped on zoom-in exit).
  • Lightweight CNN session (ONNX/TensorRT engine).

State is in-process only.

6. Failure Modes

Failure Detection Behaviour
ROI size exceeds limit pre-decode size check Reject the ROI; surface to scan_controller as tier2_oversize; do not decode.
Inference timeout (>200 ms) wall-clock Return Tier2Evidence with status: timeout; scan_controller decides to skip VLM and surface a low-evidence POI.
CNN session OOM or hardware error inference call error Health → red on sustained errors; scan_controller falls back to Tier-1-only POI surfacing.
Inconsistent primitive graph (e.g., disconnected paths) graph validation step Emit Tier2Evidence with recommended_next_action: ReturnToZoomOut and path_freshness: undefined.

7. Dependencies

In-process: detection_client, frame_ingest, scan_controller.

External: ONNX Runtime / TensorRT (whichever the lightweight CNN ships with), OpenCV (preprocessing).

8. Non-Functional Targets

Concern Target
Per-ROI latency ≤200 ms p99
Concealed-position recall ≥60 %
Concealed-position precision ≥20 % (operators filter)
Footpath detection recall ≥70 %
ROI memory footprint bounded; no unbounded buffering

9. References

  • architecture.md §3, §7.6 Tier 2 semantic analyzer, §7.5 Training Data.
  • system-flows.md §F1 Frame pipeline, §F4 Scan controller behaviour tree.
  • data_model.md §Tier2Evidence.