[AZ-626] Decompose complete: 47 tasks + docs + module layout

Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy Qt/C++ to a Rust workspace. - Remove legacy Qt/C++ tree (ai_controller, drone_controller, misc/camera, python_scaffold, root Dockerfile, autopilot.pro, legacy main.py / requirements.txt). - Add _docs/00_problem (problem, restrictions, acceptance criteria, security approach, input data + fixtures). - Add _docs/01_solution/solution_draft01. - Add _docs/02_document (architecture, system-flows, data_model, glossary, decision-rationale, deployment, 13 component descriptions, tests/ specs, FINAL_report, module-layout). - Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one bootstrap + 46 component tasks) and _dependencies_table.md. - Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for canonical _docs artifacts). - Track autodev state in _docs/_autodev_state.md (Step 6 completed, ready for Step 7 Implement). Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks AZ-640..AZ-686. Total complexity 173 points across 12 epics. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 13:11:11 +00:00 · 2026-05-19 11:02:01 +03:00
parent f7d6cb4a3a
commit bc40ea7300
235 changed files with 12585 additions and 15097 deletions
@@ -0,0 +1,82 @@
+# Component — `vlm_client` (optional)
+
+**Layer**: Perception (data plane in)
+**Status**: forward-looking design (Rust); optional behind a feature flag and a runtime config flag
+
+## 1. Purpose
+
+Tier 3 of the perception pipeline. Asks a local NanoLLM/VILA1.5-3B process to confirm a zoom-in endpoint POI using one bounded ROI crop and a short prompt. Returns a structured `VlmAssessment`. The free-form VLM text is **not** a downstream API contract — only the validated structured output is.
+
+VLM is optional; the system MUST function correctly when VLM is disabled or absent.
+
+## 2. Inputs
+
+| Input | Source | Cadence | Notes |
+|---|---|---|---|
+| Zoom-in ROI crop + prompt | `scan_controller` | per zoom-in endpoint hold | One bounded crop, short prompt, short answer. |
+| `vlm_enabled` runtime flag | startup config | once at start (re-readable on SIGHUP if implemented) | Gates whether `scan_controller` calls this component at all. |
+| IPC socket path | startup config | once | Unix-domain socket to the NanoLLM process. |
+
+## 3. Outputs
+
+| Output | Consumer | Shape |
+|---|---|---|
+| `VlmAssessment` | `scan_controller` | `{ label, confidence, status: ok \| inconclusive \| timeout \| schema_invalid \| ipc_error \| disabled, source_roi_id, latency_ms, model_version }` |
+| Health metric | health aggregator | `enabled`, `vlm_latency_p50/p99`, `errors_by_kind`, `peer_cred_check_pass_rate`. |
+
+## 4. Key Responsibilities
+
+- Validate the ROI payload (size, format) **before** sending it across the IPC channel.
+- Maintain the Unix-domain-socket connection to the NanoLLM process; perform a peer-credential check on connect (where supported by the platform).
+- Send one bounded ROI + short prompt; await one short response within ≤5 s.
+- Validate the response against the `VlmAssessment` schema; on schema-invalid, return `status: schema_invalid` to `scan_controller` and surface to health.
+- Return `status: disabled` when the runtime flag is `false`; `scan_controller` treats this identically to "VLM not present" and proceeds with Tier 2 evidence alone.
+- Capture `model_version` (whatever the NanoLLM process reports for its loaded weights) on every assessment for forensic correlation; log the version on change.
+
+## 5. Internal State
+
+- IPC socket handle and peer-credential cache.
+- In-flight request map (request id → caller).
+
+State is in-process only.
+
+## 6. Failure Modes
+
+| Failure | Detection | Behaviour |
+|---|---|---|
+| VLM process not reachable | connect / send error | Return `status: ipc_error`; bounded-backoff reconnect; health → yellow then red. |
+| Peer-cred check fails | platform API | Hard-fail the connect; do not retry without operator intervention; health → red. |
+| Response timeout (>5 s) | wall-clock | Return `status: timeout`; do not block `scan_controller` past the budget. |
+| Schema-invalid response | response parser | Return `status: schema_invalid`; log the raw response (size-capped) for offline analysis. |
+| ROI payload too large | pre-send size check | Return `status: schema_invalid` synchronously; never send. |
+| Optional component absent at build time | feature flag off at compile | `scan_controller` depends only on the `VlmAssessment` provider trait; the default impl returns `status: disabled`. The binary builds and runs identically without `vlm_client`. |
+
+## 7. Dependencies
+
+**In-process**: `scan_controller`.
+
+**External**: NanoLLM / VILA1.5-3B local process. IPC over Unix-domain socket. No network egress.
+
+## 8. Non-Functional Targets
+
+| Concern | Target |
+|---|---|
+| Per-ROI latency | ≤5 s p99 |
+| Memory budget | within the 6 GB shared budget after Tier 1 + Tier 2 |
+| Cloud egress | **none** (hard rule) |
+| Failure mode | fail-closed — never surface a POI with VLM evidence on a degraded VLM call |
+
+## 9. Optionality Model
+
+Two complementary mechanisms; the implementation chooses one or both:
+
+1. **Runtime flag (`vlm_enabled`)** gated by the benchmark-gate result. When `false`, `scan_controller` skips VLM confirmation; the zoom-in hold proceeds with Tier 2 evidence alone.
+2. **Build-time feature module.** `vlm_client` is a separate Cargo feature; the binary builds, links, and runs identically when the feature is off. `scan_controller` depends on a `VlmAssessmentProvider` trait whose default impl returns `status: disabled`.
+
+Both must yield the same observable behaviour: the system functions correctly with VLM absent, only losing the zoom-in confirmation step.
+
+## 10. References
+
+- `architecture.md §5 Architectural Principles` (no cloud egress, fail-closed), `§7.6 Local VLM confirmation`.
+- `system-flows.md §F3 VLM confirmation` (with explicit fail-closed and disabled branches).
+- `data_model.md §VlmAssessment`.