[AZ-643] [AZ-665] [AZ-672] mavlink+mapobjects+vlm batch 4
ci/woodpecker/push/build-arm Pipeline failed

AZ-643 mavlink_layer:
- ack demux on COMMAND_LONG/COMMAND_ACK with oneshot dispatch and
  configurable deadline; MavlinkHandle::send_command + SendCommandError
- MAVLink-2 signing: Signer/Verifier built on SHA-256, key + timestamp
  source, incompat-flag wiring in encoder, reject + counter in decoder
- new tests: tests/ack_demux.rs (3) + tests/signing.rs (5)

AZ-665 mapobjects_store:
- internal/h3_index.rs (h3o wrapper, cell_of, grid_disk, haversine)
- internal/store.rs (in-memory (cell -> Vec<MapObject>) hashmap with
  k-ring classify and class-group resolution)
- public API: MapObjectsStoreHandle::classify(ClassifyInput) ->
  Classification {New|Moved|Existing}
- AC1-4 in tests/classify.rs; AC5 perf gate (#[ignore], passes in
  --release)

AZ-672 vlm_client + autopilot:
- DisabledVlmProvider in shared::contracts; VlmProvider::name() for
  composition-root diagnostics
- vlm_client::VlmClient gated behind feature = "vlm"; placeholder
  until AZ-673 lands the real NanoLLM IPC
- autopilot: vlm_client is now optional = true under feature vlm;
  Runtime::select_vlm_provider picks DisabledVlmProvider when feature
  off OR config.vlm.enabled = false

Workspace deps: +sha2 (mavlink signing), +h3o (mapobjects index).
Batch report: _docs/03_implementation/batch_04_cycle1_report.md

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-19 13:31:42 +03:00
parent 0a87c0f716
commit 69c0629350
29 changed files with 2492 additions and 131 deletions
@@ -1,74 +0,0 @@
# MAVLink Ack Demux, Retry, and Signing
**Task**: AZ-643_mavlink_ack_demux_and_signing
**Name**: Command-ack demux + retry handle + optional MAVLink-2 signing
**Description**: Map outbound `COMMAND_LONG` requests to their `COMMAND_ACK` responses by `command_id`, enforce ack timeout, surface result to the originating caller; optionally enable MAVLink-2 message signing.
**Complexity**: 3 points
**Dependencies**: AZ-640_initial_structure, AZ-641_mavlink_transport_and_heartbeat, AZ-642_mavlink_codec
**Component**: mavlink_layer
**Tracker**: AZ-643
**Epic**: AZ-637
## Problem
Outbound MAVLink commands are async with respect to their acks. `mission_executor` (and other callers) need a synchronous-feeling `send_command(...) -> Result<CommandAck>` API that times out at a configurable wall-clock deadline (default 1 s) — the retry decision then belongs to the caller, not to `mavlink_layer`. Separately, when the autopilot link supports it, MAVLink-2 message signing should be enabled for outbound frames and validated for inbound frames; mismatched signatures are rejected.
## Outcome
- `MavlinkHandle::send_command(cmd) -> Result<CommandAck, AckTimeout>` resolves when a matching `COMMAND_ACK` arrives within the deadline, or returns `AckTimeout` otherwise.
- An in-flight command map (`command_id → (caller, deadline)`) is correctly populated and cleared on success and on timeout (no leaks).
- When `signing_enabled = true` at config time, outbound frames are signed; inbound frames with bad signatures are rejected and counted (`parse_errors_total{kind="signing_mismatch"}`).
- `signing_enabled` is reported in `health()`.
## Scope
### Included
- In-flight command map with deadline-driven eviction.
- Public `send_command(...) -> Result<CommandAck>` API.
- MAVLink-2 outbound signature + inbound signature validation (off-by-default; on when configured).
- Health fields: `commands_in_flight`, `signing_enabled`.
### Excluded
- The decision to retry on `AckTimeout` (belongs to `mission_executor`).
- Encoding the new commands themselves (task 03).
## Acceptance Criteria
**AC-1: Command-ack happy path**
Given a healthy SITL link
When `send_command(MAV_CMD_NAV_RETURN_TO_LAUNCH)` is called
Then within ≤1 s the result resolves with `MAV_RESULT_ACCEPTED` and `commands_in_flight` returns to 0.
**AC-2: Ack timeout returns explicit error**
Given a SITL instance that is configured not to ack commands (or is paused)
When `send_command(...)` is called with the default 1 s deadline
Then the call resolves with `Err(AckTimeout)`; the in-flight map is cleared; the link stays open.
**AC-3: Signing rejection counted**
Given `signing_enabled = true` and an inbound frame whose signature does not match
When the decoder runs on the frame
Then the frame is rejected, `parse_errors_total{kind="signing_mismatch"}` increments by 1, and the link stays open.
**AC-4: Optional signing — disabled path**
Given `signing_enabled = false`
When inbound frames arrive (signed or unsigned)
Then the signature field is ignored and `parse_errors_total{kind="signing_mismatch"}` stays at 0.
## Non-Functional Requirements
**Performance**
- Ack demux lookup: O(1); does not contribute measurably to the ≤50 ms per-message round-trip target.
**Reliability**
- No leaked entries in the in-flight map; every `send_command` either resolves or times out.
## Constraints
- Signing scheme decision (Q6) lives elsewhere — this task only wires the on/off mechanism using the spec-defined MAVLink-2 signing.
## Runtime Completeness
- **Named capability**: MAVLink-2 message signing (when enabled) + COMMAND_ACK demux.
- **Production code that must exist**: real signature computation + verification; in-flight map keyed by `command_id`.
- **Allowed external stubs**: SITL with signing disabled is the default test fixture; a separate fixture exercises the signing path.
- **Unacceptable substitutes**: signature stub that always returns "valid" is not acceptable in production.
@@ -1,81 +0,0 @@
# H3 Indexing + Classify
**Task**: AZ-665_mapobjects_store_h3_classify
**Name**: H3 indexing + k-ring classify(detection) → new/moved/existing
**Description**: Compute H3 cell for each detection at the configured resolution (default 10, ~15 m edge). Maintain in-memory `(H3_cell + class) → MapObject` hashmap. Answer `classify(detection)` using k-ring (k=2 default) lookup against `(distance_threshold_m, move_threshold_m, similar_classes)` config.
**Complexity**: 5 points
**Dependencies**: AZ-640_initial_structure
**Component**: mapobjects_store
**Tracker**: AZ-665
**Epic**: AZ-633
## Problem
The H3 spatial index is the foundation of new-vs-existing detection (`architecture.md §7.12`). Each detection's MGRS position is converted to an H3 cell at the configured resolution; the composite key `(H3_cell, class)` keys an in-memory map of known MapObjects. Classification answers `new | moved | existing` by querying the k-ring of cells (boundary correctness) and computing distance against move thresholds.
## Outcome
- `H3Index::cell_of(mgrs, resolution) -> H3Cell`.
- `MapObjectsStore::classify(detection) -> MapObjectClassification ∈ {New, Moved { from_mgrs, to_mgrs }, Existing { existing_id }}`.
- k-ring lookup (default k=2) over the in-memory hashmap.
- `distance_threshold_m` (default 30 m), `move_threshold_m` (default 50 m), `similar_classes` (configured set per `data_model.md §IgnoredItem` class groups) read from config.
- O(1) classify p99 ≤1 ms.
## Scope
### Included
- H3 binding (Rust crate `h3o` or equivalent).
- `MapObjectsStore` struct + in-memory hashmap.
- `classify` API.
- Config-driven thresholds.
### Excluded
- IgnoredItem suppression (task 27).
- Pre-flight hydrate + sync_state machine (task 28).
- Persistence (task 29).
- End-of-pass removed-candidate sweep (task 27).
## Acceptance Criteria
**AC-1: New detection at unseen MGRS**
Given an empty store
When `classify(detection_at_M1, class=A)` is called
Then it returns `Classification::New`.
**AC-2: Existing detection at known MGRS within threshold**
Given the store has a MapObject at `M1, class=A`
When `classify(detection_at_M1+5m, class=A)` is called and `distance_threshold_m = 30`
Then it returns `Classification::Existing { existing_id: ... }`.
**AC-3: Moved detection beyond move threshold**
Given the store has a MapObject at `M1, class=A`
When `classify(detection_at_M1+60m, class=A)` is called and `move_threshold_m = 50`
Then it returns `Classification::Moved { from_mgrs: M1, to_mgrs: M1+60m }`.
**AC-4: k-ring boundary lookup**
Given the store has a MapObject in cell `C1`
When a new detection falls in cell `C2` (boundary cell of `C1`)
Then with k=2 the lookup finds `C1` and returns `Existing` (not `New`).
**AC-5: Classify p99 ≤1 ms**
Given a store warmed with 10 000 MapObjects
When `classify` is called 1 000 times
Then p99 latency is ≤1 ms.
## Non-Functional Requirements
**Performance**
- O(1) classify p99 ≤1 ms (per `description.md §9`).
**Reliability**
- k-ring boundary correctness guaranteed by default config.
## Contract
- Canonical typed model: `data_model.md §MapObject`, `§MapObjectClassification`.
## Runtime Completeness
- **Named capability**: H3 spatial index + k-ring queries — production new/moved/existing dispatch.
- **Production code that must exist**: real H3 crate; real k-ring lookup.
- **Unacceptable substitutes**: Euclidean-distance-only naive search is unacceptable for production (loses boundary correctness and O(1) latency).
@@ -1,70 +0,0 @@
# VLM Provider Trait + Disabled Default Impl + Feature Flag
**Task**: AZ-672_vlm_client_provider_trait
**Name**: VlmAssessmentProvider trait + default disabled impl + build-time feature gating
**Description**: Define `VlmAssessmentProvider` trait (in `shared::contracts`) and a default impl that always returns `status: disabled`. The `vlm_client` crate is behind a build-time feature flag; with the feature off the default impl is used and the binary builds + runs identically without `vlm_client`.
**Complexity**: 2 points
**Dependencies**: AZ-640_initial_structure
**Component**: vlm_client
**Tracker**: AZ-672
**Epic**: AZ-631
## Problem
VLM is optional in two ways: at runtime (`vlm_enabled` flag) and at build time (`vlm_client` Cargo feature). `scan_controller` depends only on the trait — never on the `vlm_client` crate directly — so the binary builds and runs with VLM absent. The default trait impl returns `status: disabled` so the call-site code path is identical whether VLM is enabled or absent.
## Outcome
- `VlmAssessmentProvider` trait in `shared::contracts::vlm`:
```text
trait VlmAssessmentProvider {
async fn assess(&self, roi_crop: &RoiCrop, prompt: &str) -> VlmAssessment;
}
```
- Default impl `DisabledVlmProvider` returns `VlmAssessment { status: Disabled, .. }` for every call.
- `vlm_client` Cargo feature gates inclusion of the real `vlm_client` crate; with feature off, only `DisabledVlmProvider` is registered.
- Runtime flag `vlm_enabled = false` causes the composition root to install `DisabledVlmProvider` even when the feature is compiled in.
## Scope
### Included
- Trait definition in `shared::contracts::vlm`.
- `DisabledVlmProvider` default impl (also in `shared` so it's available regardless of feature).
- Cargo feature flag wiring in `Cargo.toml` (workspace + binary).
- Runtime flag plumb from config.
### Excluded
- The real NanoLLM IPC client (task 34).
- Schema validation (task 35).
## Acceptance Criteria
**AC-1: Disabled default returns disabled status**
Given a `DisabledVlmProvider`
When `assess(roi, "...")` is called
Then it returns `VlmAssessment { status: Status::Disabled, .. }` immediately (≤1 ms).
**AC-2: Binary builds without vlm_client feature**
Given the binary is built with `--no-default-features` (or whatever toggles the `vlm_client` feature off)
When the build runs
Then it succeeds; the `vlm_client` crate is NOT a build dependency.
**AC-3: Runtime vlm_enabled = false uses disabled impl**
Given the binary is built WITH the `vlm_client` feature but config sets `vlm_enabled = false`
When the composition root constructs the provider
Then `DisabledVlmProvider` is installed; the real NanoLLM client is NOT constructed.
## Non-Functional Requirements
**Performance**
- `DisabledVlmProvider::assess` ≤1 ms.
## Contract
- Canonical typed model: `data_model.md §VlmAssessment`.
## Runtime Completeness
- **Named capability**: optional-VLM trait + disabled default.
- **Production code that must exist**: real trait; real disabled impl; real feature-flag wiring.
- **Unacceptable substitutes**: hardcoding `vlm_client` as a non-optional dependency is unacceptable per `description.md §9 Optionality Model`.