[AZ-666] [AZ-673] [AZ-648] ignored set + UDS VLM + mission FSM batch 5
ci/woodpecker/push/build-arm Pipeline failed

AZ-666 mapobjects_store:
- internal/ignored.rs (HashSet<(mgrs, class_group)> for O(1) suppression)
- internal/passes.rs (per-region PassTracker with observed-id set and
  end-of-pass removed-candidate sweep)
- Classification::Ignored wired into classify; apply_decline +
  is_ignored + pass_start + end_of_pass on MapObjectsStoreHandle
- new tests/ignored_and_sweep.rs (3 AC + 2 supplementary)

AZ-673 vlm_client:
- internal/peer_cred.rs (Linux SO_PEERCRED via libc getsockopt;
  PeerCredOutcome::SkippedNonLinux on macOS dev hosts per
  description.md §8)
- internal/prompt.rs (pre-send ROI size + format + prompt
  non-emptiness validation)
- internal/wire.rs (length-prefixed JSON envelope with base64 ROI)
- internal/uds_client.rs (tokio UnixStream client; bounded
  reconnect; hard-stop on peer-cred mismatch; per-request deadline)
- VlmClient with both eager (open/connect) and lazy (new) ctor
- workspace Cargo.toml: base64 + libc as workspace deps

AZ-648 mission_executor:
- internal/types.rs (Variant, MissionState, TransitionKey,
  Telemetry, TransitionEvent, StepOutcome)
- internal/driver.rs (MissionDriver trait + DriverError +
  DriverAction)
- internal/fsm.rs (variant-agnostic Transition + FsmCore + step_one
  with per-transition retry budget keyed by TransitionKey)
- internal/multirotor.rs + internal/fixed_wing.rs (typed transition
  tables; multirotor has Armed/TakeOff, fixed-wing parks in
  WaitAuto for operator AUTO)
- public API: MissionExecutor::run spawns the FSM task and returns
  a clone-safe MissionExecutorHandle (state, health, subscribe,
  paused_reason, retry_count)
- new tests/state_machine.rs (AC-1..AC-4 via ScriptedDriver fake;
  SITL conformance lands with AZ-649 telemetry forwarding)

Workspace: cargo fmt + clippy -D warnings clean; full
cargo test --workspace --all-features green (1 ignored = AZ-665
perf gate). Tasks moved todo/ → done/, autodev state set to batch
6 selection.

Refs: _docs/03_implementation/batch_05_cycle1_report.md
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-19 16:54:00 +03:00
parent 69c0629350
commit b5cc0c321c
30 changed files with 3343 additions and 111 deletions
+294 -32
View File
@@ -1,54 +1,131 @@
//! Feature-gated entry point. Compiled only when `--features vlm` is on.
//!
//! AZ-672 installs the trait + a placeholder constructor; the real IPC
//! body lands in AZ-673 (`vlm_client_nanollm_ipc`). Until then `assess`
//! returns `VlmAssessment::disabled()` so the runtime can be wired
//! end-to-end without a working NanoLLM peer.
//! AZ-672 installed the trait + a placeholder constructor; AZ-673
//! replaces the placeholder with the real `NanoLlmClient` (UDS
//! connection, peer-cred check, pre-send validation, bounded request
//! deadline, bounded reconnect).
//!
//! Two construction paths are supported:
//!
//! - `VlmClient::new(path)` — synchronous, **lazy**. Composition-root
//! wiring in `crates/autopilot/src/runtime.rs` uses this so the
//! runtime can be built without requiring the NanoLLM peer to be
//! reachable yet. The UDS connection and peer-cred check happen on
//! the first `assess` call. A peer-cred mismatch on that first
//! call surfaces as `VlmAssessment { status: IpcError, .. }` and
//! subsequent calls also fail because the inner client locks.
//!
//! - `VlmClient::open(path)` / `VlmClient::connect(options)` —
//! asynchronous, **eager**. Used by integration tests and by
//! startup code that wants peer-cred mismatch to hard-fail at
//! process boot.
use std::path::PathBuf;
use std::sync::Arc;
use async_trait::async_trait;
use tokio::sync::OnceCell;
use shared::contracts::VlmProvider;
use shared::error::Result;
use shared::health::ComponentHealth;
use shared::models::vlm::VlmAssessment;
use shared::models::vlm::{VlmAssessment, VlmLabel, VlmStatus};
use super::PROVIDER_NAME;
use crate::internal::uds_client::{ConnectError, NanoLlmClient, NanoLlmClientOptions};
#[derive(Debug, Clone)]
#[derive(Clone)]
pub struct VlmClient {
ipc_socket: String,
options: NanoLlmClientOptions,
inner: Arc<OnceCell<NanoLlmClient>>,
}
impl VlmClient {
/// Construct the feature-enabled client. Until AZ-673 lands, the
/// returned instance still answers `assess` with the disabled
/// no-op assessment — the difference vs `DisabledVlmProvider` is
/// that this socket address has been validated and the IPC
/// connection will be established here in AZ-673.
pub fn new(ipc_socket: impl Into<String>) -> Self {
/// Synchronous, lazy. The first `assess` call dials the UDS peer
/// and performs the SO_PEERCRED check. Use this when the
/// composition root must stay sync.
pub fn new(socket_path: impl Into<PathBuf>) -> Self {
Self {
ipc_socket: ipc_socket.into(),
options: NanoLlmClientOptions::new(socket_path),
inner: Arc::new(OnceCell::new()),
}
}
pub fn ipc_socket(&self) -> &str {
&self.ipc_socket
/// Asynchronous, eager. Opens the UDS connection and performs the
/// peer-cred check up front. Use this when startup must hard-fail
/// on peer-cred mismatch (AZ-673 AC-2).
pub async fn open(socket_path: impl Into<PathBuf>) -> std::result::Result<Self, ConnectError> {
Self::connect(NanoLlmClientOptions::new(socket_path)).await
}
/// Asynchronous, eager, with full options (peer-cred expectations,
/// timeouts, payload limits).
pub async fn connect(options: NanoLlmClientOptions) -> std::result::Result<Self, ConnectError> {
let inner_client = NanoLlmClient::connect(options.clone()).await?;
let cell = OnceCell::new();
cell.set(inner_client)
.ok()
.expect("freshly constructed OnceCell must be empty");
Ok(Self {
options,
inner: Arc::new(cell),
})
}
pub fn ipc_socket(&self) -> &std::path::Path {
&self.options.socket_path
}
pub fn health(&self) -> ComponentHealth {
// Until AZ-673 connects, we surface yellow with the configured
// socket so the operator sees the build *did* enable VLM but
// the IPC peer is not yet wired.
ComponentHealth::yellow(PROVIDER_NAME, format!("ipc_pending: {}", self.ipc_socket))
let connected = self.inner.initialized();
let level = if connected {
ComponentHealth::green(PROVIDER_NAME)
} else {
ComponentHealth::yellow(PROVIDER_NAME, "ipc connect deferred")
};
level.with_detail(format!("ipc_socket={}", self.options.socket_path.display()))
}
/// Reference to the lazily-initialised inner client (`None` if no
/// `assess` has been made yet on a `new()`-constructed instance).
pub fn inner(&self) -> Option<&NanoLlmClient> {
self.inner.get()
}
async fn ensure_connected(&self) -> std::result::Result<&NanoLlmClient, ConnectError> {
let options = self.options.clone();
self.inner
.get_or_try_init(|| async move { NanoLlmClient::connect(options).await })
.await
}
}
trait HealthDetail {
fn with_detail(self, detail: impl Into<String>) -> Self;
}
impl HealthDetail for ComponentHealth {
fn with_detail(mut self, detail: impl Into<String>) -> Self {
self.detail = Some(detail.into());
self
}
}
#[async_trait]
impl VlmProvider for VlmClient {
async fn assess(&self, _roi: Vec<u8>, _prompt: String) -> Result<VlmAssessment> {
// Real IPC call lands in AZ-673. Returning disabled keeps the
// runtime end-to-end exercisable until that task completes.
Ok(VlmAssessment::disabled())
async fn assess(&self, roi: Vec<u8>, prompt: String) -> Result<VlmAssessment> {
match self.ensure_connected().await {
Ok(c) => Ok(c.assess(roi, prompt).await),
Err(e) => Ok(VlmAssessment {
label: VlmLabel::Error,
confidence: 0.0,
evidence_spans: Vec::new(),
reason: format!("lazy connect: {e}"),
status: VlmStatus::IpcError,
latency_ms: 0,
model_version: String::new(),
}),
}
}
fn name(&self) -> &'static str {
@@ -59,20 +136,205 @@ impl VlmProvider for VlmClient {
#[cfg(test)]
mod tests {
use super::*;
#[cfg(target_os = "linux")]
use crate::internal::peer_cred::ExpectedPeer;
use crate::internal::prompt::Limits;
use shared::models::vlm::VlmStatus;
use tempfile::tempdir;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::UnixListener;
/// Spawn a tiny fixture NanoLLM that reads one request frame and
/// writes back the supplied assessment JSON (or just hangs if
/// `respond` is `None`).
async fn fixture(
path: std::path::PathBuf,
respond: Option<serde_json::Value>,
) -> tokio::task::JoinHandle<()> {
let listener = UnixListener::bind(&path).unwrap();
tokio::spawn(async move {
let (mut s, _) = listener.accept().await.unwrap();
let mut lenbuf = [0u8; 4];
if s.read_exact(&mut lenbuf).await.is_err() {
return;
}
let len = u32::from_be_bytes(lenbuf) as usize;
let mut req = vec![0u8; len];
if s.read_exact(&mut req).await.is_err() {
return;
}
let Some(body) = respond else {
std::future::pending::<()>().await;
return;
};
let bytes = serde_json::to_vec(&body).unwrap();
let len = (bytes.len() as u32).to_be_bytes();
let _ = s.write_all(&len).await;
let _ = s.write_all(&bytes).await;
let _ = s.flush().await;
})
}
fn ok_response_json() -> serde_json::Value {
serde_json::json!({
"label": "confirmed_concealed_position",
"confidence": 0.91,
"evidence_spans": ["thicket", "tarp"],
"reason": "high foliage + tarp edge",
"status": "ok",
"latency_ms": 42,
"model_version": "VILA1.5-3B-int4"
})
}
#[tokio::test]
async fn placeholder_assess_returns_disabled_until_az_673() {
async fn ac1_happy_path_round_trip() {
// Arrange
let c = VlmClient::new("/run/vila/ipc.sock");
let dir = tempdir().unwrap();
let path = dir.path().join("nanollm.sock");
let fixture_handle = fixture(path.clone(), Some(ok_response_json())).await;
let client = VlmClient::open(&path).await.expect("connect");
// Act
let r = c
.assess(Vec::new(), String::new())
let result = client
.assess(b"\xff\xd8\xff".to_vec(), "describe".into())
.await
.expect("placeholder path is infallible");
.expect("assess returns Ok envelope");
// Assert
assert_eq!(r.status, VlmStatus::Disabled);
assert_eq!(c.name(), "vlm_client");
assert_eq!(c.ipc_socket(), "/run/vila/ipc.sock");
assert_eq!(result.status, VlmStatus::Ok);
assert_eq!(result.confidence, 0.91);
assert_eq!(result.model_version, "VILA1.5-3B-int4");
assert_eq!(result.latency_ms, 42);
fixture_handle.abort();
}
#[tokio::test]
async fn ac3_oversize_roi_rejected_pre_send() {
// Arrange — fixture exists but should never see a request.
let dir = tempdir().unwrap();
let path = dir.path().join("nanollm.sock");
let _listener = UnixListener::bind(&path).unwrap();
let mut opts = NanoLlmClientOptions::new(&path);
opts.limits = Limits {
max_roi_bytes: 4,
max_prompt_bytes: 1024,
};
let client = VlmClient::connect(opts).await.expect("connect");
// Act
let result = client
.assess(vec![0u8; 5], "p".into())
.await
.expect("assess returns SchemaInvalid envelope, not Err");
// Assert
assert_eq!(result.status, VlmStatus::SchemaInvalid);
assert!(result.reason.contains("roi too large"));
}
#[tokio::test]
async fn ac4_response_timeout_returns_explicit_status() {
// Arrange — fixture accepts the connection but never responds.
let dir = tempdir().unwrap();
let path = dir.path().join("nanollm.sock");
let fixture_handle = fixture(path.clone(), None).await;
let mut opts = NanoLlmClientOptions::new(&path);
opts.request_deadline = std::time::Duration::from_millis(150);
let client = VlmClient::connect(opts).await.expect("connect");
// Act
let started = std::time::Instant::now();
let result = client
.assess(b"r".to_vec(), "p".into())
.await
.expect("assess returns Timeout envelope, not Err");
let elapsed = started.elapsed();
// Assert
assert_eq!(result.status, VlmStatus::Timeout);
assert!(
elapsed >= std::time::Duration::from_millis(150),
"timeout fired too early: {elapsed:?}",
);
assert!(
elapsed < std::time::Duration::from_secs(1),
"timeout overshoot: {elapsed:?}",
);
fixture_handle.abort();
}
#[cfg(target_os = "linux")]
#[tokio::test]
async fn ac2_peer_cred_mismatch_hard_fails_connect() {
// Arrange
let dir = tempdir().unwrap();
let path = dir.path().join("nanollm.sock");
let _listener = UnixListener::bind(&path).unwrap();
let our_uid = unsafe { libc::geteuid() };
let bogus_uid = if our_uid == 0 { 1 } else { 0 };
let mut opts = NanoLlmClientOptions::new(&path);
opts.expected_peer = ExpectedPeer {
uid: Some(bogus_uid),
gid: None,
};
// Act
let err = VlmClient::connect(opts).await.expect_err("must reject");
// Assert
match err {
ConnectError::PeerCredMismatch {
expected_uid,
actual_uid,
..
} => {
assert_eq!(expected_uid, Some(bogus_uid));
assert_eq!(actual_uid, our_uid);
}
other => panic!("expected PeerCredMismatch, got {other:?}"),
}
}
#[tokio::test]
async fn rejects_empty_prompt_and_empty_roi() {
// Arrange
let dir = tempdir().unwrap();
let path = dir.path().join("nanollm.sock");
let _listener = UnixListener::bind(&path).unwrap();
let client = VlmClient::open(&path).await.unwrap();
// Act + Assert — empty roi.
let r = client.assess(Vec::new(), "describe".into()).await.unwrap();
assert_eq!(r.status, VlmStatus::SchemaInvalid);
// Act + Assert — empty prompt.
let r = client.assess(vec![1u8, 2, 3], String::new()).await.unwrap();
assert_eq!(r.status, VlmStatus::SchemaInvalid);
}
#[tokio::test]
async fn lazy_new_connects_on_first_assess() {
// Arrange — fixture process binds the socket after the client
// is constructed; the lazy client must succeed because connect
// happens on demand, not at construction.
let dir = tempdir().unwrap();
let path = dir.path().join("nanollm.sock");
// Construct the client *before* the fixture exists. With the
// old eager constructor this would fail; with lazy it must
// succeed.
let client = VlmClient::new(&path);
assert!(client.inner().is_none(), "should not be connected yet");
// Bring the fixture up, then call assess.
let fixture_handle = fixture(path.clone(), Some(ok_response_json())).await;
let result = client
.assess(b"r".to_vec(), "p".into())
.await
.expect("lazy assess");
assert_eq!(result.status, VlmStatus::Ok);
assert!(client.inner().is_some(), "lazy connect should have run");
fixture_handle.abort();
}
}
+6
View File
@@ -0,0 +1,6 @@
//! Internal modules used only by the feature-gated `vlm` build.
pub mod peer_cred;
pub mod prompt;
pub mod uds_client;
pub mod wire;
+164
View File
@@ -0,0 +1,164 @@
//! `SO_PEERCRED` peer credential check.
//!
//! Production target is Jetson Linux. On Linux we call `getsockopt`
//! with `SO_PEERCRED` and compare the peer's UID/GID against the
//! configured expected values; mismatch returns `PeerCredOutcome::Mismatch`.
//!
//! On macOS dev hosts there is no equivalent that returns both UID
//! and GID through `getsockopt` (LOCAL_PEERCRED returns a `xucred`
//! with up to NGROUPS, and `LOCAL_PEEREPID` returns only the PID).
//! Per the task brief we log a warning and return `SkippedNonLinux`
//! so dev workflows do not require sudo / matching service users.
#[cfg(target_os = "linux")]
use std::os::unix::io::AsRawFd;
use tokio::net::UnixStream;
#[derive(Debug, Clone, PartialEq, Eq)]
#[allow(dead_code)] // some variants only constructed on certain target_os builds
pub enum PeerCredOutcome {
/// Peer credentials match (or, on a non-Linux dev host, the check
/// was skipped and the connection should proceed).
Match { uid: u32, gid: u32 },
/// Peer credentials read but do not match the expected values.
/// Connect MUST fail with `ConnectError::PeerCredMismatch`.
Mismatch {
expected_uid: Option<u32>,
expected_gid: Option<u32>,
actual_uid: u32,
actual_gid: u32,
},
/// Non-Linux dev host: SO_PEERCRED is not available with the same
/// shape. The task brief explicitly allows proceeding here for
/// development purposes.
SkippedNonLinux,
/// `getsockopt` itself failed (kernel rejected the call or the
/// socket is not actually a UDS). Caller treats this as a hard
/// failure — the connection MUST NOT proceed.
SystemError(String),
}
/// Expected peer credentials. `None` means "accept any" for that field.
#[derive(Debug, Clone, Copy, Default)]
pub struct ExpectedPeer {
pub uid: Option<u32>,
pub gid: Option<u32>,
}
#[cfg(target_os = "linux")]
pub fn check(stream: &UnixStream, expected: ExpectedPeer) -> PeerCredOutcome {
let fd = stream.as_raw_fd();
let mut cred: libc::ucred = unsafe { std::mem::zeroed() };
let mut len = std::mem::size_of::<libc::ucred>() as libc::socklen_t;
let rc = unsafe {
libc::getsockopt(
fd,
libc::SOL_SOCKET,
libc::SO_PEERCRED,
&mut cred as *mut libc::ucred as *mut libc::c_void,
&mut len,
)
};
if rc != 0 {
let e = std::io::Error::last_os_error();
return PeerCredOutcome::SystemError(format!("SO_PEERCRED getsockopt: {e}"));
}
let actual_uid = cred.uid;
let actual_gid = cred.gid;
let uid_ok = expected.uid.map(|u| u == actual_uid).unwrap_or(true);
let gid_ok = expected.gid.map(|g| g == actual_gid).unwrap_or(true);
if uid_ok && gid_ok {
PeerCredOutcome::Match {
uid: actual_uid,
gid: actual_gid,
}
} else {
PeerCredOutcome::Mismatch {
expected_uid: expected.uid,
expected_gid: expected.gid,
actual_uid,
actual_gid,
}
}
}
#[cfg(not(target_os = "linux"))]
pub fn check(_stream: &UnixStream, _expected: ExpectedPeer) -> PeerCredOutcome {
tracing::warn!(
"SO_PEERCRED check skipped: non-Linux build (dev host). \
Production deployments MUST run on Linux."
);
PeerCredOutcome::SkippedNonLinux
}
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn peer_cred_check_on_self_socketpair() {
// Arrange — connect to ourselves via a tempfile UDS so we know
// the peer is the current process and its credentials are
// available.
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("peer.sock");
let listener = tokio::net::UnixListener::bind(&path).unwrap();
let server_task = tokio::spawn(async move {
let (s, _) = listener.accept().await.unwrap();
s
});
let client = tokio::net::UnixStream::connect(&path).await.unwrap();
let _server = server_task.await.unwrap();
// Act — accept any UID/GID; we just want to confirm the call
// returns Match (Linux) or SkippedNonLinux (macOS).
let outcome = check(&client, ExpectedPeer::default());
// Assert
match outcome {
PeerCredOutcome::Match { .. } => {}
PeerCredOutcome::SkippedNonLinux => {}
other => panic!("expected Match or SkippedNonLinux, got {other:?}"),
}
}
#[cfg(target_os = "linux")]
#[tokio::test]
async fn peer_cred_mismatch_when_uid_differs() {
// Arrange — connect to a fixture peer and expect a UID we know
// is wrong (use 0 == root, which the test process is not).
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("peer-mismatch.sock");
let listener = tokio::net::UnixListener::bind(&path).unwrap();
let _server = tokio::spawn(async move {
let (s, _) = listener.accept().await.unwrap();
s
});
let client = tokio::net::UnixStream::connect(&path).await.unwrap();
// Act — pick the *opposite* of the current uid as the expected one.
let our_uid = unsafe { libc::geteuid() };
let bogus_uid = if our_uid == 0 { 1 } else { 0 };
let outcome = check(
&client,
ExpectedPeer {
uid: Some(bogus_uid),
gid: None,
},
);
// Assert
match outcome {
PeerCredOutcome::Mismatch {
expected_uid,
actual_uid,
..
} => {
assert_eq!(expected_uid, Some(bogus_uid));
assert_eq!(actual_uid, our_uid);
}
other => panic!("expected Mismatch, got {other:?}"),
}
}
}
+112
View File
@@ -0,0 +1,112 @@
//! Pre-send ROI + prompt validation.
//!
//! Per AZ-673 §Scope and `description.md §8`: payload size is
//! validated BEFORE crossing the IPC boundary. We refuse oversize
//! ROIs synchronously rather than waste the 5 s deadline on a
//! request the peer will reject anyway.
#[derive(Debug, thiserror::Error)]
pub enum ValidateError {
#[error("roi too large: {size} bytes > max {max} bytes")]
OversizeRoi { size: usize, max: usize },
#[error("prompt too large: {size} bytes > max {max} bytes")]
OversizePrompt { size: usize, max: usize },
#[error("roi is empty")]
EmptyRoi,
#[error("prompt is empty")]
EmptyPrompt,
}
#[derive(Debug, Clone, Copy)]
pub struct Limits {
pub max_roi_bytes: usize,
pub max_prompt_bytes: usize,
}
impl Default for Limits {
fn default() -> Self {
// Defaults follow `description.md §8`: bounded ROI (≤ 1 MiB
// raw) and bounded prompt (≤ 4 KiB UTF-8).
Self {
max_roi_bytes: 1024 * 1024,
max_prompt_bytes: 4 * 1024,
}
}
}
pub fn validate(roi: &[u8], prompt: &str, limits: Limits) -> Result<(), ValidateError> {
if roi.is_empty() {
return Err(ValidateError::EmptyRoi);
}
if prompt.is_empty() {
return Err(ValidateError::EmptyPrompt);
}
if roi.len() > limits.max_roi_bytes {
return Err(ValidateError::OversizeRoi {
size: roi.len(),
max: limits.max_roi_bytes,
});
}
if prompt.len() > limits.max_prompt_bytes {
return Err(ValidateError::OversizePrompt {
size: prompt.len(),
max: limits.max_prompt_bytes,
});
}
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn accepts_payload_within_limits() {
// Arrange / Act / Assert
assert!(validate(b"hello", "describe", Limits::default()).is_ok());
}
#[test]
fn rejects_oversize_roi() {
// Arrange
let limits = Limits {
max_roi_bytes: 4,
max_prompt_bytes: 1024,
};
// Act
let err = validate(&[0u8; 5], "p", limits).unwrap_err();
// Assert
assert!(matches!(
err,
ValidateError::OversizeRoi { size: 5, max: 4 }
));
}
#[test]
fn rejects_oversize_prompt() {
// Arrange
let limits = Limits {
max_roi_bytes: 1024,
max_prompt_bytes: 4,
};
// Act
let err = validate(b"r", "hellos", limits).unwrap_err();
// Assert
assert!(matches!(err, ValidateError::OversizePrompt { .. }));
}
#[test]
fn rejects_empty_inputs() {
assert!(matches!(
validate(b"", "p", Limits::default()),
Err(ValidateError::EmptyRoi)
));
assert!(matches!(
validate(b"r", "", Limits::default()),
Err(ValidateError::EmptyPrompt)
));
}
}
@@ -0,0 +1,320 @@
//! Tokio-based UDS client for NanoLLM.
//!
//! State invariants:
//!
//! - At most one request in flight at a time. The caller serialises
//! through a `tokio::sync::Mutex` around the connection.
//! - On transport loss, the client reconnects up to `reconnect_max`
//! times with exponential backoff.
//! - On `PeerCredMismatch`, the client refuses to reconnect — peer
//! credential failures are treated as security incidents that
//! require operator intervention (AZ-673 AC-2).
//! - Every `assess` call is bounded by `request_deadline`. A timeout
//! produces a `VlmAssessment { status: Timeout, .. }` and the
//! socket is dropped + reconnected so a slow response can't poison
//! the next request.
use std::path::{Path, PathBuf};
use std::sync::Arc;
use std::time::Duration;
use shared::models::vlm::{VlmAssessment, VlmLabel, VlmStatus};
use tokio::net::UnixStream;
use tokio::sync::Mutex;
use tokio::time::timeout;
use super::peer_cred::{check as check_peer, ExpectedPeer, PeerCredOutcome};
use super::prompt::{self, Limits};
use super::wire::{read_response, write_request, WireError};
/// Errors returned from `connect`.
#[derive(Debug, thiserror::Error)]
pub enum ConnectError {
/// Socket file could not be opened (no such file, permission, etc.).
#[error("uds connect: {0}")]
Io(#[from] std::io::Error),
/// `SO_PEERCRED` returned credentials that did not match the
/// configured expected uid/gid. No automatic retry — operator
/// intervention required.
#[error("peer credential mismatch: expected_uid={expected_uid:?} expected_gid={expected_gid:?} actual_uid={actual_uid} actual_gid={actual_gid}")]
PeerCredMismatch {
expected_uid: Option<u32>,
expected_gid: Option<u32>,
actual_uid: u32,
actual_gid: u32,
},
/// `getsockopt` itself failed — usually a kernel-level rejection.
/// Treated as a hard failure (no retry).
#[error("peer credential system error: {0}")]
PeerCredSystemError(String),
}
#[derive(Debug, Clone)]
pub struct NanoLlmClientOptions {
pub socket_path: PathBuf,
pub expected_peer: ExpectedPeer,
pub request_deadline: Duration,
pub reconnect_max: u32,
pub reconnect_base: Duration,
pub reconnect_cap: Duration,
pub limits: Limits,
}
impl NanoLlmClientOptions {
pub fn new(socket_path: impl Into<PathBuf>) -> Self {
Self {
socket_path: socket_path.into(),
expected_peer: ExpectedPeer::default(),
// Per `description.md §8` 5 s ceiling.
request_deadline: Duration::from_secs(5),
reconnect_max: 3,
reconnect_base: Duration::from_millis(100),
reconnect_cap: Duration::from_secs(2),
limits: Limits::default(),
}
}
}
/// Long-lived NanoLLM UDS client. Cloneable handle (the inner state
/// is an `Arc<Mutex<...>>`); a single backing connection is shared.
#[derive(Clone)]
pub struct NanoLlmClient {
inner: Arc<Mutex<Inner>>,
options: Arc<NanoLlmClientOptions>,
}
struct Inner {
/// `None` between `disconnect_locked` and the next reconnect, or
/// when the connection has never been opened.
stream: Option<UnixStream>,
/// Set when `PeerCredMismatch` was observed. Hard-stops every
/// subsequent `assess`/connect attempt until the operator
/// rebuilds the client (i.e., restarts the process).
peer_cred_locked: bool,
/// Diagnostic counter for health surfaces.
peer_cred_check_pass: u64,
peer_cred_check_total: u64,
/// Latency samples for `p50` / `p99` surfaces. Kept ring-buffer
/// style to bound memory.
latency_samples: Vec<Duration>,
}
const LATENCY_RING_CAPACITY: usize = 128;
impl NanoLlmClient {
/// Open the UDS connection and verify the peer's credentials.
/// Caller-side mutex is initialised here.
pub async fn connect(options: NanoLlmClientOptions) -> Result<Self, ConnectError> {
let stream = open_and_check(&options.socket_path, options.expected_peer).await?;
let inner = Inner {
stream: Some(stream),
peer_cred_locked: false,
peer_cred_check_pass: 1,
peer_cred_check_total: 1,
latency_samples: Vec::with_capacity(LATENCY_RING_CAPACITY),
};
Ok(Self {
inner: Arc::new(Mutex::new(inner)),
options: Arc::new(options),
})
}
pub fn socket_path(&self) -> &Path {
&self.options.socket_path
}
/// Latency samples snapshot (cloned). Caller computes p50/p99.
pub async fn latency_samples(&self) -> Vec<Duration> {
self.inner.lock().await.latency_samples.clone()
}
/// `(passed, total)` peer-cred check counts since process start.
pub async fn peer_cred_stats(&self) -> (u64, u64) {
let g = self.inner.lock().await;
(g.peer_cred_check_pass, g.peer_cred_check_total)
}
/// True if a peer-cred mismatch ever occurred. Diagnostic only —
/// every public method already short-circuits on the lock.
pub async fn peer_cred_locked(&self) -> bool {
self.inner.lock().await.peer_cred_locked
}
/// Send a single ROI + prompt and await one assessment. Failure
/// modes (validate / timeout / IPC error) are encoded in the
/// returned `VlmAssessment.status` — `assess` never returns an
/// `Err` for these recoverable cases. Hard failures (peer-cred
/// lock, exhausted reconnect budget) DO propagate as
/// `VlmStatus::IpcError` with `label: Error`.
pub async fn assess(&self, roi: Vec<u8>, prompt: String) -> VlmAssessment {
// Pre-send validation — never spend IPC time on a known-bad
// payload (AZ-673 AC-3).
if let Err(e) = prompt::validate(&roi, &prompt, self.options.limits) {
return schema_invalid(format!("pre-send validate: {e}"));
}
// Hard-locked by peer-cred mismatch — refuse without IPC.
if self.inner.lock().await.peer_cred_locked {
return ipc_error("peer-cred mismatch lock active");
}
let started = std::time::Instant::now();
let mut guard = self.inner.lock().await;
// Lazy reconnect if the previous request dropped the stream.
if guard.stream.is_none() {
match reconnect_locked(&mut guard, &self.options).await {
Ok(()) => {}
Err(e) => return e,
}
}
// Single shot. On any IO error we drop the stream so the next
// call reconnects fresh.
let stream = guard
.stream
.as_mut()
.expect("stream present after reconnect");
match timeout(
self.options.request_deadline,
send_and_recv(stream, &prompt, &roi),
)
.await
{
Ok(Ok(mut assessment)) => {
let elapsed = started.elapsed();
push_latency(&mut guard.latency_samples, elapsed);
if assessment.latency_ms == 0 {
assessment.latency_ms = elapsed.as_millis().min(u32::MAX as u128) as u32;
}
assessment
}
Ok(Err(e)) => {
tracing::warn!(error = %e, "vlm_client uds io error; dropping connection");
guard.stream = None;
ipc_error(format!("ipc io: {e}"))
}
Err(_elapsed) => {
tracing::warn!(
deadline_ms = self.options.request_deadline.as_millis() as u64,
"vlm_client assess timeout"
);
// Drop the stream — a half-responded peer might still
// write bytes on the next call and corrupt the frame.
guard.stream = None;
timeout_status(self.options.request_deadline)
}
}
}
}
async fn open_and_check(path: &Path, expected: ExpectedPeer) -> Result<UnixStream, ConnectError> {
let stream = UnixStream::connect(path).await?;
match check_peer(&stream, expected) {
PeerCredOutcome::Match { uid, gid } => {
tracing::info!(uid, gid, "vlm_client uds peer credential check passed");
Ok(stream)
}
PeerCredOutcome::SkippedNonLinux => Ok(stream),
PeerCredOutcome::Mismatch {
expected_uid,
expected_gid,
actual_uid,
actual_gid,
} => Err(ConnectError::PeerCredMismatch {
expected_uid,
expected_gid,
actual_uid,
actual_gid,
}),
PeerCredOutcome::SystemError(s) => Err(ConnectError::PeerCredSystemError(s)),
}
}
async fn reconnect_locked(
guard: &mut Inner,
options: &NanoLlmClientOptions,
) -> Result<(), VlmAssessment> {
let mut delay = options.reconnect_base;
for attempt in 1..=options.reconnect_max {
match open_and_check(&options.socket_path, options.expected_peer).await {
Ok(s) => {
guard.stream = Some(s);
guard.peer_cred_check_pass = guard.peer_cred_check_pass.saturating_add(1);
guard.peer_cred_check_total = guard.peer_cred_check_total.saturating_add(1);
return Ok(());
}
Err(ConnectError::PeerCredMismatch { .. }) => {
guard.peer_cred_locked = true;
guard.peer_cred_check_total = guard.peer_cred_check_total.saturating_add(1);
return Err(ipc_error("peer-cred mismatch on reconnect"));
}
Err(e) => {
tracing::warn!(
error = %e,
attempt,
max = options.reconnect_max,
"vlm_client reconnect failed; backing off"
);
tokio::time::sleep(delay).await;
delay = (delay * 2).min(options.reconnect_cap);
}
}
}
Err(ipc_error("reconnect budget exhausted"))
}
async fn send_and_recv(
stream: &mut UnixStream,
prompt: &str,
roi: &[u8],
) -> Result<VlmAssessment, WireError> {
write_request(stream, prompt, roi).await?;
let resp = read_response(stream).await?;
Ok(resp)
}
fn push_latency(samples: &mut Vec<Duration>, d: Duration) {
if samples.len() == LATENCY_RING_CAPACITY {
samples.remove(0);
}
samples.push(d);
}
fn schema_invalid(reason: impl Into<String>) -> VlmAssessment {
VlmAssessment {
label: VlmLabel::Inconclusive,
confidence: 0.0,
evidence_spans: Vec::new(),
reason: reason.into(),
status: VlmStatus::SchemaInvalid,
latency_ms: 0,
model_version: String::new(),
}
}
fn ipc_error(reason: impl Into<String>) -> VlmAssessment {
VlmAssessment {
label: VlmLabel::Error,
confidence: 0.0,
evidence_spans: Vec::new(),
reason: reason.into(),
status: VlmStatus::IpcError,
latency_ms: 0,
model_version: String::new(),
}
}
fn timeout_status(deadline: Duration) -> VlmAssessment {
VlmAssessment {
label: VlmLabel::Inconclusive,
confidence: 0.0,
evidence_spans: Vec::new(),
reason: format!("ipc deadline {} ms elapsed", deadline.as_millis()),
status: VlmStatus::Timeout,
latency_ms: deadline.as_millis().min(u32::MAX as u128) as u32,
model_version: String::new(),
}
}
+156
View File
@@ -0,0 +1,156 @@
//! Wire framing for NanoLLM UDS IPC.
//!
//! Single request → single response, length-prefixed JSON:
//!
//! ```text
//! uint32 BE length || JSON payload
//! ```
//!
//! The request payload is `{"prompt": "...", "roi_b64": "..."}`. The
//! response payload is a `shared::models::vlm::VlmAssessment` JSON
//! object — the same shape `VlmProvider::assess` returns. AZ-674 will
//! add schema-version validation on top of this; AZ-673 leaves the
//! body un-validated beyond `serde_json::from_slice`.
use base64::Engine;
use serde::{Deserialize, Serialize};
use shared::models::vlm::VlmAssessment;
use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite, AsyncWriteExt};
/// Hard maximum on any single inbound frame. Defends against a peer
/// (or a corrupted peer) declaring an arbitrarily large length.
pub const MAX_FRAME_BYTES: u32 = 8 * 1024 * 1024;
#[derive(Debug, Serialize, Deserialize)]
pub struct AssessRequest {
pub prompt: String,
/// Base64-encoded ROI bytes. Kept inline in the JSON envelope so
/// the wire is one read/write per direction.
pub roi_b64: String,
}
#[derive(Debug, thiserror::Error)]
pub enum WireError {
#[error("io: {0}")]
Io(#[from] std::io::Error),
#[error("frame too large: {0} bytes (max {MAX_FRAME_BYTES})")]
FrameTooLarge(u32),
#[error("json: {0}")]
Json(#[from] serde_json::Error),
#[error("unexpected eof while reading frame body")]
UnexpectedEof,
}
pub async fn write_request<W: AsyncWrite + Unpin>(
w: &mut W,
prompt: &str,
roi: &[u8],
) -> Result<(), WireError> {
let req = AssessRequest {
prompt: prompt.to_string(),
roi_b64: base64::engine::general_purpose::STANDARD.encode(roi),
};
let body = serde_json::to_vec(&req)?;
let len = body.len() as u32;
if len > MAX_FRAME_BYTES {
return Err(WireError::FrameTooLarge(len));
}
w.write_all(&len.to_be_bytes()).await?;
w.write_all(&body).await?;
w.flush().await?;
Ok(())
}
pub async fn read_response<R: AsyncRead + Unpin>(r: &mut R) -> Result<VlmAssessment, WireError> {
let mut lenbuf = [0u8; 4];
r.read_exact(&mut lenbuf).await?;
let len = u32::from_be_bytes(lenbuf);
if len > MAX_FRAME_BYTES {
return Err(WireError::FrameTooLarge(len));
}
let mut body = vec![0u8; len as usize];
let n = r.read_exact(&mut body).await?;
if n != body.len() {
return Err(WireError::UnexpectedEof);
}
let assessment: VlmAssessment = serde_json::from_slice(&body)?;
Ok(assessment)
}
#[cfg(test)]
mod tests {
use super::*;
use shared::models::vlm::{VlmLabel, VlmStatus};
use tokio::io::duplex;
#[tokio::test]
async fn round_trip_request_and_response() {
// Arrange
let (mut a, mut b) = duplex(64 * 1024);
let prompt = "describe";
let roi = b"\xff\xd8\xff\xe0\x00\x10JFIF".to_vec();
// Act — client side writes the request, fixture side reads it
// and writes back a canned response.
let fixture = tokio::spawn(async move {
// Read request frame.
let mut lenbuf = [0u8; 4];
b.read_exact(&mut lenbuf).await.unwrap();
let len = u32::from_be_bytes(lenbuf) as usize;
let mut req_buf = vec![0u8; len];
b.read_exact(&mut req_buf).await.unwrap();
let req: AssessRequest = serde_json::from_slice(&req_buf).unwrap();
assert_eq!(req.prompt, "describe");
assert_eq!(
base64::engine::general_purpose::STANDARD
.decode(req.roi_b64)
.unwrap()
.as_slice(),
b"\xff\xd8\xff\xe0\x00\x10JFIF"
);
// Write canned response.
let response = VlmAssessment {
label: VlmLabel::ConfirmedConcealedPosition,
confidence: 0.91,
evidence_spans: vec!["foliage".into()],
reason: "match".into(),
status: VlmStatus::Ok,
latency_ms: 12,
model_version: "VILA1.5-3B-int4".into(),
};
let body = serde_json::to_vec(&response).unwrap();
let len = body.len() as u32;
b.write_all(&len.to_be_bytes()).await.unwrap();
b.write_all(&body).await.unwrap();
b.flush().await.unwrap();
});
write_request(&mut a, prompt, &roi).await.unwrap();
let resp = read_response(&mut a).await.unwrap();
fixture.await.unwrap();
// Assert
assert_eq!(resp.status, VlmStatus::Ok);
assert_eq!(resp.label, VlmLabel::ConfirmedConcealedPosition);
assert_eq!(resp.model_version, "VILA1.5-3B-int4");
}
#[tokio::test]
async fn rejects_oversized_inbound_frame() {
// Arrange
let (mut a, mut b) = duplex(64);
let huge = MAX_FRAME_BYTES + 1;
b.write_all(&huge.to_be_bytes()).await.unwrap();
b.flush().await.unwrap();
// Act
let err = read_response(&mut a).await.unwrap_err();
// Assert
assert!(matches!(err, WireError::FrameTooLarge(n) if n == huge));
}
}
+14 -5
View File
@@ -6,17 +6,26 @@
//! never references `vlm_client::VlmClient`.
//!
//! With the `vlm` feature **on**, `VlmClient` is the real NanoLLM IPC
//! client. The IPC plumbing itself lands in:
//! - AZ-673 `vlm_client_nanollm_ipc`
//! - AZ-674 `vlm_client_schema_and_model_version`
//!
//! AZ-672 only wires the trait contract + feature flag.
//! client:
//! - AZ-672 wired the trait contract + feature flag.
//! - AZ-673 (this revision) added the UDS connection, SO_PEERCRED
//! check, pre-send validation, bounded request deadline, bounded
//! reconnect.
//! - AZ-674 will add `VlmAssessment` schema-version validation on top.
#[cfg(feature = "vlm")]
mod enabled;
#[cfg(feature = "vlm")]
mod internal;
#[cfg(feature = "vlm")]
pub use enabled::VlmClient;
#[cfg(feature = "vlm")]
pub use internal::peer_cred::ExpectedPeer;
#[cfg(feature = "vlm")]
pub use internal::prompt::Limits;
#[cfg(feature = "vlm")]
pub use internal::uds_client::{ConnectError, NanoLlmClient, NanoLlmClientOptions};
/// Stable name used by tracing + `/health` to identify this crate's
/// build-time configuration. Mirrors `VlmProvider::name()`.