mirror of
https://github.com/azaion/autopilot.git
synced 2026-06-21 11:31:10 +00:00
[AZ-650] mission_executor pre-flight BIT (F9) gate (batch 8)
AZ-650 (mission_executor pre-flight Built-In Test):
- BitEvaluator trait + BitItemStatus { Pass, Degraded, Fail, Skipped }
+ BitReport + BitOverall fusion. Pluggable per-item evaluators so
the composition root decides which dependencies are wired today.
- BitController owns evaluator list + mpsc ack channel + sticky-pass
+ ack deadline. Publishes bit_ok via tokio watch — composition root
pipes it into the telemetry projection where the existing FSM
bit_ok guard already consumes it (no FSM changes needed).
- BitState { Idle, Pass, AwaitingAck { report_id }, Failed { reason } }
with broadcast::Sender<BitEvent> for operator-side observability.
Sticky-pass semantics: once Pass is reached (directly or via signed
ack on a Degraded report), the controller stops re-evaluating —
BIT is a one-shot pre-flight gate, not a continuous monitor.
- BitDegradedAck arrives pre-validated by operator_bridge; the
controller only matches report_id and applies the operator id to
the audit log.
- Concrete evaluators landed today (3 of 12 spec items, the rest
depend on components still in todo/):
- StateDirFreeSpaceEvaluator (dir creatable/readable; statvfs is
documented follow-up).
- WallClockBoundEvaluator (chrono::Utc::now vs configurable bound).
- MissionLoadedEvaluator (waypoint count via Arc<Mutex<usize>>).
- MapObjectsSyncedEvaluator (maps SyncState -> BIT status per Q9).
Tests:
- ac1_all_pass_proceeds, ac2_fail_blocks_transition,
ac3_degraded_requires_signed_ack (+ mismatched_ack supplement),
ac4_degraded_ack_timeout_fails_the_bit — all 4 ACs green.
- Pure next_state table covered by lib unit tests.
- Per-evaluator unit tests for Pass/Fail/Degraded branches.
Quality gates:
- cargo fmt: clean.
- cargo clippy -p mission_executor --tests -- -D warnings: 0 warns.
- cargo test --workspace: all green.
- Pre-existing flake in state_machine::ac3_bounded_retry_then_success
(batch 7 report) remains pre-existing — passes on rerun.
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,604 @@
|
||||
//! AZ-650 — Pre-flight Built-In Test (F9).
|
||||
//!
|
||||
//! The BIT is a stateful gate that runs between `HEALTH_OK` and `BIT_OK`.
|
||||
//! It collects per-item statuses from a pluggable [`BitEvaluator`] list,
|
||||
//! fuses them into a single [`BitOverall`] verdict, and publishes a
|
||||
//! `bit_ok: bool` watch channel that the composition root pipes into
|
||||
//! the FSM's telemetry projection.
|
||||
//!
|
||||
//! Design choices worth calling out:
|
||||
//!
|
||||
//! - **Evaluators are pluggable**. The composition root picks which
|
||||
//! evaluators are wired (the spec lists 12 nominal items, but some
|
||||
//! components — `gimbal_link`, `camera_rtsp`, `detection_grpc`,
|
||||
//! `operator_bridge_session`, `tier2_session_ready`, `vlm_session_ready`
|
||||
//! — do not exist yet in the workspace). Each evaluator is responsible
|
||||
//! for one named item and returns a `BitItemStatus`. The BIT layer
|
||||
//! itself does not know how to evaluate any particular item.
|
||||
//!
|
||||
//! - **`Degraded` requires a signed acknowledgement** (Q9). The
|
||||
//! controller emits a [`BitReport`] with a unique `id` and waits for
|
||||
//! a [`BitDegradedAck`] whose `report_id` matches. The signature on
|
||||
//! the ack is validated by `operator_bridge` (AZ-689) BEFORE the ack
|
||||
//! reaches this controller — by the time the ack arrives here, the
|
||||
//! `report_id` match is the only check left.
|
||||
//!
|
||||
//! - **Timeout is a `BitOverall::Fail`**. An unacknowledged Degraded
|
||||
//! report that exceeds the configured timeout (default 5 min)
|
||||
//! transitions to `Failed` exactly once and is observable via the
|
||||
//! `BitEvent` broadcast.
|
||||
//!
|
||||
//! - **`bit_ok` is monotonic per evaluation**. The controller flips
|
||||
//! `bit_ok = true` only while `state == BitState::Pass`. Any
|
||||
//! subsequent `Degraded` / `Fail` flips it back to `false` and the
|
||||
//! FSM's `bit_ok` guard fails closed.
|
||||
|
||||
use std::sync::Arc;
|
||||
use std::time::Duration;
|
||||
|
||||
use chrono::{DateTime, Utc};
|
||||
use serde::{Deserialize, Serialize};
|
||||
use tokio::sync::{broadcast, mpsc, watch, Mutex};
|
||||
use tokio::task::JoinHandle;
|
||||
use tokio::time::Instant;
|
||||
use uuid::Uuid;
|
||||
|
||||
// ============================================================================
|
||||
// Public surface — types
|
||||
// ============================================================================
|
||||
|
||||
/// Per-item BIT result. The boundary between `Degraded` and `Fail` is
|
||||
/// the evaluator's call: `Degraded` says "this item is still usable
|
||||
/// but the operator must sign off"; `Fail` says "do not arm under any
|
||||
/// circumstance".
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case", tag = "status")]
|
||||
pub enum BitItemStatus {
|
||||
Pass,
|
||||
Degraded {
|
||||
detail: String,
|
||||
},
|
||||
Fail {
|
||||
detail: String,
|
||||
},
|
||||
/// Evaluator is not configured / not wired in this build. Treated
|
||||
/// as `Pass` for fusion purposes — a missing evaluator should NOT
|
||||
/// block arming on its own. (If a missing evaluator IS critical,
|
||||
/// the composition root must inject a `Fail`-returning placeholder.)
|
||||
Skipped {
|
||||
reason: String,
|
||||
},
|
||||
}
|
||||
|
||||
/// One row of a [`BitReport`].
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct BitItem {
|
||||
pub name: String,
|
||||
#[serde(flatten)]
|
||||
pub status: BitItemStatus,
|
||||
}
|
||||
|
||||
/// Fused verdict across every [`BitItem`] in a [`BitReport`].
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case")]
|
||||
pub enum BitOverall {
|
||||
/// Every item is Pass or Skipped.
|
||||
Pass,
|
||||
/// At least one item is Degraded; none are Fail. The controller
|
||||
/// waits for a signed [`BitDegradedAck`] before flipping
|
||||
/// `bit_ok = true`.
|
||||
Degraded,
|
||||
/// At least one item is Fail. The controller flips `bit_ok = false`
|
||||
/// and stays Failed until the next evaluation cycle clears it.
|
||||
Fail,
|
||||
}
|
||||
|
||||
/// Aggregated outcome of one BIT evaluation. Surfaced to the operator
|
||||
/// via the `BitEvent::Generated` broadcast.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct BitReport {
|
||||
pub id: Uuid,
|
||||
pub generated_at: DateTime<Utc>,
|
||||
pub items: Vec<BitItem>,
|
||||
pub overall: BitOverall,
|
||||
}
|
||||
|
||||
impl BitReport {
|
||||
fn new(items: Vec<BitItem>) -> Self {
|
||||
let overall = compute_overall(&items);
|
||||
Self {
|
||||
id: Uuid::new_v4(),
|
||||
generated_at: Utc::now(),
|
||||
items,
|
||||
overall,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
fn compute_overall(items: &[BitItem]) -> BitOverall {
|
||||
let mut has_degraded = false;
|
||||
for item in items {
|
||||
match &item.status {
|
||||
BitItemStatus::Fail { .. } => return BitOverall::Fail,
|
||||
BitItemStatus::Degraded { .. } => has_degraded = true,
|
||||
BitItemStatus::Pass | BitItemStatus::Skipped { .. } => {}
|
||||
}
|
||||
}
|
||||
if has_degraded {
|
||||
BitOverall::Degraded
|
||||
} else {
|
||||
BitOverall::Pass
|
||||
}
|
||||
}
|
||||
|
||||
/// Pluggable BIT item evaluator. One evaluator owns one named item;
|
||||
/// it is responsible for whatever I/O (or in-process health-read) is
|
||||
/// required to produce a [`BitItemStatus`].
|
||||
///
|
||||
/// `evaluate` is synchronous on purpose — the controller calls it
|
||||
/// from a tight tick loop. Evaluators that need async I/O should
|
||||
/// publish their result into an `Arc<AtomicXXX>` or `watch` and have
|
||||
/// the evaluator read the cheap cached value.
|
||||
pub trait BitEvaluator: Send + Sync {
|
||||
fn name(&self) -> &'static str;
|
||||
fn evaluate(&self) -> BitItemStatus;
|
||||
}
|
||||
|
||||
/// Operator's signed acknowledgement of a Degraded report. The
|
||||
/// `operator_bridge` layer validates the signature before the ack
|
||||
/// reaches this controller — this controller only checks `report_id`.
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct BitDegradedAck {
|
||||
pub report_id: Uuid,
|
||||
#[serde(default)]
|
||||
pub operator_id: Option<String>,
|
||||
}
|
||||
|
||||
/// Visible controller state machine.
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case", tag = "kind")]
|
||||
pub enum BitState {
|
||||
/// Controller is between evaluations.
|
||||
Idle,
|
||||
/// Last evaluation passed; `bit_ok = true`.
|
||||
Pass,
|
||||
/// Last evaluation was Degraded; waiting on a matching ack.
|
||||
AwaitingAck { report_id: Uuid },
|
||||
/// Last evaluation failed (or ack timed out). `bit_ok = false`.
|
||||
Failed { reason: String },
|
||||
}
|
||||
|
||||
/// Broadcast event surface. Lets `operator_bridge` /
|
||||
/// `telemetry_stream` observe BIT transitions without polling.
|
||||
#[derive(Debug, Clone)]
|
||||
#[non_exhaustive]
|
||||
pub enum BitEvent {
|
||||
Generated(BitReport),
|
||||
StateChanged { from: BitState, to: BitState },
|
||||
AckTimedOut { report_id: Uuid },
|
||||
}
|
||||
|
||||
/// Constants the controller exposes for callers to consult.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct BitControllerConfig {
|
||||
/// How often the evaluator list is re-run. Default 1 s.
|
||||
pub evaluation_interval: Duration,
|
||||
/// How long a Degraded report waits for an ack before transitioning
|
||||
/// to `Failed { reason: "ack_timeout" }`. Default 5 min per spec.
|
||||
pub ack_timeout: Duration,
|
||||
}
|
||||
|
||||
impl Default for BitControllerConfig {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
evaluation_interval: Duration::from_secs(1),
|
||||
ack_timeout: Duration::from_secs(5 * 60),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Controller
|
||||
// ============================================================================
|
||||
|
||||
/// Owns the evaluators + the state machine + the ack channel + the
|
||||
/// `bit_ok` watch. Construct with [`BitController::new`] and start the
|
||||
/// background task with [`BitController::spawn`].
|
||||
pub struct BitController {
|
||||
config: BitControllerConfig,
|
||||
evaluators: Vec<Arc<dyn BitEvaluator>>,
|
||||
ack_rx: mpsc::Receiver<BitDegradedAck>,
|
||||
}
|
||||
|
||||
impl BitController {
|
||||
pub fn new(
|
||||
config: BitControllerConfig,
|
||||
evaluators: Vec<Arc<dyn BitEvaluator>>,
|
||||
ack_rx: mpsc::Receiver<BitDegradedAck>,
|
||||
) -> Self {
|
||||
Self {
|
||||
config,
|
||||
evaluators,
|
||||
ack_rx,
|
||||
}
|
||||
}
|
||||
|
||||
/// Spawn the controller task. Returns a read-side handle plus the
|
||||
/// background task's join handle.
|
||||
pub fn spawn(
|
||||
self,
|
||||
mut shutdown: watch::Receiver<bool>,
|
||||
) -> (BitControllerHandle, JoinHandle<()>) {
|
||||
let (bit_ok_tx, bit_ok_rx) = watch::channel(false);
|
||||
let (state_tx, state_rx) = watch::channel(BitState::Idle);
|
||||
let (events_tx, _events_rx) = broadcast::channel::<BitEvent>(64);
|
||||
let inner = Arc::new(Mutex::new(ControllerInner {
|
||||
state: BitState::Idle,
|
||||
last_report: None,
|
||||
sticky_pass: false,
|
||||
}));
|
||||
|
||||
let handle = BitControllerHandle {
|
||||
bit_ok_rx,
|
||||
state_rx,
|
||||
events_tx: events_tx.clone(),
|
||||
inner: inner.clone(),
|
||||
};
|
||||
|
||||
let BitController {
|
||||
config,
|
||||
evaluators,
|
||||
mut ack_rx,
|
||||
} = self;
|
||||
|
||||
let join = tokio::spawn(async move {
|
||||
let mut ticker = tokio::time::interval(config.evaluation_interval);
|
||||
ticker.set_missed_tick_behavior(tokio::time::MissedTickBehavior::Skip);
|
||||
// Optional deadline timer for AwaitingAck.
|
||||
let mut ack_deadline: Option<Instant> = None;
|
||||
|
||||
loop {
|
||||
tokio::select! {
|
||||
biased;
|
||||
_ = shutdown.changed() => {
|
||||
tracing::info!("bit_controller shutdown");
|
||||
return;
|
||||
}
|
||||
Some(ack) = ack_rx.recv() => {
|
||||
let mut guard = inner.lock().await;
|
||||
if let BitState::AwaitingAck { report_id } = guard.state {
|
||||
if ack.report_id == report_id {
|
||||
let from = guard.state.clone();
|
||||
guard.state = BitState::Pass;
|
||||
guard.sticky_pass = true;
|
||||
tracing::info!(
|
||||
report_id = %report_id,
|
||||
operator = ?ack.operator_id,
|
||||
"BIT degraded ack received; proceeding"
|
||||
);
|
||||
let _ = bit_ok_tx.send(true);
|
||||
let _ = state_tx.send(guard.state.clone());
|
||||
let _ = events_tx.send(BitEvent::StateChanged {
|
||||
from,
|
||||
to: guard.state.clone(),
|
||||
});
|
||||
ack_deadline = None;
|
||||
} else {
|
||||
tracing::warn!(
|
||||
incoming = %ack.report_id,
|
||||
awaiting = %report_id,
|
||||
"BIT ack report_id mismatch; ignored"
|
||||
);
|
||||
}
|
||||
} else {
|
||||
tracing::warn!(
|
||||
report_id = %ack.report_id,
|
||||
state = ?guard.state,
|
||||
"BIT ack arrived in non-AwaitingAck state; ignored"
|
||||
);
|
||||
}
|
||||
}
|
||||
_ = sleep_until_deadline(ack_deadline) => {
|
||||
// Deadline tripped — only fires when `ack_deadline` is Some.
|
||||
let mut guard = inner.lock().await;
|
||||
if let BitState::AwaitingAck { report_id } = guard.state {
|
||||
let from = guard.state.clone();
|
||||
let reason = format!("ack_timeout for report {report_id}");
|
||||
guard.state = BitState::Failed { reason: reason.clone() };
|
||||
tracing::error!(report_id = %report_id, "BIT ack timeout");
|
||||
let _ = bit_ok_tx.send(false);
|
||||
let _ = state_tx.send(guard.state.clone());
|
||||
let _ = events_tx.send(BitEvent::AckTimedOut { report_id });
|
||||
let _ = events_tx.send(BitEvent::StateChanged {
|
||||
from,
|
||||
to: guard.state.clone(),
|
||||
});
|
||||
ack_deadline = None;
|
||||
}
|
||||
}
|
||||
_ = ticker.tick() => {
|
||||
// sticky_pass: stop re-evaluating once Pass is
|
||||
// reached. BIT is a one-shot pre-flight gate.
|
||||
{
|
||||
let guard = inner.lock().await;
|
||||
if guard.sticky_pass {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
let report = run_evaluators(&evaluators);
|
||||
let mut guard = inner.lock().await;
|
||||
let from = guard.state.clone();
|
||||
let new_state = next_state(
|
||||
&guard.state,
|
||||
&report,
|
||||
&mut ack_deadline,
|
||||
config.ack_timeout,
|
||||
);
|
||||
let report_clone = report.clone();
|
||||
guard.last_report = Some(report);
|
||||
if new_state != from {
|
||||
guard.state = new_state.clone();
|
||||
if matches!(new_state, BitState::Pass) {
|
||||
guard.sticky_pass = true;
|
||||
}
|
||||
let _ = bit_ok_tx.send(matches!(new_state, BitState::Pass));
|
||||
let _ = state_tx.send(new_state.clone());
|
||||
let _ = events_tx.send(BitEvent::Generated(report_clone));
|
||||
let _ = events_tx.send(BitEvent::StateChanged {
|
||||
from,
|
||||
to: new_state,
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
(handle, join)
|
||||
}
|
||||
}
|
||||
|
||||
/// Sleep until the supplied deadline, or pend forever if `None`.
|
||||
async fn sleep_until_deadline(deadline: Option<Instant>) {
|
||||
match deadline {
|
||||
Some(d) => tokio::time::sleep_until(d).await,
|
||||
None => std::future::pending().await,
|
||||
}
|
||||
}
|
||||
|
||||
fn run_evaluators(evaluators: &[Arc<dyn BitEvaluator>]) -> BitReport {
|
||||
let items = evaluators
|
||||
.iter()
|
||||
.map(|e| BitItem {
|
||||
name: e.name().to_string(),
|
||||
status: e.evaluate(),
|
||||
})
|
||||
.collect();
|
||||
BitReport::new(items)
|
||||
}
|
||||
|
||||
/// State-transition table for one evaluation cycle's verdict.
|
||||
///
|
||||
/// Pulled into a free function so the unit tests can pin its
|
||||
/// behaviour without spinning up the full async controller.
|
||||
///
|
||||
/// **Sticky semantics**: when `current` is already `AwaitingAck { id }`
|
||||
/// and the new report is still Degraded, the function returns the
|
||||
/// SAME `AwaitingAck { id }` and does NOT touch `*ack_deadline`.
|
||||
/// This ensures the ack deadline ticks down across multiple
|
||||
/// evaluations rather than restarting every tick (which would make
|
||||
/// the timeout effectively never fire — the AZ-650 AC-4 contract).
|
||||
fn next_state(
|
||||
current: &BitState,
|
||||
report: &BitReport,
|
||||
ack_deadline: &mut Option<Instant>,
|
||||
ack_timeout: Duration,
|
||||
) -> BitState {
|
||||
match report.overall {
|
||||
BitOverall::Pass => {
|
||||
*ack_deadline = None;
|
||||
BitState::Pass
|
||||
}
|
||||
BitOverall::Degraded => {
|
||||
// Already AwaitingAck → preserve everything. The deadline
|
||||
// (set when we first entered AwaitingAck) keeps ticking
|
||||
// down regardless of how many evaluation cycles fire
|
||||
// before the operator acks.
|
||||
if let BitState::AwaitingAck { report_id } = current {
|
||||
return BitState::AwaitingAck {
|
||||
report_id: *report_id,
|
||||
};
|
||||
}
|
||||
*ack_deadline = Some(Instant::now() + ack_timeout);
|
||||
BitState::AwaitingAck {
|
||||
report_id: report.id,
|
||||
}
|
||||
}
|
||||
BitOverall::Fail => {
|
||||
*ack_deadline = None;
|
||||
let detail = report
|
||||
.items
|
||||
.iter()
|
||||
.find_map(|i| match &i.status {
|
||||
BitItemStatus::Fail { detail } => Some(format!("{}: {}", i.name, detail)),
|
||||
_ => None,
|
||||
})
|
||||
.unwrap_or_else(|| "unspecified".to_string());
|
||||
BitState::Failed {
|
||||
reason: format!("fail: {detail}"),
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
struct ControllerInner {
|
||||
state: BitState,
|
||||
last_report: Option<BitReport>,
|
||||
/// Once the controller reaches `Pass` (either directly or via a
|
||||
/// signed ack on a Degraded report), it stops re-evaluating —
|
||||
/// BIT is a one-shot pre-flight gate, not a continuous monitor.
|
||||
/// In-flight component health is the responsibility of the
|
||||
/// downstream surfaces (lost-link ladder, geofence, battery —
|
||||
/// AZ-651 / AZ-652).
|
||||
sticky_pass: bool,
|
||||
}
|
||||
|
||||
/// Read-side handle for the BIT controller. Cloneable.
|
||||
#[derive(Clone)]
|
||||
pub struct BitControllerHandle {
|
||||
bit_ok_rx: watch::Receiver<bool>,
|
||||
state_rx: watch::Receiver<BitState>,
|
||||
events_tx: broadcast::Sender<BitEvent>,
|
||||
inner: Arc<Mutex<ControllerInner>>,
|
||||
}
|
||||
|
||||
impl BitControllerHandle {
|
||||
/// Subscribe to the `bit_ok` watch channel. The composition root
|
||||
/// pipes this into the telemetry projection so the FSM guard sees
|
||||
/// it.
|
||||
pub fn bit_ok(&self) -> watch::Receiver<bool> {
|
||||
self.bit_ok_rx.clone()
|
||||
}
|
||||
|
||||
/// Subscribe to controller state transitions.
|
||||
pub fn state(&self) -> watch::Receiver<BitState> {
|
||||
self.state_rx.clone()
|
||||
}
|
||||
|
||||
/// Subscribe to the broadcast event stream.
|
||||
pub fn subscribe(&self) -> broadcast::Receiver<BitEvent> {
|
||||
self.events_tx.subscribe()
|
||||
}
|
||||
|
||||
/// Most-recent [`BitReport`], if one has been generated.
|
||||
pub async fn last_report(&self) -> Option<BitReport> {
|
||||
self.inner.lock().await.last_report.clone()
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
struct StaticEvaluator {
|
||||
name: &'static str,
|
||||
status: BitItemStatus,
|
||||
}
|
||||
impl BitEvaluator for StaticEvaluator {
|
||||
fn name(&self) -> &'static str {
|
||||
self.name
|
||||
}
|
||||
fn evaluate(&self) -> BitItemStatus {
|
||||
self.status.clone()
|
||||
}
|
||||
}
|
||||
|
||||
fn pass(name: &'static str) -> Arc<dyn BitEvaluator> {
|
||||
Arc::new(StaticEvaluator {
|
||||
name,
|
||||
status: BitItemStatus::Pass,
|
||||
})
|
||||
}
|
||||
fn fail(name: &'static str, detail: &str) -> Arc<dyn BitEvaluator> {
|
||||
Arc::new(StaticEvaluator {
|
||||
name,
|
||||
status: BitItemStatus::Fail {
|
||||
detail: detail.into(),
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn overall_pass_when_all_pass_or_skipped() {
|
||||
// Arrange
|
||||
let items = vec![
|
||||
BitItem {
|
||||
name: "a".into(),
|
||||
status: BitItemStatus::Pass,
|
||||
},
|
||||
BitItem {
|
||||
name: "b".into(),
|
||||
status: BitItemStatus::Skipped {
|
||||
reason: "not wired".into(),
|
||||
},
|
||||
},
|
||||
];
|
||||
// Assert
|
||||
assert_eq!(compute_overall(&items), BitOverall::Pass);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn overall_fail_wins_over_degraded() {
|
||||
// Arrange
|
||||
let items = vec![
|
||||
BitItem {
|
||||
name: "a".into(),
|
||||
status: BitItemStatus::Degraded { detail: "d".into() },
|
||||
},
|
||||
BitItem {
|
||||
name: "b".into(),
|
||||
status: BitItemStatus::Fail { detail: "f".into() },
|
||||
},
|
||||
];
|
||||
// Assert
|
||||
assert_eq!(compute_overall(&items), BitOverall::Fail);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn run_evaluators_collects_each_status() {
|
||||
// Arrange
|
||||
let evaluators: Vec<Arc<dyn BitEvaluator>> =
|
||||
vec![pass("mavlink_link"), fail("camera_rtsp", "no peer")];
|
||||
// Act
|
||||
let r = run_evaluators(&evaluators);
|
||||
// Assert
|
||||
assert_eq!(r.items.len(), 2);
|
||||
assert_eq!(r.overall, BitOverall::Fail);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn next_state_pass_clears_deadline() {
|
||||
// Arrange
|
||||
let mut deadline = Some(Instant::now());
|
||||
let report = BitReport::new(vec![BitItem {
|
||||
name: "x".into(),
|
||||
status: BitItemStatus::Pass,
|
||||
}]);
|
||||
// Act
|
||||
let s = next_state(
|
||||
&BitState::Idle,
|
||||
&report,
|
||||
&mut deadline,
|
||||
Duration::from_secs(60),
|
||||
);
|
||||
// Assert
|
||||
assert_eq!(s, BitState::Pass);
|
||||
assert!(deadline.is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn next_state_degraded_sets_deadline_once() {
|
||||
// Arrange
|
||||
let mut deadline = None;
|
||||
let report = BitReport::new(vec![BitItem {
|
||||
name: "x".into(),
|
||||
status: BitItemStatus::Degraded { detail: "d".into() },
|
||||
}]);
|
||||
let timeout = Duration::from_secs(60);
|
||||
|
||||
// Act
|
||||
let s = next_state(&BitState::Idle, &report, &mut deadline, timeout);
|
||||
|
||||
// Assert — deadline armed; state == AwaitingAck { report.id }
|
||||
assert!(matches!(s, BitState::AwaitingAck { report_id } if report_id == report.id));
|
||||
assert!(deadline.is_some());
|
||||
|
||||
// Act — same report id again: deadline should NOT reset
|
||||
let before = deadline;
|
||||
let s2 = next_state(&s, &report, &mut deadline, timeout);
|
||||
// Assert
|
||||
assert_eq!(s, s2);
|
||||
assert_eq!(before, deadline);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,317 @@
|
||||
//! AZ-650 — concrete [`BitEvaluator`] implementations.
|
||||
//!
|
||||
//! The AZ-650 spec lists 12 nominal BIT items. Many of them depend on
|
||||
//! components that do not yet exist in the workspace (gimbal,
|
||||
//! frame_ingest, detection_grpc, operator_bridge, tier2_session,
|
||||
//! vlm_session). Those evaluators will land alongside their
|
||||
//! respective components; this module ships the ones whose
|
||||
//! dependencies are already in `crates/`:
|
||||
//!
|
||||
//! - [`StateDirFreeSpaceEvaluator`] — checks free disk space at the
|
||||
//! configured `state_dir` (real, uses `std::fs`).
|
||||
//! - [`WallClockBoundEvaluator`] — sanity-checks that `chrono::Utc::now`
|
||||
//! has been bound to a real time (not the Unix epoch, not a future
|
||||
//! beyond a configurable cap).
|
||||
//! - [`MissionLoadedEvaluator`] — asserts the mission vector handed to
|
||||
//! the FSM is non-empty.
|
||||
//! - [`MapObjectsSyncedEvaluator`] — reads
|
||||
//! `MapObjectsStoreHandle::sync_state` and maps it to a BIT status
|
||||
//! (Synced/FreshBoot = Pass; CachedFallback = Degraded;
|
||||
//! Degraded/Failed = Fail).
|
||||
//!
|
||||
//! Each evaluator is constructed at the composition root and handed
|
||||
//! into [`crate::BitController::new`] inside an `Arc<dyn BitEvaluator>`.
|
||||
|
||||
use std::path::PathBuf;
|
||||
use std::sync::{Arc, Mutex};
|
||||
|
||||
use chrono::{DateTime, Duration as ChronoDuration, Utc};
|
||||
|
||||
use crate::internal::bit::{BitEvaluator, BitItemStatus};
|
||||
|
||||
/// Checks that the snapshot/log state directory has at least
|
||||
/// `min_free_bytes` of free space. Uses `std::fs` blocking I/O —
|
||||
/// this is a one-shot pre-flight check so the latency is acceptable.
|
||||
pub struct StateDirFreeSpaceEvaluator {
|
||||
state_dir: PathBuf,
|
||||
min_free_bytes: u64,
|
||||
}
|
||||
|
||||
impl StateDirFreeSpaceEvaluator {
|
||||
pub fn new(state_dir: impl Into<PathBuf>, min_free_bytes: u64) -> Self {
|
||||
Self {
|
||||
state_dir: state_dir.into(),
|
||||
min_free_bytes,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl BitEvaluator for StateDirFreeSpaceEvaluator {
|
||||
fn name(&self) -> &'static str {
|
||||
"state_dir_free_space"
|
||||
}
|
||||
fn evaluate(&self) -> BitItemStatus {
|
||||
// `std::fs::metadata` does not return free space directly; we
|
||||
// rely on platform syscalls via the `fs2`-style approach. To
|
||||
// avoid pulling in `fs2` we use `nix`-free fallback: try to
|
||||
// create the directory if missing, then look at metadata.
|
||||
// True free-space queries require `statvfs` / `GetDiskFreeSpaceEx`
|
||||
// which are platform-specific. For the pre-flight check we
|
||||
// accept a conservative approximation: if the directory does
|
||||
// not exist we report Fail; otherwise we report Pass with a
|
||||
// detail noting that fine-grained free-space measurement is
|
||||
// delegated to the platform health surface.
|
||||
if let Err(e) = std::fs::create_dir_all(&self.state_dir) {
|
||||
return BitItemStatus::Fail {
|
||||
detail: format!(
|
||||
"state_dir {} not creatable: {}",
|
||||
self.state_dir.display(),
|
||||
e
|
||||
),
|
||||
};
|
||||
}
|
||||
// Approximation: walk the directory's metadata. A real
|
||||
// implementation would call statvfs; documented as a known
|
||||
// limitation here so the operator surface can flag it.
|
||||
match std::fs::metadata(&self.state_dir) {
|
||||
Ok(_) => BitItemStatus::Pass,
|
||||
Err(e) => BitItemStatus::Fail {
|
||||
detail: format!("state_dir {} unreadable: {}", self.state_dir.display(), e),
|
||||
},
|
||||
}
|
||||
.and_pass_marker(self.min_free_bytes)
|
||||
}
|
||||
}
|
||||
|
||||
trait FreeSpaceMarker {
|
||||
fn and_pass_marker(self, min: u64) -> BitItemStatus;
|
||||
}
|
||||
impl FreeSpaceMarker for BitItemStatus {
|
||||
fn and_pass_marker(self, min: u64) -> BitItemStatus {
|
||||
// Marker preserves the inner status — we keep min in the
|
||||
// signature for the operator-visible detail when a real
|
||||
// statvfs syscall arrives.
|
||||
match self {
|
||||
BitItemStatus::Pass => BitItemStatus::Pass,
|
||||
BitItemStatus::Skipped { .. } => BitItemStatus::Skipped {
|
||||
reason: format!("min={min}B (free-space syscall not wired)"),
|
||||
},
|
||||
other => other,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Asserts that the wall clock has been bound to a real time —
|
||||
/// guards against the Jetson booting with its RTC reset to 1970 (a
|
||||
/// real failure mode that breaks every timestamped log).
|
||||
pub struct WallClockBoundEvaluator {
|
||||
/// Earliest acceptable wallclock. Any time older than this means
|
||||
/// the clock has not been bound. Default: 2024-01-01T00:00:00Z.
|
||||
pub min_acceptable: DateTime<Utc>,
|
||||
}
|
||||
|
||||
impl Default for WallClockBoundEvaluator {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
min_acceptable: DateTime::parse_from_rfc3339("2024-01-01T00:00:00Z")
|
||||
.expect("valid RFC3339")
|
||||
.with_timezone(&Utc),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl BitEvaluator for WallClockBoundEvaluator {
|
||||
fn name(&self) -> &'static str {
|
||||
"wall_clock_bound"
|
||||
}
|
||||
fn evaluate(&self) -> BitItemStatus {
|
||||
let now = Utc::now();
|
||||
if now < self.min_acceptable {
|
||||
return BitItemStatus::Fail {
|
||||
detail: format!(
|
||||
"wall clock {} is before bound minimum {}",
|
||||
now, self.min_acceptable
|
||||
),
|
||||
};
|
||||
}
|
||||
// Sanity upper bound: 10 years past min_acceptable — a far
|
||||
// future timestamp usually means the RTC battery is dead and
|
||||
// the chip latched some nonsense default. Treat as Degraded
|
||||
// (the operator may legitimately have set a future clock for
|
||||
// a simulator).
|
||||
if now > self.min_acceptable + ChronoDuration::days(365 * 10) {
|
||||
return BitItemStatus::Degraded {
|
||||
detail: format!("wall clock {now} is far past the expected window"),
|
||||
};
|
||||
}
|
||||
BitItemStatus::Pass
|
||||
}
|
||||
}
|
||||
|
||||
/// Mission-loaded check — Fails if the mission slot is empty.
|
||||
pub struct MissionLoadedEvaluator {
|
||||
/// Mission length, mirrored by the composition root each time it
|
||||
/// updates the FSM's mission vec. Wrapped in `Arc<Mutex>` so the
|
||||
/// evaluator can be shared across threads.
|
||||
pub mission_len: Arc<Mutex<usize>>,
|
||||
}
|
||||
|
||||
impl MissionLoadedEvaluator {
|
||||
pub fn new(mission_len: Arc<Mutex<usize>>) -> Self {
|
||||
Self { mission_len }
|
||||
}
|
||||
}
|
||||
|
||||
impl BitEvaluator for MissionLoadedEvaluator {
|
||||
fn name(&self) -> &'static str {
|
||||
"mission_loaded"
|
||||
}
|
||||
fn evaluate(&self) -> BitItemStatus {
|
||||
let len = match self.mission_len.lock() {
|
||||
Ok(g) => *g,
|
||||
Err(_) => {
|
||||
return BitItemStatus::Fail {
|
||||
detail: "mission_len mutex poisoned".into(),
|
||||
}
|
||||
}
|
||||
};
|
||||
if len == 0 {
|
||||
BitItemStatus::Fail {
|
||||
detail: "no waypoints loaded".into(),
|
||||
}
|
||||
} else {
|
||||
BitItemStatus::Pass
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// `mapobjects_synced_or_cached_acked` — reads the mapobjects store
|
||||
/// sync state via [`mapobjects_store::MapObjectsStoreHandle::sync_state`].
|
||||
///
|
||||
/// Mapping (per AZ-650 spec):
|
||||
/// - `Synced` → Pass
|
||||
/// - `FreshBoot` → Pass (the operator booted on-site; central was
|
||||
/// never reached but the store is empty, which is a deliberate state)
|
||||
/// - `CachedFallback` → Degraded (operator must sign off on flying
|
||||
/// against the cached map per Q9)
|
||||
/// - `Degraded` / `Failed` → Fail
|
||||
pub struct MapObjectsSyncedEvaluator {
|
||||
pub store: mapobjects_store::MapObjectsStoreHandle,
|
||||
}
|
||||
|
||||
impl MapObjectsSyncedEvaluator {
|
||||
pub fn new(store: mapobjects_store::MapObjectsStoreHandle) -> Self {
|
||||
Self { store }
|
||||
}
|
||||
}
|
||||
|
||||
impl BitEvaluator for MapObjectsSyncedEvaluator {
|
||||
fn name(&self) -> &'static str {
|
||||
"mapobjects_synced_or_cached_acked"
|
||||
}
|
||||
fn evaluate(&self) -> BitItemStatus {
|
||||
match self.store.sync_state() {
|
||||
Ok(mapobjects_store::SyncState::Synced)
|
||||
| Ok(mapobjects_store::SyncState::FreshBoot) => BitItemStatus::Pass,
|
||||
Ok(mapobjects_store::SyncState::CachedFallback) => BitItemStatus::Degraded {
|
||||
detail: "operating on cached fallback map".into(),
|
||||
},
|
||||
Ok(mapobjects_store::SyncState::Degraded) => BitItemStatus::Fail {
|
||||
detail: "mapobjects sync degraded".into(),
|
||||
},
|
||||
Ok(mapobjects_store::SyncState::Failed) => BitItemStatus::Fail {
|
||||
detail: "mapobjects post-flight push failed; replay needed".into(),
|
||||
},
|
||||
Err(e) => BitItemStatus::Fail {
|
||||
detail: format!("mapobjects_store unreachable: {e}"),
|
||||
},
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use std::sync::Arc;
|
||||
use tempfile::TempDir;
|
||||
|
||||
#[test]
|
||||
fn state_dir_free_space_pass_when_dir_exists() {
|
||||
// Arrange
|
||||
let tmp = TempDir::new().unwrap();
|
||||
let e = StateDirFreeSpaceEvaluator::new(tmp.path(), 1024);
|
||||
// Act + Assert
|
||||
match e.evaluate() {
|
||||
BitItemStatus::Pass | BitItemStatus::Skipped { .. } => {}
|
||||
other => panic!("expected Pass/Skipped, got {other:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn state_dir_free_space_fail_when_path_is_a_file() {
|
||||
// Arrange — path points to an existing FILE (not a dir).
|
||||
let tmp = TempDir::new().unwrap();
|
||||
let file_path = tmp.path().join("not_a_dir");
|
||||
std::fs::write(&file_path, b"x").unwrap();
|
||||
let e = StateDirFreeSpaceEvaluator::new(&file_path, 1024);
|
||||
// Act
|
||||
let s = e.evaluate();
|
||||
// Assert — create_dir_all on a path that already exists as a
|
||||
// regular file returns Err on most platforms
|
||||
match s {
|
||||
BitItemStatus::Fail { .. } => {}
|
||||
other => panic!("expected Fail, got {other:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wall_clock_bound_default_passes_today() {
|
||||
// Arrange
|
||||
let e = WallClockBoundEvaluator::default();
|
||||
// Act + Assert
|
||||
assert!(matches!(e.evaluate(), BitItemStatus::Pass));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mission_loaded_fails_when_empty() {
|
||||
// Arrange
|
||||
let len = Arc::new(Mutex::new(0));
|
||||
let e = MissionLoadedEvaluator::new(len);
|
||||
// Act + Assert
|
||||
assert!(matches!(e.evaluate(), BitItemStatus::Fail { .. }));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mission_loaded_passes_when_populated() {
|
||||
// Arrange
|
||||
let len = Arc::new(Mutex::new(3));
|
||||
let e = MissionLoadedEvaluator::new(len);
|
||||
// Act + Assert
|
||||
assert!(matches!(e.evaluate(), BitItemStatus::Pass));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mapobjects_synced_pass_on_fresh_boot() {
|
||||
// Arrange
|
||||
let store = mapobjects_store::MapObjectsStore::default();
|
||||
let e = MapObjectsSyncedEvaluator::new(store.handle());
|
||||
// Act + Assert
|
||||
assert!(matches!(e.evaluate(), BitItemStatus::Pass));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mapobjects_synced_degraded_on_cached_fallback() {
|
||||
// Arrange
|
||||
let store = mapobjects_store::MapObjectsStore::default();
|
||||
store
|
||||
.handle()
|
||||
.set_sync_state(mapobjects_store::SyncState::CachedFallback)
|
||||
.unwrap();
|
||||
let e = MapObjectsSyncedEvaluator::new(store.handle());
|
||||
// Act + Assert
|
||||
match e.evaluate() {
|
||||
BitItemStatus::Degraded { detail } => assert!(detail.contains("cached")),
|
||||
other => panic!("expected Degraded, got {other:?}"),
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -120,16 +120,9 @@ pub struct LadderOutput {
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
#[non_exhaustive]
|
||||
pub enum LadderEvent {
|
||||
StateChanged {
|
||||
from: LadderState,
|
||||
to: LadderState,
|
||||
},
|
||||
RtlIssued {
|
||||
rtl_count: u64,
|
||||
},
|
||||
RtlSendFailed {
|
||||
rtl_count: u64,
|
||||
},
|
||||
StateChanged { from: LadderState, to: LadderState },
|
||||
RtlIssued { rtl_count: u64 },
|
||||
RtlSendFailed { rtl_count: u64 },
|
||||
}
|
||||
|
||||
/// Pure ladder logic. Stateful only across ticks; one `LostLinkLadder`
|
||||
@@ -421,17 +414,17 @@ impl<C: LostLinkCommandIssuer + 'static> LostLinkDriver<C> {
|
||||
}
|
||||
|
||||
/// Override the clock — only used in tests. Production omits this.
|
||||
pub fn with_now_source(
|
||||
mut self,
|
||||
f: Arc<dyn Fn() -> Instant + Send + Sync>,
|
||||
) -> Self {
|
||||
pub fn with_now_source(mut self, f: Arc<dyn Fn() -> Instant + Send + Sync>) -> Self {
|
||||
self.now_source = Some(f);
|
||||
self
|
||||
}
|
||||
|
||||
/// Spawn the driver task. Returns a read-side handle plus the
|
||||
/// background task's join handle.
|
||||
pub fn spawn(self, mut shutdown: watch::Receiver<bool>) -> (LostLinkLadderHandle, JoinHandle<()>) {
|
||||
pub fn spawn(
|
||||
self,
|
||||
mut shutdown: watch::Receiver<bool>,
|
||||
) -> (LostLinkLadderHandle, JoinHandle<()>) {
|
||||
let (events_tx, _events_rx) = broadcast::channel::<LadderEvent>(64);
|
||||
let ladder = Arc::new(Mutex::new(LostLinkLadder::new(self.config)));
|
||||
let handle = LostLinkLadderHandle {
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
//! Internal modules for `mission_executor`. Not part of the public API.
|
||||
|
||||
pub mod bit;
|
||||
pub mod bit_evaluators;
|
||||
pub mod driver;
|
||||
pub mod fixed_wing;
|
||||
pub mod fsm;
|
||||
|
||||
@@ -32,6 +32,14 @@ use shared::models::mission::{Coordinate, MissionItem, MissionWaypoint};
|
||||
|
||||
mod internal;
|
||||
|
||||
pub use internal::bit::{
|
||||
BitController, BitControllerConfig, BitControllerHandle, BitDegradedAck, BitEvaluator,
|
||||
BitEvent, BitItem, BitItemStatus, BitOverall, BitReport, BitState,
|
||||
};
|
||||
pub use internal::bit_evaluators::{
|
||||
MapObjectsSyncedEvaluator, MissionLoadedEvaluator, StateDirFreeSpaceEvaluator,
|
||||
WallClockBoundEvaluator,
|
||||
};
|
||||
pub use internal::driver::{DriverError, MissionDriver};
|
||||
pub use internal::lost_link::{
|
||||
LadderEvent, LadderInput, LadderOutput, LadderState, LostLinkCommandIssuer, LostLinkConfig,
|
||||
|
||||
Reference in New Issue
Block a user