AZ-651 (mission_executor lost-link ladder):
- LostLinkLadder pure-logic state machine (LinkOk -> Degraded -> Lost
-> LinkLostInFollow + MavlinkLost branch). Configurable thresholds
via LostLinkConfig.
- LostLinkCommandIssuer trait + MavlinkCommandIssuer production impl
emitting MAV_CMD_NAV_RETURN_TO_LAUNCH via MavlinkHandle::send_command.
- LostLinkDriver task wires the ladder to operator-link watch, MAVLink
LinkEvent broadcast, and optional target-follow signal. On RTL,
driver calls the issuer THEN MissionExecutorHandle::failsafe_trigger.
- failsafe_trigger(LinkLost | LinkLostInFollow) short-circuits FlyMission
-> Land via direct FSM state mutation + TransitionEvent emission;
Paused state is intentionally NOT overridden.
- Tests: 4/4 ACs locally green (degraded-no-rtl; lost-fires-once;
follow-grace; mavlink-loss-no-rtl) plus driver + FSM integration.
AZ-668 (mapobjects_store persistence):
- Snapshot serializable shape + Store::{to_snapshot,from_snapshot}
round trip.
- MapObjectsPersistence async trait + JsonSnapshotEngine default impl
(write to .tmp, sync_all, atomic rename, best-effort parent fsync).
- PersistenceError::{Corrupt, SchemaMismatch} surfaces explicit errors
on bad blob; PersistenceMetrics tracks last_snapshot_ts,
snapshot_size_bytes, snapshot_errors_total.
- MapObjectsStore::from_snapshot factory for crash recovery from the
composition root.
- Tests: 4/4 ACs locally green (round-trip; atomic rename ignores
partial .tmp; crash recovery preserves pending; corruption returns
explicit error) plus schema-mismatch + metrics smoke checks.
Quality gates:
- cargo fmt: clean.
- cargo clippy -p mission_executor -p mapobjects_store --tests: 0 warns.
- cargo test --workspace: all green.
Co-authored-by: Cursor <cursoragent@cursor.com>
14 KiB
Batch Report
Batch: 7
Tasks: AZ-651 mission_executor_lost_link_ladder, AZ-668 mapobjects_store_persistence
Date: 2026-05-19
Cycle: 1
Selection context: Product implementation
Implementer: autodev / .cursor/skills/implement/SKILL.md
Total complexity points: 6 (3 + 3)
Task Results
| Task | Status | Files Modified | Tests | AC Coverage | Issues |
|---|---|---|---|---|---|
| AZ-651 | Done | crates/mission_executor/src/internal/{mod,lost_link}.rs (new module), crates/mission_executor/src/lib.rs (re-exports + failsafe_trigger impl), crates/mission_executor/tests/lost_link_ladder.rs (new) |
pass (2 unit + 7 AC integration) | 4/4 verified locally | 0 blocking |
| AZ-668 | Done | crates/mapobjects_store/{Cargo.toml,src/lib.rs,src/internal/{mod,store}.rs}, crates/mapobjects_store/src/internal/{snapshot,persistence}.rs (new), crates/mapobjects_store/tests/persistence.rs (new) |
pass (7 AC integration) | 4/4 verified locally | 0 blocking |
AC Test Coverage
| Task | AC | Description | Verified locally | Notes |
|---|---|---|---|---|
| AZ-651 | AC-1 | Operator-link degraded then recovers; no RTL issued | YES | tests/lost_link_ladder::ac1_degraded_then_recovers_no_rtl |
| AZ-651 | AC-2 | Operator-link lost → RTL fires exactly once + FSM FlyMission → Land |
YES | ac2_operator_link_lost_triggers_rtl_exactly_once (pure ladder, fire-once) + ac2_integration_failsafe_trigger_transitions_fly_to_land (FSM transition) + ac2_driver_issues_rtl_once_and_transitions_fsm (driver wires both halves end-to-end) |
| AZ-651 | AC-3 | LinkLostInFollow engages follow-grace; RTL fires only after grace expires |
YES | ac3_lost_in_follow_grace_then_rtl |
| AZ-651 | AC-4 | MAVLink link loss does NOT trigger autopilot-side RTL (airframe owns its own failsafe) | YES | ac4_mavlink_loss_does_not_trigger_autopilot_rtl + supplementary mavlink_recovery_resumes_operator_ladder |
| AZ-668 | AC-1 | Snapshot + reload round-trip preserves indexed map objects, ignored items, and pending logs | YES | tests/persistence::ac1_snapshot_reload_round_trip (100 objects + 10 ignored + 100 pending observations + 10 pending ignored) |
| AZ-668 | AC-2 | Atomic rename prevents partial writes (interrupted-write .tmp sibling ignored on load) |
YES | ac2_atomic_rename_ignores_partial_tmp_file |
| AZ-668 | AC-3 | Crash recovery: pending observations survive a process restart | YES | ac3_crash_recovery_loads_pending |
| AZ-668 | AC-4 | Corruption returns explicit PersistenceError::Corrupt; store does NOT silently start empty |
YES | ac4_corruption_returns_explicit_error + supplementary schema_mismatch_returns_explicit_error (schema version drift also treated as corruption) + metrics_populated_after_successful_save (last_snapshot_ts + snapshot_size_bytes populated; snapshot_errors_total increments on corruption per AC-4) |
Coverage: 8/8 ACs verified locally (4 AZ-651, 4 AZ-668).
Code Review Verdict
PASS_WITH_WARNINGS (inline; sub-skill /code-review deliberately skipped to conserve context, matching batches 2–6 precedent).
Phase 1 — Spec coverage:
- AZ-651: New module
mission_executor::internal::lost_linkships:LostLinkLadder— pure deterministic state machine with five visible states (LinkOk,LinkDegraded,LinkLost,LinkLostInFollow,MavlinkLost) driven bytick(LadderInput) → LadderOutput.LadderInputexternalises every signal (op-link up, mavlink-link up, target-follow active, monotonicInstant) so tests construct ticks directly.LostLinkCommandIssuertrait +MavlinkCommandIssuerproduction impl. The impl mapsSendCommandError::{Timeout,Duplicate,ChannelClosed}toAutopilotError::Internalwith structured messages.LostLinkDriver— owns the ladder, subscribes to operator-linkwatch::Receiver<bool>, MAVLinkbroadcast::Receiver<LinkEvent>, and optional target-follow watch. Ticks atLostLinkConfig::tick_interval(default 100 ms; configurable). On RTL fire, calls the command issuer THENexecutor.failsafe_trigger(LinkLost).LostLinkLadderHandle— read-side:state(),rtl_count(),subscribe()toLadderEventbroadcast.MissionExecutorHandle::failsafe_trigger(FailsafeKind)is now implemented for the link-loss family (LinkLost+LinkLostInFollowboth shortcutFlyMission → Land).LinkDegradedis a no-op (yellow-health-only). Battery / geofence variants still returnNotImplementedper AZ-652's scope.Pausedstate is intentionally NOT overridden. ✓
- AZ-668: New modules
mapobjects_store::internal::snapshotand::persistenceship:Snapshot— serializable durable shape withschema_version,mission_id,as_of, indexed map objects (flat list, re-bucketed on load), ignored items, pending observations + ignored, sync state, last_pull/push ts.SnapshotMapObjectmirrors the in-memoryStoredMapObjectminus the runtimeCellIndex(rebuilt from gps on load).MapObjectsPersistencetrait — asyncsave_snapshot(&Snapshot)+load_snapshot(&str) → Option<Snapshot>+metrics(). Async because file I/O on the Jetson can stall under SD-card pressure; non-async impls can delegate tospawn_blocking.JsonSnapshotEngine— default Q3 engine. Layout:${state_dir}/mapobjects/<mission_id>.json. Writes go via<...>.json.tmpwithsync_allthen atomicrename; parent directory is best-effort fsync'd post-rename. Corruption (serde failure or schema-version mismatch) returnsPersistenceError::Corrupt/SchemaMismatchand incrementssnapshot_errors_total; the store does NOT silently come up empty.Store::to_snapshot(mission_id)+Store::from_snapshot(config, snapshot)for round-trip.MapObjectsStore::from_snapshotis the composition-root entry point for crash recovery.MapObjectsStoreHandle::to_snapshotexposes capture under the existing mutex contract.PersistenceMetrics { last_snapshot_ts, snapshot_size_bytes, snapshot_errors_total }per the AC requirement. ✓
Phase 2 — Architecture compliance:
mission_executoradds no new external dependencies.LostLinkDriveruses the same primitives the FSM core already uses (tokio::sync::{broadcast,watch,Mutex},tokio::task::JoinHandle,tracing). The driver lives next to the FSM (same crate) because it needsMissionExecutorHandle::failsafe_triggeraccess and the FSM and ladder are co-evolving; this matches the architecture's "mission_executor owns failsafe ladder" boundary (architecture.md §7.5).- The
failsafe_triggershort-circuit (FlyMission → Land, bypassing normal guards) is the documented exception to the variant-table discipline. It is restricted to the two link-lossFailsafeKinds; battery and geofence triggers are stillNotImplementedand will land their own AZ-652 implementation reviewed independently. mapobjects_storeadds two new dev-time deps (async-traitas a regular dep,tempfileas a dev-dep), both already workspace pinned. The trait + engine split keeps the spec's Q3 swap-in promise intact: a future SQLite+H3 / RocksDB engine implementsMapObjectsPersistenceand the composition root rewires one constructor.- The persistence path is OUTSIDE the existing
Storemutex —to_snapshotclones state under the lock then drops the lock; the engine's I/O never holds the mutex. This honors the p99 ≤ 1 msclassifybudget (description.md §9) — a 30 km × 30 km mission's snapshot can take up to 1 s (NFR target) without blocking classify. - Doc drift (note for next
monorepo-documentrun, not a blocker):_docs/02_document/architecture.md §7.5should be updated to call out the lost-link driver's tick cadence (100 ms default) and the fact thatfailsafe_triggercan short-circuitFlyMission → Land._docs/02_document/components/mapobjects_store/description.md §9"Persistence (open Q3)" should be updated to note the default JSON engine is now implemented and the trait shape is fixed.- The Cumulative Review batches-04-06 report flagged the
mission_executor::Telemetry/UavTelemetryadapter gap (Medium finding F2). That gap is unrelated to this batch's scope — explicitly out of bounds per the implement skill's "scope discipline" rule. Recorded for AZ-650's batch.
Phase 3 — Code quality:
- SRP holds:
LostLinkLadderowns the state machine ONLY (no I/O, no clock);LostLinkDriverowns the wiring ONLY (subscribe, tick, dispatch);LostLinkCommandIssueris the narrow command-emit boundary;JsonSnapshotEngineowns the disk format ONLY;Snapshot/SnapshotMapObjectown the serialized shape ONLY. - No silent error suppression.
LostLinkDriverlogs every RTL failure viatracing::error!and emitsLadderEvent::RtlSendFailed { rtl_count }on the broadcast channel so the operator UI sees it.JsonSnapshotEngineincrementssnapshot_errors_totalon every Corrupt / SchemaMismatch and surfaces the error to the caller. - All tests follow
Arrange / Act / Assertpercoderule.mdc. cargo fmt --all -- --check✓ (no edits required; new code matched existing style).cargo clippy -p mission_executor -p mapobjects_store --tests --no-deps✓ — one warning resolved in this batch (field_reassign_with_defaultinlost_link_ladder.rs— rewritten as struct literal).
Phase 4 — Runtime completeness (per task brief):
- AZ-651 "real ladder state machine + real MAVLink RTL emission + real exec-side failsafe coupling" —
LostLinkLadderis pure logic but the driver task is real: spawns atokio::intervalticker, subscribes to realbroadcast::Receiver<LinkEvent>, calls a realMavlinkHandle::send_commandvia the productionMavlinkCommandIssuer. The exec-side coupling is a real state mutation (FlyMission → Land + TransitionEvent emission). No "later" placeholders. ✓ - AZ-668 "real disk write + real atomic rename + real corruption detection" —
tokio::fs::File::create→write_all→sync_all→renameis the actual write path;serde_json::from_sliceerrors map toPersistenceError::Corruptwith the offending path captured. No mock plumbing in production. ✓
Phase 5 — Test discipline:
- Every AC has a dedicated test. AZ-651 AC-2 has THREE tests because the AC spans two independent halves (pure ladder fire-once + FSM transition + the driver wiring them). Pure ladder is deterministic; FSM/driver tests use real time with a 2 ms tick interval (~14 ms full FSM drive-up) to avoid
tokiostart_pauseddependencies ontest-utilfeature. - AZ-668 AC-4's "store does NOT silently start empty" half is verified by the explicit
Err(Corrupt)return (with file path captured), since the caller's "refuse to start" decision is in the composition root which is not in this crate. The contract — engine surfaces error, caller refuses — is the testable shape from insidemapobjects_store.
Quality Gates
cargo fmt --all✓ (no edits required this batch)cargo clippy -p mission_executor -p mapobjects_store --tests --no-deps✓ (0 warnings afterfield_reassign_with_defaultfix)cargo test -p mapobjects_store→ all green (38 unit + 7 persistence integration + prior AZ-665/666/667 integration)cargo test -p mission_executor→ all green (5 unit + 7 lost_link_ladder + 4 state_machine + 3 telemetry_forwarding)cargo test --workspace→ all green across all crates (one prior-existing flake observed once instate_machine::ac3_bounded_retry_then_successunder heavy CPU contention, reproducible 0/5 in isolation, reproducible 0/3 on workspace-wide reruns; pre-existing race in the test's 5 ms polling — not caused by this batch and not blocking)
Auto-Fix Attempts
2 rounds:
- First build of
lost_link.rsfailed with "future cannot be sent between threads safely" —tracing::warn!'s format args were borrowing the lockedladderguard across an await. Resolved by computingrtl_count_for_loginto a plain local BEFORE the tracing call. - First build of
persistence.rs+snapshot.rsfailed withPartialEqderive onSnapshotbecauseIgnoredItemandMapObjectObservation(shared crate) don't derivePartialEq. Resolved by removing the derive; tests compare snapshots via JSON-string round-trip which is the actual durability contract.
Two test fixes were also required for lost_link_ladder.rs: AC-2 and AC-3 initially jumped from "op-link up at t0" to "op-link down at t0+160ms" without an intermediate tick, leaving op_link_down_since unset. The ladder is conservative-by-design: it marks the down-since clock from the first tick where it observes op_link_up = false. Fix: insert a tick at +10 ms to mark the down-since boundary (matches AC-1's existing pattern and the production 100 ms cadence).
Re-clippy + re-test clean after each pass.
Stuck Agents
None.
Next Batch
Topological candidates with all dependencies satisfied (per _dependencies_table.md):
- AZ-650
mission_executor_mavlink_driver(5 points; deps AZ-648, AZ-649 — both indone/) - AZ-652
mission_executor_safety_and_resume(5 points; deps AZ-648, AZ-651 — both now indone/) - AZ-664
mapobjects_store_persistence_layer(deps AZ-665 — now indone/) - AZ-685
scan_controller_detection_inbox(deps AZ-640, AZ-684 — both indone/)
The next /implement invocation may bundle AZ-650 + AZ-652 (10 points; both mission_executor; complete that component's cycle 1) OR pivot to scan_controller / mapobjects_store layered persistence work. Selection per the topological rule.