Implements the vendor wire protocol for the A40 gimbal (XOR-8 checksum, not CRC16 — task spec corrected against ArduPilot AP_Mount_Viewpro.h): frame encode/decode, typed FrameId/CameraCommand/ImageSensor, A1 angles, C1 camera, C2 set-zoom command builders, and a tokio UdpSocket transport with bounded retry, per-command deadline, and atomic vendor-fault counters surfaced via faults()/health(). GimbalControllerHandle::set_pose and zoom now ride the transport when wired; remain disabled when no transport is bound. 32/32 gimbal_controller tests green; workspace test suite green except for a pre-existing flake in mission_executor::state_machine::ac3_bounded_retry_then_success that reproduces only under parallel workspace test load (passes 5/5 in isolation; flagged in batch 8 report, unrelated to this batch). Co-authored-by: Cursor <cursoragent@cursor.com>
15 KiB
Batch 10 (cycle 1) implementation report
Tasks: AZ-653
Component scope: gimbal_controller
Verdict: PASS_WITH_WARNINGS — proceed; flagged items below.
Tasks
AZ-653 gimbal_a40_transport — ViewPro A40 vendor UDP transport
Outcome: Implemented. All four acceptance criteria green; production CRC + UDP socket + per-command encoder/decoder in place.
Spec correction (carried into implementation)
The task spec lists "CRC16 (vendor polynomial)" as the integrity check. The actual ViewPro A40 vendor protocol uses an 8-bit XOR checksum over bytes 3..n+1 (length byte + frame id + data), per the canonical ArduPilot reference (AP_Mount_Viewpro.h::calc_crc) and ViewPro's published TCP/UDP Command Packet Format doc. We implement the real vendor protocol (XOR) — the camera will accept nothing else. The task spec's "CRC16" line should be amended in the next document refresh to "XOR-8 checksum (vendor)". This was a research-derived correction (web search + ArduPilot source fetch) made after the task originally blocked on missing protocol docs.
Production code added:
crates/gimbal_controller/src/internal/a40_protocol/checksum.rsxor_checksum(buf: &[u8]) -> u8— 8-bit XOR fold; pure logic.
crates/gimbal_controller/src/internal/a40_protocol/frame.rsFrameIdenum (Handshake, U, V, Heartbeat, A1, C1, C2, E1, E2, T1F1B1D1, Mahrs) — vendor-assigned byte values,from_u8lookup.Frame { frame_id, data, frame_counter }— decoded payload.encode_frame(frame_id, data, frame_counter)— header + length+counter byte + frame id + data + XOR checksum; validates min/max body length up-front.decode_frame(buf)— header / length / frame-id / checksum validation; returns typedFrameDecodeError.- Constants:
MAX_PACKET_LEN=63,MIN_BODY_LEN=4,MAX_BODY_LEN=63.
crates/gimbal_controller/src/internal/a40_protocol/commands.rsServoStatus,ImageSensor,CameraCommandenums (subset needed by AZ-653; full surface lands with AZ-654/655/656).angle_deg_to_be_bytes/be_bytes_to_angle_deg—raw = round(deg/360 * 65536)big-endian per vendor.build_a1_angles(yaw_deg, pitch_deg)— 9-byte A1 payload.build_c1_camera(sensor, cmd)— 2-byte C1 payload.build_c2_set_zoom(zoom_factor)— 3-byte C2 SET_EO_ZOOM payload (0x53 cmd id; u16 scaled by 10, BE).
crates/gimbal_controller/src/internal/transport.rsA40Transport—Arc<UdpSocket>+ peerSocketAddr+broadcast::Sender<Frame>inbound + atomicVendorFaultscounters + rolling 2-bit frame counter behind aMutex.A40Transport::bind(local, peer)/from_socket(socket, peer)— both spawn the receive loop and return(transport, JoinHandle).send_oneway(frame_id, data)— fire-and-forget (used byM_AHRSattitude pushes).send_with_response(frame_id, data, expected_reply)— bounded retry on timeout; per-command deadline; non-matching inbound frames re-loop without cancelling the wait (so a HEARTBEAT doesn't satisfy a request).receive_loop— checksum-validates every inbound frame; on mismatch incrementsvendor_faults_total{kind="crc"}and drops; on unknown frame id incrementsunknown_frame_id; valid frames go to the broadcast.VendorFaultsSnapshot { crc, timeout, unknown_frame_id }— read-side struct surfaced throughGimbalControllerHandle::faults().- Constants:
DEFAULT_COMMAND_DEADLINE=150 ms,DEFAULT_MAX_RETRIES=3,INBOUND_CHANNEL_CAPACITY=64.
crates/gimbal_controller/src/lib.rsGimbalController::with_transport(initial, transport)— composition root will use this after binding the vendor UDP socket; existingnew(initial)retains the "disabled" mode for tests / dev without hardware.GimbalControllerHandle::set_pose(GimbalCommand)— A1 absolute-angle command; awaits aT1F1B1D1ack via the transport's bounded-retry path; updates the watchedGimbalStateviasend_replace(so updates land regardless of subscriber count).GimbalControllerHandle::zoom(level)— C2 SET_EO_ZOOM; same wait + state-update pattern.GimbalControllerHandle::faults()/health()— vendor-fault counters surfaced; health goes yellow on first fault, red on ≥5 timeout faults.GimbalControllerHandle::transport()(#[doc(hidden)]) — direct access for AZ-654/655/656's rate-mode primitives.
Tests:
crates/gimbal_controller/tests/a40_transport.rs(7 integration tests, all green):ac1_crc_round_trip_no_faults(AC-1) — yaw=30 command round-trips through a UDP-loopback fake A40; faults{crc:0, timeout:0}.ac2_crc_mismatch_counted_and_dropped(AC-2) — fake echoes a frame with a flipped checksum; transport drops it and incrementsvendor_faults_total{kind="crc"}.ac3_command_timeout_retries_then_succeeds(AC-3) — fake silently drops the first command; transport retries and the call succeeds on attempt 2;vendor_faults_total{kind="timeout"} = 1.ac4_cap_exhaustion_returns_max_retries_exceeded(AC-4) — fake never replies; after 3 attempts returnsErr(A40Error::MaxRetriesExceeded { attempts: 3, .. }); the fake observes exactly 3 inbound datagrams.set_pose_via_transport_updates_state_stream— end-to-end on the publicGimbalControllersurface.zoom_via_transport_updates_zoom_state— same forzoom.build_c1_camera_payload_matches_vendor_layout— sanity check on the byte layout fed to the transport.
- Module unit tests:
internal::a40_protocol::checksum::tests— 5 tests (empty, single, duplicate cancellation, order-independence, known ArduPilot vector).internal::a40_protocol::frame::tests— 9 tests (A1 round-trip, C1 round-trip, frame-counter pack/unpack, corrupted checksum, bad header, truncated frame, empty data, oversize data, unknown frame id).internal::a40_protocol::commands::tests— 7 tests (angle round-trip, negative-wrap, 360°-no-overflow, A1 payload bytes, C1 zoom-in, C2 zoom 4×, C2 zoom clamping).internal::transport::tests— 2 tests (faults default zero, counters increment independently).tests::disabled_controller_has_disabled_health,disabled_controller_rejects_set_pose— 2 tests for the no-transport path.
Total: 32 / 32 tests passing (cargo test -p gimbal_controller).
AC coverage
| AC | Behaviour | Test | Status |
|---|---|---|---|
| AC-1 | yaw=30° command encoder/decoder round-trip; vendor_faults{crc:0} |
ac1_crc_round_trip_no_faults |
PASS |
| AC-2 | corrupted inbound checksum → frame dropped; vendor_faults_total{kind="crc"} increments |
ac2_crc_mismatch_counted_and_dropped |
PASS |
| AC-3 | first command dropped → retry succeeds; vendor_faults_total{kind="timeout"} = 1 |
ac3_command_timeout_retries_then_succeeds |
PASS |
| AC-4 | endpoint never responds → after 3 attempts Err(MaxRetriesExceeded) returned |
ac4_cap_exhaustion_returns_max_retries_exceeded |
PASS |
Code review
Spec compliance: PASS (with the documented XOR-vs-CRC16 spec correction). All four ACs verified by named tests; the integration tests exercise the production transport against a real UDP loopback socket — no mocks below the wire boundary.
Architecture compliance: PASS.
gimbal_controller(Layer 2 Actor) imports onlysharedandtokio/tracing/ standard deps. No sibling Layer 2 imports.internal/a40_protocol/*matchesmodule-layout.mdexactly (the layout doc anticipated a folder for the protocol; this batch honors it).internal/transport.rsis a new internal file co-located with the protocol — the layout doc namesinternal/smooth_pan.rsandinternal/a40_protocol/*but doesn't yet listinternal/transport.rs. Recommended: addcrates/gimbal_controller/src/internal/transport.rsto thegimbal_controllerInternal bullet list inmodule-layout.mdduring the next document refresh. (Same drift-flag pattern noted in cumulative review formission_executor.)
SRP: PASS.
checksum.rs— pure XOR helper, no I/O.frame.rs— pure encode/decode, no I/O.commands.rs— pure typed payload builders, no I/O.transport.rs— owns UDP + retry policy + fault counters; everything async lives here.lib.rs— adapter from typedGimbalCommandtoA40Transportcalls.
Runtime completeness: PASS. Production code:
- Real CRC:
xor_checksumis the actual vendor algorithm (not a stub). - Real UDP socket:
tokio::net::UdpSocketin the transport (not a fake). - Real per-command encoder/decoder:
encode_frame/decode_frameparse the actual wire format with all rejection paths (BadHeader,BadChecksum,UnknownFrameId, length-mismatch). - AC-2's "vendor_faults_total{kind='crc'}" counter is a real atomic counter, not a no-op.
Test discipline: PASS. AAA pattern with // Arrange / Act / Assert comments. Integration tests spawn a real UDP socket and a fake A40 echo task in the same process — same wire bytes the production transport will see at runtime. No unsafe, no production unwrap/expect.
Security quick-scan: PASS. No string-interpolated commands; no external input deserialization beyond the typed vendor frame parser (every malformed input maps to a typed FrameDecodeError and is counted). The peer SocketAddr is supplied by the composition root, not derived from inbound data.
Performance scan: PASS.
- Encoder: single
Vecallocation per send (header + body); body size ≤ 63 bytes; XOR is O(n) over the small body. - Decoder: zero allocation except the
data: Vec<u8>clone (≤57 bytes). - Send path: one
Mutex<u8>lock per send for the counter — held microseconds. - Receive loop: stack buffer (128 bytes);
broadcast::sendis lock-free.
Cross-task consistency: N/A — single task in the batch.
Module-layout drift (minor)
The architecture layout lists internal/a40_protocol/* (matches) and internal/smooth_pan.rs (AZ-655). This batch additionally introduces internal/transport.rs which isn't yet enumerated. Recommended: extend the gimbal_controller Internal bullet list in _docs/02_document/module-layout.md at next document refresh.
Known limitations (warnings)
-
T1_F1_B1_D1ack semantics are coarse. Today every command awaits a genericT1_F1_B1_D1frame as ack. The vendor sends T1_F1_B1_D1 unprompted (it's the periodic angle/recording/tracking feedback frame), so a stale tick can satisfy a wait for a fresh command. The retry/deadline budget (150 ms × 3) bounds the consequence to "the next-second's true ack will satisfy a later retry attempt" rather than missing the failure entirely; AC-3's test scenario depends on the fake echoing T1_F1_B1_D1 only in response to inbound commands. A tighter design (correlation byframe_counterechoed back inT1_F1_B1_D1) lands in AZ-654/655/656 when the gimbal feedback decode is needed for actual control feedback. Documented intransport.rsdocstring. -
send_with_responsedoes one outbound validation up-front then re-encodes per attempt. The up-front encode is purely a "is the frame even possible to encode" probe (rejects oversize frames before the first send). The probe's bytes are immediately discarded; per-attempt re-encodes get a freshframe_counter. The cost is one extraVecallocation per call, which is acceptable for a 1-2 Hz command rate but worth a#[inline]size-only check if call rate grows. Documented in the function body. -
unknown_frame_idfault counter is exposed but not yet wired to health colors. Today onlycrcandtimeoutfaults flip health. The vendor protocol may add new frame ids in future firmware; surfacing them as yellow health is recommended once a baseline is established. Tracked as future work. -
gimbal-mockDocker service named intests/environment.mddoes not yet exist (e2e/mocks/gimbal-mock). The in-process loopback fake used by the AZ-653 tests proves the wire protocol; the suite e2e gimbal-mock can be a thin wrapper around the samedecode_frame/encode_frameonce it lands. Documented in the architecture compliance note above.
Auto-fix attempts during the batch
tokio::sync::watch::sendreturnsErrwhen no receivers are subscribed, which silently dropped astateupdate inzoom_via_transport_updates_zoom_state. Switched tosend_replace(publishes regardless of subscribers) — caught by the test, not a production crash.- Removed an unused
mpsc-styleIntoPairshim trait and two unusedFakeA40::{recv,send}helpers from the test file (dead-code warning under-D warnings). - Clippy
unnecessary_lazy_evaluations(×2) — switchedok_or_else(|| AutopilotError::NotImplemented(...))took_or(AutopilotError::NotImplemented(...))since the value is a string literal. - Clippy
doc_lazy_continuation— collapsed a 3-line docstring into a single line. - Removed an unused
use std::sync::Arcfromlib.rsafter refactoring.
Test reproduction
cargo build -p gimbal_controller --tests
cargo test -p gimbal_controller # 32 tests; 0 failed
cargo clippy -p gimbal_controller --tests -- -D warnings
cargo test --workspace # all green
Research provenance
The ViewPro A40 vendor protocol is documented externally:
- ArduPilot
libraries/AP_Mount/AP_Mount_Viewpro.h— canonical open-source reference (master branch). Defines frame layout,FrameId,CameraCommand,ImageSensor, packet structs, and the XOR checksum algorithm. This is the source for every constant ininternal/a40_protocol/. - ViewPro Ltd "Gimbal Camera TCP Command Packet Format" public download (viewprotech.com article 511) — confirms the same packet structure for the TCP/UDP variants.
- ViewPro A40 Pro spec sheet (viewprouav.com
A40-pro-Spec.pdf) — confirms UDP as a supported control channel.
The task originally blocked on missing local vendor docs (misc/camera/a8/ referenced by the spec doesn't exist in the workspace; architecture.md §7.7 only covers the MAVLink command surface). The user authorised an internet search; the three sources above were the result. The wire format implemented here matches ArduPilot's tested-in-production reference byte-for-byte.
Candidates for batch 11
- AZ-657
frame_ingest_rtsp_session— 3 pts. Deps only on AZ-640. Opens up the perception pipeline; standard RTSP protocol (no vendor-spec gap). - AZ-682
scan_controller_state_machine— 5 pts. DepsAZ-640, AZ-649(both done). Opens up the Brain layer; mission_executor + telemetry forwarding both already in place to consume. - AZ-654
gimbal_zoom_out_sweep— 3 pts. Now unblocked (deps on AZ-653 satisfied by this batch). Natural follow-on within the same component.
Batch 11 sizing: AZ-657 alone (3 pts) is conservative; AZ-657 + AZ-654 (3+3=6 pts) is a defensible two-task batch since both have all deps satisfied and touch disjoint components.