mirror of
https://github.com/azaion/autopilot.git
synced 2026-06-21 08:31:10 +00:00
chore: cumulative review batches 07-09 (cycle 1)
Verdict PASS_WITH_WARNINGS. 0 Critical, 0 High, 1 Medium (DRY across the three failsafe SendCommandError mappings), 2 Low (MavlinkCommandIssuer naming; module-layout path drift). None block batch 10. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,172 @@
|
||||
# Cumulative Code Review — Batches 07–09 (Cycle 1)
|
||||
|
||||
**Trigger**: `implement/SKILL.md` Step 14.5 — `K=3` batches completed since the last cumulative review (`cumulative_review_batches_04-06_cycle1_report.md`).
|
||||
**Date**: 2026-05-19
|
||||
**Cycle**: 1
|
||||
**Scope**: union of files changed in `batch_07_cycle1`, `batch_08_cycle1`, `batch_09_cycle1` (range `23366a5..HEAD`, excluding `_docs/`).
|
||||
**Mode**: inline (matching the per-batch precedent in batches 1–6; sub-skill `/code-review` deliberately skipped to conserve context).
|
||||
**Baseline**: `_docs/02_document/architecture_compliance_baseline.md` still does not exist (greenfield project — no Architecture Baseline Scan has been promoted). No `## Baseline Delta` section is produced. The intent recorded in the previous cumulative review (promote a baseline once Step 12 lands) is carried forward.
|
||||
|
||||
## Tasks in scope
|
||||
|
||||
| Batch | Tasks | Components touched |
|
||||
|-------|----------------------------------------------------------------------------|-------------------------------------------------------|
|
||||
| 07 | AZ-651 (`mission_executor_lost_link_ladder`), AZ-668 (`mapobjects_store_persistence`) | `mission_executor`, `mapobjects_store` |
|
||||
| 08 | AZ-650 (`mission_executor_bit_f9`) | `mission_executor` (BIT controller + 4 evaluators) |
|
||||
| 09 | AZ-652 (`mission_executor_safety_and_resume`) | `mission_executor` (geofence + battery + middle-waypoint + post-flight) |
|
||||
|
||||
Per-batch AC verification (rolled up from individual reports): **23 / 23 ACs verified locally** (10 in batch 07: AZ-651 4 ACs + AZ-668 6 ACs; 4 in batch 08; 6+2-branches in batch 09 = 9 tests). One pre-existing flake noted in batch 7 report (`state_machine::ac3_bounded_retry_then_success`) ran green in both batches 8 and 9 final test runs; intermittent, kept on the watch list.
|
||||
|
||||
**Code volume**: 5,619 additions, 16 deletions, 21 source/test files. The bulk lives in `mission_executor` (4 new failsafe-family modules + 3 new integration test files); `mapobjects_store` got the persistence sidecar.
|
||||
|
||||
## Phase 1 — Spec coverage
|
||||
|
||||
Every Included scope item from these 4 tasks is implemented in production code (not just tests / not just trait placeholders):
|
||||
|
||||
- **AZ-651** (lost-link ladder): `LostLinkLadder` (pure state table: `LinkOk → LinkDegraded → LinkLost → LinkLostInFollow`), `LostLinkDriver` (wiring layer subscribing to `mavlink_layer::LinkEvent`), `MavlinkCommandIssuer` (production impl issuing `MAV_CMD_NAV_RETURN_TO_LAUNCH=20`), `LadderEvent` broadcast surface. Driver wires `executor.failsafe_trigger(FailsafeKind::LinkLost)` on ladder transitions.
|
||||
- **AZ-668** (mapobjects persistence): on-disk snapshot at `~/.autopilot/state/mapobjects_snapshot.{json,sha256}` with write-then-rename atomicity, restore-on-boot semantics surfaced via `SyncState::CachedFallback`, integrity-failure surfaced via `SyncState::Degraded/Failed`.
|
||||
- **AZ-650** (BIT F9): `BitController` (12-item pre-flight gate + sticky-pass + ack-timeout deadline), 4 concrete `BitEvaluator` impls (state-dir, wallclock, mission-loaded, mapobjects-synced); the remaining 8 evaluators await their component landings per batch 8's runtime-completeness note.
|
||||
- **AZ-652** (safety + resume): `GeofenceMonitor` (pure ray-casting PIP, symmetric INCLUSION/EXCLUSION semantics — the C++ EXCLUSION-ignore bug is rejected), `BatteryMonitor` (RTL@25% / land@15% + signed-override suppression that does NOT cover hard-floor), `MissionRePlanner` (middle-waypoint re-upload sequence + target-follow release replan), `PostFlightPusher` (one-shot `mission_client::push_mapobjects_diff` from the `POST_FLIGHT_SYNC` entry guard).
|
||||
|
||||
`mission_executor`'s public surface grew by: `BatteryAction`, `BatteryCommandIssuer`, `BatteryConfig`, `BatteryDriver`, `BatteryEvent`, `BatteryMonitor`, `BatteryMonitorHandle`, `BatteryOverride`, `BitController*` (10 symbols), `GeofenceCommandIssuer`, `GeofenceDriver`, `GeofenceEvent`, `GeofenceMonitor`, `GeofenceMonitorHandle`, `GeofenceVerdict`, `LadderEvent`, `LadderInput`, `LadderOutput`, `LadderState`, `LostLinkCommandIssuer`, `LostLinkConfig`, `LostLinkDriver`, `LostLinkLadder`, `LostLinkLadderHandle`, `MavlinkCommandIssuer`, `MavlinkBatteryCommandIssuer`, `MavlinkGeofenceCommandIssuer`, `MAV_CMD_NAV_LAND`, `MAV_CMD_NAV_RETURN_TO_LAUNCH`, `MiddleWaypointHint`, `MissionRePlanner`, `PostFlightPusher`, `MapObjectsPusher`, `MapObjectsDiffSource`. Symbol explosion is expected at this stage of the executor's build-out; the next cumulative review should re-scan for any name that has not landed a user-visible call site.
|
||||
|
||||
## Phase 2 — Code quality
|
||||
|
||||
| Concern | Finding | Severity |
|
||||
|---------|---------|----------|
|
||||
| Naming consistency across failsafe issuers | Lost-link's production issuer is named `MavlinkCommandIssuer` (no `LostLink` prefix), while the two new failsafe families use the prefixed `MavlinkGeofenceCommandIssuer` / `MavlinkBatteryCommandIssuer`. A reader searching for "the lost-link command issuer" sees an unmarked name. Suggested rename: `MavlinkLostLinkCommandIssuer`. | Low / Style |
|
||||
| DRY across the three issuers | The `SendCommandError → AutopilotError::Internal(format!(…))` mapping is structurally identical across `lost_link.rs:317`, `geofence.rs:205`, `battery_thresholds.rs:261`. Three near-copies of ~10 lines each. A `From<SendCommandError> for AutopilotError` impl on the consumer side (or a `mavlink_layer::SendCommandError::into_autopilot_error(context: &str)` helper) would consolidate them. | Medium / Maintainability |
|
||||
| `unsafe` blocks | None in any of the new files. Verified via grep. | — |
|
||||
| Production `unwrap`/`expect` | All hits are in `#[cfg(test)]` modules or on hardcoded constants validated at compile/parse time (`DateTime::parse_from_rfc3339("2024-01-01T00:00:00Z").expect("valid RFC3339")` — a const literal). No production crash sites. | — |
|
||||
| Test back-door discipline | `MissionExecutorHandle::force_state_for_tests` is `#[doc(hidden)]` and used only by `safety_and_resume.rs` (no production caller — verified by grep). Acceptable for integration tests that must compile against the public API. | — |
|
||||
|
||||
## Phase 3 — Security quick-scan
|
||||
|
||||
- No string-interpolated SQL/shell.
|
||||
- No new external input deserialization (the persistence snapshot in AZ-668 uses serde over a checksum-verified file; the checksum is verified before deserialization).
|
||||
- `BatteryOverride` signature validation is **explicitly scoped out** of AZ-652 (handled by `operator_bridge` per AZ-689). The current driver assumes the override has already been verified by the producer; this is documented in the type's docstring. Until AZ-689 lands, no enforcement gap exists in production because no upstream actor sends overrides yet.
|
||||
- The persistence path uses `~/.autopilot/state/`; no path-traversal risk because the directory is hardcoded and the filename is fixed.
|
||||
|
||||
PASS.
|
||||
|
||||
## Phase 4 — Performance scan
|
||||
|
||||
- Geofence monitor: 10 Hz × O(total vertices) ≈ a few hundred FLOPs per tick at the operational `≤8 fences × ≤32 vertices`. Well under the AZ-652 ≤500 ms response budget (100 ms tick + MAVLink RTT).
|
||||
- Battery monitor: O(1) per tick — direct comparison against two thresholds.
|
||||
- BIT controller: O(evaluators) per tick at 1 Hz; sticky-pass means each evaluator is asked at most once per state cycle.
|
||||
- Persistence snapshot (AZ-668): write-then-rename keeps the operational disk path constant-time at the file-system level; serde JSON serialization is O(map_objects) but only at boot/snapshot points, not on hot paths.
|
||||
- No unbounded fetch / N+1 / blocking I/O in async contexts detected.
|
||||
|
||||
PASS.
|
||||
|
||||
## Phase 5 — Cross-task consistency
|
||||
|
||||
**Failsafe family pattern (the load-bearing consistency check for this batch range)** — all three families now follow the same shape:
|
||||
|
||||
| Family | Pure-logic monitor | Driver wrapper | Command-issuer trait | Production impl | `failsafe_trigger` integration |
|
||||
|--------------|-----------------------------|--------------------------|-----------------------------|------------------------------------|--------------------------------|
|
||||
| Lost-link | `LostLinkLadder` | `LostLinkDriver` | `LostLinkCommandIssuer` | `MavlinkCommandIssuer` *(see Phase 2 naming finding)* | `FailsafeKind::LinkLost*` → `Land` |
|
||||
| Geofence | `GeofenceMonitor` | `GeofenceDriver` | `GeofenceCommandIssuer` | `MavlinkGeofenceCommandIssuer` | `FailsafeKind::GeofenceInclusion/Exclusion` → `Land` |
|
||||
| Battery | `BatteryMonitor` | `BatteryDriver` | `BatteryCommandIssuer` | `MavlinkBatteryCommandIssuer` | `FailsafeKind::BatteryRtl` → `Land`; `BatteryHardFloor` → `Land` + latch `hard_floor_active` |
|
||||
|
||||
Convergence is intentional and matches the AZ-651 "each failsafe family owns its command surface" principle. The `MAV_CMD_NAV_RETURN_TO_LAUNCH=20` constant is shared (defined in `lost_link.rs`, re-exported via `lib.rs`, imported by both `geofence.rs` and `battery_thresholds.rs`); `MAV_CMD_NAV_LAND=21` lives in `battery_thresholds.rs` because battery is the only family that issues it. Both constants match the MAVLink Common spec.
|
||||
|
||||
**Single chokepoint**: `MissionExecutorHandle::failsafe_trigger(FailsafeKind)` in `lib.rs` handles every family in one `match` and routes all non-degraded variants through the same `transition_flymission_to_land()` helper. Adding a new failsafe family (e.g., GPS-lost) would require: one `FailsafeKind` variant + one match arm. No cross-family logic leaked.
|
||||
|
||||
**Health surface**: the `hard_floor_active: Arc<AtomicBool>` latch added in batch 9 is the only state that flips `health()` red independently of the FSM `Paused` state. All other failsafe families intentionally route through the FSM (transition to `Land`) and rely on the existing `state == Paused` → red mapping for their health surface. That asymmetry is correct (hard-floor is the only condition that should persist red after the airframe has touched down).
|
||||
|
||||
PASS.
|
||||
|
||||
## Phase 6 — Architecture compliance
|
||||
|
||||
**Layer direction** (from `module-layout.md` Allowed Dependencies):
|
||||
- `mission_executor` (Layer 3, Coordinator) imports from: `shared`, `mavlink_layer` (Layer 2), `mission_client` (Layer 2 — via traits in `post_flight.rs` and direct use of `mission_client::{MapObjectsDiff, MissionClientHandle, PushReport}`), `mapobjects_store` (Layer 2 — used by `bit_evaluators::MapObjectsSyncedEvaluator`).
|
||||
- `mapobjects_store` (Layer 2, Storage) imports from: `shared` only.
|
||||
- No Layer 3 → Layer 3 imports. No Layer 2 sibling-to-sibling imports.
|
||||
|
||||
PASS.
|
||||
|
||||
**Public API respect**:
|
||||
- `mavlink_layer::{CommandLong, MavlinkHandle, SendCommandError}` — all three are re-exported from the `mavlink_layer` crate root (verified via `crates/mavlink_layer/src/lib.rs` Public API).
|
||||
- `mavlink_layer::LinkEvent` — Public API. `mavlink_layer::MavlinkMessage` — Public API.
|
||||
- `mission_client::{MapObjectsDiff, MissionClientHandle, PushReport, PerEndpointStatus}` — all are Public API.
|
||||
- `mapobjects_store::{MapObjectsStore, MapObjectsStoreHandle, SyncState}` — all are Public API.
|
||||
|
||||
No internal-file imports across components.
|
||||
|
||||
PASS.
|
||||
|
||||
**Cyclic dependencies**: built the import graph over the changed files plus direct deps. No new cycles. The `executor: MissionExecutorHandle` field on `LostLinkDriver`, `GeofenceDriver`, `BatteryDriver` is **same-crate** dependency (drivers and handle both live in `mission_executor`) — not a cross-crate cycle.
|
||||
|
||||
PASS.
|
||||
|
||||
**Duplicate symbols across components**: `MavlinkCommandIssuer` exists ONLY in `mission_executor` (no `mavlink_layer::MavlinkCommandIssuer` collision). `MAV_CMD_NAV_*` constants exist ONLY in `mission_executor`; they shadow nothing in `mavlink_layer` which uses raw `u16` for the same wire field.
|
||||
|
||||
PASS.
|
||||
|
||||
**Cross-cutting concerns not locally re-implemented**: `tracing` is the only cross-cutting concern touched (logging). Used consistently as `tracing::{info!, warn!, error!}` via the workspace dep. No bespoke logging setup.
|
||||
|
||||
PASS.
|
||||
|
||||
**Module-layout drift** (carried forward from batch 9 report):
|
||||
- `module-layout.md` lists `crates/mission_executor/src/internal/geofence/*` (a folder). Implemented as a single file `geofence.rs` (~470 LOC). Acceptable for the current shape; if a future batch adds new geofence variants or polygon preprocessing the file becomes a folder at that point. Module-layout should be re-synced at the next decompose/document sync.
|
||||
- Same observation: `module-layout.md` lists `internal/failsafe/ladder.rs` for the lost-link ladder; implementation is at `internal/lost_link.rs`. Path drift; no code impact.
|
||||
|
||||
Low severity Architecture finding (drift, not breakage): re-sync `module-layout.md` paths at the next document refresh.
|
||||
|
||||
## Phase 7 — Architecture Compliance (baseline delta)
|
||||
|
||||
Skipped — no `architecture_compliance_baseline.md` exists yet.
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Medium | Maintainability | `crates/mission_executor/src/internal/lost_link.rs:317`, `geofence.rs:205`, `battery_thresholds.rs:261` | `SendCommandError → AutopilotError::Internal` mapping duplicated across 3 files |
|
||||
| 2 | Low | Style | `crates/mission_executor/src/internal/lost_link.rs:276` | `MavlinkCommandIssuer` lacks `LostLink` prefix; the two newer issuers use the prefixed form |
|
||||
| 3 | Low | Architecture | `_docs/02_document/module-layout.md` | Paths for `internal/geofence/*` and `internal/failsafe/ladder.rs` drift from the actual single-file layout (`geofence.rs`, `lost_link.rs`). Doc-only, no code impact. |
|
||||
|
||||
### Finding details
|
||||
|
||||
**F1: `SendCommandError → AutopilotError::Internal` mapping duplicated across 3 files** (Medium / Maintainability)
|
||||
- Locations: `crates/mission_executor/src/internal/lost_link.rs:317`, `geofence.rs:205`, `battery_thresholds.rs:261`.
|
||||
- Description: All three production issuers map `SendCommandError::{Timeout(d), Duplicate(id), ChannelClosed(reason)}` into `AutopilotError::Internal(format!(...))` with near-identical wording; only the operation label varies ("RTL", "geofence RTL", "battery {what}").
|
||||
- Suggestion: add `impl From<SendCommandError> for AutopilotError` in `mavlink_layer` (the producer crate) keyed on a `&'static str` context — or a `SendCommandError::with_context(&str) -> AutopilotError` helper. Removes ~30 LOC of duplication and centralizes the wording.
|
||||
- Tasks: AZ-651, AZ-652.
|
||||
|
||||
**F2: `MavlinkCommandIssuer` lacks `LostLink` prefix** (Low / Style)
|
||||
- Location: `crates/mission_executor/src/internal/lost_link.rs:276`.
|
||||
- Description: The first-landed production issuer (AZ-651) is named `MavlinkCommandIssuer`. When AZ-652 added geofence and battery families, both adopted the prefixed form (`MavlinkGeofenceCommandIssuer`, `MavlinkBatteryCommandIssuer`). The unprefixed name is now ambiguous from a reader's perspective.
|
||||
- Suggestion: rename `MavlinkCommandIssuer` → `MavlinkLostLinkCommandIssuer` and update the re-export in `lib.rs`. Single-crate, single-file rename; consumer side has zero call sites yet (the composition root in `autopilot/` is not wired yet).
|
||||
- Task: AZ-651 (technical debt from the time before the convention existed).
|
||||
|
||||
**F3: `module-layout.md` paths drift from actual single-file layout** (Low / Architecture)
|
||||
- Location: `_docs/02_document/module-layout.md` (the entry for `mission_executor`).
|
||||
- Description: The layout doc lists `crates/mission_executor/src/internal/geofence/*` and `crates/mission_executor/src/internal/failsafe/ladder.rs`. The actual implementation uses single files (`internal/geofence.rs`, `internal/lost_link.rs`). The cardinality difference is fine — the doc anticipated a future split that didn't (yet) materialise.
|
||||
- Suggestion: re-sync `module-layout.md` to reflect the actual single-file paths during the next document refresh, OR keep the folder-anticipated form and refactor when the second variant lands. Either is defensible; surfacing it so the next decompose/document run picks it up.
|
||||
- Tasks: AZ-651, AZ-652.
|
||||
|
||||
## Verdict
|
||||
|
||||
**PASS_WITH_WARNINGS** — 0 Critical, 0 High, 1 Medium, 2 Low.
|
||||
|
||||
Per the implement skill's auto-fix matrix (`SKILL.md` Step 10):
|
||||
- F1 (Medium / Maintainability) → **auto-fix eligible**. The fix touches one file in `mavlink_layer` plus three call-site simplifications in `mission_executor`. Could be folded into the next batch's clean-up commit or scheduled as a tiny refactor task. Recommendation: schedule as part of batch 10 or 11 if those batches already touch the issuers; otherwise defer to the next refactor cycle.
|
||||
- F2 (Low / Style) → auto-fix eligible. Single rename + re-export update. Recommend folding into batch 10 if it touches `lost_link.rs`; otherwise defer.
|
||||
- F3 (Low / Architecture, doc-only) → not auto-fixable by code; handled by the next document refresh / decompose sync.
|
||||
|
||||
None of the findings block batch 10 implementation. The cumulative review gate (Step 14.5) **PASSES** and the implement loop proceeds.
|
||||
|
||||
## Cumulative metrics
|
||||
|
||||
| Metric | Value | Trend vs. prior cumulative |
|
||||
|--------|-------|----------------------------|
|
||||
| Total source LOC added (batches 7–9, ex tests) | ~3,470 | + (batches 4–6 added ~2,800) |
|
||||
| Total test LOC added | ~1,770 | + (batches 4–6 added ~1,400) |
|
||||
| Test/source ratio | ~0.51 | stable |
|
||||
| New public API symbols | ~35 | + (failsafe family expansion is the dominant driver) |
|
||||
| Cyclomatic complexity hot-spots | `failsafe_trigger` (7-arm match), `next_state` in `bit.rs` (8 arms), `BatteryMonitor::tick` (5 paths) | All under the 10-arm SOLID threshold |
|
||||
| New `unsafe` blocks | 0 | stable |
|
||||
| New `unwrap`/`expect` in production paths | 0 | stable |
|
||||
| Layer-violation Architecture findings | 0 | stable |
|
||||
| Cyclic-dep Architecture findings | 0 | stable |
|
||||
Reference in New Issue
Block a user