Files
Oleksandr Bezdieniezhnykh 9fe0bbeac9
ci/woodpecker/push/build-arm Pipeline failed
[AZ-683] scan_controller POI queue + 5/min cap + decision window
Adds the prioritized POI queue on top of the AZ-682 FSM substrate:
priority = confidence x proximity x age_factor; rolling 60s window
caps surfaces at 5; confidence-scaled decision window (40% -> 30s,
100% -> 120s, linear; <40% never surfaces); tick() runs the timeout
sweep and silently forgets expired POIs (no IgnoredItem per spec);
DeclinePoi via operator command returns a DeclineAction for AZ-685
to persist.

ScanControllerHandle gains submit_poi_candidate /
next_poi_for_surface / decline_poi / poi_queue_len /
pois_in_window. submit_operator_cmd return type widens from
Result<()> to Result<SubmitOutcome>. ScanMetrics and health()
surface queue depth and counters.

Tests: 26 unit + 11 integration in scan_controller (all AC1..AC5 +
DeclinePoi end-to-end). Workspace clippy on scan_controller clean.
Pre-existing autopilot::Runtime::vlm_provider_name dead-code error
from batch 4 still open (see cumulative C5).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-20 09:04:29 +03:00

9.6 KiB
Raw Permalink Blame History

Batch 13 / Cycle 1 — Implementation Report

Date: 2026-05-20 Tasks: AZ-683 Verdict: PASS_WITH_WARNINGS (pre-existing autopilot lint from batch 4 still open — see Findings §A1; unchanged by this batch)

1. Scope

Ticket Title Crate Complexity
AZ-683 scan_controller POI queue + ≤5/min cap + decision-window mapping scan_controller 5

Batch 13 ships AZ-683 as a stand-alone unit. AZ-684 (evidence ladder) was considered for the same batch but pulled because its dependencies (AZ-660 detections wire, AZ-671 VLM provider runtime) are not yet landed; co-batching it would have created an artificial blocker. POI queue is fully self-contained on top of the AZ-682 FSM substrate, so shipping it alone keeps the batch unblocked and review tractable.

2. Approach

Per 02_tasks/done/AZ-683_scan_controller_poi_queue_and_window.md, the deliverable is the prioritized POI queue, rolling 5/min surface cap, confidence-scaled decision window, and the timeout-vs-decline semantic split. The evidence-ladder gate (AZ-684) and mapobjects-store IgnoredItem persist (AZ-685) are intentionally not in this batch — the queue surfaces priorities and returns dispatchable actions, but the actual gimbal slew (scan_controller issuing an ROI) and IgnoredItem write live in their own tickets. The split is enforced by:

  • next_poi_for_surface returns the Poi once the cap allows it and the confidence is ≥ 40 % — but does not itself drive the gimbal or change FSM state; AZ-684 will plumb that.
  • decline_poi returns a DeclineAction { poi_id, mgrs, class_group, declined_at, source_detection_ids } — the caller (AZ-685 mapobjects-store dispatch) is responsible for the actual IgnoredItem persist. This keeps the queue free of mapobjects_store I/O.
  • tick()'s timeout sweep silently forgets expired POIs. No IgnoredItem is emitted for a timeout per spec §3 — only a positive operator decline creates an IgnoredItem.

Component pieces shipped

  • internal/poi_queue/priority.rs — pure functions:
    • decision_window(confidence) -> Option<Duration> — linear 40 % → 30 s, 100 % → 120 s, None below floor.
    • age_factor(age_seconds) -> f32 — linear decay 1.0 → 0.1 over 300 s, clamped.
    • priority_score(confidence, proximity, age_seconds) -> f32c × p × age_factor.
  • internal/poi_queue/mod.rsPoiQueue actor-private struct:
    • insert(poi, proximity, now_ns) — enqueues with stamped enqueued_at_ns.
    • next_for_surface(now_ns) -> Option<Poi> — picks the highest priority entry that clears the confidence floor and the rolling cap, removes it from the queue, records a surface timestamp.
    • decline(poi_id) -> Option<DeclineAction> — removes entry, returns the IgnoredItem payload data.
    • timeout_sweep(now_wallclock) -> Vec<Uuid> — drops expired entries, returns the removed IDs for metric accounting.
    • surfaces_in_window(now_ns) -> usize — number of POIs surfaced in the rolling 60 s window after trimming.
    • SURFACE_CAP_PER_WINDOW = 5.
  • crates/scan_controller/src/lib.rs — wiring:
    • Inner now owns poi_queue: PoiQueue and counters pois_surfaced_total, pois_forgotten_total, pois_declined_total.
    • ScanControllerHandle::submit_poi_candidate, next_poi_for_surface, decline_poi, poi_queue_len, pois_in_window — public async surface.
    • ScanControllerHandle::tick now also runs the timeout sweep.
    • ScanControllerHandle::submit_operator_cmd now handles DeclinePoi end-to-end — payload { poi_id } is parsed, decline_poi is called, and the result is returned as SubmitOutcome::Declined(DeclineAction) for the caller. The method's return type changed from Result<()> to Result<SubmitOutcome>.
    • ScanMetrics gained four POI fields: poi_queue_len, pois_surfaced_total, pois_forgotten_total, pois_declined_total.
    • health() detail now includes poi_queue=<len>.

3. Files touched

AZ-683

  • crates/scan_controller/Cargo.toml — added serde_json (for operator-command payload parsing) and chrono (for wallclock deadlines).
  • crates/scan_controller/src/lib.rs — wired POI queue into Inner, added submit_poi_candidate / next_poi_for_surface / decline_poi / poi_queue_len / pois_in_window, changed submit_operator_cmd return type and added DeclinePoi handling, extended ScanMetrics and health().
  • crates/scan_controller/src/internal/mod.rs — added pub mod poi_queue.
  • crates/scan_controller/src/internal/poi_queue/mod.rs — new (PoiQueue, DeclineAction, SURFACE_CAP_PER_WINDOW, 5 unit tests).
  • crates/scan_controller/src/internal/poi_queue/priority.rs — new (pure priority math + 8 unit tests).
  • crates/scan_controller/tests/poi_queue.rs — new (6 integration tests covering AC-1..AC-5 + DeclinePoi via operator command).

4. Test results

Crate Unit Integration Total
scan_controller 26 11 (5 state_machine + 6 poi_queue) 37

Workspace cargo test --workspace: all suites green. The single mission_executor::state_machine::ac3_bounded_retry_then_success ignored test carries over from batch 8 — unchanged by this batch.

Clippy: cargo clippy -p scan_controller --all-targets -- -D warnings is clean. Workspace-wide clippy still hits the pre-existing autopilot::Runtime::vlm_provider_name dead-code error from batch 4 (see Findings §A1 / cumulative C5).

Acceptance criteria

AC Source Test
AC-1 priority ordering tests/poi_queue.rs::ac1_priority_ordering_via_handle + internal/poi_queue/mod.rs::orders_by_priority_score
AC-2 ≤5/min rolling cap tests/poi_queue.rs::ac2_five_per_minute_cap_via_handle + internal/poi_queue/mod.rs::cap_blocks_after_five_surfaces
AC-3 decision-window mapping tests/poi_queue.rs::ac3_decision_window_public_mapping + internal/poi_queue/priority.rs::decision_window_*
AC-4 confidence floor (no surface < 40 %) tests/poi_queue.rs::ac4_below_floor_never_surfaces + internal/poi_queue/priority.rs::decision_window_below_floor
AC-5 timeout sweep — silently forget tests/poi_queue.rs::ac5_tick_sweep_forgets_expired_pois + internal/poi_queue/mod.rs::timeout_sweep_*
Decline → IgnoredItem action tests/poi_queue.rs::decline_poi_via_operator_command_emits_action

5. Findings (this batch)

A1. Pre-existing dead-code error in autopilot::Runtime::vlm_provider_name

Severity: High (still blocks workspace -D warnings clippy gate) Category: Maintenance Origin: Batch 4. Unchanged by this batch.

Tracked in _docs/_process_leftovers/2026-05-20_autopilot_clippy.md. Carried as cumulative finding C5 — see §6.

A2. submit_operator_cmd return type changed

Severity: Low (API) Detail: Return type went from Result<()> to Result<SubmitOutcome> so that DeclinePoi can hand back the DeclineAction for AZ-685 to dispatch. No external caller exists yet (operator-bridge wiring is AZ-685), so this is not a breaking change in practice. Existing internal call sites (the tests/state_machine.rs suite from batch 12) used submit_operator_cmd only for MissionAbort / ReleaseTargetFollow and only via the public handle; both now return SubmitOutcome::Accepted and the existing tests still ignore the return value via .unwrap()-style discard, so they continue to pass unchanged.

A3. Poi.priority field is not mutated by the queue

Severity: Low (Architecture / clarification) Detail: The canonical Poi.priority field stays whatever the producer set it to. The queue's internal Entry carries the proximity/age factors needed for ordering separately. This keeps the Poi model in shared::models::poi immutable from the queue's perspective and avoids racing producers/consumers on priority. Documented here in case AZ-684/685 expects to read a final priority score from the surfaced Poi.

6. Cumulative findings — open carry-over

Batch-13 is one batch into a new triplet (13 / 14 / 15); cumulative review will land at the end of batch 15. Carry-over from the batch-12 cumulative review:

ID Severity Category Status
C1 Medium Maintainability OPEN — duplicated SendCommandError mapping in gimbal_controller (batches 9-10)
C2 Low Style OPEN — MavlinkCommandIssuer naming inconsistency (batch 9)
C3 Low Architecture OPEN — module-layout.md drift: now also covers scan_controller/internal/poi_queue/{mod,priority}.rs
C4 Low Architecture OPEN — data_model.md §PanPlan definition still missing (batch 11)
C5 High Maintenance OPEN — pre-existing autopilot/runtime.rs::vlm_provider_name dead-code error blocking workspace -D warnings clippy (batch 4 origin)

C3 grows by poi_queue/{mod,priority}.rs this batch. C5 is still the most pressing; the next opportunity to fix it is either a dedicated maintenance batch or sweep before merging dev.

7. Next-batch candidates

  • AZ-684 — scan_controller evidence ladder + VLM hooks. Now unblocked by AZ-683 here, but still needs AZ-660 (detections wire) and AZ-671 (VLM provider runtime) for end-to-end value. Could be partially implemented as a "Tier-2 confirmation handler stub" today.
  • AZ-685 — mapobjects-store dispatch for confirmed POIs and IgnoredItem (consumes the DeclineAction this batch returns).
  • AZ-659 — frame_ingest publisher (slow-consumer drop policy).
  • AZ-658 — frame_ingest decoder (still pending the retina/ffmpeg pin decision).