Files
detections-semantic/_docs/02_plans/components/01_scan_controller/description.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

15 KiB

ScanController

1. High-Level Overview

Purpose: Central orchestrator that drives the scan behavior tree — ticks the tree each cycle, which coordinates frame capture, inference dispatch, POI management, gimbal control, health monitoring, and L1/L2 scan transitions. Search behavior is data-driven via configurable Search Scenarios.

Architectural Pattern: Behavior Tree (py_trees) for high-level scan orchestration. Leaf nodes contain simple procedural logic that calls into other components. Search scenarios loaded from YAML config define what to look for and how to investigate.

Upstream dependencies: Tier1Detector, Tier2SpatialAnalyzer, VLMClient (optional), GimbalDriver, OutputManager, Config helper, Types helper

Downstream consumers: None — this is the top-level orchestrator. Exposes health API endpoint.

2. Search Scenarios (data-driven)

A SearchScenario defines what triggers a Level 2 investigation and how to investigate it. Multiple scenarios can be active simultaneously. Defined in YAML config:

search_scenarios:
  - name: winter_concealment
    enabled: true
    trigger:
      classes: [footpath_winter, branch_pile, dark_entrance]
      min_confidence: 0.5
    investigation:
      type: path_follow
      follow_class: footpath_winter
      target_classes: [concealed_position, branch_pile, dark_entrance, trash]
      use_vlm: true
    priority_boost: 1.0

  - name: autumn_concealment
    enabled: true
    trigger:
      classes: [footpath_autumn, branch_pile, dark_entrance]
      min_confidence: 0.5
    investigation:
      type: path_follow
      follow_class: footpath_autumn
      target_classes: [concealed_position, branch_pile, dark_entrance]
      use_vlm: true
    priority_boost: 1.0

  - name: building_area_search
    enabled: true
    trigger:
      classes: [building_block, road_with_traces, house_with_vehicle]
      min_confidence: 0.6
    investigation:
      type: area_sweep
      target_classes: [vehicle, military_vehicle, traces, dark_entrance]
      use_vlm: false
    priority_boost: 0.8

  - name: aa_defense_network
    enabled: false
    trigger:
      classes: [radar_dish, aa_launcher, military_truck]
      min_confidence: 0.4
      min_cluster_size: 2
    investigation:
      type: cluster_follow
      target_classes: [radar_dish, aa_launcher, military_truck, command_vehicle]
      cluster_radius_px: 300
      use_vlm: true
    priority_boost: 1.5

Investigation Types

Type Description Subtree Used When
path_follow Skeletonize footpath → PID follow → endpoint analysis PathFollowSubtree Footpath-based scenarios
area_sweep Slow pan across POI area at high zoom, Tier 1 continuously AreaSweepSubtree Building blocks, tree rows, clearings
zoom_classify Zoom to POI → run Tier 1 at high zoom → report ZoomClassifySubtree Single long-range targets
cluster_follow Cluster nearby detections → visit each in order → classify per point ClusterFollowSubtree AA defense networks, radar clusters, vehicle groups

Adding a new investigation type requires a new BT subtree. Adding a new scenario that uses existing investigation types requires only YAML config changes.

3. Behavior Tree Structure

Root (Selector — try highest priority first)
│
├── [1] HealthGuard (Decorator: checks capability flags)
│   └── FallbackBehavior
│       ├── If semantic_available=false → run existing YOLO only
│       └── If gimbal_available=false → fixed camera, Tier 1 detect only
│
├── [2] L2Investigation (Sequence — runs if POI queue non-empty)
│   ├── CheckPOIQueue (Condition: queue non-empty?)
│   ├── PickHighestPOI (Action: pop from priority queue)
│   ├── ZoomToPOI (Action: gimbal zoom + wait)
│   ├── L2DetectLoop (Repeat until timeout)
│   │   ├── CaptureFrame (Action)
│   │   ├── RunTier1 (Action: YOLOE on zoomed frame)
│   │   ├── RecordFrame (Action: L2 rate)
│   │   └── InvestigateByScenario (Selector — picks subtree based on POI's scenario)
│   │       ├── PathFollowSubtree (Sequence — if scenario.type == path_follow)
│   │       │   ├── TraceMask (Action: Tier2.trace_mask → SpatialAnalysisResult)
│   │       │   ├── PIDFollow (Action: gimbal PID along trajectory)
│   │       │   └── WaypointAnalysis (Selector — for each waypoint)
│   │       │       ├── HighConfidence (Condition: heuristic > threshold)
│   │       │       │   └── LogDetection (Action: tier=2)
│   │       │       └── AmbiguousWithVLM (Sequence — if scenario.use_vlm)
│   │       │           ├── CheckVLMAvailable (Condition)
│   │       │           ├── RunVLM (Action: VLMClient.analyze)
│   │       │           └── LogDetection (Action: tier=3)
│   │       ├── ClusterFollowSubtree (Sequence — if scenario.type == cluster_follow)
│   │       │   ├── TraceCluster (Action: Tier2.trace_cluster → SpatialAnalysisResult)
│   │       │   ├── VisitLoop (Repeat over waypoints)
│   │       │   │   ├── MoveToWaypoint (Action: gimbal to next waypoint position)
│   │       │   │   ├── CaptureFrame (Action)
│   │       │   │   ├── RunTier1 (Action: YOLOE at high zoom)
│   │       │   │   ├── ClassifyWaypoint (Selector — heuristic or VLM)
│   │       │   │   │   ├── HighConfidence (Condition)
│   │       │   │   │   │   └── LogDetection (Action: tier=2)
│   │       │   │   │   └── AmbiguousWithVLM (Sequence)
│   │       │   │   │       ├── CheckVLMAvailable (Condition)
│   │       │   │   │       ├── RunVLM (Action)
│   │       │   │   │       └── LogDetection (Action: tier=3)
│   │       │   │   └── RecordFrame (Action)
│   │       │   └── LogClusterSummary (Action: report cluster as a whole)
│   │       ├── AreaSweepSubtree (Sequence — if scenario.type == area_sweep)
│   │       │   ├── ComputeSweepPattern (Action: bounding box → pan/tilt waypoints)
│   │       │   ├── SweepLoop (Repeat over waypoints)
│   │       │   │   ├── SendGimbalCommand (Action)
│   │       │   │   ├── CaptureFrame (Action)
│   │       │   │   ├── RunTier1 (Action)
│   │       │   │   └── CheckTargets (Action: match against scenario.target_classes)
│   │       │   └── LogDetections (Action: all found targets)
│   │       └── ZoomClassifySubtree (Sequence — if scenario.type == zoom_classify)
│   │           ├── HoldZoom (Action: maintain zoom on POI)
│   │           ├── CaptureMultipleFrames (Action: 3-5 frames for confidence)
│   │           ├── RunTier1 (Action: on each frame)
│   │           ├── AggregateResults (Action: majority vote on target_classes)
│   │           └── LogDetection (Action)
│   ├── ReportToOperator (Action)
│   └── ReturnToSweep (Action: gimbal zoom out)
│
├── [3] L1Sweep (Sequence — default behavior)
│   ├── HealthCheck (Action: read tegrastats, update capability flags)
│   ├── AdvanceSweep (Action: compute next pan angle)
│   ├── SendGimbalCommand (Action: set sweep target)
│   ├── CaptureFrame (Action)
│   ├── QualityGate (Condition: Laplacian variance > threshold)
│   ├── RunTier1 (Action: YOLOE inference)
│   ├── EvaluatePOI (Action: match detections against ALL active scenarios' trigger_classes)
│   ├── RecordFrame (Action: L1 rate)
│   └── LogDetections (Action)
│
└── [4] Idle (AlwaysSucceeds — fallback)

EvaluatePOI Logic (scenario-aware)

for each detection in detections:
    for each scenario in active_scenarios:
        if detection.label in scenario.trigger.classes
           AND detection.confidence >= scenario.trigger.min_confidence:

            if scenario.investigation.type == "cluster_follow":
                # Aggregate nearby detections into a single cluster POI
                matching = [d for d in detections
                            if d.label in scenario.trigger.classes
                            and d.confidence >= scenario.trigger.min_confidence]
                clusters = spatial_cluster(matching, scenario.cluster_radius_px)
                for cluster in clusters:
                    if len(cluster) >= scenario.min_cluster_size:
                        create POI with:
                            trigger_class = cluster[0].label
                            scenario_name = scenario.name
                            investigation_type = "cluster_follow"
                            cluster_detections = cluster
                            priority = mean(d.confidence for d in cluster) * scenario.priority_boost
                        add to POI queue (deduplicate by cluster overlap)
                break  # scenario fully evaluated, don't create individual POIs
            else:
                create POI with:
                    trigger_class = detection.label
                    scenario_name = scenario.name
                    investigation_type = scenario.investigation.type
                    priority = detection.confidence * scenario.priority_boost * recency_factor
                add to POI queue (deduplicate by bbox overlap)

Blackboard Variables (py_trees shared state)

Variable Type Written by Read by
frame FrameContext CaptureFrame RunTier1, RecordFrame, QualityGate
detections list[Detection] RunTier1 EvaluatePOI, InvestigateByScenario, LogDetections
poi_queue list[POI] EvaluatePOI CheckPOIQueue, PickHighestPOI
current_poi POI PickHighestPOI ZoomToPOI, InvestigateByScenario
active_scenarios list[SearchScenario] Config load EvaluatePOI, InvestigateByScenario
spatial_result SpatialAnalysisResult TraceMask, TraceCluster PIDFollow, WaypointAnalysis, VisitLoop
capability_flags CapabilityFlags HealthCheck HealthGuard, CheckVLMAvailable
scan_angle float AdvanceSweep SendGimbalCommand

4. External API Specification

Endpoint Method Auth Rate Limit Description
/api/v1/health GET None Returns health status + metrics
/api/v1/detect POST None Frame rate Submit single frame for processing (dev/test mode)

Health response:

{
  "status": "ok",
  "tier1_ready": true,
  "gimbal_alive": true,
  "vlm_alive": false,
  "t_junction_c": 68.5,
  "capabilities": {"vlm_available": true, "gimbal_available": true, "semantic_available": true},
  "active_behavior": "L2Investigation.PathFollowSubtree.PIDFollow",
  "active_scenarios": ["winter_concealment", "building_area_search"],
  "frames_processed": 12345,
  "detections_total": 89,
  "poi_queue_depth": 2
}

5. Data Access Patterns

No database. State lives on the py_trees Blackboard:

  • POI queue: priority list on blackboard, max size from config (default 10)
  • Capability flags: 3 booleans on blackboard
  • Active scenarios: loaded from config at startup, stored on blackboard
  • Current frame: single FrameContext on blackboard (overwritten each tick)
  • Scan angle: float on blackboard (incremented each L1 tick)

6. Implementation Details

Main Loop:

scenarios = config.get("search_scenarios")
tree = create_scan_tree(config, components, scenarios)
while running:
    tree.tick()

Each tick() traverses the tree from root. The Selector tries HealthGuard first (preempts if degraded), then L2 (if POI queued), then L1 (default). Leaf nodes call into Tier1Detector, Tier2SpatialAnalyzer, VLMClient, GimbalDriver, OutputManager.

InvestigateByScenario dispatching: reads current_poi.investigation_type from blackboard, routes to the matching subtree (PathFollow / ClusterFollow / AreaSweep / ZoomClassify). Each subtree reads current_poi.scenario for target classes and VLM usage.

Leaf Node Pattern: Each leaf node is a simple py_trees.behaviour.Behaviour subclass. setup() gets component references. update() calls the component method and returns SUCCESS/FAILURE/RUNNING.

Key Dependencies:

Library Version Purpose
py_trees 2.4.0 Behavior tree framework
OpenCV 4.x Frame capture, Laplacian variance
FastAPI existing Health + detect endpoints

Error Handling Strategy:

  • Leaf node exceptions → catch, log, return FAILURE → tree falls through to next Selector child
  • Component unavailable → Condition nodes gate access (CheckVLMAvailable, QualityGate)
  • Health degradation → HealthGuard decorator at root preempts all other behaviors
  • Invalid scenario config → log error at startup, skip invalid scenario, continue with valid ones

Frame Quality Gate: QualityGate is a Condition node in L1Sweep. If it returns FAILURE (blurry frame), the Sequence aborts and the tree ticks L1Sweep again next cycle (new frame).

7. Extensions and Helpers

Helper Purpose Used By
config YAML config loading + validation All components
types Shared structs (FrameContext, POI, SearchScenario, etc.) All components

Adding a New Search Scenario (config only)

  1. Define new detection classes in YOLOE (retrain if needed)
  2. Add YAML block under search_scenarios with trigger classes, investigation type, targets
  3. Restart service — new scenario is active

Adding a New Investigation Type (code + config)

  1. Create new BT subtree (e.g., SpiralSearchSubtree)
  2. Register it in InvestigateByScenario dispatcher
  3. Use type: spiral_search in scenario config

8. Caveats & Edge Cases

Known limitations:

  • Single-threaded tree ticking — throughput capped by slowest leaf per tick
  • py_trees Blackboard is not thread-safe (fine — single-threaded design)
  • POI queue doesn't persist across restarts
  • Scenario changes require service restart (no hot-reload)

Performance bottlenecks:

  • Tier 1 inference leaf is the bottleneck (~7-100ms)
  • Tree traversal overhead is negligible (<1ms)

9. Dependency Graph

Must be implemented after: Config helper, Types helper, ALL other components (02-06) Can be implemented in parallel with: None (top-level orchestrator) Blocks: Nothing (implemented last)

10. Logging Strategy

Log Level When Example
ERROR Component crash, 3x retry exhausted, invalid scenario Gimbal UART failed 3 times, disabling gimbal
WARN Frame skipped, VLM timeout, leaf FAILURE QualityGate FAILURE: Laplacian 12.3 < 50.0
INFO State transitions, POI created, scenario match POI queued: winter_concealment triggered by footpath_winter (conf=0.72)

py_trees built-in logging: Tree can render active path as ASCII for debugging:

[o] Root
    [-] HealthGuard
    [o] L2Investigation
        [o] L2DetectLoop
            [o] InvestigateByScenario
                [o] PathFollowSubtree
                    [*] PIDFollow (RUNNING)