Files
detections-semantic/_docs/02_plans/system-flows.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

13 KiB
Raw Blame History

Semantic Detection System — System Flows

Flow Inventory

# Flow Name Trigger Primary Components Criticality
F1 Level 1 Wide-Area Scan System startup / return from L2 ScanController, GimbalDriver, Tier1Detector High
F2 Level 2 Detailed Investigation POI queued for investigation ScanController, GimbalDriver, Tier1Detector, Tier2SpatialAnalyzer, VLMProcess High
F3 Path / Cluster Following Spatial pattern detected at L2 zoom Tier2SpatialAnalyzer, GimbalDriver, ScanController High
F4 Health & Degradation Continuous monitoring HealthChecks (inline), ScanController High
F5 System Startup Power on All components Medium

Flow Dependencies

Flow Depends On Shares Data With
F1 F5 (startup complete) F2 (POI queue)
F2 F1 (POI available) F3 (spatial analysis result)
F3 F2 (spatial pattern detected at L2) F2 (gimbal position, waypoint detections)
F4 — (inline in main loop) F1, F2 (capability flags)
F5 All flows

Flow F1: Level 1 Wide-Area Scan

Description

The scan controller drives the gimbal in a left-right sweep perpendicular to the UAV flight path at medium zoom. Each frame is processed by Tier 1 (YOLOE). When a POI-class detection exceeds the confidence threshold, it is queued for Level 2 investigation. Frames are recorded at configurable rate. Detections are logged and reported to operator.

Preconditions

  • System startup complete (F5)
  • Gimbal responding
  • YOLOE TRT engine loaded

Sequence Diagram

sequenceDiagram
    participant SC as ScanController
    participant GD as GimbalDriver
    participant T1 as Tier1Detector
    participant Log as Logger/Recorder

    loop Every sweep position
        SC->>SC: health_check() — read T_junction, check gimbal, check VLM
        SC->>GD: set_sweep_target(pan_angle)
        GD->>GD: send ViewLink command
        Note over SC: capture frame from camera
        SC->>T1: process_frame(frame)
        T1-->>SC: detections[] (classes, masks, confidences)
        SC->>Log: record_frame(frame, level=1) + log_detections(detections)

        alt POI-class detected above threshold
            SC->>SC: queue_poi(detection, priority)
            alt High-priority POI ready
                Note over SC: Transition to F2
            end
        end

        SC->>SC: advance sweep angle
    end

POI Queueing (inline in F1)

When Tier 1 detects any class, EvaluatePOI checks it against ALL active search scenarios:

  1. For each detection, match against each active scenario's trigger_classes and min_confidence
  2. Check if duplicate of existing queued POI (bbox overlap > 0.5) → update confidence
  3. Otherwise create new POI entry with scenario_name and investigation_type, compute priority (confidence × scenario.priority_boost × recency)
  4. Insert into priority queue (max size configurable, default 10)
  5. If queue full, drop lowest-priority entry
  6. Transition to L2 when: current sweep position allows (not mid-transition) AND queue is non-empty

Data Flow

Step From To Data Format
1 ScanController GimbalDriver target pan/tilt/zoom GimbalCommand
2 Camera ScanController raw frame 1920x1080
3 ScanController Tier1Detector frame buffer numpy array (HWC)
4 Tier1Detector ScanController detection array list of dicts
5 ScanController Logger/Recorder frame + detections JPEG + JSON-lines

Error Scenarios

Error Where Detection Recovery
Gimbal timeout GimbalDriver No response within 2s Retry 3x, then set gimbal_available=false, continue with fixed camera
YOLOE inference failure Tier1Detector Exception / timeout Skip frame, log error, continue
Frame quality too low ScanController Laplacian variance < threshold Skip frame, continue to next

Performance Expectations

Metric Target Notes
Sweep cycle time 100-200ms per position Tier 1 inference + gimbal command
Full sweep coverage ≤10s per left-right cycle Depends on sweep angle range and step size

Flow F2: Level 2 Detailed Investigation

Description

Camera zooms into the highest-priority POI. The investigation type is determined by the POI's search scenario (path_follow, cluster_follow, area_sweep, or zoom_classify). For path_follow: F3 activates with mask tracing. For cluster_follow: F3 activates with cluster tracing (visits each member in order). For area_sweep: slow pan at high zoom. For zoom_classify: hold zoom and classify. Tier 2/3 analysis as needed. After analysis or timeout, returns to Level 1.

Preconditions

  • POI queue has at least 1 entry
  • gimbal_available == true
  • Tier 1 engine loaded

Sequence Diagram

sequenceDiagram
    participant SC as ScanController
    participant GD as GimbalDriver
    participant T1 as Tier1Detector
    participant T2 as Tier2SpatialAnalyzer
    participant VLM as VLMProcess
    participant Log as Logger/Recorder

    SC->>GD: zoom_to_poi(poi.coords, zoom=high)
    Note over GD: 1-2s zoom transition
    GD-->>SC: zoom_complete

    loop Until timeout or analysis complete
        SC->>SC: health_check()
        Note over SC: capture zoomed frame
        SC->>T1: process_frame(zoomed_frame)
        SC->>Log: record_frame(frame, level=2)
        T1-->>SC: detections[]

        alt Footpath detected
            SC->>T2: trace_path(footpath_mask)
            T2->>T2: skeletonize + find endpoints
            T2-->>SC: endpoints[], skeleton

            SC->>SC: evaluate endpoints (V1 heuristic: darkness + contrast)
            alt Endpoint is dark mass → HIGH confidence
                SC->>Log: log_detection(tier=2, class=concealed_position)
            else Ambiguous endpoint AND vlm_available
                SC->>VLM: analyze_roi(endpoint_crop, prompt)
                VLM-->>SC: vlm_response
                SC->>Log: log_detection(tier=3, vlm_result)
            else Ambiguous endpoint AND NOT vlm_available
                SC->>Log: log_detection(tier=2, class=uncertain)
            end

            SC->>GD: follow_path(skeleton.direction)
            Note over SC: Activates F3
        else No footpath, other POI type
            SC->>T2: analyze_roi(poi_region)
            T2-->>SC: classification
            SC->>Log: log_detection(tier=2)
        end
    end

    SC->>Log: report_detections_to_operator()
    SC->>GD: return_to_sweep(zoom=medium)
    Note over SC: Back to F1

Error Scenarios

Error Where Detection Recovery
Zoom transition timeout GimbalDriver No confirmation within 3s Proceed with current zoom
VLM timeout VLMProcess No response within 5s Skip Tier 3, report Tier 2 result only
VLM crash VLMProcess IPC connection refused Set vlm_available=false, continue Tier 1+2
Investigation timeout ScanController Timer exceeds limit (default 10s) Return to L1, mark POI as "timeout"

Performance Expectations

Metric Target Notes
L1→L2 transition ≤2s Including zoom
Per-POI investigation ≤10s (configurable) Including VLM if triggered
Return to L1 ≤2s Zoom-out + first sweep position

Flow F3: Path / Cluster Following

Description

Activated from F2 when the investigation type is path_follow or cluster_follow. The Tier2SpatialAnalyzer produces a SpatialAnalysisResult with ordered waypoints and a trajectory. The gimbal follows the trajectory, visiting each waypoint for analysis.

Mask trace mode (path_follow): footpath skeleton provides a continuous trajectory. PID control keeps the path centered. At each waypoint (endpoint), camera holds for analysis.

Cluster trace mode (cluster_follow): discrete detections provide point-to-point waypoints. Gimbal moves between points in nearest-neighbor order. At each waypoint, camera zooms in for detailed Tier 1 + heuristic/VLM analysis.

Preconditions

  • Level 2 active (F2)
  • SpatialAnalysisResult available with >= 1 waypoint

Flowchart (mask trace)

flowchart TD
    Start([SpatialAnalysisResult available]) --> CheckType{pattern_type?}
    CheckType -->|mask_trace| ComputeDir[Compute path direction from trajectory]
    ComputeDir --> SetTarget[Set gimbal PID target: path direction]
    SetTarget --> PanLoop{Path still visible in frame?}
    PanLoop -->|Yes| UpdatePID[PID update: adjust pan/tilt to center path]
    UpdatePID --> SendCmd[Send gimbal command]
    SendCmd --> CaptureFrame[Capture next frame]
    CaptureFrame --> RunT1[Tier 1 on new frame]
    RunT1 --> UpdateSkeleton[Update skeleton from new mask]
    UpdateSkeleton --> CheckWaypoint{Reached next waypoint?}
    CheckWaypoint -->|No| PanLoop
    CheckWaypoint -->|Yes| HoldCamera[Hold camera on waypoint]
    HoldCamera --> AnalyzeWaypoint([Tier 2/3 waypoint analysis in F2])
    PanLoop -->|No, path lost| FallbackCentroid[Use last known direction]
    FallbackCentroid --> RetryDetect{Re-detect within 3 frames?}
    RetryDetect -->|Yes| UpdateSkeleton
    RetryDetect -->|No| AbortFollow([Return to F2 main loop])
    CheckType -->|cluster_trace| InitVisit[Get first waypoint from visit order]
    InitVisit --> MoveToPoint[Move gimbal to waypoint position]
    MoveToPoint --> CaptureZoomed[Capture zoomed frame]
    CaptureZoomed --> RunT1Zoomed[Tier 1 on zoomed frame]
    RunT1Zoomed --> ClassifyPoint[Heuristic / VLM classify]
    ClassifyPoint --> LogPoint[Log detection for this waypoint]
    LogPoint --> NextPoint{More waypoints?}
    NextPoint -->|Yes| MoveToPoint
    NextPoint -->|No| ClusterDone([Log cluster summary, return to F2])

Error Scenarios

Error Where Detection Recovery
Path lost from frame Tier1Detector No footpath mask in 3 consecutive frames Abort follow, return to F2
Cluster member not visible at zoom Tier1Detector No target_class at waypoint position Log as unconfirmed, proceed to next waypoint
PID oscillation GimbalDriver Direction changes >5 times in 1s Hold position, re-acquire
Gimbal at physical limit GimbalDriver Pan/tilt at max angle Analyze what's visible, return to F2

Performance Expectations

Metric Target Notes
PID update rate 10 Hz Gimbal command every 100ms (mask trace)
Path centering Path within center 50% of frame AC requirement (mask trace)
Follow duration ≤5s per path segment Part of total POI budget
Cluster visit ≤3s per waypoint Zoom + capture + classify (cluster trace)

Flow F4: Health & Degradation

Description

Health checks are performed inline at the top of each main-loop iteration (not a separate thread). The scan controller reads sensor values and sets capability flags that control what features are available.

Capability Flags

Flag Default Set to false when Effect
vlm_available true VLM process crashed 3x, or T_junction > 75°C, or power > 80% budget Tier 3 skipped; Tier 1+2 continue
gimbal_available true Gimbal UART failed 3x Fixed camera; L2 zoom disabled; L1 sweep disabled
semantic_available true Semantic process crashed 3x, or T_junction > 80°C Existing YOLO only

Inline Health Check (runs each iteration)

1. Read T_junction from tegrastats
2. If T_junction > 80°C → semantic_available = false
3. If T_junction > 75°C → vlm_available = false
4. If last gimbal response > 4s ago → gimbal_available = false
5. If VLM IPC failed last 3 attempts → vlm_available = false
6. If all clear and T_junction < 70°C → restore flags to true

No separate monitoring thread. No formal state machine. Flags are checked wherever decisions depend on them.


Flow F5: System Startup

Description

Power-on sequence: load models, initialize gimbal, begin Level 1 scan.

Sequence Diagram

sequenceDiagram
    participant Boot as JetPack Boot
    participant Main as MainProcess
    participant T1 as Tier1Detector
    participant GD as GimbalDriver
    participant SC as ScanController

    Boot->>Main: OS ready
    Main->>T1: load YOLOE TRT engine
    T1-->>Main: engine ready (~10-20s)
    Main->>GD: initialize gimbal (UART open, handshake)
    GD-->>Main: gimbal ready (or gimbal_available=false)
    Main->>Main: initialize logger, recorder (NVMe check)
    Main->>SC: start Level 1 scan (F1)

Performance Expectations

Metric Target Notes
Total startup ≤60s Power-on to first detection
TRT engine load ≤20s Model size + NVMe speed
Gimbal handshake ≤2s UART open + version check