Initial commit

Made-with: Cursor
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-26 00:20:30 +02:00
commit 8e2ecf50fd
144 changed files with 19781 additions and 0 deletions
+314
View File
@@ -0,0 +1,314 @@
# Semantic Detection System — System Flows
## Flow Inventory
| # | Flow Name | Trigger | Primary Components | Criticality |
|---|-----------|---------|-------------------|-------------|
| F1 | Level 1 Wide-Area Scan | System startup / return from L2 | ScanController, GimbalDriver, Tier1Detector | High |
| F2 | Level 2 Detailed Investigation | POI queued for investigation | ScanController, GimbalDriver, Tier1Detector, Tier2SpatialAnalyzer, VLMProcess | High |
| F3 | Path / Cluster Following | Spatial pattern detected at L2 zoom | Tier2SpatialAnalyzer, GimbalDriver, ScanController | High |
| F4 | Health & Degradation | Continuous monitoring | HealthChecks (inline), ScanController | High |
| F5 | System Startup | Power on | All components | Medium |
## Flow Dependencies
| Flow | Depends On | Shares Data With |
|------|-----------|-----------------|
| F1 | F5 (startup complete) | F2 (POI queue) |
| F2 | F1 (POI available) | F3 (spatial analysis result) |
| F3 | F2 (spatial pattern detected at L2) | F2 (gimbal position, waypoint detections) |
| F4 | — (inline in main loop) | F1, F2 (capability flags) |
| F5 | — | All flows |
---
## Flow F1: Level 1 Wide-Area Scan
### Description
The scan controller drives the gimbal in a left-right sweep perpendicular to the UAV flight path at medium zoom. Each frame is processed by Tier 1 (YOLOE). When a POI-class detection exceeds the confidence threshold, it is queued for Level 2 investigation. Frames are recorded at configurable rate. Detections are logged and reported to operator.
### Preconditions
- System startup complete (F5)
- Gimbal responding
- YOLOE TRT engine loaded
### Sequence Diagram
```mermaid
sequenceDiagram
participant SC as ScanController
participant GD as GimbalDriver
participant T1 as Tier1Detector
participant Log as Logger/Recorder
loop Every sweep position
SC->>SC: health_check() — read T_junction, check gimbal, check VLM
SC->>GD: set_sweep_target(pan_angle)
GD->>GD: send ViewLink command
Note over SC: capture frame from camera
SC->>T1: process_frame(frame)
T1-->>SC: detections[] (classes, masks, confidences)
SC->>Log: record_frame(frame, level=1) + log_detections(detections)
alt POI-class detected above threshold
SC->>SC: queue_poi(detection, priority)
alt High-priority POI ready
Note over SC: Transition to F2
end
end
SC->>SC: advance sweep angle
end
```
### POI Queueing (inline in F1)
When Tier 1 detects any class, EvaluatePOI checks it against ALL active search scenarios:
1. For each detection, match against each active scenario's trigger_classes and min_confidence
2. Check if duplicate of existing queued POI (bbox overlap > 0.5) → update confidence
3. Otherwise create new POI entry with scenario_name and investigation_type, compute priority (confidence × scenario.priority_boost × recency)
4. Insert into priority queue (max size configurable, default 10)
5. If queue full, drop lowest-priority entry
6. Transition to L2 when: current sweep position allows (not mid-transition) AND queue is non-empty
### Data Flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | ScanController | GimbalDriver | target pan/tilt/zoom | GimbalCommand |
| 2 | Camera | ScanController | raw frame | 1920x1080 |
| 3 | ScanController | Tier1Detector | frame buffer | numpy array (HWC) |
| 4 | Tier1Detector | ScanController | detection array | list of dicts |
| 5 | ScanController | Logger/Recorder | frame + detections | JPEG + JSON-lines |
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Gimbal timeout | GimbalDriver | No response within 2s | Retry 3x, then set gimbal_available=false, continue with fixed camera |
| YOLOE inference failure | Tier1Detector | Exception / timeout | Skip frame, log error, continue |
| Frame quality too low | ScanController | Laplacian variance < threshold | Skip frame, continue to next |
### Performance Expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Sweep cycle time | 100-200ms per position | Tier 1 inference + gimbal command |
| Full sweep coverage | ≤10s per left-right cycle | Depends on sweep angle range and step size |
---
## Flow F2: Level 2 Detailed Investigation
### Description
Camera zooms into the highest-priority POI. The investigation type is determined by the POI's search scenario (path_follow, cluster_follow, area_sweep, or zoom_classify). For path_follow: F3 activates with mask tracing. For cluster_follow: F3 activates with cluster tracing (visits each member in order). For area_sweep: slow pan at high zoom. For zoom_classify: hold zoom and classify. Tier 2/3 analysis as needed. After analysis or timeout, returns to Level 1.
### Preconditions
- POI queue has at least 1 entry
- gimbal_available == true
- Tier 1 engine loaded
### Sequence Diagram
```mermaid
sequenceDiagram
participant SC as ScanController
participant GD as GimbalDriver
participant T1 as Tier1Detector
participant T2 as Tier2SpatialAnalyzer
participant VLM as VLMProcess
participant Log as Logger/Recorder
SC->>GD: zoom_to_poi(poi.coords, zoom=high)
Note over GD: 1-2s zoom transition
GD-->>SC: zoom_complete
loop Until timeout or analysis complete
SC->>SC: health_check()
Note over SC: capture zoomed frame
SC->>T1: process_frame(zoomed_frame)
SC->>Log: record_frame(frame, level=2)
T1-->>SC: detections[]
alt Footpath detected
SC->>T2: trace_path(footpath_mask)
T2->>T2: skeletonize + find endpoints
T2-->>SC: endpoints[], skeleton
SC->>SC: evaluate endpoints (V1 heuristic: darkness + contrast)
alt Endpoint is dark mass → HIGH confidence
SC->>Log: log_detection(tier=2, class=concealed_position)
else Ambiguous endpoint AND vlm_available
SC->>VLM: analyze_roi(endpoint_crop, prompt)
VLM-->>SC: vlm_response
SC->>Log: log_detection(tier=3, vlm_result)
else Ambiguous endpoint AND NOT vlm_available
SC->>Log: log_detection(tier=2, class=uncertain)
end
SC->>GD: follow_path(skeleton.direction)
Note over SC: Activates F3
else No footpath, other POI type
SC->>T2: analyze_roi(poi_region)
T2-->>SC: classification
SC->>Log: log_detection(tier=2)
end
end
SC->>Log: report_detections_to_operator()
SC->>GD: return_to_sweep(zoom=medium)
Note over SC: Back to F1
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Zoom transition timeout | GimbalDriver | No confirmation within 3s | Proceed with current zoom |
| VLM timeout | VLMProcess | No response within 5s | Skip Tier 3, report Tier 2 result only |
| VLM crash | VLMProcess | IPC connection refused | Set vlm_available=false, continue Tier 1+2 |
| Investigation timeout | ScanController | Timer exceeds limit (default 10s) | Return to L1, mark POI as "timeout" |
### Performance Expectations
| Metric | Target | Notes |
|--------|--------|-------|
| L1→L2 transition | ≤2s | Including zoom |
| Per-POI investigation | ≤10s (configurable) | Including VLM if triggered |
| Return to L1 | ≤2s | Zoom-out + first sweep position |
---
## Flow F3: Path / Cluster Following
### Description
Activated from F2 when the investigation type is `path_follow` or `cluster_follow`. The Tier2SpatialAnalyzer produces a `SpatialAnalysisResult` with ordered waypoints and a trajectory. The gimbal follows the trajectory, visiting each waypoint for analysis.
**Mask trace mode** (path_follow): footpath skeleton provides a continuous trajectory. PID control keeps the path centered. At each waypoint (endpoint), camera holds for analysis.
**Cluster trace mode** (cluster_follow): discrete detections provide point-to-point waypoints. Gimbal moves between points in nearest-neighbor order. At each waypoint, camera zooms in for detailed Tier 1 + heuristic/VLM analysis.
### Preconditions
- Level 2 active (F2)
- SpatialAnalysisResult available with >= 1 waypoint
### Flowchart (mask trace)
```mermaid
flowchart TD
Start([SpatialAnalysisResult available]) --> CheckType{pattern_type?}
CheckType -->|mask_trace| ComputeDir[Compute path direction from trajectory]
ComputeDir --> SetTarget[Set gimbal PID target: path direction]
SetTarget --> PanLoop{Path still visible in frame?}
PanLoop -->|Yes| UpdatePID[PID update: adjust pan/tilt to center path]
UpdatePID --> SendCmd[Send gimbal command]
SendCmd --> CaptureFrame[Capture next frame]
CaptureFrame --> RunT1[Tier 1 on new frame]
RunT1 --> UpdateSkeleton[Update skeleton from new mask]
UpdateSkeleton --> CheckWaypoint{Reached next waypoint?}
CheckWaypoint -->|No| PanLoop
CheckWaypoint -->|Yes| HoldCamera[Hold camera on waypoint]
HoldCamera --> AnalyzeWaypoint([Tier 2/3 waypoint analysis in F2])
PanLoop -->|No, path lost| FallbackCentroid[Use last known direction]
FallbackCentroid --> RetryDetect{Re-detect within 3 frames?}
RetryDetect -->|Yes| UpdateSkeleton
RetryDetect -->|No| AbortFollow([Return to F2 main loop])
CheckType -->|cluster_trace| InitVisit[Get first waypoint from visit order]
InitVisit --> MoveToPoint[Move gimbal to waypoint position]
MoveToPoint --> CaptureZoomed[Capture zoomed frame]
CaptureZoomed --> RunT1Zoomed[Tier 1 on zoomed frame]
RunT1Zoomed --> ClassifyPoint[Heuristic / VLM classify]
ClassifyPoint --> LogPoint[Log detection for this waypoint]
LogPoint --> NextPoint{More waypoints?}
NextPoint -->|Yes| MoveToPoint
NextPoint -->|No| ClusterDone([Log cluster summary, return to F2])
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Path lost from frame | Tier1Detector | No footpath mask in 3 consecutive frames | Abort follow, return to F2 |
| Cluster member not visible at zoom | Tier1Detector | No target_class at waypoint position | Log as unconfirmed, proceed to next waypoint |
| PID oscillation | GimbalDriver | Direction changes >5 times in 1s | Hold position, re-acquire |
| Gimbal at physical limit | GimbalDriver | Pan/tilt at max angle | Analyze what's visible, return to F2 |
### Performance Expectations
| Metric | Target | Notes |
|--------|--------|-------|
| PID update rate | 10 Hz | Gimbal command every 100ms (mask trace) |
| Path centering | Path within center 50% of frame | AC requirement (mask trace) |
| Follow duration | ≤5s per path segment | Part of total POI budget |
| Cluster visit | ≤3s per waypoint | Zoom + capture + classify (cluster trace) |
---
## Flow F4: Health & Degradation
### Description
Health checks are performed inline at the top of each main-loop iteration (not a separate thread). The scan controller reads sensor values and sets capability flags that control what features are available.
### Capability Flags
| Flag | Default | Set to false when | Effect |
|------|---------|-------------------|--------|
| vlm_available | true | VLM process crashed 3x, or T_junction > 75°C, or power > 80% budget | Tier 3 skipped; Tier 1+2 continue |
| gimbal_available | true | Gimbal UART failed 3x | Fixed camera; L2 zoom disabled; L1 sweep disabled |
| semantic_available | true | Semantic process crashed 3x, or T_junction > 80°C | Existing YOLO only |
### Inline Health Check (runs each iteration)
```
1. Read T_junction from tegrastats
2. If T_junction > 80°C → semantic_available = false
3. If T_junction > 75°C → vlm_available = false
4. If last gimbal response > 4s ago → gimbal_available = false
5. If VLM IPC failed last 3 attempts → vlm_available = false
6. If all clear and T_junction < 70°C → restore flags to true
```
No separate monitoring thread. No formal state machine. Flags are checked wherever decisions depend on them.
---
## Flow F5: System Startup
### Description
Power-on sequence: load models, initialize gimbal, begin Level 1 scan.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Boot as JetPack Boot
participant Main as MainProcess
participant T1 as Tier1Detector
participant GD as GimbalDriver
participant SC as ScanController
Boot->>Main: OS ready
Main->>T1: load YOLOE TRT engine
T1-->>Main: engine ready (~10-20s)
Main->>GD: initialize gimbal (UART open, handshake)
GD-->>Main: gimbal ready (or gimbal_available=false)
Main->>Main: initialize logger, recorder (NVMe check)
Main->>SC: start Level 1 scan (F1)
```
### Performance Expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Total startup | ≤60s | Power-on to first detection |
| TRT engine load | ≤20s | Model size + NVMe speed |
| Gimbal handshake | ≤2s | UART open + version check |