Files
detections-semantic/_docs/02_plans/data_model.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

6.1 KiB
Raw Blame History

Semantic Detection System — Data Model

Design Principle

This is a real-time streaming pipeline on an edge device, not a CRUD application. There is no database. Data falls into two categories:

  1. Runtime structs — in-memory only, exist for one processing cycle, then discarded
  2. Persistent logs — append-only flat files on NVMe SSD

Runtime Structs (in-memory only)

These are C/Cython structs or Python dataclasses. They are created, consumed by the next pipeline stage, and garbage-collected. No persistence.

FrameContext

Wraps a camera frame with metadata for the current processing cycle.

Field Type Description
frame_id uint64 Sequential counter
timestamp float64 Capture time (epoch seconds)
image numpy array (H,W,3) Raw frame pixels
scan_level uint8 1 or 2
quality_score float32 Laplacian variance (computed on capture)
pan float32 Gimbal pan at capture
tilt float32 Gimbal tilt at capture
zoom float32 Zoom level at capture

YoloDetection (external input)

Received from existing YOLO pipeline. Consumed by semantic pipeline, not stored.

Field Type Description
centerX float32 Normalized center X (0-1)
centerY float32 Normalized center Y (0-1)
width float32 Normalized width (0-1)
height float32 Normalized height (0-1)
classNum int32 Class index
label string Class label
confidence float32 0-1
mask numpy array (H,W) Segmentation mask (if seg model)

POI

In-memory queue entry. Created when Tier 1 detects a point of interest, removed after investigation or timeout. Max size configurable (default 10).

Field Type Description
poi_id uint64 Counter
frame_id uint64 Frame that triggered this POI
trigger_class string Class that triggered (footpath_winter, branch_pile, etc.)
scenario_name string Which search scenario triggered this POI
investigation_type string "path_follow", "area_sweep", or "zoom_classify"
confidence float32 Trigger confidence
bbox float32[4] Bounding box in frame
priority float32 Computed: confidence × priority_boost × recency
status enum queued / investigating / done / timeout

GimbalState

Current gimbal position. Single instance, updated on every gimbal feedback message.

Field Type Description
pan float32 Current pan angle
tilt float32 Current tilt angle
zoom float32 Current zoom level
target_pan float32 Commanded pan
target_tilt float32 Commanded tilt
target_zoom float32 Commanded zoom
last_heartbeat float64 Last response timestamp

Persistent Data (NVMe flat files)

DetectionLogEntry → detections.jsonl

One JSON line per confirmed detection. This is the primary system output.

Field Type Required Description
ts string (ISO 8601) Yes Detection timestamp
frame_id uint64 Yes Source frame
gps_denied_lat float64 No GPS-denied latitude (null if unavailable)
gps_denied_lon float64 No GPS-denied longitude
tier uint8 Yes 1, 2, or 3
class string Yes Detection class label
confidence float32 Yes 0-1
bbox float32[4] Yes centerX, centerY, width, height (normalized)
freshness string No "high_contrast" / "low_contrast" (footpaths only)
tier2_result string No Tier 2 classification
tier2_confidence float32 No Tier 2 confidence
tier3_used bool Yes Whether VLM was invoked
thumbnail_path string No Saved ROI thumbnail path

HealthLogEntry → health.jsonl

One JSON line per second. System health snapshot.

Field Type Description
ts string (ISO 8601) Timestamp
t_junction float32 Junction temperature °C
power_watts float32 Power draw
gpu_mem_mb uint32 GPU memory used
vlm_available bool VLM capability flag
gimbal_available bool Gimbal capability flag
semantic_available bool Semantic capability flag

RecordedFrame → frames/{frame_id}.jpg

JPEG file per recorded frame. Metadata embedded in filename (frame_id). Correlation to detections via frame_id in detections.jsonl.

Config → config.yaml

Single YAML file with all runtime parameters. Versioned (version: 1 field). Updated via USB.

What Is NOT Stored

Item Why not
FootpathMask (segmentation mask) Transient. Exists ~50ms during Tier 2 processing. Too large to log (HxW binary).
PathSkeleton Transient. Derivative of mask.
EndpointROI crop Thumbnail saved only if detection confirmed. Raw crop discarded.
YoloDetection input External system's data. We consume it, don't archive it.
POI queue state Runtime queue. Not useful after flight. Detections capture the outcomes.
Raw VLM response text Optionally logged inside DetectionLogEntry if tier3_used=true. Not stored separately.

Storage Budget (256GB NVMe)

Data Write Rate Per Hour 4-Hour Flight
detections.jsonl ~1 KB/detection, ~100 detections/min ~6 MB ~24 MB
health.jsonl ~200 bytes/s ~720 KB ~3 MB
frames/ (L1, 2 FPS) ~100 KB/frame, 2 FPS ~720 MB ~2.9 GB
frames/ (L2, 30 FPS) ~100 KB/frame, 30 FPS ~10.8 GB ~43 GB (if L2 100% of time)
gimbal.log ~50 bytes/command, 10 Hz ~1.8 MB ~7 MB
Total (typical L1-heavy) ~1.5 GB ~6 GB
Total (L2-heavy) ~11 GB ~46 GB

256GB NVMe comfortably supports 5+ typical flights or 5+ hours of L2-heavy operation before circular buffer kicks in.

Migration Strategy

Not applicable — no relational database. Config changes handled by YAML versioning:

  • version: 1 field in config.yaml
  • New fields get defaults (backward-compatible)
  • Breaking changes: bump version, include migration notes in USB update package