mirror of https://github.com/azaion/detections-semantic.git synced 2026-04-22 22:16:37 +00:00

Files

T

Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit

Made-with: Cursor

2026-03-26 00:20:30 +02:00

6.1 KiB

Raw Blame History

Semantic Detection System — Data Model

Design Principle

This is a real-time streaming pipeline on an edge device, not a CRUD application. There is no database. Data falls into two categories:

Runtime structs — in-memory only, exist for one processing cycle, then discarded
Persistent logs — append-only flat files on NVMe SSD

Runtime Structs (in-memory only)

These are C/Cython structs or Python dataclasses. They are created, consumed by the next pipeline stage, and garbage-collected. No persistence.

FrameContext

Wraps a camera frame with metadata for the current processing cycle.

Field	Type	Description
frame_id	uint64	Sequential counter
timestamp	float64	Capture time (epoch seconds)
image	numpy array (H,W,3)	Raw frame pixels
scan_level	uint8	1 or 2
quality_score	float32	Laplacian variance (computed on capture)
pan	float32	Gimbal pan at capture
tilt	float32	Gimbal tilt at capture
zoom	float32	Zoom level at capture

YoloDetection (external input)

Received from existing YOLO pipeline. Consumed by semantic pipeline, not stored.

Field	Type	Description
centerX	float32	Normalized center X (0-1)
centerY	float32	Normalized center Y (0-1)
width	float32	Normalized width (0-1)
height	float32	Normalized height (0-1)
classNum	int32	Class index
label	string	Class label
confidence	float32	0-1
mask	numpy array (H,W)	Segmentation mask (if seg model)

POI

In-memory queue entry. Created when Tier 1 detects a point of interest, removed after investigation or timeout. Max size configurable (default 10).

Field	Type	Description
poi_id	uint64	Counter
frame_id	uint64	Frame that triggered this POI
trigger_class	string	Class that triggered (footpath_winter, branch_pile, etc.)
scenario_name	string	Which search scenario triggered this POI
investigation_type	string	"path_follow", "area_sweep", or "zoom_classify"
confidence	float32	Trigger confidence
bbox	float32[4]	Bounding box in frame
priority	float32	Computed: confidence × priority_boost × recency
status	enum	queued / investigating / done / timeout

GimbalState

Current gimbal position. Single instance, updated on every gimbal feedback message.

Field	Type	Description
pan	float32	Current pan angle
tilt	float32	Current tilt angle
zoom	float32	Current zoom level
target_pan	float32	Commanded pan
target_tilt	float32	Commanded tilt
target_zoom	float32	Commanded zoom
last_heartbeat	float64	Last response timestamp

Persistent Data (NVMe flat files)

DetectionLogEntry → `detections.jsonl`

One JSON line per confirmed detection. This is the primary system output.

Field	Type	Required	Description
ts	string (ISO 8601)	Yes	Detection timestamp
frame_id	uint64	Yes	Source frame
gps_denied_lat	float64	No	GPS-denied latitude (null if unavailable)
gps_denied_lon	float64	No	GPS-denied longitude
tier	uint8	Yes	1, 2, or 3
class	string	Yes	Detection class label
confidence	float32	Yes	0-1
bbox	float32[4]	Yes	centerX, centerY, width, height (normalized)
freshness	string	No	"high_contrast" / "low_contrast" (footpaths only)
tier2_result	string	No	Tier 2 classification
tier2_confidence	float32	No	Tier 2 confidence
tier3_used	bool	Yes	Whether VLM was invoked
thumbnail_path	string	No	Saved ROI thumbnail path

HealthLogEntry → `health.jsonl`

One JSON line per second. System health snapshot.

Field	Type	Description
ts	string (ISO 8601)	Timestamp
t_junction	float32	Junction temperature °C
power_watts	float32	Power draw
gpu_mem_mb	uint32	GPU memory used
vlm_available	bool	VLM capability flag
gimbal_available	bool	Gimbal capability flag
semantic_available	bool	Semantic capability flag

RecordedFrame → `frames/{frame_id}.jpg`

JPEG file per recorded frame. Metadata embedded in filename (frame_id). Correlation to detections via frame_id in detections.jsonl.

Config → `config.yaml`

Single YAML file with all runtime parameters. Versioned (version: 1 field). Updated via USB.

What Is NOT Stored

Item	Why not
FootpathMask (segmentation mask)	Transient. Exists ~50ms during Tier 2 processing. Too large to log (HxW binary).
PathSkeleton	Transient. Derivative of mask.
EndpointROI crop	Thumbnail saved only if detection confirmed. Raw crop discarded.
YoloDetection input	External system's data. We consume it, don't archive it.
POI queue state	Runtime queue. Not useful after flight. Detections capture the outcomes.
Raw VLM response text	Optionally logged inside DetectionLogEntry if tier3_used=true. Not stored separately.

Storage Budget (256GB NVMe)

Data	Write Rate	Per Hour	4-Hour Flight
detections.jsonl	~1 KB/detection, ~100 detections/min	~6 MB	~24 MB
health.jsonl	~200 bytes/s	~720 KB	~3 MB
frames/ (L1, 2 FPS)	~100 KB/frame, 2 FPS	~720 MB	~2.9 GB
frames/ (L2, 30 FPS)	~100 KB/frame, 30 FPS	~10.8 GB	~43 GB (if L2 100% of time)
gimbal.log	~50 bytes/command, 10 Hz	~1.8 MB	~7 MB
Total (typical L1-heavy)		~1.5 GB	~6 GB
Total (L2-heavy)		~11 GB	~46 GB

256GB NVMe comfortably supports 5+ typical flights or 5+ hours of L2-heavy operation before circular buffer kicks in.

Migration Strategy

Not applicable — no relational database. Config changes handled by YAML versioning:

version: 1 field in config.yaml
New fields get defaults (backward-compatible)
Breaking changes: bump version, include migration notes in USB update package

6.1 KiB Raw Blame History Unescape Escape