mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-23 10:46:37 +00:00
8e2ecf50fd
Made-with: Cursor
149 lines
6.1 KiB
Markdown
149 lines
6.1 KiB
Markdown
# Semantic Detection System — Data Model
|
||
|
||
## Design Principle
|
||
|
||
This is a real-time streaming pipeline on an edge device, not a CRUD application. There is no database. Data falls into two categories:
|
||
|
||
1. **Runtime structs** — in-memory only, exist for one processing cycle, then discarded
|
||
2. **Persistent logs** — append-only flat files on NVMe SSD
|
||
|
||
## Runtime Structs (in-memory only)
|
||
|
||
These are C/Cython structs or Python dataclasses. They are created, consumed by the next pipeline stage, and garbage-collected. No persistence.
|
||
|
||
### FrameContext
|
||
|
||
Wraps a camera frame with metadata for the current processing cycle.
|
||
|
||
| Field | Type | Description |
|
||
|-------|------|-------------|
|
||
| frame_id | uint64 | Sequential counter |
|
||
| timestamp | float64 | Capture time (epoch seconds) |
|
||
| image | numpy array (H,W,3) | Raw frame pixels |
|
||
| scan_level | uint8 | 1 or 2 |
|
||
| quality_score | float32 | Laplacian variance (computed on capture) |
|
||
| pan | float32 | Gimbal pan at capture |
|
||
| tilt | float32 | Gimbal tilt at capture |
|
||
| zoom | float32 | Zoom level at capture |
|
||
|
||
### YoloDetection (external input)
|
||
|
||
Received from existing YOLO pipeline. Consumed by semantic pipeline, not stored.
|
||
|
||
| Field | Type | Description |
|
||
|-------|------|-------------|
|
||
| centerX | float32 | Normalized center X (0-1) |
|
||
| centerY | float32 | Normalized center Y (0-1) |
|
||
| width | float32 | Normalized width (0-1) |
|
||
| height | float32 | Normalized height (0-1) |
|
||
| classNum | int32 | Class index |
|
||
| label | string | Class label |
|
||
| confidence | float32 | 0-1 |
|
||
| mask | numpy array (H,W) | Segmentation mask (if seg model) |
|
||
|
||
### POI
|
||
|
||
In-memory queue entry. Created when Tier 1 detects a point of interest, removed after investigation or timeout. Max size configurable (default 10).
|
||
|
||
| Field | Type | Description |
|
||
|-------|------|-------------|
|
||
| poi_id | uint64 | Counter |
|
||
| frame_id | uint64 | Frame that triggered this POI |
|
||
| trigger_class | string | Class that triggered (footpath_winter, branch_pile, etc.) |
|
||
| scenario_name | string | Which search scenario triggered this POI |
|
||
| investigation_type | string | "path_follow", "area_sweep", or "zoom_classify" |
|
||
| confidence | float32 | Trigger confidence |
|
||
| bbox | float32[4] | Bounding box in frame |
|
||
| priority | float32 | Computed: confidence × priority_boost × recency |
|
||
| status | enum | queued / investigating / done / timeout |
|
||
|
||
### GimbalState
|
||
|
||
Current gimbal position. Single instance, updated on every gimbal feedback message.
|
||
|
||
| Field | Type | Description |
|
||
|-------|------|-------------|
|
||
| pan | float32 | Current pan angle |
|
||
| tilt | float32 | Current tilt angle |
|
||
| zoom | float32 | Current zoom level |
|
||
| target_pan | float32 | Commanded pan |
|
||
| target_tilt | float32 | Commanded tilt |
|
||
| target_zoom | float32 | Commanded zoom |
|
||
| last_heartbeat | float64 | Last response timestamp |
|
||
|
||
## Persistent Data (NVMe flat files)
|
||
|
||
### DetectionLogEntry → `detections.jsonl`
|
||
|
||
One JSON line per confirmed detection. This is the primary system output.
|
||
|
||
| Field | Type | Required | Description |
|
||
|-------|------|----------|-------------|
|
||
| ts | string (ISO 8601) | Yes | Detection timestamp |
|
||
| frame_id | uint64 | Yes | Source frame |
|
||
| gps_denied_lat | float64 | No | GPS-denied latitude (null if unavailable) |
|
||
| gps_denied_lon | float64 | No | GPS-denied longitude |
|
||
| tier | uint8 | Yes | 1, 2, or 3 |
|
||
| class | string | Yes | Detection class label |
|
||
| confidence | float32 | Yes | 0-1 |
|
||
| bbox | float32[4] | Yes | centerX, centerY, width, height (normalized) |
|
||
| freshness | string | No | "high_contrast" / "low_contrast" (footpaths only) |
|
||
| tier2_result | string | No | Tier 2 classification |
|
||
| tier2_confidence | float32 | No | Tier 2 confidence |
|
||
| tier3_used | bool | Yes | Whether VLM was invoked |
|
||
| thumbnail_path | string | No | Saved ROI thumbnail path |
|
||
|
||
### HealthLogEntry → `health.jsonl`
|
||
|
||
One JSON line per second. System health snapshot.
|
||
|
||
| Field | Type | Description |
|
||
|-------|------|-------------|
|
||
| ts | string (ISO 8601) | Timestamp |
|
||
| t_junction | float32 | Junction temperature °C |
|
||
| power_watts | float32 | Power draw |
|
||
| gpu_mem_mb | uint32 | GPU memory used |
|
||
| vlm_available | bool | VLM capability flag |
|
||
| gimbal_available | bool | Gimbal capability flag |
|
||
| semantic_available | bool | Semantic capability flag |
|
||
|
||
### RecordedFrame → `frames/{frame_id}.jpg`
|
||
|
||
JPEG file per recorded frame. Metadata embedded in filename (frame_id). Correlation to detections via frame_id in `detections.jsonl`.
|
||
|
||
### Config → `config.yaml`
|
||
|
||
Single YAML file with all runtime parameters. Versioned (`version: 1` field). Updated via USB.
|
||
|
||
## What Is NOT Stored
|
||
|
||
| Item | Why not |
|
||
|------|---------|
|
||
| FootpathMask (segmentation mask) | Transient. Exists ~50ms during Tier 2 processing. Too large to log (HxW binary). |
|
||
| PathSkeleton | Transient. Derivative of mask. |
|
||
| EndpointROI crop | Thumbnail saved only if detection confirmed. Raw crop discarded. |
|
||
| YoloDetection input | External system's data. We consume it, don't archive it. |
|
||
| POI queue state | Runtime queue. Not useful after flight. Detections capture the outcomes. |
|
||
| Raw VLM response text | Optionally logged inside DetectionLogEntry if tier3_used=true. Not stored separately. |
|
||
|
||
## Storage Budget (256GB NVMe)
|
||
|
||
| Data | Write Rate | Per Hour | 4-Hour Flight |
|
||
|------|-----------|----------|---------------|
|
||
| detections.jsonl | ~1 KB/detection, ~100 detections/min | ~6 MB | ~24 MB |
|
||
| health.jsonl | ~200 bytes/s | ~720 KB | ~3 MB |
|
||
| frames/ (L1, 2 FPS) | ~100 KB/frame, 2 FPS | ~720 MB | ~2.9 GB |
|
||
| frames/ (L2, 30 FPS) | ~100 KB/frame, 30 FPS | ~10.8 GB | ~43 GB (if L2 100% of time) |
|
||
| gimbal.log | ~50 bytes/command, 10 Hz | ~1.8 MB | ~7 MB |
|
||
| **Total (typical L1-heavy)** | | **~1.5 GB** | **~6 GB** |
|
||
| **Total (L2-heavy)** | | **~11 GB** | **~46 GB** |
|
||
|
||
256GB NVMe comfortably supports 5+ typical flights or 5+ hours of L2-heavy operation before circular buffer kicks in.
|
||
|
||
## Migration Strategy
|
||
|
||
Not applicable — no relational database. Config changes handled by YAML versioning:
|
||
- `version: 1` field in config.yaml
|
||
- New fields get defaults (backward-compatible)
|
||
- Breaking changes: bump version, include migration notes in USB update package
|