Files
detections-semantic/_docs/02_plans/data_model.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

149 lines
6.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Semantic Detection System — Data Model
## Design Principle
This is a real-time streaming pipeline on an edge device, not a CRUD application. There is no database. Data falls into two categories:
1. **Runtime structs** — in-memory only, exist for one processing cycle, then discarded
2. **Persistent logs** — append-only flat files on NVMe SSD
## Runtime Structs (in-memory only)
These are C/Cython structs or Python dataclasses. They are created, consumed by the next pipeline stage, and garbage-collected. No persistence.
### FrameContext
Wraps a camera frame with metadata for the current processing cycle.
| Field | Type | Description |
|-------|------|-------------|
| frame_id | uint64 | Sequential counter |
| timestamp | float64 | Capture time (epoch seconds) |
| image | numpy array (H,W,3) | Raw frame pixels |
| scan_level | uint8 | 1 or 2 |
| quality_score | float32 | Laplacian variance (computed on capture) |
| pan | float32 | Gimbal pan at capture |
| tilt | float32 | Gimbal tilt at capture |
| zoom | float32 | Zoom level at capture |
### YoloDetection (external input)
Received from existing YOLO pipeline. Consumed by semantic pipeline, not stored.
| Field | Type | Description |
|-------|------|-------------|
| centerX | float32 | Normalized center X (0-1) |
| centerY | float32 | Normalized center Y (0-1) |
| width | float32 | Normalized width (0-1) |
| height | float32 | Normalized height (0-1) |
| classNum | int32 | Class index |
| label | string | Class label |
| confidence | float32 | 0-1 |
| mask | numpy array (H,W) | Segmentation mask (if seg model) |
### POI
In-memory queue entry. Created when Tier 1 detects a point of interest, removed after investigation or timeout. Max size configurable (default 10).
| Field | Type | Description |
|-------|------|-------------|
| poi_id | uint64 | Counter |
| frame_id | uint64 | Frame that triggered this POI |
| trigger_class | string | Class that triggered (footpath_winter, branch_pile, etc.) |
| scenario_name | string | Which search scenario triggered this POI |
| investigation_type | string | "path_follow", "area_sweep", or "zoom_classify" |
| confidence | float32 | Trigger confidence |
| bbox | float32[4] | Bounding box in frame |
| priority | float32 | Computed: confidence × priority_boost × recency |
| status | enum | queued / investigating / done / timeout |
### GimbalState
Current gimbal position. Single instance, updated on every gimbal feedback message.
| Field | Type | Description |
|-------|------|-------------|
| pan | float32 | Current pan angle |
| tilt | float32 | Current tilt angle |
| zoom | float32 | Current zoom level |
| target_pan | float32 | Commanded pan |
| target_tilt | float32 | Commanded tilt |
| target_zoom | float32 | Commanded zoom |
| last_heartbeat | float64 | Last response timestamp |
## Persistent Data (NVMe flat files)
### DetectionLogEntry → `detections.jsonl`
One JSON line per confirmed detection. This is the primary system output.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| ts | string (ISO 8601) | Yes | Detection timestamp |
| frame_id | uint64 | Yes | Source frame |
| gps_denied_lat | float64 | No | GPS-denied latitude (null if unavailable) |
| gps_denied_lon | float64 | No | GPS-denied longitude |
| tier | uint8 | Yes | 1, 2, or 3 |
| class | string | Yes | Detection class label |
| confidence | float32 | Yes | 0-1 |
| bbox | float32[4] | Yes | centerX, centerY, width, height (normalized) |
| freshness | string | No | "high_contrast" / "low_contrast" (footpaths only) |
| tier2_result | string | No | Tier 2 classification |
| tier2_confidence | float32 | No | Tier 2 confidence |
| tier3_used | bool | Yes | Whether VLM was invoked |
| thumbnail_path | string | No | Saved ROI thumbnail path |
### HealthLogEntry → `health.jsonl`
One JSON line per second. System health snapshot.
| Field | Type | Description |
|-------|------|-------------|
| ts | string (ISO 8601) | Timestamp |
| t_junction | float32 | Junction temperature °C |
| power_watts | float32 | Power draw |
| gpu_mem_mb | uint32 | GPU memory used |
| vlm_available | bool | VLM capability flag |
| gimbal_available | bool | Gimbal capability flag |
| semantic_available | bool | Semantic capability flag |
### RecordedFrame → `frames/{frame_id}.jpg`
JPEG file per recorded frame. Metadata embedded in filename (frame_id). Correlation to detections via frame_id in `detections.jsonl`.
### Config → `config.yaml`
Single YAML file with all runtime parameters. Versioned (`version: 1` field). Updated via USB.
## What Is NOT Stored
| Item | Why not |
|------|---------|
| FootpathMask (segmentation mask) | Transient. Exists ~50ms during Tier 2 processing. Too large to log (HxW binary). |
| PathSkeleton | Transient. Derivative of mask. |
| EndpointROI crop | Thumbnail saved only if detection confirmed. Raw crop discarded. |
| YoloDetection input | External system's data. We consume it, don't archive it. |
| POI queue state | Runtime queue. Not useful after flight. Detections capture the outcomes. |
| Raw VLM response text | Optionally logged inside DetectionLogEntry if tier3_used=true. Not stored separately. |
## Storage Budget (256GB NVMe)
| Data | Write Rate | Per Hour | 4-Hour Flight |
|------|-----------|----------|---------------|
| detections.jsonl | ~1 KB/detection, ~100 detections/min | ~6 MB | ~24 MB |
| health.jsonl | ~200 bytes/s | ~720 KB | ~3 MB |
| frames/ (L1, 2 FPS) | ~100 KB/frame, 2 FPS | ~720 MB | ~2.9 GB |
| frames/ (L2, 30 FPS) | ~100 KB/frame, 30 FPS | ~10.8 GB | ~43 GB (if L2 100% of time) |
| gimbal.log | ~50 bytes/command, 10 Hz | ~1.8 MB | ~7 MB |
| **Total (typical L1-heavy)** | | **~1.5 GB** | **~6 GB** |
| **Total (L2-heavy)** | | **~11 GB** | **~46 GB** |
256GB NVMe comfortably supports 5+ typical flights or 5+ hours of L2-heavy operation before circular buffer kicks in.
## Migration Strategy
Not applicable — no relational database. Config changes handled by YAML versioning:
- `version: 1` field in config.yaml
- New fields get defaults (backward-compatible)
- Breaking changes: bump version, include migration notes in USB update package