mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 22:26:39 +00:00
Initial commit
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,80 @@
|
||||
# Observability
|
||||
|
||||
## Logging
|
||||
|
||||
### Detection Log
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| ts | ISO 8601 | Detection timestamp |
|
||||
| frame_id | uint64 | Source frame |
|
||||
| gps_denied_lat | float64 | GPS-denied latitude |
|
||||
| gps_denied_lon | float64 | GPS-denied longitude |
|
||||
| tier | uint8 | Tier that produced detection |
|
||||
| class | string | Detection class label |
|
||||
| confidence | float32 | Detection confidence |
|
||||
| bbox | float32[4] | centerX, centerY, width, height (normalized) |
|
||||
| freshness | string | Freshness tag (footpaths only) |
|
||||
| tier2_result | string | Tier 2 classification |
|
||||
| tier2_confidence | float32 | Tier 2 confidence |
|
||||
| tier3_used | bool | Whether VLM was invoked |
|
||||
| thumbnail_path | string | Path to ROI thumbnail |
|
||||
|
||||
**Format**: JSON-lines, append-only
|
||||
**Location**: `/data/output/detections.jsonl`
|
||||
**Rotation**: None (circular buffer at filesystem level for L1 frames)
|
||||
|
||||
### Gimbal Command Log
|
||||
|
||||
**Format**: Text, one line per command (timestamp, command type, target angles, CRC status, retry count)
|
||||
**Location**: `/data/output/gimbal.log`
|
||||
|
||||
### System Health Log
|
||||
|
||||
**Format**: JSON-lines, 1 entry per second
|
||||
**Fields**: timestamp, t_junction, power_watts, gpu_mem_mb, cpu_mem_mb, degradation_level, gimbal_alive, semantic_alive, vlm_alive, nvme_free_pct
|
||||
**Location**: `/data/output/health.jsonl`
|
||||
|
||||
### Application Error Log
|
||||
|
||||
**Format**: Text with severity levels (ERROR, WARN, INFO)
|
||||
**Location**: `/data/output/app.log`
|
||||
**Content**: Exceptions, timeouts, CRC failures, frame skips, VLM errors
|
||||
|
||||
## Metrics (In-Memory)
|
||||
|
||||
No external metrics service (air-gapped). Metrics are computed in-memory and exposed via health API endpoint:
|
||||
|
||||
| Metric | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| frames_processed_total | Counter | Total frames through Tier 1 |
|
||||
| frames_skipped_quality | Counter | Frames rejected by quality gate |
|
||||
| detections_total | Counter | Total detections produced (all tiers) |
|
||||
| tier1_latency_ms | Histogram | Tier 1 inference time |
|
||||
| tier2_latency_ms | Histogram | Tier 2 processing time |
|
||||
| tier3_latency_ms | Histogram | Tier 3 VLM time |
|
||||
| poi_queue_depth | Gauge | Current POI queue size |
|
||||
| degradation_level | Gauge | Current degradation level |
|
||||
| t_junction_celsius | Gauge | Current junction temperature |
|
||||
| power_draw_watts | Gauge | Current power draw |
|
||||
| gpu_memory_used_mb | Gauge | Current GPU memory |
|
||||
| gimbal_crc_failures | Counter | Total CRC failures on UART |
|
||||
| vlm_crashes | Counter | VLM process crash count |
|
||||
|
||||
**Exposed via**: GET /api/v1/health (JSON response with all metrics)
|
||||
|
||||
## Alerting
|
||||
|
||||
No external alerting system. Alerts are:
|
||||
1. Degradation level changes → logged to health log + detection log
|
||||
2. Critical events (VLM crash, gimbal loss, thermal critical) → logged with severity ERROR
|
||||
3. Operator display shows current degradation level as status indicator
|
||||
|
||||
## Post-Flight Analysis
|
||||
|
||||
After landing, NVMe data is extracted via USB for offline analysis:
|
||||
- `detections.jsonl` → import into annotation tool for TP/FP labeling
|
||||
- `frames/` → source material for training dataset expansion
|
||||
- `health.jsonl` → thermal/power profile for hardware optimization
|
||||
- `gimbal.log` → PID tuning analysis
|
||||
- `app.log` → debugging and issue diagnosis
|
||||
Reference in New Issue
Block a user