mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-23 09:06:38 +00:00
8e2ecf50fd
Made-with: Cursor
48 lines
3.2 KiB
Markdown
48 lines
3.2 KiB
Markdown
# Latency
|
|
- Tier 1 (fast probe / YOLO26 + YOLOE-26 detection): ≤100ms per frame on Jetson Orin Nano Super
|
|
- Tier 2 (fast confirmation / custom CNN classifier): ≤200ms per region of interest on Jetson Orin Nano Super
|
|
- Tier 3 (optional deep analysis / VLM): ≤5 seconds per region of interest on Jetson Orin Nano Super
|
|
|
|
# YOLO Object Detection — New Classes
|
|
- New classes added to existing YOLO model: black entrances (various sizes), piles of tree branches, footpaths, roads, trees, tree blocks
|
|
- Detection performance for new classes: targets matching existing YOLO baseline (P≥80%, R≥80%) on validation set
|
|
- New classes must not degrade detection performance of existing classes
|
|
|
|
# Semantic Detection Performance (Initial Release)
|
|
- Recall on concealed positions: ≥60% (start high on false positives, iterate down)
|
|
- Precision on concealed positions: ≥20% initial target (operator filters candidates)
|
|
- Footpath detection recall: ≥70%
|
|
- Baseline reference: existing YOLO achieves P=81.6%, R=85.2% on non-masked objects
|
|
|
|
# Scan Algorithm
|
|
- Level 1 wide-area scan covers the planned route with left-right camera sweep at medium zoom
|
|
- Points of interest detected during Level 1: footpaths, tree rows, branch piles, black entrances, houses with vehicles/traces, roads on snow/terrain/forest
|
|
- Transition from Level 1 to Level 2 within 2 seconds of POI detection (includes physical zoom transition)
|
|
- Level 2 maintains camera lock on POI while UAV continues flight (gimbal compensates for aircraft motion)
|
|
- Path-following mode: camera pans along detected footpath at a rate that keeps the path visible and centered
|
|
- Endpoint hold: camera maintains position on path endpoint for VLM analysis duration (up to 2 seconds)
|
|
- Return to Level 1 after analysis completes or configurable timeout (default 5 seconds per POI)
|
|
|
|
# Camera Control
|
|
- Gimbal control module sends pan/tilt/zoom commands to ViewPro A40
|
|
- Gimbal command latency: ≤500ms from decision to physical camera movement
|
|
- Zoom transitions: medium zoom (Level 1) to high zoom (Level 2) within 2 seconds (physical constraint)
|
|
- Path-following accuracy: detected footpath stays within center 50% of frame during pan
|
|
- Smooth gimbal transitions (no jerky movements that blur the image)
|
|
- Queue management: system maintains ordered list of POIs, prioritized by confidence and proximity to current camera position
|
|
|
|
# Semantic Analysis Pipeline
|
|
- Consumes YOLO detections as input: uses detected footpaths, roads, branch piles, entrances, trees as primitives for reasoning
|
|
- Distinguishes fresh footpaths from stale ones (visual freshness assessment)
|
|
- Traces footpaths to endpoints and identifies concealed structures at those endpoints
|
|
- Handles path intersections by following freshest / most promising branch
|
|
|
|
# Resource Constraints
|
|
- Total RAM usage (semantic module + VLM): ≤6GB on Jetson Orin Nano Super
|
|
- Must coexist with running YOLO inference pipeline without degrading YOLO performance
|
|
|
|
# Data & Training
|
|
- Training dataset: hundreds to thousands of annotated examples across all seasons and terrain types
|
|
- New YOLO classes require dedicated annotation effort for: black entrances, branch piles, footpaths, roads, trees, tree blocks
|
|
- Dataset assembly timeline: 1.5 months, 5 hours/day manual annotation effort available
|