# Latency - Tier 1 (fast probe / YOLO26 + YOLOE-26 detection): ≤100ms per frame on Jetson Orin Nano Super - Tier 2 (fast confirmation / custom CNN classifier): ≤200ms per region of interest on Jetson Orin Nano Super - Tier 3 (optional deep analysis / VLM): ≤5 seconds per region of interest on Jetson Orin Nano Super # YOLO Object Detection — New Classes - New classes added to existing YOLO model: black entrances (various sizes), piles of tree branches, footpaths, roads, trees, tree blocks - Detection performance for new classes: targets matching existing YOLO baseline (P≥80%, R≥80%) on validation set - New classes must not degrade detection performance of existing classes # Semantic Detection Performance (Initial Release) - Recall on concealed positions: ≥60% (start high on false positives, iterate down) - Precision on concealed positions: ≥20% initial target (operator filters candidates) - Footpath detection recall: ≥70% - Baseline reference: existing YOLO achieves P=81.6%, R=85.2% on non-masked objects # Scan Algorithm - Level 1 wide-area scan covers the planned route with left-right camera sweep at medium zoom - Points of interest detected during Level 1: footpaths, tree rows, branch piles, black entrances, houses with vehicles/traces, roads on snow/terrain/forest - Transition from Level 1 to Level 2 within 2 seconds of POI detection (includes physical zoom transition) - Level 2 maintains camera lock on POI while UAV continues flight (gimbal compensates for aircraft motion) - Path-following mode: camera pans along detected footpath at a rate that keeps the path visible and centered - Endpoint hold: camera maintains position on path endpoint for VLM analysis duration (up to 2 seconds) - Return to Level 1 after analysis completes or configurable timeout (default 5 seconds per POI) # Camera Control - Gimbal control module sends pan/tilt/zoom commands to ViewPro A40 - Gimbal command latency: ≤500ms from decision to physical camera movement - Zoom transitions: medium zoom (Level 1) to high zoom (Level 2) within 2 seconds (physical constraint) - Path-following accuracy: detected footpath stays within center 50% of frame during pan - Smooth gimbal transitions (no jerky movements that blur the image) - Queue management: system maintains ordered list of POIs, prioritized by confidence and proximity to current camera position # Semantic Analysis Pipeline - Consumes YOLO detections as input: uses detected footpaths, roads, branch piles, entrances, trees as primitives for reasoning - Distinguishes fresh footpaths from stale ones (visual freshness assessment) - Traces footpaths to endpoints and identifies concealed structures at those endpoints - Handles path intersections by following freshest / most promising branch # Resource Constraints - Total RAM usage (semantic module + VLM): ≤6GB on Jetson Orin Nano Super - Must coexist with running YOLO inference pipeline without degrading YOLO performance # Data & Training - Training dataset: hundreds to thousands of annotated examples across all seasons and terrain types - New YOLO classes require dedicated annotation effort for: black entrances, branch piles, footpaths, roads, trees, tree blocks - Dataset assembly timeline: 1.5 months, 5 hours/day manual annotation effort available