Files
detections-semantic/_docs/02_plans/FINAL_report.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

9.6 KiB
Raw Blame History

Semantic Detection System — Planning Report

Executive Summary

Planned a three-tier semantic detection system for UAV reconnaissance that identifies camouflaged/concealed positions by detecting footpaths, tracing them to endpoints, and classifying concealment via heuristic + VLM analysis. The architecture decomposes into 6 components, 2 common helpers, and 8 Jira epics totaling an estimated 4268 story points, with a behavior tree orchestrating a two-level scan strategy (wide sweep + detailed investigation) driven by data-configurable search scenarios.

Problem Statement

Existing YOLO-based UAV detection pipelines cannot identify camouflaged military positions — FPV operator hideouts, hidden artillery, branch-covered dugouts. A semantic layer is needed that detects terrain indicators (footpaths, branch piles, dark entrances), traces spatial patterns to potential concealment points, and optionally confirms via visual language model analysis, all while controlling a camera gimbal through a two-level scan strategy on a resource-constrained Jetson Orin Nano Super.

Architecture Overview

Three-tier inference pipeline (Tier 1: YOLOE TensorRT detection, Tier 2: heuristic spatial analysis, Tier 3: optional VLM deep analysis) orchestrated by a py_trees behavior tree. The scan controller manages L1 wide-area sweep and L2 detailed investigation with 4 investigation types (path_follow, cluster_follow, area_sweep, zoom_classify), each driven by YAML-configured search scenarios. Graceful degradation via capability flags handles VLM unavailability, gimbal failure, and thermal throttling.

Technology stack: Cython/Python 3.11, TensorRT FP16, NanoLLM (VILA1.5-3B), py_trees 2.4.0, OpenCV, scikit-image, pyserial, Docker on JetPack 6.2

Deployment: Development (x86 workstation with mock services) and Production (Jetson Orin Nano Super on UAV, NVMe SSD, air-gapped)

Component Summary

# Component Purpose Dependencies Epic
H1 Config YAML config loading, validation, typed access Bootstrap
H2 Types Shared dataclasses (FrameContext, Detection, POI, etc.) Bootstrap
01 ScanController BT orchestrator for L1/L2 scan with scenario dispatch All components Epic 7
02 Tier1Detector YOLOE TensorRT FP16 detection + segmentation Config, Types Epic 2
03 Tier2SpatialAnalyzer Mask tracing (footpaths) + cluster tracing (defense systems) Config, Types Epic 3
04 VLMClient IPC client to NanoLLM Docker container (Unix socket) Config, Types Epic 4
05 GimbalDriver ViewLink serial protocol + PID path following Config, Types Epic 5
06 OutputManager Detection logging, frame recording, operator delivery Config, Types Epic 6

Implementation order:

  1. Phase 1: Bootstrap (Config + Types + project scaffold)
  2. Phase 2: Tier1Detector, Tier2SpatialAnalyzer, VLMClient, GimbalDriver, OutputManager (parallel)
  3. Phase 3: ScanController (integrates all components)
  4. Phase 4: Integration Tests

System Flows

Flow Description Key Components
Main Pipeline Frame → Tier1 → EvaluatePOI → queue ScanController, Tier1Detector
L2 Path Follow POI → zoom → trace mask → PID follow → waypoint analysis → VLM (optional) ScanController, Tier2, GimbalDriver, VLMClient
L2 Cluster Follow Cluster POI → trace cluster → visit waypoints → classify each ScanController, Tier2, GimbalDriver
Health Degradation Thermal/failure → disable capability → fallback behavior ScanController
Recording Frame/detection → NVMe write → circular buffer management OutputManager

Reference system-flows.md for full details.

Risk Summary

Level Count Key Risks
Critical 1 R05: Seasonal model generalization (phased rollout mitigates)
High 4 R01 backbone accuracy, R02 VLM load latency, R03 heuristic FP rate, R04 GPU memory pressure
Medium 4 R06 config complexity, R07 fragmented masks, R08 ViewLink effort, R09 operator overload
Low 4 R10 GIL, R11 NVMe writes, R12 py_trees overhead, R13 scenario extensibility

Iterations completed: 1 All Critical/High risks mitigated: Yes — all have documented mitigation strategies and contingency plans

Reference risk_mitigations.md for full register.

Test Coverage

Component Integration Performance Security Acceptance AC Coverage
ScanController 11 tests 2 tests 2 tests 7 tests 10 ACs
Tier1Detector 5 tests 3 tests 1 test 2 tests 4 ACs
Tier2SpatialAnalyzer 9 tests 2 tests 1 test 4 tests 7 ACs
VLMClient 7 tests 2 tests 2 tests 2 tests 2 ACs
GimbalDriver 9 tests 3 tests 1 test 3 tests 5 ACs
OutputManager 9 tests 2 tests 2 tests 2 tests 2 ACs
Total 50 14 9 20

Overall acceptance criteria coverage: 27 / 28 ACs covered (96%)

  • AC-28 (training dataset requirements) not covered — data annotation scope, not runtime behavior

Epic Roadmap

Order Jira ID Epic Component Effort Dependencies
1 AZ-130 Bootstrap & Initial Structure Config, Types, scaffold M (5-8 pts)
2 AZ-131 Tier1Detector Tier1Detector M (5-8 pts) AZ-130
3 AZ-132 Tier2SpatialAnalyzer Tier2SpatialAnalyzer M (5-8 pts) AZ-130
4 AZ-133 VLMClient VLMClient S (3-5 pts) AZ-130
5 AZ-134 GimbalDriver GimbalDriver L (8-13 pts) AZ-130
6 AZ-135 OutputManager OutputManager S (3-5 pts) AZ-130
7 AZ-136 ScanController ScanController L (8-13 pts) AZ-130AZ-135
8 AZ-137 Integration Tests System-level M (5-8 pts) AZ-136

Total estimated effort: 4268 story points

Key Decisions Made

# Decision Rationale Alternatives Rejected
1 Three-tier architecture (YOLOE + heuristic + VLM) Graceful degradation; VLM optional CNN-based Tier 2 (V2 CNN removed)
2 NanoLLM replaces vLLM vLLM unstable on Jetson; NanoLLM purpose-built vLLM, Ollama
3 py_trees Behavior Tree for orchestration Preemptive, extensible, proven State machine, custom loop
4 Data-driven YAML search scenarios New scenarios without code changes Hardcoded investigation logic
5 Tier2SpatialAnalyzer (unified mask + cluster) Single component, two strategies, unified output Separate PathTracer and ClusterTracer components
6 No traditional DB — runtime structs + NVMe flat files Embedded edge device; no DB overhead SQLite, Redis
7 Sequential GPU scheduling (YOLOE then VLM) 8GB shared RAM constraint Concurrent execution (impossible)
8 FP16 TRT only (INT8 deferred) INT8 unstable on Jetson currently INT8 quantization
9 Phased seasonal rollout (winter first) Critical R05 mitigation Multi-season from day 1

Open Questions

# Question Impact Assigned To
1 ViewLink protocol: native checksum or CRC-16 wrapper? GimbalDriver implementation detail Dev (during spike)
2 YOLOE-11 vs YOLOE-26: which backbone wins benchmark? Tier1Detector engine selection Dev (benchmark sprint)
3 VILA1.5-3B prompt optimization for concealment detection VLM accuracy on target domain Dev + domain expert

Artifact Index

File Description
architecture.md System architecture, tech stack, ADRs
system-flows.md 5 system flows with sequence descriptions
data_model.md Runtime structs + persistent file formats
deployment/containerization.md Docker strategy
deployment/ci_cd_pipeline.md CI/CD pipeline stages
deployment/environment_strategy.md Dev vs production config
deployment/observability.md Logging, metrics, alerting
deployment/deployment_procedures.md Rollout, rollback, health checks
risk_mitigations.md 13 risks with mitigations
components/01_scan_controller/description.md ScanController spec
components/01_scan_controller/tests.md ScanController test spec
components/02_tier1_detector/description.md Tier1Detector spec
components/02_tier1_detector/tests.md Tier1Detector test spec
components/03_tier2_spatial_analyzer/description.md Tier2SpatialAnalyzer spec
components/03_tier2_spatial_analyzer/tests.md Tier2SpatialAnalyzer test spec
components/04_vlm_client/description.md VLMClient spec
components/04_vlm_client/tests.md VLMClient test spec
components/05_gimbal_driver/description.md GimbalDriver spec
components/05_gimbal_driver/tests.md GimbalDriver test spec
components/06_output_manager/description.md OutputManager spec
components/06_output_manager/tests.md OutputManager test spec
common-helpers/01_helper_config.md Config helper spec
common-helpers/02_helper_types.md Types helper spec
integration_tests/environment.md Test environment spec
integration_tests/test_data.md Test data management
integration_tests/functional_tests.md Functional test scenarios
integration_tests/non_functional_tests.md Non-functional test scenarios
integration_tests/traceability_matrix.md AC-to-test traceability
epics.md Jira epic specifications (AZ-130 through AZ-137)
diagrams/components.md Component diagram, dependency graph, data flow (Mermaid)
diagrams/flows/flow_main_pipeline.md Main pipeline Mermaid diagram