Files
detections-semantic/_docs/00_research/03_comparison_framework.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

2.6 KiB

Comparison Framework

Selected Framework Type

Problem Diagnosis + Decision Support

Selected Dimensions

  1. Memory Budget Feasibility
  2. YOLO26/YOLOE-26 TRT Deployment Stability
  3. YOLOE-26 Zero-Shot Accuracy for Domain
  4. Path Tracing Algorithm Robustness
  5. VLM Runtime & Integration Viability
  6. Gimbal Control Adequacy
  7. Training Data Realism
  8. Security & Adversarial Resilience

Initial Population

Dimension Draft01 Assumption Researched Reality Risk Level Factual Basis
Memory Budget YOLO + YOLOE-26 + CNN + VLM coexist on 8GB Only ~5.2GB usable VRAM. Single YOLO TRT engine ~2.6GB. Two engines + CNN ≈ 5-6GB. No room for VLM simultaneously. CRITICAL Fact #1, #2, #3, #14, #19
YOLO26 TRT Stability YOLO26-Seg TRT export assumed working YOLO26 has confirmed confidence misalignment in TRT C++ and INT8 export crashes on Jetson. Active bugs unfixed. HIGH Fact #5, #6
YOLOE-26 Zero-Shot Text prompts "footpath", "branch pile" assumed effective Trained on LVIS/COCO. Military concealment is far OOD. No published domain benchmarks. Generic prompts may work for "footpath" but not "dugout" or "camouflage netting". HIGH Fact #7, #8
Path Tracing Zhang-Suen skeletonization assumed robust Classical skeletonization is noise-sensitive — spurious branches from noisy segmentation masks. GraphMorph/learnable skeletons are more robust alternatives. MEDIUM Fact #15, #16
VLM Runtime vLLM or TRT-LLM assumed viable TRT-LLM explicitly does not support edge devices. vLLM works but requires careful memory management. VLM cannot run concurrently with YOLO — must unload/reload. HIGH Fact #11, #12, #14
VLM Speed UAV-VL-R1 ≤5s assumed Cosmos-Reason2-2B: 4.7 tok/s on Orin Nano Super. For 50-100 token response: 10-21s. Significantly exceeds 5s target. HIGH Fact #13
Gimbal Control PID assumed sufficient PID works for stationary UAV. During flight, Kalman filter needed to compensate attitude/mounting errors. PID alone causes drift. MEDIUM Fact #17
Training Data 1500 images/class in 8 weeks assumed Realistic for generic objects; challenging for military concealment (access, annotation complexity). Synthetic augmentation (GenCAMO, CamouflageAnything) can significantly help. MEDIUM Fact #18
Security No security measures in draft01 Small edge YOLO models are more vulnerable to adversarial patches. Physical device capture risk (model weights, logs). PatchBlock defense available. HIGH Fact #9, #10