mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 10:26:38 +00:00
8e2ecf50fd
Made-with: Cursor
2.6 KiB
2.6 KiB
Comparison Framework
Selected Framework Type
Problem Diagnosis + Decision Support
Selected Dimensions
- Memory Budget Feasibility
- YOLO26/YOLOE-26 TRT Deployment Stability
- YOLOE-26 Zero-Shot Accuracy for Domain
- Path Tracing Algorithm Robustness
- VLM Runtime & Integration Viability
- Gimbal Control Adequacy
- Training Data Realism
- Security & Adversarial Resilience
Initial Population
| Dimension | Draft01 Assumption | Researched Reality | Risk Level | Factual Basis |
|---|---|---|---|---|
| Memory Budget | YOLO + YOLOE-26 + CNN + VLM coexist on 8GB | Only ~5.2GB usable VRAM. Single YOLO TRT engine ~2.6GB. Two engines + CNN ≈ 5-6GB. No room for VLM simultaneously. | CRITICAL | Fact #1, #2, #3, #14, #19 |
| YOLO26 TRT Stability | YOLO26-Seg TRT export assumed working | YOLO26 has confirmed confidence misalignment in TRT C++ and INT8 export crashes on Jetson. Active bugs unfixed. | HIGH | Fact #5, #6 |
| YOLOE-26 Zero-Shot | Text prompts "footpath", "branch pile" assumed effective | Trained on LVIS/COCO. Military concealment is far OOD. No published domain benchmarks. Generic prompts may work for "footpath" but not "dugout" or "camouflage netting". | HIGH | Fact #7, #8 |
| Path Tracing | Zhang-Suen skeletonization assumed robust | Classical skeletonization is noise-sensitive — spurious branches from noisy segmentation masks. GraphMorph/learnable skeletons are more robust alternatives. | MEDIUM | Fact #15, #16 |
| VLM Runtime | vLLM or TRT-LLM assumed viable | TRT-LLM explicitly does not support edge devices. vLLM works but requires careful memory management. VLM cannot run concurrently with YOLO — must unload/reload. | HIGH | Fact #11, #12, #14 |
| VLM Speed | UAV-VL-R1 ≤5s assumed | Cosmos-Reason2-2B: 4.7 tok/s on Orin Nano Super. For 50-100 token response: 10-21s. Significantly exceeds 5s target. | HIGH | Fact #13 |
| Gimbal Control | PID assumed sufficient | PID works for stationary UAV. During flight, Kalman filter needed to compensate attitude/mounting errors. PID alone causes drift. | MEDIUM | Fact #17 |
| Training Data | 1500 images/class in 8 weeks assumed | Realistic for generic objects; challenging for military concealment (access, annotation complexity). Synthetic augmentation (GenCAMO, CamouflageAnything) can significantly help. | MEDIUM | Fact #18 |
| Security | No security measures in draft01 | Small edge YOLO models are more vulnerable to adversarial patches. Physical device capture risk (model weights, logs). PatchBlock defense available. | HIGH | Fact #9, #10 |