mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 19:26:38 +00:00
Initial commit
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,28 @@
|
||||
# Comparison Framework
|
||||
|
||||
## Selected Framework Type
|
||||
Problem Diagnosis + Decision Support
|
||||
|
||||
## Selected Dimensions
|
||||
1. Memory Budget Feasibility
|
||||
2. YOLO26/YOLOE-26 TRT Deployment Stability
|
||||
3. YOLOE-26 Zero-Shot Accuracy for Domain
|
||||
4. Path Tracing Algorithm Robustness
|
||||
5. VLM Runtime & Integration Viability
|
||||
6. Gimbal Control Adequacy
|
||||
7. Training Data Realism
|
||||
8. Security & Adversarial Resilience
|
||||
|
||||
## Initial Population
|
||||
|
||||
| Dimension | Draft01 Assumption | Researched Reality | Risk Level | Factual Basis |
|
||||
|-----------|-------------------|-------------------|------------|---------------|
|
||||
| Memory Budget | YOLO + YOLOE-26 + CNN + VLM coexist on 8GB | Only ~5.2GB usable VRAM. Single YOLO TRT engine ~2.6GB. Two engines + CNN ≈ 5-6GB. No room for VLM simultaneously. | **CRITICAL** | Fact #1, #2, #3, #14, #19 |
|
||||
| YOLO26 TRT Stability | YOLO26-Seg TRT export assumed working | YOLO26 has confirmed confidence misalignment in TRT C++ and INT8 export crashes on Jetson. Active bugs unfixed. | **HIGH** | Fact #5, #6 |
|
||||
| YOLOE-26 Zero-Shot | Text prompts "footpath", "branch pile" assumed effective | Trained on LVIS/COCO. Military concealment is far OOD. No published domain benchmarks. Generic prompts may work for "footpath" but not "dugout" or "camouflage netting". | **HIGH** | Fact #7, #8 |
|
||||
| Path Tracing | Zhang-Suen skeletonization assumed robust | Classical skeletonization is noise-sensitive — spurious branches from noisy segmentation masks. GraphMorph/learnable skeletons are more robust alternatives. | **MEDIUM** | Fact #15, #16 |
|
||||
| VLM Runtime | vLLM or TRT-LLM assumed viable | TRT-LLM explicitly does not support edge devices. vLLM works but requires careful memory management. VLM cannot run concurrently with YOLO — must unload/reload. | **HIGH** | Fact #11, #12, #14 |
|
||||
| VLM Speed | UAV-VL-R1 ≤5s assumed | Cosmos-Reason2-2B: 4.7 tok/s on Orin Nano Super. For 50-100 token response: 10-21s. Significantly exceeds 5s target. | **HIGH** | Fact #13 |
|
||||
| Gimbal Control | PID assumed sufficient | PID works for stationary UAV. During flight, Kalman filter needed to compensate attitude/mounting errors. PID alone causes drift. | **MEDIUM** | Fact #17 |
|
||||
| Training Data | 1500 images/class in 8 weeks assumed | Realistic for generic objects; challenging for military concealment (access, annotation complexity). Synthetic augmentation (GenCAMO, CamouflageAnything) can significantly help. | **MEDIUM** | Fact #18 |
|
||||
| Security | No security measures in draft01 | Small edge YOLO models are more vulnerable to adversarial patches. Physical device capture risk (model weights, logs). PatchBlock defense available. | **HIGH** | Fact #9, #10 |
|
||||
Reference in New Issue
Block a user