mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 22:26:39 +00:00
8e2ecf50fd
Made-with: Cursor
162 lines
7.6 KiB
Markdown
162 lines
7.6 KiB
Markdown
# Fact Cards
|
|
|
|
## Fact #1
|
|
- **Statement**: Jetson Orin Nano Super has 7.6GB total unified memory, but only ~3.7GB free GPU memory after OS/system overhead in a Docker container.
|
|
- **Source**: Source #21
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Edge AI multi-model deployment on Orin Nano Super
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Memory contention
|
|
|
|
## Fact #2
|
|
- **Statement**: A single TensorRT engine (YOLOv8-OBB) consumes ~2.6GB on Jetson Orin Nano. cuDNN/CUDA binary loading adds ~940MB-1.1GB overhead per engine initialization.
|
|
- **Source**: Source #20
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: TRT multi-engine memory planning
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Memory contention
|
|
|
|
## Fact #3
|
|
- **Statement**: Running VLA + YOLO detection concurrently on Orin Nano Super is described as "mostly theoretical" in 2025 surveys. GPU sharing causes 10-40% latency jitter.
|
|
- **Source**: Source #18
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Multi-model concurrent inference
|
|
- **Confidence**: ⚠️ Medium (survey, not primary benchmark)
|
|
- **Related Dimension**: Memory contention, performance
|
|
|
|
## Fact #4
|
|
- **Statement**: NVIDIA recommends using a single TRT engine with async CUDA streams over multiple separate engines for GPU efficiency. Multiple engines need CUDA context push/pop management.
|
|
- **Source**: Source #19
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: TRT engine management
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Memory contention, architecture
|
|
|
|
## Fact #5
|
|
- **Statement**: YOLO26 exhibits bounding box drift and inaccurate confidence scores when deployed via TensorRT on Jetson Orin Nano in C++. This is an architecture-specific export issue not present in YOLOv8.
|
|
- **Source**: Source #22
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: YOLO26/YOLOE-26 TRT deployment
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: YOLOE-26 viability, deployment risk
|
|
|
|
## Fact #6
|
|
- **Statement**: YOLO26n INT8 TensorRT export fails during calibration graph optimization on Jetson Orin with TensorRT 10.3.0 / JetPack 6. ONNX export succeeds but TRT build crashes.
|
|
- **Source**: Source #23
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: YOLO26 edge deployment
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: YOLOE-26 viability, deployment risk
|
|
|
|
## Fact #7
|
|
- **Statement**: YOLOE supports multimodal fusion of text + visual prompts with two modes: concat (zero overhead) and weighted-sum (fuse_alpha). This can improve robustness over text-only or visual-only prompts.
|
|
- **Source**: Source #30
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: YOLOE prompt strategy
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: YOLOE-26 accuracy
|
|
|
|
## Fact #8
|
|
- **Statement**: YOLOE text prompts are trained on LVIS (1203 categories) and COCO. Military concealment classes (dugouts, branch camouflage, FPV hideouts) are far out-of-distribution from training data. No published benchmarks for this domain.
|
|
- **Source**: Sources #2, #3 (inferred from training data descriptions)
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: YOLOE-26 zero-shot accuracy
|
|
- **Confidence**: ⚠️ Medium (inference from known training data)
|
|
- **Related Dimension**: YOLOE-26 accuracy
|
|
|
|
## Fact #9
|
|
- **Statement**: Smaller YOLO models (commonly used on edge devices) are more vulnerable to adversarial patch attacks than larger counterparts, creating a latency-security trade-off.
|
|
- **Source**: Source #26
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Edge AI security
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Security
|
|
|
|
## Fact #10
|
|
- **Statement**: PatchBlock is a lightweight CPU-based preprocessing module that recovers up to 77% of model accuracy under adversarial patch attacks with minimal clean accuracy loss.
|
|
- **Source**: Source #24
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Edge AI adversarial defense
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Security
|
|
|
|
## Fact #11
|
|
- **Statement**: TensorRT-LLM developers explicitly stated they "do not aim to support models on edge devices/platforms" when asked about VLM support on Orin NX.
|
|
- **Source**: Source #37
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: VLM runtime selection
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: VLM integration
|
|
|
|
## Fact #12
|
|
- **Statement**: vLLM can deploy 2B models on Jetson Orin Nano 8GB. Shared memory must be increased to 8GB. Memory management is critical. Bottleneck is memory bandwidth (68 GB/s), not compute (67 TOPS).
|
|
- **Source**: Sources #35, #36
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: VLM runtime on Jetson
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: VLM integration
|
|
|
|
## Fact #13
|
|
- **Statement**: Cosmos-Reason2-2B achieves 4.7 tok/s on Jetson Orin Nano Super with W4A16 quantization. Llama-3.1-8B W4A16 achieves 44.19 tok/s (text-only). VLMs are significantly slower due to vision encoder overhead.
|
|
- **Source**: Sources #5, #16
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: VLM inference speed estimation
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: VLM integration, performance
|
|
|
|
## Fact #14
|
|
- **Statement**: A 1.5B Q4 model on Jetson Orin Nano Super failed to load because KV cache temp buffer required 10.7GB while only 6.5GB was available. Model weights alone were only 876MB.
|
|
- **Source**: Source #21
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: VLM memory management
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Memory contention, VLM integration
|
|
|
|
## Fact #15
|
|
- **Statement**: Morphological skeletonization suffers from noise-induced boundary variations causing spurious skeletal branches. Recent methods (2025) use scale-space hierarchical simplification for controllable robustness.
|
|
- **Source**: Source #31 (related search results)
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Path tracing robustness
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Path tracing
|
|
|
|
## Fact #16
|
|
- **Statement**: GraphMorph (2025) operates at branch-level using Graph Decoder + SkeletonDijkstra, producing topology-aware centerline masks. Reduces false positives vs pixel-level segmentation approaches.
|
|
- **Source**: Source #32
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Path extraction algorithms
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Path tracing
|
|
|
|
## Fact #17
|
|
- **Statement**: Kalman filter + coordinate transformation in UAV gimbal systems eliminates attitude and mounting errors that PID controllers alone cannot compensate for during flight.
|
|
- **Source**: Source #34
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Gimbal control algorithm
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Gimbal control
|
|
|
|
## Fact #18
|
|
- **Statement**: Synthetic data generation for camouflage detection is a validated approach: GenCAMO (2026) uses scene graphs + generative models; CamouflageAnything (CVPR 2025) uses controlled out-painting. Both improve detection baselines.
|
|
- **Source**: Sources #28, #29
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Training data strategy
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Training data
|
|
|
|
## Fact #19
|
|
- **Statement**: Usable VRAM on Jetson Orin Nano Super is approximately 5.2GB after OS overhead (not the advertised 8GB). The 8GB is shared between CPU and GPU.
|
|
- **Source**: Source #36
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: Memory budget planning
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: Memory contention
|
|
|
|
## Fact #20
|
|
- **Statement**: FP8 quantization for Qwen2-VL-2B performs worse than FP16 on vLLM. INT8/W4A16 are the recommended quantization formats for 2B VLMs on constrained hardware.
|
|
- **Source**: vLLM Issue #9992
|
|
- **Phase**: Assessment
|
|
- **Target Audience**: VLM quantization strategy
|
|
- **Confidence**: ✅ High
|
|
- **Related Dimension**: VLM integration
|