mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 22:26:39 +00:00
8e2ecf50fd
Made-with: Cursor
75 lines
4.2 KiB
Markdown
75 lines
4.2 KiB
Markdown
# Question Decomposition
|
||
|
||
## Original Question
|
||
Assess solution_draft01.md for weak points, performance bottlenecks, security issues, and produce a revised solution draft.
|
||
|
||
## Active Mode
|
||
Mode B — Solution Assessment of draft01
|
||
Rationale: solution_draft01.md exists in OUTPUT_DIR. Assessing and improving.
|
||
|
||
## Problem Context Summary
|
||
- Three-tier semantic detection (YOLOE-26 → Spatial Reasoning + CNN → VLM) on Jetson Orin Nano Super (8GB, 67 TOPS)
|
||
- Two-level camera scan (wide sweep → detailed investigation) with ViewPro A40 gimbal
|
||
- Integration with existing Cython+TRT YOLO detection service
|
||
- YOLOE-26 zero-shot bootstrapping → custom YOLO26 fine-tuning transition
|
||
- VLM (UAV-VL-R1) as separate process via Unix socket IPC
|
||
- Winter-first seasonal rollout
|
||
|
||
## Question Type
|
||
**Problem Diagnosis** — root cause analysis of weak points
|
||
Combined with **Decision Support** — weighing alternative solutions for identified issues
|
||
|
||
## Research Subject Boundary Definition
|
||
- **Population**: Edge AI semantic detection pipelines on Jetson-class hardware
|
||
- **Geography**: Deployment in Eastern European winter conditions (Ukraine conflict)
|
||
- **Timeframe**: 2025-2026 technology (YOLO26, YOLOE-26, VLMs, JetPack 6.2)
|
||
- **Level**: Single Jetson Orin Nano Super device (8GB unified memory, 67 TOPS INT8)
|
||
|
||
## Decomposed Sub-Questions
|
||
|
||
### Memory & Resource Contention
|
||
1. What is the actual GPU memory footprint of YOLOE-26s-seg TRT engine + existing YOLO TRT engine + MobileNetV3-Small TRT + UAV-VL-R1 INT8 running on 8GB unified memory?
|
||
2. Can two TRT engines (existing YOLO + YOLOE-26) share the same GPU execution context, or do they need separate CUDA streams?
|
||
3. Is sequential VLM scheduling (pause YOLO → run VLM → resume) viable without dropping detection frames?
|
||
|
||
### YOLOE-26 Zero-Shot Accuracy
|
||
4. How well do YOLOE text prompts perform on out-of-distribution domains (military concealment vs COCO/LVIS training data)?
|
||
5. Are visual prompts (SAVPE) more reliable than text prompts for this domain? What are the reference image requirements?
|
||
6. What fallback if YOLOE-26 zero-shot produces unacceptable false positive rates?
|
||
|
||
### Path Tracing & Spatial Reasoning
|
||
7. How robust is morphological skeletonization on noisy aerial segmentation masks (partial paths, broken segments)?
|
||
8. What happens with dense path networks (villages, supply routes)? How to filter relevant paths?
|
||
9. Is 128×128 ROI sufficient for endpoint classification, or does the CNN need more spatial context?
|
||
|
||
### VLM Integration
|
||
10. What is the actual inference latency of a 2B-parameter VLM (INT8) on Jetson Orin Nano Super?
|
||
11. Is vLLM the right runtime for Jetson, or should we use TRT-LLM / llama.cpp / MLC-LLM?
|
||
12. What is the memory overhead of keeping a VLM loaded but idle vs loading on-demand?
|
||
|
||
### Gimbal Control & Scan Strategy
|
||
13. Is PID control sufficient for path-following, or do we need a more sophisticated controller (Kalman filter, predictive)?
|
||
14. What happens when the UAV itself is moving during Level 2 detailed scan? How to compensate?
|
||
15. Is the POI queue strategy (20 max, 30s expiry) well-calibrated for typical mission profiles?
|
||
|
||
### Training Data Strategy
|
||
16. Is 1500 images/class realistic for military concealment data? What are actual annotation throughput estimates?
|
||
17. Can synthetic data augmentation (cut-paste, style transfer) meaningfully boost concealment detection training?
|
||
|
||
### Security
|
||
18. What adversarial attack vectors exist against edge-deployed YOLO models?
|
||
19. How to protect model weights and inference pipeline on a physical device that could be captured?
|
||
20. What operational security measures are needed for the data pipeline (captured imagery, detection logs)?
|
||
|
||
## Timeliness Sensitivity Assessment
|
||
|
||
- **Research Topic**: Edge AI resource management, VLM deployment on Jetson, YOLOE accuracy assessment
|
||
- **Sensitivity Level**: Critical
|
||
- **Rationale**: Tools (vLLM, TRT-LLM, MLC-LLM for Jetson) are actively evolving. JetPack 6.2 is latest. YOLOE-26 is weeks old.
|
||
- **Source Time Window**: 6 months (Sep 2025 — Mar 2026)
|
||
- **Priority official sources**:
|
||
1. NVIDIA Jetson AI Lab (memory/performance benchmarks)
|
||
2. Ultralytics docs (YOLOE-26 accuracy, TRT export)
|
||
3. vLLM / TRT-LLM / MLC-LLM Jetson compatibility docs
|
||
4. TensorRT 10.x memory management documentation
|