Files
detections-semantic/_docs/00_research/00_question_decomposition.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

75 lines
4.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Question Decomposition
## Original Question
Assess solution_draft01.md for weak points, performance bottlenecks, security issues, and produce a revised solution draft.
## Active Mode
Mode B — Solution Assessment of draft01
Rationale: solution_draft01.md exists in OUTPUT_DIR. Assessing and improving.
## Problem Context Summary
- Three-tier semantic detection (YOLOE-26 → Spatial Reasoning + CNN → VLM) on Jetson Orin Nano Super (8GB, 67 TOPS)
- Two-level camera scan (wide sweep → detailed investigation) with ViewPro A40 gimbal
- Integration with existing Cython+TRT YOLO detection service
- YOLOE-26 zero-shot bootstrapping → custom YOLO26 fine-tuning transition
- VLM (UAV-VL-R1) as separate process via Unix socket IPC
- Winter-first seasonal rollout
## Question Type
**Problem Diagnosis** — root cause analysis of weak points
Combined with **Decision Support** — weighing alternative solutions for identified issues
## Research Subject Boundary Definition
- **Population**: Edge AI semantic detection pipelines on Jetson-class hardware
- **Geography**: Deployment in Eastern European winter conditions (Ukraine conflict)
- **Timeframe**: 2025-2026 technology (YOLO26, YOLOE-26, VLMs, JetPack 6.2)
- **Level**: Single Jetson Orin Nano Super device (8GB unified memory, 67 TOPS INT8)
## Decomposed Sub-Questions
### Memory & Resource Contention
1. What is the actual GPU memory footprint of YOLOE-26s-seg TRT engine + existing YOLO TRT engine + MobileNetV3-Small TRT + UAV-VL-R1 INT8 running on 8GB unified memory?
2. Can two TRT engines (existing YOLO + YOLOE-26) share the same GPU execution context, or do they need separate CUDA streams?
3. Is sequential VLM scheduling (pause YOLO → run VLM → resume) viable without dropping detection frames?
### YOLOE-26 Zero-Shot Accuracy
4. How well do YOLOE text prompts perform on out-of-distribution domains (military concealment vs COCO/LVIS training data)?
5. Are visual prompts (SAVPE) more reliable than text prompts for this domain? What are the reference image requirements?
6. What fallback if YOLOE-26 zero-shot produces unacceptable false positive rates?
### Path Tracing & Spatial Reasoning
7. How robust is morphological skeletonization on noisy aerial segmentation masks (partial paths, broken segments)?
8. What happens with dense path networks (villages, supply routes)? How to filter relevant paths?
9. Is 128×128 ROI sufficient for endpoint classification, or does the CNN need more spatial context?
### VLM Integration
10. What is the actual inference latency of a 2B-parameter VLM (INT8) on Jetson Orin Nano Super?
11. Is vLLM the right runtime for Jetson, or should we use TRT-LLM / llama.cpp / MLC-LLM?
12. What is the memory overhead of keeping a VLM loaded but idle vs loading on-demand?
### Gimbal Control & Scan Strategy
13. Is PID control sufficient for path-following, or do we need a more sophisticated controller (Kalman filter, predictive)?
14. What happens when the UAV itself is moving during Level 2 detailed scan? How to compensate?
15. Is the POI queue strategy (20 max, 30s expiry) well-calibrated for typical mission profiles?
### Training Data Strategy
16. Is 1500 images/class realistic for military concealment data? What are actual annotation throughput estimates?
17. Can synthetic data augmentation (cut-paste, style transfer) meaningfully boost concealment detection training?
### Security
18. What adversarial attack vectors exist against edge-deployed YOLO models?
19. How to protect model weights and inference pipeline on a physical device that could be captured?
20. What operational security measures are needed for the data pipeline (captured imagery, detection logs)?
## Timeliness Sensitivity Assessment
- **Research Topic**: Edge AI resource management, VLM deployment on Jetson, YOLOE accuracy assessment
- **Sensitivity Level**: Critical
- **Rationale**: Tools (vLLM, TRT-LLM, MLC-LLM for Jetson) are actively evolving. JetPack 6.2 is latest. YOLOE-26 is weeks old.
- **Source Time Window**: 6 months (Sep 2025 — Mar 2026)
- **Priority official sources**:
1. NVIDIA Jetson AI Lab (memory/performance benchmarks)
2. Ultralytics docs (YOLOE-26 accuracy, TRT export)
3. vLLM / TRT-LLM / MLC-LLM Jetson compatibility docs
4. TensorRT 10.x memory management documentation