mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-23 01:36:36 +00:00
Initial commit
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,74 @@
|
||||
# Question Decomposition
|
||||
|
||||
## Original Question
|
||||
Assess solution_draft01.md for weak points, performance bottlenecks, security issues, and produce a revised solution draft.
|
||||
|
||||
## Active Mode
|
||||
Mode B — Solution Assessment of draft01
|
||||
Rationale: solution_draft01.md exists in OUTPUT_DIR. Assessing and improving.
|
||||
|
||||
## Problem Context Summary
|
||||
- Three-tier semantic detection (YOLOE-26 → Spatial Reasoning + CNN → VLM) on Jetson Orin Nano Super (8GB, 67 TOPS)
|
||||
- Two-level camera scan (wide sweep → detailed investigation) with ViewPro A40 gimbal
|
||||
- Integration with existing Cython+TRT YOLO detection service
|
||||
- YOLOE-26 zero-shot bootstrapping → custom YOLO26 fine-tuning transition
|
||||
- VLM (UAV-VL-R1) as separate process via Unix socket IPC
|
||||
- Winter-first seasonal rollout
|
||||
|
||||
## Question Type
|
||||
**Problem Diagnosis** — root cause analysis of weak points
|
||||
Combined with **Decision Support** — weighing alternative solutions for identified issues
|
||||
|
||||
## Research Subject Boundary Definition
|
||||
- **Population**: Edge AI semantic detection pipelines on Jetson-class hardware
|
||||
- **Geography**: Deployment in Eastern European winter conditions (Ukraine conflict)
|
||||
- **Timeframe**: 2025-2026 technology (YOLO26, YOLOE-26, VLMs, JetPack 6.2)
|
||||
- **Level**: Single Jetson Orin Nano Super device (8GB unified memory, 67 TOPS INT8)
|
||||
|
||||
## Decomposed Sub-Questions
|
||||
|
||||
### Memory & Resource Contention
|
||||
1. What is the actual GPU memory footprint of YOLOE-26s-seg TRT engine + existing YOLO TRT engine + MobileNetV3-Small TRT + UAV-VL-R1 INT8 running on 8GB unified memory?
|
||||
2. Can two TRT engines (existing YOLO + YOLOE-26) share the same GPU execution context, or do they need separate CUDA streams?
|
||||
3. Is sequential VLM scheduling (pause YOLO → run VLM → resume) viable without dropping detection frames?
|
||||
|
||||
### YOLOE-26 Zero-Shot Accuracy
|
||||
4. How well do YOLOE text prompts perform on out-of-distribution domains (military concealment vs COCO/LVIS training data)?
|
||||
5. Are visual prompts (SAVPE) more reliable than text prompts for this domain? What are the reference image requirements?
|
||||
6. What fallback if YOLOE-26 zero-shot produces unacceptable false positive rates?
|
||||
|
||||
### Path Tracing & Spatial Reasoning
|
||||
7. How robust is morphological skeletonization on noisy aerial segmentation masks (partial paths, broken segments)?
|
||||
8. What happens with dense path networks (villages, supply routes)? How to filter relevant paths?
|
||||
9. Is 128×128 ROI sufficient for endpoint classification, or does the CNN need more spatial context?
|
||||
|
||||
### VLM Integration
|
||||
10. What is the actual inference latency of a 2B-parameter VLM (INT8) on Jetson Orin Nano Super?
|
||||
11. Is vLLM the right runtime for Jetson, or should we use TRT-LLM / llama.cpp / MLC-LLM?
|
||||
12. What is the memory overhead of keeping a VLM loaded but idle vs loading on-demand?
|
||||
|
||||
### Gimbal Control & Scan Strategy
|
||||
13. Is PID control sufficient for path-following, or do we need a more sophisticated controller (Kalman filter, predictive)?
|
||||
14. What happens when the UAV itself is moving during Level 2 detailed scan? How to compensate?
|
||||
15. Is the POI queue strategy (20 max, 30s expiry) well-calibrated for typical mission profiles?
|
||||
|
||||
### Training Data Strategy
|
||||
16. Is 1500 images/class realistic for military concealment data? What are actual annotation throughput estimates?
|
||||
17. Can synthetic data augmentation (cut-paste, style transfer) meaningfully boost concealment detection training?
|
||||
|
||||
### Security
|
||||
18. What adversarial attack vectors exist against edge-deployed YOLO models?
|
||||
19. How to protect model weights and inference pipeline on a physical device that could be captured?
|
||||
20. What operational security measures are needed for the data pipeline (captured imagery, detection logs)?
|
||||
|
||||
## Timeliness Sensitivity Assessment
|
||||
|
||||
- **Research Topic**: Edge AI resource management, VLM deployment on Jetson, YOLOE accuracy assessment
|
||||
- **Sensitivity Level**: Critical
|
||||
- **Rationale**: Tools (vLLM, TRT-LLM, MLC-LLM for Jetson) are actively evolving. JetPack 6.2 is latest. YOLOE-26 is weeks old.
|
||||
- **Source Time Window**: 6 months (Sep 2025 — Mar 2026)
|
||||
- **Priority official sources**:
|
||||
1. NVIDIA Jetson AI Lab (memory/performance benchmarks)
|
||||
2. Ultralytics docs (YOLOE-26 accuracy, TRT export)
|
||||
3. vLLM / TRT-LLM / MLC-LLM Jetson compatibility docs
|
||||
4. TensorRT 10.x memory management documentation
|
||||
Reference in New Issue
Block a user