# Tech Stack Evaluation ## Requirements Analysis ### Functional Requirements - Real-time open-vocabulary detection from UAV aerial imagery - Footpath segmentation and path tracing with endpoint analysis - Binary concealment classification on ROI crops - On-demand VLM analysis for ambiguous detections - Camera gimbal control with path-following - Integration with existing Cython+TRT YOLO pipeline ### Non-Functional Requirements - Tier 1 inference ≤15ms, Tier 2 ≤200ms - 5.2GB usable VRAM budget (Jetson Orin Nano Super 8GB) - Field-deployable: thermal resilience, tamper protection - Offline operation (no cloud dependency) ### Constraints - Jetson Orin Nano Super: 67 TOPS INT8, 8GB LPDDR5 unified, 68 GB/s bandwidth - JetPack 6.2, CUDA 12.6, TensorRT 10.3 - Existing codebase: Cython + TensorRT (must extend, not replace) - ViewPro A40 camera with ViewLink Serial Protocol V3.3.3 ## Technology Evaluation ### Detection Framework | Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score | |--------|---------|----------|----------|----------|------|-------------|-------| | **YOLOE-v8-seg (Ultralytics)** | 9/10 — open-vocab + segmentation | 9/10 — YOLOv8 TRT proven | 7/10 — PatchBlock compatible | 9/10 — existing Cython+TRT expertise | Free | 8/10 | **8.5** | | YOLOE-26-seg (Ultralytics) | 10/10 — latest arch, NMS-free | 4/10 — TRT bugs on Jetson | 7/10 | 7/10 — new arch, less familiar | Free | 9/10 | **6.5** | | YOLO-World v2 | 7/10 — open-vocab, no seg | 7/10 — stable but older | 7/10 | 8/10 | Free | 7/10 | **7.0** | **Selected**: YOLOE-v8-seg. Upgrade path to YOLOE-26 when TRT issues resolved. ### CNN Classifier | Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score | |--------|---------|----------|----------|----------|------|-------------|-------| | **MobileNetV3-Small** | 9/10 — binary classification, tiny | 10/10 — battle-tested | 8/10 | 9/10 | Free | 8/10 | **9.0** | | EfficientNet-B0 | 8/10 — slightly more accurate | 10/10 | 8/10 | 8/10 | Free | 7/10 — larger | **8.0** | | ResNet-18 | 7/10 — overkill for binary | 10/10 | 8/10 | 9/10 | Free | 6/10 — 44MB | **7.5** | **Selected**: MobileNetV3-Small. ~50MB TRT FP16. Best size/accuracy trade-off. ### VLM | Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score | |--------|---------|----------|----------|----------|------|-------------|-------| | **Moondream 0.5B INT4** | 7/10 — detect()/point() APIs | 7/10 — active development | 8/10 — local only | 7/10 — new, learning curve | Free | 9/10 — 816 MiB | **7.5** | | SmolVLM2-500M | 6/10 — no detect API | 6/10 — newer | 8/10 | 6/10 | Free | 8/10 — 1.8GB | **6.5** | | UAV-VL-R1 2B | 9/10 — aerial-specialized | 5/10 — not tested on Jetson | 8/10 | 5/10 | Free | 4/10 — 2.5GB | **5.5** | | No VLM (MVP) | 5/10 — no fallback | 10/10 | 10/10 | 10/10 | Free | 10/10 | **8.0** | **Selected**: Moondream 0.5B for VLM tier. "No VLM" as MVP fallback if Moondream insufficient. ### VLM Runtime | Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score | |--------|---------|----------|----------|----------|------|-------------|-------| | **ONNX Runtime** | 8/10 — lightweight, cross-platform | 9/10 | 8/10 | 8/10 | Free | 9/10 | **8.5** | | vLLM | 7/10 — server-oriented, overkill for 0.5B | 8/10 — Jetson compatible | 7/10 | 6/10 — complex setup | Free | 7/10 | **7.0** | | PyTorch direct | 7/10 — simplest integration | 10/10 | 8/10 | 9/10 | Free | 6/10 — no optimization | **7.5** | | MLC-LLM | 6/10 — declining adoption | 5/10 | 7/10 | 5/10 | Free | 7/10 | **5.5** | **Selected**: ONNX Runtime for Moondream 0.5B. Lightweight, no server overhead. ### Gimbal Control | Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score | |--------|---------|----------|----------|----------|------|-------------|-------| | **filterpy (Kalman) + servopilot (PID)** | 9/10 — cascade control | 8/10 — proven libraries | 8/10 | 7/10 — Kalman is new | Free | 8/10 | **8.0** | | Custom Kalman + PID from scratch | 8/10 | 5/10 — unproven | 8/10 | 6/10 | Free | 7/10 | **6.5** | | PID only (servopilot) | 6/10 — no drift compensation | 9/10 | 8/10 | 9/10 | Free | 7/10 | **7.5** | **Selected**: filterpy + servopilot cascade. ### Adversarial Defense | Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score | |--------|---------|----------|----------|----------|------|-------------|-------| | **PatchBlock** | 9/10 — designed for edge YOLO | 7/10 — 2026 paper | 9/10 | 7/10 | Free | 9/10 — CPU-based | **8.0** | | Custom input validation | 5/10 — ad-hoc | 3/10 | 6/10 | 8/10 | Free | 7/10 | **5.5** | | None | 0/10 | 10/10 | 0/10 | 10/10 | Free | 10/10 | **3.0** | **Selected**: PatchBlock. Integrate as CPU preprocessing step. ### Synthetic Data Generation | Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score | |--------|---------|----------|----------|----------|------|-------------|-------| | **CamouflageAnything** | 8/10 — CVPR 2025, camouflage-specific | 7/10 | 8/10 | 6/10 | Free | 8/10 | **7.5** | | GenCAMO | 8/10 — environment-aware, 2026 | 6/10 — newer | 8/10 | 6/10 | Free | 8/10 | **7.0** | | Cut-paste augmentation | 6/10 — simple but effective | 10/10 | 8/10 | 9/10 | Free | 7/10 | **7.5** | **Selected**: CamouflageAnything (primary) + cut-paste (supplementary). ## Tech Stack Summary | Layer | Technology | Version | Justification | |-------|-----------|---------|---------------| | Hardware | Jetson Orin Nano Super | 8GB | Existing constraint | | OS / SDK | JetPack | 6.2 | Latest for Orin Nano Super | | GPU Runtime | TensorRT | 10.3 (FP16) | Existing pipeline, proven stability | | Detection | YOLOE-v8-seg | Ultralytics ≥8.4 | Stable TRT, open-vocab + segmentation | | Classifier | MobileNetV3-Small | torchvision → TRT FP16 | Tiny footprint, binary classification | | VLM | Moondream 0.5B | INT4, ONNX | 816 MiB, detect()/point() APIs | | VLM Runtime | ONNX Runtime | ≥1.17 | Lightweight, no server overhead | | Path Tracing | OpenCV + scikit-image | OpenCV 4.x, skimage 0.22+ | Preprocessing + skeletonization | | Gimbal Kalman | filterpy | ≥1.4 | Kalman filter state estimation | | Gimbal PID | servopilot | latest | Anti-windup PID, dual-axis | | Serial | pyserial | ≥3.5 | ViewLink protocol communication | | Adversarial Defense | PatchBlock | 2026 release | CPU-based, edge-optimized | | Synthetic Data | CamouflageAnything | CVPR 2025 | Camouflage-specific generation | | Encryption | LUKS / dm-crypt | Linux kernel | Model weight encryption at rest | | Core Language | Cython + Python | 3.10+ | Existing codebase extension | ## Risk Assessment | Technology | Risk | Mitigation | |-----------|------|------------| | YOLOE-v8-seg | Lower accuracy than YOLOE-26 | Monitor YOLO26 TRT fix; upgrade when stable | | Moondream 0.5B | Untested for aerial concealment | Empirical testing Week 8; fallback to no-VLM MVP | | PatchBlock | New (2026), limited field testing | Can disable if causes false positives; low integration risk | | filterpy Kalman | Team unfamiliar | Well-documented library; standard aerospace algorithm | | CamouflageAnything | Synthetic-to-real domain gap | Supplement with real data; validate FP/FN rates | | Demand-loaded VLM | 30-45s detection pause | Batch requests; operator-triggered only; async notification | | ONNX Runtime on Jetson | Less optimized than TRT for vision models | For 0.5B model, ONNX overhead is acceptable | ## Learning Requirements | Technology | Effort | Who | |-----------|--------|-----| | YOLOE visual prompts (SAVPE) | Low — API-based | Detection engineer | | Moondream detect()/caption() | Low — simple API | ML engineer | | filterpy Kalman filter | Medium — state estimation theory | Controls engineer | | PatchBlock integration | Low — preprocessing module | Detection engineer | | CamouflageAnything pipeline | Medium — generative model setup | Data engineer | | LUKS encryption + secure boot | Medium — Linux security | DevOps / platform |