Files
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

8.0 KiB

Tech Stack Evaluation

Requirements Analysis

Functional Requirements

  • Real-time open-vocabulary detection from UAV aerial imagery
  • Footpath segmentation and path tracing with endpoint analysis
  • Binary concealment classification on ROI crops
  • On-demand VLM analysis for ambiguous detections
  • Camera gimbal control with path-following
  • Integration with existing Cython+TRT YOLO pipeline

Non-Functional Requirements

  • Tier 1 inference ≤15ms, Tier 2 ≤200ms
  • 5.2GB usable VRAM budget (Jetson Orin Nano Super 8GB)
  • Field-deployable: thermal resilience, tamper protection
  • Offline operation (no cloud dependency)

Constraints

  • Jetson Orin Nano Super: 67 TOPS INT8, 8GB LPDDR5 unified, 68 GB/s bandwidth
  • JetPack 6.2, CUDA 12.6, TensorRT 10.3
  • Existing codebase: Cython + TensorRT (must extend, not replace)
  • ViewPro A40 camera with ViewLink Serial Protocol V3.3.3

Technology Evaluation

Detection Framework

Option Fitness Maturity Security Team Fit Cost Scalability Score
YOLOE-v8-seg (Ultralytics) 9/10 — open-vocab + segmentation 9/10 — YOLOv8 TRT proven 7/10 — PatchBlock compatible 9/10 — existing Cython+TRT expertise Free 8/10 8.5
YOLOE-26-seg (Ultralytics) 10/10 — latest arch, NMS-free 4/10 — TRT bugs on Jetson 7/10 7/10 — new arch, less familiar Free 9/10 6.5
YOLO-World v2 7/10 — open-vocab, no seg 7/10 — stable but older 7/10 8/10 Free 7/10 7.0

Selected: YOLOE-v8-seg. Upgrade path to YOLOE-26 when TRT issues resolved.

CNN Classifier

Option Fitness Maturity Security Team Fit Cost Scalability Score
MobileNetV3-Small 9/10 — binary classification, tiny 10/10 — battle-tested 8/10 9/10 Free 8/10 9.0
EfficientNet-B0 8/10 — slightly more accurate 10/10 8/10 8/10 Free 7/10 — larger 8.0
ResNet-18 7/10 — overkill for binary 10/10 8/10 9/10 Free 6/10 — 44MB 7.5

Selected: MobileNetV3-Small. ~50MB TRT FP16. Best size/accuracy trade-off.

VLM

Option Fitness Maturity Security Team Fit Cost Scalability Score
Moondream 0.5B INT4 7/10 — detect()/point() APIs 7/10 — active development 8/10 — local only 7/10 — new, learning curve Free 9/10 — 816 MiB 7.5
SmolVLM2-500M 6/10 — no detect API 6/10 — newer 8/10 6/10 Free 8/10 — 1.8GB 6.5
UAV-VL-R1 2B 9/10 — aerial-specialized 5/10 — not tested on Jetson 8/10 5/10 Free 4/10 — 2.5GB 5.5
No VLM (MVP) 5/10 — no fallback 10/10 10/10 10/10 Free 10/10 8.0

Selected: Moondream 0.5B for VLM tier. "No VLM" as MVP fallback if Moondream insufficient.

VLM Runtime

Option Fitness Maturity Security Team Fit Cost Scalability Score
ONNX Runtime 8/10 — lightweight, cross-platform 9/10 8/10 8/10 Free 9/10 8.5
vLLM 7/10 — server-oriented, overkill for 0.5B 8/10 — Jetson compatible 7/10 6/10 — complex setup Free 7/10 7.0
PyTorch direct 7/10 — simplest integration 10/10 8/10 9/10 Free 6/10 — no optimization 7.5
MLC-LLM 6/10 — declining adoption 5/10 7/10 5/10 Free 7/10 5.5

Selected: ONNX Runtime for Moondream 0.5B. Lightweight, no server overhead.

Gimbal Control

Option Fitness Maturity Security Team Fit Cost Scalability Score
filterpy (Kalman) + servopilot (PID) 9/10 — cascade control 8/10 — proven libraries 8/10 7/10 — Kalman is new Free 8/10 8.0
Custom Kalman + PID from scratch 8/10 5/10 — unproven 8/10 6/10 Free 7/10 6.5
PID only (servopilot) 6/10 — no drift compensation 9/10 8/10 9/10 Free 7/10 7.5

Selected: filterpy + servopilot cascade.

Adversarial Defense

Option Fitness Maturity Security Team Fit Cost Scalability Score
PatchBlock 9/10 — designed for edge YOLO 7/10 — 2026 paper 9/10 7/10 Free 9/10 — CPU-based 8.0
Custom input validation 5/10 — ad-hoc 3/10 6/10 8/10 Free 7/10 5.5
None 0/10 10/10 0/10 10/10 Free 10/10 3.0

Selected: PatchBlock. Integrate as CPU preprocessing step.

Synthetic Data Generation

Option Fitness Maturity Security Team Fit Cost Scalability Score
CamouflageAnything 8/10 — CVPR 2025, camouflage-specific 7/10 8/10 6/10 Free 8/10 7.5
GenCAMO 8/10 — environment-aware, 2026 6/10 — newer 8/10 6/10 Free 8/10 7.0
Cut-paste augmentation 6/10 — simple but effective 10/10 8/10 9/10 Free 7/10 7.5

Selected: CamouflageAnything (primary) + cut-paste (supplementary).

Tech Stack Summary

Layer Technology Version Justification
Hardware Jetson Orin Nano Super 8GB Existing constraint
OS / SDK JetPack 6.2 Latest for Orin Nano Super
GPU Runtime TensorRT 10.3 (FP16) Existing pipeline, proven stability
Detection YOLOE-v8-seg Ultralytics ≥8.4 Stable TRT, open-vocab + segmentation
Classifier MobileNetV3-Small torchvision → TRT FP16 Tiny footprint, binary classification
VLM Moondream 0.5B INT4, ONNX 816 MiB, detect()/point() APIs
VLM Runtime ONNX Runtime ≥1.17 Lightweight, no server overhead
Path Tracing OpenCV + scikit-image OpenCV 4.x, skimage 0.22+ Preprocessing + skeletonization
Gimbal Kalman filterpy ≥1.4 Kalman filter state estimation
Gimbal PID servopilot latest Anti-windup PID, dual-axis
Serial pyserial ≥3.5 ViewLink protocol communication
Adversarial Defense PatchBlock 2026 release CPU-based, edge-optimized
Synthetic Data CamouflageAnything CVPR 2025 Camouflage-specific generation
Encryption LUKS / dm-crypt Linux kernel Model weight encryption at rest
Core Language Cython + Python 3.10+ Existing codebase extension

Risk Assessment

Technology Risk Mitigation
YOLOE-v8-seg Lower accuracy than YOLOE-26 Monitor YOLO26 TRT fix; upgrade when stable
Moondream 0.5B Untested for aerial concealment Empirical testing Week 8; fallback to no-VLM MVP
PatchBlock New (2026), limited field testing Can disable if causes false positives; low integration risk
filterpy Kalman Team unfamiliar Well-documented library; standard aerospace algorithm
CamouflageAnything Synthetic-to-real domain gap Supplement with real data; validate FP/FN rates
Demand-loaded VLM 30-45s detection pause Batch requests; operator-triggered only; async notification
ONNX Runtime on Jetson Less optimized than TRT for vision models For 0.5B model, ONNX overhead is acceptable

Learning Requirements

Technology Effort Who
YOLOE visual prompts (SAVPE) Low — API-based Detection engineer
Moondream detect()/caption() Low — simple API ML engineer
filterpy Kalman filter Medium — state estimation theory Controls engineer
PatchBlock integration Low — preprocessing module Detection engineer
CamouflageAnything pipeline Medium — generative model setup Data engineer
LUKS encryption + secure boot Medium — Linux security DevOps / platform