Files
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

141 lines
8.0 KiB
Markdown

# Tech Stack Evaluation
## Requirements Analysis
### Functional Requirements
- Real-time open-vocabulary detection from UAV aerial imagery
- Footpath segmentation and path tracing with endpoint analysis
- Binary concealment classification on ROI crops
- On-demand VLM analysis for ambiguous detections
- Camera gimbal control with path-following
- Integration with existing Cython+TRT YOLO pipeline
### Non-Functional Requirements
- Tier 1 inference ≤15ms, Tier 2 ≤200ms
- 5.2GB usable VRAM budget (Jetson Orin Nano Super 8GB)
- Field-deployable: thermal resilience, tamper protection
- Offline operation (no cloud dependency)
### Constraints
- Jetson Orin Nano Super: 67 TOPS INT8, 8GB LPDDR5 unified, 68 GB/s bandwidth
- JetPack 6.2, CUDA 12.6, TensorRT 10.3
- Existing codebase: Cython + TensorRT (must extend, not replace)
- ViewPro A40 camera with ViewLink Serial Protocol V3.3.3
## Technology Evaluation
### Detection Framework
| Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score |
|--------|---------|----------|----------|----------|------|-------------|-------|
| **YOLOE-v8-seg (Ultralytics)** | 9/10 — open-vocab + segmentation | 9/10 — YOLOv8 TRT proven | 7/10 — PatchBlock compatible | 9/10 — existing Cython+TRT expertise | Free | 8/10 | **8.5** |
| YOLOE-26-seg (Ultralytics) | 10/10 — latest arch, NMS-free | 4/10 — TRT bugs on Jetson | 7/10 | 7/10 — new arch, less familiar | Free | 9/10 | **6.5** |
| YOLO-World v2 | 7/10 — open-vocab, no seg | 7/10 — stable but older | 7/10 | 8/10 | Free | 7/10 | **7.0** |
**Selected**: YOLOE-v8-seg. Upgrade path to YOLOE-26 when TRT issues resolved.
### CNN Classifier
| Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score |
|--------|---------|----------|----------|----------|------|-------------|-------|
| **MobileNetV3-Small** | 9/10 — binary classification, tiny | 10/10 — battle-tested | 8/10 | 9/10 | Free | 8/10 | **9.0** |
| EfficientNet-B0 | 8/10 — slightly more accurate | 10/10 | 8/10 | 8/10 | Free | 7/10 — larger | **8.0** |
| ResNet-18 | 7/10 — overkill for binary | 10/10 | 8/10 | 9/10 | Free | 6/10 — 44MB | **7.5** |
**Selected**: MobileNetV3-Small. ~50MB TRT FP16. Best size/accuracy trade-off.
### VLM
| Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score |
|--------|---------|----------|----------|----------|------|-------------|-------|
| **Moondream 0.5B INT4** | 7/10 — detect()/point() APIs | 7/10 — active development | 8/10 — local only | 7/10 — new, learning curve | Free | 9/10 — 816 MiB | **7.5** |
| SmolVLM2-500M | 6/10 — no detect API | 6/10 — newer | 8/10 | 6/10 | Free | 8/10 — 1.8GB | **6.5** |
| UAV-VL-R1 2B | 9/10 — aerial-specialized | 5/10 — not tested on Jetson | 8/10 | 5/10 | Free | 4/10 — 2.5GB | **5.5** |
| No VLM (MVP) | 5/10 — no fallback | 10/10 | 10/10 | 10/10 | Free | 10/10 | **8.0** |
**Selected**: Moondream 0.5B for VLM tier. "No VLM" as MVP fallback if Moondream insufficient.
### VLM Runtime
| Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score |
|--------|---------|----------|----------|----------|------|-------------|-------|
| **ONNX Runtime** | 8/10 — lightweight, cross-platform | 9/10 | 8/10 | 8/10 | Free | 9/10 | **8.5** |
| vLLM | 7/10 — server-oriented, overkill for 0.5B | 8/10 — Jetson compatible | 7/10 | 6/10 — complex setup | Free | 7/10 | **7.0** |
| PyTorch direct | 7/10 — simplest integration | 10/10 | 8/10 | 9/10 | Free | 6/10 — no optimization | **7.5** |
| MLC-LLM | 6/10 — declining adoption | 5/10 | 7/10 | 5/10 | Free | 7/10 | **5.5** |
**Selected**: ONNX Runtime for Moondream 0.5B. Lightweight, no server overhead.
### Gimbal Control
| Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score |
|--------|---------|----------|----------|----------|------|-------------|-------|
| **filterpy (Kalman) + servopilot (PID)** | 9/10 — cascade control | 8/10 — proven libraries | 8/10 | 7/10 — Kalman is new | Free | 8/10 | **8.0** |
| Custom Kalman + PID from scratch | 8/10 | 5/10 — unproven | 8/10 | 6/10 | Free | 7/10 | **6.5** |
| PID only (servopilot) | 6/10 — no drift compensation | 9/10 | 8/10 | 9/10 | Free | 7/10 | **7.5** |
**Selected**: filterpy + servopilot cascade.
### Adversarial Defense
| Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score |
|--------|---------|----------|----------|----------|------|-------------|-------|
| **PatchBlock** | 9/10 — designed for edge YOLO | 7/10 — 2026 paper | 9/10 | 7/10 | Free | 9/10 — CPU-based | **8.0** |
| Custom input validation | 5/10 — ad-hoc | 3/10 | 6/10 | 8/10 | Free | 7/10 | **5.5** |
| None | 0/10 | 10/10 | 0/10 | 10/10 | Free | 10/10 | **3.0** |
**Selected**: PatchBlock. Integrate as CPU preprocessing step.
### Synthetic Data Generation
| Option | Fitness | Maturity | Security | Team Fit | Cost | Scalability | Score |
|--------|---------|----------|----------|----------|------|-------------|-------|
| **CamouflageAnything** | 8/10 — CVPR 2025, camouflage-specific | 7/10 | 8/10 | 6/10 | Free | 8/10 | **7.5** |
| GenCAMO | 8/10 — environment-aware, 2026 | 6/10 — newer | 8/10 | 6/10 | Free | 8/10 | **7.0** |
| Cut-paste augmentation | 6/10 — simple but effective | 10/10 | 8/10 | 9/10 | Free | 7/10 | **7.5** |
**Selected**: CamouflageAnything (primary) + cut-paste (supplementary).
## Tech Stack Summary
| Layer | Technology | Version | Justification |
|-------|-----------|---------|---------------|
| Hardware | Jetson Orin Nano Super | 8GB | Existing constraint |
| OS / SDK | JetPack | 6.2 | Latest for Orin Nano Super |
| GPU Runtime | TensorRT | 10.3 (FP16) | Existing pipeline, proven stability |
| Detection | YOLOE-v8-seg | Ultralytics ≥8.4 | Stable TRT, open-vocab + segmentation |
| Classifier | MobileNetV3-Small | torchvision → TRT FP16 | Tiny footprint, binary classification |
| VLM | Moondream 0.5B | INT4, ONNX | 816 MiB, detect()/point() APIs |
| VLM Runtime | ONNX Runtime | ≥1.17 | Lightweight, no server overhead |
| Path Tracing | OpenCV + scikit-image | OpenCV 4.x, skimage 0.22+ | Preprocessing + skeletonization |
| Gimbal Kalman | filterpy | ≥1.4 | Kalman filter state estimation |
| Gimbal PID | servopilot | latest | Anti-windup PID, dual-axis |
| Serial | pyserial | ≥3.5 | ViewLink protocol communication |
| Adversarial Defense | PatchBlock | 2026 release | CPU-based, edge-optimized |
| Synthetic Data | CamouflageAnything | CVPR 2025 | Camouflage-specific generation |
| Encryption | LUKS / dm-crypt | Linux kernel | Model weight encryption at rest |
| Core Language | Cython + Python | 3.10+ | Existing codebase extension |
## Risk Assessment
| Technology | Risk | Mitigation |
|-----------|------|------------|
| YOLOE-v8-seg | Lower accuracy than YOLOE-26 | Monitor YOLO26 TRT fix; upgrade when stable |
| Moondream 0.5B | Untested for aerial concealment | Empirical testing Week 8; fallback to no-VLM MVP |
| PatchBlock | New (2026), limited field testing | Can disable if causes false positives; low integration risk |
| filterpy Kalman | Team unfamiliar | Well-documented library; standard aerospace algorithm |
| CamouflageAnything | Synthetic-to-real domain gap | Supplement with real data; validate FP/FN rates |
| Demand-loaded VLM | 30-45s detection pause | Batch requests; operator-triggered only; async notification |
| ONNX Runtime on Jetson | Less optimized than TRT for vision models | For 0.5B model, ONNX overhead is acceptable |
## Learning Requirements
| Technology | Effort | Who |
|-----------|--------|-----|
| YOLOE visual prompts (SAVPE) | Low — API-based | Detection engineer |
| Moondream detect()/caption() | Low — simple API | ML engineer |
| filterpy Kalman filter | Medium — state estimation theory | Controls engineer |
| PatchBlock integration | Low — preprocessing module | Detection engineer |
| CamouflageAnything pipeline | Medium — generative model setup | Data engineer |
| LUKS encryption + secure boot | Medium — Linux security | DevOps / platform |