Initial commit

Made-with: Cursor
2026-06-22 05:51:13 +00:00 · 2026-03-26 00:20:30 +02:00
commit 8e2ecf50fd
144 changed files with 19781 additions and 0 deletions
@@ -0,0 +1,55 @@
+# Validation Log
+
+## Validation Scenario
+Same scenario as draft01: Winter reconnaissance flight at 700m altitude over forested area. But now accounting for memory constraints, TRT bugs, and revised VLM latency.
+
+## Expected Based on Revised Conclusions
+
+**Using the revised architecture (YOLOE-v8-seg, demand-loaded VLM, Kalman+PID gimbal):**
+
+1. Level 1 sweep begins. Single TRT engine running YOLOE-v8-seg (re-parameterized with fixed classes) + existing YOLO detection in shared engine context. Memory: ~3-3.5GB for combined engine. Inference: ~13ms (s-size).
+
+2. YOLOE-v8-seg detects footpaths via text prompt ("footpath", "trail") + visual prompt (reference images of paths). Also detects "road", "tree row". CNN-specific concealment classes handled by visual prompts only.
+
+3. Path segmentation mask preprocessed: Gaussian blur → binary threshold → morphological closing → skeletonization → branch pruning. Endpoints extracted. 256×256 ROI crops.
+
+4. MobileNetV3-Small CNN classifies endpoints. Memory: ~50MB TRT engine. Total pipeline (mask preprocessing + skeleton + CNN): ~150ms.
+
+5. High-confidence detection → operator alert with coordinates. Ambiguous detection (CNN 30-70%) → queued for VLM analysis.
+
+6. VLM analysis is **background/batch mode**: Scan controller continues Level 1 sweep. When a batch of 3-5 ambiguous detections accumulates or operator requests deep analysis: pause YOLO TRT → unload engine → load Moondream-0.5B (816 MiB) → analyze batch → unload → reload YOLO TRT. Total pause: ~20-40s. Operator receives delayed analysis results.
+
+7. Gimbal: Kalman filter fuses IMU data for state estimation → PID corrects → gimbal actuates. Path-following during Level 2 is smoother, compensates for UAV drift.
+
+## Actual Validation Results
+Cannot validate against real-world data. Validation based on:
+- YOLOE-v8-seg TRT deployment on Jetson is proven stable (unlike YOLO26)
+- Memory budget: ~3.5GB (YOLO engine) + 0.8GB (Moondream) = 4.3GB peak during VLM phase, within 5.2GB usable
+- Moondream 0.5B is confirmed to run on Raspberry Pi — Jetson will be faster
+- Kalman+PID gimbal control is standard aerospace engineering
+
+## Counterexamples
+
+1. **VLM delay unacceptable**: If 20-40s batch VLM delay is unacceptable, could use Moondream's detect() API for faster binary yes/no (~2-5s for 0.5B) instead of full text generation. Or skip VLM entirely and rely on CNN + operator judgment.
+
+2. **YOLOE-v8-seg accuracy lower than YOLOE-26-seg**: YOLOE-v8 is older architecture. YOLOE-26 should have better accuracy. Mitigation: use YOLOE-v8 for stable deployment now, switch to YOLOE-26 once TRT bugs are fixed.
+
+3. **Model switching latency**: Loading/unloading TRT engines adds 2-3s each direction. For frequent VLM requests, this overhead accumulates. Mitigation: batch VLM requests, implement predictive pre-loading.
+
+4. **Single-engine approach limits flexibility**: Merging YOLOE + existing YOLO into one engine may require re-exporting when classes change. Mitigation: use YOLOE re-parameterization — when classes are fixed, YOLOE becomes standard YOLO with zero overhead.
+
+## Review Checklist
+- [x] Draft conclusions consistent with fact cards
+- [x] No important dimensions missed
+- [x] No over-extrapolation
+- [x] Conclusions actionable/verifiable
+- [x] Memory budget calculated from documented values
+- [x] TRT deployment risk based on documented bugs
+- [ ] Note: YOLOE-v8-seg TRT stability on Jetson not directly tested (inferred from YOLOv8 stability)
+- [ ] Note: Moondream 0.5B accuracy for aerial concealment analysis is unknown
+
+## Conclusions Requiring Revision
+- VLM latency target must change from ≤5s to "background batch" (20-40s)
+- Consider dropping VLM entirely for MVP and adding later when hardware/software matures
+- YOLOE-26 should be replaced with YOLOE-v8 for initial deployment
+- Memory architecture needs explicit budget table in solution draft