Initial commit

Made-with: Cursor
2026-04-22 23:06:38 +00:00 · 2026-03-26 00:20:30 +02:00
commit 8e2ecf50fd
144 changed files with 19781 additions and 0 deletions
@@ -0,0 +1,272 @@
+# E2E Non-Functional Tests
+
+## Performance Tests
+
+### NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL]
+
+**Summary**: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine.
+**Traces to**: AC-LATENCY-TIER1
+**Metric**: p95 inference latency per frame (ms)
+
+**Preconditions**:
+- Jetson Orin Nano Super with JetPack 6.2
+- YOLOE TRT FP16 engine loaded
+- Active cooling enabled, T_junction < 70°C
+
+**Steps**:
+
+| Step | Consumer Action | Measurement |
+|------|----------------|-------------|
+| 1 | Submit 100 frames (semantic01-04.png cycled) with 100ms interval | Record per-frame inference time from API response header |
+| 2 | Compute p50, p95, p99 latency | — |
+
+**Pass criteria**: p95 latency < 100ms
+**Duration**: 15 seconds
+
+---
+
+### NFT-PERF-02: Tier 2 heuristic latency ≤50ms
+
+**Summary**: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency.
+**Traces to**: AC-LATENCY-TIER2
+**Metric**: p95 processing latency per ROI (ms)
+
+**Preconditions**:
+- Tier 1 has produced footpath segmentation masks
+
+**Steps**:
+
+| Step | Consumer Action | Measurement |
+|------|----------------|-------------|
+| 1 | Submit 50 frames with mock YOLO footpath masks | Record Tier 2 processing time from detection log |
+| 2 | Compute p50, p95 latency | — |
+
+**Pass criteria**: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN)
+**Duration**: 10 seconds
+
+---
+
+### NFT-PERF-03: Tier 3 VLM latency ≤5s
+
+**Summary**: Measure VLM inference latency including image encoding, prompt processing, and response generation.
+**Traces to**: AC-LATENCY-TIER3
+**Metric**: End-to-end VLM analysis time per ROI (ms)
+
+**Preconditions**:
+- NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test)
+
+**Steps**:
+
+| Step | Consumer Action | Measurement |
+|------|----------------|-------------|
+| 1 | Trigger 10 Tier 3 analyses on different ROIs | Record time from VLM request to response via detection log |
+| 2 | Compute p50, p95 latency | — |
+
+**Pass criteria**: p95 latency < 5000ms
+**Duration**: 60 seconds
+
+---
+
+### NFT-PERF-04: Full pipeline throughput under continuous frame input
+
+**Summary**: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth.
+**Traces to**: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM
+**Metric**: Frames processed per second, max queue depth
+
+**Preconditions**:
+- All tiers active, mock services responding
+
+**Steps**:
+
+| Step | Consumer Action | Measurement |
+|------|----------------|-------------|
+| 1 | Submit 600 frames at 10 FPS (60s) | Count processed frames from detection log |
+| 2 | Record queue depth if available from API status endpoint | — |
+
+**Pass criteria**: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason)
+**Duration**: 75 seconds
+
+---
+
+## Resilience Tests
+
+### NFT-RES-01: Semantic process crash and recovery
+
+**Summary**: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes.
+**Traces to**: AC-SCAN-ALGORITHM (degradation)
+
+**Preconditions**:
+- Semantic detection running and processing frames
+
+**Fault injection**:
+- Kill semantic process via signal (SIGKILL)
+
+**Steps**:
+
+| Step | Action | Expected Behavior |
+|------|--------|------------------|
+| 1 | Submit 5 frames successfully | Detections returned |
+| 2 | Kill semantic process | Frame processing stops |
+| 3 | Wait up to 10 seconds | Watchdog detects crash, restarts process |
+| 4 | Submit 5 more frames | Detections returned again |
+
+**Pass criteria**: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped)
+
+---
+
+### NFT-RES-02: VLM load/unload cycle stability
+
+**Summary**: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload.
+**Traces to**: AC-RESOURCE-CONSTRAINTS
+
+**Preconditions**:
+- VLM process manageable via API/signal
+
+**Fault injection**:
+- Alternating VLM load/unload commands
+
+**Steps**:
+
+| Step | Action | Expected Behavior |
+|------|--------|------------------|
+| 1 | Load VLM, run 1 inference | Success, record memory |
+| 2 | Unload VLM, record memory | Memory decreases |
+| 3 | Repeat 10 times | — |
+| 4 | Compare memory at cycle 1 vs cycle 10 | Delta < 100MB |
+
+**Pass criteria**: No memory leak (delta < 100MB over 10 cycles); all inferences succeed
+
+---
+
+### NFT-RES-03: Gimbal CRC failure handling
+
+**Summary**: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries.
+**Traces to**: AC-CAMERA-CONTROL
+
+**Preconditions**:
+- Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd
+
+**Fault injection**:
+- Mock gimbal flips random bits in response CRC
+
+**Steps**:
+
+| Step | Action | Expected Behavior |
+|------|--------|------------------|
+| 1 | Issue pan command | First 2 responses rejected (bad CRC) |
+| 2 | Automatic retry | 3rd attempt succeeds |
+| 3 | Read gimbal command log | Log shows 2 CRC failures + 1 success |
+
+**Pass criteria**: Command succeeds after retries; CRC failures logged; no crash
+
+---
+
+## Security Tests
+
+### NFT-SEC-01: No external network access from semantic detection
+
+**Summary**: Verify the semantic detection service makes no outbound network connections outside the Docker network.
+**Traces to**: RESTRICT-SOFTWARE (local-only inference)
+
+**Steps**:
+
+| Step | Consumer Action | Expected Response |
+|------|----------------|------------------|
+| 1 | Run semantic detection pipeline on test frames | Detections produced |
+| 2 | Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) | No packets to external IPs |
+
+**Pass criteria**: Zero outbound connections to external networks
+
+---
+
+### NFT-SEC-02: Model files are not accessible via API
+
+**Summary**: Verify TRT engine files and VLM model weights cannot be downloaded through the API.
+**Traces to**: RESTRICT-SOFTWARE
+
+**Steps**:
+
+| Step | Consumer Action | Expected Response |
+|------|----------------|------------------|
+| 1 | Attempt directory traversal via API: GET /api/v1/../models/ | 404 or 400 |
+| 2 | Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine | No model content returned |
+
+**Pass criteria**: Model files inaccessible via any API endpoint
+
+---
+
+## Resource Limit Tests
+
+### NFT-RES-LIM-01: Memory stays within 6GB budget [HIL]
+
+**Summary**: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation).
+**Traces to**: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE
+**Metric**: Peak RSS memory of semantic detection + VLM processes
+
+**Preconditions**:
+- Jetson Orin Nano Super, 15W mode, active cooling
+- All components loaded
+
+**Monitoring**:
+- `tegrastats` logging at 1-second intervals: GPU memory, CPU memory, swap
+
+**Duration**: 30 minutes
+**Pass criteria**: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB
+
+---
+
+### NFT-RES-LIM-02: Thermal stability under sustained load [HIL]
+
+**Summary**: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling.
+**Traces to**: RESTRICT-HARDWARE
+**Metric**: T_junction max, T_junction average
+
+**Preconditions**:
+- Jetson Orin Nano Super, 15W mode, active cooling fan running
+- Ambient temperature 20-25°C
+
+**Monitoring**:
+- Temperature sensors via `tegrastats` at 1-second intervals
+
+**Duration**: 60 minutes
+**Pass criteria**: T_junction max < 75°C; no thermal throttling events
+
+---
+
+### NFT-RES-LIM-03: NVMe recording endurance [HIL]
+
+**Summary**: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors.
+**Traces to**: AC-SCAN-ALGORITHM (recording)
+**Metric**: Frames written, write errors, NVMe health
+
+**Preconditions**:
+- NVMe SSD ≥256GB, ≥30% free space
+
+**Monitoring**:
+- Write errors via dmesg
+- NVMe SMART data before and after
+
+**Duration**: 2 hours
+**Pass criteria**: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS)
+
+---
+
+### NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL]
+
+**Summary**: Power on Jetson, measure time from boot to first successful detection.
+**Traces to**: RESTRICT-OPERATIONAL
+**Metric**: Time from power-on to first detection result (seconds)
+
+**Preconditions**:
+- JetPack 6.2 on NVMe, all models pre-exported as TRT engines
+
+**Steps**:
+
+| Step | Consumer Action | Measurement |
+|------|----------------|-------------|
+| 1 | Power on Jetson | Start timer |
+| 2 | Poll /api/v1/health every 1s | — |
+| 3 | When health returns 200, submit test frame | Record time to first detection |
+
+**Pass criteria**: First detection within 60 seconds of power-on
+**Duration**: 90 seconds max