# E2E Non-Functional Tests ## Performance Tests ### NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL] **Summary**: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine. **Traces to**: AC-LATENCY-TIER1 **Metric**: p95 inference latency per frame (ms) **Preconditions**: - Jetson Orin Nano Super with JetPack 6.2 - YOLOE TRT FP16 engine loaded - Active cooling enabled, T_junction < 70°C **Steps**: | Step | Consumer Action | Measurement | |------|----------------|-------------| | 1 | Submit 100 frames (semantic01-04.png cycled) with 100ms interval | Record per-frame inference time from API response header | | 2 | Compute p50, p95, p99 latency | — | **Pass criteria**: p95 latency < 100ms **Duration**: 15 seconds --- ### NFT-PERF-02: Tier 2 heuristic latency ≤50ms **Summary**: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency. **Traces to**: AC-LATENCY-TIER2 **Metric**: p95 processing latency per ROI (ms) **Preconditions**: - Tier 1 has produced footpath segmentation masks **Steps**: | Step | Consumer Action | Measurement | |------|----------------|-------------| | 1 | Submit 50 frames with mock YOLO footpath masks | Record Tier 2 processing time from detection log | | 2 | Compute p50, p95 latency | — | **Pass criteria**: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN) **Duration**: 10 seconds --- ### NFT-PERF-03: Tier 3 VLM latency ≤5s **Summary**: Measure VLM inference latency including image encoding, prompt processing, and response generation. **Traces to**: AC-LATENCY-TIER3 **Metric**: End-to-end VLM analysis time per ROI (ms) **Preconditions**: - NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test) **Steps**: | Step | Consumer Action | Measurement | |------|----------------|-------------| | 1 | Trigger 10 Tier 3 analyses on different ROIs | Record time from VLM request to response via detection log | | 2 | Compute p50, p95 latency | — | **Pass criteria**: p95 latency < 5000ms **Duration**: 60 seconds --- ### NFT-PERF-04: Full pipeline throughput under continuous frame input **Summary**: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth. **Traces to**: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM **Metric**: Frames processed per second, max queue depth **Preconditions**: - All tiers active, mock services responding **Steps**: | Step | Consumer Action | Measurement | |------|----------------|-------------| | 1 | Submit 600 frames at 10 FPS (60s) | Count processed frames from detection log | | 2 | Record queue depth if available from API status endpoint | — | **Pass criteria**: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason) **Duration**: 75 seconds --- ## Resilience Tests ### NFT-RES-01: Semantic process crash and recovery **Summary**: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes. **Traces to**: AC-SCAN-ALGORITHM (degradation) **Preconditions**: - Semantic detection running and processing frames **Fault injection**: - Kill semantic process via signal (SIGKILL) **Steps**: | Step | Action | Expected Behavior | |------|--------|------------------| | 1 | Submit 5 frames successfully | Detections returned | | 2 | Kill semantic process | Frame processing stops | | 3 | Wait up to 10 seconds | Watchdog detects crash, restarts process | | 4 | Submit 5 more frames | Detections returned again | **Pass criteria**: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped) --- ### NFT-RES-02: VLM load/unload cycle stability **Summary**: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload. **Traces to**: AC-RESOURCE-CONSTRAINTS **Preconditions**: - VLM process manageable via API/signal **Fault injection**: - Alternating VLM load/unload commands **Steps**: | Step | Action | Expected Behavior | |------|--------|------------------| | 1 | Load VLM, run 1 inference | Success, record memory | | 2 | Unload VLM, record memory | Memory decreases | | 3 | Repeat 10 times | — | | 4 | Compare memory at cycle 1 vs cycle 10 | Delta < 100MB | **Pass criteria**: No memory leak (delta < 100MB over 10 cycles); all inferences succeed --- ### NFT-RES-03: Gimbal CRC failure handling **Summary**: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries. **Traces to**: AC-CAMERA-CONTROL **Preconditions**: - Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd **Fault injection**: - Mock gimbal flips random bits in response CRC **Steps**: | Step | Action | Expected Behavior | |------|--------|------------------| | 1 | Issue pan command | First 2 responses rejected (bad CRC) | | 2 | Automatic retry | 3rd attempt succeeds | | 3 | Read gimbal command log | Log shows 2 CRC failures + 1 success | **Pass criteria**: Command succeeds after retries; CRC failures logged; no crash --- ## Security Tests ### NFT-SEC-01: No external network access from semantic detection **Summary**: Verify the semantic detection service makes no outbound network connections outside the Docker network. **Traces to**: RESTRICT-SOFTWARE (local-only inference) **Steps**: | Step | Consumer Action | Expected Response | |------|----------------|------------------| | 1 | Run semantic detection pipeline on test frames | Detections produced | | 2 | Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) | No packets to external IPs | **Pass criteria**: Zero outbound connections to external networks --- ### NFT-SEC-02: Model files are not accessible via API **Summary**: Verify TRT engine files and VLM model weights cannot be downloaded through the API. **Traces to**: RESTRICT-SOFTWARE **Steps**: | Step | Consumer Action | Expected Response | |------|----------------|------------------| | 1 | Attempt directory traversal via API: GET /api/v1/../models/ | 404 or 400 | | 2 | Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine | No model content returned | **Pass criteria**: Model files inaccessible via any API endpoint --- ## Resource Limit Tests ### NFT-RES-LIM-01: Memory stays within 6GB budget [HIL] **Summary**: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation). **Traces to**: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE **Metric**: Peak RSS memory of semantic detection + VLM processes **Preconditions**: - Jetson Orin Nano Super, 15W mode, active cooling - All components loaded **Monitoring**: - `tegrastats` logging at 1-second intervals: GPU memory, CPU memory, swap **Duration**: 30 minutes **Pass criteria**: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB --- ### NFT-RES-LIM-02: Thermal stability under sustained load [HIL] **Summary**: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling. **Traces to**: RESTRICT-HARDWARE **Metric**: T_junction max, T_junction average **Preconditions**: - Jetson Orin Nano Super, 15W mode, active cooling fan running - Ambient temperature 20-25°C **Monitoring**: - Temperature sensors via `tegrastats` at 1-second intervals **Duration**: 60 minutes **Pass criteria**: T_junction max < 75°C; no thermal throttling events --- ### NFT-RES-LIM-03: NVMe recording endurance [HIL] **Summary**: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors. **Traces to**: AC-SCAN-ALGORITHM (recording) **Metric**: Frames written, write errors, NVMe health **Preconditions**: - NVMe SSD ≥256GB, ≥30% free space **Monitoring**: - Write errors via dmesg - NVMe SMART data before and after **Duration**: 2 hours **Pass criteria**: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS) --- ### NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL] **Summary**: Power on Jetson, measure time from boot to first successful detection. **Traces to**: RESTRICT-OPERATIONAL **Metric**: Time from power-on to first detection result (seconds) **Preconditions**: - JetPack 6.2 on NVMe, all models pre-exported as TRT engines **Steps**: | Step | Consumer Action | Measurement | |------|----------------|-------------| | 1 | Power on Jetson | Start timer | | 2 | Poll /api/v1/health every 1s | — | | 3 | When health returns 200, submit test frame | Record time to first detection | **Pass criteria**: First detection within 60 seconds of power-on **Duration**: 90 seconds max