Made-with: Cursor
8.6 KiB
E2E Non-Functional Tests
Performance Tests
NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL]
Summary: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine. Traces to: AC-LATENCY-TIER1 Metric: p95 inference latency per frame (ms)
Preconditions:
- Jetson Orin Nano Super with JetPack 6.2
- YOLOE TRT FP16 engine loaded
- Active cooling enabled, T_junction < 70°C
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Submit 100 frames (semantic01-04.png cycled) with 100ms interval | Record per-frame inference time from API response header |
| 2 | Compute p50, p95, p99 latency | — |
Pass criteria: p95 latency < 100ms Duration: 15 seconds
NFT-PERF-02: Tier 2 heuristic latency ≤50ms
Summary: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency. Traces to: AC-LATENCY-TIER2 Metric: p95 processing latency per ROI (ms)
Preconditions:
- Tier 1 has produced footpath segmentation masks
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Submit 50 frames with mock YOLO footpath masks | Record Tier 2 processing time from detection log |
| 2 | Compute p50, p95 latency | — |
Pass criteria: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN) Duration: 10 seconds
NFT-PERF-03: Tier 3 VLM latency ≤5s
Summary: Measure VLM inference latency including image encoding, prompt processing, and response generation. Traces to: AC-LATENCY-TIER3 Metric: End-to-end VLM analysis time per ROI (ms)
Preconditions:
- NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test)
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Trigger 10 Tier 3 analyses on different ROIs | Record time from VLM request to response via detection log |
| 2 | Compute p50, p95 latency | — |
Pass criteria: p95 latency < 5000ms Duration: 60 seconds
NFT-PERF-04: Full pipeline throughput under continuous frame input
Summary: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth. Traces to: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM Metric: Frames processed per second, max queue depth
Preconditions:
- All tiers active, mock services responding
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Submit 600 frames at 10 FPS (60s) | Count processed frames from detection log |
| 2 | Record queue depth if available from API status endpoint | — |
Pass criteria: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason) Duration: 75 seconds
Resilience Tests
NFT-RES-01: Semantic process crash and recovery
Summary: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes. Traces to: AC-SCAN-ALGORITHM (degradation)
Preconditions:
- Semantic detection running and processing frames
Fault injection:
- Kill semantic process via signal (SIGKILL)
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Submit 5 frames successfully | Detections returned |
| 2 | Kill semantic process | Frame processing stops |
| 3 | Wait up to 10 seconds | Watchdog detects crash, restarts process |
| 4 | Submit 5 more frames | Detections returned again |
Pass criteria: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped)
NFT-RES-02: VLM load/unload cycle stability
Summary: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload. Traces to: AC-RESOURCE-CONSTRAINTS
Preconditions:
- VLM process manageable via API/signal
Fault injection:
- Alternating VLM load/unload commands
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Load VLM, run 1 inference | Success, record memory |
| 2 | Unload VLM, record memory | Memory decreases |
| 3 | Repeat 10 times | — |
| 4 | Compare memory at cycle 1 vs cycle 10 | Delta < 100MB |
Pass criteria: No memory leak (delta < 100MB over 10 cycles); all inferences succeed
NFT-RES-03: Gimbal CRC failure handling
Summary: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries. Traces to: AC-CAMERA-CONTROL
Preconditions:
- Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd
Fault injection:
- Mock gimbal flips random bits in response CRC
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Issue pan command | First 2 responses rejected (bad CRC) |
| 2 | Automatic retry | 3rd attempt succeeds |
| 3 | Read gimbal command log | Log shows 2 CRC failures + 1 success |
Pass criteria: Command succeeds after retries; CRC failures logged; no crash
Security Tests
NFT-SEC-01: No external network access from semantic detection
Summary: Verify the semantic detection service makes no outbound network connections outside the Docker network. Traces to: RESTRICT-SOFTWARE (local-only inference)
Steps:
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Run semantic detection pipeline on test frames | Detections produced |
| 2 | Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) | No packets to external IPs |
Pass criteria: Zero outbound connections to external networks
NFT-SEC-02: Model files are not accessible via API
Summary: Verify TRT engine files and VLM model weights cannot be downloaded through the API. Traces to: RESTRICT-SOFTWARE
Steps:
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Attempt directory traversal via API: GET /api/v1/../models/ | 404 or 400 |
| 2 | Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine | No model content returned |
Pass criteria: Model files inaccessible via any API endpoint
Resource Limit Tests
NFT-RES-LIM-01: Memory stays within 6GB budget [HIL]
Summary: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation). Traces to: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE Metric: Peak RSS memory of semantic detection + VLM processes
Preconditions:
- Jetson Orin Nano Super, 15W mode, active cooling
- All components loaded
Monitoring:
tegrastatslogging at 1-second intervals: GPU memory, CPU memory, swap
Duration: 30 minutes Pass criteria: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB
NFT-RES-LIM-02: Thermal stability under sustained load [HIL]
Summary: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling. Traces to: RESTRICT-HARDWARE Metric: T_junction max, T_junction average
Preconditions:
- Jetson Orin Nano Super, 15W mode, active cooling fan running
- Ambient temperature 20-25°C
Monitoring:
- Temperature sensors via
tegrastatsat 1-second intervals
Duration: 60 minutes Pass criteria: T_junction max < 75°C; no thermal throttling events
NFT-RES-LIM-03: NVMe recording endurance [HIL]
Summary: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors. Traces to: AC-SCAN-ALGORITHM (recording) Metric: Frames written, write errors, NVMe health
Preconditions:
- NVMe SSD ≥256GB, ≥30% free space
Monitoring:
- Write errors via dmesg
- NVMe SMART data before and after
Duration: 2 hours Pass criteria: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS)
NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL]
Summary: Power on Jetson, measure time from boot to first successful detection. Traces to: RESTRICT-OPERATIONAL Metric: Time from power-on to first detection result (seconds)
Preconditions:
- JetPack 6.2 on NVMe, all models pre-exported as TRT engines
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Power on Jetson | Start timer |
| 2 | Poll /api/v1/health every 1s | — |
| 3 | When health returns 200, submit test frame | Record time to first detection |
Pass criteria: First detection within 60 seconds of power-on Duration: 90 seconds max