Files
detections-semantic/_docs/02_plans/integration_tests/non_functional_tests.md
T
Oleksandr Bezdieniezhnykh 8e2ecf50fd Initial commit
Made-with: Cursor
2026-03-26 00:20:30 +02:00

8.6 KiB

E2E Non-Functional Tests

Performance Tests

NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL]

Summary: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine. Traces to: AC-LATENCY-TIER1 Metric: p95 inference latency per frame (ms)

Preconditions:

  • Jetson Orin Nano Super with JetPack 6.2
  • YOLOE TRT FP16 engine loaded
  • Active cooling enabled, T_junction < 70°C

Steps:

Step Consumer Action Measurement
1 Submit 100 frames (semantic01-04.png cycled) with 100ms interval Record per-frame inference time from API response header
2 Compute p50, p95, p99 latency

Pass criteria: p95 latency < 100ms Duration: 15 seconds


NFT-PERF-02: Tier 2 heuristic latency ≤50ms

Summary: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency. Traces to: AC-LATENCY-TIER2 Metric: p95 processing latency per ROI (ms)

Preconditions:

  • Tier 1 has produced footpath segmentation masks

Steps:

Step Consumer Action Measurement
1 Submit 50 frames with mock YOLO footpath masks Record Tier 2 processing time from detection log
2 Compute p50, p95 latency

Pass criteria: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN) Duration: 10 seconds


NFT-PERF-03: Tier 3 VLM latency ≤5s

Summary: Measure VLM inference latency including image encoding, prompt processing, and response generation. Traces to: AC-LATENCY-TIER3 Metric: End-to-end VLM analysis time per ROI (ms)

Preconditions:

  • NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test)

Steps:

Step Consumer Action Measurement
1 Trigger 10 Tier 3 analyses on different ROIs Record time from VLM request to response via detection log
2 Compute p50, p95 latency

Pass criteria: p95 latency < 5000ms Duration: 60 seconds


NFT-PERF-04: Full pipeline throughput under continuous frame input

Summary: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth. Traces to: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM Metric: Frames processed per second, max queue depth

Preconditions:

  • All tiers active, mock services responding

Steps:

Step Consumer Action Measurement
1 Submit 600 frames at 10 FPS (60s) Count processed frames from detection log
2 Record queue depth if available from API status endpoint

Pass criteria: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason) Duration: 75 seconds


Resilience Tests

NFT-RES-01: Semantic process crash and recovery

Summary: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes. Traces to: AC-SCAN-ALGORITHM (degradation)

Preconditions:

  • Semantic detection running and processing frames

Fault injection:

  • Kill semantic process via signal (SIGKILL)

Steps:

Step Action Expected Behavior
1 Submit 5 frames successfully Detections returned
2 Kill semantic process Frame processing stops
3 Wait up to 10 seconds Watchdog detects crash, restarts process
4 Submit 5 more frames Detections returned again

Pass criteria: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped)


NFT-RES-02: VLM load/unload cycle stability

Summary: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload. Traces to: AC-RESOURCE-CONSTRAINTS

Preconditions:

  • VLM process manageable via API/signal

Fault injection:

  • Alternating VLM load/unload commands

Steps:

Step Action Expected Behavior
1 Load VLM, run 1 inference Success, record memory
2 Unload VLM, record memory Memory decreases
3 Repeat 10 times
4 Compare memory at cycle 1 vs cycle 10 Delta < 100MB

Pass criteria: No memory leak (delta < 100MB over 10 cycles); all inferences succeed


NFT-RES-03: Gimbal CRC failure handling

Summary: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries. Traces to: AC-CAMERA-CONTROL

Preconditions:

  • Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd

Fault injection:

  • Mock gimbal flips random bits in response CRC

Steps:

Step Action Expected Behavior
1 Issue pan command First 2 responses rejected (bad CRC)
2 Automatic retry 3rd attempt succeeds
3 Read gimbal command log Log shows 2 CRC failures + 1 success

Pass criteria: Command succeeds after retries; CRC failures logged; no crash


Security Tests

NFT-SEC-01: No external network access from semantic detection

Summary: Verify the semantic detection service makes no outbound network connections outside the Docker network. Traces to: RESTRICT-SOFTWARE (local-only inference)

Steps:

Step Consumer Action Expected Response
1 Run semantic detection pipeline on test frames Detections produced
2 Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) No packets to external IPs

Pass criteria: Zero outbound connections to external networks


NFT-SEC-02: Model files are not accessible via API

Summary: Verify TRT engine files and VLM model weights cannot be downloaded through the API. Traces to: RESTRICT-SOFTWARE

Steps:

Step Consumer Action Expected Response
1 Attempt directory traversal via API: GET /api/v1/../models/ 404 or 400
2 Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine No model content returned

Pass criteria: Model files inaccessible via any API endpoint


Resource Limit Tests

NFT-RES-LIM-01: Memory stays within 6GB budget [HIL]

Summary: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation). Traces to: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE Metric: Peak RSS memory of semantic detection + VLM processes

Preconditions:

  • Jetson Orin Nano Super, 15W mode, active cooling
  • All components loaded

Monitoring:

  • tegrastats logging at 1-second intervals: GPU memory, CPU memory, swap

Duration: 30 minutes Pass criteria: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB


NFT-RES-LIM-02: Thermal stability under sustained load [HIL]

Summary: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling. Traces to: RESTRICT-HARDWARE Metric: T_junction max, T_junction average

Preconditions:

  • Jetson Orin Nano Super, 15W mode, active cooling fan running
  • Ambient temperature 20-25°C

Monitoring:

  • Temperature sensors via tegrastats at 1-second intervals

Duration: 60 minutes Pass criteria: T_junction max < 75°C; no thermal throttling events


NFT-RES-LIM-03: NVMe recording endurance [HIL]

Summary: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors. Traces to: AC-SCAN-ALGORITHM (recording) Metric: Frames written, write errors, NVMe health

Preconditions:

  • NVMe SSD ≥256GB, ≥30% free space

Monitoring:

  • Write errors via dmesg
  • NVMe SMART data before and after

Duration: 2 hours Pass criteria: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS)


NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL]

Summary: Power on Jetson, measure time from boot to first successful detection. Traces to: RESTRICT-OPERATIONAL Metric: Time from power-on to first detection result (seconds)

Preconditions:

  • JetPack 6.2 on NVMe, all models pre-exported as TRT engines

Steps:

Step Consumer Action Measurement
1 Power on Jetson Start timer
2 Poll /api/v1/health every 1s
3 When health returns 200, submit test frame Record time to first detection

Pass criteria: First detection within 60 seconds of power-on Duration: 90 seconds max