mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 23:06:38 +00:00
Initial commit
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,272 @@
|
||||
# E2E Non-Functional Tests
|
||||
|
||||
## Performance Tests
|
||||
|
||||
### NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL]
|
||||
|
||||
**Summary**: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine.
|
||||
**Traces to**: AC-LATENCY-TIER1
|
||||
**Metric**: p95 inference latency per frame (ms)
|
||||
|
||||
**Preconditions**:
|
||||
- Jetson Orin Nano Super with JetPack 6.2
|
||||
- YOLOE TRT FP16 engine loaded
|
||||
- Active cooling enabled, T_junction < 70°C
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Submit 100 frames (semantic01-04.png cycled) with 100ms interval | Record per-frame inference time from API response header |
|
||||
| 2 | Compute p50, p95, p99 latency | — |
|
||||
|
||||
**Pass criteria**: p95 latency < 100ms
|
||||
**Duration**: 15 seconds
|
||||
|
||||
---
|
||||
|
||||
### NFT-PERF-02: Tier 2 heuristic latency ≤50ms
|
||||
|
||||
**Summary**: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency.
|
||||
**Traces to**: AC-LATENCY-TIER2
|
||||
**Metric**: p95 processing latency per ROI (ms)
|
||||
|
||||
**Preconditions**:
|
||||
- Tier 1 has produced footpath segmentation masks
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Submit 50 frames with mock YOLO footpath masks | Record Tier 2 processing time from detection log |
|
||||
| 2 | Compute p50, p95 latency | — |
|
||||
|
||||
**Pass criteria**: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN)
|
||||
**Duration**: 10 seconds
|
||||
|
||||
---
|
||||
|
||||
### NFT-PERF-03: Tier 3 VLM latency ≤5s
|
||||
|
||||
**Summary**: Measure VLM inference latency including image encoding, prompt processing, and response generation.
|
||||
**Traces to**: AC-LATENCY-TIER3
|
||||
**Metric**: End-to-end VLM analysis time per ROI (ms)
|
||||
|
||||
**Preconditions**:
|
||||
- NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Trigger 10 Tier 3 analyses on different ROIs | Record time from VLM request to response via detection log |
|
||||
| 2 | Compute p50, p95 latency | — |
|
||||
|
||||
**Pass criteria**: p95 latency < 5000ms
|
||||
**Duration**: 60 seconds
|
||||
|
||||
---
|
||||
|
||||
### NFT-PERF-04: Full pipeline throughput under continuous frame input
|
||||
|
||||
**Summary**: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth.
|
||||
**Traces to**: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM
|
||||
**Metric**: Frames processed per second, max queue depth
|
||||
|
||||
**Preconditions**:
|
||||
- All tiers active, mock services responding
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Submit 600 frames at 10 FPS (60s) | Count processed frames from detection log |
|
||||
| 2 | Record queue depth if available from API status endpoint | — |
|
||||
|
||||
**Pass criteria**: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason)
|
||||
**Duration**: 75 seconds
|
||||
|
||||
---
|
||||
|
||||
## Resilience Tests
|
||||
|
||||
### NFT-RES-01: Semantic process crash and recovery
|
||||
|
||||
**Summary**: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes.
|
||||
**Traces to**: AC-SCAN-ALGORITHM (degradation)
|
||||
|
||||
**Preconditions**:
|
||||
- Semantic detection running and processing frames
|
||||
|
||||
**Fault injection**:
|
||||
- Kill semantic process via signal (SIGKILL)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Behavior |
|
||||
|------|--------|------------------|
|
||||
| 1 | Submit 5 frames successfully | Detections returned |
|
||||
| 2 | Kill semantic process | Frame processing stops |
|
||||
| 3 | Wait up to 10 seconds | Watchdog detects crash, restarts process |
|
||||
| 4 | Submit 5 more frames | Detections returned again |
|
||||
|
||||
**Pass criteria**: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped)
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-02: VLM load/unload cycle stability
|
||||
|
||||
**Summary**: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload.
|
||||
**Traces to**: AC-RESOURCE-CONSTRAINTS
|
||||
|
||||
**Preconditions**:
|
||||
- VLM process manageable via API/signal
|
||||
|
||||
**Fault injection**:
|
||||
- Alternating VLM load/unload commands
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Behavior |
|
||||
|------|--------|------------------|
|
||||
| 1 | Load VLM, run 1 inference | Success, record memory |
|
||||
| 2 | Unload VLM, record memory | Memory decreases |
|
||||
| 3 | Repeat 10 times | — |
|
||||
| 4 | Compare memory at cycle 1 vs cycle 10 | Delta < 100MB |
|
||||
|
||||
**Pass criteria**: No memory leak (delta < 100MB over 10 cycles); all inferences succeed
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-03: Gimbal CRC failure handling
|
||||
|
||||
**Summary**: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries.
|
||||
**Traces to**: AC-CAMERA-CONTROL
|
||||
|
||||
**Preconditions**:
|
||||
- Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd
|
||||
|
||||
**Fault injection**:
|
||||
- Mock gimbal flips random bits in response CRC
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Behavior |
|
||||
|------|--------|------------------|
|
||||
| 1 | Issue pan command | First 2 responses rejected (bad CRC) |
|
||||
| 2 | Automatic retry | 3rd attempt succeeds |
|
||||
| 3 | Read gimbal command log | Log shows 2 CRC failures + 1 success |
|
||||
|
||||
**Pass criteria**: Command succeeds after retries; CRC failures logged; no crash
|
||||
|
||||
---
|
||||
|
||||
## Security Tests
|
||||
|
||||
### NFT-SEC-01: No external network access from semantic detection
|
||||
|
||||
**Summary**: Verify the semantic detection service makes no outbound network connections outside the Docker network.
|
||||
**Traces to**: RESTRICT-SOFTWARE (local-only inference)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected Response |
|
||||
|------|----------------|------------------|
|
||||
| 1 | Run semantic detection pipeline on test frames | Detections produced |
|
||||
| 2 | Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) | No packets to external IPs |
|
||||
|
||||
**Pass criteria**: Zero outbound connections to external networks
|
||||
|
||||
---
|
||||
|
||||
### NFT-SEC-02: Model files are not accessible via API
|
||||
|
||||
**Summary**: Verify TRT engine files and VLM model weights cannot be downloaded through the API.
|
||||
**Traces to**: RESTRICT-SOFTWARE
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected Response |
|
||||
|------|----------------|------------------|
|
||||
| 1 | Attempt directory traversal via API: GET /api/v1/../models/ | 404 or 400 |
|
||||
| 2 | Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine | No model content returned |
|
||||
|
||||
**Pass criteria**: Model files inaccessible via any API endpoint
|
||||
|
||||
---
|
||||
|
||||
## Resource Limit Tests
|
||||
|
||||
### NFT-RES-LIM-01: Memory stays within 6GB budget [HIL]
|
||||
|
||||
**Summary**: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation).
|
||||
**Traces to**: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE
|
||||
**Metric**: Peak RSS memory of semantic detection + VLM processes
|
||||
|
||||
**Preconditions**:
|
||||
- Jetson Orin Nano Super, 15W mode, active cooling
|
||||
- All components loaded
|
||||
|
||||
**Monitoring**:
|
||||
- `tegrastats` logging at 1-second intervals: GPU memory, CPU memory, swap
|
||||
|
||||
**Duration**: 30 minutes
|
||||
**Pass criteria**: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-LIM-02: Thermal stability under sustained load [HIL]
|
||||
|
||||
**Summary**: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling.
|
||||
**Traces to**: RESTRICT-HARDWARE
|
||||
**Metric**: T_junction max, T_junction average
|
||||
|
||||
**Preconditions**:
|
||||
- Jetson Orin Nano Super, 15W mode, active cooling fan running
|
||||
- Ambient temperature 20-25°C
|
||||
|
||||
**Monitoring**:
|
||||
- Temperature sensors via `tegrastats` at 1-second intervals
|
||||
|
||||
**Duration**: 60 minutes
|
||||
**Pass criteria**: T_junction max < 75°C; no thermal throttling events
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-LIM-03: NVMe recording endurance [HIL]
|
||||
|
||||
**Summary**: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors.
|
||||
**Traces to**: AC-SCAN-ALGORITHM (recording)
|
||||
**Metric**: Frames written, write errors, NVMe health
|
||||
|
||||
**Preconditions**:
|
||||
- NVMe SSD ≥256GB, ≥30% free space
|
||||
|
||||
**Monitoring**:
|
||||
- Write errors via dmesg
|
||||
- NVMe SMART data before and after
|
||||
|
||||
**Duration**: 2 hours
|
||||
**Pass criteria**: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS)
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL]
|
||||
|
||||
**Summary**: Power on Jetson, measure time from boot to first successful detection.
|
||||
**Traces to**: RESTRICT-OPERATIONAL
|
||||
**Metric**: Time from power-on to first detection result (seconds)
|
||||
|
||||
**Preconditions**:
|
||||
- JetPack 6.2 on NVMe, all models pre-exported as TRT engines
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Power on Jetson | Start timer |
|
||||
| 2 | Poll /api/v1/health every 1s | — |
|
||||
| 3 | When health returns 200, submit test frame | Record time to first detection |
|
||||
|
||||
**Pass criteria**: First detection within 60 seconds of power-on
|
||||
**Duration**: 90 seconds max
|
||||
Reference in New Issue
Block a user