# E2E Non-Functional Tests

## Performance Tests

### NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL]

**Summary**: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine.
**Traces to**: AC-LATENCY-TIER1
**Metric**: p95 inference latency per frame (ms)

**Preconditions**:
- Jetson Orin Nano Super with JetPack 6.2
- YOLOE TRT FP16 engine loaded
- Active cooling enabled, T_junction < 70°C

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Submit 100 frames (semantic01-04.png cycled) with 100ms interval | Record per-frame inference time from API response header |
| 2 | Compute p50, p95, p99 latency | — |

**Pass criteria**: p95 latency < 100ms
**Duration**: 15 seconds

---

### NFT-PERF-02: Tier 2 heuristic latency ≤50ms

**Summary**: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency.
**Traces to**: AC-LATENCY-TIER2
**Metric**: p95 processing latency per ROI (ms)

**Preconditions**:
- Tier 1 has produced footpath segmentation masks

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Submit 50 frames with mock YOLO footpath masks | Record Tier 2 processing time from detection log |
| 2 | Compute p50, p95 latency | — |

**Pass criteria**: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN)
**Duration**: 10 seconds

---

### NFT-PERF-03: Tier 3 VLM latency ≤5s

**Summary**: Measure VLM inference latency including image encoding, prompt processing, and response generation.
**Traces to**: AC-LATENCY-TIER3
**Metric**: End-to-end VLM analysis time per ROI (ms)

**Preconditions**:
- NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test)

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Trigger 10 Tier 3 analyses on different ROIs | Record time from VLM request to response via detection log |
| 2 | Compute p50, p95 latency | — |

**Pass criteria**: p95 latency < 5000ms
**Duration**: 60 seconds

---

### NFT-PERF-04: Full pipeline throughput under continuous frame input

**Summary**: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth.
**Traces to**: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM
**Metric**: Frames processed per second, max queue depth

**Preconditions**:
- All tiers active, mock services responding

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Submit 600 frames at 10 FPS (60s) | Count processed frames from detection log |
| 2 | Record queue depth if available from API status endpoint | — |

**Pass criteria**: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason)
**Duration**: 75 seconds

---

## Resilience Tests

### NFT-RES-01: Semantic process crash and recovery

**Summary**: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes.
**Traces to**: AC-SCAN-ALGORITHM (degradation)

**Preconditions**:
- Semantic detection running and processing frames

**Fault injection**:
- Kill semantic process via signal (SIGKILL)

**Steps**:

| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Submit 5 frames successfully | Detections returned |
| 2 | Kill semantic process | Frame processing stops |
| 3 | Wait up to 10 seconds | Watchdog detects crash, restarts process |
| 4 | Submit 5 more frames | Detections returned again |

**Pass criteria**: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped)

---

### NFT-RES-02: VLM load/unload cycle stability

**Summary**: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload.
**Traces to**: AC-RESOURCE-CONSTRAINTS

**Preconditions**:
- VLM process manageable via API/signal

**Fault injection**:
- Alternating VLM load/unload commands

**Steps**:

| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Load VLM, run 1 inference | Success, record memory |
| 2 | Unload VLM, record memory | Memory decreases |
| 3 | Repeat 10 times | — |
| 4 | Compare memory at cycle 1 vs cycle 10 | Delta < 100MB |

**Pass criteria**: No memory leak (delta < 100MB over 10 cycles); all inferences succeed

---

### NFT-RES-03: Gimbal CRC failure handling

**Summary**: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries.
**Traces to**: AC-CAMERA-CONTROL

**Preconditions**:
- Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd

**Fault injection**:
- Mock gimbal flips random bits in response CRC

**Steps**:

| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Issue pan command | First 2 responses rejected (bad CRC) |
| 2 | Automatic retry | 3rd attempt succeeds |
| 3 | Read gimbal command log | Log shows 2 CRC failures + 1 success |

**Pass criteria**: Command succeeds after retries; CRC failures logged; no crash

---

## Security Tests

### NFT-SEC-01: No external network access from semantic detection

**Summary**: Verify the semantic detection service makes no outbound network connections outside the Docker network.
**Traces to**: RESTRICT-SOFTWARE (local-only inference)

**Steps**:

| Step | Consumer Action | Expected Response |
|------|----------------|------------------|
| 1 | Run semantic detection pipeline on test frames | Detections produced |
| 2 | Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) | No packets to external IPs |

**Pass criteria**: Zero outbound connections to external networks

---

### NFT-SEC-02: Model files are not accessible via API

**Summary**: Verify TRT engine files and VLM model weights cannot be downloaded through the API.
**Traces to**: RESTRICT-SOFTWARE

**Steps**:

| Step | Consumer Action | Expected Response |
|------|----------------|------------------|
| 1 | Attempt directory traversal via API: GET /api/v1/../models/ | 404 or 400 |
| 2 | Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine | No model content returned |

**Pass criteria**: Model files inaccessible via any API endpoint

---

## Resource Limit Tests

### NFT-RES-LIM-01: Memory stays within 6GB budget [HIL]

**Summary**: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation).
**Traces to**: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE
**Metric**: Peak RSS memory of semantic detection + VLM processes

**Preconditions**:
- Jetson Orin Nano Super, 15W mode, active cooling
- All components loaded

**Monitoring**:
- `tegrastats` logging at 1-second intervals: GPU memory, CPU memory, swap

**Duration**: 30 minutes
**Pass criteria**: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB

---

### NFT-RES-LIM-02: Thermal stability under sustained load [HIL]

**Summary**: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling.
**Traces to**: RESTRICT-HARDWARE
**Metric**: T_junction max, T_junction average

**Preconditions**:
- Jetson Orin Nano Super, 15W mode, active cooling fan running
- Ambient temperature 20-25°C

**Monitoring**:
- Temperature sensors via `tegrastats` at 1-second intervals

**Duration**: 60 minutes
**Pass criteria**: T_junction max < 75°C; no thermal throttling events

---

### NFT-RES-LIM-03: NVMe recording endurance [HIL]

**Summary**: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors.
**Traces to**: AC-SCAN-ALGORITHM (recording)
**Metric**: Frames written, write errors, NVMe health

**Preconditions**:
- NVMe SSD ≥256GB, ≥30% free space

**Monitoring**:
- Write errors via dmesg
- NVMe SMART data before and after

**Duration**: 2 hours
**Pass criteria**: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS)

---

### NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL]

**Summary**: Power on Jetson, measure time from boot to first successful detection.
**Traces to**: RESTRICT-OPERATIONAL
**Metric**: Time from power-on to first detection result (seconds)

**Preconditions**:
- JetPack 6.2 on NVMe, all models pre-exported as TRT engines

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Power on Jetson | Start timer |
| 2 | Poll /api/v1/health every 1s | — |
| 3 | When health returns 200, submit test frame | Record time to first detection |

**Pass criteria**: First detection within 60 seconds of power-on
**Duration**: 90 seconds max