Initial commit

Made-with: Cursor
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-26 00:20:30 +02:00
commit 8e2ecf50fd
144 changed files with 19781 additions and 0 deletions
@@ -0,0 +1,117 @@
# E2E Test Environment
## Overview
**System under test**: Semantic Detection Service — a Cython + TensorRT module running within the existing FastAPI detections service on Jetson Orin Nano Super. Entry points: FastAPI REST API (image/video input), UART serial port (gimbal commands), Unix socket (VLM IPC).
**Consumer app purpose**: Standalone Python test runner that exercises the semantic detection pipeline through its public interfaces: submitting frames, injecting mock YOLO detections, capturing detection results, and monitoring gimbal command output. No access to internals.
## Docker Environment
### Services
| Service | Image / Build | Purpose | Ports |
|---------|--------------|---------|-------|
| semantic-detection | build: ./Dockerfile.test | Main semantic detection pipeline (Tier 1 + 2 + scan controller + gimbal driver + recorder) | 8080 (API) |
| mock-yolo | build: ./tests/mock_yolo/ | Provides deterministic YOLO detection output for test frames | 8081 (API) |
| mock-gimbal | build: ./tests/mock_gimbal/ | Simulates ViewPro A40 serial interface via TCP socket (replaces UART for testing) | 9090 (TCP) |
| vlm-stub | build: ./tests/vlm_stub/ | Deterministic VLM response stub via Unix socket | — (Unix socket) |
| e2e-consumer | build: ./tests/e2e/ | Black-box test runner (pytest) | — |
### Networks
| Network | Services | Purpose |
|---------|----------|---------|
| e2e-net | all | Isolated test network |
### Volumes
| Volume | Mounted to | Purpose |
|--------|-----------|---------|
| test-frames | semantic-detection:/data/frames, e2e-consumer:/data/frames | Shared test images (semantic01-04.png + synthetic frames) |
| test-output | semantic-detection:/data/output, e2e-consumer:/data/output | Detection logs, recorded frames, gimbal command log |
### docker-compose structure
```yaml
services:
semantic-detection:
build: .
environment:
- ENV=test
- GIMBAL_HOST=mock-gimbal
- GIMBAL_PORT=9090
- VLM_SOCKET=/tmp/vlm.sock
- YOLO_API=http://mock-yolo:8081
- RECORD_PATH=/data/output/frames
- LOG_PATH=/data/output/detections.jsonl
volumes:
- test-frames:/data/frames
- test-output:/data/output
depends_on:
- mock-yolo
- mock-gimbal
- vlm-stub
mock-yolo:
build: ./tests/mock_yolo
mock-gimbal:
build: ./tests/mock_gimbal
vlm-stub:
build: ./tests/vlm_stub
e2e-consumer:
build: ./tests/e2e
volumes:
- test-frames:/data/frames
- test-output:/data/output
depends_on:
- semantic-detection
```
## Consumer Application
**Tech stack**: Python 3.11, pytest, requests, struct (for gimbal protocol parsing)
**Entry point**: `pytest tests/e2e/ --junitxml=e2e-results/report.xml`
### Communication with system under test
| Interface | Protocol | Endpoint / Topic | Authentication |
|-----------|----------|-----------------|----------------|
| Frame submission | HTTP POST | http://semantic-detection:8080/api/v1/detect | None (internal network) |
| Detection results | HTTP GET | http://semantic-detection:8080/api/v1/results | None |
| Gimbal command log | File read | /data/output/gimbal_commands.log | None (shared volume) |
| Detection log | File read | /data/output/detections.jsonl | None (shared volume) |
| Recorded frames | File read | /data/output/frames/ | None (shared volume) |
### What the consumer does NOT have access to
- No direct access to TensorRT engine internals
- No access to YOLOE model weights or inference state
- No access to VLM process memory or internal prompts
- No direct UART/serial access (reads gimbal command log only)
- No access to scan controller state machine internals
## CI/CD Integration
**When to run**: On every PR to `dev` branch; nightly on `dev`
**Pipeline stage**: After unit tests pass, before merge approval
**Gate behavior**: Block merge on any FAIL
**Timeout**: 10 minutes total suite (most tests < 1s each; VLM tests up to 30s)
## Reporting
**Format**: JUnit XML + CSV summary
**Columns**: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
**Output path**: `./e2e-results/report.xml`, `./e2e-results/summary.csv`
## Hardware-in-the-Loop Test Track
Tests requiring actual Jetson Orin Nano Super hardware are marked with `[HIL]` in test IDs. These tests:
- Run on physical Jetson with real TensorRT engines
- Use real ViewPro A40 gimbal (or ViewPro simulator if available)
- Measure actual latency, memory, thermal, power
- Run separately from Docker-based E2E suite
- Triggered manually or on hardware CI runner (if available)
@@ -0,0 +1,323 @@
# E2E Functional Tests
## Positive Scenarios
### FT-P-01: Tier 1 detects footpath from aerial image
**Summary**: Submit a winter aerial image containing a visible footpath; verify Tier 1 (YOLOE) returns a detection with class "footpath" and a segmentation mask.
**Traces to**: AC-YOLO-NEW-CLASSES, AC-SEMANTIC-PIPELINE
**Category**: YOLO Object Detection — New Classes
**Preconditions**:
- Semantic detection service is running
- Mock YOLO service returns pre-computed detections for semantic01.png including footpath class
**Input data**: semantic01.png + mock-yolo-detections (footpath detected)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic01.png to /api/v1/detect | 200 OK, processing started |
| 2 | GET /api/v1/results after 200ms | Detection result array containing at least 1 detection with class="footpath", confidence > 0.5 |
| 3 | Verify detection bbox covers the known footpath region in semantic01.png | bbox overlaps with annotated ground truth footpath region (IoU > 0.3) |
**Expected outcome**: At least 1 footpath detection returned with confidence > 0.5
**Max execution time**: 2s
---
### FT-P-02: Tier 2 traces footpath to endpoint and flags concealed position
**Summary**: Given a frame with detected footpath, verify Tier 2 performs path tracing (skeletonization → endpoint detection) and identifies a dark mass at the endpoint as a potential concealed position.
**Traces to**: AC-SEMANTIC-DETECTION, AC-SEMANTIC-PIPELINE
**Category**: Semantic Detection Performance
**Preconditions**:
- Tier 1 has detected a footpath in the input frame
- Mock YOLO provides footpath segmentation mask for semantic01.png
**Input data**: semantic01.png + mock-yolo-detections (footpath with mask)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic01.png to /api/v1/detect | Processing started |
| 2 | Wait for Tier 2 processing (up to 500ms) | — |
| 3 | GET /api/v1/results | Detection result includes tier2_result="concealed_position" with tier2_confidence > 0 |
| 4 | Read detections.jsonl from output volume | Log entry exists with tier=2, class matches "concealed_position" or "branch_pile_endpoint" |
**Expected outcome**: Tier 2 produces at least 1 endpoint detection flagged as potential concealed position
**Max execution time**: 3s
---
### FT-P-03: Detection output format matches existing YOLO output schema
**Summary**: Verify semantic detection output uses the same bounding box format as existing YOLO pipeline (centerX, centerY, width, height, classNum, label, confidence — all normalized).
**Traces to**: AC-INTEGRATION
**Category**: Integration
**Preconditions**:
- At least 1 detection produced from semantic pipeline
**Input data**: semantic03.png + mock-yolo-detections
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic03.png to /api/v1/detect | Processing started |
| 2 | GET /api/v1/results | Detection JSON array |
| 3 | Validate each detection has fields: centerX (0-1), centerY (0-1), width (0-1), height (0-1), classNum (int), label (string), confidence (0-1) | All fields present, all values within valid ranges |
**Expected outcome**: All output detections conform to existing YOLO output schema
**Max execution time**: 2s
---
### FT-P-04: Tier 3 VLM analysis triggered for ambiguous Tier 2 result
**Summary**: When Tier 2 confidence is below threshold (e.g., 0.3-0.6), verify Tier 3 VLM is invoked for deeper analysis and returns a structured response.
**Traces to**: AC-LATENCY-TIER3, AC-SEMANTIC-PIPELINE
**Category**: Semantic Analysis Pipeline
**Preconditions**:
- VLM stub is running and responds to IPC
- Mock YOLO returns detections with ambiguous endpoint (moderate confidence)
**Input data**: semantic02.png + mock-yolo-detections (footpath with ambiguous endpoint) + vlm-stub-responses
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic02.png to /api/v1/detect | Processing started |
| 2 | Wait for Tier 3 processing (up to 6s) | — |
| 3 | GET /api/v1/results | Detection result includes tier3_used=true |
| 4 | Read detections.jsonl | Log entry with tier=3 and VLM analysis text present |
**Expected outcome**: VLM was invoked, response is recorded in detection log, total latency ≤ 6s
**Max execution time**: 8s
---
### FT-P-05: Frame quality gate rejects blurry frame
**Summary**: Submit a blurred frame; verify the system rejects it via the frame quality gate and does not produce detections from it.
**Traces to**: AC-SCAN-ALGORITHM
**Category**: Scan Algorithm
**Preconditions**:
- Blurry test frames available in test data
**Input data**: blurry-frames (Gaussian blur applied to semantic01.png)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST blurry_semantic01.png to /api/v1/detect | 200 OK |
| 2 | GET /api/v1/results | Empty detection array or response indicating frame rejected (quality below threshold) |
**Expected outcome**: No detections produced from blurry frame; frame quality metric logged
**Max execution time**: 1s
---
### FT-P-06: Scan controller transitions from Level 1 to Level 2
**Summary**: When Tier 1 detects a POI, verify the scan controller issues zoom-in gimbal commands and transitions to Level 2 state.
**Traces to**: AC-SCAN-L1-TO-L2, AC-CAMERA-ZOOM
**Category**: Scan Algorithm, Camera Control
**Preconditions**:
- Mock gimbal service is running and accepting commands
- Scan controller starts in Level 1 mode
**Input data**: synthetic-video-sequence (simulating Level 1 sweep) + mock-yolo-detections (POI detected mid-sequence)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST first 10 frames (Level 1 sweep, no POI) | Gimbal commands show pan sweep pattern |
| 2 | POST frame 11 with mock YOLO returning a footpath detection | Scan controller queues POI |
| 3 | POST frame 12-15 | Gimbal command log shows zoom-in command issued |
| 4 | Read gimbal command log | Transition from sweep commands to zoom + hold commands within 2s of POI detection |
**Expected outcome**: Gimbal transitions from Level 1 sweep to Level 2 zoom within 2 seconds
**Max execution time**: 5s
---
### FT-P-07: Detection logging writes complete JSON-lines entries
**Summary**: After processing multiple frames, verify the detection log contains properly formatted JSON-lines entries with all required fields.
**Traces to**: AC-INTEGRATION
**Category**: Recording, Logging & Telemetry
**Preconditions**:
- Multiple frames processed with detections
**Input data**: semantic01.png, semantic02.png + mock-yolo-detections
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic01.png, then semantic02.png | Detections produced |
| 2 | Read /data/output/detections.jsonl | File exists, contains ≥1 JSON line |
| 3 | Parse each line as JSON | Valid JSON with fields: ts, frame_id, tier, class, confidence, bbox |
| 4 | Verify timestamps are ISO 8601, bbox values 0-1, confidence 0-1 | All values within valid ranges |
**Expected outcome**: All detection log entries are valid JSON with all required fields
**Max execution time**: 3s
---
### FT-P-08: Freshness metadata attached to footpath detections
**Summary**: Verify that footpath detections include freshness metadata (contrast ratio) as "high_contrast" or "low_contrast" tag.
**Traces to**: AC-SEMANTIC-PIPELINE
**Category**: Semantic Analysis Pipeline
**Preconditions**:
- Footpath detected in Tier 1
**Input data**: semantic01.png + mock-yolo-detections (footpath)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic01.png | Detections produced |
| 2 | GET /api/v1/results | Footpath detection includes freshness field |
| 3 | Verify freshness is one of: "high_contrast", "low_contrast" | Valid freshness tag present |
**Expected outcome**: Freshness metadata present on all footpath detections
**Max execution time**: 2s
---
## Negative Scenarios
### FT-N-01: No detections from empty scene
**Summary**: Submit a frame where YOLO returns zero detections; verify semantic pipeline returns empty results without errors.
**Traces to**: AC-SEMANTIC-PIPELINE (negative case)
**Category**: Semantic Analysis Pipeline
**Preconditions**:
- Mock YOLO returns empty detection array
**Input data**: semantic01.png + mock-yolo-empty
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic01.png with mock YOLO returning zero detections | 200 OK |
| 2 | GET /api/v1/results | Empty detection array, no errors |
**Expected outcome**: System returns empty results gracefully
**Max execution time**: 1s
---
### FT-N-02: System handles high-volume false positive YOLO input
**Summary**: Submit a frame where YOLO returns 50+ random false positive bounding boxes; verify system processes without crash and Tier 2 filters most.
**Traces to**: AC-SEMANTIC-DETECTION, RESTRICT-RESOURCE
**Category**: Semantic Detection Performance
**Preconditions**:
- Mock YOLO returns 50 random detections
**Input data**: semantic01.png + mock-yolo-noise (50 random bboxes)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST semantic01.png with noisy YOLO output | 200 OK, processing started |
| 2 | Wait 2s, GET /api/v1/results | Results returned without crash |
| 3 | Verify result count ≤ 50 | Tier 2 filtering reduces candidate count |
**Expected outcome**: System handles noisy input without crash; processes within time budget
**Max execution time**: 5s
---
### FT-N-03: Invalid image format rejected
**Summary**: Submit a 0-byte file and a truncated JPEG; verify system rejects with appropriate error.
**Traces to**: RESTRICT-SOFTWARE
**Category**: Software
**Preconditions**:
- Service is running
**Input data**: 0-byte file, truncated JPEG (first 100 bytes of semantic01.png)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST 0-byte file to /api/v1/detect | 400 Bad Request or skip with warning |
| 2 | POST truncated JPEG | 400 Bad Request or skip with warning |
**Expected outcome**: System rejects invalid input without crash
**Max execution time**: 1s
---
### FT-N-04: Gimbal communication failure triggers graceful degradation
**Summary**: When mock gimbal stops responding, verify system degrades to Level 3 (no gimbal) and continues YOLO-only detection.
**Traces to**: AC-SCAN-ALGORITHM, RESTRICT-HARDWARE
**Category**: Scan Algorithm, Resilience
**Preconditions**:
- Mock gimbal is initially running, then stopped mid-test
**Input data**: semantic01.png + mock-yolo-detections
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST frame, verify gimbal commands are sent | Gimbal commands in log |
| 2 | Stop mock-gimbal service | — |
| 3 | POST next frame | System detects gimbal timeout |
| 4 | POST 3 more frames | System enters degradation Level 3 (no gimbal), continues producing YOLO-only detections |
| 5 | GET /api/v1/results | Detections still returned (from existing YOLO pipeline) |
**Expected outcome**: System degrades gracefully to Level 3, continues detecting without gimbal
**Max execution time**: 15s
---
### FT-N-05: VLM process crash triggers Tier 3 unavailability
**Summary**: When VLM stub crashes, verify Tier 3 is marked unavailable and Tier 1+2 continue operating.
**Traces to**: AC-SEMANTIC-PIPELINE, RESTRICT-SOFTWARE
**Category**: Resilience
**Preconditions**:
- VLM stub initially running, then killed
**Input data**: semantic02.png + mock-yolo-detections (ambiguous endpoint that would trigger VLM)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Kill vlm-stub process | — |
| 2 | POST semantic02.png with ambiguous detection | Processing starts |
| 3 | GET /api/v1/results after 3s | Detection result with tier3_used=false (VLM unavailable), Tier 1+2 results still present |
| 4 | Read detection log | Log entry shows tier3 skipped with reason "vlm_unavailable" |
**Expected outcome**: Tier 1+2 results are returned; Tier 3 is gracefully skipped
**Max execution time**: 5s
@@ -0,0 +1,272 @@
# E2E Non-Functional Tests
## Performance Tests
### NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL]
**Summary**: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine.
**Traces to**: AC-LATENCY-TIER1
**Metric**: p95 inference latency per frame (ms)
**Preconditions**:
- Jetson Orin Nano Super with JetPack 6.2
- YOLOE TRT FP16 engine loaded
- Active cooling enabled, T_junction < 70°C
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Submit 100 frames (semantic01-04.png cycled) with 100ms interval | Record per-frame inference time from API response header |
| 2 | Compute p50, p95, p99 latency | — |
**Pass criteria**: p95 latency < 100ms
**Duration**: 15 seconds
---
### NFT-PERF-02: Tier 2 heuristic latency ≤50ms
**Summary**: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency.
**Traces to**: AC-LATENCY-TIER2
**Metric**: p95 processing latency per ROI (ms)
**Preconditions**:
- Tier 1 has produced footpath segmentation masks
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Submit 50 frames with mock YOLO footpath masks | Record Tier 2 processing time from detection log |
| 2 | Compute p50, p95 latency | — |
**Pass criteria**: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN)
**Duration**: 10 seconds
---
### NFT-PERF-03: Tier 3 VLM latency ≤5s
**Summary**: Measure VLM inference latency including image encoding, prompt processing, and response generation.
**Traces to**: AC-LATENCY-TIER3
**Metric**: End-to-end VLM analysis time per ROI (ms)
**Preconditions**:
- NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test)
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Trigger 10 Tier 3 analyses on different ROIs | Record time from VLM request to response via detection log |
| 2 | Compute p50, p95 latency | — |
**Pass criteria**: p95 latency < 5000ms
**Duration**: 60 seconds
---
### NFT-PERF-04: Full pipeline throughput under continuous frame input
**Summary**: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth.
**Traces to**: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM
**Metric**: Frames processed per second, max queue depth
**Preconditions**:
- All tiers active, mock services responding
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Submit 600 frames at 10 FPS (60s) | Count processed frames from detection log |
| 2 | Record queue depth if available from API status endpoint | — |
**Pass criteria**: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason)
**Duration**: 75 seconds
---
## Resilience Tests
### NFT-RES-01: Semantic process crash and recovery
**Summary**: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes.
**Traces to**: AC-SCAN-ALGORITHM (degradation)
**Preconditions**:
- Semantic detection running and processing frames
**Fault injection**:
- Kill semantic process via signal (SIGKILL)
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Submit 5 frames successfully | Detections returned |
| 2 | Kill semantic process | Frame processing stops |
| 3 | Wait up to 10 seconds | Watchdog detects crash, restarts process |
| 4 | Submit 5 more frames | Detections returned again |
**Pass criteria**: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped)
---
### NFT-RES-02: VLM load/unload cycle stability
**Summary**: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload.
**Traces to**: AC-RESOURCE-CONSTRAINTS
**Preconditions**:
- VLM process manageable via API/signal
**Fault injection**:
- Alternating VLM load/unload commands
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Load VLM, run 1 inference | Success, record memory |
| 2 | Unload VLM, record memory | Memory decreases |
| 3 | Repeat 10 times | — |
| 4 | Compare memory at cycle 1 vs cycle 10 | Delta < 100MB |
**Pass criteria**: No memory leak (delta < 100MB over 10 cycles); all inferences succeed
---
### NFT-RES-03: Gimbal CRC failure handling
**Summary**: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries.
**Traces to**: AC-CAMERA-CONTROL
**Preconditions**:
- Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd
**Fault injection**:
- Mock gimbal flips random bits in response CRC
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Issue pan command | First 2 responses rejected (bad CRC) |
| 2 | Automatic retry | 3rd attempt succeeds |
| 3 | Read gimbal command log | Log shows 2 CRC failures + 1 success |
**Pass criteria**: Command succeeds after retries; CRC failures logged; no crash
---
## Security Tests
### NFT-SEC-01: No external network access from semantic detection
**Summary**: Verify the semantic detection service makes no outbound network connections outside the Docker network.
**Traces to**: RESTRICT-SOFTWARE (local-only inference)
**Steps**:
| Step | Consumer Action | Expected Response |
|------|----------------|------------------|
| 1 | Run semantic detection pipeline on test frames | Detections produced |
| 2 | Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) | No packets to external IPs |
**Pass criteria**: Zero outbound connections to external networks
---
### NFT-SEC-02: Model files are not accessible via API
**Summary**: Verify TRT engine files and VLM model weights cannot be downloaded through the API.
**Traces to**: RESTRICT-SOFTWARE
**Steps**:
| Step | Consumer Action | Expected Response |
|------|----------------|------------------|
| 1 | Attempt directory traversal via API: GET /api/v1/../models/ | 404 or 400 |
| 2 | Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine | No model content returned |
**Pass criteria**: Model files inaccessible via any API endpoint
---
## Resource Limit Tests
### NFT-RES-LIM-01: Memory stays within 6GB budget [HIL]
**Summary**: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation).
**Traces to**: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE
**Metric**: Peak RSS memory of semantic detection + VLM processes
**Preconditions**:
- Jetson Orin Nano Super, 15W mode, active cooling
- All components loaded
**Monitoring**:
- `tegrastats` logging at 1-second intervals: GPU memory, CPU memory, swap
**Duration**: 30 minutes
**Pass criteria**: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB
---
### NFT-RES-LIM-02: Thermal stability under sustained load [HIL]
**Summary**: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling.
**Traces to**: RESTRICT-HARDWARE
**Metric**: T_junction max, T_junction average
**Preconditions**:
- Jetson Orin Nano Super, 15W mode, active cooling fan running
- Ambient temperature 20-25°C
**Monitoring**:
- Temperature sensors via `tegrastats` at 1-second intervals
**Duration**: 60 minutes
**Pass criteria**: T_junction max < 75°C; no thermal throttling events
---
### NFT-RES-LIM-03: NVMe recording endurance [HIL]
**Summary**: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors.
**Traces to**: AC-SCAN-ALGORITHM (recording)
**Metric**: Frames written, write errors, NVMe health
**Preconditions**:
- NVMe SSD ≥256GB, ≥30% free space
**Monitoring**:
- Write errors via dmesg
- NVMe SMART data before and after
**Duration**: 2 hours
**Pass criteria**: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS)
---
### NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL]
**Summary**: Power on Jetson, measure time from boot to first successful detection.
**Traces to**: RESTRICT-OPERATIONAL
**Metric**: Time from power-on to first detection result (seconds)
**Preconditions**:
- JetPack 6.2 on NVMe, all models pre-exported as TRT engines
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Power on Jetson | Start timer |
| 2 | Poll /api/v1/health every 1s | — |
| 3 | When health returns 200, submit test frame | Record time to first detection |
**Pass criteria**: First detection within 60 seconds of power-on
**Duration**: 90 seconds max
@@ -0,0 +1,46 @@
# E2E Test Data Management
## Seed Data Sets
| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|----------|-------------|---------------|-----------|---------|
| winter-footpath-images | semantic01-04.png — real aerial images with footpaths and concealed positions (winter) | FT-P-01 to FT-P-07, FT-N-01 to FT-N-04, NFT-PERF-01 to NFT-PERF-03 | Volume mount from test-frames | Persistent, read-only |
| mock-yolo-detections | Pre-computed YOLO detection JSONs for each test image (footpaths, roads, branch piles, entrances, trees) | FT-P-01 to FT-P-07 | Loaded by mock-yolo service from fixture files | Persistent, read-only |
| mock-yolo-empty | YOLO detection JSON with zero detections | FT-N-01 | Loaded by mock-yolo service | Persistent, read-only |
| mock-yolo-noise | YOLO detection JSON with high-confidence false positives (random bounding boxes) | FT-N-02 | Loaded by mock-yolo service | Persistent, read-only |
| blurry-frames | 5 synthetically blurred versions of semantic01.png (Gaussian blur, motion blur) | FT-N-03, FT-P-05 | Volume mount | Persistent, read-only |
| synthetic-video-sequence | 30 frames panning across semantic01.png to simulate gimbal movement | FT-P-06, FT-P-07 | Volume mount | Persistent, read-only |
| vlm-stub-responses | Deterministic VLM text responses for each test image ROI | FT-P-04 | Loaded by vlm-stub service | Persistent, read-only |
| gimbal-protocol-fixtures | ViewLink protocol command/response byte sequences for known operations | FT-P-06, FT-N-04 | Loaded by mock-gimbal service | Persistent, read-only |
## Data Isolation Strategy
Each test run starts with a clean output directory. The semantic-detection service restarts between test groups (via Docker restart). Input data (images, mock detections) is read-only and shared across tests. Output data (detection logs, recorded frames, gimbal commands) is written to a fresh directory per test run.
## Input Data Mapping
| Input Data File | Source Location | Description | Covers Scenarios |
|-----------------|----------------|-------------|-----------------|
| semantic01.png | `_docs/00_problem/input_data/semantic01.png` | Footpath with arrows, leading to branch pile hideout | FT-P-01, FT-P-02, FT-P-03, FT-P-04 |
| semantic02.png | `_docs/00_problem/input_data/semantic02.png` | Footpath to open space from forest, FPV pilot trail | FT-P-01, FT-P-02, FT-P-07 |
| semantic03.png | `_docs/00_problem/input_data/semantic03.png` | Footpath with squared hideout | FT-P-01, FT-P-03 |
| semantic04.png | `_docs/00_problem/input_data/semantic04.png` | Footpath ending at tree branches | FT-P-01, FT-P-02 |
| data_parameters.md | `_docs/00_problem/input_data/data_parameters.md` | Training data spec (not used in E2E tests directly) | — |
## External Dependency Mocks
| External Service | Mock/Stub | How Provided | Behavior |
|-----------------|-----------|-------------|----------|
| YOLO Detection Pipeline | mock-yolo Docker service | HTTP API returning deterministic JSON detection results per image hash | Returns pre-computed detection arrays matching expected YOLO output format (centerX, centerY, width, height, classNum, label, confidence) |
| ViewPro A40 Gimbal | mock-gimbal Docker service | TCP socket emulating UART serial interface | Accepts ViewLink protocol commands, responds with gimbal feedback (pan/tilt angles, status). Logs all received commands to file. Supports simulated delays (1-2s zoom transition). |
| VLM (NanoLLM/VILA) | vlm-stub Docker service | Unix socket responding to IPC messages | Returns deterministic text analysis per image ROI hash. Simulates ~2s latency. Returns configurable responses for positive/negative/ambiguous cases. |
| GPS-Denied System | Not mocked | Not needed — coordinates are passed as metadata input | System under test accepts coordinates as input parameters, does not compute them |
## Data Validation Rules
| Data Type | Validation | Invalid Examples | Expected System Behavior |
|-----------|-----------|-----------------|------------------------|
| Input frame | JPEG/PNG, 1920x1080, 3-channel RGB | 0-byte file, truncated JPEG, 640x480, grayscale | Reject with error, skip frame, continue processing |
| YOLO detection JSON | Array of objects with required fields (centerX, centerY, width, height, classNum, label, confidence) | Missing fields, confidence > 1.0, negative coordinates | Ignore malformed detections, process valid ones |
| Gimbal command | Valid ViewLink protocol packet with CRC-16 | Truncated packet, invalid CRC, unknown command code | Retry up to 3 times, log error, continue without gimbal |
| VLM IPC message | JSON with image_path and prompt fields | Missing image_path, empty prompt, non-existent file | Return error response, Tier 3 marked as failed for this ROI |
@@ -0,0 +1,69 @@
# E2E Traceability Matrix
## Acceptance Criteria Coverage
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-LATENCY-TIER1 | Tier 1 ≤100ms per frame | NFT-PERF-01, NFT-PERF-04 | Covered |
| AC-LATENCY-TIER2 | Tier 2 ≤200ms per ROI | NFT-PERF-02 | Covered |
| AC-LATENCY-TIER3 | Tier 3 ≤5s per ROI | NFT-PERF-03, FT-P-04 | Covered |
| AC-YOLO-NEW-CLASSES | New YOLO classes P≥80% R≥80% | FT-P-01 | Partially covered — functional flow tested; statistical P/R requires annotated validation set (component-level test) |
| AC-SEMANTIC-DETECTION-R | Concealed position recall ≥60% | FT-P-02, FT-P-03 | Partially covered — functional detection tested; statistical recall requires larger dataset (component-level test) |
| AC-SEMANTIC-DETECTION-P | Concealed position precision ≥20% | FT-P-02, FT-N-02 | Partially covered — same as above |
| AC-FOOTPATH-RECALL | Footpath detection recall ≥70% | FT-P-01 | Partially covered — functional detection tested; statistical recall at component level |
| AC-SCAN-L1 | Level 1 covers route with sweep | FT-P-06 | Covered |
| AC-SCAN-L1-TO-L2 | L1→L2 transition within 2s | FT-P-06 | Covered |
| AC-SCAN-L2-LOCK | L2 maintains camera lock on POI | — | NOT COVERED — requires real gimbal + moving platform; covered in [HIL] test track |
| AC-SCAN-PATH-FOLLOW | Path-following keeps path in center 50% | — | NOT COVERED — requires real camera + gimbal; covered in [HIL] track |
| AC-SCAN-ENDPOINT-HOLD | Endpoint hold for VLM analysis | FT-P-04 | Partially covered — VLM trigger tested; physical hold requires [HIL] |
| AC-SCAN-RETURN | Return to L1 after analysis/timeout | FT-P-06 | Covered (within mock gimbal command sequence) |
| AC-CAMERA-LATENCY | Gimbal command ≤500ms | NFT-RES-03 | Covered (mock; [HIL] for real latency) |
| AC-CAMERA-ZOOM | Zoom M→H within 2s | FT-P-06 | Covered (mock acknowledges zoom; [HIL] for physical timing) |
| AC-CAMERA-PATH-ACCURACY | Footpath stays in center 50% during pan | — | NOT COVERED — requires real gimbal; [HIL] |
| AC-CAMERA-SMOOTH | Smooth gimbal transitions | — | NOT COVERED — requires real gimbal; [HIL] |
| AC-CAMERA-QUEUE | POI queue prioritized by confidence/proximity | FT-P-06 | Partially covered — queue existence tested; priority ordering at component level |
| AC-SEMANTIC-PIPELINE | Consumes YOLO input, traces paths, freshness | FT-P-01, FT-P-02, FT-P-08 | Covered |
| AC-RESOURCE-CONSTRAINTS | ≤6GB RAM total | NFT-RES-LIM-01 | Covered [HIL] |
| AC-COEXIST-YOLO | Must not degrade existing YOLO | NFT-PERF-04 | Partially covered — throughput measured; real coexistence at [HIL] |
## Restrictions Coverage
| Restriction ID | Restriction | Test IDs | Coverage |
|---------------|-------------|----------|----------|
| RESTRICT-HW-JETSON | Jetson Orin Nano Super, 67 TOPS, 8GB | NFT-RES-LIM-01, NFT-RES-LIM-02 | Covered [HIL] |
| RESTRICT-HW-RAM | ~6GB available for semantic + VLM | NFT-RES-LIM-01 | Covered [HIL] |
| RESTRICT-CAM-VIEWPRO | ViewPro A40 1080p 40x zoom | FT-P-06, NFT-RES-03 | Covered (mock) |
| RESTRICT-CAM-ZOOM-TIME | Zoom transition 1-2s physical | FT-P-06 | Covered (mock with simulated delay) |
| RESTRICT-OP-ALTITUDE | 600-1000m altitude | — | NOT COVERED — operational parameter, not testable at E2E; affects GSD calculation tested at component level |
| RESTRICT-OP-SEASONS | All seasons, phased starting winter | FT-P-01 to FT-P-08 (winter images) | Partially covered — winter only; other seasons deferred to Phase 4 |
| RESTRICT-SW-CYTHON-TRT | Extend Cython + TRT codebase | — | NOT COVERED — architectural constraint verified by code review, not E2E test |
| RESTRICT-SW-TRT | TensorRT inference engine | NFT-PERF-01 | Covered [HIL] |
| RESTRICT-SW-VLM-LOCAL | VLM runs locally, no cloud | NFT-SEC-01 | Covered |
| RESTRICT-SW-VLM-SEPARATE | VLM as separate process with IPC | FT-P-04, FT-N-05 | Covered |
| RESTRICT-SW-SEQUENTIAL-GPU | YOLO and VLM scheduled sequentially | NFT-PERF-04, NFT-RES-LIM-01 | Covered (memory monitoring shows no concurrent GPU allocation) |
| RESTRICT-INT-FASTAPI | Existing FastAPI + Cython + Docker | FT-P-03 | Covered (output format) |
| RESTRICT-INT-YOLO-OUTPUT | Consume YOLO bounding box output | FT-P-01, FT-P-02 | Covered |
| RESTRICT-INT-OUTPUT-FORMAT | Output same bbox format | FT-P-03 | Covered |
| RESTRICT-SCOPE-ANNOTATION | Annotation tooling out of scope | — | N/A |
| RESTRICT-SCOPE-GPS | GPS-denied out of scope | — | N/A |
## Coverage Summary
| Category | Total Items | Covered | Partially Covered | Not Covered | Coverage % |
|----------|-----------|---------|-------------------|-------------|-----------|
| Acceptance Criteria | 21 | 10 | 7 | 4 | 81% (counting partial as 0.5) |
| Restrictions | 16 | 8 | 2 | 4 | 69% (2 N/A excluded) |
| **Total** | **37** | **18** | **9** | **8** | **76%** |
## Uncovered Items Analysis
| Item | Reason Not Covered | Risk | Mitigation |
|------|-------------------|------|-----------|
| AC-SCAN-L2-LOCK | Requires real gimbal + moving UAV platform | Camera drifts off target during flight | [HIL] test with real hardware; PID tuning on bench first |
| AC-SCAN-PATH-FOLLOW | Requires real gimbal + camera | Path leaves frame during pan | [HIL] test; component-level PID unit tests with simulated feedback |
| AC-CAMERA-PATH-ACCURACY | Requires real gimbal | Path not centered | [HIL] test |
| AC-CAMERA-SMOOTH | Requires real gimbal | Jerky movement blurs frames | [HIL] test; PID tuning |
| RESTRICT-OP-ALTITUDE | Operational parameter, not testable | GSD calculation wrong | Component-level GSD unit test with known altitude |
| RESTRICT-SW-CYTHON-TRT | Architectural constraint | Wrong tech stack used | Code review gate in PR process |
| RESTRICT-OP-SEASONS (non-winter) | Only winter images available now | System fails on summer/spring terrain | Phase 4 seasonal expansion; deferred by design |
| RESTRICT-HW-JETSON (real perf) | Requires physical hardware | Docker perf doesn't match Jetson | [HIL] test track runs on real Jetson |