mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 18:36:41 +00:00
Initial commit
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,117 @@
|
||||
# E2E Test Environment
|
||||
|
||||
## Overview
|
||||
|
||||
**System under test**: Semantic Detection Service — a Cython + TensorRT module running within the existing FastAPI detections service on Jetson Orin Nano Super. Entry points: FastAPI REST API (image/video input), UART serial port (gimbal commands), Unix socket (VLM IPC).
|
||||
|
||||
**Consumer app purpose**: Standalone Python test runner that exercises the semantic detection pipeline through its public interfaces: submitting frames, injecting mock YOLO detections, capturing detection results, and monitoring gimbal command output. No access to internals.
|
||||
|
||||
## Docker Environment
|
||||
|
||||
### Services
|
||||
|
||||
| Service | Image / Build | Purpose | Ports |
|
||||
|---------|--------------|---------|-------|
|
||||
| semantic-detection | build: ./Dockerfile.test | Main semantic detection pipeline (Tier 1 + 2 + scan controller + gimbal driver + recorder) | 8080 (API) |
|
||||
| mock-yolo | build: ./tests/mock_yolo/ | Provides deterministic YOLO detection output for test frames | 8081 (API) |
|
||||
| mock-gimbal | build: ./tests/mock_gimbal/ | Simulates ViewPro A40 serial interface via TCP socket (replaces UART for testing) | 9090 (TCP) |
|
||||
| vlm-stub | build: ./tests/vlm_stub/ | Deterministic VLM response stub via Unix socket | — (Unix socket) |
|
||||
| e2e-consumer | build: ./tests/e2e/ | Black-box test runner (pytest) | — |
|
||||
|
||||
### Networks
|
||||
|
||||
| Network | Services | Purpose |
|
||||
|---------|----------|---------|
|
||||
| e2e-net | all | Isolated test network |
|
||||
|
||||
### Volumes
|
||||
|
||||
| Volume | Mounted to | Purpose |
|
||||
|--------|-----------|---------|
|
||||
| test-frames | semantic-detection:/data/frames, e2e-consumer:/data/frames | Shared test images (semantic01-04.png + synthetic frames) |
|
||||
| test-output | semantic-detection:/data/output, e2e-consumer:/data/output | Detection logs, recorded frames, gimbal command log |
|
||||
|
||||
### docker-compose structure
|
||||
|
||||
```yaml
|
||||
services:
|
||||
semantic-detection:
|
||||
build: .
|
||||
environment:
|
||||
- ENV=test
|
||||
- GIMBAL_HOST=mock-gimbal
|
||||
- GIMBAL_PORT=9090
|
||||
- VLM_SOCKET=/tmp/vlm.sock
|
||||
- YOLO_API=http://mock-yolo:8081
|
||||
- RECORD_PATH=/data/output/frames
|
||||
- LOG_PATH=/data/output/detections.jsonl
|
||||
volumes:
|
||||
- test-frames:/data/frames
|
||||
- test-output:/data/output
|
||||
depends_on:
|
||||
- mock-yolo
|
||||
- mock-gimbal
|
||||
- vlm-stub
|
||||
|
||||
mock-yolo:
|
||||
build: ./tests/mock_yolo
|
||||
|
||||
mock-gimbal:
|
||||
build: ./tests/mock_gimbal
|
||||
|
||||
vlm-stub:
|
||||
build: ./tests/vlm_stub
|
||||
|
||||
e2e-consumer:
|
||||
build: ./tests/e2e
|
||||
volumes:
|
||||
- test-frames:/data/frames
|
||||
- test-output:/data/output
|
||||
depends_on:
|
||||
- semantic-detection
|
||||
```
|
||||
|
||||
## Consumer Application
|
||||
|
||||
**Tech stack**: Python 3.11, pytest, requests, struct (for gimbal protocol parsing)
|
||||
**Entry point**: `pytest tests/e2e/ --junitxml=e2e-results/report.xml`
|
||||
|
||||
### Communication with system under test
|
||||
|
||||
| Interface | Protocol | Endpoint / Topic | Authentication |
|
||||
|-----------|----------|-----------------|----------------|
|
||||
| Frame submission | HTTP POST | http://semantic-detection:8080/api/v1/detect | None (internal network) |
|
||||
| Detection results | HTTP GET | http://semantic-detection:8080/api/v1/results | None |
|
||||
| Gimbal command log | File read | /data/output/gimbal_commands.log | None (shared volume) |
|
||||
| Detection log | File read | /data/output/detections.jsonl | None (shared volume) |
|
||||
| Recorded frames | File read | /data/output/frames/ | None (shared volume) |
|
||||
|
||||
### What the consumer does NOT have access to
|
||||
|
||||
- No direct access to TensorRT engine internals
|
||||
- No access to YOLOE model weights or inference state
|
||||
- No access to VLM process memory or internal prompts
|
||||
- No direct UART/serial access (reads gimbal command log only)
|
||||
- No access to scan controller state machine internals
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
**When to run**: On every PR to `dev` branch; nightly on `dev`
|
||||
**Pipeline stage**: After unit tests pass, before merge approval
|
||||
**Gate behavior**: Block merge on any FAIL
|
||||
**Timeout**: 10 minutes total suite (most tests < 1s each; VLM tests up to 30s)
|
||||
|
||||
## Reporting
|
||||
|
||||
**Format**: JUnit XML + CSV summary
|
||||
**Columns**: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
|
||||
**Output path**: `./e2e-results/report.xml`, `./e2e-results/summary.csv`
|
||||
|
||||
## Hardware-in-the-Loop Test Track
|
||||
|
||||
Tests requiring actual Jetson Orin Nano Super hardware are marked with `[HIL]` in test IDs. These tests:
|
||||
- Run on physical Jetson with real TensorRT engines
|
||||
- Use real ViewPro A40 gimbal (or ViewPro simulator if available)
|
||||
- Measure actual latency, memory, thermal, power
|
||||
- Run separately from Docker-based E2E suite
|
||||
- Triggered manually or on hardware CI runner (if available)
|
||||
@@ -0,0 +1,323 @@
|
||||
# E2E Functional Tests
|
||||
|
||||
## Positive Scenarios
|
||||
|
||||
### FT-P-01: Tier 1 detects footpath from aerial image
|
||||
|
||||
**Summary**: Submit a winter aerial image containing a visible footpath; verify Tier 1 (YOLOE) returns a detection with class "footpath" and a segmentation mask.
|
||||
**Traces to**: AC-YOLO-NEW-CLASSES, AC-SEMANTIC-PIPELINE
|
||||
**Category**: YOLO Object Detection — New Classes
|
||||
|
||||
**Preconditions**:
|
||||
- Semantic detection service is running
|
||||
- Mock YOLO service returns pre-computed detections for semantic01.png including footpath class
|
||||
|
||||
**Input data**: semantic01.png + mock-yolo-detections (footpath detected)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic01.png to /api/v1/detect | 200 OK, processing started |
|
||||
| 2 | GET /api/v1/results after 200ms | Detection result array containing at least 1 detection with class="footpath", confidence > 0.5 |
|
||||
| 3 | Verify detection bbox covers the known footpath region in semantic01.png | bbox overlaps with annotated ground truth footpath region (IoU > 0.3) |
|
||||
|
||||
**Expected outcome**: At least 1 footpath detection returned with confidence > 0.5
|
||||
**Max execution time**: 2s
|
||||
|
||||
---
|
||||
|
||||
### FT-P-02: Tier 2 traces footpath to endpoint and flags concealed position
|
||||
|
||||
**Summary**: Given a frame with detected footpath, verify Tier 2 performs path tracing (skeletonization → endpoint detection) and identifies a dark mass at the endpoint as a potential concealed position.
|
||||
**Traces to**: AC-SEMANTIC-DETECTION, AC-SEMANTIC-PIPELINE
|
||||
**Category**: Semantic Detection Performance
|
||||
|
||||
**Preconditions**:
|
||||
- Tier 1 has detected a footpath in the input frame
|
||||
- Mock YOLO provides footpath segmentation mask for semantic01.png
|
||||
|
||||
**Input data**: semantic01.png + mock-yolo-detections (footpath with mask)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic01.png to /api/v1/detect | Processing started |
|
||||
| 2 | Wait for Tier 2 processing (up to 500ms) | — |
|
||||
| 3 | GET /api/v1/results | Detection result includes tier2_result="concealed_position" with tier2_confidence > 0 |
|
||||
| 4 | Read detections.jsonl from output volume | Log entry exists with tier=2, class matches "concealed_position" or "branch_pile_endpoint" |
|
||||
|
||||
**Expected outcome**: Tier 2 produces at least 1 endpoint detection flagged as potential concealed position
|
||||
**Max execution time**: 3s
|
||||
|
||||
---
|
||||
|
||||
### FT-P-03: Detection output format matches existing YOLO output schema
|
||||
|
||||
**Summary**: Verify semantic detection output uses the same bounding box format as existing YOLO pipeline (centerX, centerY, width, height, classNum, label, confidence — all normalized).
|
||||
**Traces to**: AC-INTEGRATION
|
||||
**Category**: Integration
|
||||
|
||||
**Preconditions**:
|
||||
- At least 1 detection produced from semantic pipeline
|
||||
|
||||
**Input data**: semantic03.png + mock-yolo-detections
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic03.png to /api/v1/detect | Processing started |
|
||||
| 2 | GET /api/v1/results | Detection JSON array |
|
||||
| 3 | Validate each detection has fields: centerX (0-1), centerY (0-1), width (0-1), height (0-1), classNum (int), label (string), confidence (0-1) | All fields present, all values within valid ranges |
|
||||
|
||||
**Expected outcome**: All output detections conform to existing YOLO output schema
|
||||
**Max execution time**: 2s
|
||||
|
||||
---
|
||||
|
||||
### FT-P-04: Tier 3 VLM analysis triggered for ambiguous Tier 2 result
|
||||
|
||||
**Summary**: When Tier 2 confidence is below threshold (e.g., 0.3-0.6), verify Tier 3 VLM is invoked for deeper analysis and returns a structured response.
|
||||
**Traces to**: AC-LATENCY-TIER3, AC-SEMANTIC-PIPELINE
|
||||
**Category**: Semantic Analysis Pipeline
|
||||
|
||||
**Preconditions**:
|
||||
- VLM stub is running and responds to IPC
|
||||
- Mock YOLO returns detections with ambiguous endpoint (moderate confidence)
|
||||
|
||||
**Input data**: semantic02.png + mock-yolo-detections (footpath with ambiguous endpoint) + vlm-stub-responses
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic02.png to /api/v1/detect | Processing started |
|
||||
| 2 | Wait for Tier 3 processing (up to 6s) | — |
|
||||
| 3 | GET /api/v1/results | Detection result includes tier3_used=true |
|
||||
| 4 | Read detections.jsonl | Log entry with tier=3 and VLM analysis text present |
|
||||
|
||||
**Expected outcome**: VLM was invoked, response is recorded in detection log, total latency ≤ 6s
|
||||
**Max execution time**: 8s
|
||||
|
||||
---
|
||||
|
||||
### FT-P-05: Frame quality gate rejects blurry frame
|
||||
|
||||
**Summary**: Submit a blurred frame; verify the system rejects it via the frame quality gate and does not produce detections from it.
|
||||
**Traces to**: AC-SCAN-ALGORITHM
|
||||
**Category**: Scan Algorithm
|
||||
|
||||
**Preconditions**:
|
||||
- Blurry test frames available in test data
|
||||
|
||||
**Input data**: blurry-frames (Gaussian blur applied to semantic01.png)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST blurry_semantic01.png to /api/v1/detect | 200 OK |
|
||||
| 2 | GET /api/v1/results | Empty detection array or response indicating frame rejected (quality below threshold) |
|
||||
|
||||
**Expected outcome**: No detections produced from blurry frame; frame quality metric logged
|
||||
**Max execution time**: 1s
|
||||
|
||||
---
|
||||
|
||||
### FT-P-06: Scan controller transitions from Level 1 to Level 2
|
||||
|
||||
**Summary**: When Tier 1 detects a POI, verify the scan controller issues zoom-in gimbal commands and transitions to Level 2 state.
|
||||
**Traces to**: AC-SCAN-L1-TO-L2, AC-CAMERA-ZOOM
|
||||
**Category**: Scan Algorithm, Camera Control
|
||||
|
||||
**Preconditions**:
|
||||
- Mock gimbal service is running and accepting commands
|
||||
- Scan controller starts in Level 1 mode
|
||||
|
||||
**Input data**: synthetic-video-sequence (simulating Level 1 sweep) + mock-yolo-detections (POI detected mid-sequence)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST first 10 frames (Level 1 sweep, no POI) | Gimbal commands show pan sweep pattern |
|
||||
| 2 | POST frame 11 with mock YOLO returning a footpath detection | Scan controller queues POI |
|
||||
| 3 | POST frame 12-15 | Gimbal command log shows zoom-in command issued |
|
||||
| 4 | Read gimbal command log | Transition from sweep commands to zoom + hold commands within 2s of POI detection |
|
||||
|
||||
**Expected outcome**: Gimbal transitions from Level 1 sweep to Level 2 zoom within 2 seconds
|
||||
**Max execution time**: 5s
|
||||
|
||||
---
|
||||
|
||||
### FT-P-07: Detection logging writes complete JSON-lines entries
|
||||
|
||||
**Summary**: After processing multiple frames, verify the detection log contains properly formatted JSON-lines entries with all required fields.
|
||||
**Traces to**: AC-INTEGRATION
|
||||
**Category**: Recording, Logging & Telemetry
|
||||
|
||||
**Preconditions**:
|
||||
- Multiple frames processed with detections
|
||||
|
||||
**Input data**: semantic01.png, semantic02.png + mock-yolo-detections
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic01.png, then semantic02.png | Detections produced |
|
||||
| 2 | Read /data/output/detections.jsonl | File exists, contains ≥1 JSON line |
|
||||
| 3 | Parse each line as JSON | Valid JSON with fields: ts, frame_id, tier, class, confidence, bbox |
|
||||
| 4 | Verify timestamps are ISO 8601, bbox values 0-1, confidence 0-1 | All values within valid ranges |
|
||||
|
||||
**Expected outcome**: All detection log entries are valid JSON with all required fields
|
||||
**Max execution time**: 3s
|
||||
|
||||
---
|
||||
|
||||
### FT-P-08: Freshness metadata attached to footpath detections
|
||||
|
||||
**Summary**: Verify that footpath detections include freshness metadata (contrast ratio) as "high_contrast" or "low_contrast" tag.
|
||||
**Traces to**: AC-SEMANTIC-PIPELINE
|
||||
**Category**: Semantic Analysis Pipeline
|
||||
|
||||
**Preconditions**:
|
||||
- Footpath detected in Tier 1
|
||||
|
||||
**Input data**: semantic01.png + mock-yolo-detections (footpath)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic01.png | Detections produced |
|
||||
| 2 | GET /api/v1/results | Footpath detection includes freshness field |
|
||||
| 3 | Verify freshness is one of: "high_contrast", "low_contrast" | Valid freshness tag present |
|
||||
|
||||
**Expected outcome**: Freshness metadata present on all footpath detections
|
||||
**Max execution time**: 2s
|
||||
|
||||
---
|
||||
|
||||
## Negative Scenarios
|
||||
|
||||
### FT-N-01: No detections from empty scene
|
||||
|
||||
**Summary**: Submit a frame where YOLO returns zero detections; verify semantic pipeline returns empty results without errors.
|
||||
**Traces to**: AC-SEMANTIC-PIPELINE (negative case)
|
||||
**Category**: Semantic Analysis Pipeline
|
||||
|
||||
**Preconditions**:
|
||||
- Mock YOLO returns empty detection array
|
||||
|
||||
**Input data**: semantic01.png + mock-yolo-empty
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic01.png with mock YOLO returning zero detections | 200 OK |
|
||||
| 2 | GET /api/v1/results | Empty detection array, no errors |
|
||||
|
||||
**Expected outcome**: System returns empty results gracefully
|
||||
**Max execution time**: 1s
|
||||
|
||||
---
|
||||
|
||||
### FT-N-02: System handles high-volume false positive YOLO input
|
||||
|
||||
**Summary**: Submit a frame where YOLO returns 50+ random false positive bounding boxes; verify system processes without crash and Tier 2 filters most.
|
||||
**Traces to**: AC-SEMANTIC-DETECTION, RESTRICT-RESOURCE
|
||||
**Category**: Semantic Detection Performance
|
||||
|
||||
**Preconditions**:
|
||||
- Mock YOLO returns 50 random detections
|
||||
|
||||
**Input data**: semantic01.png + mock-yolo-noise (50 random bboxes)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST semantic01.png with noisy YOLO output | 200 OK, processing started |
|
||||
| 2 | Wait 2s, GET /api/v1/results | Results returned without crash |
|
||||
| 3 | Verify result count ≤ 50 | Tier 2 filtering reduces candidate count |
|
||||
|
||||
**Expected outcome**: System handles noisy input without crash; processes within time budget
|
||||
**Max execution time**: 5s
|
||||
|
||||
---
|
||||
|
||||
### FT-N-03: Invalid image format rejected
|
||||
|
||||
**Summary**: Submit a 0-byte file and a truncated JPEG; verify system rejects with appropriate error.
|
||||
**Traces to**: RESTRICT-SOFTWARE
|
||||
**Category**: Software
|
||||
|
||||
**Preconditions**:
|
||||
- Service is running
|
||||
|
||||
**Input data**: 0-byte file, truncated JPEG (first 100 bytes of semantic01.png)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST 0-byte file to /api/v1/detect | 400 Bad Request or skip with warning |
|
||||
| 2 | POST truncated JPEG | 400 Bad Request or skip with warning |
|
||||
|
||||
**Expected outcome**: System rejects invalid input without crash
|
||||
**Max execution time**: 1s
|
||||
|
||||
---
|
||||
|
||||
### FT-N-04: Gimbal communication failure triggers graceful degradation
|
||||
|
||||
**Summary**: When mock gimbal stops responding, verify system degrades to Level 3 (no gimbal) and continues YOLO-only detection.
|
||||
**Traces to**: AC-SCAN-ALGORITHM, RESTRICT-HARDWARE
|
||||
**Category**: Scan Algorithm, Resilience
|
||||
|
||||
**Preconditions**:
|
||||
- Mock gimbal is initially running, then stopped mid-test
|
||||
|
||||
**Input data**: semantic01.png + mock-yolo-detections
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | POST frame, verify gimbal commands are sent | Gimbal commands in log |
|
||||
| 2 | Stop mock-gimbal service | — |
|
||||
| 3 | POST next frame | System detects gimbal timeout |
|
||||
| 4 | POST 3 more frames | System enters degradation Level 3 (no gimbal), continues producing YOLO-only detections |
|
||||
| 5 | GET /api/v1/results | Detections still returned (from existing YOLO pipeline) |
|
||||
|
||||
**Expected outcome**: System degrades gracefully to Level 3, continues detecting without gimbal
|
||||
**Max execution time**: 15s
|
||||
|
||||
---
|
||||
|
||||
### FT-N-05: VLM process crash triggers Tier 3 unavailability
|
||||
|
||||
**Summary**: When VLM stub crashes, verify Tier 3 is marked unavailable and Tier 1+2 continue operating.
|
||||
**Traces to**: AC-SEMANTIC-PIPELINE, RESTRICT-SOFTWARE
|
||||
**Category**: Resilience
|
||||
|
||||
**Preconditions**:
|
||||
- VLM stub initially running, then killed
|
||||
|
||||
**Input data**: semantic02.png + mock-yolo-detections (ambiguous endpoint that would trigger VLM)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | Kill vlm-stub process | — |
|
||||
| 2 | POST semantic02.png with ambiguous detection | Processing starts |
|
||||
| 3 | GET /api/v1/results after 3s | Detection result with tier3_used=false (VLM unavailable), Tier 1+2 results still present |
|
||||
| 4 | Read detection log | Log entry shows tier3 skipped with reason "vlm_unavailable" |
|
||||
|
||||
**Expected outcome**: Tier 1+2 results are returned; Tier 3 is gracefully skipped
|
||||
**Max execution time**: 5s
|
||||
@@ -0,0 +1,272 @@
|
||||
# E2E Non-Functional Tests
|
||||
|
||||
## Performance Tests
|
||||
|
||||
### NFT-PERF-01: Tier 1 inference latency ≤100ms [HIL]
|
||||
|
||||
**Summary**: Measure Tier 1 (YOLOE TRT FP16) inference latency on Jetson Orin Nano Super with real TensorRT engine.
|
||||
**Traces to**: AC-LATENCY-TIER1
|
||||
**Metric**: p95 inference latency per frame (ms)
|
||||
|
||||
**Preconditions**:
|
||||
- Jetson Orin Nano Super with JetPack 6.2
|
||||
- YOLOE TRT FP16 engine loaded
|
||||
- Active cooling enabled, T_junction < 70°C
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Submit 100 frames (semantic01-04.png cycled) with 100ms interval | Record per-frame inference time from API response header |
|
||||
| 2 | Compute p50, p95, p99 latency | — |
|
||||
|
||||
**Pass criteria**: p95 latency < 100ms
|
||||
**Duration**: 15 seconds
|
||||
|
||||
---
|
||||
|
||||
### NFT-PERF-02: Tier 2 heuristic latency ≤50ms
|
||||
|
||||
**Summary**: Measure V1 heuristic endpoint analysis (skeletonization + endpoint + darkness check) latency.
|
||||
**Traces to**: AC-LATENCY-TIER2
|
||||
**Metric**: p95 processing latency per ROI (ms)
|
||||
|
||||
**Preconditions**:
|
||||
- Tier 1 has produced footpath segmentation masks
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Submit 50 frames with mock YOLO footpath masks | Record Tier 2 processing time from detection log |
|
||||
| 2 | Compute p50, p95 latency | — |
|
||||
|
||||
**Pass criteria**: p95 latency < 50ms (V1 heuristic), < 200ms (V2 CNN)
|
||||
**Duration**: 10 seconds
|
||||
|
||||
---
|
||||
|
||||
### NFT-PERF-03: Tier 3 VLM latency ≤5s
|
||||
|
||||
**Summary**: Measure VLM inference latency including image encoding, prompt processing, and response generation.
|
||||
**Traces to**: AC-LATENCY-TIER3
|
||||
**Metric**: End-to-end VLM analysis time per ROI (ms)
|
||||
|
||||
**Preconditions**:
|
||||
- NanoLLM with VILA1.5-3B loaded (or vlm-stub for Docker-based test)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Trigger 10 Tier 3 analyses on different ROIs | Record time from VLM request to response via detection log |
|
||||
| 2 | Compute p50, p95 latency | — |
|
||||
|
||||
**Pass criteria**: p95 latency < 5000ms
|
||||
**Duration**: 60 seconds
|
||||
|
||||
---
|
||||
|
||||
### NFT-PERF-04: Full pipeline throughput under continuous frame input
|
||||
|
||||
**Summary**: Submit frames at 10 FPS for 60 seconds; measure detection throughput and queue depth.
|
||||
**Traces to**: AC-LATENCY-TIER1, AC-SCAN-ALGORITHM
|
||||
**Metric**: Frames processed per second, max queue depth
|
||||
|
||||
**Preconditions**:
|
||||
- All tiers active, mock services responding
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Submit 600 frames at 10 FPS (60s) | Count processed frames from detection log |
|
||||
| 2 | Record queue depth if available from API status endpoint | — |
|
||||
|
||||
**Pass criteria**: ≥8 FPS sustained processing rate; no frames silently dropped (all either processed or explicitly skipped with quality gate reason)
|
||||
**Duration**: 75 seconds
|
||||
|
||||
---
|
||||
|
||||
## Resilience Tests
|
||||
|
||||
### NFT-RES-01: Semantic process crash and recovery
|
||||
|
||||
**Summary**: Kill the semantic detection process; verify watchdog restarts it within 10 seconds and processing resumes.
|
||||
**Traces to**: AC-SCAN-ALGORITHM (degradation)
|
||||
|
||||
**Preconditions**:
|
||||
- Semantic detection running and processing frames
|
||||
|
||||
**Fault injection**:
|
||||
- Kill semantic process via signal (SIGKILL)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Behavior |
|
||||
|------|--------|------------------|
|
||||
| 1 | Submit 5 frames successfully | Detections returned |
|
||||
| 2 | Kill semantic process | Frame processing stops |
|
||||
| 3 | Wait up to 10 seconds | Watchdog detects crash, restarts process |
|
||||
| 4 | Submit 5 more frames | Detections returned again |
|
||||
|
||||
**Pass criteria**: Recovery within 10 seconds; no data corruption in detection log; frames submitted during downtime are either queued or rejected (not silently dropped)
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-02: VLM load/unload cycle stability
|
||||
|
||||
**Summary**: Load and unload VLM 10 times; verify no memory leak and successful inference after each reload.
|
||||
**Traces to**: AC-RESOURCE-CONSTRAINTS
|
||||
|
||||
**Preconditions**:
|
||||
- VLM process manageable via API/signal
|
||||
|
||||
**Fault injection**:
|
||||
- Alternating VLM load/unload commands
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Behavior |
|
||||
|------|--------|------------------|
|
||||
| 1 | Load VLM, run 1 inference | Success, record memory |
|
||||
| 2 | Unload VLM, record memory | Memory decreases |
|
||||
| 3 | Repeat 10 times | — |
|
||||
| 4 | Compare memory at cycle 1 vs cycle 10 | Delta < 100MB |
|
||||
|
||||
**Pass criteria**: No memory leak (delta < 100MB over 10 cycles); all inferences succeed
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-03: Gimbal CRC failure handling
|
||||
|
||||
**Summary**: Inject corrupted gimbal command responses; verify CRC layer detects corruption and retries.
|
||||
**Traces to**: AC-CAMERA-CONTROL
|
||||
|
||||
**Preconditions**:
|
||||
- Mock gimbal configured to return corrupted responses for first 2 attempts, valid on 3rd
|
||||
|
||||
**Fault injection**:
|
||||
- Mock gimbal flips random bits in response CRC
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Behavior |
|
||||
|------|--------|------------------|
|
||||
| 1 | Issue pan command | First 2 responses rejected (bad CRC) |
|
||||
| 2 | Automatic retry | 3rd attempt succeeds |
|
||||
| 3 | Read gimbal command log | Log shows 2 CRC failures + 1 success |
|
||||
|
||||
**Pass criteria**: Command succeeds after retries; CRC failures logged; no crash
|
||||
|
||||
---
|
||||
|
||||
## Security Tests
|
||||
|
||||
### NFT-SEC-01: No external network access from semantic detection
|
||||
|
||||
**Summary**: Verify the semantic detection service makes no outbound network connections outside the Docker network.
|
||||
**Traces to**: RESTRICT-SOFTWARE (local-only inference)
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected Response |
|
||||
|------|----------------|------------------|
|
||||
| 1 | Run semantic detection pipeline on test frames | Detections produced |
|
||||
| 2 | Monitor network traffic from semantic-detection container (via tcpdump on e2e-net) | No packets to external IPs |
|
||||
|
||||
**Pass criteria**: Zero outbound connections to external networks
|
||||
|
||||
---
|
||||
|
||||
### NFT-SEC-02: Model files are not accessible via API
|
||||
|
||||
**Summary**: Verify TRT engine files and VLM model weights cannot be downloaded through the API.
|
||||
**Traces to**: RESTRICT-SOFTWARE
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected Response |
|
||||
|------|----------------|------------------|
|
||||
| 1 | Attempt directory traversal via API: GET /api/v1/../models/ | 404 or 400 |
|
||||
| 2 | Attempt known model path: GET /api/v1/detect?path=/models/yoloe.engine | No model content returned |
|
||||
|
||||
**Pass criteria**: Model files inaccessible via any API endpoint
|
||||
|
||||
---
|
||||
|
||||
## Resource Limit Tests
|
||||
|
||||
### NFT-RES-LIM-01: Memory stays within 6GB budget [HIL]
|
||||
|
||||
**Summary**: Run full pipeline (Tier 1+2+3 + recording + logging) for 30 minutes; verify peak memory stays below 6GB (semantic module allocation).
|
||||
**Traces to**: AC-RESOURCE-CONSTRAINTS, RESTRICT-HARDWARE
|
||||
**Metric**: Peak RSS memory of semantic detection + VLM processes
|
||||
|
||||
**Preconditions**:
|
||||
- Jetson Orin Nano Super, 15W mode, active cooling
|
||||
- All components loaded
|
||||
|
||||
**Monitoring**:
|
||||
- `tegrastats` logging at 1-second intervals: GPU memory, CPU memory, swap
|
||||
|
||||
**Duration**: 30 minutes
|
||||
**Pass criteria**: Peak (semantic + VLM) memory < 6GB; no OOM kills; no swap usage above 100MB
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-LIM-02: Thermal stability under sustained load [HIL]
|
||||
|
||||
**Summary**: Run continuous inference for 60 minutes; verify T_junction stays below 75°C with active cooling.
|
||||
**Traces to**: RESTRICT-HARDWARE
|
||||
**Metric**: T_junction max, T_junction average
|
||||
|
||||
**Preconditions**:
|
||||
- Jetson Orin Nano Super, 15W mode, active cooling fan running
|
||||
- Ambient temperature 20-25°C
|
||||
|
||||
**Monitoring**:
|
||||
- Temperature sensors via `tegrastats` at 1-second intervals
|
||||
|
||||
**Duration**: 60 minutes
|
||||
**Pass criteria**: T_junction max < 75°C; no thermal throttling events
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-LIM-03: NVMe recording endurance [HIL]
|
||||
|
||||
**Summary**: Record frames to NVMe at Level 2 rate (30 FPS, 1080p JPEG) for 2 hours; verify no write errors.
|
||||
**Traces to**: AC-SCAN-ALGORITHM (recording)
|
||||
**Metric**: Frames written, write errors, NVMe health
|
||||
|
||||
**Preconditions**:
|
||||
- NVMe SSD ≥256GB, ≥30% free space
|
||||
|
||||
**Monitoring**:
|
||||
- Write errors via dmesg
|
||||
- NVMe SMART data before and after
|
||||
|
||||
**Duration**: 2 hours
|
||||
**Pass criteria**: Zero write errors; SMART indicators nominal; storage usage matches expected (~120GB for 2h at 30FPS)
|
||||
|
||||
---
|
||||
|
||||
### NFT-RES-LIM-04: Cold start time ≤60 seconds [HIL]
|
||||
|
||||
**Summary**: Power on Jetson, measure time from boot to first successful detection.
|
||||
**Traces to**: RESTRICT-OPERATIONAL
|
||||
**Metric**: Time from power-on to first detection result (seconds)
|
||||
|
||||
**Preconditions**:
|
||||
- JetPack 6.2 on NVMe, all models pre-exported as TRT engines
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | Power on Jetson | Start timer |
|
||||
| 2 | Poll /api/v1/health every 1s | — |
|
||||
| 3 | When health returns 200, submit test frame | Record time to first detection |
|
||||
|
||||
**Pass criteria**: First detection within 60 seconds of power-on
|
||||
**Duration**: 90 seconds max
|
||||
@@ -0,0 +1,46 @@
|
||||
# E2E Test Data Management
|
||||
|
||||
## Seed Data Sets
|
||||
|
||||
| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|
||||
|----------|-------------|---------------|-----------|---------|
|
||||
| winter-footpath-images | semantic01-04.png — real aerial images with footpaths and concealed positions (winter) | FT-P-01 to FT-P-07, FT-N-01 to FT-N-04, NFT-PERF-01 to NFT-PERF-03 | Volume mount from test-frames | Persistent, read-only |
|
||||
| mock-yolo-detections | Pre-computed YOLO detection JSONs for each test image (footpaths, roads, branch piles, entrances, trees) | FT-P-01 to FT-P-07 | Loaded by mock-yolo service from fixture files | Persistent, read-only |
|
||||
| mock-yolo-empty | YOLO detection JSON with zero detections | FT-N-01 | Loaded by mock-yolo service | Persistent, read-only |
|
||||
| mock-yolo-noise | YOLO detection JSON with high-confidence false positives (random bounding boxes) | FT-N-02 | Loaded by mock-yolo service | Persistent, read-only |
|
||||
| blurry-frames | 5 synthetically blurred versions of semantic01.png (Gaussian blur, motion blur) | FT-N-03, FT-P-05 | Volume mount | Persistent, read-only |
|
||||
| synthetic-video-sequence | 30 frames panning across semantic01.png to simulate gimbal movement | FT-P-06, FT-P-07 | Volume mount | Persistent, read-only |
|
||||
| vlm-stub-responses | Deterministic VLM text responses for each test image ROI | FT-P-04 | Loaded by vlm-stub service | Persistent, read-only |
|
||||
| gimbal-protocol-fixtures | ViewLink protocol command/response byte sequences for known operations | FT-P-06, FT-N-04 | Loaded by mock-gimbal service | Persistent, read-only |
|
||||
|
||||
## Data Isolation Strategy
|
||||
|
||||
Each test run starts with a clean output directory. The semantic-detection service restarts between test groups (via Docker restart). Input data (images, mock detections) is read-only and shared across tests. Output data (detection logs, recorded frames, gimbal commands) is written to a fresh directory per test run.
|
||||
|
||||
## Input Data Mapping
|
||||
|
||||
| Input Data File | Source Location | Description | Covers Scenarios |
|
||||
|-----------------|----------------|-------------|-----------------|
|
||||
| semantic01.png | `_docs/00_problem/input_data/semantic01.png` | Footpath with arrows, leading to branch pile hideout | FT-P-01, FT-P-02, FT-P-03, FT-P-04 |
|
||||
| semantic02.png | `_docs/00_problem/input_data/semantic02.png` | Footpath to open space from forest, FPV pilot trail | FT-P-01, FT-P-02, FT-P-07 |
|
||||
| semantic03.png | `_docs/00_problem/input_data/semantic03.png` | Footpath with squared hideout | FT-P-01, FT-P-03 |
|
||||
| semantic04.png | `_docs/00_problem/input_data/semantic04.png` | Footpath ending at tree branches | FT-P-01, FT-P-02 |
|
||||
| data_parameters.md | `_docs/00_problem/input_data/data_parameters.md` | Training data spec (not used in E2E tests directly) | — |
|
||||
|
||||
## External Dependency Mocks
|
||||
|
||||
| External Service | Mock/Stub | How Provided | Behavior |
|
||||
|-----------------|-----------|-------------|----------|
|
||||
| YOLO Detection Pipeline | mock-yolo Docker service | HTTP API returning deterministic JSON detection results per image hash | Returns pre-computed detection arrays matching expected YOLO output format (centerX, centerY, width, height, classNum, label, confidence) |
|
||||
| ViewPro A40 Gimbal | mock-gimbal Docker service | TCP socket emulating UART serial interface | Accepts ViewLink protocol commands, responds with gimbal feedback (pan/tilt angles, status). Logs all received commands to file. Supports simulated delays (1-2s zoom transition). |
|
||||
| VLM (NanoLLM/VILA) | vlm-stub Docker service | Unix socket responding to IPC messages | Returns deterministic text analysis per image ROI hash. Simulates ~2s latency. Returns configurable responses for positive/negative/ambiguous cases. |
|
||||
| GPS-Denied System | Not mocked | Not needed — coordinates are passed as metadata input | System under test accepts coordinates as input parameters, does not compute them |
|
||||
|
||||
## Data Validation Rules
|
||||
|
||||
| Data Type | Validation | Invalid Examples | Expected System Behavior |
|
||||
|-----------|-----------|-----------------|------------------------|
|
||||
| Input frame | JPEG/PNG, 1920x1080, 3-channel RGB | 0-byte file, truncated JPEG, 640x480, grayscale | Reject with error, skip frame, continue processing |
|
||||
| YOLO detection JSON | Array of objects with required fields (centerX, centerY, width, height, classNum, label, confidence) | Missing fields, confidence > 1.0, negative coordinates | Ignore malformed detections, process valid ones |
|
||||
| Gimbal command | Valid ViewLink protocol packet with CRC-16 | Truncated packet, invalid CRC, unknown command code | Retry up to 3 times, log error, continue without gimbal |
|
||||
| VLM IPC message | JSON with image_path and prompt fields | Missing image_path, empty prompt, non-existent file | Return error response, Tier 3 marked as failed for this ROI |
|
||||
@@ -0,0 +1,69 @@
|
||||
# E2E Traceability Matrix
|
||||
|
||||
## Acceptance Criteria Coverage
|
||||
|
||||
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|
||||
|-------|---------------------|----------|----------|
|
||||
| AC-LATENCY-TIER1 | Tier 1 ≤100ms per frame | NFT-PERF-01, NFT-PERF-04 | Covered |
|
||||
| AC-LATENCY-TIER2 | Tier 2 ≤200ms per ROI | NFT-PERF-02 | Covered |
|
||||
| AC-LATENCY-TIER3 | Tier 3 ≤5s per ROI | NFT-PERF-03, FT-P-04 | Covered |
|
||||
| AC-YOLO-NEW-CLASSES | New YOLO classes P≥80% R≥80% | FT-P-01 | Partially covered — functional flow tested; statistical P/R requires annotated validation set (component-level test) |
|
||||
| AC-SEMANTIC-DETECTION-R | Concealed position recall ≥60% | FT-P-02, FT-P-03 | Partially covered — functional detection tested; statistical recall requires larger dataset (component-level test) |
|
||||
| AC-SEMANTIC-DETECTION-P | Concealed position precision ≥20% | FT-P-02, FT-N-02 | Partially covered — same as above |
|
||||
| AC-FOOTPATH-RECALL | Footpath detection recall ≥70% | FT-P-01 | Partially covered — functional detection tested; statistical recall at component level |
|
||||
| AC-SCAN-L1 | Level 1 covers route with sweep | FT-P-06 | Covered |
|
||||
| AC-SCAN-L1-TO-L2 | L1→L2 transition within 2s | FT-P-06 | Covered |
|
||||
| AC-SCAN-L2-LOCK | L2 maintains camera lock on POI | — | NOT COVERED — requires real gimbal + moving platform; covered in [HIL] test track |
|
||||
| AC-SCAN-PATH-FOLLOW | Path-following keeps path in center 50% | — | NOT COVERED — requires real camera + gimbal; covered in [HIL] track |
|
||||
| AC-SCAN-ENDPOINT-HOLD | Endpoint hold for VLM analysis | FT-P-04 | Partially covered — VLM trigger tested; physical hold requires [HIL] |
|
||||
| AC-SCAN-RETURN | Return to L1 after analysis/timeout | FT-P-06 | Covered (within mock gimbal command sequence) |
|
||||
| AC-CAMERA-LATENCY | Gimbal command ≤500ms | NFT-RES-03 | Covered (mock; [HIL] for real latency) |
|
||||
| AC-CAMERA-ZOOM | Zoom M→H within 2s | FT-P-06 | Covered (mock acknowledges zoom; [HIL] for physical timing) |
|
||||
| AC-CAMERA-PATH-ACCURACY | Footpath stays in center 50% during pan | — | NOT COVERED — requires real gimbal; [HIL] |
|
||||
| AC-CAMERA-SMOOTH | Smooth gimbal transitions | — | NOT COVERED — requires real gimbal; [HIL] |
|
||||
| AC-CAMERA-QUEUE | POI queue prioritized by confidence/proximity | FT-P-06 | Partially covered — queue existence tested; priority ordering at component level |
|
||||
| AC-SEMANTIC-PIPELINE | Consumes YOLO input, traces paths, freshness | FT-P-01, FT-P-02, FT-P-08 | Covered |
|
||||
| AC-RESOURCE-CONSTRAINTS | ≤6GB RAM total | NFT-RES-LIM-01 | Covered [HIL] |
|
||||
| AC-COEXIST-YOLO | Must not degrade existing YOLO | NFT-PERF-04 | Partially covered — throughput measured; real coexistence at [HIL] |
|
||||
|
||||
## Restrictions Coverage
|
||||
|
||||
| Restriction ID | Restriction | Test IDs | Coverage |
|
||||
|---------------|-------------|----------|----------|
|
||||
| RESTRICT-HW-JETSON | Jetson Orin Nano Super, 67 TOPS, 8GB | NFT-RES-LIM-01, NFT-RES-LIM-02 | Covered [HIL] |
|
||||
| RESTRICT-HW-RAM | ~6GB available for semantic + VLM | NFT-RES-LIM-01 | Covered [HIL] |
|
||||
| RESTRICT-CAM-VIEWPRO | ViewPro A40 1080p 40x zoom | FT-P-06, NFT-RES-03 | Covered (mock) |
|
||||
| RESTRICT-CAM-ZOOM-TIME | Zoom transition 1-2s physical | FT-P-06 | Covered (mock with simulated delay) |
|
||||
| RESTRICT-OP-ALTITUDE | 600-1000m altitude | — | NOT COVERED — operational parameter, not testable at E2E; affects GSD calculation tested at component level |
|
||||
| RESTRICT-OP-SEASONS | All seasons, phased starting winter | FT-P-01 to FT-P-08 (winter images) | Partially covered — winter only; other seasons deferred to Phase 4 |
|
||||
| RESTRICT-SW-CYTHON-TRT | Extend Cython + TRT codebase | — | NOT COVERED — architectural constraint verified by code review, not E2E test |
|
||||
| RESTRICT-SW-TRT | TensorRT inference engine | NFT-PERF-01 | Covered [HIL] |
|
||||
| RESTRICT-SW-VLM-LOCAL | VLM runs locally, no cloud | NFT-SEC-01 | Covered |
|
||||
| RESTRICT-SW-VLM-SEPARATE | VLM as separate process with IPC | FT-P-04, FT-N-05 | Covered |
|
||||
| RESTRICT-SW-SEQUENTIAL-GPU | YOLO and VLM scheduled sequentially | NFT-PERF-04, NFT-RES-LIM-01 | Covered (memory monitoring shows no concurrent GPU allocation) |
|
||||
| RESTRICT-INT-FASTAPI | Existing FastAPI + Cython + Docker | FT-P-03 | Covered (output format) |
|
||||
| RESTRICT-INT-YOLO-OUTPUT | Consume YOLO bounding box output | FT-P-01, FT-P-02 | Covered |
|
||||
| RESTRICT-INT-OUTPUT-FORMAT | Output same bbox format | FT-P-03 | Covered |
|
||||
| RESTRICT-SCOPE-ANNOTATION | Annotation tooling out of scope | — | N/A |
|
||||
| RESTRICT-SCOPE-GPS | GPS-denied out of scope | — | N/A |
|
||||
|
||||
## Coverage Summary
|
||||
|
||||
| Category | Total Items | Covered | Partially Covered | Not Covered | Coverage % |
|
||||
|----------|-----------|---------|-------------------|-------------|-----------|
|
||||
| Acceptance Criteria | 21 | 10 | 7 | 4 | 81% (counting partial as 0.5) |
|
||||
| Restrictions | 16 | 8 | 2 | 4 | 69% (2 N/A excluded) |
|
||||
| **Total** | **37** | **18** | **9** | **8** | **76%** |
|
||||
|
||||
## Uncovered Items Analysis
|
||||
|
||||
| Item | Reason Not Covered | Risk | Mitigation |
|
||||
|------|-------------------|------|-----------|
|
||||
| AC-SCAN-L2-LOCK | Requires real gimbal + moving UAV platform | Camera drifts off target during flight | [HIL] test with real hardware; PID tuning on bench first |
|
||||
| AC-SCAN-PATH-FOLLOW | Requires real gimbal + camera | Path leaves frame during pan | [HIL] test; component-level PID unit tests with simulated feedback |
|
||||
| AC-CAMERA-PATH-ACCURACY | Requires real gimbal | Path not centered | [HIL] test |
|
||||
| AC-CAMERA-SMOOTH | Requires real gimbal | Jerky movement blurs frames | [HIL] test; PID tuning |
|
||||
| RESTRICT-OP-ALTITUDE | Operational parameter, not testable | GSD calculation wrong | Component-level GSD unit test with known altitude |
|
||||
| RESTRICT-SW-CYTHON-TRT | Architectural constraint | Wrong tech stack used | Code review gate in PR process |
|
||||
| RESTRICT-OP-SEASONS (non-winter) | Only winter images available now | System fails on summer/spring terrain | Phase 4 seasonal expansion; deferred by design |
|
||||
| RESTRICT-HW-JETSON (real perf) | Requires physical hardware | Docker perf doesn't match Jetson | [HIL] test track runs on real Jetson |
|
||||
Reference in New Issue
Block a user