Files
gps-denied-onboard/_docs/02_document/tests/blackbox-tests.md
T

504 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Blackbox Tests
## Positive Scenarios
### FT-P-01: End-to-End Position Accuracy — 50m Threshold
**Summary**: Validate that ≥80% of frame positions are within 50m of ground truth GPS across a full 60-frame flight sequence.
**Traces to**: AC-01 (80% within 50m)
**Category**: Position Accuracy
**Preconditions**:
- System running with SITL ArduPilot (GPS_TYPE=14)
- Camera replay serving flight-sequence-60 at 0.7fps
- Satellite tiles for test area loaded
- System has completed startup (first satellite match done)
**Input data**: flight-sequence-60 (60 frames), coordinates.csv (ground truth), position_accuracy.csv (thresholds)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Start session via POST /sessions | HTTP 201 with session ID |
| 2 | Subscribe to SSE stream GET /sessions/{id}/stream | SSE events begin at ~1Hz |
| 3 | Wait for camera-replay to complete all 60 frames (~86s at 0.7fps) | Position events for each processed frame |
| 4 | Collect all position events with lat/lon | 60 position estimates (some frames may have multiple updates) |
| 5 | For each frame: compute haversine distance between estimated and ground truth position | Distance array |
| 6 | Count frames where distance < 50m, compute percentage | ≥80% |
**Expected outcome**: ≥48 of 60 frames have position error < 50m from ground truth in coordinates.csv
**Max execution time**: 120s
---
### FT-P-02: End-to-End Position Accuracy — 20m Threshold
**Summary**: Validate that ≥60% of frame positions are within 20m of ground truth GPS.
**Traces to**: AC-02 (60% within 20m)
**Category**: Position Accuracy
**Preconditions**: Same as FT-P-01
**Input data**: flight-sequence-60, coordinates.csv, position_accuracy.csv
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Reuse position data from FT-P-01 run (or re-run) | 60 position estimates |
| 2 | Count frames where distance < 20m, compute percentage | ≥60% |
**Expected outcome**: ≥36 of 60 frames have position error < 20m
**Max execution time**: 120s (shared with FT-P-01)
---
### FT-P-03: No Single Frame Exceeds Maximum Error
**Summary**: Validate that no individual frame position estimate exceeds 100m error.
**Traces to**: AC-01, AC-02 (implicit: no catastrophic outliers)
**Category**: Position Accuracy
**Preconditions**: Same as FT-P-01
**Input data**: flight-sequence-60, coordinates.csv, position_accuracy.csv
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Reuse position data from FT-P-01 | 60 position estimates |
| 2 | Find max error across all frames | max(distances) ≤ 100m |
**Expected outcome**: Maximum position error across all 60 frames ≤ 100m
**Max execution time**: 120s (shared with FT-P-01)
---
### FT-P-04: VO Drift Between Satellite Anchors
**Summary**: Validate cumulative VO drift stays below 100m between consecutive satellite correction events.
**Traces to**: AC-03 (drift < 100m between anchors)
**Category**: Position Accuracy
**Preconditions**: Same as FT-P-01; satellite matching active on keyframes
**Input data**: flight-sequence-60 SSE stream (includes drift_from_anchor field)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Subscribe to SSE stream | Events with drift_from_anchor field |
| 2 | Record drift_from_anchor values over the full sequence | Array of drift values |
| 3 | Find maximum drift_from_anchor value | max(drift) < 100m |
**Expected outcome**: drift_from_anchor never exceeds 100m during the 60-frame sequence
**Max execution time**: 120s
---
### FT-P-05: GPS_INPUT Message Correctness — Normal Tracking
**Summary**: Validate GPS_INPUT message fields are correctly populated during normal satellite-anchored tracking.
**Traces to**: AC-08 (GPS_INPUT to FC via MAVLink), AC-04 (confidence score)
**Category**: Flight Controller Integration
**Preconditions**: System tracking normally with recent satellite match (<30s)
**Input data**: Normal frame + satellite match; MAVLink capture from mavlink-inspector
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Read captured GPS_INPUT messages from mavlink-inspector | GPS_INPUT messages at 5-10Hz |
| 2 | Verify field: fix_type | fix_type == 3 |
| 3 | Verify field: horiz_accuracy | 1.0 ≤ horiz_accuracy ≤ 50.0 |
| 4 | Verify field: satellites_visible | satellites_visible == 10 |
| 5 | Verify fields: lat, lon | Non-zero, within operational area bounds |
| 6 | Verify fields: vn, ve, vd | Populated (non-NaN), magnitude consistent with ~50-70 km/h flight |
**Expected outcome**: All GPS_INPUT fields populated correctly per specification
**Max execution time**: 30s
---
### FT-P-06: Image Registration Rate
**Summary**: Validate that ≥95% of frames in a normal flight are successfully registered by the VO pipeline.
**Traces to**: AC-05 (registration > 95%)
**Category**: Image Processing Quality
**Preconditions**: System running with full 60-frame sequence
**Input data**: flight-sequence-60 SSE stream (vo_status field)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Subscribe to SSE stream | Events with vo_status field |
| 2 | Count frames where vo_status == "tracking" | ≥57 of 60 |
| 3 | Compute registration rate | ≥95% |
**Expected outcome**: ≥57 of 60 frames report vo_status "tracking"
**Max execution time**: 120s
---
### FT-P-07: Confidence Tier — HIGH
**Summary**: Validate HIGH confidence tier when satellite match is recent and covariance is low.
**Traces to**: AC-04 (confidence score per estimate)
**Category**: Confidence Scoring
**Preconditions**: System running, satellite match completed <30s ago
**Input data**: SSE stream during normal tracking
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Read SSE event immediately after satellite match | confidence field |
| 2 | Verify confidence == "HIGH" | "HIGH" |
| 3 | Read GPS_INPUT fix_type from mavlink-inspector | fix_type == 3 |
**Expected outcome**: Confidence tier is HIGH, fix_type is 3
**Max execution time**: 30s
---
### FT-P-08: Confidence Tier — MEDIUM (VO-only, No Recent Satellite Match)
**Summary**: Validate MEDIUM confidence tier when VO is tracking but no satellite match in >30s.
**Traces to**: AC-04
**Category**: Confidence Scoring
**Preconditions**: System running; satellite tile server paused (returns 503) to prevent new matches; >30s since last match
**Input data**: SSE stream during VO-only tracking
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Pause satellite-tile-server (Docker pause) | No new satellite matches possible |
| 2 | Wait >30s after last satellite match | Confidence should transition |
| 3 | Read SSE event | confidence == "MEDIUM" |
| 4 | Read GPS_INPUT fix_type | fix_type == 3 |
**Expected outcome**: Confidence transitions to MEDIUM; fix_type remains 3
**Max execution time**: 60s
---
### FT-P-09: GPS_INPUT Output Rate
**Summary**: Validate GPS_INPUT messages are sent at 5-10Hz continuously.
**Traces to**: AC-08 (GPS_INPUT via MAVLink), AC-09 (frame-by-frame streaming)
**Category**: Flight Controller Integration
**Preconditions**: System running and producing position estimates
**Input data**: MAVLink capture from mavlink-inspector (10s window)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Capture GPS_INPUT messages for 10 seconds | N messages |
| 2 | Compute rate: N / 10 | 5 ≤ rate ≤ 10 |
| 3 | Verify no gaps > 300ms between consecutive messages | max gap ≤ 300ms |
**Expected outcome**: Rate is 5-10Hz, no gap exceeds 300ms
**Max execution time**: 15s
---
### FT-P-10: Object Localization
**Summary**: Validate object GPS localization from pixel coordinates via the FastAPI endpoint.
**Traces to**: AC-16 (object localization), AC-17 (trigonometric calculation)
**Category**: Object Localization
**Preconditions**: System running with known UAV position (from GPS-denied estimate); known object ground truth GPS
**Input data**: pixel_x, pixel_y (center of frame = nadir), gimbal_pan_deg=0, gimbal_tilt_deg=-90, zoom_factor=1.0
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST /objects/locate with pixel at frame center, gimbal pointing straight down | JSON: { lat, lon, alt, accuracy_m, confidence } |
| 2 | Compute haversine distance between response lat/lon and current UAV position | Should be < accuracy_m (nadir point ≈ UAV position) |
| 3 | Verify accuracy_m is consistent with current system accuracy | accuracy_m > 0, accuracy_m < 100m |
**Expected outcome**: Object location at nadir matches UAV position within accuracy_m
**Max execution time**: 5s
---
### FT-P-11: Coordinate Transform Round-Trip
**Summary**: Validate GPS→NED→pixel→GPS round-trip error is <0.1m.
**Traces to**: AC-18 (WGS84 output)
**Category**: Coordinate Transforms
**Preconditions**: System running, position known
**Input data**: Known GPS coordinate within operational area
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Query system for current position via SSE | lat, lon |
| 2 | POST /objects/locate with frame center pixel, straight-down gimbal | Returned lat, lon |
| 3 | Compute haversine distance between original UAV lat/lon and round-trip result | distance < 0.1m |
**Expected outcome**: Round-trip error < 0.1m
**Max execution time**: 5s
---
### FT-P-12: Startup — GPS_INPUT Within 60 Seconds
**Summary**: Validate the system begins outputting GPS_INPUT messages within 60s of boot.
**Traces to**: AC-11 (startup from last GPS)
**Category**: Startup & Failsafe
**Preconditions**: Fresh system start; SITL ArduPilot running with GLOBAL_POSITION_INT available
**Input data**: MAVLink capture from mavlink-inspector
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Start gps-denied-system container | System boots |
| 2 | Monitor mavlink-inspector for first GPS_INPUT message | Timestamp of first GPS_INPUT |
| 3 | Compute elapsed time from container start to first GPS_INPUT | ≤ 60s |
**Expected outcome**: First GPS_INPUT message arrives within 60s of system start
**Max execution time**: 90s
---
### FT-P-13: Telemetry Output Rate
**Summary**: Validate telemetry NAMED_VALUE_FLOAT messages are sent at 1Hz.
**Traces to**: AC-14 (telemetry to ground station)
**Category**: Telemetry
**Preconditions**: System running normally
**Input data**: MAVLink capture from mavlink-inspector (10s window)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Capture NAMED_VALUE_FLOAT messages for "gps_conf", "gps_drift", "gps_hacc" over 10s | N messages per name |
| 2 | Verify rate: ~1Hz per metric (8-12 messages per name in 10s) | 0.8-1.2 Hz |
**Expected outcome**: Each telemetry metric sent at ~1Hz
**Max execution time**: 15s
---
### FT-P-14: SSE Stream Schema
**Summary**: Validate SSE position events contain all required fields with correct types.
**Traces to**: AC-14 (streaming to ground station)
**Category**: API & Communication
**Preconditions**: Active session with SSE stream
**Input data**: SSE events from /sessions/{id}/stream
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Subscribe to SSE stream | Events at ~1Hz |
| 2 | Parse event JSON | Valid JSON |
| 3 | Verify fields: type (string), timestamp (ISO8601), lat (float), lon (float), alt (float), accuracy_h (float), confidence (string), drift_from_anchor (float), vo_status (string), last_satellite_match_age_s (float) | All present with correct types |
**Expected outcome**: Every SSE event conforms to the specified schema
**Max execution time**: 10s
---
## Negative Scenarios
### FT-N-01: Trajectory Direction Change (Frames 32-43)
**Summary**: Validate system continues producing position estimates through a trajectory direction change.
**Traces to**: AC-07 (disconnected segments core to system)
**Category**: Resilience & Edge Cases
**Preconditions**: System running; camera-replay set to serve frames 32-43 (direction change area)
**Input data**: Frames AD000032-043.jpg, coordinates for frames 32-43
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Camera-replay serves frames 32-43 at 0.7fps | System processes frames |
| 2 | Collect SSE position events for each frame | ≥12 position estimates (one per frame minimum) |
| 3 | Verify no gap >5s without a position update | Continuous output |
**Expected outcome**: System produces position estimates for all frames in the direction-change segment; no prolonged output gap
**Max execution time**: 30s
---
### FT-N-02: Outlier Frame Handling (350m Gap)
**Summary**: Validate system handles a 350m outlier between consecutive photos without position corruption.
**Traces to**: AC-06 (350m outlier tolerance)
**Category**: Resilience & Edge Cases
**Preconditions**: System running with normal tracking established; fault injection: camera-replay skips frames to simulate 350m gap
**Input data**: Normal frames followed by a frame 350m away (simulated by frame skip in camera-replay)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Normal tracking for 10 frames | Position estimates with <50m error |
| 2 | Camera-replay jumps forward ~350m (skips multiple frames) | System detects discontinuity |
| 3 | Collect position estimates for next 5 frames after the gap | Recovery within 3-5 frames |
| 4 | Verify position error of recovered frames | Error < 100m for first valid frame after recovery |
**Expected outcome**: System recovers from 350m outlier; post-recovery position error < 100m
**Max execution time**: 30s
---
### FT-N-03: Invalid Object Localization Request
**Summary**: Validate API rejects invalid pixel coordinates with HTTP 422.
**Traces to**: AC-16 (object localization)
**Category**: API Error Handling
**Preconditions**: System running with active session
**Input data**: POST /objects/locate with pixel_x=-100, pixel_y=-100
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST /objects/locate with negative pixel coordinates | HTTP 422 |
| 2 | Verify response body contains error description | JSON with "error" or "detail" field |
**Expected outcome**: HTTP 422 with validation error
**Max execution time**: 2s
---
### FT-N-04: Unauthenticated API Access
**Summary**: Validate API rejects unauthenticated requests with HTTP 401.
**Traces to**: AC-14 (security — JWT auth)
**Category**: API Security
**Preconditions**: System running
**Input data**: POST /sessions with no Authorization header
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | POST /sessions without JWT token | HTTP 401 |
| 2 | GET /sessions/{id}/stream without JWT | HTTP 401 |
| 3 | POST /objects/locate without JWT | HTTP 401 |
| 4 | GET /health (no auth required) | HTTP 200 |
**Expected outcome**: Protected endpoints return 401; /health remains accessible
**Max execution time**: 5s
---
### FT-N-05: 3-Consecutive-Failure Re-Localization Request
**Summary**: Validate that after VO loss + 3 consecutive satellite match failures, the system sends a re-localization request to the ground station.
**Traces to**: AC-08 (3 consecutive failures → re-localization request)
**Category**: Resilience & Edge Cases
**Preconditions**: System running; camera-replay set to serve featureless frames (VO will fail); satellite-tile-server returning 404 (tile not found)
**Input data**: Featureless frames (e.g., blank/uniform images), satellite tile server offline
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Camera-replay serves featureless frames | VO tracking lost |
| 2 | Satellite-tile-server returns 404 | Satellite matching fails |
| 3 | Wait for 3 camera frames (3 × 1.43s ≈ 4.3s) | 3 consecutive failures |
| 4 | Check mavlink-inspector for STATUSTEXT | Message matches `RELOC_REQ: last_lat=.* last_lon=.* uncertainty=.*m` |
| 5 | Verify GPS_INPUT fix_type | fix_type == 0 |
| 6 | Verify GPS_INPUT horiz_accuracy | horiz_accuracy == 999.0 |
**Expected outcome**: RELOC_REQ sent via STATUSTEXT; GPS_INPUT reports no-fix with 999.0 accuracy
**Max execution time**: 15s
---
### FT-N-06: IMU-Only Dead Reckoning (VO Lost, No Satellite)
**Summary**: Validate system degrades gracefully to IMU-only ESKF prediction when VO and satellite matching both fail.
**Traces to**: AC-06 (VO lost behavior), AC-04 (confidence score reflects state)
**Category**: Resilience & Edge Cases
**Preconditions**: System running; camera-replay paused (no frames); satellite-tile-server paused
**Input data**: No camera frames, no satellite tiles; only IMU from SITL
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Pause camera-replay and satellite-tile-server | System loses VO and satellite inputs |
| 2 | Read SSE events over 5s | confidence transitions from HIGH/MEDIUM to LOW |
| 3 | Read GPS_INPUT from mavlink-inspector | fix_type == 2 |
| 4 | Read horiz_accuracy over time | horiz_accuracy ≥ 50m and increasing |
| 5 | Verify GPS_INPUT continues at 5-10Hz | Messages continue (IMU-driven ESKF prediction) |
**Expected outcome**: System continues GPS_INPUT at 5-10Hz via IMU; confidence drops; accuracy degrades but output never stops
**Max execution time**: 15s
---
### FT-N-07: Operator Re-Localization Hint Accepted
**Summary**: Validate the system accepts an operator re-localization hint and recovers position.
**Traces to**: AC-08 (re-localization), AC-15 (ground station commands)
**Category**: Ground Station Integration
**Preconditions**: System in FAILED confidence state (3 consecutive failures); satellite-tile-server restored
**Input data**: Operator hint: approximate lat/lon (from coordinates.csv ground truth ± 200m offset)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Trigger 3-consecutive-failure state (FT-N-05 preconditions) | RELOC_REQ sent |
| 2 | Restore satellite-tile-server | Tiles available again |
| 3 | POST /sessions/{id}/anchor with approximate lat/lon | HTTP 200 |
| 4 | Wait for satellite match attempt (~3-5s) | System searches in new area |
| 5 | Read SSE events | confidence transitions back to HIGH/MEDIUM |
| 6 | Read GPS_INPUT fix_type | fix_type == 3 |
**Expected outcome**: System accepts operator hint, searches satellite tiles in new area, recovers position, confidence returns to HIGH/MEDIUM
**Max execution time**: 30s