Files
gps-denied-onboard/_docs/02_document/tests/resilience-tests.md
T

170 lines
6.5 KiB
Markdown

# Resilience Tests
### NFT-RES-01: Mid-Flight Reboot Recovery
**Summary**: Validate the system recovers from a companion computer reboot within 70 seconds and restores position accuracy.
**Traces to**: AC-12 (mid-flight reboot recovery)
**Preconditions**:
- System running in steady state with good position accuracy
- SITL ArduPilot continues running (FC stays up during companion computer reboot)
**Fault injection**:
- Kill gps-denied-system process (docker stop or SIGKILL)
- Restart after 5s delay (simulates Jetson reboot time)
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Record current position accuracy and confidence | Baseline metrics |
| 2 | Kill gps-denied-system container | GPS_INPUT messages stop |
| 3 | Verify SITL continues running (heartbeat present) | FC still alive, using IMU dead reckoning |
| 4 | Restart gps-denied-system container after 5s | System starts recovery sequence |
| 5 | Monitor time from restart to first GPS_INPUT | ≤ 70s |
| 6 | Wait for first satellite match | Position accuracy restored |
| 7 | Verify position error after recovery | Error ≤ 50m after first satellite match |
**Pass criteria**: Recovery time ≤ 70s; post-recovery position error ≤ 50m after satellite match
**Duration**: 120s
---
### NFT-RES-02: Tracking Loss and Satellite Re-Localization
**Summary**: Validate the system recovers from cuVSLAM tracking loss via satellite-based re-localization.
**Traces to**: AC-07 (disconnected segments), AC-06 (sharp turn handling)
**Preconditions**:
- System in normal tracking (HIGH confidence)
- Satellite tiles available
**Fault injection**:
- Camera-replay sends featureless/blurred frames (simulates VO tracking loss from sharp turn)
- Then resumes normal frames
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Normal tracking established | confidence: HIGH, vo_status: tracking |
| 2 | Camera-replay serves 3 featureless frames | cuVSLAM reports tracking_lost |
| 3 | System enters TRACKING_LOST state | Satellite matching switches to every frame |
| 4 | Camera-replay resumes normal frames | Satellite match succeeds |
| 5 | Monitor SSE: vo_status returns to "tracking" | cuVSLAM restarted |
| 6 | Monitor SSE: confidence returns to HIGH | Position re-anchored |
| 7 | Verify position accuracy after recovery | Error ≤ 50m |
**Pass criteria**: Recovery within 5 frames after normal frames resume; position error ≤ 50m post-recovery
**Duration**: 30s
---
### NFT-RES-03: Sustained IMU-Only Operation
**Summary**: Validate the system continues producing position estimates during extended IMU-only periods without crashing.
**Traces to**: AC-08 (system continues during failure), AC-12 (failsafe)
**Preconditions**:
- System in normal tracking
**Fault injection**:
- Pause both camera-replay (no VO) and satellite-tile-server (no satellite matching)
- Duration: 30s
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Establish normal tracking baseline | GPS_INPUT at 5-10Hz, confidence HIGH |
| 2 | Pause camera-replay and satellite-tile-server | VO and satellite inputs stop |
| 3 | Monitor GPS_INPUT for 30s | Messages continue at 5-10Hz (IMU-driven ESKF prediction) |
| 4 | Verify horiz_accuracy grows over time | accuracy increases monotonically |
| 5 | Verify fix_type transitions to 2 | Degraded but present |
| 6 | Verify confidence transitions to LOW | Reflects IMU-only state |
| 7 | Resume camera-replay and satellite-tile-server | System recovers to normal tracking |
| 8 | Verify recovery to HIGH confidence | Satellite match re-anchors position |
**Pass criteria**: GPS_INPUT never stops during 30s IMU-only period; system recovers when inputs resume
**Duration**: 60s
---
### NFT-RES-04: Satellite Tile Server Failure
**Summary**: Validate the system continues operating when satellite tile server becomes unavailable, with graceful accuracy degradation.
**Traces to**: AC-07 (resilience), solution risk: Google Maps quality
**Preconditions**:
- System in normal tracking
**Fault injection**:
- Stop satellite-tile-server container (simulates tile unavailability)
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Normal tracking with satellite corrections | confidence: HIGH |
| 2 | Stop satellite-tile-server | Satellite matching returns errors |
| 3 | Monitor for 60s | System falls back to VO-only; confidence drops to MEDIUM after 30s |
| 4 | Verify GPS_INPUT continues | Messages at 5-10Hz, fix_type remains 3 (VO tracking OK) |
| 5 | Restart satellite-tile-server | Satellite matching resumes |
| 6 | Verify confidence returns to HIGH | Position re-anchored |
**Pass criteria**: No crash or hang; GPS_INPUT continues; confidence degrades gracefully and recovers when tiles return
**Duration**: 90s
---
### NFT-RES-05: Corrupted Camera Frame
**Summary**: Validate the system handles a corrupted camera frame without crashing.
**Traces to**: AC-06 (outlier tolerance)
**Preconditions**:
- System in normal tracking
**Fault injection**:
- Camera-replay injects a truncated/corrupted JPEG between normal frames
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Normal tracking for 5 frames | Baseline established |
| 2 | Camera-replay sends corrupted JPEG | System logs warning, skips frame |
| 3 | Camera-replay sends next normal frame | VO continues processing |
| 4 | Verify no crash, no hang | GPS_INPUT continues at 5-10Hz |
| 5 | Verify position accuracy on next valid frame | Error < 50m |
**Pass criteria**: System skips corrupted frame gracefully; no crash; next frame processed normally
**Duration**: 15s
---
### NFT-RES-06: Camera Feed Interruption (No Frames for 10s)
**Summary**: Validate the system survives a 10-second camera feed interruption.
**Traces to**: AC-12 (failsafe — N seconds no estimate), AC-08 (continued operation)
**Preconditions**:
- System in normal tracking
**Fault injection**:
- Camera-replay pauses for 10s (no frames delivered)
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Normal tracking baseline | GPS_INPUT at 5-10Hz |
| 2 | Pause camera-replay for 10s | No new camera frames |
| 3 | Monitor GPS_INPUT | Messages continue via IMU prediction |
| 4 | Monitor confidence | Transitions to LOW after VO timeout |
| 5 | Resume camera-replay | VO restarts, satellite matching resumes |
| 6 | Verify recovery | confidence returns to HIGH within 10 frames |
**Pass criteria**: GPS_INPUT never stops; recovery within 10 frames after camera feed resumes
**Duration**: 30s