6.5 KiB
Resilience Tests
NFT-RES-01: Mid-Flight Reboot Recovery
Summary: Validate the system recovers from a companion computer reboot within 70 seconds and restores position accuracy. Traces to: AC-12 (mid-flight reboot recovery)
Preconditions:
- System running in steady state with good position accuracy
- SITL ArduPilot continues running (FC stays up during companion computer reboot)
Fault injection:
- Kill gps-denied-system process (docker stop or SIGKILL)
- Restart after 5s delay (simulates Jetson reboot time)
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Record current position accuracy and confidence | Baseline metrics |
| 2 | Kill gps-denied-system container | GPS_INPUT messages stop |
| 3 | Verify SITL continues running (heartbeat present) | FC still alive, using IMU dead reckoning |
| 4 | Restart gps-denied-system container after 5s | System starts recovery sequence |
| 5 | Monitor time from restart to first GPS_INPUT | ≤ 70s |
| 6 | Wait for first satellite match | Position accuracy restored |
| 7 | Verify position error after recovery | Error ≤ 50m after first satellite match |
Pass criteria: Recovery time ≤ 70s; post-recovery position error ≤ 50m after satellite match Duration: 120s
NFT-RES-02: Tracking Loss and Satellite Re-Localization
Summary: Validate the system recovers from cuVSLAM tracking loss via satellite-based re-localization. Traces to: AC-07 (disconnected segments), AC-06 (sharp turn handling)
Preconditions:
- System in normal tracking (HIGH confidence)
- Satellite tiles available
Fault injection:
- Camera-replay sends featureless/blurred frames (simulates VO tracking loss from sharp turn)
- Then resumes normal frames
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Normal tracking established | confidence: HIGH, vo_status: tracking |
| 2 | Camera-replay serves 3 featureless frames | cuVSLAM reports tracking_lost |
| 3 | System enters TRACKING_LOST state | Satellite matching switches to every frame |
| 4 | Camera-replay resumes normal frames | Satellite match succeeds |
| 5 | Monitor SSE: vo_status returns to "tracking" | cuVSLAM restarted |
| 6 | Monitor SSE: confidence returns to HIGH | Position re-anchored |
| 7 | Verify position accuracy after recovery | Error ≤ 50m |
Pass criteria: Recovery within 5 frames after normal frames resume; position error ≤ 50m post-recovery Duration: 30s
NFT-RES-03: Sustained IMU-Only Operation
Summary: Validate the system continues producing position estimates during extended IMU-only periods without crashing. Traces to: AC-08 (system continues during failure), AC-12 (failsafe)
Preconditions:
- System in normal tracking
Fault injection:
- Pause both camera-replay (no VO) and satellite-tile-server (no satellite matching)
- Duration: 30s
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Establish normal tracking baseline | GPS_INPUT at 5-10Hz, confidence HIGH |
| 2 | Pause camera-replay and satellite-tile-server | VO and satellite inputs stop |
| 3 | Monitor GPS_INPUT for 30s | Messages continue at 5-10Hz (IMU-driven ESKF prediction) |
| 4 | Verify horiz_accuracy grows over time | accuracy increases monotonically |
| 5 | Verify fix_type transitions to 2 | Degraded but present |
| 6 | Verify confidence transitions to LOW | Reflects IMU-only state |
| 7 | Resume camera-replay and satellite-tile-server | System recovers to normal tracking |
| 8 | Verify recovery to HIGH confidence | Satellite match re-anchors position |
Pass criteria: GPS_INPUT never stops during 30s IMU-only period; system recovers when inputs resume Duration: 60s
NFT-RES-04: Satellite Tile Server Failure
Summary: Validate the system continues operating when satellite tile server becomes unavailable, with graceful accuracy degradation. Traces to: AC-07 (resilience), solution risk: Google Maps quality
Preconditions:
- System in normal tracking
Fault injection:
- Stop satellite-tile-server container (simulates tile unavailability)
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Normal tracking with satellite corrections | confidence: HIGH |
| 2 | Stop satellite-tile-server | Satellite matching returns errors |
| 3 | Monitor for 60s | System falls back to VO-only; confidence drops to MEDIUM after 30s |
| 4 | Verify GPS_INPUT continues | Messages at 5-10Hz, fix_type remains 3 (VO tracking OK) |
| 5 | Restart satellite-tile-server | Satellite matching resumes |
| 6 | Verify confidence returns to HIGH | Position re-anchored |
Pass criteria: No crash or hang; GPS_INPUT continues; confidence degrades gracefully and recovers when tiles return Duration: 90s
NFT-RES-05: Corrupted Camera Frame
Summary: Validate the system handles a corrupted camera frame without crashing. Traces to: AC-06 (outlier tolerance)
Preconditions:
- System in normal tracking
Fault injection:
- Camera-replay injects a truncated/corrupted JPEG between normal frames
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Normal tracking for 5 frames | Baseline established |
| 2 | Camera-replay sends corrupted JPEG | System logs warning, skips frame |
| 3 | Camera-replay sends next normal frame | VO continues processing |
| 4 | Verify no crash, no hang | GPS_INPUT continues at 5-10Hz |
| 5 | Verify position accuracy on next valid frame | Error < 50m |
Pass criteria: System skips corrupted frame gracefully; no crash; next frame processed normally Duration: 15s
NFT-RES-06: Camera Feed Interruption (No Frames for 10s)
Summary: Validate the system survives a 10-second camera feed interruption. Traces to: AC-12 (failsafe — N seconds no estimate), AC-08 (continued operation)
Preconditions:
- System in normal tracking
Fault injection:
- Camera-replay pauses for 10s (no frames delivered)
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Normal tracking baseline | GPS_INPUT at 5-10Hz |
| 2 | Pause camera-replay for 10s | No new camera frames |
| 3 | Monitor GPS_INPUT | Messages continue via IMU prediction |
| 4 | Monitor confidence | Transitions to LOW after VO timeout |
| 5 | Resume camera-replay | VO restarts, satellite matching resumes |
| 6 | Verify recovery | confidence returns to HIGH within 10 frames |
Pass criteria: GPS_INPUT never stops; recovery within 10 frames after camera feed resumes Duration: 30s