Files
gps-denied-onboard/_docs/02_document/tests/resilience-tests.md
T

310 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Resilience Tests
> Each test defines fault injection + observable recovery + quantifiable pass/fail. All run through the public interfaces from `environment.md`.
---
### NFT-RES-01: Companion-computer process kill mid-flight (AC-5.3, AC-NEW-1)
**Summary**: SUT process killed mid-flight; SUT restarts and recovers from FC's IMU-extrapolated position within 30 s.
**Traces to**: AC-5.3, AC-NEW-1, F-T11, results_report row 25. Tier: T1.
**Preconditions**: SUT in steady-state tracking; FC continues to fly.
**Fault injection**:
- `docker kill -s SIGKILL <sut>` followed by `docker start <sut>`.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | SIGKILL SUT | SUT process exits non-gracefully; FC continues IMU-only DR per AC-5.2 |
| 2 | Restart SUT | container starts |
| 3 | Time from container start to first valid GPS_INPUT (`fix_type==3`) | t_recovery ≤ 30 s |
| 4 | Read `GLOBAL_POSITION_INT` from FC at SUT-start; assert pipeline seeds from it | source recovery via FC pose |
| 5 | After first satellite match, error ≤ 50 m | accuracy restored |
**Pass criteria**: t_recovery ≤ 30 s p95 over 50 trials; AC-5.2 fallback observable on FC during the gap; accuracy restored ≤ 50 m after first match.
**Duration**: 60 s per trial; 50-trial campaign on T4.
---
### NFT-RES-02: GPS spoofing — promotion within 3 s (AC-NEW-2)
**Summary**: FC GPS-loss / lane-switch event signalled → SUT promotes its estimate to primary within 3 s.
**Traces to**: AC-NEW-2, F-T12. Tier: T3 (`deferred-sitl`).
**Preconditions**: SITL + `gps-spoof-injector`.
**Fault injection**:
- Inject malicious `GPS_RAW_INT` with 1 km lat/lon offset starting at scripted t=0.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | t=0: inject spoof | FC observes anomaly; emits EKF lane-switch / fix-loss in `EKF_STATUS_REPORT` |
| 2 | SUT subscribes to `GPS_RAW_INT`, `EKF_STATUS_REPORT`, `SYS_STATUS` and maintains a "real-GPS health" rolling average | health drops below threshold |
| 3 | Within 3 s, SUT raises GPS_INPUT to primary mode + emits STATUSTEXT `PROMOTE` to GCS | promotion event observable |
**Pass criteria**: 95th percentile of t_promote ≤ 3 s over 50 trials.
**Duration**: 30 min campaign.
---
### NFT-RES-03: 3-s no-fix → FC fallback to IMU-only DR (AC-5.2)
**Summary**: Pipeline blackout for >3 s — FC falls back to IMU-only DR; SUT logs the failure.
**Traces to**: AC-5.2, restrictions §Failsafe. Tier: T3.
**Fault injection**: scripted scenario where SUT cannot produce any estimate for 3.5 s (e.g., cuVSLAM tracking loss + cache poisoned + matcher offline).
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Inject blackout | SUT publishes STATUSTEXT WARN within 1 s of blackout |
| 2 | At t=3 s of blackout, SUT emits a single STATUSTEXT FAILSAFE | recorded |
| 3 | Observe FC `EKF_STATUS_REPORT` | FC switches to IMU-only DR within 4 s of blackout start |
| 4 | After 5 s, restore pipeline | SUT re-emits valid GPS_INPUT; FC re-fuses |
**Pass criteria**: FC fallback observable within 4 s; SUT recovers within 30 s of pipeline restore (matches AC-NEW-1 budget).
**Duration**: 60 s per trial.
---
### NFT-RES-04: 3-consecutive-failures → RELOC_REQ + waiting state (AC-3.4)
**Summary**: When SUT cannot determine position for ≥3 consecutive frames AND ≥2 s, it sends a re-localization request.
**Traces to**: AC-3.4, results_report rows 20, 21, 46. Tier: T1.
**Fault injection**: scripted 3 frames of failed satellite matching + cuVSLAM degraded.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Trigger 3 consecutive frame failures spanning ≥2 s | counter increments |
| 2 | Within 2 s of the third failure, STATUSTEXT `RELOC_REQ: last_lat=… last_lon=… uncertainty=…m` emitted | regex match |
| 3 | While waiting, SUT continues VO/IMU dead reckoning (`fix_type==0`, source `dead_reckoned`) and continues satellite-match attempts (counter increments) | observable |
| 4 | FC continues with last known position + IMU extrapolation | `EKF_STATUS_REPORT` consistent |
**Pass criteria**: regex matches; SUT continues emitting GPS_INPUT in waiting state; satellite-match counter increments.
**Duration**: 60 s.
---
### NFT-RES-05: Operator hint workflow (AC-3.4, AC-6.2)
**Summary**: Operator hint is consumed as a 500 m seed for VPR/cross-view re-loc.
**Traces to**: AC-3.4, AC-6.2, F-T10, results_report row 22. Tier: T1.
**Preconditions**: SUT in re-loc waiting (after NFT-RES-04).
**Fault injection** (cooperative): `qgc-mock` sends STATUSTEXT `RELOC_HINT: lat=… lon=… sigma=500m`.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Send hint | SUT consumes hint; STATUSTEXT `HINT_RECEIVED` echoed |
| 2 | First fix after hint | error ≤ 500 m |
| 3 | After next satellite match | error ≤ 50 m; `tracking_state == NORMAL` |
**Pass criteria**: as above.
**Duration**: 60 s.
---
### NFT-RES-06: Sharp turn — VO-loss → satellite re-loc (AC-3.2)
**Summary**: <5 % overlap, <70°, <200 m drift triggers VO loss; satellite re-loc recovers within 3 frames.
**Traces to**: AC-3.2, F-T7. Tier: T1.
**Fault injection**: synthetic sharp-turn pair injected into `nav_cam_60_slice`.
**Steps**: see FT-P-14; resilience perspective: cuVSLAM tracking-loss event → matcher invocation via re-loc trigger → recovery.
**Pass criteria**: error ≤ 50 m within 3 frames of turn; cuVSLAM tracking-state returns to NORMAL.
**Duration**: 60 s.
---
### NFT-RES-07: Disconnected-segment recovery (AC-3.3)
**Summary**: ≥3 disconnected segments per flight; each segment connects to prior trajectory via global retrieval.
**Traces to**: AC-3.3, F-T8. Tier: T1.
**Fault injection**: `disconnected_segments_replay` with ≥3 large gaps.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Replay segment N (after gap) | VPR retrieves top-K candidate chunks; matcher relocalizes within 10 frames |
| 2 | After re-loc, trajectory continuity restored (no jump in EKF position beyond gap-expected) | `tracking_state == NORMAL` |
| 3 | Repeat for ≥3 segments | all 3 succeed |
**Pass criteria**: 3/3 segments recover within 10 frames; trajectory continuity maintained.
**Duration**: 5 min.
---
### NFT-RES-08: cuVSLAM-degraded fall-back path
**Summary**: If cuVSLAM underperforms (tracking lost repeatedly), SUT degrades gracefully and emits `dead_reckoned` source label rather than producing wild estimates.
**Traces to**: AC-1.4, AC-3.x, R8 reframed. Tier: T1.
**Fault injection**: scripted cuVSLAM tracking loss for 30 s.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Force cuVSLAM tracking-loss for 30 s | source label switches to `dead_reckoned`; horiz_accuracy grows |
| 2 | After 30 s, restore cuVSLAM | source label returns to `vo_extrapolated` or `satellite_anchored` |
| 3 | Verify GPS_INPUT during the 30 s window does not contain wild jumps | per-frame Δposition ≤ IMU integration bound |
**Pass criteria**: source label correctly transitions; no wild jumps; behaviour reversible.
**Duration**: 60 s.
---
### NFT-RES-09: Tile-cache corruption — graceful degradation
**Summary**: Corrupted MBTiles entry triggers reject + WARN, not a crash.
**Traces to**: AC-8.3, AC-3.x. Tier: T1.
**Fault injection**: overwrite a tile sidecar JSON with garbage between SUT runs.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Inject corruption | SUT logs WARN at cache-load |
| 2 | Replay frames over the affected sector | matcher does not consume the corrupt tile; falls through to next candidate |
| 3 | SUT process | does NOT crash; tracking_state may go DEGRADED for affected frames, then NORMAL |
**Pass criteria**: process alive; corrupt tile never produces `satellite_anchored`; recovery on next valid sector.
**Duration**: 60 s.
---
### NFT-RES-10: SITL F-T9 source-switching (AC-4.3 Option A)
**Summary**: ArduPilot SITL fuses GPS_INPUT correctly; failover to `EK3_SRC2_*` when primary unavailable.
**Traces to**: AC-4.3, F-T9 Option A. Tier: T3.
**Fault injection**: temporarily stop SUT GPS_INPUT emission for 5 s; observe FC failover.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | SUT stops emitting | FC EKF3 detects loss; switches to `EK3_SRC2_*=GPS` |
| 2 | Resume SUT emission | EKF3 switches back; no double-fusion (no #30076 / #32506 symptoms) |
**Pass criteria**: clean switch in both directions; EKF3 logs show no double-fusion symptoms.
**Duration**: 15 min.
---
### NFT-RES-11: MAVLink2 signing failure — FC rejects, SUT logs
**Summary**: When the runner sends a deliberately mis-signed GPS_INPUT, FC rejects and SUT/FC log the rejection.
**Traces to**: M-7, S-T1, F-T9 signing assertion. Tier: T3.
**Fault injection**: send a GPS_INPUT with valid schema but invalid signing tag.
**Steps**: see FT-N-14.
**Pass criteria**: FC ARM-rejects the message; STATUSTEXT WARN observable; FC continues on prior valid source.
**Duration**: 30 s.
---
### NFT-RES-12: Stale-tile rejection (AC-NEW-6)
**Summary**: Tile beyond freshness budget (or grace zone) is rejected — `satellite_anchored` source label NEVER produced from it.
**Traces to**: AC-8.2, AC-NEW-6, NF-T6. Tier: T1.
**Fault injection**: `stale_tile_scenarios` with ages 7 / 11 / 13 / 18 months for active-conflict + stable-rear sectors.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | For each combination, replay frames over the affected sector | matcher invocation either skipped or scored 0 |
| 2 | Assert source label of resulting GPS_INPUT | NEVER `satellite_anchored` from stale tile |
| 3 | Confidence weight on tiles in 30-day grace zone | linearly decayed per spec |
**Pass criteria**: as above.
**Duration**: 5 min.
---
### NFT-RES-13: F-T16 cloud-occlusion injection
**Summary**: Synthetic cloud occlusion on a fraction of frames does not cause cascading failure.
**Traces to**: F-T16, AC-3.x. Tier: T2 (`deferred-corpus`).
**Fault injection**: 30 % of frames in AerialVL S03 replay overlaid with synthetic cloud cover.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Run replay | matcher fails on cloud-occluded frames; pipeline degrades to `vo_extrapolated` |
| 2 | After cloud passes, satellite re-loc resumes | source returns to `satellite_anchored` |
**Pass criteria**: AC-1.1 / AC-1.2 still met on the non-cloud-frame subset; pipeline does not enter unrecoverable state.
**Duration**: 90 min.
---
### NFT-RES-14: 8-hour soak — no FDR rollover loss (AC-NEW-3)
**Summary**: Sustained 8 h replay; FDR caps at 64 GB and rolls over without silently dropping a payload class.
**Traces to**: AC-NEW-3, NF-T5. Tier: T4 (`deferred-hil`).
**Fault injection**: replay `synthetic_8h_load` continuously for 8 h.
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Run replay | FDR populates |
| 2 | Inspect at every hour boundary | size monotonic up to cap; rollover events logged |
| 3 | After 8 h | FDR ≤ 64 GB; all payload classes present (positions, IMU, GPS_INPUT, tlog, system health, mid-flight tiles, failure-thumbnail log) |
**Pass criteria**: ≤ 64 GB; all classes present in the latest segment; rollover events logged for any class that hit cap.
**Duration**: 8 h.
---
### NFT-RES-15: AC-NEW-7 cache-poisoning Service-side voting
**Summary**: Single-flight onboard tile is NOT promoted to trusted basemap until ≥2 voting flights confirm.
**Traces to**: AC-NEW-7, F-T3. Tier: T1 (with `service-stub`).
**Fault injection** (cooperative): submit a single-flight tile with deliberately deflated EKF covariance.
**Steps**: see FT-N-17.
**Pass criteria**: candidate stays `trust_level=candidate`; promotion only after N≥2 voting; for active sectors, single-flight promotion only when σ_xy ≤ 3 m AND OSM-road-overlap ≥ 70 %.
**Duration**: 5 min.
---
### NFT-RES-16: ROS 2 topic-rate sanity (F-T19)
**Summary**: Under simulated load, all expected ROS 2 contract topics meet expected publish rates.
**Traces to**: F-T19, Q6 → A. Tier: T1 (uses ROS 2 sniffer that subscribes only to documented contract topics, treating internal topics as opaque).
**Fault injection**: synthetic load (load generator publishes pseudo-image frames at 3 fps + IMU at 200 Hz).
**Steps**: subscribe to `nav_msgs/Odometry` (cuVSLAM output), `sensor_msgs/Image` (camera input), `mavros/global_position/global` (FC bridge), `mavros/imu/data` (FC bridge).
**Pass criteria**: each contract topic publishes at expected rate ± 10 % over a 5 min window.
**Duration**: 5 min.