Files
gps-denied-onboard/_docs/02_document/tests/resilience-tests.md
T
Oleksandr Bezdieniezhnykh c19c76481c Update autodev skill documentation and acceptance criteria
Enhanced the SKILL.md file to enforce conciseness rules for the state file, specifying acceptable content and file size limits. Updated the autodev state to reflect the transition to the planning phase, including changes to the current step and sub-step details. Revised acceptance criteria to clarify validation requirements and external dependencies, ensuring alignment with the latest research findings. Added a new overlay for Mode B revisions to track changes and decisions made during the assessment process.
2026-05-09 03:10:57 +03:00

5.2 KiB
Raw Blame History

Resilience Tests

NFT-RES-01: FC IMU-only fallback after >3 s without estimate

Summary: Validates AC-5.2 — on >3 s without an estimate, the FC falls back to IMU-only dead reckoning AND the SUT logs the failure. Traces to: AC-5.2

Preconditions:

  • SUT in satellite_anchored steady state on Derkachi replay.
  • 4 s outage injector primed (replay paused for 4 s of wall-clock).

Fault injection:

  • Pause frame source for 4 s of wall-clock while FC IMU stream continues.

Steps:

Step Action Expected Behavior
1 Mid-replay, halt frame delivery for 4 s SUT continues emitting dead_reckoned estimates from FC IMU/attitude propagation
2 After 3 s without an emit (i.e. SUT internally fails to update for >3 s), SUT logs NO_ESTIMATE_TIMEOUT FDR contains the log entry
3 Observe FC EKF source-set transition EKF source-set transitions to internal IMU-only on the FC side per the FC's own failsafe logic (AP EKF_FAILSAFE or equivalent on iNav)
4 Resume frame delivery SUT recovers; FC EKF source-set returns to companion-GPS source

Pass criteria:

  • NO_ESTIMATE_TIMEOUT logged within 200 ms of the 3 s mark.
  • FC EKF reflects the transition.
  • Recovery on resume happens within 5 emit cycles.

NFT-RES-02: Companion mid-flight reboot

Summary: Validates AC-5.3 — on companion reboot mid-flight, SUT re-initializes from FC's current IMU-extrapolated position. Traces to: AC-5.3

Preconditions:

  • SUT in steady state on Derkachi replay.
  • FC SITL has been running long enough to have a stable IMU-extrapolated pose.

Fault injection:

  • docker compose restart gps-denied-onboard mid-replay (or systemctl restart on Tier-2).

Steps:

Step Action Expected Behavior
1 At t=120 s of replay, restart SUT container SUT goes down and back up
2 Wait for first post-restart GPS_INPUT / MSP2_SENSOR_GPS arrival First emit lat/lon within ±100 m of FC's IMU-extrapolated pose at boot-complete time
3 Observe TTFF post-reboot Within AC-NEW-1 budget (<30 s p95)

Pass criteria:

  • First post-restart emit ±100 m of FC pose at boot-complete.
  • Cold-restart TTFF < 30 s.
  • No FC-side EKF divergence event during the gap.

NFT-RES-03: False-position safety budget Monte Carlo

Summary: Validates AC-NEW-4 false-position safety budget (P(error > 500 m) < 0.1%, P(error > 1 km) < 0.01%) on the available data + synthesis. PARTIAL — multi-flight statistics constrained by single Derkachi flight + 60 stills (see traceability matrix flag). Traces to: AC-NEW-4 (PARTIAL)

Preconditions:

  • Tier-1 acceptable (statistical rather than hardware-bound).
  • Pull together: 60 still-image runs (60 frames) + Derkachi replay (~14,700 frames at 30 fps OR resampled to ~870 frames at 3 Hz target). Total ≥930 frames per Monte Carlo iteration.
  • Run M=50 Monte Carlo iterations with synthetic perturbations (camera-pose noise, IMU bias drift, randomized tile sub-selection).

Fault injection:

  • Add per-iteration synthetic perturbations to mimic a population of independent flights.

Steps:

Step Action Expected Behavior
1 Run M iterations end-to-end Per-iteration error distribution captured
2 Aggregate across all iterations × frames Per-frame error CDF
3 Read off P(error > 500 m) and P(error > 1 km) from CDF Both values

Pass criteria (PARTIAL):

  • P(error > 500 m) < 0.1%.
  • P(error > 1 km) < 0.01%.
  • Test FAILS-OPEN with explicit "PARTIAL" annotation in CSV report when iteration count is below the AC-NEW-4-implied ≥100 flights — noted as reduced confidence pending D-PROJ-3 (AerialVL S03 + own multi-flight data).

NFT-RES-04: Visual blackout + spoof degraded-mode escalation

Summary: Validates the AC-NEW-8 escalation ladder (5 s, 15 s, 35 s blackouts paired with spoof) including the 100 m / 500 m covariance thresholds and the 10 s GPS-health gate before recovery. Traces to: AC-NEW-8 (twin of FT-N-04 with extended duration window and covariance assertions)

Preconditions: Same as FT-N-04; Tier-1 acceptable.

Fault injection: blackout-spoof-derkachi 5 s / 15 s / 35 s windows + spoofed FC GPS for the same windows.

Steps:

Step Action Expected Behavior
1 Begin 5 s window Mode transition ≤ 400 ms; covariance grows monotonically; spoofed GPS rejected
2 At end of 5 s window, attempt recovery Recovery only after FC GPS-health stable + non-spoofed for ≥10 s AND visual/satellite consistency check succeeds (gate enforced)
3 Begin 15 s window Same as step 1 plus when 95% covariance crosses 100 m: outbound MAVLink fix-quality degraded to "2D fix or worse"
4 Begin 35 s window Plus when 95% covariance crosses 500 m OR blackout exceeds 30 s: horiz_accuracy=999.0 + VISUAL_BLACKOUT_FAILSAFE STATUSTEXT emitted

Pass criteria:

  • All four assertions fire at the right thresholds.
  • Recovery gate is honored — early recovery attempts (FC GPS healthy for <10 s) MUST NOT promote spoofed GPS back into the estimator.

Duration: ~10 min total for three windows.