mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 18:31:13 +00:00
[AZ-424] [AZ-425] [AZ-426] Implement negatives set (FT-N-01/03/04)
Adds three pure-logic evaluators + scenarios + unit tests covering the project's failure-mode robustness ladder (AC-3.1, AC-3.4, AC-3.5, AC-NEW-8): * outlier_tolerance_evaluator (AZ-424 / FT-N-01): per-event 50 m drift bound + 3-frame covariance-monotonic window over the AZ-408 outlier injector's medium-density manifest. * outage_request_evaluator (AZ-425 / FT-N-03): detects 3+ consecutive missing-frame windows; validates OPERATOR_RELOC_REQUEST STATUSTEXT arrives at 2 s ±500 ms, dead_reckoned label during outage, and no FC EKF divergence. * blackout_spoof_evaluator (AZ-426 / FT-N-04): eight-AC ladder across the 5 s / 15 s / 35 s sub-windows — switch latency, spoof rejection, monotonic covariance, honest horiz_accuracy, STATUSTEXT 1-2 Hz, 35 s escalation thresholds, and recovery gate. Each scenario is skip-gated on the AZ-441 / AZ-407 / AZ-416 replay / SITL / mavproxy helpers; unit tests (14 + 18 + 29 = 61) cover the AC logic today. Full e2e unit-test suite: 527 passed (+67). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,68 @@
|
||||
# FT-N-01 — 350 m outlier injection tolerance
|
||||
|
||||
**Task**: AZ-424_ft_n_01_outlier_tolerance
|
||||
**Name**: Tolerate up to 350 m outliers between consecutive frames; tilt up to ±20° (AC-3.1, RESTRICT-CAM-1)
|
||||
**Description**: Implement FT-N-01 — Derkachi replay with `outlier-injection-derkachi` injector (medium density); SUT detects outlier; rejects from anchor; estimate continues from prior valid state; covariance grows monotonically; per-frame error_after_outlier ≤ error_before_outlier + 50 m.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407, AZ-408
|
||||
**Component**: Blackbox Tests / Negative (epic AZ-262)
|
||||
**Tracker**: AZ-424
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Outlier tolerance (AC-3.1) is the project's primary failure-mode robustness measurement. Without this scenario the matcher's outlier-rejection is unmeasured.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/negative/test_ft_n_01_outlier_tolerance.py`.
|
||||
- Replays Derkachi with the `outlier.py --density medium` injector (every 10th frame replaced).
|
||||
- For each non-outlier frame: computes `error_per_frame = vincenty(estimate, GT)`; tracks `error_before_outlier` and `error_after_outlier`.
|
||||
- For each outlier event: asserts `error_after_outlier ≤ error_before_outlier + 50 m`.
|
||||
- Tracks `cov_semi_major_m` across outlier events; asserts monotonic growth across the event.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- `medium`-density injection.
|
||||
- Per-frame error computation against GT.
|
||||
- Per-event drift bound assertion.
|
||||
- Per-event covariance-monotonic assertion.
|
||||
|
||||
### Excluded
|
||||
- `light` and `heavy` densities — `medium` is the AC-3.1 canonical envelope.
|
||||
- Tilt envelope assertion (camera ±20°) — derived directly from `RESTRICT-CAM-1` and verified at fixture-validation time, NOT exercised in this scenario.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: medium-density injection active**
|
||||
Given the injector is configured `--density medium`
|
||||
Then ≥10 outlier frames are injected over the Derkachi 8-min replay (1 in 10 ≈ 1470 frames / 10).
|
||||
|
||||
**AC-2: drift bound per outlier**
|
||||
Given each outlier event
|
||||
Then `error_after_outlier ≤ error_before_outlier + 50 m`.
|
||||
|
||||
**AC-3: covariance monotonic**
|
||||
Given the per-frame `cov_semi_major_m` stream
|
||||
Then for every outlier event, `cov_semi_major_m` is non-decreasing across the 3-frame window centred on the outlier.
|
||||
|
||||
**AC-4: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: outbound estimate stream, FDR per-frame records.
|
||||
- **Forbidden**: importing C3 matcher state, stubbing the outlier-detection threshold.
|
||||
|
||||
## Constraints
|
||||
|
||||
- The injector's per-frame replacement is geo-located via the tile-cache GT; ≥99 % of replacements are >350 m from the original frame's GT centre (per AZ-408 AC-2).
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-N-01
|
||||
- `_docs/02_document/tests/test-data.md` § Resilience (FT-N-01 row)
|
||||
@@ -0,0 +1,71 @@
|
||||
# FT-N-03 — Extended outage triggers operator re-loc request
|
||||
|
||||
**Task**: AZ-425_ft_n_03_outage_reloc
|
||||
**Name**: ≥3 consecutive frames AND ≥2 s without estimate → STATUSTEXT `OPERATOR_RELOC_REQUEST` + `dead_reckoned` propagation (AC-3.4)
|
||||
**Description**: Implement FT-N-03 — Derkachi replay with synthetic 3-frame outage injector; SUT fails to produce estimates for 3+ frames; after ≥2 s, STATUSTEXT containing `OPERATOR_RELOC_REQUEST` is emitted to mavproxy listener; estimates labeled `dead_reckoned` continue; FC uses last-known + IMU extrapolation.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407, AZ-408
|
||||
**Component**: Blackbox Tests / Negative / Resilience (epic AZ-262)
|
||||
**Tracker**: AZ-425
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
The operator must be alerted when the SUT enters a sustained no-estimate state — without a STATUSTEXT-based signal the mission is silently degraded. AC-3.4 is the operator-experience cornerstone of the failure-mode contract.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/negative/test_ft_n_03_outage_reloc.py`.
|
||||
- Replays Derkachi with a 3-consecutive-frame failure injector (corrupt frames force C3 matcher to fail).
|
||||
- After 2 s of no SUT estimate: assert STATUSTEXT containing `OPERATOR_RELOC_REQUEST` is captured in mavproxy `.tlog`.
|
||||
- During outage: assert outbound estimates carry `source_label = dead_reckoned` (FC IMU-extrapolated propagation continues).
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- 3-frame failure injector (a thin extension of `injectors/outlier.py` that emits all-zero frames instead of crops).
|
||||
- mavproxy listener regex match on `OPERATOR_RELOC_REQUEST`.
|
||||
- `dead_reckoned`-label observation during outage.
|
||||
|
||||
### Excluded
|
||||
- 5/15/35 s blackout-with-spoof — owned by FT-N-04 (AZ-426).
|
||||
- Multi-segment re-loc — owned by FT-P-08 (AZ-415).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: outage onset**
|
||||
Given the injector emits 3 consecutive corrupt frames
|
||||
Then the SUT fails to produce estimates for ≥3 frames (observable via gap in outbound stream).
|
||||
|
||||
**AC-2: STATUSTEXT emission**
|
||||
Given ≥2 s elapses without SUT estimate
|
||||
Then mavproxy `.tlog` contains a STATUSTEXT with payload matching regex `OPERATOR_RELOC_REQUEST` within ≤500 ms of the 2 s mark.
|
||||
|
||||
**AC-3: dead_reckoned during outage**
|
||||
Given the outage window
|
||||
Then outbound emissions during the window carry `source_label = dead_reckoned` (estimates continue, just IMU-extrapolated).
|
||||
|
||||
**AC-4: FC IMU-only continues**
|
||||
Given the `dead_reckoned` emissions reach the FC
|
||||
Then the FC's EKF uses last-known + IMU extrapolation (observable via SITL state read; no EKF divergence event).
|
||||
|
||||
**AC-5: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: mavproxy listener `.tlog`, outbound message stream, SITL state read.
|
||||
- **Forbidden**: monkeypatching the `OPERATOR_RELOC_REQUEST` emitter, stubbing the no-estimate detector.
|
||||
|
||||
## Constraints
|
||||
|
||||
- The STATUSTEXT regex is `OPERATOR_RELOC_REQUEST` (exact substring match within the STATUSTEXT payload; case-sensitive).
|
||||
- The 2 s threshold is per AC-3.4; the test's tolerance is ±500 ms around that threshold.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-N-03
|
||||
- `_docs/02_document/tests/test-data.md` § Resilience (FT-N-03 row)
|
||||
@@ -0,0 +1,94 @@
|
||||
# FT-N-04 — Visual blackout + spoofed GPS combined failsafe
|
||||
|
||||
**Task**: AZ-426_ft_n_04_blackout_spoof
|
||||
**Name**: AC-3.5 + AC-NEW-8 combined failsafe — switch label, reject spoof, propagate, monotonic covariance, STATUSTEXT (AC-3.5, AC-NEW-8)
|
||||
**Description**: Implement FT-N-04 — three sub-cases at 5 s / 15 s / 35 s blackout windows paired with FC GPS spoof; assert mode transition ≤400 ms or ≤1 frame; spoofed GPS rejected; covariance monotonic; honest `horiz_accuracy`; `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT at 1-2 Hz; 35 s window adds covariance-threshold escalations.
|
||||
**Complexity**: 5 points
|
||||
**Dependencies**: AZ-406, AZ-407, AZ-408
|
||||
**Component**: Blackbox Tests / Negative / Security (epic AZ-262)
|
||||
**Tracker**: AZ-426
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
The combined "visual-blackout + spoofed-GPS" failsafe is the project's most security-critical degradation mode. AC-3.5 + AC-NEW-8 prescribe a multi-step ladder (label switch, spoof rejection, monotonic covariance, honest reporting, escalation thresholds) that is genuinely hard to validate end-to-end. This must be exercised with synthetic spoof injection.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/negative/test_ft_n_04_blackout_spoof.py` with three sub-tests (5 s, 15 s, 35 s windows) using `blackout_spoof.py`.
|
||||
- For every sub-case asserts:
|
||||
- Within ≤1 frame OR ≤400 ms of blackout-onset: `source_label = dead_reckoned`; spoofed GPS rejected; covariance grows monotonically.
|
||||
- `horiz_accuracy ≥ 0.95 × cov_semi_major_m` (no under-reporting).
|
||||
- GCS receives `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT at 1-2 Hz throughout the blackout.
|
||||
- For the 35 s window only:
|
||||
- When 95 % covariance crosses 100 m → fix-quality degraded ("2D fix or worse" in MAVLink fix_type).
|
||||
- When 95 % covariance crosses 500 m OR blackout exceeds 30 s → `horiz_accuracy=999.0` AND `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT.
|
||||
- After blackout end: recovery only after FC GPS-health stable + non-spoofed for ≥10 s AND a visual/satellite consistency check succeeds.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- All three sub-cases (5 s, 15 s, 35 s).
|
||||
- Multi-step ladder assertions per sub-case.
|
||||
- 35 s window covariance-threshold escalations.
|
||||
- Recovery-gate assertion (≥10 s of stable non-spoofed FC GPS before re-promotion).
|
||||
|
||||
### Excluded
|
||||
- Pure-blackout (no spoof) — owned by NFT-RES-01 / FT-N-03.
|
||||
- Cache-poisoning safety — owned by NFT-SEC-01.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: switch latency**
|
||||
Given any blackout-onset event
|
||||
Then within ≤1 frame OR ≤400 ms (whichever is shorter for the run's frame rate), `source_label = dead_reckoned`.
|
||||
|
||||
**AC-2: spoof rejection**
|
||||
Given the FC inbound proxy injects spoofed GPS during the window
|
||||
Then the SUT does NOT consume the spoofed GPS into the estimator (verifiable via FDR `spoof-rejected` events).
|
||||
|
||||
**AC-3: monotonic covariance**
|
||||
Given the per-frame `cov_semi_major_m` stream within the blackout
|
||||
Then `cov_semi_major_m` is non-decreasing across consecutive emissions.
|
||||
|
||||
**AC-4: honest horiz_accuracy**
|
||||
Given the outbound `GPS_INPUT.horiz_accuracy` (AP) field
|
||||
Then `horiz_accuracy ≥ 0.95 × cov_semi_major_m` for every emission within the blackout.
|
||||
|
||||
**AC-5: STATUSTEXT 1-2 Hz**
|
||||
Given the GCS-side mavproxy capture
|
||||
Then `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT messages are emitted at a rate in `[1, 2]` Hz throughout the blackout.
|
||||
|
||||
**AC-6 (35 s only): 100 m covariance escalation**
|
||||
Given the 35 s window
|
||||
When 95 % covariance crosses 100 m
|
||||
Then outbound MAVLink reports fix-quality degraded (2D fix or worse).
|
||||
|
||||
**AC-7 (35 s only): 500 m / 30 s escalation**
|
||||
When 95 % covariance crosses 500 m OR blackout exceeds 30 s
|
||||
Then `horiz_accuracy=999.0` AND `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT emitted within ≤500 ms of the threshold crossing.
|
||||
|
||||
**AC-8: recovery gate**
|
||||
Given blackout ends and FC GPS-health is restored
|
||||
Then recovery to `satellite_anchored` only after: (a) FC GPS-health stable + non-spoofed for ≥10 s AND (b) a visual/satellite consistency check succeeds.
|
||||
|
||||
**AC-9: parameterization**
|
||||
Given conftest parameterization
|
||||
Then all sub-cases run per `(fc_adapter, vio_strategy)` for AP (where signing is in play), `fc_adapter=inav` exercises the same ladder minus the AP-specific signing context.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: blackout_spoof injector (a public-input fault), FDR `spoof-rejected` events, mavproxy STATUSTEXT capture, outbound `horiz_accuracy` field.
|
||||
- **Forbidden**: stubbing the spoof detector, monkeypatching the source-label state machine.
|
||||
|
||||
## Constraints
|
||||
|
||||
- All AC numerical thresholds match the AC-3.5 / AC-NEW-8 text exactly; deviation is a real defect signal.
|
||||
- The recovery-gate ≥10 s is wall-clock measured by the runner.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-N-04
|
||||
- `_docs/02_document/tests/test-data.md` § Resilience (FT-N-04 row)
|
||||
Reference in New Issue
Block a user