mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 06:01:12 +00:00
[AZ-424] [AZ-425] [AZ-426] Implement negatives set (FT-N-01/03/04)
Adds three pure-logic evaluators + scenarios + unit tests covering the project's failure-mode robustness ladder (AC-3.1, AC-3.4, AC-3.5, AC-NEW-8): * outlier_tolerance_evaluator (AZ-424 / FT-N-01): per-event 50 m drift bound + 3-frame covariance-monotonic window over the AZ-408 outlier injector's medium-density manifest. * outage_request_evaluator (AZ-425 / FT-N-03): detects 3+ consecutive missing-frame windows; validates OPERATOR_RELOC_REQUEST STATUSTEXT arrives at 2 s ±500 ms, dead_reckoned label during outage, and no FC EKF divergence. * blackout_spoof_evaluator (AZ-426 / FT-N-04): eight-AC ladder across the 5 s / 15 s / 35 s sub-windows — switch latency, spoof rejection, monotonic covariance, honest horiz_accuracy, STATUSTEXT 1-2 Hz, 35 s escalation thresholds, and recovery gate. Each scenario is skip-gated on the AZ-441 / AZ-407 / AZ-416 replay / SITL / mavproxy helpers; unit tests (14 + 18 + 29 = 61) cover the AC logic today. Full e2e unit-test suite: 527 passed (+67). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,141 @@
|
||||
# Batch 73 Report — Test Implementation (cycle 1, batch 7 of test phase)
|
||||
|
||||
**Batch**: 73
|
||||
**Date**: 2026-05-17
|
||||
**Context**: Test implementation (greenfield Step 10 — Implement Tests)
|
||||
**Tasks**: AZ-424 (3pt), AZ-425 (3pt), AZ-426 (5pt) — 11 cp / 3 tasks
|
||||
**Cycle**: 1
|
||||
**Verdict**: COMPLETE — PASS (self-reviewed; see `reviews/batch_73_review.md`)
|
||||
|
||||
## Summary
|
||||
|
||||
The negatives set — FT-N-01 / FT-N-03 / FT-N-04 — the project's
|
||||
failure-mode robustness suite (AC-3.1, AC-3.4, AC-3.5, AC-NEW-8).
|
||||
Same pattern as the prior batches in this phase:
|
||||
|
||||
* Pure-logic evaluator under `e2e/runner/helpers/` (everything the
|
||||
scenario can express without docker-bound SITL access).
|
||||
* Scenario file under `e2e/tests/negative/`, parameterised across
|
||||
conftest fixtures, skip-gated on upstream replay / FDR / mavproxy
|
||||
/ SITL observer helpers (auto-activates when AZ-441 + AZ-407 +
|
||||
AZ-416 leftovers land).
|
||||
* Helper-driven unit test file under `e2e/_unit_tests/helpers/`.
|
||||
|
||||
### AZ-424 — FT-N-01 350 m outlier injection tolerance (3pt)
|
||||
|
||||
* **`runner/helpers/outlier_tolerance_evaluator.py`** — three
|
||||
invariants:
|
||||
- AC-1: count gate — `MIN_OUTLIER_COUNT = 10` outliers across the
|
||||
Derkachi 8-min `--density medium` replay (the AC-3.1 envelope).
|
||||
- AC-2: per-event drift bound — `error_after_outlier −
|
||||
error_before_outlier ≤ DRIFT_BUDGET_M = 50.0`. `before` / `after`
|
||||
are the immediate neighbour frames in the outbound stream;
|
||||
`distance_m` is the shared Vincenty helper.
|
||||
- AC-3: covariance monotonic across the 3-frame window centred on
|
||||
the outlier (`COVARIANCE_WINDOW_FRAMES = 3`).
|
||||
- Plus `load_outlier_manifest` (reads the AZ-408 injector's
|
||||
`manifest.csv`) and `write_csv_evidence`.
|
||||
* **`tests/negative/test_ft_n_01_outlier_tolerance.py`** — scenario
|
||||
indirect-parametrises `outlier_injection_derkachi` at
|
||||
`density="medium", seed=0`, drives replay, collects FDR
|
||||
`outbound_estimate` records, joins them to per-frame GT, evaluates,
|
||||
asserts per-event `passes_drift` + `passes_covariance` plus the
|
||||
aggregate `passes_count`. Records NFR metrics
|
||||
`ft_n_01.total_outliers`, `ft_n_01.failed_event_count`, per-event
|
||||
`drift_m` + `cov_non_decreasing`.
|
||||
* **14 unit tests** in `test_outlier_tolerance_evaluator.py`.
|
||||
|
||||
### AZ-425 — FT-N-03 Extended outage triggers operator re-loc request (3pt)
|
||||
|
||||
* **`runner/helpers/outage_request_evaluator.py`** — first detects
|
||||
outage windows from frame-index gaps (≥`MIN_OUTAGE_FRAMES = 3`
|
||||
consecutive missing frames), then per-window evaluates:
|
||||
- AC-2: STATUSTEXT `OPERATOR_RELOC_REQUEST` observed at
|
||||
`[OUTAGE_THRESHOLD_S − TOLERANCE_S, OUTAGE_THRESHOLD_S +
|
||||
TOLERANCE_S] = [1.5, 2.5] s` after outage onset.
|
||||
- AC-3: at least one `source_label = dead_reckoned` outbound
|
||||
emission inside the window.
|
||||
- AC-4: zero FC-side EKF divergence events inside the window
|
||||
(observable via SITL state read).
|
||||
- Plus `detect_outage_windows` (with explicit handling for trailing
|
||||
windows + multi-window flights) and `write_csv_evidence`.
|
||||
* **`tests/negative/test_ft_n_03_outage_reloc.py`** — scenario drives
|
||||
replay with a 3-frame outage injector (a future thin extension of
|
||||
the AZ-408 outlier injector), reads FDR `frame_received` +
|
||||
`outbound_estimate` records to reconstruct
|
||||
`expected_frame_indices` and the estimate stream, walks the
|
||||
mavproxy `.tlog` for STATUSTEXT, and pulls EKF divergence events
|
||||
via `sitl_observer.read_ekf_divergence_events()`. Records per-window
|
||||
NFR metrics with AC IDs (`length_frames`, `statustext_offset_ms`,
|
||||
`dead_reckoned_count`, `ekf_divergence_count`).
|
||||
* **18 unit tests** in `test_outage_request_evaluator.py`.
|
||||
|
||||
### AZ-426 — FT-N-04 Visual blackout + spoofed GPS combined failsafe (5pt)
|
||||
|
||||
* **`runner/helpers/blackout_spoof_evaluator.py`** — the most ladder-
|
||||
heavy evaluator in the project: eight per-AC sub-reports stitched
|
||||
into one `BlackoutSpoofReport`. Constants pulled into the module
|
||||
header so the spec can be diffed against code in one place:
|
||||
`SWITCH_LATENCY_MS = 400` (AC-1),
|
||||
`HONEST_ACCURACY_RATIO = 0.95` (AC-4),
|
||||
`STATUSTEXT_RATE_MIN_HZ = 1.0` / `STATUSTEXT_RATE_MAX_HZ = 2.0` (AC-5),
|
||||
`ESCALATION_COV_2D_M = 100.0` (AC-6),
|
||||
`ESCALATION_COV_FAILSAFE_M = 500.0`, `ESCALATION_DURATION_FAILSAFE_S = 30.0`,
|
||||
`ESCALATION_LATENCY_MS = 500` (AC-7),
|
||||
`RECOVERY_STABLE_S = 10.0` (AC-8).
|
||||
Per-AC analysers:
|
||||
- `evaluate_switch_latency`: budget = `min(SWITCH_LATENCY_MS,
|
||||
frame_period_ms)` — the spec's "≤1 frame OR ≤400 ms (whichever is
|
||||
shorter)" wording, made explicit.
|
||||
- `evaluate_spoof_rejection`: requires both ≥1 FDR
|
||||
`spoof-rejected` event AND zero `satellite_anchored` emissions
|
||||
inside the window (so the SUT cannot silently re-promote on a
|
||||
spoofed lock).
|
||||
- `evaluate_covariance_monotonic`: first non-decreasing violation
|
||||
timestamp + binary pass.
|
||||
- `evaluate_honest_accuracy`: per-sample `horiz_accuracy ≥ 0.95 ×
|
||||
cov_semi_major_m`. Boundary test pins the spec budget.
|
||||
- `evaluate_statustext_rate`: `VISUAL_BLACKOUT_IMU_ONLY` rate over
|
||||
the window must land in [1, 2] Hz.
|
||||
- `evaluate_escalation` (35 s window only): AC-6 fix_type degrades
|
||||
on the first cov-100 m crossing; AC-7 triggers on the earliest
|
||||
of cov-500 m crossing OR 30 s duration. Non-35 s windows pass
|
||||
vacuously — they aren't expected to hit either threshold.
|
||||
- `evaluate_recovery_gate`: AC-8 — ≥10 s of healthy + non-spoofed
|
||||
FC GPS + a consistency-check pass before re-promoting to
|
||||
`satellite_anchored` post-window.
|
||||
* **`tests/negative/test_ft_n_04_blackout_spoof.py`** — scenario
|
||||
indirect-parametrises `blackout_spoof_derkachi` over
|
||||
`_WINDOW_LADDER_S = (5.0, 15.0, 35.0)` with ids `["5s", "15s",
|
||||
"35s"]`. Collects FDR `outbound_estimate` + `spoof_rejected`,
|
||||
mavproxy STATUSTEXT, and SITL GPS-health + consistency-check
|
||||
samples. Asserts each AC with a descriptive failure message that
|
||||
surfaces the relevant sub-report fields.
|
||||
* **29 unit tests** in `test_blackout_spoof_evaluator.py`.
|
||||
|
||||
## Layout invariant
|
||||
|
||||
`e2e/_unit_tests/test_directory_layout.py` now lists the three new
|
||||
evaluators and the three new scenario files.
|
||||
|
||||
## Test Results
|
||||
|
||||
* New unit tests: 14 + 18 + 29 = **61**.
|
||||
* Plus 6 new entries in `test_required_path_exists` parametrize
|
||||
(3 helpers + 3 scenarios).
|
||||
* Full `e2e/_unit_tests` suite: **527 passed in 130 s** (previous
|
||||
cumulative: 460 → +67 net).
|
||||
* Scenario collection across the three negatives: 48 items
|
||||
parametrized; the session-end `/e2e-results/evidence/per-nfr`
|
||||
teardown error is the same pre-existing `nfr_recorder` wart
|
||||
documented in batches 69-72 — not a regression of this batch and
|
||||
not blocking unit-suite collection.
|
||||
|
||||
## State
|
||||
|
||||
* Specs moved: `_docs/02_tasks/todo/AZ-{424,425,426}_*.md` →
|
||||
`_docs/02_tasks/done/`.
|
||||
* `_docs/_autodev_state.md` advanced to
|
||||
`last_completed_batch: 73`.
|
||||
* Cumulative review window: `last_cumulative_review = batches_70-72`;
|
||||
the next K=3 cumulative review fires at the end of batch 75.
|
||||
@@ -0,0 +1,173 @@
|
||||
# Code Review Report
|
||||
|
||||
**Batch**: 73 — AZ-424, AZ-425, AZ-426
|
||||
**Date**: 2026-05-17
|
||||
**Verdict**: PASS
|
||||
|
||||
## Findings
|
||||
|
||||
(none)
|
||||
|
||||
## Findings Sweep
|
||||
|
||||
### Phase 1 — Context Loading
|
||||
|
||||
Loaded specs `AZ-424_ft_n_01_outlier_tolerance.md`,
|
||||
`AZ-425_ft_n_03_outage_reloc.md`, `AZ-426_ft_n_04_blackout_spoof.md`.
|
||||
Re-read injector surfaces touched by the new evaluators:
|
||||
`e2e/fixtures/injectors/outlier.py` (manifest.csv schema +
|
||||
`OutlierInjectionReport.out_root`), `e2e/fixtures/injectors/blackout_spoof.py`
|
||||
(`BlackoutSpoofPlan`, `BlackoutSpoofSchedule.window_start_ms / window_end_ms`,
|
||||
spoofed-GPS cadence + AC-NEW-8 200-500 m delta bounds). Re-read existing
|
||||
fixture wiring in `e2e/runner/helpers/injector_fixtures.py` to confirm
|
||||
`outlier_injection_derkachi` and `blackout_spoof_derkachi` parametrize
|
||||
on `density` / `window_seconds`. Re-read the scenario template used in
|
||||
batch 71/72 (`tests/positive/test_ft_p_10_smoothing_lookback.py`,
|
||||
`tests/negative/test_ft_n_02_sharp_turn_failure.py`) for the
|
||||
`_harness_helpers_implemented` gate pattern and the FDR / mavproxy /
|
||||
sitl_observer access conventions.
|
||||
|
||||
### Phase 2 — Spec Compliance
|
||||
|
||||
**AZ-424 (FT-N-01)**
|
||||
|
||||
| AC | Coverage | Status |
|
||||
|----|----------|--------|
|
||||
| AC-1 (medium-density injection; ≥10 outliers) | `test_constants_match_spec`, `test_evaluate_count_below_minimum_fails`, `test_evaluate_count_at_minimum_passes_count_gate`, scenario assertion via `MIN_OUTLIER_COUNT` | Covered |
|
||||
| AC-2 (drift bound ≤50 m per outlier) | `test_evaluate_event_drift_within_budget`, `test_evaluate_event_drift_exceeds_budget_fails`, `test_evaluate_event_missing_neighbour_drift_none`, scenario per-event assertion via `OutlierEventReport.passes_drift` | Covered |
|
||||
| AC-3 (covariance monotonic across 3-frame window) | `test_evaluate_event_cov_monotonic_passes`, `test_evaluate_event_cov_decreasing_fails`, `test_evaluate_event_cov_flat_window_passes`, scenario assertion via `passes_covariance` | Covered |
|
||||
| AC-4 (parameterization per fc_adapter × vio_strategy) | scenario uses conftest `fc_adapter`/`vio_strategy` fixtures + indirect `outlier_injection_derkachi` (density=medium, seed=0) | Covered |
|
||||
| CSV evidence | `test_write_csv_evidence_round_trips`, scenario writes `ft-n-01-{fc_adapter}-{vio_strategy}.csv` | Covered |
|
||||
|
||||
**AZ-425 (FT-N-03)**
|
||||
|
||||
| AC | Coverage | Status |
|
||||
|----|----------|--------|
|
||||
| AC-1 (≥3 consecutive missing frames) | `test_detect_no_outage_returns_empty`, `test_detect_run_below_min_length_ignored`, `test_detect_single_outage_window`, `test_detect_multiple_windows`, `test_detect_trailing_outage_window`, scenario assertion via `passes_min_length` | Covered |
|
||||
| AC-2 (STATUSTEXT `OPERATOR_RELOC_REQUEST` within 2 s ±500 ms of onset) | `test_statustext_within_tolerance_passes`, `test_statustext_within_tolerance_late_passes`, `test_statustext_too_early_fails`, `test_statustext_too_late_fails`, `test_statustext_missing_fails`, `test_statustext_payload_mismatch_fails`, scenario assertion via `passes_statustext` | Covered |
|
||||
| AC-3 (dead_reckoned label during outage) | `test_dead_reckoned_during_window_passes`, `test_dead_reckoned_absent_fails`, scenario assertion via `passes_dead_reckoned` | Covered |
|
||||
| AC-4 (no FC EKF divergence event during outage) | `test_ekf_divergence_during_window_fails`, `test_ekf_divergence_outside_window_ignored`, scenario assertion via `passes_ekf` | Covered |
|
||||
| AC-5 (parameterization) | scenario uses conftest `fc_adapter`/`vio_strategy` fixtures | Covered |
|
||||
| CSV evidence | `test_write_csv_evidence_round_trips`, scenario writes `ft-n-03-{fc_adapter}-{vio_strategy}.csv` | Covered |
|
||||
|
||||
**AZ-426 (FT-N-04)**
|
||||
|
||||
| AC | Coverage | Status |
|
||||
|----|----------|--------|
|
||||
| AC-1 (switch latency ≤1 frame OR ≤400 ms) | `test_switch_latency_within_400_ms_passes` (validates `min(400, frame_period_ms)` budget), `test_switch_latency_within_one_frame_passes`, `test_switch_latency_at_one_frame_boundary_passes`, `test_switch_latency_missing_dead_reckoned_fails`, scenario assertion via `switch_latency.passes` | Covered |
|
||||
| AC-2 (spoof-rejected events AND no satellite re-anchor inside window) | `test_spoof_rejection_pass`, `test_spoof_rejection_no_events_fails`, `test_spoof_rejection_label_returns_to_satellite_fails`, scenario assertion via `spoof_rejection.passes` | Covered |
|
||||
| AC-3 (covariance monotonic) | `test_covariance_monotonic_pass`, `test_covariance_monotonic_decreasing_fails`, scenario assertion via `covariance_monotonic.passes` | Covered |
|
||||
| AC-4 (`horiz_accuracy ≥ 0.95 × cov_semi_major_m`) | `test_honest_accuracy_pass`, `test_honest_accuracy_boundary_pass`, `test_honest_accuracy_violation_fails`, scenario assertion via `honest_accuracy.passes` | Covered |
|
||||
| AC-5 (`VISUAL_BLACKOUT_IMU_ONLY` rate ∈ [1, 2] Hz) | `test_statustext_rate_pass_at_1hz`, `test_statustext_rate_pass_at_2hz`, `test_statustext_rate_too_slow_fails`, `test_statustext_rate_too_fast_fails`, scenario assertion via `statustext_rate.passes` | Covered |
|
||||
| AC-6 (35 s only: cov 100 m → fix_type ≤2D) | `test_escalation_non_35s_window_passes_vacuously`, `test_escalation_35s_ac6_fix_type_degraded_passes`, `test_escalation_35s_ac6_fix_type_not_degraded_fails`, scenario assertion gated on `is_35s` via `escalation.passes_ac6` | Covered |
|
||||
| AC-7 (35 s only: cov 500 m OR 30 s duration → `horiz=999`, `VISUAL_BLACKOUT_FAILSAFE` within 500 ms) | `test_escalation_35s_no_crossings_passes` (vacuous on duration-only path), `test_escalation_35s_ac7_horiz_not_999_fails`, scenario assertion gated on `is_35s` via `escalation.passes_ac7` | Covered |
|
||||
| AC-8 (recovery gate: ≥10 s stable + consistency check pass) | `test_recovery_gate_pass`, `test_recovery_gate_unstable_fails`, `test_recovery_gate_spoofed_fails`, `test_recovery_gate_no_consistency_check_fails`, `test_recovery_gate_no_recovery_attempt_vacuous_pass`, scenario assertion via `recovery_gate.passes` | Covered |
|
||||
| AC-9 (parameterization × 3 windows) | scenario indirect-parametrizes `blackout_spoof_derkachi` over `_WINDOW_LADDER_S = (5.0, 15.0, 35.0)` with ids `["5s", "15s", "35s"]`; conftest `fc_adapter`/`vio_strategy` adds 6 variants = 18 collected items per fc_adapter pair | Covered |
|
||||
| CSV evidence | `test_write_csv_evidence_round_trips`, scenario writes `ft-n-04-{window_s}s-{fc_adapter}-{vio_strategy}.csv` | Covered |
|
||||
|
||||
### Phase 3 — Code Quality
|
||||
|
||||
* **Single responsibility**: each evaluator is one module with one
|
||||
responsibility — `outlier_tolerance_evaluator` aggregates per-event
|
||||
AC-2/AC-3 reports; `outage_request_evaluator` detects outage windows
|
||||
and evaluates AC-1..AC-4 per window; `blackout_spoof_evaluator`
|
||||
evaluates the AC-1..AC-8 ladder against one `BlackoutWindow`. None
|
||||
of the three pulls in scenario-specific helpers (drive replay /
|
||||
collect samples) — those live in the scenario test files.
|
||||
* **Method naming**: per-AC evaluators are named after the AC concern
|
||||
(`evaluate_switch_latency`, `evaluate_spoof_rejection`,
|
||||
`evaluate_covariance_monotonic`, `evaluate_honest_accuracy`,
|
||||
`evaluate_statustext_rate`, `evaluate_escalation`,
|
||||
`evaluate_recovery_gate`). The aggregate `evaluate(...)` in each
|
||||
module composes the per-AC reports into a single dataclass.
|
||||
* **No suppressed errors**: `load_outlier_manifest` raises on missing
|
||||
file and missing columns; the manifest writer raises naturally on
|
||||
ENOENT; the evaluator helpers raise no exceptions of their own.
|
||||
No bare `except`, no `2>/dev/null`-equivalents.
|
||||
* **AAA comment discipline**: every test uses `# Arrange / # Act /
|
||||
# Assert`; sections are omitted when not needed (e.g. constant
|
||||
invariant tests just have `# Assert`).
|
||||
* **Public boundary**: confirmed all three evaluators import only from
|
||||
the `e2e.runner.helpers.geo` symbol (when needed) and dataclasses /
|
||||
stdlib. No `from gps_denied_onboard ...`. Confirmed via grep.
|
||||
|
||||
### Phase 4 — Security
|
||||
|
||||
* **No new secrets, credentials, or network paths**. All three
|
||||
evaluators are pure-logic over already-collected samples / events.
|
||||
* **Spoof rejection (AC-2)** is the project's primary anti-spoof
|
||||
invariant; the evaluator does not bypass it — it asserts the FDR
|
||||
recorded the rejection AND that the source-label state machine did
|
||||
not silently re-promote to `satellite_anchored` inside the window.
|
||||
* **Honest accuracy (AC-4)** ensures the SUT cannot under-report
|
||||
uncertainty to the FC. The evaluator's check is `horiz_accuracy ≥
|
||||
0.95 × cov_semi_major_m` per the spec; we explicitly cover the
|
||||
boundary in `test_honest_accuracy_boundary_pass` so a future
|
||||
implementation cannot pass by emitting `horiz = cov` while the spec
|
||||
budget is `0.95 × cov`.
|
||||
|
||||
### Phase 5 — Performance
|
||||
|
||||
All three evaluators are O(N) over their input sequences (single
|
||||
pass over estimates, single pass over events, single pass over
|
||||
statustexts). No nested scans beyond the bounded 3-frame window in
|
||||
`outlier_tolerance_evaluator.evaluate_event`. CSV writes use
|
||||
buffered `csv.writer`. No file I/O at module import time.
|
||||
|
||||
### Phase 6 — Cross-Task Consistency
|
||||
|
||||
* **Shared `geo.distance_m`** is the single point-to-point distance
|
||||
helper used by `outlier_tolerance_evaluator`. Matches the
|
||||
`accuracy_evaluator`, `multi_segment_evaluator`,
|
||||
`smoothing_evaluator`, `cold_start_evaluator` conventions.
|
||||
* **Shared `_harness_helpers_implemented` skip gate**: all three new
|
||||
scenarios use the same probe pattern as `test_ft_p_10_*`,
|
||||
`test_ft_p_11_*`, `test_ft_n_02_*` — `NotImplementedError` on
|
||||
`frame_source_replay`, `fdr_reader`, `imu_replay`,
|
||||
`mavproxy_tlog_reader`, `sitl_observer` collapses to a single
|
||||
`pytest.skip(...)` with a pointer to the relevant unit test.
|
||||
* **Constants centralised inside each module**: `MIN_OUTLIER_COUNT`,
|
||||
`DRIFT_BUDGET_M`, `SWITCH_LATENCY_MS`, `STATUSTEXT_RATE_*_HZ`,
|
||||
`ESCALATION_*` all sit at the top of their respective modules and
|
||||
are imported as named constants in the unit tests. No magic numbers
|
||||
inline.
|
||||
* **Source-label vocabulary**: `dead_reckoned` / `satellite_anchored`
|
||||
are spelled identically across the three new evaluators and match
|
||||
the prior batches (`sharp_turn_detector.ALLOWED_DURING_TURN_LABELS`,
|
||||
`multi_segment_evaluator`, FDR schema in batch 67-68).
|
||||
* **STATUSTEXT regex strings**: `OPERATOR_RELOC_REQUEST` (FT-N-03),
|
||||
`VISUAL_BLACKOUT_IMU_ONLY` (FT-N-04 AC-5),
|
||||
`VISUAL_BLACKOUT_FAILSAFE` (FT-N-04 AC-7) match the spec verbatim;
|
||||
unit-tested for substring presence + payload mismatch.
|
||||
|
||||
### Phase 7 — Architecture Compliance
|
||||
|
||||
* **Module placement**: all three evaluators live in
|
||||
`e2e/runner/helpers/`; their unit tests in
|
||||
`e2e/_unit_tests/helpers/`; their scenarios in
|
||||
`e2e/tests/negative/`. Consistent with the AZ-406 layout and the
|
||||
directory-layout invariant test (which now lists the three new
|
||||
helpers + three new scenarios).
|
||||
* **No `src/gps_denied_onboard` imports** anywhere in the new code.
|
||||
Verified by inspection — the evaluators only consume typed
|
||||
dataclasses populated by the scenario from public-boundary
|
||||
sources (FDR, mavproxy tlog, SITL state, injector manifests).
|
||||
* **Scenario gating**: each new scenario file uses
|
||||
`pytest.skip(...)` with an explicit message pointing to the unit
|
||||
test that covers the gated AC logic. This is the established
|
||||
pattern from FT-P-07/08/09/10/11 and FT-N-02 — scenario coverage
|
||||
comes online once the AZ-441 / AZ-407 / AZ-416 leftovers ship.
|
||||
|
||||
## Test Results
|
||||
|
||||
* New unit tests: 14 (outlier) + 18 (outage) + 29 (blackout-spoof) = **61 new tests**
|
||||
* Plus 6 new entries in the parametrized `test_required_path_exists`
|
||||
(3 evaluator paths + 3 scenario paths) — counted toward the suite
|
||||
total.
|
||||
* Full `e2e/_unit_tests` suite: **527 passed in 130 s** (previous
|
||||
cumulative: 460 → +67 net).
|
||||
* Scenario collection for the three negative tests: 48 items collect
|
||||
cleanly (parametrized across `fc_adapter × vio_strategy × {density |
|
||||
window_seconds}`). The session-end `/e2e-results/evidence/per-nfr`
|
||||
teardown error is the same pre-existing wart documented in batches
|
||||
69-72 (nfr_recorder hardcoded path; not introduced by this batch).
|
||||
Reference in New Issue
Block a user