mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 07:01:13 +00:00
[AZ-432] [AZ-433] [AZ-434] [AZ-435] Add NFT-RES-01..04 resilience scenarios
Batch 86: 4 NFT-RES blackbox scenarios + 4 helper evaluators + 74 unit tests + directory-layout registration. * AZ-432 NFT-RES-01: 30 s IMU-only fallback drift bound (AC-3.5 + AC-NEW-7); two sub-cases (no_imu ≤100m, good_imu_combined_factor ≤50m). * AZ-433 NFT-RES-02: companion mid-flight reboot (AC-5.2 + AC-5.3); resume ≤30s + first-emission accuracy ≤100m. * AZ-434 NFT-RES-03: 100-iteration Monte Carlo envelope (AC-NEW-4); iteration-count + master-seed determinism + envelope ratio ≥0.95. Canonical-param by default; E2E_NFT_RES_03_FULL_MATRIX=1 unlocks matrix. * AZ-435 NFT-RES-04: 35s blackout+spoof escalation ladder (AC-NEW-8); AC-1 (cov-2d→fix-degrade ≤500ms) + AC-2 (failsafe→999+STATUSTEXT ≤500ms) + AC-ORDER (strict ordering). Verdict: PASS_WITH_WARNINGS (0 Critical, 0 High, 0 Medium, 5 Low). F5 documents intentional threshold duplication with blackout_spoof evaluator (prevents contract drift between FT-N-04 and NFT-RES-04). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,151 @@
|
||||
# Batch 86 — AZ-432 + AZ-433 + AZ-434 + AZ-435 (Resilience NFTs)
|
||||
|
||||
**Tracker**: AZ-432, AZ-433, AZ-434, AZ-435
|
||||
**Tasks**: 4 tasks / 14 complexity points (3 + 3 + 5 + 3)
|
||||
**Date**: 2026-05-17
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
**Review**: `_docs/03_implementation/reviews/batch_86_review.md`
|
||||
|
||||
## Scope
|
||||
|
||||
- **AZ-432 / NFT-RES-01 (AC-3.5 + AC-NEW-7)** — 30 s pure-vision-blackout drift bound; two sub-cases (`no_imu` ≤100 m, `good_imu_combined_factor` ≤50 m); Tier-1 OR Tier-2.
|
||||
- **AZ-433 / NFT-RES-02 (AC-5.2 + AC-5.3)** — Mid-flight Docker/systemd restart; resume ≤30 s + first-emission accuracy ≤100 m; Tier-1 OR Tier-2.
|
||||
- **AZ-434 / NFT-RES-03 (AC-NEW-4)** — 100-iteration Monte Carlo statistical envelope; AC-1 (N≥100) + AC-2 (master-seed determinism) + AC-3 (`count(error_m ≤ 1.96 × cov_semi_major_m) / total ≥ 0.95`); canonical-param by default, `E2E_NFT_RES_03_FULL_MATRIX=1` unlocks full matrix.
|
||||
- **AZ-435 / NFT-RES-04 (AC-NEW-8 escalation)** — 35 s blackout+spoof full ladder; AC-1 (cov-2d → fix-degrade ≤500 ms) + AC-2 (failsafe trigger → 999+STATUSTEXT ≤500 ms) + AC-ORDER (cov-2d strictly precedes failsafe trigger).
|
||||
|
||||
## Files
|
||||
|
||||
### Created (12 files)
|
||||
|
||||
- `e2e/runner/helpers/imu_fallback_drift_evaluator.py` — sub-case drift evaluator (no_imu / good_imu_combined_factor) with window-in-spec guard.
|
||||
- `e2e/runner/helpers/companion_reboot_evaluator.py` — restart-trigger + resume-time + first-emission-accuracy verdicts from one captured `RestartEvidence`.
|
||||
- `e2e/runner/helpers/monte_carlo_envelope_evaluator.py` — iteration-count + envelope-ratio + SHA-256 determinism fingerprint.
|
||||
- `e2e/runner/helpers/escalation_ladder_evaluator.py` — cov-2d/cov-500/duration triggers, latency budgets, strict ordering.
|
||||
- `e2e/tests/resilience/test_nft_res_01_imu_only_fallback.py` — NFT-RES-01 scenario (Tier-1/2; fixture-consumer).
|
||||
- `e2e/tests/resilience/test_nft_res_02_companion_reboot.py` — NFT-RES-02 scenario (Tier-1/2; fixture-consumer).
|
||||
- `e2e/tests/resilience/test_nft_res_03_monte_carlo.py` — NFT-RES-03 scenario (Tier-1/2; canonical-only by default; full-matrix gated).
|
||||
- `e2e/tests/resilience/test_nft_res_04_blackout_escalation.py` — NFT-RES-04 scenario (Tier-1/2; fixture-consumer; sibling of FT-N-04).
|
||||
- `e2e/_unit_tests/helpers/test_imu_fallback_drift_evaluator.py` — 16 unit tests.
|
||||
- `e2e/_unit_tests/helpers/test_companion_reboot_evaluator.py` — 19 unit tests.
|
||||
- `e2e/_unit_tests/helpers/test_monte_carlo_envelope_evaluator.py` — 15 unit tests.
|
||||
- `e2e/_unit_tests/helpers/test_escalation_ladder_evaluator.py` — 24 unit tests.
|
||||
|
||||
### Modified
|
||||
|
||||
- `e2e/_unit_tests/test_directory_layout.py` — registered 8 new paths.
|
||||
|
||||
## Test Results
|
||||
|
||||
```
|
||||
$ pytest e2e/_unit_tests/helpers/test_imu_fallback_drift_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_companion_reboot_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_monte_carlo_envelope_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_escalation_ladder_evaluator.py \
|
||||
e2e/_unit_tests/test_directory_layout.py
|
||||
================ 199 passed in 0.69s ================
|
||||
```
|
||||
|
||||
Scenario collection (24 cases, all parameterised):
|
||||
|
||||
```
|
||||
$ pytest e2e/tests/resilience/ --collect-only -p no:csv --evidence-out=/tmp/e2e-test-evidence
|
||||
collected 24 items
|
||||
test_nft_res_01_imu_only_fallback: 6 cases
|
||||
test_nft_res_02_companion_reboot: 6 cases
|
||||
test_nft_res_03_monte_carlo: 6 cases
|
||||
test_nft_res_04_blackout_escalation: 6 cases
|
||||
```
|
||||
|
||||
Scenario smoke (all 24 skip cleanly with rich diagnostic messages):
|
||||
|
||||
```
|
||||
24 skipped in 0.17s
|
||||
```
|
||||
|
||||
Skip breakdown:
|
||||
- 13 skip-on-`sitl_replay_ready=False` (no `E2E_SITL_REPLAY_DIR` locally — expected pattern).
|
||||
- 8 skip-on-`vins_mono` (research-build-only per D-C1-1-SUB-A — conftest applies on production builds).
|
||||
- 3 skip-on-non-canonical-param for NFT-RES-03 (AC-4 default canonical-only; unlock with `E2E_NFT_RES_03_FULL_MATRIX=1`).
|
||||
|
||||
## AC Verification
|
||||
|
||||
### AZ-432 / NFT-RES-01
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1 30 s window injected | `BlackoutWindow.window_in_spec` (±2 s tolerance) + 3 unit tests + scenario gate |
|
||||
| AC-2 no-IMU drift ≤ 100 m | `evaluate_subcase(... no_imu).passes` + 2 unit tests + scenario AC-2 assert |
|
||||
| AC-3 good-IMU drift ≤ 50 m | `evaluate_subcase(... good_imu_combined_factor).passes` + 2 unit tests + scenario AC-3 assert |
|
||||
| AC-4 parameterization | 6 collected variants (fc_adapter × vio_strategy) |
|
||||
|
||||
### AZ-433 / NFT-RES-02
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1 restart trigger ≤ 5 s | `passes_restart_trigger` + 4 unit tests + scenario AC-1 assert |
|
||||
| AC-2 resume time ≤ 30 s | `passes_resume_time` + 4 unit tests + scenario AC-2 assert |
|
||||
| AC-3 first-emission accuracy ≤ 100 m | `passes_first_emission_accuracy` + 5 unit tests + scenario AC-3 assert |
|
||||
| AC-4 parameterization | 6 collected variants |
|
||||
|
||||
### AZ-434 / NFT-RES-03
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1 N ≥ 100 iterations | `passes_iteration_count` + 3 unit tests + scenario AC-1 assert |
|
||||
| AC-2 master-seed determinism | `determinism_fingerprint` + 4 unit tests + scenario dual-evaluate AC-2 assert |
|
||||
| AC-3 envelope ratio ≥ 0.95 | `passes_envelope` + 6 unit tests + scenario AC-3 assert |
|
||||
| AC-4 parameterization | Canonical (ardupilot, okvis2) by default; full matrix via `E2E_NFT_RES_03_FULL_MATRIX=1` |
|
||||
|
||||
### AZ-435 / NFT-RES-04
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1 100 m → fix-degrade ≤ 500 ms | `fix_degrade.passes` + 4 unit tests + scenario AC-1 assert |
|
||||
| AC-2 500 m / 30 s → 999+STATUSTEXT ≤ 500 ms | `failsafe.passes` + 6 unit tests + scenario AC-2 assert |
|
||||
| AC-ORDER cov-2d strictly precedes failsafe | `ordering.passes` + 3 unit tests + scenario AC-ORDER assert |
|
||||
| AC-3 parameterization | 6 collected variants |
|
||||
|
||||
`traces_to` markers:
|
||||
- NFT-RES-01: `AC-3.5,AC-NEW-7,AC-1,AC-2,AC-3,AC-4`
|
||||
- NFT-RES-02: `AC-5.2,AC-5.3,AC-1,AC-2,AC-3,AC-4`
|
||||
- NFT-RES-03: `AC-NEW-4,AC-1,AC-2,AC-3,AC-4`
|
||||
- NFT-RES-04: `AC-NEW-8,AC-1,AC-2,AC-3`
|
||||
|
||||
## Code Review
|
||||
|
||||
**Verdict**: PASS_WITH_WARNINGS — 0 Critical, 0 High, 0 Medium, 5 Low.
|
||||
|
||||
- **F1 (Low / Maintainability — carry-over of batch-85 F4)**: `write_csv_evidence` boilerplate now in 8 evaluators. Future hygiene PBI.
|
||||
- **F2 (Low / Spec-Gap surfacing)**: AZ-432 sub-case (a) needs SUT-side disable-IMU path OR empty IMU stream from FC inbound proxy — production dep on AZ-595.
|
||||
- **F3 (Low / Spec-Gap surfacing)**: AZ-433 process-restart observation needs runner-side health-probe; AZ-444 owns Tier-2, Tier-1 needs docker-compose healthcheck wiring.
|
||||
- **F4 (Low / Maintainability)**: `_resolve_fixture_path` duplicated across 4 new scenarios (matches NFT-PERF pattern from batch 85). Future hygiene PBI.
|
||||
- **F5 (Low / Maintainability — intentional)**: `escalation_ladder_evaluator` thresholds intentionally re-defined locally rather than imported from `blackout_spoof_evaluator`. Documented; will be cited if a future review proposes to "DRY" them.
|
||||
|
||||
Full review: `_docs/03_implementation/reviews/batch_86_review.md`.
|
||||
|
||||
## Production Dependencies
|
||||
|
||||
Surfaced for the cumulative review window (85-87) + traceability matrix:
|
||||
|
||||
1. **AZ-595 (fixture builder)**: emit `nft_res_01_imu_fallback.json` (both sub-cases × 30 s blackout × estimate+GT samples), `nft_res_02_companion_reboot.json` (restart-command + process-restarted + first-post-restart-emission + GT-at-emission timestamps), `nft_res_03_monte_carlo.json` (master_seed + 100 iterations × per-frame (error_m, cov_semi_major_m)), `nft_res_04_blackout_escalation.json` (35 s window + estimate stream with cov/horiz/fix_type + STATUSTEXT stream).
|
||||
2. **AZ-444 (Tier-2 runner)**: per-iteration clean-state lifecycle for NFT-RES-03 (fdr-output volume wipe + SUT cold restart × 100); systemd watchdog observation for NFT-RES-02 process-restart timing.
|
||||
3. **AZ-595 + SUT**: SUT-side `no_imu` config path OR FC-proxy empty IMU stream injection for AZ-432 sub-case (a).
|
||||
4. **SUT-side**: outbound stream MUST carry `cov_semi_major_m`, `horiz_accuracy`, and `fix_type` per-frame for NFT-RES-04 to detect the ladder. Existing FT-N-04 already requires the first two; `fix_type` is new for NFT-RES-04 AC-1 (MAVLink `GPS_INPUT.fix_type ≤ 2` for AP, equivalent for iNav).
|
||||
5. **Already exists**: `sitl_replay_ready` fixture, `sitl_observer.replay_dir()`, `evidence_dir`, `nfr_recorder` (AZ-406/AZ-445), `geo.distance_m` (AZ-407), conftest `fc_adapter` / `vio_strategy` / `vins_mono` skip rules.
|
||||
|
||||
## Architecture Compliance
|
||||
|
||||
- All new files under `e2e/`, owned by the Blackbox Tests cross-cutting component per `_docs/02_document/module-layout.md`.
|
||||
- No imports from `src/gps_denied_onboard` (verified — only `runner.helpers.geo`, `runner.helpers.sitl_observer`, `pyproj`, stdlib).
|
||||
- No new cyclic dependencies. New evaluators are leaves of the import DAG.
|
||||
- No new infrastructure libraries.
|
||||
|
||||
## Sub-step Trace
|
||||
|
||||
Phases executed per `implement/SKILL.md`:
|
||||
- phase 5 (load-spec) → 4 task specs read
|
||||
- phase 6 (implement-tasks-sequentially) → helpers + scenarios + unit tests for all 4 tasks
|
||||
- phase 7 (verify-ac-coverage) → ACs traced above
|
||||
- phase 8 (code-review) → batch_86_review.md (PASS_WITH_WARNINGS)
|
||||
- phase 8.5 (cumulative-review) → defer to batch 87 (K=3 window 85-87)
|
||||
- phase 11 (commit-batch) → next.
|
||||
@@ -0,0 +1,118 @@
|
||||
# Code Review Report — Batch 86
|
||||
|
||||
**Batch**: 86 (AZ-432 + AZ-433 + AZ-434 + AZ-435 — Resilience NFTs)
|
||||
**Date**: 2026-05-17
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Low | Maintainability | `e2e/runner/helpers/{imu_fallback_drift,companion_reboot,monte_carlo_envelope,escalation_ladder}_evaluator.py` | `write_csv_evidence` duplication continues — now 8 evaluators (carry-over of batch-85 F4) |
|
||||
| 2 | Low | Spec-Gap (surfacing) | `e2e/tests/resilience/test_nft_res_01_imu_only_fallback.py` | Sub-case (a) "no IMU" needs SUT-side disable-IMU config OR empty IMU stream; production dep on AZ-595 |
|
||||
| 3 | Low | Spec-Gap (surfacing) | `e2e/tests/resilience/test_nft_res_02_companion_reboot.py` | Process-restart observation requires runner-side health probe; AZ-444 owns Tier-2, Tier-1 needs docker-compose healthcheck wiring |
|
||||
| 4 | Low | Maintainability | `e2e/tests/resilience/test_nft_res_0{1,2,3,4}_*.py` | `_resolve_fixture_path` duplicated across 4 new scenarios (matches NFT-PERF pattern) |
|
||||
| 5 | Low | Maintainability (intentional) | `e2e/runner/helpers/escalation_ladder_evaluator.py` | Thresholds (`COV_2D_THRESHOLD_M`, `COV_FAILSAFE_THRESHOLD_M`, etc.) deliberately re-defined locally rather than imported from `blackout_spoof_evaluator` |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: `write_csv_evidence` duplication continues** (Low / Maintainability — carry-over)
|
||||
- Location: `imu_fallback_drift_evaluator.py`, `companion_reboot_evaluator.py`, `monte_carlo_envelope_evaluator.py`, `escalation_ladder_evaluator.py`
|
||||
- Description: Batch-85 surfaced this on 4 evaluators (F4); batch 86 adds 4 more. The pattern is: open file, write header row, write data row(s), close. All 8 evaluators implement it independently.
|
||||
- Suggestion: Continued hygiene PBI — extract a `_emit_csv(path, header, rows)` helper. Not blocking; will be surfaced again in the cumulative review of batches 85-87.
|
||||
- Task: AZ-432 / AZ-433 / AZ-434 / AZ-435
|
||||
|
||||
**F2: Sub-case (a) "no IMU" needs SUT-side disable-IMU path** (Low / Spec-Gap — surfacing)
|
||||
- Location: `e2e/tests/resilience/test_nft_res_01_imu_only_fallback.py`
|
||||
- Description: AZ-432 sub-case (a) requires the SUT to run with IMU input disabled. The task spec calls out two options: (i) SUT config flag `E2E_NFT_RES_01_DISABLE_IMU=1` (or equivalent) propagated to the SUT, or (ii) empty IMU stream from the FC inbound proxy. Neither path is wired in production code today; the fixture builder (AZ-595) is responsible for setting up the variant that produces the captured no-IMU estimates.
|
||||
- Suggestion: Track as production dependency for AZ-595 — the SITL replay builder must emit a `no_imu` sub-case capture using one of those two strategies and label it accordingly in the fixture JSON.
|
||||
- Task: AZ-432
|
||||
|
||||
**F3: Process-restart observation needs runner-side wiring** (Low / Spec-Gap — surfacing)
|
||||
- Location: `e2e/tests/resilience/test_nft_res_02_companion_reboot.py`
|
||||
- Description: AZ-433 AC-1 (process restarts within ≤5 s) requires the runner to observe the moment the SUT process is back up. On Tier-2, AZ-444 owns the systemd watchdog observation. On Tier-1, the equivalent docker-compose `healthcheck` transition (`unhealthy → healthy`) is not yet captured into the fixture JSON. The scenario tolerates `process_restarted_monotonic_ms = null` (AC-1 fails loud), but the *positive* path requires the harness.
|
||||
- Suggestion: Track as production dependency for AZ-444 (Tier-2) + AZ-595 (Tier-1 docker-compose healthcheck capture).
|
||||
- Task: AZ-433
|
||||
|
||||
**F4: `_resolve_fixture_path` duplicated across scenarios** (Low / Maintainability)
|
||||
- Location: `test_nft_res_0{1,2,3,4}_*.py`
|
||||
- Description: Each scenario re-implements the env-var-or-default fixture path resolver (`E2E_NFT_RES_NN_FIXTURE` → absolute or relative to `E2E_SITL_REPLAY_DIR`). The NFT-PERF batch (85) introduced the same pattern across 4 scenarios. Now 8 scenarios duplicate it.
|
||||
- Suggestion: Future hygiene PBI — extract a `runner.helpers.fixture_resolver.resolve(env_var, default_name)` utility. Not blocking; isolation per scenario keeps each test self-contained today.
|
||||
- Task: AZ-432 / AZ-433 / AZ-434 / AZ-435
|
||||
|
||||
**F5: Thresholds intentionally re-defined locally** (Low / Maintainability — intentional)
|
||||
- Location: `e2e/runner/helpers/escalation_ladder_evaluator.py:25-40`
|
||||
- Description: `COV_2D_THRESHOLD_M = 100.0`, `COV_FAILSAFE_THRESHOLD_M = 500.0`, `DURATION_FAILSAFE_S = 30.0`, `FIX_TYPE_2D = 2`, `HORIZ_ACCURACY_FAILSAFE = 999.0`, `STATUSTEXT_FAILSAFE = "VISUAL_BLACKOUT_FAILSAFE"`, `ESCALATION_LATENCY_MS = 500` are also defined in `blackout_spoof_evaluator.py` under the same numeric values. The duplication is **intentional** — NFT-RES-04 and FT-N-04 must continue to fail loud if either evaluator's contract drifts away from the other. A shared constants module would silently propagate one team's edit into both AC suites.
|
||||
- Suggestion: NONE. Documented in `escalation_ladder_evaluator.py` module docstring. If a future cumulative review proposes to "DRY" these, reject and re-cite this finding.
|
||||
- Task: AZ-435
|
||||
|
||||
## Phase Notes
|
||||
|
||||
### Phase 1 — Context
|
||||
|
||||
All 4 task specs read; ACs walked through against helpers + scenarios. Architecture-compliance file (`module-layout.md` § blackbox_tests) re-read to confirm `e2e/**` ownership envelope.
|
||||
|
||||
### Phase 2 — Spec Compliance
|
||||
|
||||
| Task | AC | Evidence |
|
||||
|------|----|----------|
|
||||
| AZ-432 | AC-1 30 s window | `imu_fallback_drift_evaluator.BlackoutWindow.window_in_spec` + scenario AC-1 assert |
|
||||
| AZ-432 | AC-2 no-IMU drift ≤ 100 m | `evaluate_subcase(... SUBCASE_NO_IMU).passes` + 2 unit tests + scenario AC-2 assert |
|
||||
| AZ-432 | AC-3 good-IMU drift ≤ 50 m | `evaluate_subcase(... SUBCASE_GOOD_IMU).passes` + 2 unit tests + scenario AC-3 assert |
|
||||
| AZ-432 | AC-4 parameterization | 6 collected variants per scenario |
|
||||
| AZ-433 | AC-1 restart trigger ≤ 5 s | `companion_reboot_evaluator.passes_restart_trigger` + 4 unit tests + scenario AC-1 assert |
|
||||
| AZ-433 | AC-2 resume time ≤ 30 s | `passes_resume_time` + 4 unit tests + scenario AC-2 assert |
|
||||
| AZ-433 | AC-3 accuracy ≤ 100 m | `passes_first_emission_accuracy` + 5 unit tests + scenario AC-3 assert |
|
||||
| AZ-433 | AC-4 parameterization | 6 collected variants |
|
||||
| AZ-434 | AC-1 N ≥ 100 | `monte_carlo_envelope_evaluator.passes_iteration_count` + 3 unit tests + scenario AC-1 assert |
|
||||
| AZ-434 | AC-2 determinism | `determinism_fingerprint` + 4 unit tests + scenario dual-evaluate AC-2 assert |
|
||||
| AZ-434 | AC-3 envelope ≥ 95 % | `passes_envelope` + 6 unit tests + scenario AC-3 assert |
|
||||
| AZ-434 | AC-4 parameterization | Canonical-only by default; `E2E_NFT_RES_03_FULL_MATRIX=1` unlocks full 6 variants |
|
||||
| AZ-435 | AC-1 100 m → fix-degrade ≤ 500 ms | `escalation_ladder_evaluator.fix_degrade.passes` + 4 unit tests + scenario AC-1 assert |
|
||||
| AZ-435 | AC-2 500 m/30 s → 999+STATUSTEXT ≤ 500 ms | `failsafe.passes` + 6 unit tests + scenario AC-2 assert |
|
||||
| AZ-435 | AC-ORDER cov-2d strictly precedes failsafe | `ordering.passes` + 3 unit tests + scenario AC-ORDER assert |
|
||||
| AZ-435 | AC-3 parameterization | 6 collected variants |
|
||||
|
||||
All ACs PASS.
|
||||
|
||||
### Phase 3 — Code Quality
|
||||
|
||||
- SRP: each evaluator owns one resilience concern (drift / restart / envelope / ladder). Scenarios are thin adapters.
|
||||
- Error handling: explicit `ValueError` on bad helper input (negative latency, unknown sub-case); `pytest.fail()` on bad fixture shape; no bare `except`.
|
||||
- Naming: clear (`SubCaseReport`, `RestartEvidence`, `IterationOutcome`, `EscalationLadderReport`); follows project conventions.
|
||||
- Complexity: longest function is `escalation_ladder_evaluator.evaluate` (~60 lines), all sequential helper calls — acceptable.
|
||||
- DRY: F1 + F4 + F5 noted above.
|
||||
- Test quality: every helper public function has ≥3 unit tests covering happy path, boundary, failure; total 74 new unit tests (16 + 19 + 15 + 24).
|
||||
|
||||
### Phase 4 — Security Quick-Scan
|
||||
|
||||
- No SQL, no `subprocess(shell=True)`, no `eval/exec`, no hardcoded secrets, no insecure deserialization (JSON loads only with explicit shape validation).
|
||||
- Pass.
|
||||
|
||||
### Phase 5 — Performance Scan
|
||||
|
||||
- All evaluators are O(n) single-pass sweeps over their inputs.
|
||||
- No nested loops, no N+1 patterns, no blocking I/O in hot paths.
|
||||
- Pass.
|
||||
|
||||
### Phase 6 — Cross-Task Consistency
|
||||
|
||||
- All 4 helpers share `geo.distance_m` (consistent with batch-85 + earlier pattern).
|
||||
- All 4 scenarios use the same skip pattern (`sitl_replay_ready` → `pytest.skip`; missing fixture → `pytest.fail`).
|
||||
- All 4 scenarios use `nfr_recorder.record_metric` with `ac_id` kwarg consistently.
|
||||
- F5 documents the one intentional duplication (escalation-ladder thresholds vs blackout-spoof thresholds).
|
||||
|
||||
### Phase 7 — Architecture Compliance
|
||||
|
||||
- All new files under `e2e/**` per `module-layout.md` § blackbox_tests `Owns` glob.
|
||||
- No imports from `src/gps_denied_onboard/**` (verified — only `runner.helpers.geo`, `runner.helpers.sitl_observer`, `pyproj`, stdlib).
|
||||
- No new cyclic dependencies. New evaluators are leaves of the import DAG.
|
||||
- No new infrastructure libraries.
|
||||
|
||||
## Auto-Fix Gate
|
||||
|
||||
0 Critical, 0 High, 0 Medium, 5 Low. All Low findings are either intentional, surfacing, or future hygiene. No auto-fix attempts triggered; no escalation.
|
||||
|
||||
## Verdict
|
||||
|
||||
**PASS_WITH_WARNINGS**.
|
||||
Reference in New Issue
Block a user