mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 12:31:13 +00:00
[AZ-436] [AZ-437] [AZ-438] [AZ-439] Add NFT-SEC-01..05 security scenarios
Batch 87: 6 NFT-SEC blackbox scenarios + 5 helper evaluators + 75 unit tests + cumulative review batches 85-87. * AZ-436 NFT-SEC-01: cache-poisoning safety budget (AC-NEW-9); aggregate false_trust_count ≤ N×1e-6; zero-tolerance default. Canonical-only by default; E2E_NFT_SEC_01_RELEASE_GATE=1 unlocks full matrix. * AZ-437 NFT-SEC-02 + NFT-SEC-05: shared egress-observation evaluator (AC-NEW-10); SEC-02 = 0 packets to non-e2e-net over 5min replay; SEC-05 = DNS-blackhole sidecar healthy + lookup fails + UDP-53 silent. * AZ-438 NFT-SEC-03: AP-only signing rejection (AC-NEW-11); 3 sub-cases (unsigned/wrong-key/replayed) each reject ≤500ms + no position drift. * AZ-439 NFT-SEC-04: probe (always-run) = no-crash + deterministic decode outcome; ASan-fuzz (release-gate) = 0 findings ≥4h; AC-3 corpus floor informational only per spec. Verdict per-batch: PASS_WITH_WARNINGS (5 Low). Cumulative review for batches 85-87 (K=3 window) also PASS_WITH_WARNINGS with 5 cross-batch findings — recommends hygiene PBIs for write_csv_evidence duplication (13 helpers) and _resolve_fixture_path duplication (13 scenarios), plus new tickets for AZ-595 fixture builder + DNS-blackhole sidecar service. Also adds _docs/LESSONS.md documenting the Jira transition-ID lesson (always call getTransitionsForJiraIssue first, never memorize numeric IDs across sessions). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,186 @@
|
||||
# Batch 87 — AZ-436 + AZ-437 + AZ-438 + AZ-439 (Security NFTs)
|
||||
|
||||
**Tracker**: AZ-436, AZ-437, AZ-438, AZ-439
|
||||
**Tasks**: 4 tasks / 16 complexity points (5 + 3 + 3 + 5)
|
||||
**Date**: 2026-05-17
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
**Review**: `_docs/03_implementation/reviews/batch_87_review.md`
|
||||
**Cumulative review**: `_docs/03_implementation/reviews/cumulative_review_batches_85_87.md`
|
||||
|
||||
## Scope
|
||||
|
||||
- **AZ-436 / NFT-SEC-01 (AC-NEW-9)** — N synthetic flights with 1-5 % poisoned tiles; aggregate `false_trust_count ≤ N × 1e-6`; zero-tolerance default. Canonical (ardupilot, okvis2) at N=1000 by default; `E2E_NFT_SEC_01_RELEASE_GATE=1` unlocks full N=10000 × matrix.
|
||||
- **AZ-437 / NFT-SEC-02 + NFT-SEC-05 (AC-NEW-10)** — Two scenarios sharing the egress-observation pattern: NFT-SEC-02 verifies 0 packets to non-`e2e-net` over 5-min Derkachi replay; NFT-SEC-05 verifies DNS-blackhole sidecar absorbs probes + UDP-53 silence.
|
||||
- **AZ-438 / NFT-SEC-03 (AC-NEW-11)** — AP-only; three sub-cases (unsigned / wrong-key / replayed-tlog) each yield `BAD_SIGNATURE` STATUSTEXT ≤500 ms + no position drift. iNav SKIPs.
|
||||
- **AZ-439 / NFT-SEC-04 (RESTRICT-CVE-1)** — Probe scenario (always-run): cve-jpeg-fixture does not crash SUT + records deterministic decode-success or frame-decode-error. ASan-fuzz scenario (release-gate `E2E_NFT_SEC_04_RELEASE_GATE=1`): ≥4 h, 0 ASan findings, ≥1000 corpus inputs (informational).
|
||||
|
||||
## Files
|
||||
|
||||
### Created (13 files)
|
||||
|
||||
- `e2e/runner/helpers/cache_poisoning_evaluator.py` — N-flight aggregate verdict + per-flight poison-ratio + defense-layer-coverage + rejection-reason vocabulary checks.
|
||||
- `e2e/runner/helpers/egress_observer.py` — before/after counter snapshots, `NoEgressReport` + `DnsBlackholeReport` + 5-outcome DNS lookup classifier.
|
||||
- `e2e/runner/helpers/mavlink_signing_evaluator.py` — per-sub-case rejection STATUSTEXT match (BAD_SIGNATURE + documented variants) + position-drift verdict + AC-1 iNav-SKIP companion logic.
|
||||
- `e2e/runner/helpers/cve_probe_evaluator.py` — FDR-survival + deterministic-outcome classifier; rejects silent drops as defense-bypass.
|
||||
- `e2e/runner/helpers/asan_fuzz_evaluator.py` — line-level ASan-finding classifier (8 categories + OTHER_FINDING fallback) + duration gate + corpus-floor informational check.
|
||||
- `e2e/tests/security/test_nft_sec_01_cache_poisoning.py` — NFT-SEC-01 scenario.
|
||||
- `e2e/tests/security/test_nft_sec_02_no_egress.py` — NFT-SEC-02 scenario.
|
||||
- `e2e/tests/security/test_nft_sec_03_mavlink_signing.py` — NFT-SEC-03 scenario (AP-only).
|
||||
- `e2e/tests/security/test_nft_sec_04_opencv_cve.py` — NFT-SEC-04 probe scenario (always-run).
|
||||
- `e2e/tests/security/test_nft_sec_04_asan_fuzz.py` — NFT-SEC-04 fuzz scenario (release-gate).
|
||||
- `e2e/tests/security/test_nft_sec_05_dns_blackhole.py` — NFT-SEC-05 scenario.
|
||||
- `e2e/_unit_tests/helpers/test_cache_poisoning_evaluator.py` — 16 unit tests.
|
||||
- `e2e/_unit_tests/helpers/test_egress_observer.py` — 14 unit tests.
|
||||
- `e2e/_unit_tests/helpers/test_mavlink_signing_evaluator.py` — 18 unit tests.
|
||||
- `e2e/_unit_tests/helpers/test_cve_probe_evaluator.py` — 11 unit tests.
|
||||
- `e2e/_unit_tests/helpers/test_asan_fuzz_evaluator.py` — 16 unit tests.
|
||||
- `_docs/03_implementation/reviews/batch_87_review.md` — per-batch code review.
|
||||
- `_docs/03_implementation/reviews/cumulative_review_batches_85_87.md` — K=3 window cumulative review.
|
||||
- `_docs/LESSONS.md` — agent-behaviour lesson (Jira transition IDs).
|
||||
|
||||
### Modified
|
||||
|
||||
- `e2e/_unit_tests/test_directory_layout.py` — registered 11 new paths (5 helpers + 6 scenarios).
|
||||
|
||||
## Test Results
|
||||
|
||||
Per-batch unit tests:
|
||||
|
||||
```
|
||||
$ pytest e2e/_unit_tests/helpers/test_cache_poisoning_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_egress_observer.py \
|
||||
e2e/_unit_tests/helpers/test_mavlink_signing_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_cve_probe_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_asan_fuzz_evaluator.py \
|
||||
e2e/_unit_tests/test_directory_layout.py
|
||||
================ 215 passed in 0.25s ================
|
||||
```
|
||||
|
||||
Full unit-test suite (regression check, run from workspace root):
|
||||
|
||||
```
|
||||
$ pytest e2e/_unit_tests/
|
||||
================ 1151 passed in 137.86s (0:02:17) ================
|
||||
```
|
||||
|
||||
Scenario collection (36 cases — 6 scenarios × 6 (fc_adapter × vio_strategy) variants):
|
||||
|
||||
```
|
||||
$ pytest e2e/tests/security/ --collect-only -p no:csv --evidence-out=/tmp/e2e-test-evidence
|
||||
collected 36 items
|
||||
```
|
||||
|
||||
Scenario smoke (all 36 skip cleanly with diagnostic messages):
|
||||
|
||||
```
|
||||
36 skipped in 0.11s
|
||||
```
|
||||
|
||||
Skip breakdown:
|
||||
- 12 skip-on-`vins_mono` (conftest research-build-only rule from D-C1-1-SUB-A).
|
||||
- 5 skip-on-canonical-only for NFT-SEC-01 (AC-4 default + the matching `vins_mono`-skipped vins variants).
|
||||
- 6 skip-on-iNav for NFT-SEC-03 (AC-1).
|
||||
- 4 skip-on-release-gate for NFT-SEC-04 ASan-fuzz.
|
||||
- 9 skip-on-`sitl_replay_ready=False` (no `E2E_SITL_REPLAY_DIR` locally).
|
||||
|
||||
## AC Verification
|
||||
|
||||
### AZ-436 / NFT-SEC-01
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1 N flights complete | `len(flights) < NFT_SEC_01_CI_MIN_FLIGHTS` gate + scenario `flight_count` NFR record |
|
||||
| AC-2 poisoned-tile production | `passes_ratio` + `passes_layer_coverage` + `passes_rejection_reason_vocabulary` (3 sub-asserts) |
|
||||
| AC-3 false-trust budget | `passes_budget` (zero-tolerance default — `count == 0`) + scenario `total_false_trust` / `budget` NFR records |
|
||||
| AC-4 parameterization | canonical-only default + `E2E_NFT_SEC_01_RELEASE_GATE=1` for full matrix |
|
||||
|
||||
### AZ-437 / NFT-SEC-02 + NFT-SEC-05
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| NFT-SEC-02 AC-1 egress counter == 0 | `NoEgressReport.passes` + scenario AC-1 assert |
|
||||
| NFT-SEC-05 AC-2 sidecar healthy | `DnsBlackholeReport.sidecar_healthy` + scenario AC-2 assert |
|
||||
| NFT-SEC-05 AC-3a lookup fails | `passes_lookup` (NXDOMAIN / timeout / no-servers / other-failure) + scenario AC-3a assert |
|
||||
| NFT-SEC-05 AC-3b UDP-53 silent | `passes_udp_silence` + scenario AC-3b assert |
|
||||
| AC-4 parameterization | conftest matrix |
|
||||
|
||||
### AZ-438 / NFT-SEC-03
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1 iNav SKIP | scenario-top guard on `fc_adapter == "inav"` |
|
||||
| AC-2/3/4 per-sub-case rejection ≤500 ms + no position update | per-sub-case `passes_rejection` + `passes_no_position_update` (3 ACs × 2 sub-asserts) |
|
||||
| AC-5 vio_strategy parameterization | conftest matrix |
|
||||
|
||||
### AZ-439 / NFT-SEC-04
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1a probe no crash | `passes_no_crash` + scenario AC-1a assert |
|
||||
| AC-1b probe graceful outcome | `passes_graceful_outcome` + scenario AC-1b assert (rejects silent drops) |
|
||||
| AC-2 ASan fuzz 0 findings ≥4 h | `passes_findings` + `passes_duration` + scenario AC-2 assert |
|
||||
| AC-3 ASan fuzz ≥1000 corpus | `reached_corpus_floor` (informational only per spec; recorded in CSV, not asserted) |
|
||||
| AC-4 parameterization | probe = full matrix; fuzz = ardupilot + per-vio only (justified inline to avoid duplicating a 4 h run) |
|
||||
|
||||
`traces_to` markers:
|
||||
- NFT-SEC-01: `AC-NEW-9,AC-1,AC-2,AC-3,AC-4`
|
||||
- NFT-SEC-02: `AC-NEW-10,AC-1,AC-4`
|
||||
- NFT-SEC-03: `AC-NEW-11,AC-1,AC-2,AC-3,AC-4,AC-5`
|
||||
- NFT-SEC-04 probe: `RESTRICT-CVE-1,AC-1,AC-4`
|
||||
- NFT-SEC-04 fuzz: `RESTRICT-CVE-1,AC-2,AC-3,AC-4`
|
||||
- NFT-SEC-05: `AC-NEW-10,AC-2,AC-3,AC-4`
|
||||
|
||||
## Code Review
|
||||
|
||||
**Verdict**: PASS_WITH_WARNINGS — 0 Critical, 0 High, 0 Medium, 5 Low.
|
||||
|
||||
- **F1 (Low / Maintainability — carry-over)**: `write_csv_evidence` boilerplate continues to grow (13 helpers).
|
||||
- **F2 (Low / Spec-Gap)**: DNS-blackhole sidecar referenced by NFT-SEC-05 but not deployed in `e2e/docker/docker-compose.test.yml`.
|
||||
- **F3 (Low / Spec-Gap)**: AP MAVLink 2.0 signing handshake (AZ-416) must be triggered by AZ-595 fixture builder before NFT-SEC-03 replay can run end-to-end.
|
||||
- **F4 (Low / Maintainability — carry-over)**: `_resolve_fixture_path` duplicated across 6 new scenarios.
|
||||
- **F5 (Low / Design-aligned)**: NFT-SEC-04 ASan-fuzz AC-3 corpus floor is informational-only per task spec.
|
||||
|
||||
Full review: `_docs/03_implementation/reviews/batch_87_review.md`.
|
||||
|
||||
## Cumulative Review (Batches 85-87 — K=3 Window)
|
||||
|
||||
**Verdict**: PASS_WITH_WARNINGS. 5 cross-batch findings:
|
||||
|
||||
- **CR-F1 (Medium / Maintainability)**: 13 helpers each duplicate the `write_csv_evidence` pattern. Recommended PBI: shared `csv_evidence_writer.py` (3 pts).
|
||||
- **CR-F2 (Medium / Maintainability)**: 13 scenarios each duplicate `_resolve_fixture_path`. Recommended PBI: shared `fixture_path.resolve()` (2 pts).
|
||||
- **CR-F3 (Low / Spec-Gap)**: AZ-595 fixture builder doesn't exist as a tracked task; needs to materialize 13 JSON contracts. Recommended PBI: 5 pts.
|
||||
- **CR-F4 (Low / Infrastructure-Gap)**: DNS-blackhole sidecar absent. Recommended PBI: 3 pts.
|
||||
- **CR-F5 (Informational)**: full unit-test suite (1151 tests, ~138 s) runs green from workspace root.
|
||||
|
||||
Full cumulative review: `_docs/03_implementation/reviews/cumulative_review_batches_85_87.md`.
|
||||
|
||||
## Production Dependencies
|
||||
|
||||
Surfaced for the traceability matrix + AZ-595:
|
||||
|
||||
1. **AZ-595 (fixture builder)**: emit `nft_sec_01_cache_poisoning.json` (per-flight cache + poisoned-tile slate + runner-collected `false_trust_events` + `rejection_reasons` counter); `nft_sec_02_no_egress.json` (before/after Docker network stats snapshots); `nft_sec_03_mavlink_signing.json` (3 injection timestamps + AP STATUSTEXT + GLOBAL_POSITION_INT captures); `nft_sec_04_cve_probe.json` (probe_injected_at_ms + per-frame FDR record sequence); `nft_sec_04_asan_fuzz.json` (ASan stderr log + duration + corpus size); `nft_sec_05_dns_blackhole.json` (sidecar_healthy + lookup_outcome + UDP-53 before/after).
|
||||
2. **AZ-444 (Tier-2 runner) — optional**: NFT-SEC-04 ASan-fuzz at Tier-2 (Jetson) per the same release-gate flag.
|
||||
3. **e2e infrastructure**: DNS-blackhole sidecar service in `docker-compose.test.yml` per `environment.md`.
|
||||
4. **AZ-416 (FT-P-09-AP) — already in `done/`**: AP MAVLink 2.0 signing handshake must run before AZ-595 generates the NFT-SEC-03 replay payload.
|
||||
5. **SUT**: outbound `source_label` MUST carry `tile_id` for NFT-SEC-01 false-trust attribution; FDR MUST emit deterministic decode-success/error per frame for NFT-SEC-04 silent-drop detection.
|
||||
|
||||
## Architecture Compliance
|
||||
|
||||
- All new files under `e2e/`, owned by the Blackbox Tests cross-cutting component per `_docs/02_document/module-layout.md`.
|
||||
- No imports from `src/gps_denied_onboard` (verified — only `runner.helpers.sitl_observer`, stdlib).
|
||||
- No new cyclic dependencies. New evaluators are leaves of the import DAG.
|
||||
- No new infrastructure libraries (stdlib `csv`, `dataclasses`, `enum`, `re`, `pathlib`, `math` only).
|
||||
|
||||
## Sub-step Trace
|
||||
|
||||
Phases executed per `implement/SKILL.md`:
|
||||
- phase 5 (load-spec) → 4 task specs read
|
||||
- phase 6 (implement-tasks-sequentially) → 5 helpers + 6 scenarios + 5 unit-test files for all 4 tasks
|
||||
- phase 7 (verify-ac-coverage) → ACs traced above
|
||||
- phase 8 (code-review) → batch_87_review.md (PASS_WITH_WARNINGS, 5 Low)
|
||||
- phase 8.5 (cumulative-review) → cumulative_review_batches_85_87.md (PASS_WITH_WARNINGS, 5 cross-batch findings)
|
||||
- phase 11 (commit-batch) → next.
|
||||
|
||||
## Notes on this batch
|
||||
|
||||
- A Jira transition mistake was made early in this batch (used `id=31` for "In Progress" but `id=31` in this workflow = "Done"). Caught by the mandatory read-back gate, corrected by re-transitioning to id `21` (verified-correct via `getTransitionsForJiraIssue` lookup). Lesson recorded in `_docs/LESSONS.md`. No code or git artifacts were affected — only the tracker state, which is fully restored.
|
||||
Reference in New Issue
Block a user