# Code Review — Batch 87 **Tasks**: AZ-436 + AZ-437 + AZ-438 + AZ-439 (NFT-SEC-01..05 security scenarios) **Total complexity**: 16 points (5 + 3 + 3 + 5) **Reviewer**: autodev / `implement` skill phase 8 **Date**: 2026-05-17 **Verdict**: PASS_WITH_WARNINGS (5 Low; 0 Medium / High / Critical) ## Phase 1 — Context Per `_dependencies_table.md` these four NFT-SEC tasks complete the security NFT slice of E-BBT (AZ-262). All five scenarios (cache poisoning, no-egress, MAVLink signing, OpenCV CVE probe, OpenCV CVE ASan fuzz, DNS blackhole) follow the established batch-85 / batch-86 "helper + fixture-consumer scenario + unit-test trio" pattern. Public-boundary discipline: every new helper module declares it in its module docstring; grep confirms none import `src/gps_denied_onboard`. ## Phase 2 — Spec Compliance | Task | AC | Coverage location | |------|----|-------------------| | AZ-436 | AC-1 (N flights complete) | `_parse_payload` + `len(flights) < NFT_SEC_01_CI_MIN_FLIGHTS` gate + scenario `flight_count` NFR record | | AZ-436 | AC-2 (poison ratio + layer coverage + rejection-reason vocabulary) | `passes_ratio` + `passes_layer_coverage` + `passes_rejection_reason_vocabulary` (3 sub-assertions) | | AZ-436 | AC-3 (false-trust budget zero-tolerance) | `passes_budget` + scenario `total_false_trust` + `budget` NFR records | | AZ-436 | AC-4 (canonical-only default; release-gate full matrix) | `_is_canonical_param` + `E2E_NFT_SEC_01_RELEASE_GATE` env var | | AZ-437 | AC-1 (NFT-SEC-02 0 packets to non-internal) | `evaluate_no_egress.passes` + scenario AC-1 assert | | AZ-437 | AC-2 (NFT-SEC-05 sidecar healthy) | `sidecar_healthy` field + scenario AC-2 assert | | AZ-437 | AC-3a (NFT-SEC-05 lookup fails) | `passes_lookup` + scenario AC-3a assert | | AZ-437 | AC-3b (NFT-SEC-05 UDP-53 silent) | `passes_udp_silence` + scenario AC-3b assert | | AZ-437 | AC-4 (parameterization) | conftest matrix unchanged; both scenarios run per (fc_adapter, vio_strategy) | | AZ-438 | AC-1 (iNav SKIP) | `if fc_adapter == "inav": pytest.skip(...)` | | AZ-438 | AC-2/3/4 (per-sub-case rejection ≤500ms + no position update) | per-sub-case `passes_rejection` + `passes_no_position_update` | | AZ-438 | AC-5 (vio_strategy parameterization) | conftest matrix | | AZ-439 | AC-1a (probe: no crash) | `passes_no_crash` (last FDR record ≥ probe injection) | | AZ-439 | AC-1b (probe: graceful outcome) | `passes_graceful_outcome` (decode-success OR frame-decode-error within ±50 ms tolerance) | | AZ-439 | AC-2 (fuzz: 0 ASan findings ≥4 h) | `passes_findings` + `passes_duration` + scenario AC-2 assert | | AZ-439 | AC-3 (fuzz: ≥1000 corpus) | `reached_corpus_floor` (informational only — recorded in CSV; not asserted, matches spec "informational only; no hard threshold") | | AZ-439 | AC-4 (parameterization) | probe = full matrix; fuzz = ardupilot + per-vio (avoids duplicating a 4 h run, justified inline) | Spec compliance: PASS. AC-3 on AZ-439 fuzz is correctly modeled as informational-only per the task spec. ## Phase 3 — Code Quality - All five helpers use frozen dataclasses for verdicts. Single-responsibility: evaluators do verdict logic only; scenarios do fixture I/O + assertions. - No `2>/dev/null`, bare `except`, or empty try-catch blocks anywhere. - Comment discipline upheld — only `# Arrange`/`# Act`/`# Assert` in tests; inline comments in helpers explain non-obvious intent (e.g. why ASan threshold patterns are explicitly enumerated rather than wildcard-matched). - All paths are typed; `Sequence` / `frozen=True` / `Enum` / `Path` annotations appear consistently. ## Phase 4 — Security Scan - No new credentials, secrets, or API keys introduced. - `cve_probe_evaluator` deliberately rejects "silent drop" outcomes as failure — preventing a regression where the SUT silently swallows malformed JPEGs. - `asan_fuzz_evaluator` classifies *any* unknown ASan match as `OTHER_FINDING` and still fails — preventing a regression where a future sanitizer category is silently accepted. - `egress_observer` treats SUCCESS DNS lookup as the only failing outcome. - `mavlink_signing_evaluator` requires BOTH rejection STATUSTEXT AND no-position-update for pass — catches signaling-only rejection bug class. - Test data (poisoned tile generators, BAD_SIGNATURE regex) does not contain exploit payloads, only string patterns + benign synthetic counters. Verdict: PASS. ## Phase 5 — Performance Scan - Evaluators are O(N) over their input collections; no quadratic loops or unbounded recursion. - No I/O in evaluator hot paths; CSV writing is one-shot at scenario end. - AZ-436 default N=1000 with single canonical param keeps the per-CI cost bounded; release-gate N=10000 × 6 params is correctly opt-in via env flag. - AZ-439 fuzz is correctly release-gated (≥4 h is too expensive for CI). Verdict: PASS. ## Phase 6 — Cross-Task Consistency - All six scenarios share an almost-identical structure: 1. tier-guard skip (where applicable), 2. `sitl_replay_ready` skip, 3. fixture-path resolution, 4. fixture-not-found → `pytest.fail` with explicit production-dep pointer, 5. payload parse → typed records (with `pytest.fail` on shape errors), 6. evaluator call → CSV evidence + NFR records, 7. AC assertions with diagnostic messages. - The five helpers share the `write_csv_evidence` pattern (one row per result + aggregate footer where applicable). Verdict: PASS (consistency upheld). ## Phase 7 — Architecture Compliance - All new files under `e2e/` — owned by the Blackbox Tests cross-cutting component per `_docs/02_document/module-layout.md`. - No `src/gps_denied_onboard` imports (verified by inspection of every new file's import section). - New evaluators are leaves of the import DAG — they import only stdlib + the existing `runner.helpers.sitl_observer` for fixture-path resolution. - No new infrastructure libraries; all helpers use stdlib `csv`, `dataclasses`, `enum`, `re`, `pathlib`, `math`. Verdict: PASS. ## Findings ### F1 — `write_csv_evidence` boilerplate continues to grow (Low / Maintainability — carry-over of batch-85 F4 + batch-86 F1) Each of the 5 new evaluators adds its own `write_csv_evidence(out_path, report) -> Path` with the same header-then-rows pattern. The duplication is now spread across 13 helpers (batches 85, 86, 87 combined). **Verdict**: defer to a future hygiene PBI; the per-evaluator schema variation makes a generic abstraction non-trivial (each report's row shape differs significantly — aggregate row vs per-sub-case, single row vs many). ### F2 — DNS-blackhole sidecar production dependency not yet realized (Low / Spec-Gap surfacing) The `NFT-SEC-05` scenario depends on a DNS-blackhole sidecar (per `environment.md`) that must be configured to absorb all UDP-53 probes. This sidecar does not exist in the e2e harness today (no service entry in `e2e/docker/docker-compose.test.yml` and no `dns-blackhole` directory under `e2e/fixtures/`). **Surfaced to AZ-595 + e2e infrastructure**: the DNS-blackhole sidecar must be wired before NFT-SEC-05 can run end-to-end with live captures. **Verdict**: not a code defect — surfaced as a production dependency in the batch report. ### F3 — AP MAVLink 2.0 signing handshake required for NFT-SEC-03 (Low / Spec-Gap surfacing) `NFT-SEC-03` assumes AP has the MAVLink 2.0 signing handshake completed (`SETUP_SIGNING` exchange + passkey installed) before the test runs. That handshake is owned by FT-P-09-AP (AZ-416 — already in `done/`); the NFT-SEC-03 fixture builder (AZ-595) must invoke that handshake first when generating its replay. **Verdict**: not a code defect — production dependency on AZ-595 noted in the scenario docstring. ### F4 — `_resolve_fixture_path` duplicated across 6 scenarios (Low / Maintainability) Carry-over of batch-85 F3 + batch-86 F4. The six new security scenarios each define their own `_resolve_fixture_path() -> Path` with identical logic differing only in env var name + default filename. **Verdict**: defer to a future hygiene PBI alongside the NFT-PERF and NFT-RES instances. ### F5 — ASan-fuzz AC-3 corpus floor is informational-only (Low / Spec-aligned) `AsanFuzzReport.reached_corpus_floor` is correctly computed but does NOT contribute to `passes` (and is NOT asserted in the scenario). This matches the task spec's "informational only; no hard threshold" wording exactly. **Verdict**: not a defect — flagged so a future reviewer doesn't mistake the read-only field for missing coverage. ## Auto-Fix Gate 5 Low findings; no Critical / High / Medium. Per the implement-skill auto-fix gate, no auto-fix actions are triggered. F1 + F4 are deferred to hygiene PBIs; F2 + F3 are production dependencies surfaced to the cumulative review + AZ-595 fixture builder; F5 is a documented design decision matching the spec. ## Final Verdict **PASS_WITH_WARNINGS** — proceed to commit + tracker transition.