Files
gps-denied-onboard/_docs/03_implementation/reviews/batch_87_review.md
T
Oleksandr Bezdieniezhnykh c56d4584e6 [AZ-436] [AZ-437] [AZ-438] [AZ-439] Add NFT-SEC-01..05 security scenarios
Batch 87: 6 NFT-SEC blackbox scenarios + 5 helper evaluators + 75 unit
tests + cumulative review batches 85-87.

* AZ-436 NFT-SEC-01: cache-poisoning safety budget (AC-NEW-9); aggregate
  false_trust_count ≤ N×1e-6; zero-tolerance default. Canonical-only by
  default; E2E_NFT_SEC_01_RELEASE_GATE=1 unlocks full matrix.
* AZ-437 NFT-SEC-02 + NFT-SEC-05: shared egress-observation evaluator
  (AC-NEW-10); SEC-02 = 0 packets to non-e2e-net over 5min replay;
  SEC-05 = DNS-blackhole sidecar healthy + lookup fails + UDP-53 silent.
* AZ-438 NFT-SEC-03: AP-only signing rejection (AC-NEW-11); 3 sub-cases
  (unsigned/wrong-key/replayed) each reject ≤500ms + no position drift.
* AZ-439 NFT-SEC-04: probe (always-run) = no-crash + deterministic
  decode outcome; ASan-fuzz (release-gate) = 0 findings ≥4h; AC-3
  corpus floor informational only per spec.

Verdict per-batch: PASS_WITH_WARNINGS (5 Low). Cumulative review for
batches 85-87 (K=3 window) also PASS_WITH_WARNINGS with 5 cross-batch
findings — recommends hygiene PBIs for write_csv_evidence duplication
(13 helpers) and _resolve_fixture_path duplication (13 scenarios), plus
new tickets for AZ-595 fixture builder + DNS-blackhole sidecar service.

Also adds _docs/LESSONS.md documenting the Jira transition-ID lesson
(always call getTransitionsForJiraIssue first, never memorize numeric
IDs across sessions).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-17 17:33:22 +03:00

8.8 KiB
Raw Blame History

Code Review — Batch 87

Tasks: AZ-436 + AZ-437 + AZ-438 + AZ-439 (NFT-SEC-01..05 security scenarios) Total complexity: 16 points (5 + 3 + 3 + 5) Reviewer: autodev / implement skill phase 8 Date: 2026-05-17 Verdict: PASS_WITH_WARNINGS (5 Low; 0 Medium / High / Critical)

Phase 1 — Context

Per _dependencies_table.md these four NFT-SEC tasks complete the security NFT slice of E-BBT (AZ-262). All five scenarios (cache poisoning, no-egress, MAVLink signing, OpenCV CVE probe, OpenCV CVE ASan fuzz, DNS blackhole) follow the established batch-85 / batch-86 "helper + fixture-consumer scenario + unit-test trio" pattern.

Public-boundary discipline: every new helper module declares it in its module docstring; grep confirms none import src/gps_denied_onboard.

Phase 2 — Spec Compliance

Task AC Coverage location
AZ-436 AC-1 (N flights complete) _parse_payload + len(flights) < NFT_SEC_01_CI_MIN_FLIGHTS gate + scenario flight_count NFR record
AZ-436 AC-2 (poison ratio + layer coverage + rejection-reason vocabulary) passes_ratio + passes_layer_coverage + passes_rejection_reason_vocabulary (3 sub-assertions)
AZ-436 AC-3 (false-trust budget zero-tolerance) passes_budget + scenario total_false_trust + budget NFR records
AZ-436 AC-4 (canonical-only default; release-gate full matrix) _is_canonical_param + E2E_NFT_SEC_01_RELEASE_GATE env var
AZ-437 AC-1 (NFT-SEC-02 0 packets to non-internal) evaluate_no_egress.passes + scenario AC-1 assert
AZ-437 AC-2 (NFT-SEC-05 sidecar healthy) sidecar_healthy field + scenario AC-2 assert
AZ-437 AC-3a (NFT-SEC-05 lookup fails) passes_lookup + scenario AC-3a assert
AZ-437 AC-3b (NFT-SEC-05 UDP-53 silent) passes_udp_silence + scenario AC-3b assert
AZ-437 AC-4 (parameterization) conftest matrix unchanged; both scenarios run per (fc_adapter, vio_strategy)
AZ-438 AC-1 (iNav SKIP) if fc_adapter == "inav": pytest.skip(...)
AZ-438 AC-2/3/4 (per-sub-case rejection ≤500ms + no position update) per-sub-case passes_rejection + passes_no_position_update
AZ-438 AC-5 (vio_strategy parameterization) conftest matrix
AZ-439 AC-1a (probe: no crash) passes_no_crash (last FDR record ≥ probe injection)
AZ-439 AC-1b (probe: graceful outcome) passes_graceful_outcome (decode-success OR frame-decode-error within ±50 ms tolerance)
AZ-439 AC-2 (fuzz: 0 ASan findings ≥4 h) passes_findings + passes_duration + scenario AC-2 assert
AZ-439 AC-3 (fuzz: ≥1000 corpus) reached_corpus_floor (informational only — recorded in CSV; not asserted, matches spec "informational only; no hard threshold")
AZ-439 AC-4 (parameterization) probe = full matrix; fuzz = ardupilot + per-vio (avoids duplicating a 4 h run, justified inline)

Spec compliance: PASS. AC-3 on AZ-439 fuzz is correctly modeled as informational-only per the task spec.

Phase 3 — Code Quality

  • All five helpers use frozen dataclasses for verdicts. Single-responsibility: evaluators do verdict logic only; scenarios do fixture I/O + assertions.
  • No 2>/dev/null, bare except, or empty try-catch blocks anywhere.
  • Comment discipline upheld — only # Arrange/# Act/# Assert in tests; inline comments in helpers explain non-obvious intent (e.g. why ASan threshold patterns are explicitly enumerated rather than wildcard-matched).
  • All paths are typed; Sequence / frozen=True / Enum / Path annotations appear consistently.

Phase 4 — Security Scan

  • No new credentials, secrets, or API keys introduced.
  • cve_probe_evaluator deliberately rejects "silent drop" outcomes as failure — preventing a regression where the SUT silently swallows malformed JPEGs.
  • asan_fuzz_evaluator classifies any unknown ASan match as OTHER_FINDING and still fails — preventing a regression where a future sanitizer category is silently accepted.
  • egress_observer treats SUCCESS DNS lookup as the only failing outcome.
  • mavlink_signing_evaluator requires BOTH rejection STATUSTEXT AND no-position-update for pass — catches signaling-only rejection bug class.
  • Test data (poisoned tile generators, BAD_SIGNATURE regex) does not contain exploit payloads, only string patterns + benign synthetic counters.

Verdict: PASS.

Phase 5 — Performance Scan

  • Evaluators are O(N) over their input collections; no quadratic loops or unbounded recursion.
  • No I/O in evaluator hot paths; CSV writing is one-shot at scenario end.
  • AZ-436 default N=1000 with single canonical param keeps the per-CI cost bounded; release-gate N=10000 × 6 params is correctly opt-in via env flag.
  • AZ-439 fuzz is correctly release-gated (≥4 h is too expensive for CI).

Verdict: PASS.

Phase 6 — Cross-Task Consistency

  • All six scenarios share an almost-identical structure:
    1. tier-guard skip (where applicable),
    2. sitl_replay_ready skip,
    3. fixture-path resolution,
    4. fixture-not-found → pytest.fail with explicit production-dep pointer,
    5. payload parse → typed records (with pytest.fail on shape errors),
    6. evaluator call → CSV evidence + NFR records,
    7. AC assertions with diagnostic messages.
  • The five helpers share the write_csv_evidence pattern (one row per result + aggregate footer where applicable).

Verdict: PASS (consistency upheld).

Phase 7 — Architecture Compliance

  • All new files under e2e/ — owned by the Blackbox Tests cross-cutting component per _docs/02_document/module-layout.md.
  • No src/gps_denied_onboard imports (verified by inspection of every new file's import section).
  • New evaluators are leaves of the import DAG — they import only stdlib + the existing runner.helpers.sitl_observer for fixture-path resolution.
  • No new infrastructure libraries; all helpers use stdlib csv, dataclasses, enum, re, pathlib, math.

Verdict: PASS.

Findings

F1 — write_csv_evidence boilerplate continues to grow (Low / Maintainability — carry-over of batch-85 F4 + batch-86 F1)

Each of the 5 new evaluators adds its own write_csv_evidence(out_path, report) -> Path with the same header-then-rows pattern. The duplication is now spread across 13 helpers (batches 85, 86, 87 combined).

Verdict: defer to a future hygiene PBI; the per-evaluator schema variation makes a generic abstraction non-trivial (each report's row shape differs significantly — aggregate row vs per-sub-case, single row vs many).

F2 — DNS-blackhole sidecar production dependency not yet realized (Low / Spec-Gap surfacing)

The NFT-SEC-05 scenario depends on a DNS-blackhole sidecar (per environment.md) that must be configured to absorb all UDP-53 probes. This sidecar does not exist in the e2e harness today (no service entry in e2e/docker/docker-compose.test.yml and no dns-blackhole directory under e2e/fixtures/).

Surfaced to AZ-595 + e2e infrastructure: the DNS-blackhole sidecar must be wired before NFT-SEC-05 can run end-to-end with live captures.

Verdict: not a code defect — surfaced as a production dependency in the batch report.

NFT-SEC-03 assumes AP has the MAVLink 2.0 signing handshake completed (SETUP_SIGNING exchange + passkey installed) before the test runs. That handshake is owned by FT-P-09-AP (AZ-416 — already in done/); the NFT-SEC-03 fixture builder (AZ-595) must invoke that handshake first when generating its replay.

Verdict: not a code defect — production dependency on AZ-595 noted in the scenario docstring.

F4 — _resolve_fixture_path duplicated across 6 scenarios (Low / Maintainability)

Carry-over of batch-85 F3 + batch-86 F4. The six new security scenarios each define their own _resolve_fixture_path() -> Path with identical logic differing only in env var name + default filename.

Verdict: defer to a future hygiene PBI alongside the NFT-PERF and NFT-RES instances.

F5 — ASan-fuzz AC-3 corpus floor is informational-only (Low / Spec-aligned)

AsanFuzzReport.reached_corpus_floor is correctly computed but does NOT contribute to passes (and is NOT asserted in the scenario). This matches the task spec's "informational only; no hard threshold" wording exactly.

Verdict: not a defect — flagged so a future reviewer doesn't mistake the read-only field for missing coverage.

Auto-Fix Gate

5 Low findings; no Critical / High / Medium. Per the implement-skill auto-fix gate, no auto-fix actions are triggered. F1 + F4 are deferred to hygiene PBIs; F2 + F3 are production dependencies surfaced to the cumulative review + AZ-595 fixture builder; F5 is a documented design decision matching the spec.

Final Verdict

PASS_WITH_WARNINGS — proceed to commit + tracker transition.