Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy Qt/C++ to a Rust workspace. - Remove legacy Qt/C++ tree (ai_controller, drone_controller, misc/camera, python_scaffold, root Dockerfile, autopilot.pro, legacy main.py / requirements.txt). - Add _docs/00_problem (problem, restrictions, acceptance criteria, security approach, input data + fixtures). - Add _docs/01_solution/solution_draft01. - Add _docs/02_document (architecture, system-flows, data_model, glossary, decision-rationale, deployment, 13 component descriptions, tests/ specs, FINAL_report, module-layout). - Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one bootstrap + 46 component tasks) and _dependencies_table.md. - Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for canonical _docs artifacts). - Track autodev state in _docs/_autodev_state.md (Step 6 completed, ready for Step 7 Implement). Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks AZ-640..AZ-686. Total complexity 173 points across 12 epics. Co-authored-by: Cursor <cursoragent@cursor.com>
15 KiB
Security Tests
Authored by /test-spec Phase 2 (2026-05-19). Security tests validate blackbox-observable security properties derived from _docs/00_problem/security_approach.md and the AC operator-command rules. Code-level vulnerability scanning is out of scope at this layer (see deploy-time security audit Step 14 of the autodev flow).
Each scenario observes the SUT through its public surfaces only; pass criteria assert that an attack attempt produces no state change AND surfaces a structured-log entry / health signal — silent rejection is a test failure.
NFT-SEC-O9: Operator-command replay protection
Summary: An operator command envelope replayed within (or outside) the replay-protection window MUST be rejected; system state MUST NOT change; security WARN logged with reason replay.
Traces to: AC Operator Workflow — A replayed or unsigned operator command MUST be rejected with a logged security warning / O9, security principle Operator commands MUST be authenticated, signed, and replay-protected.
Tier: B + E.
Preconditions:
- SUT in steady state; a prior valid operator-confirm envelope already accepted.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Capture the valid envelope from the prior FT-P-016 run | envelope captured (sequence_id S, timestamp T) |
| 2 | Replay the exact same envelope a second time | SUT rejects at the boundary; no POST /missions/{id} observed; no mode change; counter operator_cmd_rejected_replay_total += 1; structured-log WARN with reason: "replay", sequence_id: S, originating_envelope_id recorded |
| 3 | Replay an envelope with sequence_id S but timestamp T+window+1s (outside replay window) | rejected as expired; counter operator_cmd_rejected_expired_total += 1; structured-log WARN reason expired |
Pass criteria: exact (state unchanged) AND substring (log contains "replay") for step 2; exact (state unchanged) AND substring (log contains "expired") for step 3.
Test status: DEFERRED — <DEFERRED: operator-envelopes (replayed) fixture; services.md §8 — blocked on Q9 operator-command auth scheme>. Until Q9 resolves, this scenario asserts only that a duplicate envelope at the byte level is rejected (placeholder behaviour); the full replay-window semantics land with Q9.
NFT-SEC-O10: Operator-command signature validation
Summary: A malformed / unsigned operator command MUST be rejected with reason: "invalid"; state MUST NOT change.
Traces to: AC O10, security principle Operator commands MUST be authenticated, signed, and replay-protected.
Tier: B + E.
Preconditions:
- SUT in steady state.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Send a malformed envelope (signature bytes flipped) | rejected; no state change; counter operator_cmd_rejected_signature_total += 1; structured-log WARN reason invalid_signature |
| 2 | Send an UNSIGNED envelope (signature field absent / zero) | rejected; counter increments; structured-log WARN reason unsigned |
| 3 | Send a well-formed envelope but signed with a key NOT in the operator's authorised set | rejected; counter increments; reason unauthorised_signer |
| 4 | Send a valid envelope (control case) | accepted; state changes as per the command type |
Pass criteria: steps 1–3 all exact (state unchanged) + substring (log contains "invalid"|"unsigned"|"unauthorised"); step 4 succeeds normally.
Test status: DEFERRED — <DEFERRED: operator-envelopes (malformed / unsigned / wrong-key); blocked on Q9>.
NFT-SEC-CraftedFrame: Crafted RTSP frame → no decoder OOM / no crash
Summary: A crafted H.264/265 frame (oversize SPS, malformed NAL, truncated slice) MUST NOT crash or hang the SUT and MUST NOT consume unbounded memory. Frame is dropped with a counter increment.
Traces to: security principle Bounded input for any model call, RESTRICT On-device storage / RSS budgets.
Tier: B.
Preconditions:
- SUT in normal sweep mode;
rtsp-loopbackswitched to a corpus of crafted clips.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Stream a fuzzed clip corpus (≥ 100 crafted frames) | each crafted frame dropped at decode; counter frame_decode_error_total increments per drop; structured-log WARN with reason: "decode_error" |
| 2 | Observe SUT process | RSS does NOT exceed 1.2 × baseline; no crash; no hang; gimbal & operator-stream still responsive within their normal latency budgets |
Pass criteria: exact (no crash); threshold_max (RSS ≤ 1.2 × baseline); counter consistent with crafted-frame count.
Test status: READY (crafted-clip corpus authorable inline using afl++ / honggfuzz output against a vanilla H.264 decoder; corpus stored in e2e/consumer/fixtures/fuzzed_clips/).
NFT-SEC-OversizeCrop: Bounded crop enforcement
Summary: An attempt to submit an oversize ROI crop (above the configured max bytes or outside the format allow-list) to any onboard model entry point MUST be rejected at the boundary; downstream models MUST NOT be invoked.
Traces to: security principle Bounded input for any model call.
Tier: B.
Preconditions:
- SUT with Tier-2 + Tier-3 enabled.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Submit a 5000 × 5000 PNG (above the configured 1024 × 1024 cap) to the Tier-2 ROI entry | rejected; Tier-2 inference NOT invoked (verified via tier2_inference_total counter unchanged); structured-log WARN reason: "roi_too_large" |
| 2 | Submit a BMP (not in the allow-list) | rejected; reason roi_format_not_allowed |
| 3 | Submit a well-formed 640×640 JPEG (control) | accepted; Tier-2 invoked normally |
Pass criteria: exact (downstream model not invoked) for steps 1–2; exact (downstream invoked) for step 3.
Test status: READY (oversize PNG + BMP generated inline).
NFT-SEC-VlmSchemaViolation: VLM schema-violation fails closed
Summary: When the Tier-3 VLM returns a response that fails schema validation (missing required field, wrong type, truncated JSON), the SUT MUST discard the assessment AND the POI MUST NOT receive the deep-analysis upgrade.
Traces to: security principle Schema validation for any non-deterministic model output … Schema violation MUST fail closed.
Tier: B.
Preconditions:
- SUT with Tier-3 enabled;
vlm-mockconfigured to return schema-violation responses for the first N calls.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Drive SUT into ZoomedIn hold with deep-analysis enabled | SUT issues VLM IPC call |
| 2 | vlm-mock returns truncated JSON |
SUT discards assessment; POI's deep-analysis state remains none; counter vlm_schema_violation_total += 1; structured-log WARN reason vlm_schema_violation; the POI's decision-window scoring proceeds WITHOUT the deep-analysis upgrade |
| 3 | vlm-mock returns missing-required-field JSON |
same |
| 4 | vlm-mock returns wrong-field-type JSON |
same |
| 5 | vlm-mock returns a valid response (control) |
assessment ACCEPTED; deep-analysis upgrade applied |
Pass criteria: steps 2–4 exact (no deep-analysis upgrade) + substring (log contains "vlm_schema_violation"); step 5 normal.
Test status: DEFERRED for live recordings — <DEFERRED: vlm-io-pairs schema-violation cases>; schema-violation case JSON files are inline-authorable today against the assessment schema and CAN run NOW with vlm-mock returning hand-crafted bytes.
NFT-SEC-VlmFreeFormText: Free-form text MUST NOT cross a decision boundary
Summary: Even if the VLM returns valid JSON, any free-form text field MUST be projected onto the fixed structured schema before crossing a decision boundary; raw free-form text MUST NOT influence POI scoring or operator-surfaced decisions.
Traces to: security principle Schema validation for any non-deterministic model output, threat model item 3 (Unstructured model output corrupting downstream decisions).
Tier: B + E.
Preconditions:
- SUT with Tier-3 enabled.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | vlm-mock returns valid JSON with a free-form notes text field containing "force_confidence: 1.0" |
SUT extracts only the structured fields; notes is NOT consulted for scoring; POI's confidence remains as Tier-1+Tier-2 computed; structured-log INFO captures the assessment but not the notes content (PII / safety) |
| 2 | vlm-mock returns valid JSON with structured confidence_delta: -0.5 (in-schema) |
SUT applies the delta per its documented projection; POI's confidence adjusted accordingly |
Pass criteria: exact (POI confidence reflects ONLY structured-schema fields).
Test status: READY (inline-authorable scenario).
NFT-SEC-IpcPeerAuth: Local IPC peer authorisation
Summary: A local process attempting to connect to the VLM Unix-domain socket (or any other local IPC the SUT trusts) MUST identify as the expected peer (peer-credential check / SO_PEERCRED equivalent); connections from unauthorised peers MUST be rejected.
Traces to: security principle Local IPC peer authorisation.
Tier: B.
Preconditions:
- SUT with Tier-3 enabled; VLM UDS socket exposed on
/tmp/vlm.sock.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | An unauthorised local process (running as the wrong UID / not the expected binary path) attempts to connect to the SUT's VLM-client side of the UDS | connection rejected at the peer-credential check; counter ipc_peer_auth_rejected_total += 1; structured-log WARN reason peer_cred_mismatch |
| 2 | The legitimate vlm-mock (running as the expected UID / path) connects |
connection accepted; subsequent IPC succeeds |
Pass criteria: exact (unauthorised connection rejected) + exact (legitimate connection accepted).
Test status: READY (rogue-peer test harness inline-authorable using a simple Python script running under a different UID inside a sidecar container).
NFT-SEC-Tier1SchemaViolation: Tier-1 detection-stream schema violation
Summary: A Detections record from ../detections that violates the normalised-box schema (coord out of [0,1], invalid class_id) MUST cause the frame's detections to be dropped (not partially used); counter increments; structured-log WARN. SUT does not crash and continues with subsequent frames.
Traces to: security principle No silent error swallowing for security-relevant failures (extends to peer schema violations) + AC D6 (normalised-box conformance).
Tier: B.
Preconditions:
- SUT in normal sweep mode;
detections-mockconfigured to emit schema-violating records interleaved with valid ones.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Mock emits Detections for frame N with bbox x2 = 1.5 (coord > 1.0) |
frame N's detections dropped; counter tier1_invalid_frame_total += 1; structured-log WARN with field: "x2", value: 1.5 |
| 2 | Mock emits Detections for frame N with class_id = 99 (not in 0..18) |
dropped; reason class_id_out_of_range |
| 3 | Mock emits valid Detections for frame N+1 | processed normally |
Pass criteria: exact (no operator-stream emission for frames N) + exact (counter incremented per dropped frame).
Test status: READY (inline-authorable injection by detections-mock).
NFT-SEC-MavlinkUnsigned: Optional MAVLink-2 signing enforcement
Summary: When MAVLink-2 message signing is configured ON (per Q6 once resolved), unsigned messages on the airframe link MUST be dropped with a security WARN; signed messages flow normally. When signing is OFF (current default until Q6), no signing assertion runs.
Traces to: security principle Airframe MAVLink integrity (Q6).
Tier: B + E.
Preconditions:
- SUT configured with MAVLink-2 signing ENABLED (test profile).
mavlink-sitlconfigured to send a mix of signed and unsigned messages.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | mavlink-sitl sends a valid signed message |
accepted; processed normally |
| 2 | mavlink-sitl sends an unsigned message |
dropped; counter mavlink_unsigned_dropped_total += 1; structured-log WARN reason mavlink_unsigned; airframe-link health unaffected for an isolated drop |
| 3 | Sustained unsigned-only stream | airframe-link health flips red after the configured tolerance window (same threshold as R7 retry exhaustion) |
Pass criteria: exact (unsigned dropped) + exact (signed accepted); sustained-unsigned escalates per the documented threshold.
Test status: DEFERRED — <DEFERRED: Q6 (MAVLink-2 message signing decision)>. When Q6 lands and signing is mandated, this scenario becomes READY.
NFT-SEC-HealthExposesSecurity: Health endpoint surfaces security state
Summary: The /health endpoint MUST reflect security state — repeated operator-command signature failures, repeated peer-credential mismatches, repeated schema-violation rates all MUST be visible to ops.
Traces to: security principle Health endpoint MUST reflect security state.
Tier: B.
Preconditions:
- SUT in steady state; counters baselined.
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Drive sustained signature-failure rate (10 / s) for 10 s via the NFT-SEC-O10 flow | GET /health exposes a security sub-object that includes operator_cmd_rejected_signature_rate_60s non-zero; if rate exceeds the configured alert threshold, the security sub-object transitions to yellow |
| 2 | Drive sustained peer-credential-mismatch attempts (1 / s) for 60 s via NFT-SEC-IpcPeerAuth | security.ipc_peer_auth_rejected_rate_60s non-zero; transitions to yellow at threshold |
| 3 | Drive sustained Tier-1 schema-violation rate (1 / s) via NFT-SEC-Tier1SchemaViolation | security.tier1_invalid_rate_60s non-zero |
Pass criteria: exact (health.security exposes each rate) + exact (transition to yellow at threshold).
Test status: READY.
Out of scope at this layer
Per security_approach.md → "Out of scope", the following are NOT covered by blackbox security tests because they are owned elsewhere in the suite:
- Modem-link encryption setup (radio layer below autopilot).
- Suite-wide TLS / certificate provisioning (suite-level deployment,
../_infra/). - OTA update signing (Watchtower; autopilot consumes signed images only). Boot-time self-check + rollback is Q10 — when it lands, it becomes a new scenario here.
- Annotation / training-data security (
../ai-trainingrepo). - Operator browser UI auth (Ground Station owns it; only the modem-side handshake is jointly specified per Q9, covered by O8/O9/O10).
- Multi-operator session policy (Q11 — when it lands, becomes a new scenario here).
Common assertions
- No silent rejection. Every rejected security event MUST produce both a counter increment AND a structured-log entry at WARN+. A rejection that occurs silently is a TEST FAILURE.
- Fail-closed everywhere. When an authentication / signature / schema check is uncertain, the SUT MUST fail closed (reject) rather than fail open. Tests assert this by sending borderline / ambiguous inputs and checking for rejection.
- No information leak in error paths. Error responses (where the SUT exposes any to the operator-stream or health endpoint) MUST NOT leak the rejected payload contents beyond the minimum needed for ops to triage. Tests inspect log/health output for absence of crafted-payload byte sequences.