Files
Oleksandr Bezdieniezhnykh bc40ea7300 [AZ-626] Decompose complete: 47 tasks + docs + module layout
Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 11:02:01 +03:00

15 KiB
Raw Permalink Blame History

Security Tests

Authored by /test-spec Phase 2 (2026-05-19). Security tests validate blackbox-observable security properties derived from _docs/00_problem/security_approach.md and the AC operator-command rules. Code-level vulnerability scanning is out of scope at this layer (see deploy-time security audit Step 14 of the autodev flow).

Each scenario observes the SUT through its public surfaces only; pass criteria assert that an attack attempt produces no state change AND surfaces a structured-log entry / health signal — silent rejection is a test failure.


NFT-SEC-O9: Operator-command replay protection

Summary: An operator command envelope replayed within (or outside) the replay-protection window MUST be rejected; system state MUST NOT change; security WARN logged with reason replay. Traces to: AC Operator Workflow — A replayed or unsigned operator command MUST be rejected with a logged security warning / O9, security principle Operator commands MUST be authenticated, signed, and replay-protected. Tier: B + E.

Preconditions:

  • SUT in steady state; a prior valid operator-confirm envelope already accepted.
Step Consumer Action Expected Response
1 Capture the valid envelope from the prior FT-P-016 run envelope captured (sequence_id S, timestamp T)
2 Replay the exact same envelope a second time SUT rejects at the boundary; no POST /missions/{id} observed; no mode change; counter operator_cmd_rejected_replay_total += 1; structured-log WARN with reason: "replay", sequence_id: S, originating_envelope_id recorded
3 Replay an envelope with sequence_id S but timestamp T+window+1s (outside replay window) rejected as expired; counter operator_cmd_rejected_expired_total += 1; structured-log WARN reason expired

Pass criteria: exact (state unchanged) AND substring (log contains "replay") for step 2; exact (state unchanged) AND substring (log contains "expired") for step 3. Test status: DEFERRED — <DEFERRED: operator-envelopes (replayed) fixture; services.md §8 — blocked on Q9 operator-command auth scheme>. Until Q9 resolves, this scenario asserts only that a duplicate envelope at the byte level is rejected (placeholder behaviour); the full replay-window semantics land with Q9.


NFT-SEC-O10: Operator-command signature validation

Summary: A malformed / unsigned operator command MUST be rejected with reason: "invalid"; state MUST NOT change. Traces to: AC O10, security principle Operator commands MUST be authenticated, signed, and replay-protected. Tier: B + E.

Preconditions:

  • SUT in steady state.
Step Consumer Action Expected Response
1 Send a malformed envelope (signature bytes flipped) rejected; no state change; counter operator_cmd_rejected_signature_total += 1; structured-log WARN reason invalid_signature
2 Send an UNSIGNED envelope (signature field absent / zero) rejected; counter increments; structured-log WARN reason unsigned
3 Send a well-formed envelope but signed with a key NOT in the operator's authorised set rejected; counter increments; reason unauthorised_signer
4 Send a valid envelope (control case) accepted; state changes as per the command type

Pass criteria: steps 13 all exact (state unchanged) + substring (log contains "invalid"|"unsigned"|"unauthorised"); step 4 succeeds normally. Test status: DEFERRED — <DEFERRED: operator-envelopes (malformed / unsigned / wrong-key); blocked on Q9>.


NFT-SEC-CraftedFrame: Crafted RTSP frame → no decoder OOM / no crash

Summary: A crafted H.264/265 frame (oversize SPS, malformed NAL, truncated slice) MUST NOT crash or hang the SUT and MUST NOT consume unbounded memory. Frame is dropped with a counter increment. Traces to: security principle Bounded input for any model call, RESTRICT On-device storage / RSS budgets. Tier: B.

Preconditions:

  • SUT in normal sweep mode; rtsp-loopback switched to a corpus of crafted clips.
Step Consumer Action Expected Response
1 Stream a fuzzed clip corpus (≥ 100 crafted frames) each crafted frame dropped at decode; counter frame_decode_error_total increments per drop; structured-log WARN with reason: "decode_error"
2 Observe SUT process RSS does NOT exceed 1.2 × baseline; no crash; no hang; gimbal & operator-stream still responsive within their normal latency budgets

Pass criteria: exact (no crash); threshold_max (RSS ≤ 1.2 × baseline); counter consistent with crafted-frame count. Test status: READY (crafted-clip corpus authorable inline using afl++ / honggfuzz output against a vanilla H.264 decoder; corpus stored in e2e/consumer/fixtures/fuzzed_clips/).


NFT-SEC-OversizeCrop: Bounded crop enforcement

Summary: An attempt to submit an oversize ROI crop (above the configured max bytes or outside the format allow-list) to any onboard model entry point MUST be rejected at the boundary; downstream models MUST NOT be invoked. Traces to: security principle Bounded input for any model call. Tier: B.

Preconditions:

  • SUT with Tier-2 + Tier-3 enabled.
Step Consumer Action Expected Response
1 Submit a 5000 × 5000 PNG (above the configured 1024 × 1024 cap) to the Tier-2 ROI entry rejected; Tier-2 inference NOT invoked (verified via tier2_inference_total counter unchanged); structured-log WARN reason: "roi_too_large"
2 Submit a BMP (not in the allow-list) rejected; reason roi_format_not_allowed
3 Submit a well-formed 640×640 JPEG (control) accepted; Tier-2 invoked normally

Pass criteria: exact (downstream model not invoked) for steps 12; exact (downstream invoked) for step 3. Test status: READY (oversize PNG + BMP generated inline).


NFT-SEC-VlmSchemaViolation: VLM schema-violation fails closed

Summary: When the Tier-3 VLM returns a response that fails schema validation (missing required field, wrong type, truncated JSON), the SUT MUST discard the assessment AND the POI MUST NOT receive the deep-analysis upgrade. Traces to: security principle Schema validation for any non-deterministic model output … Schema violation MUST fail closed. Tier: B.

Preconditions:

  • SUT with Tier-3 enabled; vlm-mock configured to return schema-violation responses for the first N calls.
Step Consumer Action Expected Response
1 Drive SUT into ZoomedIn hold with deep-analysis enabled SUT issues VLM IPC call
2 vlm-mock returns truncated JSON SUT discards assessment; POI's deep-analysis state remains none; counter vlm_schema_violation_total += 1; structured-log WARN reason vlm_schema_violation; the POI's decision-window scoring proceeds WITHOUT the deep-analysis upgrade
3 vlm-mock returns missing-required-field JSON same
4 vlm-mock returns wrong-field-type JSON same
5 vlm-mock returns a valid response (control) assessment ACCEPTED; deep-analysis upgrade applied

Pass criteria: steps 24 exact (no deep-analysis upgrade) + substring (log contains "vlm_schema_violation"); step 5 normal. Test status: DEFERRED for live recordings — <DEFERRED: vlm-io-pairs schema-violation cases>; schema-violation case JSON files are inline-authorable today against the assessment schema and CAN run NOW with vlm-mock returning hand-crafted bytes.


NFT-SEC-VlmFreeFormText: Free-form text MUST NOT cross a decision boundary

Summary: Even if the VLM returns valid JSON, any free-form text field MUST be projected onto the fixed structured schema before crossing a decision boundary; raw free-form text MUST NOT influence POI scoring or operator-surfaced decisions. Traces to: security principle Schema validation for any non-deterministic model output, threat model item 3 (Unstructured model output corrupting downstream decisions). Tier: B + E.

Preconditions:

  • SUT with Tier-3 enabled.
Step Consumer Action Expected Response
1 vlm-mock returns valid JSON with a free-form notes text field containing "force_confidence: 1.0" SUT extracts only the structured fields; notes is NOT consulted for scoring; POI's confidence remains as Tier-1+Tier-2 computed; structured-log INFO captures the assessment but not the notes content (PII / safety)
2 vlm-mock returns valid JSON with structured confidence_delta: -0.5 (in-schema) SUT applies the delta per its documented projection; POI's confidence adjusted accordingly

Pass criteria: exact (POI confidence reflects ONLY structured-schema fields). Test status: READY (inline-authorable scenario).


NFT-SEC-IpcPeerAuth: Local IPC peer authorisation

Summary: A local process attempting to connect to the VLM Unix-domain socket (or any other local IPC the SUT trusts) MUST identify as the expected peer (peer-credential check / SO_PEERCRED equivalent); connections from unauthorised peers MUST be rejected. Traces to: security principle Local IPC peer authorisation. Tier: B.

Preconditions:

  • SUT with Tier-3 enabled; VLM UDS socket exposed on /tmp/vlm.sock.
Step Consumer Action Expected Response
1 An unauthorised local process (running as the wrong UID / not the expected binary path) attempts to connect to the SUT's VLM-client side of the UDS connection rejected at the peer-credential check; counter ipc_peer_auth_rejected_total += 1; structured-log WARN reason peer_cred_mismatch
2 The legitimate vlm-mock (running as the expected UID / path) connects connection accepted; subsequent IPC succeeds

Pass criteria: exact (unauthorised connection rejected) + exact (legitimate connection accepted). Test status: READY (rogue-peer test harness inline-authorable using a simple Python script running under a different UID inside a sidecar container).


NFT-SEC-Tier1SchemaViolation: Tier-1 detection-stream schema violation

Summary: A Detections record from ../detections that violates the normalised-box schema (coord out of [0,1], invalid class_id) MUST cause the frame's detections to be dropped (not partially used); counter increments; structured-log WARN. SUT does not crash and continues with subsequent frames. Traces to: security principle No silent error swallowing for security-relevant failures (extends to peer schema violations) + AC D6 (normalised-box conformance). Tier: B.

Preconditions:

  • SUT in normal sweep mode; detections-mock configured to emit schema-violating records interleaved with valid ones.
Step Consumer Action Expected Response
1 Mock emits Detections for frame N with bbox x2 = 1.5 (coord > 1.0) frame N's detections dropped; counter tier1_invalid_frame_total += 1; structured-log WARN with field: "x2", value: 1.5
2 Mock emits Detections for frame N with class_id = 99 (not in 0..18) dropped; reason class_id_out_of_range
3 Mock emits valid Detections for frame N+1 processed normally

Pass criteria: exact (no operator-stream emission for frames N) + exact (counter incremented per dropped frame). Test status: READY (inline-authorable injection by detections-mock).


Summary: When MAVLink-2 message signing is configured ON (per Q6 once resolved), unsigned messages on the airframe link MUST be dropped with a security WARN; signed messages flow normally. When signing is OFF (current default until Q6), no signing assertion runs. Traces to: security principle Airframe MAVLink integrity (Q6). Tier: B + E.

Preconditions:

  • SUT configured with MAVLink-2 signing ENABLED (test profile).
  • mavlink-sitl configured to send a mix of signed and unsigned messages.
Step Consumer Action Expected Response
1 mavlink-sitl sends a valid signed message accepted; processed normally
2 mavlink-sitl sends an unsigned message dropped; counter mavlink_unsigned_dropped_total += 1; structured-log WARN reason mavlink_unsigned; airframe-link health unaffected for an isolated drop
3 Sustained unsigned-only stream airframe-link health flips red after the configured tolerance window (same threshold as R7 retry exhaustion)

Pass criteria: exact (unsigned dropped) + exact (signed accepted); sustained-unsigned escalates per the documented threshold. Test status: DEFERRED — <DEFERRED: Q6 (MAVLink-2 message signing decision)>. When Q6 lands and signing is mandated, this scenario becomes READY.


NFT-SEC-HealthExposesSecurity: Health endpoint surfaces security state

Summary: The /health endpoint MUST reflect security state — repeated operator-command signature failures, repeated peer-credential mismatches, repeated schema-violation rates all MUST be visible to ops. Traces to: security principle Health endpoint MUST reflect security state. Tier: B.

Preconditions:

  • SUT in steady state; counters baselined.
Step Consumer Action Expected Response
1 Drive sustained signature-failure rate (10 / s) for 10 s via the NFT-SEC-O10 flow GET /health exposes a security sub-object that includes operator_cmd_rejected_signature_rate_60s non-zero; if rate exceeds the configured alert threshold, the security sub-object transitions to yellow
2 Drive sustained peer-credential-mismatch attempts (1 / s) for 60 s via NFT-SEC-IpcPeerAuth security.ipc_peer_auth_rejected_rate_60s non-zero; transitions to yellow at threshold
3 Drive sustained Tier-1 schema-violation rate (1 / s) via NFT-SEC-Tier1SchemaViolation security.tier1_invalid_rate_60s non-zero

Pass criteria: exact (health.security exposes each rate) + exact (transition to yellow at threshold). Test status: READY.


Out of scope at this layer

Per security_approach.md → "Out of scope", the following are NOT covered by blackbox security tests because they are owned elsewhere in the suite:

  • Modem-link encryption setup (radio layer below autopilot).
  • Suite-wide TLS / certificate provisioning (suite-level deployment, ../_infra/).
  • OTA update signing (Watchtower; autopilot consumes signed images only). Boot-time self-check + rollback is Q10 — when it lands, it becomes a new scenario here.
  • Annotation / training-data security (../ai-training repo).
  • Operator browser UI auth (Ground Station owns it; only the modem-side handshake is jointly specified per Q9, covered by O8/O9/O10).
  • Multi-operator session policy (Q11 — when it lands, becomes a new scenario here).

Common assertions

  • No silent rejection. Every rejected security event MUST produce both a counter increment AND a structured-log entry at WARN+. A rejection that occurs silently is a TEST FAILURE.
  • Fail-closed everywhere. When an authentication / signature / schema check is uncertain, the SUT MUST fail closed (reject) rather than fail open. Tests assert this by sending borderline / ambiguous inputs and checking for rejection.
  • No information leak in error paths. Error responses (where the SUT exposes any to the operator-stream or health endpoint) MUST NOT leak the rejected payload contents beyond the minimum needed for ops to triage. Tests inspect log/health output for absence of crafted-payload byte sequences.