mirror of https://github.com/azaion/autopilot.git synced 2026-06-21 15:41:09 +00:00

Files

T

Oleksandr Bezdieniezhnykh bc40ea7300 [AZ-626] Decompose complete: 47 tasks + docs + module layout

Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-19 11:02:01 +03:00

24 KiB

Raw Blame History

Test Environment

Authored by /test-spec Phase 2 (2026-05-19) against:

_docs/00_problem/problem.md, acceptance_criteria.md, restrictions.md, security_approach.md
_docs/01_solution/solution_draft01.md
_docs/02_document/architecture.md (incl. §6 NFR Targets, §7 Detailed Design)
_docs/00_problem/input_data/data_parameters.md, services.md, fixtures/README.md, expected_results/results_report.md

Per .cursor/rules/artifact-srp.mdc this artifact owns ONLY the test environment / harness shape — measurable thresholds belong in acceptance_criteria.md, fixture inventory belongs in test-data.md, and per-test specs belong in the sibling *-tests.md files.

Overview

System under test (SUT): autopilot — a single Rust binary that mounts onto the Jetson Orin Nano Super of a reconnaissance UAV. Its observable external surfaces:

Surface	Direction	Protocol	Source/Sink in production
Tier-1 detection RPC	autopilot ⇄ detector	bi-directional gRPC streaming (local)	`../detections`
MAVLink command/telemetry	autopilot ⇄ airframe	MAVLink v2 over UDP (or serial)	ArduPilot / PX4
Camera RTSP feed	camera → autopilot	H.264/265 1080p, 30/60 fps	ViewPro A40
Gimbal control + telemetry	autopilot ⇄ camera	ViewPro vendor UDP	ViewPro A40
Mission + MapObjects REST	autopilot ⇄ central	HTTPS JSON	`missions` service
Operator stream (telemetry out, commands in)	autopilot ⇄ GS	Suite-level modem protocol, signed commands	Ground Station
Deep-analysis VLM IPC (optional)	autopilot ⇄ VLM	Unix-domain socket	local-onboard VLM
Health endpoint	autopilot → ops	HTTP/JSON	scraped by ops
Structured logs	autopilot → ops	JSON to stdout	log shipper

The harness exercises every one of those surfaces from outside the SUT process. No test reaches inside the binary (no module imports, no direct DB peeks, no shared memory).

Consumer app purpose: a black-box test runner (e2e-consumer) that:

Brings up the SUT in a controlled topology (with mock or live peers).
Drives inputs through public surfaces.
Captures every observable: outbound network frames, MAVLink commands, gimbal UDP commands, REST calls, operator-stream messages, health-endpoint JSON, log lines, plus passive resource metrics (RSS, CPU, GPU).
Compares each observation against the expected result tagged in _docs/00_problem/input_data/expected_results/results_report.md and emits a CSV report.

Test execution tiers

Three execution tiers exist; each test scenario declares which tier(s) it must run in:

Tier	Purpose	What is real vs mocked	When it runs
U — unit	Pure in-process logic with no external surface (state-machine transitions, geometry helpers, schema validators)	Everything in-process	Per commit (cargo test)
I — component-integration	One autopilot component against mocks for every peer	SUT component real; all peers stubbed/replayed	Per commit; isolates contract drift
B — blackbox / harness	Full SUT binary against mock peers in containers	SUT binary real; every external peer mocked (HTTPS mock, gRPC replay, MAVLink SITL, scripted operator trace, RTSP loopback)	Per commit + nightly
E — suite-e2e	Full SUT against live siblings (`../detections`, `../missions`, ArduPilot SITL, Ground Station replay)	All real services in the suite-e2e compose	Nightly + pre-release
HW — hardware/replay benchmark	SUT binary on representative Jetson hardware OR on a benchmarked replay of that hardware	Real Jetson Orin Nano Super OR benchmarked replay	Pre-release; the only path that satisfies the `acceptance_criteria.md → Acceptance Gates (project-level)` hardware gate

Hardware-dependency analysis (which AC rows require HW vs replay vs commodity) is produced by the test-spec phases/hardware-assessment.md step before Phase 4 runner scripts are generated and is appended to this file as ## Hardware Execution Matrix.

Docker environment (Tier B + E)

The suite-e2e compose lives at the monorepo level (../e2e/docker-compose.suite-e2e.yml, owned by the monorepo-e2e skill — see _docs/00_problem/input_data/services.md). The autopilot-local harness lives at e2e/docker-compose.autopilot-e2e.yml (created by Phase 4) and brings up only the SUT + mocks needed for Tier-B runs.

Services (Tier B — autopilot-local harness)

Service	Image / Build	Purpose	Ports
`autopilot`	build: `.` (cross to `aarch64-unknown-linux-gnu` for HW, native for Tier B)	SUT	health: 9100/tcp; log: stdout; MAVLink: 14550/udp; gimbal: 9201/udp; operator: 9301/tcp
`detections-mock`	build: `e2e/mocks/detections-mock` (Python)	Bi-directional gRPC mock replaying recorded `Detections` streams	50051/tcp
`missions-mock`	build: `e2e/mocks/missions-mock` (Python FastAPI)	HTTPS REST mock — `GET/POST /missions/{id}` + `/mapobjects`	8443/tcp (TLS)
`rtsp-loopback`	image: `bluenviron/mediamtx`	RTSP server playing back recorded `.mp4` frame sequences at 30/60 fps	8554/tcp
`gimbal-mock`	build: `e2e/mocks/gimbal-mock` (Rust)	ViewPro UDP echo + scripted yaw/pitch/zoom telemetry replays	9200/udp
`mavlink-sitl`	image: `ardupilot/ardupilot-sitl`	ArduPilot SITL — MAVLink v2 endpoint for the autopilot to drive	14551/udp
`vlm-mock`	build: `e2e/mocks/vlm-mock` (Python, UDS)	Optional Tier-3 VLM IPC mock; replays recorded `VlmAssessment` JSON	(UDS only)
`operator-replay`	build: `e2e/mocks/operator-replay` (Python)	Scripted Ground Station session trace: connect / push frame / push telemetry / operator-click / modem-drop / reconnect / lost-link	9300/tcp
`time-injector`	build: `e2e/mocks/time-injector` (Rust)	Injects clock-drift / NTP-loss scenarios into the SUT container's clock via `faketime` LD_PRELOAD shim	—
`e2e-consumer`	build: `e2e/consumer` (Rust + assert crates)	The black-box test runner that drives scenarios + compares observables to expected results	—

Networks

Network	Services	Purpose
`autopilot-e2e`	all	Isolated test network; no egress

Volumes

Volume	Mounted to	Purpose
`fixtures-ro`	every mock service (read-only)	Mounts `_docs/00_problem/input_data/fixtures/` for replay sources
`expected-ro`	`e2e-consumer:/expected:ro`	Mounts `_docs/00_problem/input_data/expected_results/` for assertion comparison
`reports-rw`	`e2e-consumer:/reports`	CSV + JSON test output
`autopilot-state`	`autopilot:/var/lib/autopilot`	On-device persistent store (R3, Mp4) — wiped between runs

docker-compose structure (outline only — not runnable)

services:
  autopilot:
    build: .
    depends_on: [detections-mock, missions-mock, rtsp-loopback, gimbal-mock, mavlink-sitl, operator-replay]
    networks: [autopilot-e2e]
    environment:
      DETECTOR_GRPC: detections-mock:50051
      MISSIONS_URL: https://missions-mock:8443
      RTSP_URL: rtsp://rtsp-loopback:8554/feed
      GIMBAL_UDP: gimbal-mock:9200
      MAVLINK_UDP: mavlink-sitl:14551
      OPERATOR_TCP: operator-replay:9300
      VLM_SOCK: /tmp/vlm.sock
      AUTOPILOT_CONFIG: /etc/autopilot/test.toml
    volumes:
      - autopilot-state:/var/lib/autopilot
  detections-mock: { build: e2e/mocks/detections-mock, volumes: [fixtures-ro:/fixtures:ro] }
  missions-mock:   { build: e2e/mocks/missions-mock,   volumes: [fixtures-ro:/fixtures:ro] }
  rtsp-loopback:   { image: bluenviron/mediamtx,       volumes: [fixtures-ro:/fixtures:ro] }
  gimbal-mock:     { build: e2e/mocks/gimbal-mock,     volumes: [fixtures-ro:/fixtures:ro] }
  mavlink-sitl:    { image: ardupilot/ardupilot-sitl }
  vlm-mock:        { build: e2e/mocks/vlm-mock,        volumes: [fixtures-ro:/fixtures:ro] }
  operator-replay: { build: e2e/mocks/operator-replay, volumes: [fixtures-ro:/fixtures:ro] }
  time-injector:   { build: e2e/mocks/time-injector }
  e2e-consumer:
    build: e2e/consumer
    depends_on: [autopilot]
    volumes: [expected-ro:/expected:ro, reports-rw:/reports]
networks:
  autopilot-e2e: {}
volumes:
  fixtures-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/fixtures } }
  expected-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/expected_results } }
  reports-rw: {}
  autopilot-state: {}

Suite-e2e compose (Tier E) — referenced, not redefined

For Tier-E runs the harness uses ../e2e/docker-compose.suite-e2e.yml (owned by monorepo-e2e). It adds the real ../detections, real ../missions, and a richer mavlink-sitl configuration. Autopilot's Tier-E entries in this file MUST mirror the suite-e2e topology — drift is reconciled by the monorepo-e2e skill, not here.

Consumer application (`e2e-consumer`)

Tech stack: Rust + assert_cmd + testcontainers-rs + prost/tonic (for gRPC observation) + mavlink-rs (for MAVLink observation) + reqwest/hyper (for HTTPS observation) + tokio-tungstenite (for operator-stream observation). Tests are organised one-scenario-per-file under e2e/consumer/tests/scenarios/.

Entry point: cargo test --release --test scenarios (orchestrated by scripts/run-tests.sh, produced in Phase 4).

Communication with the system under test

Interface	Protocol	Endpoint / Topic	Authentication
Health endpoint	HTTP GET	`http://autopilot:9100/health`	none (loopback)
Structured log stream	line-delimited JSON on stdout	docker-compose log tail	none
MAVLink observed	MAVLink v2 / UDP	`mavlink-sitl:14551` (the harness records both sides of the link)	per Q6: MAVLink-2 message signing if configured
Gimbal observed	ViewPro UDP	`gimbal-mock:9200` (commands recorded + telemetry replayed)	none
RTSP delivered	RTSP	`rtsp://rtsp-loopback:8554/feed` (consumer schedules which clip plays per scenario)	none
Detection RPC observed	gRPC streaming	`detections-mock:50051` (consumer scripts the recorded replay served)	none
Mission REST observed	HTTPS	`missions-mock:8443` (consumer scripts JSON fixtures + asserts captured request bodies)	TLS cert (self-signed for test)
Operator stream observed	Suite modem protocol	`operator-replay:9300` (consumer scripts session traces + signed-command envelopes)	per Q9: signed envelope (HMAC / ed25519 / MAVLink-2-ext)
VLM IPC observed (when enabled)	Unix-domain socket	`/tmp/vlm.sock` shared with `vlm-mock`	peer-credential check (security_approach §"Local IPC peer authorisation")

What the consumer does NOT have access to

No direct database access to the autopilot's on-device persistent store (autopilot-state volume) — the consumer reads it only via the health endpoint, the operator telemetry stream, or as a post-run forensic check (the storage AC R3 is checked via the BIT health response, not by peeking at SQLite rows).
No internal Rust module imports — the consumer is a separate crate compiled against published public proto/schema files only.
No shared memory, no /proc/$pid/... inspection beyond passive resource metrics.
No direct reading of in-flight POI queue ordering — ordering is observed indirectly via the operator-stream emission order and the gimbal command stream.

External dependency mocks

Dependency	Mock service	Determinism guarantee	Source fixture(s)
`../detections` Tier-1 RPC	`detections-mock`	Replays recorded `Detections` stream byte-for-byte; same input → same output	`<DEFERRED: tier1_replay/*.replay; services.md §1>` (live `../detections` used as fallback in Tier-E)
`missions` API	`missions-mock`	Static JSON responses per scenario; recorded round-trip captured for `POST`	`<DEFERRED: missions_fixtures/*.json; services.md §2>`
ViewPro A40 camera frames	`rtsp-loopback` (mediamtx)	Plays back `.mp4` at exact configured fps; frame timestamps deterministic	`fixtures/videos/94d42580bd1ad6ff.mp4`, `fixtures/movement/video0[1-4].mp4`
ViewPro A40 gimbal control	`gimbal-mock`	Replays `gimbal.csv` per scenario; echoes commands with bounded latency budget per scenario	`<DEFERRED: gimbal_csv/*.csv paired with movement videos; services.md §6>`
ArduPilot airframe	`mavlink-sitl` (ArduPilot SITL)	Deterministic seed + scripted mission	scripted per scenario; no fixture file required for Tier B (SITL is the fixture)
Ground Station modem session	`operator-replay`	Replays `(t, event)` script per scenario	`<DEFERRED: operator_sessions/*.script; services.md §3>`
Local VLM (Tier-3 optional)	`vlm-mock`	Returns paired `(roi.png → VlmAssessment)` from disk; schema-violation fixtures for fail-closed tests	`<DEFERRED: vlm_io_pairs/*.json; services.md §7>`
Wall-clock / GPS / NTP	`time-injector` (faketime LD_PRELOAD)	Scripted offset / jump / source-loss; injected at SUT process start	scripted per scenario; no fixture file required

Mocks that are marked <DEFERRED:> are bridged through _docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md. Scenarios that consume those mocks declare Test status: DEFERRED — input fixture not yet acquired (see leftover row N) in their entry under the relevant *-tests.md file.

CI/CD integration

Stage	Tier(s)	When	Gate	Timeout
PR pipeline	U, I	on every PR push	block merge on FAIL	10 min
dev-branch nightly	U, I, B	nightly	warn on FAIL; report attached	60 min
weekly suite-e2e	U, I, B, E	weekly + on release branch	block release on FAIL	180 min
pre-release HW benchmark	HW	manual + pre-release	block release on FAIL	240 min

Owned in _docs/02_document/deployment/ci_cd_pipeline.md. This file only declares which tier each scenario MUST run in; the pipeline orchestration is documented there.

Reporting

Format: CSV (one row per scenario per run).

Columns:

Column	Type	Notes
`test_id`	string	e.g. `FT-P-001`, `NFT-PERF-L1`, `NFT-SEC-O9`
`test_name`	string	short title from the scenario header
`tier`	enum	U / I / B / E / HW
`seed`	int	deterministic seed used (where applicable)
`start_ts_utc`	ISO 8601	scenario start
`duration_ms`	int	total execution time
`result`	enum	PASS / FAIL / SKIP / DEFERRED
`expected_result_ref`	string	row id in `expected_results/results_report.md` (e.g. `L1`, `Mp3`)
`actual_value`	string	quantitative observation (latency_ms, count, etc.)
`compare_method`	string	one of `expected-results.md` methods
`tolerance`	string	as declared in the expected-results row
`failure_reason`	string	populated only on FAIL or DEFERRED
`artifacts_path`	string	path under `/reports/<run-id>/` for captured logs / pcaps / mavlink dumps

Output path: e2e/consumer/reports/<run-id>/report.csv (mounted host-side to ./reports/<run-id>/report.csv).

Sidecar artifacts per scenario (one folder per test_id): stdout.log, stderr.log, mavlink.tlog (where applicable), pcap.bin (where applicable), health-trace.jsonl, actual-output.json.

Test Execution

Decision (recorded 2026-05-19 by phases/hardware-assessment.md): local-only on Jetson Orin Nano Super. Every scenario — Tier B, Tier E, Tier HW — runs on representative Jetson hardware (the same hardware the airborne payload deploys to). Docker is used for service orchestration (mocks, sibling services) on the Jetson host, NOT for SUT execution on x86.

Hardware dependencies found

File	Dependency surfaced
`_docs/00_problem/restrictions.md → "Hardware"`	Jetson Orin Nano Super (aarch64), 8 GB shared LPDDR5, 67 TOPS INT8; ViewPro A40 (40× optical zoom + vendor UDP); ViewPro Z40K compatibility
`_docs/00_problem/restrictions.md → "Software environment"`	FP16 precision (INT8 rejected); no cloud egress; Tier 1 + local large models share Jetson GPU with mutual exclusion
`_docs/01_solution/solution_draft01.md`	"single Rust binary on Jetson Orin Nano Super (aarch64)"; TensorRT FP16; Tokio + Unix-domain-socket VLM IPC
`_docs/02_document/architecture.md §6` (NFR Targets) + `§7.6` (Solution Architecture) + `§7.14` (Tech Stack)	cross-compile target `aarch64-unknown-linux-gnu`; TensorRT engine; gimbal UDP; MAVLink-v2 transport
`_docs/02_document/components/*/description.md` (13 components)	physical UDP (gimbal_controller), RTSP capture (frame_ingest), MAVLink airframe link (mavlink_layer), local-onboard model (semantic_analyzer + vlm_client)

Why local-only on Jetson

The choice rejects two alternatives:

Docker-only on x86 would leave Tier-HW rows (L1–L9, Re1, Re2, NFT-RES-LIM-CPU, NFT-RES-LIM-GPU) SKIPPED-NO-HW. That defeats the project-level Acceptance Gate (acceptance_criteria.md → "Acceptance Gates (project-level)": every latency criterion MUST be measured on the deployed compute device).
Both x86 + Jetson would split the test surface and let Tier-B scenarios pass on x86 while masking real-hardware regressions (e.g. GPU contention is invisible on x86). The honest path is to exercise the actual hardware path uniformly.

Execution instructions (local on Jetson)

Prerequisites (one-time, per Jetson runner):

JetPack 6.x SDK + L4T r36.x (matches the airborne deployment image).
Rust toolchain pinned to the workspace's rust-toolchain.toml (added by Step 7 Implement); rustup target aarch64-unknown-linux-gnu already native here.
Docker + Docker Compose v2 (for orchestrating the mock services + sibling repos in Tier-E mode).
mavlink-router, tegrastats, iperf3, tc (network shaping).
ViewPro A40 (or Z40K for the Z40K-swap regression run) connected over Ethernet at the documented control endpoint.
ArduPilot SITL binary installed natively (the Docker image is x86-only; on Jetson aarch64 we run SITL natively or via Apptainer).
A representative ViewPro A40 RTSP feed source — either the physical camera or a recorded .mp4 looped through a local mediamtx.

How to start services: docker compose -f e2e/docker-compose.autopilot-e2e.yml up -d brings up detections-mock, missions-mock, rtsp-loopback, gimbal-mock, vlm-mock, operator-replay, time-injector on the Jetson host. The SUT (autopilot binary) runs outside the compose — cargo run --release on the Jetson directly, with env vars pointing at the compose-side mock endpoints. For Tier E, swap detections-mock → live ../detections and missions-mock → live missions per ../e2e/docker-compose.suite-e2e.yml.

How to run the test runner: scripts/run-tests.sh (to be created by a Decompose task per traceability-matrix.md → "Phase 4 SKIPPED" handoff) orchestrates: bring up compose → start SUT → run cargo test --release --test scenarios -p e2e-consumer → tear down. The runner reads RUN_TIER ∈ {B, E, HW} to decide which scenarios to execute.

Environment variables (consumed by both the SUT and the consumer):

RUN_TIER (B | E | HW) — selects scenario set per the matrix below.
AUTOPILOT_CONFIG — path to the test profile TOML (overrides per-scenario thresholds + Q-tagged defaults).
AUTOPILOT_RNG_SEED — deterministic-seed per scenario; captured in the CSV report.
JETSON_RUNNER_ID — identifier for the physical Jetson + camera+gimbal hardware combo; carried into every CSV row for forensic comparison across runners.

CI/CD addendum (overrides the earlier `## CI/CD integration` table)

The earlier table assumed a Docker-on-x86 PR pipeline. Under this decision, every tier runs on a Jetson runner. Operationally that means:

Stage	Tier(s)	When	Gate	Timeout	Runner
PR pipeline	U, I	on every PR push	block merge on FAIL	10 min	Jetson runner (native cargo test for U + I)
dev-branch nightly	U, I, B	nightly	warn on FAIL; report attached	60 min	Jetson runner
weekly suite-e2e	U, I, B, E	weekly + on release branch	block release on FAIL	180 min	Jetson runner + live siblings reachable from it
pre-release HW benchmark	HW	manual + pre-release	block release on FAIL	240 min	Jetson runner + physical A40 + airframe SITL/HW

Capacity note: the PR pipeline running on Jetson trades x86 throughput for execution honesty. If PR latency becomes painful, the team's mitigation is to add more Jetson runners — NOT to fall back to x86 for Tier B (that would defeat the choice).

Hardware Execution Matrix

Per the local-only-on-Jetson decision, every tier runs on Jetson. The matrix below is collapsed accordingly: it records what each scenario actually exercises on the Jetson (which hardware surface is the load-bearing one) so that a runner-capacity planner can predict which scenarios contend for the same physical resource.

Scenario	Tier	Jetson surface exercised	Concurrent-with constraint
FT-P-001 (D6 Tier-1 contract)	B + E	GPU (Tier 1 inference)	conflicts with NFT-RES-LIM-Re2 / GPU
FT-P-002 — FT-P-006 (D1–D5)	E + HW	GPU (Tier 1 inference)	as above
FT-P-007 — FT-P-010 (M1–M4)	B + E	CPU (movement) + GPU (Tier 1 inputs)	as above
FT-P-011 — FT-P-015 (S1–S5)	B + E	CPU + gimbal UDP + GPU (Tier 3 in S5)	gimbal contention serialises S1/S2/S3
FT-P-016 — FT-P-022 (O1–O7, O8 happy)	B + E	CPU + operator-stream	low contention
FT-P-023 (R1 BIT pass)	B + E	every dep mocked	none
FT-N-001 — FT-N-002 (R2/R3)	B + E	none (storage seed manipulation)	none
FT-N-003 (Mp2 cache-fallback)	B + E	mock timeout on `missions-mock`	none
FT-N-004 (O4 below-threshold)	B	CPU only	none
FT-P-024 / FT-P-025 / FT-P-026 (Mp1/Mp3/Mp5)	B + E	network + persistent store	persistent-store contention serialises
NFT-PERF-L1	HW	GPU (Tier 1)	dedicate runner — measurement integrity
NFT-PERF-L2	HW + B	GPU (Tier 2)	conflicts with L1/L3/L8 — serialise
NFT-PERF-L3	HW + B (vlm-mock)	GPU (Tier 3 VLM)	conflicts with L1/L2 — serialise
NFT-PERF-L4	HW	A40 physical zoom motor	dedicate runner — physical motion
NFT-PERF-L5	HW + B	CPU + gimbal UDP	serialise with L4/L8
NFT-PERF-L6 / L7	B + E	CPU + ego-motion + GPU (Tier 1 inputs)	serialise with L1
NFT-PERF-L8	HW + B	A40 physical zoom + Tier 1 GPU	dedicate runner
NFT-PERF-L9	B + E	CPU + operator-stream	low contention
NFT-PERF-T1	B	CPU + queue	none
NFT-PERF-T2	B + E	airframe link	low
NFT-PERF-T3	B	RTSP throttling + health	none
NFT-RES-R4–R9	B + E	airframe link + persistent store	serialise per-mission
NFT-RES-Mp2 / Mp4	B + E	network + persistent store	low
NFT-SEC-O9 / O10	B + E	operator-stream + crypto path	low
NFT-SEC-CraftedFrame / OversizeCrop	B	decoder CPU	low
NFT-SEC-VlmSchemaViolation / FreeFormText	B (vlm-mock)	UDS IPC	low
NFT-SEC-IpcPeerAuth	B	UDS IPC + peer-cred	low
NFT-SEC-Tier1SchemaViolation	B	Tier-1 RPC	none
NFT-SEC-MavlinkUnsigned	B + E	airframe link (Q6 dep)	low
NFT-SEC-HealthExposesSecurity	B	counters + health	low
NFT-RES-LIM-Re1	HW	full Jetson workload (RSS)	dedicate runner — measurement integrity
NFT-RES-LIM-Re2	HW	Tier 1 + autopilot workload concurrent	runs back-to-back with NFT-PERF-L1 in same session
NFT-RES-LIM-Storage	B + HW	persistent store	low
NFT-RES-LIM-CPU	HW	full CPU	dedicate runner
NFT-RES-LIM-GPU	HW	GPU mutex (Tier 1 vs Tier 3)	dedicate runner
NFT-RES-LIM-FileHandles	B + HW	`/proc/<pid>/fd`	low

Bold Tier values mark scenarios that REQUIRE physical Jetson + (sometimes) physical A40 to satisfy the project-level Acceptance Gate; surrogate replay does NOT count for those rows.

Capacity rule: scenarios marked dedicate runner MUST NOT run concurrently with any other scenario on the same Jetson — measurement integrity depends on the workload being exclusively the SUT.

Open dependencies that affect the harness

Open Q	Affects	Default until resolved
Q6 (MAVLink-2 signing)	`mavlink-sitl` config + observed-MAVLink assertions	signing disabled; tests skip signing assertions until Q6 lands
Q8 (MapObjects conflict resolution)	Mp5 fixture shape	`<DEFERRED>`
Q9 (Operator-command auth scheme)	`operator-replay` envelope format + signature validator	`<DEFERRED>` for O9/O10; O8 runs the happy path only
Q11 (multi-operator session policy)	`operator-replay` session-id semantics	single-operator only
Q14 (movement-detection classical vs learned-CV)	M4 benchmark fixture shape	`<DEFERRED>`

24 KiB Raw Blame History Unescape Escape