# Test Environment Authored by `/test-spec` Phase 2 (2026-05-19) against: - `_docs/00_problem/problem.md`, `acceptance_criteria.md`, `restrictions.md`, `security_approach.md` - `_docs/01_solution/solution_draft01.md` - `_docs/02_document/architecture.md` (incl. §6 NFR Targets, §7 Detailed Design) - `_docs/00_problem/input_data/data_parameters.md`, `services.md`, `fixtures/README.md`, `expected_results/results_report.md` Per `.cursor/rules/artifact-srp.mdc` this artifact owns ONLY the test environment / harness shape — measurable thresholds belong in `acceptance_criteria.md`, fixture inventory belongs in `test-data.md`, and per-test specs belong in the sibling `*-tests.md` files. --- ## Overview **System under test (SUT)**: `autopilot` — a single Rust binary that mounts onto the Jetson Orin Nano Super of a reconnaissance UAV. Its observable external surfaces: | Surface | Direction | Protocol | Source/Sink in production | |---|---|---|---| | Tier-1 detection RPC | autopilot ⇄ detector | bi-directional gRPC streaming (local) | `../detections` | | MAVLink command/telemetry | autopilot ⇄ airframe | MAVLink v2 over UDP (or serial) | ArduPilot / PX4 | | Camera RTSP feed | camera → autopilot | H.264/265 1080p, 30/60 fps | ViewPro A40 | | Gimbal control + telemetry | autopilot ⇄ camera | ViewPro vendor UDP | ViewPro A40 | | Mission + MapObjects REST | autopilot ⇄ central | HTTPS JSON | `missions` service | | Operator stream (telemetry out, commands in) | autopilot ⇄ GS | Suite-level modem protocol, signed commands | Ground Station | | Deep-analysis VLM IPC (optional) | autopilot ⇄ VLM | Unix-domain socket | local-onboard VLM | | Health endpoint | autopilot → ops | HTTP/JSON | scraped by ops | | Structured logs | autopilot → ops | JSON to stdout | log shipper | The harness exercises every one of those surfaces from outside the SUT process. No test reaches inside the binary (no module imports, no direct DB peeks, no shared memory). **Consumer app purpose**: a black-box test runner (`e2e-consumer`) that: 1. Brings up the SUT in a controlled topology (with mock or live peers). 2. Drives inputs through public surfaces. 3. Captures every observable: outbound network frames, MAVLink commands, gimbal UDP commands, REST calls, operator-stream messages, health-endpoint JSON, log lines, plus passive resource metrics (RSS, CPU, GPU). 4. Compares each observation against the expected result tagged in `_docs/00_problem/input_data/expected_results/results_report.md` and emits a CSV report. ## Test execution tiers Three execution tiers exist; each test scenario declares which tier(s) it must run in: | Tier | Purpose | What is real vs mocked | When it runs | |---|---|---|---| | **U** — unit | Pure in-process logic with no external surface (state-machine transitions, geometry helpers, schema validators) | Everything in-process | Per commit (cargo test) | | **I** — component-integration | One autopilot component against mocks for every peer | SUT component real; all peers stubbed/replayed | Per commit; isolates contract drift | | **B** — blackbox / harness | Full SUT binary against mock peers in containers | SUT binary real; every external peer mocked (HTTPS mock, gRPC replay, MAVLink SITL, scripted operator trace, RTSP loopback) | Per commit + nightly | | **E** — suite-e2e | Full SUT against live siblings (`../detections`, `../missions`, ArduPilot SITL, Ground Station replay) | All real services in the suite-e2e compose | Nightly + pre-release | | **HW** — hardware/replay benchmark | SUT binary on representative Jetson hardware OR on a benchmarked replay of that hardware | Real Jetson Orin Nano Super OR benchmarked replay | Pre-release; the only path that satisfies the `acceptance_criteria.md → Acceptance Gates (project-level)` hardware gate | Hardware-dependency analysis (which AC rows require HW vs replay vs commodity) is produced by the test-spec `phases/hardware-assessment.md` step before Phase 4 runner scripts are generated and is appended to this file as `## Hardware Execution Matrix`. ## Docker environment (Tier B + E) The suite-e2e compose lives at the monorepo level (`../e2e/docker-compose.suite-e2e.yml`, owned by the `monorepo-e2e` skill — see `_docs/00_problem/input_data/services.md`). The autopilot-local harness lives at `e2e/docker-compose.autopilot-e2e.yml` (created by Phase 4) and brings up only the SUT + mocks needed for Tier-B runs. ### Services (Tier B — autopilot-local harness) | Service | Image / Build | Purpose | Ports | |---|---|---|---| | `autopilot` | build: `.` (cross to `aarch64-unknown-linux-gnu` for HW, native for Tier B) | SUT | health: 9100/tcp; log: stdout; MAVLink: 14550/udp; gimbal: 9201/udp; operator: 9301/tcp | | `detections-mock` | build: `e2e/mocks/detections-mock` (Python) | Bi-directional gRPC mock replaying recorded `Detections` streams | 50051/tcp | | `missions-mock` | build: `e2e/mocks/missions-mock` (Python FastAPI) | HTTPS REST mock — `GET/POST /missions/{id}` + `/mapobjects` | 8443/tcp (TLS) | | `rtsp-loopback` | image: `bluenviron/mediamtx` | RTSP server playing back recorded `.mp4` frame sequences at 30/60 fps | 8554/tcp | | `gimbal-mock` | build: `e2e/mocks/gimbal-mock` (Rust) | ViewPro UDP echo + scripted yaw/pitch/zoom telemetry replays | 9200/udp | | `mavlink-sitl` | image: `ardupilot/ardupilot-sitl` | ArduPilot SITL — MAVLink v2 endpoint for the autopilot to drive | 14551/udp | | `vlm-mock` | build: `e2e/mocks/vlm-mock` (Python, UDS) | Optional Tier-3 VLM IPC mock; replays recorded `VlmAssessment` JSON | (UDS only) | | `operator-replay` | build: `e2e/mocks/operator-replay` (Python) | Scripted Ground Station session trace: connect / push frame / push telemetry / operator-click / modem-drop / reconnect / lost-link | 9300/tcp | | `time-injector` | build: `e2e/mocks/time-injector` (Rust) | Injects clock-drift / NTP-loss scenarios into the SUT container's clock via `faketime` LD_PRELOAD shim | — | | `e2e-consumer` | build: `e2e/consumer` (Rust + assert crates) | The black-box test runner that drives scenarios + compares observables to expected results | — | ### Networks | Network | Services | Purpose | |---|---|---| | `autopilot-e2e` | all | Isolated test network; no egress | ### Volumes | Volume | Mounted to | Purpose | |---|---|---| | `fixtures-ro` | every mock service (read-only) | Mounts `_docs/00_problem/input_data/fixtures/` for replay sources | | `expected-ro` | `e2e-consumer:/expected:ro` | Mounts `_docs/00_problem/input_data/expected_results/` for assertion comparison | | `reports-rw` | `e2e-consumer:/reports` | CSV + JSON test output | | `autopilot-state` | `autopilot:/var/lib/autopilot` | On-device persistent store (R3, Mp4) — wiped between runs | ### docker-compose structure (outline only — not runnable) ```yaml services: autopilot: build: . depends_on: [detections-mock, missions-mock, rtsp-loopback, gimbal-mock, mavlink-sitl, operator-replay] networks: [autopilot-e2e] environment: DETECTOR_GRPC: detections-mock:50051 MISSIONS_URL: https://missions-mock:8443 RTSP_URL: rtsp://rtsp-loopback:8554/feed GIMBAL_UDP: gimbal-mock:9200 MAVLINK_UDP: mavlink-sitl:14551 OPERATOR_TCP: operator-replay:9300 VLM_SOCK: /tmp/vlm.sock AUTOPILOT_CONFIG: /etc/autopilot/test.toml volumes: - autopilot-state:/var/lib/autopilot detections-mock: { build: e2e/mocks/detections-mock, volumes: [fixtures-ro:/fixtures:ro] } missions-mock: { build: e2e/mocks/missions-mock, volumes: [fixtures-ro:/fixtures:ro] } rtsp-loopback: { image: bluenviron/mediamtx, volumes: [fixtures-ro:/fixtures:ro] } gimbal-mock: { build: e2e/mocks/gimbal-mock, volumes: [fixtures-ro:/fixtures:ro] } mavlink-sitl: { image: ardupilot/ardupilot-sitl } vlm-mock: { build: e2e/mocks/vlm-mock, volumes: [fixtures-ro:/fixtures:ro] } operator-replay: { build: e2e/mocks/operator-replay, volumes: [fixtures-ro:/fixtures:ro] } time-injector: { build: e2e/mocks/time-injector } e2e-consumer: build: e2e/consumer depends_on: [autopilot] volumes: [expected-ro:/expected:ro, reports-rw:/reports] networks: autopilot-e2e: {} volumes: fixtures-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/fixtures } } expected-ro: { driver_opts: { type: none, o: bind, device: ${PWD}/_docs/00_problem/input_data/expected_results } } reports-rw: {} autopilot-state: {} ``` ### Suite-e2e compose (Tier E) — referenced, not redefined For Tier-E runs the harness uses `../e2e/docker-compose.suite-e2e.yml` (owned by `monorepo-e2e`). It adds the real `../detections`, real `../missions`, and a richer `mavlink-sitl` configuration. Autopilot's Tier-E entries in this file MUST mirror the suite-e2e topology — drift is reconciled by the `monorepo-e2e` skill, not here. ## Consumer application (`e2e-consumer`) **Tech stack**: Rust + `assert_cmd` + `testcontainers-rs` + `prost`/`tonic` (for gRPC observation) + `mavlink-rs` (for MAVLink observation) + `reqwest`/`hyper` (for HTTPS observation) + `tokio-tungstenite` (for operator-stream observation). Tests are organised one-scenario-per-file under `e2e/consumer/tests/scenarios/`. **Entry point**: `cargo test --release --test scenarios` (orchestrated by `scripts/run-tests.sh`, produced in Phase 4). ### Communication with the system under test | Interface | Protocol | Endpoint / Topic | Authentication | |---|---|---|---| | Health endpoint | HTTP GET | `http://autopilot:9100/health` | none (loopback) | | Structured log stream | line-delimited JSON on stdout | docker-compose log tail | none | | MAVLink observed | MAVLink v2 / UDP | `mavlink-sitl:14551` (the harness records both sides of the link) | per Q6: MAVLink-2 message signing if configured | | Gimbal observed | ViewPro UDP | `gimbal-mock:9200` (commands recorded + telemetry replayed) | none | | RTSP delivered | RTSP | `rtsp://rtsp-loopback:8554/feed` (consumer schedules which clip plays per scenario) | none | | Detection RPC observed | gRPC streaming | `detections-mock:50051` (consumer scripts the recorded replay served) | none | | Mission REST observed | HTTPS | `missions-mock:8443` (consumer scripts JSON fixtures + asserts captured request bodies) | TLS cert (self-signed for test) | | Operator stream observed | Suite modem protocol | `operator-replay:9300` (consumer scripts session traces + signed-command envelopes) | per Q9: signed envelope (HMAC / ed25519 / MAVLink-2-ext) | | VLM IPC observed (when enabled) | Unix-domain socket | `/tmp/vlm.sock` shared with `vlm-mock` | peer-credential check (security_approach §"Local IPC peer authorisation") | ### What the consumer does NOT have access to - No direct database access to the autopilot's on-device persistent store (`autopilot-state` volume) — the consumer reads it only via the health endpoint, the operator telemetry stream, or as a post-run forensic check (the storage AC R3 is checked via the BIT health response, not by peeking at SQLite rows). - No internal Rust module imports — the consumer is a separate crate compiled against published public proto/schema files only. - No shared memory, no `/proc/$pid/...` inspection beyond passive resource metrics. - No direct reading of in-flight POI queue ordering — ordering is observed indirectly via the operator-stream emission order and the gimbal command stream. ## External dependency mocks | Dependency | Mock service | Determinism guarantee | Source fixture(s) | |---|---|---|---| | `../detections` Tier-1 RPC | `detections-mock` | Replays recorded `Detections` stream byte-for-byte; same input → same output | `` (live `../detections` used as fallback in Tier-E) | | `missions` API | `missions-mock` | Static JSON responses per scenario; recorded round-trip captured for `POST` | `` | | ViewPro A40 camera frames | `rtsp-loopback` (mediamtx) | Plays back `.mp4` at exact configured fps; frame timestamps deterministic | `fixtures/videos/94d42580bd1ad6ff.mp4`, `fixtures/movement/video0[1-4].mp4` | | ViewPro A40 gimbal control | `gimbal-mock` | Replays `gimbal.csv` per scenario; echoes commands with bounded latency budget per scenario | `` | | ArduPilot airframe | `mavlink-sitl` (ArduPilot SITL) | Deterministic seed + scripted mission | scripted per scenario; no fixture file required for Tier B (SITL is the fixture) | | Ground Station modem session | `operator-replay` | Replays `(t, event)` script per scenario | `` | | Local VLM (Tier-3 optional) | `vlm-mock` | Returns paired `(roi.png → VlmAssessment)` from disk; schema-violation fixtures for fail-closed tests | `` | | Wall-clock / GPS / NTP | `time-injector` (faketime LD_PRELOAD) | Scripted offset / jump / source-loss; injected at SUT process start | scripted per scenario; no fixture file required | Mocks that are marked `` are bridged through `_docs/_process_leftovers/2026-05-19_autopilot_test_fixtures.md`. Scenarios that consume those mocks declare `Test status: DEFERRED — input fixture not yet acquired (see leftover row N)` in their entry under the relevant `*-tests.md` file. ## CI/CD integration | Stage | Tier(s) | When | Gate | Timeout | |---|---|---|---|---| | PR pipeline | U, I | on every PR push | block merge on FAIL | 10 min | | dev-branch nightly | U, I, B | nightly | warn on FAIL; report attached | 60 min | | weekly suite-e2e | U, I, B, E | weekly + on release branch | block release on FAIL | 180 min | | pre-release HW benchmark | HW | manual + pre-release | block release on FAIL | 240 min | Owned in `_docs/02_document/deployment/ci_cd_pipeline.md`. This file only declares which tier each scenario MUST run in; the pipeline orchestration is documented there. ## Reporting **Format**: CSV (one row per scenario per run). **Columns**: | Column | Type | Notes | |---|---|---| | `test_id` | string | e.g. `FT-P-001`, `NFT-PERF-L1`, `NFT-SEC-O9` | | `test_name` | string | short title from the scenario header | | `tier` | enum | U / I / B / E / HW | | `seed` | int | deterministic seed used (where applicable) | | `start_ts_utc` | ISO 8601 | scenario start | | `duration_ms` | int | total execution time | | `result` | enum | PASS / FAIL / SKIP / DEFERRED | | `expected_result_ref` | string | row id in `expected_results/results_report.md` (e.g. `L1`, `Mp3`) | | `actual_value` | string | quantitative observation (latency_ms, count, etc.) | | `compare_method` | string | one of `expected-results.md` methods | | `tolerance` | string | as declared in the expected-results row | | `failure_reason` | string | populated only on FAIL or DEFERRED | | `artifacts_path` | string | path under `/reports//` for captured logs / pcaps / mavlink dumps | **Output path**: `e2e/consumer/reports//report.csv` (mounted host-side to `./reports//report.csv`). **Sidecar artifacts** per scenario (one folder per `test_id`): `stdout.log`, `stderr.log`, `mavlink.tlog` (where applicable), `pcap.bin` (where applicable), `health-trace.jsonl`, `actual-output.json`. ## Test Execution **Decision** (recorded 2026-05-19 by `phases/hardware-assessment.md`): **local-only on Jetson Orin Nano Super**. Every scenario — Tier B, Tier E, Tier HW — runs on representative Jetson hardware (the same hardware the airborne payload deploys to). Docker is used for **service orchestration** (mocks, sibling services) on the Jetson host, NOT for SUT execution on x86. ### Hardware dependencies found | File | Dependency surfaced | |---|---| | `_docs/00_problem/restrictions.md → "Hardware"` | Jetson Orin Nano Super (aarch64), 8 GB shared LPDDR5, 67 TOPS INT8; ViewPro A40 (40× optical zoom + vendor UDP); ViewPro Z40K compatibility | | `_docs/00_problem/restrictions.md → "Software environment"` | FP16 precision (INT8 rejected); no cloud egress; Tier 1 + local large models share Jetson GPU with mutual exclusion | | `_docs/01_solution/solution_draft01.md` | "single Rust binary on Jetson Orin Nano Super (aarch64)"; TensorRT FP16; Tokio + Unix-domain-socket VLM IPC | | `_docs/02_document/architecture.md §6` (NFR Targets) + `§7.6` (Solution Architecture) + `§7.14` (Tech Stack) | cross-compile target `aarch64-unknown-linux-gnu`; TensorRT engine; gimbal UDP; MAVLink-v2 transport | | `_docs/02_document/components/*/description.md` (13 components) | physical UDP (gimbal_controller), RTSP capture (frame_ingest), MAVLink airframe link (mavlink_layer), local-onboard model (semantic_analyzer + vlm_client) | ### Why local-only on Jetson The choice rejects two alternatives: - **Docker-only on x86** would leave Tier-HW rows (L1–L9, Re1, Re2, NFT-RES-LIM-CPU, NFT-RES-LIM-GPU) `SKIPPED-NO-HW`. That defeats the project-level Acceptance Gate (`acceptance_criteria.md → "Acceptance Gates (project-level)"`: every latency criterion MUST be measured on the deployed compute device). - **Both x86 + Jetson** would split the test surface and let Tier-B scenarios pass on x86 while masking real-hardware regressions (e.g. GPU contention is invisible on x86). The honest path is to exercise the actual hardware path uniformly. ### Execution instructions (local on Jetson) **Prerequisites** (one-time, per Jetson runner): - JetPack 6.x SDK + L4T r36.x (matches the airborne deployment image). - Rust toolchain pinned to the workspace's `rust-toolchain.toml` (added by Step 7 Implement); rustup target `aarch64-unknown-linux-gnu` already native here. - Docker + Docker Compose v2 (for orchestrating the mock services + sibling repos in Tier-E mode). - `mavlink-router`, `tegrastats`, `iperf3`, `tc` (network shaping). - ViewPro A40 (or Z40K for the Z40K-swap regression run) connected over Ethernet at the documented control endpoint. - ArduPilot SITL binary installed natively (the Docker image is x86-only; on Jetson aarch64 we run SITL natively or via Apptainer). - A representative ViewPro A40 RTSP feed source — either the physical camera or a recorded `.mp4` looped through a local `mediamtx`. **How to start services**: `docker compose -f e2e/docker-compose.autopilot-e2e.yml up -d` brings up `detections-mock`, `missions-mock`, `rtsp-loopback`, `gimbal-mock`, `vlm-mock`, `operator-replay`, `time-injector` on the Jetson host. The SUT (`autopilot` binary) runs **outside** the compose — `cargo run --release` on the Jetson directly, with env vars pointing at the compose-side mock endpoints. For Tier E, swap `detections-mock` → live `../detections` and `missions-mock` → live `missions` per `../e2e/docker-compose.suite-e2e.yml`. **How to run the test runner**: `scripts/run-tests.sh` (to be created by a Decompose task per `traceability-matrix.md → "Phase 4 SKIPPED"` handoff) orchestrates: bring up compose → start SUT → run `cargo test --release --test scenarios -p e2e-consumer` → tear down. The runner reads `RUN_TIER ∈ {B, E, HW}` to decide which scenarios to execute. **Environment variables** (consumed by both the SUT and the consumer): - `RUN_TIER` (`B` | `E` | `HW`) — selects scenario set per the matrix below. - `AUTOPILOT_CONFIG` — path to the test profile TOML (overrides per-scenario thresholds + Q-tagged defaults). - `AUTOPILOT_RNG_SEED` — deterministic-seed per scenario; captured in the CSV report. - `JETSON_RUNNER_ID` — identifier for the physical Jetson + camera+gimbal hardware combo; carried into every CSV row for forensic comparison across runners. ### CI/CD addendum (overrides the earlier `## CI/CD integration` table) The earlier table assumed a Docker-on-x86 PR pipeline. Under this decision, every tier runs on a Jetson runner. Operationally that means: | Stage | Tier(s) | When | Gate | Timeout | Runner | |---|---|---|---|---|---| | PR pipeline | U, I | on every PR push | block merge on FAIL | 10 min | Jetson runner (native cargo test for U + I) | | dev-branch nightly | U, I, B | nightly | warn on FAIL; report attached | 60 min | Jetson runner | | weekly suite-e2e | U, I, B, E | weekly + on release branch | block release on FAIL | 180 min | Jetson runner + live siblings reachable from it | | pre-release HW benchmark | HW | manual + pre-release | block release on FAIL | 240 min | Jetson runner + physical A40 + airframe SITL/HW | Capacity note: the PR pipeline running on Jetson trades x86 throughput for execution honesty. If PR latency becomes painful, the team's mitigation is to add more Jetson runners — NOT to fall back to x86 for Tier B (that would defeat the choice). ## Hardware Execution Matrix Per the local-only-on-Jetson decision, every tier runs on Jetson. The matrix below is collapsed accordingly: it records **what each scenario actually exercises on the Jetson** (which hardware surface is the load-bearing one) so that a runner-capacity planner can predict which scenarios contend for the same physical resource. | Scenario | Tier | Jetson surface exercised | Concurrent-with constraint | |---|---|---|---| | FT-P-001 (D6 Tier-1 contract) | B + E | GPU (Tier 1 inference) | conflicts with NFT-RES-LIM-Re2 / GPU | | FT-P-002 — FT-P-006 (D1–D5) | E + HW | GPU (Tier 1 inference) | as above | | FT-P-007 — FT-P-010 (M1–M4) | B + E | CPU (movement) + GPU (Tier 1 inputs) | as above | | FT-P-011 — FT-P-015 (S1–S5) | B + E | CPU + gimbal UDP + GPU (Tier 3 in S5) | gimbal contention serialises S1/S2/S3 | | FT-P-016 — FT-P-022 (O1–O7, O8 happy) | B + E | CPU + operator-stream | low contention | | FT-P-023 (R1 BIT pass) | B + E | every dep mocked | none | | FT-N-001 — FT-N-002 (R2/R3) | B + E | none (storage seed manipulation) | none | | FT-N-003 (Mp2 cache-fallback) | B + E | mock timeout on `missions-mock` | none | | FT-N-004 (O4 below-threshold) | B | CPU only | none | | FT-P-024 / FT-P-025 / FT-P-026 (Mp1/Mp3/Mp5) | B + E | network + persistent store | persistent-store contention serialises | | NFT-PERF-L1 | **HW** | GPU (Tier 1) | dedicate runner — measurement integrity | | NFT-PERF-L2 | HW + B | GPU (Tier 2) | conflicts with L1/L3/L8 — serialise | | NFT-PERF-L3 | HW + B (vlm-mock) | GPU (Tier 3 VLM) | conflicts with L1/L2 — serialise | | NFT-PERF-L4 | **HW** | A40 physical zoom motor | dedicate runner — physical motion | | NFT-PERF-L5 | HW + B | CPU + gimbal UDP | serialise with L4/L8 | | NFT-PERF-L6 / L7 | B + E | CPU + ego-motion + GPU (Tier 1 inputs) | serialise with L1 | | NFT-PERF-L8 | HW + B | A40 physical zoom + Tier 1 GPU | dedicate runner | | NFT-PERF-L9 | B + E | CPU + operator-stream | low contention | | NFT-PERF-T1 | B | CPU + queue | none | | NFT-PERF-T2 | B + E | airframe link | low | | NFT-PERF-T3 | B | RTSP throttling + health | none | | NFT-RES-R4–R9 | B + E | airframe link + persistent store | serialise per-mission | | NFT-RES-Mp2 / Mp4 | B + E | network + persistent store | low | | NFT-SEC-O9 / O10 | B + E | operator-stream + crypto path | low | | NFT-SEC-CraftedFrame / OversizeCrop | B | decoder CPU | low | | NFT-SEC-VlmSchemaViolation / FreeFormText | B (vlm-mock) | UDS IPC | low | | NFT-SEC-IpcPeerAuth | B | UDS IPC + peer-cred | low | | NFT-SEC-Tier1SchemaViolation | B | Tier-1 RPC | none | | NFT-SEC-MavlinkUnsigned | B + E | airframe link (Q6 dep) | low | | NFT-SEC-HealthExposesSecurity | B | counters + health | low | | NFT-RES-LIM-Re1 | **HW** | full Jetson workload (RSS) | dedicate runner — measurement integrity | | NFT-RES-LIM-Re2 | **HW** | Tier 1 + autopilot workload concurrent | runs back-to-back with NFT-PERF-L1 in same session | | NFT-RES-LIM-Storage | B + HW | persistent store | low | | NFT-RES-LIM-CPU | **HW** | full CPU | dedicate runner | | NFT-RES-LIM-GPU | **HW** | GPU mutex (Tier 1 vs Tier 3) | dedicate runner | | NFT-RES-LIM-FileHandles | B + HW | `/proc//fd` | low | **Bold Tier values** mark scenarios that REQUIRE physical Jetson + (sometimes) physical A40 to satisfy the project-level Acceptance Gate; surrogate replay does NOT count for those rows. **Capacity rule**: scenarios marked `dedicate runner` MUST NOT run concurrently with any other scenario on the same Jetson — measurement integrity depends on the workload being exclusively the SUT. ## Open dependencies that affect the harness | Open Q | Affects | Default until resolved | |---|---|---| | Q6 (MAVLink-2 signing) | `mavlink-sitl` config + observed-MAVLink assertions | signing disabled; tests skip signing assertions until Q6 lands | | Q8 (MapObjects conflict resolution) | Mp5 fixture shape | `` | | Q9 (Operator-command auth scheme) | `operator-replay` envelope format + signature validator | `` for O9/O10; O8 runs the happy path only | | Q11 (multi-operator session policy) | `operator-replay` session-id semantics | single-operator only | | Q14 (movement-detection classical vs learned-CV) | M4 benchmark fixture shape | `` |