# Test Environment > **Active policy — 2026-05-20 (refined)**: the canonical CI / release-gate > test environment is the Jetson Orin Nano Super (or a Jetson-equivalent > arm64 agent). **Unit tests** (`pytest tests/unit/`) MAY be run on a local > developer workstation for fast iteration — they are hardware-agnostic by > construction, the suite is fully synthetic, and Jetson SSH round-trips add > latency without adding signal. **Blackbox / e2e / performance / resilience > / security / resource-limit tests** (`tests/e2e/`, `e2e/tests/`, > `tests/perf/`, etc.) MUST run on the Jetson — never on a local workstation > — because their pass criteria are tied to Jetson wall-clock latency, > thermal envelope, and the real-camera + real-FC SITL loop. Workstation x86 > Docker (the historical "Tier-1" path) is **deprecated** as a supported > e2e environment; the Tier-1 sections below are retained as historical > reference / traceability only. CI e2e pipelines target the colocated > arm64 Jetson Woodpecker agent (see `_docs/04_deploy/ci_cd_pipeline.md`); > local-development e2e runs SHOULD use `scripts/run-tests-jetson.sh` > against the configured `jetson-e2e` SSH alias rather than > `scripts/run-tests.sh`. This refinement supersedes the 2026-05-20 "all > tiers on Jetson" wording and the 2026-05-09 "both" decision recorded in > the § Test Execution section. ## Where each tier runs (active policy) | Tier | Local workstation | Jetson (canonical) | When local is the only option | |------|--------------------|--------------------|-------------------------------| | Unit (`tests/unit/`) | ✅ allowed and encouraged for dev iteration | ✅ also run as part of the Jetson CI lane | always | | Blackbox / e2e (`tests/e2e/`, `e2e/tests/`) | ❌ forbidden — placeholder fixtures + missing hardware = false-negative runs | ✅ required for any merge / release decision | never — if Jetson is unreachable, the e2e verdict is "not run" rather than a local result | | Performance / resilience / security / resource-limit | ❌ forbidden | ✅ required | never | | Thermal chamber (AC-NEW-5) | ❌ forbidden | ✅ chamber Jetson only | never | Practical consequences: - A PR may merge on green local unit tests + green Jetson e2e tests. - A PR MAY NOT merge on green local unit tests alone — the Jetson e2e lane is the binding signal. - When the Jetson agent is offline, the e2e verdict is "pending Jetson" — record the gap (e.g. via `_docs/_process_leftovers/`) rather than substituting a local run. - Tests in `tests/e2e/` that gate on `RUN_REPLAY_E2E` or `@pytest.mark.tier2` will SKIP locally; this is correct behaviour, not a failure to investigate. ## Overview **System under test (SUT)**: `gps-denied-onboard` companion-PC service that produces WGS84 position estimates from nav-camera frames + FC IMU/attitude and emits them to the FC over its native external-positioning interface. Public boundaries (the only surfaces tests interact with): - **Inbound — nav-camera frames**: V4L2 / GStreamer source (production: USB / MIPI-CSI / GigE per `restrictions.md`; tests: file-backed source replaying `_docs/00_problem/input_data/AD0000NN.jpg` or `flight_derkachi/flight_derkachi.mp4`). - **Inbound — FC telemetry**: MAVLink (ArduPilot) or MSP2 (iNav) inbound stream carrying `SCALED_IMU2`, `ATTITUDE`, `GLOBAL_POSITION_INT` (or MSP equivalents). Tests replay `flight_derkachi/data_imu.csv` through a thin replayer. - **Inbound — satellite tile cache**: filesystem + on-disk index (FAISS HNSW + tile manifest). Tests load a fixture cache mounted as a Docker volume. - **Outbound — FC external-positioning**: MAVLink `GPS_INPUT` (ArduPilot Plane) OR MSP2 `MSP2_SENSOR_GPS` (iNav). Tests observe these by spinning up the corresponding open-source SITL and reading what reaches the FC. - **Outbound — GCS telemetry**: MAVLink to QGroundControl (1-2 Hz downsample of estimates + STATUSTEXT). Tests subscribe via a passive MAVLink listener. - **Outbound — Flight Data Recorder**: NVM filesystem (per AC-NEW-3). Tests read the resulting FDR archive after the run. **Consumer app purpose**: The e2e harness drives the SUT through these public boundaries — replaying frames + telemetry, mounting tile-cache fixtures, observing FC-side acceptance via SITL, and parsing FDR output. It NEVER imports SUT modules, NEVER queries SUT internal state, and NEVER touches the SUT's filesystem outside the FDR output directory. ## Two-tier execution profile > **SUPERSEDED — 2026-05-20**: the two-tier model below is retained for > historical traceability. The active policy is **Jetson-only** (see banner > at the top of this doc). Tier-1 (workstation Docker) is deprecated; only > the Tier-2 row continues to describe a supported environment. This project originally specified two distinct test environments because the production target is Jetson hardware and AC-4.1/AC-4.2/AC-NEW-5 cannot be honestly validated on a generic x86 dev workstation. | Tier | Hardware | What it covers | What it skips | |------|----------|----------------|---------------| | **Tier-1 (workstation Docker)** *(deprecated 2026-05-20)* | x86 dev workstation, optional NVIDIA dGPU for TensorRT validation | All `FT-*` correctness, schema, `NFT-RES-*` resilience scenarios, `NFT-SEC-*` security scenarios, `NFT-LIM-*` storage budgets | Any AC whose pass criterion is bound to Jetson Orin Nano Super wall-clock latency or thermal envelope: AC-4.1 / AC-4.2 / AC-NEW-1 / AC-NEW-5 | | **Jetson (canonical, 2026-05-20)** *(formerly "Tier-2")* | Jetson Orin Nano Super (pinned hardware per `restrictions.md`), thermal chamber for AC-NEW-5 | Everything: `FT-*` correctness, schema, `NFT-RES-*`, `NFT-SEC-*`, `NFT-LIM-*`, `NFT-PERF-*` (AC-4.1 latency p95), AC-4.2 memory, AC-NEW-1 cold-start TTFF, AC-NEW-5 thermal envelope (chamber-only) | Nothing — anything that doesn't run here doesn't run at all | CI runs the Jetson pipeline (`01-test.yml`) on the colocated arm64 Jetson agent. Chamber-only AC-NEW-5 runs on `self-hosted-jetson-orin-chamber` on the documented quarterly + pre-release cadence; results are recorded in the same CSV report format. ## Docker Environment (Tier-1) ### Services | Service | Image / Build | Purpose | Ports | |---------|--------------|---------|-------| | `gps-denied-onboard` | local build (`docker/Dockerfile`) | The SUT. Production binary built with `BUILD_VINS_MONO=OFF` per locked sub-decision D-C1-1-SUB-A; research builds run a parallel job with `BUILD_VINS_MONO=ON` | 14550/udp (MAVLink to GCS), 5760/tcp (MSP2 to iNav SITL) | | `ardupilot-plane-sitl` | `ardupilot/ardupilot-sitl:plane-stable` | ArduPilot Plane SITL. Receives `GPS_INPUT` from the SUT; we read its EKF source-set state to validate AC-4.3, AC-NEW-2, AC-5.x | 14550/udp (MAVLink) | | `inav-sitl` | `inavflight/inav-sitl:9.0.0` | iNav SITL. Receives `MSP2_SENSOR_GPS` from the SUT; we read its GPS provider state | 5760/tcp (MSP2 over TCP per iNav SITL convention) | | `mock-suite-sat-service` | local build (`e2e/fixtures/mock-suite-sat`) | Stubs the parent-suite Satellite Service tile-publish API (read-only ingest contract for AC-NEW-7 voting layer). Returns deterministic fixture tiles | 8080/tcp | | `e2e-runner` | local build (`e2e/runner`) | Pytest-based harness. Drives all replays, reads FDR output, spins SITL scenarios. See § Harness Implementation Layout below for the per-evaluator inventory. | — | | `mavproxy-listener` | `ardupilot/mavproxy:latest` | Passive MAVLink listener that captures the SUT → GCS stream into a per-run `.tlog` for assertions | 14551/udp | ### Networks | Network | Services | Purpose | |---------|----------|---------| | `e2e-net` | all | Isolated test network. No host networking, no internet. Per RESTRICT-SAT-1, the SUT must NEVER reach an external satellite provider during a flight; a deny-all egress rule on `e2e-net` enforces this and is itself a security test (NFT-SEC-02). | ### Volumes | Volume | Mounted to | Purpose | |--------|-----------|---------| | `tile-cache-fixture` | `gps-denied-onboard:/var/azaion/tile-cache:ro` | Pre-built FAISS HNSW index + tile filesystem. Built once per test run from `e2e/fixtures/tile-cache-builder/` from the 60 still-image satellite references and the Derkachi route bbox. Read-only mount mirrors AC-8.3 pre-flight load behavior. | | `fdr-output` | `gps-denied-onboard:/var/azaion/fdr` | Per-flight FDR write target (AC-NEW-3 64 GB cap enforced via Docker `--storage-opt size=64g` on this volume) | | `input-data` | `e2e-runner:/test-data:ro` | Bind mount of `_docs/00_problem/input_data/` for replay | | `expected-results` | `e2e-runner:/expected:ro` | Bind mount of `_docs/00_problem/input_data/expected_results/` for assertions | ### docker-compose structure ```yaml services: gps-denied-onboard: build: context: ../.. dockerfile: docker/Dockerfile args: BUILD_VINS_MONO: "OFF" networks: [e2e-net] volumes: - tile-cache-fixture:/var/azaion/tile-cache:ro - fdr-output:/var/azaion/fdr environment: ONBOARD_FC_ADAPTER: ${FC_ADAPTER} # ardupilot | inav, set per scenario ONBOARD_VIO_STRATEGY: ${VIO_STRATEGY} # okvis2 | klt_ransac (production); vins_mono only in research build MAVLINK_SIGNING_PASSKEY_FILE: /run/secrets/mavlink_passkey depends_on: - mock-suite-sat-service ardupilot-plane-sitl: image: ardupilot/ardupilot-sitl:plane-stable networks: [e2e-net] command: ["--vehicle=ArduPlane", "--gps-type=14"] # GPS_TYPE=14 = MAV per ArduPilot SITL_simulation_parameters.html inav-sitl: image: inavflight/inav-sitl:9.0.0 networks: [e2e-net] # iNav SITL exposes MSP on TCP 5760 (UART1) per docs/SITL/SITL.md mock-suite-sat-service: build: ../fixtures/mock-suite-sat networks: [e2e-net] # Egress restriction enforced at network level, not service level e2e-runner: build: ../runner networks: [e2e-net] volumes: - input-data:/test-data:ro - expected-results:/expected:ro - fdr-output:/fdr:ro depends_on: - gps-denied-onboard - ardupilot-plane-sitl - inav-sitl - mavproxy-listener mavproxy-listener: image: ardupilot/mavproxy:latest networks: [e2e-net] networks: e2e-net: driver: bridge internal: true # NO external connectivity (enforces RESTRICT-SAT-1) volumes: tile-cache-fixture: {} fdr-output: {} ``` ## Consumer Application **Tech stack**: Python 3.12, pytest 8.x, pymavlink (MAVLink ground side), `msp_gps_toy` (MSP2 ground side, Rust binary called via subprocess), OpenCV ≥4.11.0,<4.12 (frame source replay; see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` — pin is held below 4.12 until gtsam ships numpy-2 wheels; D-CROSS-CVE-1 leftover remains open), numpy ≥1.26,<2.0 + scipy (geodesic-distance assertions in WGS84). **Entry point**: `pytest e2e/tests/` from inside `e2e-runner`. Each scenario is a parameterized pytest case keyed by FC adapter (`ardupilot` / `inav`) and VioStrategy (`okvis2` / `klt_ransac`) via the session-scoped conftest fixtures. ### Harness Implementation Layout The blackbox harness implementation lives under `e2e/` (NOT the SUT source tree — public-boundary discipline enforced by `e2e/README.md`): ``` e2e/ ├── docker/ Tier-1 entrypoint │ ├── docker-compose.test.yml Compose stack (services from § Services above) │ ├── docker-compose.tier2-bridge.yml Compose override for paired-host Tier-2 SITL bridging │ ├── run-tier1.sh AZ-444 selector-parity wrapper │ └── secrets/ Mounted Docker secrets (mavlink-passkey) ├── jetson/ Tier-2 entrypoint │ ├── run-tier2.sh AZ-444 selector-parity wrapper (control-host side) │ ├── tier2-on-jetson.sh SSH-orchestrated on-Jetson half │ ├── tier2.service systemd unit template │ ├── jtop_parser.py jetson_stats / jtop telemetry parser (NFT-LIM-01) │ └── tegrastats_parser.py tegrastats parser (NFT-LIM-04) ├── runner/ e2e-runner image │ ├── Dockerfile, conftest.py, pytest.ini, requirements.txt │ ├── helpers/ Per-AC evaluator + observer modules (47 evaluators │ │ covering accuracy, AP/iNav contract, blackout-spoof, │ │ cache poisoning, cold-start, companion reboot, │ │ CVE probe, e2e latency, egress observer, escalation │ │ ladder, FDR reader, frame-source replay, IMU replay, │ │ injector fixtures, MAVLink signing, MAVProxy tlog, │ │ memory budget, mid-flight tile, mock suite-sat audit, │ │ Monte Carlo envelope, MRE, multi-segment, outage │ │ request, outlier tolerance, registration classifier, │ │ retrieval, sharp-turn, sitl_observer, smoothing, │ │ spoof promotion, storage budget, streaming, thermal │ │ envelope, tile-cache inspector, TTFF — see │ │ `e2e/runner/helpers/` for the authoritative list) │ └── reporting/ CSV reporter + evidence bundler (AZ-445/446) │ ├── csv_reporter.py Emits `report.csv` per § Reporting │ ├── evidence_bundler.py Collects per-run `.tlog`, FDR, telemetry CSVs │ └── nfr_recorder.py NFR per-stage latency + budget recorder ├── fixtures/ Fixture builders + captured fixtures │ ├── tile-cache-builder/ `tile-cache-fixture` builder │ ├── age-injector/ `synth-age-tile-set` builder (FT-N-05) │ ├── injectors/ Runtime injectors: │ │ ├── outlier.py `outlier-injection-derkachi` (FT-N-01) │ │ ├── blackout_spoof.py `blackout-spoof-derkachi` (FT-N-04, NFT-RES-04) │ │ ├── multi_segment.py `multi-segment-derkachi` (FT-P-08) │ │ ├── cold_boot.py `cold-boot-fixture` (NFT-PERF-03) │ │ └── fc_proxy.py FC-inbound blackout/spoof proxy (FT-N-04 driver) │ ├── sitl_replay/ Captured offline FDR-replay fixtures │ │ └── p01/ FT-P-01 capture set (see test-data.md) │ ├── sitl_replay_builder/ Captured-fixture builder framework (AZ-598-600) │ │ ├── builder.py VideoSource × TlogSource × FdrProjection strategies │ │ ├── build_p01_fixtures.py FT-P-01 still-image builder │ │ └── build_p02_fixtures.py FT-P-02 Derkachi builder │ ├── mock-suite-sat/ `mock-suite-sat-service` Docker image │ ├── secrets/ Test-only secrets (mavlink-test-passkey.txt) │ └── security/ Security fixtures (cve-2025-53644.jpg) ├── tests/ Pytest target: positive/, negative/, performance/, │ resilience/, security/, resource_limit/ └── _unit_tests/ Out-of-container unit tests for harness internals (runs as part of project pytest, no Docker required) ``` ### Replay-Mode Skip Gating Several FT-* and FT-N-* scenarios rely on a pre-captured FDR-replay fixture instead of a live SITL run. When the `E2E_SITL_REPLAY_DIR` environment variable is unset, those scenarios skip cleanly via a `sitl_replay_ready` pytest marker (per AZ-594/595/598/599). To activate them: ```bash E2E_SITL_REPLAY_DIR=e2e/fixtures/sitl_replay/p01 \ pytest e2e/tests/positive/test_ft_p_01_still_image_accuracy.py ``` The captured-fixture builder framework (`e2e/fixtures/sitl_replay_builder/`) regenerates these fixtures from `_docs/00_problem/input_data/` against a live compose stack; the captured artifacts are then committed under `e2e/fixtures/sitl_replay//`. See `e2e/fixtures/sitl_replay_builder/README.md` for the framework, supported scenarios, and per-scenario builder invocations. ### Communication with system under test | Interface | Protocol | Endpoint / Topic | Authentication | |-----------|----------|-----------------|----------------| | Frame source | V4L2 / GStreamer file source | UNIX domain socket / shared `/test-data` mount | none (local) | | FC telemetry inbound | MAVLink (AP) or MSP2 (iNav) | `udp:gps-denied-onboard:14550` (AP) or `tcp:gps-denied-onboard:5760` (iNav) | MAVLink 2.0 message signing on AP per D-C8-9 (passkey via Docker secret); iNav unsigned per accepted residual risk | | Tile cache | Filesystem read | `/var/azaion/tile-cache` (read-only mount) | filesystem perms | | FC external-pos outbound observation | Read SITL EKF source-set + GLOBAL_POSITION_INT replay back from SITL | `udp:ardupilot-plane-sitl:14550` or `tcp:inav-sitl:5760` | passive listener | | GCS telemetry observation | MAVLink listener | `udp:mavproxy-listener:14551` (forwarded from SUT 14550) | none | | FDR output | Filesystem read post-run | `/fdr` (read-only mount) | filesystem perms | | Suite Sat Service mock | HTTP/JSON | `http://mock-suite-sat-service:8080` | none (test) | ### What the consumer does NOT have access to - No direct access to the SUT's internal state (GTSAM iSAM2 graph, FAISS index in-memory, OpenCV intermediate buffers, VioStrategy implementation pointer). - No internal Python/C++ module imports from the SUT. - No shared memory or filesystem with the SUT outside the four explicit mounts (`tile-cache-fixture` r/o, `fdr-output` r/o from runner side, `input-data` r/o, `expected-results` r/o). - No bypass of the FC-side acceptance check — every AC-4.3 assertion goes through SITL. ## CI/CD Integration > **2026-05-20**: rewritten for the Jetson-only policy. Tier-1 references in the historical sub-sections below are no longer operative. **When to run** (active policy): - Jetson (colocated arm64 Woodpecker agent): on every PR to `dev` branch, nightly on `dev` HEAD, and as a hard gate before any release tag. - AC-NEW-5 thermal envelope: quarterly on the chamber-attached Jetson runner; failures block release tags only. **Pipeline stage**: a single Jetson workflow (`.woodpecker/01-test.yml`) on the `self-hosted-jetson-orin` runner exercises the full suite — there is no longer a parallel x86 lane. **Gate behavior**: Jetson blocks PR merge on any test failure and blocks release tags on any test failure. Chamber tests are warning-only on PRs and blocking on release tags. **Timeout**: - Jetson: 4 hr per matrix entry (allows for full Derkachi 8 min replay × ~10 scenarios + cold-boot loops). - Thermal chamber AC-NEW-5: 9 hr (8 h hot-soak + setup/teardown). ## Reporting **Format**: CSV (one row per test). **Columns**: `test_id, test_name, traces_to, fc_adapter, vio_strategy, tier, started_at_utc, execution_time_ms, result, error_message, evidence_paths` - `traces_to`: comma-separated AC/RESTRICT IDs from the traceability matrix. - `fc_adapter`: `ardupilot` | `inav` | `n/a`. - `vio_strategy`: `okvis2` | `klt_ransac` | `vins_mono` | `n/a` (research-build only for `vins_mono`). - `tier`: `tier1-docker` | `tier2-jetson` | `tier2-chamber`. - `result`: `PASS` | `FAIL` | `SKIP` | `XFAIL` (XFAIL only allowed for AC explicitly marked NOT COVERED in the traceability matrix and not yet promoted to a real test). - `evidence_paths`: comma-separated paths inside the run-output bundle (`.tlog` files, FDR archives, screenshots, profiler traces) supporting the verdict. **Output path**: `e2e-results/run-${RUN_ID}/report.csv` plus a per-run bundle of evidence at `e2e-results/run-${RUN_ID}/evidence/`. ## Test Execution **Decision (2026-05-20, refined later that day)** — **Jetson is the binding e2e environment; unit tests may run locally.** This refines the earlier "Jetson only for everything" wording. Rationale captured in `_docs/LESSONS.md` (2026-05-20 entries): - The original "Jetson-only across all tiers" decision came from repeated workstation-vs-Jetson environment divergences in the e2e / build path (Dockerfile build order, missing `libgl1`, gtsam wheel availability, venv symlink resolution, lazy-import side-effect registration). Those divergences are real and continue to justify Jetson as the binding e2e environment. - Forcing the unit-test suite over an SSH-orchestrated Jetson loop added 30–90 s per iteration without producing any signal the local interpreter doesn't already produce. The unit suite is fully synthetic — no camera, no SITL, no Jetson-specific runtime — so a local PASS is equivalent to a Jetson PASS for that tier. **Operational entry points**: | Tier | Entry point | Where it runs | |------|-------------|---------------| | Unit (`tests/unit/`) | `pytest tests/unit/ -q` directly, or `scripts/run-tests.sh` | local workstation (Python 3.10+ venv) | | Blackbox / e2e (`tests/e2e/`, `e2e/tests/`) | `scripts/run-tests-jetson.sh` (local dev) / `.woodpecker/01-test.yml` (CI) | colocated arm64 Jetson Woodpecker agent — see `_docs/04_deploy/ci_cd_pipeline.md` | | Performance / resilience / security / resource-limit | same as e2e | Jetson only | | AC-NEW-5 thermal chamber | quarterly + pre-release | `self-hosted-jetson-orin-chamber` | A green local unit-test run is necessary-but-not-sufficient for merge; the Jetson e2e lane is the binding signal. The remainder of this section preserves the original 2026-05-09 decision context for traceability. --- **Decision (2026-05-09, SUPERSEDED)**: **both** — Tier-1 Docker + Tier-2 Jetson hardware loop. Confirmed at the Hardware-Dependency Assessment Step 4 gate. ### Hardware dependencies found (Phase 3 → Hardware Assessment scan) | Category | Indicator | Source file | |---|---|---| | GPU / CUDA | TensorRT engines (`.engine`, SM 87, JetPack 6.2, TRT 10.3) | `_docs/01_solution/solution.md` PRE-FLIGHT block | | GPU / CUDA | DISK+LightGlue FP16 inference | `_docs/01_solution/solution.md` RUNTIME block (C3) | | GPU / CUDA pin | Jetson Orin Nano Super (67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W) | `_docs/00_problem/restrictions.md` § Onboard Hardware | | Sensors / Cameras | ADTi 20MP 20L V1 nadir camera over USB / MIPI-CSI / GigE | `_docs/00_problem/restrictions.md` § Cameras | | Sensors / Cameras | V4L2 / GStreamer frame source (production) | `_docs/02_document/tests/environment.md` § Overview | | OS-specific services | High-rate IMU via UART/MAVLink to FC | `_docs/00_problem/restrictions.md` § Sensors & Integration | | OS-specific services | Per-FC inbound (MAVLink GPS_INPUT for AP, MSP2 over UART for iNav) | `_docs/00_problem/restrictions.md` § Sensors & Integration | | OS-specific services | tegrastats / jetson_stats for thermal telemetry | `_docs/02_document/tests/resource-limit-tests.md` NFT-LIM-04 | | Thermal envelope | -20 °C to +50 °C operating envelope, 25 W TDP, 8 h duty cycle | `_docs/00_problem/restrictions.md` § Failsafe & Safety + AC-NEW-5 | (Step 2 Code scan from the planning phase returned zero indicators because no source code existed yet. Post-implementation: `pyproject.toml` confirms `tensorrt`, `pymavlink`, `gtsam==4.2.1`, `faiss-gpu`, `opencv-python>=4.11.0.86,<4.12` (cycle-1 relaxation per `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` — the original `>=4.12.0` target replays once gtsam ships numpy-2 wheels), and `jetson-stats`. `pycuda` was NOT added — TensorRT EP is invoked via ONNX Runtime + the `onnx_trt_ep_runtime` factory, which uses TensorRT's Python bindings directly without `pycuda`.) ### Execution instructions — Tier-1 (Docker) **Prerequisites**: - Docker 24+ with Compose v2. - NVIDIA Container Toolkit if the workstation has an NVIDIA dGPU (lets the SUT exercise the TensorRT path; otherwise falls back to CPU TensorRT). - ≥16 GB host RAM, ≥80 GB free disk for `tile-cache-fixture` + `fdr-output` + image build cache. **How to start** (preferred — selector-parity wrapper from AZ-444): ```bash ./e2e/docker/run-tier1.sh \ --fc-adapter ardupilot \ --vio-strategy okvis2 \ [-k ] \ [--build-kind production|asan] \ [--enable-chamber] ``` `run-tier1.sh` and `e2e/jetson/run-tier2.sh` accept the same `-k ` flag and emit the same pytest invocation modulo the `TIER` env var (AZ-444 AC-1). Raw-compose equivalent (when bypassing the wrapper for debugging): ```bash cd e2e/docker export FC_ADAPTER=ardupilot VIO_STRATEGY=okvis2 docker compose -f docker-compose.test.yml up --build --abort-on-container-exit e2e-runner ``` The run reports to `./e2e-results/run-${RUN_ID}/report.csv` (see § Reporting). Exit code matches the test verdict. **Environment variables**: - `FC_ADAPTER` ∈ `{ardupilot, inav}` — selects which SITL the SUT talks to. - `VIO_STRATEGY` ∈ `{okvis2, klt_ransac}` for production binary; `vins_mono` only when the research binary `BUILD_VINS_MONO=ON` is the build. - `MAVLINK_SIGNING_PASSKEY_FILE` — path to the Docker secret loaded with the test passkey for FT-P-09-AP / NFT-SEC-03. - `E2E_SITL_REPLAY_DIR` — when set, activates captured-fixture FDR-replay mode for scenarios that gate on `sitl_replay_ready`; unset → those scenarios skip cleanly (see § Replay-Mode Skip Gating above). - `RUN_ID` — per-invocation run identifier; defaults to `local-${USER}-${EPOCH}` in development, CI sets it from the workflow run id. Determines the `e2e-results/run-${RUN_ID}/` output directory. **Skipped on Tier-1**: `NFT-PERF-01` (AC-4.1 latency p95 — Jetson-bound), `NFT-LIM-01` (AC-4.2 memory — Jetson-bound), `NFT-PERF-03` (AC-NEW-1 cold-start — Jetson-bound), `NFT-LIM-04` (AC-NEW-5 chamber baseline — Jetson-bound), AC-NEW-5 chamber portion (chamber-bound). ### Execution instructions — Tier-2 (Jetson hardware loop) **Prerequisites**: - Jetson Orin Nano Super (per `restrictions.md` § Onboard Hardware). - JetPack 6.2 + CUDA + TensorRT 10.3 + cuDNN per D-C7-9. - Workstation thermal-day environment for NFT-LIM-04 baseline. Chamber-attached runner for AC-NEW-5 chamber portion (separate quarterly job; not run in standard CI). - ArduPilot Plane SITL + iNav SITL run on the same Jetson, OR on a paired x86 host on the same network — both are supported. - Real ADTi 20MP 20L V1 camera connected via USB/MIPI-CSI/GigE; OR file-replay source if camera unavailable (in which case all `AC-2.x` cross-validation is `XFAIL` for that run). **How to start** (AZ-444 selector-parity wrapper): ```bash ./e2e/jetson/run-tier2.sh \ --fc-adapter ardupilot \ --vio-strategy okvis2 \ [-k ] \ [--build-kind production|asan] \ [--duration 5min|8h] \ [--enable-chamber] \ [--reflash] ``` The Tier-2 SITL stack runs on a paired x86 host via: ```bash docker compose \ -f e2e/docker/docker-compose.test.yml \ -f e2e/docker/docker-compose.tier2-bridge.yml up ... ``` When invoked on a control host (typical), the script SSH-orchestrates the Jetson half (`tier2-on-jetson.sh`). When `TIER2_HOST=localhost` and the script runs on the Jetson itself, it delegates directly without SSH. Outputs the same CSV format as Tier-1 (one report.csv per run) plus tegrastats + jtop CSVs in the evidence bundle. **Environment variables**: same as Tier-1 plus: - `TIER2_HOST` / `TIER2_USER` / `TIER2_KEY_PATH` — control-host → Jetson SSH wiring (required when `TIER2_HOST != localhost`). - `TIER2_CHAMBER_AMBIENT_C` — ambient temperature for AC-NEW-5 chamber runs. - `TIER2_CAMERA_DEVICE` — `/dev/video0` (production) or file path for replay mode. `gps-denied-onboard.service` (or `gps-denied-onboard-asan.service` for `--build-kind=asan`) MUST be installed via systemd on the Jetson — `e2e/jetson/tier2.service` is the template. See `_docs/03_implementation/jetson_harness_setup.md` for the physical provisioning steps. ### CI runner mapping **Active mapping (2026-05-20)**: - `self-hosted-jetson-orin` (colocated arm64 Woodpecker agent) → all test runs, every PR + nightly + pre-release. ~4 hr per matrix entry. **This is the single canonical CI test runner.** - `self-hosted-jetson-orin-chamber` → AC-NEW-5 hot-soak. Quarterly + before any release tag. ~9 hr. **Removed (2026-05-20)**: - ~~`ubuntu-24.04` (GitHub-hosted) → Tier-1 Docker, every PR + nightly. ~30-45 min per matrix entry.~~ — Tier-1 workstation Docker is deprecated; no x86 CI agent participates in the test path. CI build-push lanes that ship images may still run on amd64 if/when that matrix dimension is uncommented in `02-build-push.yml`, but the test lane is Jetson-only. **Matrix dimensions**: `FC_ADAPTER × VIO_STRATEGY × build_kind` where `build_kind ∈ {production, research}`. Production `vins_mono` is excluded (D-C1-1-SUB-A locked); research includes all three VioStrategy values.