# Test Environment ## Overview **System under test (SUT)**: `gps-denied-onboard` companion-PC service that produces WGS84 position estimates from nav-camera frames + FC IMU/attitude and emits them to the FC over its native external-positioning interface. Public boundaries (the only surfaces tests interact with): - **Inbound — nav-camera frames**: V4L2 / GStreamer source (production: USB / MIPI-CSI / GigE per `restrictions.md`; tests: file-backed source replaying `_docs/00_problem/input_data/AD0000NN.jpg` or `flight_derkachi/flight_derkachi.mp4`). - **Inbound — FC telemetry**: MAVLink (ArduPilot) or MSP2 (iNav) inbound stream carrying `SCALED_IMU2`, `ATTITUDE`, `GLOBAL_POSITION_INT` (or MSP equivalents). Tests replay `flight_derkachi/data_imu.csv` through a thin replayer. - **Inbound — satellite tile cache**: filesystem + on-disk index (FAISS HNSW + tile manifest). Tests load a fixture cache mounted as a Docker volume. - **Outbound — FC external-positioning**: MAVLink `GPS_INPUT` (ArduPilot Plane) OR MSP2 `MSP2_SENSOR_GPS` (iNav). Tests observe these by spinning up the corresponding open-source SITL and reading what reaches the FC. - **Outbound — GCS telemetry**: MAVLink to QGroundControl (1-2 Hz downsample of estimates + STATUSTEXT). Tests subscribe via a passive MAVLink listener. - **Outbound — Flight Data Recorder**: NVM filesystem (per AC-NEW-3). Tests read the resulting FDR archive after the run. **Consumer app purpose**: The e2e harness drives the SUT through these public boundaries — replaying frames + telemetry, mounting tile-cache fixtures, observing FC-side acceptance via SITL, and parsing FDR output. It NEVER imports SUT modules, NEVER queries SUT internal state, and NEVER touches the SUT's filesystem outside the FDR output directory. ## Two-tier execution profile This project requires two distinct test environments because the production target is Jetson hardware and AC-4.1/AC-4.2/AC-NEW-5 cannot be honestly validated on a generic x86 dev workstation. | Tier | Hardware | What it covers | What it skips | |------|----------|----------------|---------------| | **Tier-1 (workstation Docker)** | x86 dev workstation, optional NVIDIA dGPU for TensorRT validation | All `FT-*` correctness, schema, `NFT-RES-*` resilience scenarios, `NFT-SEC-*` security scenarios, `NFT-LIM-*` storage budgets | Any AC whose pass criterion is bound to Jetson Orin Nano Super wall-clock latency or thermal envelope: AC-4.1 / AC-4.2 / AC-NEW-1 / AC-NEW-5 | | **Tier-2 (Jetson hardware loop)** | Jetson Orin Nano Super (pinned hardware per `restrictions.md`), thermal chamber for AC-NEW-5 | AC-4.1 latency p95, AC-4.2 memory, AC-NEW-1 cold-start TTFF, AC-NEW-5 thermal envelope (chamber-only) | Iteration speed (manual hardware time) | CI runs Tier-1 on every PR. Tier-2 runs on hardware-attached runners on a nightly cadence and pre-release gate; results are imported into the same CSV report format as Tier-1. ## Docker Environment (Tier-1) ### Services | Service | Image / Build | Purpose | Ports | |---------|--------------|---------|-------| | `gps-denied-onboard` | local build (`docker/Dockerfile`) | The SUT. Production binary built with `BUILD_VINS_MONO=OFF` per locked sub-decision D-C1-1-SUB-A; research builds run a parallel job with `BUILD_VINS_MONO=ON` | 14550/udp (MAVLink to GCS), 5760/tcp (MSP2 to iNav SITL) | | `ardupilot-plane-sitl` | `ardupilot/ardupilot-sitl:plane-stable` | ArduPilot Plane SITL. Receives `GPS_INPUT` from the SUT; we read its EKF source-set state to validate AC-4.3, AC-NEW-2, AC-5.x | 14550/udp (MAVLink) | | `inav-sitl` | `inavflight/inav-sitl:9.0.0` | iNav SITL. Receives `MSP2_SENSOR_GPS` from the SUT; we read its GPS provider state | 5760/tcp (MSP2 over TCP per iNav SITL convention) | | `mock-suite-sat-service` | local build (`tests/fixtures/mock-suite-sat`) | Stubs the parent-suite Satellite Service tile-publish API (read-only ingest contract for AC-NEW-7 voting layer). Returns deterministic fixture tiles | 8080/tcp | | `e2e-runner` | local build (`tests/runner`) | Pytest-based harness. Drives all replays, reads FDR output, spins SITL scenarios | — | | `mavproxy-listener` | `ardupilot/mavproxy:latest` | Passive MAVLink listener that captures the SUT → GCS stream into a per-run `.tlog` for assertions | 14551/udp | ### Networks | Network | Services | Purpose | |---------|----------|---------| | `e2e-net` | all | Isolated test network. No host networking, no internet. Per RESTRICT-SAT-1, the SUT must NEVER reach an external satellite provider during a flight; a deny-all egress rule on `e2e-net` enforces this and is itself a security test (NFT-SEC-02). | ### Volumes | Volume | Mounted to | Purpose | |--------|-----------|---------| | `tile-cache-fixture` | `gps-denied-onboard:/var/azaion/tile-cache:ro` | Pre-built FAISS HNSW index + tile filesystem. Built once per test run from `tests/fixtures/tile-cache-builder/` from the 60 still-image satellite references and the Derkachi route bbox. Read-only mount mirrors AC-8.3 pre-flight load behavior. | | `fdr-output` | `gps-denied-onboard:/var/azaion/fdr` | Per-flight FDR write target (AC-NEW-3 64 GB cap enforced via Docker `--storage-opt size=64g` on this volume) | | `input-data` | `e2e-runner:/test-data:ro` | Bind mount of `_docs/00_problem/input_data/` for replay | | `expected-results` | `e2e-runner:/expected:ro` | Bind mount of `_docs/00_problem/input_data/expected_results/` for assertions | ### docker-compose structure ```yaml services: gps-denied-onboard: build: context: ../.. dockerfile: docker/Dockerfile args: BUILD_VINS_MONO: "OFF" networks: [e2e-net] volumes: - tile-cache-fixture:/var/azaion/tile-cache:ro - fdr-output:/var/azaion/fdr environment: ONBOARD_FC_ADAPTER: ${FC_ADAPTER} # ardupilot | inav, set per scenario ONBOARD_VIO_STRATEGY: ${VIO_STRATEGY} # okvis2 | klt_ransac (production); vins_mono only in research build MAVLINK_SIGNING_PASSKEY_FILE: /run/secrets/mavlink_passkey depends_on: - mock-suite-sat-service ardupilot-plane-sitl: image: ardupilot/ardupilot-sitl:plane-stable networks: [e2e-net] command: ["--vehicle=ArduPlane", "--gps-type=14"] # GPS_TYPE=14 = MAV per ArduPilot SITL_simulation_parameters.html inav-sitl: image: inavflight/inav-sitl:9.0.0 networks: [e2e-net] # iNav SITL exposes MSP on TCP 5760 (UART1) per docs/SITL/SITL.md mock-suite-sat-service: build: ../fixtures/mock-suite-sat networks: [e2e-net] # Egress restriction enforced at network level, not service level e2e-runner: build: ../runner networks: [e2e-net] volumes: - input-data:/test-data:ro - expected-results:/expected:ro - fdr-output:/fdr:ro depends_on: - gps-denied-onboard - ardupilot-plane-sitl - inav-sitl - mavproxy-listener mavproxy-listener: image: ardupilot/mavproxy:latest networks: [e2e-net] networks: e2e-net: driver: bridge internal: true # NO external connectivity (enforces RESTRICT-SAT-1) volumes: tile-cache-fixture: {} fdr-output: {} ``` ## Consumer Application **Tech stack**: Python 3.12, pytest 8.x, pymavlink (MAVLink ground side), `msp_gps_toy` (MSP2 ground side, Rust binary called via subprocess), OpenCV ≥4.12.0 (frame source replay), numpy + scipy (geodesic-distance assertions in WGS84). **Entry point**: `pytest tests/e2e/` from inside `e2e-runner`. Each scenario is a parameterized pytest case keyed by FC adapter (`ardupilot` / `inav`). ### Communication with system under test | Interface | Protocol | Endpoint / Topic | Authentication | |-----------|----------|-----------------|----------------| | Frame source | V4L2 / GStreamer file source | UNIX domain socket / shared `/test-data` mount | none (local) | | FC telemetry inbound | MAVLink (AP) or MSP2 (iNav) | `udp:gps-denied-onboard:14550` (AP) or `tcp:gps-denied-onboard:5760` (iNav) | MAVLink 2.0 message signing on AP per D-C8-9 (passkey via Docker secret); iNav unsigned per accepted residual risk | | Tile cache | Filesystem read | `/var/azaion/tile-cache` (read-only mount) | filesystem perms | | FC external-pos outbound observation | Read SITL EKF source-set + GLOBAL_POSITION_INT replay back from SITL | `udp:ardupilot-plane-sitl:14550` or `tcp:inav-sitl:5760` | passive listener | | GCS telemetry observation | MAVLink listener | `udp:mavproxy-listener:14551` (forwarded from SUT 14550) | none | | FDR output | Filesystem read post-run | `/fdr` (read-only mount) | filesystem perms | | Suite Sat Service mock | HTTP/JSON | `http://mock-suite-sat-service:8080` | none (test) | ### What the consumer does NOT have access to - No direct access to the SUT's internal state (GTSAM iSAM2 graph, FAISS index in-memory, OpenCV intermediate buffers, VioStrategy implementation pointer). - No internal Python/C++ module imports from the SUT. - No shared memory or filesystem with the SUT outside the four explicit mounts (`tile-cache-fixture` r/o, `fdr-output` r/o from runner side, `input-data` r/o, `expected-results` r/o). - No bypass of the FC-side acceptance check — every AC-4.3 assertion goes through SITL. ## CI/CD Integration **When to run**: - Tier-1 (workstation Docker): on every PR to `dev` branch and nightly on `dev` HEAD. - Tier-2 (Jetson hardware loop): nightly on `dev`, and as a hard gate before any release tag. - AC-NEW-5 thermal envelope: monthly on chamber-attached Jetson runner; failures block release tags only. **Pipeline stage**: - Tier-1 fits in the standard CI matrix as a single job (~30-45 min wall-clock for the full suite at first cut). - Tier-2 is a separate workflow on `self-hosted-jetson-orin` runner. **Gate behavior**: Tier-1 blocks PR merge on any test failure. Tier-2 blocks release tag on any test failure. Chamber tests are warning-only on PRs and blocking on release tags. **Timeout**: - Tier-1: 60 min per matrix entry. - Tier-2: 4 hr per matrix entry (allows for full Derkachi 8 min replay × ~10 scenarios + cold-boot loops). - Thermal chamber AC-NEW-5: 9 hr (8 h hot-soak + setup/teardown). ## Reporting **Format**: CSV (one row per test). **Columns**: `test_id, test_name, traces_to, fc_adapter, vio_strategy, tier, started_at_utc, execution_time_ms, result, error_message, evidence_paths` - `traces_to`: comma-separated AC/RESTRICT IDs from the traceability matrix. - `fc_adapter`: `ardupilot` | `inav` | `n/a`. - `vio_strategy`: `okvis2` | `klt_ransac` | `vins_mono` | `n/a` (research-build only for `vins_mono`). - `tier`: `tier1-docker` | `tier2-jetson` | `tier2-chamber`. - `result`: `PASS` | `FAIL` | `SKIP` | `XFAIL` (XFAIL only allowed for AC explicitly marked NOT COVERED in the traceability matrix and not yet promoted to a real test). - `evidence_paths`: comma-separated paths inside the run-output bundle (`.tlog` files, FDR archives, screenshots, profiler traces) supporting the verdict. **Output path**: `e2e-results/run-${RUN_ID}/report.csv` plus a per-run bundle of evidence at `e2e-results/run-${RUN_ID}/evidence/`. ## Test Execution **Decision (2026-05-09)**: **both** — Tier-1 Docker + Tier-2 Jetson hardware loop. Confirmed at the Hardware-Dependency Assessment Step 4 gate. ### Hardware dependencies found (Phase 3 → Hardware Assessment scan) | Category | Indicator | Source file | |---|---|---| | GPU / CUDA | TensorRT engines (`.engine`, SM 87, JetPack 6.2, TRT 10.3) | `_docs/01_solution/solution.md` PRE-FLIGHT block | | GPU / CUDA | DISK+LightGlue FP16 inference | `_docs/01_solution/solution.md` RUNTIME block (C3) | | GPU / CUDA pin | Jetson Orin Nano Super (67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W) | `_docs/00_problem/restrictions.md` § Onboard Hardware | | Sensors / Cameras | ADTi 20MP 20L V1 nadir camera over USB / MIPI-CSI / GigE | `_docs/00_problem/restrictions.md` § Cameras | | Sensors / Cameras | V4L2 / GStreamer frame source (production) | `_docs/02_document/tests/environment.md` § Overview | | OS-specific services | High-rate IMU via UART/MAVLink to FC | `_docs/00_problem/restrictions.md` § Sensors & Integration | | OS-specific services | Per-FC inbound (MAVLink GPS_INPUT for AP, MSP2 over UART for iNav) | `_docs/00_problem/restrictions.md` § Sensors & Integration | | OS-specific services | tegrastats / jetson_stats for thermal telemetry | `_docs/02_document/tests/resource-limit-tests.md` NFT-LIM-04 | | Thermal envelope | -20 °C to +50 °C operating envelope, 25 W TDP, 8 h duty cycle | `_docs/00_problem/restrictions.md` § Failsafe & Safety + AC-NEW-5 | (Step 2 Code scan returned zero indicators because no source code exists yet — this is the planning phase. Decompose → Implement will produce `requirements.txt` / `pyproject.toml` / Cargo.toml entries that confirm: `tensorrt`, `pycuda`, `pymavlink`, `gtsam`, `faiss-gpu`, `opencv-python>=4.12.0`, `jetson-stats`.) ### Execution instructions — Tier-1 (Docker) **Prerequisites**: - Docker 24+ with Compose v2. - NVIDIA Container Toolkit if the workstation has an NVIDIA dGPU (lets the SUT exercise the TensorRT path; otherwise falls back to CPU TensorRT). - ≥16 GB host RAM, ≥80 GB free disk for `tile-cache-fixture` + `fdr-output` + image build cache. **How to start**: ```bash cd e2e/docker export FC_ADAPTER=ardupilot # or: inav (parameterized per scenario in CI) export VIO_STRATEGY=okvis2 # or: klt_ransac (production binary) docker compose -f docker-compose.test.yml up --build --abort-on-container-exit e2e-runner ``` The run reports to `./e2e-results/run-${RUN_ID}/report.csv` (see § Reporting). Exit code matches the test verdict. **Environment variables**: - `FC_ADAPTER` ∈ `{ardupilot, inav}` — selects which SITL the SUT talks to. - `VIO_STRATEGY` ∈ `{okvis2, klt_ransac}` for production binary; `vins_mono` only when the research binary `BUILD_VINS_MONO=ON` is the build. - `MAVLINK_SIGNING_PASSKEY_FILE` — path to the Docker secret loaded with the test passkey for FT-P-09-AP / NFT-SEC-03. **Skipped on Tier-1**: `NFT-PERF-01` (AC-4.1 latency p95 — Jetson-bound), `NFT-LIM-01` (AC-4.2 memory — Jetson-bound), `NFT-PERF-03` (AC-NEW-1 cold-start — Jetson-bound), `NFT-LIM-04` (AC-NEW-5 chamber baseline — Jetson-bound), AC-NEW-5 chamber portion (chamber-bound). ### Execution instructions — Tier-2 (Jetson hardware loop) **Prerequisites**: - Jetson Orin Nano Super (per `restrictions.md` § Onboard Hardware). - JetPack 6.2 + CUDA + TensorRT 10.3 + cuDNN per D-C7-9. - Workstation thermal-day environment for NFT-LIM-04 baseline. Chamber-attached runner for AC-NEW-5 chamber portion (separate quarterly job; not run in standard CI). - ArduPilot Plane SITL + iNav SITL run on the same Jetson, OR on a paired x86 host on the same network — both are supported. - Real ADTi 20MP 20L V1 camera connected via USB/MIPI-CSI/GigE; OR file-replay source if camera unavailable (in which case all `AC-2.x` cross-validation is `XFAIL` for that run). **How to start**: ```bash cd e2e/jetson sudo systemctl restart gps-denied-onboard.service ./run-tier2.sh --fc-adapter ardupilot --vio-strategy okvis2 --duration 8h # or: ./run-tier2.sh --fc-adapter inav --vio-strategy klt_ransac --duration 5min ``` Outputs the same CSV format as Tier-1 (one report.csv per run). **Environment variables**: same as Tier-1 plus: - `TIER2_CHAMBER_AMBIENT_C` — ambient temperature for AC-NEW-5 chamber runs. - `TIER2_CAMERA_DEVICE` — `/dev/video0` (production) or file path for replay mode. ### CI runner mapping - `ubuntu-24.04` (GitHub-hosted) → Tier-1 Docker, every PR + nightly. ~30-45 min per matrix entry. - `self-hosted-jetson-orin` → Tier-2 Jetson, nightly on `dev` HEAD + pre-release gate. ~4 hr per matrix entry. - `self-hosted-jetson-orin-chamber` → AC-NEW-5 hot-soak. Quarterly + before any release tag. ~9 hr. **Matrix dimensions**: `FC_ADAPTER × VIO_STRATEGY × build_kind` where `build_kind ∈ {production, research}`. Production `vins_mono` is excluded (D-C1-1-SUB-A locked); research includes all three VioStrategy values.