mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 15:01:13 +00:00
c19c76481c
Enhanced the SKILL.md file to enforce conciseness rules for the state file, specifying acceptable content and file size limits. Updated the autodev state to reflect the transition to the planning phase, including changes to the current step and sub-step details. Revised acceptance criteria to clarify validation requirements and external dependencies, ensuring alignment with the latest research findings. Added a new overlay for Mode B revisions to track changes and decisions made during the assessment process.
249 lines
16 KiB
Markdown
249 lines
16 KiB
Markdown
# Test Environment
|
||
|
||
## Overview
|
||
|
||
**System under test (SUT)**: `gps-denied-onboard` companion-PC service that produces WGS84 position estimates from nav-camera frames + FC IMU/attitude and emits them to the FC over its native external-positioning interface. Public boundaries (the only surfaces tests interact with):
|
||
|
||
- **Inbound — nav-camera frames**: V4L2 / GStreamer source (production: USB / MIPI-CSI / GigE per `restrictions.md`; tests: file-backed source replaying `_docs/00_problem/input_data/AD0000NN.jpg` or `flight_derkachi/flight_derkachi.mp4`).
|
||
- **Inbound — FC telemetry**: MAVLink (ArduPilot) or MSP2 (iNav) inbound stream carrying `SCALED_IMU2`, `ATTITUDE`, `GLOBAL_POSITION_INT` (or MSP equivalents). Tests replay `flight_derkachi/data_imu.csv` through a thin replayer.
|
||
- **Inbound — satellite tile cache**: filesystem + on-disk index (FAISS HNSW + tile manifest). Tests load a fixture cache mounted as a Docker volume.
|
||
- **Outbound — FC external-positioning**: MAVLink `GPS_INPUT` (ArduPilot Plane) OR MSP2 `MSP2_SENSOR_GPS` (iNav). Tests observe these by spinning up the corresponding open-source SITL and reading what reaches the FC.
|
||
- **Outbound — GCS telemetry**: MAVLink to QGroundControl (1-2 Hz downsample of estimates + STATUSTEXT). Tests subscribe via a passive MAVLink listener.
|
||
- **Outbound — Flight Data Recorder**: NVM filesystem (per AC-NEW-3). Tests read the resulting FDR archive after the run.
|
||
|
||
**Consumer app purpose**: The e2e harness drives the SUT through these public boundaries — replaying frames + telemetry, mounting tile-cache fixtures, observing FC-side acceptance via SITL, and parsing FDR output. It NEVER imports SUT modules, NEVER queries SUT internal state, and NEVER touches the SUT's filesystem outside the FDR output directory.
|
||
|
||
## Two-tier execution profile
|
||
|
||
This project requires two distinct test environments because the production target is Jetson hardware and AC-4.1/AC-4.2/AC-NEW-5 cannot be honestly validated on a generic x86 dev workstation.
|
||
|
||
| Tier | Hardware | What it covers | What it skips |
|
||
|------|----------|----------------|---------------|
|
||
| **Tier-1 (workstation Docker)** | x86 dev workstation, optional NVIDIA dGPU for TensorRT validation | All `FT-*` correctness, schema, `NFT-RES-*` resilience scenarios, `NFT-SEC-*` security scenarios, `NFT-LIM-*` storage budgets | Any AC whose pass criterion is bound to Jetson Orin Nano Super wall-clock latency or thermal envelope: AC-4.1 / AC-4.2 / AC-NEW-1 / AC-NEW-5 |
|
||
| **Tier-2 (Jetson hardware loop)** | Jetson Orin Nano Super (pinned hardware per `restrictions.md`), thermal chamber for AC-NEW-5 | AC-4.1 latency p95, AC-4.2 memory, AC-NEW-1 cold-start TTFF, AC-NEW-5 thermal envelope (chamber-only) | Iteration speed (manual hardware time) |
|
||
|
||
CI runs Tier-1 on every PR. Tier-2 runs on hardware-attached runners on a nightly cadence and pre-release gate; results are imported into the same CSV report format as Tier-1.
|
||
|
||
## Docker Environment (Tier-1)
|
||
|
||
### Services
|
||
|
||
| Service | Image / Build | Purpose | Ports |
|
||
|---------|--------------|---------|-------|
|
||
| `gps-denied-onboard` | local build (`docker/Dockerfile`) | The SUT. Production binary built with `BUILD_VINS_MONO=OFF` per locked sub-decision D-C1-1-SUB-A; research builds run a parallel job with `BUILD_VINS_MONO=ON` | 14550/udp (MAVLink to GCS), 5760/tcp (MSP2 to iNav SITL) |
|
||
| `ardupilot-plane-sitl` | `ardupilot/ardupilot-sitl:plane-stable` | ArduPilot Plane SITL. Receives `GPS_INPUT` from the SUT; we read its EKF source-set state to validate AC-4.3, AC-NEW-2, AC-5.x | 14550/udp (MAVLink) |
|
||
| `inav-sitl` | `inavflight/inav-sitl:9.0.0` | iNav SITL. Receives `MSP2_SENSOR_GPS` from the SUT; we read its GPS provider state | 5760/tcp (MSP2 over TCP per iNav SITL convention) |
|
||
| `mock-suite-sat-service` | local build (`tests/fixtures/mock-suite-sat`) | Stubs the parent-suite Satellite Service tile-publish API (read-only ingest contract for AC-NEW-7 voting layer). Returns deterministic fixture tiles | 8080/tcp |
|
||
| `e2e-runner` | local build (`tests/runner`) | Pytest-based harness. Drives all replays, reads FDR output, spins SITL scenarios | — |
|
||
| `mavproxy-listener` | `ardupilot/mavproxy:latest` | Passive MAVLink listener that captures the SUT → GCS stream into a per-run `.tlog` for assertions | 14551/udp |
|
||
|
||
### Networks
|
||
|
||
| Network | Services | Purpose |
|
||
|---------|----------|---------|
|
||
| `e2e-net` | all | Isolated test network. No host networking, no internet. Per RESTRICT-SAT-1, the SUT must NEVER reach an external satellite provider during a flight; a deny-all egress rule on `e2e-net` enforces this and is itself a security test (NFT-SEC-02). |
|
||
|
||
### Volumes
|
||
|
||
| Volume | Mounted to | Purpose |
|
||
|--------|-----------|---------|
|
||
| `tile-cache-fixture` | `gps-denied-onboard:/var/azaion/tile-cache:ro` | Pre-built FAISS HNSW index + tile filesystem. Built once per test run from `tests/fixtures/tile-cache-builder/` from the 60 still-image satellite references and the Derkachi route bbox. Read-only mount mirrors AC-8.3 pre-flight load behavior. |
|
||
| `fdr-output` | `gps-denied-onboard:/var/azaion/fdr` | Per-flight FDR write target (AC-NEW-3 64 GB cap enforced via Docker `--storage-opt size=64g` on this volume) |
|
||
| `input-data` | `e2e-runner:/test-data:ro` | Bind mount of `_docs/00_problem/input_data/` for replay |
|
||
| `expected-results` | `e2e-runner:/expected:ro` | Bind mount of `_docs/00_problem/input_data/expected_results/` for assertions |
|
||
|
||
### docker-compose structure
|
||
|
||
```yaml
|
||
services:
|
||
gps-denied-onboard:
|
||
build:
|
||
context: ../..
|
||
dockerfile: docker/Dockerfile
|
||
args:
|
||
BUILD_VINS_MONO: "OFF"
|
||
networks: [e2e-net]
|
||
volumes:
|
||
- tile-cache-fixture:/var/azaion/tile-cache:ro
|
||
- fdr-output:/var/azaion/fdr
|
||
environment:
|
||
ONBOARD_FC_ADAPTER: ${FC_ADAPTER} # ardupilot | inav, set per scenario
|
||
ONBOARD_VIO_STRATEGY: ${VIO_STRATEGY} # okvis2 | klt_ransac (production); vins_mono only in research build
|
||
MAVLINK_SIGNING_PASSKEY_FILE: /run/secrets/mavlink_passkey
|
||
depends_on:
|
||
- mock-suite-sat-service
|
||
|
||
ardupilot-plane-sitl:
|
||
image: ardupilot/ardupilot-sitl:plane-stable
|
||
networks: [e2e-net]
|
||
command: ["--vehicle=ArduPlane", "--gps-type=14"] # GPS_TYPE=14 = MAV per ArduPilot SITL_simulation_parameters.html
|
||
|
||
inav-sitl:
|
||
image: inavflight/inav-sitl:9.0.0
|
||
networks: [e2e-net]
|
||
# iNav SITL exposes MSP on TCP 5760 (UART1) per docs/SITL/SITL.md
|
||
|
||
mock-suite-sat-service:
|
||
build: ../fixtures/mock-suite-sat
|
||
networks: [e2e-net]
|
||
# Egress restriction enforced at network level, not service level
|
||
|
||
e2e-runner:
|
||
build: ../runner
|
||
networks: [e2e-net]
|
||
volumes:
|
||
- input-data:/test-data:ro
|
||
- expected-results:/expected:ro
|
||
- fdr-output:/fdr:ro
|
||
depends_on:
|
||
- gps-denied-onboard
|
||
- ardupilot-plane-sitl
|
||
- inav-sitl
|
||
- mavproxy-listener
|
||
|
||
mavproxy-listener:
|
||
image: ardupilot/mavproxy:latest
|
||
networks: [e2e-net]
|
||
|
||
networks:
|
||
e2e-net:
|
||
driver: bridge
|
||
internal: true # NO external connectivity (enforces RESTRICT-SAT-1)
|
||
|
||
volumes:
|
||
tile-cache-fixture: {}
|
||
fdr-output: {}
|
||
```
|
||
|
||
## Consumer Application
|
||
|
||
**Tech stack**: Python 3.12, pytest 8.x, pymavlink (MAVLink ground side), `msp_gps_toy` (MSP2 ground side, Rust binary called via subprocess), OpenCV ≥4.12.0 (frame source replay), numpy + scipy (geodesic-distance assertions in WGS84).
|
||
|
||
**Entry point**: `pytest tests/e2e/` from inside `e2e-runner`. Each scenario is a parameterized pytest case keyed by FC adapter (`ardupilot` / `inav`).
|
||
|
||
### Communication with system under test
|
||
|
||
| Interface | Protocol | Endpoint / Topic | Authentication |
|
||
|-----------|----------|-----------------|----------------|
|
||
| Frame source | V4L2 / GStreamer file source | UNIX domain socket / shared `/test-data` mount | none (local) |
|
||
| FC telemetry inbound | MAVLink (AP) or MSP2 (iNav) | `udp:gps-denied-onboard:14550` (AP) or `tcp:gps-denied-onboard:5760` (iNav) | MAVLink 2.0 message signing on AP per D-C8-9 (passkey via Docker secret); iNav unsigned per accepted residual risk |
|
||
| Tile cache | Filesystem read | `/var/azaion/tile-cache` (read-only mount) | filesystem perms |
|
||
| FC external-pos outbound observation | Read SITL EKF source-set + GLOBAL_POSITION_INT replay back from SITL | `udp:ardupilot-plane-sitl:14550` or `tcp:inav-sitl:5760` | passive listener |
|
||
| GCS telemetry observation | MAVLink listener | `udp:mavproxy-listener:14551` (forwarded from SUT 14550) | none |
|
||
| FDR output | Filesystem read post-run | `/fdr` (read-only mount) | filesystem perms |
|
||
| Suite Sat Service mock | HTTP/JSON | `http://mock-suite-sat-service:8080` | none (test) |
|
||
|
||
### What the consumer does NOT have access to
|
||
|
||
- No direct access to the SUT's internal state (GTSAM iSAM2 graph, FAISS index in-memory, OpenCV intermediate buffers, VioStrategy implementation pointer).
|
||
- No internal Python/C++ module imports from the SUT.
|
||
- No shared memory or filesystem with the SUT outside the four explicit mounts (`tile-cache-fixture` r/o, `fdr-output` r/o from runner side, `input-data` r/o, `expected-results` r/o).
|
||
- No bypass of the FC-side acceptance check — every AC-4.3 assertion goes through SITL.
|
||
|
||
## CI/CD Integration
|
||
|
||
**When to run**:
|
||
- Tier-1 (workstation Docker): on every PR to `dev` branch and nightly on `dev` HEAD.
|
||
- Tier-2 (Jetson hardware loop): nightly on `dev`, and as a hard gate before any release tag.
|
||
- AC-NEW-5 thermal envelope: monthly on chamber-attached Jetson runner; failures block release tags only.
|
||
|
||
**Pipeline stage**:
|
||
- Tier-1 fits in the standard CI matrix as a single job (~30-45 min wall-clock for the full suite at first cut).
|
||
- Tier-2 is a separate workflow on `self-hosted-jetson-orin` runner.
|
||
|
||
**Gate behavior**: Tier-1 blocks PR merge on any test failure. Tier-2 blocks release tag on any test failure. Chamber tests are warning-only on PRs and blocking on release tags.
|
||
|
||
**Timeout**:
|
||
- Tier-1: 60 min per matrix entry.
|
||
- Tier-2: 4 hr per matrix entry (allows for full Derkachi 8 min replay × ~10 scenarios + cold-boot loops).
|
||
- Thermal chamber AC-NEW-5: 9 hr (8 h hot-soak + setup/teardown).
|
||
|
||
## Reporting
|
||
|
||
**Format**: CSV (one row per test).
|
||
|
||
**Columns**: `test_id, test_name, traces_to, fc_adapter, vio_strategy, tier, started_at_utc, execution_time_ms, result, error_message, evidence_paths`
|
||
|
||
- `traces_to`: comma-separated AC/RESTRICT IDs from the traceability matrix.
|
||
- `fc_adapter`: `ardupilot` | `inav` | `n/a`.
|
||
- `vio_strategy`: `okvis2` | `klt_ransac` | `vins_mono` | `n/a` (research-build only for `vins_mono`).
|
||
- `tier`: `tier1-docker` | `tier2-jetson` | `tier2-chamber`.
|
||
- `result`: `PASS` | `FAIL` | `SKIP` | `XFAIL` (XFAIL only allowed for AC explicitly marked NOT COVERED in the traceability matrix and not yet promoted to a real test).
|
||
- `evidence_paths`: comma-separated paths inside the run-output bundle (`.tlog` files, FDR archives, screenshots, profiler traces) supporting the verdict.
|
||
|
||
**Output path**: `e2e-results/run-${RUN_ID}/report.csv` plus a per-run bundle of evidence at `e2e-results/run-${RUN_ID}/evidence/`.
|
||
|
||
## Test Execution
|
||
|
||
**Decision (2026-05-09)**: **both** — Tier-1 Docker + Tier-2 Jetson hardware loop. Confirmed at the Hardware-Dependency Assessment Step 4 gate.
|
||
|
||
### Hardware dependencies found (Phase 3 → Hardware Assessment scan)
|
||
|
||
| Category | Indicator | Source file |
|
||
|---|---|---|
|
||
| GPU / CUDA | TensorRT engines (`.engine`, SM 87, JetPack 6.2, TRT 10.3) | `_docs/01_solution/solution.md` PRE-FLIGHT block |
|
||
| GPU / CUDA | DISK+LightGlue FP16 inference | `_docs/01_solution/solution.md` RUNTIME block (C3) |
|
||
| GPU / CUDA pin | Jetson Orin Nano Super (67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W) | `_docs/00_problem/restrictions.md` § Onboard Hardware |
|
||
| Sensors / Cameras | ADTi 20MP 20L V1 nadir camera over USB / MIPI-CSI / GigE | `_docs/00_problem/restrictions.md` § Cameras |
|
||
| Sensors / Cameras | V4L2 / GStreamer frame source (production) | `_docs/02_document/tests/environment.md` § Overview |
|
||
| OS-specific services | High-rate IMU via UART/MAVLink to FC | `_docs/00_problem/restrictions.md` § Sensors & Integration |
|
||
| OS-specific services | Per-FC inbound (MAVLink GPS_INPUT for AP, MSP2 over UART for iNav) | `_docs/00_problem/restrictions.md` § Sensors & Integration |
|
||
| OS-specific services | tegrastats / jetson_stats for thermal telemetry | `_docs/02_document/tests/resource-limit-tests.md` NFT-LIM-04 |
|
||
| Thermal envelope | -20 °C to +50 °C operating envelope, 25 W TDP, 8 h duty cycle | `_docs/00_problem/restrictions.md` § Failsafe & Safety + AC-NEW-5 |
|
||
|
||
(Step 2 Code scan returned zero indicators because no source code exists yet — this is the planning phase. Decompose → Implement will produce `requirements.txt` / `pyproject.toml` / Cargo.toml entries that confirm: `tensorrt`, `pycuda`, `pymavlink`, `gtsam`, `faiss-gpu`, `opencv-python>=4.12.0`, `jetson-stats`.)
|
||
|
||
### Execution instructions — Tier-1 (Docker)
|
||
|
||
**Prerequisites**:
|
||
- Docker 24+ with Compose v2.
|
||
- NVIDIA Container Toolkit if the workstation has an NVIDIA dGPU (lets the SUT exercise the TensorRT path; otherwise falls back to CPU TensorRT).
|
||
- ≥16 GB host RAM, ≥80 GB free disk for `tile-cache-fixture` + `fdr-output` + image build cache.
|
||
|
||
**How to start**:
|
||
```bash
|
||
cd e2e/docker
|
||
export FC_ADAPTER=ardupilot # or: inav (parameterized per scenario in CI)
|
||
export VIO_STRATEGY=okvis2 # or: klt_ransac (production binary)
|
||
docker compose -f docker-compose.test.yml up --build --abort-on-container-exit e2e-runner
|
||
```
|
||
The run reports to `./e2e-results/run-${RUN_ID}/report.csv` (see § Reporting). Exit code matches the test verdict.
|
||
|
||
**Environment variables**:
|
||
- `FC_ADAPTER` ∈ `{ardupilot, inav}` — selects which SITL the SUT talks to.
|
||
- `VIO_STRATEGY` ∈ `{okvis2, klt_ransac}` for production binary; `vins_mono` only when the research binary `BUILD_VINS_MONO=ON` is the build.
|
||
- `MAVLINK_SIGNING_PASSKEY_FILE` — path to the Docker secret loaded with the test passkey for FT-P-09-AP / NFT-SEC-03.
|
||
|
||
**Skipped on Tier-1**: `NFT-PERF-01` (AC-4.1 latency p95 — Jetson-bound), `NFT-LIM-01` (AC-4.2 memory — Jetson-bound), `NFT-PERF-03` (AC-NEW-1 cold-start — Jetson-bound), `NFT-LIM-04` (AC-NEW-5 chamber baseline — Jetson-bound), AC-NEW-5 chamber portion (chamber-bound).
|
||
|
||
### Execution instructions — Tier-2 (Jetson hardware loop)
|
||
|
||
**Prerequisites**:
|
||
- Jetson Orin Nano Super (per `restrictions.md` § Onboard Hardware).
|
||
- JetPack 6.2 + CUDA + TensorRT 10.3 + cuDNN per D-C7-9.
|
||
- Workstation thermal-day environment for NFT-LIM-04 baseline. Chamber-attached runner for AC-NEW-5 chamber portion (separate quarterly job; not run in standard CI).
|
||
- ArduPilot Plane SITL + iNav SITL run on the same Jetson, OR on a paired x86 host on the same network — both are supported.
|
||
- Real ADTi 20MP 20L V1 camera connected via USB/MIPI-CSI/GigE; OR file-replay source if camera unavailable (in which case all `AC-2.x` cross-validation is `XFAIL` for that run).
|
||
|
||
**How to start**:
|
||
```bash
|
||
cd e2e/jetson
|
||
sudo systemctl restart gps-denied-onboard.service
|
||
./run-tier2.sh --fc-adapter ardupilot --vio-strategy okvis2 --duration 8h
|
||
# or:
|
||
./run-tier2.sh --fc-adapter inav --vio-strategy klt_ransac --duration 5min
|
||
```
|
||
Outputs the same CSV format as Tier-1 (one report.csv per run).
|
||
|
||
**Environment variables**: same as Tier-1 plus:
|
||
- `TIER2_CHAMBER_AMBIENT_C` — ambient temperature for AC-NEW-5 chamber runs.
|
||
- `TIER2_CAMERA_DEVICE` — `/dev/video0` (production) or file path for replay mode.
|
||
|
||
### CI runner mapping
|
||
|
||
- `ubuntu-24.04` (GitHub-hosted) → Tier-1 Docker, every PR + nightly. ~30-45 min per matrix entry.
|
||
- `self-hosted-jetson-orin` → Tier-2 Jetson, nightly on `dev` HEAD + pre-release gate. ~4 hr per matrix entry.
|
||
- `self-hosted-jetson-orin-chamber` → AC-NEW-5 hot-soak. Quarterly + before any release tag. ~9 hr.
|
||
|
||
**Matrix dimensions**: `FC_ADAPTER × VIO_STRATEGY × build_kind` where `build_kind ∈ {production, research}`. Production `vins_mono` is excluded (D-C1-1-SUB-A locked); research includes all three VioStrategy values.
|