6.7 KiB
Test Environment
Overview
System under test: Onboard GPS-denied localization service. Public interfaces are navigation-camera frame input, flight-controller telemetry input, offline satellite-cache input, GPS_INPUT MAVLink output, QGroundControl status output, and flight-data-recorder output.
Consumer app purpose: A black-box replay harness that feeds image frames, telemetry traces, cache manifests, and fault triggers into the service, then validates emitted coordinates, confidence fields, telemetry, and logs without importing internal modules.
Execution Environments
| Environment | Purpose | Required for |
|---|---|---|
| Local replay workstation | Fast still-image and dataset replay validation | Frame-center geolocation, Satellite Service local retrieval, stale-tile rejection |
| Jetson Orin Nano Super | Production-like latency, memory, thermal, and TensorRT/ONNX profiling | AC-4.1, AC-4.2, AC-NEW-1, AC-NEW-5 |
| ArduPilot Plane SITL + QGroundControl | MAVLink GPS_INPUT, spoofing, failsafe, and GCS status validation |
AC-4.3, AC-5.2, AC-NEW-2, AC-NEW-8 |
| Representative flight/replay rig | Final acceptance evidence with synchronized nav camera, FC IMU/attitude/airspeed/altitude, MAVLink logs, and ground truth | Final AC signoff |
Docker / Compose Structure
| Service | Image / Build | Purpose | Ports |
|---|---|---|---|
| gps-denied-service | Project build image for JetPack-compatible target or replay-compatible host | System under test | MAVLink UDP/TCP and health/status endpoints TBD |
| replay-consumer | Python replay/test harness | Feeds images, telemetry, cache data, and fault triggers | none |
| satellite-cache-stub | Local COG/manifest/descriptor fixture volume | Provides offline tile cache and signed/unsigned manifests | none |
| ardupilot-plane-sitl | ArduPilot Plane SITL image or local process | Validates GPS_INPUT, spoofing/failsafe behavior |
MAVLink SITL ports |
| qgc-observer | QGC/tlog-compatible observer or MAVLink log parser | Verifies GCS-visible status output | none |
Networks
| Network | Services | Purpose |
|---|---|---|
| replay-net | gps-denied-service, replay-consumer, satellite-cache-stub | Offline replay and black-box validation |
| sitl-net | gps-denied-service, ardupilot-plane-sitl, qgc-observer | MAVLink integration and failsafe validation |
Volumes
| Volume | Mounted to | Purpose |
|---|---|---|
| input-data | /data/input |
_docs/00_problem/input_data/ and public dataset slices |
| expected-results | /data/expected |
_docs/00_problem/input_data/expected_results/ |
| derkachi-replay | /data/input/flight_derkachi |
Cropped nadir MP4 plus synchronized IMU and GLOBAL_POSITION_INT trajectory |
| satellite-cache | /cache/satellite |
COG tiles, manifests, descriptor index fixtures |
| fdr-output | /fdr |
Flight-data-recorder outputs for validation |
Consumer Application
Tech stack: Python replay harness with pytest-style assertions and MAVLink log parsing.
Entry point: run-blackbox-replay command to be created during implementation; this planning artifact defines required behavior, not code.
Communication With System Under Test
| Interface | Protocol | Endpoint / Topic | Authentication |
|---|---|---|---|
| Navigation frames | File/stream replay | Ordered image frames with timestamps | Local fixture access |
| FC telemetry | MAVLink replay or generated stream | IMU, attitude, airspeed, altitude, GPS health | Local MAVLink link |
| Satellite cache | Local filesystem contract | COG + manifest + descriptors | Signed manifest validation |
| GPS output | MAVLink | GPS_INPUT to ArduPilot Plane |
MAVLink source/system ID allowlist |
| Status output | MAVLink/QGC | STATUSTEXT / status summary |
MAVLink source/system ID allowlist |
| FDR | Filesystem output | Per-flight segmented logs | Local fixture access |
What The Consumer Does Not Access
- No internal estimator modules.
- No direct BASALT/OpenVINS/Kimera APIs.
- No direct mutation of internal state.
- No bypass of public cache, MAVLink, replay, or FDR interfaces.
CI/CD Integration
| Suite | When to run | Gate behavior | Timeout |
|---|---|---|---|
| Still-image geolocation smoke | Every PR after implementation exists | Block merge | <= 15 min |
| Public VIO dataset replay | Nightly and before release | Block release | Dataset-dependent |
| Jetson performance/resource | Before release and after runtime dependency changes | Block release | <= 8 h for endurance/thermal |
| Plane SITL failsafe/spoofing | Every release candidate | Block release | <= 60 min |
Reporting
Format: CSV and FDR validation summary.
Columns: Test ID, Test Name, Input Dataset, Execution Time (ms), Result, Error Distance (m), Source Label, Covariance 95% Semi-Major (m), GPS_INPUT.fix_type, Error Message.
Output path: ./test-results/blackbox-report.csv and ./test-results/fdr-validation-summary.md.
Test Execution
Decision: Both Docker/replay and local hardware execution.
Hardware dependencies found:
- Jetson Orin Nano Super with 8 GB shared LPDDR5 and 25 W power mode.
- CUDA/TensorRT/ONNX acceleration for DINOv2 and local-matcher profiling.
- Camera ingestion paths over USB, MIPI-CSI, or GigE.
- ArduPilot Plane SITL and MAVLink
GPS_INPUTbehavior. - Thermal, power, FDR, and storage limits that require target-like execution.
Docker / Replay Mode
Use Docker or local host replay for deterministic, reproducible tests that do not require physical Jetson hardware:
- Still-image frame-center geolocation.
- Derkachi synchronized video/telemetry replay, including alignment and VIO smoke checks.
- Satellite-cache freshness and integrity fixtures.
- FAISS descriptor/index behavior.
- Public dataset replay where GPU/hardware timing is not the assertion.
- Plane SITL tests where SITL and MAVLink behavior are the target.
Docker/replay mode is suitable for PR checks and nightly validation, but it does not prove Jetson latency, memory, thermal, or camera-driver behavior.
Local Hardware Mode
Use local Jetson hardware for release gates:
- BASALT + wrapper latency and memory profiling.
- DINOv2/ONNX/TensorRT descriptor-fidelity and runtime profiling.
- ALIKED/DISK + LightGlue runtime profiling.
- Cold-start time to first valid
GPS_INPUT. - 8-hour thermal and FDR endurance tests.
- Camera interface validation once the exact module interface is selected.
Gate Policy
- PR gate: Docker/replay smoke and deterministic fixture tests.
- Nightly gate: Docker/replay public dataset slices and SITL scenarios.
- Release gate: local Jetson hardware, Plane SITL, thermal/resource tests, and representative replay data.