mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 17:51:12 +00:00
bf13549b32
ci/woodpecker/push/02-build-push Pipeline failed
- Enhanced `.env.example` with detailed CMake build flags and replay-mode strategy flags for development and CI environments. - Updated `.gitignore` to include a new deploy rollback bookmark. - Revised `_docs/_autodev_state.md` to reflect the current task status and steps. - Added new lessons to `_docs/LESSONS.md` regarding testing and architectural improvements. - Documented changes in `_docs/02_document/deployment/ci_cd_pipeline.md` to reflect the relaxed OpenCV version pin. - Updated test data documentation in `_docs/02_document/tests/test-data.md` to clarify fixture usage and paths. This commit continues the cycle-1 documentation sync and addresses various configuration updates for improved clarity and functionality.
135 lines
25 KiB
Markdown
135 lines
25 KiB
Markdown
# Test Data Management
|
||
|
||
## Seed Data Sets
|
||
|
||
| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|
||
|----------|-------------|---------------|-----------|---------|
|
||
| `still-image-set-60` | 60 nadir aerial images `AD000001-60.jpg` from `_docs/00_problem/input_data/` with WGS84 frame-center GT in `coordinates.csv` and per-image accuracy table in `expected_results/position_accuracy.csv`. Captured at 400 m AGL with ADTi 20MP 20L V1 (per `data_parameters.md`). Slow cadence (~1 per 2-3 s), so suitable for satellite-anchor frame-center tests, NOT frame-to-frame VIO. | FT-P-01, FT-P-03, FT-P-05, FT-P-06, FT-P-15, FT-P-19, NFT-RES-03 (Monte Carlo), NFT-PERF-04 | Bind-mounted from `_docs/00_problem/input_data/` to `/test-data` in `e2e-runner` (read-only) | None — read-only fixture |
|
||
| `still-image-sat-refs-2` | Two paired Google Maps reference images `AD000001_gmaps.png`, `AD000002_gmaps.png`. Insufficient for full satellite-anchor coverage of the 60-image set; supplements the tile-cache fixture for AC-2.1b cross-validation only. | FT-P-05 (subset), FT-P-19 | Same as above | Same |
|
||
| `derkachi-fixture` | Cropped nadir flight footage `flight_derkachi/flight_derkachi.mp4` (H.264, 880×720, 30 fps, ~490.07 s = 14,700 frames) plus synchronized FC telemetry `flight_derkachi/data_imu.csv` (4,900 rows @ 10 Hz, columns `timestamp(ms)`, `Time`, `SCALED_IMU2.*`, `GLOBAL_POSITION_INT.*`). Three video frames per telemetry row. The `GLOBAL_POSITION_INT` columns are the trajectory ground truth. | FT-P-02, FT-P-04, FT-P-07, FT-P-10, FT-N-01 (synth on top), FT-N-02, FT-N-03 (synth), FT-N-04 (synth), NFT-PERF-01, NFT-PERF-02, NFT-RES-01, NFT-RES-02, NFT-RES-03 (Monte Carlo), NFT-RES-04, NFT-LIM-02 (8 h synth load loop) | Same bind mount as above | Same |
|
||
| `tile-cache-fixture` | Pre-built FAISS HNSW index + tile filesystem covering: (a) the 60 still-image footprints at 0.3-0.5 m/px, (b) the Derkachi route bbox at the same resolution. Built once per CI run by `e2e/fixtures/tile-cache-builder/` from the `_gmaps.png` references and from a curated public-data subset (when D-PROJ-3 is resolved — until then, stub-tile content for footprints not paired with `_gmaps.png`). Tile manifest schema per `restrictions.md` § Satellite Imagery. | FT-P-01, FT-P-05, FT-P-15, FT-P-16, FT-P-17, FT-P-19, FT-N-05, FT-N-06, NFT-LIM-03, NFT-PERF-01, NFT-PERF-04, NFT-SEC-01 (poisoning test), NFT-SEC-02 (egress) | Built into named Docker volume `tile-cache-fixture`; mounted read-only into SUT at `/var/azaion/tile-cache` | Volume removed at teardown |
|
||
| `synth-age-tile-set` | Two clones of the tile-cache-fixture with manifest `capture_date` field synthetically aged: `synth-age-7mo` (>6 mo, exceeds AC-8.2 active-conflict threshold) and `synth-age-13mo` (>12 mo, exceeds rear threshold). Tile pixels unchanged; only manifest dates differ. | FT-N-05, FT-N-06 | Built from `tile-cache-fixture` by date-mutating script in `e2e/fixtures/age-injector/` | Volume removed at teardown |
|
||
| `outlier-injection-derkachi` | Synthetic adversarial overlay on `derkachi-fixture`: every Nth frame replaced by a random crop from a far-away tile (>350 m offset, per AC-3.1) to inject a visual outlier. Three injection densities: `light` (1 in 100), `medium` (1 in 10), `heavy` (1 in 3). Generated at runtime by `e2e/fixtures/injectors/outlier.py`. | FT-N-01 | Generated at scenario start, written to `tmpfs` in `e2e-runner`, mounted into SUT as a derived frame source | Auto-cleared at teardown (tmpfs) |
|
||
| `blackout-spoof-derkachi` | Synthetic overlay on `derkachi-fixture`: pure-black frames inserted in 5 s / 15 s / 35 s windows AND simultaneous spoofed-GPS injection on the FC inbound stream. Spoof pattern: realistic-looking GPS jumps the trajectory 200-500 m in `north_east_random_direction`. Three windows produce three sub-scenarios per AC-NEW-8. Generated at runtime. | FT-N-04, NFT-RES-04 | Same | Same |
|
||
| `multi-segment-derkachi` | Synthetic overlay: 3+ blackout segments distributed across the Derkachi flight to exercise satellite-reference re-localization (AC-3.3) without spoofing. Generated at runtime. | FT-P-08 | Same | Same |
|
||
| `cold-boot-fixture` | The state needed to validate AC-NEW-1: a frozen FC pose (`GLOBAL_POSITION_INT` snapshot at flight-resume time) + the tile-cache-fixture + a blank FDR. Test cold-boots the SUT and measures TTFF. | NFT-PERF-03 (AC-NEW-1) | The frozen FC pose is a JSON fixture in `e2e/fixtures/cold-boot/`; SUT is restarted (`docker compose restart gps-denied-onboard`) and TTFF is measured from container-ready event to first valid `GPS_INPUT` / `MSP2_SENSOR_GPS` arrival at SITL | Container restart only |
|
||
| `mavlink-passkey` | A test-only MAVLink 2.0 signing passkey (32-byte hex). Used for D-C8-9 ArduPilot-track signing channel. NEVER reused outside test environment; checked-in as `e2e/fixtures/secrets/mavlink-test-passkey.txt` with explicit comment "TEST ONLY". | FT-P-09 (AP track), NFT-SEC-03 | Loaded via Docker secret into SUT environment | None — fixture file |
|
||
| `cve-jpeg-fixture` | Crafted JPEG that triggers CVE-2025-53644 (uninitialized stack pointer → heap buffer write) in OpenCV 4.10/4.11. The currently-pinned `opencv-python>=4.11.0.86,<4.12` must process it without crash and either decode safely or reject. NFT-SEC-04 also exercises ASan to confirm no buffer overflow. The original D-CROSS-CVE-1 spec required `>=4.12.0`; the pin is held below 4.12 because gtsam==4.2 ships only numpy-1 wheels (see `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` — D-CROSS-CVE-1 leftover OPEN until upstream gtsam targets numpy>=2). | NFT-SEC-04 | Local-data-only fixture file at `e2e/fixtures/security/cve-2025-53644.jpg` (sourced from public PoC, license-checked) | None — fixture file |
|
||
| `sitl-replay-fixture-p01` | Pre-captured FT-P-01 FDR-replay set built by `e2e/fixtures/sitl_replay_builder/build_p01_fixtures.py` from the 60 still images. Contains `outbound_messages_<fc_kind>_<host>.json` (per-image lat/lon emitted by SUT; `null` entries encode timeouts), `observer_<fc_kind>_<host>.json` (sitl_observer config), `stills.mp4` (60-image stitched video), `stationary.tlog` (synthetic stationary IMU/ATTITUDE), `fdr.jsonl` (FDR archive). Activated by `E2E_SITL_REPLAY_DIR=e2e/fixtures/sitl_replay/p01` (see environment.md § Replay-Mode Skip Gating). | FT-P-01 | Pre-committed at `e2e/fixtures/sitl_replay/p01/`; rebuild via `python -m e2e.fixtures.sitl_replay_builder.build_p01_fixtures --input-dir _docs/00_problem/input_data --output-dir e2e/fixtures/sitl_replay/p01 --fc-kind ardupilot --host sitl-host` | None — committed fixture |
|
||
| `sitl-replay-fixture-p02` | Pre-captured FT-P-02 Derkachi drift FDR-replay set built by `e2e/fixtures/sitl_replay_builder/build_p02_fixtures.py` from `flight_derkachi.mp4` + `data_imu.csv`. Contains `derkachi.tlog`, `fdr/fdr.jsonl`, `observer_<fc_kind>_<host>.json`. iNav not supported by current builder — ArduPilot only. | FT-P-02 | Pre-committed at `e2e/fixtures/sitl_replay/p02/`; rebuild via `python -m e2e.fixtures.sitl_replay_builder.build_p02_fixtures --derkachi-dir _docs/00_problem/input_data/flight_derkachi --output-dir e2e/fixtures/sitl_replay/p02 --fc-kind ardupilot --host sitl-host` | None — committed fixture |
|
||
| `fc-proxy-schedule` | JSON schedule loaded by `e2e/fixtures/injectors/fc_proxy.BlackoutSpoofProxy` to drive FT-N-04 blackout + spoofed-GPS windows on the FC inbound stream. Schedule format: `window_start_ms`, `window_end_ms`, `spoof_pattern` per window. Loaded via `BlackoutSpoofProxy.from_schedule_file(schedule_path)` and replayed by `runner/helpers/fc_proxy_runtime.drive_fc_proxy(...)` (AZ-596). | FT-N-04, NFT-RES-04 | Generated alongside the scenario's `blackout-spoof-derkachi` overlay; written to per-test tmpfs OR pre-captured under `e2e/fixtures/sitl_replay/<scenario>/proxy_schedule.json` when in FDR-replay mode | Auto-cleared at teardown (tmpfs) or committed (FDR-replay) |
|
||
|
||
## Data Isolation Strategy
|
||
|
||
Each `pytest` test case runs against a fresh `gps-denied-onboard` container (`docker compose restart` between tests, OR `--forked` pytest mode that brings a clean compose stack per case for hermetic-critical tests). The `tile-cache-fixture` and `input-data` mounts are read-only so cross-contamination between tests is impossible at the SUT-input layer. The `fdr-output` volume is reset between tests (`docker volume rm` + recreate) so each test sees a blank FDR.
|
||
|
||
For Tier-2 (Jetson hardware), the same isolation discipline applies but at the systemd-service level: `systemctl restart gps-denied-onboard.service` between tests, `/var/azaion/fdr` is wiped between tests.
|
||
|
||
Synthetic-injection fixtures (`outlier-injection-derkachi`, `blackout-spoof-derkachi`, `multi-segment-derkachi`, `synth-age-tile-set`) are generated into per-test tmpfs and never written back to a persistent volume.
|
||
|
||
`sitl-replay-fixture-*` and `fc-proxy-schedule` (when in FDR-replay mode) are committed under `e2e/fixtures/sitl_replay/<scenario>/` and read read-only by the replay-mode scenarios. They are not regenerated per test — the builders under `e2e/fixtures/sitl_replay_builder/` are invoked manually (or by a fixture-refresh CI job) when the SUT replay contract changes. When `E2E_SITL_REPLAY_DIR` is unset, the gated scenarios skip cleanly via the `sitl_replay_ready` pytest marker (per AZ-594/595/598/599) and the harness falls back to live-mode (which requires the full Docker compose stack).
|
||
|
||
## Input Data Mapping
|
||
|
||
| Input Data File | Source Location | Description | Covers Scenarios |
|
||
|-----------------|----------------|-------------|-----------------|
|
||
| `AD000001.jpg` ... `AD000060.jpg` | `_docs/00_problem/input_data/` | 60 nadir still images, ADTi 20MP @ 400 m AGL | FT-P-01, FT-P-03, FT-P-05, FT-P-06, FT-P-15, FT-P-19, NFT-PERF-04, NFT-RES-03 |
|
||
| `coordinates.csv` | `_docs/00_problem/input_data/` | 60-row WGS84 frame-center GT (image, lat, lon) | Same as above |
|
||
| `AD000001_gmaps.png`, `AD000002_gmaps.png` | `_docs/00_problem/input_data/` | Google Maps satellite reference for images 1-2 | FT-P-05, FT-P-19 |
|
||
| `data_parameters.md` | `_docs/00_problem/input_data/` | AGL height (400 m) + camera model | All — global metadata |
|
||
| `flight_derkachi/flight_derkachi.mp4` | `_docs/00_problem/input_data/flight_derkachi/` | H.264 nadir video, 880×720 @ 30 fps, ~490 s | FT-P-02, FT-P-04, FT-P-07, FT-P-10, FT-N-01..04, NFT-PERF-01..04, NFT-RES-01..04, NFT-LIM-02 |
|
||
| `flight_derkachi/data_imu.csv` | `_docs/00_problem/input_data/flight_derkachi/` | 4,900 rows @ 10 Hz of `SCALED_IMU2` + `GLOBAL_POSITION_INT` | Same as above |
|
||
| `flight_derkachi/README.md` | `_docs/00_problem/input_data/flight_derkachi/` | Fixture metadata | Documentation only |
|
||
| `expected_results/results_report.md` | `_docs/00_problem/input_data/expected_results/` | Pass/fail rules + still-image and Derkachi mappings | All FT-P / FT-N scenarios that load this fixture |
|
||
| `expected_results/position_accuracy.csv` | `_docs/00_problem/input_data/expected_results/` | Per-image accuracy threshold flags | FT-P-01, NFT-RES-03 |
|
||
|
||
## Expected Results Mapping
|
||
|
||
This table closes the gap between each test scenario and the quantifiable expected result it asserts on. Comparison methods follow `.cursor/skills/test-spec/templates/expected-results.md`. The `Expected Result Source` column points at the canonical source of truth for the assertion.
|
||
|
||
### Position accuracy
|
||
|
||
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
|
||
|-----------------|------------|-----------------|-------------------|-----------|----------------------|
|
||
| FT-P-01 | `still-image-set-60` + `tile-cache-fixture` | `pass_count(error≤50m) ≥ 48` (≥80% of 60) AND `pass_count(error≤20m) ≥ 30` (≥50% of 60) | `threshold_min` on aggregate counts; per-image error via `numeric_tolerance` against Vincenty geodesic distance to GT in `coordinates.csv` | ±50 m / ±20 m | `expected_results/results_report.md` § Pass/Fail Rules + `expected_results/position_accuracy.csv` |
|
||
| FT-P-02 | `derkachi-fixture` | At each anchor frame, `‖propagated_centre − next_anchor_centre‖ < 100 m` (visual-only) AND `< 50 m` (IMU-fused). Drift binned by `last_satellite_anchor_age_ms`. | `threshold_max` per anchor pair, then aggregate rule `≥95% of anchor pairs satisfy` | < 100 m / < 50 m | AC-1.3 + Derkachi `GLOBAL_POSITION_INT` GT |
|
||
| FT-P-03 | `still-image-set-60` (any 1 image) | Estimate output schema fields present: `lat:float`, `lon:float`, `cov_semi_major_m:float`, `source_label ∈ {satellite_anchored, visual_propagated, dead_reckoned}`, `last_satellite_anchor_age_ms:int` | `schema_match` (presence + type) AND `set_contains` (label) | N/A | AC-1.4 + AC-4.3 |
|
||
| FT-P-19 | `tile-cache-fixture` + `still-image-sat-refs-2` | Scale-ratio: any UAV-frame footprint at 400 m AGL retrievable from cache (FAISS top-K=10 includes a tile with center within 100 m of true position). Scene-change subset (PARTIAL — flag-marked, see traceability matrix). | `set_contains` (top-K result includes correct tile) | top-K hit | AC-8.6 |
|
||
|
||
### Image processing quality
|
||
|
||
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
|
||
|-----------------|------------|-----------------|-------------------|-----------|----------------------|
|
||
| FT-P-04 | `derkachi-fixture` | Frame-to-frame registration succeeds for `≥95%` of "normal" segments (defined per AC-2.1a: nadir ±10° bank/pitch from `data_imu.csv` `SCALED_IMU2` quaternion-derived attitude estimate, ≥40% inferred prior-frame overlap). Sharp-turn frames excluded from this denominator. | `threshold_min` on success ratio | ≥95% | AC-2.1a |
|
||
| FT-P-05 | `still-image-set-60` (with `_gmaps.png` subset for ground-truth match) | Satellite-anchor registration succeeds AND satisfies AC-1.1/1.2 accuracy AND MRE < 2.5 px | `threshold_max` MRE | < 2.5 px | AC-2.1b + AC-2.2 |
|
||
| FT-P-06 | `derkachi-fixture` (frame-to-frame) AND `still-image-set-60` (sat-anchor) | Mean Reprojection Error: `< 1.0 px` frame-to-frame, `< 2.5 px` satellite-anchored cross-domain | `threshold_max` per shape | < 1.0 / < 2.5 px | AC-2.2 |
|
||
|
||
### Resilience
|
||
|
||
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
|
||
|-----------------|------------|-----------------|-------------------|-----------|----------------------|
|
||
| FT-N-01 | `outlier-injection-derkachi` | Up to 350 m offset in a single frame is rejected as outlier; estimate continues from prior valid state with grown covariance; airframe tilt up to ±20° handled | Per-injected-outlier: `error_after_outlier ≤ error_before_outlier + 50 m` AND `covariance_growth_monotonic` | ±50 m drift budget | AC-3.1 |
|
||
| FT-N-02 | `derkachi-fixture` (sharp-turn segment, identified via `SCALED_IMU2` gyro_z spikes) | Sharp-turn frames may fail frame-to-frame registration; recovery via satellite-reference re-localization within next 3 frames | Boolean recovery within 3 frames | N/A | AC-3.2 |
|
||
| FT-P-08 | `multi-segment-derkachi` | ≥3 disconnected segments handled; satellite-reference re-localization succeeds at each gap; trajectory remains continuous (no >100 m jump) | `threshold_max` discontinuity | < 100 m | AC-3.3 |
|
||
| FT-N-03 | `derkachi-fixture` + synthetic 3-frame outage injector | After ≥3 consecutive frames AND ≥2 s without estimate: STATUSTEXT containing `OPERATOR_RELOC_REQUEST` emitted to GCS via `mavproxy-listener`; estimates labeled `dead_reckoned` continue | `regex` on STATUSTEXT + `set_contains` on labels | regex | AC-3.4 |
|
||
| FT-N-04 | `blackout-spoof-derkachi` (5 s / 15 s / 35 s windows) | Within ≤1 frame OR ≤400 ms: label switches to `dead_reckoned`; spoofed GPS rejected; covariance grows monotonically; `horiz_accuracy` not under-reported; `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT at 1-2 Hz | `threshold_max` switch latency + `regex` STATUSTEXT + monotonic check | ≤400 ms | AC-3.5 |
|
||
|
||
### FC contract & startup
|
||
|
||
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
|
||
|-----------------|------------|-----------------|-------------------|-----------|----------------------|
|
||
| FT-P-09-AP | `derkachi-fixture` + `mavlink-passkey` + `ardupilot-plane-sitl` | `GPS_INPUT` messages reach AP SITL; AP EKF accepts them as `EK3_SRC1_POSXY=3` (GPS); MAVLink 2.0 signing handshake completes (D-C8-9); messages without valid signature are rejected | `exact` (AP source-set state via param read) + `boolean` (signing handshake success) + `exact` (rejection of unsigned in NFT-SEC-03) | N/A | AC-4.3 + D-C8-9 |
|
||
| FT-P-09-iNav | `derkachi-fixture` + `inav-sitl` | `MSP2_SENSOR_GPS` (ID 0x1F03) messages reach iNav SITL via TCP 5760; iNav GPS provider state shows `provider=MSP` and fix is acquired | `exact` on iNav GPS provider state via MSP read | N/A | AC-4.3 + Source #4 |
|
||
| FT-P-10 | `derkachi-fixture` | Per Mode B Fact #107: GTSAM iSAM2 smoothed past-keyframe pose estimates differ from raw single-shot estimates AND smoothed estimates are closer to `GLOBAL_POSITION_INT` GT than raw (IT-11). NOT validated as FC-side retroactive correction (out of scope per Mode B revision). | `numeric_tolerance` improvement check | smoothed_error < raw_error | AC-4.5 (revised) + Mode B Fact #107 |
|
||
| FT-P-11 | `cold-boot-fixture` + `ardupilot-plane-sitl` | On boot, SUT initializes from FC EKF's last valid GPS + IMU-extrapolated position | `numeric_tolerance` initial-pose-vs-FC-pose | ±50 m | AC-5.1 |
|
||
| NFT-RES-01 | `derkachi-fixture` + 4 s outage injector | After >3 s without estimate, FC falls back to IMU-only dead reckoning; SUT emits a `NO_ESTIMATE_TIMEOUT` failure log | `boolean` on FC EKF source-set transition + `regex` on log | N/A | AC-5.2 |
|
||
| NFT-RES-02 | `derkachi-fixture` + container restart mid-replay | After companion reboot, SUT re-initializes from FC's current IMU-extrapolated position; first emitted `GPS_INPUT` / `MSP2_SENSOR_GPS` is within ±100 m of FC's IMU-extrapolated pose at boot-complete time | `numeric_tolerance` pose at first emit | ±100 m | AC-5.3 |
|
||
|
||
### Performance
|
||
|
||
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
|
||
|-----------------|------------|-----------------|-------------------|-----------|----------------------|
|
||
| NFT-PERF-01 (Tier-2 only) | `derkachi-fixture` resampled to 3 Hz on Jetson Orin Nano Super | End-to-end latency (camera capture → GPS to FC) | `threshold_max` p95 | ≤ 400 ms | AC-4.1 + D-CROSS-LATENCY-1 |
|
||
| NFT-PERF-02 (Tier-1+2) | `derkachi-fixture` | Estimates emitted frame-by-frame (no batching > 1 frame); inter-emit interval p95 ≤ inter-frame interval × 1.05 | `threshold_max` p95 inter-emit | ≤ 350 ms (at 3 Hz target) | AC-4.4 |
|
||
| NFT-PERF-03 (Tier-2 only) | `cold-boot-fixture` | Cold-start TTFF: from container-ready to first valid `GPS_INPUT` / `MSP2_SENSOR_GPS` | `threshold_max` p95 over 50 cold boots | < 30 s | AC-NEW-1 |
|
||
| NFT-PERF-04 | `still-image-set-60` + spoofed FC GPS injection in `ardupilot-plane-sitl` | Spoofing-promotion latency: from FC GPS-denial / spoof signal to SUT estimate becoming AP primary position source | `threshold_max` p95 over 50 trials per FC | < 3 s | AC-NEW-2 |
|
||
|
||
### Resource limits
|
||
|
||
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
|
||
|-----------------|------------|-----------------|-------------------|-----------|----------------------|
|
||
| NFT-LIM-01 (Tier-2) | `derkachi-fixture` 8 h replay loop | Memory `< 8 GB shared` on Jetson Orin Nano Super throughout | `threshold_max` peak RSS over duration | ≤ 8 GB | AC-4.2 |
|
||
| NFT-LIM-02 (Tier-1) | 8 h Derkachi replay loop | FDR ≤ `64 GB`; no payload class silently dropped without a logged rollover | `threshold_max` total FDR size + `regex` on rollover-event presence | ≤ 64 GB | AC-NEW-3 |
|
||
| NFT-LIM-03 | `tile-cache-fixture` plus exercised manifests/overviews/indices | Cache budget `≤ 10 GB` for the ~400 km² operational area unless solution defines a separate descriptor budget | `threshold_max` total cache size | ≤ 10 GB | RESTRICT-SAT-2 + AC-8.3 |
|
||
| NFT-LIM-04 (Tier-2) | `derkachi-fixture` 8 h | CPU/GPU/temp/throttle telemetry recorded; no thermal throttling at 25 W TDP at the upper temp envelope (deferred to chamber for AC-NEW-5) | `threshold_max` throttle event count = 0 (workstation thermal-day) | 0 events | RESTRICT-HW-1 + AC-NEW-5 (Tier-2 partial) |
|
||
|
||
### Security
|
||
|
||
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source |
|
||
|-----------------|------------|-----------------|-------------------|-----------|----------------------|
|
||
| NFT-SEC-01 | Synthetic over-confidence injection: deflate covariance ×1.5-3 in 3 trial flights, observe AC-NEW-7 cache-poisoning behavior at the `mock-suite-sat-service` ingest | Per flight: `P(geo-misalign > 30 m) < 1%`, `P(> 100 m) < 0.1%` of written tiles. PARTIAL — multi-flight Monte Carlo (≥100 flights per AC text) is reduced-confidence with current single Derkachi fixture; trace flag in matrix. | `threshold_max` on probability | < 1% / < 0.1% | AC-NEW-7 |
|
||
| NFT-SEC-02 | Network egress probe from SUT container | All non-`e2e-net` egress attempts blocked by Docker `internal: true`; per-attempt logged as security event in SUT log | `exact` (egress count = 0) + `regex` (security-event log emission) | N/A | RESTRICT-SAT-1 + AC-8.1 |
|
||
| NFT-SEC-03 | `ardupilot-plane-sitl` + un-signed MAVLink GPS_INPUT injection | AP SITL rejects unsigned messages on the signed channel; SUT-emitted (signed) messages pass; SBOM check confirms passkey configuration | `exact` (AP rejection of unsigned) + `boolean` (SBOM passkey present) | N/A | D-C8-9 + Mode B Fact #109 + AC-NEW-2 |
|
||
| NFT-SEC-04 | `cve-jpeg-fixture` fed to SUT image pipeline (C1 + C4 paths) | OpenCV ≥4.12.0 either decodes safely or rejects the file; no crash, no buffer overflow detected by AddressSanitizer | `boolean` on no-crash + ASan clean | N/A | D-CROSS-CVE-1 + Mode B Fact #112 |
|
||
|
||
## External Dependency Mocks
|
||
|
||
| External Service | Mock/Stub | How Provided | Behavior |
|
||
|-----------------|-----------|-------------|----------|
|
||
| Azaion Suite Satellite Service (ingest API for AC-NEW-7 voting layer) | `mock-suite-sat-service` Docker service | Local FastAPI stub returning canned tile-publish-acknowledgement responses with deterministic IDs; logs every received tile + per-tile quality metadata to a file the e2e-runner reads back | Returns 202 Accepted on every well-formed publish; returns 400 on malformed; never simulates real voting (the project's role is to publish, the Service's role is to vote per Mode B Fact #105 / D-PROJ-2) |
|
||
| ArduPilot Plane FC | `ardupilot-plane-sitl` Docker service | Open-source SITL build of ArduPilot Plane stable; configured with `GPS_TYPE=14` per Source #2 to accept MAVLink GPS_INPUT | Real ArduPilot EKF behavior; we observe but do not patch |
|
||
| iNav FC | `inav-sitl` Docker service | Open-source iNav SITL; GPS provider configured to MSP per `docs/SITL/SITL.md` | Real iNav GPS subsystem behavior; we observe but do not patch |
|
||
| QGroundControl GCS | `mavproxy-listener` Docker service | Passive MAVLink listener that forwards SUT → GCS stream into a `.tlog` file the e2e-runner parses | Captures all STATUSTEXT, NAMED_VALUE_FLOAT, downsampled position frames for assertions |
|
||
| AI camera (AC-7.x) | NOT MOCKED — out of scope per Phase 1 gate | N/A | NOT COVERED in current matrix — see traceability matrix |
|
||
|
||
## Data Validation Rules
|
||
|
||
| Data Type | Validation | Invalid Examples | Expected System Behavior |
|
||
|-----------|-----------|-----------------|------------------------|
|
||
| Nav-camera frame | Resolution within ADTi spec (~5472×3648 production, downscaled equivalents allowed in Tier-1 Docker) | 0×0 frame, corrupt JPEG (CVE fixture), wrong color depth | Reject frame, log invalid-input event, do NOT advance estimator state |
|
||
| FC IMU sample | `SCALED_IMU2` fields present; timestamp monotonic; non-zero accelerometer norm | Missing field, backwards timestamp, NaN | Reject sample, log invalid-input event, propagate estimator from prior valid state |
|
||
| Satellite tile manifest | Required fields per `restrictions.md`: CRS, tile matrix, dimension, lat-adjusted m/px, capture date, source, compression. m/px ≥ 0.5. capture_date within AC-8.2 freshness window. | Missing capture_date, m/px = 1.0 (below floor), capture_date older than freshness threshold | Reject tile load OR downgrade to non-`satellite_anchored` source label per AC-NEW-6 |
|
||
| Spoofed FC GPS | (FC-side input the SUT detects) | GPS jump >200 m between consecutive 5 Hz frames; FC GPS-health flag toggled to spoofed | SUT switches estimator label to `dead_reckoned`, stops promoting FC GPS, continues per AC-NEW-8 |
|
||
| MAVLink GPS_INPUT outbound | Honest covariance — `horiz_accuracy` ≥ estimator's 95% covariance semi-major axis | Under-reported covariance | This is a defect (AC-NEW-4) — fail NFT-PERF-04 if observed |
|
||
| MAVLink message signature | MAVLink 2.0 signed on AP wired channel per D-C8-9 | Unsigned message on signed channel | AP-side rejection (NFT-SEC-03 expected behavior) |
|