mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 00:41:13 +00:00
[AZ-408] [AZ-410] [AZ-411] Batch 69: synth injectors + FT-P-02/03/14
AZ-408 (3pt) — Replace AZ-406 injector scaffolds with concrete generators: - outlier.py: deterministic stride + far-away tile replacement; AC-2 ≥350m offset - blackout_spoof.py: paired video blackout + FC GPS spoof with ≤40ms alignment; AC-4 realistic fix_type/hdop; AC-NEW-8 200-500m inter-spoof deltas - multi_segment.py: ≥3 disjoint windows, ≥30s gaps, ≤25% coverage - fc_proxy.py: timed-splice runtime proxy with pre-activate RuntimeError guard - _common.py: derive_rng + tile-manifest reader + tmpfs helpers - injector_fixtures.py: pytest fixtures wired via runner conftest AZ-410 (3pt) — FT-P-02 cumulative drift between satellite anchors: - anchor_pair_detector.py: AC-1 detection, AC-2/3 pass-fraction, AC-4 monotonicity check, CSV evidence - test_ft_p_02_derkachi_drift.py: scenario gated on upstream helper NotImplementedError (frame_source_replay / fdr_reader / imu_replay) AZ-411 (2pt) — FT-P-03 + FT-P-14 schema + WGS84: - estimate_schema.py: AC-1 schema completeness, AC-2 source-label set containment, AC-3 WGS84 range + int32 1e-7 decode - test_ft_p_03_14_schema_wgs84.py: shared single-image-push scenario Tests: 248 unit tests pass (+91 vs batch 68). Reports: batch_69_report.md, batch_69_review.md (PASS), cumulative_review_batches_67-69_cycle1_report.md (PASS). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,82 @@
|
||||
# Fixture Builders — Synthetic Injectors (outlier, blackout-spoof, multi-segment)
|
||||
|
||||
**Task**: AZ-408_fixture_builders_synth_injectors
|
||||
**Name**: Runtime synthetic-injection fixture builders
|
||||
**Description**: Implement runtime-generated synthetic fixtures: `outlier-injection-derkachi` (light/medium/heavy densities), `blackout-spoof-derkachi` (5 s / 15 s / 35 s windows + paired FC GPS spoof), `multi-segment-derkachi` (3+ blackout windows without spoof).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407 (tile-cache-fixture for Derkachi route bbox)
|
||||
**Component**: Blackbox Tests / Fixture builders (epic AZ-262 / E-BBT)
|
||||
**Tracker**: AZ-408
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
The negative-path scenarios FT-N-01, FT-N-04, FT-P-08, NFT-RES-04 all rely on programmatically derived overlays of the Derkachi fixture (visual outliers, blackout windows, simultaneous FC GPS spoof). These overlays must be deterministic, runtime-generated into per-test tmpfs (per `test-data.md` § Data Isolation), and — for the spoof case — coordinate with FC inbound stream patching.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `tests/fixtures/injectors/outlier.py` overlays `derkachi-fixture` with random crops from far-away tiles (>350 m offset per AC-3.1) at three densities — `light` (1 in 100 frames), `medium` (1 in 10), `heavy` (1 in 3). Each density is a CLI flag; same `--seed` produces identical overlays.
|
||||
- `tests/fixtures/injectors/blackout_spoof.py` produces three sub-scenarios — 5 s, 15 s, 35 s pure-black-frame windows on the video stream AND simultaneous spoofed-GPS injection on the FC inbound stream. Spoof pattern: realistic-looking GPS that jumps 200-500 m in a `north_east_random_direction`. The injector is a coordinated pair (video overlay + FC inbound proxy patch) so both fire at the exact same wall-clock instant.
|
||||
- `tests/fixtures/injectors/multi_segment.py` generates 3+ blackout segments distributed across the Derkachi flight (positions configurable; default = at 25 %, 50 %, 75 % of replay) WITHOUT spoof injection. Used to exercise satellite-reference re-localization without a security failsafe path.
|
||||
- All injectors emit to per-test tmpfs (`/tmp/<run-id>/<scenario>/`) and are auto-cleared at teardown.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- The three injector scripts + a shared library for deterministic random-seed handling.
|
||||
- A small FC-inbound proxy patch for blackout_spoof — sits between the IMU/GPS replay and the SUT's FC inbound port; passes through everything except the spoofed-GPS bursts during the configured window(s).
|
||||
- pytest fixtures wrapping each injector for use in the per-scenario test files.
|
||||
|
||||
### Excluded
|
||||
- The static fixtures (tile-cache, age-injector, cold-boot, mavlink-passkey, cve-jpeg) — owned by AZ-407.
|
||||
- Per-scenario test logic (FT-N-01, FT-N-04, FT-P-08, NFT-RES-04) — those tasks consume the injector outputs.
|
||||
- Persistent storage of generated fixtures — explicitly per-test tmpfs; never written to a persistent volume.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: outlier injector is seed-deterministic**
|
||||
Given the same `--seed` value
|
||||
When `outlier.py --density medium` runs twice
|
||||
Then both runs produce overlays with identical frame indices replaced and identical replacement crop selection.
|
||||
|
||||
**AC-2: outlier offsets exceed 350 m (AC-3.1 envelope)**
|
||||
Given an `outlier-injection-derkachi` `medium`-density overlay
|
||||
When the per-frame replacement crop is geo-located via the tile-cache GT
|
||||
Then ≥99 % of replacement crops are >350 m from the original frame's GT centre.
|
||||
|
||||
**AC-3: blackout_spoof produces synchronized video + FC events**
|
||||
Given `blackout_spoof.py --window 15s`
|
||||
When the test runs
|
||||
Then within ≤40 ms wall-clock of the video stream's first all-black frame, the FC inbound proxy starts emitting spoofed GPS frames; both stop within ≤40 ms at the window's end.
|
||||
|
||||
**AC-4: blackout_spoof spoof pattern is realistic-looking**
|
||||
Given the spoof injector emits GPS during the blackout
|
||||
Then the spoofed `lat`/`lon`/`alt`/`fix_type`/`hdop` fields are within typical-flight ranges (no NaN, no obvious sentinel values, fix_type in {3, 4}, hdop in [0.5, 2.5]); the position deltas between consecutive spoofed frames are in [200 m, 500 m] per AC-NEW-8.
|
||||
|
||||
**AC-5: multi_segment produces ≥3 disjoint blackout windows**
|
||||
Given `multi_segment.py`
|
||||
Then the output contains ≥3 blackout windows; consecutive windows are separated by ≥30 s of normal frames; total blackout coverage ≤ 25 % of the source duration (so the rest of the flight remains exercising satellite-anchor recovery).
|
||||
|
||||
**AC-6: tmpfs auto-cleared at teardown**
|
||||
Given a test using any injector completes (PASS or FAIL)
|
||||
Then the injector's tmpfs scratch directory is removed within ≤2 s of teardown; subsequent tests start with empty tmpfs.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
The injectors only produce input data; they never replace, stub, or fake any SUT module. The blackout_spoof FC-inbound proxy is a pure pass-through with timed splice — it does not implement any SUT logic.
|
||||
|
||||
- No internal SUT module is imported by any injector.
|
||||
- The FC-inbound proxy operates at the protocol level (MAVLink frame routing); it does not interpret SUT output.
|
||||
- Geo-location of replacement crops uses the tile-cache fixture's manifest only (a public artifact); it does not query the SUT's tile lookup.
|
||||
|
||||
## Constraints
|
||||
|
||||
- Determinism: same `--seed` → identical overlay across runs (this is required for the regression detector in 40_csv_reporter_refinements).
|
||||
- Tmpfs-only: injectors never write to persistent volumes.
|
||||
- Coordinated timing for blackout_spoof: video-overlay and FC-inbound spoof must fire within ≤40 ms of each other (AC-NEW-8 / FT-N-04 pass criterion is "within ≤1 frame OR ≤400 ms").
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/test-data.md` § Seed Data Sets (synthetic-injection rows)
|
||||
- `_docs/02_document/tests/blackbox-tests.md` (FT-N-01, FT-N-04, FT-P-08)
|
||||
- `_docs/02_document/tests/resilience-tests.md` (NFT-RES-04 reuses blackout_spoof)
|
||||
@@ -0,0 +1,84 @@
|
||||
# FT-P-02 — Derkachi VIO drift between satellite anchors
|
||||
|
||||
**Task**: AZ-410_ft_p_02_derkachi_drift
|
||||
**Name**: FT-P-02 cumulative drift between consecutive satellite-anchored fixes (AC-1.3)
|
||||
**Description**: Implement the FT-P-02 scenario — full Derkachi replay; at each anchor frame compute drift between the propagated visual-only centre and the new satellite anchor centre; bin by `last_satellite_anchor_age_ms`; assert ≥95 % of anchor pairs satisfy drift bounds.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407
|
||||
**Component**: Blackbox Tests / Positive (epic AZ-262)
|
||||
**Tracker**: AZ-410
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Cumulative drift between consecutive satellite anchors is the most direct measure of the project's onboard-VIO + IMU-fusion behavior in flight. AC-1.3 must be measured on the Derkachi fixture (the only available real flight) — without this scenario the project has no closed-loop validation of the drift budget.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/positive/test_ft_p_02_derkachi_drift.py`.
|
||||
- Replays the full Derkachi fixture (video at 30 fps + IMU CSV at 10 Hz, 3 video frames per IMU row).
|
||||
- For each frame whose outbound estimate carries `source_label = satellite_anchored`: records (a) the propagated centre estimate of the prior visual-only segment, (b) the new anchor centre.
|
||||
- Per-anchor-pair drift = `‖propagated_centre − next_anchor_centre‖`; binned by `last_satellite_anchor_age_ms`.
|
||||
- CSV evidence: `e2e-results/run-${RUN_ID}/ft-p-02.csv` (one row per anchor pair).
|
||||
- Aggregate pass criteria (per AC-1.3): ≥95 % of anchor pairs satisfy `drift < 100 m` (visual-only) AND `drift < 50 m` when CombinedImuFactor IMU fusion is active in C5.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Full-replay test method (~8 min replay + parsing).
|
||||
- Anchor-pair detection from the outbound estimate stream (`source_label` transitions).
|
||||
- Drift binning + aggregate assertion.
|
||||
- CSV evidence emission.
|
||||
|
||||
### Excluded
|
||||
- Synthetic outage / spoof injection — owned by FT-N-01..04.
|
||||
- Sharp-turn-segment-specific assertions — owned by FT-P-07 / FT-N-02.
|
||||
- Frame-by-frame inter-emit latency — owned by NFT-PERF-02.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: anchor-pair detection**
|
||||
Given a full Derkachi replay
|
||||
Then the test identifies every transition from `visual_propagated` (or `dead_reckoned`) → `satellite_anchored` and records the pair.
|
||||
|
||||
**AC-2: drift bound (visual-only)**
|
||||
Given anchor pairs whose preceding segment was visual-only (no IMU fusion active in C5)
|
||||
Then ≥95 % of those pairs satisfy `drift < 100 m`.
|
||||
|
||||
**AC-3: drift bound (IMU-fused)**
|
||||
Given anchor pairs whose preceding segment had CombinedImuFactor IMU fusion active in C5
|
||||
Then ≥95 % of those pairs satisfy `drift < 50 m`.
|
||||
|
||||
**AC-4: drift distribution monotonic with anchor age**
|
||||
Given drift bins by `last_satellite_anchor_age_ms` (e.g. {<1 s, 1-3 s, 3-10 s, 10-30 s, >30 s})
|
||||
Then the bin medians grow monotonically with age; no anomalous spike (>2× median jump) between adjacent bins.
|
||||
|
||||
**AC-5: parameterization**
|
||||
Given the conftest's `(fc_adapter, vio_strategy)` parameterization
|
||||
Then the scenario runs once per parameterization and emits one row per parameterization in `report.csv`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end scenario through public boundaries.
|
||||
|
||||
- **Allowed inputs**: frame-source replay, IMU CSV replay through FC inbound proxy (passive).
|
||||
- **Allowed observation**: SITL-side outbound message stream; FDR-side `source_label` transitions (FDR is a public artifact post-flight); `GLOBAL_POSITION_INT` GT from `data_imu.csv`.
|
||||
- **Forbidden**: querying SUT internal C5 graph state, internal `source_label` state machine, or stubbing C5 / C8.
|
||||
- If C8 outbound emission is not implemented, the scenario MUST fail (no fallback to a stubbed emit).
|
||||
|
||||
## Constraints
|
||||
|
||||
- Replay synchrony: 3 video frames per IMU row (per `test-data.md`).
|
||||
- Drift is computed in metres in WGS84 via Vincenty; `propagated_centre` is the SUT's last-emitted position immediately before the anchor frame.
|
||||
- IMU-fused-vs-visual-only classification: derived from FDR `source_label` history of the segment preceding each anchor (visual-only = entire segment was `visual_propagated` since the prior anchor; IMU-fused = at least one frame was `satellite_anchored` with active CombinedImuFactor — readable from FDR per AC-NEW-3 schema).
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk: Derkachi fixture has too few satellite-anchor opportunities for statistical power**
|
||||
- *Mitigation*: ≥95 % is the AC-1.3 budget; for a fixture with N anchors, the required passing count rounds up. The test reports the actual N, the pass count, and the percentage in the CSV; statistical significance flagged when N < 20.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-P-02
|
||||
- `_docs/02_document/tests/test-data.md` § Position accuracy (FT-P-02 row)
|
||||
- `_docs/00_problem/input_data/flight_derkachi/data_imu.csv` (GT columns `GLOBAL_POSITION_INT.*`)
|
||||
@@ -0,0 +1,66 @@
|
||||
# FT-P-03 + FT-P-14 — Estimate schema + WGS84 output coordinate system
|
||||
|
||||
**Task**: AZ-411_ft_p_03_14_schema_wgs84
|
||||
**Name**: Estimate output schema + source-label semantics + WGS84 coordinate validation (AC-1.4, AC-4.3, AC-6.3)
|
||||
**Description**: Combined coverage for FT-P-03 (output schema + source-label) and FT-P-14 (WGS84 coordinate validation). Both are small format/contract checks on the same outbound message.
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: AZ-406, AZ-407
|
||||
**Component**: Blackbox Tests / Positive (epic AZ-262)
|
||||
**Tracker**: AZ-411
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Two thin contract checks on the SUT's outbound message — schema completeness (AC-1.4 / AC-4.3) and WGS84 coordinate-range validity (AC-6.3) — must be exercised, but each scenario alone is too small for an independent task. They share the fixture and only differ in their assertion set.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/positive/test_ft_p_03_14_schema_wgs84.py` with two test methods: `test_schema_and_source_label` (FT-P-03) and `test_wgs84_coordinate_range` (FT-P-14).
|
||||
- Both methods push a single image (default `AD000001.jpg`) and read the resulting outbound message + the out-of-band source-label channel (`STATUSTEXT` or `NAMED_VALUE_FLOAT` per AC-4.3).
|
||||
- AC-3 implements the `set_contains` rule on the source label; AC-2 implements the `schema_match` rule on field presence + types.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Schema match: `lat:float`, `lon:float`, `cov_semi_major_m:float`, `last_satellite_anchor_age_ms:int` present and well-typed in the outbound message; per AC-4.3 these fields ride either inside the `GPS_INPUT` / `MSP2_SENSOR_GPS` payload OR on a paired side-channel.
|
||||
- Source-label set containment: the out-of-band label channel emits one of `{satellite_anchored, visual_propagated, dead_reckoned}`.
|
||||
- WGS84 range check: `lat ∈ [-90, 90]`, `lon ∈ [-180, 180]`, scaled per protocol convention (AP `GPS_INPUT.lat/lon` are 1e-7 scaled int32, iNav `MSP2_SENSOR_GPS.lat/lon` likewise — the test parses correctly and checks the decoded float is in WGS84 bounds).
|
||||
|
||||
### Excluded
|
||||
- Per-image accuracy — owned by FT-P-01 (AZ-409).
|
||||
- Honest-covariance-vs-95%-confidence cross-check — owned by FT-N-04 (AZ-426) and AC-NEW-4 in NFT-RES-03.
|
||||
- Signing handshake — owned by FT-P-09-AP (AZ-416).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: schema completeness**
|
||||
Given any single outbound message with paired source-label channel
|
||||
Then all of `lat:float`, `lon:float`, `cov_semi_major_m:float`, `last_satellite_anchor_age_ms:int` are present and parse to the documented types.
|
||||
|
||||
**AC-2: source-label set containment**
|
||||
Given the source-label channel emission
|
||||
Then the label is exactly one of `{satellite_anchored, visual_propagated, dead_reckoned}`.
|
||||
|
||||
**AC-3: WGS84 coordinate range**
|
||||
Given the decoded lat/lon
|
||||
Then `lat ∈ [-90, 90]` AND `lon ∈ [-180, 180]` AND the scaling factor matches the protocol convention (AP: 1e-7 scaled int32; iNav: per `MSP2_SENSOR_GPS` schema in `docs/SITL/SITL.md`).
|
||||
|
||||
**AC-4: parameterization**
|
||||
Given the conftest's `(fc_adapter, vio_strategy)` parameterization
|
||||
Then both test methods run for each parameterization.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries; SITL-observed.
|
||||
|
||||
- **Allowed**: SITL receipt, mavproxy listener for STATUSTEXT/NAMED_VALUE_FLOAT.
|
||||
- **Forbidden**: parsing SUT internal logs for the schema fields; the schema must be visible at the outbound boundary (or it's a real defect).
|
||||
|
||||
## Constraints
|
||||
|
||||
- The source-label side-channel mechanism is documented per AC-4.3 ("MAVLink `STATUSTEXT` or `NAMED_VALUE_FLOAT` per AC-4.3"). Both encodings are accepted by this test as long as the label arrives within ≤500 ms of the paired `GPS_INPUT` / `MSP2_SENSOR_GPS`.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-P-03, § FT-P-14
|
||||
- `_docs/02_document/tests/test-data.md` § Position accuracy (FT-P-03 row)
|
||||
Reference in New Issue
Block a user