mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 12:01:13 +00:00
[AZ-406] Blackbox test harness bootstrap (Tier-1 + Tier-2 scaffold)
Bootstraps the public-boundary blackbox test harness owned by epic
AZ-262 (E-BBT). Establishes the e2e/ directory tree at the repo root,
fully separated from src/gps_denied_onboard/** and from the in-process
tests/** tree, and commits to the contracts every subsequent test
ticket (AZ-407..AZ-446) builds against.
Tier-1 (workstation Docker):
- docker/docker-compose.test.yml wires SUT + ArduPilot SITL + iNav SITL
+ mock Suite Sat Service + mavproxy listener + e2e-runner onto one
e2e-net bridge with internal: true (enforces RESTRICT-SAT-1 /
NFT-SEC-02 egress isolation at the network layer).
- docker/docker-compose.tier2-bridge.yml override disables the in-
compose SUT so Tier-2 pairs SITLs + mock + runner on an x86 host
while the SUT runs natively on the Jetson under systemd.
Tier-2 (Jetson):
- jetson/run-tier2.sh + tier2.service systemd unit + tegrastats /
jtop parsers feed per-sample telemetry into the evidence bundle.
Runner image (e2e/runner/):
- Dockerfile + requirements.txt install ONLY ground-side libs
(pymavlink, opencv-python>=4.12, numpy/scipy/geopy/pyproj, httpx,
orjson, pydantic, structlog, pytest 8.x). The runner deliberately
does NOT install the SUT package.
- conftest.py implements the AC-9 skip-rule mapping (tier2_only,
chamber_only, vins_mono, deferred_ac) tied to environment.md
parametrize axes.
- reporting/csv_reporter.py is a pytest plugin emitting one row per
test with the exact 11-column schema from environment.md §
Reporting (test_id, test_name, traces_to, fc_adapter, vio_strategy,
tier, started_at_utc, execution_time_ms, result, error_message,
evidence_paths). XFAIL surfaced only when a test carries
@pytest.mark.deferred_ac(verdict="xfail", reason=...).
- reporting/evidence_bundler.py exposes the attach_evidence fixture
that copies per-test artifacts (.tlog, FDR archives, screenshots,
tegrastats / jtop CSVs) into the run bundle and records relative
paths into the reporter's evidence_paths column.
- helpers/{frame_source_replay,imu_replay,sitl_observer,
mavproxy_tlog_reader,fdr_reader}.py declare the public surfaces
(concrete implementations owned by AZ-407 / AZ-408 / AZ-416 /
AZ-417 / AZ-441 per the dependency table); helpers/geo.py ships
today (no downstream task dep) — WGS84 distance / forward-bearing
/ offset via pyproj with NaN rejection.
Mock Suite Sat Service (e2e/fixtures/mock-suite-sat/):
- FastAPI app: POST /tiles (ingest contract from D-PROJ-2 follow-up),
GET /tiles/audit + /mock/audit (per-run read-back), POST
/mock/config (force-status, response delay), POST /mock/reset
(clears audit between tests), GET /mock/health.
Fixture scaffolds (e2e/fixtures/{tile-cache-builder, age-injector,
injectors, cold-boot, secrets, security}/):
- Public surfaces only. Concrete builders land in AZ-407 (static
fixtures), AZ-408 (runtime synthetic injection), AZ-419 (cold-boot
fixture), AZ-439 (CVE-2025-53644 JPEG generator).
Test tree (e2e/tests/{positive,negative,performance,resilience,
security,resource_limit}/):
- Mirror of the test-spec category grouping in
_docs/02_document/tests/*-tests.md.
- tests/positive/test_smoke.py is the AC-1 harness-boot smoke run
inside the e2e-runner image once Docker brings everything up.
Out-of-container unit tests (e2e/_unit_tests/):
- Exercises the harness internals (CSV reporter plugin lifecycle,
conftest skip rules, helper modules, parsers, mock app, compose
YAML structural contract, public-boundary enforcement) without
Docker / SITL. 97 unit tests, all passing.
Build / config:
- pyproject.toml: testpaths extended with e2e/_unit_tests; pythonpath
extended with e2e; fastapi>=0.111,<0.120 added to dev extras for the
mock-app TestClient unit test.
AC coverage:
- AC-1 (Tier-1 boot) → compose YAML test + directory layout
+ smoke test (Docker-bound)
- AC-2 (mock services) → 6 FastAPI TestClient unit tests
- AC-3 (SITLs accept output) → contract present; concrete check
deferred to AZ-416 / AZ-417
- AC-4 (CSV columns) → in-process plugin lifecycle test
emits the exact 11-column schema
- AC-5 (egress isolation) → static config test + runtime probe
in Docker-bound smoke
- AC-6 (Tier-2 contract) → tegrastats + jtop parser unit tests
+ jetson/* layout test; full Tier-2
contract is AZ-444
- AC-7 (fixture reproducibility) → deferred to AZ-407 per task spec
- AC-8 (parametrize matrix) → vins_mono skip-rule cases +
tests/positive/test_smoke
- AC-9 (skip semantics) → 9 conftest skip-rule unit tests
Module layout entry for blackbox_tests was added in 2026-05-16
preparatory commit d7a17a8 so this diff stays focused on the harness
scaffold. AZ-406 advances to In Testing on commit.
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,13 @@
|
||||
"""Public-boundary helper modules used by every blackbox test.
|
||||
|
||||
Modules:
|
||||
frame_source_replay — replay images/video to the SUT's V4L2 file source
|
||||
imu_replay — replay `data_imu.csv` at 10 Hz to the FC inbound
|
||||
sitl_observer — AP/iNav read-side observers (param reads, GPS_RAW_INT, MSP queries)
|
||||
mavproxy_tlog_reader — parse `.tlog` files emitted by `mavproxy-listener`
|
||||
fdr_reader — post-run filesystem read of the FDR archive
|
||||
geo — Vincenty / WGS84 geodesic helpers
|
||||
|
||||
These modules MUST NOT import from `gps_denied_onboard.*`. Public-boundary
|
||||
discipline is enforced by `e2e/_unit_tests/test_no_sut_imports.py`.
|
||||
"""
|
||||
@@ -0,0 +1,59 @@
|
||||
"""Post-run filesystem read of the FDR archive.
|
||||
|
||||
The FDR archive is a line-delimited JSON record stream per AZ-272 / AZ-273.
|
||||
Each line is an `FdrRecord` envelope (producer_id, type, monotonic_ms,
|
||||
payload). The runner image must NEVER import the SUT's FdrRecord schema
|
||||
directly — it parses the JSON bytes and validates against a duplicate
|
||||
record-type allowlist baked into this module.
|
||||
|
||||
Public surface only; concrete parser + assertion helpers are owned by
|
||||
AZ-441 (NFT-LIM-02 — FDR size budget) and the resilience scenario tasks
|
||||
that need to crawl the archive (AZ-432, AZ-433, AZ-435).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Iterator
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class FdrRecord:
|
||||
"""Mirror of `gps_denied_onboard.fdr_client.records.FdrRecord` — public-boundary copy.
|
||||
|
||||
The schema is duplicated intentionally; if the SUT's FDR schema evolves
|
||||
in a breaking way, this duplicate file fails to parse (visible drift)
|
||||
rather than silently following along.
|
||||
"""
|
||||
|
||||
producer_id: str
|
||||
monotonic_ms: int
|
||||
record_type: str
|
||||
payload: dict[str, object]
|
||||
|
||||
|
||||
def iter_records(fdr_archive_root: Path) -> Iterator[FdrRecord]:
|
||||
"""Iterate every FDR record in the archive root (ordered by monotonic_ms).
|
||||
|
||||
Raises NotImplementedError until AZ-441 supplies the orjson-backed parser.
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"fdr_reader.iter_records is owned by AZ-441 — AZ-406 supplies only "
|
||||
"the public surface."
|
||||
)
|
||||
|
||||
|
||||
def archive_size_bytes(fdr_archive_root: Path) -> int:
|
||||
"""Sum the size of every file under ``fdr_archive_root``.
|
||||
|
||||
Concrete implementation here — it's a thin os.walk + stat loop that
|
||||
NFT-LIM-02 needs as soon as a real archive lands.
|
||||
"""
|
||||
if not fdr_archive_root.exists():
|
||||
return 0
|
||||
total = 0
|
||||
for p in fdr_archive_root.rglob("*"):
|
||||
if p.is_file():
|
||||
total += p.stat().st_size
|
||||
return total
|
||||
@@ -0,0 +1,77 @@
|
||||
"""Replay images / video to the SUT's V4L2 file frame source.
|
||||
|
||||
Two replay modes:
|
||||
1. Image-set replay (FT-P-01, FT-P-05) — emit a sequence of JPEG / PNG
|
||||
still images at a configurable rate to the file frame source path the
|
||||
SUT polls.
|
||||
2. Video replay (FT-P-02, FT-P-04, FT-N-01..04, NFT-PERF-*) — decode an
|
||||
MP4 with OpenCV and emit frames at the encoded FPS (or a user-supplied
|
||||
rate for fast-forward).
|
||||
|
||||
The actual frame-source path inside the SUT container is configured via the
|
||||
``ONBOARD_FRAME_SOURCE_PATH`` environment variable on the SUT — the runner
|
||||
writes to a shared tmpfs volume mounted at the same path inside both
|
||||
containers.
|
||||
|
||||
This file currently provides the public surface used by per-scenario tests;
|
||||
concrete implementations land alongside their consuming test tasks
|
||||
(AZ-407 onward). The intent is that `FrameSourceReplayer` is a stable API
|
||||
the test specs can rely on while the underlying replay strategy is filled
|
||||
in incrementally.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Protocol
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ReplayCadence:
|
||||
"""Frame-rate / pace configuration for a replay session."""
|
||||
|
||||
fps: float = 10.0
|
||||
realtime: bool = True
|
||||
|
||||
|
||||
class FrameSink(Protocol):
|
||||
"""Abstract destination for replayed frames (file path or memory queue)."""
|
||||
|
||||
def write_frame(self, jpeg_bytes: bytes, timestamp_ms: int) -> None:
|
||||
...
|
||||
|
||||
|
||||
class FrameSourceReplayer:
|
||||
"""Public surface for replaying frames into the SUT's frame-source path.
|
||||
|
||||
AZ-407 (Static fixture builders) supplies the concrete still-image replay
|
||||
implementation; AZ-408 (Runtime synthetic-injection) supplies the video
|
||||
+ injector variants. AZ-406 only commits to the contract.
|
||||
"""
|
||||
|
||||
def __init__(self, sink: FrameSink, cadence: ReplayCadence | None = None) -> None:
|
||||
self._sink = sink
|
||||
self._cadence = cadence or ReplayCadence()
|
||||
|
||||
def replay_image_directory(self, directory: Path) -> int:
|
||||
"""Replay every image in ``directory`` (sorted by name). Returns count emitted.
|
||||
|
||||
Raises NotImplementedError until AZ-407 lands. Tests that need this
|
||||
path should mark themselves @pytest.mark.skip(reason="awaiting AZ-407")
|
||||
until then; AC-1 (smoke) does not depend on this surface.
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"FrameSourceReplayer.replay_image_directory is owned by AZ-407 — "
|
||||
"AZ-406 supplies only the public surface."
|
||||
)
|
||||
|
||||
def replay_video(self, video_path: Path) -> int:
|
||||
"""Replay an MP4 / .h264 file frame-by-frame. Returns count emitted.
|
||||
|
||||
Raises NotImplementedError until AZ-408 lands.
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"FrameSourceReplayer.replay_video is owned by AZ-408 — "
|
||||
"AZ-406 supplies only the public surface."
|
||||
)
|
||||
@@ -0,0 +1,54 @@
|
||||
"""WGS84 geodesic helpers — Vincenty distance + bearing for accuracy assertions.
|
||||
|
||||
Wraps `pyproj.Geod` (WGS84 ellipsoid) for the few operations the blackbox
|
||||
tests need. Kept deliberately small — broader geo math (UTM, MGRS, datum
|
||||
conversions) is NOT in scope for the e2e harness.
|
||||
|
||||
All inputs are degrees lat / lon (WGS84); all distances are meters.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
|
||||
from pyproj import Geod
|
||||
|
||||
_WGS84 = Geod(ellps="WGS84")
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class GeodeticDelta:
|
||||
"""Bearing + distance + back-bearing between two WGS84 points."""
|
||||
|
||||
distance_m: float
|
||||
forward_bearing_deg: float
|
||||
reverse_bearing_deg: float
|
||||
|
||||
|
||||
def distance_m(lat1: float, lon1: float, lat2: float, lon2: float) -> float:
|
||||
"""Vincenty distance in meters between two WGS84 points.
|
||||
|
||||
Raises ValueError on NaN inputs (defensive — silent NaN propagation in
|
||||
a test assertion is the kind of bug this helper exists to prevent).
|
||||
"""
|
||||
for name, value in (("lat1", lat1), ("lon1", lon1), ("lat2", lat2), ("lon2", lon2)):
|
||||
if value != value: # NaN check
|
||||
raise ValueError(f"distance_m: {name} is NaN")
|
||||
_, _, d = _WGS84.inv(lon1, lat1, lon2, lat2)
|
||||
return float(d)
|
||||
|
||||
|
||||
def delta(lat1: float, lon1: float, lat2: float, lon2: float) -> GeodeticDelta:
|
||||
"""Full geodetic delta: distance + forward/reverse bearings."""
|
||||
fwd_az, rev_az, d = _WGS84.inv(lon1, lat1, lon2, lat2)
|
||||
return GeodeticDelta(
|
||||
distance_m=float(d),
|
||||
forward_bearing_deg=float(fwd_az),
|
||||
reverse_bearing_deg=float(rev_az),
|
||||
)
|
||||
|
||||
|
||||
def offset(lat: float, lon: float, bearing_deg: float, distance_m: float) -> tuple[float, float]:
|
||||
"""Project ``(lat, lon)`` by ``distance_m`` along ``bearing_deg`` (degrees CW from north)."""
|
||||
new_lon, new_lat, _ = _WGS84.fwd(lon, lat, bearing_deg, distance_m)
|
||||
return float(new_lat), float(new_lon)
|
||||
@@ -0,0 +1,53 @@
|
||||
"""Replay `data_imu.csv` to the FC inbound at 10 Hz.
|
||||
|
||||
CSV schema (from `_docs/00_problem/input_data/flight_derkachi/data_imu.csv`):
|
||||
timestamp_ms,ax,ay,az,gx,gy,gz,roll_deg,pitch_deg,yaw_deg,baro_m
|
||||
|
||||
Owned by AZ-406 (public surface) + AZ-407 (concrete file-driver
|
||||
implementation). This module commits to the type signatures the
|
||||
per-scenario tests will import; the actual MAVLink / MSP2 emission is
|
||||
wired up by the downstream task.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Protocol
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ImuSample:
|
||||
"""One row of `data_imu.csv` after parsing into native units."""
|
||||
|
||||
timestamp_ms: int
|
||||
accel_mss: tuple[float, float, float]
|
||||
gyro_rps: tuple[float, float, float]
|
||||
attitude_rad: tuple[float, float, float] # roll, pitch, yaw (radians)
|
||||
baro_alt_m: float
|
||||
|
||||
|
||||
class FcInboundEmitter(Protocol):
|
||||
"""Abstract emitter — concrete impls are MAVLink (AP) or MSP2 (iNav)."""
|
||||
|
||||
def emit(self, sample: ImuSample) -> None:
|
||||
...
|
||||
|
||||
|
||||
class ImuReplayer:
|
||||
"""Drives an `FcInboundEmitter` from a CSV file at the recorded cadence."""
|
||||
|
||||
def __init__(self, emitter: FcInboundEmitter, rate_hz: float = 10.0) -> None:
|
||||
self._emitter = emitter
|
||||
self._rate_hz = rate_hz
|
||||
|
||||
def replay(self, csv_path: Path) -> int:
|
||||
"""Replay the CSV file. Returns the number of samples emitted.
|
||||
|
||||
Concrete implementation is owned by AZ-407 (FT-P-02 derkachi-drift
|
||||
+ FT-P-04 frame-to-frame registration are the first consumers).
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"ImuReplayer.replay is owned by AZ-407 — AZ-406 supplies only "
|
||||
"the public surface."
|
||||
)
|
||||
@@ -0,0 +1,48 @@
|
||||
"""Parse `.tlog` files emitted by `mavproxy-listener`.
|
||||
|
||||
`.tlog` is the standard MAVLink dialect dump format: each message is a
|
||||
6-byte unix-microsecond timestamp followed by the wire bytes of the MAVLink
|
||||
frame. pymavlink ships `mavlogfile` which knows how to iterate this.
|
||||
|
||||
This module exposes a small typed wrapper so per-scenario tests can:
|
||||
1. Filter for the message types they care about.
|
||||
2. Compute summary statistics (count per type, message-rate Hz, ratio
|
||||
of signed vs unsigned messages for NFT-SEC-03).
|
||||
3. Attach the source `.tlog` path to the evidence bundler.
|
||||
|
||||
Concrete iteration logic is owned by AZ-416 (FT-P-09-AP); AZ-406 commits
|
||||
to the public surface.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Iterator
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class TlogMessage:
|
||||
timestamp_us: int
|
||||
msg_type: str
|
||||
signed: bool
|
||||
fields: dict[str, object]
|
||||
|
||||
|
||||
def iter_messages(tlog_path: Path) -> Iterator[TlogMessage]:
|
||||
"""Iterate `.tlog` messages oldest-first.
|
||||
|
||||
AZ-406 raises until AZ-416 fills in the pymavlink-backed iterator.
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"mavproxy_tlog_reader.iter_messages is owned by AZ-416 — "
|
||||
"AZ-406 supplies only the public surface."
|
||||
)
|
||||
|
||||
|
||||
def count_by_type(tlog_path: Path) -> dict[str, int]:
|
||||
"""Return ``{msg_type: count}`` for every distinct message type."""
|
||||
counts: dict[str, int] = {}
|
||||
for msg in iter_messages(tlog_path):
|
||||
counts[msg.msg_type] = counts.get(msg.msg_type, 0) + 1
|
||||
return counts
|
||||
@@ -0,0 +1,59 @@
|
||||
"""ArduPilot Plane / iNav SITL state-read observers.
|
||||
|
||||
Reads what the SUT delivered to the FC over its external-positioning
|
||||
interface, without ever bypassing the FC's own acceptance path. This is
|
||||
the only legal way for blackbox tests to assert AC-4.3 (FC output contract):
|
||||
every assertion goes through the SITL's state machine.
|
||||
|
||||
Public surface only; concrete pymavlink / yamspy / msp_gps_toy subprocess
|
||||
plumbing is owned by AZ-416 (FT-P-09-AP) and AZ-417 (FT-P-09-iNav).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import Literal, Protocol
|
||||
|
||||
FcKind = Literal["ardupilot", "inav"]
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class FcGpsState:
|
||||
"""The subset of FC state the e2e tests assert against.
|
||||
|
||||
AP: assembled from EKF source-set + GLOBAL_POSITION_INT replay-back.
|
||||
iNav: assembled from MSP2 GPS-provider state + getRawGPS query.
|
||||
"""
|
||||
|
||||
primary_source: str # "MAV" (AP gps_type=14) or "MSP" (iNav)
|
||||
last_position_lat_deg: float
|
||||
last_position_lon_deg: float
|
||||
last_position_alt_m: float
|
||||
fix_quality: int # 0..6 per NMEA convention
|
||||
horizontal_accuracy_m: float
|
||||
last_update_age_ms: int
|
||||
|
||||
|
||||
class FcSitlObserver(Protocol):
|
||||
"""Common observer protocol — implemented by `ArduPilotObserver` + `InavObserver`."""
|
||||
|
||||
fc_kind: FcKind
|
||||
|
||||
def read_gps_state(self) -> FcGpsState:
|
||||
...
|
||||
|
||||
def read_parameter(self, name: str) -> float | int | str | None:
|
||||
...
|
||||
|
||||
|
||||
def get_observer(fc_kind: FcKind, host: str) -> FcSitlObserver:
|
||||
"""Factory — returns the matching observer for the requested FC.
|
||||
|
||||
AZ-416/417 own the concrete return types. AZ-406 raises until those
|
||||
tasks land so test authors can plumb the observer through their
|
||||
fixtures without yet running them.
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
f"sitl_observer.get_observer({fc_kind=}, {host=}) is owned by "
|
||||
"AZ-416 (AP) / AZ-417 (iNav) — AZ-406 supplies only the contract."
|
||||
)
|
||||
Reference in New Issue
Block a user