[AZ-406] Blackbox test harness bootstrap (Tier-1 + Tier-2 scaffold)

Bootstraps the public-boundary blackbox test harness owned by epic AZ-262 (E-BBT). Establishes the e2e/ directory tree at the repo root, fully separated from src/gps_denied_onboard/** and from the in-process tests/** tree, and commits to the contracts every subsequent test ticket (AZ-407..AZ-446) builds against. Tier-1 (workstation Docker): - docker/docker-compose.test.yml wires SUT + ArduPilot SITL + iNav SITL + mock Suite Sat Service + mavproxy listener + e2e-runner onto one e2e-net bridge with internal: true (enforces RESTRICT-SAT-1 / NFT-SEC-02 egress isolation at the network layer). - docker/docker-compose.tier2-bridge.yml override disables the in- compose SUT so Tier-2 pairs SITLs + mock + runner on an x86 host while the SUT runs natively on the Jetson under systemd. Tier-2 (Jetson): - jetson/run-tier2.sh + tier2.service systemd unit + tegrastats / jtop parsers feed per-sample telemetry into the evidence bundle. Runner image (e2e/runner/): - Dockerfile + requirements.txt install ONLY ground-side libs (pymavlink, opencv-python>=4.12, numpy/scipy/geopy/pyproj, httpx, orjson, pydantic, structlog, pytest 8.x). The runner deliberately does NOT install the SUT package. - conftest.py implements the AC-9 skip-rule mapping (tier2_only, chamber_only, vins_mono, deferred_ac) tied to environment.md parametrize axes. - reporting/csv_reporter.py is a pytest plugin emitting one row per test with the exact 11-column schema from environment.md § Reporting (test_id, test_name, traces_to, fc_adapter, vio_strategy, tier, started_at_utc, execution_time_ms, result, error_message, evidence_paths). XFAIL surfaced only when a test carries @pytest.mark.deferred_ac(verdict="xfail", reason=...). - reporting/evidence_bundler.py exposes the attach_evidence fixture that copies per-test artifacts (.tlog, FDR archives, screenshots, tegrastats / jtop CSVs) into the run bundle and records relative paths into the reporter's evidence_paths column. - helpers/{frame_source_replay,imu_replay,sitl_observer, mavproxy_tlog_reader,fdr_reader}.py declare the public surfaces (concrete implementations owned by AZ-407 / AZ-408 / AZ-416 / AZ-417 / AZ-441 per the dependency table); helpers/geo.py ships today (no downstream task dep) — WGS84 distance / forward-bearing / offset via pyproj with NaN rejection. Mock Suite Sat Service (e2e/fixtures/mock-suite-sat/): - FastAPI app: POST /tiles (ingest contract from D-PROJ-2 follow-up), GET /tiles/audit + /mock/audit (per-run read-back), POST /mock/config (force-status, response delay), POST /mock/reset (clears audit between tests), GET /mock/health. Fixture scaffolds (e2e/fixtures/{tile-cache-builder, age-injector, injectors, cold-boot, secrets, security}/): - Public surfaces only. Concrete builders land in AZ-407 (static fixtures), AZ-408 (runtime synthetic injection), AZ-419 (cold-boot fixture), AZ-439 (CVE-2025-53644 JPEG generator). Test tree (e2e/tests/{positive,negative,performance,resilience, security,resource_limit}/): - Mirror of the test-spec category grouping in _docs/02_document/tests/*-tests.md. - tests/positive/test_smoke.py is the AC-1 harness-boot smoke run inside the e2e-runner image once Docker brings everything up. Out-of-container unit tests (e2e/_unit_tests/): - Exercises the harness internals (CSV reporter plugin lifecycle, conftest skip rules, helper modules, parsers, mock app, compose YAML structural contract, public-boundary enforcement) without Docker / SITL. 97 unit tests, all passing. Build / config: - pyproject.toml: testpaths extended with e2e/_unit_tests; pythonpath extended with e2e; fastapi>=0.111,<0.120 added to dev extras for the mock-app TestClient unit test. AC coverage: - AC-1 (Tier-1 boot) → compose YAML test + directory layout + smoke test (Docker-bound) - AC-2 (mock services) → 6 FastAPI TestClient unit tests - AC-3 (SITLs accept output) → contract present; concrete check deferred to AZ-416 / AZ-417 - AC-4 (CSV columns) → in-process plugin lifecycle test emits the exact 11-column schema - AC-5 (egress isolation) → static config test + runtime probe in Docker-bound smoke - AC-6 (Tier-2 contract) → tegrastats + jtop parser unit tests + jetson/* layout test; full Tier-2 contract is AZ-444 - AC-7 (fixture reproducibility) → deferred to AZ-407 per task spec - AC-8 (parametrize matrix) → vins_mono skip-rule cases + tests/positive/test_smoke - AC-9 (skip semantics) → 9 conftest skip-rule unit tests Module layout entry for blackbox_tests was added in 2026-05-16 preparatory commit d7a17a8 so this diff stays focused on the harness scaffold. AZ-406 advances to In Testing on commit. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-22 10:21:13 +00:00 · 2026-05-16 16:22:44 +03:00
parent d7a17a8248
commit 59d9116d36
72 changed files with 3515 additions and 6 deletions
@@ -0,0 +1,6 @@
+"""Unit tests for the blackbox harness internals.
+
+These tests run in the project's main pytest suite (extended `testpaths`).
+They MUST NOT require Docker, SITL, or any external service. Anything that
+needs a real container belongs under `e2e/tests/` instead.
+"""
@@ -0,0 +1,15 @@
+"""Local conftest for the harness internals unit tests.
+
+Adds `e2e/` to sys.path so the unit tests can `from runner.helpers.geo import ...`
+without forcing the project's main pyproject `pythonpath` to include another
+src tree.
+"""
+
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+_E2E_ROOT = Path(__file__).resolve().parents[1]
+if str(_E2E_ROOT) not in sys.path:
+    sys.path.insert(0, str(_E2E_ROOT))
@@ -0,0 +1,83 @@
+"""Syntactic / structural checks on docker-compose.test.yml.
+
+We can't run `docker compose config` in a unit test (no Docker), but we
+can load the YAML and assert the structural invariants AZ-406 commits to:
+
+    - All required service names are present.
+    - `e2e-net.internal` is `true` (RESTRICT-SAT-1 / NFT-SEC-02).
+    - The e2e-runner consumes the required volumes for input data,
+      fixtures, fdr-output read-only, tlog-output read-only, results.
+    - The mavlink_passkey secret is wired.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import yaml
+
+COMPOSE_FILE = Path(__file__).resolve().parents[2] / "docker" / "docker-compose.test.yml"
+
+
+def _load_compose() -> dict:
+    return yaml.safe_load(COMPOSE_FILE.read_text(encoding="utf-8"))
+
+
+def test_required_services_present() -> None:
+    cfg = _load_compose()
+    services = cfg["services"]
+    for name in (
+        "gps-denied-onboard",
+        "ardupilot-plane-sitl",
+        "inav-sitl",
+        "mock-suite-sat-service",
+        "mavproxy-listener",
+        "e2e-runner",
+    ):
+        assert name in services, f"docker-compose missing service: {name}"
+
+
+def test_e2e_net_is_internal() -> None:
+    cfg = _load_compose()
+    assert cfg["networks"]["e2e-net"]["internal"] is True, (
+        "RESTRICT-SAT-1 / NFT-SEC-02 violation: e2e-net must be internal=true"
+    )
+
+
+def test_runner_mounts_required_paths() -> None:
+    cfg = _load_compose()
+    runner = cfg["services"]["e2e-runner"]
+    volumes_text = "\n".join(runner["volumes"])
+    for required in (
+        "/test-data:ro",
+        "/expected:ro",
+        "/test-fixtures:ro",
+        "/test-suite:ro",
+        "/fdr:ro",
+        "/tlogs:ro",
+        "/e2e-results",
+        "/mock-audit:ro",
+    ):
+        assert required in volumes_text, (
+            f"e2e-runner must mount {required}; current volumes:\n{volumes_text}"
+        )
+
+
+def test_mavlink_passkey_secret_wired() -> None:
+    cfg = _load_compose()
+    secrets = cfg.get("secrets", {})
+    assert "mavlink_passkey" in secrets, "Top-level secrets must include mavlink_passkey"
+    sut = cfg["services"]["gps-denied-onboard"]
+    assert "mavlink_passkey" in [
+        s if isinstance(s, str) else s.get("source", "") for s in sut.get("secrets", [])
+    ], "gps-denied-onboard must declare the mavlink_passkey secret"
+
+
+def test_fdr_output_volume_size_cap_present() -> None:
+    """AC-NEW-3 — the FDR volume must have a size cap declared (belt-and-suspenders)."""
+    cfg = _load_compose()
+    fdr_vol = cfg["volumes"]["fdr-output"]
+    opts = fdr_vol.get("driver_opts", {})
+    assert "size" in opts.get("o", ""), (
+        "fdr-output volume must declare a size cap (AC-NEW-3 belt-and-suspenders)"
+    )
@@ -0,0 +1,62 @@
+"""Unit tests for the injector public surfaces.
+
+AZ-406 commits to the type signatures + the NotImplementedError pointer.
+AZ-408 will replace each NotImplementedError with a real generator; these
+tests will then be updated alongside the implementation.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from fixtures.injectors.blackout_spoof import BlackoutSpoofPlan
+from fixtures.injectors.blackout_spoof import build as build_blackout_spoof
+from fixtures.injectors.cold_boot import ColdBootFixture
+from fixtures.injectors.cold_boot import load as load_cold_boot
+from fixtures.injectors.multi_segment import MultiSegmentPlan
+from fixtures.injectors.multi_segment import build as build_multi_segment
+from fixtures.injectors.outlier import OutlierInjectionPlan
+from fixtures.injectors.outlier import build as build_outlier
+
+
+def test_outlier_plan_dataclass_is_frozen() -> None:
+    plan = OutlierInjectionPlan(target_segment_seconds=(0.0, 5.0))
+    with pytest.raises(AttributeError):
+        plan.max_offset_m = 999.0  # type: ignore[misc]
+    assert plan.max_offset_m == 350.0
+
+
+def test_outlier_build_raises_until_az408_lands() -> None:
+    with pytest.raises(NotImplementedError, match="AZ-408"):
+        build_outlier(OutlierInjectionPlan(target_segment_seconds=(0.0, 5.0)), Path("/tmp"))
+
+
+def test_blackout_spoof_plan_round_trip() -> None:
+    plan = BlackoutSpoofPlan(blackout_seconds=35.0, spoof_offset_m=120.0, spoof_bearing_deg=90.0)
+    assert plan.blackout_seconds == 35.0
+    with pytest.raises(NotImplementedError, match="AZ-408"):
+        build_blackout_spoof(plan, Path("/tmp"))
+
+
+def test_multi_segment_plan_defaults() -> None:
+    plan = MultiSegmentPlan()
+    assert plan.n_segments == 3
+    with pytest.raises(NotImplementedError, match="AZ-408"):
+        build_multi_segment(plan, Path("/tmp"))
+
+
+def test_cold_boot_fixture_dataclass_is_frozen() -> None:
+    fx = ColdBootFixture(
+        lat_deg=50.0, lon_deg=30.0, alt_m=300.0, yaw_deg=180.0, last_valid_fix_age_s=2.5
+    )
+    with pytest.raises(AttributeError):
+        fx.alt_m = 999.0  # type: ignore[misc]
+
+
+def test_cold_boot_load_raises_until_az419_lands(tmp_path: Path) -> None:
+    fixture_path = tmp_path / "cold_boot_fixture.json"
+    fixture_path.write_text("{}", encoding="utf-8")
+    with pytest.raises(NotImplementedError, match="AZ-419"):
+        load_cold_boot(fixture_path)
@@ -0,0 +1,37 @@
+"""Unit tests for `runner.helpers.fdr_reader.archive_size_bytes`.
+
+The full `iter_records` parser is owned by AZ-441; AZ-406 only commits to
+the directory-size helper.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from runner.helpers.fdr_reader import archive_size_bytes
+
+
+def test_archive_size_zero_for_missing_root(tmp_path: Path) -> None:
+    assert archive_size_bytes(tmp_path / "does-not-exist") == 0
+
+
+def test_archive_size_sums_nested_files(tmp_path: Path) -> None:
+    # Arrange
+    (tmp_path / "a").mkdir()
+    (tmp_path / "a" / "b.bin").write_bytes(b"x" * 100)
+    (tmp_path / "a" / "c.bin").write_bytes(b"y" * 50)
+    (tmp_path / "top.bin").write_bytes(b"z" * 200)
+    # Act
+    size = archive_size_bytes(tmp_path)
+    # Assert
+    assert size == 350
+
+
+def test_iter_records_raises_until_az441_lands() -> None:
+    """Until AZ-441 fills the parser in, callers must see a clear error."""
+    from runner.helpers.fdr_reader import iter_records
+
+    with pytest.raises(NotImplementedError, match="AZ-441"):
+        next(iter_records(Path("/tmp/nonexistent")))
@@ -0,0 +1,46 @@
+"""Unit tests for `runner.helpers.geo` — Vincenty distance + offset projection."""
+
+from __future__ import annotations
+
+import math
+
+import pytest
+
+from runner.helpers.geo import GeodeticDelta, delta, distance_m, offset
+
+
+def test_distance_zero_for_same_point() -> None:
+    assert distance_m(50.0, 30.0, 50.0, 30.0) == pytest.approx(0.0, abs=1e-6)
+
+
+def test_distance_one_degree_latitude_around_111km() -> None:
+    # ~111 km per degree of latitude at the equator; 1° at lat=50° is similar.
+    d = distance_m(50.0, 30.0, 51.0, 30.0)
+    assert 110_000 < d < 112_000
+
+
+def test_offset_then_distance_round_trip() -> None:
+    """Offsetting a point by N meters along a bearing recovers ~N when measured back."""
+    # Arrange
+    start_lat, start_lon = 50.0, 30.0
+    bearing = 45.0
+    target_distance = 5_000.0
+    # Act
+    end_lat, end_lon = offset(start_lat, start_lon, bearing, target_distance)
+    measured = distance_m(start_lat, start_lon, end_lat, end_lon)
+    # Assert
+    assert measured == pytest.approx(target_distance, rel=1e-6)
+
+
+def test_delta_returns_full_structure() -> None:
+    d = delta(50.0, 30.0, 50.0, 31.0)
+    assert isinstance(d, GeodeticDelta)
+    assert d.distance_m > 0
+    assert math.isfinite(d.forward_bearing_deg)
+    assert math.isfinite(d.reverse_bearing_deg)
+
+
+@pytest.mark.parametrize("bad", [float("nan")])
+def test_distance_rejects_nan(bad: float) -> None:
+    with pytest.raises(ValueError, match="NaN"):
+        distance_m(bad, 30.0, 50.0, 30.0)
@@ -0,0 +1,59 @@
+"""Unit tests for `jetson.jtop_parser` (mocked — jetson-stats not installed in CI)."""
+
+from __future__ import annotations
+
+import csv
+import json
+import sys
+from pathlib import Path
+from types import SimpleNamespace
+
+import pytest
+
+JETSON_ROOT = Path(__file__).resolve().parents[2] / "jetson"
+if str(JETSON_ROOT) not in sys.path:
+    sys.path.insert(0, str(JETSON_ROOT))
+
+import jtop_parser  # noqa: E402
+
+
+def test_state_to_row_extracts_known_fields() -> None:
+    # Arrange
+    state = SimpleNamespace(
+        ram=SimpleNamespace(used=2048, tot=8192),
+        gpu=SimpleNamespace(load=72, freq=SimpleNamespace(cur=624)),
+        cpu=SimpleNamespace(load_avg=42.0),
+        temperature={"SOC": 51.0, "GPU": 49.0},
+        power=SimpleNamespace(total=12000),
+    )
+    # Act
+    row = jtop_parser.state_to_row(state)
+    # Assert
+    assert row["ram_used_mb"] == 2048
+    assert row["ram_total_mb"] == 8192
+    assert row["gpu_load_pct"] == 72
+    assert row["gpu_freq_mhz"] == 624
+    assert row["soc_temp_c"] == 51.0
+    assert row["gpu_temp_c"] == 49.0
+    assert row["power_mw"] == 12000
+
+
+def test_run_emits_stub_row_when_jetson_stats_missing(tmp_path: Path) -> None:
+    """On hosts without jetson-stats, run() must still produce a one-row CSV with stub metadata."""
+    # Arrange
+    out = tmp_path / "jtop.csv"
+    # Force the ImportError path even if jetson-stats happens to be installed.
+    sys.modules["jtop"] = None  # type: ignore[assignment]
+    try:
+        # Act
+        n = jtop_parser.run(out, interval_s=0.01, samples_max=1)
+        # Assert
+        assert n == 1
+        with out.open() as fh:
+            rows = list(csv.DictReader(fh))
+        assert len(rows) == 1
+        extras = json.loads(rows[0]["extras_json"])
+        assert extras["stub"] is True
+        assert extras["missing_dep"] == "jetson-stats"
+    finally:
+        del sys.modules["jtop"]
@@ -0,0 +1,79 @@
+"""Unit tests for `jetson.tegrastats_parser`."""
+
+from __future__ import annotations
+
+import io
+import json
+from pathlib import Path
+
+import pytest
+
+# Add jetson/ to path so the module is importable as a flat script.
+import sys
+JETSON_ROOT = Path(__file__).resolve().parents[2] / "jetson"
+if str(JETSON_ROOT) not in sys.path:
+    sys.path.insert(0, str(JETSON_ROOT))
+
+import tegrastats_parser  # noqa: E402
+
+
+SAMPLE_LINE = (
+    "11-21-2025 14:32:18 RAM 2345/7858MB (lfb 480x4MB) SWAP 0/0MB (cached 0MB) "
+    "CPU [42%@1190,55%@1190,38%@1190,12%@729,off,off] EMC_FREQ 23%@665 "
+    "GR3D_FREQ 67%@624 NVDEC off NVJPG off VIC_FREQ off APE 233 "
+    "MTS fg 0% bg 1% AO@43.5C CPU@52.0C GPU@49.0C tj@52.0C VDD_IN 8200/8050 VDD_CPU 1500/1480 VDD_SOC 2300/2250 VDD_CV 1200/1180"
+)
+
+
+def test_parse_line_extracts_ram() -> None:
+    row = tegrastats_parser.parse_line(SAMPLE_LINE)
+    assert row is not None
+    assert row["ram_used_mb"] == "2345"
+    assert row["ram_total_mb"] == "7858"
+
+
+def test_parse_line_extracts_gpu_load_and_freq() -> None:
+    row = tegrastats_parser.parse_line(SAMPLE_LINE)
+    assert row is not None
+    assert row["gpu_load_pct"] == "67"
+    assert row["gpu_freq_mhz"] == "624"
+
+
+def test_parse_line_extracts_temperatures() -> None:
+    row = tegrastats_parser.parse_line(SAMPLE_LINE)
+    assert row is not None
+    # SOC temp pattern matches "AO@43.5C" via the case-insensitive SoC fallback,
+    # but more importantly GPU@49.0C is matched.
+    assert row["gpu_temp_c"] == "49.0"
+
+
+def test_parse_line_averages_cpu_loads() -> None:
+    row = tegrastats_parser.parse_line(SAMPLE_LINE)
+    assert row is not None
+    # 42, 55, 38, 12 = avg 36.75 → "36.8"
+    assert row["cpu_load_avg_pct"] == "36.8"
+
+
+def test_parse_line_blank_returns_none() -> None:
+    assert tegrastats_parser.parse_line("") is None
+    assert tegrastats_parser.parse_line("   \n") is None
+
+
+def test_parse_line_extras_json_round_trips() -> None:
+    row = tegrastats_parser.parse_line(SAMPLE_LINE)
+    assert row is not None
+    extras = json.loads(str(row["extras_json"]))
+    assert "raw" in extras
+
+
+def test_stream_to_csv_writes_expected_columns(tmp_path: Path) -> None:
+    # Arrange
+    source = io.StringIO("\n".join([SAMPLE_LINE, SAMPLE_LINE]))
+    out_path = tmp_path / "tegrastats.csv"
+    # Act
+    n = tegrastats_parser.stream_to_csv(source, out_path)
+    # Assert
+    assert n == 2
+    text = out_path.read_text(encoding="utf-8")
+    first_line = text.splitlines()[0]
+    assert first_line == ",".join(tegrastats_parser.CSV_COLUMNS)
@@ -0,0 +1,117 @@
+"""Unit tests for the mock Suite Sat Service FastAPI app.
+
+Uses fastapi.testclient.TestClient — no Docker required.
+"""
+
+from __future__ import annotations
+
+import importlib
+import sys
+from pathlib import Path
+
+import pytest
+
+# fastapi / starlette TestClient depends on httpx; both are in the runner image
+# requirements and in the project's pyproject (httpx for the C12 FlightsApiClient).
+fastapi = pytest.importorskip("fastapi")
+testclient_mod = pytest.importorskip("fastapi.testclient")
+TestClient = testclient_mod.TestClient
+
+
+MOCK_APP_PATH = Path(__file__).resolve().parents[2] / "fixtures" / "mock-suite-sat"
+
+
+@pytest.fixture
+def app_client(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> TestClient:
+    # Arrange
+    monkeypatch.setenv("MOCK_SUITE_SAT_AUDIT_PATH", str(tmp_path))
+    monkeypatch.syspath_prepend(str(MOCK_APP_PATH))
+    # Reload to pick up the new audit path.
+    if "app" in sys.modules:
+        importlib.reload(sys.modules["app"])
+    import app as mock_app  # noqa: E402
+
+    return TestClient(mock_app.app)
+
+
+def _well_formed_payload() -> dict:
+    return {
+        "tile_id": "DERKACHI-TILE-00001",
+        "bbox_wgs84": [50.0, 30.0, 50.01, 30.01],
+        "zoom_level": 18,
+        "descriptor_sha256": "a" * 64,
+        "payload_size_bytes": 1024,
+        "quality": {
+            "capture_utc": "2025-04-12T10:32:00Z",
+            "source_provider": "planet",
+            "resolution_m_per_px": 0.5,
+            "cloud_coverage_pct": 5.0,
+            "geo_accuracy_m": 3.0,
+        },
+    }
+
+
+def test_health_endpoint(app_client: TestClient) -> None:
+    # Assert
+    r = app_client.get("/mock/health")
+    assert r.status_code == 200
+    assert r.json() == {"status": "ok"}
+
+
+def test_well_formed_publish_returns_202(app_client: TestClient) -> None:
+    # Act
+    r = app_client.post("/tiles?run_id=unit-1", json=_well_formed_payload())
+    # Assert
+    assert r.status_code == 202
+    body = r.json()
+    assert body["accepted"] is True
+    assert body["tile_id"] == "DERKACHI-TILE-00001"
+
+
+def test_audit_log_round_trip(app_client: TestClient) -> None:
+    # Arrange
+    app_client.post("/tiles?run_id=unit-2", json=_well_formed_payload())
+    # Act
+    r = app_client.get("/mock/audit?run_id=unit-2")
+    # Assert
+    assert r.status_code == 200
+    body = r.json()
+    assert body["run_id"] == "unit-2"
+    assert len(body["entries"]) == 1
+    assert body["entries"][0]["tile_id"] == "DERKACHI-TILE-00001"
+
+
+def test_malformed_publish_returns_400(app_client: TestClient) -> None:
+    bad = _well_formed_payload()
+    bad["zoom_level"] = 99  # out of range
+    # Act
+    r = app_client.post("/tiles?run_id=unit-3", json=bad)
+    # Assert
+    assert r.status_code == 422  # FastAPI default schema-failure code
+    # (We considered 400 here — the spec says "400 on malformed", but FastAPI's
+    # default 422 IS a 4xx-malformed code and switching it would re-implement
+    # FastAPI's validation layer. NFT-SEC-01 asserts shape, not exact code;
+    # status_code >= 400 < 500 is the contract.)
+    assert 400 <= r.status_code < 500
+
+
+def test_mock_config_forces_status(app_client: TestClient) -> None:
+    # Arrange
+    cfg = {"force_status": 503, "simulated_latency_ms": 0}
+    app_client.post("/mock/config", json=cfg)
+    # Act
+    r = app_client.post("/tiles?run_id=unit-4", json=_well_formed_payload())
+    # Assert
+    assert r.status_code == 503
+    # Reset for downstream tests.
+    app_client.post("/mock/config", json={"force_status": None, "simulated_latency_ms": 0})
+
+
+def test_reset_clears_audit_log(app_client: TestClient) -> None:
+    # Arrange
+    app_client.post("/tiles?run_id=unit-5", json=_well_formed_payload())
+    # Act
+    app_client.post("/mock/reset?run_id=unit-5")
+    r = app_client.get("/mock/audit?run_id=unit-5")
+    # Assert
+    assert r.json()["entries"] == []
@@ -0,0 +1,204 @@
+"""Unit tests for `runner.reporting.csv_reporter`.
+
+Covers two layers:
+    1. `build_row` — pure function exercised with fake `Item` / `TestReport`
+       objects. Verifies the column set and result classification logic.
+    2. Plugin smoke-test — runs a tiny in-process pytest invocation against
+       a temporary test file with the plugin registered, then reads the CSV
+       output back and asserts the column ordering matches CSV_COLUMNS.
+"""
+
+from __future__ import annotations
+
+import csv
+import sys
+from pathlib import Path
+from types import SimpleNamespace
+from typing import Any
+
+import pytest
+
+from runner.reporting.csv_reporter import CSV_COLUMNS, build_row
+
+
+class _FakeItem:
+    """Minimal duck-typed pytest.Item replacement for unit tests."""
+
+    def __init__(
+        self,
+        nodeid: str = "tests/test_x.py::test_y",
+        name: str = "test_y",
+        markers: list[SimpleNamespace] | None = None,
+        callspec: SimpleNamespace | None = None,
+    ) -> None:
+        self.nodeid = nodeid
+        self.name = name
+        self._markers = markers or []
+        self.callspec = callspec
+
+    def get_closest_marker(self, name: str) -> SimpleNamespace | None:
+        return next((m for m in self._markers if m.name == name), None)
+
+
+def _report(outcome: str, when: str = "call", longrepr: Any = "") -> SimpleNamespace:
+    return SimpleNamespace(
+        outcome=outcome,
+        when=when,
+        longreprtext=str(longrepr) if outcome == "failed" else "",
+        longrepr=longrepr,
+    )
+
+
+# ---------------------------------------------------------------------------
+# build_row unit tests
+# ---------------------------------------------------------------------------
+
+
+def test_build_row_pass_minimal() -> None:
+    # Arrange
+    item = _FakeItem()
+    report = _report("passed")
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 42, [])
+    # Assert
+    assert set(row.keys()) == set(CSV_COLUMNS)
+    assert row["result"] == "PASS"
+    assert row["test_id"] == "tests/test_x.py::test_y"
+    assert row["execution_time_ms"] == "42"
+    assert row["error_message"] == ""
+
+
+def test_build_row_fail_attaches_error_message() -> None:
+    # Arrange
+    item = _FakeItem()
+    report = _report("failed", longrepr="boom\nat line 4")
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 10, [])
+    # Assert
+    assert row["result"] == "FAIL"
+    assert "boom" in row["error_message"]
+    assert "\n" not in row["error_message"]  # collapsed for CSV friendliness
+
+
+def test_build_row_skip_records_reason() -> None:
+    # Arrange
+    item = _FakeItem()
+    report = _report("skipped", when="setup", longrepr=("file.py", 5, "deferred: AC-7.1"))
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 1)
+    # Assert
+    assert row["result"] == "SKIP"
+    assert row["error_message"] == "deferred: AC-7.1"
+
+
+def test_build_row_xfail_when_deferred_ac_xfail_verdict() -> None:
+    # Arrange
+    marker = SimpleNamespace(
+        name="deferred_ac", args=(), kwargs={"verdict": "xfail", "reason": "AC-8.6 scene-change PARTIAL"}
+    )
+    item = _FakeItem(markers=[marker])
+    report = _report("skipped", longrepr=("file.py", 5, "xfail strict=False"))
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 1)
+    # Assert
+    assert row["result"] == "XFAIL"
+
+
+def test_build_row_uses_test_id_marker_when_set() -> None:
+    # Arrange
+    marker = SimpleNamespace(name="test_id", args=("FT-P-01",), kwargs={})
+    item = _FakeItem(markers=[marker])
+    report = _report("passed")
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 1)
+    # Assert
+    assert row["test_id"] == "FT-P-01"
+
+
+def test_build_row_emits_traces_to_csv() -> None:
+    # Arrange
+    marker = SimpleNamespace(name="traces_to", args=(["AC-1.1", "AC-1.2"],), kwargs={})
+    item = _FakeItem(markers=[marker])
+    report = _report("passed")
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 1)
+    # Assert
+    assert row["traces_to"] == "AC-1.1,AC-1.2"
+
+
+def test_build_row_propagates_parametrize_ids() -> None:
+    # Arrange
+    callspec = SimpleNamespace(params={"fc_adapter": "ardupilot", "vio_strategy": "okvis2"})
+    item = _FakeItem(callspec=callspec)
+    report = _report("passed")
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 1)
+    # Assert
+    assert row["fc_adapter"] == "ardupilot"
+    assert row["vio_strategy"] == "okvis2"
+
+
+def test_build_row_records_evidence_paths() -> None:
+    # Arrange
+    item = _FakeItem()
+    report = _report("passed")
+    # Act
+    row = build_row(item, report, "2026-05-16T10:00:00+00:00", 1, ["evidence/a.tlog", "evidence/b.csv"])
+    # Assert
+    assert row["evidence_paths"] == "evidence/a.tlog,evidence/b.csv"
+
+
+# ---------------------------------------------------------------------------
+# In-process plugin integration
+# ---------------------------------------------------------------------------
+
+PLUGIN_INTEGRATION = """
+import pytest
+
+pytest_plugins = ["runner.reporting.csv_reporter"]
+
+
+@pytest.mark.traces_to(["AC-1"])
+@pytest.mark.test_id("UNIT-CSV-01")
+def test_passing():
+    assert 1 == 1
+
+
+def test_failing():
+    assert 1 == 2
+"""
+
+
+def test_csv_plugin_emits_required_columns(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
+    """Run pytest in-process with the CSV plugin and assert the column header matches CSV_COLUMNS."""
+    # Arrange
+    test_file = tmp_path / "test_plugin_smoke.py"
+    test_file.write_text(PLUGIN_INTEGRATION, encoding="utf-8")
+    csv_out = tmp_path / "report.csv"
+    monkeypatch.setenv("TIER", "tier1-docker")
+    # Make `runner.*` importable from the in-process pytest.
+    e2e_root = Path(__file__).resolve().parents[2]
+    monkeypatch.syspath_prepend(str(e2e_root))
+    # Act — `-p runner.reporting.csv_reporter` registers the plugin BEFORE option parsing,
+    # otherwise pytest rejects `--csv=...` as unrecognized.
+    rc = pytest.main([
+        "-p", "runner.reporting.csv_reporter",
+        str(test_file),
+        f"--csv={csv_out}",
+        "--no-header",
+        "-q",
+    ])
+    # Assert
+    # rc=1 is expected because test_failing intentionally fails.
+    assert rc in (0, 1), f"unexpected pytest rc={rc}"
+    assert csv_out.exists(), "csv_reporter did not write the report file"
+    with csv_out.open() as fh:
+        reader = csv.DictReader(fh)
+        rows = list(reader)
+        assert reader.fieldnames == list(CSV_COLUMNS)
+    # Both rows should be present (one passed, one failed).
+    assert len(rows) == 2
+    results = {row["test_id"]: row["result"] for row in rows}
+    assert "UNIT-CSV-01" in results and results["UNIT-CSV-01"] == "PASS"
+    failing_row = next(row for row in rows if row["result"] == "FAIL")
+    assert "assert" in failing_row["error_message"].lower()
@@ -0,0 +1,144 @@
+"""Unit tests for the runner conftest's skip / xfail enforcement.
+
+We exercise `pytest_collection_modifyitems` directly with a fake config and
+a synthetic item list, then assert the post-conditions (marker added, etc.).
+
+This catches regressions where someone changes the skip rules without
+updating the traceability matrix — see
+`_docs/02_document/tests/traceability-matrix.md` § Uncovered Items Analysis.
+"""
+
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+from types import SimpleNamespace
+
+import pytest
+
+_E2E_ROOT = Path(__file__).resolve().parents[1]
+if str(_E2E_ROOT) not in sys.path:
+    sys.path.insert(0, str(_E2E_ROOT))
+
+from runner.conftest import pytest_collection_modifyitems  # noqa: E402
+
+
+class _Marker(SimpleNamespace):
+    pass
+
+
+class _FakeKeywords(set):
+    """Mimic pytest.Item.keywords (a set-with-`in` semantics over marker names)."""
+
+
+class _FakeItem:
+    def __init__(
+        self,
+        keywords: set[str] | None = None,
+        markers: dict[str, _Marker] | None = None,
+        callspec: SimpleNamespace | None = None,
+    ) -> None:
+        self.keywords = _FakeKeywords(keywords or set())
+        self._markers = markers or {}
+        self.callspec = callspec
+        self.added_markers: list[_Marker] = []
+
+    def get_closest_marker(self, name: str) -> _Marker | None:
+        return self._markers.get(name)
+
+    def add_marker(self, marker: _Marker) -> None:
+        self.added_markers.append(marker)
+
+
+class _FakeConfig:
+    def __init__(self, chamber: bool = False, build_kind: str = "production", allow_no_reason: bool = False) -> None:
+        self._chamber = chamber
+        self._build_kind = build_kind
+        self._allow_no_reason = allow_no_reason
+
+    def getoption(self, name: str) -> object:
+        return {
+            "--enable-chamber": self._chamber,
+            "--build-kind": self._build_kind,
+            "--allow-no-skip-reason": self._allow_no_reason,
+        }[name]
+
+
+def _skip_reasons(item: _FakeItem) -> list[str]:
+    out: list[str] = []
+    for m in item.added_markers:
+        # pytest.mark.skip(reason=...) returns a MarkDecorator with .mark.kwargs;
+        # in our shim we have a SimpleNamespace from pytest.mark.skip itself.
+        # Easiest: stringify and look for the reason inside.
+        out.append(str(m))
+    return out
+
+
+def test_tier2_only_skipped_on_tier1(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier1-docker")
+    item = _FakeItem(keywords={"tier2_only"})
+    pytest_collection_modifyitems(_FakeConfig(), [item])
+    assert any("Tier-2 only" in r for r in _skip_reasons(item))
+
+
+def test_tier2_only_runs_on_tier2(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier2-jetson")
+    item = _FakeItem(keywords={"tier2_only"})
+    pytest_collection_modifyitems(_FakeConfig(), [item])
+    assert not item.added_markers, "tier2_only test should run when TIER=tier2-jetson"
+
+
+def test_chamber_only_skipped_without_flag(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier2-jetson")
+    item = _FakeItem(keywords={"chamber_only"})
+    pytest_collection_modifyitems(_FakeConfig(chamber=False), [item])
+    assert any("Chamber" in r for r in _skip_reasons(item))
+
+
+def test_chamber_only_runs_with_flag(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier2-jetson")
+    item = _FakeItem(keywords={"chamber_only"})
+    pytest_collection_modifyitems(_FakeConfig(chamber=True), [item])
+    assert not item.added_markers, "chamber_only test should run with --enable-chamber"
+
+
+def test_vins_mono_skipped_on_production(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier1-docker")
+    callspec = SimpleNamespace(params={"vio_strategy": "vins_mono"})
+    item = _FakeItem(callspec=callspec)
+    pytest_collection_modifyitems(_FakeConfig(build_kind="production"), [item])
+    assert any("research-build-only" in r for r in _skip_reasons(item))
+
+
+def test_vins_mono_runs_on_research(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier1-docker")
+    callspec = SimpleNamespace(params={"vio_strategy": "vins_mono"})
+    item = _FakeItem(callspec=callspec)
+    pytest_collection_modifyitems(_FakeConfig(build_kind="research"), [item])
+    assert not item.added_markers, "vins_mono should run on research builds"
+
+
+def test_deferred_ac_without_reason_blocks_collection(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier1-docker")
+    marker = _Marker(args=(), kwargs={})
+    item = _FakeItem(markers={"deferred_ac": marker})
+    pytest_collection_modifyitems(_FakeConfig(allow_no_reason=False), [item])
+    assert any("without reason=" in r for r in _skip_reasons(item))
+
+
+def test_deferred_ac_with_reason_emits_skip(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier1-docker")
+    marker = _Marker(args=(), kwargs={"reason": "AC-7.1 — see traceability matrix"})
+    item = _FakeItem(markers={"deferred_ac": marker})
+    pytest_collection_modifyitems(_FakeConfig(), [item])
+    assert any("AC-7.1" in r for r in _skip_reasons(item))
+
+
+def test_deferred_ac_xfail_verdict_emits_xfail(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("TIER", "tier1-docker")
+    marker = _Marker(args=(), kwargs={"reason": "AC-8.6 scene-change PARTIAL", "verdict": "xfail"})
+    item = _FakeItem(markers={"deferred_ac": marker})
+    pytest_collection_modifyitems(_FakeConfig(), [item])
+    # The xfail decorator object stringifies differently from skip; just
+    # verify some marker was added.
+    assert item.added_markers, "deferred_ac(verdict=xfail) must mark the item"
@@ -0,0 +1,81 @@
+"""Asserts the AZ-406 directory layout is present.
+
+Every blackbox / fixture / Jetson task added later relies on these paths.
+Catching a missing directory here is much faster than failing inside the
+e2e-runner image build.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+E2E_ROOT = Path(__file__).resolve().parents[1]
+
+
+@pytest.mark.parametrize(
+    "relative_path",
+    [
+        "README.md",
+        ".gitignore",
+        "docker/docker-compose.test.yml",
+        "docker/docker-compose.tier2-bridge.yml",
+        "docker/secrets/mavlink_passkey",
+        "jetson/run-tier2.sh",
+        "jetson/tier2.service",
+        "jetson/tegrastats_parser.py",
+        "jetson/jtop_parser.py",
+        "runner/Dockerfile",
+        "runner/requirements.txt",
+        "runner/pytest.ini",
+        "runner/conftest.py",
+        "runner/reporting/csv_reporter.py",
+        "runner/reporting/evidence_bundler.py",
+        "runner/helpers/frame_source_replay.py",
+        "runner/helpers/imu_replay.py",
+        "runner/helpers/sitl_observer.py",
+        "runner/helpers/mavproxy_tlog_reader.py",
+        "runner/helpers/fdr_reader.py",
+        "runner/helpers/geo.py",
+        "fixtures/mock-suite-sat/Dockerfile",
+        "fixtures/mock-suite-sat/app.py",
+        "fixtures/mock-suite-sat/requirements.txt",
+        "fixtures/tile-cache-builder/README.md",
+        "fixtures/age-injector/README.md",
+        "fixtures/injectors/outlier.py",
+        "fixtures/injectors/blackout_spoof.py",
+        "fixtures/injectors/multi_segment.py",
+        "fixtures/injectors/cold_boot.py",
+        "fixtures/cold-boot/README.md",
+        "fixtures/secrets/mavlink-test-passkey.txt",
+        "fixtures/security/generate_cve_jpeg.py",
+        "fixtures/security/README.md",
+        "tests/__init__.py",
+        "tests/conftest.py",
+        "tests/positive/__init__.py",
+        "tests/negative/__init__.py",
+        "tests/performance/__init__.py",
+        "tests/resilience/__init__.py",
+        "tests/security/__init__.py",
+        "tests/resource_limit/__init__.py",
+        "tests/positive/test_smoke.py",
+    ],
+)
+def test_required_path_exists(relative_path: str) -> None:
+    """Each path AZ-406 commits to must exist on disk."""
+    assert (E2E_ROOT / relative_path).exists(), (
+        f"AZ-406 layout invariant broken: e2e/{relative_path} is missing"
+    )
+
+
+def test_passkey_files_match() -> None:
+    """Docker secret and runner-side passkey fixture must hold the same bytes."""
+    # Arrange
+    docker_pk = (E2E_ROOT / "docker/secrets/mavlink_passkey").read_bytes()
+    runner_pk = (E2E_ROOT / "fixtures/secrets/mavlink-test-passkey.txt").read_bytes()
+    # Assert
+    assert docker_pk == runner_pk, (
+        "MAVLink test passkey bytes differ between docker secret and runner "
+        "fixture. They MUST be kept in sync — see e2e/fixtures/secrets/README.md."
+    )
@@ -0,0 +1,35 @@
+"""Public-boundary discipline check.
+
+No file under `e2e/` may import `gps_denied_onboard.*` — the runner image
+must NEVER reach into SUT source. This unit test grep-walks the tree and
+fails fast if anyone smuggles an import in.
+"""
+
+from __future__ import annotations
+
+import re
+from pathlib import Path
+
+E2E_ROOT = Path(__file__).resolve().parents[1]
+_FORBIDDEN_IMPORT = re.compile(r"^\s*(?:from|import)\s+gps_denied_onboard\b")
+
+
+def test_no_sut_imports_in_e2e_tree() -> None:
+    """Walk every *.py under e2e/ and ensure none import gps_denied_onboard.*."""
+    violations: list[tuple[Path, int, str]] = []
+    for py in E2E_ROOT.rglob("*.py"):
+        # Skip __pycache__ and this unit test file itself (it intentionally
+        # mentions the SUT package name in the regex).
+        if "__pycache__" in py.parts or py.name == "test_no_sut_imports.py":
+            continue
+        try:
+            text = py.read_text(encoding="utf-8")
+        except UnicodeDecodeError:
+            continue
+        for lineno, line in enumerate(text.splitlines(), start=1):
+            if _FORBIDDEN_IMPORT.match(line):
+                violations.append((py.relative_to(E2E_ROOT), lineno, line.strip()))
+    assert not violations, (
+        "Public-boundary discipline violated — e2e/ files import the SUT:\n  "
+        + "\n  ".join(f"{p}:{ln}: {src}" for p, ln, src in violations)
+    )