[AZ-406] Blackbox test harness bootstrap (Tier-1 + Tier-2 scaffold)

Bootstraps the public-boundary blackbox test harness owned by epic
AZ-262 (E-BBT). Establishes the e2e/ directory tree at the repo root,
fully separated from src/gps_denied_onboard/** and from the in-process
tests/** tree, and commits to the contracts every subsequent test
ticket (AZ-407..AZ-446) builds against.

Tier-1 (workstation Docker):
- docker/docker-compose.test.yml wires SUT + ArduPilot SITL + iNav SITL
  + mock Suite Sat Service + mavproxy listener + e2e-runner onto one
  e2e-net bridge with internal: true (enforces RESTRICT-SAT-1 /
  NFT-SEC-02 egress isolation at the network layer).
- docker/docker-compose.tier2-bridge.yml override disables the in-
  compose SUT so Tier-2 pairs SITLs + mock + runner on an x86 host
  while the SUT runs natively on the Jetson under systemd.

Tier-2 (Jetson):
- jetson/run-tier2.sh + tier2.service systemd unit + tegrastats /
  jtop parsers feed per-sample telemetry into the evidence bundle.

Runner image (e2e/runner/):
- Dockerfile + requirements.txt install ONLY ground-side libs
  (pymavlink, opencv-python>=4.12, numpy/scipy/geopy/pyproj, httpx,
  orjson, pydantic, structlog, pytest 8.x). The runner deliberately
  does NOT install the SUT package.
- conftest.py implements the AC-9 skip-rule mapping (tier2_only,
  chamber_only, vins_mono, deferred_ac) tied to environment.md
  parametrize axes.
- reporting/csv_reporter.py is a pytest plugin emitting one row per
  test with the exact 11-column schema from environment.md §
  Reporting (test_id, test_name, traces_to, fc_adapter, vio_strategy,
  tier, started_at_utc, execution_time_ms, result, error_message,
  evidence_paths). XFAIL surfaced only when a test carries
  @pytest.mark.deferred_ac(verdict="xfail", reason=...).
- reporting/evidence_bundler.py exposes the attach_evidence fixture
  that copies per-test artifacts (.tlog, FDR archives, screenshots,
  tegrastats / jtop CSVs) into the run bundle and records relative
  paths into the reporter's evidence_paths column.
- helpers/{frame_source_replay,imu_replay,sitl_observer,
  mavproxy_tlog_reader,fdr_reader}.py declare the public surfaces
  (concrete implementations owned by AZ-407 / AZ-408 / AZ-416 /
  AZ-417 / AZ-441 per the dependency table); helpers/geo.py ships
  today (no downstream task dep) — WGS84 distance / forward-bearing
  / offset via pyproj with NaN rejection.

Mock Suite Sat Service (e2e/fixtures/mock-suite-sat/):
- FastAPI app: POST /tiles (ingest contract from D-PROJ-2 follow-up),
  GET /tiles/audit + /mock/audit (per-run read-back), POST
  /mock/config (force-status, response delay), POST /mock/reset
  (clears audit between tests), GET /mock/health.

Fixture scaffolds (e2e/fixtures/{tile-cache-builder, age-injector,
injectors, cold-boot, secrets, security}/):
- Public surfaces only. Concrete builders land in AZ-407 (static
  fixtures), AZ-408 (runtime synthetic injection), AZ-419 (cold-boot
  fixture), AZ-439 (CVE-2025-53644 JPEG generator).

Test tree (e2e/tests/{positive,negative,performance,resilience,
security,resource_limit}/):
- Mirror of the test-spec category grouping in
  _docs/02_document/tests/*-tests.md.
- tests/positive/test_smoke.py is the AC-1 harness-boot smoke run
  inside the e2e-runner image once Docker brings everything up.

Out-of-container unit tests (e2e/_unit_tests/):
- Exercises the harness internals (CSV reporter plugin lifecycle,
  conftest skip rules, helper modules, parsers, mock app, compose
  YAML structural contract, public-boundary enforcement) without
  Docker / SITL. 97 unit tests, all passing.

Build / config:
- pyproject.toml: testpaths extended with e2e/_unit_tests; pythonpath
  extended with e2e; fastapi>=0.111,<0.120 added to dev extras for the
  mock-app TestClient unit test.

AC coverage:
- AC-1 (Tier-1 boot)         → compose YAML test + directory layout
                                + smoke test (Docker-bound)
- AC-2 (mock services)       → 6 FastAPI TestClient unit tests
- AC-3 (SITLs accept output) → contract present; concrete check
                                deferred to AZ-416 / AZ-417
- AC-4 (CSV columns)         → in-process plugin lifecycle test
                                emits the exact 11-column schema
- AC-5 (egress isolation)    → static config test + runtime probe
                                in Docker-bound smoke
- AC-6 (Tier-2 contract)     → tegrastats + jtop parser unit tests
                                + jetson/* layout test; full Tier-2
                                contract is AZ-444
- AC-7 (fixture reproducibility) → deferred to AZ-407 per task spec
- AC-8 (parametrize matrix)  → vins_mono skip-rule cases +
                                tests/positive/test_smoke
- AC-9 (skip semantics)      → 9 conftest skip-rule unit tests

Module layout entry for blackbox_tests was added in 2026-05-16
preparatory commit d7a17a8 so this diff stays focused on the harness
scaffold. AZ-406 advances to In Testing on commit.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-16 16:22:44 +03:00
parent d7a17a8248
commit 59d9116d36
72 changed files with 3515 additions and 6 deletions
+7
View File
@@ -0,0 +1,7 @@
# age-injector
Mutates `tile-cache-fixture` manifest dates → `synth-age-tile-set` for
FT-N-05 / FT-N-06 (stale-tile rejection on freshness violation).
Delivered by **AZ-407** (Static fixture builders). AZ-406 commits to the
directory location + name only.
+8
View File
@@ -0,0 +1,8 @@
# cold-boot-fixture
Static JSON fixture loaded by FT-P-11 (cold-start init) and NFT-PERF-03
(cold-start TTFF). Schema mirror lives in
`e2e/fixtures/injectors/cold_boot.py` (`ColdBootFixture`).
AZ-419 produces `cold_boot_fixture.json` here. AZ-406 commits to the
directory location only.
+14
View File
@@ -0,0 +1,14 @@
"""Runtime synthetic-injection fixture builders.
Each module here generates a per-test tmpfs fixture for a specific
negative-path scenario:
- outlier.py — outlier-injection-derkachi (FT-N-01)
- blackout_spoof.py — blackout-spoof-derkachi (FT-N-04, NFT-RES-04)
- multi_segment.py — multi-segment-derkachi (FT-P-08)
- cold_boot.py — cold-boot-fixture (FT-P-11, NFT-PERF-03)
AZ-406 supplies the package layout + public function signatures; concrete
generators are delivered by **AZ-408** (Runtime synthetic-injection fixture
builders).
"""
+27
View File
@@ -0,0 +1,27 @@
"""blackout-spoof-derkachi — visual blackout + spoofed GPS combination (FT-N-04, NFT-RES-04).
Concrete generator is owned by AZ-408. AZ-406 commits to the public
signature.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
@dataclass(frozen=True)
class BlackoutSpoofPlan:
"""Configuration for the blackout-spoof-derkachi fixture.
`blackout_seconds` corresponds to the 5 / 15 / 35 s window family from
NFT-RES-04 (35 s escalation ladder) and FT-N-04 (blackout + spoof).
"""
blackout_seconds: float
spoof_offset_m: float
spoof_bearing_deg: float
def build(plan: BlackoutSpoofPlan, out_root: Path) -> Path:
raise NotImplementedError("Owned by AZ-408 — AZ-406 supplies only the contract.")
+26
View File
@@ -0,0 +1,26 @@
"""cold-boot-fixture — frozen FC pose snapshot (FT-P-11, NFT-PERF-03).
The cold-boot fixture is a static JSON file (not generated at runtime);
its concrete schema is owned by AZ-419 (FT-P-11) + AZ-430 (NFT-PERF-03 TTFF).
AZ-406 commits to the file location only.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
@dataclass(frozen=True)
class ColdBootFixture:
"""Mirror of the JSON shape stored at ``cold-boot/cold_boot_fixture.json``."""
lat_deg: float
lon_deg: float
alt_m: float
yaw_deg: float
last_valid_fix_age_s: float
def load(fixture_path: Path) -> ColdBootFixture:
raise NotImplementedError("Owned by AZ-419 — AZ-406 commits to the location only.")
+20
View File
@@ -0,0 +1,20 @@
"""multi-segment-derkachi — ≥3 disconnected segments via satellite re-loc (FT-P-08).
Concrete generator is owned by AZ-408. AZ-406 commits to the public
signature.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
@dataclass(frozen=True)
class MultiSegmentPlan:
n_segments: int = 3
gap_seconds: float = 12.0
def build(plan: MultiSegmentPlan, out_root: Path) -> Path:
raise NotImplementedError("Owned by AZ-408 — AZ-406 supplies only the contract.")
+24
View File
@@ -0,0 +1,24 @@
"""outlier-injection-derkachi — injects up to 350 m position outliers (FT-N-01).
Concrete generator is owned by AZ-408. AZ-406 commits to the public
signature so test specs can plan against it.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
@dataclass(frozen=True)
class OutlierInjectionPlan:
"""Configuration for the outlier-injection-derkachi fixture."""
target_segment_seconds: tuple[float, float]
max_offset_m: float = 350.0
n_outliers: int = 5
def build(plan: OutlierInjectionPlan, out_root: Path) -> Path:
"""Generate the fixture under ``out_root``. Returns the produced directory."""
raise NotImplementedError("Owned by AZ-408 — AZ-406 supplies only the contract.")
+31
View File
@@ -0,0 +1,31 @@
# Mock Suite Satellite Service — stubs the parent-suite ingest API for blackbox tests.
#
# Behaviour spec: _docs/02_tasks/todo/AZ-406_test_infrastructure.md § Mock Services
# Contract sketch: _docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md
# NFT-SEC-01 cross-check: the accepted-fields shape MUST match the contract sketch.
FROM python:3.12-slim-bookworm
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1
WORKDIR /app
RUN apt-get update && apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir -r /app/requirements.txt
COPY app.py /app/app.py
ENV MOCK_SUITE_SAT_AUDIT_PATH=/audit
RUN mkdir -p /audit
EXPOSE 8080
HEALTHCHECK --interval=5s --timeout=2s --retries=12 \
CMD curl -fsS http://localhost:8080/mock/health || exit 1
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080", "--log-level", "info"]
+163
View File
@@ -0,0 +1,163 @@
"""Mock Suite Satellite Service — FastAPI ingest stub for blackbox tests.
Endpoints:
POST /tiles — main ingest. Returns 202 on well-formed tile,
400 on malformed; appends to the run audit log.
GET /tiles/audit — read-back of the per-run audit log (JSONL).
POST /mock/config — test-time behaviour control (force 5xx, simulate downtime).
GET /mock/audit — alias of /tiles/audit with optional ?run_id filter.
POST /mock/reset — clears the audit log between tests for isolation.
GET /mock/health — Docker healthcheck.
The accepted ingest schema is the contract sketch from
`_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`.
NFT-SEC-01 asserts the schema's accepted-fields match that sketch.
"""
from __future__ import annotations
import os
import time
import uuid
from pathlib import Path
from typing import Annotated, Literal
import orjson
from fastapi import FastAPI, HTTPException, Query
from fastapi.responses import ORJSONResponse, PlainTextResponse
from pydantic import BaseModel, Field, ValidationError
AUDIT_ROOT = Path(os.environ.get("MOCK_SUITE_SAT_AUDIT_PATH", "/audit"))
AUDIT_ROOT.mkdir(parents=True, exist_ok=True)
app = FastAPI(
title="mock-suite-sat-service",
version="0.1.0",
description="Deterministic stub of the parent Suite Satellite Service.",
default_response_class=ORJSONResponse,
)
# ---------------------------------------------------------------------------
# Behaviour control (test-only)
# ---------------------------------------------------------------------------
class _MockConfig(BaseModel):
force_status: int | None = Field(default=None, description="Force this status on every ingest.")
simulated_latency_ms: int = 0
_config = _MockConfig()
# ---------------------------------------------------------------------------
# Ingest schema (mirror of the contract sketch — keep them in sync)
# ---------------------------------------------------------------------------
class TileQualityMetadata(BaseModel):
capture_utc: str
source_provider: Literal["maxar", "planet", "sentinel-2", "skywatch", "operator-supplied"]
resolution_m_per_px: float = Field(gt=0, le=10.0)
cloud_coverage_pct: float = Field(ge=0, le=100)
geo_accuracy_m: float = Field(ge=0)
class TilePublishRequest(BaseModel):
tile_id: str = Field(min_length=8, max_length=128)
bbox_wgs84: tuple[float, float, float, float]
zoom_level: int = Field(ge=10, le=22)
descriptor_sha256: str = Field(min_length=64, max_length=64)
payload_size_bytes: int = Field(gt=0)
quality: TileQualityMetadata
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _run_audit_path(run_id: str) -> Path:
safe = "".join(c for c in run_id if c.isalnum() or c in "-_") or "default"
return AUDIT_ROOT / f"{safe}.jsonl"
def _append_audit(run_id: str, entry: dict[str, object]) -> None:
entry = {**entry, "received_at_unix": time.time(), "entry_id": str(uuid.uuid4())}
path = _run_audit_path(run_id)
with path.open("ab") as fh:
fh.write(orjson.dumps(entry))
fh.write(b"\n")
# ---------------------------------------------------------------------------
# Routes
# ---------------------------------------------------------------------------
@app.get("/mock/health")
def health() -> dict[str, str]:
return {"status": "ok"}
@app.post("/tiles", status_code=202)
def publish_tile(
request: TilePublishRequest,
run_id: Annotated[str, Query(alias="run_id")] = "default",
) -> dict[str, object]:
if _config.simulated_latency_ms > 0:
time.sleep(_config.simulated_latency_ms / 1000.0)
if _config.force_status is not None and _config.force_status >= 400:
raise HTTPException(
status_code=_config.force_status,
detail=f"forced status by /mock/config (current force_status={_config.force_status})",
)
_append_audit(
run_id,
{
"tile_id": request.tile_id,
"bbox_wgs84": list(request.bbox_wgs84),
"zoom_level": request.zoom_level,
"descriptor_sha256": request.descriptor_sha256,
"payload_size_bytes": request.payload_size_bytes,
"quality": request.quality.model_dump(),
},
)
return {"accepted": True, "tile_id": request.tile_id, "run_id": run_id}
@app.exception_handler(ValidationError)
def on_validation_error(_request, exc: ValidationError) -> ORJSONResponse: # type: ignore[no-untyped-def]
return ORJSONResponse(status_code=400, content={"detail": exc.errors()})
@app.get("/tiles/audit")
@app.get("/mock/audit")
def get_audit(run_id: Annotated[str, Query(alias="run_id")] = "default") -> ORJSONResponse:
path = _run_audit_path(run_id)
if not path.exists():
return ORJSONResponse(content={"run_id": run_id, "entries": []})
entries = []
with path.open("rb") as fh:
for line in fh:
line = line.strip()
if not line:
continue
entries.append(orjson.loads(line))
return ORJSONResponse(content={"run_id": run_id, "entries": entries})
@app.post("/mock/config")
def update_config(config: _MockConfig) -> _MockConfig:
global _config
_config = config
return _config
@app.post("/mock/reset")
def reset(run_id: Annotated[str, Query(alias="run_id")] = "default") -> PlainTextResponse:
path = _run_audit_path(run_id)
if path.exists():
path.unlink()
return PlainTextResponse("reset")
@@ -0,0 +1,4 @@
fastapi>=0.111,<0.120
uvicorn[standard]>=0.30,<0.40
pydantic>=2.5,<3.0
orjson>=3.9,<4.0
+11
View File
@@ -0,0 +1,11 @@
# Runner-side secrets fixtures (TEST ONLY)
These files are loaded by pymavlink / msp_gps_toy when the runner needs
to participate in a signed-message handshake (FT-P-09-AP, NFT-SEC-03).
The bytes here match the Docker-secret value at
`e2e/docker/secrets/mavlink_passkey`. **Both files MUST be kept in sync.**
Production deployments never see either file — the production passkey is
provisioned via a real secret store at deploy time per `environment.md`
§ Communication with system under test.
@@ -0,0 +1 @@
0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
+5
View File
@@ -0,0 +1,5 @@
# Security fixtures
Hosts the crafted artifacts consumed by NFT-SEC-* scenarios. AZ-406
delivers the directory + generator scaffold; concrete fixture content is
delivered by the consuming security tasks (AZ-439 for the CVE JPEG).
@@ -0,0 +1,43 @@
"""Programmatically generate the crafted JPEG fixture for CVE-2025-53644.
Per AZ-406 § Risk 5 — the upstream PoC JPEG has unclear redistribution
terms, so the e2e harness generates a structurally equivalent file from
scratch rather than committing copyrighted bytes.
The fixture is consumed by NFT-SEC-04 (OpenCV CVE-2025-53644 +
AddressSanitizer fuzz). The intent is NOT to reproduce the exact RCE; it
is to provide a malformed JPEG with the structural features the CVE
exploits (oversized DHT segment, truncated SOS marker) so the SUT's
hardened OpenCV path (>= 4.12.0) rejects it.
AZ-406 commits to the generator's existence + signature; AZ-439
(NFT-SEC-04) supplies the byte-level details and validates the generated
file actually triggers the CVE code path against opencv 4.11.x (control)
vs 4.12+ (mitigated).
"""
from __future__ import annotations
from pathlib import Path
def generate(out_path: Path) -> Path:
"""Write a malformed JPEG to ``out_path``. Returns the path on success.
Raises NotImplementedError until AZ-439 supplies the byte template.
Tests that need the crafted fixture should mark themselves
@pytest.mark.skip(reason="awaiting AZ-439") until then.
"""
raise NotImplementedError(
"generate_cve_jpeg.generate is owned by AZ-439 — AZ-406 commits "
"to the public signature only."
)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Generate CVE-2025-53644 fixture JPEG.")
parser.add_argument("out", type=Path, default=Path("cve-2025-53644.jpg"))
args = parser.parse_args()
generate(args.out)
+15
View File
@@ -0,0 +1,15 @@
# tile-cache-builder
Builds the `tile-cache-fixture` Docker volume from the 60 still-image
satellite references in `_docs/00_problem/input_data/` plus the Derkachi
route bbox.
This directory currently contains only the structural placeholder; the
concrete builder (Dockerfile + build script + FAISS HNSW index emitter +
manifest writer + reproducibility assertion) is delivered by **AZ-407**
(Static fixture builders) — see AC-7 ("Fixture builders are reproducible")
in `_docs/02_tasks/todo/AZ-406_test_infrastructure.md`.
AZ-406 commits to the directory's location + name only. Do NOT delete this
README before AZ-407 lands; the `e2e_unit_test_directory_layout` unit test
asserts the placeholder is present.