Files
gps-denied-onboard/e2e
Oleksandr Bezdieniezhnykh 6599d828d2 [AZ-407] [AZ-444] [AZ-445] Batch 68: fixtures, Tier-2 harness, NFR reporter
Three blackbox-harness tasks landed together — all depend only on
AZ-406 and unblock the FT-* / NFT-* scenario tasks scheduled for
batches 69+.

AZ-407 — Static fixture builders (3pt):
  * tile-cache-builder/{builder.py, Dockerfile, build.sh} produces a
    deterministic tile-cache-fixture Docker volume from
    _docs/00_problem/input_data/. Reproducibility primitives: sorted
    iteration, frozen PIL JPEG settings, FAISS HNSW32 built single-
    threaded with seeded stub descriptors.
  * age-injector/{age_injector.py, inject.sh} clones the volume and
    shifts capture_date by N×30.44 days; tile JPEG bytes preserved
    bit-identical. Emits synth-age-7mo + synth-age-13mo volumes.
  * cold-boot/cold_boot_fixture.json: frozen FC pose snapshot at
    Derkachi sector centre, schema v1.
  * secrets/mavlink-test-passkey.txt: 64-hex with required
    `# TEST ONLY` header line per AC-5. Passkey-equality test now
    compares the secret line after stripping the header.
  * security/cve-2025-53644.jpg: synthetic 158-byte malformed JPEG
    (truncated SOS marker). OpenCV 4.11.x rejects gracefully with
    imdecode → None. AZ-439 will sharpen for ASan instrumentation.
  * Top-level Makefile with `make fixtures` / `make fixtures-*` /
    `make e2e-tier1*` / `make unit-tests` targets.

AZ-444 — Tier-2 Jetson harness wrapper (5pt):
  * run-tier2.sh rewritten as orchestrator. Detects local
    (aarch64 + TIER2_HOST=localhost) vs remote (ssh into TIER2_HOST).
    New flags: -k/--selector, --build-kind production|asan,
    --reflash (gated behind TIER2_REFLASH_ACK=1 two-key gate),
    --dry-run.
  * tier2-on-jetson.sh (new) — on-device delegate. Verifies
    gps-denied-onboard{,-asan}.service health; restarts with 5s
    tolerance; spawns tegrastats + jtop parallel samplers; tails
    ASan unit's journal in asan mode; drives docker compose with
    TIER=tier2-jetson; forwards SELECTOR to pytest -k.
  * docker/run-tier1.sh (new) — selector-parity sibling.
  * AC-1 (selector parity) and AC-6 (reflash gating) unit-tested via
    --dry-run output assertions. AC-2/AC-3/AC-4/AC-5 are hardware-
    loop ACs verified by the Tier-2 runtime smoke (no Jetson in the
    unit-test layer).

AZ-445 — CSV reporter + evidence bundler refinements (2pt):
  * reporting/nfr_recorder.py (new) — pytest plugin. Provides the
    `nfr_recorder` fixture with record_metric(name, value, ac_id)
    and partial(ac_id, reason). At session end emits:
      - per-nfr/<scenario_id>.json (AC-1)
      - traceability-status.json with every AC ID parsed from
        traceability-matrix.md, classified Covered/PARTIAL/NOT
        COVERED with source scenario IDs (AC-2)
      - regression-baseline.json with all numeric metrics (AC-3)
  * csv_reporter.py extended — `_outcome_to_result` consults the
    aggregator; rows flip PASS → PARTIAL when an AC was marked
    PARTIAL by nfr_recorder (AC-4). Graceful fallback when
    aggregator isn't registered (unit-test contexts).
  * conftest.py registers nfr_recorder in pytest_plugins.
  * New --traceability-matrix CLI flag seeds the NOT COVERED rows.

Build / config:
  * pyproject.toml dev extras: added Pillow>=10.4,<13.0 for the
    tile-cache-builder unit test (broad enough to keep torchvision's
    Pillow 12 pin happy; the production builder runs inside its own
    Docker image with its own pin).
  * Updated test_directory_layout.py to cover 10 new files + replaced
    the byte-equal passkey assertion with the header-stripping
    variant.

Test results:
  * 157 focused tests pass (was 97 in batch 67; +60 new across this
    batch). No regressions.

Module-layout / spec drift:
  * AZ-407 spec text says `tests/fixtures/...`; module-layout
    blackbox_tests entry (commit d7a17a8) authoritatively places the
    harness under `e2e/`. Implementation followed the layout entry.
  * AZ-444 spec mentions `e2e/tier2/run-tier2.sh`; AZ-406 placed it
    at `e2e/jetson/run-tier2.sh`. Kept at `e2e/jetson/` for
    consistency.
  * Cold-boot README ownership: corrected from AZ-419 to AZ-407 per
    AZ-419's own Dependencies field.

Specs archived to _docs/02_tasks/done/. Jira tickets transitioned to
In Testing on commit.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 17:18:01 +03:00
..

Blackbox Test Harness (e2e/)

This directory is the public-boundary test harness for gps-denied-onboard. It is owned by the blackbox_tests cross-cutting entry in _docs/02_document/module-layout.md and implements task AZ-406 (Test Infrastructure Bootstrap) plus its downstream test-task siblings (AZ-407..AZ-446).

The harness runs in two execution tiers (environment.md § Two-tier execution profile):

  • Tier-1 — workstation Docker. cd e2e/docker && docker compose -f docker-compose.test.yml up --build --abort-on-container-exit e2e-runner
  • Tier-2 — Jetson Orin Nano Super hardware loop. ./e2e/jetson/run-tier2.sh --fc-adapter <ardupilot|inav> --vio-strategy <okvis2|klt_ransac>

Both tiers emit the same CSV report format (one row per test) per environment.md § Reporting.

Layout

e2e/
├── docker/        Tier-1 entrypoint (docker-compose.test.yml + Tier-2 bridge override + secrets mount)
├── jetson/        Tier-2 entrypoint (run-tier2.sh + systemd unit + tegrastats/jtop parsers)
├── runner/        e2e-runner image (Dockerfile, conftest, pytest plugins, helpers, requirements)
├── fixtures/      Fixture builders (tile-cache, age-injector, injectors/, mock-suite-sat, secrets, security)
├── tests/         Pytest target — `positive/`, `negative/`, `performance/`, `resilience/`, `security/`, `resource_limit/`
└── _unit_tests/   Out-of-container unit tests for the harness internals (run as part of the project test suite)

Public-Boundary Discipline (hard rule)

The e2e-runner image MUST NOT import any module from the SUT source tree (src/gps_denied_onboard/**). The only legal interaction surfaces are:

  • MAVLink (ArduPilot SITL — UDP 14550)
  • MSP2 (iNav SITL — TCP 5760)
  • HTTP/JSON (mock-suite-sat-service — port 8080)
  • Filesystem read of the FDR archive after a run (fdr-output volume)

This rule is enforced by:

  1. The runner Dockerfile building from a base image that does NOT install the SUT package.
  2. Layout discipline: no import gps_denied_onboard.* in any file under e2e/.
  3. Compose e2e-net.internal: true — no external network egress (RESTRICT-SAT-1, NFT-SEC-02).

See _docs/02_document/tests/environment.md for the full per-service spec.

RUN_ID and report paths

Each invocation must set RUN_ID (defaults to local-${USER}-${EPOCH} in development; CI sets it from the workflow run id). Reports land at:

  • e2e-results/run-${RUN_ID}/report.csv
  • e2e-results/run-${RUN_ID}/evidence/ (per-run .tlog, FDR archives, screenshots, profiler traces, tegrastats CSV, jtop CSV)

The e2e-results/ directory is gitignored.

How to add a new blackbox scenario

  1. Decompose the scenario into a task spec under _docs/02_tasks/todo/.
  2. Implement the test under the appropriate e2e/tests/<category>/ folder.
  3. The conftest's session-scoped (fc_adapter, vio_strategy) parameterization automatically applies — opt out with @pytest.mark.parametrize overrides.
  4. Trace the scenario to the AC/RESTRICT IDs it exercises via the traces_to pytest marker — the CSV reporter emits this verbatim.

How to add a new fixture builder

Fixture builders live under e2e/fixtures/ and may be standalone Python modules (for runtime injectors) or Dockerized helpers (for tile-cache / mock-suite-sat). Each builder must:

  • Be reproducible — given the same input, produce bit-identical output.
  • Document its output volume / path in _docs/02_document/tests/test-data.md.
  • Have a corresponding unit test under e2e/_unit_tests/fixtures/.

Out-of-container unit tests

The harness's internal Python — CSV reporter, helpers, parsers, mock app, conftest skip rules — is unit-tested under e2e/_unit_tests/. These tests do NOT require Docker, SITL, or any external service and run as part of the project's main pytest invocation (testpaths extension in pyproject.toml).