Files
gps-denied-onboard/_docs/02_tasks/done/AZ-444_tier2_jetson_harness.md
T
Oleksandr Bezdieniezhnykh 6599d828d2 [AZ-407] [AZ-444] [AZ-445] Batch 68: fixtures, Tier-2 harness, NFR reporter
Three blackbox-harness tasks landed together — all depend only on
AZ-406 and unblock the FT-* / NFT-* scenario tasks scheduled for
batches 69+.

AZ-407 — Static fixture builders (3pt):
  * tile-cache-builder/{builder.py, Dockerfile, build.sh} produces a
    deterministic tile-cache-fixture Docker volume from
    _docs/00_problem/input_data/. Reproducibility primitives: sorted
    iteration, frozen PIL JPEG settings, FAISS HNSW32 built single-
    threaded with seeded stub descriptors.
  * age-injector/{age_injector.py, inject.sh} clones the volume and
    shifts capture_date by N×30.44 days; tile JPEG bytes preserved
    bit-identical. Emits synth-age-7mo + synth-age-13mo volumes.
  * cold-boot/cold_boot_fixture.json: frozen FC pose snapshot at
    Derkachi sector centre, schema v1.
  * secrets/mavlink-test-passkey.txt: 64-hex with required
    `# TEST ONLY` header line per AC-5. Passkey-equality test now
    compares the secret line after stripping the header.
  * security/cve-2025-53644.jpg: synthetic 158-byte malformed JPEG
    (truncated SOS marker). OpenCV 4.11.x rejects gracefully with
    imdecode → None. AZ-439 will sharpen for ASan instrumentation.
  * Top-level Makefile with `make fixtures` / `make fixtures-*` /
    `make e2e-tier1*` / `make unit-tests` targets.

AZ-444 — Tier-2 Jetson harness wrapper (5pt):
  * run-tier2.sh rewritten as orchestrator. Detects local
    (aarch64 + TIER2_HOST=localhost) vs remote (ssh into TIER2_HOST).
    New flags: -k/--selector, --build-kind production|asan,
    --reflash (gated behind TIER2_REFLASH_ACK=1 two-key gate),
    --dry-run.
  * tier2-on-jetson.sh (new) — on-device delegate. Verifies
    gps-denied-onboard{,-asan}.service health; restarts with 5s
    tolerance; spawns tegrastats + jtop parallel samplers; tails
    ASan unit's journal in asan mode; drives docker compose with
    TIER=tier2-jetson; forwards SELECTOR to pytest -k.
  * docker/run-tier1.sh (new) — selector-parity sibling.
  * AC-1 (selector parity) and AC-6 (reflash gating) unit-tested via
    --dry-run output assertions. AC-2/AC-3/AC-4/AC-5 are hardware-
    loop ACs verified by the Tier-2 runtime smoke (no Jetson in the
    unit-test layer).

AZ-445 — CSV reporter + evidence bundler refinements (2pt):
  * reporting/nfr_recorder.py (new) — pytest plugin. Provides the
    `nfr_recorder` fixture with record_metric(name, value, ac_id)
    and partial(ac_id, reason). At session end emits:
      - per-nfr/<scenario_id>.json (AC-1)
      - traceability-status.json with every AC ID parsed from
        traceability-matrix.md, classified Covered/PARTIAL/NOT
        COVERED with source scenario IDs (AC-2)
      - regression-baseline.json with all numeric metrics (AC-3)
  * csv_reporter.py extended — `_outcome_to_result` consults the
    aggregator; rows flip PASS → PARTIAL when an AC was marked
    PARTIAL by nfr_recorder (AC-4). Graceful fallback when
    aggregator isn't registered (unit-test contexts).
  * conftest.py registers nfr_recorder in pytest_plugins.
  * New --traceability-matrix CLI flag seeds the NOT COVERED rows.

Build / config:
  * pyproject.toml dev extras: added Pillow>=10.4,<13.0 for the
    tile-cache-builder unit test (broad enough to keep torchvision's
    Pillow 12 pin happy; the production builder runs inside its own
    Docker image with its own pin).
  * Updated test_directory_layout.py to cover 10 new files + replaced
    the byte-equal passkey assertion with the header-stripping
    variant.

Test results:
  * 157 focused tests pass (was 97 in batch 67; +60 new across this
    batch). No regressions.

Module-layout / spec drift:
  * AZ-407 spec text says `tests/fixtures/...`; module-layout
    blackbox_tests entry (commit d7a17a8) authoritatively places the
    harness under `e2e/`. Implementation followed the layout entry.
  * AZ-444 spec mentions `e2e/tier2/run-tier2.sh`; AZ-406 placed it
    at `e2e/jetson/run-tier2.sh`. Kept at `e2e/jetson/` for
    consistency.
  * Cold-boot README ownership: corrected from AZ-419 to AZ-407 per
    AZ-419's own Dependencies field.

Specs archived to _docs/02_tasks/done/. Jira tickets transitioned to
In Testing on commit.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 17:18:01 +03:00

3.9 KiB

Tier-2 Jetson harness wrapper

Task: AZ-444_tier2_jetson_harness Name: Tier-2 hardware-loop runner — run-tier2.sh, ssh provisioning, systemd service install, ASan-fuzz mode, image-flash automation Description: Implement the Tier-2 hardware-loop wrapper that AZ-406 stubs out — actual ssh-based runner, systemd service install for SUT on Jetson, tegrastats capture, ASan-fuzz launch path. Complexity: 5 points Dependencies: AZ-406 Component: Blackbox Tests / Tier-2 runner (epic AZ-262) Tracker: AZ-444 Epic: AZ-262 (E-BBT)

Problem

AZ-406 scaffolds Tier-2 (run-tier2.sh exists as a stub); but the actual hardware-loop semantics — ssh provisioning, image-flash automation, systemd service-life management, telemetry capture, ASan-fuzz launch — are non-trivial and need a dedicated task.

Outcome

  • e2e/tier2/run-tier2.sh accepts the same pytest -k <selector> selector as Tier-1 and runs the selector against a configured Jetson host (env var TIER2_HOST).
  • Provisioning: ssh-based; runs apt update && apt install -y ... for runner deps if not already present (idempotent).
  • SUT lifecycle: installs the SUT as a systemd service (gps-denied-onboard.service); restart command is systemctl restart gps-denied-onboard.
  • Telemetry capture: tegrastats runs as a parallel ssh stream during each test; output piped into the per-run evidence bundle.
  • ASan-fuzz: separate --build-kind asan mode that flashes the ASan image (or builds it remotely) and runs the fuzz binary with stderr captured.
  • Image-flash automation: --reflash flag (gated, OFF by default) re-flashes the Jetson via nvidia-sdkmanager-cli when needed.

Scope

Included

  • run-tier2.sh runner.
  • ssh-based provisioning + systemd install/restart.
  • tegrastats parallel capture.
  • ASan-fuzz launch.
  • Image-flash automation (gated).

Excluded

  • The CSV reporter — owned by AZ-406.
  • Per-scenario test logic — owned by individual scenario tasks.
  • Chamber automation for +50 °C — out of scope.

Acceptance Criteria

AC-1: selector parity Given the same pytest -k <selector> invocation Then both run-tier1.sh and run-tier2.sh accept it; the resulting test selection on Tier-2 is the same as Tier-1 (modulo tier == tier2-jetson skip rules).

AC-2: idempotent provisioning Given the Jetson host has the runner deps already installed When run-tier2.sh runs Then provisioning is a no-op (idempotent).

AC-3: systemd lifecycle Given a Tier-2 test triggers restart Then systemctl restart gps-denied-onboard is issued; the SUT process restarts within ≤5 s.

AC-4: tegrastats parallel capture Given any Tier-2 test Then tegrastats runs as a parallel ssh stream during the test; its output lands in e2e-results/run-${RUN_ID}/tegrastats-${tier2-host}-${test_id}.log.

AC-5: ASan-fuzz mode Given --build-kind asan flag Then the runner ensures the ASan SUT image is installed; the fuzz binary is launched; stderr is captured into e2e-results/run-${RUN_ID}/asan-fuzz-${test_id}.log.

AC-6: image-flash gating Given the --reflash flag is OFF (default) Then image-flash is NOT triggered; the runner errors out with a clear message if the on-Jetson SUT version does not match the test selection's expected version.

System Under Test Boundary

This task IS infrastructure — no SUT logic is exercised by it. The Tier-2 harness only orchestrates SUT lifecycle on real hardware.

Constraints

  • ssh-based: requires TIER2_HOST + TIER2_USER + TIER2_KEY_PATH env vars; fails fast if any are missing.
  • The reflash path uses NVIDIA's sdkmanager-cli and is environment-specific; it is gated OFF by default to prevent accidental hardware re-provisioning during routine CI.

Document Dependencies

  • _docs/02_document/tests/environment.md § Tier-2 (Jetson hardware loop)
  • _docs/02_document/tests/test-data.md § Tier-2-only fixtures (none beyond shared)