[AZ-615] [AZ-617] Add Jetson e2e harness + tier2 marks

C7 inference (PytorchFp16Runtime / TensorRTRuntime / OnnxTrtEpRuntime) is CUDA-only by design — `model.half().cuda()` is hard-wired with no CPU fallback. The Colima/Tier-1 smoke harness can never exercise C3 matcher or C7 inference. Once AZ-614 fixes the tlog time-base mismatch and the pipeline reaches those stages, Colima runs would hard-fail at `.cuda()` instead of cleanly skipping. This commit lays down the Jetson companion harness and wires the existing `tier2` auto-skip: * tests/e2e/Dockerfile.jetson — l4t-pytorch:r36.4.0-pth2.3-py3 base, same /opt layout as the Colima image so AC-4 AST scan + bind mounts work identically. Built ON the Jetson via run-tests-jetson.sh. * docker-compose.test.jetson.yml — mirrors docker-compose.test.yml but with `runtime: nvidia`, GPU device exposure, and GPS_DENIED_TIER=2 (turns OFF the tier2 auto-skip). * scripts/run-tests-jetson.sh — rsync → ssh build → ssh up, exit-code-from e2e-runner so the local exit code reflects the remote test verdict. No credentials in the repo; uses `ssh jetson-e2e` alias resolved via ~/.ssh/config. * _docs/03_implementation/jetson_harness_setup.md — one-time SSH key + alias + sshd hardening + GPU verification steps. Documents the smoke vs. Reality Gate split + the GPS_DENIED_TIER switch. AZ-617 (mark heavy ACs with tier2): adds @pytest.mark.tier2 to AC-1, AC-2, AC-3, AC-5, AC-6 in tests/e2e/replay/test_derkachi_1min.py. Reuses the existing tier2 marker + auto-skip in tests/conftest.py (scope revision documented as a comment on AZ-617). AC-4a/4b/AC-7/AC-9 stay unmarked — they don't touch CUDA. Defers to follow-up Jira: * AZ-614 — Derkachi tlog synth time-base mismatch (unblocks tier2 ACs actually reaching the GPU stage on the Jetson) * AZ-616 — replace mock-sat with real ../satellite-provider service Not run yet: the harness needs operator-side SSH setup to come online before scripts/run-tests-jetson.sh can be executed end-to-end. Setup steps documented in jetson_harness_setup.md. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 10:31:13 +00:00 · 2026-05-18 01:57:23 +03:00
parent c2934b8686
commit 9c13ab3bd0
5 changed files with 477 additions and 0 deletions
@@ -0,0 +1,83 @@
+# Tier-2 e2e-runner image — Jetson Orin Nano (JetPack 6.x, L4T R36.x).
+#
+# AZ-615: companion image to `tests/e2e/Dockerfile` (Colima/Tier-1 smoke
+# harness) that runs the full Reality Gate — including C3 matcher + C7
+# inference — against a CUDA-capable GPU.
+#
+# Hardware contract (operator-confirmed, 2026-05-17):
+#   * Jetson Orin Nano, JetPack 6.2.2+b24, L4T R36.5.0
+#   * nvidia-container-toolkit ≥ 1.16
+#   * `docker run --runtime=nvidia ... nvidia-smi` returns the GPU
+#
+# Image layout mirrors the Colima Dockerfile (so AC-4 AST scan + bind
+# mounts work the same way):
+#   /opt/pyproject.toml
+#   /opt/src/gps_denied_onboard/...     (SUT package, editable install)
+#   /opt/tests/...                      (bind-mounted from host)
+#   /opt/_docs/00_problem/input_data/   (bind-mounted from host)
+#
+# Build context is the repo root (see `docker-compose.test.jetson.yml`
+# → `services.e2e-runner.build.context`).
+#
+# BUILD HOST: this image MUST be built ON the Jetson — cross-building
+# from x86 macOS produces images that miss Tegra-specific shared libs
+# the nvidia-container-runtime later mounts at run time.
+
+# ---------------------------------------------------------------------------
+# Base — l4t-pytorch ships JetPack runtime + PyTorch wheel ready for `.cuda()`
+#
+# Tag selection: NGC publishes l4t-pytorch on a slight lag from L4T BSP
+# releases. With BSP R36.5 on the device, the closest stable NGC tag at
+# author time is `r36.4.0-pth2.3-py3`. NVIDIA containers are
+# forward-compatible across one minor BSP (the container's userspace
+# can be slightly older than the host's L4T kernel). If a `r36.5.0-*`
+# tag is published, prefer it.
+#
+# Image lookup at run time: `docker manifest inspect nvcr.io/nvidia/l4t-pytorch:r36.4.0-pth2.3-py3`
+FROM nvcr.io/nvidia/l4t-pytorch:r36.4.0-pth2.3-py3 AS runtime
+
+ARG DEBIAN_FRONTEND=noninteractive
+# System deps mirror tests/e2e/Dockerfile + the Jetson runtime stack:
+#   * build-essential / libpq-dev / libspatialindex-dev — same as Colima
+#   * python3-pip / python3-venv — l4t-pytorch ships python but not always venv
+#   * libgl1 + libglib2.0-0 — OpenCV runtime libs (same reason as Colima)
+#   * libpq5 + libspatialindex-c6 — runtime side of psycopg + rtree
+# Note: CUDA / cuDNN / TensorRT come pre-baked in the base image — do NOT
+# attempt to apt-install them (would conflict with the Tegra-specific libs
+# the runtime mounts).
+RUN apt-get update && apt-get install -y --no-install-recommends \
+        ca-certificates \
+        build-essential \
+        libpq-dev \
+        libspatialindex-dev \
+        libpq5 \
+        libspatialindex-c6 \
+        libgl1 \
+        libglib2.0-0 \
+        python3-pip \
+        python3-venv \
+    && rm -rf /var/lib/apt/lists/*
+
+WORKDIR /opt
+
+# Editable SUT install. Skipping the `[inference]` extra because PyTorch +
+# torchvision are already provided by the l4t-pytorch base image with
+# Tegra-specific CUDA builds; reinstalling them from PyPI would clobber
+# the Tegra wheels with x86-compatible ones that lack the cuDNN / cuBLAS
+# linkage required by Orin.
+COPY pyproject.toml README.md ./
+COPY src ./src
+
+# `--break-system-packages` is needed because the l4t-pytorch base image
+# uses an externally-managed Python environment (PEP 668). The alternative
+# would be to layer a venv on top of the pre-installed torch, but that
+# would shadow the Tegra-tuned torch wheel and break `.cuda()`. The image
+# IS the environment; embracing system-pip is the path of least drift.
+RUN pip3 install --no-cache-dir --break-system-packages -e ".[dev]"
+
+# ENTRYPOINT mirrors the Colima Dockerfile — pytest discovers both
+# `tests/e2e/replay/` (heavy tier2 ACs run with GPS_DENIED_TIER=2) and
+# any future `tests/e2e/scenarios/` additions. Rootdir resolves to /opt
+# via the COPY'd pyproject.toml so `from tests.e2e.replay._helpers import ...`
+# works inside the test files.
+ENTRYPOINT ["pytest", "-q", "/opt/tests/e2e/"]
@@ -56,6 +56,7 @@ _HEAVY_SKIP = pytest.mark.skipif(
 # AC-1: CLI exits 0; JSONL line count matches tlog GLOBAL_POSITION_INT count


+@pytest.mark.tier2
@_HEAVY_SKIP
 def test_ac1_exits_0_jsonl_count_match(replay_runner, derkachi_replay_inputs) -> None:
    # Act
@@ -97,6 +98,7 @@ _ESTIMATOR_OUTPUT_KEYS = frozenset(
 )


+@pytest.mark.tier2
@_HEAVY_SKIP
 def test_ac2_jsonl_schema_match(replay_runner) -> None:
    # Act
@@ -121,6 +123,7 @@ def test_ac2_jsonl_schema_match(replay_runner) -> None:
 # AC-3: ≥ 80 % of emissions within 100 m of ground truth


+@pytest.mark.tier2
@_HEAVY_SKIP
@pytest.mark.xfail(
    reason=(
@@ -350,6 +353,7 @@ def test_ac4_encoder_byte_equality_via_transport_seam() -> None:
 # AC-5: Determinism (two runs differ by ≤ 1e-6 in position fields)


+@pytest.mark.tier2
@_HEAVY_SKIP
 def test_ac5_determinism_two_runs_diff(replay_runner) -> None:
    # Act
@@ -378,6 +382,7 @@ def test_ac5_determinism_two_runs_diff(replay_runner) -> None:
 # AC-6: Pace timing


+@pytest.mark.tier2
@_HEAVY_SKIP
 def test_ac6_pace_realtime_60s_within_5pct(replay_runner) -> None:
    # Act
@@ -391,6 +396,7 @@ def test_ac6_pace_realtime_60s_within_5pct(replay_runner) -> None:
    )


+@pytest.mark.tier2
@_HEAVY_SKIP
 def test_ac6_pace_asap_under_30s(replay_runner) -> None:
    # Act