Two doc lessons learned from on-Jetson verification:
1. The `cat >> ~/.ssh/config <<'EOF'` heredoc needs a leading blank
line. Without it, the appended block fused onto the previous
file line and produced "unsupported option yesHost" at parse
time. Added an explicit blank line + comment.
2. The smoke test for nvidia-container-runtime doesn't need a 5 GB
l4t-jetpack pull — nvidia-container-runtime mounts nvidia-smi
from the host into any container, so `ubuntu:22.04 nvidia-smi`
(80 MB) is sufficient. Switched the doc.
Operator verified end-to-end:
* `ssh jetson-e2e true` works from both terminal and Cursor Shell
* `jetson` user already in `docker` group (no sudo needed)
* `docker run --runtime=nvidia ubuntu:22.04 nvidia-smi` returns
Orin GPU info inside the container
Co-authored-by: Cursor <cursoragent@cursor.com>
Operator-reported: `nvcr.io/nvidia/l4t-base:r36.4.0` fails to pull.
Investigation against the live registries confirmed:
* `nvcr.io/nvidia/l4t-base` — deprecated in JetPack 6, no r36 tags
(forum thread "L4T Base docker image for Jetpack 6.2 (r36.4.3)",
GitHub dusty-nv/jetson-containers#883).
* `nvcr.io/nvidia/l4t-pytorch` — no r36 tags at all. Newest is
r35.2.1-pth2.0-py3 (too old for our torch>=2.2 floor).
* `nvcr.io/nvidia/l4t-jetpack:r36.4.0` — exists but ships no PyTorch.
* `dustynv/l4t-pytorch:r36.4.0` (Docker Hub) — exists, ~6.3 GB ARM64,
PyTorch + torchvision + opencv pre-baked, maintained by dusty-nv
(NVIDIA's Jetson containers maintainer).
Switched Dockerfile.jetson base to `dustynv/l4t-pytorch:r36.4.0`.
Forward-compatible with the host's R36.5 BSP (NVIDIA containers
tolerate one minor BSP ahead on the host side).
Setup doc fixes:
* smoke-test command now uses `l4t-jetpack:r36.4.0` (the official
replacement for the deprecated `l4t-base`)
* keygen step explicitly states it produces BOTH halves (private +
.pub) in one go
* ssh-copy-id + ssh config show how to specify a custom port
* troubleshooting table gets a new row for the `l4t-base not found`
case so the next dev hits the answer in 30 seconds
Co-authored-by: Cursor <cursoragent@cursor.com>
C7 inference (PytorchFp16Runtime / TensorRTRuntime / OnnxTrtEpRuntime)
is CUDA-only by design — `model.half().cuda()` is hard-wired with no
CPU fallback. The Colima/Tier-1 smoke harness can never exercise C3
matcher or C7 inference. Once AZ-614 fixes the tlog time-base mismatch
and the pipeline reaches those stages, Colima runs would hard-fail at
`.cuda()` instead of cleanly skipping.
This commit lays down the Jetson companion harness and wires the
existing `tier2` auto-skip:
* tests/e2e/Dockerfile.jetson — l4t-pytorch:r36.4.0-pth2.3-py3 base,
same /opt layout as the Colima image so AC-4 AST scan + bind mounts
work identically. Built ON the Jetson via run-tests-jetson.sh.
* docker-compose.test.jetson.yml — mirrors docker-compose.test.yml
but with `runtime: nvidia`, GPU device exposure, and
GPS_DENIED_TIER=2 (turns OFF the tier2 auto-skip).
* scripts/run-tests-jetson.sh — rsync → ssh build → ssh up,
exit-code-from e2e-runner so the local exit code reflects the
remote test verdict. No credentials in the repo; uses
`ssh jetson-e2e` alias resolved via ~/.ssh/config.
* _docs/03_implementation/jetson_harness_setup.md — one-time SSH
key + alias + sshd hardening + GPU verification steps. Documents
the smoke vs. Reality Gate split + the GPS_DENIED_TIER switch.
AZ-617 (mark heavy ACs with tier2): adds @pytest.mark.tier2 to AC-1,
AC-2, AC-3, AC-5, AC-6 in tests/e2e/replay/test_derkachi_1min.py.
Reuses the existing tier2 marker + auto-skip in tests/conftest.py
(scope revision documented as a comment on AZ-617). AC-4a/4b/AC-7/AC-9
stay unmarked — they don't touch CUDA.
Defers to follow-up Jira:
* AZ-614 — Derkachi tlog synth time-base mismatch (unblocks tier2 ACs
actually reaching the GPU stage on the Jetson)
* AZ-616 — replace mock-sat with real ../satellite-provider service
Not run yet: the harness needs operator-side SSH setup to come online
before scripts/run-tests-jetson.sh can be executed end-to-end. Setup
steps documented in jetson_harness_setup.md.
Co-authored-by: Cursor <cursoragent@cursor.com>