[AZ-615] Jetson setup doc: heredoc fix + cheaper smoke test

Two doc lessons learned from on-Jetson verification:

1. The `cat >> ~/.ssh/config <<'EOF'` heredoc needs a leading blank
   line. Without it, the appended block fused onto the previous
   file line and produced "unsupported option yesHost" at parse
   time. Added an explicit blank line + comment.
2. The smoke test for nvidia-container-runtime doesn't need a 5 GB
   l4t-jetpack pull — nvidia-container-runtime mounts nvidia-smi
   from the host into any container, so `ubuntu:22.04 nvidia-smi`
   (80 MB) is sufficient. Switched the doc.

Operator verified end-to-end:
  * `ssh jetson-e2e true` works from both terminal and Cursor Shell
  * `jetson` user already in `docker` group (no sudo needed)
  * `docker run --runtime=nvidia ubuntu:22.04 nvidia-smi` returns
    Orin GPU info inside the container

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-18 07:39:31 +03:00
parent 6586208f83
commit 662327ce32
+16 -12
View File
@@ -58,7 +58,12 @@ ssh-copy-id -i ~/.ssh/id_ed25519_jetson_e2e.pub <jetson-user>@<jetson-ip>
# Wire up ~/.ssh/config (gitignored, never committed). Add `Port <port>`
# if the Jetson's sshd listens on a non-default port.
#
# IMPORTANT: the leading blank line inside the heredoc is intentional.
# Without it, the appended block can fuse onto the previous file line
# (`IdentitiesOnly yesHost jetson-e2e` was a real failure mode).
cat >> ~/.ssh/config <<'EOF'
Host jetson-e2e
HostName <jetson-ip>
User <jetson-user>
@@ -67,7 +72,7 @@ Host jetson-e2e
IdentitiesOnly yes
AddKeysToAgent yes
UseKeychain yes
StrictHostKeyChecking yes
StrictHostKeyChecking accept-new
ServerAliveInterval 30
ServerAliveCountMax 4
EOF
@@ -103,24 +108,23 @@ Then `sudo systemctl reload ssh`.
### 4. Verify the Jetson Docker + GPU pipeline
`nvcr.io/nvidia/l4t-base` was deprecated in JetPack 6 — use
`l4t-jetpack` (the official replacement) for the smoke test:
`nvidia-container-runtime` mounts `nvidia-smi` + CUDA libs from the
host into the container at runtime, so a tiny base image works for the
smoke test (no need to pull the 5 GB `l4t-jetpack` image just to check
GPU exposure):
```bash
ssh jetson-e2e 'docker run --rm --runtime=nvidia --gpus all \
nvcr.io/nvidia/l4t-jetpack:r36.4.0 nvidia-smi'
ubuntu:22.04 nvidia-smi'
```
Expected output: an `nvidia-smi`-style table listing the Orin GPU. If
this fails with "runtime not found" or "no GPU devices", install
`nvidia-container-toolkit` and `sudo systemctl restart docker`. If it
fails with `pull access denied`, run `docker login nvcr.io` once (NGC
API key from developer.nvidia.com — most public images don't require
auth, but the registry sometimes prompts).
this fails with "could not select device driver \"nvidia\"" or "no GPU
devices", reinstall `nvidia-container-toolkit` and
`sudo systemctl restart docker`.
If `nvidia-smi` works on the host directly (it does — driver 540.5.0,
CUDA 12.6, Orin detected) but the container can't see the GPU, the
problem is always nvidia-container-toolkit, not the driver.
If `nvidia-smi` works on the host directly but not inside a container,
the problem is always nvidia-container-toolkit, not the driver.
### 5. Confirm disk + swap