diff --git a/_docs/03_implementation/jetson_harness_setup.md b/_docs/03_implementation/jetson_harness_setup.md index d31d416..5f9cafd 100644 --- a/_docs/03_implementation/jetson_harness_setup.md +++ b/_docs/03_implementation/jetson_harness_setup.md @@ -58,7 +58,12 @@ ssh-copy-id -i ~/.ssh/id_ed25519_jetson_e2e.pub @ # Wire up ~/.ssh/config (gitignored, never committed). Add `Port ` # if the Jetson's sshd listens on a non-default port. +# +# IMPORTANT: the leading blank line inside the heredoc is intentional. +# Without it, the appended block can fuse onto the previous file line +# (`IdentitiesOnly yesHost jetson-e2e` was a real failure mode). cat >> ~/.ssh/config <<'EOF' + Host jetson-e2e HostName User @@ -67,7 +72,7 @@ Host jetson-e2e IdentitiesOnly yes AddKeysToAgent yes UseKeychain yes - StrictHostKeyChecking yes + StrictHostKeyChecking accept-new ServerAliveInterval 30 ServerAliveCountMax 4 EOF @@ -103,24 +108,23 @@ Then `sudo systemctl reload ssh`. ### 4. Verify the Jetson Docker + GPU pipeline -`nvcr.io/nvidia/l4t-base` was deprecated in JetPack 6 — use -`l4t-jetpack` (the official replacement) for the smoke test: +`nvidia-container-runtime` mounts `nvidia-smi` + CUDA libs from the +host into the container at runtime, so a tiny base image works for the +smoke test (no need to pull the 5 GB `l4t-jetpack` image just to check +GPU exposure): ```bash ssh jetson-e2e 'docker run --rm --runtime=nvidia --gpus all \ - nvcr.io/nvidia/l4t-jetpack:r36.4.0 nvidia-smi' + ubuntu:22.04 nvidia-smi' ``` Expected output: an `nvidia-smi`-style table listing the Orin GPU. If -this fails with "runtime not found" or "no GPU devices", install -`nvidia-container-toolkit` and `sudo systemctl restart docker`. If it -fails with `pull access denied`, run `docker login nvcr.io` once (NGC -API key from developer.nvidia.com — most public images don't require -auth, but the registry sometimes prompts). +this fails with "could not select device driver \"nvidia\"" or "no GPU +devices", reinstall `nvidia-container-toolkit` and +`sudo systemctl restart docker`. -If `nvidia-smi` works on the host directly (it does — driver 540.5.0, -CUDA 12.6, Orin detected) but the container can't see the GPU, the -problem is always nvidia-container-toolkit, not the driver. +If `nvidia-smi` works on the host directly but not inside a container, +the problem is always nvidia-container-toolkit, not the driver. ### 5. Confirm disk + swap