mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 07:11:13 +00:00
[AZ-615] Jetson setup doc: heredoc fix + cheaper smoke test
Two doc lessons learned from on-Jetson verification:
1. The `cat >> ~/.ssh/config <<'EOF'` heredoc needs a leading blank
line. Without it, the appended block fused onto the previous
file line and produced "unsupported option yesHost" at parse
time. Added an explicit blank line + comment.
2. The smoke test for nvidia-container-runtime doesn't need a 5 GB
l4t-jetpack pull — nvidia-container-runtime mounts nvidia-smi
from the host into any container, so `ubuntu:22.04 nvidia-smi`
(80 MB) is sufficient. Switched the doc.
Operator verified end-to-end:
* `ssh jetson-e2e true` works from both terminal and Cursor Shell
* `jetson` user already in `docker` group (no sudo needed)
* `docker run --runtime=nvidia ubuntu:22.04 nvidia-smi` returns
Orin GPU info inside the container
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -58,7 +58,12 @@ ssh-copy-id -i ~/.ssh/id_ed25519_jetson_e2e.pub <jetson-user>@<jetson-ip>
|
||||
|
||||
# Wire up ~/.ssh/config (gitignored, never committed). Add `Port <port>`
|
||||
# if the Jetson's sshd listens on a non-default port.
|
||||
#
|
||||
# IMPORTANT: the leading blank line inside the heredoc is intentional.
|
||||
# Without it, the appended block can fuse onto the previous file line
|
||||
# (`IdentitiesOnly yesHost jetson-e2e` was a real failure mode).
|
||||
cat >> ~/.ssh/config <<'EOF'
|
||||
|
||||
Host jetson-e2e
|
||||
HostName <jetson-ip>
|
||||
User <jetson-user>
|
||||
@@ -67,7 +72,7 @@ Host jetson-e2e
|
||||
IdentitiesOnly yes
|
||||
AddKeysToAgent yes
|
||||
UseKeychain yes
|
||||
StrictHostKeyChecking yes
|
||||
StrictHostKeyChecking accept-new
|
||||
ServerAliveInterval 30
|
||||
ServerAliveCountMax 4
|
||||
EOF
|
||||
@@ -103,24 +108,23 @@ Then `sudo systemctl reload ssh`.
|
||||
|
||||
### 4. Verify the Jetson Docker + GPU pipeline
|
||||
|
||||
`nvcr.io/nvidia/l4t-base` was deprecated in JetPack 6 — use
|
||||
`l4t-jetpack` (the official replacement) for the smoke test:
|
||||
`nvidia-container-runtime` mounts `nvidia-smi` + CUDA libs from the
|
||||
host into the container at runtime, so a tiny base image works for the
|
||||
smoke test (no need to pull the 5 GB `l4t-jetpack` image just to check
|
||||
GPU exposure):
|
||||
|
||||
```bash
|
||||
ssh jetson-e2e 'docker run --rm --runtime=nvidia --gpus all \
|
||||
nvcr.io/nvidia/l4t-jetpack:r36.4.0 nvidia-smi'
|
||||
ubuntu:22.04 nvidia-smi'
|
||||
```
|
||||
|
||||
Expected output: an `nvidia-smi`-style table listing the Orin GPU. If
|
||||
this fails with "runtime not found" or "no GPU devices", install
|
||||
`nvidia-container-toolkit` and `sudo systemctl restart docker`. If it
|
||||
fails with `pull access denied`, run `docker login nvcr.io` once (NGC
|
||||
API key from developer.nvidia.com — most public images don't require
|
||||
auth, but the registry sometimes prompts).
|
||||
this fails with "could not select device driver \"nvidia\"" or "no GPU
|
||||
devices", reinstall `nvidia-container-toolkit` and
|
||||
`sudo systemctl restart docker`.
|
||||
|
||||
If `nvidia-smi` works on the host directly (it does — driver 540.5.0,
|
||||
CUDA 12.6, Orin detected) but the container can't see the GPU, the
|
||||
problem is always nvidia-container-toolkit, not the driver.
|
||||
If `nvidia-smi` works on the host directly but not inside a container,
|
||||
the problem is always nvidia-container-toolkit, not the driver.
|
||||
|
||||
### 5. Confirm disk + swap
|
||||
|
||||
|
||||
Reference in New Issue
Block a user