[AZ-187] Docker & hardening

Made-with: Cursor
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-04-17 18:48:55 +03:00
parent 7d690e1fb4
commit cfed26ff8c
6 changed files with 784 additions and 56 deletions
@@ -1,102 +1,222 @@
# Jetson device provisioning runbook
This runbook describes the end-to-end flow to fuse, flash, provision a device identity, and reach a state where the Azaion Loader can authenticate against the admin/resource APIs. It targets a Jetson Orin Nano class device; adapt paths and NVIDIA bundle versions to your manufacturing image.
This runbook describes the end-to-end flow to fuse, flash, and provision device identities so the Azaion Loader can authenticate against the admin/resource APIs. It supports Jetson Orin Nano, Orin NX 8GB, and Orin NX 16GB devices. Board configuration is auto-detected from the USB product ID.
The `scripts/provision_devices.sh` script automates the entire flow: detecting connected Jetsons, auto-installing L4T if needed, setting up Docker with the Loader container, optionally hardening the OS, registering device identities via the admin API, writing credentials, fusing, and flashing.
After provisioning, each Jetson boots into a production-ready state with Docker Compose running the Loader container.
## Prerequisites
- Provisioning workstation with bash, curl, openssl, python3, and USB/network access to the Jetson in recovery or mass-storage mode as required by your flash tools.
- Admin API reachable from the workstation (base URL, for example `https://admin.internal.example.com`).
- NVIDIA Jetson Linux Driver Package (L4T) and flash scripts for your SKU (for example `odmfuse.sh`, `flash.sh` from the board support package).
- Root filesystem staging directory on the workstation that will be merged into the image before `flash.sh` (often a `Linux_for_Tegra/rootfs/` tree or an extracted sample rootfs overlay).
- Ubuntu amd64 provisioning workstation with bash, curl, jq, wget, lsusb.
- Admin API reachable from the workstation (base URL configured in `scripts/.env`).
- An ApiAdmin account on the admin API (email and password in `scripts/.env`).
- `sudo` access on the workstation.
- USB-C cables that support both power and data transfer.
- Physical label/sticker materials for serial numbers.
- Internet access on first run (to download L4T BSP if not already installed).
- Loader Docker image tar file (see [Preparing the Loader image](#preparing-the-loader-image)).
## Admin API contract (provisioning)
The NVIDIA L4T BSP and sample rootfs are downloaded and installed automatically to `/opt/nvidia/Linux_for_Tegra` if not already present. No manual L4T setup is required.
The `scripts/provision_device.sh` script expects:
## Configuration
1. **POST** `{admin_base}/users` with JSON body `{"email":"<string>","password":"<string>","role":"CompanionPC"}`
- **201** or **200**: user created.
- **409**: user with this email already exists (idempotent re-run).
Copy `scripts/.env.example` to `scripts/.env` and fill in values:
2. **PATCH** `{admin_base}/users/password` with JSON body `{"email":"<string>","password":"<string>"}`
- Used when POST returns **409** so the password in `device.conf` matches the account after re-provisioning.
- **200** or **204**: password updated.
```
ADMIN_EMAIL=admin@azaion.com
ADMIN_PASSWORD=<your ApiAdmin password>
API_URL=https://admin.azaion.com
LOADER_IMAGE_TAR=/path/to/loader-image.tar
```
Adjust URL paths or JSON field names in the script if your deployment uses a different but equivalent contract.
Optional overrides (auto-detected/defaulted if omitted):
```
L4T_VERSION=r36.4.4
L4T_DIR=/opt/nvidia/Linux_for_Tegra
ROOTFS_DIR=/opt/nvidia/Linux_for_Tegra/rootfs
RESOURCE_API_URL=https://admin.azaion.com
LOADER_DEV_STAGE=main
LOADER_IMAGE=localhost:5000/loader:arm
FLASH_TARGET=nvme0n1p1
HARDEN=true
```
The `.env` file is git-ignored and must not be committed.
## Preparing the Loader image
The provisioning script requires a Loader Docker image tar to pre-load onto each device. Options:
**From CI (recommended):** Download the `loader-image.tar` artifact from the Woodpecker CI pipeline for the target branch.
**Local build (requires arm64 builder or BuildKit cross-compilation):**
```bash
docker build -f Dockerfile -t localhost:5000/loader:arm .
docker save localhost:5000/loader:arm -o loader-image.tar
```
Set `LOADER_IMAGE_TAR` in `.env` to the absolute path of the resulting tar file.
## Supported devices
| USB Product ID | Model | Board Config (auto-detected) |
| --- | --- | --- |
| 0955:7523 | Jetson Orin Nano | jetson-orin-nano-devkit |
| 0955:7323 | Jetson Orin NX 16GB | jetson-orin-nx-devkit |
| 0955:7423 | Jetson Orin NX 8GB | jetson-orin-nx-devkit |
The script scans for all NVIDIA USB devices (`lsusb -d 0955:`), matches them against the table above, and displays the model name next to each detected device.
## Admin API contract (device registration)
The script calls:
1. **POST** `{API_URL}/login` with `{"email":"<admin>","password":"<password>"}` to obtain a JWT.
2. **POST** `{API_URL}/devices` with `Authorization: Bearer <token>` and no request body.
- **200** or **201**: returns `{"serial":"azj-NNNN","email":"azj-NNNN@azaion.com","password":"<32-hex-chars>"}`.
- The server auto-assigns the next sequential serial number.
## Device identity and `device.conf`
For serial **AZJN-0042**, the script creates email **azaion-jetson-0042@azaion.com** (suffix is the segment after the last hyphen in the serial, lowercased). The password is 32 hexadecimal characters from `openssl rand -hex 16`.
For each registered device, the script writes:
The script writes:
`{ROOTFS_DIR}/etc/azaion/device.conf`
`{rootfs_staging}/etc/azaion/device.conf`
On the flashed device this becomes `/etc/azaion/device.conf` with:
On the flashed device this becomes **`/etc/azaion/device.conf`** with:
- `AZAION_DEVICE_EMAIL=azj-NNNN@azaion.com`
- `AZAION_DEVICE_PASSWORD=<32-hex-chars>`
- `AZAION_DEVICE_EMAIL=...`
- `AZAION_DEVICE_PASSWORD=...`
File permissions are set to **600**.
File permissions on the staging file are set to **600**. Ensure your image build preserves ownership and permissions appropriate for the service user that runs the Loader.
## Docker and application setup
The `scripts/setup_rootfs_docker.sh` script prepares the rootfs before flashing. It runs automatically as part of `provision_devices.sh`. What it installs:
| Component | Details |
| --- | --- |
| Docker Engine + Compose plugin | Installed via apt in chroot from Docker's official repository |
| NVIDIA Container Toolkit | GPU passthrough for containers; nvidia set as default runtime |
| Production compose file | `/opt/azaion/docker-compose.yml` — defines the `loader` service |
| Loader image | Pre-loaded from `LOADER_IMAGE_TAR` at `/opt/azaion/loader-image.tar` |
| Boot service | `azaion-loader.service` — loads the image tar on first boot, starts compose |
### Device filesystem layout after flash
```
/etc/azaion/device.conf Per-device credentials
/etc/docker/daemon.json Docker config (NVIDIA default runtime)
/opt/azaion/docker-compose.yml Production compose file
/opt/azaion/boot.sh Boot startup script
/opt/azaion/loader-image.tar Initial Loader image (deleted after first boot)
/opt/azaion/models/ Model storage
/opt/azaion/state/ Update manager state
/etc/systemd/system/azaion-loader.service Systemd unit
```
### First boot sequence
1. systemd starts `docker.service`
2. `azaion-loader.service` runs `/opt/azaion/boot.sh`
3. `boot.sh` runs `docker load -i /opt/azaion/loader-image.tar` (first boot only), then deletes the tar
4. `boot.sh` runs `docker compose -f /opt/azaion/docker-compose.yml up -d`
5. The Loader container starts, reads `/etc/azaion/device.conf`, authenticates with the API
6. The update manager begins polling for updates
## Security hardening
The `scripts/harden_rootfs.sh` script applies production security hardening to the rootfs. It runs automatically unless `--no-harden` is passed.
| Measure | Details |
| --- | --- |
| SSH disabled | `sshd.service` and `ssh.service` masked; `sshd_config` removed |
| Getty masked | `getty@.service` and `serial-getty@.service` masked — no login prompt |
| Serial console disabled | `console=ttyTCU0` / `console=ttyS0` removed from `extlinux.conf` |
| Sysctl hardening | ptrace blocked, core dumps disabled, kernel pointers hidden, ICMP redirects off |
| Root locked | Root account password-locked in `/etc/shadow` |
To provision without hardening (e.g. for development devices):
```bash
./scripts/provision_devices.sh --no-harden
```
Or set `HARDEN=false` in `.env`.
## Step-by-step flow
### 1. Unbox and record the serial
### 1. Connect Jetsons in recovery mode
Read the manufacturing label or use your factory barcode process. Example serial: `AZJN-0042`.
Connect one or more Jetson devices via USB-C. Put each device into recovery mode: hold Force Recovery button, press Power, release Power, then release Force Recovery after 2 seconds.
### 2. Fuse (if your product requires it)
Verify with `lsusb -d 0955:` -- each recovery-mode Jetson appears as `NVIDIA Corp. APX`.
Run your approved **fuse** workflow (for example NVIDIA `odmfuse.sh` or internal wrapper). This task does not replace secure boot or fTPM scripts; complete them per your security phase checklist before or after provisioning, according to your process.
### 2. Run the provisioning script
### 3. Prepare the rootfs staging tree
Extract or sync the rootfs you will flash into a directory on the workstation, for example:
`/work/images/orin-nano/rootfs-staging/`
Ensure `etc/` exists or can be created under this tree.
### 4. Provision the CompanionPC user and embed credentials
From the Loader repository root (or using an absolute path to the script):
From the loader repository root:
```bash
./scripts/provision_device.sh \
--serial AZJN-0042 \
--api-url "https://admin.internal.example.com" \
--rootfs-dir "/work/images/orin-nano/rootfs-staging"
./scripts/provision_devices.sh
```
Confirm the script prints success and that `rootfs-staging/etc/azaion/device.conf` exists.
The script will:
Re-running the same command for the same serial must not create a duplicate user; the script updates the password via **PATCH** when POST returns **409**.
1. **Install dependencies** -- installs lsusb, curl, jq, wget via apt; adds `qemu-user-static` and `binfmt-support` on x86 hosts for cross-arch chroot.
2. **Install L4T** -- if L4T BSP is not present at `L4T_DIR`, downloads the BSP and sample rootfs, extracts them, and runs `apply_binaries.sh`. This only happens on first run.
3. **Set up Docker** -- installs Docker Engine, NVIDIA Container Toolkit, compose file, and Loader image into the rootfs via chroot (`setup_rootfs_docker.sh`).
4. **Harden OS** (unless `--no-harden`) -- disables SSH, getty, serial console, applies sysctl hardening (`harden_rootfs.sh`).
5. **Authenticate** -- logs in to the admin API to get a JWT.
6. **Scan USB** -- detects all supported Jetson devices in recovery mode, displays model names.
7. **Display selection UI** -- lists detected devices with numbers and model type.
8. **Prompt for selection** -- enter device numbers (e.g. `1 3 4`), or `0` for all.
If the admin API requires authentication (Bearer token, mTLS), extend the script or shell wrapper to pass the required `curl` headers or use a local proxy; the stock script assumes network-restricted admin access without extra headers.
### 3. Per-device provisioning (automatic)
### 5. Flash the device
For each selected device, the script runs sequentially:
Run your normal **flash** procedure (for example `flash.sh` or SDK Manager) so the staged rootfs—including `etc/azaion/device.conf`—is written to the device storage.
1. **Register** -- calls `POST /devices` to get server-assigned serial, email, and password.
2. **Write device.conf** -- embeds credentials in the rootfs staging directory.
3. **Fuse** -- runs `odmfuse.sh` targeting the specific USB device instance. Board config is auto-detected from the USB product ID.
4. **Power-cycle prompt** -- asks the admin to power-cycle the device and re-enter recovery mode.
5. **Flash** -- runs `flash.sh` with the auto-detected board config to write the rootfs (including `device.conf`, Docker, and application files) to the device. Default target is `nvme0n1p1` (NVMe SSD); override with `FLASH_TARGET` in `.env` (e.g. `mmcblk0p1` for eMMC).
6. **Sticker prompt** -- displays the assigned serial and asks the admin to apply a physical label.
### 6. First boot
### 4. Apply serial labels
Power the Jetson, complete first-boot configuration if any, and verify the Loader service starts. The Loader should read `AZAION_DEVICE_EMAIL` and `AZAION_DEVICE_PASSWORD` from `/etc/azaion/device.conf`, then use them when calling **POST /login** on the Loader HTTP API (which forwards credentials to the configured resource API per your deployment). After a successful login path, the device can request resources and unlock flows as designed.
After each device is flashed, the script prints the assigned serial (e.g. `azj-0042`). Apply a label/sticker with this serial to the device enclosure for physical identification.
### 7. Smoke verification
### 5. First boot
- From another host: Loader **GET /health** returns healthy.
- **POST /login** on the Loader with the same email and password as in `device.conf` returns success (for example `{"status":"ok"}` in the reference implementation).
- Optional: trigger your normal resource or unlock smoke test against a staging API.
Power the Jetson. Docker starts automatically, loads the Loader image, and starts Docker Compose. The Loader service reads `AZAION_DEVICE_EMAIL` and `AZAION_DEVICE_PASSWORD` from `/etc/azaion/device.conf` and uses them to authenticate with the admin API via `POST /login`. The update manager begins checking for updates.
### 6. Smoke verification
- From another host: Loader `GET /health` on port 8080 returns healthy.
- `docker ps` on the device (if unhardened) shows the loader container running.
- Optional: trigger a resource or unlock smoke test against a staging API.
## Troubleshooting
| Symptom | Check |
|--------|--------|
| curl fails to reach admin API | DNS, VPN, firewall, and `API_URL` trailing slash (script strips one trailing slash). |
| HTTP 4xx/5xx from POST /users | Admin logs; confirm role value **CompanionPC** and email uniqueness rules. |
| 409 then failure on PATCH | Implement or enable **PATCH /users/password** (or change script to match your upsert API). |
| Loader cannot log in | `device.conf` path, permissions, and that the password in the file matches the account after the last successful provision. |
| --- | --- |
| No devices found by script | USB cables, recovery mode entry sequence, `lsusb -d 0955:` |
| Unknown product ID warning | Device is an NVIDIA USB device but not in the supported models table. Check SKU. |
| L4T download fails | Internet access, NVIDIA download servers availability, `L4T_VERSION` value |
| Login fails (HTTP 401) | `ADMIN_EMAIL` and `ADMIN_PASSWORD` in `.env`; account must have ApiAdmin role |
| POST /devices fails | Admin API logs; ensure AZ-196 endpoint is deployed |
| Fuse fails | L4T version compatibility, USB connection stability, sudo access |
| Flash fails | Rootfs contents, USB device still in recovery mode after power-cycle, verify `FLASH_TARGET` matches your storage (NVMe vs eMMC) |
| Docker setup fails in chroot | Verify `qemu-user-static` was installed (auto-installed on x86 hosts); check internet in chroot |
| Loader container not starting | Check `docker logs` on device; verify `/etc/azaion/device.conf` exists and has correct permissions |
| Loader cannot log in after boot | `device.conf` path and permissions; password must match the account created by POST /devices |
| Cannot SSH to hardened device | Expected behavior. Use `--no-harden` for dev devices, or reflash with USB recovery mode |
## Security notes
- Treat `device.conf` as a secret at rest; restrict file permissions and disk encryption per your product policy.
- The `.env` file contains ApiAdmin credentials -- do not commit it. It is listed in `.gitignore`.
- Prefer short-lived credentials or key rotation if the admin API supports it; this runbook describes the baseline manufacturing flow.
- Hardened devices have no SSH, no serial console, and no interactive login. Field debug requires USB recovery mode reflash with `--no-harden`.
+23
View File
@@ -0,0 +1,23 @@
API_URL=https://admin.azaion.com
ADMIN_EMAIL=admin@azaion.com
ADMIN_PASSWORD=
# Path to the Loader Docker image tar (required).
# Build with: docker save localhost:5000/loader:arm -o loader-image.tar
LOADER_IMAGE_TAR=
# Optional overrides (auto-detected if omitted):
# L4T_VERSION=r36.4.4
# L4T_DIR=/opt/nvidia/Linux_for_Tegra
# ROOTFS_DIR=/opt/nvidia/Linux_for_Tegra/rootfs
# Flash target (default: nvme0n1p1 for NVMe SSD, use mmcblk0p1 for eMMC):
# FLASH_TARGET=nvme0n1p1
# Loader runtime configuration (defaults shown):
# RESOURCE_API_URL=https://admin.azaion.com
# LOADER_DEV_STAGE=main
# LOADER_IMAGE=localhost:5000/loader:arm
# Security hardening (default: true). Set to false or use --no-harden flag.
# HARDEN=true
+62
View File
@@ -0,0 +1,62 @@
#!/usr/bin/env bash
set -euo pipefail
L4T_VERSION="${L4T_VERSION:-r36.4.4}"
L4T_DIR="${L4T_DIR:-/opt/nvidia/Linux_for_Tegra}"
ROOTFS_DIR="${ROOTFS_DIR:-${L4T_DIR}/rootfs}"
l4t_download_url() {
local version="$1"
local major minor patch
IFS='.' read -r major minor patch <<< "${version#r}"
echo "https://developer.nvidia.com/downloads/embedded/l4t/r${major}_release_v${minor}.${patch}/release/Jetson_Linux_${version}_aarch64.tbz2"
}
rootfs_download_url() {
local version="$1"
local major minor patch
IFS='.' read -r major minor patch <<< "${version#r}"
echo "https://developer.nvidia.com/downloads/embedded/l4t/r${major}_release_v${minor}.${patch}/release/Tegra_Linux_Sample-Root-Filesystem_${version}_aarch64.tbz2"
}
if [[ -f "$L4T_DIR/flash.sh" ]]; then
echo "L4T BSP already installed at $L4T_DIR"
else
echo "L4T BSP not found at $L4T_DIR"
echo "Downloading and installing L4T $L4T_VERSION..."
bsp_url="$(l4t_download_url "$L4T_VERSION")"
rootfs_url="$(rootfs_download_url "$L4T_VERSION")"
tmp_dir="$(mktemp -d)"
echo " Downloading BSP from $bsp_url ..."
wget -q --show-progress -O "$tmp_dir/bsp.tbz2" "$bsp_url"
echo " Downloading Sample Root Filesystem from $rootfs_url ..."
wget -q --show-progress -O "$tmp_dir/rootfs.tbz2" "$rootfs_url"
echo " Extracting BSP to $(dirname "$L4T_DIR")/ ..."
sudo mkdir -p "$(dirname "$L4T_DIR")"
sudo tar -xjf "$tmp_dir/bsp.tbz2" -C "$(dirname "$L4T_DIR")"
echo " Extracting rootfs to $L4T_DIR/rootfs/ ..."
sudo tar -xjf "$tmp_dir/rootfs.tbz2" -C "$L4T_DIR/rootfs/"
echo " Running apply_binaries.sh ..."
sudo "$L4T_DIR/apply_binaries.sh"
rm -rf "$tmp_dir"
echo "L4T $L4T_VERSION installed to $L4T_DIR"
fi
for tool in flash.sh odmfuse.sh; do
if [[ ! -f "$L4T_DIR/$tool" ]]; then
echo "ERROR: $L4T_DIR/$tool not found after L4T setup" >&2
exit 1
fi
done
NV_RELEASE="$L4T_DIR/rootfs/etc/nv_tegra_release"
if [[ -f "$NV_RELEASE" ]]; then
echo "L4T release: $(head -1 "$NV_RELEASE")"
fi
+52
View File
@@ -0,0 +1,52 @@
#!/usr/bin/env bash
set -euo pipefail
ROOTFS="${ROOTFS_DIR:-/opt/nvidia/Linux_for_Tegra/rootfs}"
if [[ ! -d "$ROOTFS" ]]; then
echo "ERROR: Rootfs directory not found: $ROOTFS" >&2
exit 1
fi
echo "=== Hardening rootfs: $ROOTFS ==="
echo "[1/5] Disabling SSH..."
for unit in sshd.service ssh.service; do
sudo ln -sf /dev/null "$ROOTFS/etc/systemd/system/$unit" 2>/dev/null || true
done
sudo rm -f "$ROOTFS/etc/ssh/sshd_config"
echo "[2/5] Masking getty and serial console services..."
for unit in "getty@.service" "serial-getty@.service"; do
sudo ln -sf /dev/null "$ROOTFS/etc/systemd/system/$unit"
done
echo "[3/5] Disabling serial console in bootloader config..."
EXTLINUX="$ROOTFS/boot/extlinux/extlinux.conf"
if [[ -f "$EXTLINUX" ]]; then
sudo sed -i 's/console=ttyTCU0[^ ]*//' "$EXTLINUX"
sudo sed -i 's/console=ttyS0[^ ]*//' "$EXTLINUX"
sudo sed -i 's/ */ /g' "$EXTLINUX"
fi
echo "[4/5] Applying sysctl hardening..."
sudo tee "$ROOTFS/etc/sysctl.d/99-azaion-hardening.conf" > /dev/null <<'EOF'
kernel.yama.ptrace_scope = 3
kernel.core_pattern = |/bin/false
kernel.kptr_restrict = 2
kernel.dmesg_restrict = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
EOF
echo "[5/5] Locking root account..."
if [[ -f "$ROOTFS/etc/shadow" ]]; then
sudo sed -i 's|^root:[^:]*:|root:!:|' "$ROOTFS/etc/shadow"
fi
echo ""
echo "Hardening complete."
+292
View File
@@ -0,0 +1,292 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ENV_FILE="$SCRIPT_DIR/.env"
HARDEN="${HARDEN:-true}"
while [[ $# -gt 0 ]]; do
case "$1" in
--no-harden) HARDEN="false"; shift ;;
*) echo "ERROR: Unknown option: $1" >&2; exit 1 ;;
esac
done
NVIDIA_VENDOR="0955"
declare -A PID_TO_MODEL=(
["7523"]="Orin Nano"
["7323"]="Orin NX 16GB"
["7423"]="Orin NX 8GB"
)
declare -A PID_TO_BOARD_CONFIG=(
["7523"]="jetson-orin-nano-devkit"
["7323"]="jetson-orin-nx-devkit"
["7423"]="jetson-orin-nx-devkit"
)
require_env_var() {
local name="$1"
local val="${!name:-}"
if [[ -z "$val" ]]; then
echo "ERROR: $name is not set in $ENV_FILE" >&2
exit 1
fi
}
api_post() {
local url="$1"; shift
local response
response="$(curl -sS -w "\n%{http_code}" -X POST "$url" -H "Content-Type: application/json" "$@")"
echo "$response"
}
provision_single_device() {
local device_line="$1"
local usb_id="$2"
local board_config="$3"
echo "[Step 1/5] Registering device with admin API..."
local reg_response reg_http reg_body
reg_response="$(api_post "${API_URL}/devices" -H "Authorization: Bearer $TOKEN")"
reg_http="$(echo "$reg_response" | tail -1)"
reg_body="$(echo "$reg_response" | sed '$d')"
if [[ "$reg_http" != "200" && "$reg_http" != "201" ]]; then
echo "ERROR: Device registration failed (HTTP $reg_http)" >&2
echo "$reg_body" >&2
RESULTS+=("$usb_id | FAILED | registration error HTTP $reg_http")
return 1
fi
local serial dev_email dev_password
serial="$(echo "$reg_body" | jq -r '.serial // .Serial // empty')"
dev_email="$(echo "$reg_body" | jq -r '.email // .Email // empty')"
dev_password="$(echo "$reg_body" | jq -r '.password // .Password // empty')"
if [[ -z "$serial" || -z "$dev_email" || -z "$dev_password" ]]; then
echo "ERROR: Incomplete response from POST /devices" >&2
RESULTS+=("$usb_id | FAILED | incomplete API response")
return 1
fi
echo " Assigned serial: $serial"
echo " Email: $dev_email"
echo "[Step 2/5] Writing device.conf to rootfs staging..."
local conf_dir="${ROOTFS_DIR}/etc/azaion"
sudo mkdir -p "$conf_dir"
local conf_path="${conf_dir}/device.conf"
printf 'AZAION_DEVICE_EMAIL=%s\nAZAION_DEVICE_PASSWORD=%s\n' "$dev_email" "$dev_password" \
| sudo tee "$conf_path" > /dev/null
sudo chmod 600 "$conf_path"
echo " Written: $conf_path"
echo "[Step 3/5] Fusing device (odmfuse.sh)..."
if ! sudo "$L4T_DIR/odmfuse.sh" --instance "$usb_id" 2>&1; then
echo "ERROR: Fusing failed for $serial ($usb_id)" >&2
RESULTS+=("$usb_id | $serial | FAILED | fuse error")
return 1
fi
echo ""
echo "[$serial] Fuse complete."
read -rp " Power-cycle the device and put it back in recovery mode. Press Enter when ready..."
echo "[Step 4/5] Flashing device (flash.sh)..."
if ! sudo "$L4T_DIR/flash.sh" "$board_config" "$FLASH_TARGET" --instance "$usb_id" 2>&1; then
echo "ERROR: Flashing failed for $serial ($usb_id)" >&2
RESULTS+=("$usb_id | $serial | FAILED | flash error")
return 1
fi
echo ""
echo "[$serial] Flash complete."
echo " >>> Apply sticker with serial: $serial <<<"
read -rp " Power-cycle for first boot. Press Enter when done..."
echo "[Step 5/5] $serial provisioned successfully."
RESULTS+=("$usb_id | $serial | OK")
}
# --- main ---
if [[ ! -f "$ENV_FILE" ]]; then
echo "ERROR: $ENV_FILE not found. Copy .env.example to .env and fill in values." >&2
exit 1
fi
set -a
source "$ENV_FILE"
set +a
for var in ADMIN_EMAIL ADMIN_PASSWORD API_URL LOADER_IMAGE_TAR; do
require_env_var "$var"
done
API_URL="${API_URL%/}"
RESOURCE_API_URL="${RESOURCE_API_URL:-$API_URL}"
LOADER_DEV_STAGE="${LOADER_DEV_STAGE:-main}"
LOADER_IMAGE="${LOADER_IMAGE:-localhost:5000/loader:arm}"
FLASH_TARGET="${FLASH_TARGET:-nvme0n1p1}"
L4T_VERSION="${L4T_VERSION:-r36.4.4}"
L4T_DIR="${L4T_DIR:-/opt/nvidia/Linux_for_Tegra}"
ROOTFS_DIR="${ROOTFS_DIR:-$L4T_DIR/rootfs}"
export L4T_VERSION L4T_DIR ROOTFS_DIR RESOURCE_API_URL LOADER_DEV_STAGE LOADER_IMAGE LOADER_IMAGE_TAR
echo "=== Installing host dependencies ==="
sudo apt-get update -qq
sudo apt-get install -y usbutils curl jq wget
[[ "$(uname -m)" != "aarch64" ]] && sudo apt-get install -y qemu-user-static binfmt-support
echo ""
echo "=== L4T BSP setup ==="
"$SCRIPT_DIR/ensure_l4t.sh"
echo ""
echo "=== Setting up rootfs (Docker + application) ==="
"$SCRIPT_DIR/setup_rootfs_docker.sh"
echo ""
if [[ "$HARDEN" == "true" ]]; then
echo "=== Applying security hardening ==="
"$SCRIPT_DIR/harden_rootfs.sh"
echo ""
else
echo "=== Security hardening SKIPPED (--no-harden) ==="
echo ""
fi
echo "=== Authenticating with admin API ==="
LOGIN_JSON="$(printf '{"email":"%s","password":"%s"}' "$ADMIN_EMAIL" "$ADMIN_PASSWORD")"
LOGIN_RESPONSE="$(api_post "${API_URL}/login" -d "$LOGIN_JSON")"
LOGIN_HTTP="$(echo "$LOGIN_RESPONSE" | tail -1)"
LOGIN_BODY="$(echo "$LOGIN_RESPONSE" | sed '$d')"
if [[ "$LOGIN_HTTP" != "200" ]]; then
echo "ERROR: Login failed (HTTP $LOGIN_HTTP)" >&2
echo "$LOGIN_BODY" >&2
exit 1
fi
TOKEN="$(echo "$LOGIN_BODY" | jq -r '.token // .Token // empty')"
if [[ -z "$TOKEN" ]]; then
echo "ERROR: No token in login response" >&2
echo "$LOGIN_BODY" >&2
exit 1
fi
echo "Authenticated successfully."
echo ""
echo "=== Scanning for Jetson devices in recovery mode ==="
LSUSB_OUTPUT="$(lsusb -d "${NVIDIA_VENDOR}:" 2>/dev/null || true)"
if [[ -z "$LSUSB_OUTPUT" ]]; then
echo "No Jetson devices found in recovery mode."
echo "Ensure devices are connected via USB and in recovery mode (hold Force Recovery, press Power)."
exit 0
fi
DEVICES=()
DEVICE_PIDS=()
DEVICE_MODELS=()
while IFS= read -r line; do
pid="$(echo "$line" | grep -oP "${NVIDIA_VENDOR}:\K[0-9a-fA-F]+")"
if [[ -n "${PID_TO_BOARD_CONFIG[$pid]:-}" ]]; then
DEVICES+=("$line")
DEVICE_PIDS+=("$pid")
DEVICE_MODELS+=("${PID_TO_MODEL[$pid]:-Unknown (PID $pid)}")
fi
done <<< "$LSUSB_OUTPUT"
DEVICE_COUNT="${#DEVICES[@]}"
if [[ "$DEVICE_COUNT" -eq 0 ]]; then
echo "No supported Jetson devices found."
echo "Detected NVIDIA USB devices but none matched known Jetson Orin product IDs."
exit 0
fi
echo ""
echo "Select device(s) to provision."
echo " one device, e.g. 1"
echo " some devices, e.g. 1 3 4"
echo " or all devices: 0"
echo ""
echo "--------------------------------------------"
echo "Connected Jetson devices (recovery mode):"
echo "--------------------------------------------"
for i in "${!DEVICES[@]}"; do
num=$((i + 1))
printf "%-3s %-16s %s\n" "$num" "[${DEVICE_MODELS[$i]}]" "${DEVICES[$i]}"
done
echo "--------------------------------------------"
echo "0 - provision all devices"
echo ""
read -rp "Your selection: " SELECTION
SELECTED_INDICES=()
if [[ "$SELECTION" == "0" ]]; then
for i in "${!DEVICES[@]}"; do
SELECTED_INDICES+=("$i")
done
else
for num in $SELECTION; do
if [[ "$num" =~ ^[0-9]+$ ]] && (( num >= 1 && num <= DEVICE_COUNT )); then
SELECTED_INDICES+=("$((num - 1))")
else
echo "ERROR: Invalid selection: $num (must be 1-$DEVICE_COUNT or 0 for all)" >&2
exit 1
fi
done
fi
if [[ ${#SELECTED_INDICES[@]} -eq 0 ]]; then
echo "No devices selected."
exit 0
fi
echo ""
echo "=== Provisioning ${#SELECTED_INDICES[@]} device(s) ==="
echo ""
RESULTS=()
for idx in "${SELECTED_INDICES[@]}"; do
DEVICE_LINE="${DEVICES[$idx]}"
USB_ID="$(echo "$DEVICE_LINE" | grep -oP 'Bus \K[0-9]+')-$(echo "$DEVICE_LINE" | grep -oP 'Device \K[0-9]+')"
BOARD_CONFIG="${PID_TO_BOARD_CONFIG[${DEVICE_PIDS[$idx]}]:-}"
if [[ -z "$BOARD_CONFIG" ]]; then
echo "ERROR: Unknown Jetson product ID: $NVIDIA_VENDOR:${DEVICE_PIDS[$idx]}" >&2
RESULTS+=("$USB_ID | FAILED | unknown product ID")
continue
fi
echo "--------------------------------------------"
echo "Device: ${DEVICE_MODELS[$idx]}$DEVICE_LINE"
echo "USB instance: $USB_ID"
echo "Board config: $BOARD_CONFIG"
echo "--------------------------------------------"
provision_single_device "$DEVICE_LINE" "$USB_ID" "$BOARD_CONFIG" || true
echo ""
done
CONF_CLEANUP="${ROOTFS_DIR}/etc/azaion/device.conf"
if [[ -f "$CONF_CLEANUP" ]]; then
sudo rm -f "$CONF_CLEANUP"
fi
echo ""
echo "========================================"
echo " Provisioning Summary"
echo "========================================"
printf "%-12s | %-10s | %s\n" "USB ID" "Serial" "Status"
echo "----------------------------------------"
for r in "${RESULTS[@]}"; do
echo "$r"
done
echo "========================================"
+179
View File
@@ -0,0 +1,179 @@
#!/usr/bin/env bash
set -euo pipefail
ROOTFS="${ROOTFS_DIR:-/opt/nvidia/Linux_for_Tegra/rootfs}"
LOADER_IMAGE_TAR="${LOADER_IMAGE_TAR:-}"
RESOURCE_API_URL="${RESOURCE_API_URL:-https://api.azaion.com}"
LOADER_DEV_STAGE="${LOADER_DEV_STAGE:-main}"
LOADER_IMAGE="${LOADER_IMAGE:-localhost:5000/loader:arm}"
if [[ ! -d "$ROOTFS" ]]; then
echo "ERROR: Rootfs directory not found: $ROOTFS" >&2
exit 1
fi
if [[ -z "$LOADER_IMAGE_TAR" ]]; then
echo "ERROR: LOADER_IMAGE_TAR not set. Set it in .env to the Loader Docker image tar path." >&2
exit 1
fi
if [[ ! -f "$LOADER_IMAGE_TAR" ]]; then
echo "ERROR: Loader image tar not found: $LOADER_IMAGE_TAR" >&2
exit 1
fi
cleanup_mounts() {
for mp in proc sys dev/pts dev; do
sudo umount "$ROOTFS/$mp" 2>/dev/null || true
done
if [[ -f "$ROOTFS/etc/resolv.conf.setup-bak" ]]; then
sudo mv "$ROOTFS/etc/resolv.conf.setup-bak" "$ROOTFS/etc/resolv.conf"
fi
}
setup_mounts() {
for mp in proc sys dev dev/pts; do
mountpoint -q "$ROOTFS/$mp" 2>/dev/null && sudo umount "$ROOTFS/$mp" 2>/dev/null || true
done
sudo mount --bind /proc "$ROOTFS/proc"
sudo mount --bind /sys "$ROOTFS/sys"
sudo mount --bind /dev "$ROOTFS/dev"
sudo mount --bind /dev/pts "$ROOTFS/dev/pts"
if [[ -f "$ROOTFS/etc/resolv.conf" ]]; then
sudo cp "$ROOTFS/etc/resolv.conf" "$ROOTFS/etc/resolv.conf.setup-bak"
fi
sudo cp /etc/resolv.conf "$ROOTFS/etc/resolv.conf"
}
if [[ "$(uname -m)" != "aarch64" ]]; then
if [[ ! -f "$ROOTFS/usr/bin/qemu-aarch64-static" ]]; then
sudo cp /usr/bin/qemu-aarch64-static "$ROOTFS/usr/bin/"
fi
fi
trap cleanup_mounts EXIT
echo "=== Setting up Docker in rootfs ==="
echo " Rootfs: $ROOTFS"
echo " Image tar: $LOADER_IMAGE_TAR"
echo ""
setup_mounts
if sudo chroot "$ROOTFS" docker --version &>/dev/null; then
echo "[1/6] Docker already installed, skipping..."
else
echo "[1/6] Installing Docker Engine..."
sudo chroot "$ROOTFS" bash -c '
apt-get update
apt-get install -y ca-certificates curl gnupg
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
chmod a+r /etc/apt/keyrings/docker.asc
. /etc/os-release
echo "deb [arch=arm64 signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu $VERSION_CODENAME stable" > /etc/apt/sources.list.d/docker.list
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
apt-get clean
rm -rf /var/lib/apt/lists/*
'
fi
if sudo chroot "$ROOTFS" dpkg -l nvidia-container-toolkit 2>/dev/null | grep -q '^ii'; then
echo "[2/6] NVIDIA Container Toolkit already installed, skipping..."
else
echo "[2/6] Installing NVIDIA Container Toolkit..."
sudo chroot "$ROOTFS" bash -c '
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
| gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
| sed "s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g" \
> /etc/apt/sources.list.d/nvidia-container-toolkit.list
apt-get update
apt-get install -y nvidia-container-toolkit
apt-get clean
rm -rf /var/lib/apt/lists/*
'
fi
echo "[3/6] Configuring Docker daemon (NVIDIA default runtime)..."
sudo mkdir -p "$ROOTFS/etc/docker"
sudo tee "$ROOTFS/etc/docker/daemon.json" > /dev/null <<'EOF'
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
EOF
echo "[4/6] Enabling Docker and containerd services..."
sudo mkdir -p "$ROOTFS/etc/systemd/system/multi-user.target.wants"
sudo ln -sf /lib/systemd/system/docker.service \
"$ROOTFS/etc/systemd/system/multi-user.target.wants/docker.service"
sudo ln -sf /lib/systemd/system/containerd.service \
"$ROOTFS/etc/systemd/system/multi-user.target.wants/containerd.service"
echo "[5/6] Creating Azaion application layout..."
sudo mkdir -p "$ROOTFS/opt/azaion/models"
sudo mkdir -p "$ROOTFS/opt/azaion/state"
sudo tee "$ROOTFS/opt/azaion/docker-compose.yml" > /dev/null <<EOF
services:
loader:
image: ${LOADER_IMAGE}
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /opt/azaion/docker-compose.yml:/app/docker-compose.yml:ro
- /opt/azaion/models:/app/models
- /opt/azaion/state:/app/state
- /etc/azaion/device.conf:/etc/azaion/device.conf:ro
environment:
RESOURCE_API_URL: ${RESOURCE_API_URL}
LOADER_COMPOSE_FILE: /app/docker-compose.yml
LOADER_MODEL_DIR: /app/models
LOADER_DOWNLOAD_STATE_DIR: /app/state
LOADER_DEV_STAGE: ${LOADER_DEV_STAGE}
LOADER_ARCH: arm64
EOF
echo "[6/6] Installing Loader image and boot service..."
sudo cp "$LOADER_IMAGE_TAR" "$ROOTFS/opt/azaion/loader-image.tar"
sudo tee "$ROOTFS/opt/azaion/boot.sh" > /dev/null <<'EOF'
#!/bin/bash
set -e
if [ -f /opt/azaion/loader-image.tar ]; then
docker load -i /opt/azaion/loader-image.tar
rm -f /opt/azaion/loader-image.tar
fi
docker compose -f /opt/azaion/docker-compose.yml up -d
EOF
sudo chmod 755 "$ROOTFS/opt/azaion/boot.sh"
sudo tee "$ROOTFS/etc/systemd/system/azaion-loader.service" > /dev/null <<'EOF'
[Unit]
Description=Azaion Loader
After=docker.service
Requires=docker.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/opt/azaion/boot.sh
[Install]
WantedBy=multi-user.target
EOF
sudo ln -sf /etc/systemd/system/azaion-loader.service \
"$ROOTFS/etc/systemd/system/multi-user.target.wants/azaion-loader.service"
echo ""
echo "Docker setup complete."