# Azaion AI Training — Deployment Scripts ## Overview | Script | Purpose | Location | |--------|---------|----------| | `deploy.sh` | Main deployment orchestrator | `scripts/deploy.sh` | | `generate-config.sh` | Generate `config.yaml` from environment variables | `scripts/generate-config.sh` | | `pull-images.sh` | Pull Docker images from registry | `scripts/pull-images.sh` | | `start-services.sh` | Start all services via Docker Compose | `scripts/start-services.sh` | | `stop-services.sh` | Graceful shutdown with tag backup | `scripts/stop-services.sh` | | `health-check.sh` | Verify deployment health | `scripts/health-check.sh` | ## Prerequisites - Docker and Docker Compose installed on target machine - NVIDIA driver + Docker GPU support (`nvidia-container-toolkit`) - SSH access to target machine (for remote deployment) - `.env` file with required environment variables (see `.env.example`) ## Environment Variables All scripts source `.env` from the project root. | Variable | Required By | Purpose | |----------|------------|---------| | `DEPLOY_HOST` | `deploy.sh` (remote) | SSH target for remote deployment | | `DEPLOY_USER` | `deploy.sh` (remote) | SSH user (default: `deploy`) | | `DOCKER_REGISTRY` | `pull-images.sh` | Container registry URL | | `DOCKER_IMAGE_TAG` | `pull-images.sh` | Image version to deploy (default: `latest`) | | `AZAION_API_URL` | `generate-config.sh` | Azaion REST API URL | | `AZAION_API_EMAIL` | `generate-config.sh` | API login email | | `AZAION_API_PASSWORD` | `generate-config.sh` | API login password | | `RABBITMQ_HOST` | `generate-config.sh` | RabbitMQ host | | `RABBITMQ_PORT` | `generate-config.sh` | RabbitMQ port | | `RABBITMQ_USER` | `generate-config.sh` | RabbitMQ username | | `RABBITMQ_PASSWORD` | `generate-config.sh` | RabbitMQ password | | `RABBITMQ_QUEUE_NAME` | `generate-config.sh` | RabbitMQ queue name | | `AZAION_ROOT_DIR` | `start-services.sh`, `health-check.sh` | Root data directory (default: `/azaion`) | ## Script Details ### deploy.sh Main orchestrator: generates config, pulls images, stops old services, starts new ones, checks health. ``` ./scripts/deploy.sh # Deploy latest version (local) ./scripts/deploy.sh --rollback # Rollback to previous version ./scripts/deploy.sh --local # Force local mode (skip SSH) ./scripts/deploy.sh --help # Show usage ``` Flow: `generate-config.sh` → `pull-images.sh` → `stop-services.sh` → `start-services.sh` → `health-check.sh` When `DEPLOY_HOST` is set, commands execute over SSH on the remote server. Without it, runs locally. ### generate-config.sh Generates `config.yaml` from environment variables, preserving the existing config format the codebase expects. Validates that all required variables are set before writing. ``` ./scripts/generate-config.sh # Generate config.yaml ``` ### pull-images.sh Pulls Docker images for both deployable components from the configured registry. ``` ./scripts/pull-images.sh # Pull images ``` Images pulled: - `${DOCKER_REGISTRY}/azaion/training:${DOCKER_IMAGE_TAG}` - `${DOCKER_REGISTRY}/azaion/annotation-queue:${DOCKER_IMAGE_TAG}` ### start-services.sh Creates the `/azaion/` directory tree if needed, then runs `docker compose up -d`. ``` ./scripts/start-services.sh # Start services ``` ### stop-services.sh Saves current image tags to `scripts/.previous-tags` for rollback, then stops and removes containers with a 30-second grace period. ``` ./scripts/stop-services.sh # Stop services ``` ### health-check.sh Checks container status, GPU availability, disk usage, and queue offset. Returns exit code 0 (healthy) or 1 (unhealthy). ``` ./scripts/health-check.sh # Run health check ``` Checks performed: - Annotation queue and RabbitMQ containers running - GPU available and temperature < 90°C - Disk usage < 95% (warning at 80%) - Queue offset file exists ## Common Properties All scripts: - Use `#!/bin/bash` with `set -euo pipefail` - Support `--help` flag - Source `.env` from project root if present - Are idempotent - Support remote execution via SSH (`DEPLOY_HOST` + `DEPLOY_USER`)