- Modified the existing-code workflow to automatically loop back to New Task after project completion without user confirmation. - Updated the autopilot state to reflect the current step as `done` and status as `completed`. - Clarified the deployment status report by specifying non-deployed services and their purposes. These changes enhance the automation of task management and improve documentation clarity.
4.0 KiB
Azaion AI Training — Deployment Scripts
Overview
| Script | Purpose | Location |
|---|---|---|
deploy.sh |
Main deployment orchestrator | scripts/deploy.sh |
generate-config.sh |
Generate config.yaml from environment variables |
scripts/generate-config.sh |
pull-images.sh |
Pull Docker images from registry | scripts/pull-images.sh |
start-services.sh |
Start all services via Docker Compose | scripts/start-services.sh |
stop-services.sh |
Graceful shutdown with tag backup | scripts/stop-services.sh |
health-check.sh |
Verify deployment health | scripts/health-check.sh |
Prerequisites
- Docker and Docker Compose installed on target machine
- NVIDIA driver + Docker GPU support (
nvidia-container-toolkit) - SSH access to target machine (for remote deployment)
.envfile with required environment variables (see.env.example)
Environment Variables
All scripts source .env from the project root.
| Variable | Required By | Purpose |
|---|---|---|
DEPLOY_HOST |
deploy.sh (remote) |
SSH target for remote deployment |
DEPLOY_USER |
deploy.sh (remote) |
SSH user (default: deploy) |
DOCKER_REGISTRY |
pull-images.sh |
Container registry URL |
DOCKER_IMAGE_TAG |
pull-images.sh |
Image version to deploy (default: latest) |
AZAION_API_URL |
generate-config.sh |
Azaion REST API URL |
AZAION_API_EMAIL |
generate-config.sh |
API login email |
AZAION_API_PASSWORD |
generate-config.sh |
API login password |
RABBITMQ_HOST |
generate-config.sh |
RabbitMQ host |
RABBITMQ_PORT |
generate-config.sh |
RabbitMQ port |
RABBITMQ_USER |
generate-config.sh |
RabbitMQ username |
RABBITMQ_PASSWORD |
generate-config.sh |
RabbitMQ password |
RABBITMQ_QUEUE_NAME |
generate-config.sh |
RabbitMQ queue name |
AZAION_ROOT_DIR |
start-services.sh, health-check.sh |
Root data directory (default: /azaion) |
Script Details
deploy.sh
Main orchestrator: generates config, pulls images, stops old services, starts new ones, checks health.
./scripts/deploy.sh # Deploy latest version (local)
./scripts/deploy.sh --rollback # Rollback to previous version
./scripts/deploy.sh --local # Force local mode (skip SSH)
./scripts/deploy.sh --help # Show usage
Flow: generate-config.sh → pull-images.sh → stop-services.sh → start-services.sh → health-check.sh
When DEPLOY_HOST is set, commands execute over SSH on the remote server. Without it, runs locally.
generate-config.sh
Generates config.yaml from environment variables, preserving the existing config format the codebase expects. Validates that all required variables are set before writing.
./scripts/generate-config.sh # Generate config.yaml
pull-images.sh
Pulls Docker images for both deployable components from the configured registry.
./scripts/pull-images.sh # Pull images
Images pulled:
${DOCKER_REGISTRY}/azaion/training:${DOCKER_IMAGE_TAG}${DOCKER_REGISTRY}/azaion/annotation-queue:${DOCKER_IMAGE_TAG}
start-services.sh
Creates the /azaion/ directory tree if needed, then runs docker compose up -d.
./scripts/start-services.sh # Start services
stop-services.sh
Saves current image tags to scripts/.previous-tags for rollback, then stops and removes containers with a 30-second grace period.
./scripts/stop-services.sh # Stop services
health-check.sh
Checks container status, GPU availability, disk usage, and queue offset. Returns exit code 0 (healthy) or 1 (unhealthy).
./scripts/health-check.sh # Run health check
Checks performed:
- Annotation queue and RabbitMQ containers running
- GPU available and temperature < 90°C
- Disk usage < 95% (warning at 80%)
- Queue offset file exists
Common Properties
All scripts:
- Use
#!/bin/bashwithset -euo pipefail - Support
--helpflag - Source
.envfrom project root if present - Are idempotent
- Support remote execution via SSH (
DEPLOY_HOST+DEPLOY_USER)