Files
admin/_docs/04_deploy/deploy_scripts.md
T
Oleksandr Bezdieniezhnykh c7b297de83
ci/woodpecker/push/01-test Pipeline failed
ci/woodpecker/push/02-build-push unknown status
refactor: remove deploy.cmd and update Dockerfile for health checks
- Deleted the deploy.cmd script as it was no longer needed.
- Updated Dockerfile to include curl for health checks and added a non-root user for improved security.
- Modified health check command to use curl for better reliability.
- Adjusted docker-compose.test.yml to reflect changes in health check configuration.
- Cleaned up appsettings.json and removed unused configuration properties.
- Removed Resource entity and related requests from the codebase as part of the architectural shift.
- Updated documentation to reflect the removal of hardware binding and related endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 08:47:21 +03:00

9.1 KiB

Azaion Admin API — Deployment Scripts

Date: 2026-05-13 · Cycle: 1 · Status: shipped (this is the only doc that matches concrete files in scripts/ and secrets/).

1. Overview

Script Purpose Location
deploy.sh Main orchestrator (pull → stop → start → health) scripts/deploy.sh
pull-images.sh docker login + docker pull the target image scripts/pull-images.sh
stop-services.sh Graceful stop + record rollback target scripts/stop-services.sh
start-services.sh docker run with the materialized env file and bind mounts scripts/start-services.sh
health-check.sh Poll /health/ready until 200 or timeout scripts/health-check.sh
smoke.sh 6 critical-path checks against the public URL scripts/smoke.sh
_lib.sh Shared logging + env-overlay helpers scripts/_lib.sh (sourced, not executed)
run-tests.sh Existing — runs the docker-compose test suite locally scripts/run-tests.sh
run-performance-tests.sh Existing — runs k6 against the test compose stack scripts/run-performance-tests.sh

2. Prerequisites

On the deploy host:

Requirement Why
Docker 24+ docker pull, docker run, --restart unless-stopped
sops (≥ 3.8) Decrypt secrets/<env>.env
age (≥ 1.1) Backing crypto for sops
curl Used by health-check.sh and smoke.sh
jq Used by smoke.sh for JSON parsing
/etc/azaion/age.key (mode 0400) Per-host age private key (see secrets/README.md)

On the operator's machine (only for smoke.sh):

Requirement Why
curl, jq Same as host
Network access to the public URL BASE_URL is the production / staging hostname

3. Environment Variables

scripts/_lib.sh load_env_overlay <env> resolves variables in this order (later sources override earlier):

  1. <repo>/.env (if present — local-dev convenience; harmless on a prod host that has no .env)
  2. secrets/<env>.public.env (committed plain text; loaded with set -a)
  3. secrets/<env>.env (sops-decrypted to a tempfile, sourced, tempfile deleted on exit)
  4. The shell environment that invoked deploy.sh (operator overrides)

The complete variable inventory is .env.example at the repo root. Variables specifically consumed by these scripts:

Variable Required by Source Notes
ENV deploy.sh operator shell staging or production
REGISTRY_HOST, REGISTRY_IMAGE, REGISTRY_TAG pull / start public env / operator tag is the <sha12>-<arch> immutable tag from .woodpecker/02-build-push.yml
REGISTRY_USER, REGISTRY_TOKEN pull encrypted env optional; if both missing, assumes docker login was done out-of-band
DEPLOY_CONTAINER_NAME, DEPLOY_HOST_PORT, DEPLOY_HOST_CONTENT_DIR, DEPLOY_HOST_LOGS_DIR stop / start public env identical for staging and prod by default
ASPNETCORE_ConnectionStrings__AzaionDb, __AzaionDbAdmin, JwtConfig__Secret start encrypted env the API fail-fast checks these on boot
ASPNETCORE_ResourcesConfig__*, JwtConfig__{Issuer,Audience,Lifetime} start public env (defaults from appsettings.json) only override if the env value differs from the appsettings default
SOPS_AGE_KEY_FILE _lib.sh host defaults to /etc/azaion/age.key if unset
SMOKE_ADMIN_EMAIL, SMOKE_ADMIN_PASSWORD smoke.sh operator shell dedicated smoke-test admin user; rotate as a regular admin password

4. Script details

deploy.sh

Usage:

ENV=staging   ./scripts/deploy.sh <sha-tag>
ENV=production ./scripts/deploy.sh <sha-tag>
ENV=staging   ./scripts/deploy.sh --rollback   # uses scripts/.previous_tags.env
./scripts/deploy.sh --help

Flow (matches _docs/04_deploy/deployment_procedures.md §3 / §4):

  1. Validate ENV and required commands.
  2. Load env overlay (public + sops-decrypted).
  3. If --rollback: read scripts/.previous_tags.env → set SHA_TAG to PREVIOUS_SHA_TAG.
  4. pull-images.sh (login + pull).
  5. stop-services.sh (records the SHA of whatever was running; graceful stop with docker stop -t 40; remove).
  6. start-services.sh (docker run --restart unless-stopped --env-file <materialized> --publish $DEPLOY_HOST_PORT:8080).
  7. health-check.sh (poll /health/ready with timeout).
  8. Print success line with the running revision.

Failure handling: any non-zero exit from a sub-script aborts deploy.sh (because set -euo pipefail propagates). The previously-recorded SHA in .previous_tags.env is unchanged, so --rollback after a failed deploy targets the version that was running BEFORE the failed attempt.

pull-images.sh

  • docker login only when both REGISTRY_USER and REGISTRY_TOKEN are set; otherwise warns and continues (assumes pre-auth).
  • docker pull $REGISTRY_HOST/$REGISTRY_IMAGE:$REGISTRY_TAG.
  • Logs the resolved RepoDigests[0] to give the operator an immutable identifier in the deploy log.

stop-services.sh

  • Reads org.opencontainers.image.revision from the running container (label set by the Dockerfile).
  • Writes scripts/.previous_tags.env:
    PREVIOUS_SHA_TAG=<sha12>-<arch>
    PREVIOUS_REVISION=<full sha>
    RECORDED_AT=<ISO 8601>
    
  • docker stop -t 40 then docker rm -f.
  • If the container does not exist, logs and exits 0 (idempotent — first deploy on a new host should succeed).

start-services.sh

  • Materializes a runtime env file by filtering the current shell environment with grep '^(ASPNETCORE_|AZAION_)'. Registry credentials and deploy-host plumbing variables stay on the host and never enter the container.
  • mkdir -p for the bind-mounted Content/ and logs/ dirs (idempotent).
  • docker run --detach --name --restart unless-stopped --env-file --publish --volume.
  • Logs the container ID and the running revision.

health-check.sh

  • One-shot check on /health/live first (3 s timeout). If this fails the container is wedged — fail fast.
  • Polls /health/ready every HEALTH_INTERVAL (default 2 s) until 200 or HEALTH_TIMEOUT (default 60 s).
  • Returns 0 on first 200; non-zero on timeout.

smoke.sh

Six checks, each ≤ 10 s, against the public BASE_URL:

  1. GET /health/live (200)
  2. GET /health/ready (200, best-effort — public URL may legitimately not expose this)
  3. POST /login — extract JWT
  4. GET /users/current (Bearer auth)
  5. GET /users — count rows
  6. GET /resources/list — sanity that filesystem-backed paths are reachable

Smoke is intentionally lightweight; it does NOT exercise CRUD or detection-class endpoints (those are covered by E2E in CI).

_lib.sh

Shared sourced library. Sourced via . "$SCRIPT_DIR/_lib.sh" from every script. NOT executable (lives at scripts/_lib.sh mode 0644). Contains:

  • log_info / log_warn / log_error / die
  • require_env <var…> / require_cmd <cmd…>
  • load_env_overlay <env> (the sops + age decryption pipeline)
  • container_exists, container_running, current_image_revision

5. Examples

First-ever staging deploy

# On the staging host, as deploy operator:
cd /opt/azaion/admin   # or wherever the repo is checked out
ENV=staging ./scripts/deploy.sh a1b2c3d4e5f6-arm

Rolling back production after a bad deploy

# Same host, immediately after the failed deploy:
ENV=production ./scripts/deploy.sh --rollback

Running smoke from the operator workstation

export BASE_URL=https://stage.admin.azaion.com
export SMOKE_ADMIN_EMAIL=ops-smoke@azaion.com
export SMOKE_ADMIN_PASSWORD=...   # from the operator's password manager
./scripts/smoke.sh

Local development against the dockerized stack

The dev-time compose was deferred (Drift K-adjacent). Until it lands, run the API directly:

# Postgres on host port 4312 (per env/db/00_install.sh)
dotnet run --project Azaion.AdminApi

6. Common script properties

All scripts:

  • Use #!/usr/bin/env bash with set -euo pipefail.
  • Support --help / -h for usage.
  • Source _lib.sh for logging and env-overlay helpers.
  • Are idempotent where possible (running deploy.sh twice with the same SHA tag is a no-op for pull-images.sh, recreates the container in stop/start, and re-checks health).
  • Echo to stderr for log lines (so stdout from a sub-process can still be piped).

7. What is NOT shipped in cycle 1

  • Remote SSH wrapper. The deploy procedure assumes the operator runs the script on the target host. A --remote $DEPLOY_HOST mode is recorded as Drift O (carried forward).
  • Slack notifications from inside the scripts. Notifications happen out-of-band per _docs/04_deploy/observability.md §5.
  • Database migration step. Migrations are applied manually with psql per _docs/04_deploy/environment_strategy.md §4 (Drift J).
  • Postmortem template: _docs/06_metrics/postmortem_template.md
  • Procedures: _docs/04_deploy/deployment_procedures.md
  • Environment strategy: _docs/04_deploy/environment_strategy.md
  • secrets/ folder onboarding: secrets/README.md