Files
admin/_docs/04_deploy/deploy_scripts.md
T
Oleksandr Bezdieniezhnykh c7b297de83
ci/woodpecker/push/01-test Pipeline failed
ci/woodpecker/push/02-build-push unknown status
refactor: remove deploy.cmd and update Dockerfile for health checks
- Deleted the deploy.cmd script as it was no longer needed.
- Updated Dockerfile to include curl for health checks and added a non-root user for improved security.
- Modified health check command to use curl for better reliability.
- Adjusted docker-compose.test.yml to reflect changes in health check configuration.
- Cleaned up appsettings.json and removed unused configuration properties.
- Removed Resource entity and related requests from the codebase as part of the architectural shift.
- Updated documentation to reflect the removal of hardware binding and related endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 08:47:21 +03:00

197 lines
9.1 KiB
Markdown

# Azaion Admin API — Deployment Scripts
**Date**: 2026-05-13 · **Cycle**: 1 · **Status**: shipped (this is the only doc that matches concrete files in `scripts/` and `secrets/`).
## 1. Overview
| Script | Purpose | Location |
|--------|---------|----------|
| `deploy.sh` | Main orchestrator (pull → stop → start → health) | `scripts/deploy.sh` |
| `pull-images.sh` | `docker login` + `docker pull` the target image | `scripts/pull-images.sh` |
| `stop-services.sh` | Graceful stop + record rollback target | `scripts/stop-services.sh` |
| `start-services.sh` | `docker run` with the materialized env file and bind mounts | `scripts/start-services.sh` |
| `health-check.sh` | Poll `/health/ready` until 200 or timeout | `scripts/health-check.sh` |
| `smoke.sh` | 6 critical-path checks against the **public** URL | `scripts/smoke.sh` |
| `_lib.sh` | Shared logging + env-overlay helpers | `scripts/_lib.sh` (sourced, not executed) |
| `run-tests.sh` | Existing — runs the docker-compose test suite locally | `scripts/run-tests.sh` |
| `run-performance-tests.sh` | Existing — runs k6 against the test compose stack | `scripts/run-performance-tests.sh` |
## 2. Prerequisites
On the **deploy host**:
| Requirement | Why |
|-------------|-----|
| Docker 24+ | `docker pull`, `docker run`, `--restart unless-stopped` |
| `sops` (≥ 3.8) | Decrypt `secrets/<env>.env` |
| `age` (≥ 1.1) | Backing crypto for sops |
| `curl` | Used by `health-check.sh` and `smoke.sh` |
| `jq` | Used by `smoke.sh` for JSON parsing |
| `/etc/azaion/age.key` (mode 0400) | Per-host age private key (see `secrets/README.md`) |
On the **operator's machine** (only for `smoke.sh`):
| Requirement | Why |
|-------------|-----|
| `curl`, `jq` | Same as host |
| Network access to the public URL | `BASE_URL` is the production / staging hostname |
## 3. Environment Variables
`scripts/_lib.sh` `load_env_overlay <env>` resolves variables in this order (later sources override earlier):
1. `<repo>/.env` (if present — local-dev convenience; harmless on a prod host that has no `.env`)
2. `secrets/<env>.public.env` (committed plain text; loaded with `set -a`)
3. `secrets/<env>.env` (sops-decrypted to a tempfile, sourced, tempfile deleted on exit)
4. The shell environment that invoked `deploy.sh` (operator overrides)
The complete variable inventory is `.env.example` at the repo root. Variables specifically consumed by these scripts:
| Variable | Required by | Source | Notes |
|----------|-------------|--------|-------|
| `ENV` | `deploy.sh` | operator shell | `staging` or `production` |
| `REGISTRY_HOST`, `REGISTRY_IMAGE`, `REGISTRY_TAG` | pull / start | public env / operator | tag is the `<sha12>-<arch>` immutable tag from `.woodpecker/02-build-push.yml` |
| `REGISTRY_USER`, `REGISTRY_TOKEN` | pull | encrypted env | optional; if both missing, assumes `docker login` was done out-of-band |
| `DEPLOY_CONTAINER_NAME`, `DEPLOY_HOST_PORT`, `DEPLOY_HOST_CONTENT_DIR`, `DEPLOY_HOST_LOGS_DIR` | stop / start | public env | identical for staging and prod by default |
| `ASPNETCORE_ConnectionStrings__AzaionDb`, `__AzaionDbAdmin`, `JwtConfig__Secret` | start | encrypted env | the API fail-fast checks these on boot |
| `ASPNETCORE_ResourcesConfig__*`, `JwtConfig__{Issuer,Audience,Lifetime}` | start | public env (defaults from `appsettings.json`) | only override if the env value differs from the appsettings default |
| `SOPS_AGE_KEY_FILE` | `_lib.sh` | host | defaults to `/etc/azaion/age.key` if unset |
| `SMOKE_ADMIN_EMAIL`, `SMOKE_ADMIN_PASSWORD` | `smoke.sh` | operator shell | dedicated smoke-test admin user; rotate as a regular admin password |
## 4. Script details
### `deploy.sh`
**Usage**:
```bash
ENV=staging ./scripts/deploy.sh <sha-tag>
ENV=production ./scripts/deploy.sh <sha-tag>
ENV=staging ./scripts/deploy.sh --rollback # uses scripts/.previous_tags.env
./scripts/deploy.sh --help
```
**Flow** (matches `_docs/04_deploy/deployment_procedures.md` §3 / §4):
1. Validate `ENV` and required commands.
2. Load env overlay (public + sops-decrypted).
3. If `--rollback`: read `scripts/.previous_tags.env` → set `SHA_TAG` to `PREVIOUS_SHA_TAG`.
4. `pull-images.sh` (login + pull).
5. `stop-services.sh` (records the SHA of whatever was running; graceful stop with `docker stop -t 40`; remove).
6. `start-services.sh` (`docker run --restart unless-stopped --env-file <materialized> --publish $DEPLOY_HOST_PORT:8080`).
7. `health-check.sh` (poll `/health/ready` with timeout).
8. Print success line with the running revision.
**Failure handling**: any non-zero exit from a sub-script aborts `deploy.sh` (because `set -euo pipefail` propagates). The previously-recorded SHA in `.previous_tags.env` is unchanged, so `--rollback` after a failed deploy targets the version that was running BEFORE the failed attempt.
### `pull-images.sh`
- `docker login` only when both `REGISTRY_USER` and `REGISTRY_TOKEN` are set; otherwise warns and continues (assumes pre-auth).
- `docker pull $REGISTRY_HOST/$REGISTRY_IMAGE:$REGISTRY_TAG`.
- Logs the resolved `RepoDigests[0]` to give the operator an immutable identifier in the deploy log.
### `stop-services.sh`
- Reads `org.opencontainers.image.revision` from the running container (label set by the Dockerfile).
- Writes `scripts/.previous_tags.env`:
```
PREVIOUS_SHA_TAG=<sha12>-<arch>
PREVIOUS_REVISION=<full sha>
RECORDED_AT=<ISO 8601>
```
- `docker stop -t 40` then `docker rm -f`.
- If the container does not exist, logs and exits 0 (idempotent — first deploy on a new host should succeed).
### `start-services.sh`
- Materializes a runtime env file by filtering the current shell environment with `grep '^(ASPNETCORE_|AZAION_)'`. Registry credentials and deploy-host plumbing variables stay on the host and never enter the container.
- `mkdir -p` for the bind-mounted `Content/` and `logs/` dirs (idempotent).
- `docker run --detach --name --restart unless-stopped --env-file --publish --volume`.
- Logs the container ID and the running revision.
### `health-check.sh`
- One-shot check on `/health/live` first (3 s timeout). If this fails the container is wedged — fail fast.
- Polls `/health/ready` every `HEALTH_INTERVAL` (default 2 s) until 200 or `HEALTH_TIMEOUT` (default 60 s).
- Returns 0 on first 200; non-zero on timeout.
### `smoke.sh`
Six checks, each ≤ 10 s, against the public `BASE_URL`:
1. `GET /health/live` (200)
2. `GET /health/ready` (200, best-effort — public URL may legitimately not expose this)
3. `POST /login` — extract JWT
4. `GET /users/current` (Bearer auth)
5. `GET /users` — count rows
6. `GET /resources/list` — sanity that filesystem-backed paths are reachable
Smoke is intentionally lightweight; it does NOT exercise CRUD or detection-class endpoints (those are covered by E2E in CI).
### `_lib.sh`
Shared sourced library. Sourced via `. "$SCRIPT_DIR/_lib.sh"` from every script. NOT executable (lives at `scripts/_lib.sh` mode 0644). Contains:
- `log_info` / `log_warn` / `log_error` / `die`
- `require_env <var…>` / `require_cmd <cmd…>`
- `load_env_overlay <env>` (the sops + age decryption pipeline)
- `container_exists`, `container_running`, `current_image_revision`
## 5. Examples
### First-ever staging deploy
```bash
# On the staging host, as deploy operator:
cd /opt/azaion/admin # or wherever the repo is checked out
ENV=staging ./scripts/deploy.sh a1b2c3d4e5f6-arm
```
### Rolling back production after a bad deploy
```bash
# Same host, immediately after the failed deploy:
ENV=production ./scripts/deploy.sh --rollback
```
### Running smoke from the operator workstation
```bash
export BASE_URL=https://stage.admin.azaion.com
export SMOKE_ADMIN_EMAIL=ops-smoke@azaion.com
export SMOKE_ADMIN_PASSWORD=... # from the operator's password manager
./scripts/smoke.sh
```
### Local development against the dockerized stack
The dev-time compose was deferred (Drift K-adjacent). Until it lands, run the API directly:
```bash
# Postgres on host port 4312 (per env/db/00_install.sh)
dotnet run --project Azaion.AdminApi
```
## 6. Common script properties
All scripts:
- Use `#!/usr/bin/env bash` with `set -euo pipefail`.
- Support `--help` / `-h` for usage.
- Source `_lib.sh` for logging and env-overlay helpers.
- Are idempotent where possible (running `deploy.sh` twice with the same SHA tag is a no-op for `pull-images.sh`, recreates the container in `stop`/`start`, and re-checks health).
- Echo to stderr for log lines (so stdout from a sub-process can still be piped).
## 7. What is NOT shipped in cycle 1
- Remote SSH wrapper. The deploy procedure assumes the operator runs the script on the target host. A `--remote $DEPLOY_HOST` mode is recorded as **Drift O** (carried forward).
- Slack notifications from inside the scripts. Notifications happen out-of-band per `_docs/04_deploy/observability.md` §5.
- Database migration step. Migrations are applied manually with `psql` per `_docs/04_deploy/environment_strategy.md` §4 (Drift J).
## 8. Related artifacts
- Postmortem template: `_docs/06_metrics/postmortem_template.md`
- Procedures: `_docs/04_deploy/deployment_procedures.md`
- Environment strategy: `_docs/04_deploy/environment_strategy.md`
- secrets/ folder onboarding: `secrets/README.md`