mirror of
https://github.com/azaion/admin.git
synced 2026-06-21 20:01:08 +00:00
c7b297de83
- Deleted the deploy.cmd script as it was no longer needed. - Updated Dockerfile to include curl for health checks and added a non-root user for improved security. - Modified health check command to use curl for better reliability. - Adjusted docker-compose.test.yml to reflect changes in health check configuration. - Cleaned up appsettings.json and removed unused configuration properties. - Removed Resource entity and related requests from the codebase as part of the architectural shift. - Updated documentation to reflect the removal of hardware binding and related endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
197 lines
9.1 KiB
Markdown
197 lines
9.1 KiB
Markdown
# Azaion Admin API — Deployment Scripts
|
|
|
|
**Date**: 2026-05-13 · **Cycle**: 1 · **Status**: shipped (this is the only doc that matches concrete files in `scripts/` and `secrets/`).
|
|
|
|
## 1. Overview
|
|
|
|
| Script | Purpose | Location |
|
|
|--------|---------|----------|
|
|
| `deploy.sh` | Main orchestrator (pull → stop → start → health) | `scripts/deploy.sh` |
|
|
| `pull-images.sh` | `docker login` + `docker pull` the target image | `scripts/pull-images.sh` |
|
|
| `stop-services.sh` | Graceful stop + record rollback target | `scripts/stop-services.sh` |
|
|
| `start-services.sh` | `docker run` with the materialized env file and bind mounts | `scripts/start-services.sh` |
|
|
| `health-check.sh` | Poll `/health/ready` until 200 or timeout | `scripts/health-check.sh` |
|
|
| `smoke.sh` | 6 critical-path checks against the **public** URL | `scripts/smoke.sh` |
|
|
| `_lib.sh` | Shared logging + env-overlay helpers | `scripts/_lib.sh` (sourced, not executed) |
|
|
| `run-tests.sh` | Existing — runs the docker-compose test suite locally | `scripts/run-tests.sh` |
|
|
| `run-performance-tests.sh` | Existing — runs k6 against the test compose stack | `scripts/run-performance-tests.sh` |
|
|
|
|
## 2. Prerequisites
|
|
|
|
On the **deploy host**:
|
|
|
|
| Requirement | Why |
|
|
|-------------|-----|
|
|
| Docker 24+ | `docker pull`, `docker run`, `--restart unless-stopped` |
|
|
| `sops` (≥ 3.8) | Decrypt `secrets/<env>.env` |
|
|
| `age` (≥ 1.1) | Backing crypto for sops |
|
|
| `curl` | Used by `health-check.sh` and `smoke.sh` |
|
|
| `jq` | Used by `smoke.sh` for JSON parsing |
|
|
| `/etc/azaion/age.key` (mode 0400) | Per-host age private key (see `secrets/README.md`) |
|
|
|
|
On the **operator's machine** (only for `smoke.sh`):
|
|
|
|
| Requirement | Why |
|
|
|-------------|-----|
|
|
| `curl`, `jq` | Same as host |
|
|
| Network access to the public URL | `BASE_URL` is the production / staging hostname |
|
|
|
|
## 3. Environment Variables
|
|
|
|
`scripts/_lib.sh` `load_env_overlay <env>` resolves variables in this order (later sources override earlier):
|
|
|
|
1. `<repo>/.env` (if present — local-dev convenience; harmless on a prod host that has no `.env`)
|
|
2. `secrets/<env>.public.env` (committed plain text; loaded with `set -a`)
|
|
3. `secrets/<env>.env` (sops-decrypted to a tempfile, sourced, tempfile deleted on exit)
|
|
4. The shell environment that invoked `deploy.sh` (operator overrides)
|
|
|
|
The complete variable inventory is `.env.example` at the repo root. Variables specifically consumed by these scripts:
|
|
|
|
| Variable | Required by | Source | Notes |
|
|
|----------|-------------|--------|-------|
|
|
| `ENV` | `deploy.sh` | operator shell | `staging` or `production` |
|
|
| `REGISTRY_HOST`, `REGISTRY_IMAGE`, `REGISTRY_TAG` | pull / start | public env / operator | tag is the `<sha12>-<arch>` immutable tag from `.woodpecker/02-build-push.yml` |
|
|
| `REGISTRY_USER`, `REGISTRY_TOKEN` | pull | encrypted env | optional; if both missing, assumes `docker login` was done out-of-band |
|
|
| `DEPLOY_CONTAINER_NAME`, `DEPLOY_HOST_PORT`, `DEPLOY_HOST_CONTENT_DIR`, `DEPLOY_HOST_LOGS_DIR` | stop / start | public env | identical for staging and prod by default |
|
|
| `ASPNETCORE_ConnectionStrings__AzaionDb`, `__AzaionDbAdmin`, `JwtConfig__Secret` | start | encrypted env | the API fail-fast checks these on boot |
|
|
| `ASPNETCORE_ResourcesConfig__*`, `JwtConfig__{Issuer,Audience,Lifetime}` | start | public env (defaults from `appsettings.json`) | only override if the env value differs from the appsettings default |
|
|
| `SOPS_AGE_KEY_FILE` | `_lib.sh` | host | defaults to `/etc/azaion/age.key` if unset |
|
|
| `SMOKE_ADMIN_EMAIL`, `SMOKE_ADMIN_PASSWORD` | `smoke.sh` | operator shell | dedicated smoke-test admin user; rotate as a regular admin password |
|
|
|
|
## 4. Script details
|
|
|
|
### `deploy.sh`
|
|
|
|
**Usage**:
|
|
|
|
```bash
|
|
ENV=staging ./scripts/deploy.sh <sha-tag>
|
|
ENV=production ./scripts/deploy.sh <sha-tag>
|
|
ENV=staging ./scripts/deploy.sh --rollback # uses scripts/.previous_tags.env
|
|
./scripts/deploy.sh --help
|
|
```
|
|
|
|
**Flow** (matches `_docs/04_deploy/deployment_procedures.md` §3 / §4):
|
|
|
|
1. Validate `ENV` and required commands.
|
|
2. Load env overlay (public + sops-decrypted).
|
|
3. If `--rollback`: read `scripts/.previous_tags.env` → set `SHA_TAG` to `PREVIOUS_SHA_TAG`.
|
|
4. `pull-images.sh` (login + pull).
|
|
5. `stop-services.sh` (records the SHA of whatever was running; graceful stop with `docker stop -t 40`; remove).
|
|
6. `start-services.sh` (`docker run --restart unless-stopped --env-file <materialized> --publish $DEPLOY_HOST_PORT:8080`).
|
|
7. `health-check.sh` (poll `/health/ready` with timeout).
|
|
8. Print success line with the running revision.
|
|
|
|
**Failure handling**: any non-zero exit from a sub-script aborts `deploy.sh` (because `set -euo pipefail` propagates). The previously-recorded SHA in `.previous_tags.env` is unchanged, so `--rollback` after a failed deploy targets the version that was running BEFORE the failed attempt.
|
|
|
|
### `pull-images.sh`
|
|
|
|
- `docker login` only when both `REGISTRY_USER` and `REGISTRY_TOKEN` are set; otherwise warns and continues (assumes pre-auth).
|
|
- `docker pull $REGISTRY_HOST/$REGISTRY_IMAGE:$REGISTRY_TAG`.
|
|
- Logs the resolved `RepoDigests[0]` to give the operator an immutable identifier in the deploy log.
|
|
|
|
### `stop-services.sh`
|
|
|
|
- Reads `org.opencontainers.image.revision` from the running container (label set by the Dockerfile).
|
|
- Writes `scripts/.previous_tags.env`:
|
|
```
|
|
PREVIOUS_SHA_TAG=<sha12>-<arch>
|
|
PREVIOUS_REVISION=<full sha>
|
|
RECORDED_AT=<ISO 8601>
|
|
```
|
|
- `docker stop -t 40` then `docker rm -f`.
|
|
- If the container does not exist, logs and exits 0 (idempotent — first deploy on a new host should succeed).
|
|
|
|
### `start-services.sh`
|
|
|
|
- Materializes a runtime env file by filtering the current shell environment with `grep '^(ASPNETCORE_|AZAION_)'`. Registry credentials and deploy-host plumbing variables stay on the host and never enter the container.
|
|
- `mkdir -p` for the bind-mounted `Content/` and `logs/` dirs (idempotent).
|
|
- `docker run --detach --name --restart unless-stopped --env-file --publish --volume`.
|
|
- Logs the container ID and the running revision.
|
|
|
|
### `health-check.sh`
|
|
|
|
- One-shot check on `/health/live` first (3 s timeout). If this fails the container is wedged — fail fast.
|
|
- Polls `/health/ready` every `HEALTH_INTERVAL` (default 2 s) until 200 or `HEALTH_TIMEOUT` (default 60 s).
|
|
- Returns 0 on first 200; non-zero on timeout.
|
|
|
|
### `smoke.sh`
|
|
|
|
Six checks, each ≤ 10 s, against the public `BASE_URL`:
|
|
|
|
1. `GET /health/live` (200)
|
|
2. `GET /health/ready` (200, best-effort — public URL may legitimately not expose this)
|
|
3. `POST /login` — extract JWT
|
|
4. `GET /users/current` (Bearer auth)
|
|
5. `GET /users` — count rows
|
|
6. `GET /resources/list` — sanity that filesystem-backed paths are reachable
|
|
|
|
Smoke is intentionally lightweight; it does NOT exercise CRUD or detection-class endpoints (those are covered by E2E in CI).
|
|
|
|
### `_lib.sh`
|
|
|
|
Shared sourced library. Sourced via `. "$SCRIPT_DIR/_lib.sh"` from every script. NOT executable (lives at `scripts/_lib.sh` mode 0644). Contains:
|
|
|
|
- `log_info` / `log_warn` / `log_error` / `die`
|
|
- `require_env <var…>` / `require_cmd <cmd…>`
|
|
- `load_env_overlay <env>` (the sops + age decryption pipeline)
|
|
- `container_exists`, `container_running`, `current_image_revision`
|
|
|
|
## 5. Examples
|
|
|
|
### First-ever staging deploy
|
|
|
|
```bash
|
|
# On the staging host, as deploy operator:
|
|
cd /opt/azaion/admin # or wherever the repo is checked out
|
|
ENV=staging ./scripts/deploy.sh a1b2c3d4e5f6-arm
|
|
```
|
|
|
|
### Rolling back production after a bad deploy
|
|
|
|
```bash
|
|
# Same host, immediately after the failed deploy:
|
|
ENV=production ./scripts/deploy.sh --rollback
|
|
```
|
|
|
|
### Running smoke from the operator workstation
|
|
|
|
```bash
|
|
export BASE_URL=https://stage.admin.azaion.com
|
|
export SMOKE_ADMIN_EMAIL=ops-smoke@azaion.com
|
|
export SMOKE_ADMIN_PASSWORD=... # from the operator's password manager
|
|
./scripts/smoke.sh
|
|
```
|
|
|
|
### Local development against the dockerized stack
|
|
|
|
The dev-time compose was deferred (Drift K-adjacent). Until it lands, run the API directly:
|
|
|
|
```bash
|
|
# Postgres on host port 4312 (per env/db/00_install.sh)
|
|
dotnet run --project Azaion.AdminApi
|
|
```
|
|
|
|
## 6. Common script properties
|
|
|
|
All scripts:
|
|
|
|
- Use `#!/usr/bin/env bash` with `set -euo pipefail`.
|
|
- Support `--help` / `-h` for usage.
|
|
- Source `_lib.sh` for logging and env-overlay helpers.
|
|
- Are idempotent where possible (running `deploy.sh` twice with the same SHA tag is a no-op for `pull-images.sh`, recreates the container in `stop`/`start`, and re-checks health).
|
|
- Echo to stderr for log lines (so stdout from a sub-process can still be piped).
|
|
|
|
## 7. What is NOT shipped in cycle 1
|
|
|
|
- Remote SSH wrapper. The deploy procedure assumes the operator runs the script on the target host. A `--remote $DEPLOY_HOST` mode is recorded as **Drift O** (carried forward).
|
|
- Slack notifications from inside the scripts. Notifications happen out-of-band per `_docs/04_deploy/observability.md` §5.
|
|
- Database migration step. Migrations are applied manually with `psql` per `_docs/04_deploy/environment_strategy.md` §4 (Drift J).
|
|
|
|
## 8. Related artifacts
|
|
|
|
- Postmortem template: `_docs/06_metrics/postmortem_template.md`
|
|
- Procedures: `_docs/04_deploy/deployment_procedures.md`
|
|
- Environment strategy: `_docs/04_deploy/environment_strategy.md`
|
|
- secrets/ folder onboarding: `secrets/README.md`
|