mirror of
https://github.com/azaion/admin.git
synced 2026-06-21 17:41:09 +00:00
c7b297de83
- Deleted the deploy.cmd script as it was no longer needed. - Updated Dockerfile to include curl for health checks and added a non-root user for improved security. - Modified health check command to use curl for better reliability. - Adjusted docker-compose.test.yml to reflect changes in health check configuration. - Cleaned up appsettings.json and removed unused configuration properties. - Removed Resource entity and related requests from the codebase as part of the architectural shift. - Updated documentation to reflect the removal of hardware binding and related endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
128 lines
11 KiB
Markdown
128 lines
11 KiB
Markdown
# Azaion Admin API — Environment Strategy
|
|
|
|
**Date**: 2026-05-13 · **Cycle**: 1 · **Status**: planning artifact (no scripts; concrete wiring lands in Step 7).
|
|
|
|
## 1. Environments
|
|
|
|
| Environment | Purpose | Infrastructure | Data Source |
|
|
|-------------|---------|----------------|-------------|
|
|
| **Development** | Local developer workflow on macOS / Linux. | Either bare `dotnet run` against host Postgres (port 4312) **or** the new `docker-compose.yml` planned in Step 2 (API + Postgres on a private Docker network). | Empty database; SQL files under `env/db/` create roles + schema; no fixtures. |
|
|
| **Test (CI)** | Black-box tests in CI and locally via `scripts/run-tests.sh`. | `docker-compose.test.yml` — API + Postgres + e2e-runner on a Docker network. | Functional fixtures from `e2e/db-init/00_run_all.sh` + `99_test_seed.sql`. |
|
|
| **Staging** | Pre-production validation. | Self-hosted Linux server, single Docker host, behind Nginx reverse proxy on `stage.admin.azaion.com`. Mirrors prod topology and Postgres major version. | Anonymized snapshot of production (PII scrubbed by an offline script before import). |
|
|
| **Production** | Live system. | Self-hosted Linux server, single Docker host, behind Nginx reverse proxy on `admin.azaion.com`. | Live data; daily off-host backups. |
|
|
|
|
> Test is added as a first-class environment because cycle 1 already exercises it (`docker-compose.test.yml`). The deploy template lists three; we list four to match reality.
|
|
|
|
## 2. Environment Variables
|
|
|
|
### Source of Truth
|
|
|
|
The complete variable inventory lives in `.env.example` at the repo root (Step 1, 24 entries). This document does NOT duplicate that table — it only specifies, per environment, **where each variable is sourced**.
|
|
|
|
### Per-environment sourcing
|
|
|
|
| Variable group | Development | Test (CI) | Staging | Production |
|
|
|----------------|-------------|-----------|---------|------------|
|
|
| `ASPNETCORE_ENVIRONMENT` | `.env` (`Development`) | docker-compose `environment:` (`Development`) | docker-compose / `--env-file` (`Staging`) | docker-compose / `--env-file` (`Production`) |
|
|
| `ASPNETCORE_URLS` | `.env` | compose | host `.env` (rendered from sops) | host `.env` (rendered from sops) |
|
|
| `ConnectionStrings__*` | `.env` (real local creds) | compose (literal — accepted F-10) | **sops-encrypted file in git** → decrypted on host at deploy time | same as staging |
|
|
| `JwtConfig__Secret` | `.env` (dev-only literal) | compose (literal — accepted F-10) | **sops-encrypted** | **sops-encrypted** |
|
|
| `JwtConfig__{Issuer,Audience,Lifetime}` | appsettings defaults | appsettings defaults | host `.env` if non-default | host `.env` if non-default |
|
|
| `ResourcesConfig__*` | appsettings defaults | compose | host `.env` if non-default | host `.env` if non-default |
|
|
| `DEPLOY_*`, `REGISTRY_TAG` | `.env` (developer machine) | n/a | passed to `scripts/deploy.sh` from operator's shell or CI manual trigger | same |
|
|
| `REGISTRY_USER`, `REGISTRY_TOKEN` | empty in dev `.env` | Woodpecker secrets `registry_user` / `registry_token` | Woodpecker secrets (CI deploy) or operator's shell (manual deploy) | same |
|
|
| `CI_COMMIT_SHA` | unset → image label `unknown` | Woodpecker built-in | Woodpecker built-in | Woodpecker built-in |
|
|
|
|
### Variable Validation (fail-fast)
|
|
|
|
The Admin API already does this for the most security-critical variable:
|
|
|
|
```csharp
|
|
var jwtConfig = builder.Configuration.GetSection(nameof(JwtConfig)).Get<JwtConfig>();
|
|
if (jwtConfig == null || string.IsNullOrEmpty(jwtConfig.Secret))
|
|
throw new Exception("Missing configuration section: JwtConfig");
|
|
```
|
|
|
|
The deploy plan **adds** the same fail-fast check for connection strings during Step 7 wiring (a one-time `_ = configuration.GetConnectionString("AzaionDb") ?? throw …` plus the same for `AzaionDbAdmin`, executed during `WebApplication` build). Without the check, a missing variable currently surfaces only on the first DB call, which is too late.
|
|
|
|
> Static / lookup-style variables (`ResourcesConfig__*`, `JwtConfig__{Issuer,Audience,Lifetime}`) keep their `appsettings.json` defaults in every environment unless an override is required. We do NOT add fail-fast checks for them.
|
|
|
|
## 3. Secrets Management
|
|
|
|
### Decision
|
|
|
|
| Environment | Method | Tool |
|
|
|-------------|--------|------|
|
|
| Development | `.env` file | committed `.env.example` + per-developer `.env` (git-ignored) |
|
|
| Test (CI) | docker-compose `environment:` literals | accepted as test-only (security audit F-10) |
|
|
| Staging | git-tracked encrypted file | **sops + age** |
|
|
| Production | git-tracked encrypted file | **sops + age** |
|
|
|
|
### Why sops + age (not Vault, not Woodpecker secrets, not hand-edited `.env`)
|
|
|
|
Constraints: self-hosted, no cloud account, single ops engineer, currently hand-editing `.env` on the host.
|
|
|
|
| Option | Pros | Cons | Verdict |
|
|
|--------|------|------|---------|
|
|
| sops + age (chosen) | Secrets versioned in git, encrypted at rest, decrypted on the host with a single age key. No new infra. Works offline. | Requires per-environment age keypair stored on the host outside git. Manual key rotation. | ✅ pragmatic for this team size and topology |
|
|
| HashiCorp Vault (self-hosted) | Dynamic DB creds, audit log, fine-grained ACL, KV v2. | Adds a service to operate, monitor, back up. Single-engineer ops budget cannot absorb it now. | ⏳ revisit in a future cycle when ops capacity grows |
|
|
| Woodpecker secrets exported into runtime container | Reuses existing secret store. | Couples runtime config to CI; secrets are not visible/auditable outside Woodpecker UI; cannot run the container outside CI without manually exporting them. | ❌ leaks the CI/runtime boundary |
|
|
| Hand-edited host `.env` (status quo) | Zero new tooling. | No version history, no encryption, no review trail. Single point of failure if the file is lost; security audit can't track changes. | ❌ status quo we are leaving behind (Drift B) |
|
|
|
|
### sops + age conventions for this repo
|
|
|
|
```
|
|
secrets/
|
|
├── .sops.yaml # routes secrets/staging.env / production.env to the right age recipients
|
|
├── staging.env # SOPS-encrypted; safe to commit
|
|
└── production.env # SOPS-encrypted; safe to commit
|
|
```
|
|
|
|
- `.sops.yaml` declares two age recipients: `recipient_staging` and `recipient_production` (public keys).
|
|
- The matching age **private** keys live on each host at `/etc/azaion/age.key`, mode `0400`, owned by root. They are NEVER committed.
|
|
- `scripts/deploy.sh` (Step 7) runs `SOPS_AGE_KEY_FILE=/etc/azaion/age.key sops -d secrets/${env}.env > /tmp/azaion.env` and feeds it to `docker run --env-file`.
|
|
- All staging/production env values that are NOT secret (e.g. `DEPLOY_HOST_PORT`, `REGISTRY_TAG`) live in plain-text `secrets/staging.public.env` / `secrets/production.public.env` next to the encrypted file, also git-tracked. Loaded before the decrypted overlay.
|
|
|
|
### Rotation policy
|
|
|
|
| Secret | Rotation cadence | Procedure |
|
|
|--------|------------------|-----------|
|
|
| Postgres `azaion_admin` / `azaion_reader` passwords | every 90 days, on operator schedule | `ALTER ROLE … WITH PASSWORD …` → re-encrypt `production.env` → `scripts/deploy.sh` |
|
|
| JWT `JwtConfig__Secret` | every 180 days, AND on any suspected leak | re-encrypt → deploy. **All issued tokens become invalid** — communicate maintenance window. |
|
|
| `azaion_superadmin` password | every 365 days, AND on owner change | manual; not used by the running app, only by DB migrations |
|
|
| Registry `REGISTRY_TOKEN` | every 90 days OR on CI compromise | rotate registry credential → update Woodpecker secret `registry_token` → re-encrypt `production.env` if also referenced there |
|
|
| age private key (`/etc/azaion/age.key`) | every 365 days OR on host compromise | generate new key → add public recipient to `.sops.yaml` → `sops updatekeys secrets/*.env` → distribute new private key out-of-band → remove old recipient |
|
|
|
|
## 4. Database Management
|
|
|
|
| Environment | Type | Migrations | Data | Backup |
|
|
|-------------|------|------------|------|--------|
|
|
| Development | Local Postgres on host (port 4312) **or** dockerized Postgres from `docker-compose.yml` | `env/db/*.sql` applied manually by developer the first time, then `*_users_email_unique.sql`-style additive scripts run with `psql` on demand | empty | none |
|
|
| Test (CI) | Postgres 16-alpine from `docker-compose.test.yml` | `env/db/*.sql` mounted into `/docker-entrypoint-initdb.d/sql/`, ordered by `00_run_all.sh` | `99_test_seed.sql` (functional) + 500 perf users injected by `scripts/run-performance-tests.sh` when needed | none — `down -v` between runs |
|
|
| Staging | Same Postgres major (16) on the staging server, port 4312, `azaion` database | `env/db/*.sql` applied **manually under change control** via `psql -U azaion_superadmin`. New migrations land in the same numeric-prefix sequence (`07_*.sql`, `08_*.sql`, …) | anonymized prod snapshot, refreshed on demand | nightly `pg_dump` snapshot retained 14 days |
|
|
| Production | Same Postgres 16 on prod server | Same as staging; **migration must be applied to staging first**, observed for ≥ 24 h, then promoted to prod with operator approval | live | nightly `pg_dump` retained 30 days; weekly snapshot retained 12 weeks; off-host copy via `rsync` |
|
|
|
|
### Migration rules (cycle 1)
|
|
|
|
The project does NOT use an ORM migration framework (linq2db; restrictions.md). The conventions below replace it:
|
|
|
|
1. **Numeric-prefix ordering** — every new migration is added as `env/db/NN_<description>.sql` where `NN` continues the existing sequence. The current sequence is `01..06`; the next is `07_*.sql`.
|
|
2. **Forward-only by default**. Reversibility is provided by the off-host backup, NOT by hand-written DOWN scripts. The existing files (`02_structure.sql`, `03_add_timestamp_columns.sql`, `04_detection_classes.sql`, `06_users_email_unique.sql`) follow this pattern; we keep it.
|
|
3. **Backward-compatible deploys** — every schema change must be safe to apply BEFORE the matching code is deployed (additive change → deploy code → cleanup change in a later release). The cycle 1 example: `06_users_email_unique.sql` was applied first; the `RegisterUser` change to translate `23505` came after. AZ-197's `User.Hardware` column was kept as a tombstone instead of dropped, for the same reason.
|
|
4. **Production migrations need approval** — operator manually runs the SQL on prod after staging soak. No automatic CI execution against prod in cycle 1 (Drift J — automation is a future cycle's work).
|
|
|
|
### Drifts logged here
|
|
|
|
| ID | Severity | Description | Resolved In |
|
|
|----|----------|-------------|-------------|
|
|
| B | Medium | No secret manager (status quo: hand-edited host `.env`) | **Resolved in spec** — sops + age (§3); concrete files + script in Step 7 |
|
|
| J | Low (NEW) | DB migrations applied manually on staging/prod; no automation | **Carried forward** to a future cycle |
|
|
|
|
## 5. Self-verification
|
|
|
|
- [x] Four environments (Dev, Test/CI, Staging, Production) defined with purpose, infrastructure, and data source.
|
|
- [x] Environment variable sourcing matrix references `.env.example` (Step 1) without duplicating it.
|
|
- [x] No literal secrets in this document — only variable names and tool names.
|
|
- [x] Secret manager chosen for staging/production (sops + age) with rotation policy.
|
|
- [x] Database strategy per environment, including the explicit no-ORM-migrations convention.
|