mirror of
https://github.com/azaion/admin.git
synced 2026-06-21 14:41:08 +00:00
c7b297de83
- Deleted the deploy.cmd script as it was no longer needed. - Updated Dockerfile to include curl for health checks and added a non-root user for improved security. - Modified health check command to use curl for better reliability. - Adjusted docker-compose.test.yml to reflect changes in health check configuration. - Cleaned up appsettings.json and removed unused configuration properties. - Removed Resource entity and related requests from the codebase as part of the architectural shift. - Updated documentation to reflect the removal of hardware binding and related endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
159 lines
9.8 KiB
Markdown
159 lines
9.8 KiB
Markdown
# Azaion Admin API — CI/CD Pipeline
|
|
|
|
**Date**: 2026-05-13 · **Cycle**: 1 · **Status**: planning artifact (current Woodpecker files audited; proposed changes land as concrete YAML in Step 7).
|
|
|
|
## 1. Platform & Constraints
|
|
|
|
| Constraint | Value | Source |
|
|
|------------|-------|--------|
|
|
| CI platform | **Woodpecker CI** | restrictions.md §Operational |
|
|
| Default agent label | `arm64` | `.woodpecker/01-test.yml`, `.woodpecker/02-build-push.yml` |
|
|
| Future agent label | `amd64` (matrix entry, currently commented out) | `.woodpecker/02-build-push.yml` |
|
|
| Two-workflow contract | `01-test.yml` → tests; `02-build-push.yml` (`depends_on: 01-test`) → image | Already in repo |
|
|
| Registry | `$REGISTRY_HOST/azaion/admin` | Woodpecker secret `registry_host` |
|
|
| Branches with full pipeline | `dev`, `stage`, `main` | both files' `when.branch` |
|
|
|
|
The reference contract from `.cursor/skills/deploy/templates/ci_cd_pipeline.md` is already partially adopted. This step closes the remaining gaps.
|
|
|
|
## 2. Current Pipeline (audited)
|
|
|
|
### `.woodpecker/01-test.yml` — what it does today
|
|
|
|
| Step | Image | Action | Quality gate |
|
|
|------|-------|--------|--------------|
|
|
| `unit-tests` | `mcr.microsoft.com/dotnet/sdk:10.0` | `dotnet restore` + `dotnet test Azaion.AdminApi.sln` (release, TRX logger) | All unit tests pass |
|
|
| `e2e-tests` | `mcr.microsoft.com/dotnet/sdk:10.0` | `dotnet restore` + `dotnet test e2e/Azaion.E2E/Azaion.E2E.csproj` | All E2E tests pass |
|
|
|
|
**Audit findings**:
|
|
|
|
1. ✅ Tests are gated before build (matches contract).
|
|
2. ❌ E2E test step runs `dotnet test` directly — but the project uses **Docker-orchestrated black-box tests** via `docker-compose.test.yml`. The pure `dotnet test` invocation cannot start `system-under-test` + `test-db` containers, so `e2e-tests` as written either skips integration scenarios or relies on undocumented agent state. The reference contract uses `docker compose … --abort-on-container-exit --exit-code-from e2e-runner` instead.
|
|
3. ❌ No coverage report.
|
|
4. ❌ No SAST / dependency scan / image scan stage. Security audit recommendation 13 explicitly asked for `dotnet list package --vulnerable` in CI (Drift F).
|
|
5. ❌ No artifact upload of TRX results — failures are visible only in console logs.
|
|
|
|
### `.woodpecker/02-build-push.yml` — what it does today
|
|
|
|
| Step | Image | Action | Quality gate |
|
|
|------|-------|--------|--------------|
|
|
| `build-push` | `docker` | `docker login` → `docker build` (with three OCI labels + `CI_COMMIT_SHA` build-arg) → `docker push $REGISTRY_HOST/azaion/admin:${CI_COMMIT_BRANCH}-${TAG_SUFFIX}` | Push succeeds |
|
|
|
|
**Audit findings**:
|
|
|
|
1. ✅ Multi-arch matrix scaffolding present (`PLATFORM` / `TAG_SUFFIX`) with amd64 commented for future use.
|
|
2. ✅ `depends_on: [01-test]` — gating is correct.
|
|
3. ✅ OCI labels (`revision`, `created`, `source`) injected as build-time labels.
|
|
4. ❌ Only branch-based mutable tag pushed. No immutable `<sha12>-<arch>` tag → host scripts cannot pin (Drift A).
|
|
5. ❌ No image scan (Trivy) before push.
|
|
6. ❌ Old documentation referenced `.woodpecker/build-arm.yml` which no longer exists (Drift D — fix in this doc, see §10).
|
|
|
|
## 3. Proposed Stage Map (target state for cycle 1)
|
|
|
|
| Stage | Trigger | Workflow file | Quality gate |
|
|
|-------|---------|---------------|--------------|
|
|
| Lint / format | every push & PR | `01-test.yml` (new step) | `dotnet format --verify-no-changes` returns 0 |
|
|
| Unit tests | every push & PR | `01-test.yml` | All `Azaion.*Tests` pass; TRX uploaded |
|
|
| Black-box E2E (Docker compose) | every push & PR | `01-test.yml` | `docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-consumer` returns 0; results uploaded |
|
|
| Security: dependency audit | every push & PR | `01-test.yml` (new step) | `dotnet list package --vulnerable --include-transitive` reports zero High/Critical CVEs |
|
|
| Security: image scan | post-build, pre-push | `02-build-push.yml` (new step) | `trivy image --severity HIGH,CRITICAL --exit-code 1` returns 0 |
|
|
| Build | push to `dev` / `stage` / `main` | `02-build-push.yml` | `docker build` succeeds |
|
|
| Push (branch tag + SHA tag) | push to `dev` / `stage` / `main` | `02-build-push.yml` | both `docker push` calls succeed |
|
|
| Performance smoke (optional) | manual on `stage` / `main` | `03-perf.yml` (new) | k6 thresholds in `scripts/perf-scenarios.js` all `ok: true` |
|
|
| Deploy staging | tag push or `stage` branch | `04-deploy.yml` (new) | health check returns 200 within timeout |
|
|
| Deploy production | manual approval | `04-deploy.yml` (new) | health check returns 200 within timeout |
|
|
|
|
> Note on coverage: the test infrastructure (cycle 1) does not yet collect or report coverage. The skill's 75% gate cannot be enforced this cycle. Recorded as **Drift I** (carried forward to a future cycle); does NOT block this deploy.
|
|
|
|
## 4. Caching Strategy
|
|
|
|
| Cache | Key | Notes |
|
|
|-------|-----|-------|
|
|
| `nuget` packages | hash of `**/*.csproj` | Mounted on `/root/.nuget/packages`; restored before `dotnet restore`. Cache invalidates on any csproj change. |
|
|
| Docker layer cache | hash of `Dockerfile` + `**/*.csproj` | Use Woodpecker `--cache-from` against the previous push of the same branch (e.g. `--cache-from $REGISTRY_HOST/azaion/admin:dev-arm`). Cheapest cache available without buildx. |
|
|
| E2E DB init scripts | none — re-init each run | Schema differences would mask test failures. `down -v` between runs is intentional (mirrors `scripts/run-tests.sh`). |
|
|
|
|
## 5. Parallelization
|
|
|
|
```
|
|
01-test.yml (matrix: arm64 [+ amd64 future])
|
|
├── lint-format ─┐
|
|
├── unit-tests ─┼── all run in parallel on the same agent;
|
|
├── e2e-tests ─┤ the slowest (e2e) gates the workflow
|
|
└── deps-audit ─┘
|
|
|
|
02-build-push.yml (matrix: arm64 [+ amd64 future])
|
|
├── build ─→ image-scan ─→ push (branch tag) ─→ push (sha tag)
|
|
│
|
|
└─→ artifact: image digest stored as Woodpecker artifact
|
|
|
|
03-perf.yml (manual; arm64 only)
|
|
└── k6-perf (uses the docker-compose.test.yml SUT)
|
|
|
|
04-deploy.yml (manual; per-environment)
|
|
└── pull → stop → start → health-check → smoke
|
|
```
|
|
|
|
Cross-workflow gates: `02 depends_on 01`; `04 depends_on 02` for the same SHA.
|
|
|
|
## 6. Quality Gates (summary)
|
|
|
|
| Gate | Threshold | Action on breach |
|
|
|------|-----------|------------------|
|
|
| Lint | 0 violations | fail workflow |
|
|
| Unit tests | 100% pass | fail workflow |
|
|
| E2E tests | 100% pass | fail workflow |
|
|
| Dependency audit (High / Critical) | 0 CVEs | fail workflow (Drift F) |
|
|
| Image scan (High / Critical) | 0 CVEs | fail workflow |
|
|
| Coverage | not enforced this cycle (Drift I) | inform-only |
|
|
| Performance (k6) | thresholds in `perf-scenarios.js` | fail workflow when run |
|
|
|
|
## 7. Notifications
|
|
|
|
| Event | Channel | Recipients |
|
|
|-------|---------|------------|
|
|
| `01-test` failure | Woodpecker UI + Slack `#azaion-ci` | Backend team |
|
|
| `02-build-push` failure | Woodpecker UI + Slack `#azaion-ci` | Backend team |
|
|
| Image-scan High/Critical finding | Slack `#azaion-security` | Security + on-call |
|
|
| `04-deploy` failure | Slack `#azaion-ops` + email on-call | Ops on-call |
|
|
| Manual production deploy approval requested | Slack `#azaion-ops` | Approvers |
|
|
|
|
> Slack channel names are placeholders — swap to actual channel IDs in Step 7 when wiring `from_secret: slack_webhook_*`. Email/Pager wiring is deferred until those secrets exist.
|
|
|
|
## 8. Image Tags
|
|
|
|
Resolves Drift A:
|
|
|
|
| Push order | Tag | Stability | Used by |
|
|
|-----------|-----|-----------|---------|
|
|
| 1 | `${CI_COMMIT_BRANCH}-${TAG_SUFFIX}` | mutable (overwritten each push to the branch) | quick dev pulls (`docker pull …:dev-arm`) |
|
|
| 2 | `${CI_COMMIT_SHA:0:12}-${TAG_SUFFIX}` | immutable | host deploy scripts; rollback target |
|
|
|
|
Production deploys MUST reference the SHA tag, never the branch tag (Step 6 procedures will enforce this).
|
|
|
|
## 9. Reproducibility & Audit
|
|
|
|
- Every pushed image carries `org.opencontainers.image.revision` = full `CI_COMMIT_SHA`. The 12-char prefix in the tag is for human reading; the label is the source of truth.
|
|
- `org.opencontainers.image.created` = ISO-8601 build start time (UTC).
|
|
- `org.opencontainers.image.source` = `$CI_REPO_URL`.
|
|
- Both image scan and dependency audit reports are uploaded as Woodpecker artifacts on every run (success and failure).
|
|
|
|
## 10. Drifts Resolved Here / Carried Forward
|
|
|
|
| ID | Severity | Description | Status |
|
|
|----|----------|-------------|--------|
|
|
| A | Medium | Branch-tag-only push; host pulls `:latest` that CI never produces | **Resolved in spec** — add SHA-tag push (§8); script change in Step 7 |
|
|
| D | Low | Old docs referenced `.woodpecker/build-arm.yml` | **Resolved here** — corrected to `01-test.yml` + `02-build-push.yml` everywhere |
|
|
| E | Low | `scripts/run-performance-tests.sh` is run-on-demand only | **Spec** — `03-perf.yml` planned; manual trigger in cycle 1, automatic gate in a future cycle when threshold fluctuation is understood |
|
|
| F | Low | No vulnerable-dep gate in CI | **Resolved in spec** — `deps-audit` step in `01-test.yml`; concrete YAML in Step 7 |
|
|
| I | Low (NEW) | No coverage threshold enforced (no coverage collection wired) | **Carried forward** to a future cycle; recorded in the deploy plan, not blocking |
|
|
|
|
## 11. Self-verification
|
|
|
|
- [x] All pipeline stages defined with triggers and gates.
|
|
- [ ] Coverage threshold enforced — **deferred (Drift I)** with explicit justification.
|
|
- [x] Security scanning included (deps + image; SAST deferred to a future cycle when a SAST tool is selected).
|
|
- [x] Caching configured (NuGet + Docker layer).
|
|
- [x] Multi-environment deployment scaffold (staging → production manual).
|
|
- [x] Rollback referenced (SHA-tagged images make `docker run …:<previous-sha>-arm` a one-line rollback; details in Step 6).
|
|
- [x] Notification matrix defined.
|