Files
admin/_docs/04_deploy/ci_cd_pipeline.md
T
Oleksandr Bezdieniezhnykh c7b297de83
ci/woodpecker/push/01-test Pipeline failed
ci/woodpecker/push/02-build-push unknown status
refactor: remove deploy.cmd and update Dockerfile for health checks
- Deleted the deploy.cmd script as it was no longer needed.
- Updated Dockerfile to include curl for health checks and added a non-root user for improved security.
- Modified health check command to use curl for better reliability.
- Adjusted docker-compose.test.yml to reflect changes in health check configuration.
- Cleaned up appsettings.json and removed unused configuration properties.
- Removed Resource entity and related requests from the codebase as part of the architectural shift.
- Updated documentation to reflect the removal of hardware binding and related endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 08:47:21 +03:00

9.8 KiB

Azaion Admin API — CI/CD Pipeline

Date: 2026-05-13 · Cycle: 1 · Status: planning artifact (current Woodpecker files audited; proposed changes land as concrete YAML in Step 7).

1. Platform & Constraints

Constraint Value Source
CI platform Woodpecker CI restrictions.md §Operational
Default agent label arm64 .woodpecker/01-test.yml, .woodpecker/02-build-push.yml
Future agent label amd64 (matrix entry, currently commented out) .woodpecker/02-build-push.yml
Two-workflow contract 01-test.yml → tests; 02-build-push.yml (depends_on: 01-test) → image Already in repo
Registry $REGISTRY_HOST/azaion/admin Woodpecker secret registry_host
Branches with full pipeline dev, stage, main both files' when.branch

The reference contract from .cursor/skills/deploy/templates/ci_cd_pipeline.md is already partially adopted. This step closes the remaining gaps.

2. Current Pipeline (audited)

.woodpecker/01-test.yml — what it does today

Step Image Action Quality gate
unit-tests mcr.microsoft.com/dotnet/sdk:10.0 dotnet restore + dotnet test Azaion.AdminApi.sln (release, TRX logger) All unit tests pass
e2e-tests mcr.microsoft.com/dotnet/sdk:10.0 dotnet restore + dotnet test e2e/Azaion.E2E/Azaion.E2E.csproj All E2E tests pass

Audit findings:

  1. Tests are gated before build (matches contract).
  2. E2E test step runs dotnet test directly — but the project uses Docker-orchestrated black-box tests via docker-compose.test.yml. The pure dotnet test invocation cannot start system-under-test + test-db containers, so e2e-tests as written either skips integration scenarios or relies on undocumented agent state. The reference contract uses docker compose … --abort-on-container-exit --exit-code-from e2e-runner instead.
  3. No coverage report.
  4. No SAST / dependency scan / image scan stage. Security audit recommendation 13 explicitly asked for dotnet list package --vulnerable in CI (Drift F).
  5. No artifact upload of TRX results — failures are visible only in console logs.

.woodpecker/02-build-push.yml — what it does today

Step Image Action Quality gate
build-push docker docker logindocker build (with three OCI labels + CI_COMMIT_SHA build-arg) → docker push $REGISTRY_HOST/azaion/admin:${CI_COMMIT_BRANCH}-${TAG_SUFFIX} Push succeeds

Audit findings:

  1. Multi-arch matrix scaffolding present (PLATFORM / TAG_SUFFIX) with amd64 commented for future use.
  2. depends_on: [01-test] — gating is correct.
  3. OCI labels (revision, created, source) injected as build-time labels.
  4. Only branch-based mutable tag pushed. No immutable <sha12>-<arch> tag → host scripts cannot pin (Drift A).
  5. No image scan (Trivy) before push.
  6. Old documentation referenced .woodpecker/build-arm.yml which no longer exists (Drift D — fix in this doc, see §10).

3. Proposed Stage Map (target state for cycle 1)

Stage Trigger Workflow file Quality gate
Lint / format every push & PR 01-test.yml (new step) dotnet format --verify-no-changes returns 0
Unit tests every push & PR 01-test.yml All Azaion.*Tests pass; TRX uploaded
Black-box E2E (Docker compose) every push & PR 01-test.yml docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-consumer returns 0; results uploaded
Security: dependency audit every push & PR 01-test.yml (new step) dotnet list package --vulnerable --include-transitive reports zero High/Critical CVEs
Security: image scan post-build, pre-push 02-build-push.yml (new step) trivy image --severity HIGH,CRITICAL --exit-code 1 returns 0
Build push to dev / stage / main 02-build-push.yml docker build succeeds
Push (branch tag + SHA tag) push to dev / stage / main 02-build-push.yml both docker push calls succeed
Performance smoke (optional) manual on stage / main 03-perf.yml (new) k6 thresholds in scripts/perf-scenarios.js all ok: true
Deploy staging tag push or stage branch 04-deploy.yml (new) health check returns 200 within timeout
Deploy production manual approval 04-deploy.yml (new) health check returns 200 within timeout

Note on coverage: the test infrastructure (cycle 1) does not yet collect or report coverage. The skill's 75% gate cannot be enforced this cycle. Recorded as Drift I (carried forward to a future cycle); does NOT block this deploy.

4. Caching Strategy

Cache Key Notes
nuget packages hash of **/*.csproj Mounted on /root/.nuget/packages; restored before dotnet restore. Cache invalidates on any csproj change.
Docker layer cache hash of Dockerfile + **/*.csproj Use Woodpecker --cache-from against the previous push of the same branch (e.g. --cache-from $REGISTRY_HOST/azaion/admin:dev-arm). Cheapest cache available without buildx.
E2E DB init scripts none — re-init each run Schema differences would mask test failures. down -v between runs is intentional (mirrors scripts/run-tests.sh).

5. Parallelization

01-test.yml (matrix: arm64 [+ amd64 future])
├── lint-format        ─┐
├── unit-tests         ─┼── all run in parallel on the same agent;
├── e2e-tests          ─┤   the slowest (e2e) gates the workflow
└── deps-audit         ─┘

02-build-push.yml  (matrix: arm64 [+ amd64 future])
├── build              ─→ image-scan ─→ push (branch tag) ─→ push (sha tag)
                                                                │
                                                                └─→ artifact: image digest stored as Woodpecker artifact

03-perf.yml         (manual; arm64 only)
└── k6-perf            (uses the docker-compose.test.yml SUT)

04-deploy.yml       (manual; per-environment)
└── pull → stop → start → health-check → smoke

Cross-workflow gates: 02 depends_on 01; 04 depends_on 02 for the same SHA.

6. Quality Gates (summary)

Gate Threshold Action on breach
Lint 0 violations fail workflow
Unit tests 100% pass fail workflow
E2E tests 100% pass fail workflow
Dependency audit (High / Critical) 0 CVEs fail workflow (Drift F)
Image scan (High / Critical) 0 CVEs fail workflow
Coverage not enforced this cycle (Drift I) inform-only
Performance (k6) thresholds in perf-scenarios.js fail workflow when run

7. Notifications

Event Channel Recipients
01-test failure Woodpecker UI + Slack #azaion-ci Backend team
02-build-push failure Woodpecker UI + Slack #azaion-ci Backend team
Image-scan High/Critical finding Slack #azaion-security Security + on-call
04-deploy failure Slack #azaion-ops + email on-call Ops on-call
Manual production deploy approval requested Slack #azaion-ops Approvers

Slack channel names are placeholders — swap to actual channel IDs in Step 7 when wiring from_secret: slack_webhook_*. Email/Pager wiring is deferred until those secrets exist.

8. Image Tags

Resolves Drift A:

Push order Tag Stability Used by
1 ${CI_COMMIT_BRANCH}-${TAG_SUFFIX} mutable (overwritten each push to the branch) quick dev pulls (docker pull …:dev-arm)
2 ${CI_COMMIT_SHA:0:12}-${TAG_SUFFIX} immutable host deploy scripts; rollback target

Production deploys MUST reference the SHA tag, never the branch tag (Step 6 procedures will enforce this).

9. Reproducibility & Audit

  • Every pushed image carries org.opencontainers.image.revision = full CI_COMMIT_SHA. The 12-char prefix in the tag is for human reading; the label is the source of truth.
  • org.opencontainers.image.created = ISO-8601 build start time (UTC).
  • org.opencontainers.image.source = $CI_REPO_URL.
  • Both image scan and dependency audit reports are uploaded as Woodpecker artifacts on every run (success and failure).

10. Drifts Resolved Here / Carried Forward

ID Severity Description Status
A Medium Branch-tag-only push; host pulls :latest that CI never produces Resolved in spec — add SHA-tag push (§8); script change in Step 7
D Low Old docs referenced .woodpecker/build-arm.yml Resolved here — corrected to 01-test.yml + 02-build-push.yml everywhere
E Low scripts/run-performance-tests.sh is run-on-demand only Spec03-perf.yml planned; manual trigger in cycle 1, automatic gate in a future cycle when threshold fluctuation is understood
F Low No vulnerable-dep gate in CI Resolved in specdeps-audit step in 01-test.yml; concrete YAML in Step 7
I Low (NEW) No coverage threshold enforced (no coverage collection wired) Carried forward to a future cycle; recorded in the deploy plan, not blocking

11. Self-verification

  • All pipeline stages defined with triggers and gates.
  • Coverage threshold enforced — deferred (Drift I) with explicit justification.
  • Security scanning included (deps + image; SAST deferred to a future cycle when a SAST tool is selected).
  • Caching configured (NuGet + Docker layer).
  • Multi-environment deployment scaffold (staging → production manual).
  • Rollback referenced (SHA-tagged images make docker run …:<previous-sha>-arm a one-line rollback; details in Step 6).
  • Notification matrix defined.