- Modified the autodev state to reflect the current testing phase and details of the new `jetson-e2e` tests. - Enhanced the "How to Test" documentation to provide clearer instructions on the demo replay validation process, including video and tlog alignment steps. - Updated architectural documentation to include the new demo replay operator flow and its dependencies. - Documented the removal of deprecated auto-sync features and clarified the operator-facing UI for replay validation. - Added new entries in the dependencies table for upcoming tasks related to the demo replay flow. These changes improve clarity and usability for operators and developers working with the demo replay system.
9.2 KiB
CI/CD Pipeline Template
Save as _docs/04_deploy/ci_cd_pipeline.md.
# [System Name] — CI/CD Pipeline
## Pipeline Overview
| Stage | Trigger | Quality Gate |
|-------|---------|-------------|
| Lint | Every push | Zero lint errors |
| Test | Every push | 75%+ coverage, all tests pass |
| Security | Every push | Zero critical/high CVEs |
| Build | PR merge to dev | Docker build succeeds |
| Push | After build | Images pushed to registry |
| Deploy Staging | After push | Health checks pass |
| Smoke Tests | After staging deploy | Critical paths pass |
| Deploy Production | Manual approval | Health checks pass |
## Stage Details
### Lint
- [Language-specific linters and formatters]
- Runs in parallel per language
### Test
- Unit tests: [framework and command]
- Blackbox tests: [framework and command, uses docker-compose.test.yml]
- Coverage threshold: 75% overall, 90% critical-path floor (100% aim) — per `.cursor/rules/cursor-meta.mdc` Quality Thresholds
- Coverage report published as pipeline artifact
### Security
- Dependency audit: [tool, e.g., npm audit / pip-audit / dotnet list package --vulnerable]
- SAST scan: [tool, e.g., Semgrep / SonarQube]
- Image scan: Trivy on built Docker images
- Block on: critical or high severity findings
### Build
- Docker images built using multi-stage Dockerfiles
- Tagged with git SHA: `<registry>/<component>:<sha>`
- Build cache: Docker layer cache via CI cache action
### Push
- Registry: [container registry URL]
- Authentication: [method]
### Deploy Staging
- Deployment method: [docker compose / Kubernetes / cloud service]
- Pre-deploy: run database migrations
- Post-deploy: verify health check endpoints
- Automated rollback on health check failure
### Smoke Tests
- Subset of blackbox tests targeting staging environment
- Validates critical user flows
- Timeout: [maximum duration]
### Deploy Production
- Requires manual approval via [mechanism]
- Deployment strategy: [blue-green / rolling / canary]
- Pre-deploy: database migration review
- Post-deploy: health checks + monitoring for 15 min
## Caching Strategy
| Cache | Key | Restore Keys |
|-------|-----|-------------|
| Dependencies | [lockfile hash] | [partial match] |
| Docker layers | [Dockerfile hash] | [partial match] |
| Build artifacts | [source hash] | [partial match] |
## Parallelization
[Diagram or description of which stages run concurrently]
## Notifications
| Event | Channel | Recipients |
|-------|---------|-----------|
| Build failure | [Slack/email] | [team] |
| Security alert | [Slack/email] | [team + security] |
| Deploy success | [Slack] | [team] |
| Deploy failure | [Slack/email + PagerDuty] | [on-call] |
Reference Implementation: Woodpecker CI two-workflow contract
Use this when the project's CI is Woodpecker and the test layout follows the autodev e2e contract from ../../decompose/templates/test-infrastructure-task.md (an e2e/ folder containing Dockerfile, docker-compose.test.yml, conftest.py, requirements.txt, mocks/, fixtures/, tests/).
The contract is two workflows in .woodpecker/, scheduled on the same agent label, with the build workflow gated on a successful test run:
.woodpecker/01-test.yml— runs the e2e contract, publishesresults/report.csvas an artifact, fails the pipeline on any test failure..woodpecker/02-build-push.yml—depends_on: [01-test]. Builds the image, tags it${CI_COMMIT_BRANCH}-${TAG_SUFFIX}, pushes it to the registry. Skipped automatically if test failed.
The agent label is parameterized via matrix: so a single workflow file fans out across architectures: labels: platform: ${PLATFORM} routes each matrix entry to the matching agent. Both workflows for a repo must use the same matrix so test and build run on the same machine and share Docker layer cache. New architectures = new matrix entries; never new files.
Multi-arch matrix conventions
| Variable | Meaning | Typical values |
|---|---|---|
PLATFORM |
Woodpecker agent label — selects which physical machine runs the entry. | arm64, amd64 |
TAG_SUFFIX |
Image tag suffix appended after the branch name. | arm, amd |
DOCKERFILE (only when arches need different Dockerfiles) |
Path to the Dockerfile for this entry. | Dockerfile, Dockerfile.jetson |
Most repos use the same Dockerfile for both arches (multi-arch base images handle the rest), so DOCKERFILE can be omitted from the matrix and hardcoded in the build command. Repos with split per-arch Dockerfiles (e.g., detections uses Dockerfile.jetson on Jetson with TensorRT/CUDA-on-L4T) declare DOCKERFILE as a matrix var.
When only one architecture is currently in use, keep the matrix block with a single entry and the second entry commented out — adding a new arch is then a one-line uncomment, not a structural change.
.woodpecker/01-test.yml
when:
event: [push, pull_request, manual]
branch: [dev, stage, main]
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: e2e
image: docker
commands:
- cd e2e
- docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build
- docker compose -f docker-compose.test.yml down -v
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- name: report
image: docker
when:
status: [success, failure]
commands:
- test -f e2e/results/report.csv && cat e2e/results/report.csv || echo "no report"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
Notes:
--abort-on-container-exitshuts the whole compose down as soon as ANY service exits, so a crashed dependency surfaces immediately instead of hanging the runner.--exit-code-from e2e-runnerensures the pipeline's exit code reflects the test runner's, not the SUT's.- The
reportstep runs on[success, failure]so the report is always published; without this the CSV is lost on red builds. down -vbetween runs drops mock state and DB volumes — every test run starts clean.
.woodpecker/02-build-push.yml
when:
event: [push, manual]
branch: [dev, stage, main]
depends_on:
- 01-test
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: build-push
image: docker
environment:
REGISTRY_HOST:
from_secret: registry_host
REGISTRY_USER:
from_secret: registry_user
REGISTRY_TOKEN:
from_secret: registry_token
commands:
- echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
- export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
- export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
- |
docker build -f Dockerfile \
--build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
--label org.opencontainers.image.revision=$CI_COMMIT_SHA \
--label org.opencontainers.image.created=$BUILD_DATE \
--label org.opencontainers.image.source=$CI_REPO_URL \
-t $REGISTRY_HOST/azaion/<service>:$TAG .
- docker push $REGISTRY_HOST/azaion/<service>:$TAG
volumes:
- /var/run/docker.sock:/var/run/docker.sock
Notes:
depends_on: [01-test]is enforced by Woodpecker — a failed01-test(any matrix entry) skips this workflow.- The build workflow does NOT trigger on
pull_requestevents: PRs get test signal only; pushes todev/stage/mainproduce images. Avoids polluting the registry with PR images. - Replace
<service>with the actual service name (matches the registry namespace patternazaion/<service>). - For repos with split per-arch Dockerfiles, add
DOCKERFILE: Dockerfile.jetson(or similar) to the matrix entry and substitute${DOCKERFILE}forDockerfilein thedocker build -fline.
Variations by stack
The contract is language-agnostic because the runner is docker compose. The Dockerfile inside e2e/ selects the test framework:
| Stack | e2e/Dockerfile runs |
|---|---|
| Python | pytest --csv=/results/report.csv -v |
| .NET | dotnet test --logger:"trx;LogFileName=/results/report.trx" (convert to CSV in a final step if needed) |
| Node/UI | npm test -- --reporters=default --reporters=jest-junit --outputDirectory=/results |
| Rust | cargo test --no-fail-fast -- --format json > /results/report.json |
When the repo has only unit tests (no e2e/docker-compose.test.yml), drop the compose orchestration and run the native test command directly inside a stack-appropriate image. Keep the same two-workflow split — 01-test.yml runs unit tests, 02-build-push.yml is unchanged.
Manual-trigger override (test infrastructure not yet validated)
If a repo ships a complete e2e/ layout but the test fixtures are not yet validated end-to-end (e.g., expected-results data is still being authored), gate 01-test.yml on event: [manual] only and add a TODO comment pointing to the unblocking task. The 02-build-push.yml workflow drops its depends_on clause for the manual-only window — an explicit and reversible exception, not a permanent split.