mirror of
https://github.com/azaion/annotations.git
synced 2026-04-22 22:46:30 +00:00
492 lines
22 KiB
Markdown
492 lines
22 KiB
Markdown
---
|
||
name: deploy
|
||
description: |
|
||
Comprehensive deployment skill covering status check, env setup, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures, and deployment scripts.
|
||
7-step workflow: Status & env check, Docker containerization, CI/CD pipeline definition, environment strategy, observability planning, deployment procedures, deployment scripts.
|
||
Uses _docs/04_deploy/ structure.
|
||
Trigger phrases:
|
||
- "deploy", "deployment", "deployment strategy"
|
||
- "CI/CD", "pipeline", "containerize"
|
||
- "observability", "monitoring", "logging"
|
||
- "dockerize", "docker compose"
|
||
category: ship
|
||
tags: [deployment, docker, ci-cd, observability, monitoring, containerization, scripts]
|
||
disable-model-invocation: true
|
||
---
|
||
|
||
# Deployment Planning
|
||
|
||
Plan and document the full deployment lifecycle: check deployment status and environment requirements, containerize the application, define CI/CD pipelines, configure environments, set up observability, document deployment procedures, and generate deployment scripts.
|
||
|
||
## Core Principles
|
||
|
||
- **Docker-first**: every component runs in a container; local dev, blackbox tests, and production all use Docker
|
||
- **Infrastructure as code**: all deployment configuration is version-controlled
|
||
- **Observability built-in**: logging, metrics, and tracing are part of the deployment plan, not afterthoughts
|
||
- **Environment parity**: dev, staging, and production environments mirror each other as closely as possible
|
||
- **Save immediately**: write artifacts to disk after each step; never accumulate unsaved work
|
||
- **Ask, don't assume**: when infrastructure constraints or preferences are unclear, ask the user
|
||
- **Plan, don't code**: this workflow produces deployment documents and specifications, not implementation code (except deployment scripts in Step 7)
|
||
|
||
## Context Resolution
|
||
|
||
Fixed paths:
|
||
|
||
- DOCUMENT_DIR: `_docs/02_document/`
|
||
- DEPLOY_DIR: `_docs/04_deploy/`
|
||
- REPORTS_DIR: `_docs/04_deploy/reports/`
|
||
- SCRIPTS_DIR: `scripts/`
|
||
- ARCHITECTURE: `_docs/02_document/architecture.md`
|
||
- COMPONENTS_DIR: `_docs/02_document/components/`
|
||
|
||
Announce the resolved paths to the user before proceeding.
|
||
|
||
## Input Specification
|
||
|
||
### Required Files
|
||
|
||
| File | Purpose | Required |
|
||
|------|---------|----------|
|
||
| `_docs/00_problem/problem.md` | Problem description and context | Greenfield only |
|
||
| `_docs/00_problem/restrictions.md` | Constraints and limitations | Greenfield only |
|
||
| `_docs/01_solution/solution.md` | Finalized solution | Greenfield only |
|
||
| `DOCUMENT_DIR/architecture.md` | Architecture (from plan or document skill) | Always |
|
||
| `DOCUMENT_DIR/components/` | Component specs | Always |
|
||
|
||
### Prerequisite Checks (BLOCKING)
|
||
|
||
1. `architecture.md` exists — **STOP if missing**, run `/plan` first
|
||
2. At least one component spec exists in `DOCUMENT_DIR/components/` — **STOP if missing**
|
||
3. Create DEPLOY_DIR, REPORTS_DIR, and SCRIPTS_DIR if they do not exist
|
||
4. If DEPLOY_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
|
||
|
||
## Artifact Management
|
||
|
||
### Directory Structure
|
||
|
||
```
|
||
DEPLOY_DIR/
|
||
├── containerization.md
|
||
├── ci_cd_pipeline.md
|
||
├── environment_strategy.md
|
||
├── observability.md
|
||
├── deployment_procedures.md
|
||
├── deploy_scripts.md
|
||
└── reports/
|
||
└── deploy_status_report.md
|
||
|
||
SCRIPTS_DIR/ (project root)
|
||
├── deploy.sh
|
||
├── pull-images.sh
|
||
├── start-services.sh
|
||
├── stop-services.sh
|
||
└── health-check.sh
|
||
|
||
.env (project root, git-ignored)
|
||
.env.example (project root, committed)
|
||
```
|
||
|
||
### Save Timing
|
||
|
||
| Step | Save immediately after | Filename |
|
||
|------|------------------------|----------|
|
||
| Step 1 | Status check & env setup complete | `reports/deploy_status_report.md` + `.env` + `.env.example` |
|
||
| Step 2 | Containerization plan complete | `containerization.md` |
|
||
| Step 3 | CI/CD pipeline defined | `ci_cd_pipeline.md` |
|
||
| Step 4 | Environment strategy documented | `environment_strategy.md` |
|
||
| Step 5 | Observability plan complete | `observability.md` |
|
||
| Step 6 | Deployment procedures documented | `deployment_procedures.md` |
|
||
| Step 7 | Deployment scripts created | `deploy_scripts.md` + scripts in `SCRIPTS_DIR/` |
|
||
|
||
### Resumability
|
||
|
||
If DEPLOY_DIR already contains artifacts:
|
||
|
||
1. List existing files and match to the save timing table
|
||
2. Identify the last completed step
|
||
3. Resume from the next incomplete step
|
||
4. Inform the user which steps are being skipped
|
||
|
||
## Progress Tracking
|
||
|
||
At the start of execution, create a TodoWrite with all steps (1 through 7). Update status as each step completes.
|
||
|
||
## Workflow
|
||
|
||
### Step 1: Deployment Status & Environment Setup
|
||
|
||
**Role**: DevOps / Platform engineer
|
||
**Goal**: Assess current deployment readiness, identify all required environment variables, and create `.env` files
|
||
**Constraints**: Must complete before any other step
|
||
|
||
1. Read architecture.md, all component specs, and restrictions.md
|
||
2. Assess deployment readiness:
|
||
- List all components and their current state (planned / implemented / tested)
|
||
- Identify external dependencies (databases, APIs, message queues, cloud services)
|
||
- Identify infrastructure prerequisites (container registry, cloud accounts, DNS, SSL certificates)
|
||
- Check if any deployment blockers exist
|
||
3. Identify all required environment variables by scanning:
|
||
- Component specs for configuration needs
|
||
- Database connection requirements
|
||
- External API endpoints and credentials
|
||
- Feature flags and runtime configuration
|
||
- Container registry credentials
|
||
- Cloud provider credentials
|
||
- Monitoring/logging service endpoints
|
||
4. Generate `.env.example` in project root with all variables and placeholder values (committed to VCS)
|
||
5. Generate `.env` in project root with development defaults filled in where safe (git-ignored)
|
||
6. Ensure `.gitignore` includes `.env` (but NOT `.env.example`)
|
||
7. Produce a deployment status report summarizing readiness, blockers, and required setup
|
||
|
||
**Self-verification**:
|
||
- [ ] All components assessed for deployment readiness
|
||
- [ ] External dependencies catalogued
|
||
- [ ] Infrastructure prerequisites identified
|
||
- [ ] All required environment variables discovered
|
||
- [ ] `.env.example` created with placeholder values
|
||
- [ ] `.env` created with safe development defaults
|
||
- [ ] `.gitignore` updated to exclude `.env`
|
||
- [ ] Status report written to `reports/deploy_status_report.md`
|
||
|
||
**Save action**: Write `reports/deploy_status_report.md` using `templates/deploy_status_report.md`, create `.env` and `.env.example` in project root
|
||
|
||
**BLOCKING**: Present status report and environment variables to user. Do NOT proceed until confirmed.
|
||
|
||
---
|
||
|
||
### Step 2: Containerization
|
||
|
||
**Role**: DevOps / Platform engineer
|
||
**Goal**: Define Docker configuration for every component, local development, and blackbox test environments
|
||
**Constraints**: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain.
|
||
|
||
1. Read architecture.md and all component specs
|
||
2. Read restrictions.md for infrastructure constraints
|
||
3. Research best Docker practices for the project's tech stack (multi-stage builds, base image selection, layer optimization)
|
||
4. For each component, define:
|
||
- Base image (pinned version, prefer alpine/distroless for production)
|
||
- Build stages (dependency install, build, production)
|
||
- Non-root user configuration
|
||
- Health check endpoint and command
|
||
- Exposed ports
|
||
- `.dockerignore` contents
|
||
5. Define `docker-compose.yml` for local development:
|
||
- All application components
|
||
- Database (Postgres) with named volume
|
||
- Any message queues, caches, or external service mocks
|
||
- Shared network
|
||
- Environment variable files (`.env`)
|
||
6. Define `docker-compose.test.yml` for blackbox tests:
|
||
- Application components under test
|
||
- Test runner container (black-box, no internal imports)
|
||
- Isolated database with seed data
|
||
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
|
||
7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
|
||
|
||
**Self-verification**:
|
||
- [ ] Every component has a Dockerfile specification
|
||
- [ ] Multi-stage builds specified for all production images
|
||
- [ ] Non-root user for all containers
|
||
- [ ] Health checks defined for every service
|
||
- [ ] docker-compose.yml covers all components + dependencies
|
||
- [ ] docker-compose.test.yml enables black-box testing
|
||
- [ ] `.dockerignore` defined
|
||
|
||
**Save action**: Write `containerization.md` using `templates/containerization.md`
|
||
|
||
**BLOCKING**: Present containerization plan to user. Do NOT proceed until confirmed.
|
||
|
||
---
|
||
|
||
### Step 3: CI/CD Pipeline
|
||
|
||
**Role**: DevOps engineer
|
||
**Goal**: Define the CI/CD pipeline with quality gates, security scanning, and multi-environment deployment
|
||
**Constraints**: Pipeline definition only — produce YAML specification, not implementation
|
||
|
||
1. Read architecture.md for tech stack and deployment targets
|
||
2. Read restrictions.md for CI/CD constraints (cloud provider, registry, etc.)
|
||
3. Research CI/CD best practices for the project's platform (GitHub Actions / Azure Pipelines)
|
||
4. Define pipeline stages:
|
||
|
||
| Stage | Trigger | Steps | Quality Gate |
|
||
|-------|---------|-------|-------------|
|
||
| **Lint** | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors |
|
||
| **Test** | Every push | Unit tests, blackbox tests, coverage report | 75%+ coverage (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds) |
|
||
| **Security** | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs |
|
||
| **Build** | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds |
|
||
| **Push** | After build | Push to container registry | Push succeeds |
|
||
| **Deploy Staging** | After push | Deploy to staging environment | Health checks pass |
|
||
| **Smoke Tests** | After staging deploy | Run critical path tests against staging | All pass |
|
||
| **Deploy Production** | Manual approval | Deploy to production | Health checks pass |
|
||
|
||
5. Define caching strategy: dependency caches, Docker layer caches, build artifact caches
|
||
6. Define parallelization: which stages can run concurrently
|
||
7. Define notifications: build failures, deployment status, security alerts
|
||
|
||
**Self-verification**:
|
||
- [ ] All pipeline stages defined with triggers and gates
|
||
- [ ] Coverage threshold enforced (75%+)
|
||
- [ ] Security scanning included (dependencies + images + SAST)
|
||
- [ ] Caching configured for dependencies and Docker layers
|
||
- [ ] Multi-environment deployment (staging → production)
|
||
- [ ] Rollback procedure referenced
|
||
- [ ] Notifications configured
|
||
|
||
**Save action**: Write `ci_cd_pipeline.md` using `templates/ci_cd_pipeline.md`
|
||
|
||
---
|
||
|
||
### Step 4: Environment Strategy
|
||
|
||
**Role**: Platform engineer
|
||
**Goal**: Define environment configuration, secrets management, and environment parity
|
||
**Constraints**: Strategy document — no secrets or credentials in output
|
||
|
||
1. Define environments:
|
||
|
||
| Environment | Purpose | Infrastructure | Data |
|
||
|-------------|---------|---------------|------|
|
||
| **Development** | Local developer workflow | docker-compose, local volumes | Seed data, mocks for external APIs |
|
||
| **Staging** | Pre-production validation | Mirrors production topology | Anonymized production-like data |
|
||
| **Production** | Live system | Full infrastructure | Real data |
|
||
|
||
2. Define environment variable management:
|
||
- Reference `.env.example` created in Step 1
|
||
- Per-environment variable sources (`.env` for dev, secret manager for staging/prod)
|
||
- Validation: fail fast on missing required variables at startup
|
||
3. Define secrets management:
|
||
- Never commit secrets to version control
|
||
- Development: `.env` files (git-ignored)
|
||
- Staging/Production: secret manager (AWS Secrets Manager / Azure Key Vault / Vault)
|
||
- Rotation policy
|
||
4. Define database management per environment:
|
||
- Development: Docker Postgres with named volume, seed data
|
||
- Staging: managed Postgres, migrations applied via CI/CD
|
||
- Production: managed Postgres, migrations require approval
|
||
|
||
**Self-verification**:
|
||
- [ ] All three environments defined with clear purpose
|
||
- [ ] Environment variable documentation complete (references `.env.example` from Step 1)
|
||
- [ ] No secrets in any output document
|
||
- [ ] Secret manager specified for staging/production
|
||
- [ ] Database strategy per environment
|
||
|
||
**Save action**: Write `environment_strategy.md` using `templates/environment_strategy.md`
|
||
|
||
---
|
||
|
||
### Step 5: Observability
|
||
|
||
**Role**: Site Reliability Engineer (SRE)
|
||
**Goal**: Define logging, metrics, tracing, and alerting strategy
|
||
**Constraints**: Strategy document — describe what to implement, not how to wire it
|
||
|
||
1. Read architecture.md and component specs for service boundaries
|
||
2. Research observability best practices for the tech stack
|
||
|
||
**Logging**:
|
||
- Structured JSON to stdout/stderr (no file logging in containers)
|
||
- Fields: `timestamp` (ISO 8601), `level`, `service`, `correlation_id`, `message`, `context`
|
||
- Levels: ERROR (exceptions), WARN (degraded), INFO (business events), DEBUG (diagnostics, dev only)
|
||
- No PII in logs
|
||
- Retention: dev = console, staging = 7 days, production = 30 days
|
||
|
||
**Metrics**:
|
||
- Expose Prometheus-compatible `/metrics` endpoint per service
|
||
- System metrics: CPU, memory, disk, network
|
||
- Application metrics: `request_count`, `request_duration` (histogram), `error_count`, `active_connections`
|
||
- Business metrics: derived from acceptance criteria
|
||
- Collection interval: 15s
|
||
|
||
**Distributed Tracing**:
|
||
- OpenTelemetry SDK integration
|
||
- Trace context propagation via HTTP headers and message queue metadata
|
||
- Span naming: `<service>.<operation>`
|
||
- Sampling: 100% in dev/staging, 10% in production (adjust based on volume)
|
||
|
||
**Alerting**:
|
||
|
||
| Severity | Response Time | Condition Examples |
|
||
|----------|---------------|-------------------|
|
||
| Critical | 5 min | Service down, data loss, health check failed |
|
||
| High | 30 min | Error rate > 5%, P95 latency > 2x baseline |
|
||
| Medium | 4 hours | Disk > 80%, elevated latency |
|
||
| Low | Next business day | Non-critical warnings |
|
||
|
||
**Dashboards**:
|
||
- Operations: service health, request rate, error rate, response time percentiles, resource utilization
|
||
- Business: key business metrics from acceptance criteria
|
||
|
||
**Self-verification**:
|
||
- [ ] Structured logging format defined with required fields
|
||
- [ ] Metrics endpoint specified per service
|
||
- [ ] OpenTelemetry tracing configured
|
||
- [ ] Alert severities with response times defined
|
||
- [ ] Dashboards cover operations and business metrics
|
||
- [ ] PII exclusion from logs addressed
|
||
|
||
**Save action**: Write `observability.md` using `templates/observability.md`
|
||
|
||
---
|
||
|
||
### Step 6: Deployment Procedures
|
||
|
||
**Role**: DevOps / Platform engineer
|
||
**Goal**: Define deployment strategy, rollback procedures, health checks, and deployment checklist
|
||
**Constraints**: Procedures document — no implementation
|
||
|
||
1. Define deployment strategy:
|
||
- Preferred pattern: blue-green / rolling / canary (choose based on architecture)
|
||
- Zero-downtime requirement for production
|
||
- Graceful shutdown: 30-second grace period for in-flight requests
|
||
- Database migration ordering: migrate before deploy, backward-compatible only
|
||
|
||
2. Define health checks:
|
||
|
||
| Check | Type | Endpoint | Interval | Threshold |
|
||
|-------|------|----------|----------|-----------|
|
||
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures → restart |
|
||
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures → remove from LB |
|
||
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts max |
|
||
|
||
3. Define rollback procedures:
|
||
- Trigger criteria: health check failures, error rate spike, critical alert
|
||
- Rollback steps: redeploy previous image tag, verify health, rollback database if needed
|
||
- Communication: notify stakeholders during rollback
|
||
- Post-mortem: required after every production rollback
|
||
|
||
4. Define deployment checklist:
|
||
- [ ] All tests pass in CI
|
||
- [ ] Security scan clean (zero critical/high CVEs)
|
||
- [ ] Database migrations reviewed and tested
|
||
- [ ] Environment variables configured
|
||
- [ ] Health check endpoints responding
|
||
- [ ] Monitoring alerts configured
|
||
- [ ] Rollback plan documented and tested
|
||
- [ ] Stakeholders notified
|
||
|
||
**Self-verification**:
|
||
- [ ] Deployment strategy chosen and justified
|
||
- [ ] Zero-downtime approach specified
|
||
- [ ] Health checks defined (liveness, readiness, startup)
|
||
- [ ] Rollback trigger criteria and steps documented
|
||
- [ ] Deployment checklist complete
|
||
|
||
**Save action**: Write `deployment_procedures.md` using `templates/deployment_procedures.md`
|
||
|
||
**BLOCKING**: Present deployment procedures to user. Do NOT proceed until confirmed.
|
||
|
||
---
|
||
|
||
### Step 7: Deployment Scripts
|
||
|
||
**Role**: DevOps / Platform engineer
|
||
**Goal**: Create executable deployment scripts for pulling Docker images and running services on the remote target machine
|
||
**Constraints**: Produce real, executable shell scripts. This is the ONLY step that creates implementation artifacts.
|
||
|
||
1. Read containerization.md and deployment_procedures.md from previous steps
|
||
2. Read `.env.example` for required variables
|
||
3. Create the following scripts in `SCRIPTS_DIR/`:
|
||
|
||
**`deploy.sh`** — Main deployment orchestrator:
|
||
- Validates that required environment variables are set (sources `.env` if present)
|
||
- Calls `pull-images.sh`, then `stop-services.sh`, then `start-services.sh`, then `health-check.sh`
|
||
- Exits with non-zero code on any failure
|
||
- Supports `--rollback` flag to redeploy previous image tags
|
||
|
||
**`pull-images.sh`** — Pull Docker images to target machine:
|
||
- Reads image list and tags from environment or config
|
||
- Authenticates with container registry
|
||
- Pulls all required images
|
||
- Verifies image integrity (digest check)
|
||
|
||
**`start-services.sh`** — Start services on target machine:
|
||
- Runs `docker compose up -d` or individual `docker run` commands
|
||
- Applies environment variables from `.env`
|
||
- Configures networks and volumes
|
||
- Waits for containers to reach healthy state
|
||
|
||
**`stop-services.sh`** — Graceful shutdown:
|
||
- Stops services with graceful shutdown period
|
||
- Saves current image tags for rollback reference
|
||
- Cleans up orphaned containers/networks
|
||
|
||
**`health-check.sh`** — Verify deployment health:
|
||
- Checks all health endpoints
|
||
- Reports status per service
|
||
- Returns non-zero if any service is unhealthy
|
||
|
||
4. All scripts must:
|
||
- Be POSIX-compatible (#!/bin/bash with set -euo pipefail)
|
||
- Source `.env` from project root or accept env vars from the environment
|
||
- Include usage/help output (`--help` flag)
|
||
- Be idempotent where possible
|
||
- Handle SSH connection to remote target (configurable via `DEPLOY_HOST` env var)
|
||
|
||
5. Document all scripts in `deploy_scripts.md`
|
||
|
||
**Self-verification**:
|
||
- [ ] All five scripts created and executable
|
||
- [ ] Scripts source environment variables correctly
|
||
- [ ] `deploy.sh` orchestrates the full flow
|
||
- [ ] `pull-images.sh` handles registry auth and image pull
|
||
- [ ] `start-services.sh` starts containers with correct config
|
||
- [ ] `stop-services.sh` handles graceful shutdown
|
||
- [ ] `health-check.sh` validates all endpoints
|
||
- [ ] Rollback supported via `deploy.sh --rollback`
|
||
- [ ] Scripts work for remote deployment via SSH (DEPLOY_HOST)
|
||
- [ ] `deploy_scripts.md` documents all scripts
|
||
|
||
**Save action**: Write scripts to `SCRIPTS_DIR/`, write `deploy_scripts.md` using `templates/deploy_scripts.md`
|
||
|
||
---
|
||
|
||
## Escalation Rules
|
||
|
||
| Situation | Action |
|
||
|-----------|--------|
|
||
| Unknown cloud provider or hosting | **ASK user** |
|
||
| Container registry not specified | **ASK user** |
|
||
| CI/CD platform preference unclear | **ASK user** — default to GitHub Actions |
|
||
| Secret manager not chosen | **ASK user** |
|
||
| Deployment pattern trade-offs | **ASK user** with recommendation |
|
||
| Missing architecture.md | **STOP** — run `/plan` first |
|
||
| Remote target machine details unknown | **ASK user** for SSH access, OS, and specs |
|
||
|
||
## Common Mistakes
|
||
|
||
- **Implementing during planning**: Steps 1–6 produce documents, not code (Step 7 is the exception — it creates scripts)
|
||
- **Hardcoding secrets**: never include real credentials in deployment documents or scripts
|
||
- **Ignoring blackbox test containerization**: the test environment must be containerized alongside the app
|
||
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
|
||
- **Using `:latest` tags**: always pin base image versions
|
||
- **Forgetting observability**: logging, metrics, and tracing are deployment concerns, not post-deployment additions
|
||
- **Committing `.env`**: only `.env.example` goes to version control; `.env` must be in `.gitignore`
|
||
- **Non-portable scripts**: deployment scripts must work across environments; avoid hardcoded paths
|
||
|
||
## Methodology Quick Reference
|
||
|
||
```
|
||
┌────────────────────────────────────────────────────────────────┐
|
||
│ Deployment Planning (7-Step Method) │
|
||
├────────────────────────────────────────────────────────────────┤
|
||
│ PREREQ: architecture.md + component specs exist │
|
||
│ │
|
||
│ 1. Status & Env → reports/deploy_status_report.md │
|
||
│ + .env + .env.example │
|
||
│ [BLOCKING: user confirms status & env vars] │
|
||
│ 2. Containerization → containerization.md │
|
||
│ [BLOCKING: user confirms Docker plan] │
|
||
│ 3. CI/CD Pipeline → ci_cd_pipeline.md │
|
||
│ 4. Environment → environment_strategy.md │
|
||
│ 5. Observability → observability.md │
|
||
│ 6. Procedures → deployment_procedures.md │
|
||
│ [BLOCKING: user confirms deployment plan] │
|
||
│ 7. Scripts → deploy_scripts.md + scripts/ │
|
||
├────────────────────────────────────────────────────────────────┤
|
||
│ Principles: Docker-first · IaC · Observability built-in │
|
||
│ Environment parity · Save immediately │
|
||
└────────────────────────────────────────────────────────────────┘
|
||
```
|