16 KiB
name, description, category, tags, disable-model-invocation
| name | description | category | tags | disable-model-invocation | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| deploy | Comprehensive deployment skill covering containerization, CI/CD pipeline, environment strategy, observability, and deployment procedures. 5-step workflow: Docker containerization, CI/CD pipeline definition, environment strategy, observability planning, deployment procedures. Uses _docs/02_plans/deployment/ structure. Trigger phrases: - "deploy", "deployment", "deployment strategy" - "CI/CD", "pipeline", "containerize" - "observability", "monitoring", "logging" - "dockerize", "docker compose" | ship |
|
true |
Deployment Planning
Plan and document the full deployment lifecycle: containerize the application, define CI/CD pipelines, configure environments, set up observability, and document deployment procedures.
Core Principles
- Docker-first: every component runs in a container; local dev, integration tests, and production all use Docker
- Infrastructure as code: all deployment configuration is version-controlled
- Observability built-in: logging, metrics, and tracing are part of the deployment plan, not afterthoughts
- Environment parity: dev, staging, and production environments mirror each other as closely as possible
- Save immediately: write artifacts to disk after each step; never accumulate unsaved work
- Ask, don't assume: when infrastructure constraints or preferences are unclear, ask the user
- Plan, don't code: this workflow produces deployment documents and specifications, not implementation code
Context Resolution
Fixed paths:
- PLANS_DIR:
_docs/02_plans/ - DEPLOY_DIR:
_docs/02_plans/deployment/ - ARCHITECTURE:
_docs/02_plans/architecture.md - COMPONENTS_DIR:
_docs/02_plans/components/
Announce the resolved paths to the user before proceeding.
Input Specification
Required Files
| File | Purpose |
|---|---|
_docs/00_problem/problem.md |
Problem description and context |
_docs/00_problem/restrictions.md |
Constraints and limitations |
_docs/01_solution/solution.md |
Finalized solution |
PLANS_DIR/architecture.md |
Architecture from plan skill |
PLANS_DIR/components/ |
Component specs |
Prerequisite Checks (BLOCKING)
architecture.mdexists — STOP if missing, run/planfirst- At least one component spec exists in
PLANS_DIR/components/— STOP if missing - Create DEPLOY_DIR if it does not exist
- If DEPLOY_DIR already contains artifacts, ask user: resume from last checkpoint or start fresh?
Artifact Management
Directory Structure
DEPLOY_DIR/
├── containerization.md
├── ci_cd_pipeline.md
├── environment_strategy.md
├── observability.md
└── deployment_procedures.md
Save Timing
| Step | Save immediately after | Filename |
|---|---|---|
| Step 1 | Containerization plan complete | containerization.md |
| Step 2 | CI/CD pipeline defined | ci_cd_pipeline.md |
| Step 3 | Environment strategy documented | environment_strategy.md |
| Step 4 | Observability plan complete | observability.md |
| Step 5 | Deployment procedures documented | deployment_procedures.md |
Resumability
If DEPLOY_DIR already contains artifacts:
- List existing files and match to the save timing table
- Identify the last completed step
- Resume from the next incomplete step
- Inform the user which steps are being skipped
Progress Tracking
At the start of execution, create a TodoWrite with all steps (1 through 5). Update status as each step completes.
Workflow
Step 1: Containerization
Role: DevOps / Platform engineer Goal: Define Docker configuration for every component, local development, and integration test environments Constraints: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain.
- Read architecture.md and all component specs
- Read restrictions.md for infrastructure constraints
- Research best Docker practices for the project's tech stack (multi-stage builds, base image selection, layer optimization)
- For each component, define:
- Base image (pinned version, prefer alpine/distroless for production)
- Build stages (dependency install, build, production)
- Non-root user configuration
- Health check endpoint and command
- Exposed ports
.dockerignorecontents
- Define
docker-compose.ymlfor local development:- All application components
- Database (Postgres) with named volume
- Any message queues, caches, or external service mocks
- Shared network
- Environment variable files (
.env.dev)
- Define
docker-compose.test.ymlfor integration tests:- Application components under test
- Test runner container (black-box, no internal imports)
- Isolated database with seed data
- All tests runnable via
docker compose -f docker-compose.test.yml up --abort-on-container-exit
- Define image tagging strategy:
<registry>/<project>/<component>:<git-sha>for CI,latestfor local dev only
Self-verification:
- Every component has a Dockerfile specification
- Multi-stage builds specified for all production images
- Non-root user for all containers
- Health checks defined for every service
- docker-compose.yml covers all components + dependencies
- docker-compose.test.yml enables black-box integration testing
.dockerignoredefined
Save action: Write containerization.md using templates/containerization.md
BLOCKING: Present containerization plan to user. Do NOT proceed until confirmed.
Step 2: CI/CD Pipeline
Role: DevOps engineer Goal: Define the CI/CD pipeline with quality gates, security scanning, and multi-environment deployment Constraints: Pipeline definition only — produce YAML specification, not implementation
- Read architecture.md for tech stack and deployment targets
- Read restrictions.md for CI/CD constraints (cloud provider, registry, etc.)
- Research CI/CD best practices for the project's platform (GitHub Actions / Azure Pipelines)
- Define pipeline stages:
| Stage | Trigger | Steps | Quality Gate |
|---|---|---|---|
| Lint | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors |
| Test | Every push | Unit tests, integration tests, coverage report | 75%+ coverage |
| Security | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs |
| Build | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds |
| Push | After build | Push to container registry | Push succeeds |
| Deploy Staging | After push | Deploy to staging environment | Health checks pass |
| Smoke Tests | After staging deploy | Run critical path tests against staging | All pass |
| Deploy Production | Manual approval | Deploy to production | Health checks pass |
- Define caching strategy: dependency caches, Docker layer caches, build artifact caches
- Define parallelization: which stages can run concurrently
- Define notifications: build failures, deployment status, security alerts
Self-verification:
- All pipeline stages defined with triggers and gates
- Coverage threshold enforced (75%+)
- Security scanning included (dependencies + images + SAST)
- Caching configured for dependencies and Docker layers
- Multi-environment deployment (staging → production)
- Rollback procedure referenced
- Notifications configured
Save action: Write ci_cd_pipeline.md using templates/ci_cd_pipeline.md
Step 3: Environment Strategy
Role: Platform engineer Goal: Define environment configuration, secrets management, and environment parity Constraints: Strategy document — no secrets or credentials in output
- Define environments:
| Environment | Purpose | Infrastructure | Data |
|---|---|---|---|
| Development | Local developer workflow | docker-compose, local volumes | Seed data, mocks for external APIs |
| Staging | Pre-production validation | Mirrors production topology | Anonymized production-like data |
| Production | Live system | Full infrastructure | Real data |
- Define environment variable management:
.env.examplewith all required variables (no real values)- Per-environment variable sources (
.envfor dev, secret manager for staging/prod) - Validation: fail fast on missing required variables at startup
- Define secrets management:
- Never commit secrets to version control
- Development:
.envfiles (git-ignored) - Staging/Production: secret manager (AWS Secrets Manager / Azure Key Vault / Vault)
- Rotation policy
- Define database management per environment:
- Development: Docker Postgres with named volume, seed data
- Staging: managed Postgres, migrations applied via CI/CD
- Production: managed Postgres, migrations require approval
Self-verification:
- All three environments defined with clear purpose
- Environment variable documentation complete (
.env.example) - No secrets in any output document
- Secret manager specified for staging/production
- Database strategy per environment
Save action: Write environment_strategy.md using templates/environment_strategy.md
Step 4: Observability
Role: Site Reliability Engineer (SRE) Goal: Define logging, metrics, tracing, and alerting strategy Constraints: Strategy document — describe what to implement, not how to wire it
- Read architecture.md and component specs for service boundaries
- Research observability best practices for the tech stack
Logging:
- Structured JSON to stdout/stderr (no file logging in containers)
- Fields:
timestamp(ISO 8601),level,service,correlation_id,message,context - Levels: ERROR (exceptions), WARN (degraded), INFO (business events), DEBUG (diagnostics, dev only)
- No PII in logs
- Retention: dev = console, staging = 7 days, production = 30 days
Metrics:
- Expose Prometheus-compatible
/metricsendpoint per service - System metrics: CPU, memory, disk, network
- Application metrics:
request_count,request_duration(histogram),error_count,active_connections - Business metrics: derived from acceptance criteria
- Collection interval: 15s
Distributed Tracing:
- OpenTelemetry SDK integration
- Trace context propagation via HTTP headers and message queue metadata
- Span naming:
<service>.<operation> - Sampling: 100% in dev/staging, 10% in production (adjust based on volume)
Alerting:
| Severity | Response Time | Condition Examples |
|---|---|---|
| Critical | 5 min | Service down, data loss, health check failed |
| High | 30 min | Error rate > 5%, P95 latency > 2x baseline |
| Medium | 4 hours | Disk > 80%, elevated latency |
| Low | Next business day | Non-critical warnings |
Dashboards:
- Operations: service health, request rate, error rate, response time percentiles, resource utilization
- Business: key business metrics from acceptance criteria
Self-verification:
- Structured logging format defined with required fields
- Metrics endpoint specified per service
- OpenTelemetry tracing configured
- Alert severities with response times defined
- Dashboards cover operations and business metrics
- PII exclusion from logs addressed
Save action: Write observability.md using templates/observability.md
Step 5: Deployment Procedures
Role: DevOps / Platform engineer Goal: Define deployment strategy, rollback procedures, health checks, and deployment checklist Constraints: Procedures document — no implementation
-
Define deployment strategy:
- Preferred pattern: blue-green / rolling / canary (choose based on architecture)
- Zero-downtime requirement for production
- Graceful shutdown: 30-second grace period for in-flight requests
- Database migration ordering: migrate before deploy, backward-compatible only
-
Define health checks:
| Check | Type | Endpoint | Interval | Threshold |
|---|---|---|---|---|
| Liveness | HTTP GET | /health/live |
10s | 3 failures → restart |
| Readiness | HTTP GET | /health/ready |
5s | 3 failures → remove from LB |
| Startup | HTTP GET | /health/ready |
5s | 30 attempts max |
-
Define rollback procedures:
- Trigger criteria: health check failures, error rate spike, critical alert
- Rollback steps: redeploy previous image tag, verify health, rollback database if needed
- Communication: notify stakeholders during rollback
- Post-mortem: required after every production rollback
-
Define deployment checklist:
- All tests pass in CI
- Security scan clean (zero critical/high CVEs)
- Database migrations reviewed and tested
- Environment variables configured
- Health check endpoints responding
- Monitoring alerts configured
- Rollback plan documented and tested
- Stakeholders notified
Self-verification:
- Deployment strategy chosen and justified
- Zero-downtime approach specified
- Health checks defined (liveness, readiness, startup)
- Rollback trigger criteria and steps documented
- Deployment checklist complete
Save action: Write deployment_procedures.md using templates/deployment_procedures.md
BLOCKING: Present deployment procedures to user. Do NOT proceed until confirmed.
Escalation Rules
| Situation | Action |
|---|---|
| Unknown cloud provider or hosting | ASK user |
| Container registry not specified | ASK user |
| CI/CD platform preference unclear | ASK user — default to GitHub Actions |
| Secret manager not chosen | ASK user |
| Deployment pattern trade-offs | ASK user with recommendation |
| Missing architecture.md | STOP — run /plan first |
Common Mistakes
- Implementing during planning: this workflow produces documents, not Dockerfiles or pipeline YAML
- Hardcoding secrets: never include real credentials in deployment documents
- Ignoring integration test containerization: the test environment must be containerized alongside the app
- Skipping BLOCKING gates: never proceed past a BLOCKING marker without user confirmation
- Using
:latesttags: always pin base image versions - Forgetting observability: logging, metrics, and tracing are deployment concerns, not post-deployment additions
Methodology Quick Reference
┌────────────────────────────────────────────────────────────────┐
│ Deployment Planning (5-Step Method) │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: architecture.md + component specs exist │
│ │
│ 1. Containerization → containerization.md │
│ [BLOCKING: user confirms Docker plan] │
│ 2. CI/CD Pipeline → ci_cd_pipeline.md │
│ 3. Environment → environment_strategy.md │
│ 4. Observability → observability.md │
│ 5. Procedures → deployment_procedures.md │
│ [BLOCKING: user confirms deployment plan] │
├────────────────────────────────────────────────────────────────┤
│ Principles: Docker-first · IaC · Observability built-in │
│ Environment parity · Save immediately │
└────────────────────────────────────────────────────────────────┘