Refactor README and command documentation to streamline deployment and CI/CD processes. Consolidate deployment strategies and remove obsolete commands related to CI/CD and observability. Enhance task decomposition workflow by adding data model and deployment planning sections, and update directory structures for improved clarity.

2026-06-22 20:31:12 +00:00 · 2026-03-19 12:10:11 +02:00
parent 5b1739186e
commit cfd09c79e1
17 changed files with 1314 additions and 313 deletions
@@ -0,0 +1,363 @@
+---
+name: deploy
+description: |
+  Comprehensive deployment skill covering containerization, CI/CD pipeline, environment strategy, observability, and deployment procedures.
+  5-step workflow: Docker containerization, CI/CD pipeline definition, environment strategy, observability planning, deployment procedures.
+  Uses _docs/02_plans/deployment/ structure.
+  Trigger phrases:
+  - "deploy", "deployment", "deployment strategy"
+  - "CI/CD", "pipeline", "containerize"
+  - "observability", "monitoring", "logging"
+  - "dockerize", "docker compose"
+category: ship
+tags: [deployment, docker, ci-cd, observability, monitoring, containerization]
+disable-model-invocation: true
+---
+
+# Deployment Planning
+
+Plan and document the full deployment lifecycle: containerize the application, define CI/CD pipelines, configure environments, set up observability, and document deployment procedures.
+
+## Core Principles
+
+- **Docker-first**: every component runs in a container; local dev, integration tests, and production all use Docker
+- **Infrastructure as code**: all deployment configuration is version-controlled
+- **Observability built-in**: logging, metrics, and tracing are part of the deployment plan, not afterthoughts
+- **Environment parity**: dev, staging, and production environments mirror each other as closely as possible
+- **Save immediately**: write artifacts to disk after each step; never accumulate unsaved work
+- **Ask, don't assume**: when infrastructure constraints or preferences are unclear, ask the user
+- **Plan, don't code**: this workflow produces deployment documents and specifications, not implementation code
+
+## Context Resolution
+
+Fixed paths:
+
+- PLANS_DIR: `_docs/02_plans/`
+- DEPLOY_DIR: `_docs/02_plans/deployment/`
+- ARCHITECTURE: `_docs/02_plans/architecture.md`
+- COMPONENTS_DIR: `_docs/02_plans/components/`
+
+Announce the resolved paths to the user before proceeding.
+
+## Input Specification
+
+### Required Files
+
+| File | Purpose |
+|------|---------|
+| `_docs/00_problem/problem.md` | Problem description and context |
+| `_docs/00_problem/restrictions.md` | Constraints and limitations |
+| `_docs/01_solution/solution.md` | Finalized solution |
+| `PLANS_DIR/architecture.md` | Architecture from plan skill |
+| `PLANS_DIR/components/` | Component specs |
+
+### Prerequisite Checks (BLOCKING)
+
+1. `architecture.md` exists — **STOP if missing**, run `/plan` first
+2. At least one component spec exists in `PLANS_DIR/components/` — **STOP if missing**
+3. Create DEPLOY_DIR if it does not exist
+4. If DEPLOY_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
+
+## Artifact Management
+
+### Directory Structure
+
+```
+DEPLOY_DIR/
+├── containerization.md
+├── ci_cd_pipeline.md
+├── environment_strategy.md
+├── observability.md
+└── deployment_procedures.md
+```
+
+### Save Timing
+
+| Step | Save immediately after | Filename |
+|------|------------------------|----------|
+| Step 1 | Containerization plan complete | `containerization.md` |
+| Step 2 | CI/CD pipeline defined | `ci_cd_pipeline.md` |
+| Step 3 | Environment strategy documented | `environment_strategy.md` |
+| Step 4 | Observability plan complete | `observability.md` |
+| Step 5 | Deployment procedures documented | `deployment_procedures.md` |
+
+### Resumability
+
+If DEPLOY_DIR already contains artifacts:
+
+1. List existing files and match to the save timing table
+2. Identify the last completed step
+3. Resume from the next incomplete step
+4. Inform the user which steps are being skipped
+
+## Progress Tracking
+
+At the start of execution, create a TodoWrite with all steps (1 through 5). Update status as each step completes.
+
+## Workflow
+
+### Step 1: Containerization
+
+**Role**: DevOps / Platform engineer
+**Goal**: Define Docker configuration for every component, local development, and integration test environments
+**Constraints**: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain.
+
+1. Read architecture.md and all component specs
+2. Read restrictions.md for infrastructure constraints
+3. Research best Docker practices for the project's tech stack (multi-stage builds, base image selection, layer optimization)
+4. For each component, define:
+   - Base image (pinned version, prefer alpine/distroless for production)
+   - Build stages (dependency install, build, production)
+   - Non-root user configuration
+   - Health check endpoint and command
+   - Exposed ports
+   - `.dockerignore` contents
+5. Define `docker-compose.yml` for local development:
+   - All application components
+   - Database (Postgres) with named volume
+   - Any message queues, caches, or external service mocks
+   - Shared network
+   - Environment variable files (`.env.dev`)
+6. Define `docker-compose.test.yml` for integration tests:
+   - Application components under test
+   - Test runner container (black-box, no internal imports)
+   - Isolated database with seed data
+   - All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
+7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
+
+**Self-verification**:
+- [ ] Every component has a Dockerfile specification
+- [ ] Multi-stage builds specified for all production images
+- [ ] Non-root user for all containers
+- [ ] Health checks defined for every service
+- [ ] docker-compose.yml covers all components + dependencies
+- [ ] docker-compose.test.yml enables black-box integration testing
+- [ ] `.dockerignore` defined
+
+**Save action**: Write `containerization.md` using `templates/containerization.md`
+
+**BLOCKING**: Present containerization plan to user. Do NOT proceed until confirmed.
+
+---
+
+### Step 2: CI/CD Pipeline
+
+**Role**: DevOps engineer
+**Goal**: Define the CI/CD pipeline with quality gates, security scanning, and multi-environment deployment
+**Constraints**: Pipeline definition only — produce YAML specification, not implementation
+
+1. Read architecture.md for tech stack and deployment targets
+2. Read restrictions.md for CI/CD constraints (cloud provider, registry, etc.)
+3. Research CI/CD best practices for the project's platform (GitHub Actions / Azure Pipelines)
+4. Define pipeline stages:
+
+| Stage | Trigger | Steps | Quality Gate |
+|-------|---------|-------|-------------|
+| **Lint** | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors |
+| **Test** | Every push | Unit tests, integration tests, coverage report | 75%+ coverage |
+| **Security** | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs |
+| **Build** | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds |
+| **Push** | After build | Push to container registry | Push succeeds |
+| **Deploy Staging** | After push | Deploy to staging environment | Health checks pass |
+| **Smoke Tests** | After staging deploy | Run critical path tests against staging | All pass |
+| **Deploy Production** | Manual approval | Deploy to production | Health checks pass |
+
+5. Define caching strategy: dependency caches, Docker layer caches, build artifact caches
+6. Define parallelization: which stages can run concurrently
+7. Define notifications: build failures, deployment status, security alerts
+
+**Self-verification**:
+- [ ] All pipeline stages defined with triggers and gates
+- [ ] Coverage threshold enforced (75%+)
+- [ ] Security scanning included (dependencies + images + SAST)
+- [ ] Caching configured for dependencies and Docker layers
+- [ ] Multi-environment deployment (staging → production)
+- [ ] Rollback procedure referenced
+- [ ] Notifications configured
+
+**Save action**: Write `ci_cd_pipeline.md` using `templates/ci_cd_pipeline.md`
+
+---
+
+### Step 3: Environment Strategy
+
+**Role**: Platform engineer
+**Goal**: Define environment configuration, secrets management, and environment parity
+**Constraints**: Strategy document — no secrets or credentials in output
+
+1. Define environments:
+
+| Environment | Purpose | Infrastructure | Data |
+|-------------|---------|---------------|------|
+| **Development** | Local developer workflow | docker-compose, local volumes | Seed data, mocks for external APIs |
+| **Staging** | Pre-production validation | Mirrors production topology | Anonymized production-like data |
+| **Production** | Live system | Full infrastructure | Real data |
+
+2. Define environment variable management:
+   - `.env.example` with all required variables (no real values)
+   - Per-environment variable sources (`.env` for dev, secret manager for staging/prod)
+   - Validation: fail fast on missing required variables at startup
+3. Define secrets management:
+   - Never commit secrets to version control
+   - Development: `.env` files (git-ignored)
+   - Staging/Production: secret manager (AWS Secrets Manager / Azure Key Vault / Vault)
+   - Rotation policy
+4. Define database management per environment:
+   - Development: Docker Postgres with named volume, seed data
+   - Staging: managed Postgres, migrations applied via CI/CD
+   - Production: managed Postgres, migrations require approval
+
+**Self-verification**:
+- [ ] All three environments defined with clear purpose
+- [ ] Environment variable documentation complete (`.env.example`)
+- [ ] No secrets in any output document
+- [ ] Secret manager specified for staging/production
+- [ ] Database strategy per environment
+
+**Save action**: Write `environment_strategy.md` using `templates/environment_strategy.md`
+
+---
+
+### Step 4: Observability
+
+**Role**: Site Reliability Engineer (SRE)
+**Goal**: Define logging, metrics, tracing, and alerting strategy
+**Constraints**: Strategy document — describe what to implement, not how to wire it
+
+1. Read architecture.md and component specs for service boundaries
+2. Research observability best practices for the tech stack
+
+**Logging**:
+- Structured JSON to stdout/stderr (no file logging in containers)
+- Fields: `timestamp` (ISO 8601), `level`, `service`, `correlation_id`, `message`, `context`
+- Levels: ERROR (exceptions), WARN (degraded), INFO (business events), DEBUG (diagnostics, dev only)
+- No PII in logs
+- Retention: dev = console, staging = 7 days, production = 30 days
+
+**Metrics**:
+- Expose Prometheus-compatible `/metrics` endpoint per service
+- System metrics: CPU, memory, disk, network
+- Application metrics: `request_count`, `request_duration` (histogram), `error_count`, `active_connections`
+- Business metrics: derived from acceptance criteria
+- Collection interval: 15s
+
+**Distributed Tracing**:
+- OpenTelemetry SDK integration
+- Trace context propagation via HTTP headers and message queue metadata
+- Span naming: `<service>.<operation>`
+- Sampling: 100% in dev/staging, 10% in production (adjust based on volume)
+
+**Alerting**:
+
+| Severity | Response Time | Condition Examples |
+|----------|---------------|-------------------|
+| Critical | 5 min | Service down, data loss, health check failed |
+| High | 30 min | Error rate > 5%, P95 latency > 2x baseline |
+| Medium | 4 hours | Disk > 80%, elevated latency |
+| Low | Next business day | Non-critical warnings |
+
+**Dashboards**:
+- Operations: service health, request rate, error rate, response time percentiles, resource utilization
+- Business: key business metrics from acceptance criteria
+
+**Self-verification**:
+- [ ] Structured logging format defined with required fields
+- [ ] Metrics endpoint specified per service
+- [ ] OpenTelemetry tracing configured
+- [ ] Alert severities with response times defined
+- [ ] Dashboards cover operations and business metrics
+- [ ] PII exclusion from logs addressed
+
+**Save action**: Write `observability.md` using `templates/observability.md`
+
+---
+
+### Step 5: Deployment Procedures
+
+**Role**: DevOps / Platform engineer
+**Goal**: Define deployment strategy, rollback procedures, health checks, and deployment checklist
+**Constraints**: Procedures document — no implementation
+
+1. Define deployment strategy:
+   - Preferred pattern: blue-green / rolling / canary (choose based on architecture)
+   - Zero-downtime requirement for production
+   - Graceful shutdown: 30-second grace period for in-flight requests
+   - Database migration ordering: migrate before deploy, backward-compatible only
+
+2. Define health checks:
+
+| Check | Type | Endpoint | Interval | Threshold |
+|-------|------|----------|----------|-----------|
+| Liveness | HTTP GET | `/health/live` | 10s | 3 failures → restart |
+| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures → remove from LB |
+| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts max |
+
+3. Define rollback procedures:
+   - Trigger criteria: health check failures, error rate spike, critical alert
+   - Rollback steps: redeploy previous image tag, verify health, rollback database if needed
+   - Communication: notify stakeholders during rollback
+   - Post-mortem: required after every production rollback
+
+4. Define deployment checklist:
+   - [ ] All tests pass in CI
+   - [ ] Security scan clean (zero critical/high CVEs)
+   - [ ] Database migrations reviewed and tested
+   - [ ] Environment variables configured
+   - [ ] Health check endpoints responding
+   - [ ] Monitoring alerts configured
+   - [ ] Rollback plan documented and tested
+   - [ ] Stakeholders notified
+
+**Self-verification**:
+- [ ] Deployment strategy chosen and justified
+- [ ] Zero-downtime approach specified
+- [ ] Health checks defined (liveness, readiness, startup)
+- [ ] Rollback trigger criteria and steps documented
+- [ ] Deployment checklist complete
+
+**Save action**: Write `deployment_procedures.md` using `templates/deployment_procedures.md`
+
+**BLOCKING**: Present deployment procedures to user. Do NOT proceed until confirmed.
+
+---
+
+## Escalation Rules
+
+| Situation | Action |
+|-----------|--------|
+| Unknown cloud provider or hosting | **ASK user** |
+| Container registry not specified | **ASK user** |
+| CI/CD platform preference unclear | **ASK user** — default to GitHub Actions |
+| Secret manager not chosen | **ASK user** |
+| Deployment pattern trade-offs | **ASK user** with recommendation |
+| Missing architecture.md | **STOP** — run `/plan` first |
+
+## Common Mistakes
+
+- **Implementing during planning**: this workflow produces documents, not Dockerfiles or pipeline YAML
+- **Hardcoding secrets**: never include real credentials in deployment documents
+- **Ignoring integration test containerization**: the test environment must be containerized alongside the app
+- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
+- **Using `:latest` tags**: always pin base image versions
+- **Forgetting observability**: logging, metrics, and tracing are deployment concerns, not post-deployment additions
+
+## Methodology Quick Reference
+
+```
+┌────────────────────────────────────────────────────────────────┐
+│            Deployment Planning (5-Step Method)                   │
+├────────────────────────────────────────────────────────────────┤
+│ PREREQ: architecture.md + component specs exist                 │
+│                                                                │
+│ 1. Containerization  → containerization.md                      │
+│    [BLOCKING: user confirms Docker plan]                        │
+│ 2. CI/CD Pipeline    → ci_cd_pipeline.md                        │
+│ 3. Environment       → environment_strategy.md                  │
+│ 4. Observability     → observability.md                         │
+│ 5. Procedures        → deployment_procedures.md                 │
+│    [BLOCKING: user confirms deployment plan]                    │
+├────────────────────────────────────────────────────────────────┤
+│ Principles: Docker-first · IaC · Observability built-in         │
+│             Environment parity · Save immediately               │
+└────────────────────────────────────────────────────────────────┘
+```