mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-04-22 22:46:36 +00:00
Refactor README and command documentation to streamline deployment and CI/CD processes. Consolidate deployment strategies and remove obsolete commands related to CI/CD and observability. Enhance task decomposition workflow by adding data model and deployment planning sections, and update directory structures for improved clarity.
This commit is contained in:
+40
-37
@@ -20,23 +20,20 @@
|
||||
|
||||
5. /implement — auto-orchestrates all tasks: batches by dependencies, launches parallel implementers, runs code review, loops until done
|
||||
|
||||
6. /implement-black-box-tests — E2E tests via Docker consumer app (after all tasks)
|
||||
|
||||
7. commit & push
|
||||
6. commit & push
|
||||
```
|
||||
|
||||
### SHIP (deploy and operate)
|
||||
|
||||
```
|
||||
8. /implement-cicd — validate/enhance CI/CD pipeline
|
||||
9. /deploy — deployment strategy per environment
|
||||
10. /observability — monitoring, logging, alerting plan
|
||||
7. /deploy — containerization, CI/CD, environment strategy, observability, deployment procedures (skill, 5-step workflow)
|
||||
```
|
||||
|
||||
### EVOLVE (maintenance and improvement)
|
||||
|
||||
```
|
||||
11. /refactor — structured refactoring (skill, 6-phase workflow)
|
||||
8. /refactor — structured refactoring (skill, 6-phase workflow)
|
||||
9. /retrospective — collect metrics, analyze trends, produce improvement report (skill)
|
||||
```
|
||||
|
||||
## Implementation Flow
|
||||
@@ -67,29 +64,25 @@ Multi-phase code review invoked after each implementation batch:
|
||||
|
||||
Produces structured findings with severity (Critical/High/Medium/Low) and verdict (PASS/FAIL/PASS_WITH_WARNINGS).
|
||||
|
||||
### `/implement-black-box-tests`
|
||||
|
||||
Reads `_docs/02_plans/integration_tests/` (produced by plan skill Step 1). Builds a separate Docker-based consumer app that exercises the system as a black box — no internal imports, no direct DB access. Runs E2E scenarios, produces a CSV test report.
|
||||
|
||||
Run after all tasks are done.
|
||||
|
||||
### `/implement-cicd`
|
||||
|
||||
Reviews existing CI/CD pipeline configuration, validates all stages work, optimizes performance (parallelization, caching), ensures quality gates are enforced (coverage, linting, security scanning).
|
||||
|
||||
Run after `/implement` or after all tasks.
|
||||
|
||||
### `/deploy`
|
||||
|
||||
Defines deployment strategy per environment: deployment procedures, rollback procedures, health checks, deployment checklist. Outputs `_docs/02_components/deployment_strategy.md`.
|
||||
Comprehensive deployment skill (5-step workflow):
|
||||
|
||||
Run before first production release.
|
||||
1. Containerization — Dockerfiles per component, docker-compose for dev and tests
|
||||
2. CI/CD Pipeline — lint, test, security scan, build, deploy with quality gates
|
||||
3. Environment Strategy — dev, staging, production with secrets management
|
||||
4. Observability — structured logging, metrics, tracing, alerting, dashboards
|
||||
5. Deployment Procedures — rollback, health checks, graceful shutdown
|
||||
|
||||
### `/observability`
|
||||
Outputs to `_docs/02_plans/deployment/`. Run after `/implement` or before first production release.
|
||||
|
||||
Plans logging strategy, metrics collection, distributed tracing, alerting rules, and dashboards. Outputs `_docs/02_components/observability_plan.md`.
|
||||
### `/retrospective`
|
||||
|
||||
Run before first production release.
|
||||
Collects metrics from batch reports and code review findings, analyzes trends across implementation cycles, and produces improvement reports. Outputs to `_docs/05_metrics/`.
|
||||
|
||||
### `/rollback`
|
||||
|
||||
Reverts implementation to a specific batch checkpoint using git revert, resets Jira ticket statuses, and verifies rollback integrity with tests.
|
||||
|
||||
### Commit
|
||||
|
||||
@@ -106,6 +99,8 @@ After each confirmed batch, the `/implement` skill automatically commits and pus
|
||||
| **code-review** | "code review", "review code" | 6-phase structured review with findings |
|
||||
| **refactor** | "refactor", "refactoring", "improve code" | 6-phase structured refactoring workflow |
|
||||
| **security** | "security audit", "OWASP" | OWASP-based security testing |
|
||||
| **deploy** | "deploy", "CI/CD", "containerize", "observability" | Containerization, CI/CD, observability, deployment procedures |
|
||||
| **retrospective** | "retrospective", "retro", "metrics review" | Collect metrics, analyze trends, produce improvement report |
|
||||
|
||||
## Project Folder Structure
|
||||
|
||||
@@ -134,6 +129,7 @@ _docs/
|
||||
├── 02_plans/
|
||||
│ ├── architecture.md
|
||||
│ ├── system-flows.md
|
||||
│ ├── data_model.md
|
||||
│ ├── risk_mitigations.md
|
||||
│ ├── components/
|
||||
│ │ └── [##]_[name]/
|
||||
@@ -146,6 +142,12 @@ _docs/
|
||||
│ │ ├── functional_tests.md
|
||||
│ │ ├── non_functional_tests.md
|
||||
│ │ └── traceability_matrix.md
|
||||
│ ├── deployment/
|
||||
│ │ ├── containerization.md
|
||||
│ │ ├── ci_cd_pipeline.md
|
||||
│ │ ├── environment_strategy.md
|
||||
│ │ ├── observability.md
|
||||
│ │ └── deployment_procedures.md
|
||||
│ ├── diagrams/
|
||||
│ └── FINAL_report.md
|
||||
├── 02_tasks/
|
||||
@@ -158,15 +160,17 @@ _docs/
|
||||
│ ├── batch_02_report.md
|
||||
│ ├── ...
|
||||
│ └── FINAL_implementation_report.md
|
||||
└── 04_refactoring/
|
||||
├── baseline_metrics.md
|
||||
├── discovery/
|
||||
├── analysis/
|
||||
├── test_specs/
|
||||
├── coupling_analysis.md
|
||||
├── execution_log.md
|
||||
├── hardening/
|
||||
└── FINAL_report.md
|
||||
├── 04_refactoring/
|
||||
│ ├── baseline_metrics.md
|
||||
│ ├── discovery/
|
||||
│ ├── analysis/
|
||||
│ ├── test_specs/
|
||||
│ ├── coupling_analysis.md
|
||||
│ ├── execution_log.md
|
||||
│ ├── hardening/
|
||||
│ └── FINAL_report.md
|
||||
└── 05_metrics/
|
||||
└── retro_[date].md
|
||||
```
|
||||
|
||||
## Implementation Tools
|
||||
@@ -176,10 +180,9 @@ _docs/
|
||||
| `implementer` | Subagent | Implements a single task from its spec. Launched by /implement. |
|
||||
| `/implement` | Skill | Orchestrates all tasks: dependency batching, parallel agents, code review. |
|
||||
| `/code-review` | Skill | Multi-phase code review with structured findings. |
|
||||
| `/implement-black-box-tests` | Command | E2E tests via Docker consumer app. After all tasks. |
|
||||
| `/implement-cicd` | Command | Validate and enhance CI/CD pipeline. |
|
||||
| `/deploy` | Command | Plan deployment strategy per environment. |
|
||||
| `/observability` | Command | Plan logging, metrics, tracing, alerting. |
|
||||
| `/deploy` | Skill | Containerization, CI/CD, observability, deployment procedures. |
|
||||
| `/retrospective` | Skill | Collect metrics, analyze trends, produce improvement reports. |
|
||||
| `/rollback` | Command | Revert to a batch checkpoint with Jira status reset. |
|
||||
|
||||
## Automations (Planned)
|
||||
|
||||
|
||||
@@ -1,71 +0,0 @@
|
||||
# Deployment Strategy Planning
|
||||
|
||||
## Initial data:
|
||||
- Problem description: `@_docs/00_problem/problem_description.md`
|
||||
- Restrictions: `@_docs/00_problem/restrictions.md`
|
||||
- Full Solution Description: `@_docs/01_solution/solution.md`
|
||||
- Components: `@_docs/02_components`
|
||||
- Environment Strategy: `@_docs/00_templates/environment_strategy.md`
|
||||
|
||||
## Role
|
||||
You are a DevOps/Platform engineer
|
||||
|
||||
## Task
|
||||
- Define deployment strategy for each environment
|
||||
- Plan deployment procedures and automation
|
||||
- Define rollback procedures
|
||||
- Establish deployment verification steps
|
||||
- Document manual intervention points
|
||||
|
||||
## Output
|
||||
|
||||
### Deployment Architecture
|
||||
- Infrastructure diagram (where components run)
|
||||
- Network topology
|
||||
- Load balancing strategy
|
||||
- Container/VM configuration
|
||||
|
||||
### Deployment Procedures
|
||||
|
||||
#### Staging Deployment
|
||||
- Trigger conditions
|
||||
- Pre-deployment checks
|
||||
- Deployment steps
|
||||
- Post-deployment verification
|
||||
- Smoke tests to run
|
||||
|
||||
#### Production Deployment
|
||||
- Approval workflow
|
||||
- Deployment window
|
||||
- Pre-deployment checks
|
||||
- Deployment steps (blue-green, rolling, canary)
|
||||
- Post-deployment verification
|
||||
- Smoke tests to run
|
||||
|
||||
### Rollback Procedures
|
||||
- Rollback trigger criteria
|
||||
- Rollback steps per environment
|
||||
- Data rollback considerations
|
||||
- Communication plan during rollback
|
||||
|
||||
### Health Checks
|
||||
- Liveness probe configuration
|
||||
- Readiness probe configuration
|
||||
- Custom health endpoints
|
||||
|
||||
### Deployment Checklist
|
||||
- [ ] All tests pass in CI
|
||||
- [ ] Security scan clean
|
||||
- [ ] Database migrations reviewed
|
||||
- [ ] Feature flags configured
|
||||
- [ ] Monitoring alerts configured
|
||||
- [ ] Rollback plan documented
|
||||
- [ ] Stakeholders notified
|
||||
|
||||
Store output to `_docs/02_components/deployment_strategy.md`
|
||||
|
||||
## Notes
|
||||
- Prefer automated deployments over manual
|
||||
- Zero-downtime deployments for production
|
||||
- Always have a rollback plan
|
||||
- Ask questions about infrastructure constraints
|
||||
@@ -1,64 +0,0 @@
|
||||
# CI/CD Pipeline Validation & Enhancement
|
||||
|
||||
## Initial data:
|
||||
- Problem description: `@_docs/00_problem/problem_description.md`
|
||||
- Restrictions: `@_docs/00_problem/restrictions.md`
|
||||
- Full Solution Description: `@_docs/01_solution/solution.md`
|
||||
- Components: `@_docs/02_components`
|
||||
- Environment Strategy: `@_docs/00_templates/environment_strategy.md`
|
||||
|
||||
## Role
|
||||
You are a DevOps engineer
|
||||
|
||||
## Task
|
||||
- Review existing CI/CD pipeline configuration
|
||||
- Validate all stages are working correctly
|
||||
- Optimize pipeline performance (parallelization, caching)
|
||||
- Ensure test coverage gates are enforced
|
||||
- Verify security scanning is properly configured
|
||||
- Add missing quality gates
|
||||
|
||||
## Checklist
|
||||
|
||||
### Pipeline Health
|
||||
- [ ] All stages execute successfully
|
||||
- [ ] Build time is acceptable (<10 min for most projects)
|
||||
- [ ] Caching is properly configured (dependencies, build artifacts)
|
||||
- [ ] Parallel execution where possible
|
||||
|
||||
### Quality Gates
|
||||
- [ ] Code coverage threshold enforced (minimum 75%)
|
||||
- [ ] Linting errors block merge
|
||||
- [ ] Security vulnerabilities block merge (critical/high)
|
||||
- [ ] All tests must pass
|
||||
|
||||
### Environment Deployments
|
||||
- [ ] Staging deployment works on merge to stage branch
|
||||
- [ ] Environment variables properly configured per environment
|
||||
- [ ] Secrets are securely managed (not in code)
|
||||
- [ ] Rollback procedure documented
|
||||
|
||||
### Monitoring
|
||||
- [ ] Build notifications configured (Slack, email, etc.)
|
||||
- [ ] Failed build alerts
|
||||
- [ ] Deployment success/failure notifications
|
||||
|
||||
## Output
|
||||
|
||||
### Pipeline Status Report
|
||||
- Current pipeline configuration summary
|
||||
- Issues found and fixes applied
|
||||
- Performance metrics (build times)
|
||||
|
||||
### Recommended Improvements
|
||||
- Short-term improvements
|
||||
- Long-term optimizations
|
||||
|
||||
### Quality Gate Configuration
|
||||
- Thresholds configured
|
||||
- Enforcement rules
|
||||
|
||||
## Notes
|
||||
- Do not break existing functionality
|
||||
- Test changes in separate branch first
|
||||
- Document any manual steps required
|
||||
@@ -1,122 +0,0 @@
|
||||
# Observability Planning
|
||||
|
||||
## Initial data:
|
||||
- Problem description: `@_docs/00_problem/problem_description.md`
|
||||
- Full Solution Description: `@_docs/01_solution/solution.md`
|
||||
- Components: `@_docs/02_components`
|
||||
- Deployment Strategy: `@_docs/02_components/deployment_strategy.md`
|
||||
|
||||
## Role
|
||||
You are a Site Reliability Engineer (SRE)
|
||||
|
||||
## Task
|
||||
- Define logging strategy across all components
|
||||
- Plan metrics collection and dashboards
|
||||
- Design distributed tracing (if applicable)
|
||||
- Establish alerting rules
|
||||
- Document incident response procedures
|
||||
|
||||
## Output
|
||||
|
||||
### Logging Strategy
|
||||
|
||||
#### Log Levels
|
||||
| Level | Usage | Example |
|
||||
|-------|-------|---------|
|
||||
| ERROR | Exceptions, failures requiring attention | Database connection failed |
|
||||
| WARN | Potential issues, degraded performance | Retry attempt 2/3 |
|
||||
| INFO | Significant business events | User registered, Order placed |
|
||||
| DEBUG | Detailed diagnostic information | Request payload, Query params |
|
||||
|
||||
#### Log Format
|
||||
```json
|
||||
{
|
||||
"timestamp": "ISO8601",
|
||||
"level": "INFO",
|
||||
"service": "service-name",
|
||||
"correlation_id": "uuid",
|
||||
"message": "Event description",
|
||||
"context": {}
|
||||
}
|
||||
```
|
||||
|
||||
#### Log Storage
|
||||
- Development: Console/file
|
||||
- Staging: Centralized (ELK, CloudWatch, etc.)
|
||||
- Production: Centralized with retention policy
|
||||
|
||||
### Metrics
|
||||
|
||||
#### System Metrics
|
||||
- CPU usage
|
||||
- Memory usage
|
||||
- Disk I/O
|
||||
- Network I/O
|
||||
|
||||
#### Application Metrics
|
||||
| Metric | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| request_count | Counter | Total requests |
|
||||
| request_duration | Histogram | Response time |
|
||||
| error_count | Counter | Failed requests |
|
||||
| active_connections | Gauge | Current connections |
|
||||
|
||||
#### Business Metrics
|
||||
- [Define based on acceptance criteria]
|
||||
|
||||
### Distributed Tracing
|
||||
|
||||
#### Trace Context
|
||||
- Correlation ID propagation
|
||||
- Span naming conventions
|
||||
- Sampling strategy
|
||||
|
||||
#### Integration Points
|
||||
- HTTP headers
|
||||
- Message queue metadata
|
||||
- Database query tagging
|
||||
|
||||
### Alerting
|
||||
|
||||
#### Alert Categories
|
||||
| Severity | Response Time | Examples |
|
||||
|----------|---------------|----------|
|
||||
| Critical | 5 min | Service down, Data loss |
|
||||
| High | 30 min | High error rate, Performance degradation |
|
||||
| Medium | 4 hours | Elevated latency, Disk usage high |
|
||||
| Low | Next business day | Non-critical warnings |
|
||||
|
||||
#### Alert Rules
|
||||
```yaml
|
||||
alerts:
|
||||
- name: high_error_rate
|
||||
condition: error_rate > 5%
|
||||
duration: 5m
|
||||
severity: high
|
||||
|
||||
- name: service_down
|
||||
condition: health_check_failed
|
||||
duration: 1m
|
||||
severity: critical
|
||||
```
|
||||
|
||||
### Dashboards
|
||||
|
||||
#### Operations Dashboard
|
||||
- Service health status
|
||||
- Request rate and error rate
|
||||
- Response time percentiles
|
||||
- Resource utilization
|
||||
|
||||
#### Business Dashboard
|
||||
- Key business metrics
|
||||
- User activity
|
||||
- Transaction volumes
|
||||
|
||||
Store output to `_docs/02_components/observability_plan.md`
|
||||
|
||||
## Notes
|
||||
- Follow the principle: "If it's not monitored, it's not in production"
|
||||
- Balance verbosity with cost
|
||||
- Ensure PII is not logged
|
||||
- Plan for log rotation and retention
|
||||
@@ -0,0 +1,54 @@
|
||||
# Implementation Rollback
|
||||
|
||||
## Role
|
||||
You are a DevOps engineer performing a controlled rollback of implementation batches.
|
||||
|
||||
## Input
|
||||
- User specifies a target batch number or commit hash to roll back to
|
||||
- If not specified, present the list of available batch checkpoints and ask
|
||||
|
||||
## Process
|
||||
|
||||
1. Read `_docs/03_implementation/batch_*_report.md` files to identify all batch checkpoints with their commit hashes
|
||||
2. Present batch list to user with: batch number, date, tasks included, commit hash
|
||||
3. Determine which commits need to be reverted (all commits after the target batch)
|
||||
4. For each commit to revert (in reverse chronological order):
|
||||
- Run `git revert <commit-hash> --no-edit`
|
||||
- Verify no merge conflicts; if conflicts occur, ask user for resolution
|
||||
5. Run the full test suite to verify rollback integrity
|
||||
6. If tests fail, report failures and ask user how to proceed
|
||||
7. Reset Jira ticket statuses for all reverted tasks back to "To Do" via Jira MCP
|
||||
8. Commit the revert with message: `[ROLLBACK] Reverted to batch [N]: [task list]`
|
||||
|
||||
## Output
|
||||
|
||||
Write `_docs/03_implementation/rollback_report.md`:
|
||||
|
||||
```markdown
|
||||
# Rollback Report
|
||||
|
||||
**Date**: [YYYY-MM-DD]
|
||||
**Target**: Batch [N] (commit [hash])
|
||||
**Reverted Batches**: [list]
|
||||
|
||||
## Reverted Tasks
|
||||
|
||||
| Task | Batch | Status Before | Status After |
|
||||
|------|-------|--------------|-------------|
|
||||
| [JIRA-ID] | [batch #] | In Testing | To Do |
|
||||
|
||||
## Test Results
|
||||
- [pass/fail count]
|
||||
|
||||
## Jira Updates
|
||||
- [list of ticket transitions]
|
||||
|
||||
## Notes
|
||||
- [any conflicts, manual steps, or issues encountered]
|
||||
```
|
||||
|
||||
## Safety Rules
|
||||
- Never force-push; always use `git revert` to preserve history
|
||||
- Always run tests after rollback
|
||||
- Always update Jira statuses for reverted tasks
|
||||
- If rollback fails midway, stop and report — do not leave the codebase in a partial state
|
||||
@@ -0,0 +1,8 @@
|
||||
---
|
||||
description: "Git workflow: work on dev branch, commit message format with Jira IDs"
|
||||
alwaysApply: true
|
||||
---
|
||||
# Git Workflow
|
||||
|
||||
- Work on the `dev` branch
|
||||
- Commit message format: `[JIRA-ID-1] [JIRA-ID-2] Summary of changes`
|
||||
@@ -124,15 +124,35 @@ At the start of execution, create a TodoWrite with all applicable steps. Update
|
||||
**Goal**: Produce `01_initial_structure.md` — the first task describing the project skeleton
|
||||
**Constraints**: This is a plan document, not code. The `/implement` skill executes it.
|
||||
|
||||
1. Read architecture.md, all component specs, and system-flows.md from PLANS_DIR
|
||||
1. Read architecture.md, all component specs, system-flows.md, data_model.md, and `deployment/` from PLANS_DIR
|
||||
2. Read problem, solution, and restrictions from `_docs/00_problem/` and `_docs/01_solution/`
|
||||
3. Research best implementation patterns for the identified tech stack
|
||||
4. Document the structure plan using `templates/initial-structure-task.md`
|
||||
|
||||
The bootstrap structure plan must include:
|
||||
- Project folder layout with all component directories
|
||||
- Shared models, interfaces, and DTOs
|
||||
- Dockerfile per component (multi-stage, non-root, health checks, pinned base images)
|
||||
- `docker-compose.yml` for local development (all components + database + dependencies)
|
||||
- `docker-compose.test.yml` for integration test environment (black-box test runner)
|
||||
- `.dockerignore`
|
||||
- CI/CD pipeline file (`.github/workflows/ci.yml` or `azure-pipelines.yml`) with stages from `deployment/ci_cd_pipeline.md`
|
||||
- Database migration setup and initial seed data scripts
|
||||
- Observability configuration: structured logging setup, health check endpoints (`/health/live`, `/health/ready`), metrics endpoint (`/metrics`)
|
||||
- Environment variable documentation (`.env.example`)
|
||||
- Test structure with unit and integration test locations
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All components have corresponding folders in the layout
|
||||
- [ ] All inter-component interfaces have DTOs defined
|
||||
- [ ] CI/CD stages cover build, lint, test, security, deploy
|
||||
- [ ] Dockerfile defined for each component
|
||||
- [ ] `docker-compose.yml` covers all components and dependencies
|
||||
- [ ] `docker-compose.test.yml` enables black-box integration testing
|
||||
- [ ] CI/CD pipeline file defined with lint, test, security, build, deploy stages
|
||||
- [ ] Database migration setup included
|
||||
- [ ] Health check endpoints specified for each service
|
||||
- [ ] Structured logging configuration included
|
||||
- [ ] `.env.example` with all required environment variables
|
||||
- [ ] Environment strategy covers dev, staging, production
|
||||
- [ ] Test structure includes unit and integration test locations
|
||||
|
||||
|
||||
@@ -0,0 +1,363 @@
|
||||
---
|
||||
name: deploy
|
||||
description: |
|
||||
Comprehensive deployment skill covering containerization, CI/CD pipeline, environment strategy, observability, and deployment procedures.
|
||||
5-step workflow: Docker containerization, CI/CD pipeline definition, environment strategy, observability planning, deployment procedures.
|
||||
Uses _docs/02_plans/deployment/ structure.
|
||||
Trigger phrases:
|
||||
- "deploy", "deployment", "deployment strategy"
|
||||
- "CI/CD", "pipeline", "containerize"
|
||||
- "observability", "monitoring", "logging"
|
||||
- "dockerize", "docker compose"
|
||||
category: ship
|
||||
tags: [deployment, docker, ci-cd, observability, monitoring, containerization]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Deployment Planning
|
||||
|
||||
Plan and document the full deployment lifecycle: containerize the application, define CI/CD pipelines, configure environments, set up observability, and document deployment procedures.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Docker-first**: every component runs in a container; local dev, integration tests, and production all use Docker
|
||||
- **Infrastructure as code**: all deployment configuration is version-controlled
|
||||
- **Observability built-in**: logging, metrics, and tracing are part of the deployment plan, not afterthoughts
|
||||
- **Environment parity**: dev, staging, and production environments mirror each other as closely as possible
|
||||
- **Save immediately**: write artifacts to disk after each step; never accumulate unsaved work
|
||||
- **Ask, don't assume**: when infrastructure constraints or preferences are unclear, ask the user
|
||||
- **Plan, don't code**: this workflow produces deployment documents and specifications, not implementation code
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Fixed paths:
|
||||
|
||||
- PLANS_DIR: `_docs/02_plans/`
|
||||
- DEPLOY_DIR: `_docs/02_plans/deployment/`
|
||||
- ARCHITECTURE: `_docs/02_plans/architecture.md`
|
||||
- COMPONENTS_DIR: `_docs/02_plans/components/`
|
||||
|
||||
Announce the resolved paths to the user before proceeding.
|
||||
|
||||
## Input Specification
|
||||
|
||||
### Required Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `_docs/00_problem/problem.md` | Problem description and context |
|
||||
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
|
||||
| `_docs/01_solution/solution.md` | Finalized solution |
|
||||
| `PLANS_DIR/architecture.md` | Architecture from plan skill |
|
||||
| `PLANS_DIR/components/` | Component specs |
|
||||
|
||||
### Prerequisite Checks (BLOCKING)
|
||||
|
||||
1. `architecture.md` exists — **STOP if missing**, run `/plan` first
|
||||
2. At least one component spec exists in `PLANS_DIR/components/` — **STOP if missing**
|
||||
3. Create DEPLOY_DIR if it does not exist
|
||||
4. If DEPLOY_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
|
||||
|
||||
## Artifact Management
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
DEPLOY_DIR/
|
||||
├── containerization.md
|
||||
├── ci_cd_pipeline.md
|
||||
├── environment_strategy.md
|
||||
├── observability.md
|
||||
└── deployment_procedures.md
|
||||
```
|
||||
|
||||
### Save Timing
|
||||
|
||||
| Step | Save immediately after | Filename |
|
||||
|------|------------------------|----------|
|
||||
| Step 1 | Containerization plan complete | `containerization.md` |
|
||||
| Step 2 | CI/CD pipeline defined | `ci_cd_pipeline.md` |
|
||||
| Step 3 | Environment strategy documented | `environment_strategy.md` |
|
||||
| Step 4 | Observability plan complete | `observability.md` |
|
||||
| Step 5 | Deployment procedures documented | `deployment_procedures.md` |
|
||||
|
||||
### Resumability
|
||||
|
||||
If DEPLOY_DIR already contains artifacts:
|
||||
|
||||
1. List existing files and match to the save timing table
|
||||
2. Identify the last completed step
|
||||
3. Resume from the next incomplete step
|
||||
4. Inform the user which steps are being skipped
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
At the start of execution, create a TodoWrite with all steps (1 through 5). Update status as each step completes.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Containerization
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Define Docker configuration for every component, local development, and integration test environments
|
||||
**Constraints**: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain.
|
||||
|
||||
1. Read architecture.md and all component specs
|
||||
2. Read restrictions.md for infrastructure constraints
|
||||
3. Research best Docker practices for the project's tech stack (multi-stage builds, base image selection, layer optimization)
|
||||
4. For each component, define:
|
||||
- Base image (pinned version, prefer alpine/distroless for production)
|
||||
- Build stages (dependency install, build, production)
|
||||
- Non-root user configuration
|
||||
- Health check endpoint and command
|
||||
- Exposed ports
|
||||
- `.dockerignore` contents
|
||||
5. Define `docker-compose.yml` for local development:
|
||||
- All application components
|
||||
- Database (Postgres) with named volume
|
||||
- Any message queues, caches, or external service mocks
|
||||
- Shared network
|
||||
- Environment variable files (`.env.dev`)
|
||||
6. Define `docker-compose.test.yml` for integration tests:
|
||||
- Application components under test
|
||||
- Test runner container (black-box, no internal imports)
|
||||
- Isolated database with seed data
|
||||
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
|
||||
7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every component has a Dockerfile specification
|
||||
- [ ] Multi-stage builds specified for all production images
|
||||
- [ ] Non-root user for all containers
|
||||
- [ ] Health checks defined for every service
|
||||
- [ ] docker-compose.yml covers all components + dependencies
|
||||
- [ ] docker-compose.test.yml enables black-box integration testing
|
||||
- [ ] `.dockerignore` defined
|
||||
|
||||
**Save action**: Write `containerization.md` using `templates/containerization.md`
|
||||
|
||||
**BLOCKING**: Present containerization plan to user. Do NOT proceed until confirmed.
|
||||
|
||||
---
|
||||
|
||||
### Step 2: CI/CD Pipeline
|
||||
|
||||
**Role**: DevOps engineer
|
||||
**Goal**: Define the CI/CD pipeline with quality gates, security scanning, and multi-environment deployment
|
||||
**Constraints**: Pipeline definition only — produce YAML specification, not implementation
|
||||
|
||||
1. Read architecture.md for tech stack and deployment targets
|
||||
2. Read restrictions.md for CI/CD constraints (cloud provider, registry, etc.)
|
||||
3. Research CI/CD best practices for the project's platform (GitHub Actions / Azure Pipelines)
|
||||
4. Define pipeline stages:
|
||||
|
||||
| Stage | Trigger | Steps | Quality Gate |
|
||||
|-------|---------|-------|-------------|
|
||||
| **Lint** | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors |
|
||||
| **Test** | Every push | Unit tests, integration tests, coverage report | 75%+ coverage |
|
||||
| **Security** | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs |
|
||||
| **Build** | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds |
|
||||
| **Push** | After build | Push to container registry | Push succeeds |
|
||||
| **Deploy Staging** | After push | Deploy to staging environment | Health checks pass |
|
||||
| **Smoke Tests** | After staging deploy | Run critical path tests against staging | All pass |
|
||||
| **Deploy Production** | Manual approval | Deploy to production | Health checks pass |
|
||||
|
||||
5. Define caching strategy: dependency caches, Docker layer caches, build artifact caches
|
||||
6. Define parallelization: which stages can run concurrently
|
||||
7. Define notifications: build failures, deployment status, security alerts
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All pipeline stages defined with triggers and gates
|
||||
- [ ] Coverage threshold enforced (75%+)
|
||||
- [ ] Security scanning included (dependencies + images + SAST)
|
||||
- [ ] Caching configured for dependencies and Docker layers
|
||||
- [ ] Multi-environment deployment (staging → production)
|
||||
- [ ] Rollback procedure referenced
|
||||
- [ ] Notifications configured
|
||||
|
||||
**Save action**: Write `ci_cd_pipeline.md` using `templates/ci_cd_pipeline.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Environment Strategy
|
||||
|
||||
**Role**: Platform engineer
|
||||
**Goal**: Define environment configuration, secrets management, and environment parity
|
||||
**Constraints**: Strategy document — no secrets or credentials in output
|
||||
|
||||
1. Define environments:
|
||||
|
||||
| Environment | Purpose | Infrastructure | Data |
|
||||
|-------------|---------|---------------|------|
|
||||
| **Development** | Local developer workflow | docker-compose, local volumes | Seed data, mocks for external APIs |
|
||||
| **Staging** | Pre-production validation | Mirrors production topology | Anonymized production-like data |
|
||||
| **Production** | Live system | Full infrastructure | Real data |
|
||||
|
||||
2. Define environment variable management:
|
||||
- `.env.example` with all required variables (no real values)
|
||||
- Per-environment variable sources (`.env` for dev, secret manager for staging/prod)
|
||||
- Validation: fail fast on missing required variables at startup
|
||||
3. Define secrets management:
|
||||
- Never commit secrets to version control
|
||||
- Development: `.env` files (git-ignored)
|
||||
- Staging/Production: secret manager (AWS Secrets Manager / Azure Key Vault / Vault)
|
||||
- Rotation policy
|
||||
4. Define database management per environment:
|
||||
- Development: Docker Postgres with named volume, seed data
|
||||
- Staging: managed Postgres, migrations applied via CI/CD
|
||||
- Production: managed Postgres, migrations require approval
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All three environments defined with clear purpose
|
||||
- [ ] Environment variable documentation complete (`.env.example`)
|
||||
- [ ] No secrets in any output document
|
||||
- [ ] Secret manager specified for staging/production
|
||||
- [ ] Database strategy per environment
|
||||
|
||||
**Save action**: Write `environment_strategy.md` using `templates/environment_strategy.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Observability
|
||||
|
||||
**Role**: Site Reliability Engineer (SRE)
|
||||
**Goal**: Define logging, metrics, tracing, and alerting strategy
|
||||
**Constraints**: Strategy document — describe what to implement, not how to wire it
|
||||
|
||||
1. Read architecture.md and component specs for service boundaries
|
||||
2. Research observability best practices for the tech stack
|
||||
|
||||
**Logging**:
|
||||
- Structured JSON to stdout/stderr (no file logging in containers)
|
||||
- Fields: `timestamp` (ISO 8601), `level`, `service`, `correlation_id`, `message`, `context`
|
||||
- Levels: ERROR (exceptions), WARN (degraded), INFO (business events), DEBUG (diagnostics, dev only)
|
||||
- No PII in logs
|
||||
- Retention: dev = console, staging = 7 days, production = 30 days
|
||||
|
||||
**Metrics**:
|
||||
- Expose Prometheus-compatible `/metrics` endpoint per service
|
||||
- System metrics: CPU, memory, disk, network
|
||||
- Application metrics: `request_count`, `request_duration` (histogram), `error_count`, `active_connections`
|
||||
- Business metrics: derived from acceptance criteria
|
||||
- Collection interval: 15s
|
||||
|
||||
**Distributed Tracing**:
|
||||
- OpenTelemetry SDK integration
|
||||
- Trace context propagation via HTTP headers and message queue metadata
|
||||
- Span naming: `<service>.<operation>`
|
||||
- Sampling: 100% in dev/staging, 10% in production (adjust based on volume)
|
||||
|
||||
**Alerting**:
|
||||
|
||||
| Severity | Response Time | Condition Examples |
|
||||
|----------|---------------|-------------------|
|
||||
| Critical | 5 min | Service down, data loss, health check failed |
|
||||
| High | 30 min | Error rate > 5%, P95 latency > 2x baseline |
|
||||
| Medium | 4 hours | Disk > 80%, elevated latency |
|
||||
| Low | Next business day | Non-critical warnings |
|
||||
|
||||
**Dashboards**:
|
||||
- Operations: service health, request rate, error rate, response time percentiles, resource utilization
|
||||
- Business: key business metrics from acceptance criteria
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Structured logging format defined with required fields
|
||||
- [ ] Metrics endpoint specified per service
|
||||
- [ ] OpenTelemetry tracing configured
|
||||
- [ ] Alert severities with response times defined
|
||||
- [ ] Dashboards cover operations and business metrics
|
||||
- [ ] PII exclusion from logs addressed
|
||||
|
||||
**Save action**: Write `observability.md` using `templates/observability.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Deployment Procedures
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Define deployment strategy, rollback procedures, health checks, and deployment checklist
|
||||
**Constraints**: Procedures document — no implementation
|
||||
|
||||
1. Define deployment strategy:
|
||||
- Preferred pattern: blue-green / rolling / canary (choose based on architecture)
|
||||
- Zero-downtime requirement for production
|
||||
- Graceful shutdown: 30-second grace period for in-flight requests
|
||||
- Database migration ordering: migrate before deploy, backward-compatible only
|
||||
|
||||
2. Define health checks:
|
||||
|
||||
| Check | Type | Endpoint | Interval | Threshold |
|
||||
|-------|------|----------|----------|-----------|
|
||||
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures → restart |
|
||||
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures → remove from LB |
|
||||
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts max |
|
||||
|
||||
3. Define rollback procedures:
|
||||
- Trigger criteria: health check failures, error rate spike, critical alert
|
||||
- Rollback steps: redeploy previous image tag, verify health, rollback database if needed
|
||||
- Communication: notify stakeholders during rollback
|
||||
- Post-mortem: required after every production rollback
|
||||
|
||||
4. Define deployment checklist:
|
||||
- [ ] All tests pass in CI
|
||||
- [ ] Security scan clean (zero critical/high CVEs)
|
||||
- [ ] Database migrations reviewed and tested
|
||||
- [ ] Environment variables configured
|
||||
- [ ] Health check endpoints responding
|
||||
- [ ] Monitoring alerts configured
|
||||
- [ ] Rollback plan documented and tested
|
||||
- [ ] Stakeholders notified
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Deployment strategy chosen and justified
|
||||
- [ ] Zero-downtime approach specified
|
||||
- [ ] Health checks defined (liveness, readiness, startup)
|
||||
- [ ] Rollback trigger criteria and steps documented
|
||||
- [ ] Deployment checklist complete
|
||||
|
||||
**Save action**: Write `deployment_procedures.md` using `templates/deployment_procedures.md`
|
||||
|
||||
**BLOCKING**: Present deployment procedures to user. Do NOT proceed until confirmed.
|
||||
|
||||
---
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Unknown cloud provider or hosting | **ASK user** |
|
||||
| Container registry not specified | **ASK user** |
|
||||
| CI/CD platform preference unclear | **ASK user** — default to GitHub Actions |
|
||||
| Secret manager not chosen | **ASK user** |
|
||||
| Deployment pattern trade-offs | **ASK user** with recommendation |
|
||||
| Missing architecture.md | **STOP** — run `/plan` first |
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Implementing during planning**: this workflow produces documents, not Dockerfiles or pipeline YAML
|
||||
- **Hardcoding secrets**: never include real credentials in deployment documents
|
||||
- **Ignoring integration test containerization**: the test environment must be containerized alongside the app
|
||||
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
|
||||
- **Using `:latest` tags**: always pin base image versions
|
||||
- **Forgetting observability**: logging, metrics, and tracing are deployment concerns, not post-deployment additions
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Deployment Planning (5-Step Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ: architecture.md + component specs exist │
|
||||
│ │
|
||||
│ 1. Containerization → containerization.md │
|
||||
│ [BLOCKING: user confirms Docker plan] │
|
||||
│ 2. CI/CD Pipeline → ci_cd_pipeline.md │
|
||||
│ 3. Environment → environment_strategy.md │
|
||||
│ 4. Observability → observability.md │
|
||||
│ 5. Procedures → deployment_procedures.md │
|
||||
│ [BLOCKING: user confirms deployment plan] │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Docker-first · IaC · Observability built-in │
|
||||
│ Environment parity · Save immediately │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,87 @@
|
||||
# CI/CD Pipeline Template
|
||||
|
||||
Save as `_docs/02_plans/deployment/ci_cd_pipeline.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — CI/CD Pipeline
|
||||
|
||||
## Pipeline Overview
|
||||
|
||||
| Stage | Trigger | Quality Gate |
|
||||
|-------|---------|-------------|
|
||||
| Lint | Every push | Zero lint errors |
|
||||
| Test | Every push | 75%+ coverage, all tests pass |
|
||||
| Security | Every push | Zero critical/high CVEs |
|
||||
| Build | PR merge to dev | Docker build succeeds |
|
||||
| Push | After build | Images pushed to registry |
|
||||
| Deploy Staging | After push | Health checks pass |
|
||||
| Smoke Tests | After staging deploy | Critical paths pass |
|
||||
| Deploy Production | Manual approval | Health checks pass |
|
||||
|
||||
## Stage Details
|
||||
|
||||
### Lint
|
||||
- [Language-specific linters and formatters]
|
||||
- Runs in parallel per language
|
||||
|
||||
### Test
|
||||
- Unit tests: [framework and command]
|
||||
- Integration tests: [framework and command, uses docker-compose.test.yml]
|
||||
- Coverage threshold: 75% overall, 90% critical paths
|
||||
- Coverage report published as pipeline artifact
|
||||
|
||||
### Security
|
||||
- Dependency audit: [tool, e.g., npm audit / pip-audit / dotnet list package --vulnerable]
|
||||
- SAST scan: [tool, e.g., Semgrep / SonarQube]
|
||||
- Image scan: Trivy on built Docker images
|
||||
- Block on: critical or high severity findings
|
||||
|
||||
### Build
|
||||
- Docker images built using multi-stage Dockerfiles
|
||||
- Tagged with git SHA: `<registry>/<component>:<sha>`
|
||||
- Build cache: Docker layer cache via CI cache action
|
||||
|
||||
### Push
|
||||
- Registry: [container registry URL]
|
||||
- Authentication: [method]
|
||||
|
||||
### Deploy Staging
|
||||
- Deployment method: [docker compose / Kubernetes / cloud service]
|
||||
- Pre-deploy: run database migrations
|
||||
- Post-deploy: verify health check endpoints
|
||||
- Automated rollback on health check failure
|
||||
|
||||
### Smoke Tests
|
||||
- Subset of integration tests targeting staging environment
|
||||
- Validates critical user flows
|
||||
- Timeout: [maximum duration]
|
||||
|
||||
### Deploy Production
|
||||
- Requires manual approval via [mechanism]
|
||||
- Deployment strategy: [blue-green / rolling / canary]
|
||||
- Pre-deploy: database migration review
|
||||
- Post-deploy: health checks + monitoring for 15 min
|
||||
|
||||
## Caching Strategy
|
||||
|
||||
| Cache | Key | Restore Keys |
|
||||
|-------|-----|-------------|
|
||||
| Dependencies | [lockfile hash] | [partial match] |
|
||||
| Docker layers | [Dockerfile hash] | [partial match] |
|
||||
| Build artifacts | [source hash] | [partial match] |
|
||||
|
||||
## Parallelization
|
||||
|
||||
[Diagram or description of which stages run concurrently]
|
||||
|
||||
## Notifications
|
||||
|
||||
| Event | Channel | Recipients |
|
||||
|-------|---------|-----------|
|
||||
| Build failure | [Slack/email] | [team] |
|
||||
| Security alert | [Slack/email] | [team + security] |
|
||||
| Deploy success | [Slack] | [team] |
|
||||
| Deploy failure | [Slack/email + PagerDuty] | [on-call] |
|
||||
```
|
||||
@@ -0,0 +1,94 @@
|
||||
# Containerization Plan Template
|
||||
|
||||
Save as `_docs/02_plans/deployment/containerization.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Containerization
|
||||
|
||||
## Component Dockerfiles
|
||||
|
||||
### [Component Name]
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Base image | [e.g., mcr.microsoft.com/dotnet/aspnet:8.0-alpine] |
|
||||
| Build image | [e.g., mcr.microsoft.com/dotnet/sdk:8.0-alpine] |
|
||||
| Stages | [dependency install → build → production] |
|
||||
| User | [non-root user name] |
|
||||
| Health check | [endpoint and command] |
|
||||
| Exposed ports | [port list] |
|
||||
| Key build args | [if any] |
|
||||
|
||||
### [Repeat for each component]
|
||||
|
||||
## Docker Compose — Local Development
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml structure
|
||||
services:
|
||||
[component]:
|
||||
build: ./[path]
|
||||
ports: ["host:container"]
|
||||
environment: [reference .env.dev]
|
||||
depends_on: [dependencies with health condition]
|
||||
healthcheck: [command, interval, timeout, retries]
|
||||
|
||||
db:
|
||||
image: [postgres:version-alpine]
|
||||
volumes: [named volume]
|
||||
environment: [credentials from .env.dev]
|
||||
healthcheck: [pg_isready]
|
||||
|
||||
volumes:
|
||||
[named volumes]
|
||||
|
||||
networks:
|
||||
[shared network]
|
||||
```
|
||||
|
||||
## Docker Compose — Integration Tests
|
||||
|
||||
```yaml
|
||||
# docker-compose.test.yml structure
|
||||
services:
|
||||
[app components under test]
|
||||
|
||||
test-runner:
|
||||
build: ./tests/integration
|
||||
depends_on: [app components with health condition]
|
||||
environment: [test configuration]
|
||||
# Exit code determines test pass/fail
|
||||
|
||||
db:
|
||||
image: [postgres:version-alpine]
|
||||
volumes: [seed data mount]
|
||||
```
|
||||
|
||||
Run: `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
|
||||
|
||||
## Image Tagging Strategy
|
||||
|
||||
| Context | Tag Format | Example |
|
||||
|---------|-----------|---------|
|
||||
| CI build | `<registry>/<project>/<component>:<git-sha>` | `ghcr.io/org/api:a1b2c3d` |
|
||||
| Release | `<registry>/<project>/<component>:<semver>` | `ghcr.io/org/api:1.2.0` |
|
||||
| Local dev | `<component>:latest` | `api:latest` |
|
||||
|
||||
## .dockerignore
|
||||
|
||||
```
|
||||
.git
|
||||
.cursor
|
||||
_docs
|
||||
_standalone
|
||||
node_modules
|
||||
**/bin
|
||||
**/obj
|
||||
**/__pycache__
|
||||
*.md
|
||||
.env*
|
||||
docker-compose*.yml
|
||||
```
|
||||
```
|
||||
@@ -0,0 +1,103 @@
|
||||
# Deployment Procedures Template
|
||||
|
||||
Save as `_docs/02_plans/deployment/deployment_procedures.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Deployment Procedures
|
||||
|
||||
## Deployment Strategy
|
||||
|
||||
**Pattern**: [blue-green / rolling / canary]
|
||||
**Rationale**: [why this pattern fits the architecture]
|
||||
**Zero-downtime**: required for production deployments
|
||||
|
||||
### Graceful Shutdown
|
||||
|
||||
- Grace period: 30 seconds for in-flight requests
|
||||
- Sequence: stop accepting new requests → drain connections → shutdown
|
||||
- Container orchestrator: `terminationGracePeriodSeconds: 40`
|
||||
|
||||
### Database Migration Ordering
|
||||
|
||||
- Migrations run **before** new code deploys
|
||||
- All migrations must be backward-compatible (old code works with new schema)
|
||||
- Irreversible migrations require explicit approval
|
||||
|
||||
## Health Checks
|
||||
|
||||
| Check | Type | Endpoint | Interval | Failure Threshold | Action |
|
||||
|-------|------|----------|----------|-------------------|--------|
|
||||
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures | Restart container |
|
||||
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures | Remove from load balancer |
|
||||
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts | Kill and recreate |
|
||||
|
||||
### Health Check Responses
|
||||
|
||||
- `/health/live`: returns 200 if process is running (no dependency checks)
|
||||
- `/health/ready`: returns 200 if all dependencies (DB, cache, queues) are reachable
|
||||
|
||||
## Staging Deployment
|
||||
|
||||
1. CI/CD builds and pushes Docker images tagged with git SHA
|
||||
2. Run database migrations against staging
|
||||
3. Deploy new images to staging environment
|
||||
4. Wait for health checks to pass (readiness probe)
|
||||
5. Run smoke tests against staging
|
||||
6. If smoke tests fail: automatic rollback to previous image
|
||||
|
||||
## Production Deployment
|
||||
|
||||
1. **Approval**: manual approval required via [mechanism]
|
||||
2. **Pre-deploy checks**:
|
||||
- [ ] Staging smoke tests passed
|
||||
- [ ] Security scan clean
|
||||
- [ ] Database migration reviewed
|
||||
- [ ] Monitoring alerts configured
|
||||
- [ ] Rollback plan confirmed
|
||||
3. **Deploy**: apply deployment strategy (blue-green / rolling / canary)
|
||||
4. **Verify**: health checks pass, error rate stable, latency within baseline
|
||||
5. **Monitor**: observe dashboards for 15 minutes post-deploy
|
||||
6. **Finalize**: mark deployment as successful or trigger rollback
|
||||
|
||||
## Rollback Procedures
|
||||
|
||||
### Trigger Criteria
|
||||
|
||||
- Health check failures persist after deploy
|
||||
- Error rate exceeds 5% for more than 5 minutes
|
||||
- Critical alert fires within 15 minutes of deploy
|
||||
- Manual decision by on-call engineer
|
||||
|
||||
### Rollback Steps
|
||||
|
||||
1. Redeploy previous Docker image tag (from CI/CD artifact)
|
||||
2. Verify health checks pass
|
||||
3. If database migration was applied:
|
||||
- Run DOWN migration if reversible
|
||||
- If irreversible: assess data impact, escalate if needed
|
||||
4. Notify stakeholders
|
||||
5. Schedule post-mortem within 24 hours
|
||||
|
||||
### Post-Mortem
|
||||
|
||||
Required after every production rollback:
|
||||
- Timeline of events
|
||||
- Root cause
|
||||
- What went wrong
|
||||
- Prevention measures
|
||||
|
||||
## Deployment Checklist
|
||||
|
||||
- [ ] All tests pass in CI
|
||||
- [ ] Security scan clean (zero critical/high CVEs)
|
||||
- [ ] Docker images built and pushed
|
||||
- [ ] Database migrations reviewed and tested
|
||||
- [ ] Environment variables configured for target environment
|
||||
- [ ] Health check endpoints verified
|
||||
- [ ] Monitoring alerts configured
|
||||
- [ ] Rollback plan documented and tested
|
||||
- [ ] Stakeholders notified of deployment window
|
||||
- [ ] On-call engineer available during deployment
|
||||
```
|
||||
@@ -0,0 +1,61 @@
|
||||
# Environment Strategy Template
|
||||
|
||||
Save as `_docs/02_plans/deployment/environment_strategy.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Environment Strategy
|
||||
|
||||
## Environments
|
||||
|
||||
| Environment | Purpose | Infrastructure | Data Source |
|
||||
|-------------|---------|---------------|-------------|
|
||||
| Development | Local developer workflow | docker-compose | Seed data, mocked externals |
|
||||
| Staging | Pre-production validation | [mirrors production] | Anonymized production-like data |
|
||||
| Production | Live system | [full infrastructure] | Real data |
|
||||
|
||||
## Environment Variables
|
||||
|
||||
### Required Variables
|
||||
|
||||
| Variable | Purpose | Dev Default | Staging/Prod Source |
|
||||
|----------|---------|-------------|-------------------|
|
||||
| `DATABASE_URL` | Postgres connection | `postgres://dev:dev@db:5432/app` | Secret manager |
|
||||
| [add all required variables] | | | |
|
||||
|
||||
### `.env.example`
|
||||
|
||||
```env
|
||||
# Copy to .env and fill in values
|
||||
DATABASE_URL=postgres://user:pass@host:5432/dbname
|
||||
# [all required variables with placeholder values]
|
||||
```
|
||||
|
||||
### Variable Validation
|
||||
|
||||
All services validate required environment variables at startup and fail fast with a clear error message if any are missing.
|
||||
|
||||
## Secrets Management
|
||||
|
||||
| Environment | Method | Tool |
|
||||
|-------------|--------|------|
|
||||
| Development | `.env` file (git-ignored) | dotenv |
|
||||
| Staging | Secret manager | [AWS Secrets Manager / Azure Key Vault / Vault] |
|
||||
| Production | Secret manager | [AWS Secrets Manager / Azure Key Vault / Vault] |
|
||||
|
||||
Rotation policy: [frequency and procedure]
|
||||
|
||||
## Database Management
|
||||
|
||||
| Environment | Type | Migrations | Data |
|
||||
|-------------|------|-----------|------|
|
||||
| Development | Docker Postgres, named volume | Applied on container start | Seed data via init script |
|
||||
| Staging | Managed Postgres | Applied via CI/CD pipeline | Anonymized production snapshot |
|
||||
| Production | Managed Postgres | Applied via CI/CD with approval | Live data |
|
||||
|
||||
Migration rules:
|
||||
- All migrations must be backward-compatible (support old and new code simultaneously)
|
||||
- Reversible migrations required (DOWN/rollback script)
|
||||
- Production migrations require review before apply
|
||||
```
|
||||
@@ -0,0 +1,132 @@
|
||||
# Observability Template
|
||||
|
||||
Save as `_docs/02_plans/deployment/observability.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Observability
|
||||
|
||||
## Logging
|
||||
|
||||
### Format
|
||||
|
||||
Structured JSON to stdout/stderr. No file-based logging in containers.
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "ISO8601",
|
||||
"level": "INFO",
|
||||
"service": "service-name",
|
||||
"correlation_id": "uuid",
|
||||
"message": "Event description",
|
||||
"context": {}
|
||||
}
|
||||
```
|
||||
|
||||
### Log Levels
|
||||
|
||||
| Level | Usage | Example |
|
||||
|-------|-------|---------|
|
||||
| ERROR | Exceptions, failures requiring attention | Database connection failed |
|
||||
| WARN | Potential issues, degraded performance | Retry attempt 2/3 |
|
||||
| INFO | Significant business events | User registered, Order placed |
|
||||
| DEBUG | Detailed diagnostics (dev/staging only) | Request payload, Query params |
|
||||
|
||||
### Retention
|
||||
|
||||
| Environment | Destination | Retention |
|
||||
|-------------|-------------|-----------|
|
||||
| Development | Console | Session |
|
||||
| Staging | [log aggregator] | 7 days |
|
||||
| Production | [log aggregator] | 30 days |
|
||||
|
||||
### PII Rules
|
||||
|
||||
- Never log passwords, tokens, or session IDs
|
||||
- Mask email addresses and personal identifiers
|
||||
- Log user IDs (opaque) instead of usernames
|
||||
|
||||
## Metrics
|
||||
|
||||
### Endpoints
|
||||
|
||||
Every service exposes Prometheus-compatible metrics at `/metrics`.
|
||||
|
||||
### Application Metrics
|
||||
|
||||
| Metric | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `request_count` | Counter | Total HTTP requests by method, path, status |
|
||||
| `request_duration_seconds` | Histogram | Response time by method, path |
|
||||
| `error_count` | Counter | Failed requests by type |
|
||||
| `active_connections` | Gauge | Current open connections |
|
||||
|
||||
### System Metrics
|
||||
|
||||
- CPU usage, Memory usage, Disk I/O, Network I/O
|
||||
|
||||
### Business Metrics
|
||||
|
||||
| Metric | Type | Description | Source |
|
||||
|--------|------|-------------|--------|
|
||||
| [from acceptance criteria] | | | |
|
||||
|
||||
Collection interval: 15 seconds
|
||||
|
||||
## Distributed Tracing
|
||||
|
||||
### Configuration
|
||||
|
||||
- SDK: OpenTelemetry
|
||||
- Propagation: W3C Trace Context via HTTP headers
|
||||
- Span naming: `<service>.<operation>`
|
||||
|
||||
### Sampling
|
||||
|
||||
| Environment | Rate | Rationale |
|
||||
|-------------|------|-----------|
|
||||
| Development | 100% | Full visibility |
|
||||
| Staging | 100% | Full visibility |
|
||||
| Production | 10% | Balance cost vs observability |
|
||||
|
||||
### Integration Points
|
||||
|
||||
- HTTP requests: automatic instrumentation
|
||||
- Database queries: automatic instrumentation
|
||||
- Message queues: manual span creation on publish/consume
|
||||
|
||||
## Alerting
|
||||
|
||||
| Severity | Response Time | Conditions |
|
||||
|----------|---------------|-----------|
|
||||
| Critical | 5 min | Service unreachable, health check failed for 1 min, data loss detected |
|
||||
| High | 30 min | Error rate > 5% for 5 min, P95 latency > 2x baseline for 10 min |
|
||||
| Medium | 4 hours | Disk usage > 80%, elevated latency, connection pool exhaustion |
|
||||
| Low | Next business day | Non-critical warnings, deprecated API usage |
|
||||
|
||||
### Notification Channels
|
||||
|
||||
| Severity | Channel |
|
||||
|----------|---------|
|
||||
| Critical | [PagerDuty / phone] |
|
||||
| High | [Slack + email] |
|
||||
| Medium | [Slack] |
|
||||
| Low | [Dashboard only] |
|
||||
|
||||
## Dashboards
|
||||
|
||||
### Operations Dashboard
|
||||
|
||||
- Service health status (up/down per component)
|
||||
- Request rate and error rate
|
||||
- Response time percentiles (P50, P95, P99)
|
||||
- Resource utilization (CPU, memory per container)
|
||||
- Active alerts
|
||||
|
||||
### Business Dashboard
|
||||
|
||||
- [Key business metrics from acceptance criteria]
|
||||
- [User activity indicators]
|
||||
- [Transaction volumes]
|
||||
```
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
name: plan
|
||||
description: |
|
||||
Decompose a solution into architecture, system flows, components, tests, and Jira epics.
|
||||
Decompose a solution into architecture, data model, deployment plan, system flows, components, tests, and Jira epics.
|
||||
Systematic 6-step planning workflow with BLOCKING gates, self-verification, and structured artifact management.
|
||||
Uses _docs/ + _docs/02_plans/ structure.
|
||||
Trigger phrases:
|
||||
@@ -15,7 +15,7 @@ disable-model-invocation: true
|
||||
|
||||
# Solution Planning
|
||||
|
||||
Decompose a problem and solution into architecture, system flows, components, tests, and Jira epics through a systematic 6-step workflow.
|
||||
Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, tests, and Jira epics through a systematic 6-step workflow.
|
||||
|
||||
## Core Principles
|
||||
|
||||
@@ -91,6 +91,13 @@ PLANS_DIR/
|
||||
│ └── traceability_matrix.md
|
||||
├── architecture.md
|
||||
├── system-flows.md
|
||||
├── data_model.md
|
||||
├── deployment/
|
||||
│ ├── containerization.md
|
||||
│ ├── ci_cd_pipeline.md
|
||||
│ ├── environment_strategy.md
|
||||
│ ├── observability.md
|
||||
│ └── deployment_procedures.md
|
||||
├── risk_mitigations.md
|
||||
├── risk_mitigations_02.md (iterative, ## as sequence)
|
||||
├── components/
|
||||
@@ -124,6 +131,8 @@ PLANS_DIR/
|
||||
| Step 1 | Integration traceability matrix | `integration_tests/traceability_matrix.md` |
|
||||
| Step 2 | Architecture analysis complete | `architecture.md` |
|
||||
| Step 2 | System flows documented | `system-flows.md` |
|
||||
| Step 2 | Data model documented | `data_model.md` |
|
||||
| Step 2 | Deployment plan complete | `deployment/` (5 files) |
|
||||
| Step 3 | Each component analyzed | `components/[##]_[name]/description.md` |
|
||||
| Step 3 | Common helpers generated | `common-helpers/[##]_helper_[name].md` |
|
||||
| Step 3 | Diagrams generated | `diagrams/` |
|
||||
@@ -210,9 +219,11 @@ Capture any new questions, findings, or insights that arise during test specific
|
||||
### Step 2: Solution Analysis
|
||||
|
||||
**Role**: Professional software architect
|
||||
**Goal**: Produce `architecture.md` and `system-flows.md` from the solution draft
|
||||
**Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft
|
||||
**Constraints**: No code, no component-level detail yet; focus on system-level view
|
||||
|
||||
#### Phase 2a: Architecture & Flows
|
||||
|
||||
1. Read all input files thoroughly
|
||||
2. Incorporate findings, questions, and insights discovered during Step 1 (integration tests)
|
||||
3. Research unknown or questionable topics via internet; ask user about ambiguities
|
||||
@@ -230,6 +241,56 @@ Capture any new questions, findings, or insights that arise during test specific
|
||||
|
||||
**BLOCKING**: Present architecture summary to user. Do NOT proceed until user confirms.
|
||||
|
||||
#### Phase 2b: Data Model
|
||||
|
||||
**Role**: Professional software architect
|
||||
**Goal**: Produce a detailed data model document covering entities, relationships, and migration strategy
|
||||
|
||||
1. Extract core entities from architecture.md and solution.md
|
||||
2. Define entity attributes, types, and constraints
|
||||
3. Define relationships between entities (Mermaid ERD)
|
||||
4. Define migration strategy: versioning tool (EF Core migrations / Alembic / sql-migrate), reversibility requirement, naming convention
|
||||
5. Define seed data requirements per environment (dev, staging)
|
||||
6. Define backward compatibility approach for schema changes (additive-only by default)
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every entity mentioned in architecture.md is defined
|
||||
- [ ] Relationships are explicit with cardinality
|
||||
- [ ] Migration strategy specifies reversibility requirement
|
||||
- [ ] Seed data requirements defined
|
||||
- [ ] Backward compatibility approach documented
|
||||
|
||||
**Save action**: Write `data_model.md`
|
||||
|
||||
#### Phase 2c: Deployment Planning
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Produce deployment plan covering containerization, CI/CD, environment strategy, observability, and deployment procedures
|
||||
|
||||
Use the `/deploy` skill's templates as structure for each artifact:
|
||||
|
||||
1. Read architecture.md and restrictions.md for infrastructure constraints
|
||||
2. Research Docker best practices for the project's tech stack
|
||||
3. Define containerization plan: Dockerfile per component, docker-compose for dev and tests
|
||||
4. Define CI/CD pipeline: stages, quality gates, caching, parallelization
|
||||
5. Define environment strategy: dev, staging, production with secrets management
|
||||
6. Define observability: structured logging, metrics, tracing, alerting
|
||||
7. Define deployment procedures: strategy, health checks, rollback, checklist
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every component has a Docker specification
|
||||
- [ ] CI/CD pipeline covers lint, test, security, build, deploy
|
||||
- [ ] Environment strategy covers dev, staging, production
|
||||
- [ ] Observability covers logging, metrics, tracing, alerting
|
||||
- [ ] Deployment procedures include rollback and health checks
|
||||
|
||||
**Save action**: Write all 5 files under `deployment/`:
|
||||
- `containerization.md`
|
||||
- `ci_cd_pipeline.md`
|
||||
- `environment_strategy.md`
|
||||
- `observability.md`
|
||||
- `deployment_procedures.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Component Decomposition
|
||||
@@ -375,6 +436,20 @@ Before writing the final report, verify ALL of the following:
|
||||
- [ ] Deployment model is defined
|
||||
- [ ] Integration test findings are reflected in architecture decisions
|
||||
|
||||
### Data Model
|
||||
- [ ] Every entity from architecture.md is defined
|
||||
- [ ] Relationships have explicit cardinality
|
||||
- [ ] Migration strategy with reversibility requirement
|
||||
- [ ] Seed data requirements defined
|
||||
- [ ] Backward compatibility approach documented
|
||||
|
||||
### Deployment
|
||||
- [ ] Containerization plan covers all components
|
||||
- [ ] CI/CD pipeline includes lint, test, security, build, deploy stages
|
||||
- [ ] Environment strategy covers dev, staging, production
|
||||
- [ ] Observability covers logging, metrics, tracing, alerting
|
||||
- [ ] Deployment procedures include rollback and health checks
|
||||
|
||||
### Components
|
||||
- [ ] Every component follows SRP
|
||||
- [ ] No circular dependencies
|
||||
@@ -441,8 +516,10 @@ Before writing the final report, verify ALL of the following:
|
||||
│ │
|
||||
│ 1. Integration Tests → integration_tests/ (5 files) │
|
||||
│ [BLOCKING: user confirms test coverage] │
|
||||
│ 2. Solution Analysis → architecture.md, system-flows.md │
|
||||
│ 2a. Architecture → architecture.md, system-flows.md │
|
||||
│ [BLOCKING: user confirms architecture] │
|
||||
│ 2b. Data Model → data_model.md │
|
||||
│ 2c. Deployment → deployment/ (5 files) │
|
||||
│ 3. Component Decompose → components/[##]_[name]/description │
|
||||
│ [BLOCKING: user confirms decomposition] │
|
||||
│ 4. Review & Risk → risk_mitigations.md │
|
||||
|
||||
@@ -0,0 +1,174 @@
|
||||
---
|
||||
name: retrospective
|
||||
description: |
|
||||
Collect metrics from implementation batch reports and code review findings, analyze trends across cycles,
|
||||
and produce improvement reports with actionable recommendations.
|
||||
3-step workflow: collect metrics, analyze trends, produce report.
|
||||
Outputs to _docs/05_metrics/.
|
||||
Trigger phrases:
|
||||
- "retrospective", "retro", "run retro"
|
||||
- "metrics review", "feedback loop"
|
||||
- "implementation metrics", "analyze trends"
|
||||
category: evolve
|
||||
tags: [retrospective, metrics, trends, improvement, feedback-loop]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Retrospective
|
||||
|
||||
Collect metrics from implementation artifacts, analyze trends across development cycles, and produce actionable improvement reports.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Data-driven**: conclusions come from metrics, not impressions
|
||||
- **Actionable**: every finding must have a concrete improvement suggestion
|
||||
- **Cumulative**: each retrospective compares against previous ones to track progress
|
||||
- **Save immediately**: write artifacts to disk after each step
|
||||
- **Non-judgmental**: focus on process improvement, not blame
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Fixed paths:
|
||||
|
||||
- IMPL_DIR: `_docs/03_implementation/`
|
||||
- METRICS_DIR: `_docs/05_metrics/`
|
||||
- TASKS_DIR: `_docs/02_tasks/`
|
||||
|
||||
Announce the resolved paths to the user before proceeding.
|
||||
|
||||
## Prerequisite Checks (BLOCKING)
|
||||
|
||||
1. `IMPL_DIR` exists and contains at least one `batch_*_report.md` — **STOP if missing** (nothing to analyze)
|
||||
2. Create METRICS_DIR if it does not exist
|
||||
3. Check for previous retrospective reports in METRICS_DIR to enable trend comparison
|
||||
|
||||
## Artifact Management
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
METRICS_DIR/
|
||||
├── retro_[YYYY-MM-DD].md
|
||||
├── retro_[YYYY-MM-DD].md
|
||||
└── ...
|
||||
```
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
At the start of execution, create a TodoWrite with all steps (1 through 3). Update status as each step completes.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Collect Metrics
|
||||
|
||||
**Role**: Data analyst
|
||||
**Goal**: Parse all implementation artifacts and extract quantitative metrics
|
||||
**Constraints**: Collection only — no interpretation yet
|
||||
|
||||
#### Sources
|
||||
|
||||
| Source | Metrics Extracted |
|
||||
|--------|------------------|
|
||||
| `batch_*_report.md` | Tasks per batch, batch count, task statuses (Done/Blocked/Partial) |
|
||||
| Code review sections in batch reports | PASS/FAIL/PASS_WITH_WARNINGS ratios, finding counts by severity and category |
|
||||
| Task spec files in TASKS_DIR | Complexity points per task, dependency count |
|
||||
| `FINAL_implementation_report.md` | Total tasks, total batches, overall duration |
|
||||
| Git log (if available) | Commits per batch, files changed per batch |
|
||||
|
||||
#### Metrics to Compute
|
||||
|
||||
**Implementation Metrics**:
|
||||
- Total tasks implemented
|
||||
- Total batches executed
|
||||
- Average tasks per batch
|
||||
- Average complexity points per batch
|
||||
- Total complexity points delivered
|
||||
|
||||
**Quality Metrics**:
|
||||
- Code review pass rate (PASS / total reviews)
|
||||
- Code review findings by severity: Critical, High, Medium, Low counts
|
||||
- Code review findings by category: Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope
|
||||
- FAIL count (batches that required user intervention)
|
||||
|
||||
**Efficiency Metrics**:
|
||||
- Blocked task count and reasons
|
||||
- Tasks completed on first attempt vs requiring fixes
|
||||
- Batch with most findings (identify problem areas)
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All batch reports parsed
|
||||
- [ ] All metric categories computed
|
||||
- [ ] No batch reports missed
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Analyze Trends
|
||||
|
||||
**Role**: Process improvement analyst
|
||||
**Goal**: Identify patterns, recurring issues, and improvement opportunities
|
||||
**Constraints**: Analysis must be grounded in the metrics from Step 1
|
||||
|
||||
1. If previous retrospective reports exist in METRICS_DIR, load the most recent one for comparison
|
||||
2. Identify patterns:
|
||||
- **Recurring findings**: which code review categories appear most frequently?
|
||||
- **Problem components**: which components/files generate the most findings?
|
||||
- **Complexity accuracy**: do high-complexity tasks actually produce more issues?
|
||||
- **Blocker patterns**: what types of blockers occur and can they be prevented?
|
||||
3. Compare against previous retrospective (if exists):
|
||||
- Which metrics improved?
|
||||
- Which metrics degraded?
|
||||
- Were previous improvement actions effective?
|
||||
4. Identify top 3 improvement actions ranked by impact
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Patterns are grounded in specific metrics
|
||||
- [ ] Comparison with previous retro included (if exists)
|
||||
- [ ] Top 3 actions are concrete and actionable
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Produce Report
|
||||
|
||||
**Role**: Technical writer
|
||||
**Goal**: Write a structured retrospective report with metrics, trends, and recommendations
|
||||
**Constraints**: Concise, data-driven, actionable
|
||||
|
||||
Write `METRICS_DIR/retro_[YYYY-MM-DD].md` using `templates/retrospective-report.md` as structure.
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All metrics from Step 1 included
|
||||
- [ ] Trend analysis from Step 2 included
|
||||
- [ ] Top 3 improvement actions clearly stated
|
||||
- [ ] Suggested rule/skill updates are specific
|
||||
|
||||
**Save action**: Write `retro_[YYYY-MM-DD].md`
|
||||
|
||||
Present the report summary to the user.
|
||||
|
||||
---
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| No batch reports exist | **STOP** — nothing to analyze |
|
||||
| Batch reports have inconsistent format | **WARN user**, extract what is available |
|
||||
| No previous retrospective for comparison | PROCEED — report baseline metrics only |
|
||||
| Metrics suggest systemic issue (>50% FAIL rate) | **WARN user** — suggest immediate process review |
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Retrospective (3-Step Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ: batch reports exist in _docs/03_implementation/ │
|
||||
│ │
|
||||
│ 1. Collect Metrics → parse batch reports, compute metrics │
|
||||
│ 2. Analyze Trends → patterns, comparison, improvement areas │
|
||||
│ 3. Produce Report → _docs/05_metrics/retro_[date].md │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Data-driven · Actionable · Cumulative │
|
||||
│ Non-judgmental · Save immediately │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,93 @@
|
||||
# Retrospective Report Template
|
||||
|
||||
Save as `_docs/05_metrics/retro_[YYYY-MM-DD].md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# Retrospective — [YYYY-MM-DD]
|
||||
|
||||
## Implementation Summary
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Total tasks | [count] |
|
||||
| Total batches | [count] |
|
||||
| Total complexity points | [sum] |
|
||||
| Avg tasks per batch | [value] |
|
||||
| Avg complexity per batch | [value] |
|
||||
|
||||
## Quality Metrics
|
||||
|
||||
### Code Review Results
|
||||
|
||||
| Verdict | Count | Percentage |
|
||||
|---------|-------|-----------|
|
||||
| PASS | [count] | [%] |
|
||||
| PASS_WITH_WARNINGS | [count] | [%] |
|
||||
| FAIL | [count] | [%] |
|
||||
|
||||
### Findings by Severity
|
||||
|
||||
| Severity | Count |
|
||||
|----------|-------|
|
||||
| Critical | [count] |
|
||||
| High | [count] |
|
||||
| Medium | [count] |
|
||||
| Low | [count] |
|
||||
|
||||
### Findings by Category
|
||||
|
||||
| Category | Count | Top Files |
|
||||
|----------|-------|-----------|
|
||||
| Bug | [count] | [most affected files] |
|
||||
| Spec-Gap | [count] | [most affected files] |
|
||||
| Security | [count] | [most affected files] |
|
||||
| Performance | [count] | [most affected files] |
|
||||
| Maintainability | [count] | [most affected files] |
|
||||
| Style | [count] | [most affected files] |
|
||||
|
||||
## Efficiency
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Blocked tasks | [count] |
|
||||
| Tasks requiring fixes after review | [count] |
|
||||
| Batch with most findings | Batch [N] — [reason] |
|
||||
|
||||
### Blocker Analysis
|
||||
|
||||
| Blocker Type | Count | Prevention |
|
||||
|-------------|-------|-----------|
|
||||
| [type] | [count] | [suggested prevention] |
|
||||
|
||||
## Trend Comparison
|
||||
|
||||
| Metric | Previous | Current | Change |
|
||||
|--------|----------|---------|--------|
|
||||
| Pass rate | [%] | [%] | [+/-] |
|
||||
| Avg findings per batch | [value] | [value] | [+/-] |
|
||||
| Blocked tasks | [count] | [count] | [+/-] |
|
||||
|
||||
*Previous retrospective: [date or "N/A — first retro"]*
|
||||
|
||||
## Top 3 Improvement Actions
|
||||
|
||||
1. **[Action title]**: [specific, actionable description]
|
||||
- Impact: [expected improvement]
|
||||
- Effort: [low/medium/high]
|
||||
|
||||
2. **[Action title]**: [specific, actionable description]
|
||||
- Impact: [expected improvement]
|
||||
- Effort: [low/medium/high]
|
||||
|
||||
3. **[Action title]**: [specific, actionable description]
|
||||
- Impact: [expected improvement]
|
||||
- Effort: [low/medium/high]
|
||||
|
||||
## Suggested Rule/Skill Updates
|
||||
|
||||
| File | Change | Rationale |
|
||||
|------|--------|-----------|
|
||||
| [.cursor/rules/... or .cursor/skills/...] | [specific change] | [based on which metric] |
|
||||
```
|
||||
@@ -49,19 +49,8 @@ When testing security or conducting audits:
|
||||
- Validating input sanitization
|
||||
- Reviewing security configuration
|
||||
|
||||
### OWASP Top 10 (2021)
|
||||
| # | Vulnerability | Key Test |
|
||||
|---|---------------|----------|
|
||||
| 1 | Broken Access Control | User A accessing User B's data |
|
||||
| 2 | Cryptographic Failures | Plaintext passwords, HTTP |
|
||||
| 3 | Injection | SQL/XSS/command injection |
|
||||
| 4 | Insecure Design | Rate limiting, session timeout |
|
||||
| 5 | Security Misconfiguration | Verbose errors, exposed /admin |
|
||||
| 6 | Vulnerable Components | npm audit, outdated packages |
|
||||
| 7 | Auth Failures | Weak passwords, no MFA |
|
||||
| 8 | Integrity Failures | Unsigned updates, malware |
|
||||
| 9 | Logging Failures | No audit trail for breaches |
|
||||
| 10 | SSRF | Server fetching internal URLs |
|
||||
### OWASP Top 10
|
||||
Use the most recent **stable** version of the OWASP Top 10. At the start of each security audit, research the current version at https://owasp.org/www-project-top-ten/ and test against all listed categories. Do not rely on a hardcoded list — the OWASP Top 10 is updated periodically and the current version must be verified.
|
||||
|
||||
### Tools
|
||||
| Type | Tool | Purpose |
|
||||
|
||||
Reference in New Issue
Block a user