mirror of
https://github.com/azaion/satellite-provider.git
synced 2026-06-22 18:41:14 +00:00
Enhance .cursor documentation and workflows
Updated the README.md to reflect new skill commands and improved descriptions for various workflows, including the addition of new skills like /test-spec, /code-review, and /release. Enhanced clarity on session boundaries and flow resolutions in the auto-chaining process. Removed the implementer agent file as it is no longer needed. Updated coderule and meta-rule documents to enforce stricter testing and implementation standards, ensuring real results are produced rather than simulated ones. Revised the autodev flow documentation to include a new release step and clarified the retrospective process. Adjusted the testing rules to specify coverage thresholds for critical paths. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,290 @@
|
||||
---
|
||||
name: release
|
||||
description: |
|
||||
Executes the deployment plan produced by /deploy against a target environment.
|
||||
Closes the loop between "we have a plan" and "the new version is running in production with a verdict on disk."
|
||||
6-phase workflow: pre-release gate, strategy select, execute, smoke test, watch window, commit-or-rollback.
|
||||
Outputs _docs/04_release/release_<version>.md with a definitive Released / Rolled-Back / Aborted verdict.
|
||||
Trigger phrases:
|
||||
- "release", "ship", "go live", "release this version"
|
||||
- "deploy to prod", "promote to staging", "roll out"
|
||||
- "rollback", "abort the release"
|
||||
category: ship
|
||||
tags: [release, deployment, rollback, smoke-test, observability, production]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Release Execution
|
||||
|
||||
The `/deploy` skill produces a plan and scripts. The `/release` skill **runs** them, verifies the live system, watches it for a defined window, and produces a definitive verdict on disk.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Real execution, not simulation**: every phase must actually run against the target environment. If a phase cannot be executed (missing scripts, no SSH access, disabled secrets, registry auth failure), STOP — do not pretend a step succeeded. See `meta-rule.mdc` § "Real Results, Not Simulated Ones".
|
||||
- **Verifiable rollback path**: the release does not start until rollback is proven viable for this version. "We can roll back" without evidence is not a rollback path.
|
||||
- **Quiet failure is a release failure**: a deploy script that exits 0 but emits no observable signal in the watch window is treated as a regression, not a success.
|
||||
- **One release per invocation**: a single `/release` execution targets exactly one version against exactly one environment. Multi-stage promotion (staging → prod) is two invocations, not one.
|
||||
- **Never skip the watch window**: even successful deploys can degrade after 5–60 minutes (cache warm-up, scheduled jobs, downstream backpressure). The watch window is mandatory.
|
||||
- **Autonomous rollback on hard regressions**: critical health-check failure, error-rate spike above threshold, or smoke-test failure → automatic rollback. Soft regressions (latency drift, capacity warnings) escalate to the user.
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Fixed paths:
|
||||
|
||||
- DEPLOY_DIR: `_docs/04_deploy/`
|
||||
- RELEASE_DIR: `_docs/04_release/`
|
||||
- SCRIPTS_DIR: `scripts/`
|
||||
- DEPLOY_SCRIPT: `scripts/deploy.sh`
|
||||
- HEALTH_SCRIPT: `scripts/health-check.sh`
|
||||
- ENV_TEMPLATE: `.env.example`
|
||||
- OBSERVABILITY_DOC: `_docs/04_deploy/observability.md`
|
||||
- ENVIRONMENT_DOC: `_docs/04_deploy/environment_strategy.md`
|
||||
- PROCEDURES_DOC: `_docs/04_deploy/deployment_procedures.md`
|
||||
- ARCHITECTURE: `_docs/02_document/architecture.md`
|
||||
- RESTRICTIONS: `_docs/00_problem/restrictions.md`
|
||||
|
||||
Announce the resolved paths and the **target environment + version + strategy** to the user before any phase that touches the live system.
|
||||
|
||||
## Inputs (BLOCKING prerequisites)
|
||||
|
||||
| Input | Required | Source |
|
||||
|-------|----------|--------|
|
||||
| Target environment | Yes — ASK user | `environment_strategy.md` enumerates valid options |
|
||||
| Target version / image tag | Yes — ASK user | Must exist in the registry; verified in Phase 1 |
|
||||
| Rollback target version | Yes — ASK user | Defaults to currently-deployed version if discoverable |
|
||||
| `scripts/deploy.sh` | Yes | Produced by `/deploy` Step 7. STOP if missing → run `/deploy` first |
|
||||
| `scripts/health-check.sh` | Yes | Same |
|
||||
| `_docs/04_deploy/deployment_procedures.md` | Yes | Defines per-environment runbook, manual approval rules, change-window restrictions |
|
||||
| `_docs/04_deploy/observability.md` | Yes | Defines watch metrics, thresholds, and dashboards |
|
||||
| `_docs/04_deploy/environment_strategy.md` | Yes | Defines target hostnames, registries, secrets, deploy strategy per env |
|
||||
|
||||
## Outputs
|
||||
|
||||
```
|
||||
RELEASE_DIR/
|
||||
├── release_<version>_<env>_<YYYY-MM-DD-HHmm>.md (mandatory; one per invocation)
|
||||
├── rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md (only when rollback fires; pairs with the release file)
|
||||
└── manual_approvals/
|
||||
└── approval_<version>_<env>.md (when restrictions require manual approval, written before Phase 3)
|
||||
```
|
||||
|
||||
The release report (`templates/release-report.md`) is appended to as each phase completes — it is durable across phase failures and reflects partial progress so the next operator can resume or audit.
|
||||
|
||||
## Phases
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Release Execution (6-Phase Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ: deploy artifacts on disk; tests green at HEAD │
|
||||
│ │
|
||||
│ 1. Pre-Release Gate → AC + change summary + readiness │
|
||||
│ [BLOCKING: user confirms or aborts] │
|
||||
│ 2. Strategy Select → all-at-once / blue-green / canary │
|
||||
│ [BLOCKING: user picks strategy] │
|
||||
│ 3. Execute → run deploy.sh, capture exit + logs │
|
||||
│ [AUTO-ROLLBACK on non-zero exit] │
|
||||
│ 4. Smoke Test → /test-run prod-smoke in target env │
|
||||
│ [AUTO-ROLLBACK on failure] │
|
||||
│ 5. Watch Window → poll observability for N minutes │
|
||||
│ [AUTO-ROLLBACK on hard threshold breach] │
|
||||
│ 6. Commit or Rollback → finalize verdict, update tracker │
|
||||
│ [BLOCKING: user confirms only if soft regression escalated] │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Verdicts: Released · Rolled-Back · Aborted │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Phase 1: Pre-Release Gate
|
||||
|
||||
**Goal**: Refuse to start if the system is not ready for a real release.
|
||||
|
||||
1. **Acceptance criteria check**: read `_docs/00_problem/acceptance_criteria.md`. If any AC is marked unmet OR if any AC has no associated test marked `Passed` in the latest `test-run` report, STOP and surface the unmet items. Do not let the user override with "ship anyway" without a recorded reason in the release report.
|
||||
2. **Test status check**: read the most recent `_docs/06_metrics/perf_*.md` (if perf is required by restrictions) and the latest functional test report. Any failing or skipped test that maps to a critical-path AC blocks the release.
|
||||
3. **Change summary**: read the git log between the version-tag-of-last-release and HEAD (or, if no prior release exists, from the project root commit). Render a short list grouped by component: features, fixes, breaking changes, security fixes. Cross-reference against the latest implementation reports under `_docs/03_implementation/`.
|
||||
4. **Rollback readiness**:
|
||||
- Confirm the previous version's image is still pullable from the registry (do not deploy without this).
|
||||
- Confirm `scripts/deploy.sh --rollback` works as documented (read the script; if `--rollback` flag is missing, STOP — that is a deploy-skill bug).
|
||||
- Confirm a rollback target exists (e.g., previously-deployed image tag) and is recorded in the release report under `Rollback Plan`.
|
||||
5. **Restrictions**: read `_docs/00_problem/restrictions.md` for change-window rules, manual-approval rules, blackout windows, regulatory requirements (e.g., 4-eyes review, ITAR controls). If any apply, gate accordingly — write a `manual_approvals/approval_<version>_<env>.md` file once received.
|
||||
6. **Tracker check**: list tracker tickets in the release scope (per `tracker.mdc` rules). Any ticket still in `In Progress` or `Code Review` that maps to a change in the release scope blocks Phase 1. Move-and-deploy is not allowed.
|
||||
|
||||
**BLOCKING gate**: present the assembled summary to the user using Choose A/B/C:
|
||||
|
||||
```
|
||||
══════════════════════════════════════
|
||||
PRE-RELEASE GATE
|
||||
══════════════════════════════════════
|
||||
Target env: {env}
|
||||
Target version: {version} ({git-sha})
|
||||
Rollback target: {previous-version}
|
||||
Changes: N tickets, M components
|
||||
- {summary list}
|
||||
Open risks: {summary or "none"}
|
||||
Blocking issues: {summary or "none"}
|
||||
══════════════════════════════════════
|
||||
A) Proceed to Strategy Select
|
||||
B) Abort — fix blocking issue and re-invoke
|
||||
C) Edit release scope — exclude a ticket and reassemble
|
||||
══════════════════════════════════════
|
||||
```
|
||||
|
||||
If A → write Phase 1 section to release report, proceed. If B → write `Aborted` verdict to release report with reason, exit. If C → loop back into Phase 1 with edited scope.
|
||||
|
||||
### Phase 2: Strategy Select
|
||||
|
||||
**Goal**: Pick the deployment strategy that fits the change risk and environment capability.
|
||||
|
||||
Read `environment_strategy.md` and `deployment_procedures.md` to learn which strategies the target env supports. Strategies and when each is appropriate:
|
||||
|
||||
| Strategy | When to pick | Risk if wrong |
|
||||
|----------|--------------|---------------|
|
||||
| **all-at-once** | Internal tools, low traffic, well-rehearsed change, env supports nothing else | All users hit the new version simultaneously — bug blast radius is 100% |
|
||||
| **blue-green** | Stateless services with a load balancer, env has dual-stack capability | Cutover is binary — observability must be ready to detect issues fast |
|
||||
| **canary** | Customer-facing, traffic-tier load balancer in place, gradual rollout possible | Canary metric thresholds must be well-tuned or canary fails for harmless reasons |
|
||||
| **manual** | Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) | The whole release becomes a runbook and the watch window phases are operator-driven; the release skill records but does not execute |
|
||||
|
||||
Recommend a default based on:
|
||||
- Risk level inferred from change summary (any breaking change → bias toward canary or blue-green)
|
||||
- Restrictions (e.g., regulatory rules forcing manual approval at each step)
|
||||
- Environment capability (some envs may only support all-at-once)
|
||||
|
||||
**BLOCKING gate**: Choose A/B/C/D between strategies. Record the choice in the release report.
|
||||
|
||||
### Phase 3: Execute
|
||||
|
||||
**Goal**: Actually run the deploy. Capture exit code and full stdout/stderr.
|
||||
|
||||
1. Validate environment file (`.env`) exists, all required vars from `.env.example` are set, no placeholder secrets remain.
|
||||
2. Source the env file and run `scripts/deploy.sh` against the target host. The script produced by `/deploy` Step 7 is the point of execution; do NOT bypass it. If a strategy-specific flag is needed (e.g., `--canary 5%`), pass it through.
|
||||
3. Stream stdout/stderr to the release report, with timestamps, in a fenced code block under `## Phase 3: Execute`.
|
||||
4. Capture exit code.
|
||||
5. **AUTO-ROLLBACK trigger**: non-zero exit code → immediately invoke Phase 6 with verdict `Rolled-Back: deploy script failure`. Do NOT continue to Phase 4.
|
||||
|
||||
If `deploy.sh` emits no output for more than the configured idle threshold (default 5 minutes; check `deployment_procedures.md` for an explicit value), treat it as hung — capture a snapshot of what's running on the target, kill the script, and AUTO-ROLLBACK with reason `Deploy hung — manual investigation required`.
|
||||
|
||||
**Manual strategy**: if Phase 2 picked `manual`, write a checklist of operator steps from `deployment_procedures.md` to the release report and pause until the user types `done` or `failed`. Phase 3 then records the user's report verbatim.
|
||||
|
||||
### Phase 4: Smoke Test
|
||||
|
||||
**Goal**: Verify the new version is *actually serving traffic correctly* in the target environment.
|
||||
|
||||
1. Resolve the smoke-test command from `_docs/02_document/tests/blackbox-tests.md` § Production Smoke Tests, OR delegate to `/test-run` in `--prod-smoke` mode against the target environment.
|
||||
2. The smoke-test set must (a) hit each public endpoint of each component, (b) include at least one read AND one write per public endpoint where applicable, and (c) complete in under 5 minutes total.
|
||||
3. Capture pass/fail per case to the release report.
|
||||
4. **AUTO-ROLLBACK trigger**: any smoke-test failure → invoke Phase 6 with verdict `Rolled-Back: smoke test failure: <test-name>`.
|
||||
|
||||
If smoke tests are **missing** for the target environment (no production-mode test set), STOP — write a leftover entry to `_docs/_process_leftovers/` per `tracker.mdc`, do not proceed to watch window without smoke coverage. Write `Aborted: smoke tests missing for prod-mode target` and ASK the user.
|
||||
|
||||
### Phase 5: Watch Window
|
||||
|
||||
**Goal**: Observe the live system for a defined window to catch latent regressions.
|
||||
|
||||
1. Read `observability.md` for the project's metrics, dashboards, and threshold definitions. Required watch metrics for any production target (per cursor-meta convention) include error rate, request rate, p99 latency, and saturation (CPU/memory/queue-depth).
|
||||
2. Compute the watch-window duration from `deployment_procedures.md`. If unspecified, default to **15 minutes** for staging and **60 minutes** for production.
|
||||
3. Poll the observability backend at 1-minute intervals (or the configured cadence). For each interval, record metric snapshots to the release report.
|
||||
4. Threshold rules:
|
||||
- **Hard breach** (auto-rollback): error-rate ≥ 2× baseline, p99 latency ≥ 3× baseline, any health-check failure persisting for 2 consecutive intervals.
|
||||
- **Soft breach** (escalate): metric drift between 1.5× and 2× baseline, single-interval health blip, queue-depth steady but elevated.
|
||||
- **No data** (escalate): if metrics are not flowing within the first 3 minutes, treat the absence as a hard breach — observability is itself broken.
|
||||
5. **AUTO-ROLLBACK trigger**: hard breach at any interval. Move to Phase 6 with verdict `Rolled-Back: <metric> breached <multiplier>× baseline at T+<minutes>`.
|
||||
6. **ESCALATE trigger**: soft breach. Pause polling, surface the metric, and ask the user A/B/C:
|
||||
- A) Continue watch — accept current drift, keep polling
|
||||
- B) Roll back now — treat soft drift as hard
|
||||
- C) Extend watch window by N minutes
|
||||
7. End of watch window with no breach → proceed to Phase 6.
|
||||
|
||||
The watch window cannot be skipped. If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as `Released-with-override` — this triggers an automatic incident retrospective per `retrospective/SKILL.md`.
|
||||
|
||||
### Phase 6: Commit or Rollback
|
||||
|
||||
**Goal**: Finalize the release with a definitive verdict on disk.
|
||||
|
||||
**Path A — Commit (clean release)**:
|
||||
1. Update tracker tickets: every ticket in scope moves to `Released` (or `Done`, per project convention defined in `tracker.mdc` / `_docs/_repo-config.yaml`).
|
||||
2. Tag the git HEAD with `release/<version>` (or the project's tag convention from `deployment_procedures.md`).
|
||||
3. Write the final `Released` verdict to the release report with a summary table.
|
||||
4. Trigger `/retrospective --cycle-end` with this release as the cycle terminus.
|
||||
5. Auto-chain to autodev's next step (Retrospective in greenfield, or feature-cycle loop start in existing-code).
|
||||
|
||||
**Path B — Rollback (auto-fired or user-elected)**:
|
||||
1. Run `scripts/deploy.sh --rollback` with the rollback target captured in Phase 1.
|
||||
2. Stream output to a new file `RELEASE_DIR/rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md` AND append a summary to the original release report under `## Rollback`.
|
||||
3. Re-run Phase 4 (smoke test) and a 5-minute mini watch window against the rolled-back version. If THAT also fails, escalate immediately — the system is in an unknown state and needs human takeover.
|
||||
4. Update tracker tickets back to `Ready for Release` (or the project's pre-release status).
|
||||
5. Write the final `Rolled-Back` verdict with full reason chain.
|
||||
6. Auto-trigger `/retrospective --incident` with this release as the incident anchor (per `retrospective/SKILL.md` incident mode).
|
||||
7. Do NOT auto-chain to anything else — the user owns the next step.
|
||||
|
||||
**Path C — Aborted**:
|
||||
Reached only via Phase 1 Choose B, Phase 4 smoke-tests-missing escalation, or any phase that detects a precondition violation. Write `Aborted: <reason>` to the release report. Do not auto-chain.
|
||||
|
||||
## Self-verification
|
||||
|
||||
- [ ] Release report exists at `RELEASE_DIR/release_<version>_<env>_<timestamp>.md` with verdict (Released / Rolled-Back / Aborted)
|
||||
- [ ] Every phase that ran has a section in the release report with timestamps and tool output
|
||||
- [ ] On Released: tracker tickets moved to release status; git tag pushed (if convention)
|
||||
- [ ] On Rolled-Back: rollback report exists at `RELEASE_DIR/rollback_<version>_<env>_<timestamp>.md`; tracker tickets moved back to pre-release status; incident retrospective scheduled
|
||||
- [ ] On Aborted: reason recorded; no live-system changes attempted; no tracker movement
|
||||
- [ ] No phase was skipped without an explicit reason recorded in the release report
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| `scripts/deploy.sh` missing or `--rollback` unsupported | STOP — return to `/deploy` Step 7, do not patch the script in `/release` |
|
||||
| Registry auth failure during pre-release | STOP — fix credentials at infra layer (per `coderule.mdc`); do not embed creds in the script |
|
||||
| Smoke tests missing for prod target | STOP — write a leftover; do not improvise smoke tests in `/release` |
|
||||
| Observability backend unreachable | STOP — observability blindness is itself a release blocker |
|
||||
| User asks to skip the watch window | Record override, mark verdict `Released-with-override`, fire incident retro |
|
||||
| Rollback also fails its smoke test | ESCALATE to user — system is in unknown state; do not loop deploys |
|
||||
| Tracker MCP returns Unauthorized during ticket movement | Per `tracker.mdc`, write a leftover entry; do NOT silently continue without confirming the move |
|
||||
| Multiple environments named in user request | STOP — one release per invocation; ask user to pick one |
|
||||
| Production smoke test would touch real customer data | STOP — that is a `coderule.mdc` violation; ask user to define a smoke endpoint or test account |
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Skipping the watch window when "everything looks fine after deploy"** — a deploy that exited 0 is not a release that's stable. Watch is mandatory.
|
||||
- **Faking smoke tests** to pass the gate when the prod test set is incomplete. STOP and surface the gap; do not embed prod URLs into ad-hoc curl commands.
|
||||
- **Rolling forward through a failure** ("the next deploy will fix it"). Roll back first, fix the cause, then deploy a real fix.
|
||||
- **Treating the release report as optional** when only an internal tool changed. Every release writes a report — the audit trail is the value, not the prose volume.
|
||||
- **Approving manual gates yourself** without the user's input when restrictions require human approval. The release skill records, the human approves.
|
||||
- **Reusing `release_<version>` filenames** across attempted releases. Always include the timestamp in the filename so re-attempts are visible side-by-side.
|
||||
- **Letting tracker drift silently** between release attempts. If Phase 6 cannot move tickets, the release is not complete — write a leftover and stop.
|
||||
|
||||
## Project Mode vs Standalone
|
||||
|
||||
- **Project mode** (default): autodev invokes `/release` after `/deploy`. State writes occur under `_docs/_autodev_state.md`. Full integration with retrospective and feature-cycle loop.
|
||||
- **Standalone mode**: `/release` invoked directly with `@<artifact>` (rare; usually only for re-running a rollback against a specific version). All outputs still go to `RELEASE_DIR/`.
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Release (6 phases, 3 verdicts) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Phase 1 Pre-Release Gate │
|
||||
│ AC + tests + change summary + rollback path │
|
||||
│ [BLOCKING — user A/B/C] │
|
||||
│ Phase 2 Strategy Select │
|
||||
│ all-at-once · blue-green · canary · manual │
|
||||
│ [BLOCKING — user picks] │
|
||||
│ Phase 3 Execute │
|
||||
│ scripts/deploy.sh, capture exit code + logs │
|
||||
│ [AUTO-ROLLBACK on non-zero or hang] │
|
||||
│ Phase 4 Smoke Test │
|
||||
│ /test-run --prod-smoke against target │
|
||||
│ [AUTO-ROLLBACK on any failure] │
|
||||
│ Phase 5 Watch Window │
|
||||
│ Poll observability for N minutes │
|
||||
│ [AUTO-ROLLBACK on hard breach; escalate on soft] │
|
||||
│ Phase 6 Commit or Rollback │
|
||||
│ Released → tracker, tag, retrospective │
|
||||
│ Rolled-Back → tracker reset, incident retrospective │
|
||||
│ Aborted → no live-system change │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: real execution · verifiable rollback · │
|
||||
│ quiet failure = release failure · │
|
||||
│ watch window mandatory │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,114 @@
|
||||
# Release Report — {version} → {env}
|
||||
|
||||
- **Date**: {YYYY-MM-DD HH:MM} {timezone}
|
||||
- **Operator**: {user}
|
||||
- **Strategy**: {all-at-once | blue-green | canary | manual}
|
||||
- **Verdict**: {Released | Released-with-override | Rolled-Back | Aborted}
|
||||
- **Verdict reason**: {one-line summary}
|
||||
|
||||
## Pre-Release Gate (Phase 1)
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
| AC ID | Status | Evidence |
|
||||
|-------|--------|----------|
|
||||
| AC-001 | Met / Unmet | path:section, test report, etc. |
|
||||
|
||||
### Test Status
|
||||
|
||||
| Suite | Pass | Fail | Skip | Source |
|
||||
|-------|------|------|------|--------|
|
||||
| Functional | N | N | N | _docs/03_implementation/{batch}.md |
|
||||
| Performance | N | N | N | _docs/06_metrics/perf_*.md |
|
||||
|
||||
### Change Summary
|
||||
|
||||
| Component | Tickets | Type |
|
||||
|-----------|---------|------|
|
||||
| {component} | TKT-001, TKT-002 | feature / fix / breaking / security |
|
||||
|
||||
### Rollback Plan
|
||||
|
||||
- Previous version: `{previous-version}` (registry digest: `{sha}`)
|
||||
- Rollback script: `scripts/deploy.sh --rollback`
|
||||
- Rollback target verified pullable: yes / no
|
||||
- Rollback target verified bootable in target env: yes / no
|
||||
|
||||
### Restrictions / Approvals
|
||||
|
||||
- Change-window restrictions: {none | description}
|
||||
- Manual approvals required: {none | reference to approval file}
|
||||
|
||||
### Tracker State at Gate
|
||||
|
||||
- Tickets in scope: {N}
|
||||
- Tickets blocking release: {0 — list any}
|
||||
|
||||
## Strategy Select (Phase 2)
|
||||
|
||||
- Recommended: {strategy} — reasoning
|
||||
- Chosen: {strategy} — reasoning (if differs from recommended)
|
||||
|
||||
## Execute (Phase 3)
|
||||
|
||||
- Start: {timestamp}
|
||||
- End: {timestamp}
|
||||
- Exit code: {0 / non-zero}
|
||||
|
||||
```
|
||||
<scripts/deploy.sh stdout/stderr stream, with timestamps>
|
||||
```
|
||||
|
||||
## Smoke Test (Phase 4)
|
||||
|
||||
- Mode: {/test-run --prod-smoke | manual smoke set}
|
||||
- Start: {timestamp}
|
||||
- End: {timestamp}
|
||||
|
||||
| Test | Result | Notes |
|
||||
|------|--------|-------|
|
||||
| {name} | Pass / Fail | response time, status, etc. |
|
||||
|
||||
## Watch Window (Phase 5)
|
||||
|
||||
- Duration: {minutes}
|
||||
- Cadence: {minutes per poll}
|
||||
- Backend: {observability source — Prometheus, CloudWatch, Datadog, etc.}
|
||||
|
||||
| T+min | error_rate | rps | p99_latency | saturation | health | notes |
|
||||
|-------|------------|-----|-------------|------------|--------|-------|
|
||||
| 0 | … | … | … | … | OK | … |
|
||||
| 1 | … | … | … | … | OK | … |
|
||||
| … | … | … | … | … | … | … |
|
||||
|
||||
### Threshold breaches
|
||||
|
||||
- {None | "p99 latency 1.7× baseline at T+8 — soft breach, user accepted continuation"}
|
||||
|
||||
## Commit or Rollback (Phase 6)
|
||||
|
||||
### If Released
|
||||
|
||||
- Tracker tickets moved: {list}
|
||||
- Git tag pushed: {tag} → {sha}
|
||||
- Retrospective scheduled: yes — {/retrospective --cycle-end output path}
|
||||
|
||||
### If Rolled-Back
|
||||
|
||||
- Trigger: {auto / user-elected}
|
||||
- Reason: {phase + one-line cause}
|
||||
- Rollback start: {timestamp}
|
||||
- Rollback end: {timestamp}
|
||||
- Post-rollback smoke: pass / fail
|
||||
- Tracker tickets moved back: {list}
|
||||
- Incident retrospective scheduled: yes — {/retrospective --incident output path}
|
||||
|
||||
### If Aborted
|
||||
|
||||
- Phase that aborted: {1 / 2 / 3 / 4 / 5}
|
||||
- Reason: {one-line cause}
|
||||
- No live-system changes attempted: yes / no (if live changes, document under Phase 3 above and treat as Rolled-Back instead)
|
||||
|
||||
## Lessons (one-liners; full incident retro if Rolled-Back / Released-with-override)
|
||||
|
||||
- {Optional: short one-liner observations the operator wants the next /retrospective to consider}
|
||||
Reference in New Issue
Block a user