Bundled hygiene commit before cycle-3 /implement (AZ-776, AZ-777). Mixes two concerns by user choice (autodev option B): - Cycle-3 autodev artifacts not yet committed by Step 9 (new-task): task specs for AZ-776 / AZ-777 under _docs/02_tasks/todo/ and the updated _docs/02_tasks/_dependencies_table.md. - Accumulated skill / rule tooling maintenance under .cursor/ (skills: autodev, code-review, decompose, deploy, implement, new-task, plan, refactor, retrospective, test-spec; rules: coderule, cursor-meta, meta-rule, testing; new release skill scaffolding). - Autodev bootstrap state: _docs/_autodev_state.md (step 10 in_progress) and _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md (replay timestamp refreshed; gtsam 4.2 still numpy<2-only). Co-authored-by: Cursor <cursoragent@cursor.com>
22 KiB
name, description, category, tags, disable-model-invocation
| name | description | category | tags | disable-model-invocation | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| release | Executes the deployment plan produced by /deploy against a target environment. Closes the loop between "we have a plan" and "the new version is running in production with a verdict on disk." 6-phase workflow: pre-release gate, strategy select, execute, smoke test, watch window, commit-or-rollback. Outputs _docs/04_release/release_<version>.md with a definitive Released / Rolled-Back / Aborted verdict. Trigger phrases: - "release", "ship", "go live", "release this version" - "deploy to prod", "promote to staging", "roll out" - "rollback", "abort the release" | ship |
|
true |
Release Execution
The /deploy skill produces a plan and scripts. The /release skill runs them, verifies the live system, watches it for a defined window, and produces a definitive verdict on disk.
Core Principles
- Real execution, not simulation: every phase must actually run against the target environment. If a phase cannot be executed (missing scripts, no SSH access, disabled secrets, registry auth failure), STOP — do not pretend a step succeeded. See
meta-rule.mdc§ "Real Results, Not Simulated Ones". - Verifiable rollback path: the release does not start until rollback is proven viable for this version. "We can roll back" without evidence is not a rollback path.
- Quiet failure is a release failure: a deploy script that exits 0 but emits no observable signal in the watch window is treated as a regression, not a success.
- One release per invocation: a single
/releaseexecution targets exactly one version against exactly one environment. Multi-stage promotion (staging → prod) is two invocations, not one. - Never skip the watch window: even successful deploys can degrade after 5–60 minutes (cache warm-up, scheduled jobs, downstream backpressure). The watch window is mandatory.
- Autonomous rollback on hard regressions: critical health-check failure, error-rate spike above threshold, or smoke-test failure → automatic rollback. Soft regressions (latency drift, capacity warnings) escalate to the user.
Context Resolution
Fixed paths:
- DEPLOY_DIR:
_docs/04_deploy/ - RELEASE_DIR:
_docs/04_release/ - SCRIPTS_DIR:
scripts/ - DEPLOY_SCRIPT:
scripts/deploy.sh - HEALTH_SCRIPT:
scripts/health-check.sh - ENV_TEMPLATE:
.env.example - OBSERVABILITY_DOC:
_docs/04_deploy/observability.md - ENVIRONMENT_DOC:
_docs/04_deploy/environment_strategy.md - PROCEDURES_DOC:
_docs/04_deploy/deployment_procedures.md - ARCHITECTURE:
_docs/02_document/architecture.md - RESTRICTIONS:
_docs/00_problem/restrictions.md
Announce the resolved paths and the target environment + version + strategy to the user before any phase that touches the live system.
Inputs (BLOCKING prerequisites)
| Input | Required | Source |
|---|---|---|
| Target environment | Yes — ASK user | environment_strategy.md enumerates valid options |
| Target version / image tag | Yes — ASK user | Must exist in the registry; verified in Phase 1 |
| Rollback target version | Yes — ASK user | Defaults to currently-deployed version if discoverable |
scripts/deploy.sh |
Yes | Produced by /deploy Step 7. STOP if missing → run /deploy first |
scripts/health-check.sh |
Yes | Same |
_docs/04_deploy/deployment_procedures.md |
Yes | Defines per-environment runbook, manual approval rules, change-window restrictions |
_docs/04_deploy/observability.md |
Yes | Defines watch metrics, thresholds, and dashboards |
_docs/04_deploy/environment_strategy.md |
Yes | Defines target hostnames, registries, secrets, deploy strategy per env |
Outputs
RELEASE_DIR/
├── release_<version>_<env>_<YYYY-MM-DD-HHmm>.md (mandatory; one per invocation)
├── rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md (only when rollback fires; pairs with the release file)
└── manual_approvals/
└── approval_<version>_<env>.md (when restrictions require manual approval, written before Phase 3)
The release report (templates/release-report.md) is appended to as each phase completes — it is durable across phase failures and reflects partial progress so the next operator can resume or audit.
Phases
┌────────────────────────────────────────────────────────────────┐
│ Release Execution (6-Phase Method) │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: deploy artifacts on disk; tests green at HEAD │
│ │
│ 1. Pre-Release Gate → AC + change summary + readiness │
│ [BLOCKING: user confirms or aborts] │
│ 2. Strategy Select → all-at-once / blue-green / canary │
│ [BLOCKING: user picks strategy] │
│ 3. Execute → run deploy.sh, capture exit + logs │
│ [AUTO-ROLLBACK on non-zero exit] │
│ 4. Smoke Test → /test-run prod-smoke in target env │
│ [AUTO-ROLLBACK on failure] │
│ 5. Watch Window → poll observability for N minutes │
│ [AUTO-ROLLBACK on hard threshold breach] │
│ 6. Commit or Rollback → finalize verdict, update tracker │
│ [BLOCKING: user confirms only if soft regression escalated] │
├────────────────────────────────────────────────────────────────┤
│ Verdicts: Released · Rolled-Back · Aborted │
└────────────────────────────────────────────────────────────────┘
Phase 1: Pre-Release Gate
Goal: Refuse to start if the system is not ready for a real release.
- Acceptance criteria check: read
_docs/00_problem/acceptance_criteria.md. If any AC is marked unmet OR if any AC has no associated test markedPassedin the latesttest-runreport, STOP and surface the unmet items. Do not let the user override with "ship anyway" without a recorded reason in the release report. - Test status check: read the most recent
_docs/06_metrics/perf_*.md(if perf is required by restrictions) and the latest functional test report. Any failing or skipped test that maps to a critical-path AC blocks the release. - Change summary: read the git log between the version-tag-of-last-release and HEAD (or, if no prior release exists, from the project root commit). Render a short list grouped by component: features, fixes, breaking changes, security fixes. Cross-reference against the latest implementation reports under
_docs/03_implementation/. - Rollback readiness:
- Confirm the previous version's image is still pullable from the registry (do not deploy without this).
- Confirm
scripts/deploy.sh --rollbackworks as documented (read the script; if--rollbackflag is missing, STOP — that is a deploy-skill bug). - Confirm a rollback target exists (e.g., previously-deployed image tag) and is recorded in the release report under
Rollback Plan.
- Restrictions: read
_docs/00_problem/restrictions.mdfor change-window rules, manual-approval rules, blackout windows, regulatory requirements (e.g., 4-eyes review, ITAR controls). If any apply, gate accordingly — write amanual_approvals/approval_<version>_<env>.mdfile once received. - Tracker check: list tracker tickets in the release scope (per
tracker.mdcrules). Any ticket still inIn ProgressorCode Reviewthat maps to a change in the release scope blocks Phase 1. Move-and-deploy is not allowed.
BLOCKING gate: present the assembled summary to the user using Choose A/B/C:
══════════════════════════════════════
PRE-RELEASE GATE
══════════════════════════════════════
Target env: {env}
Target version: {version} ({git-sha})
Rollback target: {previous-version}
Changes: N tickets, M components
- {summary list}
Open risks: {summary or "none"}
Blocking issues: {summary or "none"}
══════════════════════════════════════
A) Proceed to Strategy Select
B) Abort — fix blocking issue and re-invoke
C) Edit release scope — exclude a ticket and reassemble
══════════════════════════════════════
If A → write Phase 1 section to release report, proceed. If B → write Aborted verdict to release report with reason, exit. If C → loop back into Phase 1 with edited scope.
Phase 2: Strategy Select
Goal: Pick the deployment strategy that fits the change risk and environment capability.
Read environment_strategy.md and deployment_procedures.md to learn which strategies the target env supports. Strategies and when each is appropriate:
| Strategy | When to pick | Risk if wrong |
|---|---|---|
| all-at-once | Internal tools, low traffic, well-rehearsed change, env supports nothing else | All users hit the new version simultaneously — bug blast radius is 100% |
| blue-green | Stateless services with a load balancer, env has dual-stack capability | Cutover is binary — observability must be ready to detect issues fast |
| canary | Customer-facing, traffic-tier load balancer in place, gradual rollout possible | Canary metric thresholds must be well-tuned or canary fails for harmless reasons |
| manual | Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) | The whole release becomes a runbook and the watch window phases are operator-driven; the release skill records but does not execute |
Recommend a default based on:
- Risk level inferred from change summary (any breaking change → bias toward canary or blue-green)
- Restrictions (e.g., regulatory rules forcing manual approval at each step)
- Environment capability (some envs may only support all-at-once)
BLOCKING gate: Choose A/B/C/D between strategies. Record the choice in the release report.
Phase 3: Execute
Goal: Actually run the deploy. Capture exit code and full stdout/stderr.
- Validate environment file (
.env) exists, all required vars from.env.exampleare set, no placeholder secrets remain. - Source the env file and run
scripts/deploy.shagainst the target host. The script produced by/deployStep 7 is the point of execution; do NOT bypass it. If a strategy-specific flag is needed (e.g.,--canary 5%), pass it through. - Stream stdout/stderr to the release report, with timestamps, in a fenced code block under
## Phase 3: Execute. - Capture exit code.
- AUTO-ROLLBACK trigger: non-zero exit code → immediately invoke Phase 6 with verdict
Rolled-Back: deploy script failure. Do NOT continue to Phase 4.
If deploy.sh emits no output for more than the configured idle threshold (default 5 minutes; check deployment_procedures.md for an explicit value), treat it as hung — capture a snapshot of what's running on the target, kill the script, and AUTO-ROLLBACK with reason Deploy hung — manual investigation required.
Manual strategy: if Phase 2 picked manual, write a checklist of operator steps from deployment_procedures.md to the release report and pause until the user types done or failed. Phase 3 then records the user's report verbatim.
Phase 4: Smoke Test
Goal: Verify the new version is actually serving traffic correctly in the target environment.
- Resolve the smoke-test command from
_docs/02_document/tests/blackbox-tests.md§ Production Smoke Tests, OR delegate to/test-runin--prod-smokemode against the target environment. - The smoke-test set must (a) hit each public endpoint of each component, (b) include at least one read AND one write per public endpoint where applicable, and (c) complete in under 5 minutes total.
- Capture pass/fail per case to the release report.
- AUTO-ROLLBACK trigger: any smoke-test failure → invoke Phase 6 with verdict
Rolled-Back: smoke test failure: <test-name>.
If smoke tests are missing for the target environment (no production-mode test set), STOP — write a leftover entry to _docs/_process_leftovers/ per tracker.mdc, do not proceed to watch window without smoke coverage. Write Aborted: smoke tests missing for prod-mode target and ASK the user.
Phase 5: Watch Window
Goal: Observe the live system for a defined window to catch latent regressions.
- Read
observability.mdfor the project's metrics, dashboards, and threshold definitions. Required watch metrics for any production target (per cursor-meta convention) include error rate, request rate, p99 latency, and saturation (CPU/memory/queue-depth). - Compute the watch-window duration from
deployment_procedures.md. If unspecified, default to 15 minutes for staging and 60 minutes for production. - Poll the observability backend at 1-minute intervals (or the configured cadence). For each interval, record metric snapshots to the release report.
- Threshold rules:
- Hard breach (auto-rollback): error-rate ≥ 2× baseline, p99 latency ≥ 3× baseline, any health-check failure persisting for 2 consecutive intervals.
- Soft breach (escalate): metric drift between 1.5× and 2× baseline, single-interval health blip, queue-depth steady but elevated.
- No data (escalate): if metrics are not flowing within the first 3 minutes, treat the absence as a hard breach — observability is itself broken.
- AUTO-ROLLBACK trigger: hard breach at any interval. Move to Phase 6 with verdict
Rolled-Back: <metric> breached <multiplier>× baseline at T+<minutes>. - ESCALATE trigger: soft breach. Pause polling, surface the metric, and ask the user A/B/C:
- A) Continue watch — accept current drift, keep polling
- B) Roll back now — treat soft drift as hard
- C) Extend watch window by N minutes
- End of watch window with no breach → proceed to Phase 6.
The watch window cannot be skipped. If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as Released-with-override — this triggers an automatic incident retrospective per retrospective/SKILL.md.
Phase 6: Commit or Rollback
Goal: Finalize the release with a definitive verdict on disk.
Path A — Commit (clean release):
- Update tracker tickets: every ticket in scope moves to
Released(orDone, per project convention defined intracker.mdc/_docs/_repo-config.yaml). - Tag the git HEAD with
release/<version>(or the project's tag convention fromdeployment_procedures.md). - Write the final
Releasedverdict to the release report with a summary table. - Trigger
/retrospective --cycle-endwith this release as the cycle terminus. - Auto-chain to autodev's next step (Retrospective in greenfield, or feature-cycle loop start in existing-code).
Path B — Rollback (auto-fired or user-elected):
- Run
scripts/deploy.sh --rollbackwith the rollback target captured in Phase 1. - Stream output to a new file
RELEASE_DIR/rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.mdAND append a summary to the original release report under## Rollback. - Re-run Phase 4 (smoke test) and a 5-minute mini watch window against the rolled-back version. If THAT also fails, escalate immediately — the system is in an unknown state and needs human takeover.
- Update tracker tickets back to
Ready for Release(or the project's pre-release status). - Write the final
Rolled-Backverdict with full reason chain. - Auto-trigger
/retrospective --incidentwith this release as the incident anchor (perretrospective/SKILL.mdincident mode). - Do NOT auto-chain to anything else — the user owns the next step.
Path C — Aborted:
Reached only via Phase 1 Choose B, Phase 4 smoke-tests-missing escalation, or any phase that detects a precondition violation. Write Aborted: <reason> to the release report. Do not auto-chain.
Self-verification
- Release report exists at
RELEASE_DIR/release_<version>_<env>_<timestamp>.mdwith verdict (Released / Rolled-Back / Aborted) - Every phase that ran has a section in the release report with timestamps and tool output
- On Released: tracker tickets moved to release status; git tag pushed (if convention)
- On Rolled-Back: rollback report exists at
RELEASE_DIR/rollback_<version>_<env>_<timestamp>.md; tracker tickets moved back to pre-release status; incident retrospective scheduled - On Aborted: reason recorded; no live-system changes attempted; no tracker movement
- No phase was skipped without an explicit reason recorded in the release report
Escalation Rules
| Situation | Action |
|---|---|
scripts/deploy.sh missing or --rollback unsupported |
STOP — return to /deploy Step 7, do not patch the script in /release |
| Registry auth failure during pre-release | STOP — fix credentials at infra layer (per coderule.mdc); do not embed creds in the script |
| Smoke tests missing for prod target | STOP — write a leftover; do not improvise smoke tests in /release |
| Observability backend unreachable | STOP — observability blindness is itself a release blocker |
| User asks to skip the watch window | Record override, mark verdict Released-with-override, fire incident retro |
| Rollback also fails its smoke test | ESCALATE to user — system is in unknown state; do not loop deploys |
| Tracker MCP returns Unauthorized during ticket movement | Per tracker.mdc, write a leftover entry; do NOT silently continue without confirming the move |
| Multiple environments named in user request | STOP — one release per invocation; ask user to pick one |
| Production smoke test would touch real customer data | STOP — that is a coderule.mdc violation; ask user to define a smoke endpoint or test account |
Common Mistakes
- Skipping the watch window when "everything looks fine after deploy" — a deploy that exited 0 is not a release that's stable. Watch is mandatory.
- Faking smoke tests to pass the gate when the prod test set is incomplete. STOP and surface the gap; do not embed prod URLs into ad-hoc curl commands.
- Rolling forward through a failure ("the next deploy will fix it"). Roll back first, fix the cause, then deploy a real fix.
- Treating the release report as optional when only an internal tool changed. Every release writes a report — the audit trail is the value, not the prose volume.
- Approving manual gates yourself without the user's input when restrictions require human approval. The release skill records, the human approves.
- Reusing
release_<version>filenames across attempted releases. Always include the timestamp in the filename so re-attempts are visible side-by-side. - Letting tracker drift silently between release attempts. If Phase 6 cannot move tickets, the release is not complete — write a leftover and stop.
Project Mode vs Standalone
- Project mode (default): autodev invokes
/releaseafter/deploy. State writes occur under_docs/_autodev_state.md. Full integration with retrospective and feature-cycle loop. - Standalone mode:
/releaseinvoked directly with@<artifact>(rare; usually only for re-running a rollback against a specific version). All outputs still go toRELEASE_DIR/.
Methodology Quick Reference
┌────────────────────────────────────────────────────────────────┐
│ Release (6 phases, 3 verdicts) │
├────────────────────────────────────────────────────────────────┤
│ Phase 1 Pre-Release Gate │
│ AC + tests + change summary + rollback path │
│ [BLOCKING — user A/B/C] │
│ Phase 2 Strategy Select │
│ all-at-once · blue-green · canary · manual │
│ [BLOCKING — user picks] │
│ Phase 3 Execute │
│ scripts/deploy.sh, capture exit code + logs │
│ [AUTO-ROLLBACK on non-zero or hang] │
│ Phase 4 Smoke Test │
│ /test-run --prod-smoke against target │
│ [AUTO-ROLLBACK on any failure] │
│ Phase 5 Watch Window │
│ Poll observability for N minutes │
│ [AUTO-ROLLBACK on hard breach; escalate on soft] │
│ Phase 6 Commit or Rollback │
│ Released → tracker, tag, retrospective │
│ Rolled-Back → tracker reset, incident retrospective │
│ Aborted → no live-system change │
├────────────────────────────────────────────────────────────────┤
│ Principles: real execution · verifiable rollback · │
│ quiet failure = release failure · │
│ watch window mandatory │
└────────────────────────────────────────────────────────────────┘