gps-denied-onboard/.cursor/skills/release/SKILL.md

---
name: release
description: |
  Executes the deployment plan produced by /deploy against a target environment.
  Closes the loop between "we have a plan" and "the new version is running in production with a verdict on disk."
  6-phase workflow: pre-release gate, strategy select, execute, smoke test, watch window, commit-or-rollback.
  Outputs _docs/04_release/release_<version>.md with a definitive Released / Rolled-Back / Aborted verdict.
  Trigger phrases:
  - "release", "ship", "go live", "release this version"
  - "deploy to prod", "promote to staging", "roll out"
  - "rollback", "abort the release"
category: ship
tags: [release, deployment, rollback, smoke-test, observability, production]
disable-model-invocation: true
---

# Release Execution

The `/deploy` skill produces a plan and scripts. The `/release` skill **runs** them, verifies the live system, watches it for a defined window, and produces a definitive verdict on disk.

## Core Principles

- **Real execution, not simulation**: every phase must actually run against the target environment. If a phase cannot be executed (missing scripts, no SSH access, disabled secrets, registry auth failure), STOP — do not pretend a step succeeded. See `meta-rule.mdc` § "Real Results, Not Simulated Ones".
- **Verifiable rollback path**: the release does not start until rollback is proven viable for this version. "We can roll back" without evidence is not a rollback path.
- **Quiet failure is a release failure**: a deploy script that exits 0 but emits no observable signal in the watch window is treated as a regression, not a success.
- **One release per invocation**: a single `/release` execution targets exactly one version against exactly one environment. Multi-stage promotion (staging → prod) is two invocations, not one.
- **Never skip the watch window**: even successful deploys can degrade after 5–60 minutes (cache warm-up, scheduled jobs, downstream backpressure). The watch window is mandatory.
- **Autonomous rollback on hard regressions**: critical health-check failure, error-rate spike above threshold, or smoke-test failure → automatic rollback. Soft regressions (latency drift, capacity warnings) escalate to the user.

## Context Resolution

Fixed paths:

- DEPLOY_DIR: `_docs/04_deploy/`
- RELEASE_DIR: `_docs/04_release/`
- SCRIPTS_DIR: `scripts/`
- DEPLOY_SCRIPT: `scripts/deploy.sh`
- HEALTH_SCRIPT: `scripts/health-check.sh`
- ENV_TEMPLATE: `.env.example`
- OBSERVABILITY_DOC: `_docs/04_deploy/observability.md`
- ENVIRONMENT_DOC: `_docs/04_deploy/environment_strategy.md`
- PROCEDURES_DOC: `_docs/04_deploy/deployment_procedures.md`
- ARCHITECTURE: `_docs/02_document/architecture.md`
- RESTRICTIONS: `_docs/00_problem/restrictions.md`

Announce the resolved paths and the **target environment + version + strategy** to the user before any phase that touches the live system.

## Inputs (BLOCKING prerequisites)

| Input | Required | Source |
|-------|----------|--------|
| Target environment | Yes — ASK user | `environment_strategy.md` enumerates valid options |
| Target version / image tag | Yes — ASK user | Must exist in the registry; verified in Phase 1 |
| Rollback target version | Yes — ASK user | Defaults to currently-deployed version if discoverable |
| `scripts/deploy.sh` | Yes | Produced by `/deploy` Step 7. STOP if missing → run `/deploy` first |
| `scripts/health-check.sh` | Yes | Same |
| `_docs/04_deploy/deployment_procedures.md` | Yes | Defines per-environment runbook, manual approval rules, change-window restrictions |
| `_docs/04_deploy/observability.md` | Yes | Defines watch metrics, thresholds, and dashboards |
| `_docs/04_deploy/environment_strategy.md` | Yes | Defines target hostnames, registries, secrets, deploy strategy per env |

## Outputs

```
RELEASE_DIR/
├── release_<version>_<env>_<YYYY-MM-DD-HHmm>.md       (mandatory; one per invocation)
├── rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md      (only when rollback fires; pairs with the release file)
└── manual_approvals/
    └── approval_<version>_<env>.md                     (when restrictions require manual approval, written before Phase 3)
```

The release report (`templates/release-report.md`) is appended to as each phase completes — it is durable across phase failures and reflects partial progress so the next operator can resume or audit.

## Phases

```
┌────────────────────────────────────────────────────────────────┐
│             Release Execution (6-Phase Method)                  │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: deploy artifacts on disk; tests green at HEAD          │
│                                                                │
│ 1. Pre-Release Gate     → AC + change summary + readiness      │
│    [BLOCKING: user confirms or aborts]                         │
│ 2. Strategy Select      → all-at-once / blue-green / canary    │
│    [BLOCKING: user picks strategy]                             │
│ 3. Execute              → run deploy.sh, capture exit + logs   │
│    [AUTO-ROLLBACK on non-zero exit]                            │
│ 4. Smoke Test           → /test-run prod-smoke in target env   │
│    [AUTO-ROLLBACK on failure]                                  │
│ 5. Watch Window         → poll observability for N minutes     │
│    [AUTO-ROLLBACK on hard threshold breach]                    │
│ 6. Commit or Rollback   → finalize verdict, update tracker     │
│    [BLOCKING: user confirms only if soft regression escalated] │
├────────────────────────────────────────────────────────────────┤
│ Verdicts: Released · Rolled-Back · Aborted                     │
└────────────────────────────────────────────────────────────────┘
```

### Phase 1: Pre-Release Gate

**Goal**: Refuse to start if the system is not ready for a real release.

1. **Acceptance criteria check**: read `_docs/00_problem/acceptance_criteria.md`. If any AC is marked unmet OR if any AC has no associated test marked `Passed` in the latest `test-run` report, STOP and surface the unmet items. Do not let the user override with "ship anyway" without a recorded reason in the release report.
2. **Test status check**: read the most recent `_docs/06_metrics/perf_*.md` (if perf is required by restrictions) and the latest functional test report. Any failing or skipped test that maps to a critical-path AC blocks the release.
3. **Change summary**: read the git log between the version-tag-of-last-release and HEAD (or, if no prior release exists, from the project root commit). Render a short list grouped by component: features, fixes, breaking changes, security fixes. Cross-reference against the latest implementation reports under `_docs/03_implementation/`.
4. **Rollback readiness**:
   - Confirm the previous version's image is still pullable from the registry (do not deploy without this).
   - Confirm `scripts/deploy.sh --rollback` works as documented (read the script; if `--rollback` flag is missing, STOP — that is a deploy-skill bug).
   - Confirm a rollback target exists (e.g., previously-deployed image tag) and is recorded in the release report under `Rollback Plan`.
5. **Restrictions**: read `_docs/00_problem/restrictions.md` for change-window rules, manual-approval rules, blackout windows, regulatory requirements (e.g., 4-eyes review, ITAR controls). If any apply, gate accordingly — write a `manual_approvals/approval_<version>_<env>.md` file once received.
6. **Tracker check**: list tracker tickets in the release scope (per `tracker.mdc` rules). Any ticket still in `In Progress` or `Code Review` that maps to a change in the release scope blocks Phase 1. Move-and-deploy is not allowed.

**BLOCKING gate**: present the assembled summary to the user using Choose A/B/C:

```
══════════════════════════════════════
 PRE-RELEASE GATE
══════════════════════════════════════
 Target env:        {env}
 Target version:    {version} ({git-sha})
 Rollback target:   {previous-version}
 Changes:           N tickets, M components
   - {summary list}
 Open risks:        {summary or "none"}
 Blocking issues:   {summary or "none"}
══════════════════════════════════════
 A) Proceed to Strategy Select
 B) Abort — fix blocking issue and re-invoke
 C) Edit release scope — exclude a ticket and reassemble
══════════════════════════════════════
```

If A → write Phase 1 section to release report, proceed. If B → write `Aborted` verdict to release report with reason, exit. If C → loop back into Phase 1 with edited scope.

### Phase 2: Strategy Select

**Goal**: Pick the deployment strategy that fits the change risk and environment capability.

Read `environment_strategy.md` and `deployment_procedures.md` to learn which strategies the target env supports. Strategies and when each is appropriate:

| Strategy | When to pick | Risk if wrong |
|----------|--------------|---------------|
| **all-at-once** | Internal tools, low traffic, well-rehearsed change, env supports nothing else | All users hit the new version simultaneously — bug blast radius is 100% |
| **blue-green** | Stateless services with a load balancer, env has dual-stack capability | Cutover is binary — observability must be ready to detect issues fast |
| **canary** | Customer-facing, traffic-tier load balancer in place, gradual rollout possible | Canary metric thresholds must be well-tuned or canary fails for harmless reasons |
| **manual** | Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) | The whole release becomes a runbook and the watch window phases are operator-driven; the release skill records but does not execute |

Recommend a default based on:
- Risk level inferred from change summary (any breaking change → bias toward canary or blue-green)
- Restrictions (e.g., regulatory rules forcing manual approval at each step)
- Environment capability (some envs may only support all-at-once)

**BLOCKING gate**: Choose A/B/C/D between strategies. Record the choice in the release report.

### Phase 3: Execute

**Goal**: Actually run the deploy. Capture exit code and full stdout/stderr.

1. Validate environment file (`.env`) exists, all required vars from `.env.example` are set, no placeholder secrets remain.
2. Source the env file and run `scripts/deploy.sh` against the target host. The script produced by `/deploy` Step 7 is the point of execution; do NOT bypass it. If a strategy-specific flag is needed (e.g., `--canary 5%`), pass it through.
3. Stream stdout/stderr to the release report, with timestamps, in a fenced code block under `## Phase 3: Execute`.
4. Capture exit code.
5. **AUTO-ROLLBACK trigger**: non-zero exit code → immediately invoke Phase 6 with verdict `Rolled-Back: deploy script failure`. Do NOT continue to Phase 4.

If `deploy.sh` emits no output for more than the configured idle threshold (default 5 minutes; check `deployment_procedures.md` for an explicit value), treat it as hung — capture a snapshot of what's running on the target, kill the script, and AUTO-ROLLBACK with reason `Deploy hung — manual investigation required`.

**Manual strategy**: if Phase 2 picked `manual`, write a checklist of operator steps from `deployment_procedures.md` to the release report and pause until the user types `done` or `failed`. Phase 3 then records the user's report verbatim.

### Phase 4: Smoke Test

**Goal**: Verify the new version is *actually serving traffic correctly* in the target environment.

1. Resolve the smoke-test command from `_docs/02_document/tests/blackbox-tests.md` § Production Smoke Tests, OR delegate to `/test-run` in `--prod-smoke` mode against the target environment.
2. The smoke-test set must (a) hit each public endpoint of each component, (b) include at least one read AND one write per public endpoint where applicable, and (c) complete in under 5 minutes total.
3. Capture pass/fail per case to the release report.
4. **AUTO-ROLLBACK trigger**: any smoke-test failure → invoke Phase 6 with verdict `Rolled-Back: smoke test failure: <test-name>`.

If smoke tests are **missing** for the target environment (no production-mode test set), STOP — write a leftover entry to `_docs/_process_leftovers/` per `tracker.mdc`, do not proceed to watch window without smoke coverage. Write `Aborted: smoke tests missing for prod-mode target` and ASK the user.

### Phase 5: Watch Window

**Goal**: Observe the live system for a defined window to catch latent regressions.

1. Read `observability.md` for the project's metrics, dashboards, and threshold definitions. Required watch metrics for any production target (per cursor-meta convention) include error rate, request rate, p99 latency, and saturation (CPU/memory/queue-depth).
2. Compute the watch-window duration from `deployment_procedures.md`. If unspecified, default to **15 minutes** for staging and **60 minutes** for production.
3. Poll the observability backend at 1-minute intervals (or the configured cadence). For each interval, record metric snapshots to the release report.
4. Threshold rules:
   - **Hard breach** (auto-rollback): error-rate ≥ 2× baseline, p99 latency ≥ 3× baseline, any health-check failure persisting for 2 consecutive intervals.
   - **Soft breach** (escalate): metric drift between 1.5× and 2× baseline, single-interval health blip, queue-depth steady but elevated.
   - **No data** (escalate): if metrics are not flowing within the first 3 minutes, treat the absence as a hard breach — observability is itself broken.
5. **AUTO-ROLLBACK trigger**: hard breach at any interval. Move to Phase 6 with verdict `Rolled-Back: <metric> breached <multiplier>× baseline at T+<minutes>`.
6. **ESCALATE trigger**: soft breach. Pause polling, surface the metric, and ask the user A/B/C:
   - A) Continue watch — accept current drift, keep polling
   - B) Roll back now — treat soft drift as hard
   - C) Extend watch window by N minutes
7. End of watch window with no breach → proceed to Phase 6.

The watch window cannot be skipped. If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as `Released-with-override` — this triggers an automatic incident retrospective per `retrospective/SKILL.md`.

### Phase 6: Commit or Rollback

**Goal**: Finalize the release with a definitive verdict on disk.

**Path A — Commit (clean release)**:
1. Update tracker tickets: every ticket in scope moves to `Released` (or `Done`, per project convention defined in `tracker.mdc` / `_docs/_repo-config.yaml`).
2. Tag the git HEAD with `release/<version>` (or the project's tag convention from `deployment_procedures.md`).
3. Write the final `Released` verdict to the release report with a summary table.
4. Trigger `/retrospective --cycle-end` with this release as the cycle terminus.
5. Auto-chain to autodev's next step (Retrospective in greenfield, or feature-cycle loop start in existing-code).

**Path B — Rollback (auto-fired or user-elected)**:
1. Run `scripts/deploy.sh --rollback` with the rollback target captured in Phase 1.
2. Stream output to a new file `RELEASE_DIR/rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md` AND append a summary to the original release report under `## Rollback`.
3. Re-run Phase 4 (smoke test) and a 5-minute mini watch window against the rolled-back version. If THAT also fails, escalate immediately — the system is in an unknown state and needs human takeover.
4. Update tracker tickets back to `Ready for Release` (or the project's pre-release status).
5. Write the final `Rolled-Back` verdict with full reason chain.
6. Auto-trigger `/retrospective --incident` with this release as the incident anchor (per `retrospective/SKILL.md` incident mode).
7. Do NOT auto-chain to anything else — the user owns the next step.

**Path C — Aborted**:
Reached only via Phase 1 Choose B, Phase 4 smoke-tests-missing escalation, or any phase that detects a precondition violation. Write `Aborted: <reason>` to the release report. Do not auto-chain.

## Self-verification

- [ ] Release report exists at `RELEASE_DIR/release_<version>_<env>_<timestamp>.md` with verdict (Released / Rolled-Back / Aborted)
- [ ] Every phase that ran has a section in the release report with timestamps and tool output
- [ ] On Released: tracker tickets moved to release status; git tag pushed (if convention)
- [ ] On Rolled-Back: rollback report exists at `RELEASE_DIR/rollback_<version>_<env>_<timestamp>.md`; tracker tickets moved back to pre-release status; incident retrospective scheduled
- [ ] On Aborted: reason recorded; no live-system changes attempted; no tracker movement
- [ ] No phase was skipped without an explicit reason recorded in the release report

## Escalation Rules

| Situation | Action |
|-----------|--------|
| `scripts/deploy.sh` missing or `--rollback` unsupported | STOP — return to `/deploy` Step 7, do not patch the script in `/release` |
| Registry auth failure during pre-release | STOP — fix credentials at infra layer (per `coderule.mdc`); do not embed creds in the script |
| Smoke tests missing for prod target | STOP — write a leftover; do not improvise smoke tests in `/release` |
| Observability backend unreachable | STOP — observability blindness is itself a release blocker |
| User asks to skip the watch window | Record override, mark verdict `Released-with-override`, fire incident retro |
| Rollback also fails its smoke test | ESCALATE to user — system is in unknown state; do not loop deploys |
| Tracker MCP returns Unauthorized during ticket movement | Per `tracker.mdc`, write a leftover entry; do NOT silently continue without confirming the move |
| Multiple environments named in user request | STOP — one release per invocation; ask user to pick one |
| Production smoke test would touch real customer data | STOP — that is a `coderule.mdc` violation; ask user to define a smoke endpoint or test account |

## Common Mistakes

- **Skipping the watch window when "everything looks fine after deploy"** — a deploy that exited 0 is not a release that's stable. Watch is mandatory.
- **Faking smoke tests** to pass the gate when the prod test set is incomplete. STOP and surface the gap; do not embed prod URLs into ad-hoc curl commands.
- **Rolling forward through a failure** ("the next deploy will fix it"). Roll back first, fix the cause, then deploy a real fix.
- **Treating the release report as optional** when only an internal tool changed. Every release writes a report — the audit trail is the value, not the prose volume.
- **Approving manual gates yourself** without the user's input when restrictions require human approval. The release skill records, the human approves.
- **Reusing `release_<version>` filenames** across attempted releases. Always include the timestamp in the filename so re-attempts are visible side-by-side.
- **Letting tracker drift silently** between release attempts. If Phase 6 cannot move tickets, the release is not complete — write a leftover and stop.

## Project Mode vs Standalone

- **Project mode** (default): autodev invokes `/release` after `/deploy`. State writes occur under `_docs/_autodev_state.md`. Full integration with retrospective and feature-cycle loop.
- **Standalone mode**: `/release` invoked directly with `@<artifact>` (rare; usually only for re-running a rollback against a specific version). All outputs still go to `RELEASE_DIR/`.

## Methodology Quick Reference

```
┌────────────────────────────────────────────────────────────────┐
│                Release (6 phases, 3 verdicts)                  │
├────────────────────────────────────────────────────────────────┤
│ Phase 1  Pre-Release Gate                                      │
│           AC + tests + change summary + rollback path          │
│           [BLOCKING — user A/B/C]                              │
│ Phase 2  Strategy Select                                       │
│           all-at-once · blue-green · canary · manual           │
│           [BLOCKING — user picks]                              │
│ Phase 3  Execute                                               │
│           scripts/deploy.sh, capture exit code + logs          │
│           [AUTO-ROLLBACK on non-zero or hang]                  │
│ Phase 4  Smoke Test                                            │
│           /test-run --prod-smoke against target                │
│           [AUTO-ROLLBACK on any failure]                       │
│ Phase 5  Watch Window                                          │
│           Poll observability for N minutes                     │
│           [AUTO-ROLLBACK on hard breach; escalate on soft]     │
│ Phase 6  Commit or Rollback                                    │
│           Released → tracker, tag, retrospective               │
│           Rolled-Back → tracker reset, incident retrospective  │
│           Aborted → no live-system change                      │
├────────────────────────────────────────────────────────────────┤
│ Principles: real execution · verifiable rollback ·             │
│             quiet failure = release failure ·                  │
│             watch window mandatory                             │
└────────────────────────────────────────────────────────────────┘
```