mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 08:51:12 +00:00

Files

T

Oleksandr Bezdieniezhnykh 6044a33197 chore: WIP pre-implement

Bundled hygiene commit before cycle-3 /implement (AZ-776, AZ-777). Mixes
two concerns by user choice (autodev option B):

- Cycle-3 autodev artifacts not yet committed by Step 9 (new-task):
  task specs for AZ-776 / AZ-777 under _docs/02_tasks/todo/ and the
  updated _docs/02_tasks/_dependencies_table.md.
- Accumulated skill / rule tooling maintenance under .cursor/ (skills:
  autodev, code-review, decompose, deploy, implement, new-task, plan,
  refactor, retrospective, test-spec; rules: coderule, cursor-meta,
  meta-rule, testing; new release skill scaffolding).
- Autodev bootstrap state: _docs/_autodev_state.md (step 10 in_progress)
  and _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md
  (replay timestamp refreshed; gtsam 4.2 still numpy<2-only).

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-21 13:14:11 +03:00

22 KiB

Raw Blame History

name, description, category, tags, disable-model-invocation

name

description

Release Execution

The /deploy skill produces a plan and scripts. The /release skill runs them, verifies the live system, watches it for a defined window, and produces a definitive verdict on disk.

Core Principles

Real execution, not simulation: every phase must actually run against the target environment. If a phase cannot be executed (missing scripts, no SSH access, disabled secrets, registry auth failure), STOP — do not pretend a step succeeded. See meta-rule.mdc § "Real Results, Not Simulated Ones".
Verifiable rollback path: the release does not start until rollback is proven viable for this version. "We can roll back" without evidence is not a rollback path.
Quiet failure is a release failure: a deploy script that exits 0 but emits no observable signal in the watch window is treated as a regression, not a success.
One release per invocation: a single /release execution targets exactly one version against exactly one environment. Multi-stage promotion (staging → prod) is two invocations, not one.
Never skip the watch window: even successful deploys can degrade after 5–60 minutes (cache warm-up, scheduled jobs, downstream backpressure). The watch window is mandatory.
Autonomous rollback on hard regressions: critical health-check failure, error-rate spike above threshold, or smoke-test failure → automatic rollback. Soft regressions (latency drift, capacity warnings) escalate to the user.

Context Resolution

Fixed paths:

DEPLOY_DIR: _docs/04_deploy/
RELEASE_DIR: _docs/04_release/
SCRIPTS_DIR: scripts/
DEPLOY_SCRIPT: scripts/deploy.sh
HEALTH_SCRIPT: scripts/health-check.sh
ENV_TEMPLATE: .env.example
OBSERVABILITY_DOC: _docs/04_deploy/observability.md
ENVIRONMENT_DOC: _docs/04_deploy/environment_strategy.md
PROCEDURES_DOC: _docs/04_deploy/deployment_procedures.md
ARCHITECTURE: _docs/02_document/architecture.md
RESTRICTIONS: _docs/00_problem/restrictions.md

Announce the resolved paths and the target environment + version + strategy to the user before any phase that touches the live system.

Inputs (BLOCKING prerequisites)

Input	Required	Source
Target environment	Yes — ASK user	`environment_strategy.md` enumerates valid options
Target version / image tag	Yes — ASK user	Must exist in the registry; verified in Phase 1
Rollback target version	Yes — ASK user	Defaults to currently-deployed version if discoverable
`scripts/deploy.sh`	Yes	Produced by `/deploy` Step 7. STOP if missing → run `/deploy` first
`scripts/health-check.sh`	Yes	Same
`_docs/04_deploy/deployment_procedures.md`	Yes	Defines per-environment runbook, manual approval rules, change-window restrictions
`_docs/04_deploy/observability.md`	Yes	Defines watch metrics, thresholds, and dashboards
`_docs/04_deploy/environment_strategy.md`	Yes	Defines target hostnames, registries, secrets, deploy strategy per env

Outputs

RELEASE_DIR/
├── release_<version>_<env>_<YYYY-MM-DD-HHmm>.md       (mandatory; one per invocation)
├── rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md      (only when rollback fires; pairs with the release file)
└── manual_approvals/
    └── approval_<version>_<env>.md                     (when restrictions require manual approval, written before Phase 3)

The release report (templates/release-report.md) is appended to as each phase completes — it is durable across phase failures and reflects partial progress so the next operator can resume or audit.

Phases

┌────────────────────────────────────────────────────────────────┐
│             Release Execution (6-Phase Method)                  │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: deploy artifacts on disk; tests green at HEAD          │
│                                                                │
│ 1. Pre-Release Gate     → AC + change summary + readiness      │
│    [BLOCKING: user confirms or aborts]                         │
│ 2. Strategy Select      → all-at-once / blue-green / canary    │
│    [BLOCKING: user picks strategy]                             │
│ 3. Execute              → run deploy.sh, capture exit + logs   │
│    [AUTO-ROLLBACK on non-zero exit]                            │
│ 4. Smoke Test           → /test-run prod-smoke in target env   │
│    [AUTO-ROLLBACK on failure]                                  │
│ 5. Watch Window         → poll observability for N minutes     │
│    [AUTO-ROLLBACK on hard threshold breach]                    │
│ 6. Commit or Rollback   → finalize verdict, update tracker     │
│    [BLOCKING: user confirms only if soft regression escalated] │
├────────────────────────────────────────────────────────────────┤
│ Verdicts: Released · Rolled-Back · Aborted                     │
└────────────────────────────────────────────────────────────────┘

Phase 1: Pre-Release Gate

Goal: Refuse to start if the system is not ready for a real release.

Acceptance criteria check: read _docs/00_problem/acceptance_criteria.md. If any AC is marked unmet OR if any AC has no associated test marked Passed in the latest test-run report, STOP and surface the unmet items. Do not let the user override with "ship anyway" without a recorded reason in the release report.
Test status check: read the most recent _docs/06_metrics/perf_*.md (if perf is required by restrictions) and the latest functional test report. Any failing or skipped test that maps to a critical-path AC blocks the release.
Change summary: read the git log between the version-tag-of-last-release and HEAD (or, if no prior release exists, from the project root commit). Render a short list grouped by component: features, fixes, breaking changes, security fixes. Cross-reference against the latest implementation reports under _docs/03_implementation/.
Rollback readiness:
- Confirm the previous version's image is still pullable from the registry (do not deploy without this).
- Confirm scripts/deploy.sh --rollback works as documented (read the script; if --rollback flag is missing, STOP — that is a deploy-skill bug).
- Confirm a rollback target exists (e.g., previously-deployed image tag) and is recorded in the release report under Rollback Plan.
Restrictions: read _docs/00_problem/restrictions.md for change-window rules, manual-approval rules, blackout windows, regulatory requirements (e.g., 4-eyes review, ITAR controls). If any apply, gate accordingly — write a manual_approvals/approval_<version>_<env>.md file once received.
Tracker check: list tracker tickets in the release scope (per tracker.mdc rules). Any ticket still in In Progress or Code Review that maps to a change in the release scope blocks Phase 1. Move-and-deploy is not allowed.

BLOCKING gate: present the assembled summary to the user using Choose A/B/C:

══════════════════════════════════════
 PRE-RELEASE GATE
══════════════════════════════════════
 Target env:        {env}
 Target version:    {version} ({git-sha})
 Rollback target:   {previous-version}
 Changes:           N tickets, M components
   - {summary list}
 Open risks:        {summary or "none"}
 Blocking issues:   {summary or "none"}
══════════════════════════════════════
 A) Proceed to Strategy Select
 B) Abort — fix blocking issue and re-invoke
 C) Edit release scope — exclude a ticket and reassemble
══════════════════════════════════════

If A → write Phase 1 section to release report, proceed. If B → write Aborted verdict to release report with reason, exit. If C → loop back into Phase 1 with edited scope.

Phase 2: Strategy Select

Goal: Pick the deployment strategy that fits the change risk and environment capability.

Read environment_strategy.md and deployment_procedures.md to learn which strategies the target env supports. Strategies and when each is appropriate:

Strategy	When to pick	Risk if wrong
all-at-once	Internal tools, low traffic, well-rehearsed change, env supports nothing else	All users hit the new version simultaneously — bug blast radius is 100%
blue-green	Stateless services with a load balancer, env has dual-stack capability	Cutover is binary — observability must be ready to detect issues fast
canary	Customer-facing, traffic-tier load balancer in place, gradual rollout possible	Canary metric thresholds must be well-tuned or canary fails for harmless reasons
manual	Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host)	The whole release becomes a runbook and the watch window phases are operator-driven; the release skill records but does not execute

Recommend a default based on:

Risk level inferred from change summary (any breaking change → bias toward canary or blue-green)
Restrictions (e.g., regulatory rules forcing manual approval at each step)
Environment capability (some envs may only support all-at-once)

BLOCKING gate: Choose A/B/C/D between strategies. Record the choice in the release report.

Phase 3: Execute

Goal: Actually run the deploy. Capture exit code and full stdout/stderr.

Validate environment file (.env) exists, all required vars from .env.example are set, no placeholder secrets remain.
Source the env file and run scripts/deploy.sh against the target host. The script produced by /deploy Step 7 is the point of execution; do NOT bypass it. If a strategy-specific flag is needed (e.g., --canary 5%), pass it through.
Stream stdout/stderr to the release report, with timestamps, in a fenced code block under ## Phase 3: Execute.
Capture exit code.
AUTO-ROLLBACK trigger: non-zero exit code → immediately invoke Phase 6 with verdict Rolled-Back: deploy script failure. Do NOT continue to Phase 4.

If deploy.sh emits no output for more than the configured idle threshold (default 5 minutes; check deployment_procedures.md for an explicit value), treat it as hung — capture a snapshot of what's running on the target, kill the script, and AUTO-ROLLBACK with reason Deploy hung — manual investigation required.

Manual strategy: if Phase 2 picked manual, write a checklist of operator steps from deployment_procedures.md to the release report and pause until the user types done or failed. Phase 3 then records the user's report verbatim.

Phase 4: Smoke Test

Goal: Verify the new version is actually serving traffic correctly in the target environment.

Resolve the smoke-test command from _docs/02_document/tests/blackbox-tests.md § Production Smoke Tests, OR delegate to /test-run in --prod-smoke mode against the target environment.
The smoke-test set must (a) hit each public endpoint of each component, (b) include at least one read AND one write per public endpoint where applicable, and (c) complete in under 5 minutes total.
Capture pass/fail per case to the release report.
AUTO-ROLLBACK trigger: any smoke-test failure → invoke Phase 6 with verdict Rolled-Back: smoke test failure: <test-name>.

If smoke tests are missing for the target environment (no production-mode test set), STOP — write a leftover entry to _docs/_process_leftovers/ per tracker.mdc, do not proceed to watch window without smoke coverage. Write Aborted: smoke tests missing for prod-mode target and ASK the user.

Phase 5: Watch Window

Goal: Observe the live system for a defined window to catch latent regressions.

Read observability.md for the project's metrics, dashboards, and threshold definitions. Required watch metrics for any production target (per cursor-meta convention) include error rate, request rate, p99 latency, and saturation (CPU/memory/queue-depth).
Compute the watch-window duration from deployment_procedures.md. If unspecified, default to 15 minutes for staging and 60 minutes for production.
Poll the observability backend at 1-minute intervals (or the configured cadence). For each interval, record metric snapshots to the release report.
Threshold rules:
- Hard breach (auto-rollback): error-rate ≥ 2× baseline, p99 latency ≥ 3× baseline, any health-check failure persisting for 2 consecutive intervals.
- Soft breach (escalate): metric drift between 1.5× and 2× baseline, single-interval health blip, queue-depth steady but elevated.
- No data (escalate): if metrics are not flowing within the first 3 minutes, treat the absence as a hard breach — observability is itself broken.
AUTO-ROLLBACK trigger: hard breach at any interval. Move to Phase 6 with verdict Rolled-Back: <metric> breached <multiplier>× baseline at T+<minutes>.
ESCALATE trigger: soft breach. Pause polling, surface the metric, and ask the user A/B/C:
- A) Continue watch — accept current drift, keep polling
- B) Roll back now — treat soft drift as hard
- C) Extend watch window by N minutes
End of watch window with no breach → proceed to Phase 6.

The watch window cannot be skipped. If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as Released-with-override — this triggers an automatic incident retrospective per retrospective/SKILL.md.

Phase 6: Commit or Rollback

Goal: Finalize the release with a definitive verdict on disk.

Path A — Commit (clean release):

Update tracker tickets: every ticket in scope moves to Released (or Done, per project convention defined in tracker.mdc / _docs/_repo-config.yaml).
Tag the git HEAD with release/<version> (or the project's tag convention from deployment_procedures.md).
Write the final Released verdict to the release report with a summary table.
Trigger /retrospective --cycle-end with this release as the cycle terminus.
Auto-chain to autodev's next step (Retrospective in greenfield, or feature-cycle loop start in existing-code).

Path B — Rollback (auto-fired or user-elected):

Run scripts/deploy.sh --rollback with the rollback target captured in Phase 1.
Stream output to a new file RELEASE_DIR/rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md AND append a summary to the original release report under ## Rollback.
Re-run Phase 4 (smoke test) and a 5-minute mini watch window against the rolled-back version. If THAT also fails, escalate immediately — the system is in an unknown state and needs human takeover.
Update tracker tickets back to Ready for Release (or the project's pre-release status).
Write the final Rolled-Back verdict with full reason chain.
Auto-trigger /retrospective --incident with this release as the incident anchor (per retrospective/SKILL.md incident mode).
Do NOT auto-chain to anything else — the user owns the next step.

Path C — Aborted: Reached only via Phase 1 Choose B, Phase 4 smoke-tests-missing escalation, or any phase that detects a precondition violation. Write Aborted: <reason> to the release report. Do not auto-chain.

Self-verification

Release report exists at RELEASE_DIR/release_<version>_<env>_<timestamp>.md with verdict (Released / Rolled-Back / Aborted)
Every phase that ran has a section in the release report with timestamps and tool output
On Released: tracker tickets moved to release status; git tag pushed (if convention)
On Rolled-Back: rollback report exists at RELEASE_DIR/rollback_<version>_<env>_<timestamp>.md; tracker tickets moved back to pre-release status; incident retrospective scheduled
On Aborted: reason recorded; no live-system changes attempted; no tracker movement
No phase was skipped without an explicit reason recorded in the release report

Escalation Rules

Situation	Action
`scripts/deploy.sh` missing or `--rollback` unsupported	STOP — return to `/deploy` Step 7, do not patch the script in `/release`
Registry auth failure during pre-release	STOP — fix credentials at infra layer (per `coderule.mdc`); do not embed creds in the script
Smoke tests missing for prod target	STOP — write a leftover; do not improvise smoke tests in `/release`
Observability backend unreachable	STOP — observability blindness is itself a release blocker
User asks to skip the watch window	Record override, mark verdict `Released-with-override`, fire incident retro
Rollback also fails its smoke test	ESCALATE to user — system is in unknown state; do not loop deploys
Tracker MCP returns Unauthorized during ticket movement	Per `tracker.mdc`, write a leftover entry; do NOT silently continue without confirming the move
Multiple environments named in user request	STOP — one release per invocation; ask user to pick one
Production smoke test would touch real customer data	STOP — that is a `coderule.mdc` violation; ask user to define a smoke endpoint or test account

Common Mistakes

Skipping the watch window when "everything looks fine after deploy" — a deploy that exited 0 is not a release that's stable. Watch is mandatory.
Faking smoke tests to pass the gate when the prod test set is incomplete. STOP and surface the gap; do not embed prod URLs into ad-hoc curl commands.
Rolling forward through a failure ("the next deploy will fix it"). Roll back first, fix the cause, then deploy a real fix.
Treating the release report as optional when only an internal tool changed. Every release writes a report — the audit trail is the value, not the prose volume.
Approving manual gates yourself without the user's input when restrictions require human approval. The release skill records, the human approves.
Reusing release_<version> filenames across attempted releases. Always include the timestamp in the filename so re-attempts are visible side-by-side.
Letting tracker drift silently between release attempts. If Phase 6 cannot move tickets, the release is not complete — write a leftover and stop.

Project Mode vs Standalone

Project mode (default): autodev invokes /release after /deploy. State writes occur under _docs/_autodev_state.md. Full integration with retrospective and feature-cycle loop.
Standalone mode: /release invoked directly with @<artifact> (rare; usually only for re-running a rollback against a specific version). All outputs still go to RELEASE_DIR/.

Methodology Quick Reference

┌────────────────────────────────────────────────────────────────┐
│                Release (6 phases, 3 verdicts)                  │
├────────────────────────────────────────────────────────────────┤
│ Phase 1  Pre-Release Gate                                      │
│           AC + tests + change summary + rollback path          │
│           [BLOCKING — user A/B/C]                              │
│ Phase 2  Strategy Select                                       │
│           all-at-once · blue-green · canary · manual           │
│           [BLOCKING — user picks]                              │
│ Phase 3  Execute                                               │
│           scripts/deploy.sh, capture exit code + logs          │
│           [AUTO-ROLLBACK on non-zero or hang]                  │
│ Phase 4  Smoke Test                                            │
│           /test-run --prod-smoke against target                │
│           [AUTO-ROLLBACK on any failure]                       │
│ Phase 5  Watch Window                                          │
│           Poll observability for N minutes                     │
│           [AUTO-ROLLBACK on hard breach; escalate on soft]     │
│ Phase 6  Commit or Rollback                                    │
│           Released → tracker, tag, retrospective               │
│           Rolled-Back → tracker reset, incident retrospective  │
│           Aborted → no live-system change                      │
├────────────────────────────────────────────────────────────────┤
│ Principles: real execution · verifiable rollback ·             │
│             quiet failure = release failure ·                  │
│             watch window mandatory                             │
└────────────────────────────────────────────────────────────────┘

22 KiB Raw Blame History Unescape Escape