Update skills documentation to reflect changes in directory structure and terminology. Replace references to integration tests with blackbox tests across various SKILL.md files and templates. Revise paths in planning and deployment documentation to align with the updated _docs/02_document/ structure. Enhance clarity in task management processes and ensure consistency in terminology throughout the documentation.

2026-04-23 03:26:38 +00:00 · 2026-03-25 06:08:05 +02:00
parent e720a949a8
commit 1c6e8f47b1
67 changed files with 5624 additions and 3647 deletions
@@ -17,6 +17,17 @@ disable-model-invocation: true

 Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autopilot` once — the engine handles sequencing, transitions, and re-entry.

+## File Index
+
+| File | Purpose |
+|------|---------|
+| `flows/greenfield.md` | Detection rules, step table, and auto-chain rules for new projects |
+| `flows/existing-code.md` | Detection rules, step table, and auto-chain rules for existing codebases |
+| `state.md` | State file format, rules, re-entry protocol, session boundaries |
+| `protocols.md` | User interaction, Jira MCP auth, choice format, error handling, status summary |
+
+**On every invocation**: read all four files above before executing any logic.
+
 ## Core Principles

 - **Auto-chain**: when a skill completes, immediately start the next one — no pause between skills
@@ -24,250 +35,57 @@ Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Det
 - **State from disk**: all progress is persisted to `_docs/_autopilot_state.md` and cross-checked against `_docs/` folder structure
 - **Rich re-entry**: on every invocation, read the state file for full context before continuing
 - **Delegate, don't duplicate**: read and execute each sub-skill's SKILL.md; never inline their logic here
+- **Sound on pause**: follow `.cursor/rules/human-attention-sound.mdc` — play a notification sound before every pause that requires human input
+- **Minimize interruptions**: only ask the user when the decision genuinely cannot be resolved automatically
+- **Single project per workspace**: all `_docs/` paths are relative to workspace root; for monorepos, each service needs its own Cursor workspace

-## State File: `_docs/_autopilot_state.md`
+## Flow Resolution

-The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist.
+Determine which flow to use:

-### Format
+1. If workspace has source code files **and** `_docs/` does not exist → **existing-code flow** (Pre-Step detection)
+2. If `_docs/_autopilot_state.md` exists and records Document in `Completed Steps` → **existing-code flow**
+3. If `_docs/_autopilot_state.md` exists and `step: done` AND workspace contains source code → **existing-code flow** (completed project re-entry — loops to New Task)
+4. Otherwise → **greenfield flow**

-```markdown
-# Autopilot State
+After selecting the flow, apply its detection rules (first match wins) to determine the current step.

-## Current Step
-step: [0-5 or "done"]
-name: [Problem / Research / Plan / Decompose / Implement / Deploy / Done]
-status: [not_started / in_progress / completed]
-sub_step: [optional — sub-skill phase if interrupted mid-step, e.g. "Plan Step 3: Component Decomposition"]
+## Execution Loop

-## Completed Steps
-
-| Step | Name | Completed | Key Outcome |
-|------|------|-----------|-------------|
-| 0 | Problem | [date] | [one-line summary] |
-| 1 | Research | [date] | [N drafts, final approach summary] |
-| 2 | Plan | [date] | [N components, architecture summary] |
-| 3 | Decompose | [date] | [N tasks, total complexity points] |
-| 4 | Implement | [date] | [N batches, pass/fail summary] |
-| 5 | Deploy | [date] | [artifacts produced] |
-
-## Key Decisions
- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"]
- [decision 2: e.g. "6 research rounds, final draft: solution_draft06.md"]
- [decision N]
-
-## Last Session
-date: [date]
-ended_at: [step name and phase]
-reason: [completed step / session boundary / user paused / context limit]
-notes: [any context for next session, e.g. "User asked to revisit risk assessment"]
-
-## Blockers
- [blocker 1, if any]
- [none]
-```
-
-### State File Rules
-
-1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 0)
-2. **Update** the state file after every step completion, every session boundary, and every BLOCKING gate confirmation
-3. **Read** the state file as the first action on every invocation — before folder scanning
-4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 2 but `_docs/02_plans/architecture.md` already exists), trust the folder structure and update the state file to match
-5. **Never delete** the state file. It accumulates history across the entire project lifecycle
-
-## Execution Entry Point
-
-Every invocation of this skill follows the same sequence:
+Every invocation follows this sequence:

 ```
 1. Read _docs/_autopilot_state.md (if exists)
-2. Cross-check state file against _docs/ folder structure
-3. Resolve current step (state file + folder scan)
-4. Present Status Summary (from state file context)
-5. Enter Execution Loop:
-   a. Read and execute the current skill's SKILL.md
-   b. When skill completes → update state file
-   c. Re-detect next step
-   d. If next skill is ready → auto-chain (go to 5a with next skill)
-   e. If session boundary reached → update state file with session notes → suggest new conversation
-   f. If all steps done → update state file → report completion
+2. Read all File Index files above
+3. Cross-check state file against _docs/ folder structure (rules in state.md)
+4. Resolve flow (see Flow Resolution above)
+5. Resolve current step (detection rules from the active flow file)
+6. Present Status Summary (template in active flow file)
+7. Execute:
+   a. Delegate to current skill (see Skill Delegation below)
+   b. If skill returns FAILED → apply Skill Failure Retry Protocol (see protocols.md):
+      - Auto-retry the same skill (failure may be caused by missing user input or environment issue)
+      - If 3 consecutive auto-retries fail → record in state file Blockers, warn user, stop auto-retry
+   c. When skill completes successfully → reset retry counter, update state file (rules in state.md)
+   d. Re-detect next step from the active flow's detection rules
+   e. If next skill is ready → auto-chain (go to 7a with next skill)
+   f. If session boundary reached → update state, suggest new conversation (rules in state.md)
+   g. If all steps done → update state → report completion
 ```

-## State Detection
-
-Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning.
-
-### Folder Scan Rules (fallback)
-
-Scan `_docs/` to determine the current workflow position. Check rules in order — first match wins.
-
-### Detection Rules
-
-**Step 0 — Problem Gathering**
-Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty:
- `problem.md`
- `restrictions.md`
- `acceptance_criteria.md`
- `input_data/` (must contain at least one file)
-
-Action: Read and execute `.cursor/skills/problem/SKILL.md`
-
---
-
-**Step 1 — Research (Initial)**
-Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files
-
-Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A)
-
---
-
-**Step 1b — Research Decision**
-Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_plans/architecture.md` does not exist
-
-Action: Present the current research state to the user:
- How many solution drafts exist
- Whether tech_stack.md and security_analysis.md exist
- One-line summary from the latest draft
-
-Then ask: **"Run another research round (Mode B assessment), or proceed to planning?"**
- If user wants another round → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B)
- If user wants to proceed → auto-chain to Step 2 (Plan)
-
---
-
-**Step 2 — Plan**
-Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_plans/architecture.md` does not exist
-
-Action:
-1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself
-2. Read and execute `.cursor/skills/plan/SKILL.md`
-
-If `_docs/02_plans/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it.
-
---
-
-**Step 3 — Decompose**
-Condition: `_docs/02_plans/` contains `architecture.md` AND `_docs/02_plans/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`)
-
-Action: Read and execute `.cursor/skills/decompose/SKILL.md`
-
-If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
-
---
-
-**Step 4 — Implement**
-Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
-
-Action: Read and execute `.cursor/skills/implement/SKILL.md`
-
-If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
-
---
-
-**Step 5 — Deploy**
-Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND `_docs/04_deploy/` does not exist or is incomplete
-
-Action: Read and execute `.cursor/skills/deploy/SKILL.md`
-
---
-
-**Done**
-Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md)
-
-Action: Report project completion with summary.
-
-## Status Summary
-
-On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback).
-
-Format:
-
-```
-═══════════════════════════════════════════════════
- AUTOPILOT STATUS
-═══════════════════════════════════════════════════
- Step 0  Problem      [DONE / IN PROGRESS / NOT STARTED]
- Step 1  Research     [DONE (N drafts) / IN PROGRESS / NOT STARTED]
- Step 2  Plan         [DONE / IN PROGRESS / NOT STARTED]
- Step 3  Decompose    [DONE (N tasks) / IN PROGRESS / NOT STARTED]
- Step 4  Implement    [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED]
- Step 5  Deploy       [DONE / IN PROGRESS / NOT STARTED]
-═══════════════════════════════════════════════════
- Current step: [Step N — Name]
- Action: [what will happen next]
-═══════════════════════════════════════════════════
-```
-
-For re-entry (state file exists), also include:
- Key decisions from the state file's `Key Decisions` section
- Last session context from the `Last Session` section
- Any blockers from the `Blockers` section
-
-## Auto-Chain Rules
-
-After a skill completes, apply these rules:
-
-| Completed Step | Next Action |
-|---------------|-------------|
-| Problem Gathering | Auto-chain → Research (Mode A) |
-| Research (any round) | Auto-chain → Research Decision (ask user: another round or proceed?) |
-| Research Decision → proceed | Auto-chain → Plan |
-| Plan | Auto-chain → Decompose |
-| Decompose | **Session boundary** — suggest new conversation before Implement |
-| Implement | Auto-chain → Deploy |
-| Deploy | Report completion |
-
-### Session Boundary: Decompose → Implement
-
-After decompose completes, **do not auto-chain to implement**. Instead:
-
-1. Update state file: mark Decompose as completed, set current step to 4 (Implement) with status `not_started`
-2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready`
-3. Present a summary: number of tasks, estimated batches, total complexity points
-4. Suggest: "Implementation is the longest phase and benefits from a fresh conversation context. Start a new conversation and type `/autopilot` to begin implementation."
-5. If the user insists on continuing in the same conversation, proceed.
-
-This is the only hard session boundary. All other transitions auto-chain.
-
 ## Skill Delegation

 For each step, the delegation pattern is:

-1. Update state file: set current step to `in_progress`, record `sub_step` if applicable
+1. Update state file: set `step` to the autopilot step number, status to `in_progress`, set `sub_step` to the sub-skill's current internal step/phase, reset `retry_count: 0`
 2. Announce: "Starting [Skill Name]..."
 3. Read the skill file: `.cursor/skills/[name]/SKILL.md`
-4. Execute the skill's workflow exactly as written, including:
-   - All BLOCKING gates (present to user, wait for confirmation)
-   - All self-verification checklists
-   - All save actions
-   - All escalation rules
-5. When the skill's workflow is fully complete:
-   - Update state file: mark step as `completed`, record date, write one-line key outcome
-   - Add any key decisions made during this step to the `Key Decisions` section
-   - Return to the auto-chain rules
+4. Execute the skill's workflow exactly as written, including all BLOCKING gates, self-verification checklists, save actions, and escalation rules. Update `sub_step` in state each time the sub-skill advances.
+5. If the skill **fails**: follow the Skill Failure Retry Protocol in `protocols.md` — increment `retry_count`, auto-retry up to 3 times, then escalate.
+6. When complete (success): reset `retry_count: 0`, mark step `completed`, record date + key outcome, add key decisions to state file, return to auto-chain rules (from active flow file)

 Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The autopilot is a sequencer, not an optimizer.

-## Re-Entry Protocol
-
-When the user invokes `/autopilot` and work already exists:
-
-1. Read `_docs/_autopilot_state.md`
-2. Cross-check against `_docs/` folder structure
-3. Present Status Summary with context from state file (key decisions, last session, blockers)
-4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery
-5. Continue execution from detected state
-
-## Error Handling
-
-| Situation | Action |
-|-----------|--------|
-| State detection is ambiguous (artifacts suggest two different steps) | Present findings to user, ask which step to execute |
-| Sub-skill fails or hits an unrecoverable blocker | Report the error, suggest the user fix it manually, then re-invoke `/autopilot` |
-| User wants to skip a step | Warn about downstream dependencies, proceed if user confirms |
-| User wants to go back to a previous step | Warn that re-running may overwrite artifacts, proceed if user confirms |
-| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
-
 ## Trigger Conditions

 This skill activates when the user wants to:
@@ -281,41 +99,9 @@ This skill activates when the user wants to:
 **Differentiation**:
 - User wants only research → use `/research` directly
 - User wants only planning → use `/plan` directly
+- User wants to document an existing codebase → use `/document` directly
 - User wants the full guided workflow → use `/autopilot`

-## Methodology Quick Reference
+## Flow Reference

-```
-┌────────────────────────────────────────────────────────────────┐
-│              Autopilot (Auto-Chain Orchestrator)                │
-├────────────────────────────────────────────────────────────────┤
-│ EVERY INVOCATION:                                              │
-│   1. State Detection (scan _docs/)                             │
-│   2. Status Summary (show progress)                            │
-│   3. Execute current skill                                     │
-│   4. Auto-chain to next skill (loop)                           │
-│                                                                │
-│ WORKFLOW:                                                       │
-│   Step 0  Problem    → .cursor/skills/problem/SKILL.md         │
-│     ↓ auto-chain                                               │
-│   Step 1  Research   → .cursor/skills/research/SKILL.md        │
-│     ↓ auto-chain (ask: another round?)                         │
-│   Step 2  Plan       → .cursor/skills/plan/SKILL.md            │
-│     ↓ auto-chain                                               │
-│   Step 3  Decompose  → .cursor/skills/decompose/SKILL.md       │
-│     ↓ SESSION BOUNDARY (suggest new conversation)              │
-│   Step 4  Implement  → .cursor/skills/implement/SKILL.md       │
-│     ↓ auto-chain                                               │
-│   Step 5  Deploy     → .cursor/skills/deploy/SKILL.md          │
-│     ↓                                                          │
-│   DONE                                                         │
-│                                                                │
-│ STATE FILE: _docs/_autopilot_state.md                          │
-│ FALLBACK: _docs/ folder structure scan                         │
-│ PAUSE POINTS: sub-skill BLOCKING gates only                    │
-│ SESSION BREAK: after Decompose (before Implement)              │
-├────────────────────────────────────────────────────────────────┤
-│ Principles: Auto-chain · State to file · Rich re-entry         │
-│             Delegate don't duplicate · Pause at decisions only  │
-└────────────────────────────────────────────────────────────────┘
-```
+See `flows/greenfield.md` and `flows/existing-code.md` for step tables, detection rules, auto-chain rules, and status summary templates.
@@ -0,0 +1,234 @@
+# Existing Code Workflow
+
+Workflow for projects with an existing codebase. Starts with documentation, produces test specs, decomposes and implements tests, verifies them, refactors with that safety net, then adds new functionality and deploys.
+
+## Step Reference Table
+
+| Step | Name | Sub-Skill | Internal SubSteps |
+|------|------|-----------|-------------------|
+| 1 | Document | document/SKILL.md | Steps 1–8 |
+| 2 | Test Spec | test-spec/SKILL.md | Phase 1a–1b |
+| 3 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
+| 4 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 5 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 6 | Refactor | refactor/SKILL.md | Phases 0–5 (6-phase method) |
+| 7 | New Task | new-task/SKILL.md | Steps 1–8 (loop) |
+| 8 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 9 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 10 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 11 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
+| 12 | Deploy | deploy/SKILL.md | Step 1–7 |
+
+After Step 12, the existing-code workflow is complete.
+
+## Detection Rules
+
+Check rules in order — first match wins.
+
+---
+
+**Step 1 — Document**
+Condition: `_docs/` does not exist AND the workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`, `src/`, `Cargo.toml`, `*.csproj`, `package.json`)
+
+Action: An existing codebase without documentation was detected. Read and execute `.cursor/skills/document/SKILL.md`. After the document skill completes, re-detect state (the produced `_docs/` artifacts will place the project at Step 2 or later).
+
+---
+
+**Step 2 — Test Spec**
+Condition: `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`) AND `_docs/02_document/tests/traceability-matrix.md` does not exist AND the autopilot state shows Document was run (check `Completed Steps` for "Document" entry)
+
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md`
+
+This step applies when the codebase was documented via the `/document` skill. Test specifications must be produced before refactoring or further development.
+
+---
+
+**Step 3 — Decompose Tests**
+Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND the autopilot state shows Document was run AND (`_docs/02_tasks/` does not exist or has no task files)
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
+1. Run Step 1t (test infrastructure bootstrap)
+2. Run Step 3 (blackbox test task decomposition)
+3. Run Step 4 (cross-verification against test coverage)
+
+If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
+
+---
+
+**Step 4 — Implement Tests**
+Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 3 (Decompose Tests) is completed AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+The implement skill reads test tasks from `_docs/02_tasks/` and implements them.
+
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
+
+---
+
+**Step 5 — Run Tests**
+Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state shows Step 4 (Implement Tests) is completed AND the autopilot state does NOT show Step 5 (Run Tests) as completed
+
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`
+
+Verifies the implemented test suite passes before proceeding to refactoring. The tests form the safety net for all subsequent code changes.
+
+---
+
+**Step 6 — Refactor**
+Condition: the autopilot state shows Step 5 (Run Tests) is completed AND `_docs/04_refactoring/FINAL_report.md` does not exist
+
+Action: Read and execute `.cursor/skills/refactor/SKILL.md`
+
+The refactor skill runs the full 6-phase method using the implemented tests as a safety net.
+
+If `_docs/04_refactoring/` has phase reports, the refactor skill detects completed phases and continues.
+
+---
+
+**Step 7 — New Task**
+Condition: the autopilot state shows Step 6 (Refactor) is completed AND the autopilot state does NOT show Step 7 (New Task) as completed
+
+Action: Read and execute `.cursor/skills/new-task/SKILL.md`
+
+The new-task skill interactively guides the user through defining new functionality. It loops until the user is done adding tasks. New task files are written to `_docs/02_tasks/`.
+
+---
+
+**Step 8 — Implement**
+Condition: the autopilot state shows Step 7 (New Task) is completed AND `_docs/03_implementation/` does not contain a FINAL report covering the new tasks (check state for distinction between test implementation and feature implementation)
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+The implement skill reads the new tasks from `_docs/02_tasks/` and implements them. Tasks already implemented in Step 4 are skipped (the implement skill tracks completed tasks in batch reports).
+
+If `_docs/03_implementation/` has batch reports from this phase, the implement skill detects completed tasks and continues.
+
+---
+
+**Step 9 — Run Tests**
+Condition: the autopilot state shows Step 8 (Implement) is completed AND the autopilot state does NOT show Step 9 (Run Tests) as completed
+
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`
+
+---
+
+**Step 10 — Security Audit (optional)**
+Condition: the autopilot state shows Step 9 (Run Tests) is completed AND the autopilot state does NOT show Step 10 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run security audit before deploy?
+══════════════════════════════════════
+ A) Run security audit (recommended for production deployments)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: A — catches vulnerabilities before production
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 11 (Performance Test).
+- If user picks B → Mark Step 10 as `skipped` in the state file, auto-chain to Step 11 (Performance Test).
+
+---
+
+**Step 11 — Performance Test (optional)**
+Condition: the autopilot state shows Step 10 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 11 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run performance/load tests before deploy?
+══════════════════════════════════════
+ A) Run performance tests (recommended for latency-sensitive or high-load systems)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: [A or B — base on whether acceptance criteria
+ include latency, throughput, or load requirements]
+══════════════════════════════════════
+```
+
+- If user picks A → Run performance tests:
+  1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it
+  2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
+  3. Present results vs acceptance criteria thresholds
+  4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
+  5. After completion, auto-chain to Step 12 (Deploy)
+- If user picks B → Mark Step 11 as `skipped` in the state file, auto-chain to Step 12 (Deploy).
+
+---
+
+**Step 12 — Deploy**
+Condition: the autopilot state shows Step 9 (Run Tests) is completed AND (Step 10 is completed or skipped) AND (Step 11 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Read and execute `.cursor/skills/deploy/SKILL.md`
+
+After deployment completes, the existing-code workflow is done.
+
+---
+
+**Re-Entry After Completion**
+Condition: the autopilot state shows `step: done` OR all steps through 12 (Deploy) are completed
+
+Action: The project completed a full cycle. Present status and loop back to New Task:
+
+```
+══════════════════════════════════════
+ PROJECT CYCLE COMPLETE
+══════════════════════════════════════
+ The previous cycle finished successfully.
+ You can now add new functionality.
+══════════════════════════════════════
+ A) Add new features (start New Task)
+ B) Done — no more changes needed
+══════════════════════════════════════
+```
+
+- If user picks A → set `step: 7`, `status: not_started` in the state file, then auto-chain to Step 7 (New Task). Previous cycle history stays in Completed Steps.
+- If user picks B → report final project status and exit.
+
+## Auto-Chain Rules
+
+| Completed Step | Next Action |
+|---------------|-------------|
+| Document (1) | Auto-chain → Test Spec (2) |
+| Test Spec (2) | Auto-chain → Decompose Tests (3) |
+| Decompose Tests (3) | **Session boundary** — suggest new conversation before Implement Tests |
+| Implement Tests (4) | Auto-chain → Run Tests (5) |
+| Run Tests (5, all pass) | Auto-chain → Refactor (6) |
+| Refactor (6) | Auto-chain → New Task (7) |
+| New Task (7) | **Session boundary** — suggest new conversation before Implement |
+| Implement (8) | Auto-chain → Run Tests (9) |
+| Run Tests (9, all pass) | Auto-chain → Security Audit choice (10) |
+| Security Audit (10, done or skipped) | Auto-chain → Performance Test choice (11) |
+| Performance Test (11, done or skipped) | Auto-chain → Deploy (12) |
+| Deploy (12) | **Workflow complete** — existing-code flow done |
+
+## Status Summary Template
+
+```
+═══════════════════════════════════════════════════
+ AUTOPILOT STATUS (existing-code)
+═══════════════════════════════════════════════════
+ Step 1   Document            [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 2   Test Spec           [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 3   Decompose Tests     [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 4   Implement Tests     [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)]
+ Step 5   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 6   Refactor            [DONE / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)]
+ Step 7   New Task            [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 8   Implement           [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
+ Step 9   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 10  Security Audit      [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 11  Performance Test    [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 12  Deploy              [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+═══════════════════════════════════════════════════
+ Current: Step N — Name
+ SubStep: M — [sub-skill internal step name]
+ Retry:   [N/3 if retrying, omit if 0]
+ Action:  [what will happen next]
+═══════════════════════════════════════════════════
+```
@@ -0,0 +1,235 @@
+# Greenfield Workflow
+
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy.
+
+## Step Reference Table
+
+| Step | Name | Sub-Skill | Internal SubSteps |
+|------|------|-----------|-------------------|
+| 1 | Problem | problem/SKILL.md | Phase 1–4 |
+| 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
+| 3 | Plan | plan/SKILL.md | Step 1–6 + Final |
+| 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
+| 5 | Decompose | decompose/SKILL.md | Step 1–4 |
+| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 7 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 8 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 9 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
+| 10 | Deploy | deploy/SKILL.md | Step 1–7 |
+
+## Detection Rules
+
+Check rules in order — first match wins.
+
+---
+
+**Step 1 — Problem Gathering**
+Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty:
+- `problem.md`
+- `restrictions.md`
+- `acceptance_criteria.md`
+- `input_data/` (must contain at least one file)
+
+Action: Read and execute `.cursor/skills/problem/SKILL.md`
+
+---
+
+**Step 2 — Research (Initial)**
+Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files
+
+Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A)
+
+---
+
+**Research Decision** (inline gate between Step 2 and Step 3)
+Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_document/architecture.md` does not exist
+
+Action: Present the current research state to the user:
+- How many solution drafts exist
+- Whether tech_stack.md and security_analysis.md exist
+- One-line summary from the latest draft
+
+Then present using the **Choose format**:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Research complete — next action?
+══════════════════════════════════════
+ A) Run another research round (Mode B assessment)
+ B) Proceed to planning with current draft
+══════════════════════════════════════
+ Recommendation: [A or B] — [reason based on draft quality]
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B)
+- If user picks B → auto-chain to Step 3 (Plan)
+
+---
+
+**Step 3 — Plan**
+Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_document/architecture.md` does not exist
+
+Action:
+1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself
+2. Read and execute `.cursor/skills/plan/SKILL.md`
+
+If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it.
+
+---
+
+**Step 4 — UI Design (conditional)**
+Condition: `_docs/02_document/architecture.md` exists AND the autopilot state does NOT show Step 4 (UI Design) as completed or skipped AND the project is a UI project
+
+**UI Project Detection** — the project is a UI project if ANY of the following are true:
+- `package.json` exists in the workspace root or any subdirectory
+- `*.html`, `*.jsx`, `*.tsx` files exist in the workspace
+- `_docs/02_document/components/` contains a component whose `description.md` mentions UI, frontend, page, screen, dashboard, form, or view
+- `_docs/02_document/architecture.md` mentions frontend, UI layer, SPA, or client-side rendering
+- `_docs/01_solution/solution.md` mentions frontend, web interface, or user-facing UI
+
+If the project is NOT a UI project → mark Step 4 as `skipped` in the state file and auto-chain to Step 5.
+
+If the project IS a UI project → present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: UI project detected — generate mockups?
+══════════════════════════════════════
+ A) Generate UI mockups before decomposition (recommended)
+ B) Skip — proceed directly to decompose
+══════════════════════════════════════
+ Recommendation: A — mockups before decomposition
+ produce better task specs for frontend components
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/ui-design/SKILL.md`. After completion, auto-chain to Step 5 (Decompose).
+- If user picks B → Mark Step 4 as `skipped` in the state file, auto-chain to Step 5 (Decompose).
+
+---
+
+**Step 5 — Decompose**
+Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`)
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md`
+
+If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
+
+---
+
+**Step 6 — Implement**
+Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
+
+---
+
+**Step 7 — Run Tests**
+Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state does NOT show Step 7 (Run Tests) as completed AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`
+
+---
+
+**Step 8 — Security Audit (optional)**
+Condition: the autopilot state shows Step 7 (Run Tests) is completed AND the autopilot state does NOT show Step 8 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run security audit before deploy?
+══════════════════════════════════════
+ A) Run security audit (recommended for production deployments)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: A — catches vulnerabilities before production
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 9 (Performance Test).
+- If user picks B → Mark Step 8 as `skipped` in the state file, auto-chain to Step 9 (Performance Test).
+
+---
+
+**Step 9 — Performance Test (optional)**
+Condition: the autopilot state shows Step 8 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 9 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run performance/load tests before deploy?
+══════════════════════════════════════
+ A) Run performance tests (recommended for latency-sensitive or high-load systems)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: [A or B — base on whether acceptance criteria
+ include latency, throughput, or load requirements]
+══════════════════════════════════════
+```
+
+- If user picks A → Run performance tests:
+  1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it
+  2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
+  3. Present results vs acceptance criteria thresholds
+  4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
+  5. After completion, auto-chain to Step 10 (Deploy)
+- If user picks B → Mark Step 9 as `skipped` in the state file, auto-chain to Step 10 (Deploy).
+
+---
+
+**Step 10 — Deploy**
+Condition: the autopilot state shows Step 7 (Run Tests) is completed AND (Step 8 is completed or skipped) AND (Step 9 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Read and execute `.cursor/skills/deploy/SKILL.md`
+
+---
+
+**Done**
+Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md)
+
+Action: Report project completion with summary. If the user runs autopilot again after greenfield completion, Flow Resolution rule 3 routes to the existing-code flow (re-entry after completion) so they can add new features.
+
+## Auto-Chain Rules
+
+| Completed Step | Next Action |
+|---------------|-------------|
+| Problem (1) | Auto-chain → Research (2) |
+| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
+| Research Decision → proceed | Auto-chain → Plan (3) |
+| Plan (3) | Auto-chain → UI Design detection (4) |
+| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
+| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
+| Implement (6) | Auto-chain → Run Tests (7) |
+| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
+| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
+| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
+| Deploy (10) | Report completion |
+
+## Status Summary Template
+
+```
+═══════════════════════════════════════════════════
+ AUTOPILOT STATUS (greenfield)
+═══════════════════════════════════════════════════
+ Step 1   Problem             [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 2   Research            [DONE (N drafts) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 3   Plan                [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 4   UI Design           [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 5   Decompose           [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 6   Implement           [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
+ Step 7   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 8   Security Audit      [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 9   Performance Test    [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 10  Deploy              [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+═══════════════════════════════════════════════════
+ Current: Step N — Name
+ SubStep: M — [sub-skill internal step name]
+ Retry:   [N/3 if retrying, omit if 0]
+ Action:  [what will happen next]
+═══════════════════════════════════════════════════
+```
@@ -0,0 +1,314 @@
+# Autopilot Protocols
+
+## User Interaction Protocol
+
+Every time the autopilot or a sub-skill needs a user decision, use the **Choose A / B / C / D** format. This applies to:
+
+- State transitions where multiple valid next actions exist
+- Sub-skill BLOCKING gates that require user judgment
+- Any fork where the autopilot cannot confidently pick the right path
+- Trade-off decisions (tech choices, scope, risk acceptance)
+
+### When to Ask (MUST ask)
+
+- The next action is ambiguous (e.g., "another research round or proceed?")
+- The decision has irreversible consequences (e.g., architecture choices, skipping a step)
+- The user's intent or preference cannot be inferred from existing artifacts
+- A sub-skill's BLOCKING gate explicitly requires user confirmation
+- Multiple valid approaches exist with meaningfully different trade-offs
+
+### When NOT to Ask (auto-transition)
+
+- Only one logical next step exists (e.g., Problem complete → Research is the only option)
+- The transition is deterministic from the state (e.g., Plan complete → Decompose)
+- The decision is low-risk and reversible
+- Existing artifacts or prior decisions already imply the answer
+
+### Choice Format
+
+Always present decisions in this format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: [brief context]
+══════════════════════════════════════
+ A) [Option A — short description]
+ B) [Option B — short description]
+ C) [Option C — short description, if applicable]
+ D) [Option D — short description, if applicable]
+══════════════════════════════════════
+ Recommendation: [A/B/C/D] — [one-line reason]
+══════════════════════════════════════
+```
+
+Rules:
+1. Always provide 2–4 concrete options (never open-ended questions)
+2. Always include a recommendation with a brief justification
+3. Keep option descriptions to one line each
+4. If only 2 options make sense, use A/B only — do not pad with filler options
+5. Play the notification sound (per `human-attention-sound.mdc`) before presenting the choice
+6. Record every user decision in the state file's `Key Decisions` section
+7. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive
+
+## Work Item Tracker Authentication
+
+Several workflow steps create work items (epics, tasks, links). The system supports **Jira MCP** and **Azure DevOps MCP** as interchangeable backends. Detect which is configured by listing available MCP servers.
+
+### Tracker Detection
+
+1. Check for available MCP servers: Jira MCP (`user-Jira-MCP-Server`) or Azure DevOps MCP (`user-AzureDevops`)
+2. If both are available, ask the user which to use (Choose format)
+3. Record the choice in the state file: `tracker: jira` or `tracker: ado`
+4. If neither is available, set `tracker: local` and proceed without external tracking
+
+### Steps That Require Work Item Tracker
+
+| Flow | Step | Sub-Step | Tracker Action |
+|------|------|----------|----------------|
+| greenfield | 3 (Plan) | Step 6 — Epics | Create epics for each component |
+| greenfield | 5 (Decompose) | Step 1–3 — All tasks | Create ticket per task, link to epic |
+| existing-code | 3 (Decompose Tests) | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
+| existing-code | 7 (New Task) | Step 7 — Ticket | Create ticket per task, link to epic |
+
+### Authentication Gate
+
+Before entering a step that requires work item tracking (see table above) for the first time, the autopilot must:
+
+1. Call `mcp_auth` on the detected tracker's MCP server
+2. If authentication succeeds → proceed normally
+3. If the user **skips** or authentication fails → present using Choose format:
+
+```
+══════════════════════════════════════
+ Tracker authentication failed
+══════════════════════════════════════
+ A) Retry authentication (retry mcp_auth)
+ B) Continue without tracker (tasks saved locally only)
+══════════════════════════════════════
+ Recommendation: A — Tracker IDs drive task referencing,
+ dependency tracking, and implementation batching.
+ Without tracker, task files use numeric prefixes instead.
+══════════════════════════════════════
+```
+
+If user picks **B** (continue without tracker):
+- Set a flag in the state file: `tracker: local`
+- All skills that would create tickets instead save metadata locally in the task/epic files with `Tracker: pending` status
+- Task files keep numeric prefixes (e.g., `01_initial_structure.md`) instead of tracker ID prefixes
+- The workflow proceeds normally in all other respects
+
+### Re-Authentication
+
+If the tracker MCP was already authenticated in a previous invocation (verify by listing available tools beyond `mcp_auth`), skip the auth gate.
+
+## Error Handling
+
+All error situations that require user input MUST use the **Choose A / B / C / D** format.
+
+| Situation | Action |
+|-----------|--------|
+| State detection is ambiguous (artifacts suggest two different steps) | Present findings and use Choose format with the candidate steps as options |
+| Sub-skill fails or hits an unrecoverable blocker | Use Choose format: A) retry, B) skip with warning, C) abort and fix manually |
+| User wants to skip a step | Use Choose format: A) skip (with dependency warning), B) execute the step |
+| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
+| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
+
+## Skill Failure Retry Protocol
+
+Sub-skills can return a **failed** result. Failures are often caused by missing user input, environment issues, or transient errors that resolve on retry. The autopilot auto-retries before escalating.
+
+### Retry Flow
+
+```
+Skill execution → FAILED
+  │
+  ├─ retry_count < 3 ?
+  │    YES → increment retry_count in state file
+  │         → log failure reason in state file (Retry Log section)
+  │         → re-read the sub-skill's SKILL.md
+  │         → re-execute from the current sub_step
+  │         → (loop back to check result)
+  │
+  │    NO (retry_count = 3) →
+  │         → set status: failed in Current Step
+  │         → add entry to Blockers section:
+  │             "[Skill Name] failed 3 consecutive times at sub_step [M].
+  │              Last failure: [reason]. Auto-retry exhausted."
+  │         → present warning to user (see Escalation below)
+  │         → do NOT auto-retry again until user intervenes
+```
+
+### Retry Rules
+
+1. **Auto-retry immediately**: when a skill fails, retry it without asking the user — the failure is often transient (missing user confirmation in a prior step, docker not running, file lock, etc.)
+2. **Preserve sub_step**: retry from the last recorded `sub_step`, not from the beginning of the skill — unless the failure indicates corruption, in which case restart from sub_step 1
+3. **Increment `retry_count`**: update `retry_count` in the state file's `Current Step` section on each retry attempt
+4. **Log each failure**: append the failure reason and timestamp to the state file's `Retry Log` section
+5. **Reset on success**: when the skill eventually succeeds, reset `retry_count: 0` and clear the `Retry Log` for that step
+
+### Escalation (after 3 consecutive failures)
+
+After 3 failed auto-retries of the same skill, the failure is likely not user-related. Stop retrying and escalate:
+
+1. Update the state file:
+   - Set `status: failed` in `Current Step`
+   - Set `retry_count: 3`
+   - Add a blocker entry describing the repeated failure
+2. Play notification sound (per `human-attention-sound.mdc`)
+3. Present using Choose format:
+
+```
+══════════════════════════════════════
+ SKILL FAILED: [Skill Name] — 3 consecutive failures
+══════════════════════════════════════
+ Step: [N] — [Name]
+ SubStep: [M] — [sub-step name]
+ Last failure reason: [reason]
+══════════════════════════════════════
+ A) Retry with fresh context (new conversation)
+ B) Skip this step with warning
+ C) Abort — investigate and fix manually
+══════════════════════════════════════
+ Recommendation: A — fresh context often resolves
+ persistent failures
+══════════════════════════════════════
+```
+
+### Re-Entry After Failure
+
+On the next autopilot invocation (new conversation), if the state file shows `status: failed` and `retry_count: 3`:
+
+- Present the blocker to the user before attempting execution
+- If the user chooses to retry → reset `retry_count: 0`, set `status: in_progress`, and re-execute
+- If the user chooses to skip → mark step as `skipped`, proceed to next step
+- Do NOT silently auto-retry — the user must acknowledge the persistent failure first
+
+## Error Recovery Protocol
+
+### Stuck Detection
+
+When executing a sub-skill, monitor for these signals:
+
+- Same artifact overwritten 3+ times without meaningful change
+- Sub-skill repeatedly asks the same question after receiving an answer
+- No new artifacts saved for an extended period despite active execution
+
+### Recovery Actions (ordered)
+
+1. **Re-read state**: read `_docs/_autopilot_state.md` and cross-check against `_docs/` folders
+2. **Retry current sub-step**: re-read the sub-skill's SKILL.md and restart from the current sub-step
+3. **Escalate**: after 2 failed retries, present diagnostic summary to user using Choose format:
+
+```
+══════════════════════════════════════
+ RECOVERY: [skill name] stuck at [sub-step]
+══════════════════════════════════════
+ A) Retry with fresh context (new conversation)
+ B) Skip this sub-step with warning
+ C) Abort and fix manually
+══════════════════════════════════════
+ Recommendation: A — fresh context often resolves stuck loops
+══════════════════════════════════════
+```
+
+### Circuit Breaker
+
+If the same autopilot step fails 3 consecutive times across conversations:
+
+- Record the failure pattern in the state file's `Blockers` section
+- Do NOT auto-retry on next invocation
+- Present the blocker and ask user for guidance before attempting again
+
+## Context Management Protocol
+
+### Principle
+
+Disk is memory. Never rely on in-context accumulation — read from `_docs/` artifacts, not from conversation history.
+
+### Minimal Re-Read Set Per Skill
+
+When re-entering a skill (new conversation or context refresh):
+
+- Always read: `_docs/_autopilot_state.md`
+- Always read: the active skill's `SKILL.md`
+- Conditionally read: only the `_docs/` artifacts the current sub-step requires (listed in each skill's Context Resolution section)
+- Never bulk-read: do not load all `_docs/` files at once
+
+### Mid-Skill Interruption
+
+If context is filling up during a long skill (e.g., document, implement):
+
+1. Save current sub-step progress to the skill's artifact directory
+2. Update `_docs/_autopilot_state.md` with exact sub-step position
+3. Suggest a new conversation: "Context is getting long — recommend continuing in a fresh conversation for better results"
+4. On re-entry, the skill's resumability protocol picks up from the saved sub-step
+
+### Large Artifact Handling
+
+When a skill needs to read large files (e.g., full solution.md, architecture.md):
+
+- Read only the sections relevant to the current sub-step
+- Use search tools (Grep, SemanticSearch) to find specific sections rather than reading entire files
+- Summarize key decisions from prior steps in the state file so they don't need to be re-read
+
+### Context Budget Heuristic
+
+Agents cannot programmatically query context window usage. Use these heuristics to avoid degradation:
+
+| Zone | Indicators | Action |
+|------|-----------|--------|
+| **Safe** | State file + SKILL.md + 2–3 focused artifacts loaded | Continue normally |
+| **Caution** | 5+ artifacts loaded, or 3+ large files (architecture, solution, discovery), or conversation has 20+ tool calls | Complete current sub-step, then suggest session break |
+| **Danger** | Repeated truncation in tool output, tool calls failing unexpectedly, responses becoming shallow or repetitive | Save immediately, update state file, force session boundary |
+
+**Skill-specific guidelines**:
+
+| Skill | Recommended session breaks |
+|-------|---------------------------|
+| **document** | After every ~5 modules in Step 1; between Step 4 (Verification) and Step 5 (Solution Extraction) |
+| **implement** | Each batch is a natural checkpoint; if more than 2 batches completed in one session, suggest break |
+| **plan** | Between Step 5 (Test Specifications) and Step 6 (Epics) for projects with many components |
+| **research** | Between Mode A rounds; between Mode A and Mode B |
+
+**How to detect caution/danger zone without API**:
+
+1. Count tool calls made so far — if approaching 20+, context is likely filling up
+2. If reading a file returns truncated content, context is under pressure
+3. If the agent starts producing shorter or less detailed responses than earlier in the conversation, context quality is degrading
+4. When in doubt, save and suggest a new conversation — re-entry is cheap thanks to the state file
+
+## Rollback Protocol
+
+### Implementation Steps (git-based)
+
+Handled by `/implement` skill — each batch commit is a rollback checkpoint via `git revert`.
+
+### Planning/Documentation Steps (artifact-based)
+
+For steps that produce `_docs/` artifacts (problem, research, plan, decompose, document):
+
+1. **Before overwriting**: if re-running a step that already has artifacts, the sub-skill's prerequisite check asks the user (resume/overwrite/skip)
+2. **Rollback to previous step**: use Choose format:
+
+```
+══════════════════════════════════════
+ ROLLBACK: Re-run [step name]?
+══════════════════════════════════════
+ A) Re-run the step (overwrites current artifacts)
+ B) Stay on current step
+══════════════════════════════════════
+ Warning: This will overwrite files in _docs/[folder]/
+══════════════════════════════════════
+```
+
+3. **Git safety net**: artifacts are committed with each autopilot step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout <commit> -- _docs/<folder>/`
+4. **State file rollback**: when rolling back artifacts, also update `_docs/_autopilot_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date)
+
+## Status Summary
+
+On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). Use the Status Summary Template from the active flow file (`flows/greenfield.md` or `flows/existing-code.md`).
+
+For re-entry (state file exists), also include:
+- Key decisions from the state file's `Key Decisions` section
+- Last session context from the `Last Session` section
+- Any blockers from the `Blockers` section
@@ -0,0 +1,122 @@
+# Autopilot State Management
+
+## State File: `_docs/_autopilot_state.md`
+
+The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist.
+
+### Format
+
+```markdown
+# Autopilot State
+
+## Current Step
+flow: [greenfield | existing-code]
+step: [1-10 for greenfield, 1-12 for existing-code, or "done"]
+name: [step name from the active flow's Step Reference Table]
+status: [not_started / in_progress / completed / skipped / failed]
+sub_step: [optional — sub-skill internal step number + name if interrupted mid-step]
+retry_count: [0-3 — number of consecutive auto-retry attempts for current step, reset to 0 on success]
+
+When updating `Current Step`, always write it as:
+  flow: existing-code   ← active flow
+  step: N               ← autopilot step (sequential integer)
+  sub_step: M           ← sub-skill's own internal step/phase number + name
+  retry_count: 0        ← reset on new step or success; increment on each failed retry
+Example:
+  flow: greenfield
+  step: 3
+  name: Plan
+  status: in_progress
+  sub_step: 4 — Architecture Review & Risk Assessment
+  retry_count: 0
+Example (failed after 3 retries):
+  flow: existing-code
+  step: 2
+  name: Test Spec
+  status: failed
+  sub_step: 1b — Test Case Generation
+  retry_count: 3
+
+## Completed Steps
+
+| Step | Name | Completed | Key Outcome |
+|------|------|-----------|-------------|
+| 1 | [name] | [date] | [one-line summary] |
+| 2 | [name] | [date] | [one-line summary] |
+| ... | ... | ... | ... |
+
+## Key Decisions
+- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"]
+- [decision N]
+
+## Last Session
+date: [date]
+ended_at: Step [N] [Name] — SubStep [M] [sub-step name]
+reason: [completed step / session boundary / user paused / context limit]
+notes: [any context for next session]
+
+## Retry Log
+| Attempt | Step | Name | SubStep | Failure Reason | Timestamp |
+|---------|------|------|---------|----------------|-----------|
+| 1 | [step] | [name] | [sub_step] | [reason] | [date-time] |
+| ... | ... | ... | ... | ... | ... |
+
+(Clear this table when the step succeeds or user resets. Append a row on each failed auto-retry.)
+
+## Blockers
+- [blocker 1, if any]
+- [none]
+```
+
+### State File Rules
+
+1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 1)
+2. **Update** the state file after every step completion, every session boundary, every BLOCKING gate confirmation, and every failed retry attempt
+3. **Read** the state file as the first action on every invocation — before folder scanning
+4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 3 but `_docs/02_document/architecture.md` already exists), trust the folder structure and update the state file to match
+5. **Never delete** the state file. It accumulates history across the entire project lifecycle
+6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` when the step succeeds or the user manually resets. If `retry_count` reaches 3, set `status: failed` and add an entry to `Blockers`
+7. **Failed state on re-entry**: if the state file shows `status: failed` with `retry_count: 3`, do NOT auto-retry — present the blocker to the user and wait for their decision before proceeding
+
+## State Detection
+
+Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning.
+
+### Folder Scan Rules (fallback)
+
+Scan `_docs/` to determine the current workflow position. The detection rules are defined in each flow file (`flows/greenfield.md` and `flows/existing-code.md`). Check the existing-code flow first (Step 1 detection), then greenfield flow rules. First match wins.
+
+## Re-Entry Protocol
+
+When the user invokes `/autopilot` and work already exists:
+
+1. Read `_docs/_autopilot_state.md`
+2. Cross-check against `_docs/` folder structure
+3. Present Status Summary with context from state file (key decisions, last session, blockers)
+4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery
+5. Continue execution from detected state
+
+## Session Boundaries
+
+After any decompose/planning step completes, **do not auto-chain to implement**. Instead:
+
+1. Update state file: mark the step as completed, set current step to the next implement step with status `not_started`
+   - Existing-code flow: After Step 3 (Decompose Tests) → set current step to 4 (Implement Tests)
+   - Existing-code flow: After Step 7 (New Task) → set current step to 8 (Implement)
+   - Greenfield flow: After Step 5 (Decompose) → set current step to 6 (Implement)
+2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready`
+3. Present a summary: number of tasks, estimated batches, total complexity points
+4. Use Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Decompose complete — start implementation?
+══════════════════════════════════════
+ A) Start a new conversation for implementation (recommended for context freshness)
+ B) Continue implementation in this conversation
+══════════════════════════════════════
+ Recommendation: A — implementation is the longest phase, fresh context helps
+══════════════════════════════════════
+```
+
+These are the only hard session boundaries. All other transitions auto-chain.