Sync .cursor from detections

2026-04-23 03:06:38 +00:00 · 2026-04-12 05:05:11 +03:00
parent fddf1b8706
commit 09117f90b5
108 changed files with 11844 additions and 15 deletions
@@ -0,0 +1,123 @@
+---
+name: autopilot
+description: |
+  Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
+  Detects current project state from _docs/ folder, resumes from where it left off, and flows through
+  problem → research → plan → decompose → implement → deploy without manual skill invocation.
+  Maximizes work per conversation by auto-transitioning between skills.
+  Trigger phrases:
+  - "autopilot", "auto", "start", "continue"
+  - "what's next", "where am I", "project status"
+category: meta
+tags: [orchestrator, workflow, auto-chain, state-machine, meta-skill]
+disable-model-invocation: true
+---
+
+# Autopilot Orchestrator
+
+Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autopilot` once — the engine handles sequencing, transitions, and re-entry.
+
+## File Index
+
+| File | Purpose |
+|------|---------|
+| `flows/greenfield.md` | Detection rules, step table, and auto-chain rules for new projects |
+| `flows/existing-code.md` | Detection rules, step table, and auto-chain rules for existing codebases |
+| `state.md` | State file format, rules, re-entry protocol, session boundaries |
+| `protocols.md` | User interaction, tracker auth, choice format, error handling, status summary |
+
+**On every invocation**: read all four files above before executing any logic.
+
+## Core Principles
+
+- **Auto-chain**: when a skill completes, immediately start the next one — no pause between skills
+- **Only pause at decision points**: BLOCKING gates inside sub-skills are the natural pause points; do not add artificial stops between steps
+- **State from disk**: current step is persisted to `_docs/_autopilot_state.md` and cross-checked against `_docs/` folder structure
+- **Re-entry**: on every invocation, read the state file and cross-check against `_docs/` folders before continuing
+- **Delegate, don't duplicate**: read and execute each sub-skill's SKILL.md; never inline their logic here
+- **Sound on pause**: follow `.cursor/rules/human-attention-sound.mdc` — play a notification sound before every pause that requires human input (AskQuestion tool preferred for structured choices; fall back to plain text if unavailable)
+- **Minimize interruptions**: only ask the user when the decision genuinely cannot be resolved automatically
+- **Single project per workspace**: all `_docs/` paths are relative to workspace root; for monorepos, each service needs its own Cursor workspace
+
+## Flow Resolution
+
+Determine which flow to use:
+
+1. If workspace has **no source code files** → **greenfield flow**
+2. If workspace has source code files **and** `_docs/` does not exist → **existing-code flow**
+3. If workspace has source code files **and** `_docs/` exists **and** `_docs/_autopilot_state.md` does not exist → **existing-code flow**
+4. If workspace has source code files **and** `_docs/_autopilot_state.md` exists → read the `flow` field from the state file and use that flow
+
+After selecting the flow, apply its detection rules (first match wins) to determine the current step.
+
+## Execution Loop
+
+Every invocation follows this sequence:
+
+```
+1. Read _docs/_autopilot_state.md (if exists)
+2. Read all File Index files above
+3. Cross-check state file against _docs/ folder structure (rules in state.md)
+4. Resolve flow (see Flow Resolution above)
+5. Resolve current step (detection rules from the active flow file)
+6. Present Status Summary (template in active flow file)
+7. Execute:
+   a. Delegate to current skill (see Skill Delegation below)
+   b. If skill returns FAILED → apply Skill Failure Retry Protocol (see protocols.md):
+      - Auto-retry the same skill (failure may be caused by missing user input or environment issue)
+      - If 3 consecutive auto-retries fail → set status: failed, warn user, stop auto-retry
+   c. When skill completes successfully → reset retry counter, update state file (rules in state.md)
+   d. Re-detect next step from the active flow's detection rules
+   e. If next skill is ready → auto-chain (go to 7a with next skill)
+   f. If session boundary reached → update state, suggest new conversation (rules in state.md)
+   g. If all steps done → update state → report completion
+```
+
+## Skill Delegation
+
+For each step, the delegation pattern is:
+
+1. Update state file: set `step` to the autopilot step number, status to `in_progress`, set `sub_step` to the sub-skill's current internal step/phase, reset `retry_count: 0`
+2. Announce: "Starting [Skill Name]..."
+3. Read the skill file: `.cursor/skills/[name]/SKILL.md`
+4. Execute the skill's workflow exactly as written, including all BLOCKING gates, self-verification checklists, save actions, and escalation rules. Update `sub_step` in state each time the sub-skill advances.
+5. If the skill **fails**: follow the Skill Failure Retry Protocol in `protocols.md` — increment `retry_count`, auto-retry up to 3 times, then escalate.
+6. When complete (success): reset `retry_count: 0`, update state file to the next step with `status: not_started`, return to auto-chain rules (from active flow file)
+
+Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The autopilot is a sequencer, not an optimizer.
+
+## State File Template
+
+The state file (`_docs/_autopilot_state.md`) is a minimal pointer — only the current step. Full format rules are in `state.md`.
+
+```markdown
+# Autopilot State
+
+## Current Step
+flow: [greenfield | existing-code]
+step: [number or "done"]
+name: [step name]
+status: [not_started / in_progress / completed / skipped / failed]
+sub_step: [0 or N — sub-skill phase name]
+retry_count: [0-3]
+```
+
+## Trigger Conditions
+
+This skill activates when the user wants to:
+- Start a new project from scratch
+- Continue an in-progress project
+- Check project status
+- Let the AI guide them through the full workflow
+
+**Keywords**: "autopilot", "auto", "start", "continue", "what's next", "where am I", "project status"
+
+**Differentiation**:
+- User wants only research → use `/research` directly
+- User wants only planning → use `/plan` directly
+- User wants to document an existing codebase → use `/document` directly
+- User wants the full guided workflow → use `/autopilot`
+
+## Flow Reference
+
+See `flows/greenfield.md` and `flows/existing-code.md` for step tables, detection rules, auto-chain rules, and status summary templates.
@@ -0,0 +1,297 @@
+# Existing Code Workflow
+
+Workflow for projects with an existing codebase. Starts with documentation, produces test specs, checks code testability (refactoring if needed), decomposes and implements tests, verifies them, refactors with that safety net, then adds new functionality and deploys.
+
+## Step Reference Table
+
+| Step | Name | Sub-Skill | Internal SubSteps |
+|------|------|-----------|-------------------|
+| 1 | Document | document/SKILL.md | Steps 1–8 |
+| 2 | Test Spec | test-spec/SKILL.md | Phase 1a–1b |
+| 3 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
+| 4 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
+| 5 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 6 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 7 | Refactor | refactor/SKILL.md | Phases 0–7 (optional) |
+| 8 | New Task | new-task/SKILL.md | Steps 1–8 (loop) |
+| 9 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 10 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 11 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
+| 12 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 13 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
+| 14 | Deploy | deploy/SKILL.md | Step 1–7 |
+
+After Step 14, the existing-code workflow is complete.
+
+## Detection Rules
+
+Check rules in order — first match wins.
+
+---
+
+**Step 1 — Document**
+Condition: `_docs/` does not exist AND the workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`, `src/`, `Cargo.toml`, `*.csproj`, `package.json`)
+
+Action: An existing codebase without documentation was detected. Read and execute `.cursor/skills/document/SKILL.md`. After the document skill completes, re-detect state (the produced `_docs/` artifacts will place the project at Step 2 or later).
+
+---
+
+**Step 2 — Test Spec**
+Condition: `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`) AND `_docs/02_document/tests/traceability-matrix.md` does not exist AND the autopilot state shows `step >= 2` (Document already ran)
+
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md`
+
+This step applies when the codebase was documented via the `/document` skill. Test specifications must be produced before refactoring or further development.
+
+---
+
+**Step 3 — Code Testability Revision**
+Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND the autopilot state shows Test Spec (Step 2) is completed AND the autopilot state does NOT show Code Testability Revision (Step 3) as completed or skipped
+
+Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
+
+1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`
+2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
+   - Hardcoded file paths or directory references
+   - Hardcoded configuration values (URLs, credentials, magic numbers)
+   - Global mutable state that cannot be overridden
+   - Tight coupling to external services without abstraction
+   - Missing dependency injection or non-configurable parameters
+   - Direct file system operations without path configurability
+   - Inline construction of heavy dependencies (models, clients)
+3. If ALL scenarios are testable as-is:
+   - Mark Step 3 as `completed` with outcome "Code is testable — no changes needed"
+   - Auto-chain to Step 4 (Decompose Tests)
+4. If testability issues are found:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
+     - **Mode**: `guided`
+     - **Source**: `autopilot-testability-analysis`
+     - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies)
+   - Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
+   - The refactor skill will create RUN_DIR (`01-testability-refactoring`), create tasks in `_docs/02_tasks/todo/`, delegate to implement skill, and verify results
+   - Phase 3 (Safety Net) is automatically skipped by the refactor skill for testability runs
+   - After refactoring completes, mark Step 3 as `completed`
+   - Auto-chain to Step 4 (Decompose Tests)
+
+---
+
+**Step 4 — Decompose Tests**
+Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND the autopilot state shows Step 3 (Code Testability Revision) is completed or skipped AND (`_docs/02_tasks/todo/` does not exist or has no test task files)
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
+1. Run Step 1t (test infrastructure bootstrap)
+2. Run Step 3 (blackbox test task decomposition)
+3. Run Step 4 (cross-verification against test coverage)
+
+If `_docs/02_tasks/` subfolders have some task files already (e.g., refactoring tasks from Step 3), the decompose skill's resumability handles it — it appends test tasks alongside existing tasks.
+
+---
+
+**Step 5 — Implement Tests**
+Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 4 (Decompose Tests) is completed AND `_docs/03_implementation/implementation_report_tests.md` does not exist
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.
+
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
+
+---
+
+**Step 6 — Run Tests**
+Condition: `_docs/03_implementation/implementation_report_tests.md` exists AND the autopilot state shows Step 5 (Implement Tests) is completed AND the autopilot state does NOT show Step 6 (Run Tests) as completed
+
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`
+
+Verifies the implemented test suite passes before proceeding to refactoring. The tests form the safety net for all subsequent code changes.
+
+---
+
+**Step 7 — Refactor (optional)**
+Condition: the autopilot state shows Step 6 (Run Tests) is completed AND the autopilot state does NOT show Step 7 (Refactor) as completed or skipped AND no `_docs/04_refactoring/` run folder contains a `FINAL_report.md` for a non-testability run
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Refactor codebase before adding new features?
+══════════════════════════════════════
+ A) Run refactoring (recommended if code quality issues were noted during documentation)
+ B) Skip — proceed directly to New Task
+══════════════════════════════════════
+ Recommendation: [A or B — base on whether documentation
+ flagged significant code smells, coupling issues, or
+ technical debt worth addressing before new development]
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/refactor/SKILL.md` in automatic mode. The refactor skill creates a new run folder in `_docs/04_refactoring/` (e.g., `02-coupling-refactoring`), runs the full method using the implemented tests as a safety net. After completion, auto-chain to Step 8 (New Task).
+- If user picks B → Mark Step 7 as `skipped` in the state file, auto-chain to Step 8 (New Task).
+
+---
+
+**Step 8 — New Task**
+Condition: the autopilot state shows Step 7 (Refactor) is completed or skipped AND the autopilot state does NOT show Step 8 (New Task) as completed
+
+Action: Read and execute `.cursor/skills/new-task/SKILL.md`
+
+The new-task skill interactively guides the user through defining new functionality. It loops until the user is done adding tasks. New task files are written to `_docs/02_tasks/todo/`.
+
+---
+
+**Step 9 — Implement**
+Condition: the autopilot state shows Step 8 (New Task) is completed AND `_docs/03_implementation/` does not contain an `implementation_report_*.md` file other than `implementation_report_tests.md` (the tests report from Step 5 is excluded from this check)
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+The implement skill reads the new tasks from `_docs/02_tasks/todo/` and implements them. Tasks already implemented in Step 5 are skipped (completed tasks have been moved to `done/`).
+
+If `_docs/03_implementation/` has batch reports from this phase, the implement skill detects completed tasks and continues.
+
+---
+
+**Step 10 — Run Tests**
+Condition: the autopilot state shows Step 9 (Implement) is completed AND the autopilot state does NOT show Step 10 (Run Tests) as completed
+
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`
+
+---
+
+**Step 11 — Update Docs**
+Condition: the autopilot state shows Step 10 (Run Tests) is completed AND the autopilot state does NOT show Step 11 (Update Docs) as completed AND `_docs/02_document/` contains existing documentation (module or component docs)
+
+Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all task spec files from `_docs/02_tasks/done/` that were implemented in the current cycle (i.e., tasks moved to `done/` during Steps 8–9 of this cycle).
+
+The document skill in Task mode:
+1. Reads each task spec to identify changed source files
+2. Updates affected module docs, component docs, and system-level docs
+3. Does NOT redo full discovery, verification, or problem extraction
+
+If `_docs/02_document/` does not contain existing docs (e.g., documentation step was skipped), mark Step 11 as `skipped`.
+
+After completion, auto-chain to Step 12 (Security Audit).
+
+---
+
+**Step 12 — Security Audit (optional)**
+Condition: the autopilot state shows Step 11 (Update Docs) is completed or skipped AND the autopilot state does NOT show Step 12 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run security audit before deploy?
+══════════════════════════════════════
+ A) Run security audit (recommended for production deployments)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: A — catches vulnerabilities before production
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 13 (Performance Test).
+- If user picks B → Mark Step 12 as `skipped` in the state file, auto-chain to Step 13 (Performance Test).
+
+---
+
+**Step 13 — Performance Test (optional)**
+Condition: the autopilot state shows Step 12 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 13 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run performance/load tests before deploy?
+══════════════════════════════════════
+ A) Run performance tests (recommended for latency-sensitive or high-load systems)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: [A or B — base on whether acceptance criteria
+ include latency, throughput, or load requirements]
+══════════════════════════════════════
+```
+
+- If user picks A → Run performance tests:
+  1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it
+  2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
+  3. Present results vs acceptance criteria thresholds
+  4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
+  5. After completion, auto-chain to Step 14 (Deploy)
+- If user picks B → Mark Step 13 as `skipped` in the state file, auto-chain to Step 14 (Deploy).
+
+---
+
+**Step 14 — Deploy**
+Condition: the autopilot state shows Step 10 (Run Tests) is completed AND (Step 11 is completed or skipped) AND (Step 12 is completed or skipped) AND (Step 13 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Read and execute `.cursor/skills/deploy/SKILL.md`
+
+After deployment completes, the existing-code workflow is done.
+
+---
+
+**Re-Entry After Completion**
+Condition: the autopilot state shows `step: done` OR all steps through 14 (Deploy) are completed
+
+Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:
+
+```
+══════════════════════════════════════
+ PROJECT CYCLE COMPLETE
+══════════════════════════════════════
+ The previous cycle finished successfully.
+ Starting new feature cycle…
+══════════════════════════════════════
+```
+
+Set `step: 8`, `status: not_started` in the state file, then auto-chain to Step 8 (New Task).
+
+Note: the loop (Steps 8 → 14 → 8) ensures every feature cycle includes: New Task → Implement → Run Tests → Update Docs → Security → Performance → Deploy.
+
+## Auto-Chain Rules
+
+| Completed Step | Next Action |
+|---------------|-------------|
+| Document (1) | Auto-chain → Test Spec (2) |
+| Test Spec (2) | Auto-chain → Code Testability Revision (3) |
+| Code Testability Revision (3) | Auto-chain → Decompose Tests (4) |
+| Decompose Tests (4) | **Session boundary** — suggest new conversation before Implement Tests |
+| Implement Tests (5) | Auto-chain → Run Tests (6) |
+| Run Tests (6, all pass) | Auto-chain → Refactor choice (7) |
+| Refactor (7, done or skipped) | Auto-chain → New Task (8) |
+| New Task (8) | **Session boundary** — suggest new conversation before Implement |
+| Implement (9) | Auto-chain → Run Tests (10) |
+| Run Tests (10, all pass) | Auto-chain → Update Docs (11) |
+| Update Docs (11) | Auto-chain → Security Audit choice (12) |
+| Security Audit (12, done or skipped) | Auto-chain → Performance Test choice (13) |
+| Performance Test (13, done or skipped) | Auto-chain → Deploy (14) |
+| Deploy (14) | **Workflow complete** — existing-code flow done |
+
+## Status Summary Template
+
+```
+═══════════════════════════════════════════════════
+ AUTOPILOT STATUS (existing-code)
+═══════════════════════════════════════════════════
+ Step 1   Document                 [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 2   Test Spec                [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 3   Code Testability Rev.    [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 4   Decompose Tests          [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 5   Implement Tests          [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)]
+ Step 6   Run Tests                [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 7   Refactor                 [DONE / SKIPPED / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)]
+ Step 8   New Task                 [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 9   Implement                [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
+ Step 10  Run Tests                [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 11  Update Docs              [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 12  Security Audit           [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 13  Performance Test         [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 14  Deploy                   [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+═══════════════════════════════════════════════════
+ Current: Step N — Name
+ SubStep: M — [sub-skill internal step name]
+ Retry:   [N/3 if retrying, omit if 0]
+ Action:  [what will happen next]
+═══════════════════════════════════════════════════
+```
@@ -0,0 +1,235 @@
+# Greenfield Workflow
+
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy.
+
+## Step Reference Table
+
+| Step | Name | Sub-Skill | Internal SubSteps |
+|------|------|-----------|-------------------|
+| 1 | Problem | problem/SKILL.md | Phase 1–4 |
+| 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
+| 3 | Plan | plan/SKILL.md | Step 1–6 + Final |
+| 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
+| 5 | Decompose | decompose/SKILL.md | Step 1–4 |
+| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 7 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 8 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 9 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
+| 10 | Deploy | deploy/SKILL.md | Step 1–7 |
+
+## Detection Rules
+
+Check rules in order — first match wins.
+
+---
+
+**Step 1 — Problem Gathering**
+Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty:
+- `problem.md`
+- `restrictions.md`
+- `acceptance_criteria.md`
+- `input_data/` (must contain at least one file)
+
+Action: Read and execute `.cursor/skills/problem/SKILL.md`
+
+---
+
+**Step 2 — Research (Initial)**
+Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files
+
+Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A)
+
+---
+
+**Research Decision** (inline gate between Step 2 and Step 3)
+Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_document/architecture.md` does not exist
+
+Action: Present the current research state to the user:
+- How many solution drafts exist
+- Whether tech_stack.md and security_analysis.md exist
+- One-line summary from the latest draft
+
+Then present using the **Choose format**:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Research complete — next action?
+══════════════════════════════════════
+ A) Run another research round (Mode B assessment)
+ B) Proceed to planning with current draft
+══════════════════════════════════════
+ Recommendation: [A or B] — [reason based on draft quality]
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B)
+- If user picks B → auto-chain to Step 3 (Plan)
+
+---
+
+**Step 3 — Plan**
+Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_document/architecture.md` does not exist
+
+Action:
+1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself
+2. Read and execute `.cursor/skills/plan/SKILL.md`
+
+If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it.
+
+---
+
+**Step 4 — UI Design (conditional)**
+Condition: `_docs/02_document/architecture.md` exists AND the autopilot state does NOT show Step 4 (UI Design) as completed or skipped AND the project is a UI project
+
+**UI Project Detection** — the project is a UI project if ANY of the following are true:
+- `package.json` exists in the workspace root or any subdirectory
+- `*.html`, `*.jsx`, `*.tsx` files exist in the workspace
+- `_docs/02_document/components/` contains a component whose `description.md` mentions UI, frontend, page, screen, dashboard, form, or view
+- `_docs/02_document/architecture.md` mentions frontend, UI layer, SPA, or client-side rendering
+- `_docs/01_solution/solution.md` mentions frontend, web interface, or user-facing UI
+
+If the project is NOT a UI project → mark Step 4 as `skipped` in the state file and auto-chain to Step 5.
+
+If the project IS a UI project → present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: UI project detected — generate mockups?
+══════════════════════════════════════
+ A) Generate UI mockups before decomposition (recommended)
+ B) Skip — proceed directly to decompose
+══════════════════════════════════════
+ Recommendation: A — mockups before decomposition
+ produce better task specs for frontend components
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/ui-design/SKILL.md`. After completion, auto-chain to Step 5 (Decompose).
+- If user picks B → Mark Step 4 as `skipped` in the state file, auto-chain to Step 5 (Decompose).
+
+---
+
+**Step 5 — Decompose**
+Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md`
+
+If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.
+
+---
+
+**Step 6 — Implement**
+Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.
+
+---
+
+**Step 7 — Run Tests**
+Condition: `_docs/03_implementation/` contains an `implementation_report_*.md` file AND the autopilot state does NOT show Step 7 (Run Tests) as completed AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`
+
+---
+
+**Step 8 — Security Audit (optional)**
+Condition: the autopilot state shows Step 7 (Run Tests) is completed AND the autopilot state does NOT show Step 8 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run security audit before deploy?
+══════════════════════════════════════
+ A) Run security audit (recommended for production deployments)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: A — catches vulnerabilities before production
+══════════════════════════════════════
+```
+
+- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 9 (Performance Test).
+- If user picks B → Mark Step 8 as `skipped` in the state file, auto-chain to Step 9 (Performance Test).
+
+---
+
+**Step 9 — Performance Test (optional)**
+Condition: the autopilot state shows Step 8 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 9 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Present using Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Run performance/load tests before deploy?
+══════════════════════════════════════
+ A) Run performance tests (recommended for latency-sensitive or high-load systems)
+ B) Skip — proceed directly to deploy
+══════════════════════════════════════
+ Recommendation: [A or B — base on whether acceptance criteria
+ include latency, throughput, or load requirements]
+══════════════════════════════════════
+```
+
+- If user picks A → Run performance tests:
+  1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it
+  2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
+  3. Present results vs acceptance criteria thresholds
+  4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
+  5. After completion, auto-chain to Step 10 (Deploy)
+- If user picks B → Mark Step 9 as `skipped` in the state file, auto-chain to Step 10 (Deploy).
+
+---
+
+**Step 10 — Deploy**
+Condition: the autopilot state shows Step 7 (Run Tests) is completed AND (Step 8 is completed or skipped) AND (Step 9 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
+
+Action: Read and execute `.cursor/skills/deploy/SKILL.md`
+
+---
+
+**Done**
+Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md)
+
+Action: Report project completion with summary. If the user runs autopilot again after greenfield completion, Flow Resolution rule 3 routes to the existing-code flow (re-entry after completion) so they can add new features.
+
+## Auto-Chain Rules
+
+| Completed Step | Next Action |
+|---------------|-------------|
+| Problem (1) | Auto-chain → Research (2) |
+| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
+| Research Decision → proceed | Auto-chain → Plan (3) |
+| Plan (3) | Auto-chain → UI Design detection (4) |
+| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
+| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
+| Implement (6) | Auto-chain → Run Tests (7) |
+| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
+| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
+| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
+| Deploy (10) | Report completion |
+
+## Status Summary Template
+
+```
+═══════════════════════════════════════════════════
+ AUTOPILOT STATUS (greenfield)
+═══════════════════════════════════════════════════
+ Step 1   Problem             [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 2   Research            [DONE (N drafts) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 3   Plan                [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 4   UI Design           [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 5   Decompose           [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 6   Implement           [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
+ Step 7   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 8   Security Audit      [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 9   Performance Test    [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 10  Deploy              [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+═══════════════════════════════════════════════════
+ Current: Step N — Name
+ SubStep: M — [sub-skill internal step name]
+ Retry:   [N/3 if retrying, omit if 0]
+ Action:  [what will happen next]
+═══════════════════════════════════════════════════
+```
@@ -0,0 +1,365 @@
+# Autopilot Protocols
+
+## User Interaction Protocol
+
+Every time the autopilot or a sub-skill needs a user decision, use the **Choose A / B / C / D** format. This applies to:
+
+- State transitions where multiple valid next actions exist
+- Sub-skill BLOCKING gates that require user judgment
+- Any fork where the autopilot cannot confidently pick the right path
+- Trade-off decisions (tech choices, scope, risk acceptance)
+
+### When to Ask (MUST ask)
+
+- The next action is ambiguous (e.g., "another research round or proceed?")
+- The decision has irreversible consequences (e.g., architecture choices, skipping a step)
+- The user's intent or preference cannot be inferred from existing artifacts
+- A sub-skill's BLOCKING gate explicitly requires user confirmation
+- Multiple valid approaches exist with meaningfully different trade-offs
+
+### When NOT to Ask (auto-transition)
+
+- Only one logical next step exists (e.g., Problem complete → Research is the only option)
+- The transition is deterministic from the state (e.g., Plan complete → Decompose)
+- The decision is low-risk and reversible
+- Existing artifacts or prior decisions already imply the answer
+
+### Choice Format
+
+Always present decisions in this format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: [brief context]
+══════════════════════════════════════
+ A) [Option A — short description]
+ B) [Option B — short description]
+ C) [Option C — short description, if applicable]
+ D) [Option D — short description, if applicable]
+══════════════════════════════════════
+ Recommendation: [A/B/C/D] — [one-line reason]
+══════════════════════════════════════
+```
+
+Rules:
+1. Always provide 2–4 concrete options (never open-ended questions)
+2. Always include a recommendation with a brief justification
+3. Keep option descriptions to one line each
+4. If only 2 options make sense, use A/B only — do not pad with filler options
+5. Play the notification sound (per `.cursor/rules/human-attention-sound.mdc`) before presenting the choice
+6. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive
+
+## Work Item Tracker Authentication
+
+Several workflow steps create work items (epics, tasks, links). The system requires some task tracker MCP as interchangeable backend.
+
+### Tracker Detection
+
+1. If there is no task tracker MCP or it is not authorized, ask the user about it
+3. Record the choice in the state file: `tracker: jira` or `tracker: ado`
+4. If neither is available, set `tracker: local` and proceed without external tracking
+
+### Steps That Require Work Item Tracker
+
+| Flow | Step | Sub-Step | Tracker Action |
+|------|------|----------|----------------|
+| greenfield | 3 (Plan) | Step 6 — Epics | Create epics for each component |
+| greenfield | 5 (Decompose) | Step 1–3 — All tasks | Create ticket per task, link to epic |
+| existing-code | 3 (Decompose Tests) | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
+| existing-code | 7 (New Task) | Step 7 — Ticket | Create ticket per task, link to epic |
+
+### Authentication Gate
+
+Before entering a step that requires work item tracking (see table above) for the first time, the autopilot must:
+
+1. Call `mcp_auth` on the detected tracker's MCP server
+2. If authentication succeeds → proceed normally
+3. If the user **skips** or authentication fails → present using Choose format:
+
+```
+══════════════════════════════════════
+ Tracker authentication failed
+══════════════════════════════════════
+ A) Retry authentication (retry mcp_auth)
+ B) Continue without tracker (tasks saved locally only)
+══════════════════════════════════════
+ Recommendation: A — Tracker IDs drive task referencing,
+ dependency tracking, and implementation batching.
+ Without tracker, task files use numeric prefixes instead.
+══════════════════════════════════════
+```
+
+If user picks **B** (continue without tracker):
+- Set a flag in the state file: `tracker: local`
+- All skills that would create tickets instead save metadata locally in the task/epic files with `Tracker: pending` status
+- Task files keep numeric prefixes (e.g., `01_initial_structure.md`) instead of tracker ID prefixes
+- The workflow proceeds normally in all other respects
+
+### Re-Authentication
+
+If the tracker MCP was already authenticated in a previous invocation (verify by listing available tools beyond `mcp_auth`), skip the auth gate.
+
+## Error Handling
+
+All error situations that require user input MUST use the **Choose A / B / C / D** format.
+
+| Situation | Action |
+|-----------|--------|
+| State detection is ambiguous (artifacts suggest two different steps) | Present findings and use Choose format with the candidate steps as options |
+| Sub-skill fails or hits an unrecoverable blocker | Use Choose format: A) retry, B) skip with warning, C) abort and fix manually |
+| User wants to skip a step | Use Choose format: A) skip (with dependency warning), B) execute the step |
+| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
+| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
+
+## Skill Failure Retry Protocol
+
+Sub-skills can return a **failed** result. Failures are often caused by missing user input, environment issues, or transient errors that resolve on retry. The autopilot auto-retries before escalating.
+
+### Retry Flow
+
+```
+Skill execution → FAILED
+  │
+  ├─ retry_count < 3 ?
+  │    YES → increment retry_count in state file
+  │         → re-read the sub-skill's SKILL.md
+  │         → re-execute from the current sub_step
+  │         → (loop back to check result)
+  │
+  │    NO (retry_count = 3) →
+  │         → set status: failed in Current Step
+  │         → present warning to user (see Escalation below)
+  │         → do NOT auto-retry again until user intervenes
+```
+
+### Retry Rules
+
+1. **Auto-retry immediately**: when a skill fails, retry it without asking the user — the failure is often transient (missing user confirmation in a prior step, docker not running, file lock, etc.)
+2. **Preserve sub_step**: retry from the last recorded `sub_step`, not from the beginning of the skill — unless the failure indicates corruption, in which case restart from sub_step 1
+3. **Increment `retry_count`**: update `retry_count` in the state file's `Current Step` section on each retry attempt
+4. **Reset on success**: when the skill eventually succeeds, reset `retry_count: 0`
+
+### Escalation (after 3 consecutive failures)
+
+After 3 failed auto-retries of the same skill, the failure is likely not user-related. Stop retrying and escalate:
+
+1. Update the state file: set `status: failed` and `retry_count: 3` in `Current Step`
+2. Play notification sound (per `.cursor/rules/human-attention-sound.mdc`)
+3. Present using Choose format:
+
+```
+══════════════════════════════════════
+ SKILL FAILED: [Skill Name] — 3 consecutive failures
+══════════════════════════════════════
+ Step: [N] — [Name]
+ SubStep: [M] — [sub-step name]
+ Last failure reason: [reason]
+══════════════════════════════════════
+ A) Retry with fresh context (new conversation)
+ B) Skip this step with warning
+ C) Abort — investigate and fix manually
+══════════════════════════════════════
+ Recommendation: A — fresh context often resolves
+ persistent failures
+══════════════════════════════════════
+```
+
+### Re-Entry After Failure
+
+On the next autopilot invocation (new conversation), if the state file shows `status: failed` and `retry_count: 3`:
+
+- Present the blocker to the user before attempting execution
+- If the user chooses to retry → reset `retry_count: 0`, set `status: in_progress`, and re-execute
+- If the user chooses to skip → mark step as `skipped`, proceed to next step
+- Do NOT silently auto-retry — the user must acknowledge the persistent failure first
+
+## Error Recovery Protocol
+
+### Stuck Detection
+
+When executing a sub-skill, monitor for these signals:
+
+- Same artifact overwritten 3+ times without meaningful change
+- Sub-skill repeatedly asks the same question after receiving an answer
+- No new artifacts saved for an extended period despite active execution
+
+### Recovery Actions (ordered)
+
+1. **Re-read state**: read `_docs/_autopilot_state.md` and cross-check against `_docs/` folders
+2. **Retry current sub-step**: re-read the sub-skill's SKILL.md and restart from the current sub-step
+3. **Escalate**: after 2 failed retries, present diagnostic summary to user using Choose format:
+
+```
+══════════════════════════════════════
+ RECOVERY: [skill name] stuck at [sub-step]
+══════════════════════════════════════
+ A) Retry with fresh context (new conversation)
+ B) Skip this sub-step with warning
+ C) Abort and fix manually
+══════════════════════════════════════
+ Recommendation: A — fresh context often resolves stuck loops
+══════════════════════════════════════
+```
+
+### Circuit Breaker
+
+If the same autopilot step fails 3 consecutive times across conversations:
+
+- Do NOT auto-retry on next invocation
+- Present the failure pattern and ask user for guidance before attempting again
+
+## Context Management Protocol
+
+### Principle
+
+Disk is memory. Never rely on in-context accumulation — read from `_docs/` artifacts, not from conversation history.
+
+### Minimal Re-Read Set Per Skill
+
+When re-entering a skill (new conversation or context refresh):
+
+- Always read: `_docs/_autopilot_state.md`
+- Always read: the active skill's `SKILL.md`
+- Conditionally read: only the `_docs/` artifacts the current sub-step requires (listed in each skill's Context Resolution section)
+- Never bulk-read: do not load all `_docs/` files at once
+
+### Mid-Skill Interruption
+
+If context is filling up during a long skill (e.g., document, implement):
+
+1. Save current sub-step progress to the skill's artifact directory
+2. Update `_docs/_autopilot_state.md` with exact sub-step position
+3. Suggest a new conversation: "Context is getting long — recommend continuing in a fresh conversation for better results"
+4. On re-entry, the skill's resumability protocol picks up from the saved sub-step
+
+### Large Artifact Handling
+
+When a skill needs to read large files (e.g., full solution.md, architecture.md):
+
+- Read only the sections relevant to the current sub-step
+- Use search tools (Grep, SemanticSearch) to find specific sections rather than reading entire files
+- Summarize key decisions from prior steps in the state file so they don't need to be re-read
+
+### Context Budget Heuristic
+
+Agents cannot programmatically query context window usage. Use these heuristics to avoid degradation:
+
+| Zone | Indicators | Action |
+|------|-----------|--------|
+| **Safe** | State file + SKILL.md + 2–3 focused artifacts loaded | Continue normally |
+| **Caution** | 5+ artifacts loaded, or 3+ large files (architecture, solution, discovery), or conversation has 20+ tool calls | Complete current sub-step, then suggest session break |
+| **Danger** | Repeated truncation in tool output, tool calls failing unexpectedly, responses becoming shallow or repetitive | Save immediately, update state file, force session boundary |
+
+**Skill-specific guidelines**:
+
+| Skill | Recommended session breaks |
+|-------|---------------------------|
+| **document** | After every ~5 modules in Step 1; between Step 4 (Verification) and Step 5 (Solution Extraction) |
+| **implement** | Each batch is a natural checkpoint; if more than 2 batches completed in one session, suggest break |
+| **plan** | Between Step 5 (Test Specifications) and Step 6 (Epics) for projects with many components |
+| **research** | Between Mode A rounds; between Mode A and Mode B |
+
+**How to detect caution/danger zone without API**:
+
+1. Count tool calls made so far — if approaching 20+, context is likely filling up
+2. If reading a file returns truncated content, context is under pressure
+3. If the agent starts producing shorter or less detailed responses than earlier in the conversation, context quality is degrading
+4. When in doubt, save and suggest a new conversation — re-entry is cheap thanks to the state file
+
+## Rollback Protocol
+
+### Implementation Steps (git-based)
+
+Handled by `/implement` skill — each batch commit is a rollback checkpoint via `git revert`.
+
+### Planning/Documentation Steps (artifact-based)
+
+For steps that produce `_docs/` artifacts (problem, research, plan, decompose, document):
+
+1. **Before overwriting**: if re-running a step that already has artifacts, the sub-skill's prerequisite check asks the user (resume/overwrite/skip)
+2. **Rollback to previous step**: use Choose format:
+
+```
+══════════════════════════════════════
+ ROLLBACK: Re-run [step name]?
+══════════════════════════════════════
+ A) Re-run the step (overwrites current artifacts)
+ B) Stay on current step
+══════════════════════════════════════
+ Warning: This will overwrite files in _docs/[folder]/
+══════════════════════════════════════
+```
+
+3. **Git safety net**: artifacts are committed with each autopilot step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout <commit> -- _docs/<folder>/`
+4. **State file rollback**: when rolling back artifacts, also update `_docs/_autopilot_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date)
+
+## Debug / Error Recovery Protocol
+
+When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or an implementer subagent reports a blocker, the user is asked to intervene. This protocol guides the recovery process.
+
+### Structured Debugging Workflow
+
+When escalated to the user after implementation failure:
+
+1. **Classify the failure** — determine the category:
+   - **Missing dependency**: a package, service, or module the task needs but isn't available
+   - **Logic error**: code runs but produces wrong results (assertion failures, incorrect output)
+   - **Integration mismatch**: interfaces between components don't align (type errors, missing methods, wrong signatures)
+   - **Environment issue**: Docker, database, network, or configuration problem
+   - **Spec ambiguity**: the task spec is unclear or contradictory
+
+2. **Reproduce** — isolate the failing behavior:
+   - Run the specific failing test(s) in isolation
+   - Check whether the failure is deterministic or intermittent
+   - Capture the exact error message, stack trace, and relevant file:line
+
+3. **Narrow scope** — focus on the minimal reproduction:
+   - For logic errors: trace the data flow from input to the point of failure
+   - For integration mismatches: compare the caller's expectations against the callee's actual interface
+   - For environment issues: verify Docker services are running, DB is accessible, env vars are set
+
+4. **Fix and verify** — apply the fix and confirm:
+   - Make the minimal change that fixes the root cause
+   - Re-run the failing test(s) to confirm the fix
+   - Run the full test suite to check for regressions
+   - If the fix changes a shared interface, check all consumers
+
+5. **Report** — update the batch report with:
+   - Root cause category
+   - Fix applied (file:line, description)
+   - Tests that now pass
+
+### Common Recovery Patterns
+
+| Failure Pattern | Typical Root Cause | Recovery Action |
+|----------------|-------------------|----------------|
+| ImportError / ModuleNotFoundError | Missing dependency or wrong path | Install dependency or fix import path |
+| TypeError on method call | Interface mismatch between tasks | Align caller with callee's actual signature |
+| AssertionError in test | Logic bug or wrong expected value | Fix logic or update test expectations |
+| ConnectionRefused | Service not running | Start Docker services, check docker-compose |
+| Timeout | Blocking I/O or infinite loop | Add timeout, fix blocking call |
+| FileNotFoundError | Hardcoded path or missing fixture | Make path configurable, add fixture |
+
+### Escalation
+
+If debugging does not resolve the issue after 2 focused attempts:
+
+```
+══════════════════════════════════════
+ DEBUG ESCALATION: [failure description]
+══════════════════════════════════════
+ Root cause category: [category]
+ Attempted fixes: [list]
+ Current state: [what works, what doesn't]
+══════════════════════════════════════
+ A) Continue debugging with more context
+ B) Revert this batch and skip the task (move to backlog)
+ C) Simplify the task scope and retry
+══════════════════════════════════════
+```
+
+## Status Summary
+
+On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). Use the Status Summary Template from the active flow file (`flows/greenfield.md` or `flows/existing-code.md`).
+
+For re-entry (state file exists), cross-check the current step against `_docs/` folder structure and present any `status: failed` state to the user before continuing.
@@ -0,0 +1,92 @@
+# Autopilot State Management
+
+## State File: `_docs/_autopilot_state.md`
+
+The autopilot persists its position to `_docs/_autopilot_state.md`. This is a lightweight pointer — only the current step. All history lives in `_docs/` artifacts and git log. Folder scanning is the fallback when the state file doesn't exist.
+
+### Template
+
+```markdown
+# Autopilot State
+
+## Current Step
+flow: [greenfield | existing-code]
+step: [1-10 for greenfield, 1-13 for existing-code, or "done"]
+name: [step name from the active flow's Step Reference Table]
+status: [not_started / in_progress / completed / skipped / failed]
+sub_step: [0, or sub-skill internal step number + name if interrupted mid-step]
+retry_count: [0-3 — consecutive auto-retry attempts, reset to 0 on success]
+```
+
+### Examples
+
+```
+flow: greenfield
+step: 3
+name: Plan
+status: in_progress
+sub_step: 4 — Architecture Review & Risk Assessment
+retry_count: 0
+```
+
+```
+flow: existing-code
+step: 2
+name: Test Spec
+status: failed
+sub_step: 1b — Test Case Generation
+retry_count: 3
+```
+
+### State File Rules
+
+1. **Create** on the first autopilot invocation (after state detection determines Step 1)
+2. **Update** after every change — this includes: batch completion, sub-step progress, step completion, session boundary, failed retry, or any meaningful state transition. The state file must always reflect the current reality.
+3. **Read** as the first action on every invocation — before folder scanning
+4. **Cross-check**: verify against actual `_docs/` folder contents. If they disagree, trust the folder structure and update the state file
+5. **Never delete** the state file
+6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` on success. If `retry_count` reaches 3, set `status: failed`
+7. **Failed state on re-entry**: if `status: failed` with `retry_count: 3`, do NOT auto-retry — present the issue to the user first
+8. **Skill-internal state**: when the active skill maintains its own state file (e.g., document skill's `_docs/02_document/state.json`), the autopilot's `sub_step` field should reflect the skill's internal progress. On re-entry, cross-check the skill's state file against the autopilot's `sub_step` for consistency.
+
+## State Detection
+
+Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning.
+
+### Folder Scan Rules (fallback)
+
+Scan `_docs/` to determine the current workflow position. The detection rules are defined in each flow file (`flows/greenfield.md` and `flows/existing-code.md`). Check the existing-code flow first (Step 1 detection), then greenfield flow rules. First match wins.
+
+## Re-Entry Protocol
+
+When the user invokes `/autopilot` and work already exists:
+
+1. Read `_docs/_autopilot_state.md`
+2. Cross-check against `_docs/` folder structure
+3. Present Status Summary (use the active flow's Status Summary Template)
+4. If the detected step has a sub-skill with built-in resumability, the sub-skill handles mid-step recovery
+5. Continue execution from detected state
+
+## Session Boundaries
+
+After any decompose/planning step completes, **do not auto-chain to implement**. Instead:
+
+1. Update state file: mark the step as completed, set current step to the next implement step with status `not_started`
+   - Existing-code flow: After Step 4 (Decompose Tests) → set current step to 5 (Implement Tests)
+   - Existing-code flow: After Step 8 (New Task) → set current step to 9 (Implement)
+   - Greenfield flow: After Step 5 (Decompose) → set current step to 6 (Implement)
+2. Present a summary: number of tasks, estimated batches, total complexity points
+3. Use Choose format:
+
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Decompose complete — start implementation?
+══════════════════════════════════════
+ A) Start a new conversation for implementation (recommended for context freshness)
+ B) Continue implementation in this conversation
+══════════════════════════════════════
+ Recommendation: A — implementation is the longest phase, fresh context helps
+══════════════════════════════════════
+```
+
+These are the only hard session boundaries. All other transitions auto-chain.