Sync .cursor from detections

2026-04-23 04:36:34 +00:00 · 2026-04-12 05:05:08 +03:00
parent 884abf7006
commit 359bab3c92
50 changed files with 2091 additions and 1276 deletions
@@ -24,7 +24,7 @@ Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Det
 | `flows/greenfield.md` | Detection rules, step table, and auto-chain rules for new projects |
 | `flows/existing-code.md` | Detection rules, step table, and auto-chain rules for existing codebases |
 | `state.md` | State file format, rules, re-entry protocol, session boundaries |
-| `protocols.md` | User interaction, Jira MCP auth, choice format, error handling, status summary |
+| `protocols.md` | User interaction, tracker auth, choice format, error handling, status summary |

 **On every invocation**: read all four files above before executing any logic.

@@ -32,10 +32,10 @@ Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Det

 - **Auto-chain**: when a skill completes, immediately start the next one — no pause between skills
 - **Only pause at decision points**: BLOCKING gates inside sub-skills are the natural pause points; do not add artificial stops between steps
- **State from disk**: all progress is persisted to `_docs/_autopilot_state.md` and cross-checked against `_docs/` folder structure
- **Rich re-entry**: on every invocation, read the state file for full context before continuing
+- **State from disk**: current step is persisted to `_docs/_autopilot_state.md` and cross-checked against `_docs/` folder structure
+- **Re-entry**: on every invocation, read the state file and cross-check against `_docs/` folders before continuing
 - **Delegate, don't duplicate**: read and execute each sub-skill's SKILL.md; never inline their logic here
- **Sound on pause**: follow `.cursor/rules/human-attention-sound.mdc` — play a notification sound before every pause that requires human input
+- **Sound on pause**: follow `.cursor/rules/human-attention-sound.mdc` — play a notification sound before every pause that requires human input (AskQuestion tool preferred for structured choices; fall back to plain text if unavailable)
 - **Minimize interruptions**: only ask the user when the decision genuinely cannot be resolved automatically
 - **Single project per workspace**: all `_docs/` paths are relative to workspace root; for monorepos, each service needs its own Cursor workspace

@@ -43,10 +43,10 @@ Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Det

 Determine which flow to use:

-1. If workspace has source code files **and** `_docs/` does not exist → **existing-code flow** (Pre-Step detection)
-2. If `_docs/_autopilot_state.md` exists and records Document in `Completed Steps` → **existing-code flow**
-3. If `_docs/_autopilot_state.md` exists and `step: done` AND workspace contains source code → **existing-code flow** (completed project re-entry — loops to New Task)
-4. Otherwise → **greenfield flow**
+1. If workspace has **no source code files** → **greenfield flow**
+2. If workspace has source code files **and** `_docs/` does not exist → **existing-code flow**
+3. If workspace has source code files **and** `_docs/` exists **and** `_docs/_autopilot_state.md` does not exist → **existing-code flow**
+4. If workspace has source code files **and** `_docs/_autopilot_state.md` exists → read the `flow` field from the state file and use that flow

 After selecting the flow, apply its detection rules (first match wins) to determine the current step.

@@ -65,7 +65,7 @@ Every invocation follows this sequence:
   a. Delegate to current skill (see Skill Delegation below)
   b. If skill returns FAILED → apply Skill Failure Retry Protocol (see protocols.md):
      - Auto-retry the same skill (failure may be caused by missing user input or environment issue)
-      - If 3 consecutive auto-retries fail → record in state file Blockers, warn user, stop auto-retry
+      - If 3 consecutive auto-retries fail → set status: failed, warn user, stop auto-retry
   c. When skill completes successfully → reset retry counter, update state file (rules in state.md)
   d. Re-detect next step from the active flow's detection rules
   e. If next skill is ready → auto-chain (go to 7a with next skill)
@@ -82,10 +82,26 @@ For each step, the delegation pattern is:
 3. Read the skill file: `.cursor/skills/[name]/SKILL.md`
 4. Execute the skill's workflow exactly as written, including all BLOCKING gates, self-verification checklists, save actions, and escalation rules. Update `sub_step` in state each time the sub-skill advances.
 5. If the skill **fails**: follow the Skill Failure Retry Protocol in `protocols.md` — increment `retry_count`, auto-retry up to 3 times, then escalate.
-6. When complete (success): reset `retry_count: 0`, mark step `completed`, record date + key outcome, add key decisions to state file, return to auto-chain rules (from active flow file)
+6. When complete (success): reset `retry_count: 0`, update state file to the next step with `status: not_started`, return to auto-chain rules (from active flow file)

 Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The autopilot is a sequencer, not an optimizer.

+## State File Template
+
+The state file (`_docs/_autopilot_state.md`) is a minimal pointer — only the current step. Full format rules are in `state.md`.
+
+```markdown
+# Autopilot State
+
+## Current Step
+flow: [greenfield | existing-code]
+step: [number or "done"]
+name: [step name]
+status: [not_started / in_progress / completed / skipped / failed]
+sub_step: [0 or N — sub-skill phase name]
+retry_count: [0-3]
+```
+
 ## Trigger Conditions

 This skill activates when the user wants to:
@@ -1,6 +1,6 @@
 # Existing Code Workflow

-Workflow for projects with an existing codebase. Starts with documentation, produces test specs, decomposes and implements tests, verifies them, refactors with that safety net, then adds new functionality and deploys.
+Workflow for projects with an existing codebase. Starts with documentation, produces test specs, checks code testability (refactoring if needed), decomposes and implements tests, verifies them, refactors with that safety net, then adds new functionality and deploys.

 ## Step Reference Table

@@ -8,18 +8,20 @@ Workflow for projects with an existing codebase. Starts with documentation, prod
 |------|------|-----------|-------------------|
 | 1 | Document | document/SKILL.md | Steps 1–8 |
 | 2 | Test Spec | test-spec/SKILL.md | Phase 1a–1b |
-| 3 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
-| 4 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
-| 5 | Run Tests | test-run/SKILL.md | Steps 1–4 |
-| 6 | Refactor | refactor/SKILL.md | Phases 0–5 (6-phase method) |
-| 7 | New Task | new-task/SKILL.md | Steps 1–8 (loop) |
-| 8 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
-| 9 | Run Tests | test-run/SKILL.md | Steps 1–4 |
-| 10 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
-| 11 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
-| 12 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 3 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
+| 4 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
+| 5 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 6 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 7 | Refactor | refactor/SKILL.md | Phases 0–7 (optional) |
+| 8 | New Task | new-task/SKILL.md | Steps 1–8 (loop) |
+| 9 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 10 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 11 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
+| 12 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 13 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
+| 14 | Deploy | deploy/SKILL.md | Step 1–7 |

-After Step 12, the existing-code workflow is complete.
+After Step 14, the existing-code workflow is complete.

 ## Detection Rules

@@ -35,7 +37,7 @@ Action: An existing codebase without documentation was detected. Read and execut
 ---

 **Step 2 — Test Spec**
-Condition: `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`) AND `_docs/02_document/tests/traceability-matrix.md` does not exist AND the autopilot state shows Document was run (check `Completed Steps` for "Document" entry)
+Condition: `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`) AND `_docs/02_document/tests/traceability-matrix.md` does not exist AND the autopilot state shows `step >= 2` (Document already ran)

 Action: Read and execute `.cursor/skills/test-spec/SKILL.md`

@@ -43,31 +45,62 @@ This step applies when the codebase was documented via the `/document` skill. Te

 ---

-**Step 3 — Decompose Tests**
-Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND the autopilot state shows Document was run AND (`_docs/02_tasks/` does not exist or has no task files)
+**Step 3 — Code Testability Revision**
+Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND the autopilot state shows Test Spec (Step 2) is completed AND the autopilot state does NOT show Code Testability Revision (Step 3) as completed or skipped
+
+Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
+
+1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`
+2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
+   - Hardcoded file paths or directory references
+   - Hardcoded configuration values (URLs, credentials, magic numbers)
+   - Global mutable state that cannot be overridden
+   - Tight coupling to external services without abstraction
+   - Missing dependency injection or non-configurable parameters
+   - Direct file system operations without path configurability
+   - Inline construction of heavy dependencies (models, clients)
+3. If ALL scenarios are testable as-is:
+   - Mark Step 3 as `completed` with outcome "Code is testable — no changes needed"
+   - Auto-chain to Step 4 (Decompose Tests)
+4. If testability issues are found:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
+     - **Mode**: `guided`
+     - **Source**: `autopilot-testability-analysis`
+     - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies)
+   - Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
+   - The refactor skill will create RUN_DIR (`01-testability-refactoring`), create tasks in `_docs/02_tasks/todo/`, delegate to implement skill, and verify results
+   - Phase 3 (Safety Net) is automatically skipped by the refactor skill for testability runs
+   - After refactoring completes, mark Step 3 as `completed`
+   - Auto-chain to Step 4 (Decompose Tests)
+
+---
+
+**Step 4 — Decompose Tests**
+Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND the autopilot state shows Step 3 (Code Testability Revision) is completed or skipped AND (`_docs/02_tasks/todo/` does not exist or has no test task files)

 Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
 1. Run Step 1t (test infrastructure bootstrap)
 2. Run Step 3 (blackbox test task decomposition)
 3. Run Step 4 (cross-verification against test coverage)

-If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
+If `_docs/02_tasks/` subfolders have some task files already (e.g., refactoring tasks from Step 3), the decompose skill's resumability handles it — it appends test tasks alongside existing tasks.

 ---

-**Step 4 — Implement Tests**
-Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 3 (Decompose Tests) is completed AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
+**Step 5 — Implement Tests**
+Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 4 (Decompose Tests) is completed AND `_docs/03_implementation/implementation_report_tests.md` does not exist

 Action: Read and execute `.cursor/skills/implement/SKILL.md`

-The implement skill reads test tasks from `_docs/02_tasks/` and implements them.
+The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.

 If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.

 ---

-**Step 5 — Run Tests**
-Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state shows Step 4 (Implement Tests) is completed AND the autopilot state does NOT show Step 5 (Run Tests) as completed
+**Step 6 — Run Tests**
+Condition: `_docs/03_implementation/implementation_report_tests.md` exists AND the autopilot state shows Step 5 (Implement Tests) is completed AND the autopilot state does NOT show Step 6 (Run Tests) as completed

 Action: Read and execute `.cursor/skills/test-run/SKILL.md`

@@ -75,46 +108,74 @@ Verifies the implemented test suite passes before proceeding to refactoring. The

 ---

-**Step 6 — Refactor**
-Condition: the autopilot state shows Step 5 (Run Tests) is completed AND `_docs/04_refactoring/FINAL_report.md` does not exist
+**Step 7 — Refactor (optional)**
+Condition: the autopilot state shows Step 6 (Run Tests) is completed AND the autopilot state does NOT show Step 7 (Refactor) as completed or skipped AND no `_docs/04_refactoring/` run folder contains a `FINAL_report.md` for a non-testability run

-Action: Read and execute `.cursor/skills/refactor/SKILL.md`
+Action: Present using Choose format:

-The refactor skill runs the full 6-phase method using the implemented tests as a safety net.
+```
+══════════════════════════════════════
+ DECISION REQUIRED: Refactor codebase before adding new features?
+══════════════════════════════════════
+ A) Run refactoring (recommended if code quality issues were noted during documentation)
+ B) Skip — proceed directly to New Task
+══════════════════════════════════════
+ Recommendation: [A or B — base on whether documentation
+ flagged significant code smells, coupling issues, or
+ technical debt worth addressing before new development]
+══════════════════════════════════════
+```

-If `_docs/04_refactoring/` has phase reports, the refactor skill detects completed phases and continues.
+- If user picks A → Read and execute `.cursor/skills/refactor/SKILL.md` in automatic mode. The refactor skill creates a new run folder in `_docs/04_refactoring/` (e.g., `02-coupling-refactoring`), runs the full method using the implemented tests as a safety net. After completion, auto-chain to Step 8 (New Task).
+- If user picks B → Mark Step 7 as `skipped` in the state file, auto-chain to Step 8 (New Task).

 ---

-**Step 7 — New Task**
-Condition: the autopilot state shows Step 6 (Refactor) is completed AND the autopilot state does NOT show Step 7 (New Task) as completed
+**Step 8 — New Task**
+Condition: the autopilot state shows Step 7 (Refactor) is completed or skipped AND the autopilot state does NOT show Step 8 (New Task) as completed

 Action: Read and execute `.cursor/skills/new-task/SKILL.md`

-The new-task skill interactively guides the user through defining new functionality. It loops until the user is done adding tasks. New task files are written to `_docs/02_tasks/`.
+The new-task skill interactively guides the user through defining new functionality. It loops until the user is done adding tasks. New task files are written to `_docs/02_tasks/todo/`.

 ---

-**Step 8 — Implement**
-Condition: the autopilot state shows Step 7 (New Task) is completed AND `_docs/03_implementation/` does not contain a FINAL report covering the new tasks (check state for distinction between test implementation and feature implementation)
+**Step 9 — Implement**
+Condition: the autopilot state shows Step 8 (New Task) is completed AND `_docs/03_implementation/` does not contain an `implementation_report_*.md` file other than `implementation_report_tests.md` (the tests report from Step 5 is excluded from this check)

 Action: Read and execute `.cursor/skills/implement/SKILL.md`

-The implement skill reads the new tasks from `_docs/02_tasks/` and implements them. Tasks already implemented in Step 4 are skipped (the implement skill tracks completed tasks in batch reports).
+The implement skill reads the new tasks from `_docs/02_tasks/todo/` and implements them. Tasks already implemented in Step 5 are skipped (completed tasks have been moved to `done/`).

 If `_docs/03_implementation/` has batch reports from this phase, the implement skill detects completed tasks and continues.

 ---

-**Step 9 — Run Tests**
-Condition: the autopilot state shows Step 8 (Implement) is completed AND the autopilot state does NOT show Step 9 (Run Tests) as completed
+**Step 10 — Run Tests**
+Condition: the autopilot state shows Step 9 (Implement) is completed AND the autopilot state does NOT show Step 10 (Run Tests) as completed

 Action: Read and execute `.cursor/skills/test-run/SKILL.md`

 ---

-**Step 10 — Security Audit (optional)**
-Condition: the autopilot state shows Step 9 (Run Tests) is completed AND the autopilot state does NOT show Step 10 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 11 — Update Docs**
+Condition: the autopilot state shows Step 10 (Run Tests) is completed AND the autopilot state does NOT show Step 11 (Update Docs) as completed AND `_docs/02_document/` contains existing documentation (module or component docs)
+
+Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all task spec files from `_docs/02_tasks/done/` that were implemented in the current cycle (i.e., tasks moved to `done/` during Steps 8–9 of this cycle).
+
+The document skill in Task mode:
+1. Reads each task spec to identify changed source files
+2. Updates affected module docs, component docs, and system-level docs
+3. Does NOT redo full discovery, verification, or problem extraction
+
+If `_docs/02_document/` does not contain existing docs (e.g., documentation step was skipped), mark Step 11 as `skipped`.
+
+After completion, auto-chain to Step 12 (Security Audit).
+
+---
+
+**Step 12 — Security Audit (optional)**
+Condition: the autopilot state shows Step 11 (Update Docs) is completed or skipped AND the autopilot state does NOT show Step 12 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Present using Choose format:

@@ -129,13 +190,13 @@ Action: Present using Choose format:
 ══════════════════════════════════════
 ```

- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 11 (Performance Test).
- If user picks B → Mark Step 10 as `skipped` in the state file, auto-chain to Step 11 (Performance Test).
+- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 13 (Performance Test).
+- If user picks B → Mark Step 12 as `skipped` in the state file, auto-chain to Step 13 (Performance Test).

 ---

-**Step 11 — Performance Test (optional)**
-Condition: the autopilot state shows Step 10 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 11 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 13 — Performance Test (optional)**
+Condition: the autopilot state shows Step 12 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 13 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Present using Choose format:

@@ -156,13 +217,13 @@ Action: Present using Choose format:
  2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
  3. Present results vs acceptance criteria thresholds
  4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
-  5. After completion, auto-chain to Step 12 (Deploy)
- If user picks B → Mark Step 11 as `skipped` in the state file, auto-chain to Step 12 (Deploy).
+  5. After completion, auto-chain to Step 14 (Deploy)
+- If user picks B → Mark Step 13 as `skipped` in the state file, auto-chain to Step 14 (Deploy).

 ---

-**Step 12 — Deploy**
-Condition: the autopilot state shows Step 9 (Run Tests) is completed AND (Step 10 is completed or skipped) AND (Step 11 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 14 — Deploy**
+Condition: the autopilot state shows Step 10 (Run Tests) is completed AND (Step 11 is completed or skipped) AND (Step 12 is completed or skipped) AND (Step 13 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Read and execute `.cursor/skills/deploy/SKILL.md`

@@ -171,41 +232,41 @@ After deployment completes, the existing-code workflow is done.
 ---

 **Re-Entry After Completion**
-Condition: the autopilot state shows `step: done` OR all steps through 12 (Deploy) are completed
+Condition: the autopilot state shows `step: done` OR all steps through 14 (Deploy) are completed

-Action: The project completed a full cycle. Present status and loop back to New Task:
+Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:

 ```
 ══════════════════════════════════════
 PROJECT CYCLE COMPLETE
 ══════════════════════════════════════
 The previous cycle finished successfully.
- You can now add new functionality.
-══════════════════════════════════════
- A) Add new features (start New Task)
- B) Done — no more changes needed
+ Starting new feature cycle…
 ══════════════════════════════════════
 ```

- If user picks A → set `step: 7`, `status: not_started` in the state file, then auto-chain to Step 7 (New Task). Previous cycle history stays in Completed Steps.
- If user picks B → report final project status and exit.
+Set `step: 8`, `status: not_started` in the state file, then auto-chain to Step 8 (New Task).
+
+Note: the loop (Steps 8 → 14 → 8) ensures every feature cycle includes: New Task → Implement → Run Tests → Update Docs → Security → Performance → Deploy.

 ## Auto-Chain Rules

 | Completed Step | Next Action |
 |---------------|-------------|
 | Document (1) | Auto-chain → Test Spec (2) |
-| Test Spec (2) | Auto-chain → Decompose Tests (3) |
-| Decompose Tests (3) | **Session boundary** — suggest new conversation before Implement Tests |
-| Implement Tests (4) | Auto-chain → Run Tests (5) |
-| Run Tests (5, all pass) | Auto-chain → Refactor (6) |
-| Refactor (6) | Auto-chain → New Task (7) |
-| New Task (7) | **Session boundary** — suggest new conversation before Implement |
-| Implement (8) | Auto-chain → Run Tests (9) |
-| Run Tests (9, all pass) | Auto-chain → Security Audit choice (10) |
-| Security Audit (10, done or skipped) | Auto-chain → Performance Test choice (11) |
-| Performance Test (11, done or skipped) | Auto-chain → Deploy (12) |
-| Deploy (12) | **Workflow complete** — existing-code flow done |
+| Test Spec (2) | Auto-chain → Code Testability Revision (3) |
+| Code Testability Revision (3) | Auto-chain → Decompose Tests (4) |
+| Decompose Tests (4) | **Session boundary** — suggest new conversation before Implement Tests |
+| Implement Tests (5) | Auto-chain → Run Tests (6) |
+| Run Tests (6, all pass) | Auto-chain → Refactor choice (7) |
+| Refactor (7, done or skipped) | Auto-chain → New Task (8) |
+| New Task (8) | **Session boundary** — suggest new conversation before Implement |
+| Implement (9) | Auto-chain → Run Tests (10) |
+| Run Tests (10, all pass) | Auto-chain → Update Docs (11) |
+| Update Docs (11) | Auto-chain → Security Audit choice (12) |
+| Security Audit (12, done or skipped) | Auto-chain → Performance Test choice (13) |
+| Performance Test (13, done or skipped) | Auto-chain → Deploy (14) |
+| Deploy (14) | **Workflow complete** — existing-code flow done |

 ## Status Summary Template

@@ -213,18 +274,20 @@ Action: The project completed a full cycle. Present status and loop back to New
 ═══════════════════════════════════════════════════
 AUTOPILOT STATUS (existing-code)
 ═══════════════════════════════════════════════════
- Step 1   Document            [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 2   Test Spec           [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 3   Decompose Tests     [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 4   Implement Tests     [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)]
- Step 5   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 6   Refactor            [DONE / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)]
- Step 7   New Task            [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 8   Implement           [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
- Step 9   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 10  Security Audit      [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 11  Performance Test    [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 12  Deploy              [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 1   Document                 [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 2   Test Spec                [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 3   Code Testability Rev.    [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 4   Decompose Tests          [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 5   Implement Tests          [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)]
+ Step 6   Run Tests                [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 7   Refactor                 [DONE / SKIPPED / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)]
+ Step 8   New Task                 [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 9   Implement                [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
+ Step 10  Run Tests                [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 11  Update Docs              [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 12  Security Audit           [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 13  Performance Test         [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 14  Deploy                   [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
 ═══════════════════════════════════════════════════
 Current: Step N — Name
 SubStep: M — [sub-skill internal step name]
@@ -110,25 +110,25 @@ If the project IS a UI project → present using Choose format:
 ---

 **Step 5 — Decompose**
-Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`)
+Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files

 Action: Read and execute `.cursor/skills/decompose/SKILL.md`

-If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
+If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.

 ---

 **Step 6 — Implement**
-Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
+Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file

 Action: Read and execute `.cursor/skills/implement/SKILL.md`

-If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.

 ---

 **Step 7 — Run Tests**
-Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state does NOT show Step 7 (Run Tests) as completed AND (`_docs/04_deploy/` does not exist or is incomplete)
+Condition: `_docs/03_implementation/` contains an `implementation_report_*.md` file AND the autopilot state does NOT show Step 7 (Run Tests) as completed AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Read and execute `.cursor/skills/test-run/SKILL.md`

@@ -190,7 +190,7 @@ Action: Read and execute `.cursor/skills/deploy/SKILL.md`
 ---

 **Done**
-Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md)
+Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md)

 Action: Report project completion with summary. If the user runs autopilot again after greenfield completion, Flow Resolution rule 3 routes to the existing-code flow (re-entry after completion) so they can add new features.

@@ -46,18 +46,16 @@ Rules:
 2. Always include a recommendation with a brief justification
 3. Keep option descriptions to one line each
 4. If only 2 options make sense, use A/B only — do not pad with filler options
-5. Play the notification sound (per `human-attention-sound.mdc`) before presenting the choice
-6. Record every user decision in the state file's `Key Decisions` section
-7. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive
+5. Play the notification sound (per `.cursor/rules/human-attention-sound.mdc`) before presenting the choice
+6. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive

 ## Work Item Tracker Authentication

-Several workflow steps create work items (epics, tasks, links). The system supports **Jira MCP** and **Azure DevOps MCP** as interchangeable backends. Detect which is configured by listing available MCP servers.
+Several workflow steps create work items (epics, tasks, links). The system requires some task tracker MCP as interchangeable backend.

 ### Tracker Detection

-1. Check for available MCP servers: Jira MCP (`user-Jira-MCP-Server`) or Azure DevOps MCP (`user-AzureDevops`)
-2. If both are available, ask the user which to use (Choose format)
+1. If there is no task tracker MCP or it is not authorized, ask the user about it
 3. Record the choice in the state file: `tracker: jira` or `tracker: ado`
 4. If neither is available, set `tracker: local` and proceed without external tracking

@@ -124,16 +122,12 @@ Skill execution → FAILED
  │
  ├─ retry_count < 3 ?
  │    YES → increment retry_count in state file
-  │         → log failure reason in state file (Retry Log section)
  │         → re-read the sub-skill's SKILL.md
  │         → re-execute from the current sub_step
  │         → (loop back to check result)
  │
  │    NO (retry_count = 3) →
  │         → set status: failed in Current Step
-  │         → add entry to Blockers section:
-  │             "[Skill Name] failed 3 consecutive times at sub_step [M].
-  │              Last failure: [reason]. Auto-retry exhausted."
  │         → present warning to user (see Escalation below)
  │         → do NOT auto-retry again until user intervenes
 ```
@@ -143,18 +137,14 @@ Skill execution → FAILED
 1. **Auto-retry immediately**: when a skill fails, retry it without asking the user — the failure is often transient (missing user confirmation in a prior step, docker not running, file lock, etc.)
 2. **Preserve sub_step**: retry from the last recorded `sub_step`, not from the beginning of the skill — unless the failure indicates corruption, in which case restart from sub_step 1
 3. **Increment `retry_count`**: update `retry_count` in the state file's `Current Step` section on each retry attempt
-4. **Log each failure**: append the failure reason and timestamp to the state file's `Retry Log` section
-5. **Reset on success**: when the skill eventually succeeds, reset `retry_count: 0` and clear the `Retry Log` for that step
+4. **Reset on success**: when the skill eventually succeeds, reset `retry_count: 0`

 ### Escalation (after 3 consecutive failures)

 After 3 failed auto-retries of the same skill, the failure is likely not user-related. Stop retrying and escalate:

-1. Update the state file:
-   - Set `status: failed` in `Current Step`
-   - Set `retry_count: 3`
-   - Add a blocker entry describing the repeated failure
-2. Play notification sound (per `human-attention-sound.mdc`)
+1. Update the state file: set `status: failed` and `retry_count: 3` in `Current Step`
+2. Play notification sound (per `.cursor/rules/human-attention-sound.mdc`)
 3. Present using Choose format:

 ```
@@ -215,9 +205,8 @@ When executing a sub-skill, monitor for these signals:

 If the same autopilot step fails 3 consecutive times across conversations:

- Record the failure pattern in the state file's `Blockers` section
 - Do NOT auto-retry on next invocation
- Present the blocker and ask user for guidance before attempting again
+- Present the failure pattern and ask user for guidance before attempting again

 ## Context Management Protocol

@@ -304,11 +293,73 @@ For steps that produce `_docs/` artifacts (problem, research, plan, decompose, d
 3. **Git safety net**: artifacts are committed with each autopilot step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout <commit> -- _docs/<folder>/`
 4. **State file rollback**: when rolling back artifacts, also update `_docs/_autopilot_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date)

+## Debug / Error Recovery Protocol
+
+When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or an implementer subagent reports a blocker, the user is asked to intervene. This protocol guides the recovery process.
+
+### Structured Debugging Workflow
+
+When escalated to the user after implementation failure:
+
+1. **Classify the failure** — determine the category:
+   - **Missing dependency**: a package, service, or module the task needs but isn't available
+   - **Logic error**: code runs but produces wrong results (assertion failures, incorrect output)
+   - **Integration mismatch**: interfaces between components don't align (type errors, missing methods, wrong signatures)
+   - **Environment issue**: Docker, database, network, or configuration problem
+   - **Spec ambiguity**: the task spec is unclear or contradictory
+
+2. **Reproduce** — isolate the failing behavior:
+   - Run the specific failing test(s) in isolation
+   - Check whether the failure is deterministic or intermittent
+   - Capture the exact error message, stack trace, and relevant file:line
+
+3. **Narrow scope** — focus on the minimal reproduction:
+   - For logic errors: trace the data flow from input to the point of failure
+   - For integration mismatches: compare the caller's expectations against the callee's actual interface
+   - For environment issues: verify Docker services are running, DB is accessible, env vars are set
+
+4. **Fix and verify** — apply the fix and confirm:
+   - Make the minimal change that fixes the root cause
+   - Re-run the failing test(s) to confirm the fix
+   - Run the full test suite to check for regressions
+   - If the fix changes a shared interface, check all consumers
+
+5. **Report** — update the batch report with:
+   - Root cause category
+   - Fix applied (file:line, description)
+   - Tests that now pass
+
+### Common Recovery Patterns
+
+| Failure Pattern | Typical Root Cause | Recovery Action |
+|----------------|-------------------|----------------|
+| ImportError / ModuleNotFoundError | Missing dependency or wrong path | Install dependency or fix import path |
+| TypeError on method call | Interface mismatch between tasks | Align caller with callee's actual signature |
+| AssertionError in test | Logic bug or wrong expected value | Fix logic or update test expectations |
+| ConnectionRefused | Service not running | Start Docker services, check docker-compose |
+| Timeout | Blocking I/O or infinite loop | Add timeout, fix blocking call |
+| FileNotFoundError | Hardcoded path or missing fixture | Make path configurable, add fixture |
+
+### Escalation
+
+If debugging does not resolve the issue after 2 focused attempts:
+
+```
+══════════════════════════════════════
+ DEBUG ESCALATION: [failure description]
+══════════════════════════════════════
+ Root cause category: [category]
+ Attempted fixes: [list]
+ Current state: [what works, what doesn't]
+══════════════════════════════════════
+ A) Continue debugging with more context
+ B) Revert this batch and skip the task (move to backlog)
+ C) Simplify the task scope and retry
+══════════════════════════════════════
+```
+
 ## Status Summary

 On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). Use the Status Summary Template from the active flow file (`flows/greenfield.md` or `flows/existing-code.md`).

-For re-entry (state file exists), also include:
- Key decisions from the state file's `Key Decisions` section
- Last session context from the `Last Session` section
- Any blockers from the `Blockers` section
+For re-entry (state file exists), cross-check the current step against `_docs/` folder structure and present any `status: failed` state to the user before continuing.
@@ -2,81 +2,52 @@

 ## State File: `_docs/_autopilot_state.md`

-The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist.
+The autopilot persists its position to `_docs/_autopilot_state.md`. This is a lightweight pointer — only the current step. All history lives in `_docs/` artifacts and git log. Folder scanning is the fallback when the state file doesn't exist.

-### Format
+### Template

 ```markdown
 # Autopilot State

 ## Current Step
 flow: [greenfield | existing-code]
-step: [1-10 for greenfield, 1-12 for existing-code, or "done"]
+step: [1-10 for greenfield, 1-13 for existing-code, or "done"]
 name: [step name from the active flow's Step Reference Table]
 status: [not_started / in_progress / completed / skipped / failed]
-sub_step: [optional — sub-skill internal step number + name if interrupted mid-step]
-retry_count: [0-3 — number of consecutive auto-retry attempts for current step, reset to 0 on success]
+sub_step: [0, or sub-skill internal step number + name if interrupted mid-step]
+retry_count: [0-3 — consecutive auto-retry attempts, reset to 0 on success]
+```

-When updating `Current Step`, always write it as:
-  flow: existing-code   ← active flow
-  step: N               ← autopilot step (sequential integer)
-  sub_step: M           ← sub-skill's own internal step/phase number + name
-  retry_count: 0        ← reset on new step or success; increment on each failed retry
-Example:
-  flow: greenfield
-  step: 3
-  name: Plan
-  status: in_progress
-  sub_step: 4 — Architecture Review & Risk Assessment
-  retry_count: 0
-Example (failed after 3 retries):
-  flow: existing-code
-  step: 2
-  name: Test Spec
-  status: failed
-  sub_step: 1b — Test Case Generation
-  retry_count: 3
+### Examples

-## Completed Steps
+```
+flow: greenfield
+step: 3
+name: Plan
+status: in_progress
+sub_step: 4 — Architecture Review & Risk Assessment
+retry_count: 0
+```

-| Step | Name | Completed | Key Outcome |
-|------|------|-----------|-------------|
-| 1 | [name] | [date] | [one-line summary] |
-| 2 | [name] | [date] | [one-line summary] |
-| ... | ... | ... | ... |
-
-## Key Decisions
- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"]
- [decision N]
-
-## Last Session
-date: [date]
-ended_at: Step [N] [Name] — SubStep [M] [sub-step name]
-reason: [completed step / session boundary / user paused / context limit]
-notes: [any context for next session]
-
-## Retry Log
-| Attempt | Step | Name | SubStep | Failure Reason | Timestamp |
-|---------|------|------|---------|----------------|-----------|
-| 1 | [step] | [name] | [sub_step] | [reason] | [date-time] |
-| ... | ... | ... | ... | ... | ... |
-
-(Clear this table when the step succeeds or user resets. Append a row on each failed auto-retry.)
-
-## Blockers
- [blocker 1, if any]
- [none]
+```
+flow: existing-code
+step: 2
+name: Test Spec
+status: failed
+sub_step: 1b — Test Case Generation
+retry_count: 3
 ```

 ### State File Rules

-1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 1)
-2. **Update** the state file after every step completion, every session boundary, every BLOCKING gate confirmation, and every failed retry attempt
-3. **Read** the state file as the first action on every invocation — before folder scanning
-4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 3 but `_docs/02_document/architecture.md` already exists), trust the folder structure and update the state file to match
-5. **Never delete** the state file. It accumulates history across the entire project lifecycle
-6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` when the step succeeds or the user manually resets. If `retry_count` reaches 3, set `status: failed` and add an entry to `Blockers`
-7. **Failed state on re-entry**: if the state file shows `status: failed` with `retry_count: 3`, do NOT auto-retry — present the blocker to the user and wait for their decision before proceeding
+1. **Create** on the first autopilot invocation (after state detection determines Step 1)
+2. **Update** after every change — this includes: batch completion, sub-step progress, step completion, session boundary, failed retry, or any meaningful state transition. The state file must always reflect the current reality.
+3. **Read** as the first action on every invocation — before folder scanning
+4. **Cross-check**: verify against actual `_docs/` folder contents. If they disagree, trust the folder structure and update the state file
+5. **Never delete** the state file
+6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` on success. If `retry_count` reaches 3, set `status: failed`
+7. **Failed state on re-entry**: if `status: failed` with `retry_count: 3`, do NOT auto-retry — present the issue to the user first
+8. **Skill-internal state**: when the active skill maintains its own state file (e.g., document skill's `_docs/02_document/state.json`), the autopilot's `sub_step` field should reflect the skill's internal progress. On re-entry, cross-check the skill's state file against the autopilot's `sub_step` for consistency.

 ## State Detection

@@ -92,8 +63,8 @@ When the user invokes `/autopilot` and work already exists:

 1. Read `_docs/_autopilot_state.md`
 2. Cross-check against `_docs/` folder structure
-3. Present Status Summary with context from state file (key decisions, last session, blockers)
-4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery
+3. Present Status Summary (use the active flow's Status Summary Template)
+4. If the detected step has a sub-skill with built-in resumability, the sub-skill handles mid-step recovery
 5. Continue execution from detected state

 ## Session Boundaries
@@ -101,12 +72,11 @@ When the user invokes `/autopilot` and work already exists:
 After any decompose/planning step completes, **do not auto-chain to implement**. Instead:

 1. Update state file: mark the step as completed, set current step to the next implement step with status `not_started`
-   - Existing-code flow: After Step 3 (Decompose Tests) → set current step to 4 (Implement Tests)
-   - Existing-code flow: After Step 7 (New Task) → set current step to 8 (Implement)
+   - Existing-code flow: After Step 4 (Decompose Tests) → set current step to 5 (Implement Tests)
+   - Existing-code flow: After Step 8 (New Task) → set current step to 9 (Implement)
   - Greenfield flow: After Step 5 (Decompose) → set current step to 6 (Implement)
-2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready`
-3. Present a summary: number of tasks, estimated batches, total complexity points
-4. Use Choose format:
+2. Present a summary: number of tasks, estimated batches, total complexity points
+3. Use Choose format:

 ```
 ══════════════════════════════════════