Refactor autopilot workflows and documentation: Update .gitignore to include binary and media file types, enhance agent command references in documentation, and modify annotation class for improved accessibility. Adjust inference processing to handle batch sizes and streamline test specifications for clarity and consistency across the system.

2026-04-22 08:56:32 +00:00 · 2026-03-25 05:26:19 +02:00
parent a5fc4fe073
commit 4afa1a4eec
29 changed files with 447 additions and 362 deletions
@@ -1,25 +1,25 @@
 # Existing Code Workflow

-Workflow for projects with an existing codebase. Starts with documentation, produces test specs, decomposes and implements tests, refactors with that safety net, then adds new functionality and deploys.
+Workflow for projects with an existing codebase. Starts with documentation, produces test specs, decomposes and implements tests, verifies them, refactors with that safety net, then adds new functionality and deploys.

 ## Step Reference Table

-| Step | Name                    | Sub-Skill                       | Internal SubSteps                     |
-|------|-------------------------|---------------------------------|---------------------------------------|
-| —    | Document (pre-step)     | document/SKILL.md               | Steps 1–8                             |
-| 2b   | Blackbox Test Spec      | test-spec/SKILL.md              | Phase 1a–1b                           |
-| 2c   | Decompose Tests         | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4             |
-| 2d   | Implement Tests         | implement/SKILL.md              | (batch-driven, no fixed sub-steps)    |
-| 2e   | Refactor                | refactor/SKILL.md               | Phases 0–5 (6-phase method)           |
-| 2ea  | UI Design               | ui-design/SKILL.md              | Phase 0–8 (conditional — UI projects only) |
-| 2f   | New Task                | new-task/SKILL.md               | Steps 1–8 (loop)                      |
-| 2g   | Implement               | implement/SKILL.md              | (batch-driven, no fixed sub-steps)    |
-| 2h   | Run Tests               | (autopilot-managed)             | Unit tests → Blackbox tests |
-| 2hb  | Security Audit          | security/SKILL.md               | Phase 1–5 (optional)                  |
-| 2hc  | Performance Test        | (autopilot-managed)             | Load/stress tests (optional)          |
-| 2i   | Deploy                  | deploy/SKILL.md                 | Steps 1–7                             |
+| Step | Name | Sub-Skill | Internal SubSteps |
+|------|------|-----------|-------------------|
+| 1 | Document | document/SKILL.md | Steps 1–8 |
+| 2 | Test Spec | test-spec/SKILL.md | Phase 1a–1b |
+| 3 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
+| 4 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 5 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 6 | Refactor | refactor/SKILL.md | Phases 0–5 (6-phase method) |
+| 7 | New Task | new-task/SKILL.md | Steps 1–8 (loop) |
+| 8 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 9 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 10 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 11 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
+| 12 | Deploy | deploy/SKILL.md | Step 1–7 |

-After Step 2i, the existing-code workflow is complete.
+After Step 12, the existing-code workflow is complete.

 ## Detection Rules

@@ -27,30 +27,14 @@ Check rules in order — first match wins.

 ---

-**Pre-Step — Existing Codebase Detection**
+**Step 1 — Document**
 Condition: `_docs/` does not exist AND the workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`, `src/`, `Cargo.toml`, `*.csproj`, `package.json`)

-Action: An existing codebase without documentation was detected. Present using Choose format:
-
-```
-══════════════════════════════════════
- DECISION REQUIRED: Existing codebase detected
-══════════════════════════════════════
- A) Start fresh — define the problem from scratch (greenfield workflow)
- B) Document existing codebase first — run /document to reverse-engineer docs, then continue
-══════════════════════════════════════
- Recommendation: B — the /document skill analyzes your code
- bottom-up and produces _docs/ artifacts automatically,
- then you can continue with test specs, refactor, and new features.
-══════════════════════════════════════
-```
-
- If user picks A → proceed to Step 0 (Problem Gathering) in the greenfield flow
- If user picks B → read and execute `.cursor/skills/document/SKILL.md`. After document skill completes, re-detect state (the produced `_docs/` artifacts will place the project at Step 2b or later).
+Action: An existing codebase without documentation was detected. Read and execute `.cursor/skills/document/SKILL.md`. After the document skill completes, re-detect state (the produced `_docs/` artifacts will place the project at Step 2 or later).

 ---

-**Step 2b — Blackbox Test Spec**
+**Step 2 — Test Spec**
 Condition: `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`) AND `_docs/02_document/tests/traceability-matrix.md` does not exist AND the autopilot state shows Document was run (check `Completed Steps` for "Document" entry)

 Action: Read and execute `.cursor/skills/test-spec/SKILL.md`
@@ -59,7 +43,7 @@ This step applies when the codebase was documented via the `/document` skill. Te

 ---

-**Step 2c — Decompose Tests**
+**Step 3 — Decompose Tests**
 Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND the autopilot state shows Document was run AND (`_docs/02_tasks/` does not exist or has no task files)

 Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
@@ -71,8 +55,8 @@ If `_docs/02_tasks/` has some task files already, the decompose skill's resumabi

 ---

-**Step 2d — Implement Tests**
-Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 2c (Decompose Tests) is completed AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
+**Step 4 — Implement Tests**
+Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 3 (Decompose Tests) is completed AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist

 Action: Read and execute `.cursor/skills/implement/SKILL.md`

@@ -82,8 +66,17 @@ If `_docs/03_implementation/` has batch reports, the implement skill detects com

 ---

-**Step 2e — Refactor**
-Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state shows Step 2d (Implement Tests) is completed AND `_docs/04_refactoring/FINAL_report.md` does not exist
+**Step 5 — Run Tests**
+Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state shows Step 4 (Implement Tests) is completed AND the autopilot state does NOT show Step 5 (Run Tests) as completed
+
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`
+
+Verifies the implemented test suite passes before proceeding to refactoring. The tests form the safety net for all subsequent code changes.
+
+---
+
+**Step 6 — Refactor**
+Condition: the autopilot state shows Step 5 (Run Tests) is completed AND `_docs/04_refactoring/FINAL_report.md` does not exist

 Action: Read and execute `.cursor/skills/refactor/SKILL.md`

@@ -93,37 +86,8 @@ If `_docs/04_refactoring/` has phase reports, the refactor skill detects complet

 ---

-**Step 2ea — UI Design (conditional)**
-Condition: the autopilot state shows Step 2e (Refactor) is completed AND the autopilot state does NOT show Step 2ea (UI Design) as completed or skipped
-
-**UI Project Detection** — the project is a UI project if ANY of the following are true:
- `package.json` exists in the workspace root or any subdirectory
- `*.html`, `*.jsx`, `*.tsx` files exist in the workspace
- `_docs/02_document/components/` contains a component whose `description.md` mentions UI, frontend, page, screen, dashboard, form, or view
- `_docs/02_document/architecture.md` mentions frontend, UI layer, SPA, or client-side rendering
-
-If the project is NOT a UI project → mark Step 2ea as `skipped` in the state file and auto-chain to Step 2f.
-
-If the project IS a UI project → present using Choose format:
-
-```
-══════════════════════════════════════
- DECISION REQUIRED: UI project detected — generate/update mockups?
-══════════════════════════════════════
- A) Generate UI mockups before new task planning (recommended)
- B) Skip — proceed directly to new task
-══════════════════════════════════════
- Recommendation: A — mockups inform better frontend task specs
-══════════════════════════════════════
-```
-
- If user picks A → Read and execute `.cursor/skills/ui-design/SKILL.md`. After completion, auto-chain to Step 2f (New Task).
- If user picks B → Mark Step 2ea as `skipped` in the state file, auto-chain to Step 2f (New Task).
-
---
-
-**Step 2f — New Task**
-Condition: (the autopilot state shows Step 2ea (UI Design) is completed or skipped) AND the autopilot state does NOT show Step 2f (New Task) as completed
+**Step 7 — New Task**
+Condition: the autopilot state shows Step 6 (Refactor) is completed AND the autopilot state does NOT show Step 7 (New Task) as completed

 Action: Read and execute `.cursor/skills/new-task/SKILL.md`

@@ -131,46 +95,26 @@ The new-task skill interactively guides the user through defining new functional

 ---

-**Step 2g — Implement**
-Condition: the autopilot state shows Step 2f (New Task) is completed AND `_docs/03_implementation/` does not contain a FINAL report covering the new tasks (check state for distinction between test implementation and feature implementation)
+**Step 8 — Implement**
+Condition: the autopilot state shows Step 7 (New Task) is completed AND `_docs/03_implementation/` does not contain a FINAL report covering the new tasks (check state for distinction between test implementation and feature implementation)

 Action: Read and execute `.cursor/skills/implement/SKILL.md`

-The implement skill reads the new tasks from `_docs/02_tasks/` and implements them. Tasks already implemented in Step 2d are skipped (the implement skill tracks completed tasks in batch reports).
+The implement skill reads the new tasks from `_docs/02_tasks/` and implements them. Tasks already implemented in Step 4 are skipped (the implement skill tracks completed tasks in batch reports).

 If `_docs/03_implementation/` has batch reports from this phase, the implement skill detects completed tasks and continues.

 ---

-**Step 2h — Run Tests**
-Condition: the autopilot state shows Step 2g (Implement) is completed AND the autopilot state does NOT show Step 2h (Run Tests) as completed
+**Step 9 — Run Tests**
+Condition: the autopilot state shows Step 8 (Implement) is completed AND the autopilot state does NOT show Step 9 (Run Tests) as completed

-Action: Run the full test suite to verify the implementation before deployment.
-
-1. If `scripts/run-tests.sh` exists (generated by the test-spec skill Phase 4), execute it
-2. Otherwise, detect the project's test runner manually (e.g., `pytest`, `dotnet test`, `cargo test`, `npm test`) and run all unit tests; if `docker-compose.test.yml` or an equivalent test environment exists, spin it up and run the blackbox test suite
-3. **Report results**: present a summary of passed/failed/skipped tests
-
-If all tests pass → auto-chain to Step 2hb (Security Audit).
-
-If tests fail → present using Choose format:
-
-```
-══════════════════════════════════════
- TEST RESULTS: [N passed, M failed, K skipped]
-══════════════════════════════════════
- A) Fix failing tests and re-run
- B) Proceed to deploy anyway (not recommended)
- C) Abort — fix manually
-══════════════════════════════════════
- Recommendation: A — fix failures before deploying
-══════════════════════════════════════
-```
+Action: Read and execute `.cursor/skills/test-run/SKILL.md`

 ---

-**Step 2hb — Security Audit (optional)**
-Condition: the autopilot state shows Step 2h (Run Tests) is completed AND the autopilot state does NOT show Step 2hb (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 10 — Security Audit (optional)**
+Condition: the autopilot state shows Step 9 (Run Tests) is completed AND the autopilot state does NOT show Step 10 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Present using Choose format:

@@ -185,13 +129,13 @@ Action: Present using Choose format:
 ══════════════════════════════════════
 ```

- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 2i (Deploy).
- If user picks B → Mark Step 2hb as `skipped` in the state file, auto-chain to Step 2i (Deploy).
+- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 11 (Performance Test).
+- If user picks B → Mark Step 10 as `skipped` in the state file, auto-chain to Step 11 (Performance Test).

 ---

-**Step 2hc — Performance Test (optional)**
-Condition: the autopilot state shows Step 2hb (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 2hc (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 11 — Performance Test (optional)**
+Condition: the autopilot state shows Step 10 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 11 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Present using Choose format:

@@ -212,13 +156,13 @@ Action: Present using Choose format:
  2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
  3. Present results vs acceptance criteria thresholds
  4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
-  5. After completion, auto-chain to Step 2i (Deploy)
- If user picks B → Mark Step 2hc as `skipped` in the state file, auto-chain to Step 2i (Deploy).
+  5. After completion, auto-chain to Step 12 (Deploy)
+- If user picks B → Mark Step 11 as `skipped` in the state file, auto-chain to Step 12 (Deploy).

 ---

-**Step 2i — Deploy**
-Condition: the autopilot state shows Step 2h (Run Tests) is completed AND (Step 2hb is completed or skipped) AND (Step 2hc is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 12 — Deploy**
+Condition: the autopilot state shows Step 9 (Run Tests) is completed AND (Step 10 is completed or skipped) AND (Step 11 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Read and execute `.cursor/skills/deploy/SKILL.md`

@@ -227,7 +171,7 @@ After deployment completes, the existing-code workflow is done.
 ---

 **Re-Entry After Completion**
-Condition: the autopilot state shows `step: done` OR all steps through 2i (Deploy) are completed
+Condition: the autopilot state shows `step: done` OR all steps through 12 (Deploy) are completed

 Action: The project completed a full cycle. Present status and loop back to New Task:

@@ -243,22 +187,48 @@ Action: The project completed a full cycle. Present status and loop back to New
 ══════════════════════════════════════
 ```

- If user picks A → set `step: 2f`, `status: not_started` in the state file, then auto-chain to Step 2f (New Task). Previous cycle history stays in Completed Steps.
+- If user picks A → set `step: 7`, `status: not_started` in the state file, then auto-chain to Step 7 (New Task). Previous cycle history stays in Completed Steps.
 - If user picks B → report final project status and exit.

 ## Auto-Chain Rules

 | Completed Step | Next Action |
 |---------------|-------------|
-| Document (existing code) | Auto-chain → Blackbox Test Spec (Step 2b) |
-| Blackbox Test Spec (Step 2b) | Auto-chain → Decompose Tests (Step 2c) |
-| Decompose Tests (Step 2c) | **Session boundary** — suggest new conversation before Implement Tests |
-| Implement Tests (Step 2d) | Auto-chain → Refactor (Step 2e) |
-| Refactor (Step 2e) | Auto-chain → UI Design detection (Step 2ea) |
-| UI Design (Step 2ea, done or skipped) | Auto-chain → New Task (Step 2f) |
-| New Task (Step 2f) | **Session boundary** — suggest new conversation before Implement |
-| Implement (Step 2g) | Auto-chain → Run Tests (Step 2h) |
-| Run Tests (Step 2h, all pass) | Auto-chain → Security Audit choice (Step 2hb) |
-| Security Audit (Step 2hb, done or skipped) | Auto-chain → Performance Test choice (Step 2hc) |
-| Performance Test (Step 2hc, done or skipped) | Auto-chain → Deploy (Step 2i) |
-| Deploy (Step 2i) | **Workflow complete** — existing-code flow done |
+| Document (1) | Auto-chain → Test Spec (2) |
+| Test Spec (2) | Auto-chain → Decompose Tests (3) |
+| Decompose Tests (3) | **Session boundary** — suggest new conversation before Implement Tests |
+| Implement Tests (4) | Auto-chain → Run Tests (5) |
+| Run Tests (5, all pass) | Auto-chain → Refactor (6) |
+| Refactor (6) | Auto-chain → New Task (7) |
+| New Task (7) | **Session boundary** — suggest new conversation before Implement |
+| Implement (8) | Auto-chain → Run Tests (9) |
+| Run Tests (9, all pass) | Auto-chain → Security Audit choice (10) |
+| Security Audit (10, done or skipped) | Auto-chain → Performance Test choice (11) |
+| Performance Test (11, done or skipped) | Auto-chain → Deploy (12) |
+| Deploy (12) | **Workflow complete** — existing-code flow done |
+
+## Status Summary Template
+
+```
+═══════════════════════════════════════════════════
+ AUTOPILOT STATUS (existing-code)
+═══════════════════════════════════════════════════
+ Step 1   Document            [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 2   Test Spec           [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 3   Decompose Tests     [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 4   Implement Tests     [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)]
+ Step 5   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 6   Refactor            [DONE / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)]
+ Step 7   New Task            [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 8   Implement           [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
+ Step 9   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 10  Security Audit      [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 11  Performance Test    [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 12  Deploy              [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+═══════════════════════════════════════════════════
+ Current: Step N — Name
+ SubStep: M — [sub-skill internal step name]
+ Retry:   [N/3 if retrying, omit if 0]
+ Action:  [what will happen next]
+═══════════════════════════════════════════════════
+```