1 Commits

Author SHA1 Message Date
Denys Zaitsev d7e1066c60 Initial commit 2026-04-03 23:25:54 +03:00
5305 changed files with 1572704 additions and 272401 deletions
-18
View File
@@ -1,18 +0,0 @@
---
BasedOnStyle: Google
ColumnLimit: 100
IndentWidth: 4
TabWidth: 4
UseTab: Never
AccessModifierOffset: -4
AllowShortFunctionsOnASingleLine: Empty
AllowShortIfStatementsOnASingleLine: false
AllowShortLoopsOnASingleLine: false
BinPackArguments: false
BinPackParameters: false
BreakBeforeBraces: Attach
DerivePointerAlignment: false
PointerAlignment: Left
SortIncludes: true
SpaceAfterCStyleCast: true
Standard: c++17
-18
View File
@@ -1,18 +0,0 @@
---
Checks: >
-*,
bugprone-*,
clang-analyzer-*,
cppcoreguidelines-*,
modernize-*,
performance-*,
readability-*,
-bugprone-easily-swappable-parameters,
-cppcoreguidelines-avoid-magic-numbers,
-cppcoreguidelines-non-private-member-variables-in-classes,
-modernize-use-trailing-return-type,
-readability-identifier-length,
-readability-magic-numbers
WarningsAsErrors: ''
HeaderFilterRegex: '.*'
FormatStyle: file
-7
View File
@@ -1,7 +0,0 @@
format:
line_width: 100
tab_size: 2
use_tabchars: false
separate_ctrl_name_with_space: false
separate_fn_name_with_space: false
dangle_parens: false
+86 -163
View File
@@ -1,9 +1,9 @@
## How to Use
Type `/autodev` to start or continue the full workflow. The orchestrator detects where your project is and picks up from there.
Type `/autopilot` to start or continue the full workflow. The orchestrator detects where your project is and picks up from there.
```
/autodev — start a new project or continue where you left off
/autopilot — start a new project or continue where you left off
```
If you want to run a specific skill directly (without the orchestrator), use the individual commands:
@@ -11,31 +11,21 @@ If you want to run a specific skill directly (without the orchestrator), use the
```
/problem — interactive problem gathering → _docs/00_problem/
/research — solution drafts → _docs/01_solution/
/plan — architecture, ADRs, components, tests, epics → _docs/02_document/
/test-specblackbox/perf/resilience/security test specs → _docs/02_document/tests/
/decompose — atomic task specs (multi-mode) → _docs/02_tasks/todo/
/implementsequential dependency-aware batches with code review and completeness gates → _docs/03_implementation/
/test-run — runs the test suite (functional / perf modes) with gating
/code-review — multi-phase review used by /implement
/refactor — 8-phase structured refactoring (incl. testability sub-mode) → _docs/04_refactoring/
/security — OWASP-driven audit → _docs/05_security/
/deploy — containerization, CI/CD, environments, observability, procedures, scripts → _docs/04_deploy/
/release — execute deploy artifacts in prod, smoke-test, watch, decide rollback → _docs/04_release/
/document — bottom-up reverse-engineering of an existing codebase → _docs/02_document/
/new-task — interactive feature planning for an existing codebase → _docs/02_tasks/todo/
/ui-design — HTML+CSS mockups + design system → _docs/02_document/ui_mockups/
/retrospective — metrics + lessons log → _docs/06_metrics/ + _docs/LESSONS.md
/plan — architecture, components, tests → _docs/02_plans/
/decomposeatomic task specs → _docs/02_tasks/
/implementbatched parallel implementation → _docs/03_implementation/
/deploy containerization, CI/CD, observability → _docs/04_deploy/
```
## How It Works
The autodev is a state machine that persists its state to `_docs/_autodev_state.md`. On every invocation it reads the state file, cross-checks against the `_docs/` folder structure, shows a status summary with context from prior sessions, and continues execution.
The autopilot is a state machine that persists its state to `_docs/_autopilot_state.md`. On every invocation it reads the state file, cross-checks against the `_docs/` folder structure, shows a status summary with context from prior sessions, and continues execution.
```
/autodev invoked
/autopilot invoked
Read _docs/_autodev_state.md → cross-check _docs/ folders
Read _docs/_autopilot_state.md → cross-check _docs/ folders
Show status summary (progress, key decisions, last session context)
@@ -47,221 +37,154 @@ Execute current skill (read its SKILL.md, follow its workflow)
Update state file → auto-chain to next skill → loop
```
The state file tracks completed steps, key decisions, blockers, and session context. This makes re-entry across conversations seamless — the autodev knows not just where you are, but what decisions were made and why.
The state file tracks completed steps, key decisions, blockers, and session context. This makes re-entry across conversations seamless — the autopilot knows not just where you are, but what decisions were made and why.
Skills auto-chain without pausing between them. The only pauses are:
- **BLOCKING gates** inside each skill (user must confirm before proceeding)
- **Session boundaries** declared in each flow's auto-chain rules (e.g., after `decompose`, after `decompose tests`) — suggested new-conversation breakpoints to keep context fresh
- **Session boundary** after decompose (suggests new conversation before implement)
There are three flows, resolved on every invocation (see `skills/autodev/SKILL.md` § Flow Resolution):
A typical project runs in 2-4 conversations:
- Session 1: Problem → Research → Research decision
- Session 2: Plan → Decompose
- Session 3: Implement (may span multiple sessions)
- Session 4: Deploy
| Flow | When | Steps |
|------|------|-------|
| **greenfield** | empty workspace, no source yet | 17 steps: Problem → Research → Plan → UI Design → Test Spec → Decompose → Implement → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (opt) → Performance Test (opt) → Deploy → Release → Retrospective |
| **existing-code** | source files present | one-time baseline (Document → Architecture Baseline Scan → Test Spec → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → optional Refactor) then a feature-cycle loop (New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security Audit (opt) → Performance Test (opt) → Deploy → Release → Retrospective → loops back to New Task) |
| **meta-repo** | `.gitmodules`, workspace manifest, or multi-component aggregator | uses `monorepo-*` skills + `_docs/_repo-config.yaml` instead of per-component BUILD-SHIP folders |
A typical greenfield project spans several conversations because of session boundaries. Re-entry is seamless: type `/autodev` in a new conversation and the orchestrator reads `_docs/_autodev_state.md` to pick up exactly where you left off.
Re-entry is seamless: type `/autopilot` in a new conversation and the orchestrator reads the state file to pick up exactly where you left off.
## Skill Descriptions
### autodev (meta-orchestrator)
### autopilot (meta-orchestrator)
Auto-chaining engine that sequences the full BUILD → SHIP → EVOLVE workflow. Persists state to `_docs/_autodev_state.md`, surfaces top-3 lessons from `_docs/LESSONS.md` at every invocation, replays any `_docs/_process_leftovers/` entries, tracks key decisions and session context, and flows through the active flow's steps without manual skill invocation. Maximizes work per conversation with seamless cross-session re-entry.
Auto-chaining engine that sequences the full BUILD → SHIP workflow. Persists state to `_docs/_autopilot_state.md`, tracks key decisions and session context, and flows through problem → research → plan → decompose → implement → deploy without manual skill invocation. Maximizes work per conversation with seamless cross-session re-entry.
### problem
Interactive 4-phase interview that builds `_docs/00_problem/`. Asks probing questions across 8 dimensions (problem & goals, scope, hardware & environment, software & tech, acceptance criteria, input data, security, operational) until all required files can be written with concrete, measurable, quantifiable content. Acceptance criteria must include numeric targets; input data must include `expected_results/` mappings.
Interactive interview that builds `_docs/00_problem/`. Asks probing questions across 8 dimensions (problem, scope, hardware, software, acceptance criteria, input data, security, operations) until all required files can be written with concrete, measurable content.
### research
8-step deep research methodology. Mode A produces initial solution drafts. Mode B assesses and revises existing drafts. Classifies output as **Technical-component selection** (full per-mode API verification gates apply) or **Non-technical investigation** (gates relaxed). Source tiering, fact extraction, comparison frameworks, validation, exact-fit component selection. Run multiple rounds until the solution is solid.
8-step deep research methodology. Mode A produces initial solution drafts. Mode B assesses and revises existing drafts. Includes AC assessment, source tiering, fact extraction, comparison frameworks, and validation. Run multiple rounds until the solution is solid.
### plan
6-step planning workflow with one half-step (4.5: Architecture Decision Records). Produces blackbox test specs (delegated to test-spec), glossary, architecture vision, architecture document, data model, deployment plan, component specs with interfaces, risk assessment, ADRs, test specifications, and work item epics. Heavy interaction at BLOCKING gates (glossary+vision, architecture, components, mitigations, ADRs).
### test-spec
4-phase test specification workflow. Phase 1 analyzes input data + expected-results completeness. Phase 2 emits 8 test artifacts (environment, test-data, blackbox, performance, resilience, security, resource-limit, traceability matrix). Phase 3 is the hard gate that requires every test to have quantifiable expected results. Phase 4 emits runner scripts. Cycle-update mode for incremental refresh.
6-step planning workflow. Produces integration test specs, architecture, system flows, data model, deployment plan, component specs with interfaces, risk assessment, test specifications, and Jira epics. Heavy interaction at BLOCKING gates.
### decompose
Multi-mode task decomposition with 6 internal step files. Implementation mode runs Step 1 (Bootstrap), 1.5 (Module Layout), 1.7 (System-Pipeline owner tasks), 2 (per-component tasks), 4 (Cross-Verification). Tests-only mode runs Step 1t (Test Infrastructure), 3 (Blackbox tasks), 4. Single-component mode runs Step 2 only. Each task is tracker-prefixed and capped at 5 complexity points. The 1.7 step exists specifically to prevent the GPS-passthrough class of failure (see `meta-rule.mdc`).
4-step task decomposition. Produces a bootstrap structure plan, atomic task specs per component, integration test tasks, and a cross-task dependency table. Each task gets a Jira ticket and is capped at 5 complexity points.
### implement
Orchestrator that reads task specs, computes dependency-aware execution batches via topological sort, **implements tasks sequentially within each batch** (no subagents, no parallel execution — see `.cursor/rules/no-subagents.mdc`), runs code review after each batch, runs cumulative code review every K batches, and commits per batch. Has a Product Implementation Completeness Gate (Step 15) that compares promises in task specs / architecture against actual production code, plus a System-Pipeline Audit (Step 15.b) that walks architecture-named pipelines and verifies a real production caller wires each adjacent component pair. Either gate's FAIL stops the cycle until remediation tasks are created.
### code-review
7-phase code review against task specs (Phase 7 is Architecture Compliance against `module-layout.md` and `architecture.md`). Produces structured findings with verdict: PASS, PASS_WITH_WARNINGS, or FAIL. Three modes: full (per batch), baseline (one-time architecture scan of an existing codebase), cumulative (mid-implementation across batches with `## Baseline Delta`).
### test-run
Runs the test suite. Functional mode (default): detects pytest/dotnet/cargo/npm or `scripts/run-tests.sh`, applies a System-Under-Test Reality Gate to refuse passes where internal product modules were stubbed, classifies failures and skips, gates on outcome. Perf mode: detects `scripts/run-performance-tests.sh` or k6/locust/artillery/wrk, captures latency/throughput/error metrics, compares against thresholds.
### refactor
8-phase structured refactoring: baseline → discovery → analysis → safety net → execution → test sync → verification → documentation. Two input modes (Automatic / Guided). Testability sub-mode skips Phase 3 by design and emits a `testability_changes_summary.md` for user review. Each run lives in its own `RUN_DIR` under `_docs/04_refactoring/NN-<run-name>/`.
### security
5-phase OWASP-based audit: dependency scan → static analysis → OWASP Top 10 review → infrastructure review → consolidated security report. Severity-ranked, evidence-based, actionable. Complementary to `code-review` Phase 4 (lightweight security quick-scan).
Orchestrator that reads task specs, computes dependency-aware execution batches, launches up to 4 parallel implementer subagents, runs code review after each batch, and commits per batch. Does not write code itself.
### deploy
7-step deployment planning. Produces documents for steps 16 (status & env, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures) and executable scripts in step 7 (`deploy.sh`, `pull-images.sh`, `start-services.sh`, `stop-services.sh`, `health-check.sh`).
7-step deployment planning. Status check, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures, and deployment scripts. Produces documents for steps 1-6 and executable scripts in step 7.
### release
### code-review
Executes the deployment plan produced by `/deploy` against a target environment. 6 phases: pre-release gate (AC + risk + rollback readiness), strategy select (all-at-once / blue-green / canary / manual), execute (run scripts, monitor exit codes), smoke test (delegate to test-run prod-smoke), watch window (read observability for the configured duration), commit-or-rollback. Outputs `_docs/04_release/release_<version>.md`. Produces a definitive Released / Rolled-Back / Aborted verdict; failure of any phase auto-triggers rollback unless the user opts to investigate.
Multi-phase code review against task specs. Produces structured findings with verdict: PASS, FAIL, or PASS_WITH_WARNINGS.
### refactor
6-phase structured refactoring: baseline, discovery, analysis, safety net, execution, hardening.
### security
OWASP-based security testing and audit.
### retrospective
4-step workflow: collect metrics → analyze trends → produce report → update lessons log (`_docs/LESSONS.md`, ring buffer of last 15 entries consumed by `new-task`, `plan`, `decompose`, and `autodev`). Cycle-end (default) and incident modes; incident mode is auto-invoked after a 3-strike failure.
Collects metrics from implementation batch reports, analyzes trends, produces improvement reports.
### document
### rollback
Bottom-up codebase documentation. Analyzes existing code from modules through components to architecture, then retrospectively derives problem/restrictions/acceptance criteria. Alternative entry point for existing codebases — produces the same `_docs/` artifacts as problem + plan, but from code analysis instead of user interview. Two workflow files: `workflows/full.md` (full / focus-area / resume) and `workflows/task.md` (incremental update for a single task).
### new-task
Existing-code feature planning loop. Walks the user through Step 1 (description) → Step 2 (complexity assessment, consults `LESSONS.md`) → Step 3 (research if needed) → Step 4 (codebase analysis incl. test-coverage gap) → Step 4.5 (contract & layout check) → Step 5 (validate assumptions) → Step 6 (write task spec) → Step 7 (tracker ticket) → Step 8 (loop or finalize).
### ui-design
End-to-end UI workflow. Phase 0 (complexity detection: full vs quick) → Phase 1 (context check) → Phase 2 (requirements) → Phase 3 (direction exploration) → Phase 4 (design system synthesis: `DESIGN.md`) → Phase 5 (HTML+Tailwind code generation) → Phase 6 (visual verification, optional MCP enhancements) → Phase 7 (user review) → Phase 8 (iteration). Has Applicability Check that refuses to run on non-UI projects.
### monorepo-* (suite-level)
Six skills for meta-repos: `monorepo-discover` (write/refresh `_docs/_repo-config.yaml`), `monorepo-document` (sync unified docs), `monorepo-cicd` (sync CI/compose/env templates), `monorepo-onboard` (atomic add-component), `monorepo-status` (read-only drift report), `monorepo-e2e` (sync suite-level integration harness). They never cross domains; each touches exactly one artifact class.
Reverts implementation to a specific batch checkpoint using git revert, verifies integrity.
## Developer TODO (Project Mode)
The numbered list below mirrors greenfield-flow ordering. Existing-code projects start at `/document`, then enter the feature-cycle loop at `/new-task`. See `skills/autodev/flows/{greenfield,existing-code,meta-repo}.md` for the authoritative step tables.
### BUILD (greenfield)
### BUILD
```
1. /problem — interactive 4-phase interview → _docs/00_problem/
required: problem.md, restrictions.md, acceptance_criteria.md, input_data/
optional: security_approach.md
2. /research — solution drafts (Mode A draft, Mode B assess) → _docs/01_solution/
3. /plan — glossary, architecture vision, architecture, data model, deployment, components,
risks, ADRs (Step 4.5), test specs, epics → _docs/02_document/
(Step 1 invokes /test-spec internally)
4. /ui-design — HTML+Tailwind mockups (UI projects only) → _docs/02_document/ui_mockups/
5. /test-spec — produces 8 test-spec artifacts + traceability matrix → _docs/02_document/tests/
(already invoked from /plan Step 1; Step 5 here is the explicit autodev step)
6. /decompose — implementation tasks + module-layout + system-pipeline owner tasks →
_docs/02_tasks/todo/
7. /implement — sequential dependency-aware batches; per-batch code-review;
Product Completeness Gate + System-Pipeline Audit → _docs/03_implementation/
8. (auto) Code Testability Revision — surgical refactor to make code runnable under tests
9. /decompose tests — test-only decomposition mode → _docs/02_tasks/todo/
10. /implement (tests) — implements test tasks
11. /test-run — full functional suite gate
12. /test-spec --cycle-update — append implementation-learned scenarios
13. /document --task — update affected component / module / architecture docs
14. /security — OWASP-based audit (optional gate)
15. /test-run --perf — perf/load tests (optional gate)
0. /problem — interactive interview → _docs/00_problem/
- problem.md (required)
- restrictions.md (required)
- acceptance_criteria.md (required)
- input_data/ (required)
- security_approach.md (optional)
1. /research — solution drafts → _docs/01_solution/
Run multiple times: Mode A → draft, Mode B → assess & revise
2. /plan — architecture, data model, deployment, components, risks, tests, Jira epics → _docs/02_plans/
3. /decompose — atomic task specs + dependency table → _docs/02_tasks/
4. /implement — batched parallel agents, code review, commit per batch → _docs/03_implementation/
```
### SHIP
```
16. /deploy — containerization, CI/CD, environments, observability, procedures, scripts → _docs/04_deploy/
17. /release — execute deploy artifacts in prod, smoke-test, watch, decide rollback → _docs/04_release/
5. /deploy — containerization, CI/CD, environments, observability, procedures → _docs/04_deploy/
```
### EVOLVE
```
18. /retrospective — metrics + trends + lessons-log update → _docs/06_metrics/ + _docs/LESSONS.md
(cycle-end mode after release; incident mode auto-fires after 3-strike failure)
After greenfield completes, the state file is rewritten to point at the existing-code flow's
feature-cycle loop, which begins with /new-task and ends with /retrospective. The loop runs once
per feature with state.cycle incremented.
Off-cycle:
/refactor — full 8-phase refactor → _docs/04_refactoring/NN-<run-name>/
/document — full reverse-engineering of an unfamiliar codebase
6. /refactor — structured refactoring → _docs/04_refactoring/
7. /retrospective — metrics, trends, improvement actions → _docs/05_metrics/
```
Or just use `/autodev` to run all the above automatically — the orchestrator chooses the right flow, sequences steps, surfaces lessons, processes leftovers, and pauses only at BLOCKING gates and declared session boundaries.
Or just use `/autopilot` to run steps 0-5 automatically.
## Available Skills
| Skill | Triggers | Output |
|-------|----------|--------|
| **autodev** | "autodev", "auto", "start", "continue", "what's next" | Orchestrates full workflow (3 flows) |
| **autopilot** | "autopilot", "auto", "start", "continue", "what's next" | Orchestrates full workflow |
| **problem** | "problem", "define problem", "new project" | `_docs/00_problem/` |
| **research** | "research", "investigate" | `_docs/01_solution/` |
| **plan** | "plan", "decompose solution" | `_docs/02_document/` (incl. ADRs) |
| **test-spec** | "test spec", "blackbox tests", "test scenarios" | `_docs/02_document/tests/` + `scripts/` |
| **decompose** | "decompose", "task decomposition", "decompose tests" | `_docs/02_tasks/todo/` + `_docs/02_document/module-layout.md` |
| **implement** | "implement", "start implementation" | `_docs/03_implementation/` (sequential — see `no-subagents.mdc`) |
| **test-run** | "run tests", "test suite", "verify tests", "perf test" | Test results + verdict |
| **code-review** | "code review", "review code" | Verdict: PASS / FAIL / PASS_WITH_WARNINGS (7 phases) |
| **new-task** | "new task", "add feature", "new functionality" | `_docs/02_tasks/todo/` |
| **ui-design** | "design a UI", "mockup", "design system" | `_docs/02_document/ui_mockups/` |
| **refactor** | "refactor", "improve code", "testability" | `_docs/04_refactoring/NN-<run-name>/` |
| **security** | "security audit", "OWASP", "vulnerability scan" | `_docs/05_security/` |
| **document** | "document", "document codebase", "reverse-engineer docs" | `_docs/02_document/` + `_docs/00_problem/` + `_docs/01_solution/` |
| **deploy** | "deploy", "CI/CD", "observability", "containerize" | `_docs/04_deploy/` (plans + scripts) |
| **release** | "release", "ship", "go live", "rollback" | `_docs/04_release/` (executed deploy + verdict) |
| **retrospective** | "retrospective", "retro", "metrics review" | `_docs/06_metrics/` + `_docs/LESSONS.md` |
| **monorepo-discover** | "discover monorepo", "scan submodules" | `_docs/_repo-config.yaml` |
| **monorepo-document** | "sync monorepo docs" | unified `_docs/*.md` |
| **monorepo-cicd** | "sync compose", "sync ci" | suite-level CI/compose/env templates |
| **monorepo-onboard** | "onboard component", "register submodule" | atomic component addition |
| **monorepo-status** | "monorepo status", "drift report" | read-only drift report |
| **monorepo-e2e** | "suite e2e", "integration harness" | `e2e/docker-compose.suite-e2e.yml` and fixtures |
| **plan** | "plan", "decompose solution" | `_docs/02_plans/` |
| **decompose** | "decompose", "task decomposition" | `_docs/02_tasks/` |
| **implement** | "implement", "start implementation" | `_docs/03_implementation/` |
| **code-review** | "code review", "review code" | Verdict: PASS / FAIL / PASS_WITH_WARNINGS |
| **refactor** | "refactor", "improve code" | `_docs/04_refactoring/` |
| **security** | "security audit", "OWASP" | Security findings report |
| **deploy** | "deploy", "CI/CD", "observability" | `_docs/04_deploy/` |
| **retrospective** | "retrospective", "retro" | `_docs/05_metrics/` |
| **rollback** | "rollback", "revert batch" | `_docs/03_implementation/rollback_report.md` |
> The `.cursor/agents/` directory is intentionally empty. Per `.cursor/rules/no-subagents.mdc` the main agent does not delegate to subagents in this workspace; `/implement` runs tasks sequentially.
## Tools
| Tool | Type | Purpose |
|------|------|---------|
| `implementer` | Subagent | Implements a single task. Launched by `/implement`. |
## Project Folder Structure
```
_docs/
├── _autodev_state.md — autodev orchestrator state (≤30 lines; pointer only)
├── _process_leftovers/deferred tracker writes replayed at next /autodev (per tracker.mdc)
├── _repo-config.yaml — meta-repo only; produced by monorepo-discover
├── LESSONS.md — ring buffer of last 15 actionable lessons (consumed by autodev/new-task/plan/decompose)
├── 00_problem/ — problem definition, restrictions, AC, input data + expected_results/
├── _autopilot_state.md — autopilot orchestrator state (progress, decisions, session context)
├── 00_problem/ problem definition, restrictions, AC, input data
├── 00_research/ — intermediate research artifacts
├── 01_solution/ — solution drafts, tech stack, security analysis
├── 02_document/
│ ├── architecture.md — includes ## Architecture Vision (user-confirmed)
│ ├── glossary.md — user-confirmed terminology
├── 02_plans/
│ ├── architecture.md
│ ├── system-flows.md
│ ├── data_model.md
│ ├── module-layout.md — per-component Owns/Imports-from/Public API (decompose Step 1.5)
│ ├── architecture_compliance_baseline.md — existing-code baseline scan output
│ ├── risk_mitigations.md
│ ├── adr/[NNN]_[decision_slug].md — Architectural Decision Records (plan Step 4.5)
│ ├── components/[##]_[name]/ — description.md + tests.md per component
│ ├── contracts/<component>/<name>.md — versioned public-API contracts
│ ├── common-helpers/
│ ├── tests/ — environment, test-data, blackbox, performance, resilience, security, resource-limit, traceability matrix
│ ├── ui_mockups/ — HTML+CSS mockups, DESIGN.md (ui-design skill)
│ ├── integration_tests/ — environment, test data, functional, non-functional, traceability
│ ├── deployment/ — containerization, CI/CD, environments, observability, procedures
│ ├── diagrams/
│ └── FINAL_report.md
├── 02_tasks/ — task lifecycle folders + _dependencies_table.md
│ ├── _dependencies_table.md
│ ├── todo/ tasks ready for implementation
│ ├── backlog/ parked tasks (not scheduled yet)
│ └── done/ completed/archived tasks
├── 02_task_plans/ — per-task research artifacts (new-task skill)
├── 03_implementation/ — batch_*_cycle*.md, implementation_report_*.md, implementation_completeness_cycle*.md, cumulative_review_*.md
│ └── reviews/ — code review reports per batch
├── 04_deploy/ — containerization, CI/CD, environments, observability, procedures, deploy_scripts.md, reports/
├── 04_refactoring/NN-<run-name>/ — baseline_metrics, discovery, analysis, test_specs, execution_log, test_sync, verification, FINAL_report (one folder per refactor run)
├── 04_release/ — release_<version>.md (one per /release invocation), rollback_<version>.md
├── 05_security/ — dependency_scan, static_analysis, owasp_review, infrastructure_review, security_report
└── 06_metrics/ — retro_<YYYY-MM-DD>.md, structure_<YYYY-MM-DD>.md, perf_<YYYY-MM-DD>_<run-label>.md, incident_<YYYY-MM-DD>_<skill>.md
├── 02_tasks/ — [JIRA-ID]_[name].md + _dependencies_table.md
├── 03_implementation/ — batch reports, rollback report, FINAL report
├── 04_deploy/containerization, CI/CD, environments, observability, procedures, scripts
├── 04_refactoring/baseline, discovery, analysis, execution, hardening
└── 05_metrics/retro_[YYYY-MM-DD].md
```
## Standalone Mode
@@ -276,7 +199,7 @@ _docs/
## Single Component Mode (Decompose)
```
/decompose @_docs/02_document/components/03_parser/description.md
/decompose @_docs/02_plans/components/03_parser/description.md
```
Appends tasks for that component to `_docs/02_tasks/todo/` without running bootstrap or cross-verification.
Appends tasks for that component to `_docs/02_tasks/` without running bootstrap or cross-verification.
+105
View File
@@ -0,0 +1,105 @@
---
name: implementer
description: |
Implements a single task from its spec file. Use when implementing tasks from _docs/02_tasks/.
Reads the task spec, analyzes the codebase, implements the feature with tests, and verifies acceptance criteria.
Launched by the /implement skill as a subagent.
---
You are a professional software developer implementing a single task.
## Input
You receive from the `/implement` orchestrator:
- Path to a task spec file (e.g., `_docs/02_tasks/[JIRA-ID]_[short_name].md`)
- Files OWNED (exclusive write access — only you may modify these)
- Files READ-ONLY (shared interfaces, types — read but do not modify)
- Files FORBIDDEN (other agents' owned files — do not touch)
## Context (progressive loading)
Load context in this order, stopping when you have enough:
1. Read the task spec thoroughly — acceptance criteria, scope, constraints, dependencies
2. Read `_docs/02_tasks/_dependencies_table.md` to understand where this task fits
3. Read project-level context:
- `_docs/00_problem/problem.md`
- `_docs/00_problem/restrictions.md`
- `_docs/01_solution/solution.md`
4. Analyze the specific codebase areas related to your OWNED files and task dependencies
## Boundaries
**Always:**
- Run tests before reporting done
- Follow existing code conventions and patterns
- Implement error handling per the project's strategy
- Stay within the task spec's Scope/Included section
**Ask first:**
- Adding new dependencies or libraries
- Creating files outside your OWNED directories
- Changing shared interfaces that other tasks depend on
**Never:**
- Modify files in the FORBIDDEN list
- Skip writing tests
- Change database schema unless the task spec explicitly requires it
- Commit secrets, API keys, or passwords
- Modify CI/CD configuration unless the task spec explicitly requires it
## Process
1. Read the task spec thoroughly — understand every acceptance criterion
2. Analyze the existing codebase: conventions, patterns, related code, shared interfaces
3. Research best implementation approaches for the tech stack if needed
4. If the task has a dependency on an unimplemented component, create a minimal interface mock
5. Implement the feature following existing code conventions
6. Implement error handling per the project's defined strategy
7. Implement unit tests (use //Arrange //Act //Assert comments)
8. Implement integration tests — analyze existing tests, add to them or create new
9. Run all tests, fix any failures
10. Verify every acceptance criterion is satisfied — trace each AC with evidence
## Stop Conditions
- If the same fix fails 3+ times with different approaches, stop and report as blocker
- If blocked on an unimplemented dependency, create a minimal interface mock and document it
- If the task scope is unclear, stop and ask rather than assume
## Completion Report
Report using this exact structure:
```
## Implementer Report: [task_name]
**Status**: Done | Blocked | Partial
**Task**: [JIRA-ID]_[short_name]
### Acceptance Criteria
| AC | Satisfied | Evidence |
|----|-----------|----------|
| AC-1 | Yes/No | [test name or description] |
| AC-2 | Yes/No | [test name or description] |
### Files Modified
- [path] (new/modified)
### Test Results
- Unit: [X/Y] passed
- Integration: [X/Y] passed
### Mocks Created
- [path and reason, or "None"]
### Blockers
- [description, or "None"]
```
## Principles
- Follow SOLID, KISS, DRY
- Dumb code, smart data
- No unnecessary comments or logs (only exceptions)
- Ask if requirements are ambiguous — do not assume
-10
View File
@@ -1,10 +0,0 @@
---
description: Rules for installation and provisioning scripts
globs: scripts/**/*.sh
alwaysApply: false
---
# Automation Scripts
- Automate repeatable setup steps in scripts. For dependencies with official package managers (apt, brew, pip, npm), automate installation. For binaries from external URLs, document the download but require user review before execution.
- Use sensible defaults for paths and configuration (e.g. `/opt/` for system-wide tools). Allow overrides via environment variables for users who need non-standard locations.
+10 -36
View File
@@ -1,49 +1,23 @@
---
description: "Enforces readable, environment-aware coding standards with scope discipline, meaningful comments, and test verification"
description: "Enforces concise, comment-free, environment-aware coding standards with strict scope discipline and test verification"
alwaysApply: true
---
# Coding preferences
- Prefer the simplest solution that satisfies all requirements, including maintainability. When in doubt between two approaches, choose the one with fewer moving parts — but never sacrifice correctness, error handling, or readability for brevity.
- Follow the Single Responsibility Principle — a class or method should have one reason to change:
- If a method is hard to name precisely from the caller's perspective, its responsibility is misplaced. Vague names like "candidate", "data", or "item" are a signal — fix the design, not just the name.
- Logic specific to a platform, variant, or environment belongs in the class that owns that variant, not in the general coordinator. Passing a dependency through is preferable to leaking variant-specific concepts into shared code.
- Only use static methods for pure, self-contained computations (constants, simple math, stateless lookups). If a static method involves resource access, side effects, OS interaction, or logic that varies across subclasses or environments — use an instance method or factory class instead. Before implementing a non-trivial static method, ask the user.
- Avoid boilerplate and unnecessary indirection, but never sacrifice readability for brevity.
- Never suppress errors silently — no `2>/dev/null`, empty `catch` blocks, bare `except: pass`, or discarded error returns. These hide the information you need most when something breaks. If an error is truly safe to ignore, log it or comment why.
- Do not add comments that merely narrate what the code does. Comments are appropriate for: non-obvious business rules, workarounds with references to issues/bugs, safety invariants, and public API contracts. Make comments as short and concise as possible. Exception: every test must use the Arrange / Act / Assert pattern with language-appropriate comment syntax (`# Arrange` for Python, `// Arrange` for C#/Rust/JS/TS). Omit any section that is not needed (e.g. if there is no setup, skip Arrange; if act and assert are the same line, keep only Assert)
- Do not add verbose debug/trace logs by default. Log exceptions, security events (auth failures, permission denials), and business-critical state transitions. Add debug-level logging only when asked.
- Always prefer simple solution
- Generate concise code
- Do not put comments in the code
- Do not put logs unless it is an exception, or was asked specifically
- Do not put code annotations unless it was asked specifically
- Write code that takes into account the different environments: development, production
- You are careful to make changes that are requested or you are confident the changes are well understood and related to the change being requested
- Mocking data is needed only for tests, never mock data for dev or prod env
- Make test environment (files, db and so on) as close as possible to the production environment
- When you add new libraries or dependencies make sure you are using the same version of it as other parts of the code
- When writing code that calls a library API, verify the API actually exists in the pinned version. Check the library's changelog or migration guide for breaking changes between major versions. Never assume an API works at a given version — test the actual call path before committing.
- When a test fails due to a missing dependency, install it — do not fake or stub the module system. For normal packages, add them to the project's dependency file (requirements-test.txt, package.json devDependencies, test csproj, etc.) and install. Only consider stubbing if the dependency is heavy (e.g. hardware-specific SDK, large native toolchain) — and even then, ask the user first before choosing to stub.
- Do not solve environment or infrastructure problems (dependency resolution, import paths, service discovery, connection config) by hardcoding workarounds in source code. Fix them at the environment/configuration level.
- Before writing new infrastructure or workaround code, check how the existing codebase already handles the same concern. Follow established project patterns.
- If a file, class, or function has no remaining usages — delete it. Dead code rots: its dependencies drift, it misleads readers, and it breaks when the code it depends on evolves. However, before deletion verify that the symbol is not used via any of the following. If any applies, do NOT delete — leave it or ASK the user:
- Public API surface exported from the package and potentially consumed outside the workspace (see `workspace-boundary.mdc`)
- Reflection, dependency injection, or service registration (scan DI container registrations, `appsettings.json` / equivalent config, attribute-based discovery, plugin manifests)
- Dynamic dispatch from config/data (YAML/JSON references, string-based class lookups, route tables, command dispatchers)
- Test fixtures used only by currently-skipped tests — temporary skips may become active again
- Cross-repo references — if this workspace is part of a multi-repo system, grep sibling repos for shared contracts before deleting
- **Scope discipline**: focus edits on the task scope. The "scope" is:
- Files the task explicitly names
- Files that define interfaces the task changes
- Files that directly call, implement, or test the changed code
- **Adjacent hygiene is permitted** without asking: fixing imports you caused to break, updating obvious stale references within a file you already modify, deleting code that became dead because of your change.
- **Unrelated issues elsewhere**: do not silently fix them as part of this task. Either note them to the user at end of turn and ASK before expanding scope, or record in `_docs/_process_leftovers/` for later handling.
- Always think about what other methods and areas of code might be affected by the code changes, and surface the list to the user before modifying.
- When you think you are done with changes, run the full test suite. Every failure in tests that cover code you modified or that depend on code you modified is a **blocking gate**. For pre-existing failures in unrelated areas, report them to the user but do not block on them. Never silently ignore or skip a failure without reporting it. On any blocking failure, stop and ask the user to choose one of:
- **Investigate and fix** the failing test or source code
- **Remove the test** if it is obsolete or no longer relevant
- **Iterative-skill exception**: when an iterative loop skill is active (e.g. autodev / `implement/SKILL.md` batch loop, `refactor/SKILL.md` batch loop), the skill governs full-suite cadence — typically focused tests per task/batch and a single full-suite gate at the very end of the implementation phase, NOT after each batch. "Done with changes" means done with the entire implementation phase the skill is running, not done with one batch. Do not run the full suite per batch unless the skill explicitly says to.
- Focus on the areas of code relevant to the task
- Do not touch code that is unrelated to the task
- Always think about what other methods and areas of code might be affected by the code changes
- When you think you are done with changes, run tests and make sure they are not broken
- Do not rename any databases or tables or table columns without confirmation. Avoid such renaming if possible.
- Do not create diagrams unless I ask explicitly
- Make sure we don't commit binaries, create and keep .gitignore up to date and delete binaries after you are done with the task
- Never force-push to main or dev branches
- For new projects, place source code under `src/` (this works for all stacks including .NET). For existing projects, follow the established directory structure. Keep project-level config, tests, and tooling at the repo root.
- **Never run e2e or CI tests in quiet mode (`-q`).** Always use `-v --tb=short` (or equivalent verbosity flags) in all Dockerfiles, compose files, and scripts that invoke pytest. Full test output must be visible so failures can be diagnosed without re-running. This applies to both Tier-1 (Colima) and Tier-2 (Jetson) harnesses.
- **Never substitute real algorithm execution with a data passthrough to make tests pass.** If a test is designed to validate output from a specific pipeline (e.g. VIO estimation, sensor fusion, inference), the implementation MUST actually run that pipeline — not bypass it by returning the input data directly as output. Tests that pass by skipping the component they are supposed to exercise create false confidence and hide the fact that the component is not integrated. If the real integration cannot be completed in this session, STOP and report the blocker to the user explicitly. A failing test with an honest explanation is always better than a passing test that proves nothing.
+1 -16
View File
@@ -19,22 +19,7 @@ globs: [".cursor/**"]
- Kebab-case filenames
## Agent Files (.cursor/agents/)
- The `.cursor/agents/` directory is intentionally empty. Per `.cursor/rules/no-subagents.mdc`, the main agent does not delegate to subagents in this workspace. Do not add agent files here without a corresponding rule change.
- Must have `name` and `description` in frontmatter
## Security
- All `.cursor/` files must be scanned for hidden Unicode before committing (see cursor-security.mdc)
## Quality Thresholds (canonical reference)
All rules and skills must reference the single source of truth below. Do NOT restate different numeric thresholds in individual rule or skill files.
| Concern | Threshold | Enforcement |
|---------|-----------|-------------|
| Test coverage on business logic | 75% | Aim (warn below); critical-path floor enforced separately (next row) |
| Test coverage on critical paths | 90% floor / 100% aim | **90% is the enforcement floor** in CI gates, refactor verification, and release pre-flight. **100% is the aim** — drift below 100% but at-or-above 90% is acceptable; drift below 90% blocks. Critical paths = code paths where a bug would cause data loss, security breach, financial error, or system outage; identify from `acceptance_criteria.md` (must-have) and `_docs/00_problem/security_approach.md`. |
| Test scenario coverage (vs AC + restrictions) | 75% | Blocking in test-spec Phase 1 and Phase 3 |
| CI coverage gate | 75% overall, 90% critical-path | Fail build below either threshold |
| Lint errors (Critical/High) | 0 | Blocking pre-commit |
| Code-review auto-fix | Low + Medium (Style/Maint/Perf) + High (Style/Scope) | Critical and Security always escalate. Full categorization: see `.cursor/skills/implement/SKILL.md` § "Auto-Fix eligibility matrix" |
When a skill or rule needs to cite a threshold, link to this table instead of hardcoding a different number. The full auto-fix eligibility matrix (severity × category) lives in `implement/SKILL.md`; cite that file rather than re-tabulating the matrix.
+1 -1
View File
@@ -5,7 +5,7 @@ globs: ["**/*.cs", "**/*.csproj", "**/*.sln"]
# .NET / C#
- PascalCase for classes, methods, properties, namespaces; camelCase for locals and parameters; prefix interfaces with `I`
- Use `async`/`await` for I/O-bound operations; the `Async` suffix on method names is optional — follow the project's existing convention
- Use `async`/`await` for I/O-bound operations, do not suffix async methods with Async
- Use dependency injection via constructor injection; register services in `Program.cs`
- Use linq2db for small projects, EF Core with migrations for big ones; avoid raw SQL unless performance-critical; prevent N+1 with `.Include()` or projection
- Use `Result<T, E>` pattern or custom error types over throwing exceptions for expected failures
+2 -5
View File
@@ -1,11 +1,8 @@
---
description: "Git workflow: work on dev branch, commit message format with tracker IDs"
description: "Git workflow: work on dev branch, commit message format with Jira IDs"
alwaysApply: true
---
# Git Workflow
- Work on the `dev` branch
- Commit message subject line format: `[TRACKER-ID-1] [TRACKER-ID-2] Summary of changes`
- Subject line must not exceed 72 characters (standard Git convention for the first line). The 72-char limit applies to the subject ONLY, not the full commit message.
- A commit message body is optional. Add one when the subject alone cannot convey the why of the change. Wrap the body at 72 chars per line.
- Do NOT push or merge unless the user explicitly asks you to. Always ask first if there is a need.
- Commit message format: `[JIRA-ID-1] [JIRA-ID-2] Summary of changes`
-46
View File
@@ -1,46 +0,0 @@
---
description: "Play a notification sound whenever the AI agent needs human input, confirmation, or approval"
alwaysApply: true
---
# Sound Notification on Human Input
## Sound commands per OS
Detect the OS from user system info or `uname -s`:
- **macOS**: `afplay /System/Library/Sounds/Glass.aiff &`
- **Linux**: `paplay /usr/share/sounds/freedesktop/stereo/bell.oga 2>/dev/null || aplay /usr/share/sounds/freedesktop/stereo/bell.oga 2>/dev/null || echo -e '\a' &`
- **Windows (PowerShell)**: `[System.Media.SystemSounds]::Exclamation.Play()`
## When to play (play exactly once per trigger)
Play the sound when your turn will end in one of these states:
1. You are about to call the AskQuestion tool — sound BEFORE the AskQuestion call
2. Your text ends with a direct question to the user that cannot be answered without their input (e.g., "Which option do you prefer?", "What is the database name?", "Confirm before I push?")
3. You are reporting that you are BLOCKED and cannot continue without user input (missing credentials, conflicting requirements, external approval required)
4. You have just completed a destructive or irreversible action the user asked to review (commit, push, deploy, data migration, file deletion)
## When NOT to play
- You are mid-execution and returning a progress update (the conversation is not stalling)
- You are answering a purely informational or factual question and no follow-up is required
- You have already played the sound once this turn for the same pause point
- Your response only contains text describing what you did or found, with no question, no block, no irreversible action
## "Trivial" definition
A response is trivial (no sound) when ALL of the following are true:
- No explicit question to the user
- No "I am blocked" report
- No destructive/irreversible action that needs review
If any one of those is present, the response is non-trivial — play the sound.
## Ordering
The sound command is a normal Shell tool call. Place it:
- **Immediately before an AskQuestion tool call** in the same message, or
- **As the last Shell call of the turn** if ending with a text-based question, block report, or post-destructive-action review
Do not play the sound as part of routine command execution — only at the pause points listed under "When to play".
-41
View File
@@ -1,41 +0,0 @@
---
description: "Use chunked writes (Write + StrReplace marker pattern) for large generated files, especially after a monolithic Write fails"
alwaysApply: true
---
# Large File Writes — Chunk on Failure
When a `Write` call to a single file fails (timeout, payload limit, "Invalid arguments", or any tool error) and the intended content is large (>~500 lines or >~50 KB), do NOT retry the same monolithic Write. Switch to chunked writes:
1. **First Write** — create the file with header + table of contents (if applicable) + an explicit append marker, e.g.
```
<!-- INSERTION_POINT do-not-remove-until-final-chunk -->
```
2. **Each subsequent chunk** — use `StrReplace` to replace the marker with `<new content>\n<marker>` so the marker stays at the end. This is idempotent: if a chunk fails, retry it without losing earlier chunks.
3. **Final chunk** — `StrReplace` removes the marker.
## Why
- Tool argument size limits and transient failures hit large monolithic writes hardest. Retrying the same large payload typically fails for the same reason.
- Chunked writes are recoverable per chunk. The earlier chunks are durable on disk.
- A unique marker is greppable, visible in diffs, and stops accidental insertion in the wrong place.
## Triggers
- Generated documentation that aggregates per-component content (epics, design docs, multi-section architecture summaries, traceability dumps).
- Large fixture or test-data files written from a template.
- Any single-file artifact you can pre-estimate at >~500 lines.
## Do NOT chunk
- Files under ~200 lines — a single `Write` is faster, clearer, and easier to review.
- Source code files where appending breaks module structure (functions, classes, imports). Split into multiple files instead.
- Files where ordering of sections is computed late and inserting in the middle is required — use a single `Write` once the full content is known.
## Anti-patterns
- Retrying the same failed monolithic `Write` more than once. Twice is the limit; on the second failure, switch strategies.
- Using `Shell` with heredoc (`cat <<EOF`) or `echo >>` to append — these bypass the editor diff view and break the StrReplace contract for the next chunk.
- Embedding the marker so deep inside structured content that a chunk's `StrReplace` becomes ambiguous. Place the marker on its own line at the very end of the file.
-92
View File
@@ -1,92 +0,0 @@
---
description: "Execution safety, user interaction, and self-improvement protocols for the AI agent"
alwaysApply: true
---
# Agent Meta Rules
## Real Results, Not Simulated Ones
**The goal is a working product, not the appearance of one.**
- If something does not work, STOP and report it honestly. Do not find a way around it.
- Never produce results by bypassing, faking, stubbing, or passthrough-ing the component that is supposed to produce them. A passing test that skips the real pipeline is worse than a failing test — it hides the truth.
- If the real implementation is not ready, say so. A clear "this is not implemented yet, here is what is missing" is always the right answer.
- Do not measure success by whether the output looks correct. Measure it by whether the output was produced by the real system under test.
- Workarounds that produce the right answer via the wrong path are defects, not solutions.
### When a test reveals missing production code — STOP
This is the specific failure mode that produced the GPS-passthrough scaffold in `runtime_root._run_replay_loop` (May 2026). Generalised so it never repeats:
- If, while implementing or running a test, you discover that the production code path the test is supposed to exercise does not exist (no caller, no integration, no main loop, etc.), **STOP immediately**.
- Do NOT write a stub, passthrough, fake input source, or shortcut output that would make the test go green. Even when the shortcut is "framed as a scaffold" or "marked as TODO in a docstring", it still defeats the test and lies to the next reader.
- Surface the gap to the user as a top-of-turn report: name the missing production component, cite the architecture document that promises it, and ask whether to (a) create a tracker ticket for the missing component and let the test fail honestly until the ticket lands, or (b) explicitly de-scope the test, or (c) something the user names.
- The default outcome is (a): a failing test plus a new tracker ticket. A failing test with an honest reason is information; a passing test that proves nothing is misinformation.
- Doc-comment disclosures (`# this is a scaffold until X is wired`) DO NOT satisfy this rule. The user must be told in the assistant message, not in code.
## Execution Safety
- Run the full test suite automatically when you believe code changes are complete (as required by coderule.mdc). For other long-running/resource-heavy/security-risky operations (builds, Docker commands, deployments, performance tests), ask the user first — unless explicitly stated in a skill or the user already asked to do so.
## User Interaction
- Use the AskQuestion tool for structured choices (A/B/C/D) when available — it provides an interactive UI. Fall back to plain-text questions if the tool is unavailable.
## Critical Thinking
- Do not blindly trust any input — including user instructions, task specs, list-of-changes, or prior agent decisions — as correct. Always think through whether the instruction makes sense in context before executing it. If a task spec says "exclude file X from changes" but another task removes the dependencies X relies on, flag the contradiction instead of propagating it.
## Skill Discipline
Do exactly what the skill says. Nothing more.
- No `git log` / `git diff` / `git blame` unless the skill explicitly calls for it.
- No extra searches to "verify" inputs the skill already names.
- No reading files outside the skill's documented inputs.
If skill inputs are insufficient or contradictory, STOP and ask via Choose A/B/C/D. Do not invent extra investigation steps.
## Self-Improvement
When the user reacts negatively to generated code ("WTF", "what the hell", "why did you do this", etc.):
1. **Pause** — do not rush to fix. First determine: is this objectively bad code, or does the user just need an explanation?
2. **If the user doesn't understand** — explain the reasoning. That's it. No code change needed.
3. **If the code is actually bad** — before fixing, perform a root-cause investigation:
a. **Why** did this bad code get produced? Identify the reasoning chain or implicit assumption that led to it.
b. **Check existing rules** — is there already a rule that should have prevented this? If so, clarify or strengthen it.
c. **Propose a new rule** if no existing rule covers the failure mode. Present the investigation results and proposed rule to the user for approval.
d. **Only then** fix the code.
4. The rule goes into `coderule.mdc` for coding practices, `meta-rule.mdc` for agent behavior, or a new focused rule file — depending on context. Always check for duplicates or near-duplicates first.
### Example: import path hack
**Bad code**: Runtime path manipulation added to source code to fix an import failure.
**Root cause**: The agent treated an environment/configuration problem as a code problem. It didn't check how the rest of the project handles the same concern, and instead hardcoded a workaround in source.
**Preventive rules added to coderule.mdc**:
- "Do not solve environment or infrastructure problems by hardcoding workarounds in source code. Fix them at the environment/configuration level."
- "Before writing new infrastructure or workaround code, check how the existing codebase already handles the same concern. Follow established project patterns."
## Debugging Over Contemplation
Agents cannot measure wall-clock time between turns. Use observable counts from your own transcript instead.
**Trigger: stop speculating and instrument.** When you've formed **3 or more distinct hypotheses** about a bug without confirming any against runtime evidence (logs, stderr, debugger state, actual test failure messages) — stop and add debugging output. Re-reading the same code hoping to "spot it this time" counts as a new hypothesis that still has zero evidence.
Steps:
1. Identify the last known-good boundary (e.g., "request enters handler") and the known-bad result (e.g., "callback never fires").
2. Add targeted `print(..., flush=True)`, `console.error`, or logger statements at each intermediate step to narrow the gap.
3. Run the instrumented code. Read the output. Let evidence drive the next hypothesis — not inference chains.
An instrumented run producing real output beats any amount of "could it be X? but then Y..." reasoning.
## Long Investigation Retrospective
Trigger a post-mortem when ANY of the following is true (all are observable in your own transcript):
- **10+ tool calls** were used to diagnose a single issue
- **Same file modified 3+ times** without tests going green
- **3+ distinct approaches** attempted before arriving at the fix
- Any phrase like "let me try X instead" appeared **more than twice**
- A fix was eventually found by reading docs/source the agent had dismissed earlier
Post-mortem steps:
1. **Identify the bottleneck**: wrong assumption? missing runtime visibility? incorrect mental model of a framework/language boundary? ignored evidence?
2. **Extract the general lesson**: what category of mistake was this? (e.g., "Python cannot call Cython `cdef` methods", "engine errors silently swallowed", "wrong layer to fix the problem")
3. **Propose a preventive rule**: short, actionable. Present to user for approval.
4. **Write it down**: add approved rule to the appropriate `.mdc` so it applies to future sessions.
-29
View File
@@ -1,29 +0,0 @@
---
description: "Forbid spawning subagents; the main agent must do the work directly"
alwaysApply: true
---
# No Subagents
Do NOT create or delegate to subagents. This includes:
- The `Task` tool with any `subagent_type` (e.g. `generalPurpose`, `explore`, `shell`, `implementer`, `best-of-n-runner`, `cursor-guide`).
- Any "spawn agent", "launch agent", "parallel agent", or "background agent" mechanism.
- Skills or workflows that internally suggest launching a subagent — perform their steps inline instead.
## Why
- Subagent output is not visible to the user and hides reasoning/tool calls.
- Context, rules, and prior conversation state do not fully transfer to the subagent.
- Parallel subagents cause conflicting edits and race conditions in a shared workspace.
- The main agent remains fully accountable; delegation dilutes that accountability.
## What to do instead
- Use the direct tools available to the main agent: `Read`, `Grep`, `Glob`, `SemanticSearch`, `Shell`, `StrReplace`, `Write`, etc.
- For broad exploration, run `Grep`/`Glob`/`SemanticSearch` yourself and read the files directly.
- For multi-step work, use `TodoWrite` to track progress inline.
- For isolated experiments the user explicitly asks for, use a git branch/worktree you manage directly — not a subagent runner.
## Exception
Only spawn a subagent if the user explicitly requests it in the current turn (e.g. "use a subagent to…", "launch an explore agent…"). Even then, confirm once before spawning.
+3 -7
View File
@@ -1,6 +1,6 @@
---
description: "Python coding conventions: PEP 8, type hints, pydantic, pytest, async patterns, project structure"
globs: ["**/*.py", "**/*.pyx", "**/*.pxd", "**/pyproject.toml", "**/requirements*.txt"]
globs: ["**/*.py", "**/pyproject.toml", "**/requirements*.txt"]
---
# Python
@@ -8,14 +8,10 @@ globs: ["**/*.py", "**/*.pyx", "**/*.pxd", "**/pyproject.toml", "**/requirements
- Use type hints on all function signatures; validate with `mypy` or `pyright`
- Use `pydantic` for data validation and serialization
- Import order: stdlib -> third-party -> local; use absolute imports
- Use `src/` layout to separate app code from project files
- Use context managers (`with`) for resource management
- Catch specific exceptions, never bare `except:`; use custom exception classes
- Use `async`/`await` with `asyncio` for I/O-bound concurrency
- Use `pytest` for testing (not `unittest`); fixtures for setup/teardown
- **NEVER install packages globally** (`pip install` / `pip3 install` without a venv). ALWAYS use a virtual environment (`venv`, `poetry`, or `conda env`). If no venv exists for the project, create one first (`python3 -m venv .venv && source .venv/bin/activate`) before installing anything. Pin dependencies.
- Use virtual environments (`venv` or `poetry`); pin dependencies
- Format with `black`; lint with `ruff` or `flake8`
## Cython
- In `cdef class` methods, prefer `cdef` over `cpdef` unless the method must be callable from Python. `cdef` = C-only (fastest), `cpdef` = C + Python, `def` = Python-only. Check all call sites before choosing.
- **Python cannot call `cdef` methods.** If a `.py` file needs to call a `cdef` method on a Cython object, there are exactly two options: (a) convert the calling file to `.pyx`, `cimport` the class, and use a typed parameter so Cython dispatches the call at the C level; or (b) change the method to `cpdef` if it genuinely needs to be callable from both Python and Cython. Never leave a bare `except Exception: pass` around such a call — it will silently swallow the `AttributeError` and make the failure invisible for a very long time.
- When converting a `.py` file to `.pyx` to gain access to `cdef` methods: add the new extension to `setup.py`, add a `cimport` of the relevant `.pxd`, type the parameter(s) that carry the Cython object, and delete the old `.py` file. This ensures the cross-language call is resolved at compile time, not at runtime.
+1 -1
View File
@@ -4,7 +4,7 @@ alwaysApply: true
---
# Quality Gates
- After any code edit that changes logic, adds/removes imports, or modifies function signatures, run `ReadLints` on modified files and fix introduced errors
- After substantive code edits, run `ReadLints` on modified files and fix introduced errors
- Before committing, run the project's formatter if one exists (black, rustfmt, prettier, dotnet format)
- Respect existing `.editorconfig`, `.prettierrc`, `pyproject.toml [tool.black]`, or `rustfmt.toml`
- Do not commit code with Critical or High severity lint errors
-46
View File
@@ -1,46 +0,0 @@
---
description: "Explanation length and reasoning depth calibration"
alwaysApply: true
---
# Response Calibration
Default to concise. Expand only when the content demands it.
## Length target
- **Default**: a direct answer in ~310 lines. Short paragraphs or a tight bullet list.
- **Expand when**: the question involves trade-offs across multiple options, a migration/architectural decision, a security/data-loss risk, or the user explicitly asks for depth ("explain in detail", "walk me through", "why").
- **Shrink when**: the user asks for "shorter", "simpler", "TL;DR", "one line", or similar. Do not re-inflate in later turns unless they ask a new deeper question.
## Completeness floor
Short ≠ incomplete. Every response must still:
- Answer the actual question asked (not a reframed version).
- State the key constraint or reason *once*, not repeatedly.
- Flag a real caveat if one exists (data loss, breaking change, wrong-OS, security). One sentence is enough.
- Not drop a step from an action sequence. If there are 5 steps, list 5 — but without narration between them.
If the honest answer truly needs more space (e.g. trade-off matrix, multi-option decision), write more — but lead with the recommendation or direct answer, then the detail.
## Structure
- One direct sentence first. Then supporting detail.
- Prefer bullets over prose for enumerations, comparisons, or step lists.
- Drop section headers for anything under ~15 lines.
- No "Summary" / "Conclusion" sections unless the response is genuinely long.
## Reasoning depth (internal)
- Match thinking to the problem, not the length of the answer.
- Factual / "where is X used" / single-file edit → minimal thinking, go straight to tools.
- Trade-off / refactor / debugging 3+ hypotheses deep → full thinking budget.
- Do not pad thinking to look thorough. Do not skip thinking on genuinely ambiguous problems to look fast.
## Anti-patterns to avoid
- Restating the question back to the user.
- Multi-paragraph preambles before the answer.
- Exhaustive "alternatives considered" sections when the user didn't ask for alternatives.
- Recapping what was just done at the end of every tool-using turn ("Done. I have edited the file…") — a one-line confirmation is enough.
- Speculative "you might also want to…" paragraphs. Offer follow-ups as a single short sentence, or not at all.
-38
View File
@@ -1,38 +0,0 @@
---
description: "Standards for creating and maintaining Cursor skills"
globs: [".cursor/skills/**"]
---
# Skill Building
## When To Create A Skill
- Create a skill for repeatable, bounded workflows that benefit from a reusable process.
- Do not create a skill for a one-off task, vague goal, or workflow that still needs product decisions.
- Start small; evolve the skill when repeated use reveals clearer steps, constraints, or checks.
## Skill Contract
- `SKILL.md` must define a clear `name` and a proactive `description` that explains when the skill should be used.
- State expected inputs, constraints, workflow steps, and final output shape.
- Make trigger conditions explicit enough that the agent can recognize intent without an exact command.
- Base instructions on observable project evidence; do not invite fabrication or unsupported assumptions.
## Keep The Core Lean
- Keep `SKILL.md` concise and under the repo's `.cursor/` size guidance.
- Move detailed standards, examples, and background knowledge into `references/`.
- Put reusable output shapes in `templates/` or other skill-local assets instead of embedding them in the main instructions.
- Keep one primary responsibility per skill; use an orchestrator skill only when multiple existing skills must run in a defined order.
## Deterministic Work
- Use scripts for mechanical steps that are repeatable, parameterized, and safer outside the model's reasoning.
- Scripts must expose explicit inputs, avoid hidden side effects, and fail loudly on errors.
- Do not use scripts to bypass review, hide destructive behavior, or hardcode secrets.
## Quality Proof
- Include realistic examples, checklists, or eval-style scenarios that define what good output looks like.
- Cover common failure cases such as missing sections, leftover placeholders, hallucinated facts, unsafe actions, or malformed output.
- Review skill changes against those checks before treating the skill as ready.
## Security Review
- Treat third-party skills like untrusted code until reviewed.
- Inspect scripts, dependencies, references, secret handling, network calls, and destructive commands before use.
- Prefer local, project-scoped assets and dependencies; document any external dependency the skill requires.
+1 -1
View File
@@ -4,6 +4,6 @@ alwaysApply: true
---
# Tech Stack
- Prefer Postgres database, but ask user
- For new backend projects: use .NET for structured enterprise/API services, Python for data/ML/scripting tasks, Rust for performance-critical components. For existing projects, use the language already established in that project.
- Depending on task, for backend prefer .Net or Python. Rust for performance-critical things.
- For the frontend, use React with Tailwind css (or even plain css, if it is a simple project)
- document api with OpenAPI
+2 -10
View File
@@ -4,20 +4,12 @@ globs: ["**/*test*", "**/*spec*", "**/*Test*", "**/tests/**", "**/test/**"]
---
# Testing
- Structure every test with Arrange / Act / Assert section comments using language-appropriate syntax (`# Arrange` for Python, `// Arrange` for C#/Rust/JS/TS)
- Structure every test with `//Arrange`, `//Act`, `//Assert` comments
- One assertion per test when practical; name tests descriptively: `MethodName_Scenario_ExpectedResult`
- Test boundary conditions, error paths, and happy paths
- Use mocks only for external dependencies; prefer real implementations for internal code
- Aim for 75%+ coverage on business logic; **90% floor / 100% aim on critical paths** (code paths where a bug would cause data loss, security breaches, financial errors, or system outages — identify from acceptance criteria marked as must-have or from `security_approach.md`). 90% is the enforcement floor (blocking in CI / refactor verification / release pre-flight); 100% is the aspirational aim — drift below 100% but at-or-above 90% is acceptable. Both numbers are canonical — see `cursor-meta.mdc` Quality Thresholds.
- Aim for 80%+ coverage on business logic; 100% on critical paths
- Integration tests use real database (Postgres testcontainers or dedicated test DB)
- Never use Thread Sleep or fixed delays in tests; use polling or async waits
- Keep test data factories/builders for reusable test setup
- Tests must be independent: no shared mutable state between tests
## Test environment (this project)
- **Unit tests** (`tests/unit/`): may run locally on the dev workstation (`pytest tests/unit/` in the project venv). Local PASS is equivalent to Jetson PASS for this tier because the suite is fully synthetic.
- **Blackbox / e2e / performance / resilience / security / resource-limit** tests (`tests/e2e/`, `e2e/tests/`, `tests/perf/`, …): MUST run on the Jetson Orin Nano Super (or a Jetson-equivalent arm64 agent). Use `scripts/run-tests-jetson.sh` for local dev; CI runs `.woodpecker/01-test.yml` on the colocated arm64 Jetson Woodpecker agent.
- Do NOT run e2e tests on the local workstation and report the result. If the Jetson is unreachable, the e2e verdict is "not run" — record the gap in `_docs/_process_leftovers/` rather than substituting a local result.
- Tests gated by `RUN_REPLAY_E2E` or `@pytest.mark.tier2` are expected to SKIP locally; that is correct behaviour, not a failure to investigate.
- Canonical source for this policy: `_docs/02_document/tests/environment.md` § Where each tier runs (active policy).
-56
View File
@@ -1,56 +0,0 @@
---
alwaysApply: true
---
# Work Item Tracker
- Use **Jira** as the sole work item tracker (MCP server: `user-Jira-MCP-Server`)
- **NEVER** use Azure DevOps (ADO) MCP for any purpose — no reads, no writes, no queries
- Before interacting with any tracker, read this rule file first
- Jira cloud ID: `denyspopov.atlassian.net`
- Project key: `AZ`
- Project name: AZAION
- All task IDs follow the format `AZ-<number>`
- Issue types: Epic, Story, Task, Bug, Subtask
## Tracker Availability Gate
- If Jira MCP returns **Unauthorized**, **errored**, **connection refused**, **timeout**, a non-2xx status code, an empty body, or any response shape that does not clearly confirm the requested change: **STOP IMMEDIATELY** — no automatic retry, no silent continuation. Surface the full raw error/response to the user verbatim and notify via the Choose A/B/C/D format documented in `.cursor/skills/autodev/protocols.md`.
- A minimal `{"success": true}` body with no echoed issue state is NOT a confirmed transition. When a transition's success matters (status moves, ticket creation, blocking link), follow it with a read-back call (`getJiraIssue` or equivalent) and confirm the new state matches what you asked for. If the read-back disagrees → STOP and ASK.
- Do NOT loop "retry up to N times before asking". One call, one verification. On failure, the user decides whether to retry.
- The user may choose to:
- **Retry the same operation** — once, after the user authorizes it. If it fails again, surface both responses.
- **Retry authentication** — preferred when the failure looks like an auth/credentials problem; the tracker remains the source of truth.
- **Continue in `tracker: local` mode** — only when the user explicitly accepts this option. In that mode all tasks keep numeric prefixes and a `Tracker: pending` marker is written into each task header. The state file records `tracker: local`. The mode is NOT silent — the user has been asked and has acknowledged the trade-off.
- Do NOT auto-fall-back to `tracker: local` without a user decision. Do not pretend a write succeeded. Do not paper over an opaque response by moving on. If the user is unreachable (e.g., non-interactive run), stop and wait.
- When the tracker becomes available again, any `Tracker: pending` tasks should be synced — this is done at the start of the next `/autodev` invocation via the Leftovers Mechanism below.
## Leftovers Mechanism (non-user-input blockers only)
When a **non-user** blocker prevents a tracker write (MCP down, network error, transient failure, ticket linkage recoverable later), record the deferred write in `_docs/_process_leftovers/<YYYY-MM-DD>_<topic>.md` and continue non-tracker work. Each entry must include:
- Timestamp (ISO 8601)
- What was blocked (ticket creation, status transition, comment, link)
- Full payload that would have been written (summary, description, story points, epic, target status) — so the write can be replayed later
- Reason for the blockage (MCP unavailable, auth expired, unknown epic ID pending user clarification, etc.)
### Hard gates that CANNOT be deferred to leftovers
Anything requiring user input MUST still block:
- Clarifications about requirements, scope, or priority
- Approval for destructive actions or irreversible changes
- Choice between alternatives (A/B/C decisions)
- Confirmation of assumptions that change task outcome
If a blocker of this kind appears, STOP and ASK — do not write to leftovers.
### Replay obligation
At the start of every `/autodev` invocation, and before any new tracker write in any skill, check `_docs/_process_leftovers/` for pending entries. For each entry:
1. Attempt to replay the deferred write against the tracker
2. If replay succeeds → delete the leftover entry
3. If replay still fails → update the entry's timestamp and reason, continue
4. If the blocker now requires user input (e.g., MCP still down after N retries) → surface to the user
Autodev must not progress past its own step 0 until all leftovers that CAN be replayed have been replayed.
-7
View File
@@ -1,7 +0,0 @@
# Workspace Boundary
- Only modify files within the current repository (workspace root).
- Never write, edit, or delete files in sibling repositories or parent directories outside the workspace.
- When a task requires changes in another repository (e.g., admin API, flights, UI), **document** the required changes in the task's implementation notes or a dedicated cross-repo doc — do not implement them.
- The mock API at `e2e/mocks/mock_api/` may be updated to reflect the expected contract of external services, but this is a test mock — not the real implementation.
- If a task is entirely scoped to another repository, mark it as out-of-scope for this workspace and note the target repository.
-145
View File
@@ -1,145 +0,0 @@
---
name: autodev
description: |
Auto-chaining orchestrator that drives the full BUILD → SHIP → EVOLVE workflow from problem gathering through release and retrospective.
Detects current project state from _docs/ folder, resumes from where it left off, and flows through
problem → research → plan (incl. ADRs) → test specs → decompose → implement → tests → docs sync → deploy → release → retrospective without manual skill invocation.
Maximizes work per conversation by auto-transitioning between skills.
Trigger phrases:
- "autodev", "auto", "start", "continue"
- "what's next", "where am I", "project status"
category: meta
tags: [orchestrator, workflow, auto-chain, state-machine, meta-skill]
disable-model-invocation: true
---
# Autodev Orchestrator
Auto-chaining execution engine that drives the full BUILD → SHIP → EVOLVE workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autodev` once — the engine handles sequencing, transitions, and re-entry.
## File Index
| File | Purpose |
|------|---------|
| `flows/greenfield.md` | Detection rules, step table, and auto-chain rules for new projects |
| `flows/existing-code.md` | Detection rules, step table, and auto-chain rules for existing codebases |
| `flows/meta-repo.md` | Detection rules, step table, and auto-chain rules for meta-repositories (submodule aggregators, workspace monorepos) |
| `state.md` | State file format, rules, re-entry protocol, session boundaries |
| `protocols.md` | User interaction, tracker auth, choice format, error handling, status summary |
**On every invocation**: read `state.md`, `protocols.md`, and the active flow file before executing any logic. You don't need to read flow files for flows you're not in.
## Core Principles
- **Auto-chain**: when a skill completes, immediately start the next one — no pause between skills
- **Only pause at decision points**: BLOCKING gates inside sub-skills are the natural pause points; do not add artificial stops between steps
- **State from disk**: current step is persisted to `_docs/_autodev_state.md` and cross-checked against `_docs/` folder structure
- **Re-entry**: on every invocation, read the state file and cross-check against `_docs/` folders before continuing
- **Delegate, don't duplicate**: read and execute each sub-skill's SKILL.md; never inline their logic here
- **Sound on pause**: follow `.cursor/rules/human-attention-sound.mdc` — play a notification sound before every pause that requires human input (AskQuestion tool preferred for structured choices; fall back to plain text if unavailable)
- **Minimize interruptions**: only ask the user when the decision genuinely cannot be resolved automatically
- **Single project per workspace**: all `_docs/` paths are relative to workspace root; for multi-component systems, each component needs its own Cursor workspace. **Exception**: a meta-repo workspace (git-submodule aggregator or monorepo workspace) uses the `meta-repo` flow and maintains cross-cutting artifacts via `monorepo-*` skills rather than per-component BUILD-SHIP flows.
## Flow Resolution
Determine which flow to use (check in order — first match wins):
1. If `_docs/_autodev_state.md` exists → read the `flow` field and use that flow. (When a greenfield project completes its final cycle, the Done step rewrites `flow: existing-code` in-band so the next invocation enters the feature-cycle loop — see greenfield "Done".)
2. If the workspace is a **meta-repo****meta-repo flow**. Detected by: presence of `.gitmodules` with ≥2 submodules, OR `package.json` with `workspaces` field, OR `pnpm-workspace.yaml`, OR `Cargo.toml` with `[workspace]` section, OR `go.work`, OR an ad-hoc structure with multiple top-level component folders each containing their own project manifests. Optional tiebreaker: the workspace has little or no source code of its own at the root (just registry + orchestration files).
3. If workspace has **no source code files****greenfield flow**
4. If workspace has source code files **and** `_docs/` does not exist → **existing-code flow**
5. If workspace has source code files **and** `_docs/` exists → **existing-code flow**
After selecting the flow, apply its detection rules (first match wins) to determine the current step.
**Note**: the meta-repo flow uses a different artifact layout — its source of truth is `_docs/_repo-config.yaml`, not `_docs/NN_*/` folders. After Step 2.5 it also produces `_docs/glossary.md` and a `## Architecture Vision` section in the cross-cutting architecture doc identified by `docs.cross_cutting`. Other detection rules assume the BUILD-SHIP artifact layout; they don't apply to meta-repos.
## Execution Loop
Every invocation has three phases: **Bootstrap** (runs once), **Resolve** (runs once), **Execute Loop** (runs per step). Exit conditions are explicit.
```
### Bootstrap (once per invocation)
B1. Process leftovers — delegate to `.cursor/rules/tracker.mdc` → Leftovers Mechanism
(authoritative spec: replay rules, escalation, blocker handling).
B2. Surface Recent Lessons — print top 3 entries from `_docs/LESSONS.md` if present; skip silently otherwise.
B3. Read state — `_docs/_autodev_state.md` (if it exists).
B4. Read File Index — `state.md`, `protocols.md`, and the active flow file.
### Resolve (once per invocation, after Bootstrap)
R1. Reconcile state — verify state file against `_docs/` contents; probe `<workspace-root>/../docs`
(parent suite `docs/` — see `state.md` → "State File Rules" #4); on disagreement,
trust the folders and update the state file (rules: `state.md` → "State File Rules" #4).
After this step, `state.step` / `state.status` are authoritative.
R2. Resolve flow — see §Flow Resolution above.
R3. Resolve current step — when a state file exists, `state.step` drives detection.
When no state file exists, walk the active flow's detection rules in order;
first folder-probe match wins.
R4. Present Status Summary — banner template in `protocols.md` + step-list fragment from the active flow file.
### Execute Loop (per step)
loop:
E1. Delegate to the current skill (see §Skill Delegation below).
E2. On FAILED
→ apply Failure Handling (`protocols.md`): increment retry_count, auto-retry up to 3.
→ if retry_count reaches 3 → set status: failed → EXIT (escalate on next invocation).
E3. On success
→ reset retry_count, update state file (rules: `state.md`).
E4. Re-detect next step from the active flow's detection rules.
E5. If the transition is marked as a session boundary in the flow's Auto-Chain Rules
→ update state, present boundary Choose block, suggest new conversation → EXIT.
E6. If all steps done
→ update state, report completion → EXIT.
E7. Else
→ continue loop (go to E1 with the next skill).
```
## Skill Delegation
For each step, the delegation pattern is:
1. Update state file: set `step` to the autodev step number, status to `in_progress`, set `sub_step` to the sub-skill's current internal phase using the structured `{phase, name, detail}` schema (see `state.md`), reset `retry_count: 0`
2. Announce: "Starting [Skill Name]..."
3. Read the skill file: `.cursor/skills/[name]/SKILL.md`
4. Execute the skill's workflow exactly as written, including all BLOCKING gates, self-verification checklists, save actions, and escalation rules. Update `sub_step.phase`, `sub_step.name`, and optional `sub_step.detail` in state each time the sub-skill advances to a new internal phase.
5. If the skill **fails**: follow Failure Handling in `protocols.md` — increment `retry_count`, auto-retry up to 3 times, then escalate.
6. When complete (success): reset `retry_count: 0`, update state file to the next step with `status: not_started` and `sub_step: {phase: 0, name: awaiting-invocation, detail: ""}`, return to auto-chain rules (from active flow file)
**sub_step read fallback**: when reading `sub_step`, parse the structured form. If parsing fails (legacy free-text value) OR the named phase is not recognized, log a warning and fall back to a folder scan of the sub-skill's artifact directory to infer progress. Do not silently treat a malformed sub_step as phase 0 — that would cause a sub-skill to restart from scratch after each resume.
Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The autodev is a sequencer, not an optimizer.
## State File
The state file (`_docs/_autodev_state.md`) is a minimal pointer — only the current step. See `state.md` for the authoritative template, field semantics, update rules, and worked examples. Do not restate the schema here — `state.md` is the single source of truth.
**Conciseness rule (authoritative).** The state file MUST stay short. Acceptable content per field:
- `name` — the step title from the active flow's Step Reference Table. That's it.
- `sub_step.name` — kebab-case identifier from the active sub-skill. That's it.
- `sub_step.detail`**leave empty (`""`) by default.** Add a one-line note ONLY when the next-session resumer cannot infer where to pick up from `phase` + `name` + on-disk artifacts alone (e.g. `"batch 2 of 4"`, `"blocked on D-PROJ-2 reply"`, `"variant 1b"`). NEVER use `detail` as a changelog, recap, or summary of completed work — those facts belong in the relevant `_docs/` artifact (glossary, traceability matrix, leftovers folder, retro report, etc.) and in git history.
- **Total file size target: <30 lines.** If you're tempted to write more, you're using the wrong artifact — write in `_docs/` instead.
Multi-line `detail` blobs that recap what was just completed are a smell. The state file is a *pointer*, not a logbook.
## Trigger Conditions
This skill activates when the user wants to:
- Start a new project from scratch
- Continue an in-progress project
- Check project status
- Let the AI guide them through the full workflow
**Keywords**: "autodev", "auto", "start", "continue", "what's next", "where am I", "project status"
**Invocation model**: this skill is explicitly user-invoked only (`disable-model-invocation: true` in the front matter). The keywords above aid skill discovery and tooling (other skills / agents can reason about when `/autodev` is appropriate), but the model never auto-fires this skill from a keyword match. The user always types `/autodev`.
**Differentiation**:
- User wants only research → use `/research` directly
- User wants only planning → use `/plan` directly
- User wants to document an existing codebase → use `/document` directly
- User wants the full guided workflow → use `/autodev`
## Flow Reference
See `flows/greenfield.md`, `flows/existing-code.md`, and `flows/meta-repo.md` for step tables, detection rules, auto-chain rules, and each flow's Status Summary step-list fragment. The banner that wraps those fragments lives in `protocols.md` → "Banner Template (authoritative)".
@@ -1,440 +0,0 @@
# Existing Code Workflow
Workflow for projects with an existing codebase. Structurally it has **two phases**:
- **Phase A — One-time baseline setup (Steps 18)**: runs exactly once per codebase. Documents the code, produces test specs, makes the code testable, writes and runs the initial test suite, optionally refactors with that safety net.
- **Phase B — Feature cycle (Steps 917, loops)**: runs once per new feature. After Step 17 (Retrospective), the flow loops back to Step 9 (New Task) with `state.cycle` incremented. Step 16.5 (Release) sits between Deploy (16) and Retrospective (17).
A first-time run executes Phase A then Phase B; every subsequent invocation re-enters Phase B.
## Step Reference Table
### Phase A — One-time baseline setup
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 1 | Document | document/SKILL.md | Steps 07 incl. inline 2.5 (module-layout) and 4.5 (glossary + arch vision) |
| 2 | Architecture Baseline Scan | code-review/SKILL.md (baseline mode) | Phase 1 + Phase 7 |
| 3 | Test Spec | test-spec/SKILL.md | Phases 14 |
| 4 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 07 (conditional) |
| 5 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
| 6 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 7 | Run Tests | test-run/SKILL.md | Steps 14 |
| 8 | Refactor | refactor/SKILL.md | Phases 07 (optional) |
### Phase B — Feature cycle (loops back to Step 9 after Step 17)
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 9 | New Task | new-task/SKILL.md | Steps 18 (loop) |
| 10 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 11 | Run Tests | test-run/SKILL.md | Steps 14 |
| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 05 |
| 14 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 15 (optional) |
| 16 | Deploy | deploy/SKILL.md | Step 17 |
| 16.5 | Release | release/SKILL.md | Phase 16 |
| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 14 |
After Step 17, the feature cycle completes and the flow loops back to Step 9 with `state.cycle + 1` — see "Re-Entry After Completion" below.
## Detection Rules
**Resolution**: when a state file exists, `state.step` + `state.status` drive detection and the conditions below are not consulted. When no state file exists (cold start), walk the rules in order — first folder-probe match wins. Steps without a folder probe are state-driven only; they can only be reached by auto-chain from a prior step. Cycle-scoped steps (Step 10 onward) always read `state.cycle` to disambiguate current vs. prior cycle artifacts.
---
### Phase A — One-time baseline setup (Steps 18)
**Step 1 — Document**
Condition: `_docs/` does not exist AND the workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`, `src/`, `Cargo.toml`, `*.csproj`, `package.json`)
Action: An existing codebase without documentation was detected. Read and execute `.cursor/skills/document/SKILL.md`. After the document skill completes, re-detect state (the produced `_docs/` artifacts will place the project at Step 2 or later).
The document skill's Step 2.5 produces `_docs/02_document/module-layout.md`, which is required by every downstream step that assigns file ownership (`/implement` Step 4, `/code-review` Phase 7, `/refactor` discovery). If this file is missing after Step 1 completes (e.g., a pre-existing `_docs/` dir predates the 2.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 2.5.
The document skill's Step 4.5 produces `_docs/02_document/glossary.md` and prepends a confirmed `## Architecture Vision` section to `architecture.md`. Both are user-confirmed artifacts; downstream skills (refactor, decompose, new-task) treat them as authoritative for terminology and structural intent. If `glossary.md` is missing after Step 1 (pre-existing `_docs/` dir from before the 4.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 4.5 without redoing module/component analysis.
---
**Step 2 — Architecture Baseline Scan**
Condition: `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/architecture_compliance_baseline.md` does not exist.
Action: Invoke `.cursor/skills/code-review/SKILL.md` in **baseline mode** (Phase 1 + Phase 7 only) against the full existing codebase. Phase 7 produces a structural map of the code vs. the just-documented `architecture.md`. Save the output to `_docs/02_document/architecture_compliance_baseline.md`.
Rationale: existing codebases often have pre-existing architecture violations (cycles, cross-component private imports, duplicate logic). Catching them here, before the Testability Revision (Step 4), gives the user a chance to fold structural fixes into the refactor scope.
After completion, if the baseline report contains **High or Critical** Architecture findings:
- Append them to the testability `list-of-changes.md` input in Step 4 (so testability refactor can address the most disruptive ones along with testability fixes), OR
- Surface them to the user via Choose format to defer to Step 8 (optional Refactor).
If the baseline report is clean (no High/Critical findings), auto-chain directly to Step 3.
---
**Step 3 — Test Spec**
Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
State-driven: reached by auto-chain from Step 2.
Action: Read and execute `.cursor/skills/test-spec/SKILL.md`
This step applies when the codebase was documented via the `/document` skill. Test specifications must be produced before refactoring or further development.
---
**Step 4 — Code Testability Revision**
Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND no test tasks exist yet in `_docs/02_tasks/todo/`.
State-driven: reached by auto-chain from Step 3.
**Purpose**: enable tests to run at all. Without this step, hardcoded URLs, file paths, credentials, or global singletons can prevent the test suite from exercising the code against a controlled environment. The test authors need a testable surface before they can write tests that mean anything.
**Scope — MINIMAL, SURGICAL fixes**: this is not a profound refactor. It is the smallest set of changes (sometimes temporary hacks) required to make code runnable under tests. "Smallest" beats "elegant" here — deeper structural improvements belong in Step 8 (Refactor), not this step.
**Allowed changes** in this phase:
- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
- Extract narrow interfaces for components that need stubbing in tests.
- Add optional constructor parameters for dependency injection; default to the existing hardcoded behavior so callers do not break.
- Wrap global singletons in thin accessors that tests can override (thread-local / context var / setter gate).
- Split a huge function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
**NOT allowed** in this phase (defer to Step 8 Refactor):
- Renaming public APIs (breaks consumers without a safety net).
- Moving code between files unless strictly required for isolation.
- Changing algorithms or business logic.
- Restructuring module boundaries or rewriting layers.
**Safety**: Phase 3 (Safety Net) of the refactor skill is skipped here **by design** — no tests exist yet to form the safety net. Compensating controls:
- Every change is bounded by the allowed/not-allowed lists above.
- `list-of-changes.md` must be reviewed by the user BEFORE execution (refactor skill enforces this gate).
- After execution, the refactor skill produces `RUN_DIR/testability_changes_summary.md` — a plain-language list of every applied change and why. Present this to the user before auto-chaining to Step 5.
Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
2. Read `_docs/02_document/architecture_compliance_baseline.md` (produced in Step 2). If it contains High/Critical Architecture findings that overlap with testability issues, consider including the lightest structural fixes inline; leave the rest for Step 8.
3. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
- Hardcoded file paths or directory references
- Hardcoded configuration values (URLs, credentials, magic numbers)
- Global mutable state that cannot be overridden
- Tight coupling to external services without abstraction
- Missing dependency injection or non-configurable parameters
- Direct file system operations without path configurability
- Inline construction of heavy dependencies (models, clients)
4. If ALL scenarios are testable as-is:
- Mark Step 4 as `completed` with outcome "Code is testable — no changes needed"
- Auto-chain to Step 5 (Decompose Tests)
5. If testability issues are found:
- Create `_docs/04_refactoring/01-testability-refactoring/`
- Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
- **Mode**: `guided`
- **Source**: `autodev-testability-analysis`
- One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred to Step 8 Refactor" instead.
- Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
- The refactor skill will create RUN_DIR (`01-testability-refactoring`), create tasks in `_docs/02_tasks/todo/`, delegate to implement skill, and verify results
- Phase 3 (Safety Net) is automatically skipped by the refactor skill for testability runs
- After execution, the refactor skill produces `RUN_DIR/testability_changes_summary.md`. Surface this summary to the user via the Choose format (accept / request follow-up) before auto-chaining.
- Mark Step 4 as `completed`
- Auto-chain to Step 5 (Decompose Tests)
---
**Step 5 — Decompose Tests**
Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND (`_docs/02_tasks/todo/` does not exist or has no test task files).
State-driven: reached by auto-chain from Step 4 (completed or skipped).
Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
1. Run Step 1t (test infrastructure bootstrap)
2. Run Step 3 (blackbox test task decomposition)
3. Run Step 4 (cross-verification against test coverage)
If `_docs/02_tasks/` subfolders have some task files already (e.g., refactoring tasks from Step 4), the decompose skill's resumability handles it — it appends test tasks alongside existing tasks.
---
**Step 6 — Implement Tests**
Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 5.
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
---
**Step 7 — Run Tests**
Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
State-driven: reached by auto-chain from Step 6.
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
Verifies the implemented test suite passes before proceeding to refactoring. The tests form the safety net for all subsequent code changes.
---
**Step 8 — Refactor (optional)**
State-driven: reached by auto-chain from Step 7. (Sanity check: no `_docs/04_refactoring/` run folder should contain a `FINAL_report.md` for a non-testability run when entering this step for the first time.)
Action: Present using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Refactor codebase before adding new features?
══════════════════════════════════════
A) Run refactoring (recommended if code quality issues were noted during documentation)
B) Skip — proceed directly to New Task
══════════════════════════════════════
Recommendation: [A or B — base on whether documentation
flagged significant code smells, coupling issues, or
technical debt worth addressing before new development]
══════════════════════════════════════
```
- If user picks A → Read and execute `.cursor/skills/refactor/SKILL.md` in automatic mode. The refactor skill creates a new run folder in `_docs/04_refactoring/` (e.g., `02-coupling-refactoring`), runs the full method using the implemented tests as a safety net. After completion, auto-chain to Step 9 (New Task).
- If user picks B → Mark Step 8 as `skipped` in the state file, auto-chain to Step 9 (New Task).
---
### Phase B — Feature cycle (Steps 917, loops)
**Step 9 — New Task**
State-driven: reached by auto-chain from Step 8 (completed or skipped). This is also the re-entry point after a completed cycle — see "Re-Entry After Completion" below.
Action: Read and execute `.cursor/skills/new-task/SKILL.md`
The new-task skill interactively guides the user through defining new functionality. It loops until the user is done adding tasks. New task files are written to `_docs/02_tasks/todo/`.
---
**Step 10 — Implement**
State-driven: reached by auto-chain from Step 9 in the CURRENT cycle (matching `state.cycle`). Detection is purely state-driven — prior cycles will have left `implementation_report_{feature_slug}_cycle{N-1}.md` artifacts that must not block new cycles.
Action: Read and execute `.cursor/skills/implement/SKILL.md`
The implement skill reads the new tasks from `_docs/02_tasks/todo/` and implements them. Tasks already implemented in Step 6 or prior cycles are skipped (completed tasks have been moved to `done/`).
**Implementation report naming**: the final report for this cycle must be named `implementation_report_{feature_slug}_cycle{N}.md` where `{N}` is `state.cycle`. Batch reports are named `batch_{NN}_cycle{M}_report.md` so the cycle counter survives folder scans.
If `_docs/03_implementation/` has batch reports from the current cycle, the implement skill detects completed tasks and continues.
---
**Step 11 — Run Tests**
State-driven: reached by auto-chain from Step 10.
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
---
**Step 12 — Test-Spec Sync**
State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the cycle's completed task specs (files in `_docs/02_tasks/done/` moved during this cycle) and the implementation report `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` as inputs.
The skill appends new ACs, scenarios, and NFRs to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing (e.g., cycle added after a greenfield-only project), mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
After completion, auto-chain to Step 13 (Update Docs).
---
**Step 13 — Update Docs**
State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all task spec files from `_docs/02_tasks/done/` that were implemented in the current cycle (i.e., tasks moved to `done/` during Steps 910 of this cycle).
The document skill in Task mode:
1. Reads each task spec to identify changed source files
2. Updates affected module docs, component docs, and system-level docs
3. Does NOT redo full discovery, verification, or problem extraction
If `_docs/02_document/` does not contain existing docs (e.g., documentation step was skipped), mark Step 13 as `skipped`.
After completion, auto-chain to Step 14 (Security Audit).
---
**Step 14 — Security Audit (optional)**
State-driven: reached by auto-chain from Step 13 (completed or skipped).
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run security audit before deploy?`
- option-a-label: `Run security audit (recommended for production deployments)`
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A — catches vulnerabilities before production`
- target-skill: `.cursor/skills/security/SKILL.md`
- next-step: Step 15 (Performance Test)
---
**Step 15 — Performance Test (optional)**
State-driven: reached by auto-chain from Step 14 (completed or skipped).
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run performance/load tests before deploy?`
- option-a-label: `Run performance tests (recommended for latency-sensitive or high-load systems)`
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
- target-skill: `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
- next-step: Step 16 (Deploy)
---
**Step 16 — Deploy**
State-driven: reached by auto-chain from Step 15 (completed or skipped).
Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 16.5 (Release).
---
**Step 16.5 — Release**
State-driven: reached by auto-chain from Step 16, for the current `state.cycle`.
Action: Read and execute `.cursor/skills/release/SKILL.md`. The release skill owns its own user interaction (Phase 1 pre-release gate, Phase 2 strategy select, Phase 6 escalation). Autodev does NOT add a wrapping A/B/C gate. Pass cycle context (`cycle: state.cycle`).
After the release skill exits, route on the verdict:
- **Verdict `Released`** → mark Step 16.5 `completed` and auto-chain to Step 17 (Retrospective in cycle-end mode).
- **Verdict `Released-with-override`** → mark Step 16.5 `completed` AND auto-chain to Step 17 (Retrospective in **incident mode**).
- **Verdict `Rolled-Back`** → mark Step 16.5 `failed`. Auto-chain to Step 17 (Retrospective in **incident mode**). The cycle does NOT loop back to Step 9.
- **Verdict `Aborted`** → mark Step 16.5 `not_started` (no live-system change) OR `failed` (live-system touched before abort). Surface the abort reason and STOP. Next `/autodev` invocation re-evaluates Phase B from the failed step.
---
**Step 17 — Retrospective**
State-driven: reached by auto-chain from Step 16.5 with a `Released`, `Released-with-override`, or `Rolled-Back` verdict, for the current `state.cycle`.
Action: Read and execute `.cursor/skills/retrospective/SKILL.md`. Mode selection:
- Step 16.5 verdict `Released` → cycle-end mode
- Step 16.5 verdict `Released-with-override` or `Rolled-Back` → incident mode
Pass cycle context (`cycle: state.cycle`) so the retro report and LESSONS.md entries record which feature cycle they came from.
After retrospective completes:
- If Step 16.5 verdict was `Released` or `Released-with-override` → mark Step 17 as `completed` and enter "Re-Entry After Completion" evaluation (loop back to Step 9 for cycle N+1).
- If Step 16.5 verdict was `Rolled-Back` → mark Step 17 as `completed` but do NOT loop back. Surface the incident retro path and STOP.
---
**Re-Entry After Completion**
State-driven: `state.step == done` OR Step 17 (Retrospective) is completed for `state.cycle` AND Step 16.5 verdict was `Released` or `Released-with-override`. A `Rolled-Back` cycle does NOT trigger Re-Entry — the user must explicitly invoke `/autodev` again.
Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:
```
══════════════════════════════════════
PROJECT CYCLE COMPLETE
══════════════════════════════════════
The previous cycle finished successfully.
Starting new feature cycle…
══════════════════════════════════════
```
Set `step: 9`, `status: not_started`, and **increment `cycle`** (`cycle: state.cycle + 1`) in the state file, then auto-chain to Step 9 (New Task). Reset `sub_step` to `phase: 0, name: awaiting-invocation, detail: ""` and `retry_count: 0`.
Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security → Performance → Deploy → Release → Retrospective. The cycle only completes (and loops back to Step 9) on a `Released` or `Released-with-override` verdict; rolled-back or aborted releases stop the cycle.
## Auto-Chain Rules
### Phase A — One-time baseline setup
| Completed Step | Next Action |
|---------------|-------------|
| Document (1) | Auto-chain → Architecture Baseline Scan (2) |
| Architecture Baseline Scan (2) | Auto-chain → Test Spec (3). If baseline has High/Critical Architecture findings, surface them as inputs to Step 4 (testability) or defer to Step 8 (refactor). |
| Test Spec (3) | Auto-chain → Code Testability Revision (4) |
| Code Testability Revision (4) | Auto-chain → Decompose Tests (5) |
| Decompose Tests (5) | **Session boundary** — suggest new conversation before Implement Tests |
| Implement Tests (6) | Auto-chain → Run Tests (7) |
| Run Tests (7, all pass) | Auto-chain → Refactor choice (8) |
| Refactor (8, done or skipped) | Auto-chain → New Task (9) — enters Phase B |
### Phase B — Feature cycle (loops)
| Completed Step | Next Action |
|---------------|-------------|
| New Task (9) | **Session boundary** — suggest new conversation before Implement |
| Implement (10) | Auto-chain → Run Tests (11) |
| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
| Update Docs (13) | Auto-chain → Security Audit choice (14) |
| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
| Deploy (16) | Auto-chain → Release (16.5) |
| Release (16.5, verdict Released) | Auto-chain → Retrospective (17, cycle-end mode) |
| Release (16.5, verdict Released-with-override) | Auto-chain → Retrospective (17, **incident mode**) |
| Release (16.5, verdict Rolled-Back) | Auto-chain → Retrospective (17, **incident mode**); cycle does NOT loop back |
| Release (16.5, verdict Aborted) | STOP — surface abort reason; do not auto-chain |
| Retrospective (17, after Released / Released-with-override) | **Cycle complete** — loop back to New Task (9) with incremented cycle counter |
| Retrospective (17, after Rolled-Back) | Cycle remains incomplete — STOP and surface incident retro path |
## Status Summary — Step List
Flow name: `existing-code`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)".
Flow-specific slot values:
- `<header-suffix>`: ` — Cycle <N>` when `state.cycle > 1`; otherwise empty.
- `<current-suffix>`: ` (cycle <N>)` when `state.cycle > 1`; otherwise empty.
- `<footer-extras>`: empty.
**Phase A — One-time baseline setup**
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|-----------------------------|--------------------------------------------|
| 1 | Document | — |
| 2 | Architecture Baseline | — |
| 3 | Test Spec | — |
| 4 | Code Testability Revision | — |
| 5 | Decompose Tests | `DONE (N tasks)` |
| 6 | Implement Tests | `IN PROGRESS (batch M)` |
| 7 | Run Tests | `DONE (N passed, M failed)` |
| 8 | Refactor | `IN PROGRESS (phase N)` |
**Phase B — Feature cycle (loops)**
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|-----------------------------|--------------------------------------------|
| 9 | New Task | `DONE (N tasks)` |
| 10 | Implement | `IN PROGRESS (batch M of ~N)` |
| 11 | Run Tests | `DONE (N passed, M failed)` |
| 12 | Test-Spec Sync | — |
| 13 | Update Docs | — |
| 14 | Security Audit | — |
| 15 | Performance Test | — |
| 16 | Deploy | — |
| 16.5 | Release | `DONE (Released | Released-with-override | Rolled-Back | Aborted)` |
| 17 | Retrospective | — |
All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2, 4, 8, 12, 13, 14, 15 additionally accept `SKIPPED`.
Row rendering format (renders with a phase separator between Step 8 and Step 9):
```
── Phase A: One-time baseline setup ──
Step 1 Document [<state token>]
Step 2 Architecture Baseline [<state token>]
Step 3 Test Spec [<state token>]
Step 4 Code Testability Rev. [<state token>]
Step 5 Decompose Tests [<state token>]
Step 6 Implement Tests [<state token>]
Step 7 Run Tests [<state token>]
Step 8 Refactor [<state token>]
── Phase B: Feature cycle (loops) ──
Step 9 New Task [<state token>]
Step 10 Implement [<state token>]
Step 11 Run Tests [<state token>]
Step 12 Test-Spec Sync [<state token>]
Step 13 Update Docs [<state token>]
Step 14 Security Audit [<state token>]
Step 15 Performance Test [<state token>]
Step 16 Deploy [<state token>]
Step 16.5 Release [<state token>]
Step 17 Retrospective [<state token>]
```
-417
View File
@@ -1,417 +0,0 @@
# Greenfield Workflow
Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Release → Retrospective.
## Step Reference Table
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 1 | Problem | problem/SKILL.md | Phase 14 |
| 2 | Research | research/SKILL.md | Mode A: Phase 14 · Mode B: Step 08 |
| 3 | Plan | plan/SKILL.md | Step 1, 2, 3, 4, 4.5 (ADR Capture), 5, 6 + Final |
| 4 | UI Design | ui-design/SKILL.md | Phase 08 (conditional — UI projects only) |
| 5 | Test Spec | test-spec/SKILL.md | Phases 14 |
| 6 | Decompose | decompose/SKILL.md (implementation task decomposition) | Step 1 + Step 1.5 + Step 2 + Step 4 |
| 7 | Implement | implement/SKILL.md | Batch loop + Product Implementation Completeness Gate |
| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 07 (conditional) |
| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 11 | Run Tests | test-run/SKILL.md | Steps 14 |
| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 05 |
| 14 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 15 (optional) |
| 16 | Deploy | deploy/SKILL.md | Step 17 |
| 16.5 | Release | release/SKILL.md | Phase 16 |
| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 14 |
## Detection Rules
**Resolution**: when a state file exists, `state.step` + `state.status` drive detection and the conditions below are not consulted. When no state file exists (cold start), walk the rules in order — first folder-probe match wins. Steps without a folder probe are state-driven only; they can only be reached by auto-chain from a prior step.
---
**Step 1 — Problem Gathering**
Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty:
- `problem.md`
- `restrictions.md`
- `acceptance_criteria.md`
- `input_data/` (must contain at least one file)
Action: Read and execute `.cursor/skills/problem/SKILL.md`
---
**Step 2 — Research (Initial)**
Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files
Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A)
---
**Research Decision** (inline gate between Step 2 and Step 3)
Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_document/architecture.md` does not exist
Action: Present the current research state to the user:
- How many solution drafts exist
- Whether tech_stack.md and security_analysis.md exist
- One-line summary from the latest draft
Then present using the **Choose format**:
```
══════════════════════════════════════
DECISION REQUIRED: Research complete — next action?
══════════════════════════════════════
A) Run another research round (Mode B assessment)
B) Proceed to planning with current draft
══════════════════════════════════════
Recommendation: [A or B] — [reason based on draft quality]
══════════════════════════════════════
```
- If user picks A → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B)
- If user picks B → auto-chain to Step 3 (Plan)
---
**Step 3 — Plan**
Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_document/architecture.md` does not exist
Action:
1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself
2. Read and execute `.cursor/skills/plan/SKILL.md`
If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it.
---
**Step 4 — UI Design (conditional)**
Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
State-driven: reached by auto-chain from Step 3.
Action: Read and execute `.cursor/skills/ui-design/SKILL.md`. The skill runs its own **Applicability Check**, which handles UI project detection and the user's A/B choice. It returns one of:
- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Test Spec).
- `outcome: skipped, reason: not-a-ui-project` → mark Step 4 as `skipped`, auto-chain to Step 5.
- `outcome: skipped, reason: user-declined` → mark Step 4 as `skipped`, auto-chain to Step 5.
The autodev no longer inlines UI detection heuristics — they live in `ui-design/SKILL.md` under "Applicability Check".
---
**Step 5 — Test Spec**
Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
State-driven: reached by auto-chain from Step 4 (completed or skipped).
Action: Read and execute `.cursor/skills/test-spec/SKILL.md`.
This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
---
**Step 6 — Decompose**
Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_document/tests/traceability-matrix.md` exists AND `_docs/02_tasks/todo/` does not exist or has no implementation task files.
Action: Invoke `.cursor/skills/decompose/SKILL.md` for **implementation task decomposition**. The greenfield flow selects the implementation entrypoint before handing off: Bootstrap Structure, Module Layout, Component Task Decomposition, and Cross-Task Verification.
Do not invoke Blackbox Test Task Decomposition from Step 6. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality and Step 8 can revise testability before test task files exist.
If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.
---
**Step 7 — Implement**
Condition: `_docs/02_tasks/todo/` contains implementation task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain a valid product implementation report.
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Product implementation**.
The implement skill must run its **Product Implementation Completeness Gate** before it writes any final product implementation report. This gate compares completed product task specs, architecture/component promises, and actual source code so scaffold-only implementations cannot advance to Step 8. A final product implementation report without `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` is incomplete and must not be treated as Step 7 completion.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.
For folder fallback, **implementation task files** means task specs that are not test-only specs: exclude `*_test_infrastructure.md` and task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
For folder fallback, a **product implementation report** is any `_docs/03_implementation/implementation_report_*.md` file except `_docs/03_implementation/implementation_report_tests.md` and refactor reports. It is valid for greenfield progression only when:
- the matching `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists,
- that completeness report does not contain unresolved `FAIL` classifications, and
- `_docs/02_tasks/todo/` contains no pending implementation task files.
If a product report exists but any of those validity checks fail, treat product implementation as incomplete and stay in Step 7.
---
**Step 8 — Code Testability Revision**
Condition (folder fallback): `_docs/03_implementation/` contains a valid product implementation report, `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications, `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` does not exist, `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` does not exist, `_docs/03_implementation/implementation_report_tests.md` does not exist, and `_docs/02_tasks/todo/` does not contain test task files.
State-driven: reached by auto-chain from Step 7.
**Purpose**: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
**Scope — MINIMAL, SURGICAL fixes**: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
**Allowed changes** in this phase:
- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
- Extract narrow interfaces for components that need stubbing in tests.
- Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
- Wrap global singletons in thin accessors that tests can override.
- Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
**NOT allowed** in this phase (defer to a later refactor task):
- Renaming public APIs.
- Moving code between files unless strictly required for isolation.
- Changing algorithms or business logic.
- Restructuring module boundaries or rewriting layers.
Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
- Hardcoded file paths or directory references
- Hardcoded configuration values (URLs, credentials, magic numbers)
- Global mutable state that cannot be overridden
- Tight coupling to external services without abstraction
- Missing dependency injection or non-configurable parameters
- Direct file system operations without path configurability
- Inline construction of heavy dependencies (models, clients)
3. If ALL scenarios are testable as-is:
- Create `_docs/04_refactoring/01-testability-refactoring/`
- Write `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` with the scenarios reviewed and outcome "Code is testable — no changes needed"
- Mark Step 8 as `completed` with outcome "Code is testable — no changes needed"
- Auto-chain to Step 9 (Decompose Tests)
4. If testability issues are found:
- Create `_docs/04_refactoring/01-testability-refactoring/`
- Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
- **Mode**: `guided`
- **Source**: `autodev-greenfield-testability-analysis`
- One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
- Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
- Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
- After execution, surface `RUN_DIR/testability_changes_summary.md` to the user via the Choose format (accept / request follow-up) before auto-chaining
- Copy or save the accepted summary as `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` so folder fallback can detect Step 8 completion
- Mark Step 8 as `completed`
- Auto-chain to Step 9 (Decompose Tests)
---
**Step 9 — Decompose Tests**
Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND `_docs/03_implementation/` contains a valid product implementation report AND `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications AND (`_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` exists OR `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` exists) AND (`_docs/02_tasks/todo/` does not exist or has no test task files) AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 8.
Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
1. Run Step 1t (test infrastructure bootstrap)
2. Run Step 3 (blackbox/e2e-capable test task decomposition)
3. Run Step 4 (cross-verification against test coverage)
If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
---
**Step 10 — Implement Tests**
Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 9.
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed test tasks and continues.
For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
---
**Step 11 — Run Tests**
Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
State-driven: reached by auto-chain from Step 10.
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync. This is a hard product gate, not a harness-smoke gate: e2e/blackbox tests must exercise the actual implemented system through public runtime boundaries and compare actual outputs against `_docs/00_problem/input_data/expected_results/results_report.md` or referenced machine-readable expected-result files. Stubs are allowed only for external systems outside the product boundary; missing internal product implementation must fail or block the gate and send the flow back to Implement.
---
**Step 12 — Test-Spec Sync**
State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing, mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
After completion, auto-chain to Step 13 (Update Docs).
---
**Step 13 — Update Docs**
State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all completed implementation and test task spec files plus the implementation reports.
The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
If `_docs/02_document/` does not contain existing docs, mark Step 13 as `skipped`.
After completion, auto-chain to Step 14 (Security Audit).
---
**Step 14 — Security Audit (optional)**
State-driven: reached by auto-chain from Step 13 (completed or skipped).
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run security audit before deploy?`
- option-a-label: `Run security audit (recommended for production deployments)`
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A — catches vulnerabilities before production`
- target-skill: `.cursor/skills/security/SKILL.md`
- next-step: Step 15 (Performance Test)
---
**Step 15 — Performance Test (optional)**
State-driven: reached by auto-chain from Step 14 (completed or skipped).
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run performance/load tests before deploy?`
- option-a-label: `Run performance tests (recommended for latency-sensitive or high-load systems)`
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
- target-skill: `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
- next-step: Step 16 (Deploy)
---
**Step 16 — Deploy**
State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).
Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 16.5 (Release).
---
**Step 16.5 — Release**
State-driven: reached by auto-chain from Step 16.
Action: Read and execute `.cursor/skills/release/SKILL.md`. The release skill is responsible for selecting the target environment, executing the deploy artifacts, smoke-testing, watching the rollout, and producing a definitive verdict (`Released`, `Released-with-override`, `Rolled-Back`, or `Aborted`).
The release skill has its own internal BLOCKING gates (Phase 1 pre-release gate, Phase 2 strategy select, Phase 6 user confirmation when soft regression escalates). Autodev does NOT add a wrapping A/B/C gate — the release skill owns its own user interaction.
After the release skill exits:
- **Verdict `Released`** → mark Step 16.5 `completed` and auto-chain to Step 17 (Retrospective in cycle-end mode).
- **Verdict `Released-with-override`** → mark Step 16.5 `completed` AND auto-chain to Step 17 (Retrospective in **incident mode**) — the override is itself an incident the retrospective must analyze.
- **Verdict `Rolled-Back`** → mark Step 16.5 `failed`. Auto-chain to Step 17 (Retrospective in **incident mode**). Do NOT consider the project "Done" — the user owns the next move (re-run /implement on a fix branch, re-run /deploy, re-run /release).
- **Verdict `Aborted`** → mark Step 16.5 `not_started` (the release was never started) OR `failed` if the abort came after Phase 3 had already touched the live system. Surface the abort reason and STOP — do not auto-chain to retrospective.
---
**Step 17 — Retrospective**
State-driven: reached by auto-chain from Step 16.5 with a `Released` or `Released-with-override` verdict, OR from a `Rolled-Back` verdict (in incident mode).
Action: Read and execute `.cursor/skills/retrospective/SKILL.md`. Mode selection:
- Step 16.5 verdict `Released` → cycle-end mode
- Step 16.5 verdict `Released-with-override` or `Rolled-Back` → incident mode
The retrospective closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` (or `incident_<date>_release.md` in incident mode) and appending the top-3 lessons to `_docs/LESSONS.md`.
After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.
---
**Done**
State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md. `_docs/04_release/` should contain at least one `release_<version>_<env>_<timestamp>.md` with a `Released` verdict — or the user has explicitly chosen to handle release outside autodev.)
Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:
```
flow: existing-code
step: 9
name: New Task
status: not_started
sub_step:
phase: 0
name: awaiting-invocation
detail: ""
retry_count: 0
cycle: 1
```
On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and re-entry flows directly into existing-code Step 9 (New Task).
## Auto-Chain Rules
| Completed Step | Next Action |
|---------------|-------------|
| Problem (1) | Auto-chain → Research (2) |
| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
| Research Decision → proceed | Auto-chain → Plan (3) |
| Plan (3) | Auto-chain → UI Design detection (4) |
| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
| Test Spec (5) | Auto-chain → Decompose (6) |
| Decompose (6) | **Session boundary** — suggest new conversation before Implement |
| Implement (7) | Auto-chain only after Product Implementation Completeness Gate passes → Code Testability Revision (8) |
| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
| Decompose Tests (9) | **Session boundary** — suggest new conversation before Implement Tests |
| Implement Tests (10) | Auto-chain → Run Tests (11) |
| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
| Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
| Deploy (16) | Auto-chain → Release (16.5) |
| Release (16.5, verdict Released) | Auto-chain → Retrospective (17, cycle-end mode) |
| Release (16.5, verdict Released-with-override) | Auto-chain → Retrospective (17, **incident mode**) |
| Release (16.5, verdict Rolled-Back) | Auto-chain → Retrospective (17, **incident mode**); do NOT enter Done |
| Release (16.5, verdict Aborted) | STOP — surface abort reason; do not auto-chain |
| Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |
## Status Summary — Step List
Flow name: `greenfield`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|-----------------------------|--------------------------------------------|
| 1 | Problem | — |
| 2 | Research | `DONE (N drafts)` |
| 3 | Plan | — |
| 4 | UI Design | — |
| 5 | Test Spec | — |
| 6 | Decompose | `DONE (N tasks)` |
| 7 | Implement | `IN PROGRESS (batch M of ~N)` |
| 8 | Code Testability Revision | — |
| 9 | Decompose Tests | `DONE (N tasks)` |
| 10 | Implement Tests | `IN PROGRESS (batch M)` |
| 11 | Run Tests | `DONE (N passed, M failed)` |
| 12 | Test-Spec Sync | — |
| 13 | Update Docs | — |
| 14 | Security Audit | — |
| 15 | Performance Test | — |
| 16 | Deploy | — |
| 16.5 | Release | `DONE (Released | Released-with-override | Rolled-Back | Aborted)` |
| 17 | Retrospective | — |
All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.
Row rendering format (step-number column is right-padded to 2 characters for alignment):
```
Step 1 Problem [<state token>]
Step 2 Research [<state token>]
Step 3 Plan [<state token>]
Step 4 UI Design [<state token>]
Step 5 Test Spec [<state token>]
Step 6 Decompose [<state token>]
Step 7 Implement [<state token>]
Step 8 Code Testability Rev. [<state token>]
Step 9 Decompose Tests [<state token>]
Step 10 Implement Tests [<state token>]
Step 11 Run Tests [<state token>]
Step 12 Test-Spec Sync [<state token>]
Step 13 Update Docs [<state token>]
Step 14 Security Audit [<state token>]
Step 15 Performance Test [<state token>]
Step 16 Deploy [<state token>]
Step 16.5 Release [<state token>]
Step 17 Retrospective [<state token>]
```
-489
View File
@@ -1,489 +0,0 @@
# Meta-Repo Workflow
Workflow for **meta-repositories** — repos that aggregate multiple components via git submodules, npm/cargo/pnpm/go workspaces, or ad-hoc conventions. The meta-repo itself has little or no source code of its own; it orchestrates cross-cutting documentation, CI/CD, and component registration.
This flow differs fundamentally from `greenfield` and `existing-code`:
- **No problem/research/plan phases** — meta-repos don't build features, they coordinate existing ones
- **No test spec / run tests** — the meta-repo has no code to test
- **`implement` is scoped to suite-level work only** — cross-repo concerns, repo/folder renames, suite-root infra additions (e.g., `.gitmodules`, `_infra/`, suite `e2e/`). Per-component implementation lives in each component's own workspace `/autodev` cycle. The meta-repo's implement step (Step 3.5) executes only when `_docs/tasks/todo/` is non-empty AND the user explicitly opts in; placement is **before** the sync skills so subsequent Doc/E2E/CICD sync propagates the post-implementation state.
- **No `_docs/00_problem/` artifacts** — documentation target is `_docs/*.md` unified docs, not per-feature `_docs/NN_feature/` folders
- **Primary artifact is `_docs/_repo-config.yaml`** — generated by `monorepo-discover`, read by every other step
## Step Reference Table
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 1 | Discover | monorepo-discover/SKILL.md | Phase 110 |
| 2 | Config Review | (human checkpoint, no sub-skill) | — |
| 2.5 | Glossary & Architecture Vision | (inline, no sub-skill) | Steps 15 |
| 3 | Status | monorepo-status/SKILL.md | Sections 15 |
| 3.5 | Suite Implement | implement/SKILL.md (suite-level invocation context) | Steps 114 + 16 (Step 14.5 + Step 15 skipped); conditional on `_docs/tasks/todo/` non-empty AND user opt-in |
| 4 | Document Sync | monorepo-document/SKILL.md | Phase 17 (conditional on doc drift) |
| 4.5 | Integration Test Sync | monorepo-e2e/SKILL.md | Phase 16 (conditional on suite-e2e drift; skipped if `suite_e2e:` block absent in config) |
| 5 | CICD Sync | monorepo-cicd/SKILL.md | Phase 17 (conditional on CI drift) |
| 6 | Loop | (auto-return to Step 3 on next invocation) | — |
**Onboarding is NOT in the auto-chain.** Onboarding a new component is always user-initiated (`monorepo-onboard` directly, or answering "yes" to the optional onboard branch at end of Step 5). The autodev does NOT silently onboard components it discovers.
## Detection Rules
**Resolution**: when a state file exists, `state.step` + `state.status` drive detection and the conditions below are not consulted. When no state file exists (cold start), walk the rules in order — first match wins. Meta-repo uses `_docs/_repo-config.yaml` (and its `confirmed_by_user` flag) as its primary folder-probe signal rather than per-step artifact folders.
---
**Step 1 — Discover**
Condition: `_docs/_repo-config.yaml` does NOT exist
Action: Read and execute `.cursor/skills/monorepo-discover/SKILL.md`. After completion, auto-chain to **Step 2 (Config Review)**.
---
**Step 2 — Config Review** (session boundary)
Condition: `_docs/_repo-config.yaml` exists AND top-level `confirmed_by_user: false`
Action: This is a **hard session boundary**. The skill cannot proceed until a human reviews the generated config and sets `confirmed_by_user: true`. Present using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Config review pending
══════════════════════════════════════
_docs/_repo-config.yaml was generated by monorepo-discover
but has confirmed_by_user: false.
A) I've reviewed — proceed to Status
B) Pause — I'll review the config and come back later
══════════════════════════════════════
Recommendation: B — review the inferred mappings (tagged
`confirmed: false`), unresolved questions, and assumptions
before flipping confirmed_by_user: true.
══════════════════════════════════════
```
- If user picks A → verify `confirmed_by_user: true` is now set in the config. If still `false`, re-ask. If true, auto-chain to **Step 2.5 (Glossary & Architecture Vision)**.
- If user picks B → mark Step 2 as `in_progress`, update state file, end the session. Tell the user to invoke `/autodev` again after reviewing.
**Do NOT auto-flip `confirmed_by_user`.** Only the human does that.
---
**Step 2.5 — Glossary & Architecture Vision** (one-shot)
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` does NOT exist OR the cross-cutting architecture doc identified in `docs.cross_cutting` does NOT contain a `## Architecture Vision` section).
State-driven: reached by auto-chain from Step 2 (user picked A).
**Goal**: Capture meta-repo-wide terminology and the user's architecture vision **once**, after the config is confirmed but before any sync skill runs. Without this, `monorepo-document` will faithfully propagate per-component changes but never surface a unified mental model of the meta-repo to the user, and the AI will keep re-inferring the same project terminology on every invocation.
**Why inline (no sub-skill)**: `monorepo-discover` is hard-guarded to write only `_repo-config.yaml`; `monorepo-document` only edits *existing* docs. Glossary and architecture-vision creation is a first-time, user-confirmed write that crosses both guarantees, so it lives directly in the flow.
**Inputs**:
- `_docs/_repo-config.yaml` (component list, doc map, conventions, assumptions log)
- Cross-cutting docs listed under `docs.cross_cutting` (existing architecture doc, if any)
- Each component's `primary_doc` (read-only, for terminology + responsibility extraction)
- Root `README.md` if `repo.root_readme` is referenced
**Outputs**:
- `_docs/glossary.md` (or `<docs.root>/glossary.md` if `docs.root``_docs/`) — NEW
- The cross-cutting architecture doc updated in place: a `## Architecture Vision` section is prepended (or merged into an existing "Vision" / "Overview" heading)
- One new entry appended to `_docs/_repo-config.yaml` under `assumptions_log:` recording the run
- A new top-level config entry: `glossary_doc: <path>` so future `monorepo-status` and `monorepo-document` runs treat the glossary as a known cross-cutting doc
**Procedure**:
1. **Draft glossary** from `_repo-config.yaml` + each component's primary doc. Include:
- Component codenames as they appear in the config (`name` field) and any rename pairs the user noted in `unresolved:` resolutions
- Domain terms that recur across ≥2 component docs
- Acronyms / abbreviations
- Convention names from `conventions:` (e.g., commit prefix, deployment tier names)
- Stakeholder personas if cross-cutting docs reference them
Each entry: one-line definition + source (`source: components.<name>.primary_doc` or `source: _repo-config.yaml conventions`). Skip generic terms.
2. **Draft architecture vision** from the meta-repo perspective:
- **One paragraph**: what the system as a whole is, what each component contributes, the runtime topology (one binary / N services / N clients + 1 server / hybrid), how components communicate (REST / gRPC / queue / DB-shared / file-shared)
- **Components & responsibilities** (one-line each), pulled directly from `_repo-config.yaml` `components:` list
- **Cross-cutting concerns ownership**: which doc owns which concern (auth, schema, deployment, etc.) — pulled from `docs.cross_cutting[].owns`
- **Architectural principles / non-negotiables** the user has implied across components (e.g., "all components share a single Postgres", "submodules own their own CI", "deployment is per-tier, not per-component")
- **Open questions / structural drift signals**: components missing from `docs.cross_cutting`, components in registry but not in config (registry mismatch), or contradictions between component primary docs
3. **Present condensed view** to the user (NOT the full draft files):
```
══════════════════════════════════════
REVIEW: Meta-Repo Glossary + Architecture Vision
══════════════════════════════════════
Glossary (N terms drafted from config + component docs):
- <Term>: <one-line definition>
- ...
Architecture Vision — meta-repo level:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Cross-cutting ownership:
- <concern> → <doc>
- ...
Principles / non-negotiables:
- <principle>
- ...
Open questions / drift signals:
- <q1>
- <q2>
══════════════════════════════════════
A) Looks correct — write the files
B) Add / correct entries (provide diffs)
C) Resolve open questions / drift signals first
══════════════════════════════════════
Recommendation: pick C if drift signals exist;
otherwise B if components or principles
don't match your intent; A only when
the inferred vision is exactly right.
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present, loop until A.
- On C → ask the listed open questions in one batch, integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `_docs/glossary.md` (alphabetical) with `**Status**: confirmed-by-user` + date.
- Update the cross-cutting architecture doc identified in `docs.cross_cutting` (or create one at `_docs/00_architecture.md` if none exists and the user's option-B input named one): prepend `## Architecture Vision` with the confirmed paragraph + components + ownership + principles. Preserve every existing H2 below verbatim.
- Append to `_docs/_repo-config.yaml`:
- Top-level `glossary_doc: <path-relative-to-repo-root>` (sibling of `docs.root`)
- New `assumptions_log:` entry: `{ date: <today>, skill: autodev-meta-repo Step 2.5, run_notes: "Captured glossary + architecture vision", assumptions: [...] }`
- Do NOT flip any `confirmed: false` → `confirmed: true` in the config; this step writes its own confirmed artifact, it does not retroactively confirm config inferences.
**Self-verification**:
- [ ] Every glossary entry traces to either the config or a component primary doc
- [ ] Every component listed in the vision matches a `components:` entry in the config
- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
- [ ] The cross-cutting architecture doc still contains every H2 it had before this step
- [ ] User picked option A on the latest condensed view
**Idempotency**: if both `_docs/glossary.md` exists AND the architecture doc already has a `## Architecture Vision` section, this step is **skipped on re-invocation**. To refresh, the user invokes `/autodev` after deleting `glossary.md` (or running `monorepo-discover` with structural changes that justify a re-confirmation).
After completion, auto-chain to **Step 3 (Status)**.
---
**Step 3 — Status**
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` exists OR `glossary_doc:` is recorded in the config).
State-driven: reached by auto-chain from Step 2.5, or entered on any re-invocation after a completed cycle.
Action: Read and execute `.cursor/skills/monorepo-status/SKILL.md`.
The status report identifies:
- Components with doc drift (commits newer than their mapped docs)
- Components with CI coverage gaps
- Registry/config mismatches
- Unresolved questions
Based on the report, auto-chain branches in this evaluation order (first match wins):
1. **Registry mismatch** (new components not in config, or config component not in registry) → present the Choose format below FIRST. After the user resolves it (A: refresh discover, B: onboard, C: continue with mismatch acknowledged), proceed to the next rule. This rule has priority because a stale config would mislead Step 3.5's ownership-envelope synthesis and any sync skill's component scope.
2. **Pre-routing gate (Step 3.5 detection)** — check `_docs/tasks/todo/` for suite-level task files (`*.md` excluding files starting with `_`). If ≥1 task is present, auto-chain to **Step 3.5 (Suite Implement)**. After Step 3.5 returns (regardless of A/B outcome), the post-implement re-status applies rules 36 below to the post-implementation state.
3. If **doc drift** found → auto-chain to **Step 4 (Document Sync)**
4. Else if **CI drift** (only) found → auto-chain to **Step 5 (CICD Sync)**
5. Else if **suite-e2e drift** (only) found → auto-chain to **Step 4.5 (Integration Test Sync)** (only when `suite_e2e:` block exists in config)
6. Else → **workflow done for this cycle**.
**Registry mismatch Choose format** (rule 1):
```
══════════════════════════════════════
DECISION REQUIRED: Registry drift detected
══════════════════════════════════════
Components in registry but not in config: <list>
Components in config but not in registry: <list>
A) Run monorepo-discover to refresh config
B) Run monorepo-onboard for each new component (interactive)
C) Ignore for now — continue
══════════════════════════════════════
Recommendation: A — safest; re-detect everything, human reviews
══════════════════════════════════════
```
When rule 6 fires (no drift, no todo tasks), report "No drift. Meta-repo is in sync." and end the cycle. Loop waits for next invocation.
---
**Step 3.5 — Suite Implement**
Condition (folder fallback): `_docs/tasks/todo/` exists AND contains ≥1 file matching `*.md` excluding files starting with `_` (e.g., `_dependencies_table.md` is excluded by convention).
State-driven: reached by auto-chain from Step 3 when the pre-routing gate detected todo tasks. Inserted **before** the sync skills (Step 4 / 4.5 / 5) by deliberate design: implementing renames + cross-repo edits first means the subsequent sync skills propagate the actual landed state rather than the pre-change state, avoiding a second cycle to fix downstream drift.
**Skip condition**: `_docs/tasks/todo/` is empty, missing, or contains only `_*` files. In that case Step 3.5 is skipped entirely and the cycle proceeds with Step 3's existing drift-based routing.
**Goal**: Execute suite-level implementation tasks — cross-repo concerns (e.g., `autopilot` + `ui` + suite `e2e/` cutover in a coordinated change-set), folder renames (e.g., `git mv flights missions` + `.gitmodules` edit + `_infra/` path refs), and suite-root infrastructure additions (e.g., `_infra/dev/docker-compose.dev.yml`). Per-component implementation work stays in each component's own workspace `/autodev` cycle.
**Why this exists**: the meta-repo's existing sync skills (`monorepo-document`, `monorepo-cicd`, `monorepo-e2e`) only **propagate** changes that already landed. They cannot **execute** a task spec. Without Step 3.5, suite-level tickets like AZ-543 (B4 repo rename) or AZ-506 (new dev compose) have no flow path forward — they require operator action outside autodev.
**Inputs**:
- `_docs/tasks/todo/*.md` (excluding `_*`) — task specs in the existing format (`Task` / `Component` / `Dependencies` / `Acceptance criteria` headers)
- `_docs/_repo-config.yaml` — `components[].path` list, used to compute the suite-level OWNED envelope (workspace root EXCLUDING any path under a component's folder)
- `_docs/tasks/_dependencies_table.md` — synthesized by this step if missing (see Procedure)
- `_docs/tasks/_suite_module_layout.md` — synthesized by this step if missing (see Procedure)
**Procedure**:
1. **Detection (already done by Step 3 pre-routing gate)**. List task files in `_docs/tasks/todo/` (excluding `_*`). If 0 → skip Step 3.5. If ≥1 → continue.
2. **Present Choose**:
```
══════════════════════════════════════
DECISION REQUIRED: <N> suite-level task(s) in _docs/tasks/todo/
══════════════════════════════════════
Task(s) detected:
- AZ-XXX: <title> (deps: <list or "—">)
- AZ-YYY: <title> (deps: <list or "—">)
...
A) Run implement skill on these task(s) now (then continue to Doc / E2E / CICD sync)
B) Skip implement this cycle — continue to Doc / E2E / CICD sync without executing tasks
C) Pause — review the tasks before deciding (end session, no state changes)
══════════════════════════════════════
Recommendation: A — running implement BEFORE syncs means subsequent
sync skills propagate the post-implementation state.
B is appropriate when tasks are blocked on user input
or external coordination. C when the tasks themselves
need owner clarification before execution.
══════════════════════════════════════
```
3. **On user A — Pre-flight**:
a. **Working tree clean check**. Run `git status --porcelain`. If non-empty, surface to the user with a Choose A/B/C identical to the implement skill's prerequisite gate (commit/stash manually; agent commits as `chore: WIP pre-implement`; abort).
b. **Synthesize `_docs/tasks/_dependencies_table.md`** if missing. Parse each in-scope task's `Dependencies:` field. Write a minimal table of the form:
```markdown
# Suite-Level Task Dependencies
| Task ID | Depends on | Notes |
|---------|------------|-------|
| AZ-XXX | (none) | — |
| AZ-YYY | AZ-XXX | — |
```
If a task lists a dependency that is neither in `todo/` nor `done/`, log a warning in the synthesized file but do not block — implement skill's Step 1 (Parse) will surface the issue if it actually blocks execution.
c. **Synthesize `_docs/tasks/_suite_module_layout.md`** if missing. Default content:
```markdown
# Suite-Level Module Layout (synthetic)
Generated by autodev meta-repo Step 3.5. The suite root has no per-feature decomposition; ownership is defined at the component-boundary level only.
## Per-Component Mapping
| Component | Owns | Imports from |
|-----------|----------------------------------|--------------|
| suite | (workspace root) excluding any path listed under `_repo-config.yaml.components[].path` | (read-only) every component's primary doc + `_docs/*.md` |
Suite-level tasks operate on: `.gitmodules`, `_infra/**`, `_docs/**` (excluding `_docs/tasks/_*` regenerated files), root `README.md`, `e2e/**` (suite e2e harness only).
Forbidden paths for suite-level tasks: `<component>/**` for every component listed in `_repo-config.yaml.components[].path` — those edits live in the component's own workspace `/autodev` cycle.
```
d. **Prepare invocation context**:
```
suite_level: true
TASKS_DIR: _docs/tasks/
module_layout_path: _docs/tasks/_suite_module_layout.md
```
4. **Invoke implement skill**. Read and execute `.cursor/skills/implement/SKILL.md` with the prepared context. The skill's "Suite-level invocation context" subsection (added in tandem with this flow change) honors the three flags above and skips:
- Step 14.5 (cumulative code review) — no `architecture_compliance_baseline.md` exists at the suite level; cross-task drift is captured by the next `monorepo-status` cycle instead.
- Step 15 (Product Implementation Completeness Gate) — the gate's inputs (`_docs/02_document/architecture.md`, `system-flows.md`, `components/*/description.md`) do not exist in the meta-repo artifact layout. Suite tasks are infrastructure / coordination work, not feature implementation.
All other implement skill steps (114, 16) execute unchanged. Tracker integration (Step 5: In Progress, Step 12: In Testing) runs normally.
5. **Post-implement re-status**. After the implement skill completes (last batch committed, all originally-todo tasks moved to `_docs/tasks/done/`), silently re-run Step 3's drift detection logic — do NOT re-render the full Status report; just re-evaluate the drift signals against the post-implementation tree. Then auto-chain per the post-implementation drift findings:
- Doc drift → Step 4 (Document Sync)
- Suite-e2e drift only → Step 4.5
- CI drift only → Step 5
- No drift → cycle complete
Note: the post-implement re-status is exactly why Step 3.5 is placed before sync. A repo rename will typically introduce doc + CI drift; the next invocation of Step 4 / Step 5 catches it on the same cycle.
6. **On user B (skip)** → mark Step 3.5 `skipped` in state file. Apply Step 3's original drift-based routing (compute from the pre-Step-3.5 Status report).
7. **On user C (pause)** → end session. Update state to `step: 3.5, status: in_progress, sub_step: {phase: 0, name: awaiting-task-review, detail: "<N> tasks pending review"}`. Tell the user to invoke `/autodev` again after deciding. **Do NOT modify any files** — pre-flight has not run yet.
**Self-verification** (executed before invoking implement):
- [ ] Working tree is clean (or user explicitly chose B in the WIP-stash sub-Choose)
- [ ] `_docs/tasks/_dependencies_table.md` exists (synthesized if it didn't)
- [ ] `_docs/tasks/_suite_module_layout.md` exists (synthesized if it didn't)
- [ ] All in-scope task files have a `Component:` field (skip + report any that don't — don't guess ownership)
- [ ] Tracker availability gate satisfied per `protocols.md` (or `tracker: local` previously chosen)
**Failure handling**:
- If implement returns FAILED → standard Failure Handling (`protocols.md`): retry up to 3 times, then escalate.
- If implement is interrupted mid-batch → next invocation re-detects via the implement skill's resumability protocol (read latest `_docs/03_implementation/suite_batch_*.md`). Step 3.5 itself is reentrant: on re-entry, if `todo/` still has tasks, it presents the Choose again with the remaining set.
- **Half-applied state risk** (acknowledged): if implement is interrupted between commits, the working tree is clean at the last commit boundary but the in-flight batch is lost. The user is responsible for inspecting and re-invoking. This is intentional — automated rollback of suite-level renames + `.gitmodules` edits is more dangerous than a human-driven recovery.
**Idempotency**: if `_docs/tasks/todo/` becomes empty after this step (all tasks moved to `done/`), the next `/autodev` invocation skips Step 3.5 entirely and proceeds with normal Status → sync flow.
---
**Step 4 — Document Sync**
State-driven: reached by auto-chain from Step 3 when the status report flagged doc drift.
Action: Read and execute `.cursor/skills/monorepo-document/SKILL.md` with scope = components flagged by status.
The skill:
1. Runs its own drift check (M7)
2. Asks user to confirm scope (components it will touch)
3. Applies doc edits
4. Skips any component with unconfirmed mapping (M5), reports
After completion:
- If the status report ALSO flagged suite-e2e drift → auto-chain to **Step 4.5 (Integration Test Sync)**
- Else if the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
- Else → end cycle, report done
---
**Step 4.5 — Integration Test Sync**
State-driven: reached by auto-chain from Step 3 (when status report flagged suite-e2e drift and no doc drift) or from Step 4 (when both doc and suite-e2e drift were flagged).
**Skip condition**: if `_docs/_repo-config.yaml` has no `suite_e2e:` block, this step is skipped entirely — there's no harness to sync. The status report should not flag suite-e2e drift in that case; if it does, that's a status-skill bug.
Action: Read and execute `.cursor/skills/monorepo-e2e/SKILL.md` with scope = components flagged by status.
The skill:
1. Verifies every path under `suite_e2e.*` exists (binary fixtures excepted — see the skill's Phase 1)
2. Classifies each flagged change against the suite-e2e impact table
3. Applies edits to `e2e/docker-compose.suite-e2e.yml`, `e2e/fixtures/init.sql`, `e2e/fixtures/expected_detections.json` metadata, and `e2e/runner/tests/*.spec.ts` selectors as needed
4. Bumps baseline `fixture_version` with a `-stale` suffix and appends a `_docs/_process_leftovers/` entry whenever the detection model revision changes (binary fixture cannot be regenerated automatically)
5. Reports synced files; does not run the suite e2e itself
After completion:
- If the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
- Else → end cycle, report done
---
**Step 5 — CICD Sync**
State-driven: reached by auto-chain from Step 3 (when status report flagged CI drift and no doc/suite-e2e drift), Step 4, or Step 4.5.
Action: Read and execute `.cursor/skills/monorepo-cicd/SKILL.md` with scope = components flagged by status.
After completion, end cycle. Report files updated across doc, suite-e2e, and CI sync.
---
**Step 6 — Loop (re-entry on next invocation)**
State-driven: all triggered steps completed; the meta-repo cycle has finished.
Action: Update state file to `step: 3, status: not_started` so that next `/autodev` invocation starts from Status. The meta-repo flow is cyclical — there's no terminal "done" state, because drift can appear at any time as submodules evolve.
On re-invocation:
- If config was updated externally and `confirmed_by_user` flipped back to `false` → go back to Step 2
- Otherwise → Step 3 (Status)
## Explicit Onboarding Branch (user-initiated)
Onboarding is not auto-chained. Two ways to invoke:
**1. During Step 3 registry-mismatch handling** — if user picks option B in the registry-mismatch Choose format, launch `monorepo-onboard` interactively for each new component.
**2. Direct user request** — if the user says "onboard <name>" during any step, pause the current step, save state, run `monorepo-onboard`, then resume.
After onboarding completes, the config is updated. Auto-chain back to **Step 3 (Status)** to catch any remaining drift the new component introduced.
## Auto-Chain Rules
| Completed Step | Next Action |
|---------------|-------------|
| Discover (1) | Auto-chain → Config Review (2) |
| Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Glossary & Architecture Vision (2.5) |
| Config Review (2, user picked B) | **Session boundary** — end session, await re-invocation |
| Glossary & Architecture Vision (2.5) | Auto-chain → Status (3) |
| Status (3, todo tasks present) | Auto-chain → Suite Implement (3.5) — pre-routing gate fires before drift-based routing |
| Status (3, no todo tasks, doc drift) | Auto-chain → Document Sync (4) |
| Status (3, no todo tasks, suite-e2e drift only) | Auto-chain → Integration Test Sync (4.5) |
| Status (3, no todo tasks, CI drift only) | Auto-chain → CICD Sync (5) |
| Status (3, no todo tasks, no drift) | **Cycle complete** — end session, await re-invocation |
| Status (3, registry mismatch) | Ask user (A: discover, B: onboard, C: continue) |
| Suite Implement (3.5, user picked A, success) | Silent re-status; auto-chain per post-implementation drift (Step 4 / 4.5 / 5 / cycle complete) |
| Suite Implement (3.5, user picked B) | Mark `skipped`; auto-chain per Step 3's original drift findings |
| Suite Implement (3.5, user picked C) | **Session boundary** — end session, await re-invocation |
| Suite Implement (3.5, FAILED ×3) | Standard Failure Handling escalation (`protocols.md`) |
| Document Sync (4) + suite-e2e drift pending | Auto-chain → Integration Test Sync (4.5) |
| Document Sync (4) + CI drift only pending | Auto-chain → CICD Sync (5) |
| Document Sync (4) + no further drift | **Cycle complete** |
| Integration Test Sync (4.5) + CI drift pending | Auto-chain → CICD Sync (5) |
| Integration Test Sync (4.5) + no CI drift | **Cycle complete** |
| CICD Sync (5) | **Cycle complete** |
## Status Summary — Step List
Flow name: `meta-repo`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)".
Flow-specific slot values:
- `<header-suffix>`: empty.
- `<current-suffix>`: empty.
- `<footer-extras>`: add a single line:
```
Config: _docs/_repo-config.yaml [confirmed_by_user: <true|false>, last_updated: <date>]
```
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|------------------------------------|--------------------------------------------|
| 1 | Discover | — |
| 2 | Config Review | `IN PROGRESS (awaiting human)` |
| 2.5 | Glossary & Architecture Vision | `SKIPPED (already captured)` |
| 3 | Status | `DONE (no drift)`, `DONE (N drifts)` |
| 3.5 | Suite Implement | `DONE (N tasks)`, `SKIPPED (no todo tasks)`, `SKIPPED (user picked B)`, `IN PROGRESS (batch M of ~N)`, `IN PROGRESS (awaiting-task-review)` |
| 4 | Document Sync | `DONE (N docs)`, `SKIPPED (no doc drift)` |
| 4.5 | Integration Test Sync | `DONE (N files)`, `SKIPPED (no suite-e2e drift)`, `SKIPPED (no suite_e2e config block)` |
| 5 | CICD Sync | `DONE (N files)`, `SKIPPED (no CI drift)` |
All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2.5, 3.5, 4, 4.5, and 5 additionally accept `SKIPPED`.
Row rendering format:
```
Step 1 Discover [<state token>]
Step 2 Config Review [<state token>]
Step 2.5 Glossary & Architecture Vision [<state token>]
Step 3 Status [<state token>]
Step 3.5 Suite Implement [<state token>]
Step 4 Document Sync [<state token>]
Step 4.5 Integration Test Sync [<state token>]
Step 5 CICD Sync [<state token>]
```
## Notes for the meta-repo flow
- **Session boundaries**: Step 2 (Config Review pending), Step 2.5 (one-shot glossary/vision review), and Step 3.5 (when user picks C "Pause"). Step 3.5's A/B picks do NOT cross a session boundary — they auto-chain to syncs in the same session.
- **Cyclical, not terminal**: no "done forever" state. Each invocation completes a drift cycle; next invocation starts fresh.
- **Tracker integration scope**: this flow does NOT create Jira/ADO tickets in its sync skills (Status / Document Sync / E2E / CICD). Step 3.5 (Suite Implement) IS tracker-integrated — it transitions existing tickets In Progress → In Testing per the implement skill's standard tracker handling. Suite-level tickets are authored manually by the operator (typically as children of an Epic that spans multiple components, like AZ-539); the flow doesn't auto-create them.
- **Per-component vs. suite-level work**:
- Tickets that touch component source code (`<component>/src/**`) belong in that component's own workspace `/autodev` cycle. The meta-repo flow does NOT execute them.
- Tickets that touch suite-root paths only (`.gitmodules`, `_infra/**`, suite `e2e/**`, root `README.md`, suite `_docs/**` outside `tasks/_*`) are eligible for Step 3.5.
- Tickets that span both (e.g., AZ-550 B11 consumer cutover, which touches `autopilot/`, `ui/`, AND suite `e2e/`) are NOT executable from a single workspace by design — split the ticket so the suite-level slice can run in Step 3.5 and the component slices run in their owning workspaces.
- **Onboarding is opt-in**: never auto-onboarded. User must explicitly request.
- **Failure handling**: uses the same retry/escalation protocol as other flows (see `protocols.md`).
-396
View File
@@ -1,396 +0,0 @@
# Autodev Protocols
## User Interaction Protocol
Every time the autodev or a sub-skill needs a user decision, use the **Choose A / B / C / D** format. This applies to:
- State transitions where multiple valid next actions exist
- Sub-skill BLOCKING gates that require user judgment
- Any fork where the autodev cannot confidently pick the right path
- Trade-off decisions (tech choices, scope, risk acceptance)
### When to Ask (MUST ask)
- The next action is ambiguous (e.g., "another research round or proceed?")
- The decision has irreversible consequences (e.g., architecture choices, skipping a step)
- The user's intent or preference cannot be inferred from existing artifacts
- A sub-skill's BLOCKING gate explicitly requires user confirmation
- Multiple valid approaches exist with meaningfully different trade-offs
### When NOT to Ask (auto-transition)
- Only one logical next step exists (e.g., Problem complete → Research is the only option)
- The transition is deterministic from the state (e.g., Plan complete → Decompose)
- The decision is low-risk and reversible
- Existing artifacts or prior decisions already imply the answer
### Choice Format
Always present decisions in this format:
```
══════════════════════════════════════
DECISION REQUIRED: [brief context]
══════════════════════════════════════
A) [Option A — short description]
B) [Option B — short description]
C) [Option C — short description, if applicable]
D) [Option D — short description, if applicable]
══════════════════════════════════════
Recommendation: [A/B/C/D] — [one-line reason]
══════════════════════════════════════
```
Rules:
1. Always provide 24 concrete options (never open-ended questions)
2. Always include a recommendation with a brief justification
3. Keep option descriptions to one line each
4. If only 2 options make sense, use A/B only — do not pad with filler options
5. Play the notification sound (per `.cursor/rules/human-attention-sound.mdc`) before presenting the choice
6. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive
## Optional Skill Gate (reusable template)
Several flow steps ask the user whether to run an optional skill (security audit, performance test, etc.) before auto-chaining. Instead of re-stating the Choose block and skip semantics at each such step, flow files invoke this shared template.
### Template shape
```
══════════════════════════════════════
DECISION REQUIRED: <question>
══════════════════════════════════════
A) <option-a-label>
B) <option-b-label>
══════════════════════════════════════
Recommendation: <A|B> — <reason>
══════════════════════════════════════
```
### Semantics (same for every invocation)
- **On A** → read and execute the target skill's `SKILL.md`; after it completes, auto-chain to `<next-step>`.
- **On B** → mark the current step `skipped` in the state file; auto-chain to `<next-step>`.
- **On skill failure** → standard Failure Handling (§Failure Handling) — retry ladder, then escalate via Choose block.
- **Sound before the prompt** — follow `.cursor/rules/human-attention-sound.mdc`.
### How flow files invoke it
Each flow-file step that needs this gate supplies only the variable parts:
```
Action: Apply the **Optional Skill Gate** (protocols.md → "Optional Skill Gate") with:
- question: <Choose-block header>
- option-a-label: <one-line A description>
- option-b-label: <one-line B description>
- recommendation: <A|B> — <short reason, may be dynamic>
- target-skill: <.cursor/skills/<name>/SKILL.md, plus any mode hint>
- next-step: Step <N> (<name>)
```
The resolved Choose block (shape above) is then rendered verbatim by substituting these variables. Do NOT reword the shared scaffolding — reword only the variable parts. If a step needs different semantics (e.g., "re-run same skill" rather than "skip to next step"), it MUST NOT use this template; it writes the Choose block inline with its own semantics.
### When NOT to use this template
- The user choice has **more than two options** (A/B/C/D).
- The choice is **not "run-or-skip-this-skill"** (e.g., "another round of the same skill", "pick tech stack", "proceed vs. rollback").
- The skipped path needs special bookkeeping beyond `status: skipped` (e.g., must also move artifacts, notify tracker, trigger a different skill).
For those cases, write the Choose block inline using the base format in §User Interaction Protocol.
## Work Item Tracker Authentication
All tracker detection, authentication, availability gating, `tracker: local` fallback semantics, and leftovers handling are defined in `.cursor/rules/tracker.mdc`. Follow that rule — do not restate its logic here.
Autodev-specific additions on top of the rule:
### Steps That Require Work Item Tracker
Before entering a step from this table for the first time in a session, verify tracker availability per `.cursor/rules/tracker.mdc`. If the user has already chosen `tracker: local`, skip the gate and proceed.
| Flow | Step | Sub-Step | Tracker Action |
|------|------|----------|----------------|
| greenfield | Plan | Step 6 — Epics | Create epics for each component |
| greenfield | Decompose | Implementation decomposition Step 1 + Step 2 — Product tasks | Create ticket per product task, link to epic |
| greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
| existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
| existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |
| meta-repo | Suite Implement | Step 3.5 — implement skill Step 5 / Step 12 | Transition existing tickets In Progress → In Testing per implement skill (does NOT create new tickets — operator authors them) |
### State File Marker
Record the resolved choice in the state file once per session: `tracker: jira` or `tracker: local`. Subsequent steps read this marker instead of re-running the gate.
## Error Handling
All error situations that require user input MUST use the **Choose A / B / C / D** format.
| Situation | Action |
|-----------|--------|
| State detection is ambiguous (artifacts suggest two different steps) | Present findings and use Choose format with the candidate steps as options |
| Sub-skill fails or hits an unrecoverable blocker | Use Choose format: A) retry, B) skip with warning, C) abort and fix manually |
| User wants to skip a step | Use Choose format: A) skip (with dependency warning), B) execute the step |
| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
## Failure Handling
One retry ladder covers all failure modes: explicit failure returned by a sub-skill, stuck loops detected while monitoring, and persistent failures across conversations. The single counter is `retry_count` in the state file; the single escalation is the Choose block below.
### Failure signals
Treat the sub-skill as **failed** when ANY of the following is observed:
- The sub-skill explicitly returns a failed result (including blocked tasks, auto-fix loop exhaustion, prerequisite violations).
- **Stuck signals**: the same artifact is rewritten 3+ times without meaningful change; the sub-skill re-asks a question that was already answered; no new artifact has been saved despite active execution.
### Retry ladder
```
Failure observed
├─ retry_count < 3 ?
│ YES → increment retry_count in state file
│ → re-read the sub-skill's SKILL.md and _docs/_autodev_state.md
│ → resume from the last recorded sub_step (restart from sub_step 1 only if corruption is suspected)
│ → loop
│ NO (retry_count = 3) →
│ → set status: failed and retry_count: 3 in Current Step
│ → play notification sound (.cursor/rules/human-attention-sound.mdc)
│ → escalate (Choose block below)
│ → do NOT auto-retry until the user intervenes
```
Rules:
1. **Auto-retry is immediate** — do not ask before retrying.
2. **Preserve `sub_step`** across retries unless the failure indicates artifact corruption.
3. **Reset `retry_count: 0` on success.**
4. The counter is **per step, per cycle**. It is not cleared by crossing a session boundary — persistence across conversations is intentional; it IS the circuit breaker.
### Escalation
```
══════════════════════════════════════
SKILL FAILED: [Skill Name] — 3 consecutive failures
══════════════════════════════════════
Step: [N] — [Name]
SubStep: [M] — [sub-step name]
Last failure reason: [reason]
══════════════════════════════════════
A) Retry with fresh context (new conversation)
B) Skip this step with warning
C) Abort — investigate and fix manually
══════════════════════════════════════
Recommendation: A — fresh context often resolves
persistent failures
══════════════════════════════════════
```
### Re-entry after escalation
On the next invocation, if the state file shows `status: failed` AND `retry_count: 3`, do NOT auto-retry. Present the escalation block above first:
- User picks A → reset `retry_count: 0`, set `status: in_progress`, re-execute.
- User picks B → mark step `skipped`, proceed to the next step.
- User picks C → stop; return control to the user.
### Incident retrospective
Immediately after the user has made their A/B/C choice, invoke `.cursor/skills/retrospective/SKILL.md` in **incident mode**:
```
mode: incident
failing_skill: <skill name>
failure_summary: <last failure reason string>
```
This produces `_docs/06_metrics/incident_<YYYY-MM-DD>_<skill>.md` and appends 13 lessons to `_docs/LESSONS.md` under `process` or `tooling`. The retro runs even if the user picked Abort — the goal is to capture the pattern while it is fresh. If the retrospective skill itself fails, log the failure to `_docs/_process_leftovers/` but do NOT block the user's recovery choice from completing.
## Context Management Protocol
### Principle
Disk is memory. Never rely on in-context accumulation — read from `_docs/` artifacts, not from conversation history.
### Minimal Re-Read Set Per Skill
When re-entering a skill (new conversation or context refresh):
- Always read: `_docs/_autodev_state.md`
- Always read: the active skill's `SKILL.md`
- Conditionally read: only the `_docs/` artifacts the current sub-step requires (listed in each skill's Context Resolution section)
- Never bulk-read: do not load all `_docs/` files at once
### Mid-Skill Interruption
If context is filling up during a long skill (e.g., document, implement):
1. Save current sub-step progress to the skill's artifact directory
2. Update `_docs/_autodev_state.md` with exact sub-step position
3. Suggest a new conversation: "Context is getting long — recommend continuing in a fresh conversation for better results"
4. On re-entry, the skill's resumability protocol picks up from the saved sub-step
### Large Artifact Handling
When a skill needs to read large files (e.g., full solution.md, architecture.md):
- Read only the sections relevant to the current sub-step
- Use search tools (Grep, SemanticSearch) to find specific sections rather than reading entire files
- Summarize key decisions from prior steps in the state file so they don't need to be re-read
### Context Budget Heuristic
Agents cannot programmatically query context window usage. Use these heuristics to avoid degradation:
| Zone | Indicators | Action |
|------|-----------|--------|
| **Safe** | State file + SKILL.md + 23 focused artifacts loaded | Continue normally |
| **Caution** | 5+ artifacts loaded, or 3+ large files (architecture, solution, discovery), or conversation has 20+ tool calls | Complete current sub-step, then suggest session break |
| **Danger** | Repeated truncation in tool output, tool calls failing unexpectedly, responses becoming shallow or repetitive | Save immediately, update state file, force session boundary |
**Skill-specific guidelines**:
| Skill | Recommended session breaks |
|-------|---------------------------|
| **document** | After every ~5 modules in Step 1; between Step 4 (Verification) and Step 5 (Solution Extraction) |
| **implement** | Each batch is a natural checkpoint; if more than 2 batches completed in one session, suggest break |
| **plan** | Between Step 5 (Test Specifications) and Step 6 (Epics) for projects with many components |
| **research** | Between Mode A rounds; between Mode A and Mode B |
**How to detect caution/danger zone without API**:
1. Count tool calls made so far — if approaching 20+, context is likely filling up
2. If reading a file returns truncated content, context is under pressure
3. If the agent starts producing shorter or less detailed responses than earlier in the conversation, context quality is degrading
4. When in doubt, save and suggest a new conversation — re-entry is cheap thanks to the state file
## Rollback Protocol
### Implementation Steps (git-based)
Handled by `/implement` skill — each batch commit is a rollback checkpoint via `git revert`.
### Planning/Documentation Steps (artifact-based)
For steps that produce `_docs/` artifacts (problem, research, plan, decompose, document):
1. **Before overwriting**: if re-running a step that already has artifacts, the sub-skill's prerequisite check asks the user (resume/overwrite/skip)
2. **Rollback to previous step**: use Choose format:
```
══════════════════════════════════════
ROLLBACK: Re-run [step name]?
══════════════════════════════════════
A) Re-run the step (overwrites current artifacts)
B) Stay on current step
══════════════════════════════════════
Warning: This will overwrite files in _docs/[folder]/
══════════════════════════════════════
```
3. **Git safety net**: artifacts are committed with each autodev step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout <commit> -- _docs/<folder>/`
4. **State file rollback**: when rolling back artifacts, also update `_docs/_autodev_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date)
## Debug Protocol
When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or a task reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)
### Structured Debugging Workflow
When escalated to the user after implementation failure:
1. **Classify the failure** — determine the category:
- **Missing dependency**: a package, service, or module the task needs but isn't available
- **Logic error**: code runs but produces wrong results (assertion failures, incorrect output)
- **Integration mismatch**: interfaces between components don't align (type errors, missing methods, wrong signatures)
- **Environment issue**: Docker, database, network, or configuration problem
- **Spec ambiguity**: the task spec is unclear or contradictory
2. **Reproduce** — isolate the failing behavior:
- Run the specific failing test(s) in isolation
- Check whether the failure is deterministic or intermittent
- Capture the exact error message, stack trace, and relevant file:line
3. **Narrow scope** — focus on the minimal reproduction:
- For logic errors: trace the data flow from input to the point of failure
- For integration mismatches: compare the caller's expectations against the callee's actual interface
- For environment issues: verify Docker services are running, DB is accessible, env vars are set
4. **Fix and verify** — apply the fix and confirm:
- Make the minimal change that fixes the root cause
- Re-run the failing test(s) to confirm the fix
- Run the full test suite to check for regressions
- If the fix changes a shared interface, check all consumers
5. **Report** — update the batch report with:
- Root cause category
- Fix applied (file:line, description)
- Tests that now pass
### Common Recovery Patterns
| Failure Pattern | Typical Root Cause | Recovery Action |
|----------------|-------------------|----------------|
| ImportError / ModuleNotFoundError | Missing dependency or wrong path | Install dependency or fix import path |
| TypeError on method call | Interface mismatch between tasks | Align caller with callee's actual signature |
| AssertionError in test | Logic bug or wrong expected value | Fix logic or update test expectations |
| ConnectionRefused | Service not running | Start Docker services, check docker-compose |
| Timeout | Blocking I/O or infinite loop | Add timeout, fix blocking call |
| FileNotFoundError | Hardcoded path or missing fixture | Make path configurable, add fixture |
### Escalation
If debugging does not resolve the issue after 2 focused attempts:
```
══════════════════════════════════════
DEBUG ESCALATION: [failure description]
══════════════════════════════════════
Root cause category: [category]
Attempted fixes: [list]
Current state: [what works, what doesn't]
══════════════════════════════════════
A) Continue debugging with more context
B) Revert this batch and skip the task (move to backlog)
C) Simplify the task scope and retry
══════════════════════════════════════
```
## Status Summary
On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). For re-entry (state file exists), cross-check the current step against `_docs/` folder structure and present any `status: failed` state to the user before continuing.
### Banner Template (authoritative)
The banner shell is defined here once. Each flow file contributes only its step-list fragment and any flow-specific header/footer extras. Do not inline a full banner in flow files.
```
═══════════════════════════════════════════════════
AUTODEV STATUS (<flow-name>)<header-suffix>
═══════════════════════════════════════════════════
<step-list from the active flow file>
═══════════════════════════════════════════════════
Current: Step <N> — <Name><current-suffix>
SubStep: <M> — <sub-skill internal step name>
Retry: <N/3> ← omit row if retry_count is 0
Action: <what will happen next>
<footer-extras from the active flow file>
═══════════════════════════════════════════════════
```
### Slot rules
- `<flow-name>``greenfield`, `existing-code`, or `meta-repo`.
- `<header-suffix>` — optional, flow-specific. The existing-code flow appends ` — Cycle <N>` when `state.cycle > 1`; other flows leave it empty.
- `<step-list>` — a fixed-width table supplied by the active flow file (see that file's "Status Summary — Step List" section). Row format is standardized:
```
Step <N> <Step Name> [<state token>]
```
where `<state token>` comes from the state-token set defined per row in the flow's step-list table.
- `<current-suffix>` — optional, flow-specific. The existing-code flow appends ` (cycle <N>)` when `state.cycle > 1`; other flows leave it empty.
- `Retry:` row — omit entirely when `retry_count` is 0. Include it with `<N>/3` otherwise.
- `<footer-extras>` — optional, flow-specific. The meta-repo flow adds a `Config:` line with `_docs/_repo-config.yaml` state; other flows leave it empty unless **parent suite docs** apply: if `<workspace-root>/../docs` exists and is a directory, append `Suite docs (parent): <absolute path>` on its own line (or `Suite docs (parent): absent` is **not** required — omit when missing). This line is orthogonal to flow-specific footer lines; both may appear.
### State token set (shared)
The common tokens all flows may emit are: `DONE`, `IN PROGRESS`, `NOT STARTED`, `SKIPPED`, `FAILED (retry N/3)`. Specific step rows may extend this with parenthetical detail (e.g., `DONE (N drafts)`, `DONE (N tasks)`, `IN PROGRESS (batch M of ~N)`, `DONE (N passed, M failed)`). The flow's step-list table declares which extensions each step supports.
-171
View File
@@ -1,171 +0,0 @@
# Autodev State Management
## State File: `_docs/_autodev_state.md`
The autodev persists its position to `_docs/_autodev_state.md`. This is a lightweight pointer — only the current step. All history lives in `_docs/` artifacts and git log. Folder scanning is the fallback when the state file doesn't exist.
### Template
**Saved at:** `_docs/_autodev_state.md` (workspace-relative, one file per project). Created on the first `/autodev` invocation; updated in place on every state transition; never deleted.
```markdown
# Autodev State
## Current Step
flow: [greenfield | existing-code | meta-repo]
step: [1-17 for greenfield (incl. fractional 16.5), 1-17 for existing-code (incl. fractional 16.5), 1-6 for meta-repo (incl. fractional 2.5 and 3.5), or "done"]
name: [step name from the active flow's Step Reference Table]
status: [not_started / in_progress / completed / skipped / failed]
sub_step:
phase: [integer — sub-skill internal phase/step number, or 0 if not started]
name: [kebab-case short identifier from the sub-skill, or "awaiting-invocation"]
detail: [optional free-text note, may be empty]
retry_count: [0-3 — consecutive auto-retry attempts, reset to 0 on success]
cycle: [1-N — feature cycle counter for existing-code flow; increments on each "Re-Entry After Completion" loop; always 1 for greenfield and meta-repo]
```
The `sub_step` field is structured. Every sub-skill must save both `phase` (integer) and `name` (kebab-case token matching the skill's documented phase names). `detail` is optional human-readable context. On re-entry the orchestrator parses `phase` and `name` to resume; if parsing fails, fall back to folder scan and log the parse failure.
### Sub-Skill Phase Persistence — Rules (not a registry)
Each sub-skill is authoritative for its own phase list. Phase names and numbers live inside the sub-skill's own SKILL.md (and any `steps/` / `phases/` files). The orchestrator does not maintain a central phase table — it reads whatever `phase` / `name` the sub-skill last wrote.
Every sub-skill MUST follow these rules when persisting `sub_step`:
1. **`phase`** — a strictly monotonic integer per invocation, starting at 0 (`awaiting-invocation`) and incrementing by 1 at each internal save point. No fractional values are ever persisted. If the skill's own docs use half-step numbering (e.g., "Phase 4.5", decompose's "Step 1.5"), the persisted integer is simply the next integer, and all subsequent phases shift up by one in that skill's own file.
2. **`name`** — a kebab-case short identifier unique within that sub-skill. Use the phase's heading or step title in kebab-case (e.g., `component-decomposition`, `auto-fix-gate`, `cross-task-consistency`). Different modes of the same skill may reuse a `phase` integer with distinct `name` values (e.g., `decompose` phase 1 is `bootstrap-structure` in default mode, `test-infrastructure-bootstrap` in tests-only mode).
3. **`detail`** — optional free-text note (batch index, mode flag, retry hint); may be empty.
4. **Reserved name**`name: awaiting-invocation` with `phase: 0` is the universal "skill was chained but has not started" marker. Every sub-skill implicitly supports it; no sub-skill should reuse the token for anything else.
On re-entry, the orchestrator parses the structured field and resumes at `(phase, name)`. If parsing fails, it falls back to folder scan and logs the parse error — it does NOT guess a phase.
The `cycle` counter is used by existing-code flow Step 10 (Implement) detection and by implementation report naming (`implementation_report_{feature_slug}_cycle{N}.md`). It starts at 1 when a project enters existing-code flow (either by routing from greenfield's Done branch, or by first invocation on an existing codebase). It increments on each completed Retrospective → New Task loop.
### Examples
```
flow: greenfield
step: 3
name: Plan
status: in_progress
sub_step:
phase: 4
name: architecture-review-risk-assessment
detail: ""
retry_count: 0
cycle: 1
```
```
flow: existing-code
step: 3
name: Test Spec
status: failed
sub_step:
phase: 1
name: test-case-generation
detail: "variant 1b"
retry_count: 3
cycle: 1
```
```
flow: meta-repo
step: 2
name: Config Review
status: in_progress
sub_step:
phase: 0
name: awaiting-human-review
detail: "awaiting review of _docs/_repo-config.yaml"
retry_count: 0
cycle: 1
```
```
flow: meta-repo
step: 3.5
name: Suite Implement
status: in_progress
sub_step:
phase: 7
name: batch-loop
detail: "AZ-543 batch 1 of 1; suite-level"
retry_count: 0
cycle: 1
```
```
flow: existing-code
step: 10
name: Implement
status: in_progress
sub_step:
phase: 7
name: batch-loop
detail: "batch 2 of ~4"
retry_count: 0
cycle: 3
```
### State File Rules
1. **Create** on the first autodev invocation (after state detection determines Step 1)
2. **Update** after every change — this includes: batch completion, sub-step progress, step completion, session boundary, failed retry, or any meaningful state transition. The state file must always reflect the current reality.
3. **Read** as the first action on every invocation — before folder scanning
4. **Cross-check**: verify against actual `_docs/` folder contents. If they disagree, trust the folder structure and update the state file. **Parent suite `docs/`**: on every invocation, also probe `<workspace-root>/../docs` (the parent directorys `docs` folder — typical suite-level shared documentation next to a component repo). If it exists, mention it in the Status Summary footer per `protocols.md`; use it only as supplemental reading context unless a flow step explicitly ties detection to it. It never replaces workspace `_docs/` for step detection by default.
5. **Never delete** the state file
6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` on success. If `retry_count` reaches 3, set `status: failed`
7. **Failed state on re-entry**: if `status: failed` with `retry_count: 3`, do NOT auto-retry — present the issue to the user first
8. **Skill-internal state**: when the active skill maintains its own state file (e.g., document skill's `_docs/02_document/state.json`), the autodev's `sub_step` field should reflect the skill's internal progress. On re-entry, cross-check the skill's state file against the autodev's `sub_step` for consistency.
## State Detection
Read `_docs/_autodev_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning.
### Folder Scan Rules (fallback)
Scan the workspace and `_docs/` to determine the current workflow position. The detection rules are defined in each flow file (`flows/greenfield.md`, `flows/existing-code.md`, `flows/meta-repo.md`). Resolution order:
1. Apply the Flow Resolution rules in `SKILL.md` to pick the flow first (meta-repo detection takes priority over greenfield/existing-code).
2. Within the selected flow, check its detection rules in order — first match wins.
## Re-Entry Protocol
When the user invokes `/autodev` and work already exists:
1. Read `_docs/_autodev_state.md`
2. Cross-check against `_docs/` folder structure
3. Present Status Summary (render using the banner template in `protocols.md` → "Banner Template", filled in with the active flow's "Status Summary — Step List" fragment)
4. If the detected step has a sub-skill with built-in resumability, the sub-skill handles mid-step recovery
5. Continue execution from detected state
## Session Boundaries
A **session boundary** is a transition that explicitly breaks auto-chain. Which transitions are boundaries is declared **in each flow file's Auto-Chain Rules table** — rows marked `**Session boundary**`. The details live with the steps they apply to; this section defines only the shared mechanism.
**Invariant**: a flow row without the `Session boundary` marker auto-chains unconditionally. Missing marker = missing boundary.
### Orchestrator mechanism at a boundary
1. Update the state file: mark the current step `completed`; set the next step with `status: not_started`; reset `sub_step: {phase: 0, name: awaiting-invocation, detail: ""}`; keep `retry_count: 0`.
2. Present a brief summary of what just finished (tasks produced, batches expected, etc., as relevant to the boundary).
3. Present the shared Choose block (template below) — or a flow-specific override if the flow file supplies one.
4. End the session — do not start the next skill in the same conversation.
### Shared Choose template
```
══════════════════════════════════════
DECISION REQUIRED: <what just completed> — start <next phase>?
══════════════════════════════════════
A) Start a new conversation for <next phase> (recommended for context freshness)
B) Continue in this conversation (NOT recommended — context may degrade)
Warning: if context fills mid-<next phase>, state will be saved and you will
still be asked to resume in a new conversation — option B only delays that.
══════════════════════════════════════
Recommendation: A — <next phase> is long; fresh context helps
══════════════════════════════════════
```
Individual boundaries MAY override this template with a flow-specific Choose block when the pause has different semantics (e.g., `meta-repo.md` Step 2 Config Review pauses for human review of a config flag, not for context freshness). The flow file is authoritative for any such override.
+107
View File
@@ -0,0 +1,107 @@
---
name: autopilot
description: |
Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
Detects current project state from _docs/ folder, resumes from where it left off, and flows through
problem → research → plan → decompose → implement → deploy without manual skill invocation.
Maximizes work per conversation by auto-transitioning between skills.
Trigger phrases:
- "autopilot", "auto", "start", "continue"
- "what's next", "where am I", "project status"
category: meta
tags: [orchestrator, workflow, auto-chain, state-machine, meta-skill]
disable-model-invocation: true
---
# Autopilot Orchestrator
Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autopilot` once — the engine handles sequencing, transitions, and re-entry.
## File Index
| File | Purpose |
|------|---------|
| `flows/greenfield.md` | Detection rules, step table, and auto-chain rules for new projects |
| `flows/existing-code.md` | Detection rules, step table, and auto-chain rules for existing codebases |
| `state.md` | State file format, rules, re-entry protocol, session boundaries |
| `protocols.md` | User interaction, Jira MCP auth, choice format, error handling, status summary |
**On every invocation**: read all four files above before executing any logic.
## Core Principles
- **Auto-chain**: when a skill completes, immediately start the next one — no pause between skills
- **Only pause at decision points**: BLOCKING gates inside sub-skills are the natural pause points; do not add artificial stops between steps
- **State from disk**: all progress is persisted to `_docs/_autopilot_state.md` and cross-checked against `_docs/` folder structure
- **Rich re-entry**: on every invocation, read the state file for full context before continuing
- **Delegate, don't duplicate**: read and execute each sub-skill's SKILL.md; never inline their logic here
- **Sound on pause**: follow `.cursor/rules/human-attention-sound.mdc` — play a notification sound before every pause that requires human input
- **Minimize interruptions**: only ask the user when the decision genuinely cannot be resolved automatically
- **Single project per workspace**: all `_docs/` paths are relative to workspace root; for monorepos, each service needs its own Cursor workspace
## Flow Resolution
Determine which flow to use:
1. If workspace has source code files **and** `_docs/` does not exist → **existing-code flow** (Pre-Step detection)
2. If `_docs/_autopilot_state.md` exists and records Document in `Completed Steps`**existing-code flow**
3. If `_docs/_autopilot_state.md` exists and `step: done` AND workspace contains source code → **existing-code flow** (completed project re-entry — loops to New Task)
4. Otherwise → **greenfield flow**
After selecting the flow, apply its detection rules (first match wins) to determine the current step.
## Execution Loop
Every invocation follows this sequence:
```
1. Read _docs/_autopilot_state.md (if exists)
2. Read all File Index files above
3. Cross-check state file against _docs/ folder structure (rules in state.md)
4. Resolve flow (see Flow Resolution above)
5. Resolve current step (detection rules from the active flow file)
6. Present Status Summary (template in active flow file)
7. Execute:
a. Delegate to current skill (see Skill Delegation below)
b. If skill returns FAILED → apply Skill Failure Retry Protocol (see protocols.md):
- Auto-retry the same skill (failure may be caused by missing user input or environment issue)
- If 3 consecutive auto-retries fail → record in state file Blockers, warn user, stop auto-retry
c. When skill completes successfully → reset retry counter, update state file (rules in state.md)
d. Re-detect next step from the active flow's detection rules
e. If next skill is ready → auto-chain (go to 7a with next skill)
f. If session boundary reached → update state, suggest new conversation (rules in state.md)
g. If all steps done → update state → report completion
```
## Skill Delegation
For each step, the delegation pattern is:
1. Update state file: set `step` to the autopilot step number, status to `in_progress`, set `sub_step` to the sub-skill's current internal step/phase, reset `retry_count: 0`
2. Announce: "Starting [Skill Name]..."
3. Read the skill file: `.cursor/skills/[name]/SKILL.md`
4. Execute the skill's workflow exactly as written, including all BLOCKING gates, self-verification checklists, save actions, and escalation rules. Update `sub_step` in state each time the sub-skill advances.
5. If the skill **fails**: follow the Skill Failure Retry Protocol in `protocols.md` — increment `retry_count`, auto-retry up to 3 times, then escalate.
6. When complete (success): reset `retry_count: 0`, mark step `completed`, record date + key outcome, add key decisions to state file, return to auto-chain rules (from active flow file)
Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The autopilot is a sequencer, not an optimizer.
## Trigger Conditions
This skill activates when the user wants to:
- Start a new project from scratch
- Continue an in-progress project
- Check project status
- Let the AI guide them through the full workflow
**Keywords**: "autopilot", "auto", "start", "continue", "what's next", "where am I", "project status"
**Differentiation**:
- User wants only research → use `/research` directly
- User wants only planning → use `/plan` directly
- User wants to document an existing codebase → use `/document` directly
- User wants the full guided workflow → use `/autopilot`
## Flow Reference
See `flows/greenfield.md` and `flows/existing-code.md` for step tables, detection rules, auto-chain rules, and status summary templates.
@@ -0,0 +1,234 @@
# Existing Code Workflow
Workflow for projects with an existing codebase. Starts with documentation, produces test specs, decomposes and implements tests, verifies them, refactors with that safety net, then adds new functionality and deploys.
## Step Reference Table
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 1 | Document | document/SKILL.md | Steps 18 |
| 2 | Test Spec | test-spec/SKILL.md | Phase 1a1b |
| 3 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
| 4 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 5 | Run Tests | test-run/SKILL.md | Steps 14 |
| 6 | Refactor | refactor/SKILL.md | Phases 05 (6-phase method) |
| 7 | New Task | new-task/SKILL.md | Steps 18 (loop) |
| 8 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 9 | Run Tests | test-run/SKILL.md | Steps 14 |
| 10 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 11 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
| 12 | Deploy | deploy/SKILL.md | Step 17 |
After Step 12, the existing-code workflow is complete.
## Detection Rules
Check rules in order — first match wins.
---
**Step 1 — Document**
Condition: `_docs/` does not exist AND the workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`, `src/`, `Cargo.toml`, `*.csproj`, `package.json`)
Action: An existing codebase without documentation was detected. Read and execute `.cursor/skills/document/SKILL.md`. After the document skill completes, re-detect state (the produced `_docs/` artifacts will place the project at Step 2 or later).
---
**Step 2 — Test Spec**
Condition: `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`) AND `_docs/02_document/tests/traceability-matrix.md` does not exist AND the autopilot state shows Document was run (check `Completed Steps` for "Document" entry)
Action: Read and execute `.cursor/skills/test-spec/SKILL.md`
This step applies when the codebase was documented via the `/document` skill. Test specifications must be produced before refactoring or further development.
---
**Step 3 — Decompose Tests**
Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND the autopilot state shows Document was run AND (`_docs/02_tasks/` does not exist or has no task files)
Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
1. Run Step 1t (test infrastructure bootstrap)
2. Run Step 3 (blackbox test task decomposition)
3. Run Step 4 (cross-verification against test coverage)
If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
---
**Step 4 — Implement Tests**
Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 3 (Decompose Tests) is completed AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
Action: Read and execute `.cursor/skills/implement/SKILL.md`
The implement skill reads test tasks from `_docs/02_tasks/` and implements them.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
---
**Step 5 — Run Tests**
Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state shows Step 4 (Implement Tests) is completed AND the autopilot state does NOT show Step 5 (Run Tests) as completed
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
Verifies the implemented test suite passes before proceeding to refactoring. The tests form the safety net for all subsequent code changes.
---
**Step 6 — Refactor**
Condition: the autopilot state shows Step 5 (Run Tests) is completed AND `_docs/04_refactoring/FINAL_report.md` does not exist
Action: Read and execute `.cursor/skills/refactor/SKILL.md`
The refactor skill runs the full 6-phase method using the implemented tests as a safety net.
If `_docs/04_refactoring/` has phase reports, the refactor skill detects completed phases and continues.
---
**Step 7 — New Task**
Condition: the autopilot state shows Step 6 (Refactor) is completed AND the autopilot state does NOT show Step 7 (New Task) as completed
Action: Read and execute `.cursor/skills/new-task/SKILL.md`
The new-task skill interactively guides the user through defining new functionality. It loops until the user is done adding tasks. New task files are written to `_docs/02_tasks/`.
---
**Step 8 — Implement**
Condition: the autopilot state shows Step 7 (New Task) is completed AND `_docs/03_implementation/` does not contain a FINAL report covering the new tasks (check state for distinction between test implementation and feature implementation)
Action: Read and execute `.cursor/skills/implement/SKILL.md`
The implement skill reads the new tasks from `_docs/02_tasks/` and implements them. Tasks already implemented in Step 4 are skipped (the implement skill tracks completed tasks in batch reports).
If `_docs/03_implementation/` has batch reports from this phase, the implement skill detects completed tasks and continues.
---
**Step 9 — Run Tests**
Condition: the autopilot state shows Step 8 (Implement) is completed AND the autopilot state does NOT show Step 9 (Run Tests) as completed
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
---
**Step 10 — Security Audit (optional)**
Condition: the autopilot state shows Step 9 (Run Tests) is completed AND the autopilot state does NOT show Step 10 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
Action: Present using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Run security audit before deploy?
══════════════════════════════════════
A) Run security audit (recommended for production deployments)
B) Skip — proceed directly to deploy
══════════════════════════════════════
Recommendation: A — catches vulnerabilities before production
══════════════════════════════════════
```
- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 11 (Performance Test).
- If user picks B → Mark Step 10 as `skipped` in the state file, auto-chain to Step 11 (Performance Test).
---
**Step 11 — Performance Test (optional)**
Condition: the autopilot state shows Step 10 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 11 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
Action: Present using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Run performance/load tests before deploy?
══════════════════════════════════════
A) Run performance tests (recommended for latency-sensitive or high-load systems)
B) Skip — proceed directly to deploy
══════════════════════════════════════
Recommendation: [A or B — base on whether acceptance criteria
include latency, throughput, or load requirements]
══════════════════════════════════════
```
- If user picks A → Run performance tests:
1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it
2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
3. Present results vs acceptance criteria thresholds
4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
5. After completion, auto-chain to Step 12 (Deploy)
- If user picks B → Mark Step 11 as `skipped` in the state file, auto-chain to Step 12 (Deploy).
---
**Step 12 — Deploy**
Condition: the autopilot state shows Step 9 (Run Tests) is completed AND (Step 10 is completed or skipped) AND (Step 11 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
Action: Read and execute `.cursor/skills/deploy/SKILL.md`
After deployment completes, the existing-code workflow is done.
---
**Re-Entry After Completion**
Condition: the autopilot state shows `step: done` OR all steps through 12 (Deploy) are completed
Action: The project completed a full cycle. Present status and loop back to New Task:
```
══════════════════════════════════════
PROJECT CYCLE COMPLETE
══════════════════════════════════════
The previous cycle finished successfully.
You can now add new functionality.
══════════════════════════════════════
A) Add new features (start New Task)
B) Done — no more changes needed
══════════════════════════════════════
```
- If user picks A → set `step: 7`, `status: not_started` in the state file, then auto-chain to Step 7 (New Task). Previous cycle history stays in Completed Steps.
- If user picks B → report final project status and exit.
## Auto-Chain Rules
| Completed Step | Next Action |
|---------------|-------------|
| Document (1) | Auto-chain → Test Spec (2) |
| Test Spec (2) | Auto-chain → Decompose Tests (3) |
| Decompose Tests (3) | **Session boundary** — suggest new conversation before Implement Tests |
| Implement Tests (4) | Auto-chain → Run Tests (5) |
| Run Tests (5, all pass) | Auto-chain → Refactor (6) |
| Refactor (6) | Auto-chain → New Task (7) |
| New Task (7) | **Session boundary** — suggest new conversation before Implement |
| Implement (8) | Auto-chain → Run Tests (9) |
| Run Tests (9, all pass) | Auto-chain → Security Audit choice (10) |
| Security Audit (10, done or skipped) | Auto-chain → Performance Test choice (11) |
| Performance Test (11, done or skipped) | Auto-chain → Deploy (12) |
| Deploy (12) | **Workflow complete** — existing-code flow done |
## Status Summary Template
```
═══════════════════════════════════════════════════
AUTOPILOT STATUS (existing-code)
═══════════════════════════════════════════════════
Step 1 Document [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 2 Test Spec [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 3 Decompose Tests [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 4 Implement Tests [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)]
Step 5 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 6 Refactor [DONE / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)]
Step 7 New Task [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 8 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
Step 9 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 10 Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 11 Performance Test [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 12 Deploy [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
═══════════════════════════════════════════════════
Current: Step N — Name
SubStep: M — [sub-skill internal step name]
Retry: [N/3 if retrying, omit if 0]
Action: [what will happen next]
═══════════════════════════════════════════════════
```
@@ -0,0 +1,235 @@
# Greenfield Workflow
Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy.
## Step Reference Table
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 1 | Problem | problem/SKILL.md | Phase 14 |
| 2 | Research | research/SKILL.md | Mode A: Phase 14 · Mode B: Step 08 |
| 3 | Plan | plan/SKILL.md | Step 16 + Final |
| 4 | UI Design | ui-design/SKILL.md | Phase 08 (conditional — UI projects only) |
| 5 | Decompose | decompose/SKILL.md | Step 14 |
| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 7 | Run Tests | test-run/SKILL.md | Steps 14 |
| 8 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 9 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
| 10 | Deploy | deploy/SKILL.md | Step 17 |
## Detection Rules
Check rules in order — first match wins.
---
**Step 1 — Problem Gathering**
Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty:
- `problem.md`
- `restrictions.md`
- `acceptance_criteria.md`
- `input_data/` (must contain at least one file)
Action: Read and execute `.cursor/skills/problem/SKILL.md`
---
**Step 2 — Research (Initial)**
Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files
Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A)
---
**Research Decision** (inline gate between Step 2 and Step 3)
Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_document/architecture.md` does not exist
Action: Present the current research state to the user:
- How many solution drafts exist
- Whether tech_stack.md and security_analysis.md exist
- One-line summary from the latest draft
Then present using the **Choose format**:
```
══════════════════════════════════════
DECISION REQUIRED: Research complete — next action?
══════════════════════════════════════
A) Run another research round (Mode B assessment)
B) Proceed to planning with current draft
══════════════════════════════════════
Recommendation: [A or B] — [reason based on draft quality]
══════════════════════════════════════
```
- If user picks A → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B)
- If user picks B → auto-chain to Step 3 (Plan)
---
**Step 3 — Plan**
Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_document/architecture.md` does not exist
Action:
1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself
2. Read and execute `.cursor/skills/plan/SKILL.md`
If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it.
---
**Step 4 — UI Design (conditional)**
Condition: `_docs/02_document/architecture.md` exists AND the autopilot state does NOT show Step 4 (UI Design) as completed or skipped AND the project is a UI project
**UI Project Detection** — the project is a UI project if ANY of the following are true:
- `package.json` exists in the workspace root or any subdirectory
- `*.html`, `*.jsx`, `*.tsx` files exist in the workspace
- `_docs/02_document/components/` contains a component whose `description.md` mentions UI, frontend, page, screen, dashboard, form, or view
- `_docs/02_document/architecture.md` mentions frontend, UI layer, SPA, or client-side rendering
- `_docs/01_solution/solution.md` mentions frontend, web interface, or user-facing UI
If the project is NOT a UI project → mark Step 4 as `skipped` in the state file and auto-chain to Step 5.
If the project IS a UI project → present using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: UI project detected — generate mockups?
══════════════════════════════════════
A) Generate UI mockups before decomposition (recommended)
B) Skip — proceed directly to decompose
══════════════════════════════════════
Recommendation: A — mockups before decomposition
produce better task specs for frontend components
══════════════════════════════════════
```
- If user picks A → Read and execute `.cursor/skills/ui-design/SKILL.md`. After completion, auto-chain to Step 5 (Decompose).
- If user picks B → Mark Step 4 as `skipped` in the state file, auto-chain to Step 5 (Decompose).
---
**Step 5 — Decompose**
Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`)
Action: Read and execute `.cursor/skills/decompose/SKILL.md`
If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
---
**Step 6 — Implement**
Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
Action: Read and execute `.cursor/skills/implement/SKILL.md`
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
---
**Step 7 — Run Tests**
Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state does NOT show Step 7 (Run Tests) as completed AND (`_docs/04_deploy/` does not exist or is incomplete)
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
---
**Step 8 — Security Audit (optional)**
Condition: the autopilot state shows Step 7 (Run Tests) is completed AND the autopilot state does NOT show Step 8 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
Action: Present using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Run security audit before deploy?
══════════════════════════════════════
A) Run security audit (recommended for production deployments)
B) Skip — proceed directly to deploy
══════════════════════════════════════
Recommendation: A — catches vulnerabilities before production
══════════════════════════════════════
```
- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 9 (Performance Test).
- If user picks B → Mark Step 8 as `skipped` in the state file, auto-chain to Step 9 (Performance Test).
---
**Step 9 — Performance Test (optional)**
Condition: the autopilot state shows Step 8 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 9 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
Action: Present using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Run performance/load tests before deploy?
══════════════════════════════════════
A) Run performance tests (recommended for latency-sensitive or high-load systems)
B) Skip — proceed directly to deploy
══════════════════════════════════════
Recommendation: [A or B — base on whether acceptance criteria
include latency, throughput, or load requirements]
══════════════════════════════════════
```
- If user picks A → Run performance tests:
1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it
2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
3. Present results vs acceptance criteria thresholds
4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
5. After completion, auto-chain to Step 10 (Deploy)
- If user picks B → Mark Step 9 as `skipped` in the state file, auto-chain to Step 10 (Deploy).
---
**Step 10 — Deploy**
Condition: the autopilot state shows Step 7 (Run Tests) is completed AND (Step 8 is completed or skipped) AND (Step 9 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
Action: Read and execute `.cursor/skills/deploy/SKILL.md`
---
**Done**
Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md)
Action: Report project completion with summary. If the user runs autopilot again after greenfield completion, Flow Resolution rule 3 routes to the existing-code flow (re-entry after completion) so they can add new features.
## Auto-Chain Rules
| Completed Step | Next Action |
|---------------|-------------|
| Problem (1) | Auto-chain → Research (2) |
| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
| Research Decision → proceed | Auto-chain → Plan (3) |
| Plan (3) | Auto-chain → UI Design detection (4) |
| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
| Implement (6) | Auto-chain → Run Tests (7) |
| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
| Deploy (10) | Report completion |
## Status Summary Template
```
═══════════════════════════════════════════════════
AUTOPILOT STATUS (greenfield)
═══════════════════════════════════════════════════
Step 1 Problem [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 2 Research [DONE (N drafts) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 3 Plan [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 4 UI Design [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 5 Decompose [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 6 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
Step 7 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 8 Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 9 Performance Test [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
Step 10 Deploy [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
═══════════════════════════════════════════════════
Current: Step N — Name
SubStep: M — [sub-skill internal step name]
Retry: [N/3 if retrying, omit if 0]
Action: [what will happen next]
═══════════════════════════════════════════════════
```
+314
View File
@@ -0,0 +1,314 @@
# Autopilot Protocols
## User Interaction Protocol
Every time the autopilot or a sub-skill needs a user decision, use the **Choose A / B / C / D** format. This applies to:
- State transitions where multiple valid next actions exist
- Sub-skill BLOCKING gates that require user judgment
- Any fork where the autopilot cannot confidently pick the right path
- Trade-off decisions (tech choices, scope, risk acceptance)
### When to Ask (MUST ask)
- The next action is ambiguous (e.g., "another research round or proceed?")
- The decision has irreversible consequences (e.g., architecture choices, skipping a step)
- The user's intent or preference cannot be inferred from existing artifacts
- A sub-skill's BLOCKING gate explicitly requires user confirmation
- Multiple valid approaches exist with meaningfully different trade-offs
### When NOT to Ask (auto-transition)
- Only one logical next step exists (e.g., Problem complete → Research is the only option)
- The transition is deterministic from the state (e.g., Plan complete → Decompose)
- The decision is low-risk and reversible
- Existing artifacts or prior decisions already imply the answer
### Choice Format
Always present decisions in this format:
```
══════════════════════════════════════
DECISION REQUIRED: [brief context]
══════════════════════════════════════
A) [Option A — short description]
B) [Option B — short description]
C) [Option C — short description, if applicable]
D) [Option D — short description, if applicable]
══════════════════════════════════════
Recommendation: [A/B/C/D] — [one-line reason]
══════════════════════════════════════
```
Rules:
1. Always provide 24 concrete options (never open-ended questions)
2. Always include a recommendation with a brief justification
3. Keep option descriptions to one line each
4. If only 2 options make sense, use A/B only — do not pad with filler options
5. Play the notification sound (per `human-attention-sound.mdc`) before presenting the choice
6. Record every user decision in the state file's `Key Decisions` section
7. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive
## Work Item Tracker Authentication
Several workflow steps create work items (epics, tasks, links). The system supports **Jira MCP** and **Azure DevOps MCP** as interchangeable backends. Detect which is configured by listing available MCP servers.
### Tracker Detection
1. Check for available MCP servers: Jira MCP (`user-Jira-MCP-Server`) or Azure DevOps MCP (`user-AzureDevops`)
2. If both are available, ask the user which to use (Choose format)
3. Record the choice in the state file: `tracker: jira` or `tracker: ado`
4. If neither is available, set `tracker: local` and proceed without external tracking
### Steps That Require Work Item Tracker
| Flow | Step | Sub-Step | Tracker Action |
|------|------|----------|----------------|
| greenfield | 3 (Plan) | Step 6 — Epics | Create epics for each component |
| greenfield | 5 (Decompose) | Step 13 — All tasks | Create ticket per task, link to epic |
| existing-code | 3 (Decompose Tests) | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
| existing-code | 7 (New Task) | Step 7 — Ticket | Create ticket per task, link to epic |
### Authentication Gate
Before entering a step that requires work item tracking (see table above) for the first time, the autopilot must:
1. Call `mcp_auth` on the detected tracker's MCP server
2. If authentication succeeds → proceed normally
3. If the user **skips** or authentication fails → present using Choose format:
```
══════════════════════════════════════
Tracker authentication failed
══════════════════════════════════════
A) Retry authentication (retry mcp_auth)
B) Continue without tracker (tasks saved locally only)
══════════════════════════════════════
Recommendation: A — Tracker IDs drive task referencing,
dependency tracking, and implementation batching.
Without tracker, task files use numeric prefixes instead.
══════════════════════════════════════
```
If user picks **B** (continue without tracker):
- Set a flag in the state file: `tracker: local`
- All skills that would create tickets instead save metadata locally in the task/epic files with `Tracker: pending` status
- Task files keep numeric prefixes (e.g., `01_initial_structure.md`) instead of tracker ID prefixes
- The workflow proceeds normally in all other respects
### Re-Authentication
If the tracker MCP was already authenticated in a previous invocation (verify by listing available tools beyond `mcp_auth`), skip the auth gate.
## Error Handling
All error situations that require user input MUST use the **Choose A / B / C / D** format.
| Situation | Action |
|-----------|--------|
| State detection is ambiguous (artifacts suggest two different steps) | Present findings and use Choose format with the candidate steps as options |
| Sub-skill fails or hits an unrecoverable blocker | Use Choose format: A) retry, B) skip with warning, C) abort and fix manually |
| User wants to skip a step | Use Choose format: A) skip (with dependency warning), B) execute the step |
| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
## Skill Failure Retry Protocol
Sub-skills can return a **failed** result. Failures are often caused by missing user input, environment issues, or transient errors that resolve on retry. The autopilot auto-retries before escalating.
### Retry Flow
```
Skill execution → FAILED
├─ retry_count < 3 ?
│ YES → increment retry_count in state file
│ → log failure reason in state file (Retry Log section)
│ → re-read the sub-skill's SKILL.md
│ → re-execute from the current sub_step
│ → (loop back to check result)
│ NO (retry_count = 3) →
│ → set status: failed in Current Step
│ → add entry to Blockers section:
│ "[Skill Name] failed 3 consecutive times at sub_step [M].
│ Last failure: [reason]. Auto-retry exhausted."
│ → present warning to user (see Escalation below)
│ → do NOT auto-retry again until user intervenes
```
### Retry Rules
1. **Auto-retry immediately**: when a skill fails, retry it without asking the user — the failure is often transient (missing user confirmation in a prior step, docker not running, file lock, etc.)
2. **Preserve sub_step**: retry from the last recorded `sub_step`, not from the beginning of the skill — unless the failure indicates corruption, in which case restart from sub_step 1
3. **Increment `retry_count`**: update `retry_count` in the state file's `Current Step` section on each retry attempt
4. **Log each failure**: append the failure reason and timestamp to the state file's `Retry Log` section
5. **Reset on success**: when the skill eventually succeeds, reset `retry_count: 0` and clear the `Retry Log` for that step
### Escalation (after 3 consecutive failures)
After 3 failed auto-retries of the same skill, the failure is likely not user-related. Stop retrying and escalate:
1. Update the state file:
- Set `status: failed` in `Current Step`
- Set `retry_count: 3`
- Add a blocker entry describing the repeated failure
2. Play notification sound (per `human-attention-sound.mdc`)
3. Present using Choose format:
```
══════════════════════════════════════
SKILL FAILED: [Skill Name] — 3 consecutive failures
══════════════════════════════════════
Step: [N] — [Name]
SubStep: [M] — [sub-step name]
Last failure reason: [reason]
══════════════════════════════════════
A) Retry with fresh context (new conversation)
B) Skip this step with warning
C) Abort — investigate and fix manually
══════════════════════════════════════
Recommendation: A — fresh context often resolves
persistent failures
══════════════════════════════════════
```
### Re-Entry After Failure
On the next autopilot invocation (new conversation), if the state file shows `status: failed` and `retry_count: 3`:
- Present the blocker to the user before attempting execution
- If the user chooses to retry → reset `retry_count: 0`, set `status: in_progress`, and re-execute
- If the user chooses to skip → mark step as `skipped`, proceed to next step
- Do NOT silently auto-retry — the user must acknowledge the persistent failure first
## Error Recovery Protocol
### Stuck Detection
When executing a sub-skill, monitor for these signals:
- Same artifact overwritten 3+ times without meaningful change
- Sub-skill repeatedly asks the same question after receiving an answer
- No new artifacts saved for an extended period despite active execution
### Recovery Actions (ordered)
1. **Re-read state**: read `_docs/_autopilot_state.md` and cross-check against `_docs/` folders
2. **Retry current sub-step**: re-read the sub-skill's SKILL.md and restart from the current sub-step
3. **Escalate**: after 2 failed retries, present diagnostic summary to user using Choose format:
```
══════════════════════════════════════
RECOVERY: [skill name] stuck at [sub-step]
══════════════════════════════════════
A) Retry with fresh context (new conversation)
B) Skip this sub-step with warning
C) Abort and fix manually
══════════════════════════════════════
Recommendation: A — fresh context often resolves stuck loops
══════════════════════════════════════
```
### Circuit Breaker
If the same autopilot step fails 3 consecutive times across conversations:
- Record the failure pattern in the state file's `Blockers` section
- Do NOT auto-retry on next invocation
- Present the blocker and ask user for guidance before attempting again
## Context Management Protocol
### Principle
Disk is memory. Never rely on in-context accumulation — read from `_docs/` artifacts, not from conversation history.
### Minimal Re-Read Set Per Skill
When re-entering a skill (new conversation or context refresh):
- Always read: `_docs/_autopilot_state.md`
- Always read: the active skill's `SKILL.md`
- Conditionally read: only the `_docs/` artifacts the current sub-step requires (listed in each skill's Context Resolution section)
- Never bulk-read: do not load all `_docs/` files at once
### Mid-Skill Interruption
If context is filling up during a long skill (e.g., document, implement):
1. Save current sub-step progress to the skill's artifact directory
2. Update `_docs/_autopilot_state.md` with exact sub-step position
3. Suggest a new conversation: "Context is getting long — recommend continuing in a fresh conversation for better results"
4. On re-entry, the skill's resumability protocol picks up from the saved sub-step
### Large Artifact Handling
When a skill needs to read large files (e.g., full solution.md, architecture.md):
- Read only the sections relevant to the current sub-step
- Use search tools (Grep, SemanticSearch) to find specific sections rather than reading entire files
- Summarize key decisions from prior steps in the state file so they don't need to be re-read
### Context Budget Heuristic
Agents cannot programmatically query context window usage. Use these heuristics to avoid degradation:
| Zone | Indicators | Action |
|------|-----------|--------|
| **Safe** | State file + SKILL.md + 23 focused artifacts loaded | Continue normally |
| **Caution** | 5+ artifacts loaded, or 3+ large files (architecture, solution, discovery), or conversation has 20+ tool calls | Complete current sub-step, then suggest session break |
| **Danger** | Repeated truncation in tool output, tool calls failing unexpectedly, responses becoming shallow or repetitive | Save immediately, update state file, force session boundary |
**Skill-specific guidelines**:
| Skill | Recommended session breaks |
|-------|---------------------------|
| **document** | After every ~5 modules in Step 1; between Step 4 (Verification) and Step 5 (Solution Extraction) |
| **implement** | Each batch is a natural checkpoint; if more than 2 batches completed in one session, suggest break |
| **plan** | Between Step 5 (Test Specifications) and Step 6 (Epics) for projects with many components |
| **research** | Between Mode A rounds; between Mode A and Mode B |
**How to detect caution/danger zone without API**:
1. Count tool calls made so far — if approaching 20+, context is likely filling up
2. If reading a file returns truncated content, context is under pressure
3. If the agent starts producing shorter or less detailed responses than earlier in the conversation, context quality is degrading
4. When in doubt, save and suggest a new conversation — re-entry is cheap thanks to the state file
## Rollback Protocol
### Implementation Steps (git-based)
Handled by `/implement` skill — each batch commit is a rollback checkpoint via `git revert`.
### Planning/Documentation Steps (artifact-based)
For steps that produce `_docs/` artifacts (problem, research, plan, decompose, document):
1. **Before overwriting**: if re-running a step that already has artifacts, the sub-skill's prerequisite check asks the user (resume/overwrite/skip)
2. **Rollback to previous step**: use Choose format:
```
══════════════════════════════════════
ROLLBACK: Re-run [step name]?
══════════════════════════════════════
A) Re-run the step (overwrites current artifacts)
B) Stay on current step
══════════════════════════════════════
Warning: This will overwrite files in _docs/[folder]/
══════════════════════════════════════
```
3. **Git safety net**: artifacts are committed with each autopilot step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout <commit> -- _docs/<folder>/`
4. **State file rollback**: when rolling back artifacts, also update `_docs/_autopilot_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date)
## Status Summary
On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). Use the Status Summary Template from the active flow file (`flows/greenfield.md` or `flows/existing-code.md`).
For re-entry (state file exists), also include:
- Key decisions from the state file's `Key Decisions` section
- Last session context from the `Last Session` section
- Any blockers from the `Blockers` section
+122
View File
@@ -0,0 +1,122 @@
# Autopilot State Management
## State File: `_docs/_autopilot_state.md`
The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist.
### Format
```markdown
# Autopilot State
## Current Step
flow: [greenfield | existing-code]
step: [1-10 for greenfield, 1-12 for existing-code, or "done"]
name: [step name from the active flow's Step Reference Table]
status: [not_started / in_progress / completed / skipped / failed]
sub_step: [optional — sub-skill internal step number + name if interrupted mid-step]
retry_count: [0-3 — number of consecutive auto-retry attempts for current step, reset to 0 on success]
When updating `Current Step`, always write it as:
flow: existing-code ← active flow
step: N ← autopilot step (sequential integer)
sub_step: M ← sub-skill's own internal step/phase number + name
retry_count: 0 ← reset on new step or success; increment on each failed retry
Example:
flow: greenfield
step: 3
name: Plan
status: in_progress
sub_step: 4 — Architecture Review & Risk Assessment
retry_count: 0
Example (failed after 3 retries):
flow: existing-code
step: 2
name: Test Spec
status: failed
sub_step: 1b — Test Case Generation
retry_count: 3
## Completed Steps
| Step | Name | Completed | Key Outcome |
|------|------|-----------|-------------|
| 1 | [name] | [date] | [one-line summary] |
| 2 | [name] | [date] | [one-line summary] |
| ... | ... | ... | ... |
## Key Decisions
- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"]
- [decision N]
## Last Session
date: [date]
ended_at: Step [N] [Name] — SubStep [M] [sub-step name]
reason: [completed step / session boundary / user paused / context limit]
notes: [any context for next session]
## Retry Log
| Attempt | Step | Name | SubStep | Failure Reason | Timestamp |
|---------|------|------|---------|----------------|-----------|
| 1 | [step] | [name] | [sub_step] | [reason] | [date-time] |
| ... | ... | ... | ... | ... | ... |
(Clear this table when the step succeeds or user resets. Append a row on each failed auto-retry.)
## Blockers
- [blocker 1, if any]
- [none]
```
### State File Rules
1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 1)
2. **Update** the state file after every step completion, every session boundary, every BLOCKING gate confirmation, and every failed retry attempt
3. **Read** the state file as the first action on every invocation — before folder scanning
4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 3 but `_docs/02_document/architecture.md` already exists), trust the folder structure and update the state file to match
5. **Never delete** the state file. It accumulates history across the entire project lifecycle
6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` when the step succeeds or the user manually resets. If `retry_count` reaches 3, set `status: failed` and add an entry to `Blockers`
7. **Failed state on re-entry**: if the state file shows `status: failed` with `retry_count: 3`, do NOT auto-retry — present the blocker to the user and wait for their decision before proceeding
## State Detection
Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning.
### Folder Scan Rules (fallback)
Scan `_docs/` to determine the current workflow position. The detection rules are defined in each flow file (`flows/greenfield.md` and `flows/existing-code.md`). Check the existing-code flow first (Step 1 detection), then greenfield flow rules. First match wins.
## Re-Entry Protocol
When the user invokes `/autopilot` and work already exists:
1. Read `_docs/_autopilot_state.md`
2. Cross-check against `_docs/` folder structure
3. Present Status Summary with context from state file (key decisions, last session, blockers)
4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery
5. Continue execution from detected state
## Session Boundaries
After any decompose/planning step completes, **do not auto-chain to implement**. Instead:
1. Update state file: mark the step as completed, set current step to the next implement step with status `not_started`
- Existing-code flow: After Step 3 (Decompose Tests) → set current step to 4 (Implement Tests)
- Existing-code flow: After Step 7 (New Task) → set current step to 8 (Implement)
- Greenfield flow: After Step 5 (Decompose) → set current step to 6 (Implement)
2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready`
3. Present a summary: number of tasks, estimated batches, total complexity points
4. Use Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Decompose complete — start implementation?
══════════════════════════════════════
A) Start a new conversation for implementation (recommended for context freshness)
B) Continue implementation in this conversation
══════════════════════════════════════
Recommendation: A — implementation is the longest phase, fresh context helps
══════════════════════════════════════
```
These are the only hard session boundaries. All other transitions auto-chain.
+7 -74
View File
@@ -2,7 +2,7 @@
name: code-review
description: |
Multi-phase code review against task specs with structured findings output.
7-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency, architecture compliance.
6-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency.
Produces a structured report with severity-ranked findings and a PASS/FAIL/PASS_WITH_WARNINGS verdict.
Invoked by /implement skill after each batch, or manually.
Trigger phrases:
@@ -27,7 +27,7 @@ Multi-phase code review that verifies implementation against task specs, checks
## Input
- List of task spec files that were just implemented (paths to `[TRACKER-ID]_[short_name].md`)
- List of task spec files that were just implemented (paths to `[JIRA-ID]_[short_name].md`)
- Changed files (detected via `git diff` or provided by the `/implement` skill)
- Project context: `_docs/00_problem/restrictions.md`, `_docs/01_solution/solution.md`
@@ -50,18 +50,6 @@ For each task, verify implementation satisfies every acceptance criterion:
- Flag any AC that is not demonstrably satisfied as a **Spec-Gap** finding (severity: High)
- Flag any scope creep (implementation beyond what the spec asked for) as a **Scope** finding (severity: Low)
**Contract verification** (for shared-models / shared-API tasks — any task with a `## Contract` section):
- Verify the referenced contract file exists at the stated path under `_docs/02_document/contracts/`.
- Verify the implementation's public signatures (types, method shapes, endpoint paths, error variants) match the contract's **Shape** section.
- Verify invariants from the contract's **Invariants** section are enforced in code (either structurally via types or via runtime checks with tests).
- If the implementation and the contract disagree, emit a **Spec-Gap** finding (High severity) and note which side is drifting.
**Consumer-side contract verification** (for tasks whose Dependencies list a contract file):
- Verify the consumer's imports and call sites match the contract's Shape.
- If they diverge, emit a **Spec-Gap** finding (High severity) with a hint that the consumer, the contract, or the producer is drifting.
## Phase 3: Code Quality Review
Check implemented code against quality standards:
@@ -104,59 +92,6 @@ When multiple tasks were implemented in the same batch:
- Shared code is not duplicated across task implementations
- Dependencies declared in task specs are properly wired
## Phase 7: Architecture Compliance
Verify the implemented code respects the architecture documented in `_docs/02_document/architecture.md`, the component boundaries declared in `_docs/02_document/module-layout.md`, and the **accepted Architectural Decision Records** under `_docs/02_document/adr/`.
**Inputs**:
- `_docs/02_document/architecture.md` — layering, allowed dependencies, patterns
- `_docs/02_document/module-layout.md` — per-component directories, Public API surface, `Imports from` lists, Allowed Dependencies table
- `_docs/02_document/adr/` — every `Status: Accepted` ADR is an enforceable structural rule. `Status: Proposed`, `Status: Deprecated`, and `Status: Superseded` ADRs are NOT enforced (Proposed = not yet ratified; Deprecated/Superseded = a later ADR overturned it). If the directory does not exist or has only the index file, ADRs are skipped — log this skip in the report so the absence is visible.
- The cumulative list of changed files (for per-batch invocation) or the full codebase (for baseline invocation)
**Checks**:
1. **Layer direction**: for each import in a changed file, resolve the importer's layer (from the Allowed Dependencies table) and the importee's layer. Flag any import where the importee's layer is strictly higher than the importer's. Severity: High. Category: Architecture.
2. **Public API respect**: for each cross-component import, verify the imported symbol lives in the target component's Public API file list (from `module-layout.md`). Importing an internal file of another component is an Architecture finding. Severity: High.
3. **No new cyclic module dependencies**: build a module-level import graph of the changed files plus their direct dependencies. Flag any new cycle introduced by this batch. Severity: Critical (cycles are structurally hard to undo once wired). Category: Architecture.
4. **Duplicate symbols across components**: scan changed files for class, function, or constant names that also appear in another component's code AND do not share an interface. If a shared abstraction was expected (via cross-cutting epic or shared/*), flag it. Severity: High. Category: Architecture.
5. **Cross-cutting concerns not locally re-implemented**: if a file under a component directory contains logic that should live in `shared/<concern>/` (e.g., custom logging setup, config loader, error envelope), flag it. Severity: Medium. Category: Architecture.
6. **ADR compliance**: for each `Status: Accepted` ADR, confirm the changed code does not contradict the ADR's `Decision`. Two failure modes are flagged:
- **ADR-Violation**: the changed code does the opposite of an Accepted ADR's `Decision`. Example: ADR-002 says "We will use Postgres for transactional data" and the changed code introduces a SQLite dependency for a transactional path. Severity: **Critical**. Category: Architecture. The finding cites the ADR by `NNN_<slug>` and the offending file/line.
- **ADR-Drift**: the changed code does something the ADR did not anticipate AND that materially affects the ADR's `Consequences` (positive or negative). Example: ADR-004 says "Event-driven cross-component comms" and a changed file introduces a new synchronous HTTP call between two components. Severity: **High**. Category: Architecture. The finding either proposes "Update ADR-NNN to acknowledge the new pattern" or "Remove the drift to align with ADR-NNN" — never silently accepts.
The check skips ADRs that are explicitly out of scope of the changed batch (e.g., ADR-001 about deployment pipeline when the batch only touches business-logic files). Use the ADR's `Evidence` section to determine scope: if no Evidence path overlaps with any changed file, skip the ADR for this batch.
**Detection approach (per language)**:
- Python: parse `import` / `from ... import` statements; optionally AST with `ast` module for reliable symbol resolution.
- TypeScript / JavaScript: parse `import ... from '...'` and `require('...')`; resolve via `tsconfig.json` paths.
- C#: parse `using` directives and fully-qualified type references; respect `.csproj` ProjectReference layering.
- Rust: parse `use <crate>::` and `mod` declarations; respect `Cargo.toml` workspace members.
- Go: parse `import` blocks; respect module path ownership.
If a static analyzer tool is available on the project (ArchUnit, NsDepCop, tach, eslint-plugin-boundaries, etc.), prefer invoking it and parsing its output over hand-rolled analysis.
**Invocation modes**:
- **Full mode** (default when invoked by the implement skill per batch): all 7 phases run.
- **Baseline mode**: Phase 1 + Phase 7 only. Used for one-time architecture scan of an existing codebase (see existing-code flow Step 2 — Architecture Baseline Scan). Produces `_docs/02_document/architecture_compliance_baseline.md` instead of a batch review report.
- **Cumulative mode**: all 7 phases on the union of changed files since the last cumulative review. Used mid-implementation (see implement skill Step 14.5).
**Baseline delta** (cumulative mode + full mode, when `_docs/02_document/architecture_compliance_baseline.md` exists):
After the seven phases produce the current Architecture findings list, partition those findings against the baseline:
- **Carried over**: a finding whose `(file, category, rule)` triple matches an entry in the baseline. Not new; still present.
- **Resolved**: a baseline entry whose `(file, category, rule)` triple is NOT in the current findings AND whose target file is in scope of this review. The team fixed it.
- **Newly introduced**: a current finding that was not in the baseline. The review cycle created this.
Emit a `## Baseline Delta` section in the report with three tables (Carried over, Resolved, Newly introduced) and per-category counts. The verdict logic does not change — Critical / High still drive FAIL. The delta is additional signal for the user and feeds the retrospective's structural metrics.
## Output Format
Produce a structured report with findings deduplicated and sorted by severity:
@@ -201,9 +136,7 @@ Produce a structured report with findings deduplicated and sorted by severity:
## Category Values
Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architecture
`Architecture` findings come from Phase 7. They indicate layering violations, Public API bypasses, new cyclic dependencies, duplicate symbols, cross-cutting concerns re-implemented locally, **ADR-Violation** (changed code contradicts an `Accepted` ADR's Decision — Critical), or **ADR-Drift** (changed code introduces a pattern that materially affects an `Accepted` ADR's Consequences without superseding it — High).
Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope
## Verdict Logic
@@ -215,7 +148,7 @@ Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architectur
The `/implement` skill invokes this skill after each batch completes:
1. Collects changed files from all tasks implemented in the batch
1. Collects changed files from all implementer agents in the batch
2. Passes task spec paths + changed files to this skill
3. If verdict is FAIL — presents findings to user (BLOCKING), user fixes or confirms
4. If verdict is PASS or PASS_WITH_WARNINGS — proceeds automatically (findings shown as info)
@@ -226,8 +159,8 @@ The `/implement` skill invokes this skill after each batch completes:
| Input | Type | Source | Required |
|-------|------|--------|----------|
| `task_specs` | list of file paths | Task `.md` files from `_docs/02_tasks/todo/` for the current batch | Yes |
| `changed_files` | list of file paths | Files modified by the tasks in the batch (from `git diff`) | Yes |
| `task_specs` | list of file paths | Task `.md` files from `_docs/02_tasks/` for the current batch | Yes |
| `changed_files` | list of file paths | Files modified by implementer agents (from `git diff` or agent reports) | Yes |
| `batch_number` | integer | Current batch number (for report naming) | Yes |
| `project_restrictions` | file path | `_docs/00_problem/restrictions.md` | If exists |
| `solution_overview` | file path | `_docs/01_solution/solution.md` | If exists |
@@ -238,7 +171,7 @@ The implement skill invokes code-review by:
1. Reading `.cursor/skills/code-review/SKILL.md`
2. Providing the inputs above as context (read the files, pass content to the review phases)
3. Executing all 7 phases sequentially
3. Executing all 6 phases sequentially
4. Consuming the verdict from the output
### Outputs (returned to the implement skill)
+220 -111
View File
@@ -2,79 +2,64 @@
name: decompose
description: |
Decompose planned components into atomic implementable tasks with bootstrap structure plan.
Workflow entrypoints: implementation task decomposition, single component decomposition, and tests-only decomposition.
The invoking flow decides which entrypoint to run; this skill executes that selected sequence.
4-step workflow: bootstrap structure plan, component task decomposition, blackbox test task decomposition, and cross-task verification.
Supports full decomposition (_docs/ structure), single component mode, and tests-only mode.
Trigger phrases:
- "decompose", "decompose features", "feature decomposition"
- "task decomposition", "break down components"
- "prepare for implementation"
- "decompose tests", "test decomposition"
category: build
tags: [decomposition, tasks, dependencies, work-items, implementation-prep]
tags: [decomposition, tasks, dependencies, jira, implementation-prep]
disable-model-invocation: true
---
# Task Decomposition
Decompose planned components into atomic, implementable task specs with a bootstrap structure plan through a systematic workflow. All tasks are named with their work item tracker ID prefix in a flat directory.
Decompose planned components into atomic, implementable task specs with a bootstrap structure plan through a systematic workflow. All tasks are named with their Jira ticket ID prefix in a flat directory.
## Core Principles
- **Atomic tasks**: each task does one thing; if it exceeds 5 complexity points, split it
- **Behavioral specs, not implementation plans**: describe what the system should do, not how to build it
- **Flat structure**: all tasks are tracker-ID-prefixed files in TASKS_DIR — no component subdirectories
- **Flat structure**: all tasks are Jira-ID-prefixed files in TASKS_DIR — no component subdirectories
- **Save immediately**: write artifacts to disk after each task; never accumulate unsaved work
- **Tracker inline**: create work item ticket immediately after writing each task file
- **Jira inline**: create Jira ticket immediately after writing each task file
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
- **Plan, don't code**: this workflow produces documents and work item tickets, never implementation code
- **Plan, don't code**: this workflow produces documents and Jira tasks, never implementation code
## Context Resolution
Resolve the selected entrypoint from the invocation context before any other logic runs. The caller decides whether this is implementation, single component, or tests-only decomposition; this skill only executes the selected sequence.
**Implementation task decomposition** (default; selected by flows before invoking this skill):
Determine the operating mode based on invocation before any other logic runs.
**Default** (no explicit input file provided):
- DOCUMENT_DIR: `_docs/02_document/`
- TASKS_DIR: `_docs/02_tasks/`
- TASKS_TODO: `_docs/02_tasks/todo/`
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, DOCUMENT_DIR
- Produces only implementation tasks. Blackbox/e2e test task files are produced only when the invoking flow selects tests-only decomposition.
- Runs Step 1 (bootstrap) + Step 2 (all components) + Step 3 (blackbox tests) + Step 4 (cross-verification)
**Single component mode** (provided file is within `_docs/02_document/` and inside a `components/` subdirectory):
- DOCUMENT_DIR: `_docs/02_document/`
- TASKS_DIR: `_docs/02_tasks/`
- TASKS_TODO: `_docs/02_tasks/todo/`
- Derive component number and component name from the file path
- Ask user for the parent Epic ID
- Runs Step 2 (that component only, appending to existing task numbering)
**Tests-only mode** (provided file/directory is within `tests/`, or `DOCUMENT_DIR/tests/` exists and input explicitly requests test decomposition):
- DOCUMENT_DIR: `_docs/02_document/`
- TASKS_DIR: `_docs/02_tasks/`
- TASKS_TODO: `_docs/02_tasks/todo/`
- TESTS_DIR: `DOCUMENT_DIR/tests/`
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, TESTS_DIR
- Runs Step 1t (test infrastructure bootstrap) + Step 3 (blackbox test decomposition) + Step 4 (cross-verification against test coverage)
- Skips Step 1 (project bootstrap) and Step 2 (component decomposition) — the codebase already exists
Announce the selected entrypoint and resolved paths to the user before proceeding.
### Step Applicability by Mode
| Step | File | Implementation | Single | Tests-only |
|------|------|:--------------:|:------:|:----------:|
| 1 Bootstrap Structure | `steps/01_bootstrap-structure.md` | ✓ | — | — |
| 1t Test Infrastructure | `steps/01t_test-infrastructure.md` | — | — | ✓ |
| 1.5 Module Layout | `steps/01-5_module-layout.md` | ✓ | — | — |
| 1.7 System-Pipeline Tasks | `steps/01-7_system-pipeline-tasks.md` | ✓ | — | — |
| 2 Task Decomposition | `steps/02_task-decomposition.md` | ✓ | ✓ | — |
| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | — | — | ✓ |
| 4 Cross-Verification | `steps/04_cross-verification.md` | ✓ | — | ✓ |
Announce the detected mode and resolved paths to the user before proceeding.
## Input Specification
### Required Files
**Implementation task decomposition:**
**Default:**
| File | Purpose |
|------|---------|
@@ -82,11 +67,10 @@ Announce the selected entrypoint and resolved paths to the user before proceedin
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
| `_docs/01_solution/solution.md` | Finalized solution |
| `DOCUMENT_DIR/architecture.md` | Architecture from plan/document skill (must contain a `## Architecture Vision` H2 — confirmed user intent) |
| `DOCUMENT_DIR/glossary.md` | Project terminology (confirmed by user in plan Phase 2a.0 or document Step 4.5). Use it to keep task names, component references, and AC wording consistent with the user's vocabulary |
| `DOCUMENT_DIR/architecture.md` | Architecture from plan skill |
| `DOCUMENT_DIR/system-flows.md` | System flows from plan skill |
| `DOCUMENT_DIR/components/[##]_[name]/description.md` | Component specs from plan skill |
| `DOCUMENT_DIR/tests/` | Optional product acceptance context from test-spec skill; do not create test task files from it in this entrypoint |
| `DOCUMENT_DIR/tests/` | Blackbox test specs from plan skill |
**Single component mode:**
@@ -113,22 +97,19 @@ Announce the selected entrypoint and resolved paths to the user before proceedin
### Prerequisite Checks (BLOCKING)
**Implementation task decomposition:**
**Default:**
1. DOCUMENT_DIR contains `architecture.md` and `components/`**STOP if missing**
2. Create TASKS_DIR and TASKS_TODO if they do not exist
3. If TASKS_DIR subfolders (`todo/`, `backlog/`, `done/`) already contain task files, ask user: **resume from last checkpoint or start fresh?**
2. Create TASKS_DIR if it does not exist
3. If TASKS_DIR already contains task files, ask user: **resume from last checkpoint or start fresh?**
**Single component mode:**
1. The provided component file exists and is non-empty — **STOP if missing**
**Tests-only mode:**
1. `TESTS_DIR/blackbox-tests.md` exists and is non-empty — **STOP if missing**
2. `TESTS_DIR/environment.md` exists — **STOP if missing**
3. Create TASKS_DIR and TASKS_TODO if they do not exist
4. If TASKS_DIR subfolders (`todo/`, `backlog/`, `done/`) already contain task files, ask user: **resume from last checkpoint or start fresh?**
3. Create TASKS_DIR if it does not exist
4. If TASKS_DIR already contains task files, ask user: **resume from last checkpoint or start fresh?**
## Artifact Management
@@ -136,92 +117,223 @@ Announce the selected entrypoint and resolved paths to the user before proceedin
```
TASKS_DIR/
├── _dependencies_table.md
├── todo/
├── [TRACKER-ID]_initial_structure.md
│ ├── [TRACKER-ID]_[short_name].md
│ └── ...
├── backlog/
└── done/
├── [JIRA-ID]_initial_structure.md
├── [JIRA-ID]_[short_name].md
├── [JIRA-ID]_[short_name].md
├── ...
└── _dependencies_table.md
```
**Naming convention**: Each task file is initially saved in `TASKS_TODO/` with a temporary numeric prefix (`[##]_[short_name].md`). After creating the work item ticket, rename the file to use the work item ticket ID as prefix (`[TRACKER-ID]_[short_name].md`). For example: `todo/01_initial_structure.md``todo/AZ-42_initial_structure.md`.
If tracker availability fails, follow `.cursor/rules/tracker.mdc` before continuing. Only when the user explicitly chooses `tracker: local` may the numeric prefix remain; in that mode set `Tracker: pending` and `Epic: pending` in the task header and keep the task eligible for later tracker sync.
**Naming convention**: Each task file is initially saved with a temporary numeric prefix (`[##]_[short_name].md`). After creating the Jira ticket, rename the file to use the Jira ticket ID as prefix (`[JIRA-ID]_[short_name].md`). For example: `01_initial_structure.md``AZ-42_initial_structure.md`.
### Save Timing
| Step | Save immediately after | Filename |
|------|------------------------|----------|
| Step 1 | Bootstrap structure plan complete + work item ticket created + file renamed | `todo/[TRACKER-ID]_initial_structure.md` |
| Step 1.5 | Module layout written | `_docs/02_document/module-layout.md` |
| Step 1t | Test infrastructure bootstrap complete + work item ticket created + file renamed | `todo/[TRACKER-ID]_test_infrastructure.md` |
| Step 2 | Each component task decomposed + work item ticket created + file renamed | `todo/[TRACKER-ID]_[short_name].md` |
| Step 3 | Each blackbox test task decomposed + work item ticket created + file renamed | `todo/[TRACKER-ID]_[short_name].md` |
| Step 1 | Bootstrap structure plan complete + Jira ticket created + file renamed | `[JIRA-ID]_initial_structure.md` |
| Step 1t | Test infrastructure bootstrap complete + Jira ticket created + file renamed | `[JIRA-ID]_test_infrastructure.md` |
| Step 2 | Each component task decomposed + Jira ticket created + file renamed | `[JIRA-ID]_[short_name].md` |
| Step 3 | Each blackbox test task decomposed + Jira ticket created + file renamed | `[JIRA-ID]_[short_name].md` |
| Step 4 | Cross-task verification complete | `_dependencies_table.md` |
### Resumability
If TASKS_DIR subfolders already contain task files:
If TASKS_DIR already contains task files:
1. List existing `*_*.md` files across `todo/`, `backlog/`, and `done/` (excluding `_dependencies_table.md`) and count them
2. Resume numbering from the next number (for temporary numeric prefix before tracker rename)
1. List existing `*_*.md` files (excluding `_dependencies_table.md`) and count them
2. Resume numbering from the next number (for temporary numeric prefix before Jira rename)
3. Inform the user which tasks already exist and are being skipped
## Progress Tracking
At the start of execution, create a TodoWrite with all applicable steps for the selected entrypoint (see Step Applicability table). Update status as each step/component completes.
At the start of execution, create a TodoWrite with all applicable steps. Update status as each step/component completes.
## Workflow
### Step 1: Bootstrap Structure Plan (implementation mode only)
Read and follow `steps/01_bootstrap-structure.md`.
---
### Step 1t: Test Infrastructure Bootstrap (tests-only mode only)
Read and follow `steps/01t_test-infrastructure.md`.
**Role**: Professional Quality Assurance Engineer
**Goal**: Produce `01_test_infrastructure.md` — the first task describing the test project scaffold
**Constraints**: This is a plan document, not code. The `/implement` skill executes it.
1. Read `TESTS_DIR/environment.md` and `TESTS_DIR/test-data.md`
2. Read problem.md, restrictions.md, acceptance_criteria.md for domain context
3. Document the test infrastructure plan using `templates/test-infrastructure-task.md`
The test infrastructure bootstrap must include:
- Test project folder layout (`e2e/` directory structure)
- Mock/stub service definitions for each external dependency
- `docker-compose.test.yml` structure from environment.md
- Test runner configuration (framework, plugins, fixtures)
- Test data fixture setup from test-data.md seed data sets
- Test reporting configuration (format, output path)
- Data isolation strategy
**Self-verification**:
- [ ] Every external dependency from environment.md has a mock service defined
- [ ] Docker Compose structure covers all services from environment.md
- [ ] Test data fixtures cover all seed data sets from test-data.md
- [ ] Test runner configuration matches the consumer app tech stack from environment.md
- [ ] Data isolation strategy is defined
**Save action**: Write `01_test_infrastructure.md` (temporary numeric name)
**Jira action**: Create a Jira ticket for this task under the "Blackbox Tests" epic. Write the Jira ticket ID and Epic ID back into the task header.
**Rename action**: Rename the file from `01_test_infrastructure.md` to `[JIRA-ID]_test_infrastructure.md`. Update the **Task** field inside the file to match the new filename.
**BLOCKING**: Present test infrastructure plan summary to user. Do NOT proceed until user confirms.
---
### Step 1.5: Module Layout (implementation mode only)
### Step 1: Bootstrap Structure Plan (default mode only)
Read and follow `steps/01-5_module-layout.md`.
**Role**: Professional software architect
**Goal**: Produce `01_initial_structure.md` — the first task describing the project skeleton
**Constraints**: This is a plan document, not code. The `/implement` skill executes it.
1. Read architecture.md, all component specs, system-flows.md, data_model.md, and `deployment/` from DOCUMENT_DIR
2. Read problem, solution, and restrictions from `_docs/00_problem/` and `_docs/01_solution/`
3. Research best implementation patterns for the identified tech stack
4. Document the structure plan using `templates/initial-structure-task.md`
The bootstrap structure plan must include:
- Project folder layout with all component directories
- Shared models, interfaces, and DTOs
- Dockerfile per component (multi-stage, non-root, health checks, pinned base images)
- `docker-compose.yml` for local development (all components + database + dependencies)
- `docker-compose.test.yml` for blackbox test environment (blackbox test runner)
- `.dockerignore`
- CI/CD pipeline file (`.github/workflows/ci.yml` or `azure-pipelines.yml`) with stages from `deployment/ci_cd_pipeline.md`
- Database migration setup and initial seed data scripts
- Observability configuration: structured logging setup, health check endpoints (`/health/live`, `/health/ready`), metrics endpoint (`/metrics`)
- Environment variable documentation (`.env.example`)
- Test structure with unit and blackbox test locations
**Self-verification**:
- [ ] All components have corresponding folders in the layout
- [ ] All inter-component interfaces have DTOs defined
- [ ] Dockerfile defined for each component
- [ ] `docker-compose.yml` covers all components and dependencies
- [ ] `docker-compose.test.yml` enables blackbox testing
- [ ] CI/CD pipeline file defined with lint, test, security, build, deploy stages
- [ ] Database migration setup included
- [ ] Health check endpoints specified for each service
- [ ] Structured logging configuration included
- [ ] `.env.example` with all required environment variables
- [ ] Environment strategy covers dev, staging, production
- [ ] Test structure includes unit and blackbox test locations
**Save action**: Write `01_initial_structure.md` (temporary numeric name)
**Jira action**: Create a Jira ticket for this task under the "Bootstrap & Initial Structure" epic. Write the Jira ticket ID and Epic ID back into the task header.
**Rename action**: Rename the file from `01_initial_structure.md` to `[JIRA-ID]_initial_structure.md` (e.g., `AZ-42_initial_structure.md`). Update the **Task** field inside the file to match the new filename.
**BLOCKING**: Present structure plan summary to user. Do NOT proceed until user confirms.
---
### Step 1.7: System-Pipeline Tasks (implementation mode only)
### Step 2: Task Decomposition (default and single component modes)
Read and follow `steps/01-7_system-pipeline-tasks.md`.
**Role**: Professional software architect
**Goal**: Decompose each component into atomic, implementable task specs — numbered sequentially starting from 02
**Constraints**: Behavioral specs only — describe what, not how. No implementation code.
This step exists because per-component task decomposition (Step 2)
produces one task per component but NEVER produces a task whose
deliverable is "the production code that drives the end-to-end
pipeline by calling each component in order against real inputs".
The architecture document describes the loop; nobody owns it. The
GPS-passthrough incident (May 2026) is the canonical failure this
step prevents.
**Numbering**: Tasks are numbered sequentially across all components in dependency order. Start from 02 (01 is initial_structure). In single component mode, start from the next available number in TASKS_DIR.
**Component ordering**: Process components in dependency order — foundational components first (shared models, database), then components that depend on them.
For each component (or the single provided component):
1. Read the component's `description.md` and `tests.md` (if available)
2. Decompose into atomic tasks; create only 1 task if the component is simple or atomic
3. Split into multiple tasks only when it is necessary and would be easier to implement
4. Do not create tasks for other components — only tasks for the current component
5. Each task should be atomic, containing 0 APIs or a list of semantically connected APIs
6. Write each task spec using `templates/task.md`
7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
8. Note task dependencies (referencing Jira IDs of already-created dependency tasks, e.g., `AZ-42_initial_structure`)
9. **Immediately after writing each task file**: create a Jira ticket, link it to the component's epic, write the Jira ticket ID and Epic ID back into the task header, then rename the file from `[##]_[short_name].md` to `[JIRA-ID]_[short_name].md`.
**Self-verification** (per component):
- [ ] Every task is atomic (single concern)
- [ ] No task exceeds 5 complexity points
- [ ] Task dependencies reference correct Jira IDs
- [ ] Tasks cover all interfaces defined in the component spec
- [ ] No tasks duplicate work from other components
- [ ] Every task has a Jira ticket linked to the correct epic
**Save action**: Write each `[##]_[short_name].md` (temporary numeric name), create Jira ticket inline, then rename the file to `[JIRA-ID]_[short_name].md`. Update the **Task** field inside the file to match the new filename. Update **Dependencies** references in the file to use Jira IDs of the dependency tasks.
---
### Step 2: Task Decomposition (implementation and single component modes)
### Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
Read and follow `steps/02_task-decomposition.md`.
**Role**: Professional Quality Assurance Engineer
**Goal**: Decompose blackbox test specs into atomic, implementable task specs
**Constraints**: Behavioral specs only — describe what, not how. No test code.
**Numbering**:
- In default mode: continue sequential numbering from where Step 2 left off.
- In tests-only mode: start from 02 (01 is the test infrastructure bootstrap from Step 1t).
1. Read all test specs from `DOCUMENT_DIR/tests/` (`blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, `resource-limit-tests.md`)
2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test)
3. Each task should reference the specific test scenarios it implements and the environment/test-data specs
4. Dependencies:
- In default mode: blackbox test tasks depend on the component implementation tasks they exercise
- In tests-only mode: blackbox test tasks depend on the test infrastructure bootstrap task (Step 1t)
5. Write each task spec using `templates/task.md`
6. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
7. Note task dependencies (referencing Jira IDs of already-created dependency tasks)
8. **Immediately after writing each task file**: create a Jira ticket under the "Blackbox Tests" epic, write the Jira ticket ID and Epic ID back into the task header, then rename the file from `[##]_[short_name].md` to `[JIRA-ID]_[short_name].md`.
**Self-verification**:
- [ ] Every scenario from `tests/blackbox-tests.md` is covered by a task
- [ ] Every scenario from `tests/performance-tests.md`, `tests/resilience-tests.md`, `tests/security-tests.md`, and `tests/resource-limit-tests.md` is covered by a task
- [ ] No task exceeds 5 complexity points
- [ ] Dependencies correctly reference the dependency tasks (component tasks in default mode, test infrastructure in tests-only mode)
- [ ] Every task has a Jira ticket linked to the "Blackbox Tests" epic
**Save action**: Write each `[##]_[short_name].md` (temporary numeric name), create Jira ticket inline, then rename to `[JIRA-ID]_[short_name].md`.
---
### Step 3: Blackbox Test Task Decomposition (tests-only mode only)
### Step 4: Cross-Task Verification (default and tests-only modes)
Read and follow `steps/03_blackbox-test-decomposition.md`.
**Role**: Professional software architect and analyst
**Goal**: Verify task consistency and produce `_dependencies_table.md`
**Constraints**: Review step — fix gaps found, do not add new tasks
1. Verify task dependencies across all tasks are consistent
2. Check no gaps:
- In default mode: every interface in architecture.md has tasks covering it
- In tests-only mode: every test scenario in `traceability-matrix.md` is covered by a task
3. Check no overlaps: tasks don't duplicate work
4. Check no circular dependencies in the task graph
5. Produce `_dependencies_table.md` using `templates/dependencies-table.md`
**Self-verification**:
Default mode:
- [ ] Every architecture interface is covered by at least one task
- [ ] No circular dependencies in the task graph
- [ ] Cross-component dependencies are explicitly noted in affected task specs
- [ ] `_dependencies_table.md` contains every task with correct dependencies
Tests-only mode:
- [ ] Every test scenario from traceability-matrix.md "Covered" entries has a corresponding task
- [ ] No circular dependencies in the task graph
- [ ] Test task dependencies reference the test infrastructure bootstrap
- [ ] `_dependencies_table.md` contains every task with correct dependencies
**Save action**: Write `_dependencies_table.md`
**BLOCKING**: Present dependency summary to user. Do NOT proceed until user confirms.
---
### Step 4: Cross-Task Verification (implementation and tests-only modes)
Read and follow `steps/04_cross-verification.md`.
## Common Mistakes
- **Coding during decomposition**: this workflow produces specs, never code
@@ -230,9 +342,9 @@ Read and follow `steps/04_cross-verification.md`.
- **Cross-component tasks**: each task belongs to exactly one component
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Creating git branches**: branch creation is an implementation concern, not a decomposition one
- **Creating component subdirectories**: all tasks go flat in `TASKS_TODO/`
- **Forgetting tracker**: every task must have a work item ticket created inline — do not defer to a separate step
- **Forgetting to rename**: after work item ticket creation, always rename the file from numeric prefix to tracker ID prefix
- **Creating component subdirectories**: all tasks go flat in TASKS_DIR
- **Forgetting Jira**: every task must have a Jira ticket created inline — do not defer to a separate step
- **Forgetting to rename**: after Jira ticket creation, always rename the file from numeric prefix to Jira ID prefix
## Escalation Rules
@@ -242,7 +354,7 @@ Read and follow `steps/04_cross-verification.md`.
| Task complexity exceeds 5 points after splitting | ASK user |
| Missing component specs in DOCUMENT_DIR | ASK user |
| Cross-component dependency conflict | ASK user |
| Tracker epic not found for a component | ASK user for Epic ID |
| Jira epic not found for a component | ASK user for Epic ID |
| Task naming | PROCEED, confirm at next BLOCKING gate |
## Methodology Quick Reference
@@ -251,30 +363,27 @@ Read and follow `steps/04_cross-verification.md`.
┌────────────────────────────────────────────────────────────────┐
│ Task Decomposition (Multi-Mode) │
├────────────────────────────────────────────────────────────────┤
│ CONTEXT: Invoke the selected entrypoint (implementation / single / tests-only)
IMPLEMENTATION TASK DECOMPOSITION:
│ 1. Bootstrap Structure → steps/01_bootstrap-structure.md │
[BLOCKING: user confirms structure] │
1.5 Module Layout → steps/01-5_module-layout.md
[BLOCKING: user confirms layout]
1.7 System-Pipeline → steps/01-7_system-pipeline-tasks.md
[BLOCKING: user confirms pipeline owners]
2. Component Tasks → steps/02_task-decomposition.md
│ 4. Cross-Verification → steps/04_cross-verification.md │
│ [BLOCKING: user confirms dependencies] │
│ │
│ CONTEXT: Resolve mode (default / single component / tests-only)│
│ │
DEFAULT MODE:
│ 1. Bootstrap Structure [JIRA-ID]_initial_structure.md │
│ [BLOCKING: user confirms structure] │
2. Component Tasks → [JIRA-ID]_[short_name].md each
3. Blackbox Tests → [JIRA-ID]_[short_name].md each
4. Cross-Verification → _dependencies_table.md
│ [BLOCKING: user confirms dependencies]
│ TESTS-ONLY MODE: │
│ 1t. Test Infrastructure → steps/01t_test-infrastructure.md
[BLOCKING: user confirms test scaffold] │
│ 3. Blackbox Tests → steps/03_blackbox-test-decomposition.md
│ 4. Cross-Verification → steps/04_cross-verification.md
[BLOCKING: user confirms dependencies] │
│ 1t. Test Infrastructure [JIRA-ID]_test_infrastructure.md │
│ [BLOCKING: user confirms test scaffold] │
│ 3. Blackbox Tests [JIRA-ID]_[short_name].md each
│ 4. Cross-Verification _dependencies_table.md
│ [BLOCKING: user confirms dependencies] │
│ │
│ SINGLE COMPONENT MODE: │
│ 2. Component Tasks → steps/02_task-decomposition.md
│ 2. Component Tasks [JIRA-ID]_[short_name].md each
├────────────────────────────────────────────────────────────────┤
│ Principles: Atomic tasks · Behavioral specs · Flat structure │
Tracker inline · Rename to tracker ID · Save now · Ask don't assume│
Jira inline · Rename to Jira ID · Save now · Ask don't assume│
└────────────────────────────────────────────────────────────────┘
```
@@ -1,39 +0,0 @@
# Step 1.5: Module Layout (default mode only)
**Role**: Professional software architect
**Goal**: Produce `_docs/02_document/module-layout.md` — the authoritative file-ownership map used by the implement skill. Separates **behavioral** task specs (no file paths) from **structural** file mapping (no behavior).
**Constraints**: Follow the target language's standard project-layout conventions. Do not invent non-standard directory structures.
## Steps
1. Detect the target language from `DOCUMENT_DIR/architecture.md` and the bootstrap structure plan produced in Step 1.
2. Apply the language's conventional layout (see table in `templates/module-layout.md`):
- Python → `src/<pkg>/<component>/`
- C# → `src/<Component>/`
- Rust → `crates/<component>/`
- TypeScript / React → `src/<component>/` with `index.ts` barrel
- Go → `internal/<component>/` or `pkg/<component>/`
3. Each component owns ONE top-level directory. Shared code goes under `<root>/shared/` (or language equivalent).
4. Public API surface = files in the layout's `public:` list for each component; everything else is internal and MUST NOT be imported from other components.
5. Cross-cutting concerns (logging, error handling, config, telemetry, auth middleware, feature flags, i18n) each get ONE entry under Shared / Cross-Cutting; per-component tasks consume them (see Step 2 cross-cutting rule).
6. **ADR cross-check**: if `_docs/02_document/adr/` exists, read every `Status: Accepted` ADR. For each, confirm the proposed module layout does not contradict the ADR's `Decision` (e.g., an ADR mandating an event-bus boundary between two components must show up as a `Imports from` exclusion in the layout; an ADR locking a layering style must show up in the Layering table). If an ADR conflicts with the language-conventional layout from step 2, the ADR wins — record the conflict in a `## ADR-driven exceptions to the conventional layout` section of `module-layout.md` with `See ADR NNN_<slug>` references. If the ADR conflict is irreconcilable (the ADR demands something the language genuinely cannot express), STOP and ask the user A/B/C: (A) update the ADR via plan Step 4.5 supersede flow, (B) accept a layered exception with documented rationale, (C) re-open architecture.
7. Write `_docs/02_document/module-layout.md` using `templates/module-layout.md` format. Each Per-Component Mapping entry that is governed by an ADR includes a trailing `> See ADR NNN_<slug>` line.
## Self-verification
- [ ] Every component in `DOCUMENT_DIR/components/` has a Per-Component Mapping entry
- [ ] Every shared / cross-cutting concern has a Shared section entry
- [ ] Layering table covers every component (shared at the bottom)
- [ ] No component's `Imports from` list points at a higher layer
- [ ] Paths follow the detected language's convention
- [ ] No two components own overlapping paths
- [ ] If `_docs/02_document/adr/` exists with Accepted ADRs, every layout decision that an ADR governs has a trailing `> See ADR NNN_<slug>` reference
- [ ] No Accepted ADR is contradicted by the layout without a documented exception
## Save action
Write `_docs/02_document/module-layout.md`.
## Blocking
**BLOCKING**: Present layout summary to user. Do NOT proceed to Step 2 until user confirms. The implement skill depends on this file; inconsistencies here cause file-ownership conflicts at batch time.
@@ -1,72 +0,0 @@
# Step 1.7: System-Pipeline Tasks (implementation mode only)
**Role**: Professional software architect, integration-focused.
**Goal**: For every end-to-end pipeline named in `_docs/02_document/architecture.md` and `_docs/02_document/system-flows.md`, ensure there is exactly ONE explicit task that owns the production code that drives that pipeline against real inputs. This step prevents the failure mode where every individual component is "complete" but no production code wires them together (May 2026 GPS-passthrough incident — see `meta-rule.mdc` "When a test reveals missing production code").
**Constraints**:
- This step produces *integration* tasks, not per-component tasks. Per-component tasks come from Step 2.
- An integration task's owner is typically the composition root, runtime root, main loop, or whichever component the module layout (Step 1.5) names as the "system spine". It is NEVER a leaf component.
- Each integration task must be sized at 5 points or fewer. If the pipeline is too large for one task, split it into per-stage integration tasks (e.g. "wire ingress → C1", then "wire C1 → C5") rather than one giant task.
## Inputs
| File | Purpose |
|------|---------|
| `_docs/02_document/architecture.md` | Source of named end-to-end pipelines and their component sequences |
| `_docs/02_document/system-flows.md` | Source of operational flows (per-frame loop, request lifecycle, batch job, etc.) |
| `_docs/02_document/module-layout.md` | Produced by Step 1.5. Names the "system spine" component(s) — typically `runtime_root`, `app`, `main`, `composition`, or equivalent. |
| `_docs/02_document/components/*/description.md` | Per-component contracts so you can tell which side of a seam each method lives on |
## Steps
1. **Enumerate end-to-end pipelines.** Read `architecture.md` and `system-flows.md`. For each named pipeline / flow that spans 2+ components, record:
- The pipeline name (e.g. "per-frame nav loop", "tile-cache build", "operator pre-flight verification").
- The ordered sequence of components it touches (e.g. `frame_source → c1_vio → c2_vpr → ... → c5_state → replay_sink`).
- The trigger (per-frame, per-request, scheduled, manual).
- The output (what the pipeline emits and to whom).
2. **For each pipeline, locate the owner.** Use `module-layout.md` to find the component that owns the orchestration (the "spine"). If `module-layout.md` does not name one, STOP and ASK the user which component owns the pipeline. Do NOT silently default to the bootstrap structure task — bootstrap is about project skeleton, not behavior.
3. **Check whether the pipeline is already covered by an existing task spec or by the bootstrap-structure task.** A pipeline is "covered" only if:
- A task spec's `Outcome` or `Acceptance Criteria` section explicitly names "drives the {pipeline_name} end-to-end against real production components", AND
- That task's owned files include the orchestration code (typically the spine component's main loop / entrypoint).
4. **For every uncovered pipeline, create a system-integration task spec** in `_docs/02_tasks/todo/` using `.cursor/skills/decompose/templates/task.md`:
- **Component**: the spine component from step 2 (e.g. `runtime_root`).
- **Outcome**: the production callsite that drives the pipeline exists and runs end-to-end on real inputs.
- **Scope / Included**: the orchestration code (loop body, dispatcher, scheduler, entrypoint); explicit list of every component it must call in order; the data type at each seam.
- **Acceptance Criteria** (write each as testable):
- At least one production caller of every component method in the pipeline can be found by grep — name the methods explicitly.
- The orchestration runs against the real production component instances (NOT mocks, NOT a passthrough that bypasses them).
- At least one integration test exercises the orchestration end-to-end against real inputs.
- **Dependencies**: every per-component task whose component appears in the pipeline.
- **Complexity points**: ≤5; split the pipeline if it doesn't fit.
- **Tracker**: create a ticket immediately (per `decompose/SKILL.md` "Tracker inline" principle); rename the file to `[TRACKER-ID]_pipeline_<name>.md`.
5. **Mark the integration task as `Dependencies` for the integration test task.** If `tests-only` decomposition has already produced an e2e/integration test task for this pipeline, append the new integration task to its `Dependencies` field so the test cannot be "made green" before the integration ships.
## Anti-patterns this step explicitly blocks
- **"compose_root returns a wired runtime"** prose interpreted as "the loop exists". Composition assembles the graph; it is NOT the loop. The loop is the code that pulls inputs, drives each node, and emits outputs. If grep finds zero callers of the leaf components, the loop does not exist regardless of what compose_root does.
- **Treating the bootstrap-structure task as the home of the main loop.** Bootstrap is project skeleton (package layout, CLI scaffold, build files). It is NOT the main loop. Main loop is its own task.
- **Per-component tasks claiming integration scope.** A C1 VIO task's deliverable is "C1 works in isolation against unit tests". A C1 task's acceptance criteria MUST NOT include "C1 is wired into the runtime" — that's the integration task's job.
## Self-verification
- [ ] Every pipeline named in `architecture.md` / `system-flows.md` is listed in your enumeration.
- [ ] Every enumerated pipeline either (a) has an existing covered task, or (b) has a new integration task in `todo/`.
- [ ] No integration task exceeds 5 complexity points.
- [ ] Every integration task names every component in the pipeline as a `Dependencies` entry.
- [ ] No integration task is owned by a leaf component — every owner is named in `module-layout.md` as a spine / orchestrator.
- [ ] Every integration task has a tracker ticket created and the filename renamed to `[TRACKER-ID]_pipeline_<name>.md`.
## Save action
Write the new integration task files into `_docs/02_tasks/todo/`. They will be picked up by Step 2 (Task Decomposition's dependency-table writer) and by Step 4 (Cross-Verification).
## Blocking
**BLOCKING**: Present the pipeline enumeration + the list of new integration tasks to the user. Do NOT proceed to Step 2 until the user confirms:
- The enumeration matches what they expect from the architecture documents.
- Every uncovered pipeline now has an integration task.
- The chosen spine owners are correct.
If the user identifies a pipeline you missed, add it before proceeding. If the user names a different spine owner, update the task and re-run self-verification.
@@ -1,57 +0,0 @@
# Step 1: Bootstrap Structure Plan (default mode only)
**Role**: Professional software architect
**Goal**: Produce `01_initial_structure.md` — the first task describing the project skeleton.
**Constraints**: This is a plan document, not code. The `/implement` skill executes it.
## Steps
1. Read `architecture.md`, all component specs, `system-flows.md`, `data_model.md`, and `deployment/` from DOCUMENT_DIR
2. Read problem, solution, and restrictions from `_docs/00_problem/` and `_docs/01_solution/`
3. Research best implementation patterns for the identified tech stack
4. Document the structure plan using `templates/initial-structure-task.md`
The bootstrap structure plan must include:
- Project folder layout with all component directories
- Shared models, interfaces, and DTOs
- Dockerfile per component (multi-stage, non-root, health checks, pinned base images)
- `docker-compose.yml` for local development (all components + database + dependencies)
- `docker-compose.test.yml` for blackbox test environment (blackbox test runner)
- `.dockerignore`
- CI/CD pipeline file (`.github/workflows/ci.yml` or `azure-pipelines.yml`) with stages from `deployment/ci_cd_pipeline.md`
- Database migration setup and initial seed data scripts
- Observability configuration: structured logging setup, health check endpoints (`/health/live`, `/health/ready`), metrics endpoint (`/metrics`)
- Environment variable documentation (`.env.example`)
- Test structure with unit and blackbox test locations
## Self-verification
- [ ] All components have corresponding folders in the layout
- [ ] All inter-component interfaces have DTOs defined
- [ ] Dockerfile defined for each component
- [ ] `docker-compose.yml` covers all components and dependencies
- [ ] `docker-compose.test.yml` enables blackbox testing
- [ ] CI/CD pipeline file defined with lint, test, security, build, deploy stages
- [ ] Database migration setup included
- [ ] Health check endpoints specified for each service
- [ ] Structured logging configuration included
- [ ] `.env.example` with all required environment variables
- [ ] Environment strategy covers dev, staging, production
- [ ] Test structure includes unit and blackbox test locations
## Save action
Write `todo/01_initial_structure.md` (temporary numeric name).
## Tracker action
Create a work item ticket for this task under the "Bootstrap & Initial Structure" epic. Write the work item ticket ID and Epic ID back into the task header.
## Rename action
Rename the file from `todo/01_initial_structure.md` to `todo/[TRACKER-ID]_initial_structure.md` (e.g., `todo/AZ-42_initial_structure.md`). Update the **Task** field inside the file to match the new filename.
## Blocking
**BLOCKING**: Present structure plan summary to user. Do NOT proceed until user confirms.
@@ -1,45 +0,0 @@
# Step 1t: Test Infrastructure Bootstrap (tests-only mode only)
**Role**: Professional Quality Assurance Engineer
**Goal**: Produce `01_test_infrastructure.md` — the first task describing the test project scaffold.
**Constraints**: This is a plan document, not code. The `/implement` skill executes it.
## Steps
1. Read `TESTS_DIR/environment.md` and `TESTS_DIR/test-data.md`
2. Read `problem.md`, `restrictions.md`, `acceptance_criteria.md` for domain context
3. Document the test infrastructure plan using `templates/test-infrastructure-task.md`
The test infrastructure bootstrap must include:
- Test project folder layout (`e2e/` directory structure)
- Mock/stub service definitions for each external dependency
- `docker-compose.test.yml` structure from `environment.md`
- Test runner configuration (framework, plugins, fixtures)
- Test data fixture setup from `test-data.md` seed data sets
- Test reporting configuration (format, output path)
- Data isolation strategy
## Self-verification
- [ ] Every external dependency from `environment.md` has a mock service defined
- [ ] Docker Compose structure covers all services from `environment.md`
- [ ] Test data fixtures cover all seed data sets from `test-data.md`
- [ ] Test runner configuration matches the consumer app tech stack from `environment.md`
- [ ] Data isolation strategy is defined
## Save action
Write `todo/01_test_infrastructure.md` (temporary numeric name).
## Tracker action
Create a work item ticket for this task under the "Blackbox Tests" epic. Write the work item ticket ID and Epic ID back into the task header.
## Rename action
Rename the file from `todo/01_test_infrastructure.md` to `todo/[TRACKER-ID]_test_infrastructure.md`. Update the **Task** field inside the file to match the new filename.
## Blocking
**BLOCKING**: Present test infrastructure plan summary to user. Do NOT proceed until user confirms.
@@ -1,75 +0,0 @@
# Step 2: Task Decomposition (default and single component modes)
**Role**: Professional software architect
**Goal**: Decompose each component into atomic, implementable task specs — numbered sequentially starting from 02.
**Constraints**: Behavioral specs only — describe what, not how. No implementation code.
## Numbering
Tasks are numbered sequentially across all components in dependency order. Start from 02 (01 is `initial_structure`). In single component mode, start from the next available number in TASKS_DIR.
## Component ordering
Process components in dependency order — foundational components first (shared models, database), then components that depend on them.
## Consult LESSONS.md once at the start of Step 2
If `_docs/LESSONS.md` exists, read it and note `estimation`, `architecture`, or `dependencies` lessons that may bias task sizing in this pass (e.g., "auth-related changes historically take 2x estimate" → bump any auth task up one complexity tier). Apply the bias when filling the Complexity field in step 7 below. Record which lessons informed estimation in a comment in `_dependencies_table.md` (Step 4).
## Steps
For each component (or the single provided component):
1. Read the component's `description.md` and `tests.md` (if available)
2. Decompose into atomic tasks; create only 1 task if the component is simple or atomic
3. Split into multiple tasks only when it is necessary and would be easier to implement
4. Do not create tasks for other components — only tasks for the current component
5. Each task should be atomic, containing 1 API or a list of semantically connected APIs
6. Write each task spec using `templates/task.md`
7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
8. Note task dependencies (referencing tracker IDs of already-created dependency tasks, e.g., `AZ-42_initial_structure`)
9. **Cross-cutting rule**: if a concern spans ≥2 components (logging, config loading, auth/authZ, error envelope, telemetry, feature flags, i18n), create ONE shared task under the cross-cutting epic. Per-component tasks declare it as a dependency and consume it; they MUST NOT re-implement it locally. Duplicate local implementations are an `Architecture` finding (High) in code-review Phase 7 and a `Maintainability` finding in Phase 6.
10. **Shared-models / shared-API rule**: classify the task as shared if ANY of the following is true:
- The component is listed under `shared/*` in `module-layout.md`.
- The task's Scope.Included mentions "public interface", "DTO", "schema", "event", "contract", "API endpoint", or "shared model".
- The task is parented to a cross-cutting epic.
- The task is depended on by ≥2 other tasks across different components.
For every shared task:
- Produce a contract file at `_docs/02_document/contracts/<component>/<name>.md` using `templates/api-contract.md`. Fill Shape, Invariants, Non-Goals, Versioning Rules, and at least 3 Test Cases.
- Add a mandatory `## Contract` section to the task spec pointing at the contract file.
- For every consuming task, add the contract path to its `## Dependencies` section as a document dependency (separate from task dependencies).
Consumers read the contract file, not the producer's task spec. This prevents interface drift when the producer's implementation detail leaks into consumers.
11. **Immediately after writing each task file**: create a work item ticket, link it to the component's epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
## Runtime Completeness Decomposition Gate
Before Step 2 is considered complete, scan `architecture.md`, `system-flows.md`, component descriptions, and the solution for named internal runtime capabilities and dependencies. Examples include BASALT/OpenVINS/Kimera, FAISS, DINOv2, ONNX/TensorRT, ALIKED/DISK, LightGlue, RANSAC, PostGIS, MAVLink emission, FDR rollover, and any "A-Z" user-visible pipeline.
For every named internal capability:
1. Ensure at least one implementation task explicitly owns the production integration or production algorithm.
2. Do not treat "define protocol", "create adapter boundary", "add deterministic fallback", "create scaffold", or "prepare native bridge" as implementation of the capability unless the architecture explicitly says the real capability is out of scope.
3. If a capability needs external hardware/data to verify, still create the production implementation task. Verification may be hardware-gated later; implementation must not be omitted.
4. Add a `## Runtime Completeness` section to any affected task with:
- named capability/dependency,
- production code that must exist,
- allowed external stubs, if any,
- unacceptable substitutes such as fake/deterministic/internal stubs.
## Self-verification (per component)
- [ ] Every task is atomic (single concern)
- [ ] No task exceeds 5 complexity points
- [ ] Task dependencies reference correct tracker IDs
- [ ] Tasks cover all interfaces defined in the component spec
- [ ] No tasks duplicate work from other components
- [ ] Every task has a work item ticket linked to the correct epic
- [ ] Every shared-models / shared-API task has a contract file at `_docs/02_document/contracts/<component>/<name>.md` and a `## Contract` section linking to it
- [ ] Every cross-cutting concern appears exactly once as a shared task, not N per-component copies
- [ ] Every named internal runtime capability has a production implementation task, not only an interface/scaffold/fallback task
## Save action
Write each `todo/[##]_[short_name].md` (temporary numeric name), create work item ticket inline, then rename to `todo/[TRACKER-ID]_[short_name].md`. Update the **Task** field inside the file to match the new filename. Update **Dependencies** references in the file to use tracker IDs of the dependency tasks.
@@ -1,39 +0,0 @@
# Step 3: Blackbox Test Task Decomposition (tests-only mode only)
**Role**: Professional Quality Assurance Engineer
**Goal**: Decompose blackbox test specs into atomic, implementable task specs.
**Constraints**: Behavioral specs only — describe what, not how. No test code.
## Numbering
- In tests-only mode: start from 02 (01 is the test infrastructure bootstrap from Step 1t).
## Steps
1. Read all test specs from `DOCUMENT_DIR/tests/` (`blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, `resource-limit-tests.md`)
2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test)
3. Each task should reference the specific test scenarios it implements and the environment/test-data specs
4. Add a **System Under Test Boundary** section to every e2e/blackbox test task:
- The test must drive the product through public runtime boundaries and compare actual outputs to `_docs/00_problem/input_data/expected_results/results_report.md` and any referenced machine-readable expected-result files.
- Stubs are allowed only for external systems outside the product boundary: flight controller/SITL, QGC observer, satellite-provider/Suite service, physical Jetson hardware, physical camera, licensed public datasets, and network services.
- Stubs, fakes, deterministic fallbacks, monkeypatches, or direct imports are not allowed for internal product modules that the scenario is meant to validate, such as VIO, safety/anchor wrapper, satellite retrieval, anchor verification, tile manager, MAVLink output adapter, or FDR.
- If an internal module is not implemented, the test must fail/block as missing product implementation; it must not pass by replacing that module with a test stub.
5. Dependencies:
- In tests-only mode: blackbox test tasks depend on the test infrastructure bootstrap task (Step 1t)
6. Write each task spec using `templates/task.md`
7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
8. Note task dependencies (referencing tracker IDs of already-created dependency tasks)
9. **Immediately after writing each task file**: create a work item ticket under the "Blackbox Tests" epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
## Self-verification
- [ ] Every scenario from `tests/blackbox-tests.md` is covered by a task
- [ ] Every scenario from `tests/performance-tests.md`, `tests/resilience-tests.md`, `tests/security-tests.md`, and `tests/resource-limit-tests.md` is covered by a task
- [ ] No task exceeds 5 complexity points
- [ ] Dependencies correctly reference the test infrastructure task
- [ ] Every task has a work item ticket linked to the "Blackbox Tests" epic
- [ ] Every e2e/blackbox task forbids internal product stubs/fakes and requires comparison against expected-results artifacts
## Save action
Write each `todo/[##]_[short_name].md` (temporary numeric name), create work item ticket inline, then rename to `todo/[TRACKER-ID]_[short_name].md`.
@@ -1,43 +0,0 @@
# Step 4: Cross-Task Verification (implementation and tests-only modes)
**Role**: Professional software architect and analyst
**Goal**: Verify task consistency and produce `_dependencies_table.md`.
**Constraints**: Review step — fix gaps found, do not add new tasks.
## Steps
1. Verify task dependencies across all tasks are consistent
2. Check no gaps:
- In implementation mode: every product interface in `architecture.md` has implementation task coverage
- In tests-only mode: every test scenario in `traceability-matrix.md` is covered by a task
- In implementation mode: every named internal runtime capability/dependency from architecture, solution, system flows, and component descriptions has a production implementation task, not only an interface/scaffold/fallback task
- In tests-only mode: every e2e/blackbox task has a System Under Test Boundary section that forbids stubbing internal product modules and requires comparison to expected-results artifacts
3. Check no overlaps: tasks don't duplicate work
4. Check no circular dependencies in the task graph
5. Produce `_dependencies_table.md` using `templates/dependencies-table.md`
## Self-verification
### Implementation mode
- [ ] Every product interface in `architecture.md` is covered by at least one implementation task
- [ ] Every named internal runtime capability has a production implementation task
- [ ] No circular dependencies in the task graph
- [ ] Cross-component dependencies are explicitly noted in affected task specs
- [ ] `_dependencies_table.md` contains every task with correct dependencies
### Tests-only mode
- [ ] Every test scenario from `traceability-matrix.md` "Covered" entries has a corresponding task
- [ ] Every e2e/blackbox task validates actual product behavior and allows stubs only for external systems
- [ ] No circular dependencies in the task graph
- [ ] Test task dependencies reference the test infrastructure bootstrap
- [ ] `_dependencies_table.md` contains every task with correct dependencies
## Save action
Write `_dependencies_table.md`.
## Blocking
**BLOCKING**: Present dependency summary to user. Do NOT proceed until user confirms.
@@ -1,133 +0,0 @@
# API Contract Template
A contract is the **frozen, reviewed interface** between two or more components. When task A produces a shared model, DTO, schema, event payload, or public API, and task B consumes it, they must not reverse-engineer each other's implementation — they must read the contract.
Save the filled contract at `_docs/02_document/contracts/<component>/<name>.md`. Reference it from the producing task's `## Contract` section and from every consuming task's `## Dependencies` section.
---
```markdown
# Contract: [contract-name]
**Component**: [component-name]
**Producer task**: [TRACKER-ID] — [task filename]
**Consumer tasks**: [list of TRACKER-IDs or "TBD at decompose time"]
**Version**: 1.0.0
**Status**: [draft | frozen | deprecated]
**Last Updated**: [YYYY-MM-DD]
## Purpose
Short statement of what this contract represents and why it is shared (13 sentences).
## Shape
Choose ONE of the following shape forms per the contract type:
### For data models (DTO / schema / event)
```[language]
// language-native type definitions — e.g., Python dataclass, C# record, TypeScript interface, Rust struct, JSON Schema
```
For each field:
| Field | Type | Required | Description | Constraints |
|-------|------|----------|-------------|-------------|
| `id` | `string` (UUID) | yes | Unique identifier | RFC 4122 v4 |
| `created_at` | `datetime` (ISO 8601 UTC) | yes | Creation timestamp | |
| `...` | ... | ... | ... | ... |
### For function / method APIs
| Name | Signature | Throws / Errors | Blocking? |
|------|-----------|-----------------|-----------|
| `do_x` | `(input: InputDto) -> Result<OutputDto, XError>` | `XError::NotFound`, `XError::Invalid` | sync |
| ... | ... | ... | ... |
### For HTTP / RPC endpoints
| Method | Path | Request body | Response | Status codes |
|--------|------|--------------|----------|--------------|
| `POST` | `/api/v1/resource` | `CreateResource` | `Resource` | 201, 400, 409 |
| ... | ... | ... | ... | ... |
## Invariants
Properties that MUST hold for every valid instance or every allowed interaction. These survive refactors.
- Invariant 1: [statement]
- Invariant 2: [statement]
## Non-Goals
Things this contract intentionally does NOT cover. Helps prevent scope creep.
- Not covered: [statement]
## Versioning Rules
- **Breaking changes** (field renamed/removed, type changed, required→optional flipped) require a new major version and a deprecation path for consumers.
- **Non-breaking additions** (new optional field, new error variant consumers already tolerate) require a minor version bump.
## Test Cases
Representative cases that both producer and consumer tests must cover. Keep short — this is the contract test surface, not an exhaustive suite.
| Case | Input | Expected | Notes |
|------|-------|----------|-------|
| valid-minimal | minimal valid instance | accepted | |
| invalid-missing-required | missing `id` | rejected with specific error | |
| edge-case-x | ... | ... | |
## Change Log
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | YYYY-MM-DD | Initial contract | [agent/user] |
```
---
## Decompose-skill rules for emitting contracts
A task is a **shared-models / shared-API task** when ANY of the following is true:
- The component spec lists it as a shared component (under `shared/*` in `module-layout.md`).
- The task's **Scope.Included** mentions any of: "public interface", "DTO", "schema", "event", "contract", "API endpoint", "shared model".
- The task is parented to a cross-cutting epic (`epic_type: cross-cutting`).
- The task is depended on by ≥2 other tasks across different components.
For every shared-models / shared-API task:
1. Create a contract file at `_docs/02_document/contracts/<component>/<name>.md` using this template.
2. Fill in Shape, Invariants, Non-Goals, Versioning Rules, and at least 3 Test Cases.
3. Add a mandatory `## Contract` section to the task spec that links to the contract file:
```markdown
## Contract
This task produces/implements the contract at `_docs/02_document/contracts/<component>/<name>.md`.
Consumers MUST read that file — not this task spec — to discover the interface.
```
4. For every consuming task, add the contract path to its `## Dependencies` section as a document dependency (not a task dependency):
```markdown
### Document Dependencies
- `_docs/02_document/contracts/<component>/<name>.md` — API contract produced by [TRACKER-ID].
```
5. If the contract changes after it was frozen, the producer task must bump the `Version` and note the change in `Change Log`. Consumers referenced in the contract header must be notified (surface to user via Choose format).
## Code-review-skill rules for verifying contracts
Phase 2 (Spec Compliance) adds a check:
- For every task with a `## Contract` section:
- Verify the referenced contract file exists at the stated path.
- Verify the implementation's public signatures (types, method shapes, endpoint paths) match the contract's Shape section.
- If they diverge, emit a `Spec-Gap` finding with High severity.
- For every consuming task's Document Dependencies that reference a contract:
- Verify the consumer's imports / calls match the contract's Shape.
- If they diverge, emit a `Spec-Gap` finding with High severity and a hint that either the contract or the consumer is drifting.
@@ -13,10 +13,10 @@ Use this template after cross-task verification. Save as `TASKS_DIR/_dependencie
| Task | Name | Complexity | Dependencies | Epic |
|------|------|-----------|-------------|------|
| [TRACKER-ID] | initial_structure | [points] | None | [EPIC-ID] |
| [TRACKER-ID] | [short_name] | [points] | [TRACKER-ID] | [EPIC-ID] |
| [TRACKER-ID] | [short_name] | [points] | [TRACKER-ID] | [EPIC-ID] |
| [TRACKER-ID] | [short_name] | [points] | [TRACKER-ID], [TRACKER-ID] | [EPIC-ID] |
| [JIRA-ID] | initial_structure | [points] | None | [EPIC-ID] |
| [JIRA-ID] | [short_name] | [points] | [JIRA-ID] | [EPIC-ID] |
| [JIRA-ID] | [short_name] | [points] | [JIRA-ID] | [EPIC-ID] |
| [JIRA-ID] | [short_name] | [points] | [JIRA-ID], [JIRA-ID] | [EPIC-ID] |
| ... | ... | ... | ... | ... |
```
@@ -25,7 +25,7 @@ Use this template after cross-task verification. Save as `TASKS_DIR/_dependencie
## Guidelines
- Every task from TASKS_DIR must appear in this table
- Dependencies column lists tracker IDs (e.g., "AZ-43, AZ-44") or "None"
- Dependencies column lists Jira IDs (e.g., "AZ-43, AZ-44") or "None"
- No circular dependencies allowed
- Tasks should be listed in recommended execution order
- The `/implement` skill reads this table to compute dependency-aware batches; task execution remains sequential
- The `/implement` skill reads this table to compute parallel batches
@@ -1,19 +1,19 @@
# Initial Structure Task Template
Use this template for the bootstrap structure plan. Save as `TASKS_DIR/01_initial_structure.md` initially, then rename to `TASKS_DIR/[TRACKER-ID]_initial_structure.md` after work item ticket creation.
Use this template for the bootstrap structure plan. Save as `TASKS_DIR/01_initial_structure.md` initially, then rename to `TASKS_DIR/[JIRA-ID]_initial_structure.md` after Jira ticket creation.
---
```markdown
# Initial Project Structure
**Task**: [TRACKER-ID]_initial_structure
**Task**: [JIRA-ID]_initial_structure
**Name**: Initial Structure
**Description**: Scaffold the project skeleton — folders, shared models, interfaces, stubs, CI/CD, DB migrations, test structure
**Complexity**: [3|5] points
**Dependencies**: None
**Component**: Bootstrap
**Tracker**: [TASK-ID]
**Jira**: [TASK-ID]
**Epic**: [EPIC-ID]
## Project Folder Layout
@@ -1,107 +0,0 @@
# Module Layout Template
The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to each task. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.
Save as `_docs/02_document/module-layout.md`. This file is produced by the decompose skill (Step 1.5 module layout) and consumed by the implement skill (Step 4 file ownership). Task specs remain purely behavioral — they do NOT carry file paths. The layout is the single place where component → filesystem mapping lives.
---
```markdown
# Module Layout
**Language**: [python | csharp | rust | typescript | go | mixed]
**Layout Convention**: [src-layout | crates-workspace | packages-workspace | custom]
**Root**: [src/ | crates/ | packages/ | ./]
**Last Updated**: [YYYY-MM-DD]
## Layout Rules
1. Each component owns ONE top-level directory under the root.
2. Shared code lives under `<root>/shared/` (or language equivalent: `src/shared/`, `crates/shared/`, `packages/shared/`).
3. Cross-cutting concerns (logging, config, error handling, telemetry) live under `<root>/shared/<concern>/`.
4. Public API surface per component = files listed in `public:` below. Everything else is internal — other components MUST NOT import it directly.
5. Tests live outside the component tree in a separate `tests/` or `<component>/tests/` directory per the language's test convention.
## Per-Component Mapping
### Component: [component-name]
- **Epic**: [TRACKER-ID]
- **Directory**: `src/<path>/`
- **Public API**: files in this list are importable by other components
- `src/<path>/public_api.py` (or `mod.rs`, `index.ts`, `PublicApi.cs`, etc.)
- `src/<path>/types.py`
- **Internal (do NOT import from other components)**:
- `src/<path>/internal/*`
- `src/<path>/_helpers.py`
- **Owns (exclusive write during implementation)**: `src/<path>/**`
- **Imports from**: [list of other components whose Public API this component may use]
- **Consumed by**: [list of components that depend on this component's Public API]
### Component: [next-component]
...
## Shared / Cross-Cutting
### shared/models
- **Directory**: `src/shared/models/`
- **Purpose**: DTOs, value types, schemas shared across components
- **Owned by**: whoever implements task `[TRACKER-ID]_shared_models`
- **Consumed by**: all components
### shared/logging
- **Directory**: `src/shared/logging/`
- **Purpose**: structured logging setup
- **Owned by**: cross-cutting task `[TRACKER-ID]_logging`
- **Consumed by**: all components
### shared/[other concern]
...
## Allowed Dependencies (layering)
Read top-to-bottom; an upper layer may import from a lower layer but NEVER the reverse.
| Layer | Components | May import from |
|-------|------------|-----------------|
| 4. API / Entry | [list] | 1, 2, 3 |
| 3. Application | [list] | 1, 2 |
| 2. Domain | [list] | 1 |
| 1. Shared / Foundation | shared/* | (none) |
Violations of this table are **Architecture** findings in code-review Phase 7 and are High severity.
## Layout Conventions (reference)
| Language | Root | Per-component path | Public API file | Test path |
|----------|------|-------------------|-----------------|-----------|
| Python | `src/<pkg>/` | `src/<pkg>/<component>/` | `src/<pkg>/<component>/__init__.py` (re-exports) | `tests/<component>/` |
| C# (.NET) | `src/` | `src/<Component>/` | `src/<Component>/<Component>.cs` (namespace root) | `tests/<Component>.Tests/` |
| Rust | `crates/` | `crates/<component>/` | `crates/<component>/src/lib.rs` | `crates/<component>/tests/` |
| TypeScript / React | `packages/` or `src/` | `src/<component>/` | `src/<component>/index.ts` (barrel) | `src/<component>/__tests__/` or `tests/<component>/` |
| Go | `./` | `internal/<component>/` or `pkg/<component>/` | `internal/<component>/doc.go` + exported symbols | `internal/<component>/*_test.go` |
```
---
## Self-verification for the decompose skill
When writing `_docs/02_document/module-layout.md`, verify:
- [ ] Every component in `_docs/02_document/components/` has a Per-Component Mapping entry.
- [ ] Every shared / cross-cutting epic has an entry in the Shared section.
- [ ] Layering table rows cover every component.
- [ ] No component's `Imports from` list contains a component at a higher layer.
- [ ] Paths follow the detected language's convention.
- [ ] No two components own overlapping paths.
## How the implement skill consumes this
The implement skill's Step 4 (File Ownership) reads this file and, for each task in the batch:
1. Resolve the task's Component field to a Per-Component Mapping entry.
2. Set OWNED = the component's `Owns` glob.
3. Set READ-ONLY = the Public API files of every component listed in `Imports from`, plus `shared/*` Public API files.
4. Set FORBIDDEN = every other component's Owns glob.
Execution inside a batch is already sequential (one task at a time). This mapping is still required because it enforces scope discipline per task — preventing a task from drifting into files that belong to another component.
+5 -16
View File
@@ -1,20 +1,20 @@
# Task Specification Template
Create a focused behavioral specification that describes **what** the system should do, not **how** it should be built.
Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[TRACKER-ID]_[short_name].md` after work item ticket creation.
Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[JIRA-ID]_[short_name].md` after Jira ticket creation.
---
```markdown
# [Feature Name]
**Task**: [TRACKER-ID]_[short_name]
**Task**: [JIRA-ID]_[short_name]
**Name**: [short human name]
**Description**: [one-line description of what this task delivers]
**Complexity**: [1|2|3|5] points
**Dependencies**: [AZ-43_shared_models, AZ-44_db_migrations] or "None"
**Component**: [component name for context]
**Tracker**: [TASK-ID]
**Jira**: [TASK-ID]
**Epic**: [EPIC-ID]
## Problem
@@ -81,17 +81,6 @@ Then [expected result]
**Risk 1: [Title]**
- *Risk*: [Description]
- *Mitigation*: [Approach]
## Contract
<!--
OMIT this section for behavioral-only tasks.
INCLUDE this section ONLY for shared-models / shared-API / contract tasks.
See decompose/SKILL.md Step 2 shared-models rule and decompose/templates/api-contract.md.
-->
This task produces/implements the contract at `_docs/02_document/contracts/<component>/<name>.md`.
Consumers MUST read that file — not this task spec — to discover the interface.
```
---
@@ -102,7 +91,7 @@ Consumers MUST read that file — not this task spec — to discover the interfa
- 2 points: Non-trivial, low complexity, minimal coordination
- 3 points: Multi-step, moderate complexity, potential alignment needed
- 5 points: Difficult, interconnected logic, medium-high risk
- 8+ points: Too complex — split into smaller tasks
- 8 points: Too complex — split into smaller tasks
## Output Guidelines
@@ -113,7 +102,7 @@ Consumers MUST read that file — not this task spec — to discover the interfa
- Include realistic scope boundaries
- Write from the user's perspective
- Include complexity estimation
- Reference dependencies by tracker ID (e.g., AZ-43_shared_models)
- Reference dependencies by Jira ID (e.g., AZ-43_shared_models)
**DON'T:**
- Include implementation details (file paths, classes, methods)
@@ -1,19 +1,19 @@
# Test Infrastructure Task Template
Use this template for the test infrastructure bootstrap (Step 1t in tests-only mode). Save as `TASKS_DIR/01_test_infrastructure.md` initially, then rename to `TASKS_DIR/[TRACKER-ID]_test_infrastructure.md` after work item ticket creation.
Use this template for the test infrastructure bootstrap (Step 1t in tests-only mode). Save as `TASKS_DIR/01_test_infrastructure.md` initially, then rename to `TASKS_DIR/[JIRA-ID]_test_infrastructure.md` after Jira ticket creation.
---
```markdown
# Test Infrastructure
**Task**: [TRACKER-ID]_test_infrastructure
**Task**: [JIRA-ID]_test_infrastructure
**Name**: Test Infrastructure
**Description**: Scaffold the Blackbox test project — test runner, mock services, Docker test environment, test data fixtures, reporting
**Complexity**: [3|5] points
**Dependencies**: None
**Component**: Blackbox Tests
**Tracker**: [TASK-ID]
**Jira**: [TASK-ID]
**Epic**: [EPIC-ID]
## Test Project Folder Layout
+303 -21
View File
@@ -115,43 +115,332 @@ At the start of execution, create a TodoWrite with all steps (1 through 7). Upda
### Step 1: Deployment Status & Environment Setup
Read and follow `steps/01_status-env.md`.
**Role**: DevOps / Platform engineer
**Goal**: Assess current deployment readiness, identify all required environment variables, and create `.env` files
**Constraints**: Must complete before any other step
1. Read architecture.md, all component specs, and restrictions.md
2. Assess deployment readiness:
- List all components and their current state (planned / implemented / tested)
- Identify external dependencies (databases, APIs, message queues, cloud services)
- Identify infrastructure prerequisites (container registry, cloud accounts, DNS, SSL certificates)
- Check if any deployment blockers exist
3. Identify all required environment variables by scanning:
- Component specs for configuration needs
- Database connection requirements
- External API endpoints and credentials
- Feature flags and runtime configuration
- Container registry credentials
- Cloud provider credentials
- Monitoring/logging service endpoints
4. Generate `.env.example` in project root with all variables and placeholder values (committed to VCS)
5. Generate `.env` in project root with development defaults filled in where safe (git-ignored)
6. Ensure `.gitignore` includes `.env` (but NOT `.env.example`)
7. Produce a deployment status report summarizing readiness, blockers, and required setup
**Self-verification**:
- [ ] All components assessed for deployment readiness
- [ ] External dependencies catalogued
- [ ] Infrastructure prerequisites identified
- [ ] All required environment variables discovered
- [ ] `.env.example` created with placeholder values
- [ ] `.env` created with safe development defaults
- [ ] `.gitignore` updated to exclude `.env`
- [ ] Status report written to `reports/deploy_status_report.md`
**Save action**: Write `reports/deploy_status_report.md` using `templates/deploy_status_report.md`, create `.env` and `.env.example` in project root
**BLOCKING**: Present status report and environment variables to user. Do NOT proceed until confirmed.
---
### Step 2: Containerization
Read and follow `steps/02_containerization.md`.
**Role**: DevOps / Platform engineer
**Goal**: Define Docker configuration for every component, local development, and blackbox test environments
**Constraints**: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain.
1. Read architecture.md and all component specs
2. Read restrictions.md for infrastructure constraints
3. Research best Docker practices for the project's tech stack (multi-stage builds, base image selection, layer optimization)
4. For each component, define:
- Base image (pinned version, prefer alpine/distroless for production)
- Build stages (dependency install, build, production)
- Non-root user configuration
- Health check endpoint and command
- Exposed ports
- `.dockerignore` contents
5. Define `docker-compose.yml` for local development:
- All application components
- Database (Postgres) with named volume
- Any message queues, caches, or external service mocks
- Shared network
- Environment variable files (`.env`)
6. Define `docker-compose.test.yml` for blackbox tests:
- Application components under test
- Test runner container (black-box, no internal imports)
- Isolated database with seed data
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
**Self-verification**:
- [ ] Every component has a Dockerfile specification
- [ ] Multi-stage builds specified for all production images
- [ ] Non-root user for all containers
- [ ] Health checks defined for every service
- [ ] docker-compose.yml covers all components + dependencies
- [ ] docker-compose.test.yml enables black-box testing
- [ ] `.dockerignore` defined
**Save action**: Write `containerization.md` using `templates/containerization.md`
**BLOCKING**: Present containerization plan to user. Do NOT proceed until confirmed.
---
### Step 3: CI/CD Pipeline
Read and follow `steps/03_ci-cd-pipeline.md`.
**Role**: DevOps engineer
**Goal**: Define the CI/CD pipeline with quality gates, security scanning, and multi-environment deployment
**Constraints**: Pipeline definition only — produce YAML specification, not implementation
1. Read architecture.md for tech stack and deployment targets
2. Read restrictions.md for CI/CD constraints (cloud provider, registry, etc.)
3. Research CI/CD best practices for the project's platform (GitHub Actions / Azure Pipelines)
4. Define pipeline stages:
| Stage | Trigger | Steps | Quality Gate |
|-------|---------|-------|-------------|
| **Lint** | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors |
| **Test** | Every push | Unit tests, blackbox tests, coverage report | 75%+ coverage (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds) |
| **Security** | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs |
| **Build** | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds |
| **Push** | After build | Push to container registry | Push succeeds |
| **Deploy Staging** | After push | Deploy to staging environment | Health checks pass |
| **Smoke Tests** | After staging deploy | Run critical path tests against staging | All pass |
| **Deploy Production** | Manual approval | Deploy to production | Health checks pass |
5. Define caching strategy: dependency caches, Docker layer caches, build artifact caches
6. Define parallelization: which stages can run concurrently
7. Define notifications: build failures, deployment status, security alerts
**Self-verification**:
- [ ] All pipeline stages defined with triggers and gates
- [ ] Coverage threshold enforced (75%+)
- [ ] Security scanning included (dependencies + images + SAST)
- [ ] Caching configured for dependencies and Docker layers
- [ ] Multi-environment deployment (staging → production)
- [ ] Rollback procedure referenced
- [ ] Notifications configured
**Save action**: Write `ci_cd_pipeline.md` using `templates/ci_cd_pipeline.md`
---
### Step 4: Environment Strategy
Read and follow `steps/04_environment-strategy.md`.
**Role**: Platform engineer
**Goal**: Define environment configuration, secrets management, and environment parity
**Constraints**: Strategy document — no secrets or credentials in output
1. Define environments:
| Environment | Purpose | Infrastructure | Data |
|-------------|---------|---------------|------|
| **Development** | Local developer workflow | docker-compose, local volumes | Seed data, mocks for external APIs |
| **Staging** | Pre-production validation | Mirrors production topology | Anonymized production-like data |
| **Production** | Live system | Full infrastructure | Real data |
2. Define environment variable management:
- Reference `.env.example` created in Step 1
- Per-environment variable sources (`.env` for dev, secret manager for staging/prod)
- Validation: fail fast on missing required variables at startup
3. Define secrets management:
- Never commit secrets to version control
- Development: `.env` files (git-ignored)
- Staging/Production: secret manager (AWS Secrets Manager / Azure Key Vault / Vault)
- Rotation policy
4. Define database management per environment:
- Development: Docker Postgres with named volume, seed data
- Staging: managed Postgres, migrations applied via CI/CD
- Production: managed Postgres, migrations require approval
**Self-verification**:
- [ ] All three environments defined with clear purpose
- [ ] Environment variable documentation complete (references `.env.example` from Step 1)
- [ ] No secrets in any output document
- [ ] Secret manager specified for staging/production
- [ ] Database strategy per environment
**Save action**: Write `environment_strategy.md` using `templates/environment_strategy.md`
---
### Step 5: Observability
Read and follow `steps/05_observability.md`.
**Role**: Site Reliability Engineer (SRE)
**Goal**: Define logging, metrics, tracing, and alerting strategy
**Constraints**: Strategy document — describe what to implement, not how to wire it
1. Read architecture.md and component specs for service boundaries
2. Research observability best practices for the tech stack
**Logging**:
- Structured JSON to stdout/stderr (no file logging in containers)
- Fields: `timestamp` (ISO 8601), `level`, `service`, `correlation_id`, `message`, `context`
- Levels: ERROR (exceptions), WARN (degraded), INFO (business events), DEBUG (diagnostics, dev only)
- No PII in logs
- Retention: dev = console, staging = 7 days, production = 30 days
**Metrics**:
- Expose Prometheus-compatible `/metrics` endpoint per service
- System metrics: CPU, memory, disk, network
- Application metrics: `request_count`, `request_duration` (histogram), `error_count`, `active_connections`
- Business metrics: derived from acceptance criteria
- Collection interval: 15s
**Distributed Tracing**:
- OpenTelemetry SDK integration
- Trace context propagation via HTTP headers and message queue metadata
- Span naming: `<service>.<operation>`
- Sampling: 100% in dev/staging, 10% in production (adjust based on volume)
**Alerting**:
| Severity | Response Time | Condition Examples |
|----------|---------------|-------------------|
| Critical | 5 min | Service down, data loss, health check failed |
| High | 30 min | Error rate > 5%, P95 latency > 2x baseline |
| Medium | 4 hours | Disk > 80%, elevated latency |
| Low | Next business day | Non-critical warnings |
**Dashboards**:
- Operations: service health, request rate, error rate, response time percentiles, resource utilization
- Business: key business metrics from acceptance criteria
**Self-verification**:
- [ ] Structured logging format defined with required fields
- [ ] Metrics endpoint specified per service
- [ ] OpenTelemetry tracing configured
- [ ] Alert severities with response times defined
- [ ] Dashboards cover operations and business metrics
- [ ] PII exclusion from logs addressed
**Save action**: Write `observability.md` using `templates/observability.md`
---
### Step 6: Deployment Procedures
Read and follow `steps/06_procedures.md`.
**Role**: DevOps / Platform engineer
**Goal**: Define deployment strategy, rollback procedures, health checks, and deployment checklist
**Constraints**: Procedures document — no implementation
1. Define deployment strategy:
- Preferred pattern: blue-green / rolling / canary (choose based on architecture)
- Zero-downtime requirement for production
- Graceful shutdown: 30-second grace period for in-flight requests
- Database migration ordering: migrate before deploy, backward-compatible only
2. Define health checks:
| Check | Type | Endpoint | Interval | Threshold |
|-------|------|----------|----------|-----------|
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures → restart |
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures → remove from LB |
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts max |
3. Define rollback procedures:
- Trigger criteria: health check failures, error rate spike, critical alert
- Rollback steps: redeploy previous image tag, verify health, rollback database if needed
- Communication: notify stakeholders during rollback
- Post-mortem: required after every production rollback
4. Define deployment checklist:
- [ ] All tests pass in CI
- [ ] Security scan clean (zero critical/high CVEs)
- [ ] Database migrations reviewed and tested
- [ ] Environment variables configured
- [ ] Health check endpoints responding
- [ ] Monitoring alerts configured
- [ ] Rollback plan documented and tested
- [ ] Stakeholders notified
**Self-verification**:
- [ ] Deployment strategy chosen and justified
- [ ] Zero-downtime approach specified
- [ ] Health checks defined (liveness, readiness, startup)
- [ ] Rollback trigger criteria and steps documented
- [ ] Deployment checklist complete
**Save action**: Write `deployment_procedures.md` using `templates/deployment_procedures.md`
**BLOCKING**: Present deployment procedures to user. Do NOT proceed until confirmed.
---
### Step 7: Deployment Scripts
Read and follow `steps/07_scripts.md`.
**Role**: DevOps / Platform engineer
**Goal**: Create executable deployment scripts for pulling Docker images and running services on the remote target machine
**Constraints**: Produce real, executable shell scripts. This is the ONLY step that creates implementation artifacts.
1. Read containerization.md and deployment_procedures.md from previous steps
2. Read `.env.example` for required variables
3. Create the following scripts in `SCRIPTS_DIR/`:
**`deploy.sh`** — Main deployment orchestrator:
- Validates that required environment variables are set (sources `.env` if present)
- Calls `pull-images.sh`, then `stop-services.sh`, then `start-services.sh`, then `health-check.sh`
- Exits with non-zero code on any failure
- Supports `--rollback` flag to redeploy previous image tags
**`pull-images.sh`** — Pull Docker images to target machine:
- Reads image list and tags from environment or config
- Authenticates with container registry
- Pulls all required images
- Verifies image integrity (digest check)
**`start-services.sh`** — Start services on target machine:
- Runs `docker compose up -d` or individual `docker run` commands
- Applies environment variables from `.env`
- Configures networks and volumes
- Waits for containers to reach healthy state
**`stop-services.sh`** — Graceful shutdown:
- Stops services with graceful shutdown period
- Saves current image tags for rollback reference
- Cleans up orphaned containers/networks
**`health-check.sh`** — Verify deployment health:
- Checks all health endpoints
- Reports status per service
- Returns non-zero if any service is unhealthy
4. All scripts must:
- Be POSIX-compatible (#!/bin/bash with set -euo pipefail)
- Source `.env` from project root or accept env vars from the environment
- Include usage/help output (`--help` flag)
- Be idempotent where possible
- Handle SSH connection to remote target (configurable via `DEPLOY_HOST` env var)
5. Document all scripts in `deploy_scripts.md`
**Self-verification**:
- [ ] All five scripts created and executable
- [ ] Scripts source environment variables correctly
- [ ] `deploy.sh` orchestrates the full flow
- [ ] `pull-images.sh` handles registry auth and image pull
- [ ] `start-services.sh` starts containers with correct config
- [ ] `stop-services.sh` handles graceful shutdown
- [ ] `health-check.sh` validates all endpoints
- [ ] Rollback supported via `deploy.sh --rollback`
- [ ] Scripts work for remote deployment via SSH (DEPLOY_HOST)
- [ ] `deploy_scripts.md` documents all scripts
**Save action**: Write scripts to `SCRIPTS_DIR/`, write `deploy_scripts.md` using `templates/deploy_scripts.md`
---
## Escalation Rules
@@ -184,24 +473,17 @@ Read and follow `steps/07_scripts.md`.
├────────────────────────────────────────────────────────────────┤
│ PREREQ: architecture.md + component specs exist │
│ │
│ 1. Status & Env → steps/01_status-env.md
│ → reports/deploy_status_report.md │
│ 1. Status & Env → reports/deploy_status_report.md
│ + .env + .env.example │
│ [BLOCKING: user confirms status & env vars] │
│ 2. Containerization → steps/02_containerization.md │
│ → containerization.md │
│ 2. Containerization → containerization.md
│ [BLOCKING: user confirms Docker plan] │
│ 3. CI/CD Pipeline → steps/03_ci-cd-pipeline.md │
→ ci_cd_pipeline.md
4. Environment → steps/04_environment-strategy.md
→ environment_strategy.md │
│ 5. Observability → steps/05_observability.md │
│ → observability.md │
│ 6. Procedures → steps/06_procedures.md │
│ → deployment_procedures.md │
│ 3. CI/CD Pipeline → ci_cd_pipeline.md
4. Environment → environment_strategy.md
5. Observability → observability.md
6. Procedures → deployment_procedures.md │
│ [BLOCKING: user confirms deployment plan] │
│ 7. Scripts → steps/07_scripts.md
│ → deploy_scripts.md + scripts/ │
│ 7. Scripts deploy_scripts.md + scripts/
├────────────────────────────────────────────────────────────────┤
│ Principles: Docker-first · IaC · Observability built-in │
│ Environment parity · Save immediately │
@@ -1,45 +0,0 @@
# Step 1: Deployment Status & Environment Setup
**Role**: DevOps / Platform engineer
**Goal**: Assess current deployment readiness, identify all required environment variables, and create `.env` files.
**Constraints**: Must complete before any other step.
## Steps
1. Read `architecture.md`, all component specs, and `restrictions.md`
2. Assess deployment readiness:
- List all components and their current state (planned / implemented / tested)
- Identify external dependencies (databases, APIs, message queues, cloud services)
- Identify infrastructure prerequisites (container registry, cloud accounts, DNS, SSL certificates)
- Check if any deployment blockers exist
3. Identify all required environment variables by scanning:
- Component specs for configuration needs
- Database connection requirements
- External API endpoints and credentials
- Feature flags and runtime configuration
- Container registry credentials
- Cloud provider credentials
- Monitoring/logging service endpoints
4. Generate `.env.example` in project root with all variables and placeholder values (committed to VCS)
5. Generate `.env` in project root with development defaults filled in where safe (git-ignored)
6. Ensure `.gitignore` includes `.env` (but NOT `.env.example`)
7. Produce a deployment status report summarizing readiness, blockers, and required setup
## Self-verification
- [ ] All components assessed for deployment readiness
- [ ] External dependencies catalogued
- [ ] Infrastructure prerequisites identified
- [ ] All required environment variables discovered
- [ ] `.env.example` created with placeholder values
- [ ] `.env` created with safe development defaults
- [ ] `.gitignore` updated to exclude `.env`
- [ ] Status report written to `reports/deploy_status_report.md`
## Save action
Write `reports/deploy_status_report.md` using `templates/deploy_status_report.md`. Create `.env` and `.env.example` in project root.
## Blocking
**BLOCKING**: Present status report and environment variables to user. Do NOT proceed until confirmed.
@@ -1,49 +0,0 @@
# Step 2: Containerization
**Role**: DevOps / Platform engineer
**Goal**: Define Docker configuration for every component, local development, and blackbox test environments.
**Constraints**: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain.
## Steps
1. Read `architecture.md` and all component specs
2. Read `restrictions.md` for infrastructure constraints
3. Research best Docker practices for the project's tech stack (multi-stage builds, base image selection, layer optimization)
4. For each component, define:
- Base image (pinned version, prefer alpine/distroless for production)
- Build stages (dependency install, build, production)
- Non-root user configuration
- Health check endpoint and command
- Exposed ports
- `.dockerignore` contents
5. Define `docker-compose.yml` for local development:
- All application components
- Database (Postgres) with named volume
- Any message queues, caches, or external service mocks
- Shared network
- Environment variable files (`.env`)
6. Define `docker-compose.test.yml` for blackbox tests:
- Application components under test
- Test runner container (black-box, no internal imports)
- Isolated database with seed data
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner`
- See the Woodpecker two-workflow contract in [`../templates/ci_cd_pipeline.md`](../templates/ci_cd_pipeline.md) — the test runner entry point defined here becomes the first step of `.woodpecker/01-test.yml`.
7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
## Self-verification
- [ ] Every component has a Dockerfile specification
- [ ] Multi-stage builds specified for all production images
- [ ] Non-root user for all containers
- [ ] Health checks defined for every service
- [ ] `docker-compose.yml` covers all components + dependencies
- [ ] `docker-compose.test.yml` enables black-box testing
- [ ] `.dockerignore` defined
## Save action
Write `containerization.md` using `templates/containerization.md`.
## Blocking
**BLOCKING**: Present containerization plan to user. Do NOT proceed until confirmed.
@@ -1,41 +0,0 @@
# Step 3: CI/CD Pipeline
**Role**: DevOps engineer
**Goal**: Define the CI/CD pipeline with quality gates, security scanning, and multi-environment deployment.
**Constraints**: Pipeline definition only — produce YAML specification, not implementation.
## Steps
1. Read `architecture.md` for tech stack and deployment targets
2. Read `restrictions.md` for CI/CD constraints (cloud provider, registry, etc.)
3. Research CI/CD best practices for the project's platform (GitHub Actions / Azure Pipelines)
4. Define pipeline stages:
| Stage | Trigger | Steps | Quality Gate |
|-------|---------|-------|-------------|
| **Lint** | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors |
| **Test** | Every push | Unit tests, blackbox tests, coverage report | 75%+ coverage (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds) |
| **Security** | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs |
| **Build** | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds |
| **Push** | After build | Push to container registry | Push succeeds |
| **Deploy Staging** | After push | Deploy to staging environment | Health checks pass |
| **Smoke Tests** | After staging deploy | Run critical path tests against staging | All pass |
| **Deploy Production** | Manual approval | Deploy to production | Health checks pass |
5. Define caching strategy: dependency caches, Docker layer caches, build artifact caches
6. Define parallelization: which stages can run concurrently
7. Define notifications: build failures, deployment status, security alerts
## Self-verification
- [ ] All pipeline stages defined with triggers and gates
- [ ] Coverage threshold enforced (75%+)
- [ ] Security scanning included (dependencies + images + SAST)
- [ ] Caching configured for dependencies and Docker layers
- [ ] Multi-environment deployment (staging → production)
- [ ] Rollback procedure referenced
- [ ] Notifications configured
## Save action
Write `ci_cd_pipeline.md` using `templates/ci_cd_pipeline.md`.
@@ -1,41 +0,0 @@
# Step 4: Environment Strategy
**Role**: Platform engineer
**Goal**: Define environment configuration, secrets management, and environment parity.
**Constraints**: Strategy document — no secrets or credentials in output.
## Steps
1. Define environments:
| Environment | Purpose | Infrastructure | Data |
|-------------|---------|---------------|------|
| **Development** | Local developer workflow | docker-compose, local volumes | Seed data, mocks for external APIs |
| **Staging** | Pre-production validation | Mirrors production topology | Anonymized production-like data |
| **Production** | Live system | Full infrastructure | Real data |
2. Define environment variable management:
- Reference `.env.example` created in Step 1
- Per-environment variable sources (`.env` for dev, secret manager for staging/prod)
- Validation: fail fast on missing required variables at startup
3. Define secrets management:
- Never commit secrets to version control
- Development: `.env` files (git-ignored)
- Staging/Production: secret manager (AWS Secrets Manager / Azure Key Vault / Vault)
- Rotation policy
4. Define database management per environment:
- Development: Docker Postgres with named volume, seed data
- Staging: managed Postgres, migrations applied via CI/CD
- Production: managed Postgres, migrations require approval
## Self-verification
- [ ] All three environments defined with clear purpose
- [ ] Environment variable documentation complete (references `.env.example` from Step 1)
- [ ] No secrets in any output document
- [ ] Secret manager specified for staging/production
- [ ] Database strategy per environment
## Save action
Write `environment_strategy.md` using `templates/environment_strategy.md`.
@@ -1,60 +0,0 @@
# Step 5: Observability
**Role**: Site Reliability Engineer (SRE)
**Goal**: Define logging, metrics, tracing, and alerting strategy.
**Constraints**: Strategy document — describe what to implement, not how to wire it.
## Steps
1. Read `architecture.md` and component specs for service boundaries
2. Research observability best practices for the tech stack
## Logging
- Structured JSON to stdout/stderr (no file logging in containers)
- Fields: `timestamp` (ISO 8601), `level`, `service`, `correlation_id`, `message`, `context`
- Levels: ERROR (exceptions), WARN (degraded), INFO (business events), DEBUG (diagnostics, dev only)
- No PII in logs
- Retention: dev = console, staging = 7 days, production = 30 days
## Metrics
- Expose Prometheus-compatible `/metrics` endpoint per service
- System metrics: CPU, memory, disk, network
- Application metrics: `request_count`, `request_duration` (histogram), `error_count`, `active_connections`
- Business metrics: derived from acceptance criteria
- Collection interval: 15s
## Distributed Tracing
- OpenTelemetry SDK integration
- Trace context propagation via HTTP headers and message queue metadata
- Span naming: `<service>.<operation>`
- Sampling: 100% in dev/staging, 10% in production (adjust based on volume)
## Alerting
| Severity | Response Time | Condition Examples |
|----------|---------------|-------------------|
| Critical | 5 min | Service down, data loss, health check failed |
| High | 30 min | Error rate > 5%, P95 latency > 2x baseline |
| Medium | 4 hours | Disk > 80%, elevated latency |
| Low | Next business day | Non-critical warnings |
## Dashboards
- Operations: service health, request rate, error rate, response time percentiles, resource utilization
- Business: key business metrics from acceptance criteria
## Self-verification
- [ ] Structured logging format defined with required fields
- [ ] Metrics endpoint specified per service
- [ ] OpenTelemetry tracing configured
- [ ] Alert severities with response times defined
- [ ] Dashboards cover operations and business metrics
- [ ] PII exclusion from logs addressed
## Save action
Write `observability.md` using `templates/observability.md`.
@@ -1,53 +0,0 @@
# Step 6: Deployment Procedures
**Role**: DevOps / Platform engineer
**Goal**: Define deployment strategy, rollback procedures, health checks, and deployment checklist.
**Constraints**: Procedures document — no implementation.
## Steps
1. Define deployment strategy:
- Preferred pattern: blue-green / rolling / canary (choose based on architecture)
- Zero-downtime requirement for production
- Graceful shutdown: 30-second grace period for in-flight requests
- Database migration ordering: migrate before deploy, backward-compatible only
2. Define health checks:
| Check | Type | Endpoint | Interval | Threshold |
|-------|------|----------|----------|-----------|
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures → restart |
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures → remove from LB |
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts max |
3. Define rollback procedures:
- Trigger criteria: health check failures, error rate spike, critical alert
- Rollback steps: redeploy previous image tag, verify health, rollback database if needed
- Communication: notify stakeholders during rollback
- Post-mortem: required after every production rollback
4. Define deployment checklist:
- [ ] All tests pass in CI
- [ ] Security scan clean (zero critical/high CVEs)
- [ ] Database migrations reviewed and tested
- [ ] Environment variables configured
- [ ] Health check endpoints responding
- [ ] Monitoring alerts configured
- [ ] Rollback plan documented and tested
- [ ] Stakeholders notified
## Self-verification
- [ ] Deployment strategy chosen and justified
- [ ] Zero-downtime approach specified
- [ ] Health checks defined (liveness, readiness, startup)
- [ ] Rollback trigger criteria and steps documented
- [ ] Deployment checklist complete
## Save action
Write `deployment_procedures.md` using `templates/deployment_procedures.md`.
## Blocking
**BLOCKING**: Present deployment procedures to user. Do NOT proceed until confirmed.
-70
View File
@@ -1,70 +0,0 @@
# Step 7: Deployment Scripts
**Role**: DevOps / Platform engineer
**Goal**: Create executable deployment scripts for pulling Docker images and running services on the remote target machine.
**Constraints**: Produce real, executable shell scripts. This is the ONLY step that creates implementation artifacts.
## Steps
1. Read `containerization.md` and `deployment_procedures.md` from previous steps
2. Read `.env.example` for required variables
3. Create the following scripts in `SCRIPTS_DIR/`:
### `deploy.sh` — Main deployment orchestrator
- Validates that required environment variables are set (sources `.env` if present)
- Calls `pull-images.sh`, then `stop-services.sh`, then `start-services.sh`, then `health-check.sh`
- Exits with non-zero code on any failure
- Supports `--rollback` flag to redeploy previous image tags
### `pull-images.sh` — Pull Docker images to target machine
- Reads image list and tags from environment or config
- Authenticates with container registry
- Pulls all required images
- Verifies image integrity (digest check)
### `start-services.sh` — Start services on target machine
- Runs `docker compose up -d` or individual `docker run` commands
- Applies environment variables from `.env`
- Configures networks and volumes
- Waits for containers to reach healthy state
### `stop-services.sh` — Graceful shutdown
- Stops services with graceful shutdown period
- Saves current image tags for rollback reference
- Cleans up orphaned containers/networks
### `health-check.sh` — Verify deployment health
- Checks all health endpoints
- Reports status per service
- Returns non-zero if any service is unhealthy
4. All scripts must:
- Be POSIX-compatible (`#!/bin/bash` with `set -euo pipefail`)
- Source `.env` from project root or accept env vars from the environment
- Include usage/help output (`--help` flag)
- Be idempotent where possible
- Handle SSH connection to remote target (configurable via `DEPLOY_HOST` env var)
5. Document all scripts in `deploy_scripts.md`
## Self-verification
- [ ] All five scripts created and executable
- [ ] Scripts source environment variables correctly
- [ ] `deploy.sh` orchestrates the full flow
- [ ] `pull-images.sh` handles registry auth and image pull
- [ ] `start-services.sh` starts containers with correct config
- [ ] `stop-services.sh` handles graceful shutdown
- [ ] `health-check.sh` validates all endpoints
- [ ] Rollback supported via `deploy.sh --rollback`
- [ ] Scripts work for remote deployment via SSH (`DEPLOY_HOST`)
- [ ] `deploy_scripts.md` documents all scripts
## Save action
Write scripts to `SCRIPTS_DIR/`. Write `deploy_scripts.md` using `templates/deploy_scripts.md`.
@@ -29,7 +29,7 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`.
### Test
- Unit tests: [framework and command]
- Blackbox tests: [framework and command, uses docker-compose.test.yml]
- Coverage threshold: 75% overall, 90% critical-path floor (100% aim) — per `.cursor/rules/cursor-meta.mdc` Quality Thresholds
- Coverage threshold: 75% overall, 90% critical paths
- Coverage report published as pipeline artifact
### Security
@@ -85,140 +85,3 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`.
| Deploy success | [Slack] | [team] |
| Deploy failure | [Slack/email + PagerDuty] | [on-call] |
```
---
## Reference Implementation: Woodpecker CI two-workflow contract
Use this when the project's CI is **Woodpecker** and the test layout follows the autodev e2e contract from [`../../decompose/templates/test-infrastructure-task.md`](../../decompose/templates/test-infrastructure-task.md) (an `e2e/` folder containing `Dockerfile`, `docker-compose.test.yml`, `conftest.py`, `requirements.txt`, `mocks/`, `fixtures/`, `tests/`).
The contract is **two workflows in `.woodpecker/`**, scheduled on the same agent label, with the build workflow gated on a successful test run:
- `.woodpecker/01-test.yml` — runs the e2e contract, publishes `results/report.csv` as an artifact, fails the pipeline on any test failure.
- `.woodpecker/02-build-push.yml``depends_on: [01-test]`. Builds the image, tags it `${CI_COMMIT_BRANCH}-${TAG_SUFFIX}`, pushes it to the registry. Skipped automatically if test failed.
The agent label is parameterized via `matrix:` so a single workflow file fans out across architectures: `labels: platform: ${PLATFORM}` routes each matrix entry to the matching agent. Both workflows for a repo must use the same matrix so test and build run on the same machine and share Docker layer cache. New architectures = new matrix entries; never new files.
### Multi-arch matrix conventions
| Variable | Meaning | Typical values |
|----------|---------|----------------|
| `PLATFORM` | Woodpecker agent label — selects which physical machine runs the entry. | `arm64`, `amd64` |
| `TAG_SUFFIX` | Image tag suffix appended after the branch name. | `arm`, `amd` |
| `DOCKERFILE` *(only when arches need different Dockerfiles)* | Path to the Dockerfile for this entry. | `Dockerfile`, `Dockerfile.jetson` |
Most repos use the same `Dockerfile` for both arches (multi-arch base images handle the rest), so `DOCKERFILE` can be omitted from the matrix and hardcoded in the build command. Repos with split per-arch Dockerfiles (e.g., `detections` uses `Dockerfile.jetson` on Jetson with TensorRT/CUDA-on-L4T) declare `DOCKERFILE` as a matrix var.
When only one architecture is currently in use, keep the matrix block with a single entry and the second entry commented out — adding a new arch is then a one-line uncomment, not a structural change.
### `.woodpecker/01-test.yml`
```yaml
when:
event: [push, pull_request, manual]
branch: [dev, stage, main]
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: e2e
image: docker
commands:
- cd e2e
- docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build
- docker compose -f docker-compose.test.yml down -v
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- name: report
image: docker
when:
status: [success, failure]
commands:
- test -f e2e/results/report.csv && cat e2e/results/report.csv || echo "no report"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
```
Notes:
- `--abort-on-container-exit` shuts the whole compose down as soon as ANY service exits, so a crashed dependency surfaces immediately instead of hanging the runner.
- `--exit-code-from e2e-runner` ensures the pipeline's exit code reflects the test runner's, not the SUT's.
- The `report` step runs on `[success, failure]` so the report is always published; without this the CSV is lost on red builds.
- `down -v` between runs drops mock state and DB volumes — every test run starts clean.
### `.woodpecker/02-build-push.yml`
```yaml
when:
event: [push, manual]
branch: [dev, stage, main]
depends_on:
- 01-test
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: build-push
image: docker
environment:
REGISTRY_HOST:
from_secret: registry_host
REGISTRY_USER:
from_secret: registry_user
REGISTRY_TOKEN:
from_secret: registry_token
commands:
- echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
- export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
- export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
- |
docker build -f Dockerfile \
--build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
--label org.opencontainers.image.revision=$CI_COMMIT_SHA \
--label org.opencontainers.image.created=$BUILD_DATE \
--label org.opencontainers.image.source=$CI_REPO_URL \
-t $REGISTRY_HOST/azaion/<service>:$TAG .
- docker push $REGISTRY_HOST/azaion/<service>:$TAG
volumes:
- /var/run/docker.sock:/var/run/docker.sock
```
Notes:
- `depends_on: [01-test]` is enforced by Woodpecker — a failed `01-test` (any matrix entry) skips this workflow.
- The build workflow does NOT trigger on `pull_request` events: PRs get test signal only; pushes to `dev`/`stage`/`main` produce images. Avoids polluting the registry with PR images.
- Replace `<service>` with the actual service name (matches the registry namespace pattern `azaion/<service>`).
- For repos with split per-arch Dockerfiles, add `DOCKERFILE: Dockerfile.jetson` (or similar) to the matrix entry and substitute `${DOCKERFILE}` for `Dockerfile` in the `docker build -f` line.
### Variations by stack
The contract is language-agnostic because the runner is `docker compose`. The Dockerfile inside `e2e/` selects the test framework:
| Stack | `e2e/Dockerfile` runs |
|-------|----------------------|
| Python | `pytest --csv=/results/report.csv -v` |
| .NET | `dotnet test --logger:"trx;LogFileName=/results/report.trx"` (convert to CSV in a final step if needed) |
| Node/UI | `npm test -- --reporters=default --reporters=jest-junit --outputDirectory=/results` |
| Rust | `cargo test --no-fail-fast -- --format json > /results/report.json` |
When the repo has **only unit tests** (no `e2e/docker-compose.test.yml`), drop the compose orchestration and run the native test command directly inside a stack-appropriate image. Keep the same two-workflow split — `01-test.yml` runs unit tests, `02-build-push.yml` is unchanged.
### Manual-trigger override (test infrastructure not yet validated)
If a repo ships a complete `e2e/` layout but the test fixtures are not yet validated end-to-end (e.g., expected-results data is still being authored), gate `01-test.yml` on `event: [manual]` only and add a TODO comment pointing to the unblocking task. The `02-build-push.yml` workflow drops its `depends_on` clause for the manual-only window — an explicit and reversible exception, not a permanent split.
+465 -21
View File
@@ -19,19 +19,9 @@ disable-model-invocation: true
Analyze an existing codebase from the bottom up — individual modules first, then components, then system-level architecture — and produce the same `_docs/` artifacts that the `problem` and `plan` skills generate, without requiring user interview.
## File Index
| File | Purpose |
|------|---------|
| `workflows/full.md` | Full / Focus Area / Resume modes — Steps 07 (discovery through final report) |
| `workflows/task.md` | Task mode — lightweight incremental doc update triggered by task spec files |
| `references/artifacts.md` | Directory structure, state.json format, resumability, save principles |
**On every invocation**: read the appropriate workflow file based on mode detection below.
## Core Principles
- **Bottom-up always**: module docs component specs architecture/flows solution problem extraction. Every higher level is synthesized from the level below.
- **Bottom-up always**: module docs -> component specs -> architecture/flows -> solution -> problem extraction. Every higher level is synthesized from the level below.
- **Dependencies first**: process modules in topological order (leaves first). When documenting module X, all of X's dependencies already have docs.
- **Incremental context**: each module's doc uses already-written dependency docs as context — no ever-growing chain.
- **Verify against code**: cross-reference every entity in generated docs against actual codebase. Catch hallucinations.
@@ -56,16 +46,470 @@ Announce resolved paths (and FOCUS_DIR if set) to user before proceeding.
Determine the execution mode before any other logic:
| Mode | Trigger | Scope | Workflow File |
|------|---------|-------|---------------|
| **Full** | No input file, no existing state | Entire codebase | `workflows/full.md` |
| **Focus Area** | User provides a directory path (e.g., `@src/api/`) | Only the specified subtree + transitive dependencies | `workflows/full.md` |
| **Resume** | `state.json` exists in DOCUMENT_DIR | Continue from last checkpoint | `workflows/full.md` |
| **Task** | User provides a task spec file AND `_docs/02_document/` has existing docs | Targeted update of docs affected by the task | `workflows/task.md` |
| Mode | Trigger | Scope |
|------|---------|-------|
| **Full** | No input file, no existing state | Entire codebase |
| **Focus Area** | User provides a directory path (e.g., `@src/api/`) | Only the specified subtree + transitive dependencies |
| **Resume** | `state.json` exists in DOCUMENT_DIR | Continue from last checkpoint |
After detecting the mode, read and follow the corresponding workflow file.
Focus Area mode produces module + component docs for the targeted area only. It can be run repeatedly for different areas — each run appends to the existing module and component docs without overwriting other areas.
- **Full / Focus Area / Resume** → read `workflows/full.md`
- **Task** → read `workflows/task.md`
## Prerequisite Checks
For artifact directory structure and state.json format, see `references/artifacts.md`.
1. If `_docs/` already exists and contains files AND mode is **Full**, ASK user: **overwrite, merge, or write to `_docs_generated/` instead?**
2. Create DOCUMENT_DIR, SOLUTION_DIR, and PROBLEM_DIR if they don't exist
3. If DOCUMENT_DIR contains a `state.json`, offer to **resume from last checkpoint or start fresh**
4. If FOCUS_DIR is set, verify the directory exists and contains source files — **STOP if missing**
## Progress Tracking
Create a TodoWrite with all steps (0 through 7). Update status as each step completes.
## Workflow
### Step 0: Codebase Discovery
**Role**: Code analyst
**Goal**: Build a complete map of the codebase (or targeted subtree) before analyzing any code.
**Focus Area scoping**: if FOCUS_DIR is set, limit the scan to that directory subtree. Still identify transitive dependencies outside FOCUS_DIR (modules that FOCUS_DIR imports) and include them in the processing order, but skip modules that are neither inside FOCUS_DIR nor dependencies of it.
Scan and catalog:
1. Directory tree (ignore `node_modules`, `.git`, `__pycache__`, `bin/`, `obj/`, build artifacts)
2. Language detection from file extensions and config files
3. Package manifests: `package.json`, `requirements.txt`, `pyproject.toml`, `*.csproj`, `Cargo.toml`, `go.mod`
4. Config files: `Dockerfile`, `docker-compose.yml`, `.env.example`, CI/CD configs (`.github/workflows/`, `.gitlab-ci.yml`, `azure-pipelines.yml`)
5. Entry points: `main.*`, `app.*`, `index.*`, `Program.*`, startup scripts
6. Test structure: test directories, test frameworks, test runner configs
7. Existing documentation: README, `docs/`, wiki references, inline doc coverage
8. **Dependency graph**: build a module-level dependency graph by analyzing imports/references. Identify:
- Leaf modules (no internal dependencies)
- Entry points (no internal dependents)
- Cycles (mark for grouped analysis)
- Topological processing order
- If FOCUS_DIR: mark which modules are in-scope vs dependency-only
**Save**: `DOCUMENT_DIR/00_discovery.md` containing:
- Directory tree (concise, relevant directories only)
- Tech stack summary table (language, framework, database, infra)
- Dependency graph (textual list + Mermaid diagram)
- Topological processing order
- Entry points and leaf modules
**Save**: `DOCUMENT_DIR/state.json` with initial state:
```json
{
"current_step": "module-analysis",
"completed_steps": ["discovery"],
"focus_dir": null,
"modules_total": 0,
"modules_documented": [],
"modules_remaining": [],
"module_batch": 0,
"components_written": [],
"last_updated": ""
}
```
Set `focus_dir` to the FOCUS_DIR path if in Focus Area mode, or `null` for Full mode.
---
### Step 1: Module-Level Documentation
**Role**: Code analyst
**Goal**: Document every identified module individually, processing in topological order (leaves first).
**Batched processing**: process modules in batches of ~5 (sorted by topological order). After each batch: save all module docs, update `state.json`, present a progress summary. Between batches, evaluate whether to suggest a session break.
For each module in topological order:
1. **Read**: read the module's source code. Assess complexity and what context is needed.
2. **Gather context**: collect already-written docs of this module's dependencies (available because of bottom-up order). Note external library usage.
3. **Write module doc** with these sections:
- **Purpose**: one-sentence responsibility
- **Public interface**: exported functions/classes/methods with signatures, input/output types
- **Internal logic**: key algorithms, patterns, non-obvious behavior
- **Dependencies**: what it imports internally and why
- **Consumers**: what uses this module (from the dependency graph)
- **Data models**: entities/types defined in this module
- **Configuration**: env vars, config keys consumed
- **External integrations**: HTTP calls, DB queries, queue operations, file I/O
- **Security**: auth checks, encryption, input validation, secrets access
- **Tests**: what tests exist for this module, what they cover
4. **Verify**: cross-check that every entity referenced in the doc exists in the codebase. Flag uncertainties.
**Cycle handling**: modules in a dependency cycle are analyzed together as a group, producing a single combined doc.
**Large modules**: if a module exceeds comfortable analysis size, split into logical sub-sections and analyze each part, then combine.
**Save**: `DOCUMENT_DIR/modules/[module_name].md` for each module.
**State**: update `state.json` after each module completes (move from `modules_remaining` to `modules_documented`). Increment `module_batch` after each batch of ~5.
**Session break heuristic**: after each batch, if more than 10 modules remain AND 2+ batches have already completed in this session, suggest a session break:
```
══════════════════════════════════════
SESSION BREAK SUGGESTED
══════════════════════════════════════
Modules documented: [X] of [Y]
Batches completed this session: [N]
══════════════════════════════════════
A) Continue in this conversation
B) Save and continue in a fresh conversation (recommended)
══════════════════════════════════════
Recommendation: B — fresh context improves
analysis quality for remaining modules
══════════════════════════════════════
```
Re-entry is seamless: `state.json` tracks exactly which modules are done.
---
### Step 2: Component Assembly
**Role**: Software architect
**Goal**: Group related modules into logical components and produce component specs.
1. Analyze module docs from Step 1 to identify natural groupings:
- By directory structure (most common)
- By shared data models or common purpose
- By dependency clusters (tightly coupled modules)
2. For each identified component, synthesize its module docs into a single component specification using `templates/component-spec.md` as structure:
- High-level overview: purpose, pattern, upstream/downstream
- Internal interfaces: method signatures, DTOs (from actual module code)
- External API specification (if the component exposes HTTP/gRPC endpoints)
- Data access patterns: queries, caching, storage estimates
- Implementation details: algorithmic complexity, state management, key libraries
- Extensions and helpers: shared utilities needed
- Caveats and edge cases: limitations, race conditions, bottlenecks
- Dependency graph: implementation order relative to other components
- Logging strategy
3. Identify common helpers shared across multiple components -> document in `common-helpers/`
4. Generate component relationship diagram (Mermaid)
**Self-verification**:
- [ ] Every module from Step 1 is covered by exactly one component
- [ ] No component has overlapping responsibility with another
- [ ] Inter-component interfaces are explicit (who calls whom, with what)
- [ ] Component dependency graph has no circular dependencies
**Save**:
- `DOCUMENT_DIR/components/[##]_[name]/description.md` per component
- `DOCUMENT_DIR/common-helpers/[##]_helper_[name].md` per shared helper
- `DOCUMENT_DIR/diagrams/components.md` (Mermaid component diagram)
**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms the component breakdown is correct.
---
### Step 3: System-Level Synthesis
**Role**: Software architect
**Goal**: From component docs, synthesize system-level documents.
All documents here are derived from component docs (Step 2) + module docs (Step 1). No new code reading should be needed. If it is, that indicates a gap in Steps 1-2 — go back and fill it.
#### 3a. Architecture
Using `templates/architecture.md` as structure:
- System context and boundaries from entry points and external integrations
- Tech stack table from discovery (Step 0) + component specs
- Deployment model from Dockerfiles, CI configs, environment strategies
- Data model overview from per-component data access sections
- Integration points from inter-component interfaces
- NFRs from test thresholds, config limits, health checks
- Security architecture from per-module security observations
- Key ADRs inferred from technology choices and patterns
**Save**: `DOCUMENT_DIR/architecture.md`
#### 3b. System Flows
Using `templates/system-flows.md` as structure:
- Trace main flows through the component interaction graph
- Entry point -> component chain -> output for each major flow
- Mermaid sequence diagrams and flowcharts
- Error scenarios from exception handling patterns
- Data flow tables per flow
**Save**: `DOCUMENT_DIR/system-flows.md` and `DOCUMENT_DIR/diagrams/flows/flow_[name].md`
#### 3c. Data Model
- Consolidate all data models from module docs
- Entity-relationship diagram (Mermaid ERD)
- Migration strategy (if ORM/migration tooling detected)
- Seed data observations
- Backward compatibility approach (if versioning found)
**Save**: `DOCUMENT_DIR/data_model.md`
#### 3d. Deployment (if Dockerfile/CI configs exist)
- Containerization summary
- CI/CD pipeline structure
- Environment strategy (dev, staging, production)
- Observability (logging patterns, metrics, health checks found in code)
**Save**: `DOCUMENT_DIR/deployment/` (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md — only files for which sufficient code evidence exists)
---
### Step 4: Verification Pass
**Role**: Quality verifier
**Goal**: Compare every generated document against actual code. Fix hallucinations, fill gaps, correct inaccuracies.
For each document generated in Steps 1-3:
1. **Entity verification**: extract all code entities (class names, function names, module names, endpoints) mentioned in the doc. Cross-reference each against the actual codebase. Flag any that don't exist.
2. **Interface accuracy**: for every method signature, DTO, or API endpoint in component specs, verify it matches actual code.
3. **Flow correctness**: for each system flow diagram, trace the actual code path and verify the sequence matches.
4. **Completeness check**: are there modules or components discovered in Step 0 that aren't covered by any document? Flag gaps.
5. **Consistency check**: do component docs agree with architecture doc? Do flow diagrams match component interfaces?
Apply corrections inline to the documents that need them.
**Save**: `DOCUMENT_DIR/04_verification_log.md` with:
- Total entities verified vs flagged
- Corrections applied (which document, what changed)
- Remaining gaps or uncertainties
- Completeness score (modules covered / total modules)
**BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes.
**Session boundary**: After verification is confirmed, suggest a session break before proceeding to the synthesis steps (57). These steps produce different artifact types and benefit from fresh context:
```
══════════════════════════════════════
VERIFICATION COMPLETE — session break?
══════════════════════════════════════
Steps 04 (analysis + verification) are done.
Steps 57 (solution + problem extraction + report)
can run in a fresh conversation.
══════════════════════════════════════
A) Continue in this conversation
B) Save and continue in a new conversation (recommended)
══════════════════════════════════════
```
If **Focus Area mode**: Steps 57 are skipped (they require full codebase coverage). Present a summary of modules and components documented for this area. The user can run `/document` again for another area, or run without FOCUS_DIR once all areas are covered to produce the full synthesis.
---
### Step 5: Solution Extraction (Retrospective)
**Role**: Software architect
**Goal**: From all verified technical documentation, retrospectively create `solution.md` — the same artifact the research skill produces. This makes downstream skills (`plan`, `deploy`, `decompose`) compatible with the documented codebase.
Synthesize from architecture (Step 3) + component specs (Step 2) + system flows (Step 3) + verification findings (Step 4):
1. **Product Solution Description**: what the system is, brief component interaction diagram (Mermaid)
2. **Architecture**: the architecture that is implemented, with per-component solution tables:
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|----------|-------|-----------|-------------|-------------|----------|------|-----|
| [actual implementation] | [libs/platforms used] | [observed strengths] | [observed limitations] | [requirements met] | [security approach] | [cost indicators] | [fitness assessment] |
3. **Testing Strategy**: summarize integration/functional tests and non-functional tests found in the codebase
4. **References**: links to key config files, Dockerfiles, CI configs that evidence the solution choices
**Save**: `SOLUTION_DIR/solution.md` (`_docs/01_solution/solution.md`)
---
### Step 6: Problem Extraction (Retrospective)
**Role**: Business analyst
**Goal**: From all verified technical docs, retrospectively derive the high-level problem definition — producing the same documents the `problem` skill creates through interview.
This is the inverse of normal workflow: instead of problem -> solution -> code, we go code -> technical docs -> problem understanding.
#### 6a. `problem.md`
- Synthesize from architecture overview + component purposes + system flows
- What is this system? What problem does it solve? Who are the users? How does it work at a high level?
- Cross-reference with README if one exists
- Free-form text, concise, readable by someone unfamiliar with the project
#### 6b. `restrictions.md`
- Extract from: tech stack choices, Dockerfile specs (OS, base images), CI configs (platform constraints), dependency versions, environment configs
- Categorize with headers: Hardware, Software, Environment, Operational
- Each restriction should be specific and testable
#### 6c. `acceptance_criteria.md`
- Derive from: test assertions (expected values, thresholds), performance configs (timeouts, rate limits, batch sizes), health check endpoints, validation rules in code
- Categorize with headers by domain
- Every criterion must have a measurable value — if only implied, note the source
#### 6d. `input_data/`
- Document data schemas found (DB schemas, API request/response types, config file formats)
- Create `data_parameters.md` describing what data the system consumes, formats, volumes, update patterns
#### 6e. `security_approach.md` (only if security code found)
- Authentication mechanisms, authorization patterns, encryption, secrets handling, CORS, rate limiting, input sanitization — all from code observations
- If no security-relevant code found, skip this file
**Save**: all files to `PROBLEM_DIR/` (`_docs/00_problem/`)
**BLOCKING**: Present all problem documents to user. These are the most abstracted and therefore most prone to interpretation error. Do NOT proceed until user confirms or requests corrections.
---
### Step 7: Final Report
**Role**: Technical writer
**Goal**: Produce `FINAL_report.md` integrating all generated documentation.
Using `templates/final-report.md` as structure:
- Executive summary from architecture + problem docs
- Problem statement (transformed from problem.md, not copy-pasted)
- Architecture overview with tech stack one-liner
- Component summary table (number, name, purpose, dependencies)
- System flows summary table
- Risk observations from verification log (Step 4)
- Open questions (uncertainties flagged during analysis)
- Artifact index listing all generated documents with paths
**Save**: `DOCUMENT_DIR/FINAL_report.md`
**State**: update `state.json` with `current_step: "complete"`.
---
## Artifact Management
### Directory Structure
```
_docs/
├── 00_problem/ # Step 6 (retrospective)
│ ├── problem.md
│ ├── restrictions.md
│ ├── acceptance_criteria.md
│ ├── input_data/
│ │ └── data_parameters.md
│ └── security_approach.md
├── 01_solution/ # Step 5 (retrospective)
│ └── solution.md
└── 02_document/ # DOCUMENT_DIR
├── 00_discovery.md # Step 0
├── modules/ # Step 1
│ ├── [module_name].md
│ └── ...
├── components/ # Step 2
│ ├── 01_[name]/description.md
│ ├── 02_[name]/description.md
│ └── ...
├── common-helpers/ # Step 2
├── architecture.md # Step 3
├── system-flows.md # Step 3
├── data_model.md # Step 3
├── deployment/ # Step 3
├── diagrams/ # Steps 2-3
│ ├── components.md
│ └── flows/
├── 04_verification_log.md # Step 4
├── FINAL_report.md # Step 7
└── state.json # Resumability
```
### Resumability
Maintain `DOCUMENT_DIR/state.json`:
```json
{
"current_step": "module-analysis",
"completed_steps": ["discovery"],
"focus_dir": null,
"modules_total": 12,
"modules_documented": ["utils/helpers", "models/user"],
"modules_remaining": ["services/auth", "api/endpoints"],
"module_batch": 1,
"components_written": [],
"last_updated": "2026-03-21T14:00:00Z"
}
```
Update after each module/component completes. If interrupted, resume from next undocumented module.
When resuming:
1. Read `state.json`
2. Cross-check against actual files in DOCUMENT_DIR (trust files over state if they disagree)
3. Continue from the next incomplete item
4. Inform user which steps are being skipped
### Save Principles
1. **Save immediately**: write each module doc as soon as analysis completes
2. **Incremental context**: each subsequent module uses already-written docs as context
3. **Preserve intermediates**: keep all module docs even after synthesis into component docs
4. **Enable recovery**: state file tracks exact progress for resume
## Escalation Rules
| Situation | Action |
|-----------|--------|
| Minified/obfuscated code detected | WARN user, skip module, note in verification log |
| Module too large for context window | Split into sub-sections, analyze parts separately, combine |
| Cycle in dependency graph | Group cycled modules, analyze together as one doc |
| Generated code (protobuf, swagger-gen) | Note as generated, document the source spec instead |
| No tests found in codebase | Note gap in acceptance_criteria.md, derive AC from validation rules and config limits only |
| Contradictions between code and README | Flag in verification log, ASK user |
| Binary files or non-code assets | Skip, note in discovery |
| `_docs/` already exists | ASK user: overwrite, merge, or use `_docs_generated/` |
| Code intent is ambiguous | ASK user, do not guess |
## Common Mistakes
- **Top-down guessing**: never infer architecture before documenting modules. Build up, don't assume down.
- **Hallucinating entities**: always verify that referenced classes/functions/endpoints actually exist in code.
- **Skipping modules**: every source module must appear in exactly one module doc and one component.
- **Monolithic analysis**: don't try to analyze the entire codebase in one pass. Module by module, in order.
- **Inventing restrictions**: only document constraints actually evidenced in code, configs, or Dockerfiles.
- **Vague acceptance criteria**: "should be fast" is not a criterion. Extract actual numeric thresholds from code.
- **Writing code**: this skill produces documents, never implementation code.
## Methodology Quick Reference
```
┌──────────────────────────────────────────────────────────────────┐
│ Bottom-Up Codebase Documentation (8-Step) │
├──────────────────────────────────────────────────────────────────┤
│ MODE: Full / Focus Area (@dir) / Resume (state.json) │
│ PREREQ: Check _docs/ exists (overwrite/merge/new?) │
│ PREREQ: Check state.json for resume │
│ │
│ 0. Discovery → dependency graph, tech stack, topo order │
│ (Focus Area: scoped to FOCUS_DIR + transitive deps) │
│ 1. Module Docs → per-module analysis (leaves first) │
│ (batched ~5 modules; session break between batches) │
│ 2. Component Assembly → group modules, write component specs │
│ [BLOCKING: user confirms components] │
│ 3. System Synthesis → architecture, flows, data model, deploy │
│ 4. Verification → compare all docs vs code, fix errors │
│ [BLOCKING: user reviews corrections] │
│ [SESSION BREAK suggested before Steps 57] │
│ ── Focus Area mode stops here ── │
│ 5. Solution Extraction → retrospective solution.md │
│ 6. Problem Extraction → retrospective problem, restrictions, AC │
│ [BLOCKING: user confirms problem docs] │
│ 7. Final Report → FINAL_report.md │
├──────────────────────────────────────────────────────────────────┤
│ Principles: Bottom-up always · Dependencies first │
│ Incremental context · Verify against code │
│ Save immediately · Resume from checkpoint │
│ Batch modules · Session breaks for large codebases │
└──────────────────────────────────────────────────────────────────┘
```
@@ -1,72 +0,0 @@
# Document Skill — Artifact Management
## Directory Structure
```
_docs/
├── 00_problem/ # Step 6 (retrospective)
│ ├── problem.md
│ ├── restrictions.md
│ ├── acceptance_criteria.md
│ ├── input_data/
│ │ └── data_parameters.md
│ └── security_approach.md
├── 01_solution/ # Step 5 (retrospective)
│ └── solution.md
└── 02_document/ # DOCUMENT_DIR
├── 00_discovery.md # Step 0
├── modules/ # Step 1
│ ├── [module_name].md
│ └── ...
├── components/ # Step 2
│ ├── 01_[name]/description.md
│ ├── 02_[name]/description.md
│ └── ...
├── common-helpers/ # Step 2
├── architecture.md # Step 3
├── system-flows.md # Step 3
├── data_model.md # Step 3
├── deployment/ # Step 3
├── diagrams/ # Steps 2-3
│ ├── components.md
│ └── flows/
├── 04_verification_log.md # Step 4
├── glossary.md # Step 4.5 (confirmed-by-user)
├── FINAL_report.md # Step 7
└── state.json # Resumability
```
## State File (state.json)
Maintained in `DOCUMENT_DIR/state.json` for resumability:
```json
{
"current_step": "module-analysis",
"completed_steps": ["discovery"],
"focus_dir": null,
"modules_total": 12,
"modules_documented": ["utils/helpers", "models/user"],
"modules_remaining": ["services/auth", "api/endpoints"],
"module_batch": 1,
"components_written": [],
"step_4_5_glossary_vision": "not_started",
"last_updated": "2026-03-21T14:00:00Z"
}
```
Update after each module/component completes. If interrupted, resume from next undocumented module.
### Resume Protocol
1. Read `state.json`
2. Cross-check against actual files in DOCUMENT_DIR (trust files over state if they disagree)
3. Continue from the next incomplete item
4. Inform user which steps are being skipped
## Save Principles
1. **Save immediately**: write each module doc as soon as analysis completes
2. **Incremental context**: each subsequent module uses already-written docs as context
3. **Preserve intermediates**: keep all module docs even after synthesis into component docs
4. **Enable recovery**: state file tracks exact progress for resume
-509
View File
@@ -1,509 +0,0 @@
# Document Skill — Full / Focus Area / Resume Workflow
Covers three related modes that share the same 8-step pipeline:
- **Full**: entire codebase, no prior state
- **Focus Area**: scoped to a directory subtree + transitive dependencies
- **Resume**: continue from `state.json` checkpoint
## Prerequisite Checks
1. If `_docs/` already exists and contains files AND mode is **Full**, ASK user: **overwrite, merge, or write to `_docs_generated/` instead?**
2. Create DOCUMENT_DIR, SOLUTION_DIR, and PROBLEM_DIR if they don't exist
3. If DOCUMENT_DIR contains a `state.json`, offer to **resume from last checkpoint or start fresh**
4. If FOCUS_DIR is set, verify the directory exists and contains source files — **STOP if missing**
## Progress Tracking
Create a TodoWrite with all steps (0 through 7, including the inline Step 2.5 Module Layout Derivation and Step 4.5 Glossary & Architecture Vision). Update status as each step completes.
## Steps
### Step 0: Codebase Discovery
**Role**: Code analyst
**Goal**: Build a complete map of the codebase (or targeted subtree) before analyzing any code.
**Focus Area scoping**: if FOCUS_DIR is set, limit the scan to that directory subtree. Still identify transitive dependencies outside FOCUS_DIR (modules that FOCUS_DIR imports) and include them in the processing order, but skip modules that are neither inside FOCUS_DIR nor dependencies of it.
Scan and catalog:
1. Directory tree (ignore `node_modules`, `.git`, `__pycache__`, `bin/`, `obj/`, build artifacts)
2. Language detection from file extensions and config files
3. Package manifests: `package.json`, `requirements.txt`, `pyproject.toml`, `*.csproj`, `Cargo.toml`, `go.mod`
4. Config files: `Dockerfile`, `docker-compose.yml`, `.env.example`, CI/CD configs (`.github/workflows/`, `.gitlab-ci.yml`, `azure-pipelines.yml`)
5. Entry points: `main.*`, `app.*`, `index.*`, `Program.*`, startup scripts
6. Test structure: test directories, test frameworks, test runner configs
7. Existing documentation: README, `docs/`, wiki references, inline doc coverage
8. **Dependency graph**: build a module-level dependency graph by analyzing imports/references. Identify:
- Leaf modules (no internal dependencies)
- Entry points (no internal dependents)
- Cycles (mark for grouped analysis)
- Topological processing order
- If FOCUS_DIR: mark which modules are in-scope vs dependency-only
**Save**: `DOCUMENT_DIR/00_discovery.md` containing:
- Directory tree (concise, relevant directories only)
- Tech stack summary table (language, framework, database, infra)
- Dependency graph (textual list + Mermaid diagram)
- Topological processing order
- Entry points and leaf modules
**Save**: `DOCUMENT_DIR/state.json` with initial state (see `references/artifacts.md` for format).
---
### Step 1: Module-Level Documentation
**Role**: Code analyst
**Goal**: Document every identified module individually, processing in topological order (leaves first).
**Batched processing**: process modules in batches of ~5 (sorted by topological order). After each batch: save all module docs, update `state.json`, present a progress summary. Between batches, evaluate whether to suggest a session break.
For each module in topological order:
1. **Read**: read the module's source code. Assess complexity and what context is needed.
2. **Gather context**: collect already-written docs of this module's dependencies (available because of bottom-up order). Note external library usage.
3. **Write module doc** with these sections:
- **Purpose**: one-sentence responsibility
- **Public interface**: exported functions/classes/methods with signatures, input/output types
- **Internal logic**: key algorithms, patterns, non-obvious behavior
- **Dependencies**: what it imports internally and why
- **Consumers**: what uses this module (from the dependency graph)
- **Data models**: entities/types defined in this module
- **Configuration**: env vars, config keys consumed
- **External integrations**: HTTP calls, DB queries, queue operations, file I/O
- **Security**: auth checks, encryption, input validation, secrets access
- **Tests**: what tests exist for this module, what they cover
4. **Verify**: cross-check that every entity referenced in the doc exists in the codebase. Flag uncertainties.
**Cycle handling**: modules in a dependency cycle are analyzed together as a group, producing a single combined doc.
**Large modules**: if a module exceeds comfortable analysis size, split into logical sub-sections and analyze each part, then combine.
**Save**: `DOCUMENT_DIR/modules/[module_name].md` for each module.
**State**: update `state.json` after each module completes (move from `modules_remaining` to `modules_documented`). Increment `module_batch` after each batch of ~5.
**Session break heuristic**: after each batch, if more than 10 modules remain AND 2+ batches have already completed in this session, suggest a session break:
```
══════════════════════════════════════
SESSION BREAK SUGGESTED
══════════════════════════════════════
Modules documented: [X] of [Y]
Batches completed this session: [N]
══════════════════════════════════════
A) Continue in this conversation
B) Save and continue in a fresh conversation (recommended)
══════════════════════════════════════
Recommendation: B — fresh context improves
analysis quality for remaining modules
══════════════════════════════════════
```
Re-entry is seamless: `state.json` tracks exactly which modules are done.
---
### Step 2: Component Assembly
**Role**: Software architect
**Goal**: Group related modules into logical components and produce component specs.
1. Analyze module docs from Step 1 to identify natural groupings:
- By directory structure (most common)
- By shared data models or common purpose
- By dependency clusters (tightly coupled modules)
2. For each identified component, synthesize its module docs into a single component specification using `.cursor/skills/plan/templates/component-spec.md` as structure:
- High-level overview: purpose, pattern, upstream/downstream
- Internal interfaces: method signatures, DTOs (from actual module code)
- External API specification (if the component exposes HTTP/gRPC endpoints)
- Data access patterns: queries, caching, storage estimates
- Implementation details: algorithmic complexity, state management, key libraries
- Extensions and helpers: shared utilities needed
- Caveats and edge cases: limitations, race conditions, bottlenecks
- Dependency graph: implementation order relative to other components
- Logging strategy
3. Identify common helpers shared across multiple components → document in `common-helpers/`
4. Generate component relationship diagram (Mermaid)
**Self-verification**:
- [ ] Every module from Step 1 is covered by exactly one component
- [ ] No component has overlapping responsibility with another
- [ ] Inter-component interfaces are explicit (who calls whom, with what)
- [ ] Component dependency graph has no circular dependencies
**Save**:
- `DOCUMENT_DIR/components/[##]_[name]/description.md` per component
- `DOCUMENT_DIR/common-helpers/[##]_helper_[name].md` per shared helper
- `DOCUMENT_DIR/diagrams/components.md` (Mermaid component diagram)
**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms the component breakdown is correct.
---
### Step 2.5: Module Layout Derivation
**Role**: Software architect
**Goal**: Produce `_docs/02_document/module-layout.md` — the authoritative file-ownership map read by `/implement` Step 4, `/code-review` Phase 7, and `/refactor` discovery. Required for any downstream skill that assigns file ownership or checks architectural layering.
This step derives the layout from the **existing** codebase rather than from a plan. Decompose Step 1.5 is the greenfield counterpart and uses the same template; this step uses the same output shape so downstream consumers don't branch on origin.
1. For each component identified in Step 2, resolve its owning directory from module docs (Step 1) and from directory groupings used in Step 2.
2. For each component, compute:
- **Public API**: exported symbols. Language-specific: Python — `__init__.py` re-exports + non-underscore root-level symbols; TypeScript — `index.ts` / barrel exports; C# — `public` types in the namespace root; Rust — `pub` items in `lib.rs` / `mod.rs`; Go — exported (capitalized) identifiers in the package root.
- **Internal**: everything else under the component's directory.
- **Owns**: the component's directory glob.
- **Imports from**: other components whose Public API this one references (parse imports; reuse tooling from Step 0's dependency graph).
- **Consumed by**: reverse of Imports from across all components.
3. Identify `shared/*` directories already present in the code (or infer candidates: modules imported by ≥2 components and owning no domain logic). Create a Shared / Cross-Cutting entry per concern.
4. Infer the Allowed Dependencies layering table by topologically sorting the import graph built in step 2. Components that import only from `shared/*` go to Layer 1; each successive layer imports only from lower layers.
5. Write `_docs/02_document/module-layout.md` using `.cursor/skills/decompose/templates/module-layout.md`. At the top of the file add `**Status**: derived-from-code` and a `## Verification Needed` block listing any inference that was not clean (detected cycles, ambiguous ownership, components not cleanly assignable to a layer).
**Self-verification**:
- [ ] Every component from Step 2 has a Per-Component Mapping entry
- [ ] Every Public API list is grounded in an actual exported symbol (no guesses)
- [ ] No component's `Imports from` points at a component in a higher layer
- [ ] Shared directories detected in code are listed under Shared / Cross-Cutting
- [ ] Cycles from Step 0 that span components are surfaced in `## Verification Needed`
**Save**: `_docs/02_document/module-layout.md`
**BLOCKING**: Present the layering table and the `## Verification Needed` block to the user. Do NOT proceed until the user confirms (or patches) the derived layout. Downstream skills assume this file is accurate.
---
### Step 3: System-Level Synthesis
**Role**: Software architect
**Goal**: From component docs, synthesize system-level documents.
All documents here are derived from component docs (Step 2) + module docs (Step 1). No new code reading should be needed. If it is, that indicates a gap in Steps 1-2 — go back and fill it.
#### 3a. Architecture
Using `.cursor/skills/plan/templates/architecture.md` as structure:
- System context and boundaries from entry points and external integrations
- Tech stack table from discovery (Step 0) + component specs
- Deployment model from Dockerfiles, CI configs, environment strategies
- Data model overview from per-component data access sections
- Integration points from inter-component interfaces
- NFRs from test thresholds, config limits, health checks
- Security architecture from per-module security observations
- Key ADRs inferred from technology choices and patterns
**Save**: `DOCUMENT_DIR/architecture.md`
#### 3b. System Flows
Using `.cursor/skills/plan/templates/system-flows.md` as structure:
- Trace main flows through the component interaction graph
- Entry point → component chain → output for each major flow
- Mermaid sequence diagrams and flowcharts
- Error scenarios from exception handling patterns
- Data flow tables per flow
**Save**: `DOCUMENT_DIR/system-flows.md` and `DOCUMENT_DIR/diagrams/flows/flow_[name].md`
#### 3c. Data Model
- Consolidate all data models from module docs
- Entity-relationship diagram (Mermaid ERD)
- Migration strategy (if ORM/migration tooling detected)
- Seed data observations
- Backward compatibility approach (if versioning found)
**Save**: `DOCUMENT_DIR/data_model.md`
#### 3d. Deployment (if Dockerfile/CI configs exist)
- Containerization summary
- CI/CD pipeline structure
- Environment strategy (dev, staging, production)
- Observability (logging patterns, metrics, health checks found in code)
**Save**: `DOCUMENT_DIR/deployment/` (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md — only files for which sufficient code evidence exists)
---
### Step 4: Verification Pass
**Role**: Quality verifier
**Goal**: Compare every generated document against actual code. Fix hallucinations, fill gaps, correct inaccuracies.
For each document generated in Steps 1-3:
1. **Entity verification**: extract all code entities (class names, function names, module names, endpoints) mentioned in the doc. Cross-reference each against the actual codebase. Flag any that don't exist.
2. **Interface accuracy**: for every method signature, DTO, or API endpoint in component specs, verify it matches actual code.
3. **Flow correctness**: for each system flow diagram, trace the actual code path and verify the sequence matches.
4. **Completeness check**: are there modules or components discovered in Step 0 that aren't covered by any document? Flag gaps.
5. **Consistency check**: do component docs agree with architecture doc? Do flow diagrams match component interfaces?
Apply corrections inline to the documents that need them.
**Save**: `DOCUMENT_DIR/04_verification_log.md` with:
- Total entities verified vs flagged
- Corrections applied (which document, what changed)
- Remaining gaps or uncertainties
- Completeness score (modules covered / total modules)
**BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes.
---
### Step 4.5: Glossary & Architecture Vision (BLOCKING)
**Role**: Software architect + business analyst
**Goal**: Reconcile the AI's verified understanding of the codebase with the user's intended terminology and architecture vision. Existing-code projects often carry domain language and structural intent that is invisible from code alone (synonyms, deprecated names, modules that are "supposed to" be split, components the user thinks of as one logical unit even though they live in two folders). This step makes that intent explicit before any downstream skill (refactor, decompose, new-task) acts on the docs.
**When this step runs**:
- Always, after Step 4 (Verification Pass) — for Full and Resume modes.
- **Skipped** in Focus Area mode (the glossary/vision is system-wide; running it on a partial scan would produce a partial glossary). Resume the user once a full pass exists.
**Inputs** (already on disk after Step 4):
- `DOCUMENT_DIR/architecture.md`, `system-flows.md`, `data_model.md`, `deployment/*`
- `DOCUMENT_DIR/components/*/description.md`
- `DOCUMENT_DIR/modules/*.md`
- `DOCUMENT_DIR/04_verification_log.md` (so the AI knows which doc parts are confirmed vs. flagged)
**Outputs**:
- `DOCUMENT_DIR/glossary.md` (NEW)
- `DOCUMENT_DIR/architecture.md` updated in place: a new `## Architecture Vision` section is prepended (or merged into an existing "Overview" / "Vision" heading if already present); existing technical sections are preserved verbatim
**Procedure**:
1. **Draft glossary** from verified docs:
- Domain entities, processes, roles named in module/component docs
- Acronyms / abbreviations
- Internal codenames (project, service, model names) that recur in the codebase
- Synonym pairs the AI noticed (e.g., the codebase uses "flight" but module comments say "mission")
- Stakeholder personas if any docs reference them
Each entry: one-line definition + source reference (`source: components/03_flights/description.md`). Skip generic CS/industry terms.
2. **Draft architecture vision** as the AI currently understands the codebase:
- **One paragraph**: what the system is, who runs it, the runtime topology shape (monolith / services / pipeline / library / hybrid), and the dominant pattern (e.g., "submodule-based meta-repo with REST + SSE between UI and backend").
- **Components & responsibilities** (one-line each), pulled from `components/*/description.md`.
- **Major data flows** (one or two sentences each), pulled from `system-flows.md`.
- **Architectural principles / non-negotiables** the AI inferred from the code (e.g., "DB-driven config", "all UI traffic via REST + SSE only", "no per-component shared state"). Mark each with `inferred-from: <source>`.
- **Open questions / drift signals**: places where the code disagrees with itself, or where the AI cannot tell intent from implementation (e.g., two components doing similar work — is that legacy duplication or deliberate?).
3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
```
══════════════════════════════════════
REVIEW: Glossary + Architecture Vision (existing code)
══════════════════════════════════════
Glossary (N terms drafted from verified docs):
- <Term>: <one-line definition>
- ...
Architecture Vision — as inferred from the codebase:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Principles / non-negotiables (inferred):
- <principle> [inferred-from: <source>]
- ...
Open questions / drift signals:
- <q1>
- <q2>
══════════════════════════════════════
A) Inferred vision matches my intent — write the files
B) Add / correct entries (provide diffs — terms, components,
principles, or rename pairs)
C) Resolve the open questions / drift signals first
══════════════════════════════════════
Recommendation: pick C if any drift signals exist;
otherwise B if the vision misses
project-specific intent; A only when
the inferred vision is exactly right.
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present, loop until A.
- On C → ask the listed open questions in one batch (M4-style), integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `DOCUMENT_DIR/glossary.md`, alphabetical, with a top-line `**Status**: confirmed-by-user` and the date.
- Update `DOCUMENT_DIR/architecture.md`:
- If a `## Architecture Vision` (or `## Vision` / `## Overview`) section already exists at the top, replace its body with the confirmed paragraph + components + principles.
- Otherwise, insert `## Architecture Vision` as the first H2 after the title; preserve every existing H2 below.
- Do NOT delete or re-order existing technical sections (Tech Stack, Deployment Model, Data Model, NFRs, ADRs).
6. **Update `state.json`**: mark `step_4_5_glossary_vision: confirmed`. Resume on rerun must skip this step unless the user explicitly invokes `/document --refresh-vision`.
**Self-verification**:
- [ ] Every glossary entry traces to at least one file under `DOCUMENT_DIR/`
- [ ] Every component listed in the vision matches a folder under `DOCUMENT_DIR/components/`
- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
- [ ] `architecture.md` still contains all H2 sections it had before this step
- [ ] User picked option A on the latest condensed view
**BLOCKING**: Do NOT proceed to the session boundary / Step 5 until both files are saved and the user has picked A.
---
**Session boundary**: After Step 4.5 is confirmed, suggest a session break before proceeding to the synthesis steps (57). These steps produce different artifact types and benefit from fresh context:
```
══════════════════════════════════════
VERIFICATION COMPLETE — session break?
══════════════════════════════════════
Steps 04 (analysis + verification) are done.
Steps 57 (solution + problem extraction + report)
can run in a fresh conversation.
══════════════════════════════════════
A) Continue in this conversation
B) Save and continue in a new conversation (recommended)
══════════════════════════════════════
```
If **Focus Area mode**: Steps 57 are skipped (they require full codebase coverage). Present a summary of modules and components documented for this area. The user can run `/document` again for another area, or run without FOCUS_DIR once all areas are covered to produce the full synthesis.
---
### Step 5: Solution Extraction (Retrospective)
**Role**: Software architect
**Goal**: From all verified technical documentation, retrospectively create `solution.md` — the same artifact the research skill produces.
Synthesize from architecture (Step 3) + component specs (Step 2) + system flows (Step 3) + verification findings (Step 4):
1. **Product Solution Description**: what the system is, brief component interaction diagram (Mermaid)
2. **Architecture**: the architecture that is implemented, with per-component solution tables:
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|----------|-------|-----------|-------------|-------------|----------|------|-----|
| [actual implementation] | [libs/platforms used] | [observed strengths] | [observed limitations] | [requirements met] | [security approach] | [cost indicators] | [fitness assessment] |
3. **Testing Strategy**: summarize integration/functional tests and non-functional tests found in the codebase
4. **References**: links to key config files, Dockerfiles, CI configs that evidence the solution choices
**Save**: `SOLUTION_DIR/solution.md` (`_docs/01_solution/solution.md`)
---
### Step 6: Problem Extraction (Retrospective)
**Role**: Business analyst
**Goal**: From all verified technical docs, retrospectively derive the high-level problem definition.
#### 6a. `problem.md`
- Synthesize from architecture overview + component purposes + system flows
- What is this system? What problem does it solve? Who are the users? How does it work at a high level?
- Cross-reference with README if one exists
#### 6b. `restrictions.md`
- Extract from: tech stack choices, Dockerfile specs, CI configs, dependency versions, environment configs
- Categorize: Hardware, Software, Environment, Operational
#### 6c. `acceptance_criteria.md`
- Derive from: test assertions, performance configs, health check endpoints, validation rules
- Every criterion must have a measurable value
#### 6d. `input_data/`
- Document data schemas (DB schemas, API request/response types, config file formats)
- Create `data_parameters.md` describing what data the system consumes
#### 6e. `security_approach.md` (only if security code found)
- Authentication, authorization, encryption, secrets handling, CORS, rate limiting, input sanitization
**Save**: all files to `PROBLEM_DIR/` (`_docs/00_problem/`)
**BLOCKING**: Present all problem documents to user. Do NOT proceed until user confirms or requests corrections.
---
### Step 7: Final Report
**Role**: Technical writer
**Goal**: Produce `FINAL_report.md` integrating all generated documentation.
Using `.cursor/skills/plan/templates/final-report.md` as structure:
- Executive summary from architecture + problem docs
- Problem statement (transformed from problem.md, not copy-pasted)
- Architecture overview with tech stack one-liner
- Component summary table (number, name, purpose, dependencies)
- System flows summary table
- Risk observations from verification log (Step 4)
- Open questions (uncertainties flagged during analysis)
- Artifact index listing all generated documents with paths
**Save**: `DOCUMENT_DIR/FINAL_report.md`
**State**: update `state.json` with `current_step: "complete"`.
---
## Escalation Rules
| Situation | Action |
|-----------|--------|
| Minified/obfuscated code detected | WARN user, skip module, note in verification log |
| Module too large for context window | Split into sub-sections, analyze parts separately, combine |
| Cycle in dependency graph | Group cycled modules, analyze together as one doc |
| Generated code (protobuf, swagger-gen) | Note as generated, document the source spec instead |
| No tests found in codebase | Note gap in acceptance_criteria.md, derive AC from validation rules and config limits only |
| Contradictions between code and README | Flag in verification log, ASK user |
| Binary files or non-code assets | Skip, note in discovery |
| `_docs/` already exists | ASK user: overwrite, merge, or use `_docs_generated/` |
| Code intent is ambiguous | ASK user, do not guess |
## Common Mistakes
- **Top-down guessing**: never infer architecture before documenting modules. Build up, don't assume down.
- **Hallucinating entities**: always verify that referenced classes/functions/endpoints actually exist in code.
- **Skipping modules**: every source module must appear in exactly one module doc and one component.
- **Monolithic analysis**: don't try to analyze the entire codebase in one pass. Module by module, in order.
- **Inventing restrictions**: only document constraints actually evidenced in code, configs, or Dockerfiles.
- **Vague acceptance criteria**: "should be fast" is not a criterion. Extract actual numeric thresholds from code.
- **Writing code**: this skill produces documents, never implementation code.
## Quick Reference
```
┌──────────────────────────────────────────────────────────────────┐
│ Bottom-Up Codebase Documentation (8-Step) │
├──────────────────────────────────────────────────────────────────┤
│ MODE: Full / Focus Area (@dir) / Resume (state.json) │
│ PREREQ: Check _docs/ exists (overwrite/merge/new?) │
│ PREREQ: Check state.json for resume │
│ │
│ 0. Discovery → dependency graph, tech stack, topo order │
│ (Focus Area: scoped to FOCUS_DIR + transitive deps) │
│ 1. Module Docs → per-module analysis (leaves first) │
│ (batched ~5 modules; session break between batches) │
│ 2. Component Assembly → group modules, write component specs │
│ [BLOCKING: user confirms components] │
│ 2.5 Module Layout → derive module-layout.md from code │
│ [BLOCKING: user confirms layout] │
│ 3. System Synthesis → architecture, flows, data model, deploy │
│ 4. Verification → compare all docs vs code, fix errors │
│ [BLOCKING: user reviews corrections] │
│ [SESSION BREAK suggested before Steps 57] │
│ ── Focus Area mode stops here ── │
│ 5. Solution Extraction → retrospective solution.md │
│ 6. Problem Extraction → retrospective problem, restrictions, AC │
│ [BLOCKING: user confirms problem docs] │
│ 7. Final Report → FINAL_report.md │
├──────────────────────────────────────────────────────────────────┤
│ Principles: Bottom-up always · Dependencies first │
│ Incremental context · Verify against code │
│ Save immediately · Resume from checkpoint │
│ Batch modules · Session breaks for large codebases │
└──────────────────────────────────────────────────────────────────┘
```
-112
View File
@@ -1,112 +0,0 @@
# Document Skill — Task Mode Workflow
Lightweight, incremental documentation update triggered by task spec files. Updates only the docs affected by implemented tasks — does NOT redo full discovery, verification, or problem extraction.
## Trigger
- User provides one or more task spec files (e.g., `@_docs/02_tasks/done/AZ-173_*.md`)
- AND `_docs/02_document/` already contains module/component docs
## Accepts
One or more task spec files from `_docs/02_tasks/todo/` or `_docs/02_tasks/done/`.
## Steps
### Task Step 0: Scope Analysis
1. Read each task spec — extract the "Files Modified" or "Scope / Included" section to identify which source files were changed
2. Map changed source files to existing module docs in `DOCUMENT_DIR/modules/`
3. Map affected modules to their parent components in `DOCUMENT_DIR/components/`
4. Identify which higher-level docs might be affected (system-flows, data_model, data_parameters)
**Output**: a list of docs to update, organized by level:
- Module docs (direct matches)
- Component docs (parents of affected modules)
- System-level docs (only if the task changed API endpoints, data models, or external integrations)
- Problem-level docs (only if the task changed input parameters, acceptance criteria, or restrictions)
### Task Step 0.5: Import-Graph Ripple
A module that changed may be imported by other modules whose docs are now stale even though those other modules themselves were not directly edited. Compute the reverse-dependency set and fold it into the update list.
1. For each source file in the set of changed files from Step 0, build its module-level identifier (Python module path, C# namespace, Rust module path, TS import-specifier, Go package path — depending on the project language).
2. Search the codebase for files that import from any of those identifiers. Preferred tooling per language:
- **Python**: `rg -e "^(from|import) <module>"` then parse with `ast` to confirm actual symbol use.
- **TypeScript / JavaScript**: `rg "from ['\"].*<path>"` then resolve via `tsconfig.json` paths / `jsconfig.json` if present.
- **C#**: `rg "^using <namespace>"` plus `.csproj` `ProjectReference` graph.
- **Rust**: `rg "use <crate>::"` plus `Cargo.toml` workspace members.
- **Go**: `rg "\"<module-path>\""` plus `go.mod` requires.
If a static analyzer is available for the project (e.g., `pydeps`, `madge`, `depcruise`, `NDepend`, `cargo modules`, `go list -deps`), prefer its output — it is more reliable than regex.
3. For each importing file found, look up the component it belongs to via `_docs/02_document/module-layout.md` (if present) or by directory match against `DOCUMENT_DIR/components/`.
4. Add every such component and module to the update list, even if it was not in the current cycle's task spec.
5. Produce `_docs/02_document/ripple_log_cycle<N>.md` (where `<N>` is `state.cycle` from `_docs/_autodev_state.md`, default `1`) listing each downstream doc that was added to the refresh set and the reason (which changed file triggered it). Example line:
```
- docs/components/02_ingestor.md — refreshed because src/ingestor/queue.py imports src/shared/serializer.py (changed by AZ-173)
```
6. When parsing imports fails (missing tooling, unsupported language), log the parse failure in the ripple log and fall back to a directory-proximity heuristic: any component whose source directory contains files matching the changed-file basenames. Note: heuristic mode is explicitly marked in the log so the user can request a manual pass.
### Task Step 1: Module Doc Updates
For each affected module:
1. Read the current source file
2. Read the existing module doc
3. Diff the module doc against current code — identify:
- New functions/methods/classes not in the doc
- Removed functions/methods/classes still in the doc
- Changed signatures or behavior
- New/removed dependencies
- New/removed external integrations
4. Update the module doc in-place, preserving the existing structure and style
5. If a module is entirely new (no existing doc), create a new module doc following the standard template from `workflows/full.md` Step 1
### Task Step 2: Component Doc Updates
For each affected component:
1. Read all module docs belonging to this component (including freshly updated ones)
2. Read the existing component doc
3. Update internal interfaces, dependency graphs, implementation details, and caveats sections
4. Do NOT change the component's purpose, pattern, or high-level overview unless the task fundamentally changed it
### Task Step 3: System-Level Doc Updates (conditional)
Only if the task changed API endpoints, system flows, data models, or external integrations:
1. Update `system-flows.md` — modify affected flow diagrams and data flow tables
2. Update `data_model.md` — if entities changed
3. Update `architecture.md` — only if new external integrations or architectural patterns were added
### Task Step 4: Problem-Level Doc Updates (conditional)
Only if the task changed API input parameters, configuration, or acceptance criteria:
1. Update `_docs/00_problem/input_data/data_parameters.md`
2. Update `_docs/00_problem/acceptance_criteria.md` — if new testable criteria emerged
### Task Step 5: Summary
Present a summary of all docs updated:
```
══════════════════════════════════════
DOCUMENTATION UPDATE COMPLETE
══════════════════════════════════════
Task(s): [task IDs]
Module docs updated: [count]
Component docs updated: [count]
System-level docs updated: [list or "none"]
Problem-level docs updated: [list or "none"]
Ripple-refreshed docs (imports changed indirectly): [count, see ripple_log_cycle<N>.md]
══════════════════════════════════════
```
## Principles
- **Minimal changes**: only update what the task actually changed. Do not rewrite unaffected sections.
- **Preserve style**: match the existing doc's structure, tone, and level of detail.
- **Verify against code**: for every entity added or changed in a doc, confirm it exists in the current source.
- **New modules**: if the task introduced an entirely new source file, create a new module doc from the standard template.
- **Dead references**: if the task removed code, remove the corresponding doc entries. Do not keep stale references.
+74 -299
View File
@@ -1,111 +1,54 @@
---
name: implement
description: |
Implement tasks sequentially with dependency-aware batching and integrated code review.
Orchestrate task implementation with dependency-aware batching, parallel subagents, and integrated code review.
Reads flat task files and _dependencies_table.md from TASKS_DIR, computes execution batches via topological sort,
implements tasks one at a time in dependency order, runs code-review skill after each batch, and loops until done.
launches up to 4 implementer subagents in parallel, runs code-review skill after each batch, and loops until done.
Use after /decompose has produced task files.
Trigger phrases:
- "implement", "start implementation", "implement tasks"
- "execute tasks"
- "run implementers", "execute tasks"
category: build
tags: [implementation, batching, code-review]
tags: [implementation, orchestration, batching, parallel, code-review]
disable-model-invocation: true
---
# Implementation Runner
# Implementation Orchestrator
Implement all tasks produced by the `/decompose` skill. This skill reads task specs, computes execution order, writes the code and tests for each task **sequentially** (no subagents, no parallel execution), validates results via the `/code-review` skill, and escalates issues.
Orchestrate the implementation of all tasks produced by the `/decompose` skill. This skill is a **pure orchestrator** — it does NOT write implementation code itself. It reads task specs, computes execution order, delegates to `implementer` subagents, validates results via the `/code-review` skill, and escalates issues.
For each task the main agent receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria — then moves on to the next task.
The `implementer` agent is the specialist that writes all the code — it receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria.
## Core Principles
- **Sequential execution**: implement one task at a time. Do NOT spawn subagents and do NOT run tasks in parallel. (See `.cursor/rules/no-subagents.mdc`.)
- **Dependency-aware ordering**: tasks run only when all their dependencies are satisfied
- **Batching for review, not parallelism**: tasks are grouped into batches so `/code-review` and commits operate on a coherent unit of work — all tasks inside a batch are still implemented one after the other
- **Orchestrate, don't implement**: this skill delegates all coding to `implementer` subagents
- **Dependency-aware batching**: tasks run only when all their dependencies are satisfied
- **Max 4 parallel agents**: never launch more than 4 implementer subagents simultaneously
- **File isolation**: no two parallel agents may write to the same file
- **Integrated review**: `/code-review` skill runs automatically after each batch
- **Completeness before testing**: product implementation is not done until code is checked against task outcomes, included scope, architecture/component promises, named runtime dependencies, and unresolved scaffold/native placeholders — not just task AC tests
- **Runtime dependency reality**: production code cannot satisfy a task by exposing only a protocol, fake runner, deterministic fallback, or "native bridge" placeholder when the task/architecture promises a concrete internal capability such as BASALT VIO, FAISS retrieval, LightGlue matching, or a full A-Z localization pipeline. Stubs are allowed only for external systems and tests.
- **Auto-start**: batches start immediately — no user confirmation before a batch
- **Auto-start**: batches launch immediately — no user confirmation before a batch
- **Gate on failure**: user confirmation is required only when code review returns FAIL
- **Commit per batch**: after each batch is confirmed, commit. Ask the user whether to push to remote unless the user previously opted into auto-push for this session.
- **Commit and push per batch**: after each batch is confirmed, commit and push to remote
## Context Resolution
- TASKS_DIR: `_docs/02_tasks/`
- Task files: selected `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
- Task files: all `*.md` files in TASKS_DIR (excluding files starting with `_`)
- Dependency table: `TASKS_DIR/_dependencies_table.md`
### Task Selection Context
The invoking flow decides which task category this run should execute. The implement skill must honor that selected context instead of consuming every file in `todo/`.
| Context | Selected task files |
|---------|---------------------|
| Product implementation | Task specs that are not test-only and not refactoring specs |
| Test implementation | `*_test_infrastructure.md` plus task specs whose `Component` or `Epic` identifies `Blackbox Tests` |
| Refactoring | Task specs whose filename or task ID includes `_refactor_` |
If no explicit context is provided, infer it from the active autodev step:
- greenfield Step 7 or existing-code Step 10 → Product implementation
- greenfield Step 10 or existing-code Step 6 → Test implementation
- refactor Phase 4 → Refactoring
Unselected task files remain in `TASKS_DIR/todo/` for their later flow step.
### Task Lifecycle Folders
```
TASKS_DIR/
├── _dependencies_table.md
├── todo/ ← tasks ready for implementation (this skill reads from here)
├── backlog/ ← parked tasks (not scheduled yet, ignored by this skill)
└── done/ ← completed tasks (moved here after implementation)
```
### Suite-level invocation context (meta-repo flow)
When invoked from `.cursor/skills/autodev/flows/meta-repo.md` Step 3.5 (or any caller that supplies the same context envelope), the skill receives:
```
suite_level: true
TASKS_DIR: <override> # e.g., _docs/tasks/ (vs. default _docs/02_tasks/)
module_layout_path: <override> # e.g., _docs/tasks/_suite_module_layout.md
```
When `suite_level: true` is present, the following gate adjustments apply — and ONLY these. All other steps (114, 16) execute unchanged:
1. **TASKS_DIR override** is honored throughout the skill (Step 1 Parse, Step 13 Archive, Step 15 input paths if it ran). Default `_docs/02_tasks/` is replaced by the supplied path.
2. **module_layout_path override** is read instead of the hardcoded `_docs/02_document/module-layout.md` in Step 4 (Assign File Ownership). The supplied file uses the same `Per-Component Mapping` schema. If both the override and the hardcoded path are missing, behavior is unchanged from default mode (STOP and instruct).
3. **Step 14.5 (Cumulative Code Review) — SKIPPED**. The meta-repo has no `_docs/02_document/architecture_compliance_baseline.md`; cross-task drift is captured by the next `monorepo-status` cycle instead.
4. **Step 15 (Product Implementation Completeness Gate) — SKIPPED**. The gate's hard inputs (`_docs/02_document/architecture.md`, `system-flows.md`, `components/*/description.md`) do not exist in the meta-repo artifact layout. Suite-level tasks are infrastructure / coordination work (renames, cross-repo edits, suite-root infra additions), not feature implementation; the equivalent completeness signal is the next `monorepo-status` drift report (which the meta-repo flow re-runs immediately after Step 3.5 returns).
5. **Final report filename**: `_docs/03_implementation/suite_implementation_report_{run_name}.md` (in addition to the existing feature/test/refactor variants). Batch reports follow `_docs/03_implementation/suite_batch_{NN}_report.md`.
6. **Tracker integration** (Step 5: In Progress, Step 12: In Testing) runs unchanged — suite-level tickets follow the same tracker rules as any other.
Without `suite_level: true`, none of these adjustments apply and the skill runs exactly as documented in default mode.
## Prerequisite Checks (BLOCKING)
1. `TASKS_DIR/todo/` exists and contains at least one task file for the selected context **STOP if missing**
- Exception for Product implementation re-entry: if no selected product tasks remain in `todo/`, but the active autodev state is Step 7 or the latest product completeness report is missing/invalid/contains `FAIL`, skip directly to Step 15 (Product Implementation Completeness Gate). This gate may create remediation tasks and return to Step 1. Do not write a final implementation report from this state.
1. TASKS_DIR exists and contains at least one task file — **STOP if missing**
2. `_dependencies_table.md` exists — **STOP if missing**
3. At least one task is not yet completed — **STOP if all done**
4. **Working tree is clean** — run `git status --porcelain`; the output must be empty.
- If dirty, STOP and present the list of changed files to the user via the Choose format:
- A) Commit or stash stray changes manually, then re-invoke `/implement`
- B) Agent commits stray changes as a single `chore: WIP pre-implement` commit and proceeds
- C) Abort
- Rationale: each batch ends with a commit. Unrelated uncommitted changes would get silently folded into batch commits otherwise.
- This check is repeated at the start of each batch iteration (see step 6 / step 14 Loop).
## Algorithm
### 1. Parse
- Read selected task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
- Read all task `*.md` files from TASKS_DIR (excluding files starting with `_`)
- Read `_dependencies_table.md` — parse into a dependency graph (DAG)
- Validate: no circular dependencies in the selected task graph, all referenced selected-task dependencies exist or are already completed in `TASKS_DIR/done/`
- Validate: no circular dependencies, all referenced dependencies exist
### 2. Detect Progress
@@ -118,252 +61,87 @@ Without `suite_level: true`, none of these adjustments apply and the skill runs
- Topological sort remaining tasks
- Select tasks whose dependencies are ALL satisfied (completed)
- A batch is simply a coherent group of tasks for review + commit. Within the batch, tasks are implemented sequentially in topological order.
- Cap the batch size at a reasonable review scope (default: 4 tasks)
- If a ready task depends on any task currently being worked on in this batch, it must wait for the next batch
- Cap the batch at 4 parallel agents
- If the batch would exceed 20 total complexity points, suggest splitting and let the user decide
### 4. Assign File Ownership
The authoritative file-ownership map is `_docs/02_document/module-layout.md` (produced by the decompose skill's Step 1.5), unless `suite_level: true` was supplied in the invocation context — in which case the `module_layout_path` override is read instead (see "Suite-level invocation context" above). Task specs are purely behavioral — they do NOT carry file paths. Derive ownership from the layout, not from the task spec's prose.
For each task in the batch:
- Read the task spec's **Component** field.
- Look up the component in `_docs/02_document/module-layout.md` → Per-Component Mapping.
- Set **OWNED** = the component's `Owns` glob (the files this task is allowed to write).
- Set **READ-ONLY** = Public API files of every component in the component's `Imports from` list, plus all `shared/*` Public API files.
- Set **FORBIDDEN** = every other component's `Owns` glob, and every other component's internal (non-Public API) files.
- If the task is a shared / cross-cutting task (lives under `shared/*`), OWNED = that shared directory; READ-ONLY = nothing; FORBIDDEN = every component directory.
Since execution is sequential, there is no parallel-write conflict to resolve; ownership here is a **scope discipline** check — it stops a task from drifting into unrelated components even when alone.
If `_docs/02_document/module-layout.md` is missing or the component is not found:
- STOP the batch.
- Instruct the user to run `/decompose` Step 1.5 or to manually add the component entry to `module-layout.md`.
- Do NOT guess file paths from the task spec — that is exactly the drift this file exists to prevent.
- Parse the task spec's Component field and Scope section
- Map the component to directories/files in the project
- Determine: files OWNED (exclusive write), files READ-ONLY (shared interfaces, types), files FORBIDDEN (other agents' owned files)
- If two tasks in the same batch would modify the same file, schedule them sequentially instead of in parallel
### 5. Update Tracker Status → In Progress
For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before starting work. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (Jira MCP or Azure DevOps MCP — see `protocols.md` for detection) before launching the implementer. If `tracker: local`, skip this step.
### 6. Implement Tasks Sequentially
### 6. Launch Implementer Subagents
**Per-batch dirty-tree re-check**: before starting the batch, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.
For each task in the batch, launch an `implementer` subagent with:
- Path to the task spec file
- List of files OWNED (exclusive write access)
- List of files READ-ONLY
- List of files FORBIDDEN
For each task in the batch **in topological order, one at a time**:
1. Read the task spec file.
2. Respect the file-ownership envelope computed in Step 4 (OWNED / READ-ONLY / FORBIDDEN).
3. Implement the feature and write/update tests for every acceptance criterion in the spec. Tests for internal product behavior must exercise the production implementation path. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still exist and skip/block with a clear prerequisite reason, but that skip does not make missing production code complete.
4. Run the relevant tests locally before moving on to the next task in the batch. If tests fail, fix in-place — do not defer.
5. Capture a short per-task status line (files changed, tests pass/fail, any blockers) for the batch report.
Launch all subagents immediately — no user confirmation.
Do NOT spawn subagents and do NOT attempt to implement two tasks simultaneously, even if they touch disjoint files. See `.cursor/rules/no-subagents.mdc`.
### 7. Monitor
### 7. Collect Status
- Wait for all subagents to complete
- Collect structured status reports from each implementer
- If any implementer reports "Blocked", log the blocker and continue with others
- After all tasks in the batch are finished, aggregate the per-task status lines into a structured batch status.
- If any task reported "Blocked", log the blocker with the failing task's ID and continue — the batch report will surface it.
**Stuck detection** — while monitoring, watch for these signals per subagent:
- Same file modified 3+ times without test pass rate improving → flag as stuck, stop the subagent, report as Blocked
- Subagent has not produced new output for an extended period → flag as potentially hung
- If a subagent is flagged as stuck, do NOT let it continue looping — stop it and record the blocker in the batch report
**Stuck detection** — while implementing a task, watch for these signals in your own progress:
- The same file has been rewritten 3+ times without tests going green → stop, mark the task Blocked, and move to the next task in the batch (the user will be asked at the end of the batch).
- You have tried 3+ distinct approaches without evidence-driven progress → stop, mark Blocked, move on.
- Do NOT loop indefinitely on a single task. Record the blocker and proceed.
### 8. AC Test Coverage Verification
Before code review, verify that every acceptance criterion in each task spec has at least one test that validates it. For each task in the batch:
1. Read the task spec's **Acceptance Criteria** section
2. Search the test files (new and existing) for tests that cover each AC
3. Classify each AC as:
- **Covered**: a test directly validates this AC (running or skipped-with-reason)
- **Not covered**: no test exists for this AC
If any AC is **Not covered**:
- This is a **BLOCKING** failure — the missing test must be written before proceeding
- Go back to the offending task, add tests for the specific ACs that lack coverage, then re-run this check
- If the test cannot run in the current environment (GPU required, platform-specific, external service), the test must still exist and skip with `pytest.mark.skipif` or `pytest.skip()` explaining the prerequisite
- A skipped test counts as **Covered** — the test exists and will run when the environment allows
Only proceed to Step 9 when every AC has a corresponding test.
### 9. Code Review
### 8. Code Review
- Run `/code-review` skill on the batch's changed files + corresponding task specs
- The code-review skill produces a verdict: PASS, PASS_WITH_WARNINGS, or FAIL
### 10. Auto-Fix Gate
### 9. Auto-Fix Gate
Bounded auto-fix loop — only applies to **mechanical** findings. Critical and Security findings are never auto-fixed.
Auto-fix loop with bounded retries (max 2 attempts) before escalating to user:
**Auto-fix eligibility matrix:**
1. If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue automatically to step 10
2. If verdict is **FAIL** (attempt 1 or 2):
- Parse the code review findings (Critical and High severity items)
- For each finding, attempt an automated fix using the finding's location, description, and suggestion
- Re-run `/code-review` on the modified files
- If now PASS or PASS_WITH_WARNINGS → continue to step 10
- If still FAIL → increment retry counter, repeat from (2) up to max 2 attempts
3. If still **FAIL** after 2 auto-fix attempts: present all findings to user (**BLOCKING**). User must confirm fixes or accept before proceeding.
| Severity | Category | Auto-fix? |
|----------|----------|-----------|
| Low | any | yes |
| Medium | Style, Maintainability, Performance | yes |
| Medium | Bug, Spec-Gap, Security, Architecture | escalate |
| High | Style, Scope | yes |
| High | Bug, Spec-Gap, Performance, Maintainability, Architecture | escalate |
| Critical | any | escalate |
| any | Security | escalate |
| any | Architecture (cyclic deps) | escalate |
Track `auto_fix_attempts` count in the batch report for retrospective analysis.
Flow:
### 10. Test
1. If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue to step 11
2. If verdict is **FAIL**:
- Partition findings into auto-fix-eligible and escalate (using the matrix above)
- For eligible findings, attempt fixes using location/description/suggestion, then re-run `/code-review` on modified files (max 2 rounds)
- If all remaining findings are auto-fix-eligible and re-review now passes → continue to step 11
- If any non-eligible finding exists at any point → stop auto-fixing, present the full list to the user (**BLOCKING**)
3. User must explicitly approve each non-auto-fix finding (accept, request manual fix, mark as out-of-scope) before proceeding.
- Run the full test suite
- If failures: report to user with details
Track `auto_fix_attempts` and `escalated_findings` in the batch report for retrospective analysis.
### 11. Commit (and optionally Push)
### 11. Commit and Push
- After user confirms the batch (explicitly for FAIL, implicitly for PASS/PASS_WITH_WARNINGS):
- `git add` all changed files from the batch
- `git commit` with a message that includes ALL task IDs (tracker IDs or numeric prefixes) of tasks implemented in the batch, followed by a summary of what was implemented. Format: `[TASK-ID-1] [TASK-ID-2] ... Summary of changes`
- Ask the user whether to push to remote, unless the user previously opted into auto-push for this session
- `git commit` with a message that includes ALL task IDs (Jira IDs, ADO IDs, or numeric prefixes) of tasks implemented in the batch, followed by a summary of what was implemented. Format: `[TASK-ID-1] [TASK-ID-2] ... Summary of changes`
- `git push` to the remote branch
### 12. Update Tracker Status → In Testing
After the batch is committed (and pushed if the user approved pushing), transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
After the batch is committed and pushed, transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step.
### 13. Archive Completed Tasks
### 13. Loop
Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.
For product implementation, this archive means "batch implementation accepted." The Product Implementation Completeness Gate can still require follow-up remediation tasks before the feature is complete; it does not move original task files back to `todo/`.
### 14. Loop
- Go back to step 2 until all tasks in `todo/` are done
### 14.5. Cumulative Code Review (every K batches)
**Skipped entirely when `suite_level: true`** (see "Suite-level invocation context" above) — the meta-repo has no `architecture_compliance_baseline.md` to evaluate against; cross-task drift is captured by the next `monorepo-status` cycle.
- **Trigger**: every K completed batches (default `K = 3`; configurable per run via a `cumulative_review_interval` knob in the invocation context)
- **Purpose**: per-batch review (Step 9) catches batch-local issues; cumulative review catches issues that only appear when tasks are combined — architecture drift, cross-task inconsistency, duplicate symbols introduced across different batches, contracts that drifted across producer/consumer batches
- **Scope**: the union of files changed since the **last** cumulative review (or since the start of the run if this is the first)
- **Action**: invoke `.cursor/skills/code-review/SKILL.md` in **cumulative mode**. All 7 phases run, with emphasis on Phase 6 (Cross-Task Consistency), Phase 7 (Architecture Compliance), and duplicate-symbol detection across the accumulated code
- **Output**: write the report to `_docs/03_implementation/cumulative_review_batches_[NN-MM]_cycle[N]_report.md` where `[NN-MM]` is the batch range covered and `[N]` is the current `state.cycle`. When `_docs/02_document/architecture_compliance_baseline.md` exists, the report includes the `## Baseline Delta` section (carried over / resolved / newly introduced) per `code-review/SKILL.md` "Baseline delta".
- **Gate**:
- `PASS` or `PASS_WITH_WARNINGS` → continue to next batch (step 14 loop)
- `FAIL` → STOP. Present the report to the user via the Choose format:
- A) Auto-fix findings using the Auto-Fix Gate matrix in step 10, then re-run cumulative review
- B) Open a targeted refactor run (invoke refactor skill in guided mode with the findings as `list-of-changes.md`)
- C) Manually fix, then re-invoke `/implement`
- Do NOT loop to the next batch on `FAIL` — the whole point is to stop drift before it compounds
- **Interaction with Auto-Fix Gate**: Architecture findings (new category from code-review Phase 7) always escalate per the implement auto-fix matrix; they cannot silently auto-fix
- **Resumability**: if interrupted, the next invocation checks for the latest `cumulative_review_batches_*.md` and computes the changed-file set from batch reports produced after that review
### 15. Product Implementation Completeness Gate
Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate when (a) the remaining context is explicitly test implementation or refactoring (as determined by the task files and report filename rules), OR (b) `suite_level: true` was supplied in the invocation context (the gate's inputs do not exist in the meta-repo artifact layout — see "Suite-level invocation context" above).
**Goal**: catch the failure mode where narrow tests validate scaffold behavior while the task's actual outcome, included scope, architecture promise, or named integration remains unimplemented.
Inputs:
- Completed product task specs from `_docs/02_tasks/done/` for the current cycle
- `_docs/02_document/architecture.md`
- `_docs/02_document/system-flows.md`
- Relevant `_docs/02_document/components/*/description.md` files
- Current source code under each completed task's ownership envelope
- Batch reports and code-review reports for the current cycle
For each completed product task:
1. Read these sections from the task spec: `Description`, `Outcome`, `Scope / Included`, `Acceptance Criteria`, `Non-Functional Requirements`, `Constraints`, and explicit named technologies or integrations.
2. Compare those promises against actual source code, not only tests or report prose.
3. Search the task's owned component files for unresolved implementation markers: `placeholder`, `stub`, `reserved`, `TODO`, `NotImplemented`, `pass`, `deterministic`, `fake`, `mock`, `scaffold`, `native bridge`, and empty native/readme-only integration directories. Ignore test fixtures/mocks only when they are under test-owned paths and not used as production behavior.
4. Verify that each named runtime dependency in the task promise is integrated as production behavior, not merely represented by an interface. Examples: if a task promises FAISS, DINOv2, BASALT, LightGlue, OpenCV, RANSAC, a database, cloud service, or hardware SDK, the production code must either call that dependency or contain an adapter that loads and executes the real dependency package. A deterministic fallback, fake runner, empty `native/` package, or "bridge to be supplied later" is **FAIL** unless the task itself explicitly scoped the dependency out before implementation started.
5. Distinguish internal implementation from external prerequisites:
- Internal product capabilities (VIO, anchor verification, cache retrieval, safety wrapper, FDR, MAVLink emission) must be implemented in production code before the task can pass.
- External systems/hardware/data (Jetson device, physical camera, ArduPilot process, QGC, third-party service credentials, unavailable licensed dataset) may be `BLOCKED` only when production code exists and the missing prerequisite is outside the product boundary.
6. Verify tests exercise the real implementation path where local prerequisites exist. Environment-gated tests may skip only with an explicit prerequisite reason; they do not make missing production code complete.
7. For any architecture promise that describes an end-to-end user outcome, verify there is an executable production pipeline connecting the relevant components. Isolated component contracts and test-only harness orchestration are not enough.
8. Classify each task:
- **PASS**: task promises are implemented or explicitly out of scope in the task itself.
- **BLOCKED**: production code exists but cannot be fully verified due to external hardware/data/license/runtime prerequisites; the blocker is explicit and tests report blocked/skipped with reason.
- **FAIL**: promised production behavior is missing, only scaffolded, or only represented in tests/reports.
#### 15.b System-Pipeline Check (runs ONCE per gate invocation, after per-task classification)
The per-task classification above (steps 18) operates on `_docs/02_tasks/done/`. It catches missing component-local behavior but it CANNOT catch a missing *integration* — there is no task to fail if no task ever owned the integration in the first place. The GPS-passthrough incident (May 2026) escaped this gate because every per-component task in `done/` was honestly complete; the missing piece was the cross-component loop, which had no owning task.
The system-pipeline check fixes that by walking the architecture documents directly, independent of `done/`.
**Inputs**:
- `_docs/02_document/architecture.md`
- `_docs/02_document/system-flows.md`
- Full source tree under the project's production directory (e.g. `src/`).
**Procedure**:
1. **Enumerate end-to-end pipelines.** Read `architecture.md` and `system-flows.md`. For each named pipeline / operational flow that spans 2+ components, record the ordered component sequence and the trigger (per-frame, per-request, scheduled, manual).
2. **Grep for production callers of each seam method.** For each adjacent pair `A → B` in a pipeline, find a production source file (not under `tests/`, not under a `bench/` package, not a doc) that calls `A`'s public output method AND passes the result into `B`'s public input method.
3. **Classify the pipeline**:
- **WIRED**: a production caller exists and the chain is complete from the first to the last component in the sequence.
- **PARTIALLY WIRED**: some adjacent pairs have callers but at least one seam is missing.
- **NOT WIRED**: no production code calls the pipeline's components in order. Bench tools, unit tests, and microbenchmarks do NOT count as "wiring".
4. **Distinguish "wired but stubbed" from "wired with real components"**: a caller that invokes a passthrough / GPS-from-tlog / mock-output-generator instead of the real component is `NOT WIRED` for the purposes of this gate. The seam exists in the source file but the production behavior is faked. Grep for the same scaffold markers Step 15 already enumerates (`placeholder`, `stub`, `passthrough`, `scaffold until`, etc.) inside the caller's body.
5. **Output**: append a `## System Pipeline Audit` section to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md`. Per-pipeline row: name, sequence, classification, evidence file (the caller, or "NONE FOUND"), remediation suggestion if not `WIRED`.
**Pipeline classification feeds the combined gate below.** Any pipeline that is not `WIRED` is a system-level FAIL that the per-task gate cannot rescue.
**Why this is here and not only in decompose**: decompose Step 1.7 creates integration tasks up front; this check verifies the integration tasks actually got implemented (or, if they were never created, surfaces the gap before the cycle closes). The two layers are belt-and-suspenders by design.
Save the audit to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` with:
- Per-task classification
- Evidence files/symbols checked
- Any unresolved scaffold/native placeholders
- Any named promised technologies not integrated
- **System Pipeline Audit table** (per pipeline: name, sequence, WIRED / PARTIALLY WIRED / NOT WIRED, evidence file, remediation suggestion)
- Required remediation task suggestions, each sized to 5 points or less
Gate:
- If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, AND every enumerated pipeline is `WIRED`, continue to Final Test Run.
- If any product task is `FAIL` OR any pipeline is `PARTIALLY WIRED` / `NOT WIRED`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
- A) Create remediation tasks now and return to implementation. (For pipeline FAILs the remediation task is a NEW integration task owned by the spine component per `_docs/02_document/module-layout.md`; it is NOT a test task and NOT a doc task; its deliverable is production code that drives the pipeline against real components.)
- B) Mark the missing behavior explicitly out of scope in task/docs, then re-run this gate
- C) Abort for manual correction
- Recommendation must normally be A unless the user deliberately accepts reduced scope.
Remediation task creation:
1. For each `FAIL`, create one or more task specs using `.cursor/skills/decompose/templates/task.md`; each remediation task must be sized at 5 points or less.
2. Save each task to `_docs/02_tasks/todo/` with a short name prefixed by `remediate_`.
3. Set **Component** to the failed task's component and set **Dependencies** to the failed task ID plus any remediation prerequisites.
4. Create or defer tracker tickets using the same tracker rules as decompose/new-task: if tracker is available, create tickets immediately; if the user explicitly chose `tracker: local`, keep numeric prefixes with `Tracker: pending` / `Epic: pending`.
5. Append the remediation tasks to `_docs/02_tasks/_dependencies_table.md`.
6. Return to Step 1 (Parse) in **Product implementation** context. The final product implementation report can be written only after remediation tasks complete and this gate reruns without `FAIL`.
### 16. Final Test Run
- After all batches are complete, run the full test suite once unless the invoking flow's immediate next step is `Run Tests`.
- If the next flow step is `Run Tests`, record a handoff in the final implementation report and let `.cursor/skills/test-run/SKILL.md` own the full-suite gate to avoid duplicate full runs.
- When this step does run, read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices).
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision.
- When tests pass, report final summary.
- Go back to step 2 until all tasks are done
- When all tasks are complete, report final summary
## Batch Report Persistence
After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. For product implementation, produce the FINAL implementation report only after the Product Implementation Completeness Gate passes. For test and refactor implementation, produce the FINAL report after all selected tasks complete and the full-suite gate is either run or handed off per Step 16. The filename depends on context:
- **Test implementation** (tasks from test decomposition): `_docs/03_implementation/implementation_report_tests.md`
- **Feature implementation**: `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` where `{feature_slug}` is derived from the batch task names (e.g., `implementation_report_core_api_cycle2.md`) and `{N}` is the current `state.cycle` from `_docs/_autodev_state.md`. If `state.cycle` is absent (pre-migration), default to `cycle1`.
- **Refactoring**: `_docs/03_implementation/implementation_report_refactor_{run_name}.md`
- **Suite-level** (when `suite_level: true` was supplied — see "Suite-level invocation context" above): `_docs/03_implementation/suite_implementation_report_{run_name}.md`. Batch reports use `_docs/03_implementation/suite_batch_{NN}_report.md`. `{run_name}` is derived from the batch task IDs (e.g., `suite_implementation_report_az543_az549_az550.md`).
Determine the context from the task files being implemented: if all tasks have test-related names or belong to a test epic, use the tests filename; if `suite_level: true` was supplied, use the suite filename; otherwise derive the feature slug from the component names and append the cycle suffix.
Batch report filenames must also include the cycle counter when running feature implementation: `_docs/03_implementation/batch_{NN}_cycle{N}_report.md` (test and refactor runs may use the plain `batch_{NN}_report.md` form since they are not cycle-scoped).
After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_report.md`. Create the directory if it doesn't exist. When all tasks are complete, produce `_docs/03_implementation/FINAL_implementation_report.md` with a summary of all batches.
## Batch Report
@@ -378,11 +156,10 @@ After each batch, produce a structured report:
## Task Results
| Task | Status | Files Modified | Tests | AC Coverage | Issues |
|------|--------|---------------|-------|-------------|--------|
| [TRACKER-ID]_[name] | Done | [count] files | [pass/fail] | [N/N ACs covered] | [count or None] |
| Task | Status | Files Modified | Tests | Issues |
|------|--------|---------------|-------|--------|
| [JIRA-ID]_[name] | Done | [count] files | [pass/fail] | [count or None] |
## AC Test Coverage: [All covered / X of Y covered]
## Code Review Verdict: [PASS/FAIL/PASS_WITH_WARNINGS]
## Auto-Fix Attempts: [0/1/2]
## Stuck Agents: [count or None]
@@ -394,11 +171,10 @@ After each batch, produce a structured report:
| Situation | Action |
|-----------|--------|
| Same task rewritten 3+ times without green tests | Mark Blocked, continue batch, escalate at batch end |
| Implementer fails same approach 3+ times | Stop it, escalate to user |
| Task blocked on external dependency (not in task list) | Report and skip |
| File ownership violated (task wrote outside OWNED) | ASK user |
| Product completeness gate finds missing promised implementation | STOP — create remediation tasks or get explicit user scope reduction |
| Test failure after final test run | Delegate to test-run skill — blocking gate |
| File ownership conflict unresolvable | ASK user |
| Test failures exceed 50% of suite after a batch | Stop and escalate |
| All tasks complete | Report final summary, suggest final commit |
| `_dependencies_table.md` missing | STOP — run `/decompose` first |
@@ -406,14 +182,13 @@ After each batch, produce a structured report:
Each batch commit serves as a rollback checkpoint. If recovery is needed:
- **Tests fail after final test run**: `git revert <batch-commit-hash>` using hashes from the batch reports in `_docs/03_implementation/`
- **Resuming after interruption**: Read `_docs/03_implementation/batch_*_report.md` files (filtered by current `state.cycle` for feature implementation) to determine which batches completed, then continue from the next batch
- **Tests fail after a batch commit**: `git revert <batch-commit-hash>` using the hash from the batch report in `_docs/03_implementation/`
- **Resuming after interruption**: Read `_docs/03_implementation/batch_*_report.md` files to determine which batches completed, then continue from the next batch
- **Multiple consecutive batches fail**: Stop and escalate to user with links to batch reports and commit hashes
## Safety Rules
- Never start a task whose dependencies are not yet completed
- Never run tasks in parallel and never spawn subagents — see `.cursor/rules/no-subagents.mdc`
- If a task is flagged as stuck, stop working on it and report — do not let it loop indefinitely
- Always run the Product Implementation Completeness Gate before final product reports
- Always run or hand off the full test suite after all batches complete (step 16)
- Never launch tasks whose dependencies are not yet completed
- Never allow two parallel agents to write to the same file
- If a subagent fails or is flagged as stuck, stop it and report — do not let it loop indefinitely
- Always run tests after each batch completes
@@ -3,31 +3,29 @@
## Topological Sort with Batch Grouping
The `/implement` skill uses a topological sort to determine execution order,
then groups tasks into batches for code review and commit. Execution within a
batch is **sequential** — see `.cursor/rules/no-subagents.mdc`.
then groups tasks into batches for parallel execution.
## Algorithm
1. Build adjacency list from `_dependencies_table.md`
2. Compute in-degree for each task node
3. Initialize the ready set with all nodes that have in-degree 0
3. Initialize batch 0 with all nodes that have in-degree 0
4. For each batch:
a. Select up to 4 tasks from the ready set (default batch size cap)
b. Implement the selected tasks one at a time in topological order
c. When all tasks in the batch complete, remove them from the graph and
decrement in-degrees of dependents
d. Add newly zero-in-degree nodes to the ready set
a. Select up to 4 tasks from the ready set
b. Check file ownership — if two tasks would write the same file, defer one to the next batch
c. Launch selected tasks as parallel implementer subagents
d. When all complete, remove them from the graph and decrement in-degrees of dependents
e. Add newly zero-in-degree nodes to the next batch's ready set
5. Repeat until the graph is empty
## Ordering Inside a Batch
## File Ownership Conflict Resolution
Tasks inside a batch are executed in topological order — a task is only
started after every task it depends on (inside the batch or in a previous
batch) is done. When two tasks have the same topological rank, prefer the
lower-numbered (more foundational) task first.
When two tasks in the same batch map to overlapping files:
- Prefer to run the lower-numbered task first (it's more foundational)
- Defer the higher-numbered task to the next batch
- If both have equal priority, ask the user
## Complexity Budget
Each batch should not exceed 20 total complexity points.
If it does, split the batch and let the user choose which tasks to include.
The budget exists to keep the per-batch code review scope reviewable.
@@ -15,7 +15,7 @@ Use this template after each implementation batch completes.
| Task | Status | Files Modified | Tests | Issues |
|------|--------|---------------|-------|--------|
| [TRACKER-ID]_[name] | Done/Blocked/Partial | [count] files | [X/Y pass] | [count or None] |
| [JIRA-ID]_[name] | Done/Blocked/Partial | [count] files | [X/Y pass] | [count or None] |
## Code Review Verdict: [PASS / FAIL / PASS_WITH_WARNINGS]
-164
View File
@@ -1,164 +0,0 @@
---
name: monorepo-cicd
description: Syncs CI/CD and infrastructure configuration at the monorepo root (compose files, install scripts, env templates, CI service tables) after one or more components changed. Reads `_docs/_repo-config.yaml` (produced by monorepo-discover) to know which CI artifacts are in play and how they're structured. Touches ONLY CI/infra files — never documentation, component directories, or per-component CI configs. Use when a component added/changed a Dockerfile path, port, env var, image tag format, or runtime dependency.
---
# Monorepo CI/CD
Propagates component changes into the repo-level CI/CD and infrastructure artifacts. Strictly scoped — never edits docs, component internals, or per-component CI configs.
## Scope — explicit
| In scope | Out of scope |
| -------- | ------------ |
| `docker-compose.*.yml` at repo root | Unified docs in `_docs/*.md` → use `monorepo-document` |
| `.env.example` / `.env.template` | Root `README.md` documentation → `monorepo-document` |
| Install scripts (`ci-*.sh`, `setup.sh`, etc.) | Per-component CI configs (`<component>/.woodpecker/*`, `<component>/.github/*`) |
| CI service-registry docs (`ci_steps.md` or similar — the human-readable index of pipelines; in scope only if the config says so under `ci.service_registry_doc`) | Component source code, Dockerfiles, or internal docs |
| Kustomization / Helm manifests at repo root | `_docs/_repo-config.yaml` itself (only `monorepo-discover` and `monorepo-onboard` write it) |
If a component change needs doc updates too, tell the user to also run `monorepo-document`.
**Special case**: `ci.service_registry_doc` (e.g., `ci_steps.md`) is a **CI artifact that happens to be markdown**. It's in this skill's scope, not `monorepo-document`'s, because it describes the pipeline/service topology — not user-facing feature docs.
## Preconditions (hard gates)
1. `_docs/_repo-config.yaml` exists.
2. Top-level `confirmed_by_user: true`.
3. `ci.*` section is populated in config (not empty).
4. Components-in-scope have confirmed CI mappings, OR user explicitly approves inferred ones.
If any gate fails, redirect to `monorepo-discover` or ask for confirmation.
## Mitigations (M1M7)
- **M1** Separation: this skill only touches CI/infra files; no docs, no component internals.
- **M3** Factual vs. interpretive: image tag format, port numbers, env var names — FACTUAL, read from code. Doc cross-references — out of scope entirely (belongs to `monorepo-document`).
- **M4** Batch questions at checkpoints.
- **M5** Skip over guess: component with no CI mapping → skip and report.
- **M6** Assumptions footer + append to `_repo-config.yaml` `assumptions_log`.
- **M7** Drift detection: verify every file in `ci.orchestration_files`, `ci.install_scripts`, `ci.env_template` exists; stop if not.
## Workflow
### Phase 1: Drift check (M7)
Verify every CI file listed in config exists on disk. Missing file → stop, ask user:
- Run `monorepo-discover` to refresh, OR
- Skip the missing file (recorded in report)
Do NOT recreate missing infra files automatically.
### Phase 2: Determine scope
Ask the user (unless specified):
> Which components changed? (a) list them, (b) auto-detect, (c) skip detection (I'll apply specific changes).
For **auto-detect**, for each component:
```bash
git -C <path> log --oneline -20 # submodule
# or
git log --oneline -20 -- <path> # monorepo subfolder
```
Flag commits that touch CI-relevant concerns:
- Dockerfile additions, renames, or path changes
- CI pipeline files (`<component>/.woodpecker/*`, `<component>/.github/workflows/*`, etc.)
- New exposed ports
- New environment variables consumed by the component
- Changes to image name / tag format
- New dependency on another service (e.g., new DB, new broker)
Present the flagged list; confirm.
### Phase 3: Classify changes per component
| Change type | Target CI files |
| ----------- | --------------- |
| Dockerfile path moved/renamed | `ci.service_registry_doc` service table; per-component CI is OUT OF SCOPE (tell user to update it) |
| New port exposed | `ci.service_registry_doc` ports section (if infra port); component's service block in orchestration file |
| Registry URL changed | `ci.install_scripts` (all of them); `ci.env_template`; `ci.service_registry_doc` |
| Branch naming convention changed | All `ci.install_scripts`; all `ci.orchestration_files` referencing the branch; `ci.service_registry_doc` |
| New runtime env var | `ci.env_template`; component's service block in orchestration file |
| New infrastructure component (DB, cache, broker) | Relevant `ci.orchestration_files`; `ci.service_registry_doc` architecture section |
| New image tag format | All `ci.orchestration_files`; `ci.install_scripts`; `ci.service_registry_doc` |
| Watchtower/polling config change | Specific `ci.orchestration_files`; `ci.service_registry_doc` |
If a change type isn't covered here or in the config, add to an unresolved list and skip (M5).
### Phase 4: Apply edits
For each (change → target file) pair:
1. Read the target file.
2. Locate the service block / table row / section.
3. Edit carefully:
- **Orchestration files (compose/kustomize/helm)**: YAML; preserve indentation, anchors, and references exactly. Match existing service-block structure. Never reformat unchanged lines.
- **Install scripts (`*.sh`)**: shell; any edit must remain **idempotent**. Re-running the script on an already-configured host must not break it. If an edit cannot be made idempotent, flag for the user and skip.
- **`.env.example`**: append new vars at the appropriate section; never remove user's local customizations (file is in git, so comments may be significant).
- **`ci.service_registry_doc`** (markdown): preserve column widths, ordering (alphabetical or compose-order — whichever existed), ASCII diagrams.
### Phase 5: Skip-and-report (M5)
Skip a component if:
- No `ci_config` in its config entry AND no entry in config's CI mappings
- `confirmed: false` on its mapping and user didn't approve
- Component's Dockerfile path declared in config doesn't exist on disk — surface contradiction
- Change type unrecognized — skip, report for manual handling
### Phase 6: Idempotency / lint check
- Shell: if `shellcheck` available, run on any edited `*.sh`.
- YAML: if `yamllint` or `prettier` available, run on edited `*.yml` / `*.yaml`.
- For edited install scripts, **mentally re-run** the logic: would a second invocation crash, duplicate, or corrupt? Flag anything that might.
Skip linters silently if none configured — don't install tools.
### Phase 7: Report + assumptions footer (M6)
```
monorepo-cicd run complete.
CI files updated (N):
- docker-compose.run.yml — added `loader` service block
- .env.example — added LOADER_BUCKET_NAME placeholder
- ci_steps.md — added `loader` row in service table
Skipped (K):
- satellite-provider: no ci_config in repo-config.yaml
- detections: Dockerfile path in config (admin/src/Dockerfile) does not exist on disk
Manual actions needed (M):
- Update `<submodule>/.woodpecker/*.yml` inside the submodule's own workspace
(per-component CI is not maintained by this skill)
Assumptions used this run:
- image tag format: ${REGISTRY}/${NAME}:${BRANCH}-${ARCH_TAG} (confirmed in config)
- target branch for triggers: [stage, main] (confirmed in config)
Next step: review the diff, then commit with
`<commit_prefix> Sync CI after <components>` (or your own message).
```
Append run entry to `_docs/_repo-config.yaml` `assumptions_log:`.
## What this skill will NEVER do
- Modify files inside component directories
- Edit unified docs under `docs.root`
- Edit per-component CI configs (`.woodpecker/*`, `.github/*`, etc.)
- Auto-generate CI pipeline YAML for components (only provide template guidance)
- Set `confirmed_by_user` or `confirmed:` flags
- Auto-commit
- Install tools (shellcheck, yamllint, etc.) — use if present, skip if absent
## Edge cases
- **Compose file has service blocks for components NOT in config**: note in report; ask user whether to rediscover (`monorepo-discover`) or leave them alone.
- **`.env.example` has entries for removed components**: don't auto-remove; flag to user.
- **Install script edit cannot be made idempotent**: don't save; ask user to handle manually.
- **Branch trigger vs. runtime branch mismatch**: if config says triggers are `[stage, main]` but a compose file references a branch tag `develop`, stop and ask.
-183
View File
@@ -1,183 +0,0 @@
---
name: monorepo-discover
description: Scans a monorepo or meta-repo (git-submodule aggregators, npm/cargo workspaces, etc.) and generates a human-reviewable `_docs/_repo-config.yaml` that other `monorepo-*` skills (document, cicd, onboard, status) read. Produces inferred mappings tagged with evidence; never writes to the config's `confirmed_by_user` flag — the human does that. Use on first setup in a new monorepo, or to refresh the config after structural changes.
---
# Monorepo Discover
Writes or refreshes `_docs/_repo-config.yaml` — the shared config file that every other `monorepo-*` skill depends on. Does NOT modify any other files.
## Core principle
**Discovery is a suggestion, not a commitment.** The skill infers repo structure, but every inferred entry is tagged with `confirmed: false` + evidence. Action skills (`monorepo-document`, `monorepo-cicd`, `monorepo-onboard`) refuse to run until the human reviews the config and sets `confirmed_by_user: true`.
## Mitigations against LLM inference errors (applies throughout)
| Rule | What it means |
| ---- | ------------- |
| **M1** Separation | This skill never triggers other skills. It stops after writing config. |
| **M2** Evidence thresholds | No mapping gets recorded without at least one signal (name match, textual reference, directory convention, explicit statement). Zero-signal candidates go under `unresolved:` with a question. |
| **M3** Factual vs. interpretive | Resolve factual questions alone (file exists? line says what?). Ask for interpretive ones (does A feed into B?) unless M2 evidence is present. Ask for conventional ones always (commit prefix? target branch?). |
| **M4** Batch questions | Accumulate all `unresolved:` questions. Present at end of discovery, not drip-wise. |
| **M5** Skip over guess | Never record a zero-evidence mapping under `components:` or `docs:` — always put it in `unresolved:` with a question. |
| **M6** Assumptions footer | Every run ends with an explicit list of assumptions used. Also append to `assumptions_log:` in the config. |
| **M7** Structural drift | If the config already exists, produce a diff of what would change and ask for approval before overwriting. Never silently regenerate. |
## Guardrail
**This skill writes ONLY `_docs/_repo-config.yaml`.** It never edits unified docs, CI files, or component directories. If the workflow ever pushes you to modify anything else, stop.
## Workflow
### Phase 1: Detect repo type
Check which of these exists (first match wins):
1. `.gitmodules`**git-submodules meta-repo**
2. `package.json` with `workspaces` field → **npm/yarn/pnpm workspace**
3. `pnpm-workspace.yaml`**pnpm workspace**
4. `Cargo.toml` with `[workspace]` section → **cargo workspace**
5. `go.work`**go workspace**
6. Multiple top-level subfolders each with their own `package.json` / `Cargo.toml` / `pyproject.toml` / `*.csproj`**ad-hoc monorepo**
If none match → **ask the user** what kind of monorepo this is. Don't guess.
Record in `repo.type` and `repo.component_registry`.
### Phase 2: Enumerate components
Based on repo type, parse the registry and list components. For each collect:
- `name`, `path`
- `stack` — infer from files present (`.csproj` → .NET, `pyproject.toml` → Python, `Cargo.toml` → Rust, `package.json` → Node/TS, `go.mod` → Go). Multiple signals → pick dominant one. No signals → `stack: unknown` and add to `unresolved:`.
- `evidence` — list of signals used (e.g., `[gitmodules_entry, csproj_present]`)
Do NOT yet populate `primary_doc`, `secondary_docs`, `ci_config`, or `deployment_tier` — those come in Phases 4 and 5.
### Phase 3: Locate docs root
Probe in order: `_docs/`, `docs/`, `documentation/`, or a root-level README with links to sub-docs.
- Multiple candidates → ask user which is canonical
- None → `docs.root: null` + flag under `unresolved:`
Once located, classify each `*.md`:
- **Primary doc** — filename or H1 names a component/feature
- **Cross-cutting doc** — describes repo-wide concerns (architecture, schema, auth, index)
- **Index** — `README.md`, `index.md`, or `_index.md`
Detect filename convention (e.g., `NN_<name>.md`) and next unused prefix.
### Phase 4: Map components to docs (inference, M2-gated)
For each component, attempt to find its **primary doc** using the evidence rules. A mapping qualifies for `components:` (with `confirmed: false`) if at least ONE of these holds:
- **Name match** — component name appears in the doc filename OR H1
- **Textual reference** — doc body explicitly names the component path or git URL
- **Directory convention** — doc lives inside the component's folder
- **Explicit statement** — README, index, or comment asserts the mapping
No signal → entry goes under `unresolved:` with an A/B/C question, NOT under `components:` as a guess.
Cross-cutting docs go in `docs.cross_cutting:` with an `owns:` list describing what triggers updates to them. If you can't classify a doc, add an `unresolved:` entry asking the user.
### Phase 5: Detect CI tooling
Probe at repo root AND per-component for CI configs:
- `.github/workflows/*.yml` → GitHub Actions
- `.gitlab-ci.yml` → GitLab CI
- `.woodpecker/` or `.woodpecker.yml` → Woodpecker
- `.drone.yml` → Drone
- `Jenkinsfile` → Jenkins
- `bitbucket-pipelines.yml` → Bitbucket
- `azure-pipelines.yml` → Azure Pipelines
- `.circleci/config.yml` → CircleCI
Probe for orchestration/infra at root:
- `docker-compose*.yml`
- `kustomization.yaml`, `helm/`
- `Makefile` with build/deploy targets
- `*-install.sh`, `*-setup.sh`
- `.env.example`, `.env.template`
Record under `ci:`. For image tag formats, grep compose files for `image:` lines and record the pattern (e.g., `${REGISTRY}/${NAME}:${BRANCH}-${ARCH}`).
Anything ambiguous → `unresolved:` entry.
### Phase 6: Detect conventions
- **Commit prefix**: `git log --format=%s -50` → look for `[PREFIX]` consistency
- **Target/work branch**: check CI config trigger branches; fall back to `git remote show origin`
- **Ticket ID pattern**: grep commits and docs for regex like `[A-Z]+-\d+`
- **Image tag format**: see Phase 5
- **Deployment tiers**: scan root README and architecture docs for named tiers/environments
Record inferred conventions with `confirmed: false`.
### Phase 7: Read existing config (if any) and produce diff
If `_docs/_repo-config.yaml` already exists:
1. Parse it.
2. Compare against what Phases 16 discovered.
3. Produce a **diff report**:
- Entries added (new components, new docs)
- Entries changed (e.g., `primary_doc` changed due to doc renaming)
- Entries removed (component removed from registry)
4. **Ask the user** whether to apply the diff.
5. If applied, **preserve `confirmed: true` flags** for entries that still match — don't reset human-approved mappings.
6. **Preserve user-owned top-level keys verbatim**: `glossary_doc:` (written by autodev meta-repo Step 2.5) and any `assumptions_log:` entries are NEVER edited or removed by this skill. Carry them through unchanged. If the file referenced by `glossary_doc:` no longer exists on disk, surface as an `unresolved:` question — do not auto-clear the field.
7. If user declines, stop — leave config untouched.
### Phase 8: Batch question checkpoint (M4)
Present ALL accumulated `unresolved:` questions in one round. For each offer options when possible (A/B/C), open-ended only when no options exist.
After answers, update the draft config with the resolutions.
### Phase 9: Write config file
Write `_docs/_repo-config.yaml` using the schema in [templates/repo-config.example.yaml](templates/repo-config.example.yaml).
- Top-level `confirmed_by_user: false` ALWAYS — only the human flips this
- Every entry has `confirmed: <bool>` and (when `false`) `evidence: [...]`
- Append to `assumptions_log:` a new entry for this run
### Phase 10: Review handoff + assumptions footer (M6)
Output:
```
Generated/refreshed _docs/_repo-config.yaml:
- N components discovered (X confirmed, Y inferred, Z unresolved)
- M docs located (K primary, L cross-cutting)
- CI tooling: <detected>
- P unresolved questions resolved this run; Q still open — see config
- Assumptions made during discovery:
- Treated <path> as unified-docs root (only candidate found)
- Inferred `<component>` primary doc = `<doc>` (name match)
- Commit prefix `<prefix>` seen in N of last 20 commits
Next step: please review _docs/_repo-config.yaml, correct any wrong inferences,
and set `confirmed_by_user: true` at the top. After that, monorepo-document,
monorepo-cicd, monorepo-status, and monorepo-onboard will run.
```
Then stop.
## What this skill will NEVER do
- Modify any file other than `_docs/_repo-config.yaml`
- Set `confirmed_by_user: true`
- Record a mapping with zero evidence
- Chain to another skill automatically
- Commit the generated config
## Failure / ambiguity handling
- Internal contradictions in a component (README references files not in code) → surface to user, stop, do NOT silently reconcile
- Docs root cannot be located → record `docs.root: null` and list unresolved question; do not create a new `_docs/` folder
- Parsing fails on `_docs/_repo-config.yaml` (existing file is corrupt) → surface to user, stop; never overwrite silently
@@ -1,172 +0,0 @@
# _docs/_repo-config.yaml — schema and example
#
# Generated by monorepo-discover. Reviewed by a human. Consumed by:
# - monorepo-document (reads docs.* and components.*.primary_doc/secondary_docs)
# - monorepo-cicd (reads ci.* and components.*.ci_config)
# - monorepo-onboard (reads all sections; writes new component entries)
# - monorepo-status (reads all sections; writes nothing)
#
# Every entry has a `confirmed:` flag:
# true = human reviewed and approved
# false = inferred by monorepo-discover; needs review
# And an `evidence:` list documenting why discovery made the inference.
# ---------------------------------------------------------------------------
# Metadata
# ---------------------------------------------------------------------------
version: 1
last_updated: 2026-04-17
confirmed_by_user: false # HUMAN ONLY: flip to true after reviewing
# ---------------------------------------------------------------------------
# Repo identity
# ---------------------------------------------------------------------------
repo:
name: example-monorepo
type: git-submodules # git-submodules | npm-workspaces | cargo-workspace | pnpm-workspace | go-workspace | adhoc
component_registry: .gitmodules
root_readme: README.md
work_branch: dev
# ---------------------------------------------------------------------------
# Components
# ---------------------------------------------------------------------------
components:
- name: annotations
path: annotations/
stack: .NET 10
confirmed: true
evidence: [gitmodules_entry, csproj_present]
primary_doc: _docs/01_annotations.md
secondary_docs:
- _docs/00_database_schema.md
- _docs/00_roles_permissions.md
ci_config: annotations/.woodpecker/
deployment_tier: api-layer
ports:
- "5001/http"
depends_on: []
env_vars:
- ANNOTATIONS_DB_URL
- name: loader
path: loader/
stack: Python 3.12
confirmed: false # inferred, needs review
evidence: [gitmodules_entry, pyproject_present]
primary_doc: _docs/07_admin.md
primary_doc_section: "Model delivery"
secondary_docs:
- _docs/00_top_level_architecture.md
ci_config: loader/.woodpecker/
deployment_tier: edge
ports: []
depends_on: [admin]
env_vars: []
# ---------------------------------------------------------------------------
# Documentation
# ---------------------------------------------------------------------------
docs:
root: _docs/
index: _docs/README.md
file_convention: "NN_<name>.md"
next_unused_prefix: "13"
cross_cutting:
- path: _docs/00_top_level_architecture.md
owns:
- deployment topology
- component communication
- infrastructure inventory
confirmed: true
- path: _docs/00_database_schema.md
owns:
- database schema changes
- ER diagram
confirmed: true
- path: _docs/00_roles_permissions.md
owns:
- permission codes
- role-to-feature mapping
confirmed: true
# ---------------------------------------------------------------------------
# CI/CD
# ---------------------------------------------------------------------------
ci:
tooling: Woodpecker # GitHub Actions | GitLab CI | Woodpecker | Drone | Jenkins | ...
service_registry_doc: ci_steps.md
orchestration_files:
- docker-compose.ci.yml
- docker-compose.run.yml
- docker-compose.ci-agent-amd64.yml
install_scripts:
- ci-server-install.sh
- ci-client-install.sh
- ci-agent-amd64-install.sh
env_template: .env.example
image_tag_format: "${REGISTRY}/${NAME}:${BRANCH}-${ARCH_TAG}"
branch_triggers: [stage, main]
expected_files_per_component:
- path_glob: "<component>/.woodpecker/build-*.yml"
required: at-least-one
pipeline_template: |
when:
branch: [stage, main]
labels:
platform: arm64
steps:
- name: build-push
image: docker
commands:
- docker build -f Dockerfile -t localhost:5000/<service>:${CI_COMMIT_BRANCH}-arm .
- docker push localhost:5000/<service>:${CI_COMMIT_BRANCH}-arm
volumes:
- /var/run/docker.sock:/var/run/docker.sock
confirmed: false
# ---------------------------------------------------------------------------
# Conventions
# ---------------------------------------------------------------------------
conventions:
commit_prefix: "[suite]"
meta_commit_fallback: "[meta]"
ticket_id_pattern: "AZ-\\d+"
component_naming: lowercase-hyphen
deployment_tiers:
- edge
- remote
- operator-station
- api-layer
confirmed: false
# ---------------------------------------------------------------------------
# Unresolved questions (populated by monorepo-discover)
# ---------------------------------------------------------------------------
# Every question discovery couldn't resolve goes here. Action skills refuse
# to touch entries that map to `unresolved:` items until the human resolves them.
unresolved:
- id: satellite-provider-doc-slot
question: "Component `satellite-provider` has no matching doc. Create new file or extend an existing doc?"
options:
- "new _docs/13_satellite_provider.md"
- "extend _docs/11_gps_denied.md with a Satellite section"
- "no doc needed (internal utility)"
# ---------------------------------------------------------------------------
# Assumptions log (append-only, audit trail)
# ---------------------------------------------------------------------------
# monorepo-discover appends a new entry each run.
# monorepo-document, monorepo-cicd, monorepo-onboard also append their
# per-run assumptions here so the user can audit what was taken on faith.
assumptions_log:
- date: 2026-04-17
skill: monorepo-discover
run_notes: "Initial discovery"
assumptions:
- "Treated _docs/ as unified-docs root (only candidate found)"
- "Inferred component→doc mappings via name matching for 9/11 components"
- "Commit prefix [suite] observed in 14 of last 20 commits"
-179
View File
@@ -1,179 +0,0 @@
---
name: monorepo-document
description: Syncs unified documentation (`_docs/*.md` and equivalent) in a monorepo after one or more components changed. Reads `_docs/_repo-config.yaml` (produced by monorepo-discover) to know which doc files each component feeds into and which cross-cutting docs own which concerns. Touches ONLY documentation files — never CI, compose, env templates, or component directories. Use when a submodule/package added/changed an API, schema, permission, event, or dependency and the unified docs need to catch up.
---
# Monorepo Document
Propagates component changes into the unified documentation set. Strictly scoped to `*.md` files under `docs.root` (and `repo.root_readme` if referenced as cross-cutting).
## Scope — explicit
| In scope | Out of scope |
| -------- | ------------ |
| `_docs/*.md` (primary and cross-cutting) | `.env.example`, `docker-compose.*.yml` → use `monorepo-cicd` |
| Root `README.md` **only** if `_repo-config.yaml` lists it as a doc target (e.g., services table) | Install scripts (`ci-*.sh`) → use `monorepo-cicd` |
| Docs index (`_docs/README.md` or similar) cross-reference tables | Component-internal docs (`<component>/README.md`, `<component>/docs/*`) |
| Cross-cutting docs listed in `docs.cross_cutting` | `_docs/_repo-config.yaml` itself (only `monorepo-discover` and `monorepo-onboard` write it) |
| Body of cross-cutting docs **except** the `## Architecture Vision` section (preserved verbatim — owned by autodev meta-repo Step 2.5) | The file at `glossary_doc:` (user-confirmed; only autodev meta-repo Step 2.5 rewrites it). New project terms surfaced during sync are reported back to the user, not silently appended |
| `## Architecture Vision` body — read-only, may be referenced for terminology consistency but never edited | — |
If a component change requires CI/env updates too, tell the user to also run `monorepo-cicd`. This skill does NOT cross domains.
## Preconditions (hard gates)
1. `_docs/_repo-config.yaml` exists.
2. Top-level `confirmed_by_user: true` in the config.
3. `docs.root` is set (non-null) in the config.
4. Components-in-scope have `confirmed: true` mappings, OR the user explicitly approves an inferred mapping for this run.
If any gate fails:
- Config missing → redirect: "Run `monorepo-discover` first."
- `confirmed_by_user: false` → "Please review the config and set `confirmed_by_user: true`."
- `docs.root: null` → "Config has no docs root. Run `monorepo-discover` to re-detect, or edit the config."
- Component inferred but not confirmed → ask user: "Mapping `<component>``<doc>` is inferred. Use it this run? (y/n/edit config first)"
## Mitigations (same M1M7 spirit)
- **M1** Separation: this skill only syncs docs; never touches CI or config.
- **M3** Factual vs. interpretive: don't guess mappings. Use config. If config has an `unresolved:` entry for a component in scope, SKIP it (M5) and report.
- **M4** Batch questions at checkpoints: end of scope determination, end of drift check.
- **M5** Skip over guess: missing/ambiguous mapping → skip and report, never pick a default.
- **M6** Assumptions footer every run; append to config's `assumptions_log:`.
- **M7** Drift detection before action: re-scan `docs.root` to verify config-listed docs still exist; if not, stop and ask.
## Workflow
### Phase 1: Drift check (M7)
Before editing anything:
1. For each component in scope, verify its `primary_doc` and each `secondary_docs` file exists on disk.
2. For each entry in `docs.cross_cutting`, verify the file exists.
3. If any expected file is missing → **stop**, ask user whether to:
- Run `monorepo-discover` to refresh the config, OR
- Skip the missing file for this run (recorded as skipped in report)
Do NOT silently create missing docs. That's onboarding territory.
### Phase 2: Determine scope
If the user hasn't specified which components changed, ask:
> Which components changed? (a) list them, (b) auto-detect from recent commits, (c) skip to review changes you've already made.
For **auto-detect**, for each component in config:
```bash
git -C <path> log --oneline -20 # submodule
# or
git log --oneline -20 -- <path> # monorepo subfolder
```
Flag components whose recent commits touch doc-relevant concerns:
- API/route files (controllers, handlers, OpenAPI specs, route definitions)
- Schema/migration files
- Auth/permission files (attributes, middleware, policies)
- Streaming/SSE/websocket event definitions
- Public exports (`index.ts`, `mod.rs`, `__init__.py`)
- Component's own README if it documents API
- Environment variable additions (only impact docs if a Configuration section exists)
Present the flagged list; ask for confirmation before proceeding.
### Phase 3: Classify changes per component
For each in-scope component, read recent diffs and classify changes:
| Change type | Target doc concern |
| ----------- | ------------------ |
| New/changed REST endpoint | Primary doc API section; cross-cutting arch doc if pattern changes |
| Schema/migration | Cross-cutting schema doc; primary doc if entity documented there |
| New permission/role | Cross-cutting roles/permissions doc; index permission-matrix table |
| New streaming/SSE event | Primary doc events section; cross-cutting arch doc |
| New inter-component dependency | Cross-cutting arch doc; primary doc dependencies section |
| New env variable (affects docs) | Primary doc Configuration section only — `.env.example` is out of scope |
Match concerns to docs via `docs.cross_cutting[].owns`. If a concern has no owner, add to an in-memory unresolved list and skip it (M5) — tell the user at the end.
### Phase 4: Apply edits
For each mapping (component change → target doc):
1. Read the target doc.
2. Locate the relevant section (heading match, anchor, or `primary_doc_section` from config).
3. Edit only that section. Preserve:
- Heading structure and anchors (inbound links depend on them)
- Table column widths / alignment style
- ASCII diagrams (characters, indentation, widths)
- Cross-reference wording style
4. Update cross-references when needed: if a renamed doc is linked elsewhere, fix links too.
### Phase 5: Skip-and-report (M5)
Skip a component, don't guess, if:
- No mapping in config (the component itself isn't listed)
- Mapping tagged `confirmed: false` and user didn't approve it in Phase 2
- Component internally inconsistent (README asserts endpoints not in code) — surface contradiction
Each skip gets a line in the report with the reason.
### Phase 6: Lint / format
Run markdown linter or formatter if the project has one (check for `.markdownlintrc`, `prettier`, or similar at repo root). Skip if none.
### Phase 7: Report + assumptions footer (M6)
Output:
```
monorepo-document run complete.
Docs updated (N):
- _docs/01_flights.md — added endpoint POST /flights/gps-denied-start
- _docs/00_roles_permissions.md — added permission `FLIGHTS.GPS_DENIED.OPERATE`
- _docs/README.md — permission-matrix row updated
Skipped (K):
- satellite-provider: no confirmed mapping (config has unresolved entry)
- detections-semantic: internal README references endpoints not in code — needs reconciliation
Assumptions used this run:
- component `flights` → `_docs/02_flights.md` (user-confirmed in config)
- roles doc = `_docs/00_roles_permissions.md` (user-confirmed cross-cutting)
- target branch: `dev` (from conventions.work_branch)
Next step: review the diff in your editor, then commit with
`<commit_prefix> Sync docs after <components>` (or your own message).
```
Append to `_docs/_repo-config.yaml` under `assumptions_log:`:
```yaml
- date: 2026-04-17
skill: monorepo-document
run_notes: "Synced <components>"
assumptions:
- "<list>"
```
## What this skill will NEVER do
- Modify files inside component directories
- Edit CI files, compose files, install scripts, or env templates
- Create new doc files (that's `monorepo-onboard`)
- Change `confirmed_by_user` or any `confirmed: <bool>` flag
- Auto-commit or push
- Guess a mapping not in the config
- Edit `glossary_doc:` (the file recorded under the config's `glossary_doc:` key)
- Edit the `## Architecture Vision` section of any cross-cutting doc; if a sync would conflict with that section, surface the conflict to the user and skip — do not silently rewrite user-confirmed content
## Edge cases
- **Component has no primary doc** (UI component that spans all feature docs): if config has `primary_doc: null` or similar marker, iterate through `docs.cross_cutting` where the component is referenced. Don't invent a doc.
- **Multiple components touch the same cross-cutting doc in one run**: apply sequentially; after each edit re-read to get updated line numbers.
- **Cosmetic-only changes** (whitespace renames, internal refactors without API changes): inform user, ask whether to sync or skip.
- **Large gap** (doc untouched for months, component has dozens of commits): ask user which commits matter — don't reconstruct full history.
-152
View File
@@ -1,152 +0,0 @@
---
name: monorepo-e2e
description: Syncs the suite-level integration e2e harness (`e2e/docker-compose.suite-e2e.yml`, fixtures, Playwright runner) when component contracts drift in ways that affect the cross-service scenario. Reads `_docs/_repo-config.yaml` to know which suite-e2e artifacts are in play. Touches ONLY suite-e2e files — never per-component CI, docs, or component internals. Use when a component changes a port, env var, public API endpoint, DB schema column, or detection model that the suite e2e exercises.
---
# Monorepo Suite-E2E
Propagates component changes into the suite-level integration e2e harness. Strictly scoped — never edits docs, component internals, per-component CI configs, or the production deploy compose.
## Scope — explicit
| In scope | Out of scope |
| -------- | ------------ |
| `e2e/docker-compose.suite-e2e.yml` (overlay, healthchecks, seed services) | Production `_infra/deploy/<target>/docker-compose.yml``monorepo-cicd` owns it |
| `e2e/fixtures/init.sql` (seeded rows that the spec depends on) | Component DB migrations — owned by each component |
| `e2e/fixtures/expected_detections.json` (detection baseline) | Detection model itself — owned by `detections/` |
| `e2e/runner/tests/*.spec.ts` selector / contract-driven edits | New scenarios (user-driven, not drift-driven) |
| `e2e/runner/Dockerfile` / `package.json` Playwright version bumps | Net-new e2e infrastructure (use `monorepo-onboard` or initial scaffolding) |
| `.woodpecker/suite-e2e.yml` (suite-level pipeline) | Per-component `.woodpecker/01-test.yml` / `02-build-push.yml``monorepo-cicd` owns those |
| Suite-e2e leftover entries under `_docs/_process_leftovers/` | Per-component leftovers — owned by each component |
If a component change needs doc updates too, tell the user to also run `monorepo-document`. If it needs production-deploy or per-component CI updates, run `monorepo-cicd`. This skill **only** updates the suite-e2e surface.
## Preconditions (hard gates)
1. `_docs/_repo-config.yaml` exists.
2. Top-level `confirmed_by_user: true`.
3. `suite_e2e.*` section is populated in config (see "Required config block" below). If absent, abort and ask the user to extend the config via `monorepo-discover`.
4. Components-in-scope have confirmed contract mappings (port, public API path, DB tables touched), OR user explicitly approves inferred ones.
## Required config block
This skill expects `_docs/_repo-config.yaml` to carry:
```yaml
suite_e2e:
overlay: e2e/docker-compose.suite-e2e.yml
fixtures:
init_sql: e2e/fixtures/init.sql
baseline_json: e2e/fixtures/expected_detections.json
binary_fixtures:
- e2e/fixtures/sample.mp4
- e2e/fixtures/model.tar.gz
runner:
dockerfile: e2e/runner/Dockerfile
package_json: e2e/runner/package.json
spec_dir: e2e/runner/tests
pipeline: .woodpecker/suite-e2e.yml
scenario:
description: "Upload video → detect → overlays → dataset → DB persistence"
components_exercised:
- ui
- annotations
- detections
- postgres-local
api_contracts:
- component: ui
path: /api/admin/auth/login
- component: annotations
path: /api/annotations/media/batch
- component: annotations
path: /api/annotations/media/{id}/annotations
db_tables:
- media
- annotations
- detection
- detection_classes
model_pin:
detections_repo_path: <path-to-model-config-or-classes-source>
classes_source: annotations/src/Database/DatabaseMigrator.cs
```
If `suite_e2e:` is missing the skill **stops** — it does not invent a default mapping.
## Mitigations (M1M7)
- **M1** Separation: this skill only touches suite-e2e files; no production deploy compose, no per-component CI, no docs, no component internals.
- **M3** Factual vs. interpretive: port, env var, API path, DB column — FACTUAL, read from the components' code. Whether a baseline still matches the model — DEFERRED to the user (the skill flags drift, never silently re-records).
- **M4** Batch questions at checkpoints.
- **M5** Skip over guess: a component change that doesn't map cleanly to one of the in-scope artifacts → skip and report.
- **M6** Assumptions footer + append to `_repo-config.yaml` `assumptions_log`.
- **M7** Drift detection: verify every path under `suite_e2e.*` exists on disk; stop if not.
## Workflow
### Phase 1: Drift check (M7)
Verify every file listed under `suite_e2e.*` (excluding `binary_fixtures`, which are gitignored) exists on disk. Missing file → stop and ask:
- Run `monorepo-discover` to refresh, OR
- Skip the missing artifact (recorded in report)
For `binary_fixtures` paths that are absent (expected — they live in S3/LFS), check whether `expected_detections.json._meta.video_sha256` is still a `TBD-...` placeholder. If yes, surface this as a known leftover (`_docs/_process_leftovers/2026-04-22_suite-e2e-binary-fixtures.md`) and continue.
### Phase 2: Determine scope
Same as `monorepo-cicd` Phase 2 — ask the user, or auto-detect. For **auto-detect**, flag commits that touch suite-e2e-relevant concerns:
| Commit pattern | Suite-e2e impact |
| -------------- | ---------------- |
| New port exposed by `<component>` | Healthcheck override may change in `e2e/docker-compose.suite-e2e.yml` |
| New required env var on `<component>` | `e2e/docker-compose.suite-e2e.yml` `e2e-runner` env block + `init.sql` seed |
| Public API path renamed / removed | Spec selector / API call path in `e2e/runner/tests/*.spec.ts` |
| DB schema column renamed in a `db_tables` entry | `init.sql` column reference + spec `pg.query` text |
| New required DB table referenced by spec | `init.sql` insert block (skip if owned by component migration) |
| Detection model rev change in `detections/` | `expected_detections.json` `_meta.model.revision` + flag baseline as stale |
| New canonical detection class added | `expected_detections.json._meta` annotation |
Present the flagged list; confirm.
### Phase 3: Classify changes per component
| Change type | Target suite-e2e files |
| ----------- | ---------------------- |
| Port / env var change | `e2e/docker-compose.suite-e2e.yml` |
| API path / contract change | `e2e/runner/tests/*.spec.ts` |
| DB schema reference change | `e2e/fixtures/init.sql` and spec SQL queries |
| Model / class catalog change | `e2e/fixtures/expected_detections.json` (mark `_meta.fixture_version` bump + leftover entry for binary refresh) |
| Playwright dependency drift | `e2e/runner/package.json` + `e2e/runner/Dockerfile` |
| Suite scenario steps gone stale | **Stop and ask** — scenario edits are user-driven, not drift-driven |
### Phase 4: Apply edits
Edit each in-scope file. After each batch, run `ReadLints` on touched files. Do NOT run the suite e2e itself — that's a downstream pipeline operation, not a sync-skill responsibility.
For `expected_detections.json`: when the model revision changes, the skill **does not** re-record the baseline — the binary fixture cannot be regenerated from the dev environment. Instead:
1. Set `_meta.model.revision` to the new revision.
2. Set `_meta.fixture_version` to a new bumped version with a `-stale` suffix (e.g., `0.2.0-stale`).
3. Append a new entry to `_docs/_process_leftovers/` describing the required re-record.
4. Leave `expected.by_class` untouched — the spec's tolerance check will fail loudly until the binary refresh lands.
### Phase 5: Update assumptions log
Append a new `assumptions_log:` entry to `_docs/_repo-config.yaml` recording:
- Date, components in scope, which suite-e2e files were touched
- Any inferred contract mappings still tagged `confirmed: false`
- Any leftover entries created
### Phase 6: Report
Render a Choose-format summary of the synced files, surface any `_process_leftovers/` entries created, and end. Do NOT auto-commit.
## Self-verification
- [ ] No file outside `e2e/`, `.woodpecker/suite-e2e.yml`, or `_docs/_process_leftovers/` was edited
- [ ] `_docs/_repo-config.yaml` `suite_e2e:` block was not silently mutated except for `assumptions_log` append
- [ ] `expected_detections.json` was not re-recorded (only metadata bumped + leftover added)
- [ ] Every spec edit traces to a flagged commit pattern in Phase 2
- [ ] `ReadLints` clean on every touched file
## Failure handling
Same retry / escalation protocol as `monorepo-cicd` — see `protocols.md`. The most common failure mode is the binary-fixture leftover (sample.mp4 missing or SHA-mismatched); this skill does not attempt to resolve it, only surfaces it.
-248
View File
@@ -1,248 +0,0 @@
---
name: monorepo-onboard
description: Adds a new component (submodule / package / workspace member) to a monorepo as a single atomic operation. Updates the component registry (`.gitmodules` / `package.json` workspaces / `Cargo.toml` / etc.), places or extends unified docs, updates CI/compose/env artifacts, and appends an entry to `_docs/_repo-config.yaml`. Intentionally monolithic — onboarding is one user intent that spans multiple artifact domains. Use when the user says "onboard X", "add service Y to the monorepo", "register new repo".
---
# Monorepo Onboard
Onboards a new component atomically. Spans registry + docs + CI + env + config in one coordinated run — because onboarding is a single user intent, and splitting it across multiple skills would fragment the user experience, cause duplicate input collection, and create inconsistent intermediate states in the config file.
## Why this skill is monolithic
Onboarding ONE component requires updating ~8 artifacts. If the user had to invoke `monorepo-document`, `monorepo-cicd`, and a registry skill separately, they would answer overlapping questions 23 times, and the config file would pass through invalid states between runs. Monolithic preserves atomicity and consistency.
Sync operations (after onboarding is done) ARE split by artifact — see `monorepo-document` and `monorepo-cicd`.
## Preconditions (hard gates)
1. `_docs/_repo-config.yaml` exists.
2. Top-level `confirmed_by_user: true`.
3. The component is NOT already in `components:` — if it is, redirect to `monorepo-document` or `monorepo-cicd` (it's an update, not an onboarding).
## Mitigations (M1M7)
- **M1** Separation: this skill does not invoke `monorepo-discover` automatically. If `_repo-config.yaml` needs regeneration first, tell the user.
- **M3** Factual vs. interpretive vs. conventional: all user inputs below are CONVENTIONAL (project choices) — always ASK, never infer.
- **M4** Batch inputs in one question round.
- **M5** Skip over guess: if the user's answer doesn't match enumerable options in config (e.g., unknown deployment tier), stop and ask whether to extend config or adjust answer.
- **M6** Assumptions footer + config `assumptions_log` append.
- **M7** Drift detection: before writing anything, verify every artifact path that will be touched exists (or will be created) — stop on unexpected conditions.
## Required inputs (batch-ask, M4)
Collect ALL of these upfront. If any missing, stop and ask. Offer choices from config when the input has a constrained domain (e.g., `conventions.deployment_tiers`).
| Input | Example | Enumerable? |
| ----- | ------- | ----------- |
| `name` | `satellite-provider` | No — open-ended, follow `conventions.component_naming` |
| `location` | git URL / path | No |
| `stack` | `.NET 10`, `Python 3.12` | No — open-ended |
| `purpose` (one line) | "Fetches satellite imagery" | No |
| `doc_placement` | "extend `_docs/07_admin.md`" OR "new `_docs/NN_satellite.md`" | Yes — offer options based on `docs.*` |
| `ci_required` | Which pipelines (or "none") | Yes — infer from `ci.tooling` |
| `deployment_tier` | `edge` | Yes — `conventions.deployment_tiers` |
| `ports` | "5010/http" or "none" | No |
| `depends_on` | Other components called | Yes — list from `components:` names |
| `env_vars` | Name + placeholder value | No (never real secrets) |
If the user provides an answer outside the enumerable set (e.g., deployment tier not in config), **stop** and ask whether to extend the config or pick from the existing set (M5).
## Workflow
### Phase 1: Drift check (M7)
Before writing:
1. Verify `repo.component_registry` exists on disk.
2. Verify `docs.root` exists.
3. If `doc_placement` = extend existing doc, verify that doc exists.
4. Verify every file in `ci.orchestration_files` and `ci.env_template` exists.
5. Verify `ci.service_registry_doc` exists (if set).
Any missing → stop, ask whether to run `monorepo-discover` first or proceed skipping that artifact.
### Phase 2: Register in component registry
Based on `repo.type`:
| Registry | Action |
| -------- | ------ |
| `git-submodules` | Append `[submodule "<name>"]` stanza to `.gitmodules`. Preserve existing indentation style exactly. |
| `npm-workspaces` | Add path to `workspaces` array in `package.json`. Preserve JSON formatting. |
| `pnpm-workspace` | Add to `packages:` in `pnpm-workspace.yaml`. |
| `cargo-workspace` | Add to `members:` in `Cargo.toml`. |
| `go-workspace` | Add to `use (...)` block in `go.work`. |
| `adhoc` | Update the registry file that config points to. |
**Do NOT run** `git submodule add`, `npm install`, or equivalent commands. Produce the text diff; the user runs the actual registration command after review.
### Phase 3: Root README update
If the root README contains a component/services table (check `repo.root_readme`):
1. Insert a new row following existing ordering (alphabetical or deployment-order — match what's there).
2. Match column widths and punctuation exactly.
If there's an ASCII architecture diagram and `deployment_tier` implies new runtime presence, **ask** the user where to place the new box — don't invent a position.
### Phase 4: Unified docs placement
**If extending an existing doc**:
1. Read the target file.
2. Add a new H2 section at the appropriate position. If ambiguous (the file has multiple possible sections), ask.
3. Update file's internal TOC if present.
4. Update `docs.index` ONLY if that index has a cross-reference table that includes sub-sections (check the file).
**If creating a new doc file**:
1. Determine the filename via `docs.file_convention` and `docs.next_unused_prefix` (e.g., `13_satellite_provider.md`).
2. Create using this template:
```markdown
# <Component Name>
## Overview
<expanded purpose from user input>
## API
<endpoints or "None">
## Data model
<if applicable, else "None">
## Configuration
<env vars from user input>
```
3. Update `docs.index` (`_docs/README.md` or equivalent):
- Add row to docs table, matching existing format
- If the component introduces a permission AND the index has a permission → feature matrix, update that too
4. After creating, update `docs.next_unused_prefix` in `_docs/_repo-config.yaml`.
### Phase 5: Cross-cutting docs
For each `docs.cross_cutting` entry whose `owns:` matches a fact provided by the user, update that doc:
- `depends_on` non-empty → architecture/communication doc
- New schema/tables → schema doc (ask user for schema details if not provided)
- New permission/role → permissions doc
If a cross-cutting concern is implied by inputs but has no owner in config → add to `unresolved:` in config and ask.
### Phase 6: CI/CD integration
Update:
- **`ci.service_registry_doc`**: add new row to the service table in that file (if set). Match existing format.
- **Orchestration files** (`ci.orchestration_files`): add service block if component is a runtime service. Use `ci.image_tag_format` for the image string. Include `depends_on`, `ports`, `environment`, `volumes` based on user inputs and existing service-block structure.
- **`ci.env_template`**: append new env vars with placeholder values. NEVER real secrets.
### Phase 7: Per-component CI — guidance ONLY
For `<component>/.woodpecker/*.yml`, `<component>/.github/workflows/*`, etc.:
**Do NOT create these files.** They live inside the component's own repo/workspace.
Instead, output the `ci.pipeline_template` (from config) customized for this component, so the user can copy it into the component's workspace themselves.
### Phase 8: Update `_docs/_repo-config.yaml`
Append new entry to `components:`:
```yaml
- name: <name>
path: <path>/
stack: <stack>
confirmed: true # user explicitly onboarded = confirmed
evidence: [user_onboarded]
primary_doc: <new doc path>
secondary_docs: [...]
ci_config: <component>/.<ci_tool>/ # expected location
deployment_tier: <tier>
ports: [...]
depends_on: [...]
env_vars: [...]
```
If `docs.next_unused_prefix` was consumed, increment it.
Append to `assumptions_log:`:
```yaml
- date: <date>
skill: monorepo-onboard
run_notes: "Onboarded <name>"
assumptions:
- "<list>"
```
Do NOT change `confirmed_by_user` — only human sets that.
### Phase 9: Verification report (M6 footer)
```
monorepo-onboard run complete — onboarded `<name>`.
Files modified (N):
- .gitmodules — added submodule entry
- README.md — added row in Services table
- _docs/NN_<name>.md — created
- _docs/README.md — added index row + permission-matrix row
- _docs/00_top_level_architecture.md — added to Communication section
- docker-compose.run.yml — added service block
- .env.example — added <NAME>_API_KEY placeholder
- ci_steps.md — added service-table row
- _docs/_repo-config.yaml — recorded component + updated next_unused_prefix
Files NOT modified but the user must handle:
- <component>/.woodpecker/build-*.yml — create inside the component's own workspace
(template below)
- CI system UI — activate the new repo
Next manual actions:
1. Actually add the component: `git submodule add <url> <path>` (or equivalent)
2. Create per-component CI config using the template
3. Activate the repo in your CI system
4. Review the full diff, then commit with `<commit_prefix> Onboard <name>`
Pipeline template for <name>:
<rendered ci.pipeline_template with <service> replaced>
Assumptions used this run:
- Doc filename convention: <from config>
- Image tag format: <from config>
- Alphabetical ordering in Services table (observed)
```
## What this skill will NEVER do
- Run `git submodule add`, `npm install`, or any network/install-touching command
- Create per-component CI configs inside component directories
- Invent env vars, ports, permissions, or ticket IDs — all from user
- Auto-commit
- Reorder existing table rows beyond inserting the new one
- Set `confirmed_by_user: true` in config
- Touch a file outside the explicit scope
## Rollback (pre-commit)
Before the user commits, revert is straightforward:
```bash
git checkout -- <every file listed in the report>
```
For the new doc file, remove it explicitly:
```bash
rm _docs/NN_<name>.md
```
The component itself (if already registered via `git submodule add` or workspace install) requires manual cleanup — outside this skill's scope.
## Edge cases
- **Component already in config** (not registry) or vice versa → state mismatch. Redirect to `monorepo-discover` to reconcile.
- **User input contradicts config convention** (e.g., new deployment tier not in `conventions.deployment_tiers`): stop, ask — extend config, or choose from existing.
- **`docs.next_unused_prefix` collides with an existing file** (race condition): bump and retry once; if still colliding, stop.
- **No `docs.root` in config**: cannot place a doc. Ask user to run `monorepo-discover` or manually set it in the config first.
-160
View File
@@ -1,160 +0,0 @@
---
name: monorepo-status
description: Read-only drift/coverage report for a monorepo. Reads `_docs/_repo-config.yaml` and compares live repo state (component commits, doc files, CI artifacts) against it. Surfaces which components have unsynced docs, missing CI coverage, unresolved questions, or structural drift. Writes nothing. Use before releases, during audits, or whenever the user asks "what's out of sync?".
---
# Monorepo Status
Read-only. Reports drift between the live repo and `_docs/_repo-config.yaml`. Writes **nothing** — not even `assumptions_log`. Its only deliverable is a text report.
## Preconditions (soft gates)
1. `_docs/_repo-config.yaml` exists — if not, redirect: "Run `monorepo-discover` first."
2. `confirmed_by_user: true` is NOT required — this skill can run on an unconfirmed config, but will flag it prominently.
## Mitigations (M1M7)
- **M1/M7** This skill IS M7 — it is the drift-detection mechanism other skills invoke conceptually. It surfaces drift, never "fixes" it.
- **M3** All checks are FACTUAL (file exists? commit date? referenced in config?). No interpretive work.
- **M6** Assumptions footer included; but this skill does NOT append to `assumptions_log` in config (writes nothing).
## What the report covers
### Section 1: Config health
- Is `confirmed_by_user: true`? (If false, flag prominently — other skills won't run)
- How many entries have `confirmed: false` (inferred)?
- Count of `unresolved:` entries + their IDs
- Age of config (`last_updated`) — flag if > 60 days old
### Section 2: Component drift
For each component in `components:`:
1. Last commit date of component:
```bash
git -C <path> log -1 --format=%cI # submodule
# or
git log -1 --format=%cI -- <path> # subfolder
```
2. Last commit date of `primary_doc` (and each `secondary_docs` entry):
```bash
git log -1 --format=%cI -- <doc_path>
```
3. Flag as drift if ANY doc's last commit is older than the component's last commit by more than a threshold (default: 0 days — any ordering difference is drift, but annotate magnitude).
### Section 3: CI coverage
For each component:
- Does it have files matching `ci.expected_files_per_component[*].path_glob`?
- Is it present in each `ci.orchestration_files` that's expected to include it (heuristic: check if the compose file mentions the component name or image)?
- Is it listed in `ci.service_registry_doc` if that file has a service table?
Mark each as `complete` / `partial` / `missing` and explain.
### Section 4: Registry vs. config consistency
- Every component in the registry (`.gitmodules`, workspaces, etc.) appears in `components:` — flag mismatches
- Every component in `components:` appears in the registry — flag mismatches
- Every `docs.root` file cross-referenced in config exists on disk — flag missing
- Every `ci.orchestration_files` and `ci.install_scripts` exists — flag missing
- `glossary_doc:` (if recorded in config) points to a file that exists on disk — flag missing
- The cross-cutting architecture doc identified by `docs.cross_cutting` contains a `## Architecture Vision` section — flag missing (signals the meta-repo flow's Step 2.5 was skipped or the section was removed)
### Section 5: Unresolved questions
List every `unresolved:` entry in config with its ID and question — so the user knows what's blocking full confirmation.
## Workflow
1. Read `_docs/_repo-config.yaml`. If missing or unparseable, STOP with a redirect to `monorepo-discover`.
2. Run all checks above (purely read-only).
3. Render the single summary table and supporting sections.
4. Include the assumptions footer.
5. STOP. Do not edit any file.
## Report template
```
═══════════════════════════════════════════════════
MONOREPO STATUS
═══════════════════════════════════════════════════
Config: _docs/_repo-config.yaml
confirmed_by_user: <true|false> [FLAG if false]
last_updated: <date> [FLAG if > 60 days]
inferred entries: <count> of <total>
unresolved: <count> open
═══════════════════════════════════════════════════
Component drift
═══════════════════════════════════════════════════
Component | Last commit | Primary doc age | Secondary docs | Status
-------------------- | ----------- | --------------- | -------------- | ------
annotations | 2d ago | 2d ago | OK | in-sync
flights | 1d ago | 12d ago | 1 stale (schema)| drift
satellite-provider | 3d ago | N/A | N/A | no mapping
═══════════════════════════════════════════════════
CI coverage
═══════════════════════════════════════════════════
Component | CI configs | Orchestration | Service registry
-------------------- | ---------- | ------------- | ----------------
annotations | complete | yes | yes
flights | complete | yes | yes
satellite-provider | missing | no | no
═══════════════════════════════════════════════════
Registry vs. config
═══════════════════════════════════════════════════
In registry, not in config: [list or "(none)"]
In config, not in registry: [list or "(none)"]
Config-referenced docs missing: [list or "(none)"]
Config-referenced CI files missing: [list or "(none)"]
glossary_doc: [path or "not recorded — run /autodev to capture"]
Architecture Vision section: [present | missing in <doc>]
═══════════════════════════════════════════════════
Unresolved questions
═══════════════════════════════════════════════════
- <id>: <question>
- <id>: <question>
═══════════════════════════════════════════════════
Recommendations
═══════════════════════════════════════════════════
- Run monorepo-document for: flights (docs drift)
- Run monorepo-cicd for: satellite-provider (no CI coverage)
- Run monorepo-onboard for: satellite-provider (no mapping)
- Run monorepo-discover to refresh config (if drift is widespread or config is stale)
═══════════════════════════════════════════════════
Assumptions used this run
═══════════════════════════════════════════════════
- Drift threshold: any ordering difference counts as drift
- CI coverage heuristic: component name or image appears in compose file
- Component last-commit measured via `git log` against the component path
Report only. No files modified.
```
## What this skill will NEVER do
- Modify any file (including the config `assumptions_log`)
- Run `monorepo-discover`, `monorepo-document`, `monorepo-cicd`, or `monorepo-onboard` automatically — only recommend them
- Block on unresolved entries (it just lists them)
- Install tools
## Edge cases
- **Git not available / shallow clone**: commit dates may be inaccurate — note in the assumptions footer.
- **Config has `confirmed: false` but no unresolved entries**: this is a sign discovery ran but the human never reviewed. Flag in Section 1.
- **Component in registry but no entry in config** (or vice versa): flag in Section 4 — don't guess what the mapping should be; just report the mismatch.
- **Very large monorepos (100+ components)**: don't truncate tables; tell the user if the report will be long, offer to scope to a subset.
+33 -132
View File
@@ -4,19 +4,19 @@ description: |
Interactive skill for adding new functionality to an existing codebase.
Guides the user through describing the feature, assessing complexity,
optionally running research, analyzing the codebase for insertion points,
validating assumptions with the user, and producing a task spec with work item ticket.
validating assumptions with the user, and producing a task spec with Jira ticket.
Supports a loop — the user can add multiple tasks in one session.
Trigger phrases:
- "new task", "add feature", "new functionality"
- "I want to add", "new component", "extend"
category: build
tags: [task, feature, interactive, planning, work-items]
tags: [task, feature, interactive, planning, jira]
disable-model-invocation: true
---
# New Task (Interactive Feature Planning)
Guide the user through defining new functionality for an existing codebase. Produces one or more task specifications with work item tickets, optionally running deep research for complex features.
Guide the user through defining new functionality for an existing codebase. Produces one or more task specifications with Jira tickets, optionally running deep research for complex features.
## Core Principles
@@ -31,14 +31,13 @@ Guide the user through defining new functionality for an existing codebase. Prod
Fixed paths:
- TASKS_DIR: `_docs/02_tasks/`
- TASKS_TODO: `_docs/02_tasks/todo/`
- PLANS_DIR: `_docs/02_task_plans/`
- DOCUMENT_DIR: `_docs/02_document/`
- DEPENDENCIES_TABLE: `_docs/02_tasks/_dependencies_table.md`
Create TASKS_DIR, TASKS_TODO, and PLANS_DIR if they don't exist.
Create TASKS_DIR and PLANS_DIR if they don't exist.
If TASKS_DIR already contains task files (scan `todo/`, `backlog/`, and `done/`), use them to determine the next numeric prefix for temporary file naming.
If TASKS_DIR already contains task files, scan them to determine the next numeric prefix for temporary file naming.
## Workflow
@@ -75,75 +74,36 @@ Record the description verbatim for use in subsequent steps.
**Role**: Technical analyst
**Goal**: Determine whether deep research is needed.
Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md including its `## Architecture Vision` section, glossary.md, components/, system-flows.md). Use `glossary.md` to keep the new task's name, acceptance-criteria wording, and component references aligned with the user's confirmed vocabulary; flag the task to the user if the request appears to violate an Architecture Vision principle, do not silently allow it.
**Consult LESSONS.md**: if `_docs/LESSONS.md` exists, read it and look for entries in categories `estimation`, `architecture`, `dependencies` that might apply to the task under consideration. If a relevant lesson exists (e.g., "estimation: auth-related changes historically take 2x estimate"), bias the classification and recommendation accordingly. Note in the output which lessons (if any) were applied.
Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md, components/, system-flows.md).
Assess the change along these dimensions:
- **Scope**: how many components/files are affected?
- **Novelty**: does it involve libraries, protocols, or patterns not already in the codebase?
- **Risk**: could it break existing functionality or require architectural changes?
### 2a. Complexity-Points Estimate
Project policy (per the workspace user-rule on ADO points): aim for tasks at 23 points (rarely 5). Tasks at 8 points are high risk; tasks at 13 are too complex and MUST be broken down. The new-task skill enforces this here, before producing a single-file task spec.
Map the Scope/Novelty/Risk profile to a points estimate using this table:
| Profile | Points | Examples |
|---------|--------|----------|
| All three low | **12** | One-line config change; trivial CRUD field addition |
| Two low + one medium | **3** | Localized refactor; add one well-understood endpoint |
| One low + two medium, OR all medium | **5** | New small feature touching 23 components; integration with a known library |
| Any high, OR two medium + one high | **8** | Cross-cutting concern across 4+ components; integration with an unfamiliar protocol; significant architectural change |
| Two or three high | **13** | New subsystem; unfamiliar tech across the stack; multiple unknown unknowns |
If a relevant LESSONS.md entry biases the estimate (e.g., "auth-related changes historically take 2× estimate"), apply the multiplier and round up to the next discrete point on the scale (1, 2, 3, 5, 8, 13).
### 2b. Routing by Complexity
| Estimate | Default routing | Override path |
|----------|-----------------|---------------|
| **15** | Continue this skill at Step 3 (Research) or Step 4 (Codebase Analysis) — see classification below | — |
| **8** | **STOP this skill and recommend handoff to `/decompose @<feature_description>`** (single-component decompose mode if the affected scope fits inside one component, default mode if it does not). The user may override and proceed in `/new-task`, but the override must be explicitly chosen. | C) Proceed in /new-task anyway with the user's acknowledgement that the resulting task is high-risk and may need to be re-decomposed mid-implementation |
| **13** | **STOP this skill — auto-handoff is mandatory.** A 13-point feature cannot be a single task spec. Invoke `/decompose @<feature_description>` (default mode) before writing any task file. Surface the handoff to the user with no override path; this is a hard policy gate. | None — must decompose |
For the auto-handoff path:
1. Render a one-paragraph description of the feature suitable to feed `/decompose` (combine Step 1's verbatim user description with the complexity-points reasoning).
2. Save it to `_docs/02_task_plans/<feature_slug>/feature-description.md` so the decompose skill has a stable input file.
3. Either (a) directly auto-chain into `.cursor/skills/decompose/SKILL.md` in default mode with this file as input, or (b) report the handoff to the user along with the exact `/decompose` invocation and stop. Pick (a) only if the user has explicitly enabled auto-chain across skills (e.g., we are inside an `/autodev` invocation); otherwise pick (b).
### 2c. Research vs Skip Research (only for ≤5 estimates)
Classification (independent of points; runs only when points ≤ 5 and Step 2b chose Continue):
Classification:
| Category | Criteria | Action |
|----------|----------|--------|
| **Needs research** | New libraries/frameworks, unfamiliar protocols, multiple unknowns | Proceed to Step 3 (Research) |
| **Needs research** | New libraries/frameworks, unfamiliar protocols, significant architectural change, multiple unknowns | Proceed to Step 3 (Research) |
| **Skip research** | Extends existing functionality, uses patterns already in codebase, straightforward new component with known tech | Skip to Step 4 (Codebase Analysis) |
Present the full assessment to the user:
Present the assessment to the user:
```
══════════════════════════════════════
COMPLEXITY ASSESSMENT
══════════════════════════════════════
Scope: [low / medium / high]
Novelty: [low / medium / high]
Risk: [low / medium / high]
Points: [1 / 2 / 3 / 5 / 8 / 13] (project aim: 23, rarely 5)
Routing: [Continue in /new-task | Hand off to /decompose]
Scope: [low / medium / high]
Novelty: [low / medium / high]
Risk: [low / medium / high]
══════════════════════════════════════
Recommendation: [Research needed | Skip research | Decompose required]
Reason: [one-line justification, including any LESSONS.md influence]
Recommendation: [Research needed / Skip research]
Reason: [one-line justification]
══════════════════════════════════════
```
**BLOCKING**:
- If points ≤ 5 → ask the user to confirm or override the research recommendation before proceeding.
- If points = 8 → ask the user to choose between hand-off to /decompose (recommended) and continuing in /new-task with explicit risk acknowledgement.
- If points = 13 → STOP and present the handoff plan; do not offer a continue-anyway override.
**BLOCKING**: Ask the user to confirm or override the recommendation before proceeding.
---
@@ -158,7 +118,7 @@ This step only runs if Step 2 determined research is needed.
2. Invoke `.cursor/skills/research/SKILL.md` in standalone mode:
- INPUT_FILE: `PLANS_DIR/<task_slug>/problem.md`
- BASE_DIR: `PLANS_DIR/<task_slug>/`
3. After research completes, read the latest solution draft from `PLANS_DIR/<task_slug>/01_solution/` (highest-numbered `solution_draft*.md`)
3. After research completes, read the solution draft from `PLANS_DIR/<task_slug>/01_solution/solution_draft01.md`
4. Extract the key findings relevant to the task specification
The `<task_slug>` is a short kebab-case name derived from the feature description (e.g., `auth-provider-integration`, `real-time-notifications`).
@@ -168,11 +128,10 @@ The `<task_slug>` is a short kebab-case name derived from the feature descriptio
### Step 4: Codebase Analysis
**Role**: Software architect
**Goal**: Determine where and how to insert the new functionality, and whether existing tests cover the new requirements.
**Goal**: Determine where and how to insert the new functionality.
1. Read the codebase documentation from DOCUMENT_DIR:
- `architecture.md` — overall structure (the `## Architecture Vision` H2 is user-confirmed intent and must not be violated by the new task without explicit approval)
- `glossary.md` — project terminology; reuse the user's vocabulary in task names, AC, and component references
- `architecture.md` — overall structure
- `components/` — component specs
- `system-flows.md` — data flows (if exists)
- `data_model.md` — data model (if exists)
@@ -184,10 +143,6 @@ The `<task_slug>` is a short kebab-case name derived from the feature descriptio
- What new interfaces or models are needed
- How data flows through the change
4. If the change is complex enough, read the actual source files (not just docs) to verify insertion points
5. **Test coverage gap analysis**: Read existing test files that cover the affected components. For each acceptance criterion from Step 1, determine whether an existing test already validates it. Classify each AC as:
- **Covered**: an existing test directly validates this behavior
- **Partially covered**: an existing test exercises the code path but doesn't assert the new requirement
- **Not covered**: no existing test validates this behavior — a new test is required
Present the analysis:
@@ -200,56 +155,9 @@ Present the analysis:
Interface changes: [list or "None"]
New interfaces: [list or "None"]
Data flow impact: [summary]
─────────────────────────────────────
TEST COVERAGE GAP ANALYSIS
─────────────────────────────────────
AC-1: [Covered / Partially covered / Not covered]
[existing test name or "needs new test"]
AC-2: [Covered / Partially covered / Not covered]
[existing test name or "needs new test"]
...
─────────────────────────────────────
New tests needed: [count]
Existing tests to update: [count or "None"]
══════════════════════════════════════
```
When gaps are found, the task spec (Step 6) MUST include the missing tests in the Scope (Included) section and the Unit/Blackbox Tests tables. Tests are not optional — if an AC is not covered by an existing test, the task must deliver a test for it.
---
### Step 4.5: Contract & Layout Check
**Role**: Architect
**Goal**: Prevent silent public-API drift and keep `module-layout.md` consistent before implementation locks file ownership.
Apply the four shared-task triggers from `.cursor/skills/decompose/SKILL.md` Step 2 rule #10 (shared/*, Scope mentions interface/DTO/schema/event/contract/API/shared-model, parent epic is cross-cutting, ≥2 consumers) and classify the task:
- **Producer** — any trigger fires, OR the task changes a public signature / invariant / serialization / error variant of an existing symbol:
1. Check for an existing contract at `_docs/02_document/contracts/<component>/<name>.md`.
2. If present → decide version bump (patch / minor / major per the contract's Versioning Rules) and add the Change Log entry to the task's deliverables.
3. If absent → add creation of the contract file (using `.cursor/skills/decompose/templates/api-contract.md`) to the task's Scope.Included; add a `## Contract` section to the task spec.
4. List every currently-known consumer (from Codebase Analysis Step 4) and add them to the contract's Consumer tasks field.
- **Consumer** — the task imports or calls a public API belonging to another component:
1. Resolve the component's contract file; add it to the task's `### Document Dependencies` section.
2. If the cross-component interface has no contract file, Choose: **A)** create a retroactive contract now as a prerequisite task, **B)** proceed without (logs an explicit coupling risk in the task's Risks & Mitigation).
- **Layout delta** — the task introduces a new component OR changes an existing component's Public API surface:
1. Draft the Per-Component Mapping entry (or the Public API diff) against `_docs/02_document/module-layout.md` using `.cursor/skills/decompose/templates/module-layout.md` format.
2. Add the layout edit to the task's deliverables; the implementer writes it alongside the code change.
3. If `module-layout.md` does not exist, STOP and instruct the user to run `/document` first (existing-code flow) or `/decompose` default mode (greenfield). Do not guess.
- **ADR cross-check** — runs unconditionally for every new-task in any of the three classifications above:
1. If `_docs/02_document/adr/` exists, scan every `Status: Accepted` ADR. For each, ask: "would the proposed task either contradict this ADR's `Decision` or materially affect its `Consequences`?"
2. **Conflict** (task contradicts an Accepted ADR) → STOP and Choose A/B/C: **A)** Re-scope the task to comply with the ADR, **B)** Propose superseding the ADR — the task spec then includes a deliverable to invoke `/plan --adr-only` (or the next `/plan` cycle's Step 4.5) with `Supersedes: ADR-NNN`, and the new task does NOT proceed until that supersede ADR is `Accepted`, **C)** Park the task in `backlog/` with a `Blocked-By: ADR-NNN review` note. Do not silently approve a contradictory task.
3. **Drift** (task changes assumptions an ADR depends on but does not directly contradict it) → record the affected ADR(s) under a new `### ADR Impact` section in the task spec with `> Affects ADR NNN_<slug>: <one-line summary>`. The implementer surfaces this at code-review Phase 7 (which then classifies it as ADR-Drift if not addressed).
4. **Aligned** (task implements something an Accepted ADR mandates) → cite the ADR(s) under `### ADR Compliance` in the task spec with `> Implements ADR NNN_<slug>`. Code-review Phase 7 then expects matching evidence in the implemented code.
Record the classification, any contract/layout deliverables, and any ADR cross-check outcomes in the working notes; they feed Step 5 (Validate Assumptions) and Step 6 (Create Task).
**BLOCKING**: none — this step surfaces findings; the user confirms them in Step 5.
---
### Step 5: Validate Assumptions
@@ -287,28 +195,21 @@ Present using the Choose format for each decision that has meaningful alternativ
**Role**: Technical writer
**Goal**: Produce the task specification file.
1. Determine the next numeric prefix by scanning all TASKS_DIR subfolders (`todo/`, `backlog/`, `done/`) for existing files
2. If research was performed (Step 3), the research artifacts live in `PLANS_DIR/<task_slug>/` — reference them from the task spec where relevant
3. Write the task file using `.cursor/skills/decompose/templates/task.md`:
1. Determine the next numeric prefix by scanning TASKS_DIR for existing files
2. Write the task file using `.cursor/skills/decompose/templates/task.md`:
- Fill all fields from the gathered information
- Set **Complexity** based on the assessment from Step 2
- Set **Dependencies** by cross-referencing existing tasks in TASKS_DIR subfolders
- Set **Tracker** and **Epic** to `pending` (filled in Step 7)
3. Save as `TASKS_TODO/[##]_[short_name].md`
- Set **Dependencies** by cross-referencing existing tasks in TASKS_DIR
- Set **Jira** and **Epic** to `pending` (filled in Step 7)
3. Save as `TASKS_DIR/[##]_[short_name].md`
**Self-verification**:
- [ ] Problem section clearly describes the user need
- [ ] Acceptance criteria are testable (Gherkin format)
- [ ] Scope boundaries are explicit
- [ ] Complexity points match the assessment
- [ ] Dependencies reference existing task tracker IDs where applicable
- [ ] Dependencies reference existing task Jira IDs where applicable
- [ ] No implementation details leaked into the spec
- [ ] If Step 4.5 classified the task as producer, the `## Contract` section exists and points at a contract file
- [ ] If Step 4.5 classified the task as consumer, `### Document Dependencies` lists the relevant contract file
- [ ] If Step 4.5 flagged a layout delta, the task's Scope.Included names the `module-layout.md` edit
- [ ] If Step 4.5 flagged an ADR conflict, the task is either re-scoped (A), explicitly blocked on a supersede ADR (B), or parked in backlog (C) — never silently bypassed
- [ ] If Step 4.5 flagged ADR drift, the task spec has an `### ADR Impact` section listing the affected ADR(s)
- [ ] If Step 4.5 flagged ADR alignment, the task spec has an `### ADR Compliance` section citing the implemented ADR(s)
---
@@ -317,20 +218,20 @@ Present using the Choose format for each decision that has meaningful alternativ
**Role**: Project coordinator
**Goal**: Create a work item ticket and link it to the task file.
1. Create a ticket via the configured work item tracker (see `autodev/protocols.md` for tracker detection):
1. Create a ticket via the configured work item tracker (Jira MCP or Azure DevOps MCP — see `autopilot/protocols.md` for detection):
- Summary: the task's **Name** field
- Description: the task's **Problem** and **Acceptance Criteria** sections
- Story points: the task's **Complexity** value
- Link to the appropriate epic (ask user if unclear which epic)
2. Write the ticket ID and Epic ID back into the task file header:
- Update **Task** field: `[TICKET-ID]_[short_name]`
- Update **Tracker** field: `[TICKET-ID]`
- Update **Jira** field: `[TICKET-ID]`
- Update **Epic** field: `[EPIC-ID]`
3. Rename the file from `[##]_[short_name].md` to `[TICKET-ID]_[short_name].md`
If the work item tracker is not authenticated or unavailable, follow `.cursor/rules/tracker.mdc` before continuing. Only if the user explicitly chooses `tracker: local`:
If the work item tracker is not authenticated or unavailable (`tracker: local`):
- Keep the numeric prefix
- Set **Tracker** to `pending`
- Set **Jira** to `pending`
- Set **Epic** to `pending`
- The task is still valid and can be implemented; tracker sync happens later
@@ -342,7 +243,7 @@ Ask the user:
```
══════════════════════════════════════
Task created: [TRACKER-ID or ##] — [task name]
Task created: [JIRA-ID or ##] — [task name]
══════════════════════════════════════
A) Add another task
B) Done — finish and update dependencies
@@ -358,7 +259,7 @@ Ask the user:
After the user chooses **Done**:
1. Update (or create) `DEPENDENCIES_TABLE` — add all newly created tasks to the dependencies table
1. Update (or create) `TASKS_DIR/_dependencies_table.md` — add all newly created tasks to the dependencies table
2. Present a summary of all tasks created in this session:
```
@@ -368,8 +269,8 @@ After the user chooses **Done**:
Tasks created: N
Total complexity: M points
─────────────────────────────────────
[TRACKER-ID] [name] ([complexity] pts)
[TRACKER-ID] [name] ([complexity] pts)
[JIRA-ID] [name] ([complexity] pts)
[JIRA-ID] [name] ([complexity] pts)
...
══════════════════════════════════════
```
@@ -383,7 +284,7 @@ After the user chooses **Done**:
| Research skill hits a blocker | Follow research skill's own escalation rules |
| Codebase analysis reveals conflicting architectures | **ASK** user which pattern to follow |
| Complexity exceeds 5 points | **WARN** user and suggest splitting into multiple tasks |
| Work item tracker MCP unavailable | Follow `.cursor/rules/tracker.mdc`; do not continue in local mode unless the user explicitly chooses it |
| Jira MCP unavailable | **WARN**, continue with local-only task files |
## Trigger Conditions
+13 -28
View File
@@ -1,21 +1,21 @@
---
name: plan
description: |
Decompose a solution into architecture, data model, deployment plan, system flows, components, tests, and work item epics.
Systematic planning workflow with BLOCKING gates, self-verification, and structured artifact management.
Decompose a solution into architecture, data model, deployment plan, system flows, components, tests, and Jira epics.
Systematic 6-step planning workflow with BLOCKING gates, self-verification, and structured artifact management.
Uses _docs/ + _docs/02_document/ structure.
Trigger phrases:
- "plan", "decompose solution", "architecture planning"
- "break down the solution", "create planning documents"
- "component decomposition", "solution analysis"
category: build
tags: [planning, architecture, components, testing, work-items, epics]
tags: [planning, architecture, components, testing, jira, epics]
disable-model-invocation: true
---
# Solution Planning
Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, ADRs, tests, and work item epics through a systematic workflow with seven step files (1, 2, 3, 4, 4.5, 5, 6) plus a Final quality checklist.
Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, tests, and Jira epics through a systematic 6-step workflow.
## Core Principles
@@ -55,13 +55,13 @@ Read `steps/01_artifact-management.md` for directory structure, save timing, sav
## Progress Tracking
At the start of execution, create a TodoWrite with all steps (1, 2, 3, 4, 4.5, 5, 6 plus Final). Update status as each step completes. The fractional Step 4.5 (ADR Capture) sits between Architecture Review (Step 4) and Test Specifications (Step 5).
At the start of execution, create a TodoWrite with all steps (1 through 6 plus Final). Update status as each step completes.
## Workflow
### Step 1: Blackbox Tests
Read and execute `.cursor/skills/test-spec/SKILL.md`. This is a planning context — no source code exists yet, so test-spec Phase 4 (script generation) is skipped. Script creation is handled later by the decompose skill as a task.
Read and execute `.cursor/skills/test-spec/SKILL.md`.
Capture any new questions, findings, or insights that arise during test specification — these feed forward into Steps 2 and 3.
@@ -69,7 +69,7 @@ Capture any new questions, findings, or insights that arise during test specific
### Step 2: Solution Analysis
Read and follow `steps/02_solution-analysis.md`. The step opens with **Phase 2a.0: Glossary & Architecture Vision** (BLOCKING) — drafts `_docs/02_document/glossary.md` and a one-paragraph architecture vision, presents the condensed view to the user, iterates until confirmed, then proceeds into the architecture, data-model, and deployment phases. The confirmed vision becomes the first `## Architecture Vision` H2 of `architecture.md`.
Read and follow `steps/02_solution-analysis.md`.
---
@@ -85,25 +85,15 @@ Read and follow `steps/04_review-risk.md`.
---
### Step 4.5: Architecture Decision Records (ADRs)
Read and follow `steps/04-5_adr-capture.md`.
This step captures the architecture and tech-stack decisions that were made (or revised) in Steps 24 as durable, dated, immutable records under `_docs/02_document/adr/`. ADRs are the single thing in `_docs/` that explain the **why** of each major decision after the conversation history is gone. They are consumed by `decompose` (when bootstrapping module layout), `new-task` (when assessing a new feature against existing decisions), `refactor` (when proposing replacements), and any future code-review cycle that needs to confirm a structural choice was deliberate.
This step is **BLOCKING**: the ADR set must be reviewed and confirmed by the user before Step 5 begins.
---
### Step 5: Test Specifications
Read and follow `steps/05_test-specifications.md`.
---
### Step 6: Work Item Epics
### Step 6: Jira Epics
Read and follow `steps/06_work-item-epics.md`.
Read and follow `steps/06_jira-epics.md`.
---
@@ -117,7 +107,6 @@ Read and follow `steps/07_quality-checklist.md`.
- **Coding during planning**: this workflow produces documents, never code
- **Multi-responsibility components**: if a component does two things, split it
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Skipping the glossary/vision gate (Phase 2a.0)**: drafting `architecture.md` from raw `solution.md` without confirming terminology and vision means the AI's mental model is not aligned with the user's; every downstream artifact will inherit that drift
- **Diagrams without data**: generate diagrams only after the underlying structure is documented
- **Copy-pasting problem.md**: the architecture doc should analyze and transform, not repeat the input
- **Vague interfaces**: "component A talks to component B" is not enough; define the method, input, output
@@ -130,7 +119,7 @@ Read and follow `steps/07_quality-checklist.md`.
|-----------|--------|
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — planning cannot proceed |
| Ambiguous requirements | ASK user |
| Input data coverage below the canonical threshold (`cursor-meta.mdc` Quality Thresholds) | Search internet for supplementary data, ASK user to validate |
| Input data coverage below 70% | Search internet for supplementary data, ASK user to validate |
| Technology choice with multiple valid options | ASK user |
| Component naming | PROCEED, confirm at next BLOCKING gate |
| File structure within templates | PROCEED |
@@ -148,18 +137,14 @@ Read and follow `steps/07_quality-checklist.md`.
│ │
│ 1. Blackbox Tests → test-spec/SKILL.md │
│ [BLOCKING: user confirms test coverage] │
│ 2. Solution Analysis → glossary + vision, architecture,
data model, deployment
│ [BLOCKING 2a.0: user confirms glossary + vision] │
│ [BLOCKING 2a: user confirms architecture] │
│ 2. Solution Analysis → architecture, data model, deployment
[BLOCKING: user confirms architecture]
│ 3. Component Decomp → component specs + interfaces │
│ [BLOCKING: user confirms components] │
│ 4. Review & Risk → risk register, iterations │
│ [BLOCKING: user confirms mitigations] │
│ 4.5 ADR Capture → _docs/02_document/adr/NNN_*.md │
│ [BLOCKING: user confirms ADR set] │
│ 5. Test Specifications → per-component test specs │
│ 6. Work Item Epics → epic per component + bootstrap │
│ 6. Jira Epics → epic per component + bootstrap │
│ ───────────────────────────────────────────────── │
│ Final: Quality Checklist → FINAL_report.md │
├────────────────────────────────────────────────────────────────┤
@@ -26,10 +26,6 @@ DOCUMENT_DIR/
│ └── deployment_procedures.md
├── risk_mitigations.md
├── risk_mitigations_02.md (iterative, ## as sequence)
├── adr/
│ ├── 001_[decision_slug].md
│ ├── 002_[decision_slug].md
│ └── ...
├── components/
│ ├── 01_[name]/
│ │ ├── description.md
@@ -70,10 +66,8 @@ DOCUMENT_DIR/
| Step 3 | Common helpers generated | `common-helpers/[##]_helper_[name].md` |
| Step 3 | Diagrams generated | `diagrams/` |
| Step 4 | Risk assessment complete | `risk_mitigations.md` |
| Step 4.5 | Each ADR captured | `adr/NNN_[decision_slug].md` |
| Step 4.5 | ADR index updated | `adr/README.md` |
| Step 5 | Tests written per component | `components/[##]_[name]/tests.md` |
| Step 6 | Epics created in work item tracker | Tracker via MCP |
| Step 6 | Epics created in Jira | Jira via MCP |
| Final | All steps complete | `FINAL_report.md` |
### Save Principles
@@ -91,15 +85,3 @@ If DOCUMENT_DIR already contains artifacts:
2. Identify the last completed step based on which artifacts exist
3. Resume from the next incomplete step
4. Inform the user which steps are being skipped
#### Step 4.5 (ADR Capture) resumption rule
ADR files have a `Status` field that disambiguates "step in progress" from "step done":
- `Status: Proposed` → Step 4.5 is **in progress**. The user has not yet hit the BLOCKING gate (or hit it and chose B/C/D, which kept files at `Proposed`). Resume Step 4.5 at Phase 4.5f and re-present the BLOCKING Choose to the user. Do NOT skip to Step 5.
- `Status: Accepted` AND `adr/README.md` index exists AND every Accepted ADR is referenced in the index → Step 4.5 is **done**. Skip to Step 5.
- `Status: Accepted` but `adr/README.md` is missing or out of date → Step 4.5 is **partially complete**. Resume at Phase 4.5d (Maintain the ADR Index) before moving on.
- Mixed `Proposed` + `Accepted` files in the same directory → Step 4.5 is **in progress** with prior partial confirmations. Resume at Phase 4.5f and re-present only the still-`Proposed` ADRs.
- Empty `adr/` directory or no `adr/` directory → Step 4.5 has not started yet. Begin at Phase 4.5a.
The `Date` field on every Accepted ADR is the date the user confirmed it; do not regenerate it during resumption.
@@ -4,105 +4,20 @@
**Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft
**Constraints**: No code, no component-level detail yet; focus on system-level view
### Phase 2a.0: Glossary & Architecture Vision (BLOCKING)
**Role**: Software architect + business analyst
**Goal**: Align the AI's mental model of the project with the user's intent BEFORE drafting `architecture.md`. Capture domain terminology and the user's high-level architecture vision so every downstream artifact (architecture, components, flows, tests, epics) is grounded in confirmed user intent — not in AI inference.
**Inputs**:
- `_docs/00_problem/problem.md`, `acceptance_criteria.md`, `restrictions.md`
- `_docs/00_problem/input_data/*`
- `_docs/01_solution/solution.md` (and any earlier `solution_draft*.md` siblings)
- Any blackbox-test findings produced in Step 1
**Outputs**:
- `_docs/02_document/glossary.md` (NEW)
- A confirmed "Architecture Vision" paragraph + bullet list held in working memory and used as the spine of Phase 2a's `architecture.md`
**Procedure**:
1. **Draft glossary** — extract project-specific terminology from inputs (NOT generic software terms). Include:
- Domain entities, processes, and roles
- Acronyms / abbreviations
- Internal codenames or product names
- Synonym pairs in active use (e.g., "flight" vs. "mission")
- Stakeholder personas referenced in problem.md
Each entry: one-line definition, plus a parenthetical source (`source: problem.md`, `source: solution.md §3`).
Skip terms that have a single well-known industry meaning (REST, JSON, etc.).
2. **Draft architecture vision** — synthesize from inputs:
- **One paragraph**: what the system is, who uses it, the shape of the runtime topology (monolith / services / pipeline / library / hybrid).
- **Components & responsibilities** (one-line each). At this stage these are *intent-level*, not the formal decomposition that Step 3 produces.
- **Major data flows** (one or two sentences each).
- **Architectural principles / non-negotiables** the user has implied (e.g., "DB-driven config", "no per-component state outside Redis", "all UI traffic via REST + SSE only").
- **Open architectural questions** the AI cannot resolve from inputs alone.
3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
```
══════════════════════════════════════
REVIEW: Glossary + Architecture Vision
══════════════════════════════════════
Glossary (N terms drafted):
- <Term>: <one-line definition>
- ...
Architecture Vision:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Principles / non-negotiables:
- <principle>
- ...
Open questions (AI could not resolve):
- <q1>
- <q2>
══════════════════════════════════════
A) Looks correct — write glossary.md, use vision for Phase 2a
B) I want to add / correct entries (provide diffs)
C) Answer the open questions first, then re-present
══════════════════════════════════════
Recommendation: pick C if open questions exist, otherwise A
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present the condensed view, loop until A.
- On C → ask the listed open questions one round (M4-style batch), integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `_docs/02_document/glossary.md` with terms in alphabetical order. Include a top-line `**Status**: confirmed-by-user` and the date.
- Hold the confirmed vision (paragraph + components + principles) in working memory; Phase 2a will materialize it into `architecture.md` and **must** preserve every confirmed principle and component intent verbatim.
**Self-verification**:
- [ ] Every glossary entry traces to at least one input file (no invented terms)
- [ ] Every component listed in the vision is one the inputs reference
- [ ] All open questions are either answered or explicitly deferred (with the user's acknowledgement)
- [ ] User picked option A on the latest condensed view
**BLOCKING**: Do NOT proceed to Phase 2a until `glossary.md` is saved and the user has confirmed the architecture vision.
### Phase 2a: Architecture & Flows
1. Read all input files thoroughly
2. Incorporate findings, questions, and insights discovered during Step 1 (blackbox tests)
3. **Apply confirmed vision from Phase 2a.0**: the architecture document must include a top-level `## Architecture Vision` section that contains the user-confirmed paragraph, components, and principles verbatim. The rest of `architecture.md` (tech stack, deployment model, NFRs, ADRs) builds on top of that section, never contradicts it
4. Research unknown or questionable topics via internet; ask user about ambiguities
5. Document architecture using `templates/architecture.md` as structure
6. Document system flows using `templates/system-flows.md` as structure
3. Research unknown or questionable topics via internet; ask user about ambiguities
4. Document architecture using `templates/architecture.md` as structure
5. Document system flows using `templates/system-flows.md` as structure
**Self-verification**:
- [ ] `architecture.md` opens with a `## Architecture Vision` section matching Phase 2a.0
- [ ] Architecture covers all capabilities mentioned in solution.md
- [ ] System flows cover all main user/system interactions
- [ ] No contradictions with problem.md, restrictions.md, or the confirmed vision
- [ ] No contradictions with problem.md or restrictions.md
- [ ] Technology choices are justified
- [ ] Blackbox test findings are reflected in architecture decisions
- [ ] Every term used in `architecture.md` that is project-specific appears in `glossary.md`
**Save action**: Write `architecture.md` and `system-flows.md`
@@ -1,187 +0,0 @@
# Step 4.5: Architecture Decision Records (ADRs)
**Role**: Architect / technical writer
**Goal**: Capture every major architecture, tech-stack, data-model, and integration decision made during Steps 24 as a durable, dated, immutable record under `_docs/02_document/adr/`.
**Constraints**: ADRs only — do not re-open architecture; do not make new decisions in this step. Document what has been decided, not what is still open.
ADRs are the single thing in `_docs/` that explains the **why** of each major decision after the conversation history is gone. They are consumed by:
- `decompose` Step 1.5 (`steps/01-5_module-layout.md`) — every Accepted ADR is cross-checked against the module-layout proposal; conflicts trigger an explicit Choose between supersede / exception / re-open.
- `new-task` Step 4.5 (`SKILL.md` § "Step 4.5: Contract & Layout Check") — every new task is classified against Accepted ADRs as Conflict / Drift / Aligned; conflicts STOP the task with a Choose A/B/C; drift adds an `### ADR Impact` section; alignment adds an `### ADR Compliance` section.
- `refactor` Phase 2b.1 (`phases/02-analysis.md`) — every Accepted ADR is diffed against the proposed roadmap; Violations trigger a BLOCKING supersede gate that produces a `supersede_adr_NNN.md` task before any refactor task is created.
- `code-review` Phase 7 (`SKILL.md` § "Phase 7: Architecture Compliance") — every changed-files batch is checked against Accepted ADRs; ADR-Violation findings are Critical, ADR-Drift findings are High.
Discipline that still relies on the human: when a downstream skill detects a Drift case, the resulting task spec MUST land its `## ADR Impact` / `## ADR Compliance` section; the implementer must address it; the next code-review batch then has the context it needs. Drift left undocumented is the silent-failure path — every consumer hook above is designed to make it visible.
## Inputs
- `_docs/02_document/architecture.md` (incl. confirmed `## Architecture Vision`)
- `_docs/02_document/glossary.md`
- `_docs/02_document/data_model.md`
- `_docs/02_document/system-flows.md`
- `_docs/02_document/risk_mitigations.md` (and any `risk_mitigations_NN.md` iterations from Step 4)
- `_docs/02_document/components/[##]_[name]/description.md`
- `_docs/02_document/deployment/` (CI/CD, environments, observability)
- `_docs/00_problem/restrictions.md` and `_docs/00_problem/acceptance_criteria.md` (each ADR must reference relevant constraints / AC by ID)
- Optional: `_docs/01_solution/solution.md` and `_docs/01_solution/tech_stack.md` (research output)
- Optional: `_docs/LESSONS.md` — surface any lesson categories of `architecture` / `dependencies` that bias the recommendation
## What is an ADR (and what is not)
Capture an ADR when **all** of the following hold:
1. The decision picks between two or more genuinely valid approaches with meaningful trade-offs.
2. The decision has **downstream consequences** that other decisions, code, or tasks inherit from.
3. The decision is **non-obvious** to a future reader who only sees the final code — they would ask "why was it built this way?" rather than discovering the answer by reading the source.
Do NOT create an ADR for:
- Naming, formatting, or purely cosmetic choices.
- A choice that is fully implied by a single explicit restriction (`restrictions.md` is itself the record — link to it from the architecture doc instead).
- A choice the team has not actually made yet — open questions live in `risk_mitigations.md` or `_docs/_process_leftovers/`, not in ADRs.
- A technology selection where research already produced an exact-fit selection with one viable option (the research doc is the record — link to the relevant `solution_draft*.md` section).
## Process
### Phase 4.5a: Decision Inventory
Walk the inputs and list candidate decisions. For each candidate, record a one-liner:
```
- [decision] — [trade-off summary] — [downstream consumers] — [evidence file:section]
```
Inspect at minimum:
| Inspection target | Typical decisions surfaced |
|-------------------|----------------------------|
| `architecture.md` § layering | Layering style (clean vs hex vs n-tier), which layer owns transactions, how cross-cutting concerns enter |
| `architecture.md` § Architecture Vision | The North Star principle (e.g., "edge-first, sync-second"); ADR captures the implication for one specific subsystem |
| `data_model.md` | Datastore choice (Postgres vs Mongo), partitioning, soft vs hard deletes, schema evolution strategy |
| `system-flows.md` | Sync vs async boundaries, idempotency strategy, retry policy ownership, error envelope shape |
| `components/*/description.md` § interfaces | Public-API style (REST vs RPC vs event), versioning strategy, auth/authorization placement |
| `deployment/containerization.md` | Single container vs sidecar vs init container, base image lineage |
| `deployment/ci_cd_pipeline.md` | Trunk-based vs feature-branch, gate ordering, deploy strategy (blue-green / canary / all-at-once) |
| `deployment/observability.md` | Logging stack, metric backend, sampling rate decisions, retention |
| `risk_mitigations.md` | Risk-acceptance trade-offs (e.g., "we accept N% data loss in exchange for sub-100ms p99") |
| Tech-stack from `_docs/01_solution/tech_stack.md` | Anything where research recorded ≥2 candidates and a winner |
Drop any candidate that fails the three "what is an ADR" criteria above. Keep the rest.
### Phase 4.5b: Numbering and Slugs
ADRs are numbered globally per project, monotonically, never re-used.
1. List existing files under `_docs/02_document/adr/` matching `^[0-9]{3}_.+\.md$`.
2. The next ADR number is `max(existing) + 1`, zero-padded to 3 digits.
3. The slug is kebab-case, ≤6 words, derived from the decision summary. Example: `001_use-postgres-for-transactional-data.md`, `004_event-driven-cross-component-comms.md`.
### Phase 4.5c: Render One ADR Per Decision
For each kept candidate, render the ADR using `templates/adr.md`. Required sections (do NOT omit any):
| Section | Content |
|---------|---------|
| **Number** | `NNN` |
| **Title** | One-line decision statement (matches slug) |
| **Status** | `Proposed` (only during Step 4.5 iteration) → `Accepted` (after user confirmation at the BLOCKING gate) |
| **Date** | YYYY-MM-DD (the date the user confirmed) |
| **Deciders** | The user (project owner) — the AI is not a decider |
| **Context** | The problem this decision addresses, including links to AC IDs, restriction IDs, risks, and (where relevant) the research draft section |
| **Decision** | The chosen approach in one sentence, then the supporting detail |
| **Alternatives Considered** | Each alternative with a one-line "rejected because…" |
| **Consequences** | Positive (what becomes easier / cheaper / faster) and negative (what becomes harder / locked in / costly to undo). Be honest — every decision has a downside. |
| **Supersedes / Superseded by** | Empty initially; updated when a future ADR overturns this one |
| **Evidence** | File-and-section pointers into `_docs/` showing where the decision is reflected (architecture.md § layering, components/02_*/description.md § interface, etc.) |
After rendering, write each file to `_docs/02_document/adr/NNN_<slug>.md`. Keep `Status: Proposed` until the BLOCKING gate.
### Phase 4.5d: Maintain the ADR Index
Write or update `_docs/02_document/adr/README.md` with this exact shape:
```markdown
# Architecture Decision Records
This index lists every ADR for this project, in number order. ADRs are immutable once `Accepted`
new decisions that overturn a prior ADR are recorded as new ADRs whose `Supersedes` field points
back, and the original ADR's `Superseded by` field is updated.
| # | Title | Status | Date | Supersedes |
|---|-------|--------|------|------------|
| 001 | Use Postgres for transactional data | Accepted | 2026-05-21 | — |
| 002 | Event-driven cross-component comms | Accepted | 2026-05-21 | — |
| ... | ... | ... | ... | ... |
```
Sort by `#` ascending. Include all ADRs ever written, even superseded ones — the audit trail is the point.
### Phase 4.5e: Cross-Link from architecture.md
In `architecture.md`, every section that reflects an ADR decision gets a one-line trailing reference:
```markdown
> See ADR 001 (Use Postgres for transactional data), ADR 003 (Event-driven cross-component comms).
```
Place the reference at the end of the section, after the prose. This lets a future reader of `architecture.md` jump straight to the rationale.
### Phase 4.5f: BLOCKING Gate — User Confirmation
Present the ADR set to the user using the Choose format from `.cursor/skills/autodev/protocols.md` (or plain text if AskQuestion is unavailable):
```
══════════════════════════════════════
DECISION REQUIRED: ADR set captured (N records)
══════════════════════════════════════
001 — [title]
002 — [title]
...
══════════════════════════════════════
A) Accept all ADRs as written
B) Edit specific ADRs (numbers and edits)
C) Add a missed decision (description)
D) Remove an ADR (number and reason)
══════════════════════════════════════
Recommendation: A — review the rendered set and confirm; corrections are quick on Round 2
══════════════════════════════════════
```
Loop:
- **A** → flip every ADR's `Status` from `Proposed` to `Accepted`, set `Date` to today's date, save, exit step.
- **B** → apply edits, re-present the modified ADRs, loop.
- **C** → run Phase 4.5a4.5e for the missed decision only, append to the set, re-present, loop.
- **D** → confirm with the user that the candidate fails the three "what is an ADR" criteria, remove the file, update the index, loop.
Do NOT mark `Accepted` without an explicit user A.
## Self-verification
- [ ] Every kept candidate from Phase 4.5a has a corresponding file under `adr/`
- [ ] Every ADR has all required sections (none empty except `Supersedes` / `Superseded by`)
- [ ] `Decision` sections are one-sentence-then-detail, not "we'll figure it out"
- [ ] `Alternatives Considered` lists at least one rejected alternative per ADR
- [ ] `Consequences` lists both positive AND negative consequences (an ADR with no negatives is suspect)
- [ ] `Evidence` points at real `_docs/` sections that exist on disk
- [ ] `adr/README.md` index lists every file in the directory and matches their `Status` / `Date`
- [ ] `architecture.md` has a trailing `See ADR …` reference at every section that an ADR reflects
- [ ] The user confirmed the set via Choose A; every ADR is `Accepted` with today's date
## Common mistakes
- **Re-opening architecture**: Step 4.5 records, it does not decide. If a candidate decision turns out to be unsettled, that's a Step 2 / Step 4 gap — return there, do not paper over it with a wishy-washy ADR.
- **Decision-of-the-week**: do not write an ADR for every minor pattern choice. The bar is "non-obvious to a future reader". 515 ADRs is typical for a planning round; 40+ is over-capture.
- **Negative consequences left empty**: every real decision has costs. If you cannot name one, the decision was not actually weighed.
- **Vague evidence**: `architecture.md` is not enough — point at the specific section. `architecture.md § Layering``architecture.md`.
- **Numbering reuse**: never recycle a number from a deleted ADR. The audit trail is more important than tidy numbering.
- **Superseding without recording**: when a later cycle overturns an ADR, the new ADR must point at the old one via `Supersedes`, AND the old ADR's `Superseded by` field must be updated. Index reflects both. (This is enforced when `decompose` or `refactor` later updates ADRs.)
## Escalation
| Situation | Action |
|-----------|--------|
| Candidate decision is unsettled (the team has not actually decided) | Return to the originating step (2 / 3 / 4); do NOT write a placeholder ADR |
| Two candidates in Phase 4.5a turn out to be the same decision phrased differently | Merge into one ADR, list both phrasings in `Context` |
| User picks D (remove an ADR) and the AI judges the decision is genuinely worth recording | Surface the disagreement, ASK why the user wants it removed, defer to user |
| Existing `adr/` directory has files but `adr/README.md` is missing or stale | Rebuild the index from the directory before adding new ADRs |
@@ -2,7 +2,7 @@
**Role**: Professional Quality Assurance Engineer
**Goal**: Write test specs for each component achieving the canonical minimum acceptance-criteria coverage (currently 75% — see `.cursor/rules/cursor-meta.mdc` Quality Thresholds; do not restate a different number here)
**Goal**: Write test specs for each component achieving minimum 75% acceptance criteria coverage
**Constraints**: Test specs only — no test code. Each test must trace to an acceptance criterion.
@@ -0,0 +1,48 @@
## Step 6: Work Item Epics
**Role**: Professional product manager
**Goal**: Create epics from components, ordered by dependency
**Constraints**: Epic descriptions must be **comprehensive and self-contained** — a developer reading only the epic should understand the full context without needing to open separate files.
1. **Create "Bootstrap & Initial Structure" epic first** — this epic will parent the `01_initial_structure` task created by the decompose skill. It covers project scaffolding: folder structure, shared models, interfaces, stubs, CI/CD config, DB migrations setup, test structure.
2. Generate epics for each component using the configured work item tracker (Jira MCP or Azure DevOps MCP — see `autopilot/protocols.md`), structured per `templates/epic-spec.md`
3. Order epics by dependency (Bootstrap epic is always first, then components based on their dependency graph)
4. Include effort estimation per epic (T-shirt size or story points range)
5. Ensure each epic has clear acceptance criteria cross-referenced with component specs
6. Generate Mermaid diagrams showing component-to-epic mapping and component relationships
**CRITICAL — Epic description richness requirements**:
Each epic description MUST include ALL of the following sections with substantial content:
- **System context**: where this component fits in the overall architecture (include Mermaid diagram showing this component's position and connections)
- **Problem / Context**: what problem this component solves, why it exists, current pain points
- **Scope**: detailed in-scope and out-of-scope lists
- **Architecture notes**: relevant ADRs, technology choices, patterns used, key design decisions
- **Interface specification**: full method signatures, input/output types, error types (from component description.md)
- **Data flow**: how data enters and exits this component (include Mermaid sequence or flowchart diagram)
- **Dependencies**: epic dependencies (with Jira IDs) and external dependencies (libraries, hardware, services)
- **Acceptance criteria**: measurable criteria with specific thresholds (from component tests.md)
- **Non-functional requirements**: latency, memory, throughput targets with failure thresholds
- **Risks & mitigations**: relevant risks from risk_mitigations.md with concrete mitigation strategies
- **Effort estimation**: T-shirt size and story points range
- **Child issues**: planned task breakdown with complexity points
- **Key constraints**: from restrictions.md that affect this component
- **Testing strategy**: summary of test types and coverage from tests.md
Do NOT create minimal epics with just a summary and short description. The epic is the primary reference document for the implementation team.
**Self-verification**:
- [ ] "Bootstrap & Initial Structure" epic exists and is first in order
- [ ] "Blackbox Tests" epic exists
- [ ] Every component maps to exactly one epic
- [ ] Dependency order is respected (no epic depends on a later one)
- [ ] Acceptance criteria are measurable
- [ ] Effort estimates are realistic
- [ ] Every epic description includes architecture diagram, interface spec, data flow, risks, and NFRs
- [ ] Epic descriptions are self-contained — readable without opening other files
7. **Create "Blackbox Tests" epic** — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.
**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If `tracker: local`, save locally only.
@@ -1,61 +0,0 @@
## Step 6: Work Item Epics
**Role**: Professional product manager
**Goal**: Create epics from components, ordered by dependency
**Constraints**: Epic descriptions must be **comprehensive and self-contained** — a developer reading only the epic should understand the full context without needing to open separate files.
0. **Consult LESSONS.md** — if `_docs/LESSONS.md` exists, read it and factor any `estimation` / `architecture` / `dependencies` entries into epic sizing, scope, and dependency ordering. This closes the retrospective feedback loop; lessons from prior cycles directly inform current epic shape. Note in the Step 6 output which lessons were applied (or that none were relevant).
1. **Create "Bootstrap & Initial Structure" epic first** — this epic will parent the `01_initial_structure` task created by the decompose skill. It covers project scaffolding: folder structure, shared models, interfaces, stubs, CI/CD config, DB migrations setup, test structure.
2. **Identify cross-cutting concerns from architecture.md and restrictions.md**. Default candidates to consider (include only if architecture/restrictions reference them):
- Logging / observability (structured logging, correlation IDs, metrics)
- Error handling / envelope / result types
- Configuration loading (env vars, config files, secrets)
- Authentication / authorization middleware
- Feature flags / toggles
- Telemetry / tracing
- i18n / localization
For each identified concern, create ONE epic named `Cross-Cutting: <name>` with `epic_type: cross-cutting`. Each cross-cutting epic will parent exactly ONE shared implementation task (placed under `src/shared/<concern>/` by decompose skill). All component-level tasks that consume the concern declare the shared task as a dependency — they do NOT re-implement the concern locally. This rule is enforced by code-review Phase 6 (Cross-Task Consistency) and Phase 7 (Architecture Compliance).
3. Generate epics for each component using the configured work item tracker (see `autodev/protocols.md` for tracker detection), structured per `templates/epic-spec.md`
4. Order epics by dependency: Bootstrap epic first, then Cross-Cutting epics (they underlie everything), then component epics in dependency order
5. Include effort estimation per epic (T-shirt size or story points range). Use LESSONS.md estimation entries as a calibration hint — if a lesson says "component X was underestimated by 2x last time" and the current plan has a comparable component, widen that epic's estimate.
6. Ensure each epic has clear acceptance criteria cross-referenced with component specs
7. Generate Mermaid diagrams showing component-to-epic mapping and component relationships; include cross-cutting epics as horizontal dependencies of every consuming component epic
**CRITICAL — Epic description richness requirements**:
Each epic description MUST include ALL of the following sections with substantial content:
- **System context**: where this component fits in the overall architecture (include Mermaid diagram showing this component's position and connections)
- **Problem / Context**: what problem this component solves, why it exists, current pain points
- **Scope**: detailed in-scope and out-of-scope lists
- **Architecture notes**: relevant ADRs, technology choices, patterns used, key design decisions
- **Interface specification**: full method signatures, input/output types, error types (from component description.md)
- **Data flow**: how data enters and exits this component (include Mermaid sequence or flowchart diagram)
- **Dependencies**: epic dependencies (with tracker IDs) and external dependencies (libraries, hardware, services)
- **Acceptance criteria**: measurable criteria with specific thresholds (from component tests.md)
- **Non-functional requirements**: latency, memory, throughput targets with failure thresholds
- **Risks & mitigations**: relevant risks from risk_mitigations.md with concrete mitigation strategies
- **Effort estimation**: T-shirt size and story points range
- **Child issues**: planned task breakdown with complexity points
- **Key constraints**: from restrictions.md that affect this component
- **Testing strategy**: summary of test types and coverage from tests.md
Do NOT create minimal epics with just a summary and short description. The epic is the primary reference document for the implementation team.
**Self-verification**:
- [ ] "Bootstrap & Initial Structure" epic exists and is first in order
- [ ] Every identified cross-cutting concern has exactly one `Cross-Cutting: <name>` epic
- [ ] No two epics own the same cross-cutting concern
- [ ] "Blackbox Tests" epic exists
- [ ] Every component maps to exactly one component epic
- [ ] Dependency order is respected (no epic depends on a later one)
- [ ] Cross-Cutting epics precede every consuming component epic
- [ ] Acceptance criteria are measurable
- [ ] Effort estimates are realistic and reflect LESSONS.md calibration hints (if any applied)
- [ ] Every epic description includes architecture diagram, interface spec, data flow, risks, and NFRs
- [ ] Epic descriptions are self-contained — readable without opening other files
8. **Create "Blackbox Tests" epic** — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.
**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If tracker availability fails, follow `.cursor/rules/tracker.mdc`; only if the user explicitly chooses `tracker: local`, save locally only with pending tracker markers.
-67
View File
@@ -1,67 +0,0 @@
# ADR-{NNN}: {decision-title}
- **Status**: {Proposed | Accepted | Deprecated | Superseded}
- **Date**: {YYYY-MM-DD}
- **Deciders**: {user / project owner}
- **Supersedes**: {ADR-NNN | —}
- **Superseded by**: {ADR-NNN | —}
## Context
What problem does this decision address? Cite the relevant constraint(s), acceptance criterion / criteria, and risk(s) by ID.
- Acceptance criteria addressed: AC-{ID-1}, AC-{ID-2}
- Restrictions addressed: R-{ID-1}, R-{ID-2}
- Risks addressed: RISK-{ID-1}
- Research source (if any): `_docs/01_solution/solution_draftN.md` § {section}
A short paragraph (36 sentences) explaining why a choice is required now and what makes it non-trivial. Do not pre-announce the decision here — that goes in `Decision`. Focus on the forces at play (load, scale, team familiarity, hardware constraints, regulatory drivers, third-party limits).
## Decision
One declarative sentence: **"We will …"** Then 13 paragraphs of supporting detail explaining how the decision will be implemented at the boundaries between components.
Be specific. "We will use Postgres" is too thin; "We will use Postgres 16 with logical replication for read scaling, restricting JSONB columns to top-level metadata only, with all transactional data in normalized tables" is the right resolution.
## Alternatives Considered
| Alternative | Rejected because |
|-------------|------------------|
| {Alt 1 — short label} | {one line: the cost / mismatch / risk that ruled it out, ideally referencing a measurable criterion} |
| {Alt 2 — short label} | {one line} |
| {Alt 3 — short label} | {one line} |
At least one rejected alternative is mandatory. If only one option was ever considered, this is not an ADR — link to the source restriction or research selection from the parent doc instead.
## Consequences
### Positive
- {What becomes easier / cheaper / faster, with concrete examples where possible}
- {…}
### Negative
- {What becomes harder / locked in / costly to undo}
- {…}
Every real decision has both. If the negatives section is hard to fill, the alternatives were probably not weighed seriously — return to the prior step.
### Neutral / Open
- {What is unchanged but worth flagging for future readers (e.g., "this does not change the auth boundary; auth remains in component 02_user_management as decided in ADR-003")}
## Evidence
Where this decision is reflected on disk. Use `file:section` links so future readers can jump.
- `_docs/02_document/architecture.md` § {section}
- `_docs/02_document/data_model.md` § {section}
- `_docs/02_document/components/{##_name}/description.md` § {section}
- `_docs/02_document/system-flows.md` § {flow name}
- `_docs/02_document/deployment/{file}.md` § {section}
- {add more as needed}
## Notes
Optional. Use for caveats that did not fit above, links to external research, or follow-ups that the team agreed to revisit on a known trigger ("re-evaluate after 6 months in production" / "re-evaluate when load exceeds 10× baseline").
+3 -12
View File
@@ -1,6 +1,6 @@
# Epic Template
Use this template for each epic. Create epics via the configured work item tracker (see `autodev/protocols.md` for tracker detection).
Use this template for each epic. Create epics via the configured work item tracker (Jira MCP or Azure DevOps MCP).
---
@@ -9,9 +9,6 @@ Use this template for each epic. Create epics via the configured work item track
**Example**: Data Ingestion — Near-real-time pipeline
**epic_type**: [component | bootstrap | cross-cutting | tests]
**concern** (cross-cutting only): [logging | error-handling | config | authn | authz | feature-flags | telemetry | i18n | other-named-concern]
### Epic Summary
[1-2 sentences: what we are building + why it matters]
@@ -126,11 +123,5 @@ Link to architecture.md and relevant component spec.]
- Be concise. Fewer words with the same meaning = better epic.
- Capabilities in scope are "what", not "how" — avoid describing implementation details.
- Dependency order matters: epics that must be done first should be listed earlier in the backlog.
- Every `component` epic maps to exactly one component. If a component is too large for one epic, split the component first.
- A `cross-cutting` epic maps to exactly one shared concern and parents exactly one shared implementation task. Component epics that consume the concern declare the cross-cutting epic as a dependency.
- Valid `epic_type` values:
- `bootstrap` — the initial-structure epic (always exactly one per project)
- `component` — a normal per-component epic
- `cross-cutting` — a shared concern that spans ≥2 components
- `tests` — the blackbox-tests epic (always exactly one)
- Complexity points for child issues follow the project standard: 1, 2, 3, 5. Do not create issues above 5 points — split them.
- Every epic maps to exactly one component. If a component is too large for one epic, split the component first.
- Complexity points for child issues follow the project standard: 1, 2, 3, 5, 8. Do not create issues above 5 points — split them.
@@ -1,6 +1,6 @@
# Final Planning Report Template
Use this template after completing all steps (1, 2, 3, 4, 4.5, 5, 6) and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.
Use this template after completing all 6 steps and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.
---
@@ -27,8 +27,8 @@ Use this template after completing all steps (1, 2, 3, 4, 4.5, 5, 6) and the qua
| # | Component | Purpose | Dependencies | Epic |
|---|-----------|---------|-------------|------|
| 01 | [name] | [one-line purpose] | — | [Tracker ID] |
| 02 | [name] | [one-line purpose] | 01 | [Tracker ID] |
| 01 | [name] | [one-line purpose] | — | [Jira ID] |
| 02 | [name] | [one-line purpose] | 01 | [Jira ID] |
| ... | | | | |
**Implementation order** (based on dependency graph):
@@ -71,8 +71,8 @@ Use this template after completing all steps (1, 2, 3, 4, 4.5, 5, 6) and the qua
| Order | Epic | Component | Effort | Dependencies |
|-------|------|-----------|--------|-------------|
| 1 | [Tracker ID]: [name] | [component] | [S/M/L/XL] | — |
| 2 | [Tracker ID]: [name] | [component] | [S/M/L/XL] | Epic 1 |
| 1 | [Jira ID]: [name] | [component] | [S/M/L/XL] | — |
| 2 | [Jira ID]: [name] | [component] | [S/M/L/XL] | Epic 1 |
| ... | | | | |
**Total estimated effort**: [sum or range]
-2
View File
@@ -181,8 +181,6 @@ Categorized measurable criteria with markdown headers and bullet points:
Every criterion must have a measurable value. Vague criteria like "should be fast" are not acceptable — push for "less than 400ms end-to-end".
**AC must be design-independent**: describe testable outcomes only — no libraries, algorithms, params, or design choices. Implementation follows AC, never reverse. (IEEE 830 / Atlassian / GitScrum)
### input_data/
At least one file. Options:
+423 -96
View File
@@ -1,144 +1,471 @@
---
name: refactor
description: |
Structured 8-phase refactoring workflow with two input modes:
Automatic (skill discovers issues) and Guided (input file with change list).
Each run gets its own subfolder in _docs/04_refactoring/.
Delegates code execution to the implement skill via task files in _docs/02_tasks/.
Additional workflow modes: Targeted (skip discovery), Quick Assessment (phases 0-2 only).
Structured refactoring workflow (6-phase method) with three execution modes:
- Full Refactoring: all 6 phases — baseline, discovery, analysis, safety net, execution, hardening
- Targeted Refactoring: skip discovery if docs exist, focus on a specific component/area
- Quick Assessment: phases 0-2 only, outputs a refactoring plan without execution
Supports project mode (_docs/ structure) and standalone mode (@file.md).
Trigger phrases:
- "refactor", "refactoring", "improve code"
- "analyze coupling", "decoupling", "technical debt"
- "refactoring assessment", "code quality improvement"
category: evolve
tags: [refactoring, coupling, technical-debt, performance, testability]
trigger_phrases: ["refactor", "refactoring", "improve code", "analyze coupling", "decoupling", "technical debt", "code quality"]
tags: [refactoring, coupling, technical-debt, performance, hardening]
disable-model-invocation: true
---
# Structured Refactoring
# Structured Refactoring (6-Phase Method)
Phase details live in `phases/` — read the relevant file before executing each phase.
Transform existing codebases through a systematic refactoring workflow: capture baseline, document current state, research improvements, build safety net, execute changes, and harden.
## Core Principles
- **Preserve behavior first**: never refactor without a passing test suite (exception: testability runs, where the goal is making code testable)
- **Preserve behavior first**: never refactor without a passing test suite
- **Measure before and after**: every change must be justified by metrics
- **Small incremental changes**: commit frequently, never break tests
- **Save immediately**: write artifacts to disk after each phase
- **Delegate execution**: all code changes go through the implement skill via task files
- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
- **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
- **Exact-fit recommendations**: do not recommend a replacement pattern, library, service, architecture, algorithm, or "modern approach" merely because it improves structure or solves a similar class of problem. It must fit confirmed product constraints, acceptance criteria, operating context, integration boundaries, and current code realities. Otherwise reject it, mark it experimental, or ask the user before adding it to the roadmap.
- **Per-mode API capability verification on replacements**: when a refactor proposes replacing or adding a library/SDK/framework/service that exposes multiple modes or configurations, pin the exact mode the refactored code will use (inputs, outputs, runtime) and verify *that mode* via mandatory `context7` lookup plus a saved Minimum Viable Example before promoting the recommendation to `Selected`. Capability claims at the category level ("supports A, B, C modes") must be cross-checked against the literal mode enumeration — `A, B → A+B` style conflations are the recurring silent-failure path.
## Context Resolution
Announce detected paths and input mode to user before proceeding.
Determine the operating mode based on invocation before any other logic runs.
**Fixed paths:**
**Project mode** (no explicit input file provided):
- PROBLEM_DIR: `_docs/00_problem/`
- SOLUTION_DIR: `_docs/01_solution/`
- COMPONENTS_DIR: `_docs/02_document/components/`
- DOCUMENT_DIR: `_docs/02_document/`
- REFACTOR_DIR: `_docs/04_refactoring/`
- All existing guardrails apply.
| Path | Location |
|------|----------|
| PROBLEM_DIR | `_docs/00_problem/` |
| SOLUTION_DIR | `_docs/01_solution/` |
| COMPONENTS_DIR | `_docs/02_document/components/` |
| DOCUMENT_DIR | `_docs/02_document/` |
| TASKS_DIR | `_docs/02_tasks/` |
| TASKS_TODO | `_docs/02_tasks/todo/` |
| REFACTOR_DIR | `_docs/04_refactoring/` |
| RUN_DIR | `REFACTOR_DIR/NN-[run-name]/` |
**Standalone mode** (explicit input file provided, e.g. `/refactor @some_component.md`):
- INPUT_FILE: the provided file (treated as component/area description)
- REFACTOR_DIR: `_standalone/refactoring/`
- Guardrails relaxed: only INPUT_FILE must exist and be non-empty
- `acceptance_criteria.md` is optional — warn if absent
**Prereqs**: `problem.md` required, `acceptance_criteria.md` warn if absent.
Announce the detected mode and resolved paths to the user before proceeding.
**RUN_DIR resolution**: on start, scan REFACTOR_DIR for existing `NN-*` folders. Auto-increment the numeric prefix for the new run. The run name is derived from the invocation context (e.g., `01-testability-refactoring`, `02-coupling-refactoring`). If invoked with a guided input file, derive the name from the input file name or ask the user.
## Mode Detection
Create REFACTOR_DIR and RUN_DIR if missing. If a RUN_DIR with the same name already exists, ask user: **resume or start fresh?**
After context resolution, determine the execution mode:
## Input Modes
1. **User explicitly says** "quick assessment" or "just assess" → **Quick Assessment**
2. **User explicitly says** "refactor [component/file/area]" with a specific target → **Targeted Refactoring**
3. **Default** → **Full Refactoring**
| Mode | Trigger | Discovery source |
|------|---------|-----------------|
| Automatic | Default, no input file | Skill discovers issues from code analysis |
| Guided | Input file provided (e.g., `/refactor @list-of-changes.md`) | Reads input file + scans code to form validated change list |
| Mode | Phases Executed | When to Use |
|------|----------------|-------------|
| **Full Refactoring** | 0 → 1 → 2 → 3 → 4 → 5 | Complete refactoring of a system or major area |
| **Targeted Refactoring** | 0 → (skip 1 if docs exist) → 2 → 3 → 4 → 5 | Refactor a specific component; docs already exist |
| **Quick Assessment** | 0 → 1 → 2 | Produce a refactoring roadmap without executing changes |
Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-changes.md`). Both modes then convert that file into task files in TASKS_DIR during Phase 2.
Inform the user which mode was detected and confirm before proceeding.
**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file only if it lives outside `RUN_DIR`. If the provided file is already the canonical `RUN_DIR/list-of-changes.md`, keep it as the audit record.
## Prerequisite Checks (BLOCKING)
**Project mode:**
1. PROBLEM_DIR exists with `problem.md` (or `problem_description.md`) — **STOP if missing**, ask user to create it
2. If `acceptance_criteria.md` is missing: **warn** and ask whether to proceed
3. Create REFACTOR_DIR if it does not exist
4. If REFACTOR_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
**Standalone mode:**
1. INPUT_FILE exists and is non-empty — **STOP if missing**
2. Warn if no `acceptance_criteria.md` provided
3. Create REFACTOR_DIR if it does not exist
## Artifact Management
### Directory Structure
```
REFACTOR_DIR/
├── baseline_metrics.md (Phase 0)
├── discovery/
│ ├── components/
│ │ └── [##]_[name].md (Phase 1)
│ ├── solution.md (Phase 1)
│ └── system_flows.md (Phase 1)
├── analysis/
│ ├── research_findings.md (Phase 2)
│ └── refactoring_roadmap.md (Phase 2)
├── test_specs/
│ └── [##]_[test_name].md (Phase 3)
├── coupling_analysis.md (Phase 4)
├── execution_log.md (Phase 4)
├── hardening/
│ ├── technical_debt.md (Phase 5)
│ ├── performance.md (Phase 5)
│ └── security.md (Phase 5)
└── FINAL_report.md (after all phases)
```
### Save Timing
| Phase | Save immediately after | Filename |
|-------|------------------------|----------|
| Phase 0 | Baseline captured | `baseline_metrics.md` |
| Phase 1 | Each component documented | `discovery/components/[##]_[name].md` |
| Phase 1 | Solution synthesized | `discovery/solution.md`, `discovery/system_flows.md` |
| Phase 2 | Research complete | `analysis/research_findings.md` |
| Phase 2 | Roadmap produced | `analysis/refactoring_roadmap.md` |
| Phase 3 | Test specs written | `test_specs/[##]_[test_name].md` |
| Phase 4 | Coupling analyzed | `coupling_analysis.md` |
| Phase 4 | Execution complete | `execution_log.md` |
| Phase 5 | Each hardening track | `hardening/<track>.md` |
| Final | All phases done | `FINAL_report.md` |
### Resumability
If REFACTOR_DIR already contains artifacts:
1. List existing files and match to the save timing table
2. Identify the last completed phase based on which artifacts exist
3. Resume from the next incomplete phase
4. Inform the user which phases are being skipped
## Progress Tracking
At the start of execution, create a TodoWrite with all applicable phases. Update status as each phase completes.
## Workflow
| Phase | File | Summary | Gate |
|-------|------|---------|------|
| 0 | `phases/00-baseline.md` | Collect goals, create RUN_DIR, capture baseline metrics | BLOCKING: user confirms |
| 1 | `phases/01-discovery.md` | Document components (scoped for guided mode), produce list-of-changes.md | BLOCKING: user confirms |
| 2 | `phases/02-analysis.md` | Research improvements, produce roadmap, create epic, decompose into tasks in TASKS_DIR | BLOCKING: user confirms |
| | | *Quick Assessment stops here* | |
| 3 | `phases/03-safety-net.md` | Check existing tests or implement pre-refactoring tests (skip for testability runs) | GATE: all tests pass |
| 4 | `phases/04-execution.md` | Delegate task execution to implement skill | GATE: implement completes |
| 4.5 | (inline, testability runs only) | Produce `testability_changes_summary.md` listing every applied change in plain language, surface to user | GATE: user acknowledges summary |
| 5 | `phases/05-test-sync.md` | Remove obsolete, update broken, add new tests | GATE: all tests pass |
| 6 | `phases/06-verification.md` | Run full suite, compare metrics vs baseline | GATE: all pass, no regressions |
| 7 | `phases/07-documentation.md` | Update `_docs/` to reflect refactored state | Skip if `_docs/02_document/` absent |
### Phase 0: Context & Baseline
**Workflow mode detection:**
- "quick assessment" / "just assess" → phases 02
- "refactor [specific target]" → skip phase 1 if docs exist
- Default → all phases
**Role**: Software engineer preparing for refactoring
**Goal**: Collect refactoring goals and capture baseline metrics
**Constraints**: Measurement only — no code changes
**Testability-run specifics** (guided mode invoked by autodev existing-code Step 4 or greenfield Step 8):
- Run name is `01-testability-refactoring`.
- Phase 3 (Safety Net) is skipped by design — no tests exist yet. Compensating control: the `list-of-changes.md` gate in Phase 1 must be reviewed and approved by the user before Phase 4 runs.
- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see the invoking flow's testability step for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for the next optional full-refactor step or backlog consideration.
- After Phase 4 (Execution) completes, write `RUN_DIR/testability_changes_summary.md` as Phase 4.5. Format: one bullet per applied change.
```markdown
# Testability Changes Summary ({{run_name}})
#### 0a. Collect Goals
Applied {{N}} change(s):
If PROBLEM_DIR files do not yet exist, help the user create them:
- **{{change_id}}**changed {{symbol}} in `{{file}}`: {{plain-language reason}}. Risk: {{low|medium|high}}.
```
Group bullets by category (config extraction, DI insertion, singleton wrapping, interface extraction, function split). Present the summary to the user via the Choose format before proceeding to Phase 5.
1. `problem.md`what the system currently does, what changes are needed, pain points
2. `acceptance_criteria.md` — success criteria for the refactoring
3. `security_approach.md` — security requirements (if applicable)
At the start of execution, create a TodoWrite with all applicable phases.
Store in PROBLEM_DIR.
## Artifact Structure
#### 0b. Capture Baseline
All artifacts are written to RUN_DIR:
1. Read problem description and acceptance criteria
2. Measure current system metrics using project-appropriate tools:
```
baseline_metrics.md Phase 0
discovery/components/[##]_[name].md Phase 1
discovery/solution.md Phase 1
discovery/system_flows.md Phase 1
list-of-changes.md Phase 1
analysis/research_findings.md Phase 2
analysis/refactoring_roadmap.md Phase 2
test_specs/[##]_[test_name].md Phase 3
execution_log.md Phase 4
testability_changes_summary.md Phase 4.5 (testability runs only)
test_sync/{obsolete_tests,updated_tests,new_tests}.md Phase 5
verification_report.md Phase 6
doc_update_log.md Phase 7
FINAL_report.md after all phases
```
| Metric Category | What to Capture |
|----------------|-----------------|
| **Coverage** | Overall, unit, blackbox, critical paths |
| **Complexity** | Cyclomatic complexity (avg + top 5 functions), LOC, tech debt ratio |
| **Code Smells** | Total, critical, major |
| **Performance** | Response times (P50/P95/P99), CPU/memory, throughput |
| **Dependencies** | Total count, outdated, security vulnerabilities |
| **Build** | Build time, test execution time, deployment time |
Task files produced during Phase 2 go to TASKS_TODO (not RUN_DIR):
```
TASKS_TODO/[TRACKER-ID]_refactor_[short_name].md
TASKS_DIR/_dependencies_table.md (appended)
```
3. Create functionality inventory: all features/endpoints with status and coverage
**Resumability**: match existing artifacts to phases above, resume from next incomplete phase.
**Self-verification**:
- [ ] All metric categories measured (or noted as N/A with reason)
- [ ] Functionality inventory is complete
- [ ] Measurements are reproducible
**Save action**: Write `REFACTOR_DIR/baseline_metrics.md`
**BLOCKING**: Present baseline summary to user. Do NOT proceed until user confirms.
---
### Phase 1: Discovery
**Role**: Principal software architect
**Goal**: Generate documentation from existing code and form solution description
**Constraints**: Document what exists, not what should be. No code changes.
**Skip condition** (Targeted mode): If `COMPONENTS_DIR` and `SOLUTION_DIR` already contain documentation for the target area, skip to Phase 2. Ask user to confirm skip.
#### 1a. Document Components
For each component in the codebase:
1. Analyze project structure, directories, files
2. Go file by file, analyze each method
3. Analyze connections between components
Write per component to `REFACTOR_DIR/discovery/components/[##]_[name].md`:
- Purpose and architectural patterns
- Mermaid diagrams for logic flows
- API reference table (name, description, input, output)
- Implementation details: algorithmic complexity, state management, dependencies
- Caveats, edge cases, known limitations
#### 1b. Synthesize Solution & Flows
1. Review all generated component documentation
2. Synthesize into a cohesive solution description
3. Create flow diagrams showing component interactions
Write:
- `REFACTOR_DIR/discovery/solution.md` — product description, component overview, interaction diagram
- `REFACTOR_DIR/discovery/system_flows.md` — Mermaid flowcharts per major use case
Also copy to project standard locations if in project mode:
- `SOLUTION_DIR/solution.md`
- `DOCUMENT_DIR/system_flows.md`
**Self-verification**:
- [ ] Every component in the codebase is documented
- [ ] Solution description covers all components
- [ ] Flow diagrams cover all major use cases
- [ ] Mermaid diagrams are syntactically correct
**Save action**: Write discovery artifacts
**BLOCKING**: Present discovery summary to user. Do NOT proceed until user confirms documentation accuracy.
---
### Phase 2: Analysis
**Role**: Researcher and software architect
**Goal**: Research improvements and produce a refactoring roadmap
**Constraints**: Analysis only — no code changes
#### 2a. Deep Research
1. Analyze current implementation patterns
2. Research modern approaches for similar systems
3. Identify what could be done differently
4. Suggest improvements based on state-of-the-art practices
Write `REFACTOR_DIR/analysis/research_findings.md`:
- Current state analysis: patterns used, strengths, weaknesses
- Alternative approaches per component: current vs alternative, pros/cons, migration effort
- Prioritized recommendations: quick wins + strategic improvements
#### 2b. Solution Assessment
1. Assess current implementation against acceptance criteria
2. Identify weak points in codebase, map to specific code areas
3. Perform gap analysis: acceptance criteria vs current state
4. Prioritize changes by impact and effort
Write `REFACTOR_DIR/analysis/refactoring_roadmap.md`:
- Weak points assessment: location, description, impact, proposed solution
- Gap analysis: what's missing, what needs improvement
- Phased roadmap: Phase 1 (critical fixes), Phase 2 (major improvements), Phase 3 (enhancements)
**Self-verification**:
- [ ] All acceptance criteria are addressed in gap analysis
- [ ] Recommendations are grounded in actual code, not abstract
- [ ] Roadmap phases are prioritized by impact
- [ ] Quick wins are identified separately
**Save action**: Write analysis artifacts
**BLOCKING**: Present refactoring roadmap to user. Do NOT proceed until user confirms.
**Quick Assessment mode stops here.** Present final summary and write `FINAL_report.md` with phases 0-2 content.
---
### Phase 3: Safety Net
**Role**: QA engineer and developer
**Goal**: Design and implement tests that capture current behavior before refactoring
**Constraints**: Tests must all pass on the current codebase before proceeding
#### 3a. Design Test Specs
Coverage requirements (must meet before refactoring — see `.cursor/rules/cursor-meta.mdc` Quality Thresholds):
- Minimum overall coverage: 75%
- Critical path coverage: 90%
- All public APIs must have blackbox tests
- All error handling paths must be tested
For each critical area, write test specs to `REFACTOR_DIR/test_specs/[##]_[test_name].md`:
- Blackbox tests: summary, current behavior, input data, expected result, max expected time
- Acceptance tests: summary, preconditions, steps with expected results
- Coverage analysis: current %, target %, uncovered critical paths
#### 3b. Implement Tests
1. Set up test environment and infrastructure if not exists
2. Implement each test from specs
3. Run tests, verify all pass on current codebase
4. Document any discovered issues
**Self-verification**:
- [ ] Coverage requirements met (75% overall, 90% critical paths)
- [ ] All tests pass on current codebase
- [ ] All public APIs have blackbox tests
- [ ] Test data fixtures are configured
**Save action**: Write test specs; implemented tests go into the project's test folder
**GATE (BLOCKING)**: ALL tests must pass before proceeding to Phase 4. If tests fail, fix the tests (not the code) or ask user for guidance. Do NOT proceed to Phase 4 with failing tests.
---
### Phase 4: Execution
**Role**: Software architect and developer
**Goal**: Analyze coupling and execute decoupling changes
**Constraints**: Small incremental changes; tests must stay green after every change
#### 4a. Analyze Coupling
1. Analyze coupling between components/modules
2. Map dependencies (direct and transitive)
3. Identify circular dependencies
4. Form decoupling strategy
Write `REFACTOR_DIR/coupling_analysis.md`:
- Dependency graph (Mermaid)
- Coupling metrics per component
- Problem areas: components involved, coupling type, severity, impact
- Decoupling strategy: priority order, proposed interfaces/abstractions, effort estimates
**BLOCKING**: Present coupling analysis to user. Do NOT proceed until user confirms strategy.
#### 4b. Execute Decoupling
For each change in the decoupling strategy:
1. Implement the change
2. Run blackbox tests
3. Fix any failures
4. Commit with descriptive message
Address code smells encountered: long methods, large classes, duplicate code, dead code, magic numbers.
Write `REFACTOR_DIR/execution_log.md`:
- Change description, files affected, test status per change
- Before/after metrics comparison against baseline
**Self-verification**:
- [ ] All tests still pass after execution
- [ ] No circular dependencies remain (or reduced per plan)
- [ ] Code smells addressed
- [ ] Metrics improved compared to baseline
**Save action**: Write execution artifacts
**BLOCKING**: Present execution summary to user. Do NOT proceed until user confirms.
---
### Phase 5: Hardening (Optional, Parallel Tracks)
**Role**: Varies per track
**Goal**: Address technical debt, performance, and security
**Constraints**: Each track is optional; user picks which to run
Present the three tracks and let user choose which to execute:
#### Track A: Technical Debt
**Role**: Technical debt analyst
1. Identify and categorize debt items: design, code, test, documentation
2. Assess each: location, description, impact, effort, interest (cost of not fixing)
3. Prioritize: quick wins → strategic debt → tolerable debt
4. Create actionable plan with prevention measures
Write `REFACTOR_DIR/hardening/technical_debt.md`
#### Track B: Performance Optimization
**Role**: Performance engineer
1. Profile current performance, identify bottlenecks
2. For each bottleneck: location, symptom, root cause, impact
3. Propose optimizations with expected improvement and risk
4. Implement one at a time, benchmark after each change
5. Verify tests still pass
Write `REFACTOR_DIR/hardening/performance.md` with before/after benchmarks
#### Track C: Security Review
**Role**: Security engineer
1. Review code against OWASP Top 10
2. Verify security requirements from `security_approach.md` are met
3. Check: authentication, authorization, input validation, output encoding, encryption, logging
Write `REFACTOR_DIR/hardening/security.md`:
- Vulnerability assessment: location, type, severity, exploit scenario, fix
- Security controls review
- Compliance check against `security_approach.md`
- Recommendations: critical fixes, improvements, hardening
**Self-verification** (per track):
- [ ] All findings are grounded in actual code
- [ ] Recommendations are actionable with effort estimates
- [ ] All tests still pass after any changes
**Save action**: Write hardening artifacts
---
## Final Report
After all phases complete, write `RUN_DIR/FINAL_report.md`:
mode used (automatic/guided), input mode, phases executed, baseline vs final metrics, changes summary, remaining items, lessons learned.
After all executed phases complete, write `REFACTOR_DIR/FINAL_report.md`:
- Refactoring mode used and phases executed
- Baseline metrics vs final metrics comparison
- Changes made summary
- Remaining items (deferred to future)
- Lessons learned
## Escalation Rules
| Situation | Action |
|-----------|--------|
| Unclear scope or ambiguous criteria | **ASK user** |
| Unclear refactoring scope | **ASK user** |
| Ambiguous acceptance criteria | **ASK user** |
| Tests failing before refactoring | **ASK user** — fix tests or fix code? |
| Risk of breaking external contracts | **ASK user** |
| Performance vs readability trade-off | **ASK user** |
| No test suite or CI exists | **WARN user**, suggest safety net first |
| Security vulnerability found | **WARN user** immediately |
| Implement skill reports failures | **ASK user** — review batch reports |
| Coupling change risks breaking external contracts | **ASK user** |
| Performance optimization vs readability trade-off | **ASK user** |
| Missing baseline metrics (no test suite, no CI) | **WARN user**, suggest building safety net first |
| Security vulnerability found during refactoring | **WARN user** immediately, don't defer |
## Trigger Conditions
When the user wants to:
- Improve existing code structure or quality
- Reduce technical debt or coupling
- Prepare codebase for new features
- Assess code health before major changes
**Keywords**: "refactor", "refactoring", "improve code", "reduce coupling", "technical debt", "code quality", "decoupling"
## Methodology Quick Reference
```
┌────────────────────────────────────────────────────────────────┐
│ Structured Refactoring (6-Phase Method) │
├────────────────────────────────────────────────────────────────┤
│ CONTEXT: Resolve mode (project vs standalone) + set paths │
│ MODE: Full / Targeted / Quick Assessment │
│ │
│ 0. Context & Baseline → baseline_metrics.md │
│ [BLOCKING: user confirms baseline] │
│ 1. Discovery → discovery/ (components, solution) │
│ [BLOCKING: user confirms documentation] │
│ 2. Analysis → analysis/ (research, roadmap) │
│ [BLOCKING: user confirms roadmap] │
│ ── Quick Assessment stops here ── │
│ 3. Safety Net → test_specs/ + implemented tests │
│ [GATE: all tests must pass] │
│ 4. Execution → coupling_analysis, execution_log │
│ [BLOCKING: user confirms changes] │
│ 5. Hardening → hardening/ (debt, perf, security) │
│ [optional, user picks tracks] │
│ ───────────────────────────────────────────────── │
│ FINAL_report.md │
├────────────────────────────────────────────────────────────────┤
│ Principles: Preserve behavior · Measure before/after │
│ Small changes · Save immediately · Ask don't assume│
└────────────────────────────────────────────────────────────────┘
```
@@ -1,52 +0,0 @@
# Phase 0: Context & Baseline
**Role**: Software engineer preparing for refactoring
**Goal**: Collect refactoring goals, create run directory, capture baseline metrics
**Constraints**: Measurement only — no code changes
## 0a. Collect Goals
If PROBLEM_DIR files do not yet exist, help the user create them:
1. `problem.md` — what the system currently does, what changes are needed, pain points
2. `acceptance_criteria.md` — success criteria for the refactoring
3. `security_approach.md` — security requirements (if applicable)
Store in PROBLEM_DIR.
## 0b. Create RUN_DIR
1. Scan REFACTOR_DIR for existing `NN-*` folders
2. Auto-increment the numeric prefix (e.g., if `01-testability-refactoring` exists, next is `02-...`)
3. Determine the run name:
- If guided mode with input file: derive from input file name or context (e.g., `01-testability-refactoring`)
- If automatic mode: ask user for a short run name, or derive from goals (e.g., `01-coupling-refactoring`)
4. Create `REFACTOR_DIR/NN-[run-name]/` — this is RUN_DIR for the rest of the workflow
Announce RUN_DIR path to user.
## 0c. Capture Baseline
1. Read problem description and acceptance criteria
2. Measure current system metrics using project-appropriate tools:
| Metric Category | What to Capture |
|----------------|-----------------|
| **Coverage** | Overall, unit, blackbox, critical paths |
| **Complexity** | Cyclomatic complexity (avg + top 5 functions), LOC, tech debt ratio |
| **Code Smells** | Total, critical, major |
| **Performance** | Response times (P50/P95/P99), CPU/memory, throughput |
| **Dependencies** | Total count, outdated, security vulnerabilities |
| **Build** | Build time, test execution time, deployment time |
3. Create functionality inventory: all features/endpoints with status and coverage
**Self-verification**:
- [ ] RUN_DIR created with correct auto-incremented prefix
- [ ] All metric categories measured (or noted as N/A with reason)
- [ ] Functionality inventory is complete
- [ ] Measurements are reproducible
**Save action**: Write `RUN_DIR/baseline_metrics.md`
**BLOCKING**: Present baseline summary to user. Do NOT proceed until user confirms.
@@ -1,159 +0,0 @@
# Phase 1: Discovery
**Role**: Principal software architect
**Goal**: Analyze existing code and produce `RUN_DIR/list-of-changes.md`
**Constraints**: Document what exists, identify what needs to change. No code changes.
**Skip condition** (Targeted mode): If `COMPONENTS_DIR` and `SOLUTION_DIR` already contain documentation for the target area, skip to Phase 2. Ask user to confirm skip.
## Mode Branch
Determine the input mode set during Context Resolution (see SKILL.md):
- **Guided mode**: input file provided → start with 1g below
- **Automatic mode**: no input file → start with 1a below
---
## Guided Mode
### 1g. Read and Validate Input File
1. Read the provided input file (e.g., `list-of-changes.md` from the autodev testability revision step or user-provided file)
2. Extract file paths, problem descriptions, and proposed changes from each entry
3. For each entry, verify against actual codebase:
- Referenced files exist
- Described problems are accurate (read the code, confirm the issue)
- Proposed changes are feasible
4. Flag any entries that reference nonexistent files or describe inaccurate problems — ASK user
### 1h. Scoped Component Analysis
For each file/area referenced in the input file:
1. Analyze the specific modules and their immediate dependencies
2. Document component structure, interfaces, and coupling points relevant to the proposed changes
3. Identify additional issues not in the input file but discovered during analysis of the same areas
Write per-component to `RUN_DIR/discovery/components/[##]_[name].md` (same format as automatic mode, but scoped to affected areas only).
### 1i. Logical Flow Analysis (guided mode)
Even in guided mode, perform the logical flow analysis from step 1c (automatic mode) — scoped to the areas affected by the input file. Cross-reference documented flows against actual implementation for the affected components. This catches issues the input file author may have missed.
Write findings to `RUN_DIR/discovery/logical_flow_analysis.md`.
### 1j. Produce List of Changes
1. Start from the validated input file entries
2. Enrich each entry with:
- Exact file paths confirmed from code
- Risk assessment (low/medium/high)
- Dependencies between changes
3. Add any additional issues discovered during scoped analysis (1h)
4. **Add any logical flow contradictions** discovered during step 1i
5. Write `RUN_DIR/list-of-changes.md` using `templates/list-of-changes.md` format
- Set **Mode**: `guided`
- Set **Source**: path to the original input file
Skip to **Save action** below.
---
## Automatic Mode
### 1a. Document Components
For each component in the codebase:
1. Analyze project structure, directories, files
2. Go file by file, analyze each method
3. Analyze connections between components
Write per component to `RUN_DIR/discovery/components/[##]_[name].md`:
- Purpose and architectural patterns
- Mermaid diagrams for logic flows
- API reference table (name, description, input, output)
- Implementation details: algorithmic complexity, state management, dependencies
- Caveats, edge cases, known limitations
### 1b. Synthesize Solution & Flows
1. Review all generated component documentation
2. Synthesize into a cohesive solution description
3. Create flow diagrams showing component interactions
Write:
- `RUN_DIR/discovery/solution.md` — product description, component overview, interaction diagram
- `RUN_DIR/discovery/system_flows.md` — Mermaid flowcharts per major use case
Also copy to project standard locations:
- `SOLUTION_DIR/solution.md`
- `DOCUMENT_DIR/system_flows.md`
### 1c. Logical Flow Analysis
**Critical step — do not skip.** Before producing the change list, cross-reference documented business flows against actual implementation. This catches issues that static code inspection alone misses.
1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md` (paying special attention to its `## Architecture Vision` section — that's the user-confirmed structural intent), `DOCUMENT_DIR/glossary.md`, `DOCUMENT_DIR/module-layout.md`, every file under `DOCUMENT_DIR/contracts/`, and `SOLUTION_DIR/solution.md` (whichever exist). Extract every documented business flow, data path, architectural decision, module ownership boundary, and contract shape. Any refactor change that contradicts a confirmed Architecture Vision principle must either be rejected or surfaced to the user before being added to `list-of-changes.md` — those principles are not refactor targets without explicit user approval.
2. **Trace each flow through code**: For every documented flow (e.g., "video batch processing", "image tiling", "engine initialization"), walk the actual code path line by line. At each decision point ask:
- Does the code match the documented/intended behavior?
- Are there edge cases where the flow silently drops data, double-processes, or deadlocks?
- Do loop boundaries handle partial batches, empty inputs, and last-iteration cleanup?
- Are assumptions from one component (e.g., "batch size is dynamic") honored by all consumers?
3. **Check for logical contradictions**: Specifically look for:
- **Fixed-size assumptions vs dynamic-size reality**: Does the code require exact batch alignment when the engine supports variable sizes? Does it pad, truncate, or drop data to fit a fixed size?
- **Loop scoping bugs**: Are accumulators (lists, counters) reset at the right point? Does the last iteration flush remaining data? Are results from inside the loop duplicated outside?
- **Wasted computation**: Is the system doing redundant work (e.g., duplicating frames to fill a batch, processing the same data twice)?
- **Silent data loss**: Are partial batches, remaining frames, or edge-case inputs silently dropped instead of processed?
- **Documentation drift**: Does the architecture doc describe components or patterns (e.g., "msgpack serialization") that are actually dead in the code?
4. **Classify each finding** as:
- **Logic bug**: Incorrect behavior (data loss, double-processing)
- **Performance waste**: Correct but inefficient (unnecessary padding, redundant inference)
- **Design contradiction**: Code assumes X but system needs Y (fixed vs dynamic batch)
- **Documentation drift**: Docs describe something the code doesn't do
Write findings to `RUN_DIR/discovery/logical_flow_analysis.md`.
### 1d. Produce List of Changes
From the component analysis, solution synthesis, and **logical flow analysis**, identify all issues that need refactoring:
1. Hardcoded values (paths, config, magic numbers)
2. Tight coupling between components
3. Missing dependency injection / non-configurable parameters
4. Global mutable state
5. Code duplication
6. Missing error handling
7. Testability blockers (code that cannot be exercised in isolation)
8. Security concerns
9. Performance bottlenecks
10. **Logical flow contradictions** (from step 1c)
11. **Silent data loss or wasted computation** (from step 1c)
12. **Module ownership violations** — code that lives under one component's directory but implements another component's concern, or imports another component's internal (non-Public API) file. Cross-check against `DOCUMENT_DIR/module-layout.md` if present.
13. **Contract drift** — shared-models / shared-API implementations whose public shape has drifted from the contract file in `DOCUMENT_DIR/contracts/`. Include both producer drift and consumer drift.
Write `RUN_DIR/list-of-changes.md` using `templates/list-of-changes.md` format:
- Set **Mode**: `automatic`
- Set **Source**: `self-discovered`
---
## Save action (both modes)
Write all discovery artifacts to RUN_DIR.
**Self-verification**:
- [ ] Every referenced file in list-of-changes.md exists in the codebase
- [ ] Each change entry has file paths, problem, change description, risk, and dependencies
- [ ] Component documentation covers all areas affected by the changes
- [ ] **Logical flow analysis completed**: every documented business flow traced through code, contradictions identified
- [ ] **No silent data loss**: loop boundaries, partial batches, and edge cases checked for all processing flows
- [ ] In guided mode: all input file entries are validated or flagged
- [ ] In automatic mode: solution description covers all components
- [ ] Mermaid diagrams are syntactically correct
**BLOCKING**: Present discovery summary and list-of-changes.md to user. Do NOT proceed until user confirms documentation accuracy and change list completeness.
@@ -1,163 +0,0 @@
# Phase 2: Analysis & Task Decomposition
**Role**: Researcher, software architect, and task planner
**Goal**: Research improvements, produce a refactoring roadmap, and decompose into implementable tasks
**Constraints**: Analysis and planning only — no code changes
## 2a. Deep Research
1. Analyze current implementation patterns
2. Extract the **Project Constraint Matrix** from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, current architecture/docs, and actual code constraints. Include required inputs/outputs, operating context, lifecycle assumptions, integration boundaries, non-functional targets, and hard disqualifiers.
3. Research modern approaches for similar systems
4. For each alternative pattern/library/service/architecture/algorithm, research intrinsic implementation constraints: required inputs/outputs, runtime assumptions, supported deployment modes, resource needs, operational limits, licensing/security constraints, and known failure reports.
**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for proposed replacements)**
When a refactor recommendation replaces (or adds) a library/SDK/framework/service, the same per-mode verification used by `/research` Step 2 applies — selecting a replacement on category fit alone is the same silent-failure path. For every replacement candidate that has multiple modes or configurations:
1. **Pin the exact mode/configuration** the refactored code will use, in one explicit sentence. Inputs (data shapes, sensor counts, payloads, rates), outputs (per `acceptance_criteria.md` and contract files), runtime (matching the project's deployment).
2. **Run `context7` (or equivalent docs lookup)** for the candidate. **Mandatory for every replacement library/SDK/framework candidate**, not optional. Minimum three queries per candidate: mode enumeration, project's exact mode (with input/output shapes), disqualifier probe ("does this mode produce the required output? are there published limitations on this runtime?"). Append URLs to `RUN_DIR/analysis/research_findings.md` references section.
3. **Save a Minimum Viable Example (MVE)** for the pinned mode under `RUN_DIR/analysis/mve_evidence.md` with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌. If no official example covers the project's exact configuration, the recommendation cannot be `Selected` based on category fit alone — it must be `Experimental only` (with required-evidence note) or `Rejected`.
4. **Treat "the same library in a different mode" as a different recommendation.** If the project's pinned mode is `<X>` but the only documented evidence covers `<Y>`, do not silently soften the description. Open a separate recommendation row, with its own MVE, fit assessment, and disqualifiers.
5. **Common silent-failure pattern**: a fact summary paraphrases docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes" — no `A+B` combination exists. Cross-check paraphrased capability claims against the literal mode enumeration.
5. Identify what could be done differently
6. Suggest improvements only when they fit the Project Constraint Matrix. A cleaner or more modern approach that violates product constraints must be marked `Rejected` or `Experimental only`, not added as a roadmap recommendation.
Write `RUN_DIR/analysis/research_findings.md`:
- Current state analysis: patterns used, strengths, weaknesses
- Alternative approaches per component: current vs alternative, pros/cons, migration effort
- Prioritized recommendations: quick wins + strategic improvements
- Constraint-fit table: recommendation, **pinned mode/config**, constraints checked, **API capability evidence (MVE link)**, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
- For every recommendation that replaces or adds a library/SDK/framework, append a **Restrictions × Candidate-Mode sub-matrix** that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's pinned mode, marking each cell ✅ Pass / ❌ Fail / ❓ Verify / N/A with cited evidence. A recommendation cannot be `Selected` while any cell is ❌ or ❓.
## 2b. Solution Assessment & Hardening Tracks
1. Assess current implementation against acceptance criteria
2. Identify weak points in codebase, map to specific code areas
3. Perform gap analysis: acceptance criteria vs current state
4. Prioritize changes by impact and effort
5. Reject or escalate any proposed refactor that improves code structure while weakening required behavior, integration contracts, runtime constraints, safety/security posture, or acceptance criteria
### 2b.1. ADR Superseding Gate (BLOCKING)
A refactor that improves code structure while overturning a documented architecture decision is the silent-drift class the project repeatedly burns on (see `meta-rule.mdc` § GPS-passthrough postmortem and the auto-lessons it produced). This gate makes drift visible and forces a deliberate ADR update.
1. **List candidate ADRs**: read every `Status: Accepted` file in `_docs/02_document/adr/`. If the directory does not exist or contains only the index, log `No ADRs in scope` to `RUN_DIR/analysis/adr_impact.md` and skip the rest of this gate.
2. **Diff each candidate against the proposed refactor roadmap**: for each ADR, ask the same two questions as code-review Phase 7:
- **Violation**: does any roadmap item do the *opposite* of the ADR's `Decision`?
- **Drift**: does any roadmap item materially affect the ADR's `Consequences` (positive or negative) without contradicting the Decision outright?
3. **Classify each impacted ADR** in `RUN_DIR/analysis/adr_impact.md`:
| ADR | Roadmap item | Impact | Required action |
|-----|--------------|--------|-----------------|
| NNN | `roadmap-item-NN` | Violation / Drift / Aligned | (filled by Choose A/B/C below) |
4. **For every Violation row, present a BLOCKING Choose**:
```
══════════════════════════════════════
DECISION REQUIRED: Refactor would violate ADR-NNN (<title>)
══════════════════════════════════════
A) Update the ADR via supersede: the refactor produces a NEW ADR
(`Supersedes: NNN`) capturing the new Decision, and ADR-NNN's
`Superseded by` field is updated. The supersede ADR is itself a
deliverable of this refactor run (added to RUN_DIR/analysis/adr_impact.md
and to TASKS_DIR as a task) and must be `Accepted` before Phase 4.
B) Reduce the refactor scope to NOT violate ADR-NNN
C) Re-evaluate ADR-NNN: keep the refactor but only after ADR-NNN is
formally re-opened in a new /plan Step 4.5 round
══════════════════════════════════════
Recommendation: A — supersede is the only path that keeps the audit
trail intact while letting the refactor land
══════════════════════════════════════
```
5. **For every Drift row**: do not block, but the roadmap item must include a `## ADR Impact` section in its task spec citing the affected ADR(s). The implementer surfaces this at code-review Phase 7, which would otherwise classify the change as ADR-Drift (High) without context.
6. **For every Aligned row**: cite the ADR in the roadmap item's task spec under `## ADR Compliance`. No further action.
7. **Self-supersede deliverable**: any Choose A path adds a `[##]_supersede_adr_NNN.md` task file to the refactor run's TASKS_DIR with the new ADR text drafted (using `.cursor/skills/plan/templates/adr.md`). The task's only Acceptance Criterion is "ADR file exists at `_docs/02_document/adr/<next>_<slug>.md` with `Status: Accepted`, ADR-NNN's `Superseded by` field updated, and `_docs/02_document/adr/README.md` index reflects both."
Present optional hardening tracks for user to include in the roadmap:
```
══════════════════════════════════════
DECISION REQUIRED: Include hardening tracks?
══════════════════════════════════════
A) Technical Debt — identify and address design/code/test debt
B) Performance Optimization — profile, identify bottlenecks, optimize
C) Security Review — OWASP Top 10, auth, encryption, input validation
D) All of the above
E) None — proceed with structural refactoring only
══════════════════════════════════════
```
For each selected track, add entries to `RUN_DIR/list-of-changes.md` (append to the file produced in Phase 1):
- **Track A**: tech debt items with location, impact, effort
- **Track B**: performance bottlenecks with profiling data
- **Track C**: security findings with severity and fix description
Write `RUN_DIR/analysis/refactoring_roadmap.md`:
- Weak points assessment: location, description, impact, proposed solution
- Gap analysis: what's missing, what needs improvement
- Phased roadmap: Phase 1 (critical fixes), Phase 2 (major improvements), Phase 3 (enhancements)
- Selected hardening tracks and their items
- Applicability gate: each roadmap item must state constraint fit, mismatches, required evidence, and status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
**BLOCKING applicability gate**: Before 2c and 2d, every recommendation in the roadmap must be `Selected`. Items marked `Rejected` are excluded. Items marked `Experimental only` or `Needs user decision` require a user decision before task creation.
**BLOCKING ADR-supersede gate**: Before 2c and 2d, every Violation row in `RUN_DIR/analysis/adr_impact.md` (from 2b.1) must be resolved via Choose A, B, or C. A Violation row with no chosen path blocks task creation.
## 2c. Create Epic
Create a work item tracker epic for this refactoring run:
1. Epic name: the RUN_DIR name (e.g., `01-testability-refactoring`)
2. Create the epic via configured tracker MCP
3. Record the Epic ID — all tasks in 2d will be linked under this epic
4. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; only use `PENDING` placeholders if the user explicitly chooses `tracker: local`
## 2d. Task Decomposition
Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files.
1. Read `RUN_DIR/list-of-changes.md`
2. For each change entry (or group of related entries), create an atomic task file in TASKS_DIR:
- Use the standard task template format (`.cursor/skills/decompose/templates/task.md`)
- File naming: `[##]_refactor_[short_name].md` (temporary numeric prefix)
- **Task**: `PENDING_refactor_[short_name]`
- **Description**: derived from the change entry's Problem + Change fields
- **Complexity**: estimate 1-5 points; split into multiple tasks if >5
- **Dependencies**: map change-level dependencies (C01, C02) to task-level tracker IDs
- **Component**: from the change entry's File(s) field
- **Epic**: the epic created in 2c
- **Acceptance Criteria**: derived from the change entry — verify the problem is resolved
3. Create work item ticket for each task under the epic from 2c
4. Rename each file to `[TRACKER-ID]_refactor_[short_name].md` after ticket creation
5. Update or append to `TASKS_DIR/_dependencies_table.md` with the refactoring tasks
**Self-verification**:
- [ ] All acceptance criteria are addressed in gap analysis
- [ ] Recommendations are grounded in actual code, not abstract
- [ ] Every recommendation has been checked against the Project Constraint Matrix
- [ ] No recommendation violates product restrictions, acceptance criteria, documented architecture decisions, or actual code integration boundaries
- [ ] Every replacement library/SDK/framework recommendation has a pinned mode/config, a saved MVE in `mve_evidence.md`, and a Restrictions × Candidate-Mode sub-matrix with no ❌ or ❓ cells
- [ ] `context7` (or equivalent) was consulted for every replacement library/SDK/framework recommendation
- [ ] Paraphrased capability claims have been cross-checked against the literal mode-enumeration evidence (no `A, B → A+B` style conflation)
- [ ] Rejected and experimental approaches are documented but not converted into implementation tasks without user approval
- [ ] Roadmap phases are prioritized by impact
- [ ] Epic created and all tasks linked to it
- [ ] Every entry in list-of-changes.md has a corresponding task file in TASKS_DIR
- [ ] No task exceeds 5 complexity points
- [ ] Task dependencies are consistent (no circular dependencies)
- [ ] `_dependencies_table.md` includes all refactoring tasks
- [ ] Every task has a work item ticket (or PENDING placeholder)
- [ ] If `_docs/02_document/adr/` exists with Accepted ADRs, `RUN_DIR/analysis/adr_impact.md` has been written and every Violation row is resolved (A/B/C) — no implicit overrides
- [ ] For every Violation resolved via Choose A, a `[##]_supersede_adr_NNN.md` task exists in TASKS_DIR with the drafted supersede ADR
- [ ] For every Drift row, the corresponding roadmap-item task spec has a `## ADR Impact` section
- [ ] For every Aligned row, the corresponding roadmap-item task spec has a `## ADR Compliance` section
**Save action**: Write analysis artifacts to RUN_DIR, task files to TASKS_DIR
**BLOCKING**: Present refactoring roadmap and task list to user. Do NOT proceed until user confirms.
**Quick Assessment mode stops here.** Present final summary and write `FINAL_report.md` with phases 0-2 content.
@@ -1,57 +0,0 @@
# Phase 3: Safety Net
**Role**: QA engineer and developer
**Goal**: Ensure tests exist that capture current behavior before refactoring
**Constraints**: Tests must all pass on the current codebase before proceeding
## Skip Condition: Testability Refactoring
If the current run name contains `testability` (e.g., `01-testability-refactoring`), **skip Phase 3 entirely**. The purpose of a testability run is to make the code testable so that tests can be written afterward. Announce the skip and proceed to Phase 4.
## 3a. Check Existing Tests
Before designing or implementing any new tests, check what already exists:
1. Scan the project for existing test files (unit tests, integration tests, blackbox tests)
2. Run the existing test suite — record pass/fail counts
3. Measure current coverage against the areas being refactored (from `RUN_DIR/list-of-changes.md` file paths)
4. Assess coverage against thresholds (canonical: see `.cursor/rules/cursor-meta.mdc` Quality Thresholds — never hardcode a different number):
- Minimum overall coverage: 75%
- Critical path coverage: **90% floor / 100% aim** — 90% is the enforcement floor (blocks Phase 4 if not met); 100% is the aspirational target. Refactors are NOT permitted to drop below 90% on the critical paths covered by the in-scope changes.
- All public APIs must have blackbox tests
- All error handling paths must be tested
If existing tests meet all thresholds for the refactoring areas:
- Document the existing coverage in `RUN_DIR/test_specs/existing_coverage.md`
- Skip to the GATE check below
If existing tests partially cover the refactoring areas:
- Document what is covered and what gaps remain
- Proceed to 3b only for the uncovered areas
If no relevant tests exist:
- Proceed to 3b for full test design
## 3b. Design Test Specs (for uncovered areas only)
For each uncovered critical area, write test specs to `RUN_DIR/test_specs/[##]_[test_name].md`:
- Blackbox tests: summary, current behavior, input data, expected result, max expected time
- Acceptance tests: summary, preconditions, steps with expected results
- Coverage analysis: current %, target %, uncovered critical paths
## 3c. Implement Tests (for uncovered areas only)
1. Set up test environment and infrastructure if not exists
2. Implement each test from specs
3. Run tests, verify all pass on current codebase
4. Document any discovered issues
**Self-verification**:
- [ ] Coverage requirements met (75% overall, 90% critical-path floor — 100% aim — per canonical `cursor-meta.mdc` Quality Thresholds) across existing + new tests
- [ ] All tests pass on current codebase
- [ ] All public APIs in refactoring scope have blackbox tests
- [ ] Test data fixtures are configured
**Save action**: Write test specs to RUN_DIR; implemented tests go into the project's test folder
**GATE (BLOCKING)**: ALL tests must pass before proceeding to Phase 4. If tests fail, fix the tests (not the code) or ask user for guidance. Do NOT proceed to Phase 4 with failing tests.
@@ -1,63 +0,0 @@
# Phase 4: Execution
**Role**: Orchestrator
**Goal**: Execute all refactoring tasks by delegating to the implement skill
**Constraints**: No inline code changes — all implementation goes through the implement skill's batching and review pipeline
## 4a. Pre-Flight Checks
1. Verify refactoring task files exist in TASKS_DIR (created during Phase 2d):
- All `[TRACKER-ID]_refactor_*.md` files are present
- Each task file has valid header fields (Task, Name, Description, Complexity, Dependencies)
2. Verify `TASKS_DIR/_dependencies_table.md` includes the refactoring tasks
3. Verify all tests pass (safety net from Phase 3 is green), unless this is a testability run where Phase 3 was intentionally skipped
4. If any check fails, go back to the relevant phase to fix
## 4b. Delegate to Implement Skill
Read and execute `.cursor/skills/implement/SKILL.md`.
The implement skill will:
1. Parse task files and dependency graph from TASKS_DIR
2. Detect already-completed tasks (skip non-refactoring tasks from prior workflow steps)
3. Compute execution batches for the refactoring tasks
4. Implement tasks sequentially in topological order (no subagents, no parallelism)
5. Run code review after each batch
6. Commit per batch and push only when the user approved pushing
7. Update work item ticket status
Do NOT modify, skip, or abbreviate any part of the implement skill's workflow. The refactor skill is delegating execution, not optimizing it.
## 4c. Capture Results
After the implement skill completes:
1. Read batch reports from `_docs/03_implementation/batch_*_report.md`
2. Read the latest `_docs/03_implementation/implementation_report_*.md` file
3. Write `RUN_DIR/execution_log.md` summarizing:
- Total tasks executed
- Batches completed
- Code review verdicts per batch
- Files modified (aggregate list)
- Any blocked or failed tasks
- Links to batch reports
## 4d. Update Task Statuses
For each successfully completed refactoring task:
1. Transition the work item ticket status to **Done** via the configured tracker MCP
2. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; if the user explicitly chose `tracker: local`, note the pending status transitions in `RUN_DIR/execution_log.md`
For any failed or blocked tasks, leave their status as-is (the implement skill already set them to In Testing or blocked).
**Self-verification**:
- [ ] All refactoring tasks show as completed in batch reports
- [ ] All completed tasks have work item tracker status set to Done
- [ ] All tests still pass after execution
- [ ] No tasks remain in blocked or failed state (or user has acknowledged them)
- [ ] `RUN_DIR/execution_log.md` written with links to batch reports
**Save action**: Write `RUN_DIR/execution_log.md`
**GATE**: All refactoring tasks must be implemented. If any tasks failed, present the failures to the user and ask for guidance before proceeding to Phase 5.
@@ -1,53 +0,0 @@
# Phase 5: Test Synchronization
**Role**: QA engineer and developer
**Goal**: Reconcile the test suite with the refactored codebase — remove obsolete tests, update broken tests, add tests for new code
**Constraints**: All tests must pass at the end of this phase. Do not change production code here — only tests.
**Skip condition**: If the run name contains `testability`, skip Phase 5 entirely — no test suite exists yet to synchronize. Proceed directly to Phase 6.
## 5a. Identify Obsolete Tests
1. Compare the pre-refactoring codebase structure (from Phase 0 inventory) with the current state
2. Find tests that reference removed functions, classes, modules, or endpoints
3. Find tests that duplicate coverage due to merged/consolidated code
4. Decide per test: **delete** (functionality removed) or **merge** (duplicates)
Write `RUN_DIR/test_sync/obsolete_tests.md`:
- Test file, test name, reason (target removed / target merged / duplicate coverage), action taken (deleted / merged into)
## 5b. Update Existing Tests
1. Run the full test suite — collect failures and errors
2. For each failing test, determine the cause:
- Renamed/moved function or module → update import paths and references
- Changed function signature → update call sites and assertions
- Changed behavior (intentional per refactoring plan) → update expected values
- Changed data structures → update fixtures and assertions
3. Fix each test, re-run to confirm it passes
Write `RUN_DIR/test_sync/updated_tests.md`:
- Test file, test name, change type (import path / signature / assertion / fixture), description of update
## 5c. Add New Tests
1. Identify new code introduced during Phase 4 that lacks test coverage:
- New public functions, classes, or modules
- New interfaces or abstractions introduced during decoupling
- New error handling paths
2. Write tests following the same patterns and conventions as the existing test suite
3. Ensure coverage targets from Phase 3 are maintained or improved
Write `RUN_DIR/test_sync/new_tests.md`:
- Test file, test name, target function/module, coverage type (unit / integration / blackbox)
**Self-verification**:
- [ ] All obsolete tests removed or merged
- [ ] All pre-existing tests pass after updates
- [ ] New code from Phase 4 has test coverage
- [ ] Overall coverage meets or exceeds Phase 3 baseline (75% overall, 90% critical-path floor / 100% aim — per `.cursor/rules/cursor-meta.mdc` Quality Thresholds)
- [ ] No tests reference removed or renamed code
**Save action**: Write test_sync artifacts; implemented tests go into the project's test folder
**GATE (BLOCKING)**: ALL tests must pass before proceeding to Phase 6. If tests fail, fix the tests or ask user for guidance.
@@ -1,53 +0,0 @@
# Phase 6: Final Verification
**Role**: QA engineer
**Goal**: Run all tests end-to-end, compare final metrics against baseline, and confirm the refactoring succeeded
**Constraints**: No code changes. If failures are found, go back to the appropriate phase (4/5) to fix before retrying.
**Skip condition**: If the run name contains `testability`, skip Phase 6 entirely — no test suite exists yet to verify against. Proceed directly to Phase 7.
## 6a. Run Full Test Suite
1. Run unit tests, integration tests, and blackbox tests
2. Run acceptance tests derived from `acceptance_criteria.md`
3. Record pass/fail counts and any failures
If any test fails:
- Determine whether the failure is a test issue (→ return to Phase 5) or a code issue (→ return to Phase 4)
- Do NOT proceed until all tests pass
## 6b. Capture Final Metrics
Re-measure all metrics from Phase 0 baseline using the same tools:
| Metric Category | What to Capture |
|----------------|-----------------|
| **Coverage** | Overall, unit, blackbox, critical paths |
| **Complexity** | Cyclomatic complexity (avg + top 5 functions), LOC, tech debt ratio |
| **Code Smells** | Total, critical, major |
| **Performance** | Response times (P50/P95/P99), CPU/memory, throughput |
| **Dependencies** | Total count, outdated, security vulnerabilities |
| **Build** | Build time, test execution time, deployment time |
## 6c. Compare Against Baseline
1. Read `RUN_DIR/baseline_metrics.md`
2. Produce a side-by-side comparison: baseline vs final for every metric
3. Flag any regressions (metrics that got worse)
4. Verify acceptance criteria are met
Write `RUN_DIR/verification_report.md`:
- Test results summary: total, passed, failed, skipped
- Metric comparison table: metric, baseline value, final value, delta, status (improved / unchanged / regressed)
- Acceptance criteria checklist: criterion, status (met / not met), evidence
- Regressions (if any): metric, severity, explanation
**Self-verification**:
- [ ] All tests pass (zero failures)
- [ ] All acceptance criteria are met
- [ ] No critical metric regressions
- [ ] Metrics are captured with the same tools/methodology as Phase 0
**Save action**: Write `RUN_DIR/verification_report.md`
**GATE (BLOCKING)**: All tests must pass and no critical regressions. Present verification report to user. Do NOT proceed to Phase 7 until user confirms.
@@ -1,45 +0,0 @@
# Phase 7: Documentation Update
**Role**: Technical writer
**Goal**: Update existing `_docs/` artifacts to reflect all changes made during refactoring
**Constraints**: Documentation only — no code changes. Only update docs that are affected by refactoring changes.
**Skip condition**: If no `_docs/02_document/` directory exists, skip this phase entirely.
## 7a. Identify Affected Documentation
1. Review `RUN_DIR/execution_log.md` to list all files changed during Phase 4
2. Review test changes from Phase 5
3. Map changed files to their corresponding module docs in `_docs/02_document/modules/`
4. Map changed modules to their parent component docs in `_docs/02_document/components/`
5. Determine if system-level docs need updates (`architecture.md`, `system-flows.md`, `data_model.md`)
6. Determine if test documentation needs updates (`_docs/02_document/tests/`)
## 7b. Update Module Documentation
For each module doc affected by refactoring changes:
1. Re-read the current source file
2. Update the module doc to reflect new/changed interfaces, dependencies, internal logic
3. Remove documentation for deleted code; add documentation for new code
## 7c. Update Component Documentation
For each component doc affected:
1. Re-read the updated module docs within the component
2. Update inter-module interfaces, dependency graphs, caveats
3. Update the component relationship diagram if component boundaries changed
## 7d. Update System-Level Documentation
If structural changes were made (new modules, removed modules, changed interfaces):
1. Update `_docs/02_document/architecture.md` if architecture changed — but **never edit the `## Architecture Vision` section**. That section is user-confirmed (plan Phase 2a.0 / document Step 4.5); if a refactor invalidates a vision principle, surface it to the user and let them update the vision themselves before continuing. Update only the technical sections below the Vision H2.
2. Update `_docs/02_document/system-flows.md` if flow sequences changed
3. Update `_docs/02_document/diagrams/components.md` if component relationships changed
**Self-verification**:
- [ ] Every changed source file has an up-to-date module doc
- [ ] Component docs reflect the refactored structure
- [ ] No stale references to removed code in any doc
- [ ] Dependency graphs in docs match actual imports
**Save action**: Updated docs written in-place to `_docs/02_document/`
@@ -1,53 +0,0 @@
# List of Changes Template
Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
---
```markdown
# List of Changes
**Run**: [NN-run-name]
**Mode**: [automatic | guided]
**Source**: [self-discovered | path/to/input-file.md]
**Date**: [YYYY-MM-DD]
## Summary
[1-2 sentence overview of what this refactoring run addresses]
## Changes
### C01: [Short Title]
- **File(s)**: [file paths, comma-separated]
- **Problem**: [what makes this problematic / untestable / coupled]
- **Change**: [what to do — behavioral description, not implementation steps]
- **Rationale**: [why this change is needed]
- **Constraint Fit**: [which product constraints / acceptance criteria / integration boundaries this preserves; or "Rejected — violates ..."]
- **Risk**: [low | medium | high]
- **Dependencies**: [other change IDs this depends on, or "None"]
### C02: [Short Title]
- **File(s)**: [file paths]
- **Problem**: [description]
- **Change**: [description]
- **Rationale**: [description]
- **Constraint Fit**: [description]
- **Risk**: [low | medium | high]
- **Dependencies**: [C01, or "None"]
```
---
## Guidelines
- **Change IDs** use format `C##` (C01, C02, ...) — sequential within the run
- Each change should map to one atomic task (1-5 complexity points); split if larger
- **File(s)** must reference actual files verified to exist in the codebase
- **Problem** describes the current state, not the desired state
- **Change** describes what the system should do differently — behavioral, not prescriptive
- **Constraint Fit** proves the change preserves confirmed product requirements, restrictions, acceptance criteria, architecture decisions, and integration contracts
- Do not include changes whose only benefit is structural cleanliness if they weaken required behavior or violate constraints; record those as rejected in analysis instead
- **Dependencies** reference other change IDs within this list; cross-run dependencies use tracker IDs
- In guided mode, the input file entries are validated against actual code and enriched with file paths, risk, and dependencies before writing
- In automatic mode, entries are derived from Phase 1 component analysis and Phase 2 research findings
-290
View File
@@ -1,290 +0,0 @@
---
name: release
description: |
Executes the deployment plan produced by /deploy against a target environment.
Closes the loop between "we have a plan" and "the new version is running in production with a verdict on disk."
6-phase workflow: pre-release gate, strategy select, execute, smoke test, watch window, commit-or-rollback.
Outputs _docs/04_release/release_<version>.md with a definitive Released / Rolled-Back / Aborted verdict.
Trigger phrases:
- "release", "ship", "go live", "release this version"
- "deploy to prod", "promote to staging", "roll out"
- "rollback", "abort the release"
category: ship
tags: [release, deployment, rollback, smoke-test, observability, production]
disable-model-invocation: true
---
# Release Execution
The `/deploy` skill produces a plan and scripts. The `/release` skill **runs** them, verifies the live system, watches it for a defined window, and produces a definitive verdict on disk.
## Core Principles
- **Real execution, not simulation**: every phase must actually run against the target environment. If a phase cannot be executed (missing scripts, no SSH access, disabled secrets, registry auth failure), STOP — do not pretend a step succeeded. See `meta-rule.mdc` § "Real Results, Not Simulated Ones".
- **Verifiable rollback path**: the release does not start until rollback is proven viable for this version. "We can roll back" without evidence is not a rollback path.
- **Quiet failure is a release failure**: a deploy script that exits 0 but emits no observable signal in the watch window is treated as a regression, not a success.
- **One release per invocation**: a single `/release` execution targets exactly one version against exactly one environment. Multi-stage promotion (staging → prod) is two invocations, not one.
- **Never skip the watch window**: even successful deploys can degrade after 560 minutes (cache warm-up, scheduled jobs, downstream backpressure). The watch window is mandatory.
- **Autonomous rollback on hard regressions**: critical health-check failure, error-rate spike above threshold, or smoke-test failure → automatic rollback. Soft regressions (latency drift, capacity warnings) escalate to the user.
## Context Resolution
Fixed paths:
- DEPLOY_DIR: `_docs/04_deploy/`
- RELEASE_DIR: `_docs/04_release/`
- SCRIPTS_DIR: `scripts/`
- DEPLOY_SCRIPT: `scripts/deploy.sh`
- HEALTH_SCRIPT: `scripts/health-check.sh`
- ENV_TEMPLATE: `.env.example`
- OBSERVABILITY_DOC: `_docs/04_deploy/observability.md`
- ENVIRONMENT_DOC: `_docs/04_deploy/environment_strategy.md`
- PROCEDURES_DOC: `_docs/04_deploy/deployment_procedures.md`
- ARCHITECTURE: `_docs/02_document/architecture.md`
- RESTRICTIONS: `_docs/00_problem/restrictions.md`
Announce the resolved paths and the **target environment + version + strategy** to the user before any phase that touches the live system.
## Inputs (BLOCKING prerequisites)
| Input | Required | Source |
|-------|----------|--------|
| Target environment | Yes — ASK user | `environment_strategy.md` enumerates valid options |
| Target version / image tag | Yes — ASK user | Must exist in the registry; verified in Phase 1 |
| Rollback target version | Yes — ASK user | Defaults to currently-deployed version if discoverable |
| `scripts/deploy.sh` | Yes | Produced by `/deploy` Step 7. STOP if missing → run `/deploy` first |
| `scripts/health-check.sh` | Yes | Same |
| `_docs/04_deploy/deployment_procedures.md` | Yes | Defines per-environment runbook, manual approval rules, change-window restrictions |
| `_docs/04_deploy/observability.md` | Yes | Defines watch metrics, thresholds, and dashboards |
| `_docs/04_deploy/environment_strategy.md` | Yes | Defines target hostnames, registries, secrets, deploy strategy per env |
## Outputs
```
RELEASE_DIR/
├── release_<version>_<env>_<YYYY-MM-DD-HHmm>.md (mandatory; one per invocation)
├── rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md (only when rollback fires; pairs with the release file)
└── manual_approvals/
└── approval_<version>_<env>.md (when restrictions require manual approval, written before Phase 3)
```
The release report (`templates/release-report.md`) is appended to as each phase completes — it is durable across phase failures and reflects partial progress so the next operator can resume or audit.
## Phases
```
┌────────────────────────────────────────────────────────────────┐
│ Release Execution (6-Phase Method) │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: deploy artifacts on disk; tests green at HEAD │
│ │
│ 1. Pre-Release Gate → AC + change summary + readiness │
│ [BLOCKING: user confirms or aborts] │
│ 2. Strategy Select → all-at-once / blue-green / canary │
│ [BLOCKING: user picks strategy] │
│ 3. Execute → run deploy.sh, capture exit + logs │
│ [AUTO-ROLLBACK on non-zero exit] │
│ 4. Smoke Test → /test-run prod-smoke in target env │
│ [AUTO-ROLLBACK on failure] │
│ 5. Watch Window → poll observability for N minutes │
│ [AUTO-ROLLBACK on hard threshold breach] │
│ 6. Commit or Rollback → finalize verdict, update tracker │
│ [BLOCKING: user confirms only if soft regression escalated] │
├────────────────────────────────────────────────────────────────┤
│ Verdicts: Released · Rolled-Back · Aborted │
└────────────────────────────────────────────────────────────────┘
```
### Phase 1: Pre-Release Gate
**Goal**: Refuse to start if the system is not ready for a real release.
1. **Acceptance criteria check**: read `_docs/00_problem/acceptance_criteria.md`. If any AC is marked unmet OR if any AC has no associated test marked `Passed` in the latest `test-run` report, STOP and surface the unmet items. Do not let the user override with "ship anyway" without a recorded reason in the release report.
2. **Test status check**: read the most recent `_docs/06_metrics/perf_*.md` (if perf is required by restrictions) and the latest functional test report. Any failing or skipped test that maps to a critical-path AC blocks the release.
3. **Change summary**: read the git log between the version-tag-of-last-release and HEAD (or, if no prior release exists, from the project root commit). Render a short list grouped by component: features, fixes, breaking changes, security fixes. Cross-reference against the latest implementation reports under `_docs/03_implementation/`.
4. **Rollback readiness**:
- Confirm the previous version's image is still pullable from the registry (do not deploy without this).
- Confirm `scripts/deploy.sh --rollback` works as documented (read the script; if `--rollback` flag is missing, STOP — that is a deploy-skill bug).
- Confirm a rollback target exists (e.g., previously-deployed image tag) and is recorded in the release report under `Rollback Plan`.
5. **Restrictions**: read `_docs/00_problem/restrictions.md` for change-window rules, manual-approval rules, blackout windows, regulatory requirements (e.g., 4-eyes review, ITAR controls). If any apply, gate accordingly — write a `manual_approvals/approval_<version>_<env>.md` file once received.
6. **Tracker check**: list tracker tickets in the release scope (per `tracker.mdc` rules). Any ticket still in `In Progress` or `Code Review` that maps to a change in the release scope blocks Phase 1. Move-and-deploy is not allowed.
**BLOCKING gate**: present the assembled summary to the user using Choose A/B/C:
```
══════════════════════════════════════
PRE-RELEASE GATE
══════════════════════════════════════
Target env: {env}
Target version: {version} ({git-sha})
Rollback target: {previous-version}
Changes: N tickets, M components
- {summary list}
Open risks: {summary or "none"}
Blocking issues: {summary or "none"}
══════════════════════════════════════
A) Proceed to Strategy Select
B) Abort — fix blocking issue and re-invoke
C) Edit release scope — exclude a ticket and reassemble
══════════════════════════════════════
```
If A → write Phase 1 section to release report, proceed. If B → write `Aborted` verdict to release report with reason, exit. If C → loop back into Phase 1 with edited scope.
### Phase 2: Strategy Select
**Goal**: Pick the deployment strategy that fits the change risk and environment capability.
Read `environment_strategy.md` and `deployment_procedures.md` to learn which strategies the target env supports. Strategies and when each is appropriate:
| Strategy | When to pick | Risk if wrong |
|----------|--------------|---------------|
| **all-at-once** | Internal tools, low traffic, well-rehearsed change, env supports nothing else | All users hit the new version simultaneously — bug blast radius is 100% |
| **blue-green** | Stateless services with a load balancer, env has dual-stack capability | Cutover is binary — observability must be ready to detect issues fast |
| **canary** | Customer-facing, traffic-tier load balancer in place, gradual rollout possible | Canary metric thresholds must be well-tuned or canary fails for harmless reasons |
| **manual** | Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) | The whole release becomes a runbook and the watch window phases are operator-driven; the release skill records but does not execute |
Recommend a default based on:
- Risk level inferred from change summary (any breaking change → bias toward canary or blue-green)
- Restrictions (e.g., regulatory rules forcing manual approval at each step)
- Environment capability (some envs may only support all-at-once)
**BLOCKING gate**: Choose A/B/C/D between strategies. Record the choice in the release report.
### Phase 3: Execute
**Goal**: Actually run the deploy. Capture exit code and full stdout/stderr.
1. Validate environment file (`.env`) exists, all required vars from `.env.example` are set, no placeholder secrets remain.
2. Source the env file and run `scripts/deploy.sh` against the target host. The script produced by `/deploy` Step 7 is the point of execution; do NOT bypass it. If a strategy-specific flag is needed (e.g., `--canary 5%`), pass it through.
3. Stream stdout/stderr to the release report, with timestamps, in a fenced code block under `## Phase 3: Execute`.
4. Capture exit code.
5. **AUTO-ROLLBACK trigger**: non-zero exit code → immediately invoke Phase 6 with verdict `Rolled-Back: deploy script failure`. Do NOT continue to Phase 4.
If `deploy.sh` emits no output for more than the configured idle threshold (default 5 minutes; check `deployment_procedures.md` for an explicit value), treat it as hung — capture a snapshot of what's running on the target, kill the script, and AUTO-ROLLBACK with reason `Deploy hung — manual investigation required`.
**Manual strategy**: if Phase 2 picked `manual`, write a checklist of operator steps from `deployment_procedures.md` to the release report and pause until the user types `done` or `failed`. Phase 3 then records the user's report verbatim.
### Phase 4: Smoke Test
**Goal**: Verify the new version is *actually serving traffic correctly* in the target environment.
1. Resolve the smoke-test command from `_docs/02_document/tests/blackbox-tests.md` § Production Smoke Tests, OR delegate to `/test-run` in `--prod-smoke` mode against the target environment.
2. The smoke-test set must (a) hit each public endpoint of each component, (b) include at least one read AND one write per public endpoint where applicable, and (c) complete in under 5 minutes total.
3. Capture pass/fail per case to the release report.
4. **AUTO-ROLLBACK trigger**: any smoke-test failure → invoke Phase 6 with verdict `Rolled-Back: smoke test failure: <test-name>`.
If smoke tests are **missing** for the target environment (no production-mode test set), STOP — write a leftover entry to `_docs/_process_leftovers/` per `tracker.mdc`, do not proceed to watch window without smoke coverage. Write `Aborted: smoke tests missing for prod-mode target` and ASK the user.
### Phase 5: Watch Window
**Goal**: Observe the live system for a defined window to catch latent regressions.
1. Read `observability.md` for the project's metrics, dashboards, and threshold definitions. Required watch metrics for any production target (per cursor-meta convention) include error rate, request rate, p99 latency, and saturation (CPU/memory/queue-depth).
2. Compute the watch-window duration from `deployment_procedures.md`. If unspecified, default to **15 minutes** for staging and **60 minutes** for production.
3. Poll the observability backend at 1-minute intervals (or the configured cadence). For each interval, record metric snapshots to the release report.
4. Threshold rules:
- **Hard breach** (auto-rollback): error-rate ≥ 2× baseline, p99 latency ≥ 3× baseline, any health-check failure persisting for 2 consecutive intervals.
- **Soft breach** (escalate): metric drift between 1.5× and 2× baseline, single-interval health blip, queue-depth steady but elevated.
- **No data** (escalate): if metrics are not flowing within the first 3 minutes, treat the absence as a hard breach — observability is itself broken.
5. **AUTO-ROLLBACK trigger**: hard breach at any interval. Move to Phase 6 with verdict `Rolled-Back: <metric> breached <multiplier>× baseline at T+<minutes>`.
6. **ESCALATE trigger**: soft breach. Pause polling, surface the metric, and ask the user A/B/C:
- A) Continue watch — accept current drift, keep polling
- B) Roll back now — treat soft drift as hard
- C) Extend watch window by N minutes
7. End of watch window with no breach → proceed to Phase 6.
The watch window cannot be skipped. If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as `Released-with-override` — this triggers an automatic incident retrospective per `retrospective/SKILL.md`.
### Phase 6: Commit or Rollback
**Goal**: Finalize the release with a definitive verdict on disk.
**Path A — Commit (clean release)**:
1. Update tracker tickets: every ticket in scope moves to `Released` (or `Done`, per project convention defined in `tracker.mdc` / `_docs/_repo-config.yaml`).
2. Tag the git HEAD with `release/<version>` (or the project's tag convention from `deployment_procedures.md`).
3. Write the final `Released` verdict to the release report with a summary table.
4. Trigger `/retrospective --cycle-end` with this release as the cycle terminus.
5. Auto-chain to autodev's next step (Retrospective in greenfield, or feature-cycle loop start in existing-code).
**Path B — Rollback (auto-fired or user-elected)**:
1. Run `scripts/deploy.sh --rollback` with the rollback target captured in Phase 1.
2. Stream output to a new file `RELEASE_DIR/rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md` AND append a summary to the original release report under `## Rollback`.
3. Re-run Phase 4 (smoke test) and a 5-minute mini watch window against the rolled-back version. If THAT also fails, escalate immediately — the system is in an unknown state and needs human takeover.
4. Update tracker tickets back to `Ready for Release` (or the project's pre-release status).
5. Write the final `Rolled-Back` verdict with full reason chain.
6. Auto-trigger `/retrospective --incident` with this release as the incident anchor (per `retrospective/SKILL.md` incident mode).
7. Do NOT auto-chain to anything else — the user owns the next step.
**Path C — Aborted**:
Reached only via Phase 1 Choose B, Phase 4 smoke-tests-missing escalation, or any phase that detects a precondition violation. Write `Aborted: <reason>` to the release report. Do not auto-chain.
## Self-verification
- [ ] Release report exists at `RELEASE_DIR/release_<version>_<env>_<timestamp>.md` with verdict (Released / Rolled-Back / Aborted)
- [ ] Every phase that ran has a section in the release report with timestamps and tool output
- [ ] On Released: tracker tickets moved to release status; git tag pushed (if convention)
- [ ] On Rolled-Back: rollback report exists at `RELEASE_DIR/rollback_<version>_<env>_<timestamp>.md`; tracker tickets moved back to pre-release status; incident retrospective scheduled
- [ ] On Aborted: reason recorded; no live-system changes attempted; no tracker movement
- [ ] No phase was skipped without an explicit reason recorded in the release report
## Escalation Rules
| Situation | Action |
|-----------|--------|
| `scripts/deploy.sh` missing or `--rollback` unsupported | STOP — return to `/deploy` Step 7, do not patch the script in `/release` |
| Registry auth failure during pre-release | STOP — fix credentials at infra layer (per `coderule.mdc`); do not embed creds in the script |
| Smoke tests missing for prod target | STOP — write a leftover; do not improvise smoke tests in `/release` |
| Observability backend unreachable | STOP — observability blindness is itself a release blocker |
| User asks to skip the watch window | Record override, mark verdict `Released-with-override`, fire incident retro |
| Rollback also fails its smoke test | ESCALATE to user — system is in unknown state; do not loop deploys |
| Tracker MCP returns Unauthorized during ticket movement | Per `tracker.mdc`, write a leftover entry; do NOT silently continue without confirming the move |
| Multiple environments named in user request | STOP — one release per invocation; ask user to pick one |
| Production smoke test would touch real customer data | STOP — that is a `coderule.mdc` violation; ask user to define a smoke endpoint or test account |
## Common Mistakes
- **Skipping the watch window when "everything looks fine after deploy"** — a deploy that exited 0 is not a release that's stable. Watch is mandatory.
- **Faking smoke tests** to pass the gate when the prod test set is incomplete. STOP and surface the gap; do not embed prod URLs into ad-hoc curl commands.
- **Rolling forward through a failure** ("the next deploy will fix it"). Roll back first, fix the cause, then deploy a real fix.
- **Treating the release report as optional** when only an internal tool changed. Every release writes a report — the audit trail is the value, not the prose volume.
- **Approving manual gates yourself** without the user's input when restrictions require human approval. The release skill records, the human approves.
- **Reusing `release_<version>` filenames** across attempted releases. Always include the timestamp in the filename so re-attempts are visible side-by-side.
- **Letting tracker drift silently** between release attempts. If Phase 6 cannot move tickets, the release is not complete — write a leftover and stop.
## Project Mode vs Standalone
- **Project mode** (default): autodev invokes `/release` after `/deploy`. State writes occur under `_docs/_autodev_state.md`. Full integration with retrospective and feature-cycle loop.
- **Standalone mode**: `/release` invoked directly with `@<artifact>` (rare; usually only for re-running a rollback against a specific version). All outputs still go to `RELEASE_DIR/`.
## Methodology Quick Reference
```
┌────────────────────────────────────────────────────────────────┐
│ Release (6 phases, 3 verdicts) │
├────────────────────────────────────────────────────────────────┤
│ Phase 1 Pre-Release Gate │
│ AC + tests + change summary + rollback path │
│ [BLOCKING — user A/B/C] │
│ Phase 2 Strategy Select │
│ all-at-once · blue-green · canary · manual │
│ [BLOCKING — user picks] │
│ Phase 3 Execute │
│ scripts/deploy.sh, capture exit code + logs │
│ [AUTO-ROLLBACK on non-zero or hang] │
│ Phase 4 Smoke Test │
│ /test-run --prod-smoke against target │
│ [AUTO-ROLLBACK on any failure] │
│ Phase 5 Watch Window │
│ Poll observability for N minutes │
│ [AUTO-ROLLBACK on hard breach; escalate on soft] │
│ Phase 6 Commit or Rollback │
│ Released → tracker, tag, retrospective │
│ Rolled-Back → tracker reset, incident retrospective │
│ Aborted → no live-system change │
├────────────────────────────────────────────────────────────────┤
│ Principles: real execution · verifiable rollback · │
│ quiet failure = release failure · │
│ watch window mandatory │
└────────────────────────────────────────────────────────────────┘
```
@@ -1,114 +0,0 @@
# Release Report — {version} → {env}
- **Date**: {YYYY-MM-DD HH:MM} {timezone}
- **Operator**: {user}
- **Strategy**: {all-at-once | blue-green | canary | manual}
- **Verdict**: {Released | Released-with-override | Rolled-Back | Aborted}
- **Verdict reason**: {one-line summary}
## Pre-Release Gate (Phase 1)
### Acceptance Criteria
| AC ID | Status | Evidence |
|-------|--------|----------|
| AC-001 | Met / Unmet | path:section, test report, etc. |
### Test Status
| Suite | Pass | Fail | Skip | Source |
|-------|------|------|------|--------|
| Functional | N | N | N | _docs/03_implementation/{batch}.md |
| Performance | N | N | N | _docs/06_metrics/perf_*.md |
### Change Summary
| Component | Tickets | Type |
|-----------|---------|------|
| {component} | TKT-001, TKT-002 | feature / fix / breaking / security |
### Rollback Plan
- Previous version: `{previous-version}` (registry digest: `{sha}`)
- Rollback script: `scripts/deploy.sh --rollback`
- Rollback target verified pullable: yes / no
- Rollback target verified bootable in target env: yes / no
### Restrictions / Approvals
- Change-window restrictions: {none | description}
- Manual approvals required: {none | reference to approval file}
### Tracker State at Gate
- Tickets in scope: {N}
- Tickets blocking release: {0 — list any}
## Strategy Select (Phase 2)
- Recommended: {strategy} — reasoning
- Chosen: {strategy} — reasoning (if differs from recommended)
## Execute (Phase 3)
- Start: {timestamp}
- End: {timestamp}
- Exit code: {0 / non-zero}
```
<scripts/deploy.sh stdout/stderr stream, with timestamps>
```
## Smoke Test (Phase 4)
- Mode: {/test-run --prod-smoke | manual smoke set}
- Start: {timestamp}
- End: {timestamp}
| Test | Result | Notes |
|------|--------|-------|
| {name} | Pass / Fail | response time, status, etc. |
## Watch Window (Phase 5)
- Duration: {minutes}
- Cadence: {minutes per poll}
- Backend: {observability source — Prometheus, CloudWatch, Datadog, etc.}
| T+min | error_rate | rps | p99_latency | saturation | health | notes |
|-------|------------|-----|-------------|------------|--------|-------|
| 0 | … | … | … | … | OK | … |
| 1 | … | … | … | … | OK | … |
| … | … | … | … | … | … | … |
### Threshold breaches
- {None | "p99 latency 1.7× baseline at T+8 — soft breach, user accepted continuation"}
## Commit or Rollback (Phase 6)
### If Released
- Tracker tickets moved: {list}
- Git tag pushed: {tag} → {sha}
- Retrospective scheduled: yes — {/retrospective --cycle-end output path}
### If Rolled-Back
- Trigger: {auto / user-elected}
- Reason: {phase + one-line cause}
- Rollback start: {timestamp}
- Rollback end: {timestamp}
- Post-rollback smoke: pass / fail
- Tracker tickets moved back: {list}
- Incident retrospective scheduled: yes — {/retrospective --incident output path}
### If Aborted
- Phase that aborted: {1 / 2 / 3 / 4 / 5}
- Reason: {one-line cause}
- No live-system changes attempted: yes / no (if live changes, document under Phase 3 above and treat as Rolled-Back instead)
## Lessons (one-liners; full incident retro if Rolled-Back / Released-with-override)
- {Optional: short one-liner observations the operator wants the next /retrospective to consider}
+3 -21
View File
@@ -30,27 +30,6 @@ Transform vague topics raised by users into high-quality, deliverable research r
- **Internet-first investigation** — do not rely on training data for factual claims; search the web extensively for every sub-question, rephrase queries when results are thin, and keep searching until you have converging evidence from multiple independent sources
- **Multi-perspective analysis** — examine every problem from at least 3 different viewpoints (e.g., end-user, implementer, business decision-maker, contrarian, domain expert, field practitioner); each perspective should generate its own search queries
- **Question multiplication** — for each sub-question, generate multiple reformulated search queries (synonyms, related terms, negations, "what can go wrong" variants, practitioner-focused variants) to maximize coverage and uncover blind spots
- **Component option breadth** — for every component area, build a broad option landscape before selecting. Search direct candidates, adjacent-domain alternatives, commercial/open-source variants, classical/simple baselines, current SOTA, and "do not use" failure cases. A component may not be narrowed to one candidate until alternatives have been searched and rejected with evidence.
- **Component research depth** — for every serious component candidate, go beyond discovery pages. Read official docs, repository/license files, issue discussions, benchmarks, deployment guides, version/platform requirements, security notes, maintenance signals, and real-world failure reports. Extract evidence for inputs/outputs, lifecycle assumptions, runtime/storage/latency fit, integration boundaries, licensing, operational risks, and unsupported scenarios before assigning any selection status.
- **Exact-fit component selection** — never select a component, tool, library, service, architecture pattern, or algorithm merely because it solves a similar class of problem. It must be proven compatible with the project's explicit operating context, constraints, required inputs/outputs, non-functional requirements, lifecycle assumptions, and acceptance criteria. If fit is unproven or mismatched, mark it `Rejected`, `Experimental only`, or escalate for user decision before it can shape the solution.
- **Per-mode API capability verification** *(applies only to technical-component selection — see Research Output Class below)* — when a candidate library/SDK/framework/service exposes multiple modes or configurations, *the candidate is not a single thing*. Pin the exact mode the project will use (one explicit sentence: inputs, outputs, runtime), and verify *that mode* against the project's required inputs/outputs via official docs (mandatory `context7` lookup) plus a saved Minimum Viable Example. Capability claims at the category level ("supports X, Y, Z modes") must be cross-checked against the literal mode enumeration before being treated as project-applicable. Two modes of one library are two distinct candidates for the purposes of the Component Applicability Gate. Does not apply to non-technical research (concept comparison, market/policy investigation, knowledge organization, etc.).
## Research Output Class (BLOCKING — set in Step 1)
Before applying any of the technical-component gates (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookup), classify the research output into one of two classes. Record the decision in `00_question_decomposition.md` once, near the top, so every downstream step honors it.
| Class | What the output recommends or selects | Examples | Technical-component gates apply? |
|-------|---------------------------------------|----------|----------------------------------|
| **Technical-component selection** | One or more libraries, SDKs, frameworks, services, protocols, data formats, infrastructure patterns, algorithms, or APIs that will be implemented or operated against | "Pick a vector database", "Compare auth-token strategies for our API", "Should we use Kafka or RabbitMQ?", architecture / tech-stack / migration drafts (Mode A, Mode B) | **Yes — all gates active** |
| **Non-technical investigation** | Concept comparisons, knowledge organization, root-cause investigation of an event, market/policy/regulatory/social analysis, literature review, decision support without committing to specific tooling | "Why did adoption stall in Q3?", "Compare phenomenology vs constructivism", "Map regulatory landscape for X", "What do practitioners say about onboarding under remote-first orgs?" | **No — skip API/MVE/sub-matrix gates; the rest of the 8-step engine still applies** |
How to decide:
1. Inspect the question and the input files (`problem.md`, `restrictions.md`, `acceptance_criteria.md`, or the standalone input file).
2. If the deliverable will name specific software/services/protocols that someone will then build with or operate, it is **Technical-component selection**.
3. If the deliverable is a report, comparison, or recommendation that does not commit to specific tooling, it is **Non-technical investigation**.
4. **Mixed runs are valid.** Some research questions have a non-technical core but include one technical sub-question (or vice versa). In that case classify per component area within the run, not the run as a whole, and note in `00_question_decomposition.md` which component areas trigger the technical-component gates.
When the run is purely **Non-technical investigation**, the rest of the research engine — question decomposition, perspective rotation, exhaustive web search, fact extraction, comparison framework, reasoning chain, validation, deliverable formatting — still applies in full. The sections that get skipped are explicitly the technical gates listed in the table above.
## Context Resolution
@@ -133,6 +112,9 @@ When the user wants to:
- Assess or improve an existing solution draft
**Differentiation from other Skills**:
- Needs a **visual knowledge graph** → use `research-to-diagram`
- Needs **written output** (articles/tutorials) → use `wsy-writer`
- Needs **material organization** → use `material-to-markdown`
- Needs **research + solution draft** → use this Skill
## Stakeholder Perspectives
@@ -32,17 +32,3 @@
6. Applicable scenarios
7. Team capability requirements
8. Migration difficulty
## Decomposition Completeness Probes (Completeness Audit Reference)
Used during Step 1's Decomposition Completeness Audit. After generating sub-questions, ask each probe against the current decomposition. If a probe reveals an uncovered area, add a sub-question for it.
| Probe | What it catches |
|-------|-----------------|
| **What does this cost — in money, time, resources, or trade-offs?** | Budget, pricing, licensing, tax, opportunity cost, maintenance burden |
| **What are the hard constraints — physical, legal, regulatory, environmental?** | Regulations, certifications, spectrum/frequency rules, export controls, physics limits, IP restrictions |
| **What are the dependencies and assumptions that could break?** | Supply chain, vendor lock-in, API stability, single points of failure, standards evolution |
| **What does the operating environment actually look like?** | Terrain, weather, connectivity, infrastructure, power, latency, user skill level |
| **What failure modes exist and what happens when they trigger?** | Degraded operation, fallback, safety margins, blast radius, recovery time |
| **What do practitioners who solved similar problems say matters most?** | Field-tested priorities that don't appear in specs or papers |
| **What changes over time — and what looks stable now but isn't?** | Technology roadmaps, regulatory shifts, deprecation risk, scaling effects |
@@ -10,12 +10,6 @@
- [ ] Every citation can be directly verified by the user (source verifiability)
- [ ] Structure hierarchy is clear; executives can quickly locate information
## Decomposition Completeness
- [ ] Domain discovery search executed: searched "key factors when [problem domain]" before starting research
- [ ] Completeness probes applied: every probe from `references/comparison-frameworks.md` checked against sub-questions
- [ ] No uncovered areas remain: all gaps filled with sub-questions or justified as not applicable
## Internet Search Depth
- [ ] Every sub-question was searched with at least 3-5 different query variants
@@ -27,26 +21,13 @@
- [ ] Iterative deepening completed: follow-up questions from initial findings were searched
- [ ] No sub-question relies solely on training data without web verification
## Component Option Breadth
- [ ] `00_question_decomposition.md` contains a Component Option Search Plan
- [ ] Every component area was searched across simple baseline, established production, open-source, commercial/vendor, current SOTA, adjacent-domain, no-build/defer, and known-bad options where applicable
- [ ] Every component area has at least 3 realistic candidates, or a documented explanation of why broad searches found fewer
- [ ] Each lead candidate has official/source-of-truth evidence plus independent validation when available
- [ ] Each component area includes at least one baseline/fallback option and at least one rejected or experimental option when possible
- [ ] Alternative names, synonyms, and neighboring-domain terms were searched before declaring the option landscape complete
- [ ] Licensing, runtime, platform, maintenance, and unsupported-scenario searches were performed for every lead, fallback, and rejected candidate
## Mode A Specific
- [ ] Phase 1 completed: AC assessment was presented to and confirmed by user
- [ ] AC assessment consistent: Solution draft respects the (possibly adjusted) acceptance criteria and restrictions
- [ ] Competitor analysis included: Existing solutions were researched
- [ ] All components have comparison tables: Each component lists alternatives with tools, advantages, limitations, security, cost
- [ ] Component options are broad: component tables include baseline, production, open-source, commercial/vendor, SOTA/research, adjacent-domain, defer/no-build, and disqualified options where applicable
- [ ] Tools/libraries verified: Suggested tools actually exist and work as described
- [ ] Component fit matrix completed: `06_component_fit_matrix.md` (or `06_component_fit_matrix/` if split) exists and every selected component/tool/pattern is marked `Selected`
- [ ] No field-adjacent substitution: no selected candidate is chosen only because it solves a similar class of problem while failing the project's explicit constraints
- [ ] Testing strategy covers AC: Tests map to acceptance criteria
- [ ] Tech stack documented (if Phase 3 ran): `tech_stack.md` has evaluation tables, risk assessment, and learning requirements
- [ ] Security analysis documented (if Phase 4 ran): `security_analysis.md` has threat model and per-component controls
@@ -58,9 +39,6 @@
- [ ] New draft is self-contained: Written as if from scratch, no "updated" markers
- [ ] Performance column included: Mode B comparison tables include performance characteristics
- [ ] Previous draft issues addressed: Every finding in the table is resolved in the new draft
- [ ] Existing selected components were challenged against a broad alternative landscape before being kept
- [ ] Existing component fit audited: every old and new component/tool/pattern was checked against `restrictions.md`, `acceptance_criteria.md`, and the Project Constraint Matrix
- [ ] Rejected/experimental candidates are not lead recommendations unless the user explicitly accepted the risk
## Timeliness Check (High-Sensitivity Domain BLOCKING)
@@ -80,7 +58,7 @@ When the research topic has Critical or High sensitivity level:
## Target Audience Consistency Check (BLOCKING)
- [ ] Research boundary clearly defined: `00_question_decomposition.md` has clear population/geography/timeframe/level boundaries
- [ ] Every source has target audience annotated in `01_source_registry.md` (or category files under `01_source_registry/` if split)
- [ ] Every source has target audience annotated in `01_source_registry.md`
- [ ] Mismatched sources properly handled (excluded, annotated, or marked reference-only)
- [ ] No audience confusion in fact cards: Every fact has target audience consistent with research boundary
- [ ] No audience confusion in the report: Policies/research/data cited have consistent target audiences
@@ -92,33 +70,3 @@ When the research topic has Critical or High sensitivity level:
- [ ] Cited facts have corresponding statements in the original text (no over-interpretation)
- [ ] Source publication/update dates annotated; technical docs include version numbers
- [ ] Unverifiable information annotated `[limited source]` and not sole support for core conclusions
## Exact-Fit Validation (BLOCKING)
- [ ] Project Constraint Matrix extracted from problem context before component selection
- [ ] Component fit matrix includes `Component Area`, `Option Family`, and `Pinned Mode/Config` columns
- [ ] Every selected component/tool/library/service/pattern/algorithm has evidence for required inputs/outputs and integration boundaries
- [ ] Every selected candidate has evidence for the operating context and lifecycle assumptions it must support
- [ ] Every selected candidate has evidence for non-functional targets that are binding for the project
- [ ] Known unsupported scenarios and failure reports were searched for every selected candidate
- [ ] Mismatches are recorded as disqualifiers, not softened into generic limitations
- [ ] Any candidate with unproven fit is marked `Experimental only` or escalated for user decision
- [ ] Any candidate with documented constraint conflict is marked `Rejected`
## API Capability Verification (BLOCKING)
**Applicability**: this checklist applies only when the run is classified as **Technical-component selection** (see SKILL.md → Research Output Class). For non-technical research (concept comparison, market/policy investigation, root-cause analysis, knowledge organization), skip this checklist entirely and note the skip in `05_validation_log.md`. For mixed runs, apply only to technical component areas.
For every lead candidate that is a library/SDK/framework/service:
- [ ] The exact mode/configuration the project will use is pinned in one explicit sentence (inputs, outputs, runtime); no vague "supports X" language
- [ ] `context7` (or equivalent docs lookup) was run for the candidate, with at least 3 queries: mode enumeration, project's exact mode, disqualifier probe
- [ ] All consulted URLs from context7 / official docs are appended to `01_source_registry.md` (or files under `01_source_registry/` if split)
- [ ] A Minimum Viable Example (MVE) was saved for the pinned mode in `02_fact_cards.md` / `02_fact_cards/` (or `02_mve_evidence.md`) with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌
- [ ] When the MVE inputs or outputs do not exactly match the project's, the mismatch is cited from the official docs (not inferred), and the candidate is `Experimental only` or `Rejected`
- [ ] When a library has multiple modes, each project-relevant mode appears as its own candidate row (not a single library row that softens across modes)
- [ ] Restrictions × Candidate-Modes sub-matrix in `06_component_fit_matrix.md` (or files under `06_component_fit_matrix/` if split) is filled for every lead candidate, with one row per numbered restriction and per numbered acceptance criterion
- [ ] Sub-matrix uses ✅ / ❌ / ❓ / N/A only — no free-form prose substitutes
- [ ] No `Selected` candidate has any ❌ or ❓ cell in its sub-matrix
- [ ] "Validation gate required" footnotes are explicitly classified as either *API capability* (must be resolved here) or *runtime quality* (may be carried forward)
- [ ] Paraphrased capability claims in fact cards have been cross-checked against the literal mode-enumeration evidence (no `mono, inertial → mono-inertial` style conflation)
@@ -89,7 +89,7 @@ Value Translation:
## Source Registry Entry Template
For each source consulted, immediately append to `01_source_registry.md` (or the appropriate category file under `01_source_registry/` if the artifact has been split — see splittable-artifacts convention in `steps/00_project-integration.md`):
For each source consulted, immediately append to `01_source_registry.md`:
```markdown
## Source #[number]
- **Title**: [source title]
@@ -57,49 +57,22 @@ RESEARCH_DIR/
├── 03_comparison_framework.md # Step 4 output: selected framework and populated data
├── 04_reasoning_chain.md # Step 6 output: fact → conclusion reasoning
├── 05_validation_log.md # Step 7 output: use-case validation results
├── 06_component_fit_matrix.md # Step 7.5 output: component exact-fit gate
└── raw/ # Raw source archive (optional)
├── source_1.md
└── source_2.md
```
#### Splittable artifacts — Layout convention
The following three artifacts MAY equivalently be a **folder** of the same base name when the single-file form has grown unwieldy (typically ≳ 1000 lines or ≳ 200 KB):
- `01_source_registry.md``01_source_registry/`
- `02_fact_cards.md``02_fact_cards/`
- `06_component_fit_matrix.md``06_component_fit_matrix/`
When using the folder form:
- Place a `00_summary.md` index file at the folder root with a short common summary table and the cross-cutting status the single-file form would have carried in its preamble.
- Split per-entry content into category files (e.g. one file per sub-question or per component): `SQ1_*.md`, `C1_*.md`, etc. Keep entry numbering global across the folder so cross-references like "Source #42" still resolve to exactly one place.
- Cross-references from outside the folder may point at either `01_source_registry/00_summary.md` (for the index) or directly at the relevant category file.
```
RESEARCH_DIR/01_source_registry/ # split form (when single-file is too large)
├── 00_summary.md # index + investigation status + compact source table
├── SQ1_existing_systems.md # category file
├── SQ2_canonical_pipeline.md # category file
├── C1_vio.md # per-component file
└── ...
```
Throughout the rest of this skill (other steps, references, templates), the singular `XX.md` form is used as a logical name; treat each occurrence as applying equally to the folder form when the artifact has been split.
### Save Timing & Content
| Step | Save immediately after completion | Filename |
|------|-----------------------------------|----------|
| Mode A Phase 1 | AC & restrictions assessment tables | `00_ac_assessment.md` |
| Step 0-1 | Question type classification + sub-question list | `00_question_decomposition.md` |
| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` *(splittable, see convention)* |
| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` *(splittable, see convention)* |
| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` |
| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` |
| Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` |
| Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` |
| Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` |
| Step 7.5 | Component exact-fit gate and selection status | `06_component_fit_matrix.md` *(splittable, see convention)* |
| Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` |
### Save Principles
@@ -117,12 +90,11 @@ Throughout the rest of this skill (other steps, references, templates), the sing
|------|---------|----------------|
| `00_ac_assessment.md` | AC & restrictions assessment (Mode A only) | After Phase 1 completion |
| `00_question_decomposition.md` | Question type, sub-question list | After Step 0-1 completion |
| `01_source_registry.md` *(splittable)* | All source links and summaries | Continuously updated during Step 2 |
| `02_fact_cards.md` *(splittable)* | Extracted facts and sources | Continuously updated during Step 3 |
| `01_source_registry.md` | All source links and summaries | Continuously updated during Step 2 |
| `02_fact_cards.md` | Extracted facts and sources | Continuously updated during Step 3 |
| `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion |
| `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion |
| `05_validation_log.md` | Use-case validation and review | After Step 7 completion |
| `06_component_fit_matrix.md` *(splittable)* | Exact-fit matrix for every proposed component/tool/pattern with status `Selected` / `Rejected` / `Experimental only` / `Needs user decision` | Before Step 8 deliverable formatting |
| `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion |
| `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) |
| `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) |

Some files were not shown because too many files have changed in this diff Show More