diff --git a/.cursor/skills/autopilot/SKILL.md b/.cursor/skills/autopilot/SKILL.md index db02045..8cec5a5 100644 --- a/.cursor/skills/autopilot/SKILL.md +++ b/.cursor/skills/autopilot/SKILL.md @@ -17,6 +17,17 @@ disable-model-invocation: true Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autopilot` once — the engine handles sequencing, transitions, and re-entry. +## File Index + +| File | Purpose | +|------|---------| +| `flows/greenfield.md` | Detection rules, step table, and auto-chain rules for new projects | +| `flows/existing-code.md` | Detection rules, step table, and auto-chain rules for existing codebases | +| `state.md` | State file format, rules, re-entry protocol, session boundaries | +| `protocols.md` | User interaction, Jira MCP auth, choice format, error handling, status summary | + +**On every invocation**: read all four files above before executing any logic. + ## Core Principles - **Auto-chain**: when a skill completes, immediately start the next one — no pause between skills @@ -24,250 +35,57 @@ Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Det - **State from disk**: all progress is persisted to `_docs/_autopilot_state.md` and cross-checked against `_docs/` folder structure - **Rich re-entry**: on every invocation, read the state file for full context before continuing - **Delegate, don't duplicate**: read and execute each sub-skill's SKILL.md; never inline their logic here +- **Sound on pause**: follow `.cursor/rules/human-attention-sound.mdc` — play a notification sound before every pause that requires human input +- **Minimize interruptions**: only ask the user when the decision genuinely cannot be resolved automatically +- **Single project per workspace**: all `_docs/` paths are relative to workspace root; for monorepos, each service needs its own Cursor workspace -## State File: `_docs/_autopilot_state.md` +## Flow Resolution -The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist. +Determine which flow to use: -### Format +1. If workspace has source code files **and** `_docs/` does not exist → **existing-code flow** (Pre-Step detection) +2. If `_docs/_autopilot_state.md` exists and records Document in `Completed Steps` → **existing-code flow** +3. If `_docs/_autopilot_state.md` exists and `step: done` AND workspace contains source code → **existing-code flow** (completed project re-entry — loops to New Task) +4. Otherwise → **greenfield flow** -```markdown -# Autopilot State +After selecting the flow, apply its detection rules (first match wins) to determine the current step. -## Current Step -step: [0-5 or "done"] -name: [Problem / Research / Plan / Decompose / Implement / Deploy / Done] -status: [not_started / in_progress / completed] -sub_step: [optional — sub-skill phase if interrupted mid-step, e.g. "Plan Step 3: Component Decomposition"] +## Execution Loop -## Completed Steps - -| Step | Name | Completed | Key Outcome | -|------|------|-----------|-------------| -| 0 | Problem | [date] | [one-line summary] | -| 1 | Research | [date] | [N drafts, final approach summary] | -| 2 | Plan | [date] | [N components, architecture summary] | -| 3 | Decompose | [date] | [N tasks, total complexity points] | -| 4 | Implement | [date] | [N batches, pass/fail summary] | -| 5 | Deploy | [date] | [artifacts produced] | - -## Key Decisions -- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"] -- [decision 2: e.g. "6 research rounds, final draft: solution_draft06.md"] -- [decision N] - -## Last Session -date: [date] -ended_at: [step name and phase] -reason: [completed step / session boundary / user paused / context limit] -notes: [any context for next session, e.g. "User asked to revisit risk assessment"] - -## Blockers -- [blocker 1, if any] -- [none] -``` - -### State File Rules - -1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 0) -2. **Update** the state file after every step completion, every session boundary, and every BLOCKING gate confirmation -3. **Read** the state file as the first action on every invocation — before folder scanning -4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 2 but `_docs/02_plans/architecture.md` already exists), trust the folder structure and update the state file to match -5. **Never delete** the state file. It accumulates history across the entire project lifecycle - -## Execution Entry Point - -Every invocation of this skill follows the same sequence: +Every invocation follows this sequence: ``` 1. Read _docs/_autopilot_state.md (if exists) -2. Cross-check state file against _docs/ folder structure -3. Resolve current step (state file + folder scan) -4. Present Status Summary (from state file context) -5. Enter Execution Loop: - a. Read and execute the current skill's SKILL.md - b. When skill completes → update state file - c. Re-detect next step - d. If next skill is ready → auto-chain (go to 5a with next skill) - e. If session boundary reached → update state file with session notes → suggest new conversation - f. If all steps done → update state file → report completion +2. Read all File Index files above +3. Cross-check state file against _docs/ folder structure (rules in state.md) +4. Resolve flow (see Flow Resolution above) +5. Resolve current step (detection rules from the active flow file) +6. Present Status Summary (template in active flow file) +7. Execute: + a. Delegate to current skill (see Skill Delegation below) + b. If skill returns FAILED → apply Skill Failure Retry Protocol (see protocols.md): + - Auto-retry the same skill (failure may be caused by missing user input or environment issue) + - If 3 consecutive auto-retries fail → record in state file Blockers, warn user, stop auto-retry + c. When skill completes successfully → reset retry counter, update state file (rules in state.md) + d. Re-detect next step from the active flow's detection rules + e. If next skill is ready → auto-chain (go to 7a with next skill) + f. If session boundary reached → update state, suggest new conversation (rules in state.md) + g. If all steps done → update state → report completion ``` -## State Detection - -Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning. - -### Folder Scan Rules (fallback) - -Scan `_docs/` to determine the current workflow position. Check rules in order — first match wins. - -### Detection Rules - -**Step 0 — Problem Gathering** -Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty: -- `problem.md` -- `restrictions.md` -- `acceptance_criteria.md` -- `input_data/` (must contain at least one file) - -Action: Read and execute `.cursor/skills/problem/SKILL.md` - ---- - -**Step 1 — Research (Initial)** -Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files - -Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A) - ---- - -**Step 1b — Research Decision** -Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_plans/architecture.md` does not exist - -Action: Present the current research state to the user: -- How many solution drafts exist -- Whether tech_stack.md and security_analysis.md exist -- One-line summary from the latest draft - -Then ask: **"Run another research round (Mode B assessment), or proceed to planning?"** -- If user wants another round → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B) -- If user wants to proceed → auto-chain to Step 2 (Plan) - ---- - -**Step 2 — Plan** -Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_plans/architecture.md` does not exist - -Action: -1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself -2. Read and execute `.cursor/skills/plan/SKILL.md` - -If `_docs/02_plans/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it. - ---- - -**Step 3 — Decompose** -Condition: `_docs/02_plans/` contains `architecture.md` AND `_docs/02_plans/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`) - -Action: Read and execute `.cursor/skills/decompose/SKILL.md` - -If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it. - ---- - -**Step 4 — Implement** -Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist - -Action: Read and execute `.cursor/skills/implement/SKILL.md` - -If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. - ---- - -**Step 5 — Deploy** -Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND `_docs/04_deploy/` does not exist or is incomplete - -Action: Read and execute `.cursor/skills/deploy/SKILL.md` - ---- - -**Done** -Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md) - -Action: Report project completion with summary. - -## Status Summary - -On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). - -Format: - -``` -═══════════════════════════════════════════════════ - AUTOPILOT STATUS -═══════════════════════════════════════════════════ - Step 0 Problem [DONE / IN PROGRESS / NOT STARTED] - Step 1 Research [DONE (N drafts) / IN PROGRESS / NOT STARTED] - Step 2 Plan [DONE / IN PROGRESS / NOT STARTED] - Step 3 Decompose [DONE (N tasks) / IN PROGRESS / NOT STARTED] - Step 4 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED] - Step 5 Deploy [DONE / IN PROGRESS / NOT STARTED] -═══════════════════════════════════════════════════ - Current step: [Step N — Name] - Action: [what will happen next] -═══════════════════════════════════════════════════ -``` - -For re-entry (state file exists), also include: -- Key decisions from the state file's `Key Decisions` section -- Last session context from the `Last Session` section -- Any blockers from the `Blockers` section - -## Auto-Chain Rules - -After a skill completes, apply these rules: - -| Completed Step | Next Action | -|---------------|-------------| -| Problem Gathering | Auto-chain → Research (Mode A) | -| Research (any round) | Auto-chain → Research Decision (ask user: another round or proceed?) | -| Research Decision → proceed | Auto-chain → Plan | -| Plan | Auto-chain → Decompose | -| Decompose | **Session boundary** — suggest new conversation before Implement | -| Implement | Auto-chain → Deploy | -| Deploy | Report completion | - -### Session Boundary: Decompose → Implement - -After decompose completes, **do not auto-chain to implement**. Instead: - -1. Update state file: mark Decompose as completed, set current step to 4 (Implement) with status `not_started` -2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready` -3. Present a summary: number of tasks, estimated batches, total complexity points -4. Suggest: "Implementation is the longest phase and benefits from a fresh conversation context. Start a new conversation and type `/autopilot` to begin implementation." -5. If the user insists on continuing in the same conversation, proceed. - -This is the only hard session boundary. All other transitions auto-chain. - ## Skill Delegation For each step, the delegation pattern is: -1. Update state file: set current step to `in_progress`, record `sub_step` if applicable +1. Update state file: set `step` to the autopilot step number, status to `in_progress`, set `sub_step` to the sub-skill's current internal step/phase, reset `retry_count: 0` 2. Announce: "Starting [Skill Name]..." 3. Read the skill file: `.cursor/skills/[name]/SKILL.md` -4. Execute the skill's workflow exactly as written, including: - - All BLOCKING gates (present to user, wait for confirmation) - - All self-verification checklists - - All save actions - - All escalation rules -5. When the skill's workflow is fully complete: - - Update state file: mark step as `completed`, record date, write one-line key outcome - - Add any key decisions made during this step to the `Key Decisions` section - - Return to the auto-chain rules +4. Execute the skill's workflow exactly as written, including all BLOCKING gates, self-verification checklists, save actions, and escalation rules. Update `sub_step` in state each time the sub-skill advances. +5. If the skill **fails**: follow the Skill Failure Retry Protocol in `protocols.md` — increment `retry_count`, auto-retry up to 3 times, then escalate. +6. When complete (success): reset `retry_count: 0`, mark step `completed`, record date + key outcome, add key decisions to state file, return to auto-chain rules (from active flow file) Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The autopilot is a sequencer, not an optimizer. -## Re-Entry Protocol - -When the user invokes `/autopilot` and work already exists: - -1. Read `_docs/_autopilot_state.md` -2. Cross-check against `_docs/` folder structure -3. Present Status Summary with context from state file (key decisions, last session, blockers) -4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery -5. Continue execution from detected state - -## Error Handling - -| Situation | Action | -|-----------|--------| -| State detection is ambiguous (artifacts suggest two different steps) | Present findings to user, ask which step to execute | -| Sub-skill fails or hits an unrecoverable blocker | Report the error, suggest the user fix it manually, then re-invoke `/autopilot` | -| User wants to skip a step | Warn about downstream dependencies, proceed if user confirms | -| User wants to go back to a previous step | Warn that re-running may overwrite artifacts, proceed if user confirms | -| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution | - ## Trigger Conditions This skill activates when the user wants to: @@ -281,41 +99,9 @@ This skill activates when the user wants to: **Differentiation**: - User wants only research → use `/research` directly - User wants only planning → use `/plan` directly +- User wants to document an existing codebase → use `/document` directly - User wants the full guided workflow → use `/autopilot` -## Methodology Quick Reference +## Flow Reference -``` -┌────────────────────────────────────────────────────────────────┐ -│ Autopilot (Auto-Chain Orchestrator) │ -├────────────────────────────────────────────────────────────────┤ -│ EVERY INVOCATION: │ -│ 1. State Detection (scan _docs/) │ -│ 2. Status Summary (show progress) │ -│ 3. Execute current skill │ -│ 4. Auto-chain to next skill (loop) │ -│ │ -│ WORKFLOW: │ -│ Step 0 Problem → .cursor/skills/problem/SKILL.md │ -│ ↓ auto-chain │ -│ Step 1 Research → .cursor/skills/research/SKILL.md │ -│ ↓ auto-chain (ask: another round?) │ -│ Step 2 Plan → .cursor/skills/plan/SKILL.md │ -│ ↓ auto-chain │ -│ Step 3 Decompose → .cursor/skills/decompose/SKILL.md │ -│ ↓ SESSION BOUNDARY (suggest new conversation) │ -│ Step 4 Implement → .cursor/skills/implement/SKILL.md │ -│ ↓ auto-chain │ -│ Step 5 Deploy → .cursor/skills/deploy/SKILL.md │ -│ ↓ │ -│ DONE │ -│ │ -│ STATE FILE: _docs/_autopilot_state.md │ -│ FALLBACK: _docs/ folder structure scan │ -│ PAUSE POINTS: sub-skill BLOCKING gates only │ -│ SESSION BREAK: after Decompose (before Implement) │ -├────────────────────────────────────────────────────────────────┤ -│ Principles: Auto-chain · State to file · Rich re-entry │ -│ Delegate don't duplicate · Pause at decisions only │ -└────────────────────────────────────────────────────────────────┘ -``` +See `flows/greenfield.md` and `flows/existing-code.md` for step tables, detection rules, auto-chain rules, and status summary templates. diff --git a/.cursor/skills/autopilot/flows/existing-code.md b/.cursor/skills/autopilot/flows/existing-code.md new file mode 100644 index 0000000..ff31c36 --- /dev/null +++ b/.cursor/skills/autopilot/flows/existing-code.md @@ -0,0 +1,234 @@ +# Existing Code Workflow + +Workflow for projects with an existing codebase. Starts with documentation, produces test specs, decomposes and implements tests, verifies them, refactors with that safety net, then adds new functionality and deploys. + +## Step Reference Table + +| Step | Name | Sub-Skill | Internal SubSteps | +|------|------|-----------|-------------------| +| 1 | Document | document/SKILL.md | Steps 1–8 | +| 2 | Test Spec | test-spec/SKILL.md | Phase 1a–1b | +| 3 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 | +| 4 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) | +| 5 | Run Tests | test-run/SKILL.md | Steps 1–4 | +| 6 | Refactor | refactor/SKILL.md | Phases 0–5 (6-phase method) | +| 7 | New Task | new-task/SKILL.md | Steps 1–8 (loop) | +| 8 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) | +| 9 | Run Tests | test-run/SKILL.md | Steps 1–4 | +| 10 | Security Audit | security/SKILL.md | Phase 1–5 (optional) | +| 11 | Performance Test | (autopilot-managed) | Load/stress tests (optional) | +| 12 | Deploy | deploy/SKILL.md | Step 1–7 | + +After Step 12, the existing-code workflow is complete. + +## Detection Rules + +Check rules in order — first match wins. + +--- + +**Step 1 — Document** +Condition: `_docs/` does not exist AND the workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`, `src/`, `Cargo.toml`, `*.csproj`, `package.json`) + +Action: An existing codebase without documentation was detected. Read and execute `.cursor/skills/document/SKILL.md`. After the document skill completes, re-detect state (the produced `_docs/` artifacts will place the project at Step 2 or later). + +--- + +**Step 2 — Test Spec** +Condition: `_docs/02_document/FINAL_report.md` exists AND workspace contains source code files (e.g., `*.py`, `*.cs`, `*.rs`, `*.ts`) AND `_docs/02_document/tests/traceability-matrix.md` does not exist AND the autopilot state shows Document was run (check `Completed Steps` for "Document" entry) + +Action: Read and execute `.cursor/skills/test-spec/SKILL.md` + +This step applies when the codebase was documented via the `/document` skill. Test specifications must be produced before refactoring or further development. + +--- + +**Step 3 — Decompose Tests** +Condition: `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND the autopilot state shows Document was run AND (`_docs/02_tasks/` does not exist or has no task files) + +Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will: +1. Run Step 1t (test infrastructure bootstrap) +2. Run Step 3 (blackbox test task decomposition) +3. Run Step 4 (cross-verification against test coverage) + +If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it. + +--- + +**Step 4 — Implement Tests** +Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND the autopilot state shows Step 3 (Decompose Tests) is completed AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist + +Action: Read and execute `.cursor/skills/implement/SKILL.md` + +The implement skill reads test tasks from `_docs/02_tasks/` and implements them. + +If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. + +--- + +**Step 5 — Run Tests** +Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state shows Step 4 (Implement Tests) is completed AND the autopilot state does NOT show Step 5 (Run Tests) as completed + +Action: Read and execute `.cursor/skills/test-run/SKILL.md` + +Verifies the implemented test suite passes before proceeding to refactoring. The tests form the safety net for all subsequent code changes. + +--- + +**Step 6 — Refactor** +Condition: the autopilot state shows Step 5 (Run Tests) is completed AND `_docs/04_refactoring/FINAL_report.md` does not exist + +Action: Read and execute `.cursor/skills/refactor/SKILL.md` + +The refactor skill runs the full 6-phase method using the implemented tests as a safety net. + +If `_docs/04_refactoring/` has phase reports, the refactor skill detects completed phases and continues. + +--- + +**Step 7 — New Task** +Condition: the autopilot state shows Step 6 (Refactor) is completed AND the autopilot state does NOT show Step 7 (New Task) as completed + +Action: Read and execute `.cursor/skills/new-task/SKILL.md` + +The new-task skill interactively guides the user through defining new functionality. It loops until the user is done adding tasks. New task files are written to `_docs/02_tasks/`. + +--- + +**Step 8 — Implement** +Condition: the autopilot state shows Step 7 (New Task) is completed AND `_docs/03_implementation/` does not contain a FINAL report covering the new tasks (check state for distinction between test implementation and feature implementation) + +Action: Read and execute `.cursor/skills/implement/SKILL.md` + +The implement skill reads the new tasks from `_docs/02_tasks/` and implements them. Tasks already implemented in Step 4 are skipped (the implement skill tracks completed tasks in batch reports). + +If `_docs/03_implementation/` has batch reports from this phase, the implement skill detects completed tasks and continues. + +--- + +**Step 9 — Run Tests** +Condition: the autopilot state shows Step 8 (Implement) is completed AND the autopilot state does NOT show Step 9 (Run Tests) as completed + +Action: Read and execute `.cursor/skills/test-run/SKILL.md` + +--- + +**Step 10 — Security Audit (optional)** +Condition: the autopilot state shows Step 9 (Run Tests) is completed AND the autopilot state does NOT show Step 10 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete) + +Action: Present using Choose format: + +``` +══════════════════════════════════════ + DECISION REQUIRED: Run security audit before deploy? +══════════════════════════════════════ + A) Run security audit (recommended for production deployments) + B) Skip — proceed directly to deploy +══════════════════════════════════════ + Recommendation: A — catches vulnerabilities before production +══════════════════════════════════════ +``` + +- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 11 (Performance Test). +- If user picks B → Mark Step 10 as `skipped` in the state file, auto-chain to Step 11 (Performance Test). + +--- + +**Step 11 — Performance Test (optional)** +Condition: the autopilot state shows Step 10 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 11 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete) + +Action: Present using Choose format: + +``` +══════════════════════════════════════ + DECISION REQUIRED: Run performance/load tests before deploy? +══════════════════════════════════════ + A) Run performance tests (recommended for latency-sensitive or high-load systems) + B) Skip — proceed directly to deploy +══════════════════════════════════════ + Recommendation: [A or B — base on whether acceptance criteria + include latency, throughput, or load requirements] +══════════════════════════════════════ +``` + +- If user picks A → Run performance tests: + 1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it + 2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system + 3. Present results vs acceptance criteria thresholds + 4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort + 5. After completion, auto-chain to Step 12 (Deploy) +- If user picks B → Mark Step 11 as `skipped` in the state file, auto-chain to Step 12 (Deploy). + +--- + +**Step 12 — Deploy** +Condition: the autopilot state shows Step 9 (Run Tests) is completed AND (Step 10 is completed or skipped) AND (Step 11 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete) + +Action: Read and execute `.cursor/skills/deploy/SKILL.md` + +After deployment completes, the existing-code workflow is done. + +--- + +**Re-Entry After Completion** +Condition: the autopilot state shows `step: done` OR all steps through 12 (Deploy) are completed + +Action: The project completed a full cycle. Present status and loop back to New Task: + +``` +══════════════════════════════════════ + PROJECT CYCLE COMPLETE +══════════════════════════════════════ + The previous cycle finished successfully. + You can now add new functionality. +══════════════════════════════════════ + A) Add new features (start New Task) + B) Done — no more changes needed +══════════════════════════════════════ +``` + +- If user picks A → set `step: 7`, `status: not_started` in the state file, then auto-chain to Step 7 (New Task). Previous cycle history stays in Completed Steps. +- If user picks B → report final project status and exit. + +## Auto-Chain Rules + +| Completed Step | Next Action | +|---------------|-------------| +| Document (1) | Auto-chain → Test Spec (2) | +| Test Spec (2) | Auto-chain → Decompose Tests (3) | +| Decompose Tests (3) | **Session boundary** — suggest new conversation before Implement Tests | +| Implement Tests (4) | Auto-chain → Run Tests (5) | +| Run Tests (5, all pass) | Auto-chain → Refactor (6) | +| Refactor (6) | Auto-chain → New Task (7) | +| New Task (7) | **Session boundary** — suggest new conversation before Implement | +| Implement (8) | Auto-chain → Run Tests (9) | +| Run Tests (9, all pass) | Auto-chain → Security Audit choice (10) | +| Security Audit (10, done or skipped) | Auto-chain → Performance Test choice (11) | +| Performance Test (11, done or skipped) | Auto-chain → Deploy (12) | +| Deploy (12) | **Workflow complete** — existing-code flow done | + +## Status Summary Template + +``` +═══════════════════════════════════════════════════ + AUTOPILOT STATUS (existing-code) +═══════════════════════════════════════════════════ + Step 1 Document [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 2 Test Spec [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 3 Decompose Tests [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 4 Implement Tests [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)] + Step 5 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 6 Refactor [DONE / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)] + Step 7 New Task [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 8 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)] + Step 9 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 10 Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 11 Performance Test [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 12 Deploy [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] +═══════════════════════════════════════════════════ + Current: Step N — Name + SubStep: M — [sub-skill internal step name] + Retry: [N/3 if retrying, omit if 0] + Action: [what will happen next] +═══════════════════════════════════════════════════ +``` diff --git a/.cursor/skills/autopilot/flows/greenfield.md b/.cursor/skills/autopilot/flows/greenfield.md new file mode 100644 index 0000000..04bf16f --- /dev/null +++ b/.cursor/skills/autopilot/flows/greenfield.md @@ -0,0 +1,235 @@ +# Greenfield Workflow + +Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy. + +## Step Reference Table + +| Step | Name | Sub-Skill | Internal SubSteps | +|------|------|-----------|-------------------| +| 1 | Problem | problem/SKILL.md | Phase 1–4 | +| 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 | +| 3 | Plan | plan/SKILL.md | Step 1–6 + Final | +| 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) | +| 5 | Decompose | decompose/SKILL.md | Step 1–4 | +| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) | +| 7 | Run Tests | test-run/SKILL.md | Steps 1–4 | +| 8 | Security Audit | security/SKILL.md | Phase 1–5 (optional) | +| 9 | Performance Test | (autopilot-managed) | Load/stress tests (optional) | +| 10 | Deploy | deploy/SKILL.md | Step 1–7 | + +## Detection Rules + +Check rules in order — first match wins. + +--- + +**Step 1 — Problem Gathering** +Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty: +- `problem.md` +- `restrictions.md` +- `acceptance_criteria.md` +- `input_data/` (must contain at least one file) + +Action: Read and execute `.cursor/skills/problem/SKILL.md` + +--- + +**Step 2 — Research (Initial)** +Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files + +Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A) + +--- + +**Research Decision** (inline gate between Step 2 and Step 3) +Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_document/architecture.md` does not exist + +Action: Present the current research state to the user: +- How many solution drafts exist +- Whether tech_stack.md and security_analysis.md exist +- One-line summary from the latest draft + +Then present using the **Choose format**: + +``` +══════════════════════════════════════ + DECISION REQUIRED: Research complete — next action? +══════════════════════════════════════ + A) Run another research round (Mode B assessment) + B) Proceed to planning with current draft +══════════════════════════════════════ + Recommendation: [A or B] — [reason based on draft quality] +══════════════════════════════════════ +``` + +- If user picks A → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B) +- If user picks B → auto-chain to Step 3 (Plan) + +--- + +**Step 3 — Plan** +Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_document/architecture.md` does not exist + +Action: +1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself +2. Read and execute `.cursor/skills/plan/SKILL.md` + +If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it. + +--- + +**Step 4 — UI Design (conditional)** +Condition: `_docs/02_document/architecture.md` exists AND the autopilot state does NOT show Step 4 (UI Design) as completed or skipped AND the project is a UI project + +**UI Project Detection** — the project is a UI project if ANY of the following are true: +- `package.json` exists in the workspace root or any subdirectory +- `*.html`, `*.jsx`, `*.tsx` files exist in the workspace +- `_docs/02_document/components/` contains a component whose `description.md` mentions UI, frontend, page, screen, dashboard, form, or view +- `_docs/02_document/architecture.md` mentions frontend, UI layer, SPA, or client-side rendering +- `_docs/01_solution/solution.md` mentions frontend, web interface, or user-facing UI + +If the project is NOT a UI project → mark Step 4 as `skipped` in the state file and auto-chain to Step 5. + +If the project IS a UI project → present using Choose format: + +``` +══════════════════════════════════════ + DECISION REQUIRED: UI project detected — generate mockups? +══════════════════════════════════════ + A) Generate UI mockups before decomposition (recommended) + B) Skip — proceed directly to decompose +══════════════════════════════════════ + Recommendation: A — mockups before decomposition + produce better task specs for frontend components +══════════════════════════════════════ +``` + +- If user picks A → Read and execute `.cursor/skills/ui-design/SKILL.md`. After completion, auto-chain to Step 5 (Decompose). +- If user picks B → Mark Step 4 as `skipped` in the state file, auto-chain to Step 5 (Decompose). + +--- + +**Step 5 — Decompose** +Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`) + +Action: Read and execute `.cursor/skills/decompose/SKILL.md` + +If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it. + +--- + +**Step 6 — Implement** +Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist + +Action: Read and execute `.cursor/skills/implement/SKILL.md` + +If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. + +--- + +**Step 7 — Run Tests** +Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND the autopilot state does NOT show Step 7 (Run Tests) as completed AND (`_docs/04_deploy/` does not exist or is incomplete) + +Action: Read and execute `.cursor/skills/test-run/SKILL.md` + +--- + +**Step 8 — Security Audit (optional)** +Condition: the autopilot state shows Step 7 (Run Tests) is completed AND the autopilot state does NOT show Step 8 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete) + +Action: Present using Choose format: + +``` +══════════════════════════════════════ + DECISION REQUIRED: Run security audit before deploy? +══════════════════════════════════════ + A) Run security audit (recommended for production deployments) + B) Skip — proceed directly to deploy +══════════════════════════════════════ + Recommendation: A — catches vulnerabilities before production +══════════════════════════════════════ +``` + +- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 9 (Performance Test). +- If user picks B → Mark Step 8 as `skipped` in the state file, auto-chain to Step 9 (Performance Test). + +--- + +**Step 9 — Performance Test (optional)** +Condition: the autopilot state shows Step 8 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 9 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete) + +Action: Present using Choose format: + +``` +══════════════════════════════════════ + DECISION REQUIRED: Run performance/load tests before deploy? +══════════════════════════════════════ + A) Run performance tests (recommended for latency-sensitive or high-load systems) + B) Skip — proceed directly to deploy +══════════════════════════════════════ + Recommendation: [A or B — base on whether acceptance criteria + include latency, throughput, or load requirements] +══════════════════════════════════════ +``` + +- If user picks A → Run performance tests: + 1. If `scripts/run-performance-tests.sh` exists (generated by the test-spec skill Phase 4), execute it + 2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system + 3. Present results vs acceptance criteria thresholds + 4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort + 5. After completion, auto-chain to Step 10 (Deploy) +- If user picks B → Mark Step 9 as `skipped` in the state file, auto-chain to Step 10 (Deploy). + +--- + +**Step 10 — Deploy** +Condition: the autopilot state shows Step 7 (Run Tests) is completed AND (Step 8 is completed or skipped) AND (Step 9 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete) + +Action: Read and execute `.cursor/skills/deploy/SKILL.md` + +--- + +**Done** +Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md) + +Action: Report project completion with summary. If the user runs autopilot again after greenfield completion, Flow Resolution rule 3 routes to the existing-code flow (re-entry after completion) so they can add new features. + +## Auto-Chain Rules + +| Completed Step | Next Action | +|---------------|-------------| +| Problem (1) | Auto-chain → Research (2) | +| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) | +| Research Decision → proceed | Auto-chain → Plan (3) | +| Plan (3) | Auto-chain → UI Design detection (4) | +| UI Design (4, done or skipped) | Auto-chain → Decompose (5) | +| Decompose (5) | **Session boundary** — suggest new conversation before Implement | +| Implement (6) | Auto-chain → Run Tests (7) | +| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) | +| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) | +| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) | +| Deploy (10) | Report completion | + +## Status Summary Template + +``` +═══════════════════════════════════════════════════ + AUTOPILOT STATUS (greenfield) +═══════════════════════════════════════════════════ + Step 1 Problem [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 2 Research [DONE (N drafts) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 3 Plan [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 4 UI Design [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 5 Decompose [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 6 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)] + Step 7 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 8 Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 9 Performance Test [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] + Step 10 Deploy [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)] +═══════════════════════════════════════════════════ + Current: Step N — Name + SubStep: M — [sub-skill internal step name] + Retry: [N/3 if retrying, omit if 0] + Action: [what will happen next] +═══════════════════════════════════════════════════ +``` diff --git a/.cursor/skills/autopilot/protocols.md b/.cursor/skills/autopilot/protocols.md new file mode 100644 index 0000000..406bf72 --- /dev/null +++ b/.cursor/skills/autopilot/protocols.md @@ -0,0 +1,314 @@ +# Autopilot Protocols + +## User Interaction Protocol + +Every time the autopilot or a sub-skill needs a user decision, use the **Choose A / B / C / D** format. This applies to: + +- State transitions where multiple valid next actions exist +- Sub-skill BLOCKING gates that require user judgment +- Any fork where the autopilot cannot confidently pick the right path +- Trade-off decisions (tech choices, scope, risk acceptance) + +### When to Ask (MUST ask) + +- The next action is ambiguous (e.g., "another research round or proceed?") +- The decision has irreversible consequences (e.g., architecture choices, skipping a step) +- The user's intent or preference cannot be inferred from existing artifacts +- A sub-skill's BLOCKING gate explicitly requires user confirmation +- Multiple valid approaches exist with meaningfully different trade-offs + +### When NOT to Ask (auto-transition) + +- Only one logical next step exists (e.g., Problem complete → Research is the only option) +- The transition is deterministic from the state (e.g., Plan complete → Decompose) +- The decision is low-risk and reversible +- Existing artifacts or prior decisions already imply the answer + +### Choice Format + +Always present decisions in this format: + +``` +══════════════════════════════════════ + DECISION REQUIRED: [brief context] +══════════════════════════════════════ + A) [Option A — short description] + B) [Option B — short description] + C) [Option C — short description, if applicable] + D) [Option D — short description, if applicable] +══════════════════════════════════════ + Recommendation: [A/B/C/D] — [one-line reason] +══════════════════════════════════════ +``` + +Rules: +1. Always provide 2–4 concrete options (never open-ended questions) +2. Always include a recommendation with a brief justification +3. Keep option descriptions to one line each +4. If only 2 options make sense, use A/B only — do not pad with filler options +5. Play the notification sound (per `human-attention-sound.mdc`) before presenting the choice +6. Record every user decision in the state file's `Key Decisions` section +7. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive + +## Work Item Tracker Authentication + +Several workflow steps create work items (epics, tasks, links). The system supports **Jira MCP** and **Azure DevOps MCP** as interchangeable backends. Detect which is configured by listing available MCP servers. + +### Tracker Detection + +1. Check for available MCP servers: Jira MCP (`user-Jira-MCP-Server`) or Azure DevOps MCP (`user-AzureDevops`) +2. If both are available, ask the user which to use (Choose format) +3. Record the choice in the state file: `tracker: jira` or `tracker: ado` +4. If neither is available, set `tracker: local` and proceed without external tracking + +### Steps That Require Work Item Tracker + +| Flow | Step | Sub-Step | Tracker Action | +|------|------|----------|----------------| +| greenfield | 3 (Plan) | Step 6 — Epics | Create epics for each component | +| greenfield | 5 (Decompose) | Step 1–3 — All tasks | Create ticket per task, link to epic | +| existing-code | 3 (Decompose Tests) | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic | +| existing-code | 7 (New Task) | Step 7 — Ticket | Create ticket per task, link to epic | + +### Authentication Gate + +Before entering a step that requires work item tracking (see table above) for the first time, the autopilot must: + +1. Call `mcp_auth` on the detected tracker's MCP server +2. If authentication succeeds → proceed normally +3. If the user **skips** or authentication fails → present using Choose format: + +``` +══════════════════════════════════════ + Tracker authentication failed +══════════════════════════════════════ + A) Retry authentication (retry mcp_auth) + B) Continue without tracker (tasks saved locally only) +══════════════════════════════════════ + Recommendation: A — Tracker IDs drive task referencing, + dependency tracking, and implementation batching. + Without tracker, task files use numeric prefixes instead. +══════════════════════════════════════ +``` + +If user picks **B** (continue without tracker): +- Set a flag in the state file: `tracker: local` +- All skills that would create tickets instead save metadata locally in the task/epic files with `Tracker: pending` status +- Task files keep numeric prefixes (e.g., `01_initial_structure.md`) instead of tracker ID prefixes +- The workflow proceeds normally in all other respects + +### Re-Authentication + +If the tracker MCP was already authenticated in a previous invocation (verify by listing available tools beyond `mcp_auth`), skip the auth gate. + +## Error Handling + +All error situations that require user input MUST use the **Choose A / B / C / D** format. + +| Situation | Action | +|-----------|--------| +| State detection is ambiguous (artifacts suggest two different steps) | Present findings and use Choose format with the candidate steps as options | +| Sub-skill fails or hits an unrecoverable blocker | Use Choose format: A) retry, B) skip with warning, C) abort and fix manually | +| User wants to skip a step | Use Choose format: A) skip (with dependency warning), B) execute the step | +| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step | +| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution | + +## Skill Failure Retry Protocol + +Sub-skills can return a **failed** result. Failures are often caused by missing user input, environment issues, or transient errors that resolve on retry. The autopilot auto-retries before escalating. + +### Retry Flow + +``` +Skill execution → FAILED + │ + ├─ retry_count < 3 ? + │ YES → increment retry_count in state file + │ → log failure reason in state file (Retry Log section) + │ → re-read the sub-skill's SKILL.md + │ → re-execute from the current sub_step + │ → (loop back to check result) + │ + │ NO (retry_count = 3) → + │ → set status: failed in Current Step + │ → add entry to Blockers section: + │ "[Skill Name] failed 3 consecutive times at sub_step [M]. + │ Last failure: [reason]. Auto-retry exhausted." + │ → present warning to user (see Escalation below) + │ → do NOT auto-retry again until user intervenes +``` + +### Retry Rules + +1. **Auto-retry immediately**: when a skill fails, retry it without asking the user — the failure is often transient (missing user confirmation in a prior step, docker not running, file lock, etc.) +2. **Preserve sub_step**: retry from the last recorded `sub_step`, not from the beginning of the skill — unless the failure indicates corruption, in which case restart from sub_step 1 +3. **Increment `retry_count`**: update `retry_count` in the state file's `Current Step` section on each retry attempt +4. **Log each failure**: append the failure reason and timestamp to the state file's `Retry Log` section +5. **Reset on success**: when the skill eventually succeeds, reset `retry_count: 0` and clear the `Retry Log` for that step + +### Escalation (after 3 consecutive failures) + +After 3 failed auto-retries of the same skill, the failure is likely not user-related. Stop retrying and escalate: + +1. Update the state file: + - Set `status: failed` in `Current Step` + - Set `retry_count: 3` + - Add a blocker entry describing the repeated failure +2. Play notification sound (per `human-attention-sound.mdc`) +3. Present using Choose format: + +``` +══════════════════════════════════════ + SKILL FAILED: [Skill Name] — 3 consecutive failures +══════════════════════════════════════ + Step: [N] — [Name] + SubStep: [M] — [sub-step name] + Last failure reason: [reason] +══════════════════════════════════════ + A) Retry with fresh context (new conversation) + B) Skip this step with warning + C) Abort — investigate and fix manually +══════════════════════════════════════ + Recommendation: A — fresh context often resolves + persistent failures +══════════════════════════════════════ +``` + +### Re-Entry After Failure + +On the next autopilot invocation (new conversation), if the state file shows `status: failed` and `retry_count: 3`: + +- Present the blocker to the user before attempting execution +- If the user chooses to retry → reset `retry_count: 0`, set `status: in_progress`, and re-execute +- If the user chooses to skip → mark step as `skipped`, proceed to next step +- Do NOT silently auto-retry — the user must acknowledge the persistent failure first + +## Error Recovery Protocol + +### Stuck Detection + +When executing a sub-skill, monitor for these signals: + +- Same artifact overwritten 3+ times without meaningful change +- Sub-skill repeatedly asks the same question after receiving an answer +- No new artifacts saved for an extended period despite active execution + +### Recovery Actions (ordered) + +1. **Re-read state**: read `_docs/_autopilot_state.md` and cross-check against `_docs/` folders +2. **Retry current sub-step**: re-read the sub-skill's SKILL.md and restart from the current sub-step +3. **Escalate**: after 2 failed retries, present diagnostic summary to user using Choose format: + +``` +══════════════════════════════════════ + RECOVERY: [skill name] stuck at [sub-step] +══════════════════════════════════════ + A) Retry with fresh context (new conversation) + B) Skip this sub-step with warning + C) Abort and fix manually +══════════════════════════════════════ + Recommendation: A — fresh context often resolves stuck loops +══════════════════════════════════════ +``` + +### Circuit Breaker + +If the same autopilot step fails 3 consecutive times across conversations: + +- Record the failure pattern in the state file's `Blockers` section +- Do NOT auto-retry on next invocation +- Present the blocker and ask user for guidance before attempting again + +## Context Management Protocol + +### Principle + +Disk is memory. Never rely on in-context accumulation — read from `_docs/` artifacts, not from conversation history. + +### Minimal Re-Read Set Per Skill + +When re-entering a skill (new conversation or context refresh): + +- Always read: `_docs/_autopilot_state.md` +- Always read: the active skill's `SKILL.md` +- Conditionally read: only the `_docs/` artifacts the current sub-step requires (listed in each skill's Context Resolution section) +- Never bulk-read: do not load all `_docs/` files at once + +### Mid-Skill Interruption + +If context is filling up during a long skill (e.g., document, implement): + +1. Save current sub-step progress to the skill's artifact directory +2. Update `_docs/_autopilot_state.md` with exact sub-step position +3. Suggest a new conversation: "Context is getting long — recommend continuing in a fresh conversation for better results" +4. On re-entry, the skill's resumability protocol picks up from the saved sub-step + +### Large Artifact Handling + +When a skill needs to read large files (e.g., full solution.md, architecture.md): + +- Read only the sections relevant to the current sub-step +- Use search tools (Grep, SemanticSearch) to find specific sections rather than reading entire files +- Summarize key decisions from prior steps in the state file so they don't need to be re-read + +### Context Budget Heuristic + +Agents cannot programmatically query context window usage. Use these heuristics to avoid degradation: + +| Zone | Indicators | Action | +|------|-----------|--------| +| **Safe** | State file + SKILL.md + 2–3 focused artifacts loaded | Continue normally | +| **Caution** | 5+ artifacts loaded, or 3+ large files (architecture, solution, discovery), or conversation has 20+ tool calls | Complete current sub-step, then suggest session break | +| **Danger** | Repeated truncation in tool output, tool calls failing unexpectedly, responses becoming shallow or repetitive | Save immediately, update state file, force session boundary | + +**Skill-specific guidelines**: + +| Skill | Recommended session breaks | +|-------|---------------------------| +| **document** | After every ~5 modules in Step 1; between Step 4 (Verification) and Step 5 (Solution Extraction) | +| **implement** | Each batch is a natural checkpoint; if more than 2 batches completed in one session, suggest break | +| **plan** | Between Step 5 (Test Specifications) and Step 6 (Epics) for projects with many components | +| **research** | Between Mode A rounds; between Mode A and Mode B | + +**How to detect caution/danger zone without API**: + +1. Count tool calls made so far — if approaching 20+, context is likely filling up +2. If reading a file returns truncated content, context is under pressure +3. If the agent starts producing shorter or less detailed responses than earlier in the conversation, context quality is degrading +4. When in doubt, save and suggest a new conversation — re-entry is cheap thanks to the state file + +## Rollback Protocol + +### Implementation Steps (git-based) + +Handled by `/implement` skill — each batch commit is a rollback checkpoint via `git revert`. + +### Planning/Documentation Steps (artifact-based) + +For steps that produce `_docs/` artifacts (problem, research, plan, decompose, document): + +1. **Before overwriting**: if re-running a step that already has artifacts, the sub-skill's prerequisite check asks the user (resume/overwrite/skip) +2. **Rollback to previous step**: use Choose format: + +``` +══════════════════════════════════════ + ROLLBACK: Re-run [step name]? +══════════════════════════════════════ + A) Re-run the step (overwrites current artifacts) + B) Stay on current step +══════════════════════════════════════ + Warning: This will overwrite files in _docs/[folder]/ +══════════════════════════════════════ +``` + +3. **Git safety net**: artifacts are committed with each autopilot step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout -- _docs//` +4. **State file rollback**: when rolling back artifacts, also update `_docs/_autopilot_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date) + +## Status Summary + +On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). Use the Status Summary Template from the active flow file (`flows/greenfield.md` or `flows/existing-code.md`). + +For re-entry (state file exists), also include: +- Key decisions from the state file's `Key Decisions` section +- Last session context from the `Last Session` section +- Any blockers from the `Blockers` section diff --git a/.cursor/skills/autopilot/state.md b/.cursor/skills/autopilot/state.md new file mode 100644 index 0000000..57e6444 --- /dev/null +++ b/.cursor/skills/autopilot/state.md @@ -0,0 +1,122 @@ +# Autopilot State Management + +## State File: `_docs/_autopilot_state.md` + +The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist. + +### Format + +```markdown +# Autopilot State + +## Current Step +flow: [greenfield | existing-code] +step: [1-10 for greenfield, 1-12 for existing-code, or "done"] +name: [step name from the active flow's Step Reference Table] +status: [not_started / in_progress / completed / skipped / failed] +sub_step: [optional — sub-skill internal step number + name if interrupted mid-step] +retry_count: [0-3 — number of consecutive auto-retry attempts for current step, reset to 0 on success] + +When updating `Current Step`, always write it as: + flow: existing-code ← active flow + step: N ← autopilot step (sequential integer) + sub_step: M ← sub-skill's own internal step/phase number + name + retry_count: 0 ← reset on new step or success; increment on each failed retry +Example: + flow: greenfield + step: 3 + name: Plan + status: in_progress + sub_step: 4 — Architecture Review & Risk Assessment + retry_count: 0 +Example (failed after 3 retries): + flow: existing-code + step: 2 + name: Test Spec + status: failed + sub_step: 1b — Test Case Generation + retry_count: 3 + +## Completed Steps + +| Step | Name | Completed | Key Outcome | +|------|------|-----------|-------------| +| 1 | [name] | [date] | [one-line summary] | +| 2 | [name] | [date] | [one-line summary] | +| ... | ... | ... | ... | + +## Key Decisions +- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"] +- [decision N] + +## Last Session +date: [date] +ended_at: Step [N] [Name] — SubStep [M] [sub-step name] +reason: [completed step / session boundary / user paused / context limit] +notes: [any context for next session] + +## Retry Log +| Attempt | Step | Name | SubStep | Failure Reason | Timestamp | +|---------|------|------|---------|----------------|-----------| +| 1 | [step] | [name] | [sub_step] | [reason] | [date-time] | +| ... | ... | ... | ... | ... | ... | + +(Clear this table when the step succeeds or user resets. Append a row on each failed auto-retry.) + +## Blockers +- [blocker 1, if any] +- [none] +``` + +### State File Rules + +1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 1) +2. **Update** the state file after every step completion, every session boundary, every BLOCKING gate confirmation, and every failed retry attempt +3. **Read** the state file as the first action on every invocation — before folder scanning +4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 3 but `_docs/02_document/architecture.md` already exists), trust the folder structure and update the state file to match +5. **Never delete** the state file. It accumulates history across the entire project lifecycle +6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` when the step succeeds or the user manually resets. If `retry_count` reaches 3, set `status: failed` and add an entry to `Blockers` +7. **Failed state on re-entry**: if the state file shows `status: failed` with `retry_count: 3`, do NOT auto-retry — present the blocker to the user and wait for their decision before proceeding + +## State Detection + +Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning. + +### Folder Scan Rules (fallback) + +Scan `_docs/` to determine the current workflow position. The detection rules are defined in each flow file (`flows/greenfield.md` and `flows/existing-code.md`). Check the existing-code flow first (Step 1 detection), then greenfield flow rules. First match wins. + +## Re-Entry Protocol + +When the user invokes `/autopilot` and work already exists: + +1. Read `_docs/_autopilot_state.md` +2. Cross-check against `_docs/` folder structure +3. Present Status Summary with context from state file (key decisions, last session, blockers) +4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery +5. Continue execution from detected state + +## Session Boundaries + +After any decompose/planning step completes, **do not auto-chain to implement**. Instead: + +1. Update state file: mark the step as completed, set current step to the next implement step with status `not_started` + - Existing-code flow: After Step 3 (Decompose Tests) → set current step to 4 (Implement Tests) + - Existing-code flow: After Step 7 (New Task) → set current step to 8 (Implement) + - Greenfield flow: After Step 5 (Decompose) → set current step to 6 (Implement) +2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready` +3. Present a summary: number of tasks, estimated batches, total complexity points +4. Use Choose format: + +``` +══════════════════════════════════════ + DECISION REQUIRED: Decompose complete — start implementation? +══════════════════════════════════════ + A) Start a new conversation for implementation (recommended for context freshness) + B) Continue implementation in this conversation +══════════════════════════════════════ + Recommendation: A — implementation is the longest phase, fresh context helps +══════════════════════════════════════ +``` + +These are the only hard session boundaries. All other transitions auto-chain. diff --git a/.cursor/skills/code-review/SKILL.md b/.cursor/skills/code-review/SKILL.md index 1c5bd4f..041013a 100644 --- a/.cursor/skills/code-review/SKILL.md +++ b/.cursor/skills/code-review/SKILL.md @@ -46,7 +46,7 @@ For each task, verify implementation satisfies every acceptance criterion: - Walk through each AC (Given/When/Then) and trace it in the code - Check that unit tests cover each AC -- Check that integration tests exist where specified in the task spec +- Check that blackbox tests exist where specified in the task spec - Flag any AC that is not demonstrably satisfied as a **Spec-Gap** finding (severity: High) - Flag any scope creep (implementation beyond what the spec asked for) as a **Scope** finding (severity: Low) @@ -152,3 +152,42 @@ The `/implement` skill invokes this skill after each batch completes: 2. Passes task spec paths + changed files to this skill 3. If verdict is FAIL — presents findings to user (BLOCKING), user fixes or confirms 4. If verdict is PASS or PASS_WITH_WARNINGS — proceeds automatically (findings shown as info) + +## Integration Contract + +### Inputs (provided by the implement skill) + +| Input | Type | Source | Required | +|-------|------|--------|----------| +| `task_specs` | list of file paths | Task `.md` files from `_docs/02_tasks/` for the current batch | Yes | +| `changed_files` | list of file paths | Files modified by implementer agents (from `git diff` or agent reports) | Yes | +| `batch_number` | integer | Current batch number (for report naming) | Yes | +| `project_restrictions` | file path | `_docs/00_problem/restrictions.md` | If exists | +| `solution_overview` | file path | `_docs/01_solution/solution.md` | If exists | + +### Invocation Pattern + +The implement skill invokes code-review by: + +1. Reading `.cursor/skills/code-review/SKILL.md` +2. Providing the inputs above as context (read the files, pass content to the review phases) +3. Executing all 6 phases sequentially +4. Consuming the verdict from the output + +### Outputs (returned to the implement skill) + +| Output | Type | Description | +|--------|------|-------------| +| `verdict` | `PASS` / `PASS_WITH_WARNINGS` / `FAIL` | Drives the implement skill's auto-fix gate | +| `findings` | structured list | Each finding has: severity, category, file:line, title, description, suggestion, task reference | +| `critical_count` | integer | Number of Critical findings | +| `high_count` | integer | Number of High findings | +| `report_path` | file path | `_docs/03_implementation/reviews/batch_[NN]_review.md` | + +### Report Persistence + +Save the review report to `_docs/03_implementation/reviews/batch_[NN]_review.md` (create the `reviews/` directory if it does not exist). The report uses the Output Format defined above. + +The implement skill uses `verdict` to decide: +- `PASS` / `PASS_WITH_WARNINGS` → proceed to commit +- `FAIL` → enter auto-fix loop (up to 2 attempts), then escalate to user diff --git a/.cursor/skills/decompose/SKILL.md b/.cursor/skills/decompose/SKILL.md index 8fac9a3..ac1cb2c 100644 --- a/.cursor/skills/decompose/SKILL.md +++ b/.cursor/skills/decompose/SKILL.md @@ -2,12 +2,13 @@ name: decompose description: | Decompose planned components into atomic implementable tasks with bootstrap structure plan. - 4-step workflow: bootstrap structure plan, component task decomposition, integration test task decomposition, and cross-task verification. - Supports full decomposition (_docs/ structure) and single component mode. + 4-step workflow: bootstrap structure plan, component task decomposition, blackbox test task decomposition, and cross-task verification. + Supports full decomposition (_docs/ structure), single component mode, and tests-only mode. Trigger phrases: - "decompose", "decompose features", "feature decomposition" - "task decomposition", "break down components" - "prepare for implementation" + - "decompose tests", "test decomposition" category: build tags: [decomposition, tasks, dependencies, jira, implementation-prep] disable-model-invocation: true @@ -32,18 +33,26 @@ Decompose planned components into atomic, implementable task specs with a bootst Determine the operating mode based on invocation before any other logic runs. **Default** (no explicit input file provided): -- PLANS_DIR: `_docs/02_plans/` +- DOCUMENT_DIR: `_docs/02_document/` - TASKS_DIR: `_docs/02_tasks/` -- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, PLANS_DIR -- Runs Step 1 (bootstrap) + Step 2 (all components) + Step 3 (integration tests) + Step 4 (cross-verification) +- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, DOCUMENT_DIR +- Runs Step 1 (bootstrap) + Step 2 (all components) + Step 3 (blackbox tests) + Step 4 (cross-verification) -**Single component mode** (provided file is within `_docs/02_plans/` and inside a `components/` subdirectory): -- PLANS_DIR: `_docs/02_plans/` +**Single component mode** (provided file is within `_docs/02_document/` and inside a `components/` subdirectory): +- DOCUMENT_DIR: `_docs/02_document/` - TASKS_DIR: `_docs/02_tasks/` - Derive component number and component name from the file path - Ask user for the parent Epic ID - Runs Step 2 (that component only, appending to existing task numbering) +**Tests-only mode** (provided file/directory is within `tests/`, or `DOCUMENT_DIR/tests/` exists and input explicitly requests test decomposition): +- DOCUMENT_DIR: `_docs/02_document/` +- TASKS_DIR: `_docs/02_tasks/` +- TESTS_DIR: `DOCUMENT_DIR/tests/` +- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, TESTS_DIR +- Runs Step 1t (test infrastructure bootstrap) + Step 3 (blackbox test decomposition) + Step 4 (cross-verification against test coverage) +- Skips Step 1 (project bootstrap) and Step 2 (component decomposition) — the codebase already exists + Announce the detected mode and resolved paths to the user before proceeding. ## Input Specification @@ -58,10 +67,10 @@ Announce the detected mode and resolved paths to the user before proceeding. | `_docs/00_problem/restrictions.md` | Constraints and limitations | | `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria | | `_docs/01_solution/solution.md` | Finalized solution | -| `PLANS_DIR/architecture.md` | Architecture from plan skill | -| `PLANS_DIR/system-flows.md` | System flows from plan skill | -| `PLANS_DIR/components/[##]_[name]/description.md` | Component specs from plan skill | -| `PLANS_DIR/integration_tests/` | Integration test specs from plan skill | +| `DOCUMENT_DIR/architecture.md` | Architecture from plan skill | +| `DOCUMENT_DIR/system-flows.md` | System flows from plan skill | +| `DOCUMENT_DIR/components/[##]_[name]/description.md` | Component specs from plan skill | +| `DOCUMENT_DIR/tests/` | Blackbox test specs from plan skill | **Single component mode:** @@ -70,16 +79,38 @@ Announce the detected mode and resolved paths to the user before proceeding. | The provided component `description.md` | Component spec to decompose | | Corresponding `tests.md` in the same directory (if available) | Test specs for context | +**Tests-only mode:** + +| File | Purpose | +|------|---------| +| `TESTS_DIR/environment.md` | Test environment specification (Docker services, networks, volumes) | +| `TESTS_DIR/test-data.md` | Test data management (seed data, mocks, isolation) | +| `TESTS_DIR/blackbox-tests.md` | Blackbox functional scenarios (positive + negative) | +| `TESTS_DIR/performance-tests.md` | Performance test scenarios | +| `TESTS_DIR/resilience-tests.md` | Resilience test scenarios | +| `TESTS_DIR/security-tests.md` | Security test scenarios | +| `TESTS_DIR/resource-limit-tests.md` | Resource limit test scenarios | +| `TESTS_DIR/traceability-matrix.md` | AC/restriction coverage mapping | +| `_docs/00_problem/problem.md` | Problem context | +| `_docs/00_problem/restrictions.md` | Constraints for test design | +| `_docs/00_problem/acceptance_criteria.md` | Acceptance criteria being verified | + ### Prerequisite Checks (BLOCKING) **Default:** -1. PLANS_DIR contains `architecture.md` and `components/` — **STOP if missing** +1. DOCUMENT_DIR contains `architecture.md` and `components/` — **STOP if missing** 2. Create TASKS_DIR if it does not exist 3. If TASKS_DIR already contains task files, ask user: **resume from last checkpoint or start fresh?** **Single component mode:** 1. The provided component file exists and is non-empty — **STOP if missing** +**Tests-only mode:** +1. `TESTS_DIR/blackbox-tests.md` exists and is non-empty — **STOP if missing** +2. `TESTS_DIR/environment.md` exists — **STOP if missing** +3. Create TASKS_DIR if it does not exist +4. If TASKS_DIR already contains task files, ask user: **resume from last checkpoint or start fresh?** + ## Artifact Management ### Directory Structure @@ -100,8 +131,9 @@ TASKS_DIR/ | Step | Save immediately after | Filename | |------|------------------------|----------| | Step 1 | Bootstrap structure plan complete + Jira ticket created + file renamed | `[JIRA-ID]_initial_structure.md` | +| Step 1t | Test infrastructure bootstrap complete + Jira ticket created + file renamed | `[JIRA-ID]_test_infrastructure.md` | | Step 2 | Each component task decomposed + Jira ticket created + file renamed | `[JIRA-ID]_[short_name].md` | -| Step 3 | Each integration test task decomposed + Jira ticket created + file renamed | `[JIRA-ID]_[short_name].md` | +| Step 3 | Each blackbox test task decomposed + Jira ticket created + file renamed | `[JIRA-ID]_[short_name].md` | | Step 4 | Cross-task verification complete | `_dependencies_table.md` | ### Resumability @@ -118,13 +150,49 @@ At the start of execution, create a TodoWrite with all applicable steps. Update ## Workflow +### Step 1t: Test Infrastructure Bootstrap (tests-only mode only) + +**Role**: Professional Quality Assurance Engineer +**Goal**: Produce `01_test_infrastructure.md` — the first task describing the test project scaffold +**Constraints**: This is a plan document, not code. The `/implement` skill executes it. + +1. Read `TESTS_DIR/environment.md` and `TESTS_DIR/test-data.md` +2. Read problem.md, restrictions.md, acceptance_criteria.md for domain context +3. Document the test infrastructure plan using `templates/test-infrastructure-task.md` + +The test infrastructure bootstrap must include: +- Test project folder layout (`e2e/` directory structure) +- Mock/stub service definitions for each external dependency +- `docker-compose.test.yml` structure from environment.md +- Test runner configuration (framework, plugins, fixtures) +- Test data fixture setup from test-data.md seed data sets +- Test reporting configuration (format, output path) +- Data isolation strategy + +**Self-verification**: +- [ ] Every external dependency from environment.md has a mock service defined +- [ ] Docker Compose structure covers all services from environment.md +- [ ] Test data fixtures cover all seed data sets from test-data.md +- [ ] Test runner configuration matches the consumer app tech stack from environment.md +- [ ] Data isolation strategy is defined + +**Save action**: Write `01_test_infrastructure.md` (temporary numeric name) + +**Jira action**: Create a Jira ticket for this task under the "Blackbox Tests" epic. Write the Jira ticket ID and Epic ID back into the task header. + +**Rename action**: Rename the file from `01_test_infrastructure.md` to `[JIRA-ID]_test_infrastructure.md`. Update the **Task** field inside the file to match the new filename. + +**BLOCKING**: Present test infrastructure plan summary to user. Do NOT proceed until user confirms. + +--- + ### Step 1: Bootstrap Structure Plan (default mode only) **Role**: Professional software architect **Goal**: Produce `01_initial_structure.md` — the first task describing the project skeleton **Constraints**: This is a plan document, not code. The `/implement` skill executes it. -1. Read architecture.md, all component specs, system-flows.md, data_model.md, and `deployment/` from PLANS_DIR +1. Read architecture.md, all component specs, system-flows.md, data_model.md, and `deployment/` from DOCUMENT_DIR 2. Read problem, solution, and restrictions from `_docs/00_problem/` and `_docs/01_solution/` 3. Research best implementation patterns for the identified tech stack 4. Document the structure plan using `templates/initial-structure-task.md` @@ -134,27 +202,27 @@ The bootstrap structure plan must include: - Shared models, interfaces, and DTOs - Dockerfile per component (multi-stage, non-root, health checks, pinned base images) - `docker-compose.yml` for local development (all components + database + dependencies) -- `docker-compose.test.yml` for integration test environment (black-box test runner) +- `docker-compose.test.yml` for blackbox test environment (blackbox test runner) - `.dockerignore` - CI/CD pipeline file (`.github/workflows/ci.yml` or `azure-pipelines.yml`) with stages from `deployment/ci_cd_pipeline.md` - Database migration setup and initial seed data scripts - Observability configuration: structured logging setup, health check endpoints (`/health/live`, `/health/ready`), metrics endpoint (`/metrics`) - Environment variable documentation (`.env.example`) -- Test structure with unit and integration test locations +- Test structure with unit and blackbox test locations **Self-verification**: - [ ] All components have corresponding folders in the layout - [ ] All inter-component interfaces have DTOs defined - [ ] Dockerfile defined for each component - [ ] `docker-compose.yml` covers all components and dependencies -- [ ] `docker-compose.test.yml` enables black-box integration testing +- [ ] `docker-compose.test.yml` enables blackbox testing - [ ] CI/CD pipeline file defined with lint, test, security, build, deploy stages - [ ] Database migration setup included - [ ] Health check endpoints specified for each service - [ ] Structured logging configuration included - [ ] `.env.example` with all required environment variables - [ ] Environment strategy covers dev, staging, production -- [ ] Test structure includes unit and integration test locations +- [ ] Test structure includes unit and blackbox test locations **Save action**: Write `01_initial_structure.md` (temporary numeric name) @@ -166,7 +234,7 @@ The bootstrap structure plan must include: --- -### Step 2: Task Decomposition (all modes) +### Step 2: Task Decomposition (default and single component modes) **Role**: Professional software architect **Goal**: Decompose each component into atomic, implementable task specs — numbered sequentially starting from 02 @@ -200,52 +268,66 @@ For each component (or the single provided component): --- -### Step 3: Integration Test Task Decomposition (default mode only) +### Step 3: Blackbox Test Task Decomposition (default and tests-only modes) **Role**: Professional Quality Assurance Engineer -**Goal**: Decompose integration test specs into atomic, implementable task specs +**Goal**: Decompose blackbox test specs into atomic, implementable task specs **Constraints**: Behavioral specs only — describe what, not how. No test code. -**Numbering**: Continue sequential numbering from where Step 2 left off. +**Numbering**: +- In default mode: continue sequential numbering from where Step 2 left off. +- In tests-only mode: start from 02 (01 is the test infrastructure bootstrap from Step 1t). -1. Read all test specs from `PLANS_DIR/integration_tests/` (functional_tests.md, non_functional_tests.md) +1. Read all test specs from `DOCUMENT_DIR/tests/` (`blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, `resource-limit-tests.md`) 2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test) -3. Each task should reference the specific test scenarios it implements and the environment/test_data specs -4. Dependencies: integration test tasks depend on the component implementation tasks they exercise +3. Each task should reference the specific test scenarios it implements and the environment/test-data specs +4. Dependencies: + - In default mode: blackbox test tasks depend on the component implementation tasks they exercise + - In tests-only mode: blackbox test tasks depend on the test infrastructure bootstrap task (Step 1t) 5. Write each task spec using `templates/task.md` 6. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does 7. Note task dependencies (referencing Jira IDs of already-created dependency tasks) -8. **Immediately after writing each task file**: create a Jira ticket under the "Integration Tests" epic, write the Jira ticket ID and Epic ID back into the task header, then rename the file from `[##]_[short_name].md` to `[JIRA-ID]_[short_name].md`. +8. **Immediately after writing each task file**: create a Jira ticket under the "Blackbox Tests" epic, write the Jira ticket ID and Epic ID back into the task header, then rename the file from `[##]_[short_name].md` to `[JIRA-ID]_[short_name].md`. **Self-verification**: -- [ ] Every functional test scenario from `integration_tests/functional_tests.md` is covered by a task -- [ ] Every non-functional test scenario from `integration_tests/non_functional_tests.md` is covered by a task +- [ ] Every scenario from `tests/blackbox-tests.md` is covered by a task +- [ ] Every scenario from `tests/performance-tests.md`, `tests/resilience-tests.md`, `tests/security-tests.md`, and `tests/resource-limit-tests.md` is covered by a task - [ ] No task exceeds 5 complexity points -- [ ] Dependencies correctly reference the component tasks being tested -- [ ] Every task has a Jira ticket linked to the "Integration Tests" epic +- [ ] Dependencies correctly reference the dependency tasks (component tasks in default mode, test infrastructure in tests-only mode) +- [ ] Every task has a Jira ticket linked to the "Blackbox Tests" epic **Save action**: Write each `[##]_[short_name].md` (temporary numeric name), create Jira ticket inline, then rename to `[JIRA-ID]_[short_name].md`. --- -### Step 4: Cross-Task Verification (default mode only) +### Step 4: Cross-Task Verification (default and tests-only modes) **Role**: Professional software architect and analyst **Goal**: Verify task consistency and produce `_dependencies_table.md` **Constraints**: Review step — fix gaps found, do not add new tasks 1. Verify task dependencies across all tasks are consistent -2. Check no gaps: every interface in architecture.md has tasks covering it -3. Check no overlaps: tasks don't duplicate work across components +2. Check no gaps: + - In default mode: every interface in architecture.md has tasks covering it + - In tests-only mode: every test scenario in `traceability-matrix.md` is covered by a task +3. Check no overlaps: tasks don't duplicate work 4. Check no circular dependencies in the task graph 5. Produce `_dependencies_table.md` using `templates/dependencies-table.md` **Self-verification**: + +Default mode: - [ ] Every architecture interface is covered by at least one task - [ ] No circular dependencies in the task graph - [ ] Cross-component dependencies are explicitly noted in affected task specs - [ ] `_dependencies_table.md` contains every task with correct dependencies +Tests-only mode: +- [ ] Every test scenario from traceability-matrix.md "Covered" entries has a corresponding task +- [ ] No circular dependencies in the task graph +- [ ] Test task dependencies reference the test infrastructure bootstrap +- [ ] `_dependencies_table.md` contains every task with correct dependencies + **Save action**: Write `_dependencies_table.md` **BLOCKING**: Present dependency summary to user. Do NOT proceed until user confirms. @@ -270,7 +352,7 @@ For each component (or the single provided component): |-----------|--------| | Ambiguous component boundaries | ASK user | | Task complexity exceeds 5 points after splitting | ASK user | -| Missing component specs in PLANS_DIR | ASK user | +| Missing component specs in DOCUMENT_DIR | ASK user | | Cross-component dependency conflict | ASK user | | Jira epic not found for a component | ASK user for Epic ID | | Task naming | PROCEED, confirm at next BLOCKING gate | @@ -279,15 +361,27 @@ For each component (or the single provided component): ``` ┌────────────────────────────────────────────────────────────────┐ -│ Task Decomposition (4-Step Method) │ +│ Task Decomposition (Multi-Mode) │ ├────────────────────────────────────────────────────────────────┤ -│ CONTEXT: Resolve mode (default / single component) │ -│ 1. Bootstrap Structure → [JIRA-ID]_initial_structure.md │ -│ [BLOCKING: user confirms structure] │ -│ 2. Component Tasks → [JIRA-ID]_[short_name].md each │ -│ 3. Integration Tests → [JIRA-ID]_[short_name].md each │ -│ 4. Cross-Verification → _dependencies_table.md │ -│ [BLOCKING: user confirms dependencies] │ +│ CONTEXT: Resolve mode (default / single component / tests-only)│ +│ │ +│ DEFAULT MODE: │ +│ 1. Bootstrap Structure → [JIRA-ID]_initial_structure.md │ +│ [BLOCKING: user confirms structure] │ +│ 2. Component Tasks → [JIRA-ID]_[short_name].md each │ +│ 3. Blackbox Tests → [JIRA-ID]_[short_name].md each │ +│ 4. Cross-Verification → _dependencies_table.md │ +│ [BLOCKING: user confirms dependencies] │ +│ │ +│ TESTS-ONLY MODE: │ +│ 1t. Test Infrastructure → [JIRA-ID]_test_infrastructure.md │ +│ [BLOCKING: user confirms test scaffold] │ +│ 3. Blackbox Tests → [JIRA-ID]_[short_name].md each │ +│ 4. Cross-Verification → _dependencies_table.md │ +│ [BLOCKING: user confirms dependencies] │ +│ │ +│ SINGLE COMPONENT MODE: │ +│ 2. Component Tasks → [JIRA-ID]_[short_name].md each │ ├────────────────────────────────────────────────────────────────┤ │ Principles: Atomic tasks · Behavioral specs · Flat structure │ │ Jira inline · Rename to Jira ID · Save now · Ask don't assume│ diff --git a/.cursor/skills/decompose/templates/initial-structure-task.md b/.cursor/skills/decompose/templates/initial-structure-task.md index 9642f65..371e5e0 100644 --- a/.cursor/skills/decompose/templates/initial-structure-task.md +++ b/.cursor/skills/decompose/templates/initial-structure-task.md @@ -49,7 +49,7 @@ project-root/ | Build | Compile/bundle the application | Every push | | Lint / Static Analysis | Code quality and style checks | Every push | | Unit Tests | Run unit test suite | Every push | -| Integration Tests | Run integration test suite | Every push | +| Blackbox Tests | Run blackbox test suite | Every push | | Security Scan | SAST / dependency check | Every push | | Deploy to Staging | Deploy to staging environment | Merge to staging branch | diff --git a/.cursor/skills/decompose/templates/task.md b/.cursor/skills/decompose/templates/task.md index d8547a9..f36ea38 100644 --- a/.cursor/skills/decompose/templates/task.md +++ b/.cursor/skills/decompose/templates/task.md @@ -64,7 +64,7 @@ Then [expected result] |--------|-------------|-----------------| | AC-1 | [test subject] | [expected result] | -## Integration Tests +## Blackbox Tests | AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | |--------|------------------------|-------------|-------------------|----------------| diff --git a/.cursor/skills/decompose/templates/test-infrastructure-task.md b/.cursor/skills/decompose/templates/test-infrastructure-task.md new file mode 100644 index 0000000..a07cb42 --- /dev/null +++ b/.cursor/skills/decompose/templates/test-infrastructure-task.md @@ -0,0 +1,129 @@ +# Test Infrastructure Task Template + +Use this template for the test infrastructure bootstrap (Step 1t in tests-only mode). Save as `TASKS_DIR/01_test_infrastructure.md` initially, then rename to `TASKS_DIR/[JIRA-ID]_test_infrastructure.md` after Jira ticket creation. + +--- + +```markdown +# Test Infrastructure + +**Task**: [JIRA-ID]_test_infrastructure +**Name**: Test Infrastructure +**Description**: Scaffold the Blackbox test project — test runner, mock services, Docker test environment, test data fixtures, reporting +**Complexity**: [3|5] points +**Dependencies**: None +**Component**: Blackbox Tests +**Jira**: [TASK-ID] +**Epic**: [EPIC-ID] + +## Test Project Folder Layout + +``` +e2e/ +├── conftest.py +├── requirements.txt +├── Dockerfile +├── mocks/ +│ ├── [mock_service_1]/ +│ │ ├── Dockerfile +│ │ └── [entrypoint file] +│ └── [mock_service_2]/ +│ ├── Dockerfile +│ └── [entrypoint file] +├── fixtures/ +│ └── [test data files] +├── tests/ +│ ├── test_[category_1].py +│ ├── test_[category_2].py +│ └── ... +└── docker-compose.test.yml +``` + +### Layout Rationale + +[Brief explanation of directory structure choices — framework conventions, separation of mocks from tests, fixture management] + +## Mock Services + +| Mock Service | Replaces | Endpoints | Behavior | +|-------------|----------|-----------|----------| +| [name] | [external service] | [endpoints it serves] | [response behavior, configurable via control API] | + +### Mock Control API + +Each mock service exposes a `POST /mock/config` endpoint for test-time behavior control (e.g., simulate downtime, inject errors). A `GET /mock/[resource]` endpoint returns recorded interactions for assertion. + +## Docker Test Environment + +### docker-compose.test.yml Structure + +| Service | Image / Build | Purpose | Depends On | +|---------|--------------|---------|------------| +| [system-under-test] | [build context] | Main system being tested | [mock services] | +| [mock-1] | [build context] | Mock for [external service] | — | +| [e2e-consumer] | [build from e2e/] | Test runner | [system-under-test] | + +### Networks and Volumes + +[Isolated test network, volume mounts for test data, model files, results output] + +## Test Runner Configuration + +**Framework**: [e.g., pytest] +**Plugins**: [e.g., pytest-csv, sseclient-py, requests] +**Entry point**: [e.g., pytest --csv=/results/report.csv] + +### Fixture Strategy + +| Fixture | Scope | Purpose | +|---------|-------|---------| +| [name] | [session/module/function] | [what it provides] | + +## Test Data Fixtures + +| Data Set | Source | Format | Used By | +|----------|--------|--------|---------| +| [name] | [volume mount / generated / API seed] | [format] | [test categories] | + +### Data Isolation + +[Strategy: fresh containers per run, volume cleanup, mock state reset] + +## Test Reporting + +**Format**: [e.g., CSV] +**Columns**: [e.g., Test ID, Test Name, Execution Time (ms), Result, Error Message] +**Output path**: [e.g., /results/report.csv → mounted to host] + +## Acceptance Criteria + +**AC-1: Test environment starts** +Given the docker-compose.test.yml +When `docker compose -f docker-compose.test.yml up` is executed +Then all services start and the system-under-test is reachable + +**AC-2: Mock services respond** +Given the test environment is running +When the e2e-consumer sends requests to mock services +Then mock services respond with configured behavior + +**AC-3: Test runner executes** +Given the test environment is running +When the e2e-consumer starts +Then the test runner discovers and executes test files + +**AC-4: Test report generated** +Given tests have been executed +When the test run completes +Then a report file exists at the configured output path with correct columns +``` + +--- + +## Guidance Notes + +- This is a PLAN document, not code. The `/implement` skill executes it. +- Focus on test infrastructure decisions, not individual test implementations. +- Reference environment.md and test-data.md from the test specs — don't repeat everything. +- Mock services must be deterministic: same input always produces same output. +- The Docker environment must be self-contained: `docker compose up` sufficient. diff --git a/.cursor/skills/deploy/SKILL.md b/.cursor/skills/deploy/SKILL.md index 8767761..d325667 100644 --- a/.cursor/skills/deploy/SKILL.md +++ b/.cursor/skills/deploy/SKILL.md @@ -20,7 +20,7 @@ Plan and document the full deployment lifecycle: check deployment status and env ## Core Principles -- **Docker-first**: every component runs in a container; local dev, integration tests, and production all use Docker +- **Docker-first**: every component runs in a container; local dev, blackbox tests, and production all use Docker - **Infrastructure as code**: all deployment configuration is version-controlled - **Observability built-in**: logging, metrics, and tracing are part of the deployment plan, not afterthoughts - **Environment parity**: dev, staging, and production environments mirror each other as closely as possible @@ -32,12 +32,12 @@ Plan and document the full deployment lifecycle: check deployment status and env Fixed paths: -- PLANS_DIR: `_docs/02_plans/` +- DOCUMENT_DIR: `_docs/02_document/` - DEPLOY_DIR: `_docs/04_deploy/` - REPORTS_DIR: `_docs/04_deploy/reports/` - SCRIPTS_DIR: `scripts/` -- ARCHITECTURE: `_docs/02_plans/architecture.md` -- COMPONENTS_DIR: `_docs/02_plans/components/` +- ARCHITECTURE: `_docs/02_document/architecture.md` +- COMPONENTS_DIR: `_docs/02_document/components/` Announce the resolved paths to the user before proceeding. @@ -45,18 +45,18 @@ Announce the resolved paths to the user before proceeding. ### Required Files -| File | Purpose | -|------|---------| -| `_docs/00_problem/problem.md` | Problem description and context | -| `_docs/00_problem/restrictions.md` | Constraints and limitations | -| `_docs/01_solution/solution.md` | Finalized solution | -| `PLANS_DIR/architecture.md` | Architecture from plan skill | -| `PLANS_DIR/components/` | Component specs | +| File | Purpose | Required | +|------|---------|----------| +| `_docs/00_problem/problem.md` | Problem description and context | Greenfield only | +| `_docs/00_problem/restrictions.md` | Constraints and limitations | Greenfield only | +| `_docs/01_solution/solution.md` | Finalized solution | Greenfield only | +| `DOCUMENT_DIR/architecture.md` | Architecture (from plan or document skill) | Always | +| `DOCUMENT_DIR/components/` | Component specs | Always | ### Prerequisite Checks (BLOCKING) 1. `architecture.md` exists — **STOP if missing**, run `/plan` first -2. At least one component spec exists in `PLANS_DIR/components/` — **STOP if missing** +2. At least one component spec exists in `DOCUMENT_DIR/components/` — **STOP if missing** 3. Create DEPLOY_DIR, REPORTS_DIR, and SCRIPTS_DIR if they do not exist 4. If DEPLOY_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?** @@ -157,7 +157,7 @@ At the start of execution, create a TodoWrite with all steps (1 through 7). Upda ### Step 2: Containerization **Role**: DevOps / Platform engineer -**Goal**: Define Docker configuration for every component, local development, and integration test environments +**Goal**: Define Docker configuration for every component, local development, and blackbox test environments **Constraints**: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain. 1. Read architecture.md and all component specs @@ -176,7 +176,7 @@ At the start of execution, create a TodoWrite with all steps (1 through 7). Upda - Any message queues, caches, or external service mocks - Shared network - Environment variable files (`.env`) -6. Define `docker-compose.test.yml` for integration tests: +6. Define `docker-compose.test.yml` for blackbox tests: - Application components under test - Test runner container (black-box, no internal imports) - Isolated database with seed data @@ -189,7 +189,7 @@ At the start of execution, create a TodoWrite with all steps (1 through 7). Upda - [ ] Non-root user for all containers - [ ] Health checks defined for every service - [ ] docker-compose.yml covers all components + dependencies -- [ ] docker-compose.test.yml enables black-box integration testing +- [ ] docker-compose.test.yml enables black-box testing - [ ] `.dockerignore` defined **Save action**: Write `containerization.md` using `templates/containerization.md` @@ -212,7 +212,7 @@ At the start of execution, create a TodoWrite with all steps (1 through 7). Upda | Stage | Trigger | Steps | Quality Gate | |-------|---------|-------|-------------| | **Lint** | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors | -| **Test** | Every push | Unit tests, integration tests, coverage report | 75%+ coverage | +| **Test** | Every push | Unit tests, blackbox tests, coverage report | 75%+ coverage (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds) | | **Security** | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs | | **Build** | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds | | **Push** | After build | Push to container registry | Push succeeds | @@ -458,7 +458,7 @@ At the start of execution, create a TodoWrite with all steps (1 through 7). Upda - **Implementing during planning**: Steps 1–6 produce documents, not code (Step 7 is the exception — it creates scripts) - **Hardcoding secrets**: never include real credentials in deployment documents or scripts -- **Ignoring integration test containerization**: the test environment must be containerized alongside the app +- **Ignoring blackbox test containerization**: the test environment must be containerized alongside the app - **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation - **Using `:latest` tags**: always pin base image versions - **Forgetting observability**: logging, metrics, and tracing are deployment concerns, not post-deployment additions diff --git a/.cursor/skills/deploy/templates/ci_cd_pipeline.md b/.cursor/skills/deploy/templates/ci_cd_pipeline.md index 57b8b41..16102e3 100644 --- a/.cursor/skills/deploy/templates/ci_cd_pipeline.md +++ b/.cursor/skills/deploy/templates/ci_cd_pipeline.md @@ -28,7 +28,7 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`. ### Test - Unit tests: [framework and command] -- Integration tests: [framework and command, uses docker-compose.test.yml] +- Blackbox tests: [framework and command, uses docker-compose.test.yml] - Coverage threshold: 75% overall, 90% critical paths - Coverage report published as pipeline artifact @@ -54,7 +54,7 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`. - Automated rollback on health check failure ### Smoke Tests -- Subset of integration tests targeting staging environment +- Subset of blackbox tests targeting staging environment - Validates critical user flows - Timeout: [maximum duration] diff --git a/.cursor/skills/deploy/templates/containerization.md b/.cursor/skills/deploy/templates/containerization.md index d1025be..d6c7073 100644 --- a/.cursor/skills/deploy/templates/containerization.md +++ b/.cursor/skills/deploy/templates/containerization.md @@ -48,7 +48,7 @@ networks: [shared network] ``` -## Docker Compose — Integration Tests +## Docker Compose — Blackbox Tests ```yaml # docker-compose.test.yml structure diff --git a/.cursor/skills/deploy/templates/deploy_scripts.md b/.cursor/skills/deploy/templates/deploy_scripts.md new file mode 100644 index 0000000..24e915c --- /dev/null +++ b/.cursor/skills/deploy/templates/deploy_scripts.md @@ -0,0 +1,114 @@ +# Deployment Scripts Documentation Template + +Save as `_docs/04_deploy/deploy_scripts.md`. + +--- + +```markdown +# [System Name] — Deployment Scripts + +## Overview + +| Script | Purpose | Location | +|--------|---------|----------| +| `deploy.sh` | Main deployment orchestrator | `scripts/deploy.sh` | +| `pull-images.sh` | Pull Docker images from registry | `scripts/pull-images.sh` | +| `start-services.sh` | Start all services | `scripts/start-services.sh` | +| `stop-services.sh` | Graceful shutdown | `scripts/stop-services.sh` | +| `health-check.sh` | Verify deployment health | `scripts/health-check.sh` | + +## Prerequisites + +- Docker and Docker Compose installed on target machine +- SSH access to target machine (configured via `DEPLOY_HOST`) +- Container registry credentials configured +- `.env` file with required environment variables (see `.env.example`) + +## Environment Variables + +All scripts source `.env` from the project root or accept variables from the environment. + +| Variable | Required By | Purpose | +|----------|------------|---------| +| `DEPLOY_HOST` | All (remote mode) | SSH target for remote deployment | +| `REGISTRY_URL` | `pull-images.sh` | Container registry URL | +| `REGISTRY_USER` | `pull-images.sh` | Registry authentication | +| `REGISTRY_PASS` | `pull-images.sh` | Registry authentication | +| `IMAGE_TAG` | `pull-images.sh`, `start-services.sh` | Image version to deploy (default: latest git SHA) | +| [add project-specific variables] | | | + +## Script Details + +### deploy.sh + +Main orchestrator that runs the full deployment flow. + +**Usage**: +- `./scripts/deploy.sh` — Deploy latest version +- `./scripts/deploy.sh --rollback` — Rollback to previous version +- `./scripts/deploy.sh --help` — Show usage + +**Flow**: +1. Validate required environment variables +2. Call `pull-images.sh` +3. Call `stop-services.sh` +4. Call `start-services.sh` +5. Call `health-check.sh` +6. Report success or failure + +**Rollback**: When `--rollback` is passed, reads the previous image tags saved by `stop-services.sh` and redeploys those versions. + +### pull-images.sh + +**Usage**: `./scripts/pull-images.sh [--help]` + +**Steps**: +1. Authenticate with container registry (`REGISTRY_URL`) +2. Pull all required images with specified `IMAGE_TAG` +3. Verify image integrity via digest check +4. Report pull results per image + +### start-services.sh + +**Usage**: `./scripts/start-services.sh [--help]` + +**Steps**: +1. Run `docker compose up -d` with the correct env file +2. Configure networks and volumes +3. Wait for all containers to report healthy state +4. Report startup status per service + +### stop-services.sh + +**Usage**: `./scripts/stop-services.sh [--help]` + +**Steps**: +1. Save current image tags to `previous_tags.env` (for rollback) +2. Stop services with graceful shutdown period (30s) +3. Clean up orphaned containers and networks + +### health-check.sh + +**Usage**: `./scripts/health-check.sh [--help]` + +**Checks**: + +| Service | Endpoint | Expected | +|---------|----------|----------| +| [Component 1] | `http://localhost:[port]/health/live` | HTTP 200 | +| [Component 2] | `http://localhost:[port]/health/ready` | HTTP 200 | +| [add all services] | | | + +**Exit codes**: +- `0` — All services healthy +- `1` — One or more services unhealthy + +## Common Script Properties + +All scripts: +- Use `#!/bin/bash` with `set -euo pipefail` +- Support `--help` flag for usage information +- Source `.env` from project root if present +- Are idempotent where possible +- Support remote execution via SSH when `DEPLOY_HOST` is set +``` diff --git a/.cursor/skills/document/SKILL.md b/.cursor/skills/document/SKILL.md new file mode 100644 index 0000000..c920555 --- /dev/null +++ b/.cursor/skills/document/SKILL.md @@ -0,0 +1,515 @@ +--- +name: document +description: | + Bottom-up codebase documentation skill. Analyzes existing code from modules up through components + to architecture, then retrospectively derives problem/restrictions/acceptance criteria. + Produces the same _docs/ artifacts as the problem, research, and plan skills, but from code + analysis instead of user interview. + Trigger phrases: + - "document", "document codebase", "document this project" + - "documentation", "generate documentation", "create documentation" + - "reverse-engineer docs", "code to docs" + - "analyze and document" +category: build +tags: [documentation, code-analysis, reverse-engineering, architecture, bottom-up] +disable-model-invocation: true +--- + +# Bottom-Up Codebase Documentation + +Analyze an existing codebase from the bottom up — individual modules first, then components, then system-level architecture — and produce the same `_docs/` artifacts that the `problem` and `plan` skills generate, without requiring user interview. + +## Core Principles + +- **Bottom-up always**: module docs -> component specs -> architecture/flows -> solution -> problem extraction. Every higher level is synthesized from the level below. +- **Dependencies first**: process modules in topological order (leaves first). When documenting module X, all of X's dependencies already have docs. +- **Incremental context**: each module's doc uses already-written dependency docs as context — no ever-growing chain. +- **Verify against code**: cross-reference every entity in generated docs against actual codebase. Catch hallucinations. +- **Save immediately**: write each artifact as soon as its step completes. Enable resume from any checkpoint. +- **Ask, don't assume**: when code intent is ambiguous, ASK the user before proceeding. + +## Context Resolution + +Fixed paths: + +- DOCUMENT_DIR: `_docs/02_document/` +- SOLUTION_DIR: `_docs/01_solution/` +- PROBLEM_DIR: `_docs/00_problem/` + +Optional input: + +- FOCUS_DIR: a specific directory subtree provided by the user (e.g., `/document @src/api/`). When set, only this subtree and its transitive dependencies are analyzed. + +Announce resolved paths (and FOCUS_DIR if set) to user before proceeding. + +## Mode Detection + +Determine the execution mode before any other logic: + +| Mode | Trigger | Scope | +|------|---------|-------| +| **Full** | No input file, no existing state | Entire codebase | +| **Focus Area** | User provides a directory path (e.g., `@src/api/`) | Only the specified subtree + transitive dependencies | +| **Resume** | `state.json` exists in DOCUMENT_DIR | Continue from last checkpoint | + +Focus Area mode produces module + component docs for the targeted area only. It can be run repeatedly for different areas — each run appends to the existing module and component docs without overwriting other areas. + +## Prerequisite Checks + +1. If `_docs/` already exists and contains files AND mode is **Full**, ASK user: **overwrite, merge, or write to `_docs_generated/` instead?** +2. Create DOCUMENT_DIR, SOLUTION_DIR, and PROBLEM_DIR if they don't exist +3. If DOCUMENT_DIR contains a `state.json`, offer to **resume from last checkpoint or start fresh** +4. If FOCUS_DIR is set, verify the directory exists and contains source files — **STOP if missing** + +## Progress Tracking + +Create a TodoWrite with all steps (0 through 7). Update status as each step completes. + +## Workflow + +### Step 0: Codebase Discovery + +**Role**: Code analyst +**Goal**: Build a complete map of the codebase (or targeted subtree) before analyzing any code. + +**Focus Area scoping**: if FOCUS_DIR is set, limit the scan to that directory subtree. Still identify transitive dependencies outside FOCUS_DIR (modules that FOCUS_DIR imports) and include them in the processing order, but skip modules that are neither inside FOCUS_DIR nor dependencies of it. + +Scan and catalog: + +1. Directory tree (ignore `node_modules`, `.git`, `__pycache__`, `bin/`, `obj/`, build artifacts) +2. Language detection from file extensions and config files +3. Package manifests: `package.json`, `requirements.txt`, `pyproject.toml`, `*.csproj`, `Cargo.toml`, `go.mod` +4. Config files: `Dockerfile`, `docker-compose.yml`, `.env.example`, CI/CD configs (`.github/workflows/`, `.gitlab-ci.yml`, `azure-pipelines.yml`) +5. Entry points: `main.*`, `app.*`, `index.*`, `Program.*`, startup scripts +6. Test structure: test directories, test frameworks, test runner configs +7. Existing documentation: README, `docs/`, wiki references, inline doc coverage +8. **Dependency graph**: build a module-level dependency graph by analyzing imports/references. Identify: + - Leaf modules (no internal dependencies) + - Entry points (no internal dependents) + - Cycles (mark for grouped analysis) + - Topological processing order + - If FOCUS_DIR: mark which modules are in-scope vs dependency-only + +**Save**: `DOCUMENT_DIR/00_discovery.md` containing: +- Directory tree (concise, relevant directories only) +- Tech stack summary table (language, framework, database, infra) +- Dependency graph (textual list + Mermaid diagram) +- Topological processing order +- Entry points and leaf modules + +**Save**: `DOCUMENT_DIR/state.json` with initial state: +```json +{ + "current_step": "module-analysis", + "completed_steps": ["discovery"], + "focus_dir": null, + "modules_total": 0, + "modules_documented": [], + "modules_remaining": [], + "module_batch": 0, + "components_written": [], + "last_updated": "" +} +``` + +Set `focus_dir` to the FOCUS_DIR path if in Focus Area mode, or `null` for Full mode. + +--- + +### Step 1: Module-Level Documentation + +**Role**: Code analyst +**Goal**: Document every identified module individually, processing in topological order (leaves first). + +**Batched processing**: process modules in batches of ~5 (sorted by topological order). After each batch: save all module docs, update `state.json`, present a progress summary. Between batches, evaluate whether to suggest a session break. + +For each module in topological order: + +1. **Read**: read the module's source code. Assess complexity and what context is needed. +2. **Gather context**: collect already-written docs of this module's dependencies (available because of bottom-up order). Note external library usage. +3. **Write module doc** with these sections: + - **Purpose**: one-sentence responsibility + - **Public interface**: exported functions/classes/methods with signatures, input/output types + - **Internal logic**: key algorithms, patterns, non-obvious behavior + - **Dependencies**: what it imports internally and why + - **Consumers**: what uses this module (from the dependency graph) + - **Data models**: entities/types defined in this module + - **Configuration**: env vars, config keys consumed + - **External integrations**: HTTP calls, DB queries, queue operations, file I/O + - **Security**: auth checks, encryption, input validation, secrets access + - **Tests**: what tests exist for this module, what they cover +4. **Verify**: cross-check that every entity referenced in the doc exists in the codebase. Flag uncertainties. + +**Cycle handling**: modules in a dependency cycle are analyzed together as a group, producing a single combined doc. + +**Large modules**: if a module exceeds comfortable analysis size, split into logical sub-sections and analyze each part, then combine. + +**Save**: `DOCUMENT_DIR/modules/[module_name].md` for each module. +**State**: update `state.json` after each module completes (move from `modules_remaining` to `modules_documented`). Increment `module_batch` after each batch of ~5. + +**Session break heuristic**: after each batch, if more than 10 modules remain AND 2+ batches have already completed in this session, suggest a session break: + +``` +══════════════════════════════════════ + SESSION BREAK SUGGESTED +══════════════════════════════════════ + Modules documented: [X] of [Y] + Batches completed this session: [N] +══════════════════════════════════════ + A) Continue in this conversation + B) Save and continue in a fresh conversation (recommended) +══════════════════════════════════════ + Recommendation: B — fresh context improves + analysis quality for remaining modules +══════════════════════════════════════ +``` + +Re-entry is seamless: `state.json` tracks exactly which modules are done. + +--- + +### Step 2: Component Assembly + +**Role**: Software architect +**Goal**: Group related modules into logical components and produce component specs. + +1. Analyze module docs from Step 1 to identify natural groupings: + - By directory structure (most common) + - By shared data models or common purpose + - By dependency clusters (tightly coupled modules) +2. For each identified component, synthesize its module docs into a single component specification using `templates/component-spec.md` as structure: + - High-level overview: purpose, pattern, upstream/downstream + - Internal interfaces: method signatures, DTOs (from actual module code) + - External API specification (if the component exposes HTTP/gRPC endpoints) + - Data access patterns: queries, caching, storage estimates + - Implementation details: algorithmic complexity, state management, key libraries + - Extensions and helpers: shared utilities needed + - Caveats and edge cases: limitations, race conditions, bottlenecks + - Dependency graph: implementation order relative to other components + - Logging strategy +3. Identify common helpers shared across multiple components -> document in `common-helpers/` +4. Generate component relationship diagram (Mermaid) + +**Self-verification**: +- [ ] Every module from Step 1 is covered by exactly one component +- [ ] No component has overlapping responsibility with another +- [ ] Inter-component interfaces are explicit (who calls whom, with what) +- [ ] Component dependency graph has no circular dependencies + +**Save**: +- `DOCUMENT_DIR/components/[##]_[name]/description.md` per component +- `DOCUMENT_DIR/common-helpers/[##]_helper_[name].md` per shared helper +- `DOCUMENT_DIR/diagrams/components.md` (Mermaid component diagram) + +**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms the component breakdown is correct. + +--- + +### Step 3: System-Level Synthesis + +**Role**: Software architect +**Goal**: From component docs, synthesize system-level documents. + +All documents here are derived from component docs (Step 2) + module docs (Step 1). No new code reading should be needed. If it is, that indicates a gap in Steps 1-2 — go back and fill it. + +#### 3a. Architecture + +Using `templates/architecture.md` as structure: + +- System context and boundaries from entry points and external integrations +- Tech stack table from discovery (Step 0) + component specs +- Deployment model from Dockerfiles, CI configs, environment strategies +- Data model overview from per-component data access sections +- Integration points from inter-component interfaces +- NFRs from test thresholds, config limits, health checks +- Security architecture from per-module security observations +- Key ADRs inferred from technology choices and patterns + +**Save**: `DOCUMENT_DIR/architecture.md` + +#### 3b. System Flows + +Using `templates/system-flows.md` as structure: + +- Trace main flows through the component interaction graph +- Entry point -> component chain -> output for each major flow +- Mermaid sequence diagrams and flowcharts +- Error scenarios from exception handling patterns +- Data flow tables per flow + +**Save**: `DOCUMENT_DIR/system-flows.md` and `DOCUMENT_DIR/diagrams/flows/flow_[name].md` + +#### 3c. Data Model + +- Consolidate all data models from module docs +- Entity-relationship diagram (Mermaid ERD) +- Migration strategy (if ORM/migration tooling detected) +- Seed data observations +- Backward compatibility approach (if versioning found) + +**Save**: `DOCUMENT_DIR/data_model.md` + +#### 3d. Deployment (if Dockerfile/CI configs exist) + +- Containerization summary +- CI/CD pipeline structure +- Environment strategy (dev, staging, production) +- Observability (logging patterns, metrics, health checks found in code) + +**Save**: `DOCUMENT_DIR/deployment/` (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md — only files for which sufficient code evidence exists) + +--- + +### Step 4: Verification Pass + +**Role**: Quality verifier +**Goal**: Compare every generated document against actual code. Fix hallucinations, fill gaps, correct inaccuracies. + +For each document generated in Steps 1-3: + +1. **Entity verification**: extract all code entities (class names, function names, module names, endpoints) mentioned in the doc. Cross-reference each against the actual codebase. Flag any that don't exist. +2. **Interface accuracy**: for every method signature, DTO, or API endpoint in component specs, verify it matches actual code. +3. **Flow correctness**: for each system flow diagram, trace the actual code path and verify the sequence matches. +4. **Completeness check**: are there modules or components discovered in Step 0 that aren't covered by any document? Flag gaps. +5. **Consistency check**: do component docs agree with architecture doc? Do flow diagrams match component interfaces? + +Apply corrections inline to the documents that need them. + +**Save**: `DOCUMENT_DIR/04_verification_log.md` with: +- Total entities verified vs flagged +- Corrections applied (which document, what changed) +- Remaining gaps or uncertainties +- Completeness score (modules covered / total modules) + +**BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes. + +**Session boundary**: After verification is confirmed, suggest a session break before proceeding to the synthesis steps (5–7). These steps produce different artifact types and benefit from fresh context: + +``` +══════════════════════════════════════ + VERIFICATION COMPLETE — session break? +══════════════════════════════════════ + Steps 0–4 (analysis + verification) are done. + Steps 5–7 (solution + problem extraction + report) + can run in a fresh conversation. +══════════════════════════════════════ + A) Continue in this conversation + B) Save and continue in a new conversation (recommended) +══════════════════════════════════════ +``` + +If **Focus Area mode**: Steps 5–7 are skipped (they require full codebase coverage). Present a summary of modules and components documented for this area. The user can run `/document` again for another area, or run without FOCUS_DIR once all areas are covered to produce the full synthesis. + +--- + +### Step 5: Solution Extraction (Retrospective) + +**Role**: Software architect +**Goal**: From all verified technical documentation, retrospectively create `solution.md` — the same artifact the research skill produces. This makes downstream skills (`plan`, `deploy`, `decompose`) compatible with the documented codebase. + +Synthesize from architecture (Step 3) + component specs (Step 2) + system flows (Step 3) + verification findings (Step 4): + +1. **Product Solution Description**: what the system is, brief component interaction diagram (Mermaid) +2. **Architecture**: the architecture that is implemented, with per-component solution tables: + +| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit | +|----------|-------|-----------|-------------|-------------|----------|------|-----| +| [actual implementation] | [libs/platforms used] | [observed strengths] | [observed limitations] | [requirements met] | [security approach] | [cost indicators] | [fitness assessment] | + +3. **Testing Strategy**: summarize integration/functional tests and non-functional tests found in the codebase +4. **References**: links to key config files, Dockerfiles, CI configs that evidence the solution choices + +**Save**: `SOLUTION_DIR/solution.md` (`_docs/01_solution/solution.md`) + +--- + +### Step 6: Problem Extraction (Retrospective) + +**Role**: Business analyst +**Goal**: From all verified technical docs, retrospectively derive the high-level problem definition — producing the same documents the `problem` skill creates through interview. + +This is the inverse of normal workflow: instead of problem -> solution -> code, we go code -> technical docs -> problem understanding. + +#### 6a. `problem.md` + +- Synthesize from architecture overview + component purposes + system flows +- What is this system? What problem does it solve? Who are the users? How does it work at a high level? +- Cross-reference with README if one exists +- Free-form text, concise, readable by someone unfamiliar with the project + +#### 6b. `restrictions.md` + +- Extract from: tech stack choices, Dockerfile specs (OS, base images), CI configs (platform constraints), dependency versions, environment configs +- Categorize with headers: Hardware, Software, Environment, Operational +- Each restriction should be specific and testable + +#### 6c. `acceptance_criteria.md` + +- Derive from: test assertions (expected values, thresholds), performance configs (timeouts, rate limits, batch sizes), health check endpoints, validation rules in code +- Categorize with headers by domain +- Every criterion must have a measurable value — if only implied, note the source + +#### 6d. `input_data/` + +- Document data schemas found (DB schemas, API request/response types, config file formats) +- Create `data_parameters.md` describing what data the system consumes, formats, volumes, update patterns + +#### 6e. `security_approach.md` (only if security code found) + +- Authentication mechanisms, authorization patterns, encryption, secrets handling, CORS, rate limiting, input sanitization — all from code observations +- If no security-relevant code found, skip this file + +**Save**: all files to `PROBLEM_DIR/` (`_docs/00_problem/`) + +**BLOCKING**: Present all problem documents to user. These are the most abstracted and therefore most prone to interpretation error. Do NOT proceed until user confirms or requests corrections. + +--- + +### Step 7: Final Report + +**Role**: Technical writer +**Goal**: Produce `FINAL_report.md` integrating all generated documentation. + +Using `templates/final-report.md` as structure: + +- Executive summary from architecture + problem docs +- Problem statement (transformed from problem.md, not copy-pasted) +- Architecture overview with tech stack one-liner +- Component summary table (number, name, purpose, dependencies) +- System flows summary table +- Risk observations from verification log (Step 4) +- Open questions (uncertainties flagged during analysis) +- Artifact index listing all generated documents with paths + +**Save**: `DOCUMENT_DIR/FINAL_report.md` + +**State**: update `state.json` with `current_step: "complete"`. + +--- + +## Artifact Management + +### Directory Structure + +``` +_docs/ +├── 00_problem/ # Step 6 (retrospective) +│ ├── problem.md +│ ├── restrictions.md +│ ├── acceptance_criteria.md +│ ├── input_data/ +│ │ └── data_parameters.md +│ └── security_approach.md +├── 01_solution/ # Step 5 (retrospective) +│ └── solution.md +└── 02_document/ # DOCUMENT_DIR + ├── 00_discovery.md # Step 0 + ├── modules/ # Step 1 + │ ├── [module_name].md + │ └── ... + ├── components/ # Step 2 + │ ├── 01_[name]/description.md + │ ├── 02_[name]/description.md + │ └── ... + ├── common-helpers/ # Step 2 + ├── architecture.md # Step 3 + ├── system-flows.md # Step 3 + ├── data_model.md # Step 3 + ├── deployment/ # Step 3 + ├── diagrams/ # Steps 2-3 + │ ├── components.md + │ └── flows/ + ├── 04_verification_log.md # Step 4 + ├── FINAL_report.md # Step 7 + └── state.json # Resumability +``` + +### Resumability + +Maintain `DOCUMENT_DIR/state.json`: + +```json +{ + "current_step": "module-analysis", + "completed_steps": ["discovery"], + "focus_dir": null, + "modules_total": 12, + "modules_documented": ["utils/helpers", "models/user"], + "modules_remaining": ["services/auth", "api/endpoints"], + "module_batch": 1, + "components_written": [], + "last_updated": "2026-03-21T14:00:00Z" +} +``` + +Update after each module/component completes. If interrupted, resume from next undocumented module. + +When resuming: +1. Read `state.json` +2. Cross-check against actual files in DOCUMENT_DIR (trust files over state if they disagree) +3. Continue from the next incomplete item +4. Inform user which steps are being skipped + +### Save Principles + +1. **Save immediately**: write each module doc as soon as analysis completes +2. **Incremental context**: each subsequent module uses already-written docs as context +3. **Preserve intermediates**: keep all module docs even after synthesis into component docs +4. **Enable recovery**: state file tracks exact progress for resume + +## Escalation Rules + +| Situation | Action | +|-----------|--------| +| Minified/obfuscated code detected | WARN user, skip module, note in verification log | +| Module too large for context window | Split into sub-sections, analyze parts separately, combine | +| Cycle in dependency graph | Group cycled modules, analyze together as one doc | +| Generated code (protobuf, swagger-gen) | Note as generated, document the source spec instead | +| No tests found in codebase | Note gap in acceptance_criteria.md, derive AC from validation rules and config limits only | +| Contradictions between code and README | Flag in verification log, ASK user | +| Binary files or non-code assets | Skip, note in discovery | +| `_docs/` already exists | ASK user: overwrite, merge, or use `_docs_generated/` | +| Code intent is ambiguous | ASK user, do not guess | + +## Common Mistakes + +- **Top-down guessing**: never infer architecture before documenting modules. Build up, don't assume down. +- **Hallucinating entities**: always verify that referenced classes/functions/endpoints actually exist in code. +- **Skipping modules**: every source module must appear in exactly one module doc and one component. +- **Monolithic analysis**: don't try to analyze the entire codebase in one pass. Module by module, in order. +- **Inventing restrictions**: only document constraints actually evidenced in code, configs, or Dockerfiles. +- **Vague acceptance criteria**: "should be fast" is not a criterion. Extract actual numeric thresholds from code. +- **Writing code**: this skill produces documents, never implementation code. + +## Methodology Quick Reference + +``` +┌──────────────────────────────────────────────────────────────────┐ +│ Bottom-Up Codebase Documentation (8-Step) │ +├──────────────────────────────────────────────────────────────────┤ +│ MODE: Full / Focus Area (@dir) / Resume (state.json) │ +│ PREREQ: Check _docs/ exists (overwrite/merge/new?) │ +│ PREREQ: Check state.json for resume │ +│ │ +│ 0. Discovery → dependency graph, tech stack, topo order │ +│ (Focus Area: scoped to FOCUS_DIR + transitive deps) │ +│ 1. Module Docs → per-module analysis (leaves first) │ +│ (batched ~5 modules; session break between batches) │ +│ 2. Component Assembly → group modules, write component specs │ +│ [BLOCKING: user confirms components] │ +│ 3. System Synthesis → architecture, flows, data model, deploy │ +│ 4. Verification → compare all docs vs code, fix errors │ +│ [BLOCKING: user reviews corrections] │ +│ [SESSION BREAK suggested before Steps 5–7] │ +│ ── Focus Area mode stops here ── │ +│ 5. Solution Extraction → retrospective solution.md │ +│ 6. Problem Extraction → retrospective problem, restrictions, AC │ +│ [BLOCKING: user confirms problem docs] │ +│ 7. Final Report → FINAL_report.md │ +├──────────────────────────────────────────────────────────────────┤ +│ Principles: Bottom-up always · Dependencies first │ +│ Incremental context · Verify against code │ +│ Save immediately · Resume from checkpoint │ +│ Batch modules · Session breaks for large codebases │ +└──────────────────────────────────────────────────────────────────┘ +``` diff --git a/.cursor/skills/implement/SKILL.md b/.cursor/skills/implement/SKILL.md index fb24044..cf44a57 100644 --- a/.cursor/skills/implement/SKILL.md +++ b/.cursor/skills/implement/SKILL.md @@ -73,9 +73,9 @@ For each task in the batch: - Determine: files OWNED (exclusive write), files READ-ONLY (shared interfaces, types), files FORBIDDEN (other agents' owned files) - If two tasks in the same batch would modify the same file, schedule them sequentially instead of in parallel -### 5. Update Jira Status → In Progress +### 5. Update Tracker Status → In Progress -For each task in the batch, transition its Jira ticket status to **In Progress** via Jira MCP before launching the implementer. +For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (Jira MCP or Azure DevOps MCP — see `protocols.md` for detection) before launching the implementer. If `tracker: local`, skip this step. ### 6. Launch Implementer Subagents @@ -93,15 +93,30 @@ Launch all subagents immediately — no user confirmation. - Collect structured status reports from each implementer - If any implementer reports "Blocked", log the blocker and continue with others +**Stuck detection** — while monitoring, watch for these signals per subagent: +- Same file modified 3+ times without test pass rate improving → flag as stuck, stop the subagent, report as Blocked +- Subagent has not produced new output for an extended period → flag as potentially hung +- If a subagent is flagged as stuck, do NOT let it continue looping — stop it and record the blocker in the batch report + ### 8. Code Review - Run `/code-review` skill on the batch's changed files + corresponding task specs - The code-review skill produces a verdict: PASS, PASS_WITH_WARNINGS, or FAIL -### 9. Gate +### 9. Auto-Fix Gate -- If verdict is **FAIL**: present findings to user (**BLOCKING**). User must confirm fixes or accept before proceeding. -- If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue automatically. +Auto-fix loop with bounded retries (max 2 attempts) before escalating to user: + +1. If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue automatically to step 10 +2. If verdict is **FAIL** (attempt 1 or 2): + - Parse the code review findings (Critical and High severity items) + - For each finding, attempt an automated fix using the finding's location, description, and suggestion + - Re-run `/code-review` on the modified files + - If now PASS or PASS_WITH_WARNINGS → continue to step 10 + - If still FAIL → increment retry counter, repeat from (2) up to max 2 attempts +3. If still **FAIL** after 2 auto-fix attempts: present all findings to user (**BLOCKING**). User must confirm fixes or accept before proceeding. + +Track `auto_fix_attempts` count in the batch report for retrospective analysis. ### 10. Test @@ -112,12 +127,12 @@ Launch all subagents immediately — no user confirmation. - After user confirms the batch (explicitly for FAIL, implicitly for PASS/PASS_WITH_WARNINGS): - `git add` all changed files from the batch - - `git commit` with a message that includes ALL JIRA-IDs of tasks implemented in the batch, followed by a summary of what was implemented. Format: `[JIRA-ID-1] [JIRA-ID-2] ... Summary of changes` + - `git commit` with a message that includes ALL task IDs (Jira IDs, ADO IDs, or numeric prefixes) of tasks implemented in the batch, followed by a summary of what was implemented. Format: `[TASK-ID-1] [TASK-ID-2] ... Summary of changes` - `git push` to the remote branch -### 12. Update Jira Status → In Testing +### 12. Update Tracker Status → In Testing -After the batch is committed and pushed, transition the Jira ticket status of each task in the batch to **In Testing** via Jira MCP. +After the batch is committed and pushed, transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step. ### 13. Loop @@ -146,6 +161,8 @@ After each batch, produce a structured report: | [JIRA-ID]_[name] | Done | [count] files | [pass/fail] | [count or None] | ## Code Review Verdict: [PASS/FAIL/PASS_WITH_WARNINGS] +## Auto-Fix Attempts: [0/1/2] +## Stuck Agents: [count or None] ## Next Batch: [task list] or "All tasks complete" ``` @@ -173,5 +190,5 @@ Each batch commit serves as a rollback checkpoint. If recovery is needed: - Never launch tasks whose dependencies are not yet completed - Never allow two parallel agents to write to the same file -- If a subagent fails, do NOT retry automatically — report and let user decide +- If a subagent fails or is flagged as stuck, stop it and report — do not let it loop indefinitely - Always run tests after each batch completes diff --git a/.cursor/skills/new-task/SKILL.md b/.cursor/skills/new-task/SKILL.md new file mode 100644 index 0000000..e68ff4c --- /dev/null +++ b/.cursor/skills/new-task/SKILL.md @@ -0,0 +1,302 @@ +--- +name: new-task +description: | + Interactive skill for adding new functionality to an existing codebase. + Guides the user through describing the feature, assessing complexity, + optionally running research, analyzing the codebase for insertion points, + validating assumptions with the user, and producing a task spec with Jira ticket. + Supports a loop — the user can add multiple tasks in one session. + Trigger phrases: + - "new task", "add feature", "new functionality" + - "I want to add", "new component", "extend" +category: build +tags: [task, feature, interactive, planning, jira] +disable-model-invocation: true +--- + +# New Task (Interactive Feature Planning) + +Guide the user through defining new functionality for an existing codebase. Produces one or more task specifications with Jira tickets, optionally running deep research for complex features. + +## Core Principles + +- **User-driven**: every task starts with the user's description; never invent requirements +- **Right-size research**: only invoke the research skill when the change is big enough to warrant it +- **Validate before committing**: surface all assumptions and uncertainties to the user before writing the task file +- **Save immediately**: write task files to disk as soon as they are ready; never accumulate unsaved work +- **Ask, don't assume**: when scope, insertion point, or approach is unclear, STOP and ask the user + +## Context Resolution + +Fixed paths: + +- TASKS_DIR: `_docs/02_tasks/` +- PLANS_DIR: `_docs/02_task_plans/` +- DOCUMENT_DIR: `_docs/02_document/` +- DEPENDENCIES_TABLE: `_docs/02_tasks/_dependencies_table.md` + +Create TASKS_DIR and PLANS_DIR if they don't exist. + +If TASKS_DIR already contains task files, scan them to determine the next numeric prefix for temporary file naming. + +## Workflow + +The skill runs as a loop. Each iteration produces one task. After each task the user chooses to add another or finish. + +--- + +### Step 1: Gather Feature Description + +**Role**: Product analyst +**Goal**: Get a clear, detailed description of the new functionality from the user. + +Ask the user: + +``` +══════════════════════════════════════ + NEW TASK: Describe the functionality +══════════════════════════════════════ + Please describe in detail the new functionality you want to add: + - What should it do? + - Who is it for? + - Any specific requirements or constraints? +══════════════════════════════════════ +``` + +**BLOCKING**: Do NOT proceed until the user provides a description. + +Record the description verbatim for use in subsequent steps. + +--- + +### Step 2: Analyze Complexity + +**Role**: Technical analyst +**Goal**: Determine whether deep research is needed. + +Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md, components/, system-flows.md). + +Assess the change along these dimensions: +- **Scope**: how many components/files are affected? +- **Novelty**: does it involve libraries, protocols, or patterns not already in the codebase? +- **Risk**: could it break existing functionality or require architectural changes? + +Classification: + +| Category | Criteria | Action | +|----------|----------|--------| +| **Needs research** | New libraries/frameworks, unfamiliar protocols, significant architectural change, multiple unknowns | Proceed to Step 3 (Research) | +| **Skip research** | Extends existing functionality, uses patterns already in codebase, straightforward new component with known tech | Skip to Step 4 (Codebase Analysis) | + +Present the assessment to the user: + +``` +══════════════════════════════════════ + COMPLEXITY ASSESSMENT +══════════════════════════════════════ + Scope: [low / medium / high] + Novelty: [low / medium / high] + Risk: [low / medium / high] +══════════════════════════════════════ + Recommendation: [Research needed / Skip research] + Reason: [one-line justification] +══════════════════════════════════════ +``` + +**BLOCKING**: Ask the user to confirm or override the recommendation before proceeding. + +--- + +### Step 3: Research (conditional) + +**Role**: Researcher +**Goal**: Investigate unknowns before task specification. + +This step only runs if Step 2 determined research is needed. + +1. Create a problem description file at `PLANS_DIR//problem.md` summarizing the feature request and the specific unknowns to investigate +2. Invoke `.cursor/skills/research/SKILL.md` in standalone mode: + - INPUT_FILE: `PLANS_DIR//problem.md` + - BASE_DIR: `PLANS_DIR//` +3. After research completes, read the solution draft from `PLANS_DIR//01_solution/solution_draft01.md` +4. Extract the key findings relevant to the task specification + +The `` is a short kebab-case name derived from the feature description (e.g., `auth-provider-integration`, `real-time-notifications`). + +--- + +### Step 4: Codebase Analysis + +**Role**: Software architect +**Goal**: Determine where and how to insert the new functionality. + +1. Read the codebase documentation from DOCUMENT_DIR: + - `architecture.md` — overall structure + - `components/` — component specs + - `system-flows.md` — data flows (if exists) + - `data_model.md` — data model (if exists) +2. If research was performed (Step 3), incorporate findings +3. Analyze and determine: + - Which existing components are affected + - Where new code should be inserted (which layers, modules, files) + - What interfaces need to change + - What new interfaces or models are needed + - How data flows through the change +4. If the change is complex enough, read the actual source files (not just docs) to verify insertion points + +Present the analysis: + +``` +══════════════════════════════════════ + CODEBASE ANALYSIS +══════════════════════════════════════ + Affected components: [list] + Insertion points: [list of modules/layers] + Interface changes: [list or "None"] + New interfaces: [list or "None"] + Data flow impact: [summary] +══════════════════════════════════════ +``` + +--- + +### Step 5: Validate Assumptions + +**Role**: Quality gate +**Goal**: Surface every uncertainty and get user confirmation. + +Review all decisions and assumptions made in Steps 2–4. For each uncertainty: +1. State the assumption clearly +2. Propose a solution or approach +3. List alternatives if they exist + +Present using the Choose format for each decision that has meaningful alternatives: + +``` +══════════════════════════════════════ + ASSUMPTION VALIDATION +══════════════════════════════════════ + 1. [Assumption]: [proposed approach] + Alternative: [other option, if any] + 2. [Assumption]: [proposed approach] + Alternative: [other option, if any] + ... +══════════════════════════════════════ + Please confirm or correct these assumptions. +══════════════════════════════════════ +``` + +**BLOCKING**: Do NOT proceed until the user confirms or corrects all assumptions. + +--- + +### Step 6: Create Task + +**Role**: Technical writer +**Goal**: Produce the task specification file. + +1. Determine the next numeric prefix by scanning TASKS_DIR for existing files +2. Write the task file using `.cursor/skills/decompose/templates/task.md`: + - Fill all fields from the gathered information + - Set **Complexity** based on the assessment from Step 2 + - Set **Dependencies** by cross-referencing existing tasks in TASKS_DIR + - Set **Jira** and **Epic** to `pending` (filled in Step 7) +3. Save as `TASKS_DIR/[##]_[short_name].md` + +**Self-verification**: +- [ ] Problem section clearly describes the user need +- [ ] Acceptance criteria are testable (Gherkin format) +- [ ] Scope boundaries are explicit +- [ ] Complexity points match the assessment +- [ ] Dependencies reference existing task Jira IDs where applicable +- [ ] No implementation details leaked into the spec + +--- + +### Step 7: Work Item Ticket + +**Role**: Project coordinator +**Goal**: Create a work item ticket and link it to the task file. + +1. Create a ticket via the configured work item tracker (Jira MCP or Azure DevOps MCP — see `autopilot/protocols.md` for detection): + - Summary: the task's **Name** field + - Description: the task's **Problem** and **Acceptance Criteria** sections + - Story points: the task's **Complexity** value + - Link to the appropriate epic (ask user if unclear which epic) +2. Write the ticket ID and Epic ID back into the task file header: + - Update **Task** field: `[TICKET-ID]_[short_name]` + - Update **Jira** field: `[TICKET-ID]` + - Update **Epic** field: `[EPIC-ID]` +3. Rename the file from `[##]_[short_name].md` to `[TICKET-ID]_[short_name].md` + +If the work item tracker is not authenticated or unavailable (`tracker: local`): +- Keep the numeric prefix +- Set **Jira** to `pending` +- Set **Epic** to `pending` +- The task is still valid and can be implemented; tracker sync happens later + +--- + +### Step 8: Loop Gate + +Ask the user: + +``` +══════════════════════════════════════ + Task created: [JIRA-ID or ##] — [task name] +══════════════════════════════════════ + A) Add another task + B) Done — finish and update dependencies +══════════════════════════════════════ +``` + +- If **A** → loop back to Step 1 +- If **B** → proceed to Finalize + +--- + +### Finalize + +After the user chooses **Done**: + +1. Update (or create) `TASKS_DIR/_dependencies_table.md` — add all newly created tasks to the dependencies table +2. Present a summary of all tasks created in this session: + +``` +══════════════════════════════════════ + NEW TASK SUMMARY +══════════════════════════════════════ + Tasks created: N + Total complexity: M points + ───────────────────────────────────── + [JIRA-ID] [name] ([complexity] pts) + [JIRA-ID] [name] ([complexity] pts) + ... +══════════════════════════════════════ +``` + +## Escalation Rules + +| Situation | Action | +|-----------|--------| +| User description is vague or incomplete | **ASK** for more detail — do not guess | +| Unclear which epic to link to | **ASK** user for the epic | +| Research skill hits a blocker | Follow research skill's own escalation rules | +| Codebase analysis reveals conflicting architectures | **ASK** user which pattern to follow | +| Complexity exceeds 5 points | **WARN** user and suggest splitting into multiple tasks | +| Jira MCP unavailable | **WARN**, continue with local-only task files | + +## Trigger Conditions + +When the user wants to: +- Add new functionality to an existing codebase +- Plan a new feature or component +- Create task specifications for upcoming work + +**Keywords**: "new task", "add feature", "new functionality", "extend", "I want to add" + +**Differentiation**: +- User wants to decompose an existing plan into tasks → use `/decompose` +- User wants to research a topic without creating tasks → use `/research` +- User wants to refactor existing code → use `/refactor` +- User wants to define and plan a new feature → use this skill diff --git a/.cursor/skills/new-task/templates/task.md b/.cursor/skills/new-task/templates/task.md new file mode 100644 index 0000000..3a52cf9 --- /dev/null +++ b/.cursor/skills/new-task/templates/task.md @@ -0,0 +1,2 @@ + + diff --git a/.cursor/skills/plan/SKILL.md b/.cursor/skills/plan/SKILL.md index 5ee6222..b1cc48d 100644 --- a/.cursor/skills/plan/SKILL.md +++ b/.cursor/skills/plan/SKILL.md @@ -3,7 +3,7 @@ name: plan description: | Decompose a solution into architecture, data model, deployment plan, system flows, components, tests, and Jira epics. Systematic 6-step planning workflow with BLOCKING gates, self-verification, and structured artifact management. - Uses _docs/ + _docs/02_plans/ structure. + Uses _docs/ + _docs/02_document/ structure. Trigger phrases: - "plan", "decompose solution", "architecture planning" - "break down the solution", "create planning documents" @@ -31,13 +31,11 @@ Fixed paths — no mode detection needed: - PROBLEM_FILE: `_docs/00_problem/problem.md` - SOLUTION_FILE: `_docs/01_solution/solution.md` -- PLANS_DIR: `_docs/02_plans/` +- DOCUMENT_DIR: `_docs/02_document/` Announce the resolved paths to the user before proceeding. -## Input Specification - -### Required Files +## Required Files | File | Purpose | |------|---------| @@ -47,170 +45,23 @@ Announce the resolved paths to the user before proceeding. | `_docs/00_problem/input_data/` | Reference data examples | | `_docs/01_solution/solution.md` | Finalized solution to decompose | -### Prerequisite Checks (BLOCKING) +## Prerequisites -Run sequentially before any planning step: - -**Prereq 1: Data Gate** - -1. `_docs/00_problem/acceptance_criteria.md` exists and is non-empty — **STOP if missing** -2. `_docs/00_problem/restrictions.md` exists and is non-empty — **STOP if missing** -3. `_docs/00_problem/input_data/` exists and contains at least one data file — **STOP if missing** -4. `_docs/00_problem/problem.md` exists and is non-empty — **STOP if missing** - -All four are mandatory. If any is missing or empty, STOP and ask the user to provide them. If the user cannot provide the required data, planning cannot proceed — just stop. - -**Prereq 2: Finalize Solution Draft** - -Only runs after the Data Gate passes: - -1. Scan `_docs/01_solution/` for files matching `solution_draft*.md` -2. Identify the highest-numbered draft (e.g. `solution_draft06.md`) -3. **Rename** it to `_docs/01_solution/solution.md` -4. If `solution.md` already exists, ask the user whether to overwrite or keep existing -5. Verify `solution.md` is non-empty — **STOP if missing or empty** - -**Prereq 3: Workspace Setup** - -1. Create PLANS_DIR if it does not exist -2. If PLANS_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?** +Read and follow `steps/00_prerequisites.md`. All three prerequisite checks are **BLOCKING** — do not start the workflow until they pass. ## Artifact Management -### Directory Structure - -All artifacts are written directly under PLANS_DIR: - -``` -PLANS_DIR/ -├── integration_tests/ -│ ├── environment.md -│ ├── test_data.md -│ ├── functional_tests.md -│ ├── non_functional_tests.md -│ └── traceability_matrix.md -├── architecture.md -├── system-flows.md -├── data_model.md -├── deployment/ -│ ├── containerization.md -│ ├── ci_cd_pipeline.md -│ ├── environment_strategy.md -│ ├── observability.md -│ └── deployment_procedures.md -├── risk_mitigations.md -├── risk_mitigations_02.md (iterative, ## as sequence) -├── components/ -│ ├── 01_[name]/ -│ │ ├── description.md -│ │ └── tests.md -│ ├── 02_[name]/ -│ │ ├── description.md -│ │ └── tests.md -│ └── ... -├── common-helpers/ -│ ├── 01_helper_[name]/ -│ ├── 02_helper_[name]/ -│ └── ... -├── diagrams/ -│ ├── components.drawio -│ └── flows/ -│ ├── flow_[name].md (Mermaid) -│ └── ... -└── FINAL_report.md -``` - -### Save Timing - -| Step | Save immediately after | Filename | -|------|------------------------|----------| -| Step 1 | Integration test environment spec | `integration_tests/environment.md` | -| Step 1 | Integration test data spec | `integration_tests/test_data.md` | -| Step 1 | Integration functional tests | `integration_tests/functional_tests.md` | -| Step 1 | Integration non-functional tests | `integration_tests/non_functional_tests.md` | -| Step 1 | Integration traceability matrix | `integration_tests/traceability_matrix.md` | -| Step 2 | Architecture analysis complete | `architecture.md` | -| Step 2 | System flows documented | `system-flows.md` | -| Step 2 | Data model documented | `data_model.md` | -| Step 2 | Deployment plan complete | `deployment/` (5 files) | -| Step 3 | Each component analyzed | `components/[##]_[name]/description.md` | -| Step 3 | Common helpers generated | `common-helpers/[##]_helper_[name].md` | -| Step 3 | Diagrams generated | `diagrams/` | -| Step 4 | Risk assessment complete | `risk_mitigations.md` | -| Step 5 | Tests written per component | `components/[##]_[name]/tests.md` | -| Step 6 | Epics created in Jira | Jira via MCP | -| Final | All steps complete | `FINAL_report.md` | - -### Save Principles - -1. **Save immediately**: write to disk as soon as a step completes; do not wait until the end -2. **Incremental updates**: same file can be updated multiple times; append or replace -3. **Preserve process**: keep all intermediate files even after integration into final report -4. **Enable recovery**: if interrupted, resume from the last saved artifact (see Resumability) - -### Resumability - -If PLANS_DIR already contains artifacts: - -1. List existing files and match them to the save timing table above -2. Identify the last completed step based on which artifacts exist -3. Resume from the next incomplete step -4. Inform the user which steps are being skipped +Read `steps/01_artifact-management.md` for directory structure, save timing, save principles, and resumability rules. Refer to it throughout the workflow. ## Progress Tracking -At the start of execution, create a TodoWrite with all steps (1 through 6). Update status as each step completes. +At the start of execution, create a TodoWrite with all steps (1 through 6 plus Final). Update status as each step completes. ## Workflow -### Step 1: Integration Tests +### Step 1: Blackbox Tests -**Role**: Professional Quality Assurance Engineer -**Goal**: Analyze input data completeness and produce detailed black-box integration test specifications -**Constraints**: Spec only — no test code. Tests describe what the system should do given specific inputs, not how the system is built. - -#### Phase 1a: Input Data Completeness Analysis - -1. Read `_docs/01_solution/solution.md` (finalized in Prereq 2) -2. Read `acceptance_criteria.md`, `restrictions.md` -3. Read testing strategy from solution.md -4. Analyze `input_data/` contents against: - - Coverage of acceptance criteria scenarios - - Coverage of restriction edge cases - - Coverage of testing strategy requirements -5. Threshold: at least 70% coverage of the scenarios -6. If coverage is low, search the internet for supplementary data, assess quality with user, and if user agrees, add to `input_data/` -7. Present coverage assessment to user - -**BLOCKING**: Do NOT proceed until user confirms the input data coverage is sufficient. - -#### Phase 1b: Black-Box Test Scenario Specification - -Based on all acquired data, acceptance_criteria, and restrictions, form detailed test scenarios: - -1. Define test environment using `templates/integration-environment.md` as structure -2. Define test data management using `templates/integration-test-data.md` as structure -3. Write functional test scenarios (positive + negative) using `templates/integration-functional-tests.md` as structure -4. Write non-functional test scenarios (performance, resilience, security, edge cases) using `templates/integration-non-functional-tests.md` as structure -5. Build traceability matrix using `templates/integration-traceability-matrix.md` as structure - -**Self-verification**: -- [ ] Every acceptance criterion is covered by at least one test scenario -- [ ] Every restriction is verified by at least one test scenario -- [ ] Positive and negative scenarios are balanced -- [ ] Consumer app has no direct access to system internals -- [ ] Docker environment is self-contained (`docker compose up` sufficient) -- [ ] External dependencies have mock/stub services defined -- [ ] Traceability matrix has no uncovered AC or restrictions - -**Save action**: Write all files under `integration_tests/`: -- `environment.md` -- `test_data.md` -- `functional_tests.md` -- `non_functional_tests.md` -- `traceability_matrix.md` - -**BLOCKING**: Present test coverage summary (from traceability_matrix.md) to user. Do NOT proceed until confirmed. +Read and execute `.cursor/skills/test-spec/SKILL.md`. Capture any new questions, findings, or insights that arise during test specification — these feed forward into Steps 2 and 3. @@ -218,263 +69,37 @@ Capture any new questions, findings, or insights that arise during test specific ### Step 2: Solution Analysis -**Role**: Professional software architect -**Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft -**Constraints**: No code, no component-level detail yet; focus on system-level view - -#### Phase 2a: Architecture & Flows - -1. Read all input files thoroughly -2. Incorporate findings, questions, and insights discovered during Step 1 (integration tests) -3. Research unknown or questionable topics via internet; ask user about ambiguities -4. Document architecture using `templates/architecture.md` as structure -5. Document system flows using `templates/system-flows.md` as structure - -**Self-verification**: -- [ ] Architecture covers all capabilities mentioned in solution.md -- [ ] System flows cover all main user/system interactions -- [ ] No contradictions with problem.md or restrictions.md -- [ ] Technology choices are justified -- [ ] Integration test findings are reflected in architecture decisions - -**Save action**: Write `architecture.md` and `system-flows.md` - -**BLOCKING**: Present architecture summary to user. Do NOT proceed until user confirms. - -#### Phase 2b: Data Model - -**Role**: Professional software architect -**Goal**: Produce a detailed data model document covering entities, relationships, and migration strategy - -1. Extract core entities from architecture.md and solution.md -2. Define entity attributes, types, and constraints -3. Define relationships between entities (Mermaid ERD) -4. Define migration strategy: versioning tool (EF Core migrations / Alembic / sql-migrate), reversibility requirement, naming convention -5. Define seed data requirements per environment (dev, staging) -6. Define backward compatibility approach for schema changes (additive-only by default) - -**Self-verification**: -- [ ] Every entity mentioned in architecture.md is defined -- [ ] Relationships are explicit with cardinality -- [ ] Migration strategy specifies reversibility requirement -- [ ] Seed data requirements defined -- [ ] Backward compatibility approach documented - -**Save action**: Write `data_model.md` - -#### Phase 2c: Deployment Planning - -**Role**: DevOps / Platform engineer -**Goal**: Produce deployment plan covering containerization, CI/CD, environment strategy, observability, and deployment procedures - -Use the `/deploy` skill's templates as structure for each artifact: - -1. Read architecture.md and restrictions.md for infrastructure constraints -2. Research Docker best practices for the project's tech stack -3. Define containerization plan: Dockerfile per component, docker-compose for dev and tests -4. Define CI/CD pipeline: stages, quality gates, caching, parallelization -5. Define environment strategy: dev, staging, production with secrets management -6. Define observability: structured logging, metrics, tracing, alerting -7. Define deployment procedures: strategy, health checks, rollback, checklist - -**Self-verification**: -- [ ] Every component has a Docker specification -- [ ] CI/CD pipeline covers lint, test, security, build, deploy -- [ ] Environment strategy covers dev, staging, production -- [ ] Observability covers logging, metrics, tracing, alerting -- [ ] Deployment procedures include rollback and health checks - -**Save action**: Write all 5 files under `deployment/`: -- `containerization.md` -- `ci_cd_pipeline.md` -- `environment_strategy.md` -- `observability.md` -- `deployment_procedures.md` +Read and follow `steps/02_solution-analysis.md`. --- ### Step 3: Component Decomposition -**Role**: Professional software architect -**Goal**: Decompose the architecture into components with detailed specs -**Constraints**: No code; only names, interfaces, inputs/outputs. Follow SRP strictly. - -1. Identify components from the architecture; think about separation, reusability, and communication patterns -2. Use integration test scenarios from Step 1 to validate component boundaries -3. If additional components are needed (data preparation, shared helpers), create them -4. For each component, write a spec using `templates/component-spec.md` as structure -5. Generate diagrams: - - draw.io component diagram showing relations (minimize line intersections, group semantically coherent components, place external users near their components) - - Mermaid flowchart per main control flow -6. Components can share and reuse common logic, same for multiple components. Hence for such occurences common-helpers folder is specified. - -**Self-verification**: -- [ ] Each component has a single, clear responsibility -- [ ] No functionality is spread across multiple components -- [ ] All inter-component interfaces are defined (who calls whom, with what) -- [ ] Component dependency graph has no circular dependencies -- [ ] All components from architecture.md are accounted for -- [ ] Every integration test scenario can be traced through component interactions - -**Save action**: Write: - - each component `components/[##]_[name]/description.md` - - common helper `common-helpers/[##]_helper_[name].md` - - diagrams `diagrams/` - -**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms. +Read and follow `steps/03_component-decomposition.md`. --- ### Step 4: Architecture Review & Risk Assessment -**Role**: Professional software architect and analyst -**Goal**: Validate all artifacts for consistency, then identify and mitigate risks -**Constraints**: This is a review step — fix problems found, do not add new features - -#### 4a. Evaluator Pass (re-read ALL artifacts) - -Review checklist: -- [ ] All components follow Single Responsibility Principle -- [ ] All components follow dumb code / smart data principle -- [ ] Inter-component interfaces are consistent (caller's output matches callee's input) -- [ ] No circular dependencies in the dependency graph -- [ ] No missing interactions between components -- [ ] No over-engineering — is there a simpler decomposition? -- [ ] Security considerations addressed in component design -- [ ] Performance bottlenecks identified -- [ ] API contracts are consistent across components - -Fix any issues found before proceeding to risk identification. - -#### 4b. Risk Identification - -1. Identify technical and project risks -2. Assess probability and impact using `templates/risk-register.md` -3. Define mitigation strategies -4. Apply mitigations to architecture, flows, and component documents where applicable - -**Self-verification**: -- [ ] Every High/Critical risk has a concrete mitigation strategy -- [ ] Mitigations are reflected in the relevant component or architecture docs -- [ ] No new risks introduced by the mitigations themselves - -**Save action**: Write `risk_mitigations.md` - -**BLOCKING**: Present risk summary to user. Ask whether assessment is sufficient. - -**Iterative**: If user requests another round, repeat Step 4 and write `risk_mitigations_##.md` (## as sequence number). Continue until user confirms. +Read and follow `steps/04_review-risk.md`. --- ### Step 5: Test Specifications -**Role**: Professional Quality Assurance Engineer - -**Goal**: Write test specs for each component achieving minimum 75% acceptance criteria coverage - -**Constraints**: Test specs only — no test code. Each test must trace to an acceptance criterion. - -1. For each component, write tests using `templates/test-spec.md` as structure -2. Cover all 4 types: integration, performance, security, acceptance -3. Include test data management (setup, teardown, isolation) -4. Verify traceability: every acceptance criterion from `acceptance_criteria.md` must be covered by at least one test - -**Self-verification**: -- [ ] Every acceptance criterion has at least one test covering it -- [ ] Test inputs are realistic and well-defined -- [ ] Expected results are specific and measurable -- [ ] No component is left without tests - -**Save action**: Write each `components/[##]_[name]/tests.md` +Read and follow `steps/05_test-specifications.md`. --- ### Step 6: Jira Epics -**Role**: Professional product manager - -**Goal**: Create Jira epics from components, ordered by dependency - -**Constraints**: Be concise — fewer words with the same meaning is better - -1. **Create "Bootstrap & Initial Structure" epic first** — this epic will parent the `01_initial_structure` task created by the decompose skill. It covers project scaffolding: folder structure, shared models, interfaces, stubs, CI/CD config, DB migrations setup, test structure. -2. Generate Jira Epics for each component using Jira MCP, structured per `templates/epic-spec.md` -3. Order epics by dependency (Bootstrap epic is always first, then components based on their dependency graph) -4. Include effort estimation per epic (T-shirt size or story points range) -5. Ensure each epic has clear acceptance criteria cross-referenced with component specs -6. Generate updated draw.io diagram showing component-to-epic mapping - -**Self-verification**: -- [ ] "Bootstrap & Initial Structure" epic exists and is first in order -- [ ] "Integration Tests" epic exists -- [ ] Every component maps to exactly one epic -- [ ] Dependency order is respected (no epic depends on a later one) -- [ ] Acceptance criteria are measurable -- [ ] Effort estimates are realistic - -7. **Create "Integration Tests" epic** — this epic will parent the integration test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `integration_tests/`. - -**Save action**: Epics created in Jira via MCP +Read and follow `steps/06_jira-epics.md`. --- -## Quality Checklist (before FINAL_report.md) +### Final: Quality Checklist -Before writing the final report, verify ALL of the following: - -### Integration Tests -- [ ] Every acceptance criterion is covered in traceability_matrix.md -- [ ] Every restriction is verified by at least one test -- [ ] Positive and negative scenarios are balanced -- [ ] Docker environment is self-contained -- [ ] Consumer app treats main system as black box -- [ ] CI/CD integration and reporting defined - -### Architecture -- [ ] Covers all capabilities from solution.md -- [ ] Technology choices are justified -- [ ] Deployment model is defined -- [ ] Integration test findings are reflected in architecture decisions - -### Data Model -- [ ] Every entity from architecture.md is defined -- [ ] Relationships have explicit cardinality -- [ ] Migration strategy with reversibility requirement -- [ ] Seed data requirements defined -- [ ] Backward compatibility approach documented - -### Deployment -- [ ] Containerization plan covers all components -- [ ] CI/CD pipeline includes lint, test, security, build, deploy stages -- [ ] Environment strategy covers dev, staging, production -- [ ] Observability covers logging, metrics, tracing, alerting -- [ ] Deployment procedures include rollback and health checks - -### Components -- [ ] Every component follows SRP -- [ ] No circular dependencies -- [ ] All inter-component interfaces are defined and consistent -- [ ] No orphan components (unused by any flow) -- [ ] Every integration test scenario can be traced through component interactions - -### Risks -- [ ] All High/Critical risks have mitigations -- [ ] Mitigations are reflected in component/architecture docs -- [ ] User has confirmed risk assessment is sufficient - -### Tests -- [ ] Every acceptance criterion is covered by at least one test -- [ ] All 4 test types are represented per component (where applicable) -- [ ] Test data management is defined - -### Epics -- [ ] "Bootstrap & Initial Structure" epic exists -- [ ] "Integration Tests" epic exists -- [ ] Every component maps to an epic -- [ ] Dependency order is correct -- [ ] Acceptance criteria are measurable - -**Save action**: Write `FINAL_report.md` using `templates/final-report.md` as structure +Read and follow `steps/07_quality-checklist.md`. ## Common Mistakes @@ -486,7 +111,7 @@ Before writing the final report, verify ALL of the following: - **Copy-pasting problem.md**: the architecture doc should analyze and transform, not repeat the input - **Vague interfaces**: "component A talks to component B" is not enough; define the method, input, output - **Ignoring restrictions.md**: every constraint must be traceable in the architecture or risk register -- **Ignoring integration test findings**: insights from Step 1 must feed into architecture (Step 2) and component decomposition (Step 3) +- **Ignoring blackbox test findings**: insights from Step 1 must feed into architecture (Step 2) and component decomposition (Step 3) ## Escalation Rules @@ -505,31 +130,26 @@ Before writing the final report, verify ALL of the following: ``` ┌────────────────────────────────────────────────────────────────┐ -│ Solution Planning (6-Step Method) │ +│ Solution Planning (6-Step + Final) │ ├────────────────────────────────────────────────────────────────┤ -│ PREREQ 1: Data Gate (BLOCKING) │ -│ → verify AC, restrictions, input_data exist — STOP if not │ -│ PREREQ 2: Finalize solution draft │ -│ → rename highest solution_draft##.md to solution.md │ -│ PREREQ 3: Workspace setup │ -│ → create PLANS_DIR/ if needed │ +│ PREREQ: Data Gate (BLOCKING) │ +│ → verify AC, restrictions, input_data, solution exist │ │ │ -│ 1. Integration Tests → integration_tests/ (5 files) │ +│ 1. Blackbox Tests → test-spec/SKILL.md │ │ [BLOCKING: user confirms test coverage] │ -│ 2a. Architecture → architecture.md, system-flows.md │ +│ 2. Solution Analysis → architecture, data model, deployment │ │ [BLOCKING: user confirms architecture] │ -│ 2b. Data Model → data_model.md │ -│ 2c. Deployment → deployment/ (5 files) │ -│ 3. Component Decompose → components/[##]_[name]/description │ -│ [BLOCKING: user confirms decomposition] │ -│ 4. Review & Risk → risk_mitigations.md │ -│ [BLOCKING: user confirms risks, iterative] │ -│ 5. Test Specifications → components/[##]_[name]/tests.md │ -│ 6. Jira Epics → Jira via MCP │ +│ 3. Component Decomp → component specs + interfaces │ +│ [BLOCKING: user confirms components] │ +│ 4. Review & Risk → risk register, iterations │ +│ [BLOCKING: user confirms mitigations] │ +│ 5. Test Specifications → per-component test specs │ +│ 6. Jira Epics → epic per component + bootstrap │ │ ───────────────────────────────────────────────── │ -│ Quality Checklist → FINAL_report.md │ +│ Final: Quality Checklist → FINAL_report.md │ ├────────────────────────────────────────────────────────────────┤ -│ Principles: SRP · Dumb code/smart data · Save immediately │ -│ Ask don't assume · Plan don't code │ +│ Principles: Single Responsibility · Dumb code, smart data │ +│ Save immediately · Ask don't assume │ +│ Plan don't code │ └────────────────────────────────────────────────────────────────┘ ``` diff --git a/.cursor/skills/plan/steps/00_prerequisites.md b/.cursor/skills/plan/steps/00_prerequisites.md new file mode 100644 index 0000000..3eccbc8 --- /dev/null +++ b/.cursor/skills/plan/steps/00_prerequisites.md @@ -0,0 +1,27 @@ +## Prerequisite Checks (BLOCKING) + +Run sequentially before any planning step: + +### Prereq 1: Data Gate + +1. `_docs/00_problem/acceptance_criteria.md` exists and is non-empty — **STOP if missing** +2. `_docs/00_problem/restrictions.md` exists and is non-empty — **STOP if missing** +3. `_docs/00_problem/input_data/` exists and contains at least one data file — **STOP if missing** +4. `_docs/00_problem/problem.md` exists and is non-empty — **STOP if missing** + +All four are mandatory. If any is missing or empty, STOP and ask the user to provide them. If the user cannot provide the required data, planning cannot proceed — just stop. + +### Prereq 2: Finalize Solution Draft + +Only runs after the Data Gate passes: + +1. Scan `_docs/01_solution/` for files matching `solution_draft*.md` +2. Identify the highest-numbered draft (e.g. `solution_draft06.md`) +3. **Rename** it to `_docs/01_solution/solution.md` +4. If `solution.md` already exists, ask the user whether to overwrite or keep existing +5. Verify `solution.md` is non-empty — **STOP if missing or empty** + +### Prereq 3: Workspace Setup + +1. Create DOCUMENT_DIR if it does not exist +2. If DOCUMENT_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?** diff --git a/.cursor/skills/plan/steps/01_artifact-management.md b/.cursor/skills/plan/steps/01_artifact-management.md new file mode 100644 index 0000000..95af1d0 --- /dev/null +++ b/.cursor/skills/plan/steps/01_artifact-management.md @@ -0,0 +1,87 @@ +## Artifact Management + +### Directory Structure + +All artifacts are written directly under DOCUMENT_DIR: + +``` +DOCUMENT_DIR/ +├── tests/ +│ ├── environment.md +│ ├── test-data.md +│ ├── blackbox-tests.md +│ ├── performance-tests.md +│ ├── resilience-tests.md +│ ├── security-tests.md +│ ├── resource-limit-tests.md +│ └── traceability-matrix.md +├── architecture.md +├── system-flows.md +├── data_model.md +├── deployment/ +│ ├── containerization.md +│ ├── ci_cd_pipeline.md +│ ├── environment_strategy.md +│ ├── observability.md +│ └── deployment_procedures.md +├── risk_mitigations.md +├── risk_mitigations_02.md (iterative, ## as sequence) +├── components/ +│ ├── 01_[name]/ +│ │ ├── description.md +│ │ └── tests.md +│ ├── 02_[name]/ +│ │ ├── description.md +│ │ └── tests.md +│ └── ... +├── common-helpers/ +│ ├── 01_helper_[name]/ +│ ├── 02_helper_[name]/ +│ └── ... +├── diagrams/ +│ ├── components.drawio +│ └── flows/ +│ ├── flow_[name].md (Mermaid) +│ └── ... +└── FINAL_report.md +``` + +### Save Timing + +| Step | Save immediately after | Filename | +|------|------------------------|----------| +| Step 1 | Blackbox test environment spec | `tests/environment.md` | +| Step 1 | Blackbox test data spec | `tests/test-data.md` | +| Step 1 | Blackbox tests | `tests/blackbox-tests.md` | +| Step 1 | Blackbox performance tests | `tests/performance-tests.md` | +| Step 1 | Blackbox resilience tests | `tests/resilience-tests.md` | +| Step 1 | Blackbox security tests | `tests/security-tests.md` | +| Step 1 | Blackbox resource limit tests | `tests/resource-limit-tests.md` | +| Step 1 | Blackbox traceability matrix | `tests/traceability-matrix.md` | +| Step 2 | Architecture analysis complete | `architecture.md` | +| Step 2 | System flows documented | `system-flows.md` | +| Step 2 | Data model documented | `data_model.md` | +| Step 2 | Deployment plan complete | `deployment/` (5 files) | +| Step 3 | Each component analyzed | `components/[##]_[name]/description.md` | +| Step 3 | Common helpers generated | `common-helpers/[##]_helper_[name].md` | +| Step 3 | Diagrams generated | `diagrams/` | +| Step 4 | Risk assessment complete | `risk_mitigations.md` | +| Step 5 | Tests written per component | `components/[##]_[name]/tests.md` | +| Step 6 | Epics created in Jira | Jira via MCP | +| Final | All steps complete | `FINAL_report.md` | + +### Save Principles + +1. **Save immediately**: write to disk as soon as a step completes; do not wait until the end +2. **Incremental updates**: same file can be updated multiple times; append or replace +3. **Preserve process**: keep all intermediate files even after integration into final report +4. **Enable recovery**: if interrupted, resume from the last saved artifact (see Resumability) + +### Resumability + +If DOCUMENT_DIR already contains artifacts: + +1. List existing files and match them to the save timing table above +2. Identify the last completed step based on which artifacts exist +3. Resume from the next incomplete step +4. Inform the user which steps are being skipped diff --git a/.cursor/skills/plan/steps/02_solution-analysis.md b/.cursor/skills/plan/steps/02_solution-analysis.md new file mode 100644 index 0000000..701f409 --- /dev/null +++ b/.cursor/skills/plan/steps/02_solution-analysis.md @@ -0,0 +1,74 @@ +## Step 2: Solution Analysis + +**Role**: Professional software architect +**Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft +**Constraints**: No code, no component-level detail yet; focus on system-level view + +### Phase 2a: Architecture & Flows + +1. Read all input files thoroughly +2. Incorporate findings, questions, and insights discovered during Step 1 (blackbox tests) +3. Research unknown or questionable topics via internet; ask user about ambiguities +4. Document architecture using `templates/architecture.md` as structure +5. Document system flows using `templates/system-flows.md` as structure + +**Self-verification**: +- [ ] Architecture covers all capabilities mentioned in solution.md +- [ ] System flows cover all main user/system interactions +- [ ] No contradictions with problem.md or restrictions.md +- [ ] Technology choices are justified +- [ ] Blackbox test findings are reflected in architecture decisions + +**Save action**: Write `architecture.md` and `system-flows.md` + +**BLOCKING**: Present architecture summary to user. Do NOT proceed until user confirms. + +### Phase 2b: Data Model + +**Role**: Professional software architect +**Goal**: Produce a detailed data model document covering entities, relationships, and migration strategy + +1. Extract core entities from architecture.md and solution.md +2. Define entity attributes, types, and constraints +3. Define relationships between entities (Mermaid ERD) +4. Define migration strategy: versioning tool (EF Core migrations / Alembic / sql-migrate), reversibility requirement, naming convention +5. Define seed data requirements per environment (dev, staging) +6. Define backward compatibility approach for schema changes (additive-only by default) + +**Self-verification**: +- [ ] Every entity mentioned in architecture.md is defined +- [ ] Relationships are explicit with cardinality +- [ ] Migration strategy specifies reversibility requirement +- [ ] Seed data requirements defined +- [ ] Backward compatibility approach documented + +**Save action**: Write `data_model.md` + +### Phase 2c: Deployment Planning + +**Role**: DevOps / Platform engineer +**Goal**: Produce deployment plan covering containerization, CI/CD, environment strategy, observability, and deployment procedures + +Use the `/deploy` skill's templates as structure for each artifact: + +1. Read architecture.md and restrictions.md for infrastructure constraints +2. Research Docker best practices for the project's tech stack +3. Define containerization plan: Dockerfile per component, docker-compose for dev and tests +4. Define CI/CD pipeline: stages, quality gates, caching, parallelization +5. Define environment strategy: dev, staging, production with secrets management +6. Define observability: structured logging, metrics, tracing, alerting +7. Define deployment procedures: strategy, health checks, rollback, checklist + +**Self-verification**: +- [ ] Every component has a Docker specification +- [ ] CI/CD pipeline covers lint, test, security, build, deploy +- [ ] Environment strategy covers dev, staging, production +- [ ] Observability covers logging, metrics, tracing, alerting +- [ ] Deployment procedures include rollback and health checks + +**Save action**: Write all 5 files under `deployment/`: +- `containerization.md` +- `ci_cd_pipeline.md` +- `environment_strategy.md` +- `observability.md` +- `deployment_procedures.md` diff --git a/.cursor/skills/plan/steps/03_component-decomposition.md b/.cursor/skills/plan/steps/03_component-decomposition.md new file mode 100644 index 0000000..c026e65 --- /dev/null +++ b/.cursor/skills/plan/steps/03_component-decomposition.md @@ -0,0 +1,29 @@ +## Step 3: Component Decomposition + +**Role**: Professional software architect +**Goal**: Decompose the architecture into components with detailed specs +**Constraints**: No code; only names, interfaces, inputs/outputs. Follow SRP strictly. + +1. Identify components from the architecture; think about separation, reusability, and communication patterns +2. Use blackbox test scenarios from Step 1 to validate component boundaries +3. If additional components are needed (data preparation, shared helpers), create them +4. For each component, write a spec using `templates/component-spec.md` as structure +5. Generate diagrams: + - draw.io component diagram showing relations (minimize line intersections, group semantically coherent components, place external users near their components) + - Mermaid flowchart per main control flow +6. Components can share and reuse common logic, same for multiple components. Hence for such occurences common-helpers folder is specified. + +**Self-verification**: +- [ ] Each component has a single, clear responsibility +- [ ] No functionality is spread across multiple components +- [ ] All inter-component interfaces are defined (who calls whom, with what) +- [ ] Component dependency graph has no circular dependencies +- [ ] All components from architecture.md are accounted for +- [ ] Every blackbox test scenario can be traced through component interactions + +**Save action**: Write: + - each component `components/[##]_[name]/description.md` + - common helper `common-helpers/[##]_helper_[name].md` + - diagrams `diagrams/` + +**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms. diff --git a/.cursor/skills/plan/steps/04_review-risk.md b/.cursor/skills/plan/steps/04_review-risk.md new file mode 100644 index 0000000..747b7cf --- /dev/null +++ b/.cursor/skills/plan/steps/04_review-risk.md @@ -0,0 +1,38 @@ +## Step 4: Architecture Review & Risk Assessment + +**Role**: Professional software architect and analyst +**Goal**: Validate all artifacts for consistency, then identify and mitigate risks +**Constraints**: This is a review step — fix problems found, do not add new features + +### 4a. Evaluator Pass (re-read ALL artifacts) + +Review checklist: +- [ ] All components follow Single Responsibility Principle +- [ ] All components follow dumb code / smart data principle +- [ ] Inter-component interfaces are consistent (caller's output matches callee's input) +- [ ] No circular dependencies in the dependency graph +- [ ] No missing interactions between components +- [ ] No over-engineering — is there a simpler decomposition? +- [ ] Security considerations addressed in component design +- [ ] Performance bottlenecks identified +- [ ] API contracts are consistent across components + +Fix any issues found before proceeding to risk identification. + +### 4b. Risk Identification + +1. Identify technical and project risks +2. Assess probability and impact using `templates/risk-register.md` +3. Define mitigation strategies +4. Apply mitigations to architecture, flows, and component documents where applicable + +**Self-verification**: +- [ ] Every High/Critical risk has a concrete mitigation strategy +- [ ] Mitigations are reflected in the relevant component or architecture docs +- [ ] No new risks introduced by the mitigations themselves + +**Save action**: Write `risk_mitigations.md` + +**BLOCKING**: Present risk summary to user. Ask whether assessment is sufficient. + +**Iterative**: If user requests another round, repeat Step 4 and write `risk_mitigations_##.md` (## as sequence number). Continue until user confirms. diff --git a/.cursor/skills/plan/steps/05_test-specifications.md b/.cursor/skills/plan/steps/05_test-specifications.md new file mode 100644 index 0000000..9657359 --- /dev/null +++ b/.cursor/skills/plan/steps/05_test-specifications.md @@ -0,0 +1,20 @@ +## Step 5: Test Specifications + +**Role**: Professional Quality Assurance Engineer + +**Goal**: Write test specs for each component achieving minimum 75% acceptance criteria coverage + +**Constraints**: Test specs only — no test code. Each test must trace to an acceptance criterion. + +1. For each component, write tests using `templates/test-spec.md` as structure +2. Cover all 4 types: integration, performance, security, acceptance +3. Include test data management (setup, teardown, isolation) +4. Verify traceability: every acceptance criterion from `acceptance_criteria.md` must be covered by at least one test + +**Self-verification**: +- [ ] Every acceptance criterion has at least one test covering it +- [ ] Test inputs are realistic and well-defined +- [ ] Expected results are specific and measurable +- [ ] No component is left without tests + +**Save action**: Write each `components/[##]_[name]/tests.md` diff --git a/.cursor/skills/plan/steps/06_jira-epics.md b/.cursor/skills/plan/steps/06_jira-epics.md new file mode 100644 index 0000000..e93d95e --- /dev/null +++ b/.cursor/skills/plan/steps/06_jira-epics.md @@ -0,0 +1,48 @@ +## Step 6: Work Item Epics + +**Role**: Professional product manager + +**Goal**: Create epics from components, ordered by dependency + +**Constraints**: Epic descriptions must be **comprehensive and self-contained** — a developer reading only the epic should understand the full context without needing to open separate files. + +1. **Create "Bootstrap & Initial Structure" epic first** — this epic will parent the `01_initial_structure` task created by the decompose skill. It covers project scaffolding: folder structure, shared models, interfaces, stubs, CI/CD config, DB migrations setup, test structure. +2. Generate epics for each component using the configured work item tracker (Jira MCP or Azure DevOps MCP — see `autopilot/protocols.md`), structured per `templates/epic-spec.md` +3. Order epics by dependency (Bootstrap epic is always first, then components based on their dependency graph) +4. Include effort estimation per epic (T-shirt size or story points range) +5. Ensure each epic has clear acceptance criteria cross-referenced with component specs +6. Generate Mermaid diagrams showing component-to-epic mapping and component relationships + +**CRITICAL — Epic description richness requirements**: + +Each epic description MUST include ALL of the following sections with substantial content: +- **System context**: where this component fits in the overall architecture (include Mermaid diagram showing this component's position and connections) +- **Problem / Context**: what problem this component solves, why it exists, current pain points +- **Scope**: detailed in-scope and out-of-scope lists +- **Architecture notes**: relevant ADRs, technology choices, patterns used, key design decisions +- **Interface specification**: full method signatures, input/output types, error types (from component description.md) +- **Data flow**: how data enters and exits this component (include Mermaid sequence or flowchart diagram) +- **Dependencies**: epic dependencies (with Jira IDs) and external dependencies (libraries, hardware, services) +- **Acceptance criteria**: measurable criteria with specific thresholds (from component tests.md) +- **Non-functional requirements**: latency, memory, throughput targets with failure thresholds +- **Risks & mitigations**: relevant risks from risk_mitigations.md with concrete mitigation strategies +- **Effort estimation**: T-shirt size and story points range +- **Child issues**: planned task breakdown with complexity points +- **Key constraints**: from restrictions.md that affect this component +- **Testing strategy**: summary of test types and coverage from tests.md + +Do NOT create minimal epics with just a summary and short description. The epic is the primary reference document for the implementation team. + +**Self-verification**: +- [ ] "Bootstrap & Initial Structure" epic exists and is first in order +- [ ] "Blackbox Tests" epic exists +- [ ] Every component maps to exactly one epic +- [ ] Dependency order is respected (no epic depends on a later one) +- [ ] Acceptance criteria are measurable +- [ ] Effort estimates are realistic +- [ ] Every epic description includes architecture diagram, interface spec, data flow, risks, and NFRs +- [ ] Epic descriptions are self-contained — readable without opening other files + +7. **Create "Blackbox Tests" epic** — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`. + +**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If `tracker: local`, save locally only. diff --git a/.cursor/skills/plan/steps/07_quality-checklist.md b/.cursor/skills/plan/steps/07_quality-checklist.md new file mode 100644 index 0000000..f883e88 --- /dev/null +++ b/.cursor/skills/plan/steps/07_quality-checklist.md @@ -0,0 +1,57 @@ +## Quality Checklist (before FINAL_report.md) + +Before writing the final report, verify ALL of the following: + +### Blackbox Tests +- [ ] Every acceptance criterion is covered in traceability-matrix.md +- [ ] Every restriction is verified by at least one test +- [ ] Positive and negative scenarios are balanced +- [ ] Docker environment is self-contained +- [ ] Consumer app treats main system as black box +- [ ] CI/CD integration and reporting defined + +### Architecture +- [ ] Covers all capabilities from solution.md +- [ ] Technology choices are justified +- [ ] Deployment model is defined +- [ ] Blackbox test findings are reflected in architecture decisions + +### Data Model +- [ ] Every entity from architecture.md is defined +- [ ] Relationships have explicit cardinality +- [ ] Migration strategy with reversibility requirement +- [ ] Seed data requirements defined +- [ ] Backward compatibility approach documented + +### Deployment +- [ ] Containerization plan covers all components +- [ ] CI/CD pipeline includes lint, test, security, build, deploy stages +- [ ] Environment strategy covers dev, staging, production +- [ ] Observability covers logging, metrics, tracing, alerting +- [ ] Deployment procedures include rollback and health checks + +### Components +- [ ] Every component follows SRP +- [ ] No circular dependencies +- [ ] All inter-component interfaces are defined and consistent +- [ ] No orphan components (unused by any flow) +- [ ] Every blackbox test scenario can be traced through component interactions + +### Risks +- [ ] All High/Critical risks have mitigations +- [ ] Mitigations are reflected in component/architecture docs +- [ ] User has confirmed risk assessment is sufficient + +### Tests +- [ ] Every acceptance criterion is covered by at least one test +- [ ] All 4 test types are represented per component (where applicable) +- [ ] Test data management is defined + +### Epics +- [ ] "Bootstrap & Initial Structure" epic exists +- [ ] "Blackbox Tests" epic exists +- [ ] Every component maps to an epic +- [ ] Dependency order is correct +- [ ] Acceptance criteria are measurable + +**Save action**: Write `FINAL_report.md` using `templates/final-report.md` as structure diff --git a/.cursor/skills/plan/templates/architecture.md b/.cursor/skills/plan/templates/architecture.md index 0884500..1d381cc 100644 --- a/.cursor/skills/plan/templates/architecture.md +++ b/.cursor/skills/plan/templates/architecture.md @@ -1,6 +1,6 @@ # Architecture Document Template -Use this template for the architecture document. Save as `_docs/02_plans/architecture.md`. +Use this template for the architecture document. Save as `_docs/02_document/architecture.md`. --- diff --git a/.cursor/skills/plan/templates/integration-functional-tests.md b/.cursor/skills/plan/templates/blackbox-tests.md similarity index 83% rename from .cursor/skills/plan/templates/integration-functional-tests.md rename to .cursor/skills/plan/templates/blackbox-tests.md index 9bb3eff..d522698 100644 --- a/.cursor/skills/plan/templates/integration-functional-tests.md +++ b/.cursor/skills/plan/templates/blackbox-tests.md @@ -1,24 +1,24 @@ -# E2E Functional Tests Template +# Blackbox Tests Template -Save as `PLANS_DIR/integration_tests/functional_tests.md`. +Save as `DOCUMENT_DIR/tests/blackbox-tests.md`. --- ```markdown -# E2E Functional Tests +# Blackbox Tests ## Positive Scenarios ### FT-P-01: [Scenario Name] -**Summary**: [One sentence: what end-to-end use case this validates] +**Summary**: [One sentence: what black-box use case this validates] **Traces to**: AC-[ID], AC-[ID] **Category**: [which AC category — e.g., Position Accuracy, Image Processing, etc.] **Preconditions**: - [System state required before test] -**Input data**: [reference to specific data set or file from test_data.md] +**Input data**: [reference to specific data set or file from test-data.md] **Steps**: @@ -71,8 +71,8 @@ Save as `PLANS_DIR/integration_tests/functional_tests.md`. ## Guidance Notes -- Functional tests should typically trace to at least one acceptance criterion or restriction. Tests without a trace are allowed but should have a clear justification. +- Blackbox tests should typically trace to at least one acceptance criterion or restriction. Tests without a trace are allowed but should have a clear justification. - Positive scenarios validate the system does what it should. - Negative scenarios validate the system rejects or handles gracefully what it shouldn't accept. - Expected outcomes must be specific and measurable — not "works correctly" but "returns position within 50m of ground truth." -- Input data references should point to specific entries in test_data.md. +- Input data references should point to specific entries in test-data.md. diff --git a/.cursor/skills/plan/templates/epic-spec.md b/.cursor/skills/plan/templates/epic-spec.md index f8ebcfc..6cb60e6 100644 --- a/.cursor/skills/plan/templates/epic-spec.md +++ b/.cursor/skills/plan/templates/epic-spec.md @@ -1,6 +1,6 @@ -# Jira Epic Template +# Epic Template -Use this template for each Jira epic. Create epics via Jira MCP. +Use this template for each epic. Create epics via the configured work item tracker (Jira MCP or Azure DevOps MCP). --- @@ -73,14 +73,14 @@ Link to architecture.md and relevant component spec.] ### Design & Architecture -- Architecture doc: `_docs/02_plans/architecture.md` -- Component spec: `_docs/02_plans/components/[##]_[name]/description.md` -- System flows: `_docs/02_plans/system-flows.md` +- Architecture doc: `_docs/02_document/architecture.md` +- Component spec: `_docs/02_document/components/[##]_[name]/description.md` +- System flows: `_docs/02_document/system-flows.md` ### Definition of Done - [ ] All in-scope capabilities implemented -- [ ] Automated tests pass (unit + integration + e2e) +- [ ] Automated tests pass (unit + blackbox) - [ ] Minimum coverage threshold met (75%) - [ ] Runbooks written (if applicable) - [ ] Documentation updated diff --git a/.cursor/skills/plan/templates/final-report.md b/.cursor/skills/plan/templates/final-report.md index db0828b..0e27016 100644 --- a/.cursor/skills/plan/templates/final-report.md +++ b/.cursor/skills/plan/templates/final-report.md @@ -1,6 +1,6 @@ # Final Planning Report Template -Use this template after completing all 5 steps and the quality checklist. Save as `_docs/02_plans/FINAL_report.md`. +Use this template after completing all 6 steps and the quality checklist. Save as `_docs/02_document/FINAL_report.md`. --- diff --git a/.cursor/skills/plan/templates/integration-non-functional-tests.md b/.cursor/skills/plan/templates/integration-non-functional-tests.md deleted file mode 100644 index d1b5f3a..0000000 --- a/.cursor/skills/plan/templates/integration-non-functional-tests.md +++ /dev/null @@ -1,97 +0,0 @@ -# E2E Non-Functional Tests Template - -Save as `PLANS_DIR/integration_tests/non_functional_tests.md`. - ---- - -```markdown -# E2E Non-Functional Tests - -## Performance Tests - -### NFT-PERF-01: [Test Name] - -**Summary**: [What performance characteristic this validates] -**Traces to**: AC-[ID] -**Metric**: [what is measured — latency, throughput, frame rate, etc.] - -**Preconditions**: -- [System state, load profile, data volume] - -**Steps**: - -| Step | Consumer Action | Measurement | -|------|----------------|-------------| -| 1 | [action] | [what to measure and how] | - -**Pass criteria**: [specific threshold — e.g., p95 latency < 400ms] -**Duration**: [how long the test runs] - ---- - -## Resilience Tests - -### NFT-RES-01: [Test Name] - -**Summary**: [What failure/recovery scenario this validates] -**Traces to**: AC-[ID] - -**Preconditions**: -- [System state before fault injection] - -**Fault injection**: -- [What fault is introduced — process kill, network partition, invalid input sequence, etc.] - -**Steps**: - -| Step | Action | Expected Behavior | -|------|--------|------------------| -| 1 | [inject fault] | [system behavior during fault] | -| 2 | [observe recovery] | [system behavior after recovery] | - -**Pass criteria**: [recovery time, data integrity, continued operation] - ---- - -## Security Tests - -### NFT-SEC-01: [Test Name] - -**Summary**: [What security property this validates] -**Traces to**: AC-[ID], RESTRICT-[ID] - -**Steps**: - -| Step | Consumer Action | Expected Response | -|------|----------------|------------------| -| 1 | [attempt unauthorized access / injection / etc.] | [rejection / no data leak / etc.] | - -**Pass criteria**: [specific security outcome] - ---- - -## Resource Limit Tests - -### NFT-RES-LIM-01: [Test Name] - -**Summary**: [What resource constraint this validates] -**Traces to**: AC-[ID], RESTRICT-[ID] - -**Preconditions**: -- [System running under specified constraints] - -**Monitoring**: -- [What resources to monitor — memory, CPU, GPU, disk, temperature] - -**Duration**: [how long to run] -**Pass criteria**: [resource stays within limit — e.g., memory < 8GB throughout] -``` - ---- - -## Guidance Notes - -- Performance tests should run long enough to capture steady-state behavior, not just cold-start. -- Resilience tests must define both the fault and the expected recovery — not just "system should recover." -- Security tests at E2E level focus on black-box attacks (unauthorized API calls, malformed input), not code-level vulnerabilities. -- Resource limit tests must specify monitoring duration — short bursts don't prove sustained compliance. diff --git a/.cursor/skills/plan/templates/performance-tests.md b/.cursor/skills/plan/templates/performance-tests.md new file mode 100644 index 0000000..dfbcd14 --- /dev/null +++ b/.cursor/skills/plan/templates/performance-tests.md @@ -0,0 +1,35 @@ +# Performance Tests Template + +Save as `DOCUMENT_DIR/tests/performance-tests.md`. + +--- + +```markdown +# Performance Tests + +### NFT-PERF-01: [Test Name] + +**Summary**: [What performance characteristic this validates] +**Traces to**: AC-[ID] +**Metric**: [what is measured — latency, throughput, frame rate, etc.] + +**Preconditions**: +- [System state, load profile, data volume] + +**Steps**: + +| Step | Consumer Action | Measurement | +|------|----------------|-------------| +| 1 | [action] | [what to measure and how] | + +**Pass criteria**: [specific threshold — e.g., p95 latency < 400ms] +**Duration**: [how long the test runs] +``` + +--- + +## Guidance Notes + +- Performance tests should run long enough to capture steady-state behavior, not just cold-start. +- Define clear pass/fail thresholds with specific metrics (p50, p95, p99 latency, throughput, etc.). +- Include warm-up preconditions to separate initialization cost from steady-state performance. diff --git a/.cursor/skills/plan/templates/resilience-tests.md b/.cursor/skills/plan/templates/resilience-tests.md new file mode 100644 index 0000000..72890ae --- /dev/null +++ b/.cursor/skills/plan/templates/resilience-tests.md @@ -0,0 +1,37 @@ +# Resilience Tests Template + +Save as `DOCUMENT_DIR/tests/resilience-tests.md`. + +--- + +```markdown +# Resilience Tests + +### NFT-RES-01: [Test Name] + +**Summary**: [What failure/recovery scenario this validates] +**Traces to**: AC-[ID] + +**Preconditions**: +- [System state before fault injection] + +**Fault injection**: +- [What fault is introduced — process kill, network partition, invalid input sequence, etc.] + +**Steps**: + +| Step | Action | Expected Behavior | +|------|--------|------------------| +| 1 | [inject fault] | [system behavior during fault] | +| 2 | [observe recovery] | [system behavior after recovery] | + +**Pass criteria**: [recovery time, data integrity, continued operation] +``` + +--- + +## Guidance Notes + +- Resilience tests must define both the fault and the expected recovery — not just "system should recover." +- Include specific recovery time expectations and data integrity checks. +- Test both graceful degradation (partial failure) and full recovery scenarios. diff --git a/.cursor/skills/plan/templates/resource-limit-tests.md b/.cursor/skills/plan/templates/resource-limit-tests.md new file mode 100644 index 0000000..53779e3 --- /dev/null +++ b/.cursor/skills/plan/templates/resource-limit-tests.md @@ -0,0 +1,31 @@ +# Resource Limit Tests Template + +Save as `DOCUMENT_DIR/tests/resource-limit-tests.md`. + +--- + +```markdown +# Resource Limit Tests + +### NFT-RES-LIM-01: [Test Name] + +**Summary**: [What resource constraint this validates] +**Traces to**: AC-[ID], RESTRICT-[ID] + +**Preconditions**: +- [System running under specified constraints] + +**Monitoring**: +- [What resources to monitor — memory, CPU, GPU, disk, temperature] + +**Duration**: [how long to run] +**Pass criteria**: [resource stays within limit — e.g., memory < 8GB throughout] +``` + +--- + +## Guidance Notes + +- Resource limit tests must specify monitoring duration — short bursts don't prove sustained compliance. +- Define specific numeric limits that can be programmatically checked. +- Include both the monitoring method and the threshold in the pass criteria. diff --git a/.cursor/skills/plan/templates/risk-register.md b/.cursor/skills/plan/templates/risk-register.md index 0983d7f..786aec9 100644 --- a/.cursor/skills/plan/templates/risk-register.md +++ b/.cursor/skills/plan/templates/risk-register.md @@ -1,6 +1,6 @@ # Risk Register Template -Use this template for risk assessment. Save as `_docs/02_plans/risk_mitigations.md`. +Use this template for risk assessment. Save as `_docs/02_document/risk_mitigations.md`. Subsequent iterations: `risk_mitigations_02.md`, `risk_mitigations_03.md`, etc. --- diff --git a/.cursor/skills/plan/templates/security-tests.md b/.cursor/skills/plan/templates/security-tests.md new file mode 100644 index 0000000..b243404 --- /dev/null +++ b/.cursor/skills/plan/templates/security-tests.md @@ -0,0 +1,30 @@ +# Security Tests Template + +Save as `DOCUMENT_DIR/tests/security-tests.md`. + +--- + +```markdown +# Security Tests + +### NFT-SEC-01: [Test Name] + +**Summary**: [What security property this validates] +**Traces to**: AC-[ID], RESTRICT-[ID] + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | [attempt unauthorized access / injection / etc.] | [rejection / no data leak / etc.] | + +**Pass criteria**: [specific security outcome] +``` + +--- + +## Guidance Notes + +- Security tests at blackbox level focus on black-box attacks (unauthorized API calls, malformed input), not code-level vulnerabilities. +- Verify the system remains operational after security-related edge cases (no crash, no hang). +- Test authentication/authorization boundaries from the consumer's perspective. diff --git a/.cursor/skills/plan/templates/system-flows.md b/.cursor/skills/plan/templates/system-flows.md index 4d5656f..6c887a8 100644 --- a/.cursor/skills/plan/templates/system-flows.md +++ b/.cursor/skills/plan/templates/system-flows.md @@ -1,7 +1,7 @@ # System Flows Template -Use this template for the system flows document. Save as `_docs/02_plans/system-flows.md`. -Individual flow diagrams go in `_docs/02_plans/diagrams/flows/flow_[name].md`. +Use this template for the system flows document. Save as `_docs/02_document/system-flows.md`. +Individual flow diagrams go in `_docs/02_document/diagrams/flows/flow_[name].md`. --- diff --git a/.cursor/skills/plan/templates/integration-test-data.md b/.cursor/skills/plan/templates/test-data.md similarity index 62% rename from .cursor/skills/plan/templates/integration-test-data.md rename to .cursor/skills/plan/templates/test-data.md index 041c963..0cee7fa 100644 --- a/.cursor/skills/plan/templates/integration-test-data.md +++ b/.cursor/skills/plan/templates/test-data.md @@ -1,11 +1,11 @@ -# E2E Test Data Template +# Test Data Template -Save as `PLANS_DIR/integration_tests/test_data.md`. +Save as `DOCUMENT_DIR/tests/test-data.md`. --- ```markdown -# E2E Test Data Management +# Test Data Management ## Seed Data Sets @@ -23,6 +23,12 @@ Save as `PLANS_DIR/integration_tests/test_data.md`. |-----------------|----------------|-------------|-----------------| | [filename] | `_docs/00_problem/input_data/[filename]` | [what it contains] | [test IDs that use this data] | +## Expected Results Mapping + +| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Expected Result Source | +|-----------------|------------|-----------------|-------------------|-----------|----------------------| +| [test ID] | `input_data/[filename]` | [quantifiable expected output] | [exact / tolerance / pattern / threshold / file-diff] | [± value or N/A] | `input_data/expected_results/[filename]` or inline | + ## External Dependency Mocks | External Service | Mock/Stub | How Provided | Behavior | @@ -42,5 +48,8 @@ Save as `PLANS_DIR/integration_tests/test_data.md`. - Every seed data set should be traceable to specific test scenarios. - Input data from `_docs/00_problem/input_data/` should be mapped to test scenarios that use it. +- Every input data item MUST have a corresponding expected result in the Expected Results Mapping table. +- Expected results MUST be quantifiable: exact values, numeric tolerances, pattern matches, thresholds, or reference files. "Works correctly" is never acceptable. +- For complex expected outputs, provide machine-readable reference files (JSON, CSV) in `_docs/00_problem/input_data/expected_results/` and reference them in the mapping. - External mocks must be deterministic — same input always produces same output. - Data isolation must guarantee no test can affect another test's outcome. diff --git a/.cursor/skills/plan/templates/integration-environment.md b/.cursor/skills/plan/templates/test-environment.md similarity index 92% rename from .cursor/skills/plan/templates/integration-environment.md rename to .cursor/skills/plan/templates/test-environment.md index 6d8a0ac..b5d74fa 100644 --- a/.cursor/skills/plan/templates/integration-environment.md +++ b/.cursor/skills/plan/templates/test-environment.md @@ -1,16 +1,16 @@ -# E2E Test Environment Template +# Test Environment Template -Save as `PLANS_DIR/integration_tests/environment.md`. +Save as `DOCUMENT_DIR/tests/environment.md`. --- ```markdown -# E2E Test Environment +# Test Environment ## Overview **System under test**: [main system name and entry points — API URLs, message queues, serial ports, etc.] -**Consumer app purpose**: Standalone application that exercises the main system through its public interfaces, validating end-to-end use cases without access to internals. +**Consumer app purpose**: Standalone application that exercises the main system through its public interfaces, validating black-box use cases without access to internals. ## Docker Environment diff --git a/.cursor/skills/plan/templates/test-spec.md b/.cursor/skills/plan/templates/test-spec.md index 2b6ee44..5b7b83e 100644 --- a/.cursor/skills/plan/templates/test-spec.md +++ b/.cursor/skills/plan/templates/test-spec.md @@ -17,7 +17,7 @@ Use this template for each component's test spec. Save as `components/[##]_[name --- -## Integration Tests +## Blackbox Tests ### IT-01: [Test Name] @@ -169,4 +169,4 @@ Use this template for each component's test spec. Save as `components/[##]_[name - If an acceptance criterion has no test covering it, mark it as NOT COVERED and explain why (e.g., "requires manual verification", "deferred to phase 2"). - Performance test targets should come from the NFR section in `architecture.md`. - Security tests should cover at minimum: authentication bypass, authorization escalation, injection attacks relevant to this component. -- Not every component needs all 4 test types. A stateless utility component may only need integration tests. +- Not every component needs all 4 test types. A stateless utility component may only need blackbox tests. diff --git a/.cursor/skills/plan/templates/integration-traceability-matrix.md b/.cursor/skills/plan/templates/traceability-matrix.md similarity index 82% rename from .cursor/skills/plan/templates/integration-traceability-matrix.md rename to .cursor/skills/plan/templates/traceability-matrix.md index 05ccafa..e0192ac 100644 --- a/.cursor/skills/plan/templates/integration-traceability-matrix.md +++ b/.cursor/skills/plan/templates/traceability-matrix.md @@ -1,11 +1,11 @@ -# E2E Traceability Matrix Template +# Traceability Matrix Template -Save as `PLANS_DIR/integration_tests/traceability_matrix.md`. +Save as `DOCUMENT_DIR/tests/traceability-matrix.md`. --- ```markdown -# E2E Traceability Matrix +# Traceability Matrix ## Acceptance Criteria Coverage @@ -34,7 +34,7 @@ Save as `PLANS_DIR/integration_tests/traceability_matrix.md`. | Item | Reason Not Covered | Risk | Mitigation | |------|-------------------|------|-----------| -| [AC/Restriction ID] | [why it cannot be tested at E2E level] | [what could go wrong] | [how risk is addressed — e.g., covered by component tests in Step 5] | +| [AC/Restriction ID] | [why it cannot be tested at blackbox level] | [what could go wrong] | [how risk is addressed — e.g., covered by component tests in Step 5] | ``` --- @@ -44,4 +44,4 @@ Save as `PLANS_DIR/integration_tests/traceability_matrix.md`. - Every acceptance criterion must appear in the matrix — either covered or explicitly marked as not covered with a reason. - Every restriction must appear in the matrix. - NOT COVERED items must have a reason and a mitigation strategy (e.g., "covered at component test level" or "requires real hardware"). -- Coverage percentage should be at least 75% for acceptance criteria at the E2E level. +- Coverage percentage should be at least 75% for acceptance criteria at the blackbox test level. diff --git a/.cursor/skills/problem/SKILL.md b/.cursor/skills/problem/SKILL.md index 030a2a1..570fa1e 100644 --- a/.cursor/skills/problem/SKILL.md +++ b/.cursor/skills/problem/SKILL.md @@ -46,7 +46,7 @@ The interview is complete when the AI can write ALL of these: | `problem.md` | Clear problem statement: what is being built, why, for whom, what it does | | `restrictions.md` | All constraints identified: hardware, software, environment, operational, regulatory, budget, timeline | | `acceptance_criteria.md` | Measurable success criteria with specific numeric targets grouped by category | -| `input_data/` | At least one reference data file or detailed data description document | +| `input_data/` | At least one reference data file or detailed data description document. Must include `expected_results.md` with input→output pairs for downstream test specification | | `security_approach.md` | (optional) Security requirements identified, or explicitly marked as not applicable | ## Interview Protocol @@ -187,6 +187,7 @@ At least one file. Options: - User provides actual data files (CSV, JSON, images, etc.) — save as-is - User describes data parameters — save as `data_parameters.md` - User provides URLs to data — save as `data_sources.md` with links and descriptions +- `expected_results.md` — expected outputs for given inputs (required by downstream test-spec skill). During the Acceptance Criteria dimension, probe for concrete input→output pairs and save them here. Format: use the template from `.cursor/skills/test-spec/templates/expected-results.md`. ### security_approach.md (optional) diff --git a/.cursor/skills/refactor/SKILL.md b/.cursor/skills/refactor/SKILL.md index 7fe59b8..3acea10 100644 --- a/.cursor/skills/refactor/SKILL.md +++ b/.cursor/skills/refactor/SKILL.md @@ -34,8 +34,8 @@ Determine the operating mode based on invocation before any other logic runs. **Project mode** (no explicit input file provided): - PROBLEM_DIR: `_docs/00_problem/` - SOLUTION_DIR: `_docs/01_solution/` -- COMPONENTS_DIR: `_docs/02_components/` -- TESTS_DIR: `_docs/02_tests/` +- COMPONENTS_DIR: `_docs/02_document/components/` +- DOCUMENT_DIR: `_docs/02_document/` - REFACTOR_DIR: `_docs/04_refactoring/` - All existing guardrails apply. @@ -155,7 +155,7 @@ Store in PROBLEM_DIR. | Metric Category | What to Capture | |----------------|-----------------| -| **Coverage** | Overall, unit, integration, critical paths | +| **Coverage** | Overall, unit, blackbox, critical paths | | **Complexity** | Cyclomatic complexity (avg + top 5 functions), LOC, tech debt ratio | | **Code Smells** | Total, critical, major | | **Performance** | Response times (P50/P95/P99), CPU/memory, throughput | @@ -210,7 +210,7 @@ Write: Also copy to project standard locations if in project mode: - `SOLUTION_DIR/solution.md` -- `COMPONENTS_DIR/system_flows.md` +- `DOCUMENT_DIR/system_flows.md` **Self-verification**: - [ ] Every component in the codebase is documented @@ -276,14 +276,14 @@ Write `REFACTOR_DIR/analysis/refactoring_roadmap.md`: #### 3a. Design Test Specs -Coverage requirements (must meet before refactoring): +Coverage requirements (must meet before refactoring — see `.cursor/rules/cursor-meta.mdc` Quality Thresholds): - Minimum overall coverage: 75% - Critical path coverage: 90% -- All public APIs must have integration tests +- All public APIs must have blackbox tests - All error handling paths must be tested For each critical area, write test specs to `REFACTOR_DIR/test_specs/[##]_[test_name].md`: -- Integration tests: summary, current behavior, input data, expected result, max expected time +- Blackbox tests: summary, current behavior, input data, expected result, max expected time - Acceptance tests: summary, preconditions, steps with expected results - Coverage analysis: current %, target %, uncovered critical paths @@ -297,7 +297,7 @@ For each critical area, write test specs to `REFACTOR_DIR/test_specs/[##]_[test_ **Self-verification**: - [ ] Coverage requirements met (75% overall, 90% critical paths) - [ ] All tests pass on current codebase -- [ ] All public APIs have integration tests +- [ ] All public APIs have blackbox tests - [ ] Test data fixtures are configured **Save action**: Write test specs; implemented tests go into the project's test folder @@ -332,7 +332,7 @@ Write `REFACTOR_DIR/coupling_analysis.md`: For each change in the decoupling strategy: 1. Implement the change -2. Run integration tests +2. Run blackbox tests 3. Fix any failures 4. Commit with descriptive message diff --git a/.cursor/skills/research/SKILL.md b/.cursor/skills/research/SKILL.md index 1b4c159..85fd5d7 100644 --- a/.cursor/skills/research/SKILL.md +++ b/.cursor/skills/research/SKILL.md @@ -1,5 +1,5 @@ --- -name: deep-research +name: research description: | Deep Research Methodology (8-Step Method) with two execution modes: - Mode A (Initial Research): Assess acceptance criteria, then research problem and produce solution draft @@ -13,6 +13,7 @@ description: | - "comparative analysis", "concept comparison", "technical comparison" category: build tags: [research, analysis, solution-design, comparison, decision-support] +disable-model-invocation: true --- # Deep Research (8-Step Method) @@ -42,257 +43,51 @@ Determine the operating mode based on invocation before any other logic runs. **Standalone mode** (explicit input file provided, e.g. `/research @some_doc.md`): - INPUT_FILE: the provided file (treated as problem description) -- OUTPUT_DIR: `_standalone/01_solution/` -- RESEARCH_DIR: `_standalone/00_research/` +- BASE_DIR: if specified by the caller, use it; otherwise default to `_standalone/` +- OUTPUT_DIR: `BASE_DIR/01_solution/` +- RESEARCH_DIR: `BASE_DIR/00_research/` - Guardrails relaxed: only INPUT_FILE must exist and be non-empty - `restrictions.md` and `acceptance_criteria.md` are optional — warn if absent, proceed if user confirms - Mode detection uses OUTPUT_DIR for `solution_draft*.md` scanning - Draft numbering works the same, scoped to OUTPUT_DIR -- **Final step**: after all research is complete, move INPUT_FILE into `_standalone/` +- **Final step**: after all research is complete, move INPUT_FILE into BASE_DIR Announce the detected mode and resolved paths to the user before proceeding. ## Project Integration -### Prerequisite Guardrails (BLOCKING) - -Before any research begins, verify the input context exists. **Do not proceed if guardrails fail.** - -**Project mode:** -1. Check INPUT_DIR exists — **STOP if missing**, ask user to create it and provide problem files -2. Check `problem.md` in INPUT_DIR exists and is non-empty — **STOP if missing** -3. Check `restrictions.md` in INPUT_DIR exists and is non-empty — **STOP if missing** -4. Check `acceptance_criteria.md` in INPUT_DIR exists and is non-empty — **STOP if missing** -5. Check `input_data/` in INPUT_DIR exists and contains at least one file — **STOP if missing** -6. Read **all** files in INPUT_DIR to ground the investigation in the project context -7. Create OUTPUT_DIR and RESEARCH_DIR if they don't exist - -**Standalone mode:** -1. Check INPUT_FILE exists and is non-empty — **STOP if missing** -2. Warn if no `restrictions.md` or `acceptance_criteria.md` were provided alongside INPUT_FILE — proceed if user confirms -3. Create OUTPUT_DIR and RESEARCH_DIR if they don't exist - -### Mode Detection - -After guardrails pass, determine the execution mode: - -1. Scan OUTPUT_DIR for files matching `solution_draft*.md` -2. **No matches found** → **Mode A: Initial Research** -3. **Matches found** → **Mode B: Solution Assessment** (use the highest-numbered draft as input) -4. **User override**: if the user explicitly says "research from scratch" or "initial research", force Mode A regardless of existing drafts - -Inform the user which mode was detected and confirm before proceeding. - -### Solution Draft Numbering - -All final output is saved as `OUTPUT_DIR/solution_draft##.md` with a 2-digit zero-padded number: - -1. Scan existing files in OUTPUT_DIR matching `solution_draft*.md` -2. Extract the highest existing number -3. Increment by 1 -4. Zero-pad to 2 digits (e.g., `01`, `02`, ..., `10`, `11`) - -Example: if `solution_draft01.md` through `solution_draft10.md` exist, the next output is `solution_draft11.md`. - -### Working Directory & Intermediate Artifact Management - -#### Directory Structure - -At the start of research, **must** create a working directory under RESEARCH_DIR: - -``` -RESEARCH_DIR/ -├── 00_ac_assessment.md # Mode A Phase 1 output: AC & restrictions assessment -├── 00_question_decomposition.md # Step 0-1 output -├── 01_source_registry.md # Step 2 output: all consulted source links -├── 02_fact_cards.md # Step 3 output: extracted facts -├── 03_comparison_framework.md # Step 4 output: selected framework and populated data -├── 04_reasoning_chain.md # Step 6 output: fact → conclusion reasoning -├── 05_validation_log.md # Step 7 output: use-case validation results -└── raw/ # Raw source archive (optional) - ├── source_1.md - └── source_2.md -``` - -### Save Timing & Content - -| Step | Save immediately after completion | Filename | -|------|-----------------------------------|----------| -| Mode A Phase 1 | AC & restrictions assessment tables | `00_ac_assessment.md` | -| Step 0-1 | Question type classification + sub-question list | `00_question_decomposition.md` | -| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` | -| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` | -| Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` | -| Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` | -| Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` | -| Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` | - -### Save Principles - -1. **Save immediately**: Write to the corresponding file as soon as a step is completed; don't wait until the end -2. **Incremental updates**: Same file can be updated multiple times; append or replace new content -3. **Preserve process**: Keep intermediate files even after their content is integrated into the final report -4. **Enable recovery**: If research is interrupted, progress can be recovered from intermediate files +Read and follow `steps/00_project-integration.md` for prerequisite guardrails, mode detection, draft numbering, working directory setup, save timing, and output file inventory. ## Execution Flow ### Mode A: Initial Research -Triggered when no `solution_draft*.md` files exist in OUTPUT_DIR, or when the user explicitly requests initial research. +Read and follow `steps/01_mode-a-initial-research.md`. -#### Phase 1: AC & Restrictions Assessment (BLOCKING) - -**Role**: Professional software architect - -A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them. - -**Input**: All files from INPUT_DIR (or INPUT_FILE in standalone mode) - -**Task**: -1. Read all problem context files thoroughly -2. **ASK the user about every unclear aspect** — do not assume: - - Unclear problem boundaries → ask - - Ambiguous acceptance criteria values → ask - - Missing context (no `security_approach.md`, no `input_data/`) → ask what they have - - Conflicting restrictions → ask which takes priority -3. Research in internet **extensively** — use multiple search queries per question, rephrase, and search from different angles: - - How realistic are the acceptance criteria for this specific domain? Search for industry benchmarks, standards, and typical values - - How critical is each criterion? Search for case studies where criteria were relaxed or tightened - - What domain-specific acceptance criteria are we missing? Search for industry standards, regulatory requirements, and best practices in the specific domain - - Impact of each criterion value on the whole system quality — search for research papers and engineering reports - - Cost/budget implications of each criterion — search for pricing, total cost of ownership analyses, and comparable project budgets - - Timeline implications — search for project timelines, development velocity reports, and comparable implementations - - What do practitioners in this domain consider the most important criteria? Search forums, conference talks, and experience reports -4. Research restrictions from multiple perspectives: - - Are the restrictions realistic? Search for comparable projects that operated under similar constraints - - Should any be tightened or relaxed? Search for what constraints similar projects actually ended up with - - Are there additional restrictions we should add? Search for regulatory, compliance, and safety requirements in this domain - - What restrictions do practitioners wish they had defined earlier? Search for post-mortem reports and lessons learned -5. Verify findings with authoritative sources (official docs, papers, benchmarks) — each key finding must have at least 2 independent sources - -**Uses Steps 0-3 of the 8-step engine** (question classification, decomposition, source tiering, fact extraction) scoped to AC and restrictions assessment. - -**📁 Save action**: Write `RESEARCH_DIR/00_ac_assessment.md` with format: - -```markdown -# Acceptance Criteria Assessment - -## Acceptance Criteria - -| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status | -|-----------|-----------|-------------------|---------------------|--------| -| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed | - -## Restrictions Assessment - -| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status | -|-------------|-----------|-------------------|---------------------|--------| -| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed | - -## Key Findings -[Summary of critical findings] - -## Sources -[Key references used] -``` - -**BLOCKING**: Present the AC assessment tables to the user. Wait for confirmation or adjustments before proceeding to Phase 2. The user may update `acceptance_criteria.md` or `restrictions.md` based on findings. - ---- - -#### Phase 2: Problem Research & Solution Draft - -**Role**: Professional researcher and software architect - -Full 8-step research methodology. Produces the first solution draft. - -**Input**: All files from INPUT_DIR (possibly updated after Phase 1) + Phase 1 artifacts - -**Task** (drives the 8-step engine): -1. Research existing/competitor solutions for similar problems — search broadly across industries and adjacent domains, not just the obvious competitors -2. Research the problem thoroughly — all possible ways to solve it, split into components; search for how different fields approach analogous problems -3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1 -4. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions -5. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist? -6. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues) -7. Include security considerations in each component analysis -8. Provide rough cost estimates for proposed solutions - -Be concise in formulating. The fewer words, the better, but do not miss any important details. - -**📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md` - ---- - -#### Phase 3: Tech Stack Consolidation (OPTIONAL) - -**Role**: Software architect evaluating technology choices - -Focused synthesis step — no new 8-step cycle. Uses research already gathered in Phase 2 to make concrete technology decisions. - -**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + all files from INPUT_DIR - -**Task**: -1. Extract technology options from the solution draft's component comparison tables -2. Score each option against: fitness for purpose, maturity, security track record, team expertise, cost, scalability -3. Produce a tech stack summary with selection rationale -4. Assess risks and learning requirements per technology choice - -**📁 Save action**: Write `OUTPUT_DIR/tech_stack.md` with: -- Requirements analysis (functional, non-functional, constraints) -- Technology evaluation tables (language, framework, database, infrastructure, key libraries) with scores -- Tech stack summary block -- Risk assessment and learning requirements tables - ---- - -#### Phase 4: Security Deep Dive (OPTIONAL) - -**Role**: Security architect - -Focused analysis step — deepens the security column from the solution draft into a proper threat model and controls specification. - -**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + `security_approach.md` from INPUT_DIR + problem context - -**Task**: -1. Build threat model: asset inventory, threat actors, attack vectors -2. Define security requirements and proposed controls per component (with risk level) -3. Summarize authentication/authorization, data protection, secure communication, and logging/monitoring approach - -**📁 Save action**: Write `OUTPUT_DIR/security_analysis.md` with: -- Threat model (assets, actors, vectors) -- Per-component security requirements and controls table -- Security controls summary +Phases: AC Assessment (BLOCKING) → Problem Research → Tech Stack (optional) → Security (optional). --- ### Mode B: Solution Assessment -Triggered when `solution_draft*.md` files exist in OUTPUT_DIR. +Read and follow `steps/02_mode-b-solution-assessment.md`. -**Role**: Professional software architect +--- -Full 8-step research methodology applied to assessing and improving an existing solution draft. +## Research Engine (8-Step Method) -**Input**: All files from INPUT_DIR + the latest (highest-numbered) `solution_draft##.md` from OUTPUT_DIR +The 8-step method is the core research engine used by both modes. Steps 0-1 and Step 8 have mode-specific behavior; Steps 2-7 are identical regardless of mode. -**Task** (drives the 8-step engine): -1. Read the existing solution draft thoroughly -2. Research in internet extensively — for each component/decision in the draft, search for: - - Known problems and limitations of the chosen approach - - What practitioners say about using it in production - - Better alternatives that may have emerged recently - - Common failure modes and edge cases - - How competitors/similar projects solve the same problem differently -3. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure" -4. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft -5. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports -6. For each identified weak point, search for multiple solution approaches and compare them -7. Based on findings, form a new solution draft in the same format +**Investigation phase** (Steps 0–3.5): Read and follow `steps/03_engine-investigation.md`. +Covers: question classification, novelty sensitivity, question decomposition, perspective rotation, exhaustive web search, fact extraction, iterative deepening. -**📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md` +**Analysis phase** (Steps 4–8): Read and follow `steps/04_engine-analysis.md`. +Covers: comparison framework, baseline alignment, reasoning chain, use-case validation, deliverable formatting. -**Optional follow-up**: After Mode B completes, the user can request Phase 3 (Tech Stack Consolidation) or Phase 4 (Security Deep Dive) using the revised draft. These phases work identically to their Mode A descriptions above. +## Solution Draft Output Templates + +- Mode A: `templates/solution_draft_mode_a.md` +- Mode B: `templates/solution_draft_mode_b.md` ## Escalation Rules @@ -316,389 +111,12 @@ When the user wants to: - Gather information and evidence for a decision - Assess or improve an existing solution draft -**Keywords**: -- "deep research", "deep dive", "in-depth analysis" -- "research this", "investigate", "look into" -- "assess solution", "review draft", "improve solution" -- "comparative analysis", "concept comparison", "technical comparison" - **Differentiation from other Skills**: - Needs a **visual knowledge graph** → use `research-to-diagram` - Needs **written output** (articles/tutorials) → use `wsy-writer` - Needs **material organization** → use `material-to-markdown` - Needs **research + solution draft** → use this Skill -## Research Engine (8-Step Method) - -The 8-step method is the core research engine used by both modes. Steps 0-1 and Step 8 have mode-specific behavior; Steps 2-7 are identical regardless of mode. - -### Step 0: Question Type Classification - -First, classify the research question type and select the corresponding strategy: - -| Question Type | Core Task | Focus Dimensions | -|---------------|-----------|------------------| -| **Concept Comparison** | Build comparison framework | Mechanism differences, applicability boundaries | -| **Decision Support** | Weigh trade-offs | Cost, risk, benefit | -| **Trend Analysis** | Map evolution trajectory | History, driving factors, predictions | -| **Problem Diagnosis** | Root cause analysis | Symptoms, causes, evidence chain | -| **Knowledge Organization** | Systematic structuring | Definitions, classifications, relationships | - -**Mode-specific classification**: - -| Mode / Phase | Typical Question Type | -|--------------|----------------------| -| Mode A Phase 1 | Knowledge Organization + Decision Support | -| Mode A Phase 2 | Decision Support | -| Mode B | Problem Diagnosis + Decision Support | - -### Step 0.5: Novelty Sensitivity Assessment (BLOCKING) - -Before starting research, assess the novelty sensitivity of the question (Critical/High/Medium/Low). This determines source time windows and filtering strategy. - -**For full classification table, critical-domain rules, trigger words, and assessment template**: Read `references/novelty-sensitivity.md` - -Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources within 6 months, mandatory version annotations, cross-validation from 2+ sources, and direct verification of official download pages. - -**📁 Save action**: Append timeliness assessment to the end of `00_question_decomposition.md` - ---- - -### Step 1: Question Decomposition & Boundary Definition - -**Mode-specific sub-questions**: - -**Mode A Phase 2** (Initial Research — Problem & Solution): -- "What existing/competitor solutions address this problem?" -- "What are the component parts of this problem?" -- "For each component, what are the state-of-the-art solutions?" -- "What are the security considerations per component?" -- "What are the cost implications of each approach?" - -**Mode B** (Solution Assessment): -- "What are the weak points and potential problems in the existing draft?" -- "What are the security vulnerabilities in the proposed architecture?" -- "Where are the performance bottlenecks?" -- "What solutions exist for each identified issue?" - -**General sub-question patterns** (use when applicable): -- **Sub-question A**: "What is X and how does it work?" (Definition & mechanism) -- **Sub-question B**: "What are the dimensions of relationship/difference between X and Y?" (Comparative analysis) -- **Sub-question C**: "In what scenarios is X applicable/inapplicable?" (Boundary conditions) -- **Sub-question D**: "What are X's development trends/best practices?" (Extended analysis) - -#### Perspective Rotation (MANDATORY) - -For each research problem, examine it from **at least 3 different perspectives**. Each perspective generates its own sub-questions and search queries. - -| Perspective | What it asks | Example queries | -|-------------|-------------|-----------------| -| **End-user / Consumer** | What problems do real users encounter? What do they wish were different? | "X problems", "X frustrations reddit", "X user complaints" | -| **Implementer / Engineer** | What are the technical challenges, gotchas, hidden complexities? | "X implementation challenges", "X pitfalls", "X lessons learned" | -| **Business / Decision-maker** | What are the costs, ROI, strategic implications? | "X total cost of ownership", "X ROI case study", "X vs Y business comparison" | -| **Contrarian / Devil's advocate** | What could go wrong? Why might this fail? What are critics saying? | "X criticism", "why not X", "X failures", "X disadvantages real world" | -| **Domain expert / Academic** | What does peer-reviewed research say? What are theoretical limits? | "X research paper", "X systematic review", "X benchmarks academic" | -| **Practitioner / Field** | What do people who actually use this daily say? What works in practice vs theory? | "X in production", "X experience report", "X after 1 year" | - -Select at least 3 perspectives relevant to the problem. Document the chosen perspectives in `00_question_decomposition.md`. - -#### Question Explosion (MANDATORY) - -For **each sub-question**, generate **at least 3-5 search query variants** before searching. This ensures broad coverage and avoids missing relevant information due to terminology differences. - -**Query variant strategies**: -- **Specificity ladder**: broad ("indoor navigation systems") → narrow ("UWB-based indoor drone navigation accuracy") -- **Negation/failure**: "X limitations", "X failure modes", "when X doesn't work" -- **Comparison framing**: "X vs Y for Z", "X alternative for Z", "X or Y which is better for Z" -- **Practitioner voice**: "X in production experience", "X real-world results", "X lessons learned" -- **Temporal**: "X 2025", "X latest developments", "X roadmap" -- **Geographic/domain**: "X in Europe", "X for defense applications", "X in agriculture" - -Record all planned queries in `00_question_decomposition.md` alongside each sub-question. - -**⚠️ Research Subject Boundary Definition (BLOCKING - must be explicit)**: - -When decomposing questions, you must explicitly define the **boundaries of the research subject**: - -| Dimension | Boundary to define | Example | -|-----------|--------------------|---------| -| **Population** | Which group is being studied? | University students vs K-12 vs vocational students vs all students | -| **Geography** | Which region is being studied? | Chinese universities vs US universities vs global | -| **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture | -| **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational | - -**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research. - -**📁 Save action**: -1. Read all files from INPUT_DIR to ground the research in the project context -2. Create working directory `RESEARCH_DIR/` -3. Write `00_question_decomposition.md`, including: - - Original question - - Active mode (A Phase 2 or B) and rationale - - Summary of relevant problem context from INPUT_DIR - - Classified question type and rationale - - **Research subject boundary definition** (population, geography, timeframe, level) - - List of decomposed sub-questions - - **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale - - **Search query variants** for each sub-question (at least 3-5 per sub-question) -4. Write TodoWrite to track progress - -### Step 2: Source Tiering & Exhaustive Web Investigation - -Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). Conclusions must be traceable to L1/L2; L3/L4 serve as supplementary and validation. - -**For full tier definitions, search strategies, community mining steps, and source registry templates**: Read `references/source-tiering.md` - -**Tool Usage**: -- Use `WebSearch` for broad searches; `WebFetch` to read specific pages -- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation -- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories) -- When citing web sources, include the URL and date accessed - -#### Exhaustive Search Requirements (MANDATORY) - -Do not stop at the first few results. The goal is to build a comprehensive evidence base. - -**Minimum search effort per sub-question**: -- Execute **all** query variants generated in Step 1's Question Explosion (at least 3-5 per sub-question) -- Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion) -- If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems - -**Search broadening strategies** (use when results are thin): -- Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation" -- Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums -- Try different geographies: search in English + search for European/Asian approaches if relevant -- Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025" -- Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report" - -**Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated. - -**📁 Save action**: -For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`. - -### Step 3: Fact Extraction & Evidence Cards - -Transform sources into **verifiable fact cards**: - -```markdown -## Fact Cards - -### Fact 1 -- **Statement**: [specific fact description] -- **Source**: [link/document section] -- **Confidence**: High/Medium/Low - -### Fact 2 -... -``` - -**Key discipline**: -- Pin down facts first, then reason -- Distinguish "what officials said" from "what I infer" -- When conflicting information is found, annotate and preserve both sides -- Annotate confidence level: - - ✅ High: Explicitly stated in official documentation - - ⚠️ Medium: Mentioned in official blog but not formally documented - - ❓ Low: Inference or from unofficial sources - -**📁 Save action**: -For each extracted fact, **immediately** append to `02_fact_cards.md`: -```markdown -## Fact #[number] -- **Statement**: [specific fact description] -- **Source**: [Source #number] [link] -- **Phase**: [Phase 1 / Phase 2 / Assessment] -- **Target Audience**: [which group this fact applies to, inherited from source or further refined] -- **Confidence**: ✅/⚠️/❓ -- **Related Dimension**: [corresponding comparison dimension] -``` - -**⚠️ Target audience in fact statements**: -- If a fact comes from a "partially overlapping" or "reference only" source, the statement **must explicitly annotate the applicable scope** -- Wrong: "The Ministry of Education banned phones in classrooms" (doesn't specify who) -- Correct: "The Ministry of Education banned K-12 students from bringing phones into classrooms (does not apply to university students)" - -### Step 3.5: Iterative Deepening — Follow-Up Investigation - -After initial fact extraction, review what you have found and identify **knowledge gaps and new questions** that emerged from the initial research. This step ensures the research doesn't stop at surface-level findings. - -**Process**: - -1. **Gap analysis**: Review fact cards and identify: - - Sub-questions with fewer than 3 high-confidence facts → need more searching - - Contradictions between sources → need tie-breaking evidence - - Perspectives (from Step 1) that have no or weak coverage → need targeted search - - Claims that rely only on L3/L4 sources → need L1/L2 verification - -2. **Follow-up question generation**: Based on initial findings, generate new questions: - - "Source X claims [fact] — is this consistent with other evidence?" - - "If [approach A] has [limitation], how do practitioners work around it?" - - "What are the second-order effects of [finding]?" - - "Who disagrees with [common finding] and why?" - - "What happened when [solution] was deployed at scale?" - -3. **Targeted deep-dive searches**: Execute follow-up searches focusing on: - - Specific claims that need verification - - Alternative viewpoints not yet represented - - Real-world case studies and experience reports - - Failure cases and edge conditions - - Recent developments that may change the picture - -4. **Update artifacts**: Append new sources to `01_source_registry.md`, new facts to `02_fact_cards.md` - -**Exit criteria**: Proceed to Step 4 when: -- Every sub-question has at least 3 facts with at least one from L1/L2 -- At least 3 perspectives from Step 1 have supporting evidence -- No unresolved contradictions remain (or they are explicitly documented as open questions) -- Follow-up searches are no longer producing new substantive information - -### Step 4: Build Comparison/Analysis Framework - -Based on the question type, select fixed analysis dimensions. **For dimension lists** (General, Concept Comparison, Decision Support): Read `references/comparison-frameworks.md` - -**📁 Save action**: -Write to `03_comparison_framework.md`: -```markdown -# Comparison Framework - -## Selected Framework Type -[Concept Comparison / Decision Support / ...] - -## Selected Dimensions -1. [Dimension 1] -2. [Dimension 2] -... - -## Initial Population -| Dimension | X | Y | Factual Basis | -|-----------|---|---|---------------| -| [Dimension 1] | [description] | [description] | Fact #1, #3 | -| ... | | | | -``` - -### Step 5: Reference Point Baseline Alignment - -Ensure all compared parties have clear, consistent definitions: - -**Checklist**: -- [ ] Is the reference point's definition stable/widely accepted? -- [ ] Does it need verification, or can domain common knowledge be used? -- [ ] Does the reader's understanding of the reference point match mine? -- [ ] Are there ambiguities that need to be clarified first? - -### Step 6: Fact-to-Conclusion Reasoning Chain - -Explicitly write out the "fact → comparison → conclusion" reasoning process: - -```markdown -## Reasoning Process - -### Regarding [Dimension Name] - -1. **Fact confirmation**: According to [source], X's mechanism is... -2. **Compare with reference**: While Y's mechanism is... -3. **Conclusion**: Therefore, the difference between X and Y on this dimension is... -``` - -**Key discipline**: -- Conclusions come from mechanism comparison, not "gut feelings" -- Every conclusion must be traceable to specific facts -- Uncertain conclusions must be annotated - -**📁 Save action**: -Write to `04_reasoning_chain.md`: -```markdown -# Reasoning Chain - -## Dimension 1: [Dimension Name] - -### Fact Confirmation -According to [Fact #X], X's mechanism is... - -### Reference Comparison -While Y's mechanism is... (Source: [Fact #Y]) - -### Conclusion -Therefore, the difference between X and Y on this dimension is... - -### Confidence -✅/⚠️/❓ + rationale - ---- -## Dimension 2: [Dimension Name] -... -``` - -### Step 7: Use-Case Validation (Sanity Check) - -Validate conclusions against a typical scenario: - -**Validation questions**: -- Based on my conclusions, how should this scenario be handled? -- Is that actually the case? -- Are there counterexamples that need to be addressed? - -**Review checklist**: -- [ ] Are draft conclusions consistent with Step 3 fact cards? -- [ ] Are there any important dimensions missed? -- [ ] Is there any over-extrapolation? -- [ ] Are conclusions actionable/verifiable? - -**📁 Save action**: -Write to `05_validation_log.md`: -```markdown -# Validation Log - -## Validation Scenario -[Scenario description] - -## Expected Based on Conclusions -If using X: [expected behavior] -If using Y: [expected behavior] - -## Actual Validation Results -[actual situation] - -## Counterexamples -[yes/no, describe if yes] - -## Review Checklist -- [x] Draft conclusions consistent with fact cards -- [x] No important dimensions missed -- [x] No over-extrapolation -- [ ] Issue found: [if any] - -## Conclusions Requiring Revision -[if any] -``` - -### Step 8: Deliverable Formatting - -Make the output **readable, traceable, and actionable**. - -**📁 Save action**: -Integrate all intermediate artifacts. Write to `OUTPUT_DIR/solution_draft##.md` using the appropriate output template based on active mode: -- Mode A: `templates/solution_draft_mode_a.md` -- Mode B: `templates/solution_draft_mode_b.md` - -Sources to integrate: -- Extract background from `00_question_decomposition.md` -- Reference key facts from `02_fact_cards.md` -- Organize conclusions from `04_reasoning_chain.md` -- Generate references from `01_source_registry.md` -- Supplement with use cases from `05_validation_log.md` -- For Mode A: include AC assessment from `00_ac_assessment.md` - -## Solution Draft Output Templates - -### Mode A: Initial Research Output - -Use template: `templates/solution_draft_mode_a.md` - -### Mode B: Solution Assessment Output - -Use template: `templates/solution_draft_mode_b.md` - ## Stakeholder Perspectives Adjust content depth based on audience: @@ -709,75 +127,6 @@ Adjust content depth based on audience: | **Implementers** | Specific mechanisms, how-to | Detailed, emphasize how to do it | | **Technical experts** | Details, boundary conditions, limitations | In-depth, emphasize accuracy | -## Output Files - -Default intermediate artifacts location: `RESEARCH_DIR/` - -**Required files** (automatically generated through the process): - -| File | Content | When Generated | -|------|---------|----------------| -| `00_ac_assessment.md` | AC & restrictions assessment (Mode A only) | After Phase 1 completion | -| `00_question_decomposition.md` | Question type, sub-question list | After Step 0-1 completion | -| `01_source_registry.md` | All source links and summaries | Continuously updated during Step 2 | -| `02_fact_cards.md` | Extracted facts and sources | Continuously updated during Step 3 | -| `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion | -| `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion | -| `05_validation_log.md` | Use-case validation and review | After Step 7 completion | -| `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion | -| `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) | -| `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) | - -**Optional files**: -- `raw/*.md` - Raw source archives (saved when content is lengthy) - -## Methodology Quick Reference Card - -``` -┌──────────────────────────────────────────────────────────────────┐ -│ Deep Research — Mode-Aware 8-Step Method │ -├──────────────────────────────────────────────────────────────────┤ -│ CONTEXT: Resolve mode (project vs standalone) + set paths │ -│ GUARDRAILS: Check INPUT_DIR/INPUT_FILE exists + required files │ -│ MODE DETECT: solution_draft*.md in 01_solution? → A or B │ -│ │ -│ MODE A: Initial Research │ -│ Phase 1: AC & Restrictions Assessment (BLOCKING) │ -│ Phase 2: Full 8-step → solution_draft##.md │ -│ Phase 3: Tech Stack Consolidation (OPTIONAL) → tech_stack.md │ -│ Phase 4: Security Deep Dive (OPTIONAL) → security_analysis.md │ -│ │ -│ MODE B: Solution Assessment │ -│ Read latest draft → Full 8-step → solution_draft##.md (N+1) │ -│ Optional: Phase 3 / Phase 4 on revised draft │ -│ │ -│ 8-STEP ENGINE: │ -│ 0. Classify question type → Select framework template │ -│ 0.5 Novelty sensitivity → Time windows for sources │ -│ 1. Decompose question → sub-questions + perspectives + queries │ -│ → Perspective Rotation (3+ viewpoints, MANDATORY) │ -│ → Question Explosion (3-5 query variants per sub-Q) │ -│ 2. Exhaustive web search → L1 > L2 > L3 > L4, broad coverage │ -│ → Execute ALL query variants, search until saturation │ -│ 3. Extract facts → Each with source, confidence level │ -│ 3.5 Iterative deepening → gaps, contradictions, follow-ups │ -│ → Keep searching until exit criteria met │ -│ 4. Build framework → Fixed dimensions, structured compare │ -│ 5. Align references → Ensure unified definitions │ -│ 6. Reasoning chain → Fact→Compare→Conclude, explicit │ -│ 7. Use-case validation → Sanity check, prevent armchairing │ -│ 8. Deliverable → solution_draft##.md (mode-specific format) │ -├──────────────────────────────────────────────────────────────────┤ -│ Key discipline: Ask don't assume · Facts before reasoning │ -│ Conclusions from mechanism, not gut feelings │ -│ Search broadly, from multiple perspectives, until saturation │ -└──────────────────────────────────────────────────────────────────┘ -``` - -## Usage Examples - -For detailed execution flow examples (Mode A initial, Mode B assessment, standalone, force override): Read `references/usage-examples.md` - ## Source Verifiability Requirements Every cited piece of external information must be directly verifiable by the user. All links must be publicly accessible (annotate `[login required]` if not), citations must include exact section/page/timestamp, and unverifiable information must be annotated `[limited source]`. Full checklist in `references/quality-checklists.md`. @@ -795,7 +144,7 @@ Before completing the solution draft, run through the checklists in `references/ When replying to the user after research is complete: -**✅ Should include**: +**Should include**: - Active mode used (A or B) and which optional phases were executed - One-sentence core conclusion - Key findings summary (3-5 points) @@ -803,7 +152,7 @@ When replying to the user after research is complete: - Paths to optional artifacts if produced: `tech_stack.md`, `security_analysis.md` - If there are significant uncertainties, annotate points requiring further verification -**❌ Must not include**: +**Must not include**: - Process file listings (e.g., `00_question_decomposition.md`, `01_source_registry.md`, etc.) - Detailed research step descriptions - Working directory structure display diff --git a/.cursor/skills/research/steps/00_project-integration.md b/.cursor/skills/research/steps/00_project-integration.md new file mode 100644 index 0000000..f94ef4f --- /dev/null +++ b/.cursor/skills/research/steps/00_project-integration.md @@ -0,0 +1,103 @@ +## Project Integration + +### Prerequisite Guardrails (BLOCKING) + +Before any research begins, verify the input context exists. **Do not proceed if guardrails fail.** + +**Project mode:** +1. Check INPUT_DIR exists — **STOP if missing**, ask user to create it and provide problem files +2. Check `problem.md` in INPUT_DIR exists and is non-empty — **STOP if missing** +3. Check `restrictions.md` in INPUT_DIR exists and is non-empty — **STOP if missing** +4. Check `acceptance_criteria.md` in INPUT_DIR exists and is non-empty — **STOP if missing** +5. Check `input_data/` in INPUT_DIR exists and contains at least one file — **STOP if missing** +6. Read **all** files in INPUT_DIR to ground the investigation in the project context +7. Create OUTPUT_DIR and RESEARCH_DIR if they don't exist + +**Standalone mode:** +1. Check INPUT_FILE exists and is non-empty — **STOP if missing** +2. Resolve BASE_DIR: use the caller-specified directory if provided; otherwise default to `_standalone/` +3. Resolve OUTPUT_DIR (`BASE_DIR/01_solution/`) and RESEARCH_DIR (`BASE_DIR/00_research/`) +4. Warn if no `restrictions.md` or `acceptance_criteria.md` were provided alongside INPUT_FILE — proceed if user confirms +5. Create BASE_DIR, OUTPUT_DIR, and RESEARCH_DIR if they don't exist + +### Mode Detection + +After guardrails pass, determine the execution mode: + +1. Scan OUTPUT_DIR for files matching `solution_draft*.md` +2. **No matches found** → **Mode A: Initial Research** +3. **Matches found** → **Mode B: Solution Assessment** (use the highest-numbered draft as input) +4. **User override**: if the user explicitly says "research from scratch" or "initial research", force Mode A regardless of existing drafts + +Inform the user which mode was detected and confirm before proceeding. + +### Solution Draft Numbering + +All final output is saved as `OUTPUT_DIR/solution_draft##.md` with a 2-digit zero-padded number: + +1. Scan existing files in OUTPUT_DIR matching `solution_draft*.md` +2. Extract the highest existing number +3. Increment by 1 +4. Zero-pad to 2 digits (e.g., `01`, `02`, ..., `10`, `11`) + +Example: if `solution_draft01.md` through `solution_draft10.md` exist, the next output is `solution_draft11.md`. + +### Working Directory & Intermediate Artifact Management + +#### Directory Structure + +At the start of research, **must** create a working directory under RESEARCH_DIR: + +``` +RESEARCH_DIR/ +├── 00_ac_assessment.md # Mode A Phase 1 output: AC & restrictions assessment +├── 00_question_decomposition.md # Step 0-1 output +├── 01_source_registry.md # Step 2 output: all consulted source links +├── 02_fact_cards.md # Step 3 output: extracted facts +├── 03_comparison_framework.md # Step 4 output: selected framework and populated data +├── 04_reasoning_chain.md # Step 6 output: fact → conclusion reasoning +├── 05_validation_log.md # Step 7 output: use-case validation results +└── raw/ # Raw source archive (optional) + ├── source_1.md + └── source_2.md +``` + +### Save Timing & Content + +| Step | Save immediately after completion | Filename | +|------|-----------------------------------|----------| +| Mode A Phase 1 | AC & restrictions assessment tables | `00_ac_assessment.md` | +| Step 0-1 | Question type classification + sub-question list | `00_question_decomposition.md` | +| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` | +| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` | +| Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` | +| Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` | +| Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` | +| Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` | + +### Save Principles + +1. **Save immediately**: Write to the corresponding file as soon as a step is completed; don't wait until the end +2. **Incremental updates**: Same file can be updated multiple times; append or replace new content +3. **Preserve process**: Keep intermediate files even after their content is integrated into the final report +4. **Enable recovery**: If research is interrupted, progress can be recovered from intermediate files + +### Output Files + +**Required files** (automatically generated through the process): + +| File | Content | When Generated | +|------|---------|----------------| +| `00_ac_assessment.md` | AC & restrictions assessment (Mode A only) | After Phase 1 completion | +| `00_question_decomposition.md` | Question type, sub-question list | After Step 0-1 completion | +| `01_source_registry.md` | All source links and summaries | Continuously updated during Step 2 | +| `02_fact_cards.md` | Extracted facts and sources | Continuously updated during Step 3 | +| `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion | +| `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion | +| `05_validation_log.md` | Use-case validation and review | After Step 7 completion | +| `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion | +| `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) | +| `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) | + +**Optional files**: +- `raw/*.md` - Raw source archives (saved when content is lengthy) diff --git a/.cursor/skills/research/steps/01_mode-a-initial-research.md b/.cursor/skills/research/steps/01_mode-a-initial-research.md new file mode 100644 index 0000000..88404cd --- /dev/null +++ b/.cursor/skills/research/steps/01_mode-a-initial-research.md @@ -0,0 +1,127 @@ +## Mode A: Initial Research + +Triggered when no `solution_draft*.md` files exist in OUTPUT_DIR, or when the user explicitly requests initial research. + +### Phase 1: AC & Restrictions Assessment (BLOCKING) + +**Role**: Professional software architect + +A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them. + +**Input**: All files from INPUT_DIR (or INPUT_FILE in standalone mode) + +**Task**: +1. Read all problem context files thoroughly +2. **ASK the user about every unclear aspect** — do not assume: + - Unclear problem boundaries → ask + - Ambiguous acceptance criteria values → ask + - Missing context (no `security_approach.md`, no `input_data/`) → ask what they have + - Conflicting restrictions → ask which takes priority +3. Research in internet **extensively** — use multiple search queries per question, rephrase, and search from different angles: + - How realistic are the acceptance criteria for this specific domain? Search for industry benchmarks, standards, and typical values + - How critical is each criterion? Search for case studies where criteria were relaxed or tightened + - What domain-specific acceptance criteria are we missing? Search for industry standards, regulatory requirements, and best practices in the specific domain + - Impact of each criterion value on the whole system quality — search for research papers and engineering reports + - Cost/budget implications of each criterion — search for pricing, total cost of ownership analyses, and comparable project budgets + - Timeline implications — search for project timelines, development velocity reports, and comparable implementations + - What do practitioners in this domain consider the most important criteria? Search forums, conference talks, and experience reports +4. Research restrictions from multiple perspectives: + - Are the restrictions realistic? Search for comparable projects that operated under similar constraints + - Should any be tightened or relaxed? Search for what constraints similar projects actually ended up with + - Are there additional restrictions we should add? Search for regulatory, compliance, and safety requirements in this domain + - What restrictions do practitioners wish they had defined earlier? Search for post-mortem reports and lessons learned +5. Verify findings with authoritative sources (official docs, papers, benchmarks) — each key finding must have at least 2 independent sources + +**Uses Steps 0-3 of the 8-step engine** (question classification, decomposition, source tiering, fact extraction) scoped to AC and restrictions assessment. + +**Save action**: Write `RESEARCH_DIR/00_ac_assessment.md` with format: + +```markdown +# Acceptance Criteria Assessment + +## Acceptance Criteria + +| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status | +|-----------|-----------|-------------------|---------------------|--------| +| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed | + +## Restrictions Assessment + +| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status | +|-------------|-----------|-------------------|---------------------|--------| +| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed | + +## Key Findings +[Summary of critical findings] + +## Sources +[Key references used] +``` + +**BLOCKING**: Present the AC assessment tables to the user. Wait for confirmation or adjustments before proceeding to Phase 2. The user may update `acceptance_criteria.md` or `restrictions.md` based on findings. + +--- + +### Phase 2: Problem Research & Solution Draft + +**Role**: Professional researcher and software architect + +Full 8-step research methodology. Produces the first solution draft. + +**Input**: All files from INPUT_DIR (possibly updated after Phase 1) + Phase 1 artifacts + +**Task** (drives the 8-step engine): +1. Research existing/competitor solutions for similar problems — search broadly across industries and adjacent domains, not just the obvious competitors +2. Research the problem thoroughly — all possible ways to solve it, split into components; search for how different fields approach analogous problems +3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1 +4. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions +5. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist? +6. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues) +7. Include security considerations in each component analysis +8. Provide rough cost estimates for proposed solutions + +Be concise in formulating. The fewer words, the better, but do not miss any important details. + +**Save action**: Write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md` + +--- + +### Phase 3: Tech Stack Consolidation (OPTIONAL) + +**Role**: Software architect evaluating technology choices + +Focused synthesis step — no new 8-step cycle. Uses research already gathered in Phase 2 to make concrete technology decisions. + +**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + all files from INPUT_DIR + +**Task**: +1. Extract technology options from the solution draft's component comparison tables +2. Score each option against: fitness for purpose, maturity, security track record, team expertise, cost, scalability +3. Produce a tech stack summary with selection rationale +4. Assess risks and learning requirements per technology choice + +**Save action**: Write `OUTPUT_DIR/tech_stack.md` with: +- Requirements analysis (functional, non-functional, constraints) +- Technology evaluation tables (language, framework, database, infrastructure, key libraries) with scores +- Tech stack summary block +- Risk assessment and learning requirements tables + +--- + +### Phase 4: Security Deep Dive (OPTIONAL) + +**Role**: Security architect + +Focused analysis step — deepens the security column from the solution draft into a proper threat model and controls specification. + +**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + `security_approach.md` from INPUT_DIR + problem context + +**Task**: +1. Build threat model: asset inventory, threat actors, attack vectors +2. Define security requirements and proposed controls per component (with risk level) +3. Summarize authentication/authorization, data protection, secure communication, and logging/monitoring approach + +**Save action**: Write `OUTPUT_DIR/security_analysis.md` with: +- Threat model (assets, actors, vectors) +- Per-component security requirements and controls table +- Security controls summary diff --git a/.cursor/skills/research/steps/02_mode-b-solution-assessment.md b/.cursor/skills/research/steps/02_mode-b-solution-assessment.md new file mode 100644 index 0000000..d14d031 --- /dev/null +++ b/.cursor/skills/research/steps/02_mode-b-solution-assessment.md @@ -0,0 +1,27 @@ +## Mode B: Solution Assessment + +Triggered when `solution_draft*.md` files exist in OUTPUT_DIR. + +**Role**: Professional software architect + +Full 8-step research methodology applied to assessing and improving an existing solution draft. + +**Input**: All files from INPUT_DIR + the latest (highest-numbered) `solution_draft##.md` from OUTPUT_DIR + +**Task** (drives the 8-step engine): +1. Read the existing solution draft thoroughly +2. Research in internet extensively — for each component/decision in the draft, search for: + - Known problems and limitations of the chosen approach + - What practitioners say about using it in production + - Better alternatives that may have emerged recently + - Common failure modes and edge cases + - How competitors/similar projects solve the same problem differently +3. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure" +4. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft +5. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports +6. For each identified weak point, search for multiple solution approaches and compare them +7. Based on findings, form a new solution draft in the same format + +**Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md` + +**Optional follow-up**: After Mode B completes, the user can request Phase 3 (Tech Stack Consolidation) or Phase 4 (Security Deep Dive) using the revised draft. These phases work identically to their Mode A descriptions in `steps/01_mode-a-initial-research.md`. diff --git a/.cursor/skills/research/steps/03_engine-investigation.md b/.cursor/skills/research/steps/03_engine-investigation.md new file mode 100644 index 0000000..733905d --- /dev/null +++ b/.cursor/skills/research/steps/03_engine-investigation.md @@ -0,0 +1,227 @@ +## Research Engine — Investigation Phase (Steps 0–3.5) + +### Step 0: Question Type Classification + +First, classify the research question type and select the corresponding strategy: + +| Question Type | Core Task | Focus Dimensions | +|---------------|-----------|------------------| +| **Concept Comparison** | Build comparison framework | Mechanism differences, applicability boundaries | +| **Decision Support** | Weigh trade-offs | Cost, risk, benefit | +| **Trend Analysis** | Map evolution trajectory | History, driving factors, predictions | +| **Problem Diagnosis** | Root cause analysis | Symptoms, causes, evidence chain | +| **Knowledge Organization** | Systematic structuring | Definitions, classifications, relationships | + +**Mode-specific classification**: + +| Mode / Phase | Typical Question Type | +|--------------|----------------------| +| Mode A Phase 1 | Knowledge Organization + Decision Support | +| Mode A Phase 2 | Decision Support | +| Mode B | Problem Diagnosis + Decision Support | + +### Step 0.5: Novelty Sensitivity Assessment (BLOCKING) + +Before starting research, assess the novelty sensitivity of the question (Critical/High/Medium/Low). This determines source time windows and filtering strategy. + +**For full classification table, critical-domain rules, trigger words, and assessment template**: Read `references/novelty-sensitivity.md` + +Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources within 6 months, mandatory version annotations, cross-validation from 2+ sources, and direct verification of official download pages. + +**Save action**: Append timeliness assessment to the end of `00_question_decomposition.md` + +--- + +### Step 1: Question Decomposition & Boundary Definition + +**Mode-specific sub-questions**: + +**Mode A Phase 2** (Initial Research — Problem & Solution): +- "What existing/competitor solutions address this problem?" +- "What are the component parts of this problem?" +- "For each component, what are the state-of-the-art solutions?" +- "What are the security considerations per component?" +- "What are the cost implications of each approach?" + +**Mode B** (Solution Assessment): +- "What are the weak points and potential problems in the existing draft?" +- "What are the security vulnerabilities in the proposed architecture?" +- "Where are the performance bottlenecks?" +- "What solutions exist for each identified issue?" + +**General sub-question patterns** (use when applicable): +- **Sub-question A**: "What is X and how does it work?" (Definition & mechanism) +- **Sub-question B**: "What are the dimensions of relationship/difference between X and Y?" (Comparative analysis) +- **Sub-question C**: "In what scenarios is X applicable/inapplicable?" (Boundary conditions) +- **Sub-question D**: "What are X's development trends/best practices?" (Extended analysis) + +#### Perspective Rotation (MANDATORY) + +For each research problem, examine it from **at least 3 different perspectives**. Each perspective generates its own sub-questions and search queries. + +| Perspective | What it asks | Example queries | +|-------------|-------------|-----------------| +| **End-user / Consumer** | What problems do real users encounter? What do they wish were different? | "X problems", "X frustrations reddit", "X user complaints" | +| **Implementer / Engineer** | What are the technical challenges, gotchas, hidden complexities? | "X implementation challenges", "X pitfalls", "X lessons learned" | +| **Business / Decision-maker** | What are the costs, ROI, strategic implications? | "X total cost of ownership", "X ROI case study", "X vs Y business comparison" | +| **Contrarian / Devil's advocate** | What could go wrong? Why might this fail? What are critics saying? | "X criticism", "why not X", "X failures", "X disadvantages real world" | +| **Domain expert / Academic** | What does peer-reviewed research say? What are theoretical limits? | "X research paper", "X systematic review", "X benchmarks academic" | +| **Practitioner / Field** | What do people who actually use this daily say? What works in practice vs theory? | "X in production", "X experience report", "X after 1 year" | + +Select at least 3 perspectives relevant to the problem. Document the chosen perspectives in `00_question_decomposition.md`. + +#### Question Explosion (MANDATORY) + +For **each sub-question**, generate **at least 3-5 search query variants** before searching. This ensures broad coverage and avoids missing relevant information due to terminology differences. + +**Query variant strategies**: +- **Specificity ladder**: broad ("indoor navigation systems") → narrow ("UWB-based indoor drone navigation accuracy") +- **Negation/failure**: "X limitations", "X failure modes", "when X doesn't work" +- **Comparison framing**: "X vs Y for Z", "X alternative for Z", "X or Y which is better for Z" +- **Practitioner voice**: "X in production experience", "X real-world results", "X lessons learned" +- **Temporal**: "X 2025", "X latest developments", "X roadmap" +- **Geographic/domain**: "X in Europe", "X for defense applications", "X in agriculture" + +Record all planned queries in `00_question_decomposition.md` alongside each sub-question. + +**Research Subject Boundary Definition (BLOCKING - must be explicit)**: + +When decomposing questions, you must explicitly define the **boundaries of the research subject**: + +| Dimension | Boundary to define | Example | +|-----------|--------------------|---------| +| **Population** | Which group is being studied? | University students vs K-12 vs vocational students vs all students | +| **Geography** | Which region is being studied? | Chinese universities vs US universities vs global | +| **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture | +| **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational | + +**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research. + +**Save action**: +1. Read all files from INPUT_DIR to ground the research in the project context +2. Create working directory `RESEARCH_DIR/` +3. Write `00_question_decomposition.md`, including: + - Original question + - Active mode (A Phase 2 or B) and rationale + - Summary of relevant problem context from INPUT_DIR + - Classified question type and rationale + - **Research subject boundary definition** (population, geography, timeframe, level) + - List of decomposed sub-questions + - **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale + - **Search query variants** for each sub-question (at least 3-5 per sub-question) +4. Write TodoWrite to track progress + +--- + +### Step 2: Source Tiering & Exhaustive Web Investigation + +Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). Conclusions must be traceable to L1/L2; L3/L4 serve as supplementary and validation. + +**For full tier definitions, search strategies, community mining steps, and source registry templates**: Read `references/source-tiering.md` + +**Tool Usage**: +- Use `WebSearch` for broad searches; `WebFetch` to read specific pages +- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation +- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories) +- When citing web sources, include the URL and date accessed + +#### Exhaustive Search Requirements (MANDATORY) + +Do not stop at the first few results. The goal is to build a comprehensive evidence base. + +**Minimum search effort per sub-question**: +- Execute **all** query variants generated in Step 1's Question Explosion (at least 3-5 per sub-question) +- Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion) +- If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems + +**Search broadening strategies** (use when results are thin): +- Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation" +- Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums +- Try different geographies: search in English + search for European/Asian approaches if relevant +- Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025" +- Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report" + +**Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated. + +**Save action**: +For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`. + +--- + +### Step 3: Fact Extraction & Evidence Cards + +Transform sources into **verifiable fact cards**: + +```markdown +## Fact Cards + +### Fact 1 +- **Statement**: [specific fact description] +- **Source**: [link/document section] +- **Confidence**: High/Medium/Low + +### Fact 2 +... +``` + +**Key discipline**: +- Pin down facts first, then reason +- Distinguish "what officials said" from "what I infer" +- When conflicting information is found, annotate and preserve both sides +- Annotate confidence level: + - ✅ High: Explicitly stated in official documentation + - ⚠️ Medium: Mentioned in official blog but not formally documented + - ❓ Low: Inference or from unofficial sources + +**Save action**: +For each extracted fact, **immediately** append to `02_fact_cards.md`: +```markdown +## Fact #[number] +- **Statement**: [specific fact description] +- **Source**: [Source #number] [link] +- **Phase**: [Phase 1 / Phase 2 / Assessment] +- **Target Audience**: [which group this fact applies to, inherited from source or further refined] +- **Confidence**: ✅/⚠️/❓ +- **Related Dimension**: [corresponding comparison dimension] +``` + +**Target audience in fact statements**: +- If a fact comes from a "partially overlapping" or "reference only" source, the statement **must explicitly annotate the applicable scope** +- Wrong: "The Ministry of Education banned phones in classrooms" (doesn't specify who) +- Correct: "The Ministry of Education banned K-12 students from bringing phones into classrooms (does not apply to university students)" + +--- + +### Step 3.5: Iterative Deepening — Follow-Up Investigation + +After initial fact extraction, review what you have found and identify **knowledge gaps and new questions** that emerged from the initial research. This step ensures the research doesn't stop at surface-level findings. + +**Process**: + +1. **Gap analysis**: Review fact cards and identify: + - Sub-questions with fewer than 3 high-confidence facts → need more searching + - Contradictions between sources → need tie-breaking evidence + - Perspectives (from Step 1) that have no or weak coverage → need targeted search + - Claims that rely only on L3/L4 sources → need L1/L2 verification + +2. **Follow-up question generation**: Based on initial findings, generate new questions: + - "Source X claims [fact] — is this consistent with other evidence?" + - "If [approach A] has [limitation], how do practitioners work around it?" + - "What are the second-order effects of [finding]?" + - "Who disagrees with [common finding] and why?" + - "What happened when [solution] was deployed at scale?" + +3. **Targeted deep-dive searches**: Execute follow-up searches focusing on: + - Specific claims that need verification + - Alternative viewpoints not yet represented + - Real-world case studies and experience reports + - Failure cases and edge conditions + - Recent developments that may change the picture + +4. **Update artifacts**: Append new sources to `01_source_registry.md`, new facts to `02_fact_cards.md` + +**Exit criteria**: Proceed to Step 4 when: +- Every sub-question has at least 3 facts with at least one from L1/L2 +- At least 3 perspectives from Step 1 have supporting evidence +- No unresolved contradictions remain (or they are explicitly documented as open questions) +- Follow-up searches are no longer producing new substantive information diff --git a/.cursor/skills/research/steps/04_engine-analysis.md b/.cursor/skills/research/steps/04_engine-analysis.md new file mode 100644 index 0000000..b06f7cd --- /dev/null +++ b/.cursor/skills/research/steps/04_engine-analysis.md @@ -0,0 +1,146 @@ +## Research Engine — Analysis Phase (Steps 4–8) + +### Step 4: Build Comparison/Analysis Framework + +Based on the question type, select fixed analysis dimensions. **For dimension lists** (General, Concept Comparison, Decision Support): Read `references/comparison-frameworks.md` + +**Save action**: +Write to `03_comparison_framework.md`: +```markdown +# Comparison Framework + +## Selected Framework Type +[Concept Comparison / Decision Support / ...] + +## Selected Dimensions +1. [Dimension 1] +2. [Dimension 2] +... + +## Initial Population +| Dimension | X | Y | Factual Basis | +|-----------|---|---|---------------| +| [Dimension 1] | [description] | [description] | Fact #1, #3 | +| ... | | | | +``` + +--- + +### Step 5: Reference Point Baseline Alignment + +Ensure all compared parties have clear, consistent definitions: + +**Checklist**: +- [ ] Is the reference point's definition stable/widely accepted? +- [ ] Does it need verification, or can domain common knowledge be used? +- [ ] Does the reader's understanding of the reference point match mine? +- [ ] Are there ambiguities that need to be clarified first? + +--- + +### Step 6: Fact-to-Conclusion Reasoning Chain + +Explicitly write out the "fact → comparison → conclusion" reasoning process: + +```markdown +## Reasoning Process + +### Regarding [Dimension Name] + +1. **Fact confirmation**: According to [source], X's mechanism is... +2. **Compare with reference**: While Y's mechanism is... +3. **Conclusion**: Therefore, the difference between X and Y on this dimension is... +``` + +**Key discipline**: +- Conclusions come from mechanism comparison, not "gut feelings" +- Every conclusion must be traceable to specific facts +- Uncertain conclusions must be annotated + +**Save action**: +Write to `04_reasoning_chain.md`: +```markdown +# Reasoning Chain + +## Dimension 1: [Dimension Name] + +### Fact Confirmation +According to [Fact #X], X's mechanism is... + +### Reference Comparison +While Y's mechanism is... (Source: [Fact #Y]) + +### Conclusion +Therefore, the difference between X and Y on this dimension is... + +### Confidence +✅/⚠️/❓ + rationale + +--- +## Dimension 2: [Dimension Name] +... +``` + +--- + +### Step 7: Use-Case Validation (Sanity Check) + +Validate conclusions against a typical scenario: + +**Validation questions**: +- Based on my conclusions, how should this scenario be handled? +- Is that actually the case? +- Are there counterexamples that need to be addressed? + +**Review checklist**: +- [ ] Are draft conclusions consistent with Step 3 fact cards? +- [ ] Are there any important dimensions missed? +- [ ] Is there any over-extrapolation? +- [ ] Are conclusions actionable/verifiable? + +**Save action**: +Write to `05_validation_log.md`: +```markdown +# Validation Log + +## Validation Scenario +[Scenario description] + +## Expected Based on Conclusions +If using X: [expected behavior] +If using Y: [expected behavior] + +## Actual Validation Results +[actual situation] + +## Counterexamples +[yes/no, describe if yes] + +## Review Checklist +- [x] Draft conclusions consistent with fact cards +- [x] No important dimensions missed +- [x] No over-extrapolation +- [ ] Issue found: [if any] + +## Conclusions Requiring Revision +[if any] +``` + +--- + +### Step 8: Deliverable Formatting + +Make the output **readable, traceable, and actionable**. + +**Save action**: +Integrate all intermediate artifacts. Write to `OUTPUT_DIR/solution_draft##.md` using the appropriate output template based on active mode: +- Mode A: `templates/solution_draft_mode_a.md` +- Mode B: `templates/solution_draft_mode_b.md` + +Sources to integrate: +- Extract background from `00_question_decomposition.md` +- Reference key facts from `02_fact_cards.md` +- Organize conclusions from `04_reasoning_chain.md` +- Generate references from `01_source_registry.md` +- Supplement with use cases from `05_validation_log.md` +- For Mode A: include AC assessment from `00_ac_assessment.md` diff --git a/.cursor/skills/retrospective/SKILL.md b/.cursor/skills/retrospective/SKILL.md index 0f04f25..3b5191a 100644 --- a/.cursor/skills/retrospective/SKILL.md +++ b/.cursor/skills/retrospective/SKILL.md @@ -4,7 +4,7 @@ description: | Collect metrics from implementation batch reports and code review findings, analyze trends across cycles, and produce improvement reports with actionable recommendations. 3-step workflow: collect metrics, analyze trends, produce report. - Outputs to _docs/05_metrics/. + Outputs to _docs/06_metrics/. Trigger phrases: - "retrospective", "retro", "run retro" - "metrics review", "feedback loop" @@ -31,7 +31,7 @@ Collect metrics from implementation artifacts, analyze trends across development Fixed paths: - IMPL_DIR: `_docs/03_implementation/` -- METRICS_DIR: `_docs/05_metrics/` +- METRICS_DIR: `_docs/06_metrics/` - TASKS_DIR: `_docs/02_tasks/` Announce the resolved paths to the user before proceeding. @@ -166,7 +166,7 @@ Present the report summary to the user. │ │ │ 1. Collect Metrics → parse batch reports, compute metrics │ │ 2. Analyze Trends → patterns, comparison, improvement areas │ -│ 3. Produce Report → _docs/05_metrics/retro_[date].md │ +│ 3. Produce Report → _docs/06_metrics/retro_[date].md │ ├────────────────────────────────────────────────────────────────┤ │ Principles: Data-driven · Actionable · Cumulative │ │ Non-judgmental · Save immediately │ diff --git a/.cursor/skills/rollback/SKILL.md b/.cursor/skills/rollback/SKILL.md deleted file mode 100644 index 064ef58..0000000 --- a/.cursor/skills/rollback/SKILL.md +++ /dev/null @@ -1,130 +0,0 @@ ---- -name: rollback -description: | - Revert implementation to a specific batch checkpoint using git revert, reset Jira ticket statuses, - verify rollback integrity with tests, and produce a rollback report. - Trigger phrases: - - "rollback", "revert", "revert batch" - - "undo implementation", "roll back to batch" -category: build -tags: [rollback, revert, recovery, implementation] -disable-model-invocation: true ---- - -# Implementation Rollback - -Revert the codebase to a specific batch checkpoint, reset Jira statuses for reverted tasks, and verify integrity. - -## Core Principles - -- **Preserve history**: always use `git revert`, never force-push -- **Verify after revert**: run the full test suite after every rollback -- **Update tracking**: reset Jira ticket statuses for all reverted tasks -- **Atomic rollback**: if rollback fails midway, stop and report — do not leave the codebase in a partial state -- **Ask, don't assume**: if the target batch is ambiguous, present options and ask - -## Context Resolution - -- IMPL_DIR: `_docs/03_implementation/` -- Batch reports: `IMPL_DIR/batch_*_report.md` - -## Prerequisite Checks (BLOCKING) - -1. IMPL_DIR exists and contains at least one `batch_*_report.md` — **STOP if missing** -2. Git working tree is clean (no uncommitted changes) — **STOP if dirty**, ask user to commit or stash - -## Input - -- User specifies a target batch number or commit hash -- If not specified, present the list of available batch checkpoints and ask - -## Workflow - -### Step 1: Identify Checkpoints - -1. Read all `batch_*_report.md` files from IMPL_DIR -2. Extract: batch number, date, tasks included, commit hash, code review verdict -3. Present batch list to user - -**BLOCKING**: User must confirm which batch to roll back to. - -### Step 2: Revert Commits - -1. Determine which commits need to be reverted (all commits after the target batch) -2. For each commit in reverse chronological order: - - Run `git revert --no-edit` - - If merge conflicts occur: present conflicts and ask user for resolution -3. If any revert fails and cannot be resolved, abort the rollback sequence with `git revert --abort` and report - -### Step 3: Verify Integrity - -1. Run the full test suite -2. If tests fail: report failures to user, ask how to proceed (fix or abort) -3. If tests pass: continue - -### Step 4: Update Jira - -1. Identify all tasks from reverted batches -2. Reset each task's Jira ticket status to "To Do" via Jira MCP - -### Step 5: Finalize - -1. Commit with message: `[ROLLBACK] Reverted to batch [N]: [task list]` -2. Write rollback report to `IMPL_DIR/rollback_report.md` - -## Output - -Write `_docs/03_implementation/rollback_report.md`: - -```markdown -# Rollback Report - -**Date**: [YYYY-MM-DD] -**Target**: Batch [N] (commit [hash]) -**Reverted Batches**: [list] - -## Reverted Tasks - -| Task | Batch | Status Before | Status After | -|------|-------|--------------|-------------| -| [JIRA-ID] | [batch #] | In Testing | To Do | - -## Test Results -- [pass/fail count] - -## Jira Updates -- [list of ticket transitions] - -## Notes -- [any conflicts, manual steps, or issues encountered] -``` - -## Escalation Rules - -| Situation | Action | -|-----------|--------| -| No batch reports exist | **STOP** — nothing to roll back | -| Uncommitted changes in working tree | **STOP** — ask user to commit or stash | -| Merge conflicts during revert | **ASK user** for resolution | -| Tests fail after rollback | **ASK user** — fix or abort | -| Rollback fails midway | Abort with `git revert --abort`, report to user | - -## Methodology Quick Reference - -``` -┌────────────────────────────────────────────────────────────────┐ -│ Rollback (5-Step Method) │ -├────────────────────────────────────────────────────────────────┤ -│ PREREQ: batch reports exist, clean working tree │ -│ │ -│ 1. Identify Checkpoints → present batch list │ -│ [BLOCKING: user confirms target batch] │ -│ 2. Revert Commits → git revert per commit │ -│ 3. Verify Integrity → run full test suite │ -│ 4. Update Jira → reset statuses to "To Do" │ -│ 5. Finalize → commit + rollback_report.md │ -├────────────────────────────────────────────────────────────────┤ -│ Principles: Preserve history · Verify after revert │ -│ Atomic rollback · Ask don't assume │ -└────────────────────────────────────────────────────────────────┘ -``` diff --git a/.cursor/skills/security/SKILL.md b/.cursor/skills/security/SKILL.md index 5be5701..1e35084 100644 --- a/.cursor/skills/security/SKILL.md +++ b/.cursor/skills/security/SKILL.md @@ -1,300 +1,347 @@ --- -name: security-testing -description: "Test for security vulnerabilities using OWASP principles. Use when conducting security audits, testing auth, or implementing security practices." -category: specialized-testing -priority: critical -tokenEstimate: 1200 -agents: [qe-security-scanner, qe-api-contract-validator, qe-quality-analyzer] -implementation_status: optimized -optimization_version: 1.0 -last_optimized: 2025-12-02 -dependencies: [] -quick_reference_card: true -tags: [security, owasp, sast, dast, vulnerabilities, auth, injection] -trust_tier: 3 -validation: - schema_path: schemas/output.json - validator_path: scripts/validate-config.json - eval_path: evals/security-testing.yaml +name: security +description: | + OWASP-based security audit skill. Analyzes codebase for vulnerabilities across dependency scanning, + static analysis, OWASP Top 10 review, and secrets detection. Produces a structured security report + with severity-ranked findings and remediation guidance. + Can be invoked standalone or as part of the autopilot flow (optional step before deploy). + Trigger phrases: + - "security audit", "security scan", "OWASP review" + - "vulnerability scan", "security check" + - "check for vulnerabilities", "pentest" +category: review +tags: [security, owasp, sast, vulnerabilities, auth, injection, secrets] +disable-model-invocation: true --- -# Security Testing +# Security Audit - -When testing security or conducting audits: -1. TEST OWASP Top 10 vulnerabilities systematically -2. VALIDATE authentication and authorization on every endpoint -3. SCAN dependencies for known vulnerabilities (npm audit) -4. CHECK for injection attacks (SQL, XSS, command) -5. VERIFY secrets aren't exposed in code/logs +Analyze the codebase for security vulnerabilities using OWASP principles. Produces a structured report with severity-ranked findings, remediation suggestions, and a security checklist verdict. -**Quick Security Checks:** -- Access control → Test horizontal/vertical privilege escalation -- Crypto → Verify password hashing, HTTPS, no sensitive data exposed -- Injection → Test SQL injection, XSS, command injection -- Auth → Test weak passwords, session fixation, MFA enforcement -- Config → Check error messages don't leak info +## Core Principles -**Critical Success Factors:** -- Think like an attacker, build like a defender -- Security is built in, not added at the end -- Test continuously in CI/CD, not just before release - +- **OWASP-driven**: use the current OWASP Top 10 as the primary framework — verify the latest version at https://owasp.org/www-project-top-ten/ at audit start +- **Evidence-based**: every finding must reference a specific file, line, or configuration +- **Severity-ranked**: findings sorted Critical > High > Medium > Low +- **Actionable**: every finding includes a concrete remediation suggestion +- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work +- **Complement, don't duplicate**: the `/code-review` skill does a lightweight security quick-scan; this skill goes deeper -## Quick Reference Card +## Context Resolution -### When to Use -- Security audits and penetration testing -- Testing authentication/authorization -- Validating input sanitization -- Reviewing security configuration +**Project mode** (default): +- PROBLEM_DIR: `_docs/00_problem/` +- SOLUTION_DIR: `_docs/01_solution/` +- DOCUMENT_DIR: `_docs/02_document/` +- SECURITY_DIR: `_docs/05_security/` -### OWASP Top 10 -Use the most recent **stable** version of the OWASP Top 10. At the start of each security audit, research the current version at https://owasp.org/www-project-top-ten/ and test against all listed categories. Do not rely on a hardcoded list — the OWASP Top 10 is updated periodically and the current version must be verified. +**Standalone mode** (explicit target provided, e.g. `/security @src/api/`): +- TARGET: the provided path +- SECURITY_DIR: `_standalone/security/` -### Tools -| Type | Tool | Purpose | -|------|------|---------| -| SAST | SonarQube, Semgrep | Static code analysis | -| DAST | OWASP ZAP, Burp | Dynamic scanning | -| Deps | npm audit, Snyk | Dependency vulnerabilities | -| Secrets | git-secrets, TruffleHog | Secret scanning | +Announce the detected mode and resolved paths to the user before proceeding. -### Agent Coordination -- `qe-security-scanner`: Multi-layer SAST/DAST scanning -- `qe-api-contract-validator`: API security testing -- `qe-quality-analyzer`: Security code review +## Prerequisite Checks + +1. Codebase must contain source code files — **STOP if empty** +2. Create SECURITY_DIR if it does not exist +3. If SECURITY_DIR already contains artifacts, ask user: **resume, overwrite, or skip?** +4. If `_docs/00_problem/security_approach.md` exists, read it for project-specific security requirements + +## Progress Tracking + +At the start of execution, create a TodoWrite with all phases (1 through 5). Update status as each phase completes. + +## Workflow + +### Phase 1: Dependency Scan + +**Role**: Security analyst +**Goal**: Identify known vulnerabilities in project dependencies +**Constraints**: Scan only — no code changes + +1. Detect the project's package manager(s): `requirements.txt`, `package.json`, `Cargo.toml`, `*.csproj`, `go.mod` +2. Run the appropriate audit tool: + - Python: `pip audit` or `safety check` + - Node: `npm audit` + - Rust: `cargo audit` + - .NET: `dotnet list package --vulnerable` + - Go: `govulncheck` +3. If no audit tool is available, manually inspect dependency files for known CVEs using WebSearch +4. Record findings with CVE IDs, affected packages, severity, and recommended upgrade versions + +**Self-verification**: +- [ ] All package manifests scanned +- [ ] Each finding has a CVE ID or advisory reference +- [ ] Upgrade paths identified for Critical/High findings + +**Save action**: Write `SECURITY_DIR/dependency_scan.md` --- -## Key Vulnerability Tests +### Phase 2: Static Analysis (SAST) -### 1. Broken Access Control -```javascript -// Horizontal escalation - User A accessing User B's data -test('user cannot access another user\'s order', async () => { - const userAToken = await login('userA'); - const userBOrder = await createOrder('userB'); +**Role**: Security engineer +**Goal**: Identify code-level vulnerabilities through static analysis +**Constraints**: Analysis only — no code changes - const response = await api.get(`/orders/${userBOrder.id}`, { - headers: { Authorization: `Bearer ${userAToken}` } - }); - expect(response.status).toBe(403); -}); +Scan the codebase for these vulnerability patterns: -// Vertical escalation - Regular user accessing admin -test('regular user cannot access admin', async () => { - const userToken = await login('regularUser'); - expect((await api.get('/admin/users', { - headers: { Authorization: `Bearer ${userToken}` } - })).status).toBe(403); -}); -``` +**Injection**: +- SQL injection via string interpolation or concatenation +- Command injection (subprocess with shell=True, exec, eval, os.system) +- XSS via unsanitized user input in HTML output +- Template injection -### 2. Injection Attacks -```javascript -// SQL Injection -test('prevents SQL injection', async () => { - const malicious = "' OR '1'='1"; - const response = await api.get(`/products?search=${malicious}`); - expect(response.body.length).toBeLessThan(100); // Not all products -}); +**Authentication & Authorization**: +- Hardcoded credentials, API keys, passwords, tokens +- Missing authentication checks on endpoints +- Missing authorization checks (horizontal/vertical escalation paths) +- Weak password validation rules -// XSS -test('sanitizes HTML output', async () => { - const xss = ''; - await api.post('/comments', { text: xss }); +**Cryptographic Failures**: +- Plaintext password storage (no hashing) +- Weak hashing algorithms (MD5, SHA1 for passwords) +- Hardcoded encryption keys or salts +- Missing TLS/HTTPS enforcement - const html = (await api.get('/comments')).body; - expect(html).toContain('<script>'); - expect(html).not.toContain('` for Tailwind +- `