mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 22:26:33 +00:00
Update .gitignore to include additional file types and directories for Python projects, enhancing environment management and build artifacts exclusion.
This commit is contained in:
@@ -0,0 +1,460 @@
|
||||
---
|
||||
name: autopilot
|
||||
description: |
|
||||
Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
|
||||
Detects current project state from _docs/ folder, resumes from where it left off, and flows through
|
||||
problem → research → plan → decompose → implement → deploy without manual skill invocation.
|
||||
Maximizes work per conversation by auto-transitioning between skills.
|
||||
Trigger phrases:
|
||||
- "autopilot", "auto", "start", "continue"
|
||||
- "what's next", "where am I", "project status"
|
||||
category: meta
|
||||
tags: [orchestrator, workflow, auto-chain, state-machine, meta-skill]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Autopilot Orchestrator
|
||||
|
||||
Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autopilot` once — the engine handles sequencing, transitions, and re-entry.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Auto-chain**: when a skill completes, immediately start the next one — no pause between skills
|
||||
- **Only pause at decision points**: BLOCKING gates inside sub-skills are the natural pause points; do not add artificial stops between steps
|
||||
- **State from disk**: all progress is persisted to `_docs/_autopilot_state.md` and cross-checked against `_docs/` folder structure
|
||||
- **Rich re-entry**: on every invocation, read the state file for full context before continuing
|
||||
- **Delegate, don't duplicate**: read and execute each sub-skill's SKILL.md; never inline their logic here
|
||||
- **Sound on pause**: follow `.cursor/rules/human-input-sound.mdc` — play a notification sound before every pause that requires human input
|
||||
- **Minimize interruptions**: only ask the user when the decision genuinely cannot be resolved automatically
|
||||
- **Jira MCP required**: steps that create Jira artifacts (Plan Step 6, Decompose) must have authenticated Jira MCP — never skip or substitute with local files
|
||||
|
||||
## Jira MCP Authentication
|
||||
|
||||
Several workflow steps create Jira artifacts (epics, tasks, links). The Jira MCP server must be authenticated **before** any step that writes to Jira.
|
||||
|
||||
### Steps That Require Jira MCP
|
||||
|
||||
| Step | Sub-Step | Jira Action |
|
||||
|------|----------|-------------|
|
||||
| 2 (Plan) | Step 6 — Jira Epics | Create epics for each component |
|
||||
| 3 (Decompose) | Step 1–3 — All tasks | Create Jira ticket per task, link to epic |
|
||||
|
||||
### Authentication Gate
|
||||
|
||||
Before entering **Step 2 (Plan)** or **Step 3 (Decompose)** for the first time, the autopilot must:
|
||||
|
||||
1. Call `mcp_auth` on the Jira MCP server
|
||||
2. If authentication succeeds → proceed normally
|
||||
3. If the user **skips** authentication → **STOP**. Present using Choose format:
|
||||
|
||||
```
|
||||
══════════════════════════════════════
|
||||
BLOCKER: Jira MCP authentication required
|
||||
══════════════════════════════════════
|
||||
A) Authenticate now (retry mcp_auth)
|
||||
B) Pause autopilot — resume after configuring Jira MCP
|
||||
══════════════════════════════════════
|
||||
Note: Jira integration is mandatory. Plan and Decompose
|
||||
steps create epics and tasks that drive implementation.
|
||||
Local-only workarounds are not acceptable.
|
||||
══════════════════════════════════════
|
||||
```
|
||||
|
||||
Do NOT offer a "skip Jira" or "save locally" option. The workflow depends on Jira IDs for task referencing, dependency tracking, and implementation batching.
|
||||
|
||||
### Re-Authentication
|
||||
|
||||
If Jira MCP was already authenticated in a previous invocation (verify by listing available Jira tools beyond `mcp_auth`), skip the auth gate.
|
||||
|
||||
## User Interaction Protocol
|
||||
|
||||
Every time the autopilot or a sub-skill needs a user decision, use the **Choose A / B / C / D** format. This applies to:
|
||||
|
||||
- State transitions where multiple valid next actions exist
|
||||
- Sub-skill BLOCKING gates that require user judgment
|
||||
- Any fork where the autopilot cannot confidently pick the right path
|
||||
- Trade-off decisions (tech choices, scope, risk acceptance)
|
||||
|
||||
### When to Ask (MUST ask)
|
||||
|
||||
- The next action is ambiguous (e.g., "another research round or proceed?")
|
||||
- The decision has irreversible consequences (e.g., architecture choices, skipping a step)
|
||||
- The user's intent or preference cannot be inferred from existing artifacts
|
||||
- A sub-skill's BLOCKING gate explicitly requires user confirmation
|
||||
- Multiple valid approaches exist with meaningfully different trade-offs
|
||||
|
||||
### When NOT to Ask (auto-transition)
|
||||
|
||||
- Only one logical next step exists (e.g., Problem complete → Research is the only option)
|
||||
- The transition is deterministic from the state (e.g., Plan complete → Decompose)
|
||||
- The decision is low-risk and reversible
|
||||
- Existing artifacts or prior decisions already imply the answer
|
||||
|
||||
### Choice Format
|
||||
|
||||
Always present decisions in this format:
|
||||
|
||||
```
|
||||
══════════════════════════════════════
|
||||
DECISION REQUIRED: [brief context]
|
||||
══════════════════════════════════════
|
||||
A) [Option A — short description]
|
||||
B) [Option B — short description]
|
||||
C) [Option C — short description, if applicable]
|
||||
D) [Option D — short description, if applicable]
|
||||
══════════════════════════════════════
|
||||
Recommendation: [A/B/C/D] — [one-line reason]
|
||||
══════════════════════════════════════
|
||||
```
|
||||
|
||||
Rules:
|
||||
1. Always provide 2–4 concrete options (never open-ended questions)
|
||||
2. Always include a recommendation with a brief justification
|
||||
3. Keep option descriptions to one line each
|
||||
4. If only 2 options make sense, use A/B only — do not pad with filler options
|
||||
5. Play the notification sound (per `human-input-sound.mdc`) before presenting the choice
|
||||
6. Record every user decision in the state file's `Key Decisions` section
|
||||
7. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive
|
||||
|
||||
## State File: `_docs/_autopilot_state.md`
|
||||
|
||||
The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist.
|
||||
|
||||
### Format
|
||||
|
||||
```markdown
|
||||
# Autopilot State
|
||||
|
||||
## Current Step
|
||||
step: [0-5 or "done"]
|
||||
name: [Problem / Research / Plan / Decompose / Implement / Deploy / Done]
|
||||
status: [not_started / in_progress / completed]
|
||||
sub_step: [optional — sub-skill internal step number + name if interrupted mid-step]
|
||||
|
||||
## Step ↔ SubStep Reference
|
||||
| Step | Name | Sub-Skill | Internal SubSteps |
|
||||
|------|------------|------------------------|------------------------------------------|
|
||||
| 0 | Problem | problem/SKILL.md | Phase 1–4 |
|
||||
| 1 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
|
||||
| 2 | Plan | plan/SKILL.md | Step 1–6 |
|
||||
| 3 | Decompose | decompose/SKILL.md | Step 1–4 |
|
||||
| 4 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
|
||||
| 5 | Deploy | deploy/SKILL.md | Step 1–7 |
|
||||
|
||||
When updating `Current Step`, always write it as:
|
||||
step: N ← autopilot step (0–5)
|
||||
sub_step: M ← sub-skill's own internal step/phase number + name
|
||||
Example:
|
||||
step: 2
|
||||
name: Plan
|
||||
status: in_progress
|
||||
sub_step: 4 — Architecture Review & Risk Assessment
|
||||
|
||||
## Completed Steps
|
||||
|
||||
| Step | Name | Completed | Key Outcome |
|
||||
|------|------|-----------|-------------|
|
||||
| 0 | Problem | [date] | [one-line summary] |
|
||||
| 1 | Research | [date] | [N drafts, final approach summary] |
|
||||
| 2 | Plan | [date] | [N components, architecture summary] |
|
||||
| 3 | Decompose | [date] | [N tasks, total complexity points] |
|
||||
| 4 | Implement | [date] | [N batches, pass/fail summary] |
|
||||
| 5 | Deploy | [date] | [artifacts produced] |
|
||||
|
||||
## Key Decisions
|
||||
- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"]
|
||||
- [decision 2: e.g. "6 research rounds, final draft: solution_draft06.md"]
|
||||
- [decision N]
|
||||
|
||||
## Last Session
|
||||
date: [date]
|
||||
ended_at: Step [N] [Name] — SubStep [M] [sub-step name]
|
||||
reason: [completed step / session boundary / user paused / context limit]
|
||||
notes: [any context for next session, e.g. "User asked to revisit risk assessment"]
|
||||
|
||||
## Blockers
|
||||
- [blocker 1, if any]
|
||||
- [none]
|
||||
```
|
||||
|
||||
### State File Rules
|
||||
|
||||
1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 0)
|
||||
2. **Update** the state file after every step completion, every session boundary, and every BLOCKING gate confirmation
|
||||
3. **Read** the state file as the first action on every invocation — before folder scanning
|
||||
4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 2 but `_docs/02_plans/architecture.md` already exists), trust the folder structure and update the state file to match
|
||||
5. **Never delete** the state file. It accumulates history across the entire project lifecycle
|
||||
|
||||
## Execution Entry Point
|
||||
|
||||
Every invocation of this skill follows the same sequence:
|
||||
|
||||
```
|
||||
1. Read _docs/_autopilot_state.md (if exists)
|
||||
2. Cross-check state file against _docs/ folder structure
|
||||
3. Resolve current step (state file + folder scan)
|
||||
4. Present Status Summary (from state file context)
|
||||
5. Enter Execution Loop:
|
||||
a. Read and execute the current skill's SKILL.md
|
||||
b. When skill completes → update state file
|
||||
c. Re-detect next step
|
||||
d. If next skill is ready → auto-chain (go to 5a with next skill)
|
||||
e. If session boundary reached → update state file with session notes → suggest new conversation
|
||||
f. If all steps done → update state file → report completion
|
||||
```
|
||||
|
||||
## State Detection
|
||||
|
||||
Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning.
|
||||
|
||||
### Folder Scan Rules (fallback)
|
||||
|
||||
Scan `_docs/` to determine the current workflow position. Check rules in order — first match wins.
|
||||
|
||||
### Detection Rules
|
||||
|
||||
**Step 0 — Problem Gathering**
|
||||
Condition: `_docs/00_problem/` does not exist, OR any of these are missing/empty:
|
||||
- `problem.md`
|
||||
- `restrictions.md`
|
||||
- `acceptance_criteria.md`
|
||||
- `input_data/` (must contain at least one file)
|
||||
|
||||
Action: Read and execute `.cursor/skills/problem/SKILL.md`
|
||||
|
||||
---
|
||||
|
||||
**Step 1 — Research (Initial)**
|
||||
Condition: `_docs/00_problem/` is complete AND `_docs/01_solution/` has no `solution_draft*.md` files
|
||||
|
||||
Action: Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode A)
|
||||
|
||||
---
|
||||
|
||||
**Step 1b — Research Decision**
|
||||
Condition: `_docs/01_solution/` contains `solution_draft*.md` files AND `_docs/01_solution/solution.md` does not exist AND `_docs/02_plans/architecture.md` does not exist
|
||||
|
||||
Action: Present the current research state to the user:
|
||||
- How many solution drafts exist
|
||||
- Whether tech_stack.md and security_analysis.md exist
|
||||
- One-line summary from the latest draft
|
||||
|
||||
Then present using the **Choose format**:
|
||||
|
||||
```
|
||||
══════════════════════════════════════
|
||||
DECISION REQUIRED: Research complete — next action?
|
||||
══════════════════════════════════════
|
||||
A) Run another research round (Mode B assessment)
|
||||
B) Proceed to planning with current draft
|
||||
══════════════════════════════════════
|
||||
Recommendation: [A or B] — [reason based on draft quality]
|
||||
══════════════════════════════════════
|
||||
```
|
||||
|
||||
- If user picks A → Read and execute `.cursor/skills/research/SKILL.md` (will auto-detect Mode B)
|
||||
- If user picks B → auto-chain to Step 2 (Plan)
|
||||
|
||||
---
|
||||
|
||||
**Step 2 — Plan**
|
||||
Condition: `_docs/01_solution/` has `solution_draft*.md` files AND `_docs/02_plans/architecture.md` does not exist
|
||||
|
||||
Action:
|
||||
1. The plan skill's Prereq 2 will rename the latest draft to `solution.md` — this is handled by the plan skill itself
|
||||
2. Read and execute `.cursor/skills/plan/SKILL.md`
|
||||
|
||||
If `_docs/02_plans/` exists but is incomplete (has some artifacts but no `FINAL_report.md`), the plan skill's built-in resumability handles it.
|
||||
|
||||
---
|
||||
|
||||
**Step 3 — Decompose**
|
||||
Condition: `_docs/02_plans/` contains `architecture.md` AND `_docs/02_plans/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`)
|
||||
|
||||
Action: Read and execute `.cursor/skills/decompose/SKILL.md`
|
||||
|
||||
If `_docs/02_tasks/` has some task files already, the decompose skill's resumability handles it.
|
||||
|
||||
---
|
||||
|
||||
**Step 4 — Implement**
|
||||
Condition: `_docs/02_tasks/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/FINAL_implementation_report.md` does not exist
|
||||
|
||||
Action: Read and execute `.cursor/skills/implement/SKILL.md`
|
||||
|
||||
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
|
||||
|
||||
---
|
||||
|
||||
**Step 5 — Deploy**
|
||||
Condition: `_docs/03_implementation/FINAL_implementation_report.md` exists AND `_docs/04_deploy/` does not exist or is incomplete
|
||||
|
||||
Action: Read and execute `.cursor/skills/deploy/SKILL.md`
|
||||
|
||||
---
|
||||
|
||||
**Done**
|
||||
Condition: `_docs/04_deploy/` contains all expected artifacts (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md)
|
||||
|
||||
Action: Report project completion with summary.
|
||||
|
||||
## Status Summary
|
||||
|
||||
On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback).
|
||||
|
||||
Format:
|
||||
|
||||
```
|
||||
═══════════════════════════════════════════════════
|
||||
AUTOPILOT STATUS
|
||||
═══════════════════════════════════════════════════
|
||||
Step 0 Problem [DONE / IN PROGRESS / NOT STARTED]
|
||||
Step 1 Research [DONE (N drafts) / IN PROGRESS / NOT STARTED]
|
||||
Step 2 Plan [DONE / IN PROGRESS / NOT STARTED]
|
||||
Step 3 Decompose [DONE (N tasks) / IN PROGRESS / NOT STARTED]
|
||||
Step 4 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED]
|
||||
Step 5 Deploy [DONE / IN PROGRESS / NOT STARTED]
|
||||
═══════════════════════════════════════════════════
|
||||
Current: Step N — Name
|
||||
SubStep: M — [sub-skill internal step name]
|
||||
Action: [what will happen next]
|
||||
═══════════════════════════════════════════════════
|
||||
```
|
||||
|
||||
For re-entry (state file exists), also include:
|
||||
- Key decisions from the state file's `Key Decisions` section
|
||||
- Last session context from the `Last Session` section
|
||||
- Any blockers from the `Blockers` section
|
||||
|
||||
## Auto-Chain Rules
|
||||
|
||||
After a skill completes, apply these rules:
|
||||
|
||||
| Completed Step | Next Action |
|
||||
|---------------|-------------|
|
||||
| Problem Gathering | Auto-chain → Research (Mode A) |
|
||||
| Research (any round) | Auto-chain → Research Decision (ask user: another round or proceed?) |
|
||||
| Research Decision → proceed | Auto-chain → Plan |
|
||||
| Plan | Auto-chain → Decompose |
|
||||
| Decompose | **Session boundary** — suggest new conversation before Implement |
|
||||
| Implement | Auto-chain → Deploy |
|
||||
| Deploy | Report completion |
|
||||
|
||||
### Session Boundary: Decompose → Implement
|
||||
|
||||
After decompose completes, **do not auto-chain to implement**. Instead:
|
||||
|
||||
1. Update state file: mark Decompose as completed, set current step to 4 (Implement) with status `not_started`
|
||||
2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready`
|
||||
3. Present a summary: number of tasks, estimated batches, total complexity points
|
||||
4. Use Choose format:
|
||||
|
||||
```
|
||||
══════════════════════════════════════
|
||||
DECISION REQUIRED: Decompose complete — start implementation?
|
||||
══════════════════════════════════════
|
||||
A) Start a new conversation for implementation (recommended for context freshness)
|
||||
B) Continue implementation in this conversation
|
||||
══════════════════════════════════════
|
||||
Recommendation: A — implementation is the longest phase, fresh context helps
|
||||
══════════════════════════════════════
|
||||
```
|
||||
|
||||
This is the only hard session boundary. All other transitions auto-chain.
|
||||
|
||||
## Skill Delegation
|
||||
|
||||
For each step, the delegation pattern is:
|
||||
|
||||
1. Update state file: set `step` to the autopilot step number (0–5), status to `in_progress`, set `sub_step` to the sub-skill's current internal step/phase number and name
|
||||
2. Announce: "Starting [Skill Name]..."
|
||||
3. Read the skill file: `.cursor/skills/[name]/SKILL.md`
|
||||
4. Execute the skill's workflow exactly as written, including:
|
||||
- All BLOCKING gates (present to user, wait for confirmation)
|
||||
- All self-verification checklists
|
||||
- All save actions
|
||||
- All escalation rules
|
||||
- Update `sub_step` in the state file each time the sub-skill advances to a new internal step/phase
|
||||
5. When the skill's workflow is fully complete:
|
||||
- Update state file: mark step as `completed`, record date, write one-line key outcome
|
||||
- Add any key decisions made during this step to the `Key Decisions` section
|
||||
- Return to the auto-chain rules
|
||||
|
||||
Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The autopilot is a sequencer, not an optimizer.
|
||||
|
||||
## Re-Entry Protocol
|
||||
|
||||
When the user invokes `/autopilot` and work already exists:
|
||||
|
||||
1. Read `_docs/_autopilot_state.md`
|
||||
2. Cross-check against `_docs/` folder structure
|
||||
3. Present Status Summary with context from state file (key decisions, last session, blockers)
|
||||
4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery
|
||||
5. Continue execution from detected state
|
||||
|
||||
## Error Handling
|
||||
|
||||
All error situations that require user input MUST use the **Choose A / B / C / D** format.
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| State detection is ambiguous (artifacts suggest two different steps) | Present findings and use Choose format with the candidate steps as options |
|
||||
| Sub-skill fails or hits an unrecoverable blocker | Use Choose format: A) retry, B) skip with warning, C) abort and fix manually |
|
||||
| User wants to skip a step | Use Choose format: A) skip (with dependency warning), B) execute the step |
|
||||
| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
|
||||
| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
|
||||
|
||||
## Trigger Conditions
|
||||
|
||||
This skill activates when the user wants to:
|
||||
- Start a new project from scratch
|
||||
- Continue an in-progress project
|
||||
- Check project status
|
||||
- Let the AI guide them through the full workflow
|
||||
|
||||
**Keywords**: "autopilot", "auto", "start", "continue", "what's next", "where am I", "project status"
|
||||
|
||||
**Differentiation**:
|
||||
- User wants only research → use `/research` directly
|
||||
- User wants only planning → use `/plan` directly
|
||||
- User wants the full guided workflow → use `/autopilot`
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Autopilot (Auto-Chain Orchestrator) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ EVERY INVOCATION: │
|
||||
│ 1. State Detection (scan _docs/) │
|
||||
│ 2. Status Summary (show progress) │
|
||||
│ 3. Execute current skill │
|
||||
│ 4. Auto-chain to next skill (loop) │
|
||||
│ │
|
||||
│ WORKFLOW: │
|
||||
│ Step 0 Problem → .cursor/skills/problem/SKILL.md │
|
||||
│ ↓ auto-chain │
|
||||
│ Step 1 Research → .cursor/skills/research/SKILL.md │
|
||||
│ ↓ auto-chain (ask: another round?) │
|
||||
│ Step 2 Plan → .cursor/skills/plan/SKILL.md │
|
||||
│ ↓ auto-chain │
|
||||
│ Step 3 Decompose → .cursor/skills/decompose/SKILL.md │
|
||||
│ ↓ SESSION BOUNDARY (suggest new conversation) │
|
||||
│ Step 4 Implement → .cursor/skills/implement/SKILL.md │
|
||||
│ ↓ auto-chain │
|
||||
│ Step 5 Deploy → .cursor/skills/deploy/SKILL.md │
|
||||
│ ↓ │
|
||||
│ DONE │
|
||||
│ │
|
||||
│ STATE FILE: _docs/_autopilot_state.md │
|
||||
│ FALLBACK: _docs/ folder structure scan │
|
||||
│ PAUSE POINTS: sub-skill BLOCKING gates only │
|
||||
│ SESSION BREAK: after Decompose (before Implement) │
|
||||
│ USER INPUT: Choose A/B/C/D format at genuine decisions only │
|
||||
│ AUTO-TRANSITION: when path is unambiguous, don't ask │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Auto-chain · State to file · Rich re-entry │
|
||||
│ Delegate don't duplicate · Pause at decisions only │
|
||||
│ Minimize interruptions · Choose format for decisions │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,154 @@
|
||||
---
|
||||
name: code-review
|
||||
description: |
|
||||
Multi-phase code review against task specs with structured findings output.
|
||||
6-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency.
|
||||
Produces a structured report with severity-ranked findings and a PASS/FAIL/PASS_WITH_WARNINGS verdict.
|
||||
Invoked by /implement skill after each batch, or manually.
|
||||
Trigger phrases:
|
||||
- "code review", "review code", "review implementation"
|
||||
- "check code quality", "review against specs"
|
||||
category: review
|
||||
tags: [code-review, quality, security-scan, performance, SOLID]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Code Review
|
||||
|
||||
Multi-phase code review that verifies implementation against task specs, checks code quality, and produces structured findings.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Understand intent first**: read the task specs before reviewing code — know what it should do before judging how
|
||||
- **Structured output**: every finding has severity, category, location, description, and suggestion
|
||||
- **Deduplicate**: same issue at the same location is reported once using `{file}:{line}:{title}` as key
|
||||
- **Severity-ranked**: findings sorted Critical > High > Medium > Low
|
||||
- **Verdict-driven**: clear PASS/FAIL/PASS_WITH_WARNINGS drives automation decisions
|
||||
|
||||
## Input
|
||||
|
||||
- List of task spec files that were just implemented (paths to `[JIRA-ID]_[short_name].md`)
|
||||
- Changed files (detected via `git diff` or provided by the `/implement` skill)
|
||||
- Project context: `_docs/00_problem/restrictions.md`, `_docs/01_solution/solution.md`
|
||||
|
||||
## Phase 1: Context Loading
|
||||
|
||||
Before reviewing code, build understanding of intent:
|
||||
|
||||
1. Read each task spec — acceptance criteria, scope, constraints, dependencies
|
||||
2. Read project restrictions and solution overview
|
||||
3. Map which changed files correspond to which task specs
|
||||
4. Understand what the code is supposed to do before judging how it does it
|
||||
|
||||
## Phase 2: Spec Compliance Review
|
||||
|
||||
For each task, verify implementation satisfies every acceptance criterion:
|
||||
|
||||
- Walk through each AC (Given/When/Then) and trace it in the code
|
||||
- Check that unit tests cover each AC
|
||||
- Check that integration tests exist where specified in the task spec
|
||||
- Flag any AC that is not demonstrably satisfied as a **Spec-Gap** finding (severity: High)
|
||||
- Flag any scope creep (implementation beyond what the spec asked for) as a **Scope** finding (severity: Low)
|
||||
|
||||
## Phase 3: Code Quality Review
|
||||
|
||||
Check implemented code against quality standards:
|
||||
|
||||
- **SOLID principles** — single responsibility, open/closed, Liskov, interface segregation, dependency inversion
|
||||
- **Error handling** — consistent strategy, no bare catch/except, meaningful error messages
|
||||
- **Naming** — clear intent, follows project conventions
|
||||
- **Complexity** — functions longer than 50 lines or cyclomatic complexity > 10
|
||||
- **DRY** — duplicated logic across files
|
||||
- **Test quality** — tests assert meaningful behavior, not just "no error thrown"
|
||||
- **Dead code** — unused imports, unreachable branches
|
||||
|
||||
## Phase 4: Security Quick-Scan
|
||||
|
||||
Lightweight security checks (defer deep analysis to the `/security` skill):
|
||||
|
||||
- SQL injection via string interpolation
|
||||
- Command injection (subprocess with shell=True, exec, eval)
|
||||
- Hardcoded secrets, API keys, passwords
|
||||
- Missing input validation on external inputs
|
||||
- Sensitive data in logs or error messages
|
||||
- Insecure deserialization
|
||||
|
||||
## Phase 5: Performance Scan
|
||||
|
||||
Check for common performance anti-patterns:
|
||||
|
||||
- O(n^2) or worse algorithms where O(n) is possible
|
||||
- N+1 query patterns
|
||||
- Unbounded data fetching (missing pagination/limits)
|
||||
- Blocking I/O in async contexts
|
||||
- Unnecessary memory copies or allocations in hot paths
|
||||
|
||||
## Phase 6: Cross-Task Consistency
|
||||
|
||||
When multiple tasks were implemented in the same batch:
|
||||
|
||||
- Interfaces between tasks are compatible (method signatures, DTOs match)
|
||||
- No conflicting patterns (e.g., one task uses repository pattern, another does raw SQL)
|
||||
- Shared code is not duplicated across task implementations
|
||||
- Dependencies declared in task specs are properly wired
|
||||
|
||||
## Output Format
|
||||
|
||||
Produce a structured report with findings deduplicated and sorted by severity:
|
||||
|
||||
```markdown
|
||||
# Code Review Report
|
||||
|
||||
**Batch**: [task list]
|
||||
**Date**: [YYYY-MM-DD]
|
||||
**Verdict**: PASS | PASS_WITH_WARNINGS | FAIL
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Critical | Security | src/api/auth.py:42 | SQL injection via f-string |
|
||||
| 2 | High | Spec-Gap | src/service/orders.py | AC-3 not satisfied |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: SQL injection via f-string** (Critical / Security)
|
||||
- Location: `src/api/auth.py:42`
|
||||
- Description: User input interpolated directly into SQL query
|
||||
- Suggestion: Use parameterized query via bind parameters
|
||||
- Task: 04_auth_service
|
||||
|
||||
**F2: AC-3 not satisfied** (High / Spec-Gap)
|
||||
- Location: `src/service/orders.py`
|
||||
- Description: AC-3 requires order total recalculation on item removal, but no such logic exists
|
||||
- Suggestion: Add recalculation in remove_item() method
|
||||
- Task: 07_order_processing
|
||||
```
|
||||
|
||||
## Severity Definitions
|
||||
|
||||
| Severity | Meaning | Blocks? |
|
||||
|----------|---------|---------|
|
||||
| Critical | Security vulnerability, data loss, crash | Yes — verdict FAIL |
|
||||
| High | Spec gap, logic bug, broken test | Yes — verdict FAIL |
|
||||
| Medium | Performance issue, maintainability concern, missing validation | No — verdict PASS_WITH_WARNINGS |
|
||||
| Low | Style, minor improvement, scope creep | No — verdict PASS_WITH_WARNINGS |
|
||||
|
||||
## Category Values
|
||||
|
||||
Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope
|
||||
|
||||
## Verdict Logic
|
||||
|
||||
- **FAIL**: any Critical or High finding exists
|
||||
- **PASS_WITH_WARNINGS**: only Medium or Low findings
|
||||
- **PASS**: no findings
|
||||
|
||||
## Integration with /implement
|
||||
|
||||
The `/implement` skill invokes this skill after each batch completes:
|
||||
|
||||
1. Collects changed files from all implementer agents in the batch
|
||||
2. Passes task spec paths + changed files to this skill
|
||||
3. If verdict is FAIL — presents findings to user (BLOCKING), user fixes or confirms
|
||||
4. If verdict is PASS or PASS_WITH_WARNINGS — proceeds automatically (findings shown as info)
|
||||
@@ -0,0 +1,295 @@
|
||||
---
|
||||
name: decompose
|
||||
description: |
|
||||
Decompose planned components into atomic implementable tasks with bootstrap structure plan.
|
||||
4-step workflow: bootstrap structure plan, component task decomposition, integration test task decomposition, and cross-task verification.
|
||||
Supports full decomposition (_docs/ structure) and single component mode.
|
||||
Trigger phrases:
|
||||
- "decompose", "decompose features", "feature decomposition"
|
||||
- "task decomposition", "break down components"
|
||||
- "prepare for implementation"
|
||||
category: build
|
||||
tags: [decomposition, tasks, dependencies, jira, implementation-prep]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Task Decomposition
|
||||
|
||||
Decompose planned components into atomic, implementable task specs with a bootstrap structure plan through a systematic workflow. All tasks are named with their Jira ticket ID prefix in a flat directory.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Atomic tasks**: each task does one thing; if it exceeds 5 complexity points, split it
|
||||
- **Behavioral specs, not implementation plans**: describe what the system should do, not how to build it
|
||||
- **Flat structure**: all tasks are Jira-ID-prefixed files in TASKS_DIR — no component subdirectories
|
||||
- **Save immediately**: write artifacts to disk after each task; never accumulate unsaved work
|
||||
- **Jira inline**: create Jira ticket immediately after writing each task file
|
||||
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
|
||||
- **Plan, don't code**: this workflow produces documents and Jira tasks, never implementation code
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Determine the operating mode based on invocation before any other logic runs.
|
||||
|
||||
**Default** (no explicit input file provided):
|
||||
- PLANS_DIR: `_docs/02_plans/`
|
||||
- TASKS_DIR: `_docs/02_tasks/`
|
||||
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, PLANS_DIR
|
||||
- Runs Step 1 (bootstrap) + Step 2 (all components) + Step 3 (integration tests) + Step 4 (cross-verification)
|
||||
|
||||
**Single component mode** (provided file is within `_docs/02_plans/` and inside a `components/` subdirectory):
|
||||
- PLANS_DIR: `_docs/02_plans/`
|
||||
- TASKS_DIR: `_docs/02_tasks/`
|
||||
- Derive component number and component name from the file path
|
||||
- Ask user for the parent Epic ID
|
||||
- Runs Step 2 (that component only, appending to existing task numbering)
|
||||
|
||||
Announce the detected mode and resolved paths to the user before proceeding.
|
||||
|
||||
## Input Specification
|
||||
|
||||
### Required Files
|
||||
|
||||
**Default:**
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `_docs/00_problem/problem.md` | Problem description and context |
|
||||
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
|
||||
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
|
||||
| `_docs/01_solution/solution.md` | Finalized solution |
|
||||
| `PLANS_DIR/architecture.md` | Architecture from plan skill |
|
||||
| `PLANS_DIR/system-flows.md` | System flows from plan skill |
|
||||
| `PLANS_DIR/components/[##]_[name]/description.md` | Component specs from plan skill |
|
||||
| `PLANS_DIR/integration_tests/` | Integration test specs from plan skill |
|
||||
|
||||
**Single component mode:**
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| The provided component `description.md` | Component spec to decompose |
|
||||
| Corresponding `tests.md` in the same directory (if available) | Test specs for context |
|
||||
|
||||
### Prerequisite Checks (BLOCKING)
|
||||
|
||||
**Default:**
|
||||
1. PLANS_DIR contains `architecture.md` and `components/` — **STOP if missing**
|
||||
2. Create TASKS_DIR if it does not exist
|
||||
3. If TASKS_DIR already contains task files, ask user: **resume from last checkpoint or start fresh?**
|
||||
|
||||
**Single component mode:**
|
||||
1. The provided component file exists and is non-empty — **STOP if missing**
|
||||
|
||||
## Artifact Management
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
TASKS_DIR/
|
||||
├── [JIRA-ID]_initial_structure.md
|
||||
├── [JIRA-ID]_[short_name].md
|
||||
├── [JIRA-ID]_[short_name].md
|
||||
├── ...
|
||||
└── _dependencies_table.md
|
||||
```
|
||||
|
||||
**Naming convention**: Each task file is initially saved with a temporary numeric prefix (`[##]_[short_name].md`). After creating the Jira ticket, rename the file to use the Jira ticket ID as prefix (`[JIRA-ID]_[short_name].md`). For example: `01_initial_structure.md` → `AZ-42_initial_structure.md`.
|
||||
|
||||
### Save Timing
|
||||
|
||||
| Step | Save immediately after | Filename |
|
||||
|------|------------------------|----------|
|
||||
| Step 1 | Bootstrap structure plan complete + Jira ticket created + file renamed | `[JIRA-ID]_initial_structure.md` |
|
||||
| Step 2 | Each component task decomposed + Jira ticket created + file renamed | `[JIRA-ID]_[short_name].md` |
|
||||
| Step 3 | Each integration test task decomposed + Jira ticket created + file renamed | `[JIRA-ID]_[short_name].md` |
|
||||
| Step 4 | Cross-task verification complete | `_dependencies_table.md` |
|
||||
|
||||
### Resumability
|
||||
|
||||
If TASKS_DIR already contains task files:
|
||||
|
||||
1. List existing `*_*.md` files (excluding `_dependencies_table.md`) and count them
|
||||
2. Resume numbering from the next number (for temporary numeric prefix before Jira rename)
|
||||
3. Inform the user which tasks already exist and are being skipped
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
At the start of execution, create a TodoWrite with all applicable steps. Update status as each step/component completes.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Bootstrap Structure Plan (default mode only)
|
||||
|
||||
**Role**: Professional software architect
|
||||
**Goal**: Produce `01_initial_structure.md` — the first task describing the project skeleton
|
||||
**Constraints**: This is a plan document, not code. The `/implement` skill executes it.
|
||||
|
||||
1. Read architecture.md, all component specs, system-flows.md, data_model.md, and `deployment/` from PLANS_DIR
|
||||
2. Read problem, solution, and restrictions from `_docs/00_problem/` and `_docs/01_solution/`
|
||||
3. Research best implementation patterns for the identified tech stack
|
||||
4. Document the structure plan using `templates/initial-structure-task.md`
|
||||
|
||||
The bootstrap structure plan must include:
|
||||
- Project folder layout with all component directories
|
||||
- Shared models, interfaces, and DTOs
|
||||
- Dockerfile per component (multi-stage, non-root, health checks, pinned base images)
|
||||
- `docker-compose.yml` for local development (all components + database + dependencies)
|
||||
- `docker-compose.test.yml` for integration test environment (black-box test runner)
|
||||
- `.dockerignore`
|
||||
- CI/CD pipeline file (`.github/workflows/ci.yml` or `azure-pipelines.yml`) with stages from `deployment/ci_cd_pipeline.md`
|
||||
- Database migration setup and initial seed data scripts
|
||||
- Observability configuration: structured logging setup, health check endpoints (`/health/live`, `/health/ready`), metrics endpoint (`/metrics`)
|
||||
- Environment variable documentation (`.env.example`)
|
||||
- Test structure with unit and integration test locations
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All components have corresponding folders in the layout
|
||||
- [ ] All inter-component interfaces have DTOs defined
|
||||
- [ ] Dockerfile defined for each component
|
||||
- [ ] `docker-compose.yml` covers all components and dependencies
|
||||
- [ ] `docker-compose.test.yml` enables black-box integration testing
|
||||
- [ ] CI/CD pipeline file defined with lint, test, security, build, deploy stages
|
||||
- [ ] Database migration setup included
|
||||
- [ ] Health check endpoints specified for each service
|
||||
- [ ] Structured logging configuration included
|
||||
- [ ] `.env.example` with all required environment variables
|
||||
- [ ] Environment strategy covers dev, staging, production
|
||||
- [ ] Test structure includes unit and integration test locations
|
||||
|
||||
**Save action**: Write `01_initial_structure.md` (temporary numeric name)
|
||||
|
||||
**Jira action**: Create a Jira ticket for this task under the "Bootstrap & Initial Structure" epic. Write the Jira ticket ID and Epic ID back into the task header.
|
||||
|
||||
**Rename action**: Rename the file from `01_initial_structure.md` to `[JIRA-ID]_initial_structure.md` (e.g., `AZ-42_initial_structure.md`). Update the **Task** field inside the file to match the new filename.
|
||||
|
||||
**BLOCKING**: Present structure plan summary to user. Do NOT proceed until user confirms.
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Task Decomposition (all modes)
|
||||
|
||||
**Role**: Professional software architect
|
||||
**Goal**: Decompose each component into atomic, implementable task specs — numbered sequentially starting from 02
|
||||
**Constraints**: Behavioral specs only — describe what, not how. No implementation code.
|
||||
|
||||
**Numbering**: Tasks are numbered sequentially across all components in dependency order. Start from 02 (01 is initial_structure). In single component mode, start from the next available number in TASKS_DIR.
|
||||
|
||||
**Component ordering**: Process components in dependency order — foundational components first (shared models, database), then components that depend on them.
|
||||
|
||||
For each component (or the single provided component):
|
||||
|
||||
1. Read the component's `description.md` and `tests.md` (if available)
|
||||
2. Decompose into atomic tasks; create only 1 task if the component is simple or atomic
|
||||
3. Split into multiple tasks only when it is necessary and would be easier to implement
|
||||
4. Do not create tasks for other components — only tasks for the current component
|
||||
5. Each task should be atomic, containing 0 APIs or a list of semantically connected APIs
|
||||
6. Write each task spec using `templates/task.md`
|
||||
7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
|
||||
8. Note task dependencies (referencing Jira IDs of already-created dependency tasks, e.g., `AZ-42_initial_structure`)
|
||||
9. **Immediately after writing each task file**: create a Jira ticket, link it to the component's epic, write the Jira ticket ID and Epic ID back into the task header, then rename the file from `[##]_[short_name].md` to `[JIRA-ID]_[short_name].md`.
|
||||
|
||||
**Self-verification** (per component):
|
||||
- [ ] Every task is atomic (single concern)
|
||||
- [ ] No task exceeds 5 complexity points
|
||||
- [ ] Task dependencies reference correct Jira IDs
|
||||
- [ ] Tasks cover all interfaces defined in the component spec
|
||||
- [ ] No tasks duplicate work from other components
|
||||
- [ ] Every task has a Jira ticket linked to the correct epic
|
||||
|
||||
**Save action**: Write each `[##]_[short_name].md` (temporary numeric name), create Jira ticket inline, then rename the file to `[JIRA-ID]_[short_name].md`. Update the **Task** field inside the file to match the new filename. Update **Dependencies** references in the file to use Jira IDs of the dependency tasks.
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Integration Test Task Decomposition (default mode only)
|
||||
|
||||
**Role**: Professional Quality Assurance Engineer
|
||||
**Goal**: Decompose integration test specs into atomic, implementable task specs
|
||||
**Constraints**: Behavioral specs only — describe what, not how. No test code.
|
||||
|
||||
**Numbering**: Continue sequential numbering from where Step 2 left off.
|
||||
|
||||
1. Read all test specs from `PLANS_DIR/integration_tests/` (functional_tests.md, non_functional_tests.md)
|
||||
2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test)
|
||||
3. Each task should reference the specific test scenarios it implements and the environment/test_data specs
|
||||
4. Dependencies: integration test tasks depend on the component implementation tasks they exercise
|
||||
5. Write each task spec using `templates/task.md`
|
||||
6. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
|
||||
7. Note task dependencies (referencing Jira IDs of already-created dependency tasks)
|
||||
8. **Immediately after writing each task file**: create a Jira ticket under the "Integration Tests" epic, write the Jira ticket ID and Epic ID back into the task header, then rename the file from `[##]_[short_name].md` to `[JIRA-ID]_[short_name].md`.
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every functional test scenario from `integration_tests/functional_tests.md` is covered by a task
|
||||
- [ ] Every non-functional test scenario from `integration_tests/non_functional_tests.md` is covered by a task
|
||||
- [ ] No task exceeds 5 complexity points
|
||||
- [ ] Dependencies correctly reference the component tasks being tested
|
||||
- [ ] Every task has a Jira ticket linked to the "Integration Tests" epic
|
||||
|
||||
**Save action**: Write each `[##]_[short_name].md` (temporary numeric name), create Jira ticket inline, then rename to `[JIRA-ID]_[short_name].md`.
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Cross-Task Verification (default mode only)
|
||||
|
||||
**Role**: Professional software architect and analyst
|
||||
**Goal**: Verify task consistency and produce `_dependencies_table.md`
|
||||
**Constraints**: Review step — fix gaps found, do not add new tasks
|
||||
|
||||
1. Verify task dependencies across all tasks are consistent
|
||||
2. Check no gaps: every interface in architecture.md has tasks covering it
|
||||
3. Check no overlaps: tasks don't duplicate work across components
|
||||
4. Check no circular dependencies in the task graph
|
||||
5. Produce `_dependencies_table.md` using `templates/dependencies-table.md`
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every architecture interface is covered by at least one task
|
||||
- [ ] No circular dependencies in the task graph
|
||||
- [ ] Cross-component dependencies are explicitly noted in affected task specs
|
||||
- [ ] `_dependencies_table.md` contains every task with correct dependencies
|
||||
|
||||
**Save action**: Write `_dependencies_table.md`
|
||||
|
||||
**BLOCKING**: Present dependency summary to user. Do NOT proceed until user confirms.
|
||||
|
||||
---
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Coding during decomposition**: this workflow produces specs, never code
|
||||
- **Over-splitting**: don't create many tasks if the component is simple — 1 task is fine
|
||||
- **Tasks exceeding 5 points**: split them; no task should be too complex for a single implementer
|
||||
- **Cross-component tasks**: each task belongs to exactly one component
|
||||
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
|
||||
- **Creating git branches**: branch creation is an implementation concern, not a decomposition one
|
||||
- **Creating component subdirectories**: all tasks go flat in TASKS_DIR
|
||||
- **Forgetting Jira**: every task must have a Jira ticket created inline — do not defer to a separate step
|
||||
- **Forgetting to rename**: after Jira ticket creation, always rename the file from numeric prefix to Jira ID prefix
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Ambiguous component boundaries | ASK user |
|
||||
| Task complexity exceeds 5 points after splitting | ASK user |
|
||||
| Missing component specs in PLANS_DIR | ASK user |
|
||||
| Cross-component dependency conflict | ASK user |
|
||||
| Jira epic not found for a component | ASK user for Epic ID |
|
||||
| Task naming | PROCEED, confirm at next BLOCKING gate |
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Task Decomposition (4-Step Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ CONTEXT: Resolve mode (default / single component) │
|
||||
│ 1. Bootstrap Structure → [JIRA-ID]_initial_structure.md │
|
||||
│ [BLOCKING: user confirms structure] │
|
||||
│ 2. Component Tasks → [JIRA-ID]_[short_name].md each │
|
||||
│ 3. Integration Tests → [JIRA-ID]_[short_name].md each │
|
||||
│ 4. Cross-Verification → _dependencies_table.md │
|
||||
│ [BLOCKING: user confirms dependencies] │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Atomic tasks · Behavioral specs · Flat structure │
|
||||
│ Jira inline · Rename to Jira ID · Save now · Ask don't assume│
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,31 @@
|
||||
# Dependencies Table Template
|
||||
|
||||
Use this template after cross-task verification. Save as `TASKS_DIR/_dependencies_table.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# Dependencies Table
|
||||
|
||||
**Date**: [YYYY-MM-DD]
|
||||
**Total Tasks**: [N]
|
||||
**Total Complexity Points**: [N]
|
||||
|
||||
| Task | Name | Complexity | Dependencies | Epic |
|
||||
|------|------|-----------|-------------|------|
|
||||
| [JIRA-ID] | initial_structure | [points] | None | [EPIC-ID] |
|
||||
| [JIRA-ID] | [short_name] | [points] | [JIRA-ID] | [EPIC-ID] |
|
||||
| [JIRA-ID] | [short_name] | [points] | [JIRA-ID] | [EPIC-ID] |
|
||||
| [JIRA-ID] | [short_name] | [points] | [JIRA-ID], [JIRA-ID] | [EPIC-ID] |
|
||||
| ... | ... | ... | ... | ... |
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidelines
|
||||
|
||||
- Every task from TASKS_DIR must appear in this table
|
||||
- Dependencies column lists Jira IDs (e.g., "AZ-43, AZ-44") or "None"
|
||||
- No circular dependencies allowed
|
||||
- Tasks should be listed in recommended execution order
|
||||
- The `/implement` skill reads this table to compute parallel batches
|
||||
@@ -0,0 +1,135 @@
|
||||
# Initial Structure Task Template
|
||||
|
||||
Use this template for the bootstrap structure plan. Save as `TASKS_DIR/01_initial_structure.md` initially, then rename to `TASKS_DIR/[JIRA-ID]_initial_structure.md` after Jira ticket creation.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# Initial Project Structure
|
||||
|
||||
**Task**: [JIRA-ID]_initial_structure
|
||||
**Name**: Initial Structure
|
||||
**Description**: Scaffold the project skeleton — folders, shared models, interfaces, stubs, CI/CD, DB migrations, test structure
|
||||
**Complexity**: [3|5] points
|
||||
**Dependencies**: None
|
||||
**Component**: Bootstrap
|
||||
**Jira**: [TASK-ID]
|
||||
**Epic**: [EPIC-ID]
|
||||
|
||||
## Project Folder Layout
|
||||
|
||||
```
|
||||
project-root/
|
||||
├── [folder structure based on tech stack and components]
|
||||
└── ...
|
||||
```
|
||||
|
||||
### Layout Rationale
|
||||
|
||||
[Brief explanation of why this structure was chosen — language conventions, framework patterns, etc.]
|
||||
|
||||
## DTOs and Interfaces
|
||||
|
||||
### Shared DTOs
|
||||
|
||||
| DTO Name | Used By Components | Fields Summary |
|
||||
|----------|-------------------|---------------|
|
||||
| [name] | [component list] | [key fields] |
|
||||
|
||||
### Component Interfaces
|
||||
|
||||
| Component | Interface | Methods | Exposed To |
|
||||
|-----------|-----------|---------|-----------|
|
||||
| [name] | [InterfaceName] | [method list] | [consumers] |
|
||||
|
||||
## CI/CD Pipeline
|
||||
|
||||
| Stage | Purpose | Trigger |
|
||||
|-------|---------|---------|
|
||||
| Build | Compile/bundle the application | Every push |
|
||||
| Lint / Static Analysis | Code quality and style checks | Every push |
|
||||
| Unit Tests | Run unit test suite | Every push |
|
||||
| Integration Tests | Run integration test suite | Every push |
|
||||
| Security Scan | SAST / dependency check | Every push |
|
||||
| Deploy to Staging | Deploy to staging environment | Merge to staging branch |
|
||||
|
||||
### Pipeline Configuration Notes
|
||||
|
||||
[Framework-specific notes: CI tool, runners, caching, parallelism, etc.]
|
||||
|
||||
## Environment Strategy
|
||||
|
||||
| Environment | Purpose | Configuration Notes |
|
||||
|-------------|---------|-------------------|
|
||||
| Development | Local development | [local DB, mock services, debug flags] |
|
||||
| Staging | Pre-production testing | [staging DB, staging services, production-like config] |
|
||||
| Production | Live system | [production DB, real services, optimized config] |
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Dev | Staging | Production | Description |
|
||||
|----------|-----|---------|------------|-------------|
|
||||
| [VAR_NAME] | [value/source] | [value/source] | [value/source] | [purpose] |
|
||||
|
||||
## Database Migration Approach
|
||||
|
||||
**Migration tool**: [tool name]
|
||||
**Strategy**: [migration strategy — e.g., versioned scripts, ORM migrations]
|
||||
|
||||
### Initial Schema
|
||||
|
||||
[Key tables/collections that need to be created, referencing component data access patterns]
|
||||
|
||||
## Test Structure
|
||||
|
||||
```
|
||||
tests/
|
||||
├── unit/
|
||||
│ ├── [component_1]/
|
||||
│ ├── [component_2]/
|
||||
│ └── ...
|
||||
├── integration/
|
||||
│ ├── test_data/
|
||||
│ └── [test files]
|
||||
└── ...
|
||||
```
|
||||
|
||||
### Test Configuration Notes
|
||||
|
||||
[Test runner, fixtures, test data management, isolation strategy]
|
||||
|
||||
## Implementation Order
|
||||
|
||||
| Order | Component | Reason |
|
||||
|-------|-----------|--------|
|
||||
| 1 | [name] | [why first — foundational, no dependencies] |
|
||||
| 2 | [name] | [depends on #1] |
|
||||
| ... | ... | ... |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Project scaffolded**
|
||||
Given the structure plan above
|
||||
When the implementer executes this task
|
||||
Then all folders, stubs, and configuration files exist
|
||||
|
||||
**AC-2: Tests runnable**
|
||||
Given the scaffolded project
|
||||
When the test suite is executed
|
||||
Then all stub tests pass (even if they only assert true)
|
||||
|
||||
**AC-3: CI/CD configured**
|
||||
Given the scaffolded project
|
||||
When CI pipeline runs
|
||||
Then build, lint, and test stages complete successfully
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- This is a PLAN document, not code. The `/implement` skill executes it.
|
||||
- Focus on structure and organization decisions, not implementation details.
|
||||
- Reference component specs for interface and DTO details — don't repeat everything.
|
||||
- The folder layout should follow conventions of the identified tech stack.
|
||||
- Environment strategy should account for secrets management and configuration.
|
||||
@@ -0,0 +1,113 @@
|
||||
# Task Specification Template
|
||||
|
||||
Create a focused behavioral specification that describes **what** the system should do, not **how** it should be built.
|
||||
Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[JIRA-ID]_[short_name].md` after Jira ticket creation.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [Feature Name]
|
||||
|
||||
**Task**: [JIRA-ID]_[short_name]
|
||||
**Name**: [short human name]
|
||||
**Description**: [one-line description of what this task delivers]
|
||||
**Complexity**: [1|2|3|5] points
|
||||
**Dependencies**: [AZ-43_shared_models, AZ-44_db_migrations] or "None"
|
||||
**Component**: [component name for context]
|
||||
**Jira**: [TASK-ID]
|
||||
**Epic**: [EPIC-ID]
|
||||
|
||||
## Problem
|
||||
|
||||
Clear, concise statement of the problem users are facing.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Measurable or observable goal 1
|
||||
- Measurable or observable goal 2
|
||||
- ...
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- What's in scope for this task
|
||||
|
||||
### Excluded
|
||||
- Explicitly what's NOT in scope
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: [Title]**
|
||||
Given [precondition]
|
||||
When [action]
|
||||
Then [expected result]
|
||||
|
||||
**AC-2: [Title]**
|
||||
Given [precondition]
|
||||
When [action]
|
||||
Then [expected result]
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- [requirement if relevant]
|
||||
|
||||
**Compatibility**
|
||||
- [requirement if relevant]
|
||||
|
||||
**Reliability**
|
||||
- [requirement if relevant]
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-1 | [test subject] | [expected result] |
|
||||
|
||||
## Integration Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-1 | [setup] | [test subject] | [expected behavior] | [NFR if any] |
|
||||
|
||||
## Constraints
|
||||
|
||||
- [Architectural pattern constraint if critical]
|
||||
- [Technical limitation]
|
||||
- [Integration requirement]
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: [Title]**
|
||||
- *Risk*: [Description]
|
||||
- *Mitigation*: [Approach]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Complexity Points Guide
|
||||
|
||||
- 1 point: Trivial, self-contained, no dependencies
|
||||
- 2 points: Non-trivial, low complexity, minimal coordination
|
||||
- 3 points: Multi-step, moderate complexity, potential alignment needed
|
||||
- 5 points: Difficult, interconnected logic, medium-high risk
|
||||
- 8 points: Too complex — split into smaller tasks
|
||||
|
||||
## Output Guidelines
|
||||
|
||||
**DO:**
|
||||
- Focus on behavior and user experience
|
||||
- Use clear, simple language
|
||||
- Keep acceptance criteria testable (Gherkin format)
|
||||
- Include realistic scope boundaries
|
||||
- Write from the user's perspective
|
||||
- Include complexity estimation
|
||||
- Reference dependencies by Jira ID (e.g., AZ-43_shared_models)
|
||||
|
||||
**DON'T:**
|
||||
- Include implementation details (file paths, classes, methods)
|
||||
- Prescribe technical solutions or libraries
|
||||
- Add architectural diagrams or code examples
|
||||
- Specify exact API endpoints or data structures
|
||||
- Include step-by-step implementation instructions
|
||||
- Add "how to build" guidance
|
||||
@@ -0,0 +1,491 @@
|
||||
---
|
||||
name: deploy
|
||||
description: |
|
||||
Comprehensive deployment skill covering status check, env setup, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures, and deployment scripts.
|
||||
7-step workflow: Status & env check, Docker containerization, CI/CD pipeline definition, environment strategy, observability planning, deployment procedures, deployment scripts.
|
||||
Uses _docs/04_deploy/ structure.
|
||||
Trigger phrases:
|
||||
- "deploy", "deployment", "deployment strategy"
|
||||
- "CI/CD", "pipeline", "containerize"
|
||||
- "observability", "monitoring", "logging"
|
||||
- "dockerize", "docker compose"
|
||||
category: ship
|
||||
tags: [deployment, docker, ci-cd, observability, monitoring, containerization, scripts]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Deployment Planning
|
||||
|
||||
Plan and document the full deployment lifecycle: check deployment status and environment requirements, containerize the application, define CI/CD pipelines, configure environments, set up observability, document deployment procedures, and generate deployment scripts.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Docker-first**: every component runs in a container; local dev, integration tests, and production all use Docker
|
||||
- **Infrastructure as code**: all deployment configuration is version-controlled
|
||||
- **Observability built-in**: logging, metrics, and tracing are part of the deployment plan, not afterthoughts
|
||||
- **Environment parity**: dev, staging, and production environments mirror each other as closely as possible
|
||||
- **Save immediately**: write artifacts to disk after each step; never accumulate unsaved work
|
||||
- **Ask, don't assume**: when infrastructure constraints or preferences are unclear, ask the user
|
||||
- **Plan, don't code**: this workflow produces deployment documents and specifications, not implementation code (except deployment scripts in Step 7)
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Fixed paths:
|
||||
|
||||
- PLANS_DIR: `_docs/02_plans/`
|
||||
- DEPLOY_DIR: `_docs/04_deploy/`
|
||||
- REPORTS_DIR: `_docs/04_deploy/reports/`
|
||||
- SCRIPTS_DIR: `scripts/`
|
||||
- ARCHITECTURE: `_docs/02_plans/architecture.md`
|
||||
- COMPONENTS_DIR: `_docs/02_plans/components/`
|
||||
|
||||
Announce the resolved paths to the user before proceeding.
|
||||
|
||||
## Input Specification
|
||||
|
||||
### Required Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `_docs/00_problem/problem.md` | Problem description and context |
|
||||
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
|
||||
| `_docs/01_solution/solution.md` | Finalized solution |
|
||||
| `PLANS_DIR/architecture.md` | Architecture from plan skill |
|
||||
| `PLANS_DIR/components/` | Component specs |
|
||||
|
||||
### Prerequisite Checks (BLOCKING)
|
||||
|
||||
1. `architecture.md` exists — **STOP if missing**, run `/plan` first
|
||||
2. At least one component spec exists in `PLANS_DIR/components/` — **STOP if missing**
|
||||
3. Create DEPLOY_DIR, REPORTS_DIR, and SCRIPTS_DIR if they do not exist
|
||||
4. If DEPLOY_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
|
||||
|
||||
## Artifact Management
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
DEPLOY_DIR/
|
||||
├── containerization.md
|
||||
├── ci_cd_pipeline.md
|
||||
├── environment_strategy.md
|
||||
├── observability.md
|
||||
├── deployment_procedures.md
|
||||
├── deploy_scripts.md
|
||||
└── reports/
|
||||
└── deploy_status_report.md
|
||||
|
||||
SCRIPTS_DIR/ (project root)
|
||||
├── deploy.sh
|
||||
├── pull-images.sh
|
||||
├── start-services.sh
|
||||
├── stop-services.sh
|
||||
└── health-check.sh
|
||||
|
||||
.env (project root, git-ignored)
|
||||
.env.example (project root, committed)
|
||||
```
|
||||
|
||||
### Save Timing
|
||||
|
||||
| Step | Save immediately after | Filename |
|
||||
|------|------------------------|----------|
|
||||
| Step 1 | Status check & env setup complete | `reports/deploy_status_report.md` + `.env` + `.env.example` |
|
||||
| Step 2 | Containerization plan complete | `containerization.md` |
|
||||
| Step 3 | CI/CD pipeline defined | `ci_cd_pipeline.md` |
|
||||
| Step 4 | Environment strategy documented | `environment_strategy.md` |
|
||||
| Step 5 | Observability plan complete | `observability.md` |
|
||||
| Step 6 | Deployment procedures documented | `deployment_procedures.md` |
|
||||
| Step 7 | Deployment scripts created | `deploy_scripts.md` + scripts in `SCRIPTS_DIR/` |
|
||||
|
||||
### Resumability
|
||||
|
||||
If DEPLOY_DIR already contains artifacts:
|
||||
|
||||
1. List existing files and match to the save timing table
|
||||
2. Identify the last completed step
|
||||
3. Resume from the next incomplete step
|
||||
4. Inform the user which steps are being skipped
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
At the start of execution, create a TodoWrite with all steps (1 through 7). Update status as each step completes.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Deployment Status & Environment Setup
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Assess current deployment readiness, identify all required environment variables, and create `.env` files
|
||||
**Constraints**: Must complete before any other step
|
||||
|
||||
1. Read architecture.md, all component specs, and restrictions.md
|
||||
2. Assess deployment readiness:
|
||||
- List all components and their current state (planned / implemented / tested)
|
||||
- Identify external dependencies (databases, APIs, message queues, cloud services)
|
||||
- Identify infrastructure prerequisites (container registry, cloud accounts, DNS, SSL certificates)
|
||||
- Check if any deployment blockers exist
|
||||
3. Identify all required environment variables by scanning:
|
||||
- Component specs for configuration needs
|
||||
- Database connection requirements
|
||||
- External API endpoints and credentials
|
||||
- Feature flags and runtime configuration
|
||||
- Container registry credentials
|
||||
- Cloud provider credentials
|
||||
- Monitoring/logging service endpoints
|
||||
4. Generate `.env.example` in project root with all variables and placeholder values (committed to VCS)
|
||||
5. Generate `.env` in project root with development defaults filled in where safe (git-ignored)
|
||||
6. Ensure `.gitignore` includes `.env` (but NOT `.env.example`)
|
||||
7. Produce a deployment status report summarizing readiness, blockers, and required setup
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All components assessed for deployment readiness
|
||||
- [ ] External dependencies catalogued
|
||||
- [ ] Infrastructure prerequisites identified
|
||||
- [ ] All required environment variables discovered
|
||||
- [ ] `.env.example` created with placeholder values
|
||||
- [ ] `.env` created with safe development defaults
|
||||
- [ ] `.gitignore` updated to exclude `.env`
|
||||
- [ ] Status report written to `reports/deploy_status_report.md`
|
||||
|
||||
**Save action**: Write `reports/deploy_status_report.md` using `templates/deploy_status_report.md`, create `.env` and `.env.example` in project root
|
||||
|
||||
**BLOCKING**: Present status report and environment variables to user. Do NOT proceed until confirmed.
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Containerization
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Define Docker configuration for every component, local development, and integration test environments
|
||||
**Constraints**: Plan only — no Dockerfile creation. Describe what each Dockerfile should contain.
|
||||
|
||||
1. Read architecture.md and all component specs
|
||||
2. Read restrictions.md for infrastructure constraints
|
||||
3. Research best Docker practices for the project's tech stack (multi-stage builds, base image selection, layer optimization)
|
||||
4. For each component, define:
|
||||
- Base image (pinned version, prefer alpine/distroless for production)
|
||||
- Build stages (dependency install, build, production)
|
||||
- Non-root user configuration
|
||||
- Health check endpoint and command
|
||||
- Exposed ports
|
||||
- `.dockerignore` contents
|
||||
5. Define `docker-compose.yml` for local development:
|
||||
- All application components
|
||||
- Database (Postgres) with named volume
|
||||
- Any message queues, caches, or external service mocks
|
||||
- Shared network
|
||||
- Environment variable files (`.env`)
|
||||
6. Define `docker-compose.test.yml` for integration tests:
|
||||
- Application components under test
|
||||
- Test runner container (black-box, no internal imports)
|
||||
- Isolated database with seed data
|
||||
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
|
||||
7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every component has a Dockerfile specification
|
||||
- [ ] Multi-stage builds specified for all production images
|
||||
- [ ] Non-root user for all containers
|
||||
- [ ] Health checks defined for every service
|
||||
- [ ] docker-compose.yml covers all components + dependencies
|
||||
- [ ] docker-compose.test.yml enables black-box integration testing
|
||||
- [ ] `.dockerignore` defined
|
||||
|
||||
**Save action**: Write `containerization.md` using `templates/containerization.md`
|
||||
|
||||
**BLOCKING**: Present containerization plan to user. Do NOT proceed until confirmed.
|
||||
|
||||
---
|
||||
|
||||
### Step 3: CI/CD Pipeline
|
||||
|
||||
**Role**: DevOps engineer
|
||||
**Goal**: Define the CI/CD pipeline with quality gates, security scanning, and multi-environment deployment
|
||||
**Constraints**: Pipeline definition only — produce YAML specification, not implementation
|
||||
|
||||
1. Read architecture.md for tech stack and deployment targets
|
||||
2. Read restrictions.md for CI/CD constraints (cloud provider, registry, etc.)
|
||||
3. Research CI/CD best practices for the project's platform (GitHub Actions / Azure Pipelines)
|
||||
4. Define pipeline stages:
|
||||
|
||||
| Stage | Trigger | Steps | Quality Gate |
|
||||
|-------|---------|-------|-------------|
|
||||
| **Lint** | Every push | Run linters per language (black, rustfmt, prettier, dotnet format) | Zero errors |
|
||||
| **Test** | Every push | Unit tests, integration tests, coverage report | 75%+ coverage |
|
||||
| **Security** | Every push | Dependency audit, SAST scan (Semgrep/SonarQube), image scan (Trivy) | Zero critical/high CVEs |
|
||||
| **Build** | PR merge to dev | Build Docker images, tag with git SHA | Build succeeds |
|
||||
| **Push** | After build | Push to container registry | Push succeeds |
|
||||
| **Deploy Staging** | After push | Deploy to staging environment | Health checks pass |
|
||||
| **Smoke Tests** | After staging deploy | Run critical path tests against staging | All pass |
|
||||
| **Deploy Production** | Manual approval | Deploy to production | Health checks pass |
|
||||
|
||||
5. Define caching strategy: dependency caches, Docker layer caches, build artifact caches
|
||||
6. Define parallelization: which stages can run concurrently
|
||||
7. Define notifications: build failures, deployment status, security alerts
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All pipeline stages defined with triggers and gates
|
||||
- [ ] Coverage threshold enforced (75%+)
|
||||
- [ ] Security scanning included (dependencies + images + SAST)
|
||||
- [ ] Caching configured for dependencies and Docker layers
|
||||
- [ ] Multi-environment deployment (staging → production)
|
||||
- [ ] Rollback procedure referenced
|
||||
- [ ] Notifications configured
|
||||
|
||||
**Save action**: Write `ci_cd_pipeline.md` using `templates/ci_cd_pipeline.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Environment Strategy
|
||||
|
||||
**Role**: Platform engineer
|
||||
**Goal**: Define environment configuration, secrets management, and environment parity
|
||||
**Constraints**: Strategy document — no secrets or credentials in output
|
||||
|
||||
1. Define environments:
|
||||
|
||||
| Environment | Purpose | Infrastructure | Data |
|
||||
|-------------|---------|---------------|------|
|
||||
| **Development** | Local developer workflow | docker-compose, local volumes | Seed data, mocks for external APIs |
|
||||
| **Staging** | Pre-production validation | Mirrors production topology | Anonymized production-like data |
|
||||
| **Production** | Live system | Full infrastructure | Real data |
|
||||
|
||||
2. Define environment variable management:
|
||||
- Reference `.env.example` created in Step 1
|
||||
- Per-environment variable sources (`.env` for dev, secret manager for staging/prod)
|
||||
- Validation: fail fast on missing required variables at startup
|
||||
3. Define secrets management:
|
||||
- Never commit secrets to version control
|
||||
- Development: `.env` files (git-ignored)
|
||||
- Staging/Production: secret manager (AWS Secrets Manager / Azure Key Vault / Vault)
|
||||
- Rotation policy
|
||||
4. Define database management per environment:
|
||||
- Development: Docker Postgres with named volume, seed data
|
||||
- Staging: managed Postgres, migrations applied via CI/CD
|
||||
- Production: managed Postgres, migrations require approval
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All three environments defined with clear purpose
|
||||
- [ ] Environment variable documentation complete (references `.env.example` from Step 1)
|
||||
- [ ] No secrets in any output document
|
||||
- [ ] Secret manager specified for staging/production
|
||||
- [ ] Database strategy per environment
|
||||
|
||||
**Save action**: Write `environment_strategy.md` using `templates/environment_strategy.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Observability
|
||||
|
||||
**Role**: Site Reliability Engineer (SRE)
|
||||
**Goal**: Define logging, metrics, tracing, and alerting strategy
|
||||
**Constraints**: Strategy document — describe what to implement, not how to wire it
|
||||
|
||||
1. Read architecture.md and component specs for service boundaries
|
||||
2. Research observability best practices for the tech stack
|
||||
|
||||
**Logging**:
|
||||
- Structured JSON to stdout/stderr (no file logging in containers)
|
||||
- Fields: `timestamp` (ISO 8601), `level`, `service`, `correlation_id`, `message`, `context`
|
||||
- Levels: ERROR (exceptions), WARN (degraded), INFO (business events), DEBUG (diagnostics, dev only)
|
||||
- No PII in logs
|
||||
- Retention: dev = console, staging = 7 days, production = 30 days
|
||||
|
||||
**Metrics**:
|
||||
- Expose Prometheus-compatible `/metrics` endpoint per service
|
||||
- System metrics: CPU, memory, disk, network
|
||||
- Application metrics: `request_count`, `request_duration` (histogram), `error_count`, `active_connections`
|
||||
- Business metrics: derived from acceptance criteria
|
||||
- Collection interval: 15s
|
||||
|
||||
**Distributed Tracing**:
|
||||
- OpenTelemetry SDK integration
|
||||
- Trace context propagation via HTTP headers and message queue metadata
|
||||
- Span naming: `<service>.<operation>`
|
||||
- Sampling: 100% in dev/staging, 10% in production (adjust based on volume)
|
||||
|
||||
**Alerting**:
|
||||
|
||||
| Severity | Response Time | Condition Examples |
|
||||
|----------|---------------|-------------------|
|
||||
| Critical | 5 min | Service down, data loss, health check failed |
|
||||
| High | 30 min | Error rate > 5%, P95 latency > 2x baseline |
|
||||
| Medium | 4 hours | Disk > 80%, elevated latency |
|
||||
| Low | Next business day | Non-critical warnings |
|
||||
|
||||
**Dashboards**:
|
||||
- Operations: service health, request rate, error rate, response time percentiles, resource utilization
|
||||
- Business: key business metrics from acceptance criteria
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Structured logging format defined with required fields
|
||||
- [ ] Metrics endpoint specified per service
|
||||
- [ ] OpenTelemetry tracing configured
|
||||
- [ ] Alert severities with response times defined
|
||||
- [ ] Dashboards cover operations and business metrics
|
||||
- [ ] PII exclusion from logs addressed
|
||||
|
||||
**Save action**: Write `observability.md` using `templates/observability.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 6: Deployment Procedures
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Define deployment strategy, rollback procedures, health checks, and deployment checklist
|
||||
**Constraints**: Procedures document — no implementation
|
||||
|
||||
1. Define deployment strategy:
|
||||
- Preferred pattern: blue-green / rolling / canary (choose based on architecture)
|
||||
- Zero-downtime requirement for production
|
||||
- Graceful shutdown: 30-second grace period for in-flight requests
|
||||
- Database migration ordering: migrate before deploy, backward-compatible only
|
||||
|
||||
2. Define health checks:
|
||||
|
||||
| Check | Type | Endpoint | Interval | Threshold |
|
||||
|-------|------|----------|----------|-----------|
|
||||
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures → restart |
|
||||
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures → remove from LB |
|
||||
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts max |
|
||||
|
||||
3. Define rollback procedures:
|
||||
- Trigger criteria: health check failures, error rate spike, critical alert
|
||||
- Rollback steps: redeploy previous image tag, verify health, rollback database if needed
|
||||
- Communication: notify stakeholders during rollback
|
||||
- Post-mortem: required after every production rollback
|
||||
|
||||
4. Define deployment checklist:
|
||||
- [ ] All tests pass in CI
|
||||
- [ ] Security scan clean (zero critical/high CVEs)
|
||||
- [ ] Database migrations reviewed and tested
|
||||
- [ ] Environment variables configured
|
||||
- [ ] Health check endpoints responding
|
||||
- [ ] Monitoring alerts configured
|
||||
- [ ] Rollback plan documented and tested
|
||||
- [ ] Stakeholders notified
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Deployment strategy chosen and justified
|
||||
- [ ] Zero-downtime approach specified
|
||||
- [ ] Health checks defined (liveness, readiness, startup)
|
||||
- [ ] Rollback trigger criteria and steps documented
|
||||
- [ ] Deployment checklist complete
|
||||
|
||||
**Save action**: Write `deployment_procedures.md` using `templates/deployment_procedures.md`
|
||||
|
||||
**BLOCKING**: Present deployment procedures to user. Do NOT proceed until confirmed.
|
||||
|
||||
---
|
||||
|
||||
### Step 7: Deployment Scripts
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Create executable deployment scripts for pulling Docker images and running services on the remote target machine
|
||||
**Constraints**: Produce real, executable shell scripts. This is the ONLY step that creates implementation artifacts.
|
||||
|
||||
1. Read containerization.md and deployment_procedures.md from previous steps
|
||||
2. Read `.env.example` for required variables
|
||||
3. Create the following scripts in `SCRIPTS_DIR/`:
|
||||
|
||||
**`deploy.sh`** — Main deployment orchestrator:
|
||||
- Validates that required environment variables are set (sources `.env` if present)
|
||||
- Calls `pull-images.sh`, then `stop-services.sh`, then `start-services.sh`, then `health-check.sh`
|
||||
- Exits with non-zero code on any failure
|
||||
- Supports `--rollback` flag to redeploy previous image tags
|
||||
|
||||
**`pull-images.sh`** — Pull Docker images to target machine:
|
||||
- Reads image list and tags from environment or config
|
||||
- Authenticates with container registry
|
||||
- Pulls all required images
|
||||
- Verifies image integrity (digest check)
|
||||
|
||||
**`start-services.sh`** — Start services on target machine:
|
||||
- Runs `docker compose up -d` or individual `docker run` commands
|
||||
- Applies environment variables from `.env`
|
||||
- Configures networks and volumes
|
||||
- Waits for containers to reach healthy state
|
||||
|
||||
**`stop-services.sh`** — Graceful shutdown:
|
||||
- Stops services with graceful shutdown period
|
||||
- Saves current image tags for rollback reference
|
||||
- Cleans up orphaned containers/networks
|
||||
|
||||
**`health-check.sh`** — Verify deployment health:
|
||||
- Checks all health endpoints
|
||||
- Reports status per service
|
||||
- Returns non-zero if any service is unhealthy
|
||||
|
||||
4. All scripts must:
|
||||
- Be POSIX-compatible (#!/bin/bash with set -euo pipefail)
|
||||
- Source `.env` from project root or accept env vars from the environment
|
||||
- Include usage/help output (`--help` flag)
|
||||
- Be idempotent where possible
|
||||
- Handle SSH connection to remote target (configurable via `DEPLOY_HOST` env var)
|
||||
|
||||
5. Document all scripts in `deploy_scripts.md`
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All five scripts created and executable
|
||||
- [ ] Scripts source environment variables correctly
|
||||
- [ ] `deploy.sh` orchestrates the full flow
|
||||
- [ ] `pull-images.sh` handles registry auth and image pull
|
||||
- [ ] `start-services.sh` starts containers with correct config
|
||||
- [ ] `stop-services.sh` handles graceful shutdown
|
||||
- [ ] `health-check.sh` validates all endpoints
|
||||
- [ ] Rollback supported via `deploy.sh --rollback`
|
||||
- [ ] Scripts work for remote deployment via SSH (DEPLOY_HOST)
|
||||
- [ ] `deploy_scripts.md` documents all scripts
|
||||
|
||||
**Save action**: Write scripts to `SCRIPTS_DIR/`, write `deploy_scripts.md` using `templates/deploy_scripts.md`
|
||||
|
||||
---
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Unknown cloud provider or hosting | **ASK user** |
|
||||
| Container registry not specified | **ASK user** |
|
||||
| CI/CD platform preference unclear | **ASK user** — default to GitHub Actions |
|
||||
| Secret manager not chosen | **ASK user** |
|
||||
| Deployment pattern trade-offs | **ASK user** with recommendation |
|
||||
| Missing architecture.md | **STOP** — run `/plan` first |
|
||||
| Remote target machine details unknown | **ASK user** for SSH access, OS, and specs |
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Implementing during planning**: Steps 1–6 produce documents, not code (Step 7 is the exception — it creates scripts)
|
||||
- **Hardcoding secrets**: never include real credentials in deployment documents or scripts
|
||||
- **Ignoring integration test containerization**: the test environment must be containerized alongside the app
|
||||
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
|
||||
- **Using `:latest` tags**: always pin base image versions
|
||||
- **Forgetting observability**: logging, metrics, and tracing are deployment concerns, not post-deployment additions
|
||||
- **Committing `.env`**: only `.env.example` goes to version control; `.env` must be in `.gitignore`
|
||||
- **Non-portable scripts**: deployment scripts must work across environments; avoid hardcoded paths
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Deployment Planning (7-Step Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ: architecture.md + component specs exist │
|
||||
│ │
|
||||
│ 1. Status & Env → reports/deploy_status_report.md │
|
||||
│ + .env + .env.example │
|
||||
│ [BLOCKING: user confirms status & env vars] │
|
||||
│ 2. Containerization → containerization.md │
|
||||
│ [BLOCKING: user confirms Docker plan] │
|
||||
│ 3. CI/CD Pipeline → ci_cd_pipeline.md │
|
||||
│ 4. Environment → environment_strategy.md │
|
||||
│ 5. Observability → observability.md │
|
||||
│ 6. Procedures → deployment_procedures.md │
|
||||
│ [BLOCKING: user confirms deployment plan] │
|
||||
│ 7. Scripts → deploy_scripts.md + scripts/ │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Docker-first · IaC · Observability built-in │
|
||||
│ Environment parity · Save immediately │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,87 @@
|
||||
# CI/CD Pipeline Template
|
||||
|
||||
Save as `_docs/04_deploy/ci_cd_pipeline.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — CI/CD Pipeline
|
||||
|
||||
## Pipeline Overview
|
||||
|
||||
| Stage | Trigger | Quality Gate |
|
||||
|-------|---------|-------------|
|
||||
| Lint | Every push | Zero lint errors |
|
||||
| Test | Every push | 75%+ coverage, all tests pass |
|
||||
| Security | Every push | Zero critical/high CVEs |
|
||||
| Build | PR merge to dev | Docker build succeeds |
|
||||
| Push | After build | Images pushed to registry |
|
||||
| Deploy Staging | After push | Health checks pass |
|
||||
| Smoke Tests | After staging deploy | Critical paths pass |
|
||||
| Deploy Production | Manual approval | Health checks pass |
|
||||
|
||||
## Stage Details
|
||||
|
||||
### Lint
|
||||
- [Language-specific linters and formatters]
|
||||
- Runs in parallel per language
|
||||
|
||||
### Test
|
||||
- Unit tests: [framework and command]
|
||||
- Integration tests: [framework and command, uses docker-compose.test.yml]
|
||||
- Coverage threshold: 75% overall, 90% critical paths
|
||||
- Coverage report published as pipeline artifact
|
||||
|
||||
### Security
|
||||
- Dependency audit: [tool, e.g., npm audit / pip-audit / dotnet list package --vulnerable]
|
||||
- SAST scan: [tool, e.g., Semgrep / SonarQube]
|
||||
- Image scan: Trivy on built Docker images
|
||||
- Block on: critical or high severity findings
|
||||
|
||||
### Build
|
||||
- Docker images built using multi-stage Dockerfiles
|
||||
- Tagged with git SHA: `<registry>/<component>:<sha>`
|
||||
- Build cache: Docker layer cache via CI cache action
|
||||
|
||||
### Push
|
||||
- Registry: [container registry URL]
|
||||
- Authentication: [method]
|
||||
|
||||
### Deploy Staging
|
||||
- Deployment method: [docker compose / Kubernetes / cloud service]
|
||||
- Pre-deploy: run database migrations
|
||||
- Post-deploy: verify health check endpoints
|
||||
- Automated rollback on health check failure
|
||||
|
||||
### Smoke Tests
|
||||
- Subset of integration tests targeting staging environment
|
||||
- Validates critical user flows
|
||||
- Timeout: [maximum duration]
|
||||
|
||||
### Deploy Production
|
||||
- Requires manual approval via [mechanism]
|
||||
- Deployment strategy: [blue-green / rolling / canary]
|
||||
- Pre-deploy: database migration review
|
||||
- Post-deploy: health checks + monitoring for 15 min
|
||||
|
||||
## Caching Strategy
|
||||
|
||||
| Cache | Key | Restore Keys |
|
||||
|-------|-----|-------------|
|
||||
| Dependencies | [lockfile hash] | [partial match] |
|
||||
| Docker layers | [Dockerfile hash] | [partial match] |
|
||||
| Build artifacts | [source hash] | [partial match] |
|
||||
|
||||
## Parallelization
|
||||
|
||||
[Diagram or description of which stages run concurrently]
|
||||
|
||||
## Notifications
|
||||
|
||||
| Event | Channel | Recipients |
|
||||
|-------|---------|-----------|
|
||||
| Build failure | [Slack/email] | [team] |
|
||||
| Security alert | [Slack/email] | [team + security] |
|
||||
| Deploy success | [Slack] | [team] |
|
||||
| Deploy failure | [Slack/email + PagerDuty] | [on-call] |
|
||||
```
|
||||
@@ -0,0 +1,94 @@
|
||||
# Containerization Plan Template
|
||||
|
||||
Save as `_docs/04_deploy/containerization.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Containerization
|
||||
|
||||
## Component Dockerfiles
|
||||
|
||||
### [Component Name]
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Base image | [e.g., mcr.microsoft.com/dotnet/aspnet:8.0-alpine] |
|
||||
| Build image | [e.g., mcr.microsoft.com/dotnet/sdk:8.0-alpine] |
|
||||
| Stages | [dependency install → build → production] |
|
||||
| User | [non-root user name] |
|
||||
| Health check | [endpoint and command] |
|
||||
| Exposed ports | [port list] |
|
||||
| Key build args | [if any] |
|
||||
|
||||
### [Repeat for each component]
|
||||
|
||||
## Docker Compose — Local Development
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml structure
|
||||
services:
|
||||
[component]:
|
||||
build: ./[path]
|
||||
ports: ["host:container"]
|
||||
environment: [reference .env.dev]
|
||||
depends_on: [dependencies with health condition]
|
||||
healthcheck: [command, interval, timeout, retries]
|
||||
|
||||
db:
|
||||
image: [postgres:version-alpine]
|
||||
volumes: [named volume]
|
||||
environment: [credentials from .env.dev]
|
||||
healthcheck: [pg_isready]
|
||||
|
||||
volumes:
|
||||
[named volumes]
|
||||
|
||||
networks:
|
||||
[shared network]
|
||||
```
|
||||
|
||||
## Docker Compose — Integration Tests
|
||||
|
||||
```yaml
|
||||
# docker-compose.test.yml structure
|
||||
services:
|
||||
[app components under test]
|
||||
|
||||
test-runner:
|
||||
build: ./tests/integration
|
||||
depends_on: [app components with health condition]
|
||||
environment: [test configuration]
|
||||
# Exit code determines test pass/fail
|
||||
|
||||
db:
|
||||
image: [postgres:version-alpine]
|
||||
volumes: [seed data mount]
|
||||
```
|
||||
|
||||
Run: `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
|
||||
|
||||
## Image Tagging Strategy
|
||||
|
||||
| Context | Tag Format | Example |
|
||||
|---------|-----------|---------|
|
||||
| CI build | `<registry>/<project>/<component>:<git-sha>` | `ghcr.io/org/api:a1b2c3d` |
|
||||
| Release | `<registry>/<project>/<component>:<semver>` | `ghcr.io/org/api:1.2.0` |
|
||||
| Local dev | `<component>:latest` | `api:latest` |
|
||||
|
||||
## .dockerignore
|
||||
|
||||
```
|
||||
.git
|
||||
.cursor
|
||||
_docs
|
||||
_standalone
|
||||
node_modules
|
||||
**/bin
|
||||
**/obj
|
||||
**/__pycache__
|
||||
*.md
|
||||
.env*
|
||||
docker-compose*.yml
|
||||
```
|
||||
```
|
||||
@@ -0,0 +1,73 @@
|
||||
# Deployment Status Report Template
|
||||
|
||||
Save as `_docs/04_deploy/reports/deploy_status_report.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Deployment Status Report
|
||||
|
||||
## Deployment Readiness Summary
|
||||
|
||||
| Aspect | Status | Notes |
|
||||
|--------|--------|-------|
|
||||
| Architecture defined | ✅ / ❌ | |
|
||||
| Component specs complete | ✅ / ❌ | |
|
||||
| Infrastructure prerequisites met | ✅ / ❌ | |
|
||||
| External dependencies identified | ✅ / ❌ | |
|
||||
| Blockers | [count] | [summary] |
|
||||
|
||||
## Component Status
|
||||
|
||||
| Component | State | Docker-ready | Notes |
|
||||
|-----------|-------|-------------|-------|
|
||||
| [Component 1] | planned / implemented / tested | yes / no | |
|
||||
| [Component 2] | planned / implemented / tested | yes / no | |
|
||||
|
||||
## External Dependencies
|
||||
|
||||
| Dependency | Type | Required For | Status |
|
||||
|------------|------|-------------|--------|
|
||||
| [e.g., PostgreSQL] | Database | Data persistence | [available / needs setup] |
|
||||
| [e.g., Redis] | Cache | Session management | [available / needs setup] |
|
||||
| [e.g., External API] | API | [purpose] | [available / needs setup] |
|
||||
|
||||
## Infrastructure Prerequisites
|
||||
|
||||
| Prerequisite | Status | Action Needed |
|
||||
|-------------|--------|--------------|
|
||||
| Container registry | [ready / not set up] | [action] |
|
||||
| Cloud account | [ready / not set up] | [action] |
|
||||
| DNS configuration | [ready / not set up] | [action] |
|
||||
| SSL certificates | [ready / not set up] | [action] |
|
||||
| CI/CD platform | [ready / not set up] | [action] |
|
||||
| Secret manager | [ready / not set up] | [action] |
|
||||
|
||||
## Deployment Blockers
|
||||
|
||||
| Blocker | Severity | Resolution |
|
||||
|---------|----------|-----------|
|
||||
| [blocker description] | critical / high / medium | [resolution steps] |
|
||||
|
||||
## Required Environment Variables
|
||||
|
||||
| Variable | Purpose | Required In | Default (Dev) | Source (Staging/Prod) |
|
||||
|----------|---------|------------|---------------|----------------------|
|
||||
| `DATABASE_URL` | Postgres connection string | All components | `postgres://dev:dev@db:5432/app` | Secret manager |
|
||||
| `DEPLOY_HOST` | Remote target machine | Deployment scripts | `localhost` | Environment |
|
||||
| `REGISTRY_URL` | Container registry URL | CI/CD, deploy scripts | `localhost:5000` | Environment |
|
||||
| `REGISTRY_USER` | Registry username | CI/CD, deploy scripts | — | Secret manager |
|
||||
| `REGISTRY_PASS` | Registry password | CI/CD, deploy scripts | — | Secret manager |
|
||||
| [add all required variables] | | | | |
|
||||
|
||||
## .env Files Created
|
||||
|
||||
- `.env.example` — committed to VCS, contains all variable names with placeholder values
|
||||
- `.env` — git-ignored, contains development defaults
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. [Resolve any blockers listed above]
|
||||
2. [Set up missing infrastructure prerequisites]
|
||||
3. [Proceed to containerization planning]
|
||||
```
|
||||
@@ -0,0 +1,103 @@
|
||||
# Deployment Procedures Template
|
||||
|
||||
Save as `_docs/04_deploy/deployment_procedures.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Deployment Procedures
|
||||
|
||||
## Deployment Strategy
|
||||
|
||||
**Pattern**: [blue-green / rolling / canary]
|
||||
**Rationale**: [why this pattern fits the architecture]
|
||||
**Zero-downtime**: required for production deployments
|
||||
|
||||
### Graceful Shutdown
|
||||
|
||||
- Grace period: 30 seconds for in-flight requests
|
||||
- Sequence: stop accepting new requests → drain connections → shutdown
|
||||
- Container orchestrator: `terminationGracePeriodSeconds: 40`
|
||||
|
||||
### Database Migration Ordering
|
||||
|
||||
- Migrations run **before** new code deploys
|
||||
- All migrations must be backward-compatible (old code works with new schema)
|
||||
- Irreversible migrations require explicit approval
|
||||
|
||||
## Health Checks
|
||||
|
||||
| Check | Type | Endpoint | Interval | Failure Threshold | Action |
|
||||
|-------|------|----------|----------|-------------------|--------|
|
||||
| Liveness | HTTP GET | `/health/live` | 10s | 3 failures | Restart container |
|
||||
| Readiness | HTTP GET | `/health/ready` | 5s | 3 failures | Remove from load balancer |
|
||||
| Startup | HTTP GET | `/health/ready` | 5s | 30 attempts | Kill and recreate |
|
||||
|
||||
### Health Check Responses
|
||||
|
||||
- `/health/live`: returns 200 if process is running (no dependency checks)
|
||||
- `/health/ready`: returns 200 if all dependencies (DB, cache, queues) are reachable
|
||||
|
||||
## Staging Deployment
|
||||
|
||||
1. CI/CD builds and pushes Docker images tagged with git SHA
|
||||
2. Run database migrations against staging
|
||||
3. Deploy new images to staging environment
|
||||
4. Wait for health checks to pass (readiness probe)
|
||||
5. Run smoke tests against staging
|
||||
6. If smoke tests fail: automatic rollback to previous image
|
||||
|
||||
## Production Deployment
|
||||
|
||||
1. **Approval**: manual approval required via [mechanism]
|
||||
2. **Pre-deploy checks**:
|
||||
- [ ] Staging smoke tests passed
|
||||
- [ ] Security scan clean
|
||||
- [ ] Database migration reviewed
|
||||
- [ ] Monitoring alerts configured
|
||||
- [ ] Rollback plan confirmed
|
||||
3. **Deploy**: apply deployment strategy (blue-green / rolling / canary)
|
||||
4. **Verify**: health checks pass, error rate stable, latency within baseline
|
||||
5. **Monitor**: observe dashboards for 15 minutes post-deploy
|
||||
6. **Finalize**: mark deployment as successful or trigger rollback
|
||||
|
||||
## Rollback Procedures
|
||||
|
||||
### Trigger Criteria
|
||||
|
||||
- Health check failures persist after deploy
|
||||
- Error rate exceeds 5% for more than 5 minutes
|
||||
- Critical alert fires within 15 minutes of deploy
|
||||
- Manual decision by on-call engineer
|
||||
|
||||
### Rollback Steps
|
||||
|
||||
1. Redeploy previous Docker image tag (from CI/CD artifact)
|
||||
2. Verify health checks pass
|
||||
3. If database migration was applied:
|
||||
- Run DOWN migration if reversible
|
||||
- If irreversible: assess data impact, escalate if needed
|
||||
4. Notify stakeholders
|
||||
5. Schedule post-mortem within 24 hours
|
||||
|
||||
### Post-Mortem
|
||||
|
||||
Required after every production rollback:
|
||||
- Timeline of events
|
||||
- Root cause
|
||||
- What went wrong
|
||||
- Prevention measures
|
||||
|
||||
## Deployment Checklist
|
||||
|
||||
- [ ] All tests pass in CI
|
||||
- [ ] Security scan clean (zero critical/high CVEs)
|
||||
- [ ] Docker images built and pushed
|
||||
- [ ] Database migrations reviewed and tested
|
||||
- [ ] Environment variables configured for target environment
|
||||
- [ ] Health check endpoints verified
|
||||
- [ ] Monitoring alerts configured
|
||||
- [ ] Rollback plan documented and tested
|
||||
- [ ] Stakeholders notified of deployment window
|
||||
- [ ] On-call engineer available during deployment
|
||||
```
|
||||
@@ -0,0 +1,61 @@
|
||||
# Environment Strategy Template
|
||||
|
||||
Save as `_docs/04_deploy/environment_strategy.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Environment Strategy
|
||||
|
||||
## Environments
|
||||
|
||||
| Environment | Purpose | Infrastructure | Data Source |
|
||||
|-------------|---------|---------------|-------------|
|
||||
| Development | Local developer workflow | docker-compose | Seed data, mocked externals |
|
||||
| Staging | Pre-production validation | [mirrors production] | Anonymized production-like data |
|
||||
| Production | Live system | [full infrastructure] | Real data |
|
||||
|
||||
## Environment Variables
|
||||
|
||||
### Required Variables
|
||||
|
||||
| Variable | Purpose | Dev Default | Staging/Prod Source |
|
||||
|----------|---------|-------------|-------------------|
|
||||
| `DATABASE_URL` | Postgres connection | `postgres://dev:dev@db:5432/app` | Secret manager |
|
||||
| [add all required variables] | | | |
|
||||
|
||||
### `.env.example`
|
||||
|
||||
```env
|
||||
# Copy to .env and fill in values
|
||||
DATABASE_URL=postgres://user:pass@host:5432/dbname
|
||||
# [all required variables with placeholder values]
|
||||
```
|
||||
|
||||
### Variable Validation
|
||||
|
||||
All services validate required environment variables at startup and fail fast with a clear error message if any are missing.
|
||||
|
||||
## Secrets Management
|
||||
|
||||
| Environment | Method | Tool |
|
||||
|-------------|--------|------|
|
||||
| Development | `.env` file (git-ignored) | dotenv |
|
||||
| Staging | Secret manager | [AWS Secrets Manager / Azure Key Vault / Vault] |
|
||||
| Production | Secret manager | [AWS Secrets Manager / Azure Key Vault / Vault] |
|
||||
|
||||
Rotation policy: [frequency and procedure]
|
||||
|
||||
## Database Management
|
||||
|
||||
| Environment | Type | Migrations | Data |
|
||||
|-------------|------|-----------|------|
|
||||
| Development | Docker Postgres, named volume | Applied on container start | Seed data via init script |
|
||||
| Staging | Managed Postgres | Applied via CI/CD pipeline | Anonymized production snapshot |
|
||||
| Production | Managed Postgres | Applied via CI/CD with approval | Live data |
|
||||
|
||||
Migration rules:
|
||||
- All migrations must be backward-compatible (support old and new code simultaneously)
|
||||
- Reversible migrations required (DOWN/rollback script)
|
||||
- Production migrations require review before apply
|
||||
```
|
||||
@@ -0,0 +1,132 @@
|
||||
# Observability Template
|
||||
|
||||
Save as `_docs/04_deploy/observability.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Observability
|
||||
|
||||
## Logging
|
||||
|
||||
### Format
|
||||
|
||||
Structured JSON to stdout/stderr. No file-based logging in containers.
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "ISO8601",
|
||||
"level": "INFO",
|
||||
"service": "service-name",
|
||||
"correlation_id": "uuid",
|
||||
"message": "Event description",
|
||||
"context": {}
|
||||
}
|
||||
```
|
||||
|
||||
### Log Levels
|
||||
|
||||
| Level | Usage | Example |
|
||||
|-------|-------|---------|
|
||||
| ERROR | Exceptions, failures requiring attention | Database connection failed |
|
||||
| WARN | Potential issues, degraded performance | Retry attempt 2/3 |
|
||||
| INFO | Significant business events | User registered, Order placed |
|
||||
| DEBUG | Detailed diagnostics (dev/staging only) | Request payload, Query params |
|
||||
|
||||
### Retention
|
||||
|
||||
| Environment | Destination | Retention |
|
||||
|-------------|-------------|-----------|
|
||||
| Development | Console | Session |
|
||||
| Staging | [log aggregator] | 7 days |
|
||||
| Production | [log aggregator] | 30 days |
|
||||
|
||||
### PII Rules
|
||||
|
||||
- Never log passwords, tokens, or session IDs
|
||||
- Mask email addresses and personal identifiers
|
||||
- Log user IDs (opaque) instead of usernames
|
||||
|
||||
## Metrics
|
||||
|
||||
### Endpoints
|
||||
|
||||
Every service exposes Prometheus-compatible metrics at `/metrics`.
|
||||
|
||||
### Application Metrics
|
||||
|
||||
| Metric | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `request_count` | Counter | Total HTTP requests by method, path, status |
|
||||
| `request_duration_seconds` | Histogram | Response time by method, path |
|
||||
| `error_count` | Counter | Failed requests by type |
|
||||
| `active_connections` | Gauge | Current open connections |
|
||||
|
||||
### System Metrics
|
||||
|
||||
- CPU usage, Memory usage, Disk I/O, Network I/O
|
||||
|
||||
### Business Metrics
|
||||
|
||||
| Metric | Type | Description | Source |
|
||||
|--------|------|-------------|--------|
|
||||
| [from acceptance criteria] | | | |
|
||||
|
||||
Collection interval: 15 seconds
|
||||
|
||||
## Distributed Tracing
|
||||
|
||||
### Configuration
|
||||
|
||||
- SDK: OpenTelemetry
|
||||
- Propagation: W3C Trace Context via HTTP headers
|
||||
- Span naming: `<service>.<operation>`
|
||||
|
||||
### Sampling
|
||||
|
||||
| Environment | Rate | Rationale |
|
||||
|-------------|------|-----------|
|
||||
| Development | 100% | Full visibility |
|
||||
| Staging | 100% | Full visibility |
|
||||
| Production | 10% | Balance cost vs observability |
|
||||
|
||||
### Integration Points
|
||||
|
||||
- HTTP requests: automatic instrumentation
|
||||
- Database queries: automatic instrumentation
|
||||
- Message queues: manual span creation on publish/consume
|
||||
|
||||
## Alerting
|
||||
|
||||
| Severity | Response Time | Conditions |
|
||||
|----------|---------------|-----------|
|
||||
| Critical | 5 min | Service unreachable, health check failed for 1 min, data loss detected |
|
||||
| High | 30 min | Error rate > 5% for 5 min, P95 latency > 2x baseline for 10 min |
|
||||
| Medium | 4 hours | Disk usage > 80%, elevated latency, connection pool exhaustion |
|
||||
| Low | Next business day | Non-critical warnings, deprecated API usage |
|
||||
|
||||
### Notification Channels
|
||||
|
||||
| Severity | Channel |
|
||||
|----------|---------|
|
||||
| Critical | [PagerDuty / phone] |
|
||||
| High | [Slack + email] |
|
||||
| Medium | [Slack] |
|
||||
| Low | [Dashboard only] |
|
||||
|
||||
## Dashboards
|
||||
|
||||
### Operations Dashboard
|
||||
|
||||
- Service health status (up/down per component)
|
||||
- Request rate and error rate
|
||||
- Response time percentiles (P50, P95, P99)
|
||||
- Resource utilization (CPU, memory per container)
|
||||
- Active alerts
|
||||
|
||||
### Business Dashboard
|
||||
|
||||
- [Key business metrics from acceptance criteria]
|
||||
- [User activity indicators]
|
||||
- [Transaction volumes]
|
||||
```
|
||||
@@ -0,0 +1,177 @@
|
||||
---
|
||||
name: implement
|
||||
description: |
|
||||
Orchestrate task implementation with dependency-aware batching, parallel subagents, and integrated code review.
|
||||
Reads flat task files and _dependencies_table.md from TASKS_DIR, computes execution batches via topological sort,
|
||||
launches up to 4 implementer subagents in parallel, runs code-review skill after each batch, and loops until done.
|
||||
Use after /decompose has produced task files.
|
||||
Trigger phrases:
|
||||
- "implement", "start implementation", "implement tasks"
|
||||
- "run implementers", "execute tasks"
|
||||
category: build
|
||||
tags: [implementation, orchestration, batching, parallel, code-review]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Implementation Orchestrator
|
||||
|
||||
Orchestrate the implementation of all tasks produced by the `/decompose` skill. This skill is a **pure orchestrator** — it does NOT write implementation code itself. It reads task specs, computes execution order, delegates to `implementer` subagents, validates results via the `/code-review` skill, and escalates issues.
|
||||
|
||||
The `implementer` agent is the specialist that writes all the code — it receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Orchestrate, don't implement**: this skill delegates all coding to `implementer` subagents
|
||||
- **Dependency-aware batching**: tasks run only when all their dependencies are satisfied
|
||||
- **Max 4 parallel agents**: never launch more than 4 implementer subagents simultaneously
|
||||
- **File isolation**: no two parallel agents may write to the same file
|
||||
- **Integrated review**: `/code-review` skill runs automatically after each batch
|
||||
- **Auto-start**: batches launch immediately — no user confirmation before a batch
|
||||
- **Gate on failure**: user confirmation is required only when code review returns FAIL
|
||||
- **Commit and push per batch**: after each batch is confirmed, commit and push to remote
|
||||
|
||||
## Context Resolution
|
||||
|
||||
- TASKS_DIR: `_docs/02_tasks/`
|
||||
- Task files: all `*.md` files in TASKS_DIR (excluding files starting with `_`)
|
||||
- Dependency table: `TASKS_DIR/_dependencies_table.md`
|
||||
|
||||
## Prerequisite Checks (BLOCKING)
|
||||
|
||||
1. TASKS_DIR exists and contains at least one task file — **STOP if missing**
|
||||
2. `_dependencies_table.md` exists — **STOP if missing**
|
||||
3. At least one task is not yet completed — **STOP if all done**
|
||||
|
||||
## Algorithm
|
||||
|
||||
### 1. Parse
|
||||
|
||||
- Read all task `*.md` files from TASKS_DIR (excluding files starting with `_`)
|
||||
- Read `_dependencies_table.md` — parse into a dependency graph (DAG)
|
||||
- Validate: no circular dependencies, all referenced dependencies exist
|
||||
|
||||
### 2. Detect Progress
|
||||
|
||||
- Scan the codebase to determine which tasks are already completed
|
||||
- Match implemented code against task acceptance criteria
|
||||
- Mark completed tasks as done in the DAG
|
||||
- Report progress to user: "X of Y tasks completed"
|
||||
|
||||
### 3. Compute Next Batch
|
||||
|
||||
- Topological sort remaining tasks
|
||||
- Select tasks whose dependencies are ALL satisfied (completed)
|
||||
- If a ready task depends on any task currently being worked on in this batch, it must wait for the next batch
|
||||
- Cap the batch at 4 parallel agents
|
||||
- If the batch would exceed 20 total complexity points, suggest splitting and let the user decide
|
||||
|
||||
### 4. Assign File Ownership
|
||||
|
||||
For each task in the batch:
|
||||
- Parse the task spec's Component field and Scope section
|
||||
- Map the component to directories/files in the project
|
||||
- Determine: files OWNED (exclusive write), files READ-ONLY (shared interfaces, types), files FORBIDDEN (other agents' owned files)
|
||||
- If two tasks in the same batch would modify the same file, schedule them sequentially instead of in parallel
|
||||
|
||||
### 5. Update Jira Status → In Progress
|
||||
|
||||
For each task in the batch, transition its Jira ticket status to **In Progress** via Jira MCP before launching the implementer.
|
||||
|
||||
### 6. Launch Implementer Subagents
|
||||
|
||||
For each task in the batch, launch an `implementer` subagent with:
|
||||
- Path to the task spec file
|
||||
- List of files OWNED (exclusive write access)
|
||||
- List of files READ-ONLY
|
||||
- List of files FORBIDDEN
|
||||
|
||||
Launch all subagents immediately — no user confirmation.
|
||||
|
||||
### 7. Monitor
|
||||
|
||||
- Wait for all subagents to complete
|
||||
- Collect structured status reports from each implementer
|
||||
- If any implementer reports "Blocked", log the blocker and continue with others
|
||||
|
||||
### 8. Code Review
|
||||
|
||||
- Run `/code-review` skill on the batch's changed files + corresponding task specs
|
||||
- The code-review skill produces a verdict: PASS, PASS_WITH_WARNINGS, or FAIL
|
||||
|
||||
### 9. Gate
|
||||
|
||||
- If verdict is **FAIL**: present findings to user (**BLOCKING**). User must confirm fixes or accept before proceeding.
|
||||
- If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue automatically.
|
||||
|
||||
### 10. Test
|
||||
|
||||
- Run the full test suite
|
||||
- If failures: report to user with details
|
||||
|
||||
### 11. Commit and Push
|
||||
|
||||
- After user confirms the batch (explicitly for FAIL, implicitly for PASS/PASS_WITH_WARNINGS):
|
||||
- `git add` all changed files from the batch
|
||||
- `git commit` with a message that includes ALL JIRA-IDs of tasks implemented in the batch, followed by a summary of what was implemented. Format: `[JIRA-ID-1] [JIRA-ID-2] ... Summary of changes`
|
||||
- `git push` to the remote branch
|
||||
|
||||
### 12. Update Jira Status → In Testing
|
||||
|
||||
After the batch is committed and pushed, transition the Jira ticket status of each task in the batch to **In Testing** via Jira MCP.
|
||||
|
||||
### 13. Loop
|
||||
|
||||
- Go back to step 2 until all tasks are done
|
||||
- When all tasks are complete, report final summary
|
||||
|
||||
## Batch Report Persistence
|
||||
|
||||
After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_report.md`. Create the directory if it doesn't exist. When all tasks are complete, produce `_docs/03_implementation/FINAL_implementation_report.md` with a summary of all batches.
|
||||
|
||||
## Batch Report
|
||||
|
||||
After each batch, produce a structured report:
|
||||
|
||||
```markdown
|
||||
# Batch Report
|
||||
|
||||
**Batch**: [N]
|
||||
**Tasks**: [list]
|
||||
**Date**: [YYYY-MM-DD]
|
||||
|
||||
## Task Results
|
||||
|
||||
| Task | Status | Files Modified | Tests | Issues |
|
||||
|------|--------|---------------|-------|--------|
|
||||
| [JIRA-ID]_[name] | Done | [count] files | [pass/fail] | [count or None] |
|
||||
|
||||
## Code Review Verdict: [PASS/FAIL/PASS_WITH_WARNINGS]
|
||||
|
||||
## Next Batch: [task list] or "All tasks complete"
|
||||
```
|
||||
|
||||
## Stop Conditions and Escalation
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Implementer fails same approach 3+ times | Stop it, escalate to user |
|
||||
| Task blocked on external dependency (not in task list) | Report and skip |
|
||||
| File ownership conflict unresolvable | ASK user |
|
||||
| Test failures exceed 50% of suite after a batch | Stop and escalate |
|
||||
| All tasks complete | Report final summary, suggest final commit |
|
||||
| `_dependencies_table.md` missing | STOP — run `/decompose` first |
|
||||
|
||||
## Recovery
|
||||
|
||||
Each batch commit serves as a rollback checkpoint. If recovery is needed:
|
||||
|
||||
- **Tests fail after a batch commit**: `git revert <batch-commit-hash>` using the hash from the batch report in `_docs/03_implementation/`
|
||||
- **Resuming after interruption**: Read `_docs/03_implementation/batch_*_report.md` files to determine which batches completed, then continue from the next batch
|
||||
- **Multiple consecutive batches fail**: Stop and escalate to user with links to batch reports and commit hashes
|
||||
|
||||
## Safety Rules
|
||||
|
||||
- Never launch tasks whose dependencies are not yet completed
|
||||
- Never allow two parallel agents to write to the same file
|
||||
- If a subagent fails, do NOT retry automatically — report and let user decide
|
||||
- Always run tests after each batch completes
|
||||
@@ -0,0 +1,31 @@
|
||||
# Batching Algorithm Reference
|
||||
|
||||
## Topological Sort with Batch Grouping
|
||||
|
||||
The `/implement` skill uses a topological sort to determine execution order,
|
||||
then groups tasks into batches for parallel execution.
|
||||
|
||||
## Algorithm
|
||||
|
||||
1. Build adjacency list from `_dependencies_table.md`
|
||||
2. Compute in-degree for each task node
|
||||
3. Initialize batch 0 with all nodes that have in-degree 0
|
||||
4. For each batch:
|
||||
a. Select up to 4 tasks from the ready set
|
||||
b. Check file ownership — if two tasks would write the same file, defer one to the next batch
|
||||
c. Launch selected tasks as parallel implementer subagents
|
||||
d. When all complete, remove them from the graph and decrement in-degrees of dependents
|
||||
e. Add newly zero-in-degree nodes to the next batch's ready set
|
||||
5. Repeat until the graph is empty
|
||||
|
||||
## File Ownership Conflict Resolution
|
||||
|
||||
When two tasks in the same batch map to overlapping files:
|
||||
- Prefer to run the lower-numbered task first (it's more foundational)
|
||||
- Defer the higher-numbered task to the next batch
|
||||
- If both have equal priority, ask the user
|
||||
|
||||
## Complexity Budget
|
||||
|
||||
Each batch should not exceed 20 total complexity points.
|
||||
If it does, split the batch and let the user choose which tasks to include.
|
||||
@@ -0,0 +1,36 @@
|
||||
# Batch Report Template
|
||||
|
||||
Use this template after each implementation batch completes.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# Batch Report
|
||||
|
||||
**Batch**: [N]
|
||||
**Tasks**: [list of task names]
|
||||
**Date**: [YYYY-MM-DD]
|
||||
|
||||
## Task Results
|
||||
|
||||
| Task | Status | Files Modified | Tests | Issues |
|
||||
|------|--------|---------------|-------|--------|
|
||||
| [JIRA-ID]_[name] | Done/Blocked/Partial | [count] files | [X/Y pass] | [count or None] |
|
||||
|
||||
## Code Review Verdict: [PASS / FAIL / PASS_WITH_WARNINGS]
|
||||
|
||||
[Link to code review report if FAIL or PASS_WITH_WARNINGS]
|
||||
|
||||
## Test Suite
|
||||
|
||||
- Total: [N] tests
|
||||
- Passed: [N]
|
||||
- Failed: [N]
|
||||
- Skipped: [N]
|
||||
|
||||
## Commit
|
||||
|
||||
[Suggested commit message]
|
||||
|
||||
## Next Batch: [task list] or "All tasks complete"
|
||||
```
|
||||
@@ -0,0 +1,557 @@
|
||||
---
|
||||
name: plan
|
||||
description: |
|
||||
Decompose a solution into architecture, data model, deployment plan, system flows, components, tests, and Jira epics.
|
||||
Systematic 6-step planning workflow with BLOCKING gates, self-verification, and structured artifact management.
|
||||
Uses _docs/ + _docs/02_plans/ structure.
|
||||
Trigger phrases:
|
||||
- "plan", "decompose solution", "architecture planning"
|
||||
- "break down the solution", "create planning documents"
|
||||
- "component decomposition", "solution analysis"
|
||||
category: build
|
||||
tags: [planning, architecture, components, testing, jira, epics]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Solution Planning
|
||||
|
||||
Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, tests, and Jira epics through a systematic 6-step workflow.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Single Responsibility**: each component does one thing well; do not spread related logic across components
|
||||
- **Dumb code, smart data**: keep logic simple, push complexity into data structures and configuration
|
||||
- **Save immediately**: write artifacts to disk after each step; never accumulate unsaved work
|
||||
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
|
||||
- **Plan, don't code**: this workflow produces documents and specs, never implementation code
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Fixed paths — no mode detection needed:
|
||||
|
||||
- PROBLEM_FILE: `_docs/00_problem/problem.md`
|
||||
- SOLUTION_FILE: `_docs/01_solution/solution.md`
|
||||
- PLANS_DIR: `_docs/02_plans/`
|
||||
|
||||
Announce the resolved paths to the user before proceeding.
|
||||
|
||||
## Input Specification
|
||||
|
||||
### Required Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `_docs/00_problem/problem.md` | Problem description and context |
|
||||
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
|
||||
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
|
||||
| `_docs/00_problem/input_data/` | Reference data examples |
|
||||
| `_docs/01_solution/solution.md` | Finalized solution to decompose |
|
||||
|
||||
### Prerequisite Checks (BLOCKING)
|
||||
|
||||
Run sequentially before any planning step:
|
||||
|
||||
**Prereq 1: Data Gate**
|
||||
|
||||
1. `_docs/00_problem/acceptance_criteria.md` exists and is non-empty — **STOP if missing**
|
||||
2. `_docs/00_problem/restrictions.md` exists and is non-empty — **STOP if missing**
|
||||
3. `_docs/00_problem/input_data/` exists and contains at least one data file — **STOP if missing**
|
||||
4. `_docs/00_problem/problem.md` exists and is non-empty — **STOP if missing**
|
||||
|
||||
All four are mandatory. If any is missing or empty, STOP and ask the user to provide them. If the user cannot provide the required data, planning cannot proceed — just stop.
|
||||
|
||||
**Prereq 2: Finalize Solution Draft**
|
||||
|
||||
Only runs after the Data Gate passes:
|
||||
|
||||
1. Scan `_docs/01_solution/` for files matching `solution_draft*.md`
|
||||
2. Identify the highest-numbered draft (e.g. `solution_draft06.md`)
|
||||
3. **Rename** it to `_docs/01_solution/solution.md`
|
||||
4. If `solution.md` already exists, ask the user whether to overwrite or keep existing
|
||||
5. Verify `solution.md` is non-empty — **STOP if missing or empty**
|
||||
|
||||
**Prereq 3: Workspace Setup**
|
||||
|
||||
1. Create PLANS_DIR if it does not exist
|
||||
2. If PLANS_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
|
||||
|
||||
## Artifact Management
|
||||
|
||||
### Directory Structure
|
||||
|
||||
All artifacts are written directly under PLANS_DIR:
|
||||
|
||||
```
|
||||
PLANS_DIR/
|
||||
├── integration_tests/
|
||||
│ ├── environment.md
|
||||
│ ├── test_data.md
|
||||
│ ├── functional_tests.md
|
||||
│ ├── non_functional_tests.md
|
||||
│ └── traceability_matrix.md
|
||||
├── architecture.md
|
||||
├── system-flows.md
|
||||
├── data_model.md
|
||||
├── deployment/
|
||||
│ ├── containerization.md
|
||||
│ ├── ci_cd_pipeline.md
|
||||
│ ├── environment_strategy.md
|
||||
│ ├── observability.md
|
||||
│ └── deployment_procedures.md
|
||||
├── risk_mitigations.md
|
||||
├── risk_mitigations_02.md (iterative, ## as sequence)
|
||||
├── components/
|
||||
│ ├── 01_[name]/
|
||||
│ │ ├── description.md
|
||||
│ │ └── tests.md
|
||||
│ ├── 02_[name]/
|
||||
│ │ ├── description.md
|
||||
│ │ └── tests.md
|
||||
│ └── ...
|
||||
├── common-helpers/
|
||||
│ ├── 01_helper_[name]/
|
||||
│ ├── 02_helper_[name]/
|
||||
│ └── ...
|
||||
├── diagrams/
|
||||
│ ├── components.drawio
|
||||
│ └── flows/
|
||||
│ ├── flow_[name].md (Mermaid)
|
||||
│ └── ...
|
||||
└── FINAL_report.md
|
||||
```
|
||||
|
||||
### Save Timing
|
||||
|
||||
| Step | Save immediately after | Filename |
|
||||
|------|------------------------|----------|
|
||||
| Step 1 | Integration test environment spec | `integration_tests/environment.md` |
|
||||
| Step 1 | Integration test data spec | `integration_tests/test_data.md` |
|
||||
| Step 1 | Integration functional tests | `integration_tests/functional_tests.md` |
|
||||
| Step 1 | Integration non-functional tests | `integration_tests/non_functional_tests.md` |
|
||||
| Step 1 | Integration traceability matrix | `integration_tests/traceability_matrix.md` |
|
||||
| Step 2 | Architecture analysis complete | `architecture.md` |
|
||||
| Step 2 | System flows documented | `system-flows.md` |
|
||||
| Step 2 | Data model documented | `data_model.md` |
|
||||
| Step 2 | Deployment plan complete | `deployment/` (5 files) |
|
||||
| Step 3 | Each component analyzed | `components/[##]_[name]/description.md` |
|
||||
| Step 3 | Common helpers generated | `common-helpers/[##]_helper_[name].md` |
|
||||
| Step 3 | Diagrams generated | `diagrams/` |
|
||||
| Step 4 | Risk assessment complete | `risk_mitigations.md` |
|
||||
| Step 5 | Tests written per component | `components/[##]_[name]/tests.md` |
|
||||
| Step 6 | Epics created in Jira | Jira via MCP |
|
||||
| Final | All steps complete | `FINAL_report.md` |
|
||||
|
||||
### Save Principles
|
||||
|
||||
1. **Save immediately**: write to disk as soon as a step completes; do not wait until the end
|
||||
2. **Incremental updates**: same file can be updated multiple times; append or replace
|
||||
3. **Preserve process**: keep all intermediate files even after integration into final report
|
||||
4. **Enable recovery**: if interrupted, resume from the last saved artifact (see Resumability)
|
||||
|
||||
### Resumability
|
||||
|
||||
If PLANS_DIR already contains artifacts:
|
||||
|
||||
1. List existing files and match them to the save timing table above
|
||||
2. Identify the last completed step based on which artifacts exist
|
||||
3. Resume from the next incomplete step
|
||||
4. Inform the user which steps are being skipped
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
At the start of execution, create a TodoWrite with all steps (1 through 6). Update status as each step completes.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Integration Tests
|
||||
|
||||
**Role**: Professional Quality Assurance Engineer
|
||||
**Goal**: Analyze input data completeness and produce detailed black-box integration test specifications
|
||||
**Constraints**: Spec only — no test code. Tests describe what the system should do given specific inputs, not how the system is built.
|
||||
|
||||
#### Phase 1a: Input Data Completeness Analysis
|
||||
|
||||
1. Read `_docs/01_solution/solution.md` (finalized in Prereq 2)
|
||||
2. Read `acceptance_criteria.md`, `restrictions.md`
|
||||
3. Read testing strategy from solution.md
|
||||
4. Analyze `input_data/` contents against:
|
||||
- Coverage of acceptance criteria scenarios
|
||||
- Coverage of restriction edge cases
|
||||
- Coverage of testing strategy requirements
|
||||
5. Threshold: at least 70% coverage of the scenarios
|
||||
6. If coverage is low, search the internet for supplementary data, assess quality with user, and if user agrees, add to `input_data/`
|
||||
7. Present coverage assessment to user
|
||||
|
||||
**BLOCKING**: Do NOT proceed until user confirms the input data coverage is sufficient.
|
||||
|
||||
#### Phase 1b: Black-Box Test Scenario Specification
|
||||
|
||||
Based on all acquired data, acceptance_criteria, and restrictions, form detailed test scenarios:
|
||||
|
||||
1. Define test environment using `templates/integration-environment.md` as structure
|
||||
2. Define test data management using `templates/integration-test-data.md` as structure
|
||||
3. Write functional test scenarios (positive + negative) using `templates/integration-functional-tests.md` as structure
|
||||
4. Write non-functional test scenarios (performance, resilience, security, edge cases) using `templates/integration-non-functional-tests.md` as structure
|
||||
5. Build traceability matrix using `templates/integration-traceability-matrix.md` as structure
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every acceptance criterion is covered by at least one test scenario
|
||||
- [ ] Every restriction is verified by at least one test scenario
|
||||
- [ ] Positive and negative scenarios are balanced
|
||||
- [ ] Consumer app has no direct access to system internals
|
||||
- [ ] Docker environment is self-contained (`docker compose up` sufficient)
|
||||
- [ ] External dependencies have mock/stub services defined
|
||||
- [ ] Traceability matrix has no uncovered AC or restrictions
|
||||
|
||||
**Save action**: Write all files under `integration_tests/`:
|
||||
- `environment.md`
|
||||
- `test_data.md`
|
||||
- `functional_tests.md`
|
||||
- `non_functional_tests.md`
|
||||
- `traceability_matrix.md`
|
||||
|
||||
**BLOCKING**: Present test coverage summary (from traceability_matrix.md) to user. Do NOT proceed until confirmed.
|
||||
|
||||
Capture any new questions, findings, or insights that arise during test specification — these feed forward into Steps 2 and 3.
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Solution Analysis
|
||||
|
||||
**Role**: Professional software architect
|
||||
**Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft
|
||||
**Constraints**: No code, no component-level detail yet; focus on system-level view
|
||||
|
||||
#### Phase 2a: Architecture & Flows
|
||||
|
||||
1. Read all input files thoroughly
|
||||
2. Incorporate findings, questions, and insights discovered during Step 1 (integration tests)
|
||||
3. Research unknown or questionable topics via internet; ask user about ambiguities
|
||||
4. Document architecture using `templates/architecture.md` as structure
|
||||
5. Document system flows using `templates/system-flows.md` as structure
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Architecture covers all capabilities mentioned in solution.md
|
||||
- [ ] System flows cover all main user/system interactions
|
||||
- [ ] No contradictions with problem.md or restrictions.md
|
||||
- [ ] Technology choices are justified
|
||||
- [ ] Integration test findings are reflected in architecture decisions
|
||||
|
||||
**Save action**: Write `architecture.md` and `system-flows.md`
|
||||
|
||||
**BLOCKING**: Present architecture summary to user. Do NOT proceed until user confirms.
|
||||
|
||||
#### Phase 2b: Data Model
|
||||
|
||||
**Role**: Professional software architect
|
||||
**Goal**: Produce a detailed data model document covering entities, relationships, and migration strategy
|
||||
|
||||
1. Extract core entities from architecture.md and solution.md
|
||||
2. Define entity attributes, types, and constraints
|
||||
3. Define relationships between entities (Mermaid ERD)
|
||||
4. Define migration strategy: versioning tool (EF Core migrations / Alembic / sql-migrate), reversibility requirement, naming convention
|
||||
5. Define seed data requirements per environment (dev, staging)
|
||||
6. Define backward compatibility approach for schema changes (additive-only by default)
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every entity mentioned in architecture.md is defined
|
||||
- [ ] Relationships are explicit with cardinality
|
||||
- [ ] Migration strategy specifies reversibility requirement
|
||||
- [ ] Seed data requirements defined
|
||||
- [ ] Backward compatibility approach documented
|
||||
|
||||
**Save action**: Write `data_model.md`
|
||||
|
||||
#### Phase 2c: Deployment Planning
|
||||
|
||||
**Role**: DevOps / Platform engineer
|
||||
**Goal**: Produce deployment plan covering containerization, CI/CD, environment strategy, observability, and deployment procedures
|
||||
|
||||
Use the `/deploy` skill's templates as structure for each artifact:
|
||||
|
||||
1. Read architecture.md and restrictions.md for infrastructure constraints
|
||||
2. Research Docker best practices for the project's tech stack
|
||||
3. Define containerization plan: Dockerfile per component, docker-compose for dev and tests
|
||||
4. Define CI/CD pipeline: stages, quality gates, caching, parallelization
|
||||
5. Define environment strategy: dev, staging, production with secrets management
|
||||
6. Define observability: structured logging, metrics, tracing, alerting
|
||||
7. Define deployment procedures: strategy, health checks, rollback, checklist
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every component has a Docker specification
|
||||
- [ ] CI/CD pipeline covers lint, test, security, build, deploy
|
||||
- [ ] Environment strategy covers dev, staging, production
|
||||
- [ ] Observability covers logging, metrics, tracing, alerting
|
||||
- [ ] Deployment procedures include rollback and health checks
|
||||
|
||||
**Save action**: Write all 5 files under `deployment/`:
|
||||
- `containerization.md`
|
||||
- `ci_cd_pipeline.md`
|
||||
- `environment_strategy.md`
|
||||
- `observability.md`
|
||||
- `deployment_procedures.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Component Decomposition
|
||||
|
||||
**Role**: Professional software architect
|
||||
**Goal**: Decompose the architecture into components with detailed specs
|
||||
**Constraints**: No code; only names, interfaces, inputs/outputs. Follow SRP strictly.
|
||||
|
||||
1. Identify components from the architecture; think about separation, reusability, and communication patterns
|
||||
2. Use integration test scenarios from Step 1 to validate component boundaries
|
||||
3. If additional components are needed (data preparation, shared helpers), create them
|
||||
4. For each component, write a spec using `templates/component-spec.md` as structure
|
||||
5. Generate diagrams:
|
||||
- draw.io component diagram showing relations (minimize line intersections, group semantically coherent components, place external users near their components)
|
||||
- Mermaid flowchart per main control flow
|
||||
6. Components can share and reuse common logic, same for multiple components. Hence for such occurences common-helpers folder is specified.
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Each component has a single, clear responsibility
|
||||
- [ ] No functionality is spread across multiple components
|
||||
- [ ] All inter-component interfaces are defined (who calls whom, with what)
|
||||
- [ ] Component dependency graph has no circular dependencies
|
||||
- [ ] All components from architecture.md are accounted for
|
||||
- [ ] Every integration test scenario can be traced through component interactions
|
||||
|
||||
**Save action**: Write:
|
||||
- each component `components/[##]_[name]/description.md`
|
||||
- common helper `common-helpers/[##]_helper_[name].md`
|
||||
- diagrams `diagrams/`
|
||||
|
||||
**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms.
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Architecture Review & Risk Assessment
|
||||
|
||||
**Role**: Professional software architect and analyst
|
||||
**Goal**: Validate all artifacts for consistency, then identify and mitigate risks
|
||||
**Constraints**: This is a review step — fix problems found, do not add new features
|
||||
|
||||
#### 4a. Evaluator Pass (re-read ALL artifacts)
|
||||
|
||||
Review checklist:
|
||||
- [ ] All components follow Single Responsibility Principle
|
||||
- [ ] All components follow dumb code / smart data principle
|
||||
- [ ] Inter-component interfaces are consistent (caller's output matches callee's input)
|
||||
- [ ] No circular dependencies in the dependency graph
|
||||
- [ ] No missing interactions between components
|
||||
- [ ] No over-engineering — is there a simpler decomposition?
|
||||
- [ ] Security considerations addressed in component design
|
||||
- [ ] Performance bottlenecks identified
|
||||
- [ ] API contracts are consistent across components
|
||||
|
||||
Fix any issues found before proceeding to risk identification.
|
||||
|
||||
#### 4b. Risk Identification
|
||||
|
||||
1. Identify technical and project risks
|
||||
2. Assess probability and impact using `templates/risk-register.md`
|
||||
3. Define mitigation strategies
|
||||
4. Apply mitigations to architecture, flows, and component documents where applicable
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every High/Critical risk has a concrete mitigation strategy
|
||||
- [ ] Mitigations are reflected in the relevant component or architecture docs
|
||||
- [ ] No new risks introduced by the mitigations themselves
|
||||
|
||||
**Save action**: Write `risk_mitigations.md`
|
||||
|
||||
**BLOCKING**: Present risk summary to user. Ask whether assessment is sufficient.
|
||||
|
||||
**Iterative**: If user requests another round, repeat Step 4 and write `risk_mitigations_##.md` (## as sequence number). Continue until user confirms.
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Test Specifications
|
||||
|
||||
**Role**: Professional Quality Assurance Engineer
|
||||
|
||||
**Goal**: Write test specs for each component achieving minimum 75% acceptance criteria coverage
|
||||
|
||||
**Constraints**: Test specs only — no test code. Each test must trace to an acceptance criterion.
|
||||
|
||||
1. For each component, write tests using `templates/test-spec.md` as structure
|
||||
2. Cover all 4 types: integration, performance, security, acceptance
|
||||
3. Include test data management (setup, teardown, isolation)
|
||||
4. Verify traceability: every acceptance criterion from `acceptance_criteria.md` must be covered by at least one test
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every acceptance criterion has at least one test covering it
|
||||
- [ ] Test inputs are realistic and well-defined
|
||||
- [ ] Expected results are specific and measurable
|
||||
- [ ] No component is left without tests
|
||||
|
||||
**Save action**: Write each `components/[##]_[name]/tests.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 6: Jira Epics
|
||||
|
||||
**Role**: Professional product manager
|
||||
|
||||
**Goal**: Create Jira epics from components, ordered by dependency
|
||||
|
||||
**Constraints**: Epic descriptions must be **comprehensive and self-contained** — a developer reading only the Jira epic should understand the full context without needing to open separate files.
|
||||
|
||||
1. **Create "Bootstrap & Initial Structure" epic first** — this epic will parent the `01_initial_structure` task created by the decompose skill. It covers project scaffolding: folder structure, shared models, interfaces, stubs, CI/CD config, DB migrations setup, test structure.
|
||||
2. Generate Jira Epics for each component using Jira MCP, structured per `templates/epic-spec.md`
|
||||
3. Order epics by dependency (Bootstrap epic is always first, then components based on their dependency graph)
|
||||
4. Include effort estimation per epic (T-shirt size or story points range)
|
||||
5. Ensure each epic has clear acceptance criteria cross-referenced with component specs
|
||||
6. Generate Mermaid diagrams showing component-to-epic mapping and component relationships
|
||||
|
||||
**CRITICAL — Epic description richness requirements**:
|
||||
|
||||
Each epic description in Jira MUST include ALL of the following sections with substantial content:
|
||||
- **System context**: where this component fits in the overall architecture (include Mermaid diagram showing this component's position and connections)
|
||||
- **Problem / Context**: what problem this component solves, why it exists, current pain points
|
||||
- **Scope**: detailed in-scope and out-of-scope lists
|
||||
- **Architecture notes**: relevant ADRs, technology choices, patterns used, key design decisions
|
||||
- **Interface specification**: full method signatures, input/output types, error types (from component description.md)
|
||||
- **Data flow**: how data enters and exits this component (include Mermaid sequence or flowchart diagram)
|
||||
- **Dependencies**: epic dependencies (with Jira IDs) and external dependencies (libraries, hardware, services)
|
||||
- **Acceptance criteria**: measurable criteria with specific thresholds (from component tests.md)
|
||||
- **Non-functional requirements**: latency, memory, throughput targets with failure thresholds
|
||||
- **Risks & mitigations**: relevant risks from risk_mitigations.md with concrete mitigation strategies
|
||||
- **Effort estimation**: T-shirt size and story points range
|
||||
- **Child issues**: planned task breakdown with complexity points
|
||||
- **Key constraints**: from restrictions.md that affect this component
|
||||
- **Testing strategy**: summary of test types and coverage from tests.md
|
||||
|
||||
Do NOT create minimal epics with just a summary and short description. The Jira epic is the primary reference document for the implementation team.
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] "Bootstrap & Initial Structure" epic exists and is first in order
|
||||
- [ ] "Integration Tests" epic exists
|
||||
- [ ] Every component maps to exactly one epic
|
||||
- [ ] Dependency order is respected (no epic depends on a later one)
|
||||
- [ ] Acceptance criteria are measurable
|
||||
- [ ] Effort estimates are realistic
|
||||
- [ ] Every epic description includes architecture diagram, interface spec, data flow, risks, and NFRs
|
||||
- [ ] Epic descriptions are self-contained — readable without opening other files
|
||||
|
||||
7. **Create "Integration Tests" epic** — this epic will parent the integration test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `integration_tests/`.
|
||||
|
||||
**Save action**: Epics created in Jira via MCP. Also saved locally in `epics.md` with Jira IDs.
|
||||
|
||||
---
|
||||
|
||||
## Quality Checklist (before FINAL_report.md)
|
||||
|
||||
Before writing the final report, verify ALL of the following:
|
||||
|
||||
### Integration Tests
|
||||
- [ ] Every acceptance criterion is covered in traceability_matrix.md
|
||||
- [ ] Every restriction is verified by at least one test
|
||||
- [ ] Positive and negative scenarios are balanced
|
||||
- [ ] Docker environment is self-contained
|
||||
- [ ] Consumer app treats main system as black box
|
||||
- [ ] CI/CD integration and reporting defined
|
||||
|
||||
### Architecture
|
||||
- [ ] Covers all capabilities from solution.md
|
||||
- [ ] Technology choices are justified
|
||||
- [ ] Deployment model is defined
|
||||
- [ ] Integration test findings are reflected in architecture decisions
|
||||
|
||||
### Data Model
|
||||
- [ ] Every entity from architecture.md is defined
|
||||
- [ ] Relationships have explicit cardinality
|
||||
- [ ] Migration strategy with reversibility requirement
|
||||
- [ ] Seed data requirements defined
|
||||
- [ ] Backward compatibility approach documented
|
||||
|
||||
### Deployment
|
||||
- [ ] Containerization plan covers all components
|
||||
- [ ] CI/CD pipeline includes lint, test, security, build, deploy stages
|
||||
- [ ] Environment strategy covers dev, staging, production
|
||||
- [ ] Observability covers logging, metrics, tracing, alerting
|
||||
- [ ] Deployment procedures include rollback and health checks
|
||||
|
||||
### Components
|
||||
- [ ] Every component follows SRP
|
||||
- [ ] No circular dependencies
|
||||
- [ ] All inter-component interfaces are defined and consistent
|
||||
- [ ] No orphan components (unused by any flow)
|
||||
- [ ] Every integration test scenario can be traced through component interactions
|
||||
|
||||
### Risks
|
||||
- [ ] All High/Critical risks have mitigations
|
||||
- [ ] Mitigations are reflected in component/architecture docs
|
||||
- [ ] User has confirmed risk assessment is sufficient
|
||||
|
||||
### Tests
|
||||
- [ ] Every acceptance criterion is covered by at least one test
|
||||
- [ ] All 4 test types are represented per component (where applicable)
|
||||
- [ ] Test data management is defined
|
||||
|
||||
### Epics
|
||||
- [ ] "Bootstrap & Initial Structure" epic exists
|
||||
- [ ] "Integration Tests" epic exists
|
||||
- [ ] Every component maps to an epic
|
||||
- [ ] Dependency order is correct
|
||||
- [ ] Acceptance criteria are measurable
|
||||
|
||||
**Save action**: Write `FINAL_report.md` using `templates/final-report.md` as structure
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Proceeding without input data**: all three data gate items (acceptance_criteria, restrictions, input_data) must be present before any planning begins
|
||||
- **Coding during planning**: this workflow produces documents, never code
|
||||
- **Multi-responsibility components**: if a component does two things, split it
|
||||
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
|
||||
- **Diagrams without data**: generate diagrams only after the underlying structure is documented
|
||||
- **Copy-pasting problem.md**: the architecture doc should analyze and transform, not repeat the input
|
||||
- **Vague interfaces**: "component A talks to component B" is not enough; define the method, input, output
|
||||
- **Ignoring restrictions.md**: every constraint must be traceable in the architecture or risk register
|
||||
- **Ignoring integration test findings**: insights from Step 1 must feed into architecture (Step 2) and component decomposition (Step 3)
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — planning cannot proceed |
|
||||
| Ambiguous requirements | ASK user |
|
||||
| Input data coverage below 70% | Search internet for supplementary data, ASK user to validate |
|
||||
| Technology choice with multiple valid options | ASK user |
|
||||
| Component naming | PROCEED, confirm at next BLOCKING gate |
|
||||
| File structure within templates | PROCEED |
|
||||
| Contradictions between input files | ASK user |
|
||||
| Risk mitigation requires architecture change | ASK user |
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Solution Planning (6-Step Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ 1: Data Gate (BLOCKING) │
|
||||
│ → verify AC, restrictions, input_data exist — STOP if not │
|
||||
│ PREREQ 2: Finalize solution draft │
|
||||
│ → rename highest solution_draft##.md to solution.md │
|
||||
│ PREREQ 3: Workspace setup │
|
||||
│ → create PLANS_DIR/ if needed │
|
||||
│ │
|
||||
│ 1. Integration Tests → integration_tests/ (5 files) │
|
||||
│ [BLOCKING: user confirms test coverage] │
|
||||
│ 2a. Architecture → architecture.md, system-flows.md │
|
||||
│ [BLOCKING: user confirms architecture] │
|
||||
│ 2b. Data Model → data_model.md │
|
||||
│ 2c. Deployment → deployment/ (5 files) │
|
||||
│ 3. Component Decompose → components/[##]_[name]/description │
|
||||
│ [BLOCKING: user confirms decomposition] │
|
||||
│ 4. Review & Risk → risk_mitigations.md │
|
||||
│ [BLOCKING: user confirms risks, iterative] │
|
||||
│ 5. Test Specifications → components/[##]_[name]/tests.md │
|
||||
│ 6. Jira Epics → Jira via MCP │
|
||||
│ ───────────────────────────────────────────────── │
|
||||
│ Quality Checklist → FINAL_report.md │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: SRP · Dumb code/smart data · Save immediately │
|
||||
│ Ask don't assume · Plan don't code │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,128 @@
|
||||
# Architecture Document Template
|
||||
|
||||
Use this template for the architecture document. Save as `_docs/02_plans/architecture.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Architecture
|
||||
|
||||
## 1. System Context
|
||||
|
||||
**Problem being solved**: [One paragraph summarizing the problem from problem.md]
|
||||
|
||||
**System boundaries**: [What is inside the system vs. external]
|
||||
|
||||
**External systems**:
|
||||
|
||||
| System | Integration Type | Direction | Purpose |
|
||||
|--------|-----------------|-----------|---------|
|
||||
| [name] | REST / Queue / DB / File | Inbound / Outbound / Both | [why] |
|
||||
|
||||
## 2. Technology Stack
|
||||
|
||||
| Layer | Technology | Version | Rationale |
|
||||
|-------|-----------|---------|-----------|
|
||||
| Language | | | |
|
||||
| Framework | | | |
|
||||
| Database | | | |
|
||||
| Cache | | | |
|
||||
| Message Queue | | | |
|
||||
| Hosting | | | |
|
||||
| CI/CD | | | |
|
||||
|
||||
**Key constraints from restrictions.md**:
|
||||
- [Constraint 1 and how it affects technology choices]
|
||||
- [Constraint 2]
|
||||
|
||||
## 3. Deployment Model
|
||||
|
||||
**Environments**: Development, Staging, Production
|
||||
|
||||
**Infrastructure**:
|
||||
- [Cloud provider / On-prem / Hybrid]
|
||||
- [Container orchestration if applicable]
|
||||
- [Scaling strategy: horizontal / vertical / auto]
|
||||
|
||||
**Environment-specific configuration**:
|
||||
|
||||
| Config | Development | Production |
|
||||
|--------|-------------|------------|
|
||||
| Database | [local/docker] | [managed service] |
|
||||
| Secrets | [.env file] | [secret manager] |
|
||||
| Logging | [console] | [centralized] |
|
||||
|
||||
## 4. Data Model Overview
|
||||
|
||||
> High-level data model covering the entire system. Detailed per-component models go in component specs.
|
||||
|
||||
**Core entities**:
|
||||
|
||||
| Entity | Description | Owned By Component |
|
||||
|--------|-------------|--------------------|
|
||||
| [entity] | [what it represents] | [component ##] |
|
||||
|
||||
**Key relationships**:
|
||||
- [Entity A] → [Entity B]: [relationship description]
|
||||
|
||||
**Data flow summary**:
|
||||
- [Source] → [Transform] → [Destination]: [what data and why]
|
||||
|
||||
## 5. Integration Points
|
||||
|
||||
### Internal Communication
|
||||
|
||||
| From | To | Protocol | Pattern | Notes |
|
||||
|------|----|----------|---------|-------|
|
||||
| [component] | [component] | Sync REST / Async Queue / Direct call | Request-Response / Event / Command | |
|
||||
|
||||
### External Integrations
|
||||
|
||||
| External System | Protocol | Auth | Rate Limits | Failure Mode |
|
||||
|----------------|----------|------|-------------|--------------|
|
||||
| [system] | [REST/gRPC/etc] | [API key/OAuth/etc] | [limits] | [retry/circuit breaker/fallback] |
|
||||
|
||||
## 6. Non-Functional Requirements
|
||||
|
||||
| Requirement | Target | Measurement | Priority |
|
||||
|------------|--------|-------------|----------|
|
||||
| Availability | [e.g., 99.9%] | [how measured] | High/Medium/Low |
|
||||
| Latency (p95) | [e.g., <200ms] | [endpoint/operation] | |
|
||||
| Throughput | [e.g., 1000 req/s] | [peak/sustained] | |
|
||||
| Data retention | [e.g., 90 days] | [which data] | |
|
||||
| Recovery (RPO/RTO) | [e.g., RPO 1hr, RTO 4hr] | | |
|
||||
| Scalability | [e.g., 10x current load] | [timeline] | |
|
||||
|
||||
## 7. Security Architecture
|
||||
|
||||
**Authentication**: [mechanism — JWT / session / API key]
|
||||
|
||||
**Authorization**: [RBAC / ABAC / per-resource]
|
||||
|
||||
**Data protection**:
|
||||
- At rest: [encryption method]
|
||||
- In transit: [TLS version]
|
||||
- Secrets management: [tool/approach]
|
||||
|
||||
**Audit logging**: [what is logged, where, retention]
|
||||
|
||||
## 8. Key Architectural Decisions
|
||||
|
||||
Record significant decisions that shaped the architecture.
|
||||
|
||||
### ADR-001: [Decision Title]
|
||||
|
||||
**Context**: [Why this decision was needed]
|
||||
|
||||
**Decision**: [What was decided]
|
||||
|
||||
**Alternatives considered**:
|
||||
1. [Alternative 1] — rejected because [reason]
|
||||
2. [Alternative 2] — rejected because [reason]
|
||||
|
||||
**Consequences**: [Trade-offs accepted]
|
||||
|
||||
### ADR-002: [Decision Title]
|
||||
|
||||
...
|
||||
```
|
||||
@@ -0,0 +1,156 @@
|
||||
# Component Specification Template
|
||||
|
||||
Use this template for each component. Save as `components/[##]_[name]/description.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [Component Name]
|
||||
|
||||
## 1. High-Level Overview
|
||||
|
||||
**Purpose**: [One sentence: what this component does and its role in the system]
|
||||
|
||||
**Architectural Pattern**: [e.g., Repository, Event-driven, Pipeline, Facade, etc.]
|
||||
|
||||
**Upstream dependencies**: [Components that this component calls or consumes from]
|
||||
|
||||
**Downstream consumers**: [Components that call or consume from this component]
|
||||
|
||||
## 2. Internal Interfaces
|
||||
|
||||
For each interface this component exposes internally:
|
||||
|
||||
### Interface: [InterfaceName]
|
||||
|
||||
| Method | Input | Output | Async | Error Types |
|
||||
|--------|-------|--------|-------|-------------|
|
||||
| `method_name` | `InputDTO` | `OutputDTO` | Yes/No | `ErrorType1`, `ErrorType2` |
|
||||
|
||||
**Input DTOs**:
|
||||
```
|
||||
[DTO name]:
|
||||
field_1: type (required/optional) — description
|
||||
field_2: type (required/optional) — description
|
||||
```
|
||||
|
||||
**Output DTOs**:
|
||||
```
|
||||
[DTO name]:
|
||||
field_1: type — description
|
||||
field_2: type — description
|
||||
```
|
||||
|
||||
## 3. External API Specification
|
||||
|
||||
> Include this section only if the component exposes an external HTTP/gRPC API.
|
||||
> Skip if the component is internal-only.
|
||||
|
||||
| Endpoint | Method | Auth | Rate Limit | Description |
|
||||
|----------|--------|------|------------|-------------|
|
||||
| `/api/v1/...` | GET/POST/PUT/DELETE | Required/Public | X req/min | Brief description |
|
||||
|
||||
**Request/Response schemas**: define per endpoint using OpenAPI-style notation.
|
||||
|
||||
**Example request/response**:
|
||||
```json
|
||||
// Request
|
||||
{ }
|
||||
|
||||
// Response
|
||||
{ }
|
||||
```
|
||||
|
||||
## 4. Data Access Patterns
|
||||
|
||||
### Queries
|
||||
|
||||
| Query | Frequency | Hot Path | Index Needed |
|
||||
|-------|-----------|----------|--------------|
|
||||
| [describe query] | High/Medium/Low | Yes/No | Yes/No |
|
||||
|
||||
### Caching Strategy
|
||||
|
||||
| Data | Cache Type | TTL | Invalidation |
|
||||
|------|-----------|-----|-------------|
|
||||
| [data item] | In-memory / Redis / None | [duration] | [trigger] |
|
||||
|
||||
### Storage Estimates
|
||||
|
||||
| Table/Collection | Est. Row Count (1yr) | Row Size | Total Size | Growth Rate |
|
||||
|-----------------|---------------------|----------|------------|-------------|
|
||||
| [table_name] | | | | /month |
|
||||
|
||||
### Data Management
|
||||
|
||||
**Seed data**: [Required seed data and how to load it]
|
||||
|
||||
**Rollback**: [Rollback procedure for this component's data changes]
|
||||
|
||||
## 5. Implementation Details
|
||||
|
||||
**Algorithmic Complexity**: [Big O for critical methods — only if non-trivial]
|
||||
|
||||
**State Management**: [Local state / Global state / Stateless — explain how state is handled]
|
||||
|
||||
**Key Dependencies**: [External libraries and their purpose]
|
||||
|
||||
| Library | Version | Purpose |
|
||||
|---------|---------|---------|
|
||||
| [name] | [version] | [why needed] |
|
||||
|
||||
**Error Handling Strategy**:
|
||||
- [How errors are caught, propagated, and reported]
|
||||
- [Retry policy if applicable]
|
||||
- [Circuit breaker if applicable]
|
||||
|
||||
## 6. Extensions and Helpers
|
||||
|
||||
> List any shared utilities this component needs that should live in a `helpers/` folder.
|
||||
|
||||
| Helper | Purpose | Used By |
|
||||
|--------|---------|---------|
|
||||
| [helper_name] | [what it does] | [list of components] |
|
||||
|
||||
## 7. Caveats & Edge Cases
|
||||
|
||||
**Known limitations**:
|
||||
- [Limitation 1]
|
||||
|
||||
**Potential race conditions**:
|
||||
- [Race condition scenario, if any]
|
||||
|
||||
**Performance bottlenecks**:
|
||||
- [Bottleneck description and mitigation approach]
|
||||
|
||||
## 8. Dependency Graph
|
||||
|
||||
**Must be implemented after**: [list of component numbers/names]
|
||||
|
||||
**Can be implemented in parallel with**: [list of component numbers/names]
|
||||
|
||||
**Blocks**: [list of components that depend on this one]
|
||||
|
||||
## 9. Logging Strategy
|
||||
|
||||
| Log Level | When | Example |
|
||||
|-----------|------|---------|
|
||||
| ERROR | Unrecoverable failures | `Failed to process order {id}: {error}` |
|
||||
| WARN | Recoverable issues | `Retry attempt {n} for {operation}` |
|
||||
| INFO | Key business events | `Order {id} created by user {uid}` |
|
||||
| DEBUG | Development diagnostics | `Query returned {n} rows in {ms}ms` |
|
||||
|
||||
**Log format**: [structured JSON / plaintext — match system standard]
|
||||
|
||||
**Log storage**: [stdout / file / centralized logging service]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- **Section 3 (External API)**: skip entirely for internal-only components. Include for any component that exposes HTTP endpoints, WebSocket connections, or gRPC services.
|
||||
- **Section 4 (Storage Estimates)**: critical for components that manage persistent data. Skip for stateless components.
|
||||
- **Section 5 (Algorithmic Complexity)**: only document if the algorithm is non-trivial (O(n^2) or worse, recursive, etc.). Simple CRUD operations don't need this.
|
||||
- **Section 6 (Helpers)**: if the helper is used by only one component, keep it inside that component. Only extract to `helpers/` if shared by 2+ components.
|
||||
- **Section 8 (Dependency Graph)**: this is essential for determining implementation order. Be precise about what "depends on" means — data dependency, API dependency, or shared infrastructure.
|
||||
@@ -0,0 +1,127 @@
|
||||
# Jira Epic Template
|
||||
|
||||
Use this template for each Jira epic. Create epics via Jira MCP.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
## Epic: [Component Name] — [Outcome]
|
||||
|
||||
**Example**: Data Ingestion — Near-real-time pipeline
|
||||
|
||||
### Epic Summary
|
||||
|
||||
[1-2 sentences: what we are building + why it matters]
|
||||
|
||||
### Problem / Context
|
||||
|
||||
[Current state, pain points, constraints, business opportunities.
|
||||
Link to architecture.md and relevant component spec.]
|
||||
|
||||
### Scope
|
||||
|
||||
**In Scope**:
|
||||
- [Capability 1 — describe what, not how]
|
||||
- [Capability 2]
|
||||
- [Capability 3]
|
||||
|
||||
**Out of Scope**:
|
||||
- [Explicit exclusion 1 — prevents scope creep]
|
||||
- [Explicit exclusion 2]
|
||||
|
||||
### Assumptions
|
||||
|
||||
- [System design assumption]
|
||||
- [Data structure assumption]
|
||||
- [Infrastructure assumption]
|
||||
|
||||
### Dependencies
|
||||
|
||||
**Epic dependencies** (must be completed first):
|
||||
- [Epic name / ID]
|
||||
|
||||
**External dependencies**:
|
||||
- [Services, hardware, environments, certificates, data sources]
|
||||
|
||||
### Effort Estimation
|
||||
|
||||
**T-shirt size**: S / M / L / XL
|
||||
**Story points range**: [min]-[max]
|
||||
|
||||
### Users / Consumers
|
||||
|
||||
| Type | Who | Key Use Cases |
|
||||
|------|-----|--------------|
|
||||
| Internal | [team/role] | [use case] |
|
||||
| External | [user type] | [use case] |
|
||||
| System | [service name] | [integration point] |
|
||||
|
||||
### Requirements
|
||||
|
||||
**Functional**:
|
||||
- [API expectations, events, data handling]
|
||||
- [Idempotency, retry behavior]
|
||||
|
||||
**Non-functional**:
|
||||
- [Availability, latency, throughput targets]
|
||||
- [Scalability, processing limits, data retention]
|
||||
|
||||
**Security / Compliance**:
|
||||
- [Authentication, encryption, secrets management]
|
||||
- [Logging, audit trail]
|
||||
- [SOC2 / ISO / GDPR if applicable]
|
||||
|
||||
### Design & Architecture
|
||||
|
||||
- Architecture doc: `_docs/02_plans/architecture.md`
|
||||
- Component spec: `_docs/02_plans/components/[##]_[name]/description.md`
|
||||
- System flows: `_docs/02_plans/system-flows.md`
|
||||
|
||||
### Definition of Done
|
||||
|
||||
- [ ] All in-scope capabilities implemented
|
||||
- [ ] Automated tests pass (unit + integration + e2e)
|
||||
- [ ] Minimum coverage threshold met (75%)
|
||||
- [ ] Runbooks written (if applicable)
|
||||
- [ ] Documentation updated
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
| # | Criterion | Measurable Condition |
|
||||
|---|-----------|---------------------|
|
||||
| 1 | [criterion] | [how to verify] |
|
||||
| 2 | [criterion] | [how to verify] |
|
||||
|
||||
### Risks & Mitigations
|
||||
|
||||
| # | Risk | Mitigation | Owner |
|
||||
|---|------|------------|-------|
|
||||
| 1 | [top risk] | [mitigation] | [owner] |
|
||||
| 2 | | | |
|
||||
| 3 | | | |
|
||||
|
||||
### Labels
|
||||
|
||||
- `component:[name]`
|
||||
- `env:prod` / `env:stg`
|
||||
- `type:platform` / `type:data` / `type:integration`
|
||||
|
||||
### Child Issues
|
||||
|
||||
| Type | Title | Points |
|
||||
|------|-------|--------|
|
||||
| Spike | [research/investigation task] | [1-3] |
|
||||
| Task | [implementation task] | [1-5] |
|
||||
| Task | [implementation task] | [1-5] |
|
||||
| Enabler | [infrastructure/setup task] | [1-3] |
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- Be concise. Fewer words with the same meaning = better epic.
|
||||
- Capabilities in scope are "what", not "how" — avoid describing implementation details.
|
||||
- Dependency order matters: epics that must be done first should be listed earlier in the backlog.
|
||||
- Every epic maps to exactly one component. If a component is too large for one epic, split the component first.
|
||||
- Complexity points for child issues follow the project standard: 1, 2, 3, 5, 8. Do not create issues above 5 points — split them.
|
||||
@@ -0,0 +1,104 @@
|
||||
# Final Planning Report Template
|
||||
|
||||
Use this template after completing all 5 steps and the quality checklist. Save as `_docs/02_plans/FINAL_report.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — Planning Report
|
||||
|
||||
## Executive Summary
|
||||
|
||||
[2-3 sentences: what was planned, the core architectural approach, and the key outcome (number of components, epics, estimated effort)]
|
||||
|
||||
## Problem Statement
|
||||
|
||||
[Brief restatement from problem.md — transformed, not copy-pasted]
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
[Key architectural decisions and technology stack summary. Reference `architecture.md` for full details.]
|
||||
|
||||
**Technology stack**: [language, framework, database, hosting — one line]
|
||||
|
||||
**Deployment**: [environment strategy — one line]
|
||||
|
||||
## Component Summary
|
||||
|
||||
| # | Component | Purpose | Dependencies | Epic |
|
||||
|---|-----------|---------|-------------|------|
|
||||
| 01 | [name] | [one-line purpose] | — | [Jira ID] |
|
||||
| 02 | [name] | [one-line purpose] | 01 | [Jira ID] |
|
||||
| ... | | | | |
|
||||
|
||||
**Implementation order** (based on dependency graph):
|
||||
1. [Phase 1: components that can start immediately]
|
||||
2. [Phase 2: components that depend on Phase 1]
|
||||
3. [Phase 3: ...]
|
||||
|
||||
## System Flows
|
||||
|
||||
| Flow | Description | Key Components |
|
||||
|------|-------------|---------------|
|
||||
| [name] | [one-line summary] | [component list] |
|
||||
|
||||
[Reference `system-flows.md` for full diagrams and details.]
|
||||
|
||||
## Risk Summary
|
||||
|
||||
| Level | Count | Key Risks |
|
||||
|-------|-------|-----------|
|
||||
| Critical | [N] | [brief list] |
|
||||
| High | [N] | [brief list] |
|
||||
| Medium | [N] | — |
|
||||
| Low | [N] | — |
|
||||
|
||||
**Iterations completed**: [N]
|
||||
**All Critical/High risks mitigated**: Yes / No — [details if No]
|
||||
|
||||
[Reference `risk_mitigations.md` for full register.]
|
||||
|
||||
## Test Coverage
|
||||
|
||||
| Component | Integration | Performance | Security | Acceptance | AC Coverage |
|
||||
|-----------|-------------|-------------|----------|------------|-------------|
|
||||
| [name] | [N tests] | [N tests] | [N tests] | [N tests] | [X/Y ACs] |
|
||||
| ... | | | | | |
|
||||
|
||||
**Overall acceptance criteria coverage**: [X / Y total ACs covered] ([percentage]%)
|
||||
|
||||
## Epic Roadmap
|
||||
|
||||
| Order | Epic | Component | Effort | Dependencies |
|
||||
|-------|------|-----------|--------|-------------|
|
||||
| 1 | [Jira ID]: [name] | [component] | [S/M/L/XL] | — |
|
||||
| 2 | [Jira ID]: [name] | [component] | [S/M/L/XL] | Epic 1 |
|
||||
| ... | | | | |
|
||||
|
||||
**Total estimated effort**: [sum or range]
|
||||
|
||||
## Key Decisions Made
|
||||
|
||||
| # | Decision | Rationale | Alternatives Rejected |
|
||||
|---|----------|-----------|----------------------|
|
||||
| 1 | [decision] | [why] | [what was rejected] |
|
||||
| 2 | | | |
|
||||
|
||||
## Open Questions
|
||||
|
||||
| # | Question | Impact | Assigned To |
|
||||
|---|----------|--------|-------------|
|
||||
| 1 | [unresolved question] | [what it blocks or affects] | [who should answer] |
|
||||
|
||||
## Artifact Index
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `architecture.md` | System architecture |
|
||||
| `system-flows.md` | System flows and diagrams |
|
||||
| `components/01_[name]/description.md` | Component spec |
|
||||
| `components/01_[name]/tests.md` | Test spec |
|
||||
| `risk_mitigations.md` | Risk register |
|
||||
| `diagrams/components.drawio` | Component diagram |
|
||||
| `diagrams/flows/flow_[name].md` | Flow diagrams |
|
||||
```
|
||||
@@ -0,0 +1,90 @@
|
||||
# E2E Test Environment Template
|
||||
|
||||
Save as `PLANS_DIR/integration_tests/environment.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# E2E Test Environment
|
||||
|
||||
## Overview
|
||||
|
||||
**System under test**: [main system name and entry points — API URLs, message queues, serial ports, etc.]
|
||||
**Consumer app purpose**: Standalone application that exercises the main system through its public interfaces, validating end-to-end use cases without access to internals.
|
||||
|
||||
## Docker Environment
|
||||
|
||||
### Services
|
||||
|
||||
| Service | Image / Build | Purpose | Ports |
|
||||
|---------|--------------|---------|-------|
|
||||
| system-under-test | [main app image or build context] | The main system being tested | [ports] |
|
||||
| test-db | [postgres/mysql/etc.] | Database for the main system | [ports] |
|
||||
| e2e-consumer | [build context for consumer app] | Black-box test runner | — |
|
||||
| [dependency] | [image] | [purpose — cache, queue, mock, etc.] | [ports] |
|
||||
|
||||
### Networks
|
||||
|
||||
| Network | Services | Purpose |
|
||||
|---------|----------|---------|
|
||||
| e2e-net | all | Isolated test network |
|
||||
|
||||
### Volumes
|
||||
|
||||
| Volume | Mounted to | Purpose |
|
||||
|--------|-----------|---------|
|
||||
| [name] | [service:path] | [test data, DB persistence, etc.] |
|
||||
|
||||
### docker-compose structure
|
||||
|
||||
```yaml
|
||||
# Outline only — not runnable code
|
||||
services:
|
||||
system-under-test:
|
||||
# main system
|
||||
test-db:
|
||||
# database
|
||||
e2e-consumer:
|
||||
# consumer test app
|
||||
depends_on:
|
||||
- system-under-test
|
||||
```
|
||||
|
||||
## Consumer Application
|
||||
|
||||
**Tech stack**: [language, framework, test runner]
|
||||
**Entry point**: [how it starts — e.g., pytest, jest, custom runner]
|
||||
|
||||
### Communication with system under test
|
||||
|
||||
| Interface | Protocol | Endpoint / Topic | Authentication |
|
||||
|-----------|----------|-----------------|----------------|
|
||||
| [API name] | [HTTP/gRPC/AMQP/etc.] | [URL or topic] | [method] |
|
||||
|
||||
### What the consumer does NOT have access to
|
||||
|
||||
- No direct database access to the main system
|
||||
- No internal module imports
|
||||
- No shared memory or file system with the main system
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
**When to run**: [e.g., on PR merge to dev, nightly, before production deploy]
|
||||
**Pipeline stage**: [where in the CI pipeline this fits]
|
||||
**Gate behavior**: [block merge / warning only / manual approval]
|
||||
**Timeout**: [max total suite duration before considered failed]
|
||||
|
||||
## Reporting
|
||||
|
||||
**Format**: CSV
|
||||
**Columns**: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
|
||||
**Output path**: [where the CSV is written — e.g., ./e2e-results/report.csv]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- The consumer app must treat the main system as a true black box — no internal imports, no direct DB queries against the main system's database.
|
||||
- Docker environment should be self-contained — `docker compose up` must be sufficient to run the full suite.
|
||||
- If the main system requires external services (payment gateways, third-party APIs), define mock/stub services in the Docker environment.
|
||||
@@ -0,0 +1,78 @@
|
||||
# E2E Functional Tests Template
|
||||
|
||||
Save as `PLANS_DIR/integration_tests/functional_tests.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# E2E Functional Tests
|
||||
|
||||
## Positive Scenarios
|
||||
|
||||
### FT-P-01: [Scenario Name]
|
||||
|
||||
**Summary**: [One sentence: what end-to-end use case this validates]
|
||||
**Traces to**: AC-[ID], AC-[ID]
|
||||
**Category**: [which AC category — e.g., Position Accuracy, Image Processing, etc.]
|
||||
|
||||
**Preconditions**:
|
||||
- [System state required before test]
|
||||
|
||||
**Input data**: [reference to specific data set or file from test_data.md]
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | [call / send / provide input] | [response / event / output] |
|
||||
| 2 | [call / send / provide input] | [response / event / output] |
|
||||
|
||||
**Expected outcome**: [specific, measurable result]
|
||||
**Max execution time**: [e.g., 10s]
|
||||
|
||||
---
|
||||
|
||||
### FT-P-02: [Scenario Name]
|
||||
|
||||
(repeat structure)
|
||||
|
||||
---
|
||||
|
||||
## Negative Scenarios
|
||||
|
||||
### FT-N-01: [Scenario Name]
|
||||
|
||||
**Summary**: [One sentence: what invalid/edge input this tests]
|
||||
**Traces to**: AC-[ID] (negative case), RESTRICT-[ID]
|
||||
**Category**: [which AC/restriction category]
|
||||
|
||||
**Preconditions**:
|
||||
- [System state required before test]
|
||||
|
||||
**Input data**: [reference to specific invalid data or edge case]
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | [provide invalid input / trigger edge case] | [error response / graceful degradation / fallback behavior] |
|
||||
|
||||
**Expected outcome**: [system rejects gracefully / falls back to X / returns error Y]
|
||||
**Max execution time**: [e.g., 5s]
|
||||
|
||||
---
|
||||
|
||||
### FT-N-02: [Scenario Name]
|
||||
|
||||
(repeat structure)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- Functional tests should typically trace to at least one acceptance criterion or restriction. Tests without a trace are allowed but should have a clear justification.
|
||||
- Positive scenarios validate the system does what it should.
|
||||
- Negative scenarios validate the system rejects or handles gracefully what it shouldn't accept.
|
||||
- Expected outcomes must be specific and measurable — not "works correctly" but "returns position within 50m of ground truth."
|
||||
- Input data references should point to specific entries in test_data.md.
|
||||
@@ -0,0 +1,97 @@
|
||||
# E2E Non-Functional Tests Template
|
||||
|
||||
Save as `PLANS_DIR/integration_tests/non_functional_tests.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# E2E Non-Functional Tests
|
||||
|
||||
## Performance Tests
|
||||
|
||||
### NFT-PERF-01: [Test Name]
|
||||
|
||||
**Summary**: [What performance characteristic this validates]
|
||||
**Traces to**: AC-[ID]
|
||||
**Metric**: [what is measured — latency, throughput, frame rate, etc.]
|
||||
|
||||
**Preconditions**:
|
||||
- [System state, load profile, data volume]
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Measurement |
|
||||
|------|----------------|-------------|
|
||||
| 1 | [action] | [what to measure and how] |
|
||||
|
||||
**Pass criteria**: [specific threshold — e.g., p95 latency < 400ms]
|
||||
**Duration**: [how long the test runs]
|
||||
|
||||
---
|
||||
|
||||
## Resilience Tests
|
||||
|
||||
### NFT-RES-01: [Test Name]
|
||||
|
||||
**Summary**: [What failure/recovery scenario this validates]
|
||||
**Traces to**: AC-[ID]
|
||||
|
||||
**Preconditions**:
|
||||
- [System state before fault injection]
|
||||
|
||||
**Fault injection**:
|
||||
- [What fault is introduced — process kill, network partition, invalid input sequence, etc.]
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Behavior |
|
||||
|------|--------|------------------|
|
||||
| 1 | [inject fault] | [system behavior during fault] |
|
||||
| 2 | [observe recovery] | [system behavior after recovery] |
|
||||
|
||||
**Pass criteria**: [recovery time, data integrity, continued operation]
|
||||
|
||||
---
|
||||
|
||||
## Security Tests
|
||||
|
||||
### NFT-SEC-01: [Test Name]
|
||||
|
||||
**Summary**: [What security property this validates]
|
||||
**Traces to**: AC-[ID], RESTRICT-[ID]
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Consumer Action | Expected Response |
|
||||
|------|----------------|------------------|
|
||||
| 1 | [attempt unauthorized access / injection / etc.] | [rejection / no data leak / etc.] |
|
||||
|
||||
**Pass criteria**: [specific security outcome]
|
||||
|
||||
---
|
||||
|
||||
## Resource Limit Tests
|
||||
|
||||
### NFT-RES-LIM-01: [Test Name]
|
||||
|
||||
**Summary**: [What resource constraint this validates]
|
||||
**Traces to**: AC-[ID], RESTRICT-[ID]
|
||||
|
||||
**Preconditions**:
|
||||
- [System running under specified constraints]
|
||||
|
||||
**Monitoring**:
|
||||
- [What resources to monitor — memory, CPU, GPU, disk, temperature]
|
||||
|
||||
**Duration**: [how long to run]
|
||||
**Pass criteria**: [resource stays within limit — e.g., memory < 8GB throughout]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- Performance tests should run long enough to capture steady-state behavior, not just cold-start.
|
||||
- Resilience tests must define both the fault and the expected recovery — not just "system should recover."
|
||||
- Security tests at E2E level focus on black-box attacks (unauthorized API calls, malformed input), not code-level vulnerabilities.
|
||||
- Resource limit tests must specify monitoring duration — short bursts don't prove sustained compliance.
|
||||
@@ -0,0 +1,46 @@
|
||||
# E2E Test Data Template
|
||||
|
||||
Save as `PLANS_DIR/integration_tests/test_data.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# E2E Test Data Management
|
||||
|
||||
## Seed Data Sets
|
||||
|
||||
| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|
||||
|----------|-------------|---------------|-----------|---------|
|
||||
| [name] | [what it contains] | [test IDs] | [SQL script / API call / fixture file / volume mount] | [how removed after test] |
|
||||
|
||||
## Data Isolation Strategy
|
||||
|
||||
[e.g., each test run gets a fresh container restart, or transactions are rolled back, or namespaced data, or separate DB per test group]
|
||||
|
||||
## Input Data Mapping
|
||||
|
||||
| Input Data File | Source Location | Description | Covers Scenarios |
|
||||
|-----------------|----------------|-------------|-----------------|
|
||||
| [filename] | `_docs/00_problem/input_data/[filename]` | [what it contains] | [test IDs that use this data] |
|
||||
|
||||
## External Dependency Mocks
|
||||
|
||||
| External Service | Mock/Stub | How Provided | Behavior |
|
||||
|-----------------|-----------|-------------|----------|
|
||||
| [service name] | [mock type] | [Docker service / in-process stub / recorded responses] | [what it returns / simulates] |
|
||||
|
||||
## Data Validation Rules
|
||||
|
||||
| Data Type | Validation | Invalid Examples | Expected System Behavior |
|
||||
|-----------|-----------|-----------------|------------------------|
|
||||
| [type] | [rules] | [invalid input examples] | [how system should respond] |
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- Every seed data set should be traceable to specific test scenarios.
|
||||
- Input data from `_docs/00_problem/input_data/` should be mapped to test scenarios that use it.
|
||||
- External mocks must be deterministic — same input always produces same output.
|
||||
- Data isolation must guarantee no test can affect another test's outcome.
|
||||
@@ -0,0 +1,47 @@
|
||||
# E2E Traceability Matrix Template
|
||||
|
||||
Save as `PLANS_DIR/integration_tests/traceability_matrix.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# E2E Traceability Matrix
|
||||
|
||||
## Acceptance Criteria Coverage
|
||||
|
||||
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|
||||
|-------|---------------------|----------|----------|
|
||||
| AC-01 | [criterion text] | FT-P-01, NFT-PERF-01 | Covered |
|
||||
| AC-02 | [criterion text] | FT-P-02, FT-N-01 | Covered |
|
||||
| AC-03 | [criterion text] | — | NOT COVERED — [reason and mitigation] |
|
||||
|
||||
## Restrictions Coverage
|
||||
|
||||
| Restriction ID | Restriction | Test IDs | Coverage |
|
||||
|---------------|-------------|----------|----------|
|
||||
| RESTRICT-01 | [restriction text] | FT-N-02, NFT-RES-LIM-01 | Covered |
|
||||
| RESTRICT-02 | [restriction text] | — | NOT COVERED — [reason and mitigation] |
|
||||
|
||||
## Coverage Summary
|
||||
|
||||
| Category | Total Items | Covered | Not Covered | Coverage % |
|
||||
|----------|-----------|---------|-------------|-----------|
|
||||
| Acceptance Criteria | [N] | [N] | [N] | [%] |
|
||||
| Restrictions | [N] | [N] | [N] | [%] |
|
||||
| **Total** | [N] | [N] | [N] | [%] |
|
||||
|
||||
## Uncovered Items Analysis
|
||||
|
||||
| Item | Reason Not Covered | Risk | Mitigation |
|
||||
|------|-------------------|------|-----------|
|
||||
| [AC/Restriction ID] | [why it cannot be tested at E2E level] | [what could go wrong] | [how risk is addressed — e.g., covered by component tests in Step 5] |
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- Every acceptance criterion must appear in the matrix — either covered or explicitly marked as not covered with a reason.
|
||||
- Every restriction must appear in the matrix.
|
||||
- NOT COVERED items must have a reason and a mitigation strategy (e.g., "covered at component test level" or "requires real hardware").
|
||||
- Coverage percentage should be at least 75% for acceptance criteria at the E2E level.
|
||||
@@ -0,0 +1,99 @@
|
||||
# Risk Register Template
|
||||
|
||||
Use this template for risk assessment. Save as `_docs/02_plans/risk_mitigations.md`.
|
||||
Subsequent iterations: `risk_mitigations_02.md`, `risk_mitigations_03.md`, etc.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# Risk Assessment — [Topic] — Iteration [##]
|
||||
|
||||
## Risk Scoring Matrix
|
||||
|
||||
| | Low Impact | Medium Impact | High Impact |
|
||||
|--|------------|---------------|-------------|
|
||||
| **High Probability** | Medium | High | Critical |
|
||||
| **Medium Probability** | Low | Medium | High |
|
||||
| **Low Probability** | Low | Low | Medium |
|
||||
|
||||
## Acceptance Criteria by Risk Level
|
||||
|
||||
| Level | Action Required |
|
||||
|-------|----------------|
|
||||
| Low | Accepted, monitored quarterly |
|
||||
| Medium | Mitigation plan required before implementation |
|
||||
| High | Mitigation + contingency plan required, reviewed weekly |
|
||||
| Critical | Must be resolved before proceeding to next planning step |
|
||||
|
||||
## Risk Register
|
||||
|
||||
| ID | Risk | Category | Probability | Impact | Score | Mitigation | Owner | Status |
|
||||
|----|------|----------|-------------|--------|-------|------------|-------|--------|
|
||||
| R01 | [risk description] | [category] | High/Med/Low | High/Med/Low | Critical/High/Med/Low | [mitigation strategy] | [owner] | Open/Mitigated/Accepted |
|
||||
| R02 | | | | | | | | |
|
||||
|
||||
## Risk Categories
|
||||
|
||||
### Technical Risks
|
||||
- Technology choices may not meet requirements
|
||||
- Integration complexity underestimated
|
||||
- Performance targets unachievable
|
||||
- Security vulnerabilities in design
|
||||
- Data model cannot support future requirements
|
||||
|
||||
### Schedule Risks
|
||||
- Dependencies delayed
|
||||
- Scope creep from ambiguous requirements
|
||||
- Underestimated complexity
|
||||
|
||||
### Resource Risks
|
||||
- Key person dependency
|
||||
- Team lacks experience with chosen technology
|
||||
- Infrastructure not available in time
|
||||
|
||||
### External Risks
|
||||
- Third-party API changes or deprecation
|
||||
- Vendor reliability or pricing changes
|
||||
- Regulatory or compliance changes
|
||||
- Data source availability
|
||||
|
||||
## Detailed Risk Analysis
|
||||
|
||||
### R01: [Risk Title]
|
||||
|
||||
**Description**: [Detailed description of the risk]
|
||||
|
||||
**Trigger conditions**: [What would cause this risk to materialize]
|
||||
|
||||
**Affected components**: [List of components impacted]
|
||||
|
||||
**Mitigation strategy**:
|
||||
1. [Action 1]
|
||||
2. [Action 2]
|
||||
|
||||
**Contingency plan**: [What to do if mitigation fails]
|
||||
|
||||
**Residual risk after mitigation**: [Low/Medium/High]
|
||||
|
||||
**Documents updated**: [List architecture/component docs that were updated to reflect this mitigation]
|
||||
|
||||
---
|
||||
|
||||
### R02: [Risk Title]
|
||||
|
||||
(repeat structure above)
|
||||
|
||||
## Architecture/Component Changes Applied
|
||||
|
||||
| Risk ID | Document Modified | Change Description |
|
||||
|---------|------------------|--------------------|
|
||||
| R01 | `architecture.md` §3 | [what changed] |
|
||||
| R01 | `components/02_[name]/description.md` §5 | [what changed] |
|
||||
|
||||
## Summary
|
||||
|
||||
**Total risks identified**: [N]
|
||||
**Critical**: [N] | **High**: [N] | **Medium**: [N] | **Low**: [N]
|
||||
**Risks mitigated this iteration**: [N]
|
||||
**Risks requiring user decision**: [list]
|
||||
```
|
||||
@@ -0,0 +1,108 @@
|
||||
# System Flows Template
|
||||
|
||||
Use this template for the system flows document. Save as `_docs/02_plans/system-flows.md`.
|
||||
Individual flow diagrams go in `_docs/02_plans/diagrams/flows/flow_[name].md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# [System Name] — System Flows
|
||||
|
||||
## Flow Inventory
|
||||
|
||||
| # | Flow Name | Trigger | Primary Components | Criticality |
|
||||
|---|-----------|---------|-------------------|-------------|
|
||||
| F1 | [name] | [user action / scheduled / event] | [component list] | High/Medium/Low |
|
||||
| F2 | [name] | | | |
|
||||
| ... | | | | |
|
||||
|
||||
## Flow Dependencies
|
||||
|
||||
| Flow | Depends On | Shares Data With |
|
||||
|------|-----------|-----------------|
|
||||
| F1 | — | F2 (via [entity]) |
|
||||
| F2 | F1 must complete first | F3 |
|
||||
|
||||
---
|
||||
|
||||
## Flow F1: [Flow Name]
|
||||
|
||||
### Description
|
||||
|
||||
[1-2 sentences: what this flow does, who triggers it, what the outcome is]
|
||||
|
||||
### Preconditions
|
||||
|
||||
- [Condition 1]
|
||||
- [Condition 2]
|
||||
|
||||
### Sequence Diagram
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant ComponentA
|
||||
participant ComponentB
|
||||
participant Database
|
||||
|
||||
User->>ComponentA: [action]
|
||||
ComponentA->>ComponentB: [call with params]
|
||||
ComponentB->>Database: [query/write]
|
||||
Database-->>ComponentB: [result]
|
||||
ComponentB-->>ComponentA: [response]
|
||||
ComponentA-->>User: [result]
|
||||
```
|
||||
|
||||
### Flowchart
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
Start([Trigger]) --> Step1[Step description]
|
||||
Step1 --> Decision{Condition?}
|
||||
Decision -->|Yes| Step2[Step description]
|
||||
Decision -->|No| Step3[Step description]
|
||||
Step2 --> EndNode([Result])
|
||||
Step3 --> EndNode
|
||||
```
|
||||
|
||||
### Data Flow
|
||||
|
||||
| Step | From | To | Data | Format |
|
||||
|------|------|----|------|--------|
|
||||
| 1 | [source] | [destination] | [what data] | [DTO/event/etc] |
|
||||
| 2 | | | | |
|
||||
|
||||
### Error Scenarios
|
||||
|
||||
| Error | Where | Detection | Recovery |
|
||||
|-------|-------|-----------|----------|
|
||||
| [error type] | [which step] | [how detected] | [what happens] |
|
||||
|
||||
### Performance Expectations
|
||||
|
||||
| Metric | Target | Notes |
|
||||
|--------|--------|-------|
|
||||
| End-to-end latency | [target] | [conditions] |
|
||||
| Throughput | [target] | [peak/sustained] |
|
||||
|
||||
---
|
||||
|
||||
## Flow F2: [Flow Name]
|
||||
|
||||
(repeat structure above)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Mermaid Diagram Conventions
|
||||
|
||||
Follow these conventions for consistency across all flow diagrams:
|
||||
|
||||
- **Participants**: use component names matching `components/[##]_[name]`
|
||||
- **Node IDs**: camelCase, no spaces (e.g., `validateInput`, `saveOrder`)
|
||||
- **Decision nodes**: use `{Question?}` format
|
||||
- **Start/End**: use `([label])` stadium shape
|
||||
- **External systems**: use `[[label]]` subroutine shape
|
||||
- **Subgraphs**: group by component or bounded context
|
||||
- **No styling**: do not add colors or CSS classes — let the renderer theme handle it
|
||||
- **Edge labels**: wrap special characters in quotes (e.g., `-->|"O(n) check"|`)
|
||||
@@ -0,0 +1,172 @@
|
||||
# Test Specification Template
|
||||
|
||||
Use this template for each component's test spec. Save as `components/[##]_[name]/tests.md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# Test Specification — [Component Name]
|
||||
|
||||
## Acceptance Criteria Traceability
|
||||
|
||||
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|
||||
|-------|---------------------|----------|----------|
|
||||
| AC-01 | [criterion from acceptance_criteria.md] | IT-01, AT-01 | Covered |
|
||||
| AC-02 | [criterion] | PT-01 | Covered |
|
||||
| AC-03 | [criterion] | — | NOT COVERED — [reason] |
|
||||
|
||||
---
|
||||
|
||||
## Integration Tests
|
||||
|
||||
### IT-01: [Test Name]
|
||||
|
||||
**Summary**: [One sentence: what this test verifies]
|
||||
|
||||
**Traces to**: AC-01, AC-03
|
||||
|
||||
**Description**: [Detailed test scenario]
|
||||
|
||||
**Input data**:
|
||||
```
|
||||
[specific input data for this test]
|
||||
```
|
||||
|
||||
**Expected result**:
|
||||
```
|
||||
[specific expected output or state]
|
||||
```
|
||||
|
||||
**Max execution time**: [e.g., 5s]
|
||||
|
||||
**Dependencies**: [other components/services that must be running]
|
||||
|
||||
---
|
||||
|
||||
### IT-02: [Test Name]
|
||||
|
||||
(repeat structure)
|
||||
|
||||
---
|
||||
|
||||
## Performance Tests
|
||||
|
||||
### PT-01: [Test Name]
|
||||
|
||||
**Summary**: [One sentence: what performance aspect is tested]
|
||||
|
||||
**Traces to**: AC-02
|
||||
|
||||
**Load scenario**:
|
||||
- Concurrent users: [N]
|
||||
- Request rate: [N req/s]
|
||||
- Duration: [N minutes]
|
||||
- Ramp-up: [strategy]
|
||||
|
||||
**Expected results**:
|
||||
|
||||
| Metric | Target | Failure Threshold |
|
||||
|--------|--------|-------------------|
|
||||
| Latency (p50) | [target] | [max] |
|
||||
| Latency (p95) | [target] | [max] |
|
||||
| Latency (p99) | [target] | [max] |
|
||||
| Throughput | [target req/s] | [min req/s] |
|
||||
| Error rate | [target %] | [max %] |
|
||||
|
||||
**Resource limits**:
|
||||
- CPU: [max %]
|
||||
- Memory: [max MB/GB]
|
||||
- Database connections: [max pool size]
|
||||
|
||||
---
|
||||
|
||||
### PT-02: [Test Name]
|
||||
|
||||
(repeat structure)
|
||||
|
||||
---
|
||||
|
||||
## Security Tests
|
||||
|
||||
### ST-01: [Test Name]
|
||||
|
||||
**Summary**: [One sentence: what security aspect is tested]
|
||||
|
||||
**Traces to**: AC-04
|
||||
|
||||
**Attack vector**: [e.g., SQL injection on search endpoint, privilege escalation via direct ID access]
|
||||
|
||||
**Test procedure**:
|
||||
1. [Step 1]
|
||||
2. [Step 2]
|
||||
|
||||
**Expected behavior**: [what the system should do — reject, sanitize, log, etc.]
|
||||
|
||||
**Pass criteria**: [specific measurable condition]
|
||||
|
||||
**Fail criteria**: [what constitutes a failure]
|
||||
|
||||
---
|
||||
|
||||
### ST-02: [Test Name]
|
||||
|
||||
(repeat structure)
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Tests
|
||||
|
||||
### AT-01: [Test Name]
|
||||
|
||||
**Summary**: [One sentence: what user-facing behavior is verified]
|
||||
|
||||
**Traces to**: AC-01
|
||||
|
||||
**Preconditions**:
|
||||
- [Precondition 1]
|
||||
- [Precondition 2]
|
||||
|
||||
**Steps**:
|
||||
|
||||
| Step | Action | Expected Result |
|
||||
|------|--------|-----------------|
|
||||
| 1 | [user action] | [expected outcome] |
|
||||
| 2 | [user action] | [expected outcome] |
|
||||
| 3 | [user action] | [expected outcome] |
|
||||
|
||||
---
|
||||
|
||||
### AT-02: [Test Name]
|
||||
|
||||
(repeat structure)
|
||||
|
||||
---
|
||||
|
||||
## Test Data Management
|
||||
|
||||
**Required test data**:
|
||||
|
||||
| Data Set | Description | Source | Size |
|
||||
|----------|-------------|--------|------|
|
||||
| [name] | [what it contains] | [generated / fixture / copy of prod subset] | [approx size] |
|
||||
|
||||
**Setup procedure**:
|
||||
1. [How to prepare the test environment]
|
||||
2. [How to load test data]
|
||||
|
||||
**Teardown procedure**:
|
||||
1. [How to clean up after tests]
|
||||
2. [How to restore initial state]
|
||||
|
||||
**Data isolation strategy**: [How tests are isolated from each other — separate DB, transactions, namespacing]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Guidance Notes
|
||||
|
||||
- Every test MUST trace back to at least one acceptance criterion (AC-XX). If a test doesn't trace to any, question whether it's needed.
|
||||
- If an acceptance criterion has no test covering it, mark it as NOT COVERED and explain why (e.g., "requires manual verification", "deferred to phase 2").
|
||||
- Performance test targets should come from the NFR section in `architecture.md`.
|
||||
- Security tests should cover at minimum: authentication bypass, authorization escalation, injection attacks relevant to this component.
|
||||
- Not every component needs all 4 test types. A stateless utility component may only need integration tests.
|
||||
@@ -0,0 +1,240 @@
|
||||
---
|
||||
name: problem
|
||||
description: |
|
||||
Interactive problem gathering skill that builds _docs/00_problem/ through structured interview.
|
||||
Iteratively asks probing questions until the problem, restrictions, acceptance criteria, and input data
|
||||
are fully understood. Produces all required files for downstream skills (research, plan, etc.).
|
||||
Trigger phrases:
|
||||
- "problem", "define problem", "problem gathering"
|
||||
- "what am I building", "describe problem"
|
||||
- "start project", "new project"
|
||||
category: build
|
||||
tags: [problem, gathering, interview, requirements, acceptance-criteria]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Problem Gathering
|
||||
|
||||
Build a complete problem definition through structured, interactive interview with the user. Produces all required files in `_docs/00_problem/` that downstream skills (research, plan, decompose, implement, deploy) depend on.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Ask, don't assume**: never infer requirements the user hasn't stated
|
||||
- **Exhaust before writing**: keep asking until all dimensions are covered; do not write files prematurely
|
||||
- **Concrete over vague**: push for measurable values, specific constraints, real numbers
|
||||
- **Save immediately**: once the user confirms, write all files at once
|
||||
- **User is the authority**: the AI suggests, the user decides
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Fixed paths:
|
||||
|
||||
- OUTPUT_DIR: `_docs/00_problem/`
|
||||
- INPUT_DATA_DIR: `_docs/00_problem/input_data/`
|
||||
|
||||
## Prerequisite Checks
|
||||
|
||||
1. If OUTPUT_DIR already exists and contains files, present what exists and ask user: **resume and fill gaps, overwrite, or skip?**
|
||||
2. If overwrite or fresh start, create OUTPUT_DIR and INPUT_DATA_DIR
|
||||
|
||||
## Completeness Criteria
|
||||
|
||||
The interview is complete when the AI can write ALL of these:
|
||||
|
||||
| File | Complete when |
|
||||
|------|--------------|
|
||||
| `problem.md` | Clear problem statement: what is being built, why, for whom, what it does |
|
||||
| `restrictions.md` | All constraints identified: hardware, software, environment, operational, regulatory, budget, timeline |
|
||||
| `acceptance_criteria.md` | Measurable success criteria with specific numeric targets grouped by category |
|
||||
| `input_data/` | At least one reference data file or detailed data description document |
|
||||
| `security_approach.md` | (optional) Security requirements identified, or explicitly marked as not applicable |
|
||||
|
||||
## Interview Protocol
|
||||
|
||||
### Phase 1: Open Discovery
|
||||
|
||||
Start with broad, open questions. Let the user describe the problem in their own words.
|
||||
|
||||
**Opening**: Ask the user to describe what they are building and what problem it solves. Do not interrupt or narrow down yet.
|
||||
|
||||
After the user responds, summarize what you understood and ask: "Did I get this right? What did I miss?"
|
||||
|
||||
### Phase 2: Structured Probing
|
||||
|
||||
Work through each dimension systematically. For each dimension, ask only what the user hasn't already covered. Skip dimensions that were fully answered in Phase 1.
|
||||
|
||||
**Dimension checklist:**
|
||||
|
||||
1. **Problem & Goals**
|
||||
- What exactly does the system do?
|
||||
- What problem does it solve? Why does it need to exist?
|
||||
- Who are the users / operators / stakeholders?
|
||||
- What is the expected usage pattern (frequency, load, environment)?
|
||||
|
||||
2. **Scope & Boundaries**
|
||||
- What is explicitly IN scope?
|
||||
- What is explicitly OUT of scope?
|
||||
- Are there related systems this integrates with?
|
||||
- What does the system NOT do (common misconceptions)?
|
||||
|
||||
3. **Hardware & Environment**
|
||||
- What hardware does it run on? (CPU, GPU, memory, storage)
|
||||
- What operating system / platform?
|
||||
- What is the deployment environment? (cloud, edge, embedded, on-prem)
|
||||
- Any physical constraints? (power, thermal, size, connectivity)
|
||||
|
||||
4. **Software & Tech Constraints**
|
||||
- Required programming languages or frameworks?
|
||||
- Required protocols or interfaces?
|
||||
- Existing systems it must integrate with?
|
||||
- Libraries or tools that must or must not be used?
|
||||
|
||||
5. **Acceptance Criteria**
|
||||
- What does "done" look like?
|
||||
- Performance targets: latency, throughput, accuracy, error rates?
|
||||
- Quality bars: reliability, availability, recovery time?
|
||||
- Push for specific numbers: "less than Xms", "above Y%", "within Z meters"
|
||||
- Edge cases: what happens when things go wrong?
|
||||
- Startup and shutdown behavior?
|
||||
|
||||
6. **Input Data**
|
||||
- What data does the system consume?
|
||||
- Formats, schemas, volumes, update frequency?
|
||||
- Does the user have sample/reference data to provide?
|
||||
- If no data exists yet, what would representative data look like?
|
||||
|
||||
7. **Security** (optional, probe gently)
|
||||
- Authentication / authorization requirements?
|
||||
- Data sensitivity (PII, classified, proprietary)?
|
||||
- Communication security (encryption, TLS)?
|
||||
- If the user says "not a concern", mark as N/A and move on
|
||||
|
||||
8. **Operational Constraints**
|
||||
- Budget constraints?
|
||||
- Timeline constraints?
|
||||
- Team size / expertise constraints?
|
||||
- Regulatory or compliance requirements?
|
||||
- Geographic restrictions?
|
||||
|
||||
### Phase 3: Gap Analysis
|
||||
|
||||
After all dimensions are covered:
|
||||
|
||||
1. Internally assess completeness against the Completeness Criteria table
|
||||
2. Present a completeness summary to the user:
|
||||
|
||||
```
|
||||
Completeness Check:
|
||||
- problem.md: READY / GAPS: [list missing aspects]
|
||||
- restrictions.md: READY / GAPS: [list missing aspects]
|
||||
- acceptance_criteria.md: READY / GAPS: [list missing aspects]
|
||||
- input_data/: READY / GAPS: [list missing aspects]
|
||||
- security_approach.md: READY / N/A / GAPS: [list missing aspects]
|
||||
```
|
||||
|
||||
3. If gaps exist, ask targeted follow-up questions for each gap
|
||||
4. Repeat until all required files show READY
|
||||
|
||||
### Phase 4: Draft & Confirm
|
||||
|
||||
1. Draft all files in the conversation (show the user what will be written)
|
||||
2. Present each file's content for review
|
||||
3. Ask: "Should I save these files? Any changes needed?"
|
||||
4. Apply any requested changes
|
||||
5. Save all files to OUTPUT_DIR
|
||||
|
||||
## Output File Formats
|
||||
|
||||
### problem.md
|
||||
|
||||
Free-form text. Clear, concise description of:
|
||||
- What is being built
|
||||
- What problem it solves
|
||||
- How it works at a high level
|
||||
- Key context the reader needs to understand the problem
|
||||
|
||||
No headers required. Paragraph format. Should be readable by someone unfamiliar with the project.
|
||||
|
||||
### restrictions.md
|
||||
|
||||
Categorized constraints with markdown headers and bullet points:
|
||||
|
||||
```markdown
|
||||
# [Category Name]
|
||||
|
||||
- Constraint description with specific values where applicable
|
||||
- Another constraint
|
||||
```
|
||||
|
||||
Categories are derived from the interview (hardware, software, environment, operational, etc.). Each restriction should be specific and testable.
|
||||
|
||||
### acceptance_criteria.md
|
||||
|
||||
Categorized measurable criteria with markdown headers and bullet points:
|
||||
|
||||
```markdown
|
||||
# [Category Name]
|
||||
|
||||
- Criterion with specific numeric target
|
||||
- Another criterion with measurable threshold
|
||||
```
|
||||
|
||||
Every criterion must have a measurable value. Vague criteria like "should be fast" are not acceptable — push for "less than 400ms end-to-end".
|
||||
|
||||
### input_data/
|
||||
|
||||
At least one file. Options:
|
||||
- User provides actual data files (CSV, JSON, images, etc.) — save as-is
|
||||
- User describes data parameters — save as `data_parameters.md`
|
||||
- User provides URLs to data — save as `data_sources.md` with links and descriptions
|
||||
|
||||
### security_approach.md (optional)
|
||||
|
||||
If security requirements exist, document them. If the user says security is not a concern for this project, skip this file entirely.
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
Create a TodoWrite with phases 1-4. Update as each phase completes.
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| User cannot provide acceptance criteria numbers | Suggest industry benchmarks, ASK user to confirm or adjust |
|
||||
| User has no input data at all | ASK what representative data would look like, create a `data_parameters.md` describing expected data |
|
||||
| User says "I don't know" to a critical dimension | Research the domain briefly, suggest reasonable defaults, ASK user to confirm |
|
||||
| Conflicting requirements discovered | Present the conflict, ASK user which takes priority |
|
||||
| User wants to skip a required file | Explain why downstream skills need it, ASK if they want a minimal placeholder |
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Writing files before the interview is complete**: gather everything first, then write
|
||||
- **Accepting vague criteria**: "fast", "accurate", "reliable" are not acceptance criteria without numbers
|
||||
- **Assuming technical choices**: do not suggest specific technologies unless the user constrains them
|
||||
- **Over-engineering the problem statement**: problem.md should be concise, not a dissertation
|
||||
- **Inventing restrictions**: only document what the user actually states as a constraint
|
||||
- **Skipping input data**: downstream skills (especially research and plan) need concrete data context
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Problem Gathering (4-Phase Interview) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ: Check if _docs/00_problem/ exists (resume/overwrite?) │
|
||||
│ │
|
||||
│ Phase 1: Open Discovery │
|
||||
│ → "What are you building?" → summarize → confirm │
|
||||
│ Phase 2: Structured Probing │
|
||||
│ → 8 dimensions: problem, scope, hardware, software, │
|
||||
│ acceptance criteria, input data, security, operations │
|
||||
│ → skip what Phase 1 already covered │
|
||||
│ Phase 3: Gap Analysis │
|
||||
│ → assess completeness per file → fill gaps iteratively │
|
||||
│ Phase 4: Draft & Confirm │
|
||||
│ → show all files → user confirms → save to _docs/00_problem/ │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Ask don't assume · Concrete over vague │
|
||||
│ Exhaust before writing · User is authority │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,471 @@
|
||||
---
|
||||
name: refactor
|
||||
description: |
|
||||
Structured refactoring workflow (6-phase method) with three execution modes:
|
||||
- Full Refactoring: all 6 phases — baseline, discovery, analysis, safety net, execution, hardening
|
||||
- Targeted Refactoring: skip discovery if docs exist, focus on a specific component/area
|
||||
- Quick Assessment: phases 0-2 only, outputs a refactoring plan without execution
|
||||
Supports project mode (_docs/ structure) and standalone mode (@file.md).
|
||||
Trigger phrases:
|
||||
- "refactor", "refactoring", "improve code"
|
||||
- "analyze coupling", "decoupling", "technical debt"
|
||||
- "refactoring assessment", "code quality improvement"
|
||||
category: evolve
|
||||
tags: [refactoring, coupling, technical-debt, performance, hardening]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Structured Refactoring (6-Phase Method)
|
||||
|
||||
Transform existing codebases through a systematic refactoring workflow: capture baseline, document current state, research improvements, build safety net, execute changes, and harden.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Preserve behavior first**: never refactor without a passing test suite
|
||||
- **Measure before and after**: every change must be justified by metrics
|
||||
- **Small incremental changes**: commit frequently, never break tests
|
||||
- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
|
||||
- **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Determine the operating mode based on invocation before any other logic runs.
|
||||
|
||||
**Project mode** (no explicit input file provided):
|
||||
- PROBLEM_DIR: `_docs/00_problem/`
|
||||
- SOLUTION_DIR: `_docs/01_solution/`
|
||||
- COMPONENTS_DIR: `_docs/02_components/`
|
||||
- TESTS_DIR: `_docs/02_tests/`
|
||||
- REFACTOR_DIR: `_docs/04_refactoring/`
|
||||
- All existing guardrails apply.
|
||||
|
||||
**Standalone mode** (explicit input file provided, e.g. `/refactor @some_component.md`):
|
||||
- INPUT_FILE: the provided file (treated as component/area description)
|
||||
- REFACTOR_DIR: `_standalone/refactoring/`
|
||||
- Guardrails relaxed: only INPUT_FILE must exist and be non-empty
|
||||
- `acceptance_criteria.md` is optional — warn if absent
|
||||
|
||||
Announce the detected mode and resolved paths to the user before proceeding.
|
||||
|
||||
## Mode Detection
|
||||
|
||||
After context resolution, determine the execution mode:
|
||||
|
||||
1. **User explicitly says** "quick assessment" or "just assess" → **Quick Assessment**
|
||||
2. **User explicitly says** "refactor [component/file/area]" with a specific target → **Targeted Refactoring**
|
||||
3. **Default** → **Full Refactoring**
|
||||
|
||||
| Mode | Phases Executed | When to Use |
|
||||
|------|----------------|-------------|
|
||||
| **Full Refactoring** | 0 → 1 → 2 → 3 → 4 → 5 | Complete refactoring of a system or major area |
|
||||
| **Targeted Refactoring** | 0 → (skip 1 if docs exist) → 2 → 3 → 4 → 5 | Refactor a specific component; docs already exist |
|
||||
| **Quick Assessment** | 0 → 1 → 2 | Produce a refactoring roadmap without executing changes |
|
||||
|
||||
Inform the user which mode was detected and confirm before proceeding.
|
||||
|
||||
## Prerequisite Checks (BLOCKING)
|
||||
|
||||
**Project mode:**
|
||||
1. PROBLEM_DIR exists with `problem.md` (or `problem_description.md`) — **STOP if missing**, ask user to create it
|
||||
2. If `acceptance_criteria.md` is missing: **warn** and ask whether to proceed
|
||||
3. Create REFACTOR_DIR if it does not exist
|
||||
4. If REFACTOR_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
|
||||
|
||||
**Standalone mode:**
|
||||
1. INPUT_FILE exists and is non-empty — **STOP if missing**
|
||||
2. Warn if no `acceptance_criteria.md` provided
|
||||
3. Create REFACTOR_DIR if it does not exist
|
||||
|
||||
## Artifact Management
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
REFACTOR_DIR/
|
||||
├── baseline_metrics.md (Phase 0)
|
||||
├── discovery/
|
||||
│ ├── components/
|
||||
│ │ └── [##]_[name].md (Phase 1)
|
||||
│ ├── solution.md (Phase 1)
|
||||
│ └── system_flows.md (Phase 1)
|
||||
├── analysis/
|
||||
│ ├── research_findings.md (Phase 2)
|
||||
│ └── refactoring_roadmap.md (Phase 2)
|
||||
├── test_specs/
|
||||
│ └── [##]_[test_name].md (Phase 3)
|
||||
├── coupling_analysis.md (Phase 4)
|
||||
├── execution_log.md (Phase 4)
|
||||
├── hardening/
|
||||
│ ├── technical_debt.md (Phase 5)
|
||||
│ ├── performance.md (Phase 5)
|
||||
│ └── security.md (Phase 5)
|
||||
└── FINAL_report.md (after all phases)
|
||||
```
|
||||
|
||||
### Save Timing
|
||||
|
||||
| Phase | Save immediately after | Filename |
|
||||
|-------|------------------------|----------|
|
||||
| Phase 0 | Baseline captured | `baseline_metrics.md` |
|
||||
| Phase 1 | Each component documented | `discovery/components/[##]_[name].md` |
|
||||
| Phase 1 | Solution synthesized | `discovery/solution.md`, `discovery/system_flows.md` |
|
||||
| Phase 2 | Research complete | `analysis/research_findings.md` |
|
||||
| Phase 2 | Roadmap produced | `analysis/refactoring_roadmap.md` |
|
||||
| Phase 3 | Test specs written | `test_specs/[##]_[test_name].md` |
|
||||
| Phase 4 | Coupling analyzed | `coupling_analysis.md` |
|
||||
| Phase 4 | Execution complete | `execution_log.md` |
|
||||
| Phase 5 | Each hardening track | `hardening/<track>.md` |
|
||||
| Final | All phases done | `FINAL_report.md` |
|
||||
|
||||
### Resumability
|
||||
|
||||
If REFACTOR_DIR already contains artifacts:
|
||||
|
||||
1. List existing files and match to the save timing table
|
||||
2. Identify the last completed phase based on which artifacts exist
|
||||
3. Resume from the next incomplete phase
|
||||
4. Inform the user which phases are being skipped
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
At the start of execution, create a TodoWrite with all applicable phases. Update status as each phase completes.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase 0: Context & Baseline
|
||||
|
||||
**Role**: Software engineer preparing for refactoring
|
||||
**Goal**: Collect refactoring goals and capture baseline metrics
|
||||
**Constraints**: Measurement only — no code changes
|
||||
|
||||
#### 0a. Collect Goals
|
||||
|
||||
If PROBLEM_DIR files do not yet exist, help the user create them:
|
||||
|
||||
1. `problem.md` — what the system currently does, what changes are needed, pain points
|
||||
2. `acceptance_criteria.md` — success criteria for the refactoring
|
||||
3. `security_approach.md` — security requirements (if applicable)
|
||||
|
||||
Store in PROBLEM_DIR.
|
||||
|
||||
#### 0b. Capture Baseline
|
||||
|
||||
1. Read problem description and acceptance criteria
|
||||
2. Measure current system metrics using project-appropriate tools:
|
||||
|
||||
| Metric Category | What to Capture |
|
||||
|----------------|-----------------|
|
||||
| **Coverage** | Overall, unit, integration, critical paths |
|
||||
| **Complexity** | Cyclomatic complexity (avg + top 5 functions), LOC, tech debt ratio |
|
||||
| **Code Smells** | Total, critical, major |
|
||||
| **Performance** | Response times (P50/P95/P99), CPU/memory, throughput |
|
||||
| **Dependencies** | Total count, outdated, security vulnerabilities |
|
||||
| **Build** | Build time, test execution time, deployment time |
|
||||
|
||||
3. Create functionality inventory: all features/endpoints with status and coverage
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All metric categories measured (or noted as N/A with reason)
|
||||
- [ ] Functionality inventory is complete
|
||||
- [ ] Measurements are reproducible
|
||||
|
||||
**Save action**: Write `REFACTOR_DIR/baseline_metrics.md`
|
||||
|
||||
**BLOCKING**: Present baseline summary to user. Do NOT proceed until user confirms.
|
||||
|
||||
---
|
||||
|
||||
### Phase 1: Discovery
|
||||
|
||||
**Role**: Principal software architect
|
||||
**Goal**: Generate documentation from existing code and form solution description
|
||||
**Constraints**: Document what exists, not what should be. No code changes.
|
||||
|
||||
**Skip condition** (Targeted mode): If `COMPONENTS_DIR` and `SOLUTION_DIR` already contain documentation for the target area, skip to Phase 2. Ask user to confirm skip.
|
||||
|
||||
#### 1a. Document Components
|
||||
|
||||
For each component in the codebase:
|
||||
|
||||
1. Analyze project structure, directories, files
|
||||
2. Go file by file, analyze each method
|
||||
3. Analyze connections between components
|
||||
|
||||
Write per component to `REFACTOR_DIR/discovery/components/[##]_[name].md`:
|
||||
- Purpose and architectural patterns
|
||||
- Mermaid diagrams for logic flows
|
||||
- API reference table (name, description, input, output)
|
||||
- Implementation details: algorithmic complexity, state management, dependencies
|
||||
- Caveats, edge cases, known limitations
|
||||
|
||||
#### 1b. Synthesize Solution & Flows
|
||||
|
||||
1. Review all generated component documentation
|
||||
2. Synthesize into a cohesive solution description
|
||||
3. Create flow diagrams showing component interactions
|
||||
|
||||
Write:
|
||||
- `REFACTOR_DIR/discovery/solution.md` — product description, component overview, interaction diagram
|
||||
- `REFACTOR_DIR/discovery/system_flows.md` — Mermaid flowcharts per major use case
|
||||
|
||||
Also copy to project standard locations if in project mode:
|
||||
- `SOLUTION_DIR/solution.md`
|
||||
- `COMPONENTS_DIR/system_flows.md`
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Every component in the codebase is documented
|
||||
- [ ] Solution description covers all components
|
||||
- [ ] Flow diagrams cover all major use cases
|
||||
- [ ] Mermaid diagrams are syntactically correct
|
||||
|
||||
**Save action**: Write discovery artifacts
|
||||
|
||||
**BLOCKING**: Present discovery summary to user. Do NOT proceed until user confirms documentation accuracy.
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Analysis
|
||||
|
||||
**Role**: Researcher and software architect
|
||||
**Goal**: Research improvements and produce a refactoring roadmap
|
||||
**Constraints**: Analysis only — no code changes
|
||||
|
||||
#### 2a. Deep Research
|
||||
|
||||
1. Analyze current implementation patterns
|
||||
2. Research modern approaches for similar systems
|
||||
3. Identify what could be done differently
|
||||
4. Suggest improvements based on state-of-the-art practices
|
||||
|
||||
Write `REFACTOR_DIR/analysis/research_findings.md`:
|
||||
- Current state analysis: patterns used, strengths, weaknesses
|
||||
- Alternative approaches per component: current vs alternative, pros/cons, migration effort
|
||||
- Prioritized recommendations: quick wins + strategic improvements
|
||||
|
||||
#### 2b. Solution Assessment
|
||||
|
||||
1. Assess current implementation against acceptance criteria
|
||||
2. Identify weak points in codebase, map to specific code areas
|
||||
3. Perform gap analysis: acceptance criteria vs current state
|
||||
4. Prioritize changes by impact and effort
|
||||
|
||||
Write `REFACTOR_DIR/analysis/refactoring_roadmap.md`:
|
||||
- Weak points assessment: location, description, impact, proposed solution
|
||||
- Gap analysis: what's missing, what needs improvement
|
||||
- Phased roadmap: Phase 1 (critical fixes), Phase 2 (major improvements), Phase 3 (enhancements)
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All acceptance criteria are addressed in gap analysis
|
||||
- [ ] Recommendations are grounded in actual code, not abstract
|
||||
- [ ] Roadmap phases are prioritized by impact
|
||||
- [ ] Quick wins are identified separately
|
||||
|
||||
**Save action**: Write analysis artifacts
|
||||
|
||||
**BLOCKING**: Present refactoring roadmap to user. Do NOT proceed until user confirms.
|
||||
|
||||
**Quick Assessment mode stops here.** Present final summary and write `FINAL_report.md` with phases 0-2 content.
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Safety Net
|
||||
|
||||
**Role**: QA engineer and developer
|
||||
**Goal**: Design and implement tests that capture current behavior before refactoring
|
||||
**Constraints**: Tests must all pass on the current codebase before proceeding
|
||||
|
||||
#### 3a. Design Test Specs
|
||||
|
||||
Coverage requirements (must meet before refactoring):
|
||||
- Minimum overall coverage: 75%
|
||||
- Critical path coverage: 90%
|
||||
- All public APIs must have integration tests
|
||||
- All error handling paths must be tested
|
||||
|
||||
For each critical area, write test specs to `REFACTOR_DIR/test_specs/[##]_[test_name].md`:
|
||||
- Integration tests: summary, current behavior, input data, expected result, max expected time
|
||||
- Acceptance tests: summary, preconditions, steps with expected results
|
||||
- Coverage analysis: current %, target %, uncovered critical paths
|
||||
|
||||
#### 3b. Implement Tests
|
||||
|
||||
1. Set up test environment and infrastructure if not exists
|
||||
2. Implement each test from specs
|
||||
3. Run tests, verify all pass on current codebase
|
||||
4. Document any discovered issues
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Coverage requirements met (75% overall, 90% critical paths)
|
||||
- [ ] All tests pass on current codebase
|
||||
- [ ] All public APIs have integration tests
|
||||
- [ ] Test data fixtures are configured
|
||||
|
||||
**Save action**: Write test specs; implemented tests go into the project's test folder
|
||||
|
||||
**GATE (BLOCKING)**: ALL tests must pass before proceeding to Phase 4. If tests fail, fix the tests (not the code) or ask user for guidance. Do NOT proceed to Phase 4 with failing tests.
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Execution
|
||||
|
||||
**Role**: Software architect and developer
|
||||
**Goal**: Analyze coupling and execute decoupling changes
|
||||
**Constraints**: Small incremental changes; tests must stay green after every change
|
||||
|
||||
#### 4a. Analyze Coupling
|
||||
|
||||
1. Analyze coupling between components/modules
|
||||
2. Map dependencies (direct and transitive)
|
||||
3. Identify circular dependencies
|
||||
4. Form decoupling strategy
|
||||
|
||||
Write `REFACTOR_DIR/coupling_analysis.md`:
|
||||
- Dependency graph (Mermaid)
|
||||
- Coupling metrics per component
|
||||
- Problem areas: components involved, coupling type, severity, impact
|
||||
- Decoupling strategy: priority order, proposed interfaces/abstractions, effort estimates
|
||||
|
||||
**BLOCKING**: Present coupling analysis to user. Do NOT proceed until user confirms strategy.
|
||||
|
||||
#### 4b. Execute Decoupling
|
||||
|
||||
For each change in the decoupling strategy:
|
||||
|
||||
1. Implement the change
|
||||
2. Run integration tests
|
||||
3. Fix any failures
|
||||
4. Commit with descriptive message
|
||||
|
||||
Address code smells encountered: long methods, large classes, duplicate code, dead code, magic numbers.
|
||||
|
||||
Write `REFACTOR_DIR/execution_log.md`:
|
||||
- Change description, files affected, test status per change
|
||||
- Before/after metrics comparison against baseline
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All tests still pass after execution
|
||||
- [ ] No circular dependencies remain (or reduced per plan)
|
||||
- [ ] Code smells addressed
|
||||
- [ ] Metrics improved compared to baseline
|
||||
|
||||
**Save action**: Write execution artifacts
|
||||
|
||||
**BLOCKING**: Present execution summary to user. Do NOT proceed until user confirms.
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Hardening (Optional, Parallel Tracks)
|
||||
|
||||
**Role**: Varies per track
|
||||
**Goal**: Address technical debt, performance, and security
|
||||
**Constraints**: Each track is optional; user picks which to run
|
||||
|
||||
Present the three tracks and let user choose which to execute:
|
||||
|
||||
#### Track A: Technical Debt
|
||||
|
||||
**Role**: Technical debt analyst
|
||||
|
||||
1. Identify and categorize debt items: design, code, test, documentation
|
||||
2. Assess each: location, description, impact, effort, interest (cost of not fixing)
|
||||
3. Prioritize: quick wins → strategic debt → tolerable debt
|
||||
4. Create actionable plan with prevention measures
|
||||
|
||||
Write `REFACTOR_DIR/hardening/technical_debt.md`
|
||||
|
||||
#### Track B: Performance Optimization
|
||||
|
||||
**Role**: Performance engineer
|
||||
|
||||
1. Profile current performance, identify bottlenecks
|
||||
2. For each bottleneck: location, symptom, root cause, impact
|
||||
3. Propose optimizations with expected improvement and risk
|
||||
4. Implement one at a time, benchmark after each change
|
||||
5. Verify tests still pass
|
||||
|
||||
Write `REFACTOR_DIR/hardening/performance.md` with before/after benchmarks
|
||||
|
||||
#### Track C: Security Review
|
||||
|
||||
**Role**: Security engineer
|
||||
|
||||
1. Review code against OWASP Top 10
|
||||
2. Verify security requirements from `security_approach.md` are met
|
||||
3. Check: authentication, authorization, input validation, output encoding, encryption, logging
|
||||
|
||||
Write `REFACTOR_DIR/hardening/security.md`:
|
||||
- Vulnerability assessment: location, type, severity, exploit scenario, fix
|
||||
- Security controls review
|
||||
- Compliance check against `security_approach.md`
|
||||
- Recommendations: critical fixes, improvements, hardening
|
||||
|
||||
**Self-verification** (per track):
|
||||
- [ ] All findings are grounded in actual code
|
||||
- [ ] Recommendations are actionable with effort estimates
|
||||
- [ ] All tests still pass after any changes
|
||||
|
||||
**Save action**: Write hardening artifacts
|
||||
|
||||
---
|
||||
|
||||
## Final Report
|
||||
|
||||
After all executed phases complete, write `REFACTOR_DIR/FINAL_report.md`:
|
||||
|
||||
- Refactoring mode used and phases executed
|
||||
- Baseline metrics vs final metrics comparison
|
||||
- Changes made summary
|
||||
- Remaining items (deferred to future)
|
||||
- Lessons learned
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Unclear refactoring scope | **ASK user** |
|
||||
| Ambiguous acceptance criteria | **ASK user** |
|
||||
| Tests failing before refactoring | **ASK user** — fix tests or fix code? |
|
||||
| Coupling change risks breaking external contracts | **ASK user** |
|
||||
| Performance optimization vs readability trade-off | **ASK user** |
|
||||
| Missing baseline metrics (no test suite, no CI) | **WARN user**, suggest building safety net first |
|
||||
| Security vulnerability found during refactoring | **WARN user** immediately, don't defer |
|
||||
|
||||
## Trigger Conditions
|
||||
|
||||
When the user wants to:
|
||||
- Improve existing code structure or quality
|
||||
- Reduce technical debt or coupling
|
||||
- Prepare codebase for new features
|
||||
- Assess code health before major changes
|
||||
|
||||
**Keywords**: "refactor", "refactoring", "improve code", "reduce coupling", "technical debt", "code quality", "decoupling"
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Structured Refactoring (6-Phase Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ CONTEXT: Resolve mode (project vs standalone) + set paths │
|
||||
│ MODE: Full / Targeted / Quick Assessment │
|
||||
│ │
|
||||
│ 0. Context & Baseline → baseline_metrics.md │
|
||||
│ [BLOCKING: user confirms baseline] │
|
||||
│ 1. Discovery → discovery/ (components, solution) │
|
||||
│ [BLOCKING: user confirms documentation] │
|
||||
│ 2. Analysis → analysis/ (research, roadmap) │
|
||||
│ [BLOCKING: user confirms roadmap] │
|
||||
│ ── Quick Assessment stops here ── │
|
||||
│ 3. Safety Net → test_specs/ + implemented tests │
|
||||
│ [GATE: all tests must pass] │
|
||||
│ 4. Execution → coupling_analysis, execution_log │
|
||||
│ [BLOCKING: user confirms changes] │
|
||||
│ 5. Hardening → hardening/ (debt, perf, security) │
|
||||
│ [optional, user picks tracks] │
|
||||
│ ───────────────────────────────────────────────── │
|
||||
│ FINAL_report.md │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Preserve behavior · Measure before/after │
|
||||
│ Small changes · Save immediately · Ask don't assume│
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,708 @@
|
||||
---
|
||||
name: deep-research
|
||||
description: |
|
||||
Deep Research Methodology (8-Step Method) with two execution modes:
|
||||
- Mode A (Initial Research): Assess acceptance criteria, then research problem and produce solution draft
|
||||
- Mode B (Solution Assessment): Assess existing solution draft for weak points and produce revised draft
|
||||
Supports project mode (_docs/ structure) and standalone mode (@file.md).
|
||||
Auto-detects research mode based on existing solution_draft files.
|
||||
Trigger phrases:
|
||||
- "research", "deep research", "deep dive", "in-depth analysis"
|
||||
- "research this", "investigate", "look into"
|
||||
- "assess solution", "review solution draft"
|
||||
- "comparative analysis", "concept comparison", "technical comparison"
|
||||
category: build
|
||||
tags: [research, analysis, solution-design, comparison, decision-support]
|
||||
---
|
||||
|
||||
# Deep Research (8-Step Method)
|
||||
|
||||
Transform vague topics raised by users into high-quality, deliverable research reports through a systematic methodology. Operates in two modes: **Initial Research** (produce new solution draft) and **Solution Assessment** (assess and revise existing draft).
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Conclusions come from mechanism comparison, not "gut feelings"**
|
||||
- **Pin down the facts first, then reason**
|
||||
- **Prioritize authoritative sources: L1 > L2 > L3 > L4**
|
||||
- **Intermediate results must be saved for traceability and reuse**
|
||||
- **Ask, don't assume** — when any aspect of the problem, criteria, or restrictions is unclear, STOP and ask the user before proceeding
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Determine the operating mode based on invocation before any other logic runs.
|
||||
|
||||
**Project mode** (no explicit input file provided):
|
||||
- INPUT_DIR: `_docs/00_problem/`
|
||||
- OUTPUT_DIR: `_docs/01_solution/`
|
||||
- RESEARCH_DIR: `_docs/00_research/`
|
||||
- All existing guardrails, mode detection, and draft numbering apply as-is.
|
||||
|
||||
**Standalone mode** (explicit input file provided, e.g. `/research @some_doc.md`):
|
||||
- INPUT_FILE: the provided file (treated as problem description)
|
||||
- OUTPUT_DIR: `_standalone/01_solution/`
|
||||
- RESEARCH_DIR: `_standalone/00_research/`
|
||||
- Guardrails relaxed: only INPUT_FILE must exist and be non-empty
|
||||
- `restrictions.md` and `acceptance_criteria.md` are optional — warn if absent, proceed if user confirms
|
||||
- Mode detection uses OUTPUT_DIR for `solution_draft*.md` scanning
|
||||
- Draft numbering works the same, scoped to OUTPUT_DIR
|
||||
- **Final step**: after all research is complete, move INPUT_FILE into `_standalone/`
|
||||
|
||||
Announce the detected mode and resolved paths to the user before proceeding.
|
||||
|
||||
## Project Integration
|
||||
|
||||
### Prerequisite Guardrails (BLOCKING)
|
||||
|
||||
Before any research begins, verify the input context exists. **Do not proceed if guardrails fail.**
|
||||
|
||||
**Project mode:**
|
||||
1. Check INPUT_DIR exists — **STOP if missing**, ask user to create it and provide problem files
|
||||
2. Check `problem.md` in INPUT_DIR exists and is non-empty — **STOP if missing**
|
||||
3. Check `restrictions.md` in INPUT_DIR exists and is non-empty — **STOP if missing**
|
||||
4. Check `acceptance_criteria.md` in INPUT_DIR exists and is non-empty — **STOP if missing**
|
||||
5. Check `input_data/` in INPUT_DIR exists and contains at least one file — **STOP if missing**
|
||||
6. Read **all** files in INPUT_DIR to ground the investigation in the project context
|
||||
7. Create OUTPUT_DIR and RESEARCH_DIR if they don't exist
|
||||
|
||||
**Standalone mode:**
|
||||
1. Check INPUT_FILE exists and is non-empty — **STOP if missing**
|
||||
2. Warn if no `restrictions.md` or `acceptance_criteria.md` were provided alongside INPUT_FILE — proceed if user confirms
|
||||
3. Create OUTPUT_DIR and RESEARCH_DIR if they don't exist
|
||||
|
||||
### Mode Detection
|
||||
|
||||
After guardrails pass, determine the execution mode:
|
||||
|
||||
1. Scan OUTPUT_DIR for files matching `solution_draft*.md`
|
||||
2. **No matches found** → **Mode A: Initial Research**
|
||||
3. **Matches found** → **Mode B: Solution Assessment** (use the highest-numbered draft as input)
|
||||
4. **User override**: if the user explicitly says "research from scratch" or "initial research", force Mode A regardless of existing drafts
|
||||
|
||||
Inform the user which mode was detected and confirm before proceeding.
|
||||
|
||||
### Solution Draft Numbering
|
||||
|
||||
All final output is saved as `OUTPUT_DIR/solution_draft##.md` with a 2-digit zero-padded number:
|
||||
|
||||
1. Scan existing files in OUTPUT_DIR matching `solution_draft*.md`
|
||||
2. Extract the highest existing number
|
||||
3. Increment by 1
|
||||
4. Zero-pad to 2 digits (e.g., `01`, `02`, ..., `10`, `11`)
|
||||
|
||||
Example: if `solution_draft01.md` through `solution_draft10.md` exist, the next output is `solution_draft11.md`.
|
||||
|
||||
### Working Directory & Intermediate Artifact Management
|
||||
|
||||
#### Directory Structure
|
||||
|
||||
At the start of research, **must** create a working directory under RESEARCH_DIR:
|
||||
|
||||
```
|
||||
RESEARCH_DIR/
|
||||
├── 00_ac_assessment.md # Mode A Phase 1 output: AC & restrictions assessment
|
||||
├── 00_question_decomposition.md # Step 0-1 output
|
||||
├── 01_source_registry.md # Step 2 output: all consulted source links
|
||||
├── 02_fact_cards.md # Step 3 output: extracted facts
|
||||
├── 03_comparison_framework.md # Step 4 output: selected framework and populated data
|
||||
├── 04_reasoning_chain.md # Step 6 output: fact → conclusion reasoning
|
||||
├── 05_validation_log.md # Step 7 output: use-case validation results
|
||||
└── raw/ # Raw source archive (optional)
|
||||
├── source_1.md
|
||||
└── source_2.md
|
||||
```
|
||||
|
||||
### Save Timing & Content
|
||||
|
||||
| Step | Save immediately after completion | Filename |
|
||||
|------|-----------------------------------|----------|
|
||||
| Mode A Phase 1 | AC & restrictions assessment tables | `00_ac_assessment.md` |
|
||||
| Step 0-1 | Question type classification + sub-question list | `00_question_decomposition.md` |
|
||||
| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` |
|
||||
| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` |
|
||||
| Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` |
|
||||
| Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` |
|
||||
| Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` |
|
||||
| Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` |
|
||||
|
||||
### Save Principles
|
||||
|
||||
1. **Save immediately**: Write to the corresponding file as soon as a step is completed; don't wait until the end
|
||||
2. **Incremental updates**: Same file can be updated multiple times; append or replace new content
|
||||
3. **Preserve process**: Keep intermediate files even after their content is integrated into the final report
|
||||
4. **Enable recovery**: If research is interrupted, progress can be recovered from intermediate files
|
||||
|
||||
## Execution Flow
|
||||
|
||||
### Mode A: Initial Research
|
||||
|
||||
Triggered when no `solution_draft*.md` files exist in OUTPUT_DIR, or when the user explicitly requests initial research.
|
||||
|
||||
#### Phase 1: AC & Restrictions Assessment (BLOCKING)
|
||||
|
||||
**Role**: Professional software architect
|
||||
|
||||
A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them.
|
||||
|
||||
**Input**: All files from INPUT_DIR (or INPUT_FILE in standalone mode)
|
||||
|
||||
**Task**:
|
||||
1. Read all problem context files thoroughly
|
||||
2. **ASK the user about every unclear aspect** — do not assume:
|
||||
- Unclear problem boundaries → ask
|
||||
- Ambiguous acceptance criteria values → ask
|
||||
- Missing context (no `security_approach.md`, no `input_data/`) → ask what they have
|
||||
- Conflicting restrictions → ask which takes priority
|
||||
3. Research in internet:
|
||||
- How realistic are the acceptance criteria for this specific domain?
|
||||
- How critical is each criterion?
|
||||
- What domain-specific acceptance criteria are we missing?
|
||||
- Impact of each criterion value on the whole system quality
|
||||
- Cost/budget implications of each criterion
|
||||
- Timeline implications — how long would it take to meet each criterion
|
||||
4. Research restrictions:
|
||||
- Are the restrictions realistic?
|
||||
- Should any be tightened or relaxed?
|
||||
- Are there additional restrictions we should add?
|
||||
5. Verify findings with authoritative sources (official docs, papers, benchmarks)
|
||||
|
||||
**Uses Steps 0-3 of the 8-step engine** (question classification, decomposition, source tiering, fact extraction) scoped to AC and restrictions assessment.
|
||||
|
||||
**📁 Save action**: Write `RESEARCH_DIR/00_ac_assessment.md` with format:
|
||||
|
||||
```markdown
|
||||
# Acceptance Criteria Assessment
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status |
|
||||
|-----------|-----------|-------------------|---------------------|--------|
|
||||
| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed |
|
||||
|
||||
## Restrictions Assessment
|
||||
|
||||
| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status |
|
||||
|-------------|-----------|-------------------|---------------------|--------|
|
||||
| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed |
|
||||
|
||||
## Key Findings
|
||||
[Summary of critical findings]
|
||||
|
||||
## Sources
|
||||
[Key references used]
|
||||
```
|
||||
|
||||
**BLOCKING**: Present the AC assessment tables to the user. Wait for confirmation or adjustments before proceeding to Phase 2. The user may update `acceptance_criteria.md` or `restrictions.md` based on findings.
|
||||
|
||||
---
|
||||
|
||||
#### Phase 2: Problem Research & Solution Draft
|
||||
|
||||
**Role**: Professional researcher and software architect
|
||||
|
||||
Full 8-step research methodology. Produces the first solution draft.
|
||||
|
||||
**Input**: All files from INPUT_DIR (possibly updated after Phase 1) + Phase 1 artifacts
|
||||
|
||||
**Task** (drives the 8-step engine):
|
||||
1. Research existing/competitor solutions for similar problems
|
||||
2. Research the problem thoroughly — all possible ways to solve it, split into components
|
||||
3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches
|
||||
4. Verify that suggested tools/libraries actually exist and work as described
|
||||
5. Include security considerations in each component analysis
|
||||
6. Provide rough cost estimates for proposed solutions
|
||||
|
||||
Be concise in formulating. The fewer words, the better, but do not miss any important details.
|
||||
|
||||
**📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`
|
||||
|
||||
---
|
||||
|
||||
#### Phase 3: Tech Stack Consolidation (OPTIONAL)
|
||||
|
||||
**Role**: Software architect evaluating technology choices
|
||||
|
||||
Focused synthesis step — no new 8-step cycle. Uses research already gathered in Phase 2 to make concrete technology decisions.
|
||||
|
||||
**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + all files from INPUT_DIR
|
||||
|
||||
**Task**:
|
||||
1. Extract technology options from the solution draft's component comparison tables
|
||||
2. Score each option against: fitness for purpose, maturity, security track record, team expertise, cost, scalability
|
||||
3. Produce a tech stack summary with selection rationale
|
||||
4. Assess risks and learning requirements per technology choice
|
||||
|
||||
**📁 Save action**: Write `OUTPUT_DIR/tech_stack.md` with:
|
||||
- Requirements analysis (functional, non-functional, constraints)
|
||||
- Technology evaluation tables (language, framework, database, infrastructure, key libraries) with scores
|
||||
- Tech stack summary block
|
||||
- Risk assessment and learning requirements tables
|
||||
|
||||
---
|
||||
|
||||
#### Phase 4: Security Deep Dive (OPTIONAL)
|
||||
|
||||
**Role**: Security architect
|
||||
|
||||
Focused analysis step — deepens the security column from the solution draft into a proper threat model and controls specification.
|
||||
|
||||
**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + `security_approach.md` from INPUT_DIR + problem context
|
||||
|
||||
**Task**:
|
||||
1. Build threat model: asset inventory, threat actors, attack vectors
|
||||
2. Define security requirements and proposed controls per component (with risk level)
|
||||
3. Summarize authentication/authorization, data protection, secure communication, and logging/monitoring approach
|
||||
|
||||
**📁 Save action**: Write `OUTPUT_DIR/security_analysis.md` with:
|
||||
- Threat model (assets, actors, vectors)
|
||||
- Per-component security requirements and controls table
|
||||
- Security controls summary
|
||||
|
||||
---
|
||||
|
||||
### Mode B: Solution Assessment
|
||||
|
||||
Triggered when `solution_draft*.md` files exist in OUTPUT_DIR.
|
||||
|
||||
**Role**: Professional software architect
|
||||
|
||||
Full 8-step research methodology applied to assessing and improving an existing solution draft.
|
||||
|
||||
**Input**: All files from INPUT_DIR + the latest (highest-numbered) `solution_draft##.md` from OUTPUT_DIR
|
||||
|
||||
**Task** (drives the 8-step engine):
|
||||
1. Read the existing solution draft thoroughly
|
||||
2. Research in internet — identify all potential weak points and problems
|
||||
3. Identify security weak points and vulnerabilities
|
||||
4. Identify performance bottlenecks
|
||||
5. Address these problems and find ways to solve them
|
||||
6. Based on findings, form a new solution draft in the same format
|
||||
|
||||
**📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
|
||||
|
||||
**Optional follow-up**: After Mode B completes, the user can request Phase 3 (Tech Stack Consolidation) or Phase 4 (Security Deep Dive) using the revised draft. These phases work identically to their Mode A descriptions above.
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| Unclear problem boundaries | **ASK user** |
|
||||
| Ambiguous acceptance criteria values | **ASK user** |
|
||||
| Missing context files (`security_approach.md`, `input_data/`) | **ASK user** what they have |
|
||||
| Conflicting restrictions | **ASK user** which takes priority |
|
||||
| Technology choice with multiple valid options | **ASK user** |
|
||||
| Contradictions between input files | **ASK user** |
|
||||
| Missing acceptance criteria or restrictions files | **WARN user**, ask whether to proceed |
|
||||
| File naming within research artifacts | PROCEED |
|
||||
| Source tier classification | PROCEED |
|
||||
|
||||
## Trigger Conditions
|
||||
|
||||
When the user wants to:
|
||||
- Deeply understand a concept/technology/phenomenon
|
||||
- Compare similarities and differences between two or more things
|
||||
- Gather information and evidence for a decision
|
||||
- Assess or improve an existing solution draft
|
||||
|
||||
**Keywords**:
|
||||
- "deep research", "deep dive", "in-depth analysis"
|
||||
- "research this", "investigate", "look into"
|
||||
- "assess solution", "review draft", "improve solution"
|
||||
- "comparative analysis", "concept comparison", "technical comparison"
|
||||
|
||||
**Differentiation from other Skills**:
|
||||
- Needs a **visual knowledge graph** → use `research-to-diagram`
|
||||
- Needs **written output** (articles/tutorials) → use `wsy-writer`
|
||||
- Needs **material organization** → use `material-to-markdown`
|
||||
- Needs **research + solution draft** → use this Skill
|
||||
|
||||
## Research Engine (8-Step Method)
|
||||
|
||||
The 8-step method is the core research engine used by both modes. Steps 0-1 and Step 8 have mode-specific behavior; Steps 2-7 are identical regardless of mode.
|
||||
|
||||
### Step 0: Question Type Classification
|
||||
|
||||
First, classify the research question type and select the corresponding strategy:
|
||||
|
||||
| Question Type | Core Task | Focus Dimensions |
|
||||
|---------------|-----------|------------------|
|
||||
| **Concept Comparison** | Build comparison framework | Mechanism differences, applicability boundaries |
|
||||
| **Decision Support** | Weigh trade-offs | Cost, risk, benefit |
|
||||
| **Trend Analysis** | Map evolution trajectory | History, driving factors, predictions |
|
||||
| **Problem Diagnosis** | Root cause analysis | Symptoms, causes, evidence chain |
|
||||
| **Knowledge Organization** | Systematic structuring | Definitions, classifications, relationships |
|
||||
|
||||
**Mode-specific classification**:
|
||||
|
||||
| Mode / Phase | Typical Question Type |
|
||||
|--------------|----------------------|
|
||||
| Mode A Phase 1 | Knowledge Organization + Decision Support |
|
||||
| Mode A Phase 2 | Decision Support |
|
||||
| Mode B | Problem Diagnosis + Decision Support |
|
||||
|
||||
### Step 0.5: Novelty Sensitivity Assessment (BLOCKING)
|
||||
|
||||
Before starting research, assess the novelty sensitivity of the question (Critical/High/Medium/Low). This determines source time windows and filtering strategy.
|
||||
|
||||
**For full classification table, critical-domain rules, trigger words, and assessment template**: Read `references/novelty-sensitivity.md`
|
||||
|
||||
Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources within 6 months, mandatory version annotations, cross-validation from 2+ sources, and direct verification of official download pages.
|
||||
|
||||
**📁 Save action**: Append timeliness assessment to the end of `00_question_decomposition.md`
|
||||
|
||||
---
|
||||
|
||||
### Step 1: Question Decomposition & Boundary Definition
|
||||
|
||||
**Mode-specific sub-questions**:
|
||||
|
||||
**Mode A Phase 2** (Initial Research — Problem & Solution):
|
||||
- "What existing/competitor solutions address this problem?"
|
||||
- "What are the component parts of this problem?"
|
||||
- "For each component, what are the state-of-the-art solutions?"
|
||||
- "What are the security considerations per component?"
|
||||
- "What are the cost implications of each approach?"
|
||||
|
||||
**Mode B** (Solution Assessment):
|
||||
- "What are the weak points and potential problems in the existing draft?"
|
||||
- "What are the security vulnerabilities in the proposed architecture?"
|
||||
- "Where are the performance bottlenecks?"
|
||||
- "What solutions exist for each identified issue?"
|
||||
|
||||
**General sub-question patterns** (use when applicable):
|
||||
- **Sub-question A**: "What is X and how does it work?" (Definition & mechanism)
|
||||
- **Sub-question B**: "What are the dimensions of relationship/difference between X and Y?" (Comparative analysis)
|
||||
- **Sub-question C**: "In what scenarios is X applicable/inapplicable?" (Boundary conditions)
|
||||
- **Sub-question D**: "What are X's development trends/best practices?" (Extended analysis)
|
||||
|
||||
**⚠️ Research Subject Boundary Definition (BLOCKING - must be explicit)**:
|
||||
|
||||
When decomposing questions, you must explicitly define the **boundaries of the research subject**:
|
||||
|
||||
| Dimension | Boundary to define | Example |
|
||||
|-----------|--------------------|---------|
|
||||
| **Population** | Which group is being studied? | University students vs K-12 vs vocational students vs all students |
|
||||
| **Geography** | Which region is being studied? | Chinese universities vs US universities vs global |
|
||||
| **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture |
|
||||
| **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational |
|
||||
|
||||
**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research.
|
||||
|
||||
**📁 Save action**:
|
||||
1. Read all files from INPUT_DIR to ground the research in the project context
|
||||
2. Create working directory `RESEARCH_DIR/`
|
||||
3. Write `00_question_decomposition.md`, including:
|
||||
- Original question
|
||||
- Active mode (A Phase 2 or B) and rationale
|
||||
- Summary of relevant problem context from INPUT_DIR
|
||||
- Classified question type and rationale
|
||||
- **Research subject boundary definition** (population, geography, timeframe, level)
|
||||
- List of decomposed sub-questions
|
||||
4. Write TodoWrite to track progress
|
||||
|
||||
### Step 2: Source Tiering & Authority Anchoring
|
||||
|
||||
Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). Conclusions must be traceable to L1/L2; L3/L4 serve as supplementary and validation.
|
||||
|
||||
**For full tier definitions, search strategies, community mining steps, and source registry templates**: Read `references/source-tiering.md`
|
||||
|
||||
**Tool Usage**:
|
||||
- Use `WebSearch` for broad searches; `WebFetch` to read specific pages
|
||||
- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation
|
||||
- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
|
||||
- When citing web sources, include the URL and date accessed
|
||||
|
||||
**📁 Save action**:
|
||||
For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`.
|
||||
|
||||
### Step 3: Fact Extraction & Evidence Cards
|
||||
|
||||
Transform sources into **verifiable fact cards**:
|
||||
|
||||
```markdown
|
||||
## Fact Cards
|
||||
|
||||
### Fact 1
|
||||
- **Statement**: [specific fact description]
|
||||
- **Source**: [link/document section]
|
||||
- **Confidence**: High/Medium/Low
|
||||
|
||||
### Fact 2
|
||||
...
|
||||
```
|
||||
|
||||
**Key discipline**:
|
||||
- Pin down facts first, then reason
|
||||
- Distinguish "what officials said" from "what I infer"
|
||||
- When conflicting information is found, annotate and preserve both sides
|
||||
- Annotate confidence level:
|
||||
- ✅ High: Explicitly stated in official documentation
|
||||
- ⚠️ Medium: Mentioned in official blog but not formally documented
|
||||
- ❓ Low: Inference or from unofficial sources
|
||||
|
||||
**📁 Save action**:
|
||||
For each extracted fact, **immediately** append to `02_fact_cards.md`:
|
||||
```markdown
|
||||
## Fact #[number]
|
||||
- **Statement**: [specific fact description]
|
||||
- **Source**: [Source #number] [link]
|
||||
- **Phase**: [Phase 1 / Phase 2 / Assessment]
|
||||
- **Target Audience**: [which group this fact applies to, inherited from source or further refined]
|
||||
- **Confidence**: ✅/⚠️/❓
|
||||
- **Related Dimension**: [corresponding comparison dimension]
|
||||
```
|
||||
|
||||
**⚠️ Target audience in fact statements**:
|
||||
- If a fact comes from a "partially overlapping" or "reference only" source, the statement **must explicitly annotate the applicable scope**
|
||||
- Wrong: "The Ministry of Education banned phones in classrooms" (doesn't specify who)
|
||||
- Correct: "The Ministry of Education banned K-12 students from bringing phones into classrooms (does not apply to university students)"
|
||||
|
||||
### Step 4: Build Comparison/Analysis Framework
|
||||
|
||||
Based on the question type, select fixed analysis dimensions. **For dimension lists** (General, Concept Comparison, Decision Support): Read `references/comparison-frameworks.md`
|
||||
|
||||
**📁 Save action**:
|
||||
Write to `03_comparison_framework.md`:
|
||||
```markdown
|
||||
# Comparison Framework
|
||||
|
||||
## Selected Framework Type
|
||||
[Concept Comparison / Decision Support / ...]
|
||||
|
||||
## Selected Dimensions
|
||||
1. [Dimension 1]
|
||||
2. [Dimension 2]
|
||||
...
|
||||
|
||||
## Initial Population
|
||||
| Dimension | X | Y | Factual Basis |
|
||||
|-----------|---|---|---------------|
|
||||
| [Dimension 1] | [description] | [description] | Fact #1, #3 |
|
||||
| ... | | | |
|
||||
```
|
||||
|
||||
### Step 5: Reference Point Baseline Alignment
|
||||
|
||||
Ensure all compared parties have clear, consistent definitions:
|
||||
|
||||
**Checklist**:
|
||||
- [ ] Is the reference point's definition stable/widely accepted?
|
||||
- [ ] Does it need verification, or can domain common knowledge be used?
|
||||
- [ ] Does the reader's understanding of the reference point match mine?
|
||||
- [ ] Are there ambiguities that need to be clarified first?
|
||||
|
||||
### Step 6: Fact-to-Conclusion Reasoning Chain
|
||||
|
||||
Explicitly write out the "fact → comparison → conclusion" reasoning process:
|
||||
|
||||
```markdown
|
||||
## Reasoning Process
|
||||
|
||||
### Regarding [Dimension Name]
|
||||
|
||||
1. **Fact confirmation**: According to [source], X's mechanism is...
|
||||
2. **Compare with reference**: While Y's mechanism is...
|
||||
3. **Conclusion**: Therefore, the difference between X and Y on this dimension is...
|
||||
```
|
||||
|
||||
**Key discipline**:
|
||||
- Conclusions come from mechanism comparison, not "gut feelings"
|
||||
- Every conclusion must be traceable to specific facts
|
||||
- Uncertain conclusions must be annotated
|
||||
|
||||
**📁 Save action**:
|
||||
Write to `04_reasoning_chain.md`:
|
||||
```markdown
|
||||
# Reasoning Chain
|
||||
|
||||
## Dimension 1: [Dimension Name]
|
||||
|
||||
### Fact Confirmation
|
||||
According to [Fact #X], X's mechanism is...
|
||||
|
||||
### Reference Comparison
|
||||
While Y's mechanism is... (Source: [Fact #Y])
|
||||
|
||||
### Conclusion
|
||||
Therefore, the difference between X and Y on this dimension is...
|
||||
|
||||
### Confidence
|
||||
✅/⚠️/❓ + rationale
|
||||
|
||||
---
|
||||
## Dimension 2: [Dimension Name]
|
||||
...
|
||||
```
|
||||
|
||||
### Step 7: Use-Case Validation (Sanity Check)
|
||||
|
||||
Validate conclusions against a typical scenario:
|
||||
|
||||
**Validation questions**:
|
||||
- Based on my conclusions, how should this scenario be handled?
|
||||
- Is that actually the case?
|
||||
- Are there counterexamples that need to be addressed?
|
||||
|
||||
**Review checklist**:
|
||||
- [ ] Are draft conclusions consistent with Step 3 fact cards?
|
||||
- [ ] Are there any important dimensions missed?
|
||||
- [ ] Is there any over-extrapolation?
|
||||
- [ ] Are conclusions actionable/verifiable?
|
||||
|
||||
**📁 Save action**:
|
||||
Write to `05_validation_log.md`:
|
||||
```markdown
|
||||
# Validation Log
|
||||
|
||||
## Validation Scenario
|
||||
[Scenario description]
|
||||
|
||||
## Expected Based on Conclusions
|
||||
If using X: [expected behavior]
|
||||
If using Y: [expected behavior]
|
||||
|
||||
## Actual Validation Results
|
||||
[actual situation]
|
||||
|
||||
## Counterexamples
|
||||
[yes/no, describe if yes]
|
||||
|
||||
## Review Checklist
|
||||
- [x] Draft conclusions consistent with fact cards
|
||||
- [x] No important dimensions missed
|
||||
- [x] No over-extrapolation
|
||||
- [ ] Issue found: [if any]
|
||||
|
||||
## Conclusions Requiring Revision
|
||||
[if any]
|
||||
```
|
||||
|
||||
### Step 8: Deliverable Formatting
|
||||
|
||||
Make the output **readable, traceable, and actionable**.
|
||||
|
||||
**📁 Save action**:
|
||||
Integrate all intermediate artifacts. Write to `OUTPUT_DIR/solution_draft##.md` using the appropriate output template based on active mode:
|
||||
- Mode A: `templates/solution_draft_mode_a.md`
|
||||
- Mode B: `templates/solution_draft_mode_b.md`
|
||||
|
||||
Sources to integrate:
|
||||
- Extract background from `00_question_decomposition.md`
|
||||
- Reference key facts from `02_fact_cards.md`
|
||||
- Organize conclusions from `04_reasoning_chain.md`
|
||||
- Generate references from `01_source_registry.md`
|
||||
- Supplement with use cases from `05_validation_log.md`
|
||||
- For Mode A: include AC assessment from `00_ac_assessment.md`
|
||||
|
||||
## Solution Draft Output Templates
|
||||
|
||||
### Mode A: Initial Research Output
|
||||
|
||||
Use template: `templates/solution_draft_mode_a.md`
|
||||
|
||||
### Mode B: Solution Assessment Output
|
||||
|
||||
Use template: `templates/solution_draft_mode_b.md`
|
||||
|
||||
## Stakeholder Perspectives
|
||||
|
||||
Adjust content depth based on audience:
|
||||
|
||||
| Audience | Focus | Detail Level |
|
||||
|----------|-------|--------------|
|
||||
| **Decision-makers** | Conclusions, risks, recommendations | Concise, emphasize actionability |
|
||||
| **Implementers** | Specific mechanisms, how-to | Detailed, emphasize how to do it |
|
||||
| **Technical experts** | Details, boundary conditions, limitations | In-depth, emphasize accuracy |
|
||||
|
||||
## Output Files
|
||||
|
||||
Default intermediate artifacts location: `RESEARCH_DIR/`
|
||||
|
||||
**Required files** (automatically generated through the process):
|
||||
|
||||
| File | Content | When Generated |
|
||||
|------|---------|----------------|
|
||||
| `00_ac_assessment.md` | AC & restrictions assessment (Mode A only) | After Phase 1 completion |
|
||||
| `00_question_decomposition.md` | Question type, sub-question list | After Step 0-1 completion |
|
||||
| `01_source_registry.md` | All source links and summaries | Continuously updated during Step 2 |
|
||||
| `02_fact_cards.md` | Extracted facts and sources | Continuously updated during Step 3 |
|
||||
| `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion |
|
||||
| `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion |
|
||||
| `05_validation_log.md` | Use-case validation and review | After Step 7 completion |
|
||||
| `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion |
|
||||
| `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) |
|
||||
| `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) |
|
||||
|
||||
**Optional files**:
|
||||
- `raw/*.md` - Raw source archives (saved when content is lengthy)
|
||||
|
||||
## Methodology Quick Reference Card
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ Deep Research — Mode-Aware 8-Step Method │
|
||||
├──────────────────────────────────────────────────────────────────┤
|
||||
│ CONTEXT: Resolve mode (project vs standalone) + set paths │
|
||||
│ GUARDRAILS: Check INPUT_DIR/INPUT_FILE exists + required files │
|
||||
│ MODE DETECT: solution_draft*.md in 01_solution? → A or B │
|
||||
│ │
|
||||
│ MODE A: Initial Research │
|
||||
│ Phase 1: AC & Restrictions Assessment (BLOCKING) │
|
||||
│ Phase 2: Full 8-step → solution_draft##.md │
|
||||
│ Phase 3: Tech Stack Consolidation (OPTIONAL) → tech_stack.md │
|
||||
│ Phase 4: Security Deep Dive (OPTIONAL) → security_analysis.md │
|
||||
│ │
|
||||
│ MODE B: Solution Assessment │
|
||||
│ Read latest draft → Full 8-step → solution_draft##.md (N+1) │
|
||||
│ Optional: Phase 3 / Phase 4 on revised draft │
|
||||
│ │
|
||||
│ 8-STEP ENGINE: │
|
||||
│ 0. Classify question type → Select framework template │
|
||||
│ 1. Decompose question → mode-specific sub-questions │
|
||||
│ 2. Tier sources → L1 Official > L2 Blog > L3 Media > L4 │
|
||||
│ 3. Extract facts → Each with source, confidence level │
|
||||
│ 4. Build framework → Fixed dimensions, structured compare │
|
||||
│ 5. Align references → Ensure unified definitions │
|
||||
│ 6. Reasoning chain → Fact→Compare→Conclude, explicit │
|
||||
│ 7. Use-case validation → Sanity check, prevent armchairing │
|
||||
│ 8. Deliverable → solution_draft##.md (mode-specific format) │
|
||||
├──────────────────────────────────────────────────────────────────┤
|
||||
│ Key discipline: Ask don't assume · Facts before reasoning │
|
||||
│ Conclusions from mechanism, not gut feelings │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
For detailed execution flow examples (Mode A initial, Mode B assessment, standalone, force override): Read `references/usage-examples.md`
|
||||
|
||||
## Source Verifiability Requirements
|
||||
|
||||
Every cited piece of external information must be directly verifiable by the user. All links must be publicly accessible (annotate `[login required]` if not), citations must include exact section/page/timestamp, and unverifiable information must be annotated `[limited source]`. Full checklist in `references/quality-checklists.md`.
|
||||
|
||||
## Quality Checklist
|
||||
|
||||
Before completing the solution draft, run through the checklists in `references/quality-checklists.md`. This covers:
|
||||
- General quality (L1/L2 support, verifiability, actionability)
|
||||
- Mode A specific (AC assessment, competitor analysis, component tables, tech stack)
|
||||
- Mode B specific (findings table, self-contained draft, performance column)
|
||||
- Timeliness check for high-sensitivity domains (version annotations, cross-validation, community mining)
|
||||
- Target audience consistency (boundary definition, source matching, fact card audience)
|
||||
|
||||
## Final Reply Guidelines
|
||||
|
||||
When replying to the user after research is complete:
|
||||
|
||||
**✅ Should include**:
|
||||
- Active mode used (A or B) and which optional phases were executed
|
||||
- One-sentence core conclusion
|
||||
- Key findings summary (3-5 points)
|
||||
- Path to the solution draft: `OUTPUT_DIR/solution_draft##.md`
|
||||
- Paths to optional artifacts if produced: `tech_stack.md`, `security_analysis.md`
|
||||
- If there are significant uncertainties, annotate points requiring further verification
|
||||
|
||||
**❌ Must not include**:
|
||||
- Process file listings (e.g., `00_question_decomposition.md`, `01_source_registry.md`, etc.)
|
||||
- Detailed research step descriptions
|
||||
- Working directory structure display
|
||||
|
||||
**Reason**: Process files are for retrospective review, not for the user. The user cares about conclusions, not the process.
|
||||
@@ -0,0 +1,34 @@
|
||||
# Comparison & Analysis Frameworks — Reference
|
||||
|
||||
## General Dimensions (select as needed)
|
||||
|
||||
1. Goal / What problem does it solve
|
||||
2. Working mechanism / Process
|
||||
3. Input / Output / Boundaries
|
||||
4. Advantages / Disadvantages / Trade-offs
|
||||
5. Applicable scenarios / Boundary conditions
|
||||
6. Cost / Benefit / Risk
|
||||
7. Historical evolution / Future trends
|
||||
8. Security / Permissions / Controllability
|
||||
|
||||
## Concept Comparison Specific Dimensions
|
||||
|
||||
1. Definition & essence
|
||||
2. Trigger / invocation method
|
||||
3. Execution agent
|
||||
4. Input/output & type constraints
|
||||
5. Determinism & repeatability
|
||||
6. Resource & context management
|
||||
7. Composition & reuse patterns
|
||||
8. Security boundaries & permission control
|
||||
|
||||
## Decision Support Specific Dimensions
|
||||
|
||||
1. Solution overview
|
||||
2. Implementation cost
|
||||
3. Maintenance cost
|
||||
4. Risk assessment
|
||||
5. Expected benefit
|
||||
6. Applicable scenarios
|
||||
7. Team capability requirements
|
||||
8. Migration difficulty
|
||||
@@ -0,0 +1,75 @@
|
||||
# Novelty Sensitivity Assessment — Reference
|
||||
|
||||
## Novelty Sensitivity Classification
|
||||
|
||||
| Sensitivity Level | Typical Domains | Source Time Window | Description |
|
||||
|-------------------|-----------------|-------------------|-------------|
|
||||
| **Critical** | AI/LLMs, blockchain, cryptocurrency | 3-6 months | Technology iterates extremely fast; info from months ago may be completely outdated |
|
||||
| **High** | Cloud services, frontend frameworks, API interfaces | 6-12 months | Frequent version updates; must confirm current version |
|
||||
| **Medium** | Programming languages, databases, operating systems | 1-2 years | Relatively stable but still evolving |
|
||||
| **Low** | Algorithm fundamentals, design patterns, theoretical concepts | No limit | Core principles change slowly |
|
||||
|
||||
## Critical Sensitivity Domain Special Rules
|
||||
|
||||
When the research topic involves the following domains, special rules must be enforced:
|
||||
|
||||
**Trigger word identification**:
|
||||
- AI-related: LLM, GPT, Claude, Gemini, AI Agent, RAG, vector database, prompt engineering
|
||||
- Cloud-native: Kubernetes new versions, Serverless, container runtimes
|
||||
- Cutting-edge tech: Web3, quantum computing, AR/VR
|
||||
|
||||
**Mandatory rules**:
|
||||
|
||||
1. **Search with time constraints**:
|
||||
- Use `time_range: "month"` or `time_range: "week"` to limit search results
|
||||
- Prefer `start_date: "YYYY-MM-DD"` set to within the last 3 months
|
||||
|
||||
2. **Elevate official source priority**:
|
||||
- Must first consult official documentation, official blogs, official Changelogs
|
||||
- GitHub Release Notes, official X/Twitter announcements
|
||||
- Academic papers (arXiv and other preprint platforms)
|
||||
|
||||
3. **Mandatory version number annotation**:
|
||||
- Any technical description must annotate the current version number
|
||||
- Example: "Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) supports..."
|
||||
- Prohibit vague statements like "the latest version supports..."
|
||||
|
||||
4. **Outdated information handling**:
|
||||
- Technical blogs/tutorials older than 6 months -> historical reference only, cannot serve as factual evidence
|
||||
- Version inconsistency found -> must verify current version before using
|
||||
- Obviously outdated descriptions (e.g., "will support in the future" but now already supported) -> discard directly
|
||||
|
||||
5. **Cross-validation**:
|
||||
- Highly sensitive information must be confirmed from at least 2 independent sources
|
||||
- Priority: Official docs > Official blogs > Authoritative tech media > Personal blogs
|
||||
|
||||
6. **Official download/release page direct verification (BLOCKING)**:
|
||||
- Must directly visit official download pages to verify platform support (don't rely on search engine caches)
|
||||
- Use `WebFetch` to directly extract download page content
|
||||
- Search results about "coming soon" or "planned support" may be outdated; must verify in real time
|
||||
- Platform support is frequently changing information; cannot infer from old sources
|
||||
|
||||
7. **Product-specific protocol/feature name search (BLOCKING)**:
|
||||
- Beyond searching the product name, must additionally search protocol/standard names the product supports
|
||||
- Common protocols/standards to search:
|
||||
- AI tools: MCP, ACP (Agent Client Protocol), LSP, DAP
|
||||
- Cloud services: OAuth, OIDC, SAML
|
||||
- Data exchange: GraphQL, gRPC, REST
|
||||
- Search format: `"<product_name> <protocol_name> support"` or `"<product_name> <protocol_name> integration"`
|
||||
|
||||
## Timeliness Assessment Output Template
|
||||
|
||||
```markdown
|
||||
## Timeliness Sensitivity Assessment
|
||||
|
||||
- **Research Topic**: [topic]
|
||||
- **Sensitivity Level**: Critical / High / Medium / Low
|
||||
- **Rationale**: [why this level]
|
||||
- **Source Time Window**: [X months/years]
|
||||
- **Priority official sources to consult**:
|
||||
1. [Official source 1]
|
||||
2. [Official source 2]
|
||||
- **Key version information to verify**:
|
||||
- [Product/technology 1]: Current version ____
|
||||
- [Product/technology 2]: Current version ____
|
||||
```
|
||||
@@ -0,0 +1,61 @@
|
||||
# Quality Checklists — Reference
|
||||
|
||||
## General Quality
|
||||
|
||||
- [ ] All core conclusions have L1/L2 tier factual support
|
||||
- [ ] No use of vague words like "possibly", "probably" without annotating uncertainty
|
||||
- [ ] Comparison dimensions are complete with no key differences missed
|
||||
- [ ] At least one real use case validates conclusions
|
||||
- [ ] References are complete with accessible links
|
||||
- [ ] Every citation can be directly verified by the user (source verifiability)
|
||||
- [ ] Structure hierarchy is clear; executives can quickly locate information
|
||||
|
||||
## Mode A Specific
|
||||
|
||||
- [ ] Phase 1 completed: AC assessment was presented to and confirmed by user
|
||||
- [ ] AC assessment consistent: Solution draft respects the (possibly adjusted) acceptance criteria and restrictions
|
||||
- [ ] Competitor analysis included: Existing solutions were researched
|
||||
- [ ] All components have comparison tables: Each component lists alternatives with tools, advantages, limitations, security, cost
|
||||
- [ ] Tools/libraries verified: Suggested tools actually exist and work as described
|
||||
- [ ] Testing strategy covers AC: Tests map to acceptance criteria
|
||||
- [ ] Tech stack documented (if Phase 3 ran): `tech_stack.md` has evaluation tables, risk assessment, and learning requirements
|
||||
- [ ] Security analysis documented (if Phase 4 ran): `security_analysis.md` has threat model and per-component controls
|
||||
|
||||
## Mode B Specific
|
||||
|
||||
- [ ] Findings table complete: All identified weak points documented with solutions
|
||||
- [ ] Weak point categories covered: Functional, security, and performance assessed
|
||||
- [ ] New draft is self-contained: Written as if from scratch, no "updated" markers
|
||||
- [ ] Performance column included: Mode B comparison tables include performance characteristics
|
||||
- [ ] Previous draft issues addressed: Every finding in the table is resolved in the new draft
|
||||
|
||||
## Timeliness Check (High-Sensitivity Domain BLOCKING)
|
||||
|
||||
When the research topic has Critical or High sensitivity level:
|
||||
|
||||
- [ ] Timeliness sensitivity assessment completed: `00_question_decomposition.md` contains a timeliness assessment section
|
||||
- [ ] Source timeliness annotated: Every source has publication date, timeliness status, version info
|
||||
- [ ] No outdated sources used as factual evidence (Critical: within 6 months; High: within 1 year)
|
||||
- [ ] Version numbers explicitly annotated for all technical products/APIs/SDKs
|
||||
- [ ] Official sources prioritized: Core conclusions have support from official documentation/blogs
|
||||
- [ ] Cross-validation completed: Key technical information confirmed from at least 2 independent sources
|
||||
- [ ] Download page directly verified: Platform support info comes from real-time extraction of official download pages
|
||||
- [ ] Protocol/feature names searched: Searched for product-supported protocol names (MCP, ACP, etc.)
|
||||
- [ ] GitHub Issues mined: Reviewed product's GitHub Issues popular discussions
|
||||
- [ ] Community hotspots identified: Identified and recorded feature points users care most about
|
||||
|
||||
## Target Audience Consistency Check (BLOCKING)
|
||||
|
||||
- [ ] Research boundary clearly defined: `00_question_decomposition.md` has clear population/geography/timeframe/level boundaries
|
||||
- [ ] Every source has target audience annotated in `01_source_registry.md`
|
||||
- [ ] Mismatched sources properly handled (excluded, annotated, or marked reference-only)
|
||||
- [ ] No audience confusion in fact cards: Every fact has target audience consistent with research boundary
|
||||
- [ ] No audience confusion in the report: Policies/research/data cited have consistent target audiences
|
||||
|
||||
## Source Verifiability
|
||||
|
||||
- [ ] All cited links are publicly accessible (annotate `[login required]` if not)
|
||||
- [ ] Citations include exact section/page/timestamp for long documents
|
||||
- [ ] Cited facts have corresponding statements in the original text (no over-interpretation)
|
||||
- [ ] Source publication/update dates annotated; technical docs include version numbers
|
||||
- [ ] Unverifiable information annotated `[limited source]` and not sole support for core conclusions
|
||||
@@ -0,0 +1,118 @@
|
||||
# Source Tiering & Authority Anchoring — Reference
|
||||
|
||||
## Source Tiers
|
||||
|
||||
| Tier | Source Type | Purpose | Credibility |
|
||||
|------|------------|---------|-------------|
|
||||
| **L1** | Official docs, papers, specs, RFCs | Definitions, mechanisms, verifiable facts | High |
|
||||
| **L2** | Official blogs, tech talks, white papers | Design intent, architectural thinking | High |
|
||||
| **L3** | Authoritative media, expert commentary, tutorials | Supplementary intuition, case studies | Medium |
|
||||
| **L4** | Community discussions, personal blogs, forums | Discover blind spots, validate understanding | Low |
|
||||
|
||||
## L4 Community Source Specifics (mandatory for product comparison research)
|
||||
|
||||
| Source Type | Access Method | Value |
|
||||
|------------|---------------|-------|
|
||||
| **GitHub Issues** | Visit `github.com/<org>/<repo>/issues` | Real user pain points, feature requests, bug reports |
|
||||
| **GitHub Discussions** | Visit `github.com/<org>/<repo>/discussions` | Feature discussions, usage insights, community consensus |
|
||||
| **Reddit** | Search `site:reddit.com "<product_name>"` | Authentic user reviews, comparison discussions |
|
||||
| **Hacker News** | Search `site:news.ycombinator.com "<product_name>"` | In-depth technical community discussions |
|
||||
| **Discord/Telegram** | Product's official community channels | Active user feedback (must annotate [limited source]) |
|
||||
|
||||
## Principles
|
||||
|
||||
- Conclusions must be traceable to L1/L2
|
||||
- L3/L4 serve only as supplementary and validation
|
||||
- L4 community discussions are used to discover "what users truly care about"
|
||||
- Record all information sources
|
||||
|
||||
## Timeliness Filtering Rules (execute based on Step 0.5 sensitivity level)
|
||||
|
||||
| Sensitivity Level | Source Filtering Rule | Suggested Search Parameters |
|
||||
|-------------------|----------------------|-----------------------------|
|
||||
| Critical | Only accept sources within 6 months as factual evidence | `time_range: "month"` or `start_date` set to last 3 months |
|
||||
| High | Prefer sources within 1 year; annotate if older than 1 year | `time_range: "year"` |
|
||||
| Medium | Sources within 2 years used normally; older ones need validity check | Default search |
|
||||
| Low | No time limit | Default search |
|
||||
|
||||
## High-Sensitivity Domain Search Strategy
|
||||
|
||||
```
|
||||
1. Round 1: Targeted official source search
|
||||
- Use include_domains to restrict to official domains
|
||||
- Example: include_domains: ["anthropic.com", "openai.com", "docs.xxx.com"]
|
||||
|
||||
2. Round 2: Official download/release page direct verification (BLOCKING)
|
||||
- Directly visit official download pages; don't rely on search caches
|
||||
- Use tavily-extract or WebFetch to extract page content
|
||||
- Verify: platform support, current version number, release date
|
||||
|
||||
3. Round 3: Product-specific protocol/feature search (BLOCKING)
|
||||
- Search protocol names the product supports (MCP, ACP, LSP, etc.)
|
||||
- Format: "<product_name> <protocol_name>" site:official_domain
|
||||
|
||||
4. Round 4: Time-limited broad search
|
||||
- time_range: "month" or start_date set to recent
|
||||
- Exclude obviously outdated sources
|
||||
|
||||
5. Round 5: Version verification
|
||||
- Cross-validate version numbers from search results
|
||||
- If inconsistency found, immediately consult official Changelog
|
||||
|
||||
6. Round 6: Community voice mining (BLOCKING - mandatory for product comparison research)
|
||||
- Visit the product's GitHub Issues page, review popular/pinned issues
|
||||
- Search Issues for key feature terms (e.g., "MCP", "plugin", "integration")
|
||||
- Review discussion trends from the last 3-6 months
|
||||
- Identify the feature points and differentiating characteristics users care most about
|
||||
```
|
||||
|
||||
## Community Voice Mining Detailed Steps
|
||||
|
||||
```
|
||||
GitHub Issues Mining Steps:
|
||||
1. Visit github.com/<org>/<repo>/issues
|
||||
2. Sort by "Most commented" to view popular discussions
|
||||
3. Search keywords:
|
||||
- Feature-related: feature request, enhancement, MCP, plugin, API
|
||||
- Comparison-related: vs, compared to, alternative, migrate from
|
||||
4. Review issue labels: enhancement, feature, discussion
|
||||
5. Record frequently occurring feature demands and user pain points
|
||||
|
||||
Value Translation:
|
||||
- Frequently discussed features -> likely differentiating highlights
|
||||
- User complaints/requests -> likely product weaknesses
|
||||
- Comparison discussions -> directly obtain user-perspective difference analysis
|
||||
```
|
||||
|
||||
## Source Registry Entry Template
|
||||
|
||||
For each source consulted, immediately append to `01_source_registry.md`:
|
||||
```markdown
|
||||
## Source #[number]
|
||||
- **Title**: [source title]
|
||||
- **Link**: [URL]
|
||||
- **Tier**: L1/L2/L3/L4
|
||||
- **Publication Date**: [YYYY-MM-DD]
|
||||
- **Timeliness Status**: Currently valid / Needs verification / Outdated (reference only)
|
||||
- **Version Info**: [If involving a specific version, must annotate]
|
||||
- **Target Audience**: [Explicitly annotate the group/geography/level this source targets]
|
||||
- **Research Boundary Match**: Full match / Partial overlap / Reference only
|
||||
- **Summary**: [1-2 sentence key content]
|
||||
- **Related Sub-question**: [which sub-question this corresponds to]
|
||||
```
|
||||
|
||||
## Target Audience Verification (BLOCKING)
|
||||
|
||||
Before including each source, verify that its target audience matches the research boundary:
|
||||
|
||||
| Source Type | Target audience to verify | Verification method |
|
||||
|------------|---------------------------|---------------------|
|
||||
| **Policy/Regulation** | Who is it for? (K-12/university/all) | Check document title, scope clauses |
|
||||
| **Academic Research** | Who are the subjects? (vocational/undergraduate/graduate) | Check methodology/sample description sections |
|
||||
| **Statistical Data** | Which population is measured? | Check data source description |
|
||||
| **Case Reports** | What type of institution is involved? | Confirm institution type |
|
||||
|
||||
Handling mismatched sources:
|
||||
- Target audience completely mismatched -> do not include
|
||||
- Partially overlapping -> include but annotate applicable scope
|
||||
- Usable as analogous reference -> include but explicitly annotate "reference only"
|
||||
@@ -0,0 +1,56 @@
|
||||
# Usage Examples — Reference
|
||||
|
||||
## Example 1: Initial Research (Mode A)
|
||||
|
||||
```
|
||||
User: Research this problem and find the best solution
|
||||
```
|
||||
|
||||
Execution flow:
|
||||
1. Context resolution: no explicit file -> project mode (INPUT_DIR=`_docs/00_problem/`, OUTPUT_DIR=`_docs/01_solution/`)
|
||||
2. Guardrails: verify INPUT_DIR exists with required files
|
||||
3. Mode detection: no `solution_draft*.md` -> Mode A
|
||||
4. Phase 1: Assess acceptance criteria and restrictions, ask user about unclear parts
|
||||
5. BLOCKING: present AC assessment, wait for user confirmation
|
||||
6. Phase 2: Full 8-step research — competitors, components, state-of-the-art solutions
|
||||
7. Output: `OUTPUT_DIR/solution_draft01.md`
|
||||
8. (Optional) Phase 3: Tech stack consolidation -> `tech_stack.md`
|
||||
9. (Optional) Phase 4: Security deep dive -> `security_analysis.md`
|
||||
|
||||
## Example 2: Solution Assessment (Mode B)
|
||||
|
||||
```
|
||||
User: Assess the current solution draft
|
||||
```
|
||||
|
||||
Execution flow:
|
||||
1. Context resolution: no explicit file -> project mode
|
||||
2. Guardrails: verify INPUT_DIR exists
|
||||
3. Mode detection: `solution_draft03.md` found in OUTPUT_DIR -> Mode B, read it as input
|
||||
4. Full 8-step research — weak points, security, performance, solutions
|
||||
5. Output: `OUTPUT_DIR/solution_draft04.md` with findings table + revised draft
|
||||
|
||||
## Example 3: Standalone Research
|
||||
|
||||
```
|
||||
User: /research @my_problem.md
|
||||
```
|
||||
|
||||
Execution flow:
|
||||
1. Context resolution: explicit file -> standalone mode (INPUT_FILE=`my_problem.md`, OUTPUT_DIR=`_standalone/my_problem/01_solution/`)
|
||||
2. Guardrails: verify INPUT_FILE exists and is non-empty, warn about missing restrictions/AC
|
||||
3. Mode detection + full research flow as in Example 1, scoped to standalone paths
|
||||
4. Output: `_standalone/my_problem/01_solution/solution_draft01.md`
|
||||
5. Move `my_problem.md` into `_standalone/my_problem/`
|
||||
|
||||
## Example 4: Force Initial Research (Override)
|
||||
|
||||
```
|
||||
User: Research from scratch, ignore existing drafts
|
||||
```
|
||||
|
||||
Execution flow:
|
||||
1. Context resolution: no explicit file -> project mode
|
||||
2. Mode detection: drafts exist, but user explicitly requested initial research -> Mode A
|
||||
3. Phase 1 + Phase 2 as in Example 1
|
||||
4. Output: `OUTPUT_DIR/solution_draft##.md` (incremented from highest existing)
|
||||
@@ -0,0 +1,37 @@
|
||||
# Solution Draft
|
||||
|
||||
## Product Solution Description
|
||||
[Short description of the proposed solution. Brief component interaction diagram.]
|
||||
|
||||
## Existing/Competitor Solutions Analysis
|
||||
[Analysis of existing solutions for similar problems, if any.]
|
||||
|
||||
## Architecture
|
||||
|
||||
[Architecture solution that meets restrictions and acceptance criteria.]
|
||||
|
||||
### Component: [Component Name]
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|-----------|-------------|-------------|----------|------|-----|
|
||||
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
|
||||
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
|
||||
|
||||
[Repeat per component]
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Integration / Functional Tests
|
||||
- [Test 1]
|
||||
- [Test 2]
|
||||
|
||||
### Non-Functional Tests
|
||||
- [Performance test 1]
|
||||
- [Security test 1]
|
||||
|
||||
## References
|
||||
[All cited source links]
|
||||
|
||||
## Related Artifacts
|
||||
- Tech stack evaluation: `_docs/01_solution/tech_stack.md` (if Phase 3 was executed)
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md` (if Phase 4 was executed)
|
||||
@@ -0,0 +1,40 @@
|
||||
# Solution Draft
|
||||
|
||||
## Assessment Findings
|
||||
|
||||
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|
||||
|------------------------|----------------------------------------------|-------------|
|
||||
| [old] | [weak point] | [new] |
|
||||
|
||||
## Product Solution Description
|
||||
[Short description. Brief component interaction diagram. Written as if from scratch — no "updated" markers.]
|
||||
|
||||
## Architecture
|
||||
|
||||
[Architecture solution that meets restrictions and acceptance criteria.]
|
||||
|
||||
### Component: [Component Name]
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|-----------|-------------|-------------|----------|------------|-----|
|
||||
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
|
||||
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
|
||||
|
||||
[Repeat per component]
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Integration / Functional Tests
|
||||
- [Test 1]
|
||||
- [Test 2]
|
||||
|
||||
### Non-Functional Tests
|
||||
- [Performance test 1]
|
||||
- [Security test 1]
|
||||
|
||||
## References
|
||||
[All cited source links]
|
||||
|
||||
## Related Artifacts
|
||||
- Tech stack evaluation: `_docs/01_solution/tech_stack.md` (if Phase 3 was executed)
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md` (if Phase 4 was executed)
|
||||
@@ -0,0 +1,174 @@
|
||||
---
|
||||
name: retrospective
|
||||
description: |
|
||||
Collect metrics from implementation batch reports and code review findings, analyze trends across cycles,
|
||||
and produce improvement reports with actionable recommendations.
|
||||
3-step workflow: collect metrics, analyze trends, produce report.
|
||||
Outputs to _docs/05_metrics/.
|
||||
Trigger phrases:
|
||||
- "retrospective", "retro", "run retro"
|
||||
- "metrics review", "feedback loop"
|
||||
- "implementation metrics", "analyze trends"
|
||||
category: evolve
|
||||
tags: [retrospective, metrics, trends, improvement, feedback-loop]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Retrospective
|
||||
|
||||
Collect metrics from implementation artifacts, analyze trends across development cycles, and produce actionable improvement reports.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Data-driven**: conclusions come from metrics, not impressions
|
||||
- **Actionable**: every finding must have a concrete improvement suggestion
|
||||
- **Cumulative**: each retrospective compares against previous ones to track progress
|
||||
- **Save immediately**: write artifacts to disk after each step
|
||||
- **Non-judgmental**: focus on process improvement, not blame
|
||||
|
||||
## Context Resolution
|
||||
|
||||
Fixed paths:
|
||||
|
||||
- IMPL_DIR: `_docs/03_implementation/`
|
||||
- METRICS_DIR: `_docs/05_metrics/`
|
||||
- TASKS_DIR: `_docs/02_tasks/`
|
||||
|
||||
Announce the resolved paths to the user before proceeding.
|
||||
|
||||
## Prerequisite Checks (BLOCKING)
|
||||
|
||||
1. `IMPL_DIR` exists and contains at least one `batch_*_report.md` — **STOP if missing** (nothing to analyze)
|
||||
2. Create METRICS_DIR if it does not exist
|
||||
3. Check for previous retrospective reports in METRICS_DIR to enable trend comparison
|
||||
|
||||
## Artifact Management
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
METRICS_DIR/
|
||||
├── retro_[YYYY-MM-DD].md
|
||||
├── retro_[YYYY-MM-DD].md
|
||||
└── ...
|
||||
```
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
At the start of execution, create a TodoWrite with all steps (1 through 3). Update status as each step completes.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Collect Metrics
|
||||
|
||||
**Role**: Data analyst
|
||||
**Goal**: Parse all implementation artifacts and extract quantitative metrics
|
||||
**Constraints**: Collection only — no interpretation yet
|
||||
|
||||
#### Sources
|
||||
|
||||
| Source | Metrics Extracted |
|
||||
|--------|------------------|
|
||||
| `batch_*_report.md` | Tasks per batch, batch count, task statuses (Done/Blocked/Partial) |
|
||||
| Code review sections in batch reports | PASS/FAIL/PASS_WITH_WARNINGS ratios, finding counts by severity and category |
|
||||
| Task spec files in TASKS_DIR | Complexity points per task, dependency count |
|
||||
| `FINAL_implementation_report.md` | Total tasks, total batches, overall duration |
|
||||
| Git log (if available) | Commits per batch, files changed per batch |
|
||||
|
||||
#### Metrics to Compute
|
||||
|
||||
**Implementation Metrics**:
|
||||
- Total tasks implemented
|
||||
- Total batches executed
|
||||
- Average tasks per batch
|
||||
- Average complexity points per batch
|
||||
- Total complexity points delivered
|
||||
|
||||
**Quality Metrics**:
|
||||
- Code review pass rate (PASS / total reviews)
|
||||
- Code review findings by severity: Critical, High, Medium, Low counts
|
||||
- Code review findings by category: Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope
|
||||
- FAIL count (batches that required user intervention)
|
||||
|
||||
**Efficiency Metrics**:
|
||||
- Blocked task count and reasons
|
||||
- Tasks completed on first attempt vs requiring fixes
|
||||
- Batch with most findings (identify problem areas)
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All batch reports parsed
|
||||
- [ ] All metric categories computed
|
||||
- [ ] No batch reports missed
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Analyze Trends
|
||||
|
||||
**Role**: Process improvement analyst
|
||||
**Goal**: Identify patterns, recurring issues, and improvement opportunities
|
||||
**Constraints**: Analysis must be grounded in the metrics from Step 1
|
||||
|
||||
1. If previous retrospective reports exist in METRICS_DIR, load the most recent one for comparison
|
||||
2. Identify patterns:
|
||||
- **Recurring findings**: which code review categories appear most frequently?
|
||||
- **Problem components**: which components/files generate the most findings?
|
||||
- **Complexity accuracy**: do high-complexity tasks actually produce more issues?
|
||||
- **Blocker patterns**: what types of blockers occur and can they be prevented?
|
||||
3. Compare against previous retrospective (if exists):
|
||||
- Which metrics improved?
|
||||
- Which metrics degraded?
|
||||
- Were previous improvement actions effective?
|
||||
4. Identify top 3 improvement actions ranked by impact
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] Patterns are grounded in specific metrics
|
||||
- [ ] Comparison with previous retro included (if exists)
|
||||
- [ ] Top 3 actions are concrete and actionable
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Produce Report
|
||||
|
||||
**Role**: Technical writer
|
||||
**Goal**: Write a structured retrospective report with metrics, trends, and recommendations
|
||||
**Constraints**: Concise, data-driven, actionable
|
||||
|
||||
Write `METRICS_DIR/retro_[YYYY-MM-DD].md` using `templates/retrospective-report.md` as structure.
|
||||
|
||||
**Self-verification**:
|
||||
- [ ] All metrics from Step 1 included
|
||||
- [ ] Trend analysis from Step 2 included
|
||||
- [ ] Top 3 improvement actions clearly stated
|
||||
- [ ] Suggested rule/skill updates are specific
|
||||
|
||||
**Save action**: Write `retro_[YYYY-MM-DD].md`
|
||||
|
||||
Present the report summary to the user.
|
||||
|
||||
---
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| No batch reports exist | **STOP** — nothing to analyze |
|
||||
| Batch reports have inconsistent format | **WARN user**, extract what is available |
|
||||
| No previous retrospective for comparison | PROCEED — report baseline metrics only |
|
||||
| Metrics suggest systemic issue (>50% FAIL rate) | **WARN user** — suggest immediate process review |
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Retrospective (3-Step Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ: batch reports exist in _docs/03_implementation/ │
|
||||
│ │
|
||||
│ 1. Collect Metrics → parse batch reports, compute metrics │
|
||||
│ 2. Analyze Trends → patterns, comparison, improvement areas │
|
||||
│ 3. Produce Report → _docs/05_metrics/retro_[date].md │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Data-driven · Actionable · Cumulative │
|
||||
│ Non-judgmental · Save immediately │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,93 @@
|
||||
# Retrospective Report Template
|
||||
|
||||
Save as `_docs/05_metrics/retro_[YYYY-MM-DD].md`.
|
||||
|
||||
---
|
||||
|
||||
```markdown
|
||||
# Retrospective — [YYYY-MM-DD]
|
||||
|
||||
## Implementation Summary
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Total tasks | [count] |
|
||||
| Total batches | [count] |
|
||||
| Total complexity points | [sum] |
|
||||
| Avg tasks per batch | [value] |
|
||||
| Avg complexity per batch | [value] |
|
||||
|
||||
## Quality Metrics
|
||||
|
||||
### Code Review Results
|
||||
|
||||
| Verdict | Count | Percentage |
|
||||
|---------|-------|-----------|
|
||||
| PASS | [count] | [%] |
|
||||
| PASS_WITH_WARNINGS | [count] | [%] |
|
||||
| FAIL | [count] | [%] |
|
||||
|
||||
### Findings by Severity
|
||||
|
||||
| Severity | Count |
|
||||
|----------|-------|
|
||||
| Critical | [count] |
|
||||
| High | [count] |
|
||||
| Medium | [count] |
|
||||
| Low | [count] |
|
||||
|
||||
### Findings by Category
|
||||
|
||||
| Category | Count | Top Files |
|
||||
|----------|-------|-----------|
|
||||
| Bug | [count] | [most affected files] |
|
||||
| Spec-Gap | [count] | [most affected files] |
|
||||
| Security | [count] | [most affected files] |
|
||||
| Performance | [count] | [most affected files] |
|
||||
| Maintainability | [count] | [most affected files] |
|
||||
| Style | [count] | [most affected files] |
|
||||
|
||||
## Efficiency
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Blocked tasks | [count] |
|
||||
| Tasks requiring fixes after review | [count] |
|
||||
| Batch with most findings | Batch [N] — [reason] |
|
||||
|
||||
### Blocker Analysis
|
||||
|
||||
| Blocker Type | Count | Prevention |
|
||||
|-------------|-------|-----------|
|
||||
| [type] | [count] | [suggested prevention] |
|
||||
|
||||
## Trend Comparison
|
||||
|
||||
| Metric | Previous | Current | Change |
|
||||
|--------|----------|---------|--------|
|
||||
| Pass rate | [%] | [%] | [+/-] |
|
||||
| Avg findings per batch | [value] | [value] | [+/-] |
|
||||
| Blocked tasks | [count] | [count] | [+/-] |
|
||||
|
||||
*Previous retrospective: [date or "N/A — first retro"]*
|
||||
|
||||
## Top 3 Improvement Actions
|
||||
|
||||
1. **[Action title]**: [specific, actionable description]
|
||||
- Impact: [expected improvement]
|
||||
- Effort: [low/medium/high]
|
||||
|
||||
2. **[Action title]**: [specific, actionable description]
|
||||
- Impact: [expected improvement]
|
||||
- Effort: [low/medium/high]
|
||||
|
||||
3. **[Action title]**: [specific, actionable description]
|
||||
- Impact: [expected improvement]
|
||||
- Effort: [low/medium/high]
|
||||
|
||||
## Suggested Rule/Skill Updates
|
||||
|
||||
| File | Change | Rationale |
|
||||
|------|--------|-----------|
|
||||
| [.cursor/rules/... or .cursor/skills/...] | [specific change] | [based on which metric] |
|
||||
```
|
||||
@@ -0,0 +1,130 @@
|
||||
---
|
||||
name: rollback
|
||||
description: |
|
||||
Revert implementation to a specific batch checkpoint using git revert, reset Jira ticket statuses,
|
||||
verify rollback integrity with tests, and produce a rollback report.
|
||||
Trigger phrases:
|
||||
- "rollback", "revert", "revert batch"
|
||||
- "undo implementation", "roll back to batch"
|
||||
category: build
|
||||
tags: [rollback, revert, recovery, implementation]
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Implementation Rollback
|
||||
|
||||
Revert the codebase to a specific batch checkpoint, reset Jira statuses for reverted tasks, and verify integrity.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Preserve history**: always use `git revert`, never force-push
|
||||
- **Verify after revert**: run the full test suite after every rollback
|
||||
- **Update tracking**: reset Jira ticket statuses for all reverted tasks
|
||||
- **Atomic rollback**: if rollback fails midway, stop and report — do not leave the codebase in a partial state
|
||||
- **Ask, don't assume**: if the target batch is ambiguous, present options and ask
|
||||
|
||||
## Context Resolution
|
||||
|
||||
- IMPL_DIR: `_docs/03_implementation/`
|
||||
- Batch reports: `IMPL_DIR/batch_*_report.md`
|
||||
|
||||
## Prerequisite Checks (BLOCKING)
|
||||
|
||||
1. IMPL_DIR exists and contains at least one `batch_*_report.md` — **STOP if missing**
|
||||
2. Git working tree is clean (no uncommitted changes) — **STOP if dirty**, ask user to commit or stash
|
||||
|
||||
## Input
|
||||
|
||||
- User specifies a target batch number or commit hash
|
||||
- If not specified, present the list of available batch checkpoints and ask
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Identify Checkpoints
|
||||
|
||||
1. Read all `batch_*_report.md` files from IMPL_DIR
|
||||
2. Extract: batch number, date, tasks included, commit hash, code review verdict
|
||||
3. Present batch list to user
|
||||
|
||||
**BLOCKING**: User must confirm which batch to roll back to.
|
||||
|
||||
### Step 2: Revert Commits
|
||||
|
||||
1. Determine which commits need to be reverted (all commits after the target batch)
|
||||
2. For each commit in reverse chronological order:
|
||||
- Run `git revert <commit-hash> --no-edit`
|
||||
- If merge conflicts occur: present conflicts and ask user for resolution
|
||||
3. If any revert fails and cannot be resolved, abort the rollback sequence with `git revert --abort` and report
|
||||
|
||||
### Step 3: Verify Integrity
|
||||
|
||||
1. Run the full test suite
|
||||
2. If tests fail: report failures to user, ask how to proceed (fix or abort)
|
||||
3. If tests pass: continue
|
||||
|
||||
### Step 4: Update Jira
|
||||
|
||||
1. Identify all tasks from reverted batches
|
||||
2. Reset each task's Jira ticket status to "To Do" via Jira MCP
|
||||
|
||||
### Step 5: Finalize
|
||||
|
||||
1. Commit with message: `[ROLLBACK] Reverted to batch [N]: [task list]`
|
||||
2. Write rollback report to `IMPL_DIR/rollback_report.md`
|
||||
|
||||
## Output
|
||||
|
||||
Write `_docs/03_implementation/rollback_report.md`:
|
||||
|
||||
```markdown
|
||||
# Rollback Report
|
||||
|
||||
**Date**: [YYYY-MM-DD]
|
||||
**Target**: Batch [N] (commit [hash])
|
||||
**Reverted Batches**: [list]
|
||||
|
||||
## Reverted Tasks
|
||||
|
||||
| Task | Batch | Status Before | Status After |
|
||||
|------|-------|--------------|-------------|
|
||||
| [JIRA-ID] | [batch #] | In Testing | To Do |
|
||||
|
||||
## Test Results
|
||||
- [pass/fail count]
|
||||
|
||||
## Jira Updates
|
||||
- [list of ticket transitions]
|
||||
|
||||
## Notes
|
||||
- [any conflicts, manual steps, or issues encountered]
|
||||
```
|
||||
|
||||
## Escalation Rules
|
||||
|
||||
| Situation | Action |
|
||||
|-----------|--------|
|
||||
| No batch reports exist | **STOP** — nothing to roll back |
|
||||
| Uncommitted changes in working tree | **STOP** — ask user to commit or stash |
|
||||
| Merge conflicts during revert | **ASK user** for resolution |
|
||||
| Tests fail after rollback | **ASK user** — fix or abort |
|
||||
| Rollback fails midway | Abort with `git revert --abort`, report to user |
|
||||
|
||||
## Methodology Quick Reference
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ Rollback (5-Step Method) │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ PREREQ: batch reports exist, clean working tree │
|
||||
│ │
|
||||
│ 1. Identify Checkpoints → present batch list │
|
||||
│ [BLOCKING: user confirms target batch] │
|
||||
│ 2. Revert Commits → git revert per commit │
|
||||
│ 3. Verify Integrity → run full test suite │
|
||||
│ 4. Update Jira → reset statuses to "To Do" │
|
||||
│ 5. Finalize → commit + rollback_report.md │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ Principles: Preserve history · Verify after revert │
|
||||
│ Atomic rollback · Ask don't assume │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -0,0 +1,300 @@
|
||||
---
|
||||
name: security-testing
|
||||
description: "Test for security vulnerabilities using OWASP principles. Use when conducting security audits, testing auth, or implementing security practices."
|
||||
category: specialized-testing
|
||||
priority: critical
|
||||
tokenEstimate: 1200
|
||||
agents: [qe-security-scanner, qe-api-contract-validator, qe-quality-analyzer]
|
||||
implementation_status: optimized
|
||||
optimization_version: 1.0
|
||||
last_optimized: 2025-12-02
|
||||
dependencies: []
|
||||
quick_reference_card: true
|
||||
tags: [security, owasp, sast, dast, vulnerabilities, auth, injection]
|
||||
trust_tier: 3
|
||||
validation:
|
||||
schema_path: schemas/output.json
|
||||
validator_path: scripts/validate-config.json
|
||||
eval_path: evals/security-testing.yaml
|
||||
---
|
||||
|
||||
# Security Testing
|
||||
|
||||
<default_to_action>
|
||||
When testing security or conducting audits:
|
||||
1. TEST OWASP Top 10 vulnerabilities systematically
|
||||
2. VALIDATE authentication and authorization on every endpoint
|
||||
3. SCAN dependencies for known vulnerabilities (npm audit)
|
||||
4. CHECK for injection attacks (SQL, XSS, command)
|
||||
5. VERIFY secrets aren't exposed in code/logs
|
||||
|
||||
**Quick Security Checks:**
|
||||
- Access control → Test horizontal/vertical privilege escalation
|
||||
- Crypto → Verify password hashing, HTTPS, no sensitive data exposed
|
||||
- Injection → Test SQL injection, XSS, command injection
|
||||
- Auth → Test weak passwords, session fixation, MFA enforcement
|
||||
- Config → Check error messages don't leak info
|
||||
|
||||
**Critical Success Factors:**
|
||||
- Think like an attacker, build like a defender
|
||||
- Security is built in, not added at the end
|
||||
- Test continuously in CI/CD, not just before release
|
||||
</default_to_action>
|
||||
|
||||
## Quick Reference Card
|
||||
|
||||
### When to Use
|
||||
- Security audits and penetration testing
|
||||
- Testing authentication/authorization
|
||||
- Validating input sanitization
|
||||
- Reviewing security configuration
|
||||
|
||||
### OWASP Top 10
|
||||
Use the most recent **stable** version of the OWASP Top 10. At the start of each security audit, research the current version at https://owasp.org/www-project-top-ten/ and test against all listed categories. Do not rely on a hardcoded list — the OWASP Top 10 is updated periodically and the current version must be verified.
|
||||
|
||||
### Tools
|
||||
| Type | Tool | Purpose |
|
||||
|------|------|---------|
|
||||
| SAST | SonarQube, Semgrep | Static code analysis |
|
||||
| DAST | OWASP ZAP, Burp | Dynamic scanning |
|
||||
| Deps | npm audit, Snyk | Dependency vulnerabilities |
|
||||
| Secrets | git-secrets, TruffleHog | Secret scanning |
|
||||
|
||||
### Agent Coordination
|
||||
- `qe-security-scanner`: Multi-layer SAST/DAST scanning
|
||||
- `qe-api-contract-validator`: API security testing
|
||||
- `qe-quality-analyzer`: Security code review
|
||||
|
||||
---
|
||||
|
||||
## Key Vulnerability Tests
|
||||
|
||||
### 1. Broken Access Control
|
||||
```javascript
|
||||
// Horizontal escalation - User A accessing User B's data
|
||||
test('user cannot access another user\'s order', async () => {
|
||||
const userAToken = await login('userA');
|
||||
const userBOrder = await createOrder('userB');
|
||||
|
||||
const response = await api.get(`/orders/${userBOrder.id}`, {
|
||||
headers: { Authorization: `Bearer ${userAToken}` }
|
||||
});
|
||||
expect(response.status).toBe(403);
|
||||
});
|
||||
|
||||
// Vertical escalation - Regular user accessing admin
|
||||
test('regular user cannot access admin', async () => {
|
||||
const userToken = await login('regularUser');
|
||||
expect((await api.get('/admin/users', {
|
||||
headers: { Authorization: `Bearer ${userToken}` }
|
||||
})).status).toBe(403);
|
||||
});
|
||||
```
|
||||
|
||||
### 2. Injection Attacks
|
||||
```javascript
|
||||
// SQL Injection
|
||||
test('prevents SQL injection', async () => {
|
||||
const malicious = "' OR '1'='1";
|
||||
const response = await api.get(`/products?search=${malicious}`);
|
||||
expect(response.body.length).toBeLessThan(100); // Not all products
|
||||
});
|
||||
|
||||
// XSS
|
||||
test('sanitizes HTML output', async () => {
|
||||
const xss = '<script>alert("XSS")</script>';
|
||||
await api.post('/comments', { text: xss });
|
||||
|
||||
const html = (await api.get('/comments')).body;
|
||||
expect(html).toContain('<script>');
|
||||
expect(html).not.toContain('<script>');
|
||||
});
|
||||
```
|
||||
|
||||
### 3. Cryptographic Failures
|
||||
```javascript
|
||||
test('passwords are hashed', async () => {
|
||||
await db.users.create({ email: 'test@example.com', password: 'MyPassword123' });
|
||||
const user = await db.users.findByEmail('test@example.com');
|
||||
|
||||
expect(user.password).not.toBe('MyPassword123');
|
||||
expect(user.password).toMatch(/^\$2[aby]\$\d{2}\$/); // bcrypt
|
||||
});
|
||||
|
||||
test('no sensitive data in API response', async () => {
|
||||
const response = await api.get('/users/me');
|
||||
expect(response.body).not.toHaveProperty('password');
|
||||
expect(response.body).not.toHaveProperty('ssn');
|
||||
});
|
||||
```
|
||||
|
||||
### 4. Security Misconfiguration
|
||||
```javascript
|
||||
test('errors don\'t leak sensitive info', async () => {
|
||||
const response = await api.post('/login', { email: 'nonexistent@test.com', password: 'wrong' });
|
||||
expect(response.body.error).toBe('Invalid credentials'); // Generic message
|
||||
});
|
||||
|
||||
test('sensitive endpoints not exposed', async () => {
|
||||
const endpoints = ['/debug', '/.env', '/.git', '/admin'];
|
||||
for (let ep of endpoints) {
|
||||
expect((await fetch(`https://example.com${ep}`)).status).not.toBe(200);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### 5. Rate Limiting
|
||||
```javascript
|
||||
test('rate limiting prevents brute force', async () => {
|
||||
const responses = [];
|
||||
for (let i = 0; i < 20; i++) {
|
||||
responses.push(await api.post('/login', { email: 'test@example.com', password: 'wrong' }));
|
||||
}
|
||||
expect(responses.filter(r => r.status === 429).length).toBeGreaterThan(0);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Security Checklist
|
||||
|
||||
### Authentication
|
||||
- [ ] Strong password requirements (12+ chars)
|
||||
- [ ] Password hashing (bcrypt, scrypt, Argon2)
|
||||
- [ ] MFA for sensitive operations
|
||||
- [ ] Account lockout after failed attempts
|
||||
- [ ] Session ID changes after login
|
||||
- [ ] Session timeout
|
||||
|
||||
### Authorization
|
||||
- [ ] Check authorization on every request
|
||||
- [ ] Least privilege principle
|
||||
- [ ] No horizontal escalation
|
||||
- [ ] No vertical escalation
|
||||
|
||||
### Data Protection
|
||||
- [ ] HTTPS everywhere
|
||||
- [ ] Encrypted at rest
|
||||
- [ ] Secrets not in code/logs
|
||||
- [ ] PII compliance (GDPR)
|
||||
|
||||
### Input Validation
|
||||
- [ ] Server-side validation
|
||||
- [ ] Parameterized queries (no SQL injection)
|
||||
- [ ] Output encoding (no XSS)
|
||||
- [ ] Rate limiting
|
||||
|
||||
---
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
```yaml
|
||||
# GitHub Actions
|
||||
security-checks:
|
||||
steps:
|
||||
- name: Dependency audit
|
||||
run: npm audit --audit-level=high
|
||||
|
||||
- name: SAST scan
|
||||
run: npm run sast
|
||||
|
||||
- name: Secret scan
|
||||
uses: trufflesecurity/trufflehog@main
|
||||
|
||||
- name: DAST scan
|
||||
if: github.ref == 'refs/heads/main'
|
||||
run: docker run owasp/zap2docker-stable zap-baseline.py -t https://staging.example.com
|
||||
```
|
||||
|
||||
**Pre-commit hooks:**
|
||||
```bash
|
||||
#!/bin/sh
|
||||
git-secrets --scan
|
||||
npm run lint:security
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Agent-Assisted Security Testing
|
||||
|
||||
```typescript
|
||||
// Comprehensive multi-layer scan
|
||||
await Task("Security Scan", {
|
||||
target: 'src/',
|
||||
layers: { sast: true, dast: true, dependencies: true, secrets: true },
|
||||
severity: ['critical', 'high', 'medium']
|
||||
}, "qe-security-scanner");
|
||||
|
||||
// OWASP Top 10 testing
|
||||
await Task("OWASP Scan", {
|
||||
categories: ['broken-access-control', 'injection', 'cryptographic-failures'],
|
||||
depth: 'comprehensive'
|
||||
}, "qe-security-scanner");
|
||||
|
||||
// Validate fix
|
||||
await Task("Validate Fix", {
|
||||
vulnerability: 'CVE-2024-12345',
|
||||
expectedResolution: 'upgrade package to v2.0.0',
|
||||
retestAfterFix: true
|
||||
}, "qe-security-scanner");
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Agent Coordination Hints
|
||||
|
||||
### Memory Namespace
|
||||
```
|
||||
aqe/security/
|
||||
├── scans/* - Scan results
|
||||
├── vulnerabilities/* - Found vulnerabilities
|
||||
├── fixes/* - Remediation tracking
|
||||
└── compliance/* - Compliance status
|
||||
```
|
||||
|
||||
### Fleet Coordination
|
||||
```typescript
|
||||
const securityFleet = await FleetManager.coordinate({
|
||||
strategy: 'security-testing',
|
||||
agents: [
|
||||
'qe-security-scanner',
|
||||
'qe-api-contract-validator',
|
||||
'qe-quality-analyzer',
|
||||
'qe-deployment-readiness'
|
||||
],
|
||||
topology: 'parallel'
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
### ❌ Security by Obscurity
|
||||
Hiding admin at `/super-secret-admin` → **Use proper auth**
|
||||
|
||||
### ❌ Client-Side Validation Only
|
||||
JavaScript validation can be bypassed → **Always validate server-side**
|
||||
|
||||
### ❌ Trusting User Input
|
||||
Assuming input is safe → **Sanitize, validate, escape all input**
|
||||
|
||||
### ❌ Hardcoded Secrets
|
||||
API keys in code → **Environment variables, secret management**
|
||||
|
||||
---
|
||||
|
||||
## Related Skills
|
||||
- [agentic-quality-engineering](../agentic-quality-engineering/) - Security with agents
|
||||
- [api-testing-patterns](../api-testing-patterns/) - API security testing
|
||||
- [compliance-testing](../compliance-testing/) - GDPR, HIPAA, SOC2
|
||||
|
||||
---
|
||||
|
||||
## Remember
|
||||
|
||||
**Think like an attacker:** What would you try to break? Test that.
|
||||
**Build like a defender:** Assume input is malicious until proven otherwise.
|
||||
**Test continuously:** Security testing is ongoing, not one-time.
|
||||
|
||||
**With Agents:** Agents automate vulnerability scanning, track remediation, and validate fixes. Use agents to maintain security posture at scale.
|
||||
@@ -0,0 +1,789 @@
|
||||
# =============================================================================
|
||||
# AQE Skill Evaluation Test Suite: Security Testing v1.0.0
|
||||
# =============================================================================
|
||||
#
|
||||
# Comprehensive evaluation suite for the security-testing skill per ADR-056.
|
||||
# Tests OWASP Top 10 2021 detection, severity classification, remediation
|
||||
# quality, and cross-model consistency.
|
||||
#
|
||||
# Schema: .claude/skills/.validation/schemas/skill-eval.schema.json
|
||||
# Validator: .claude/skills/security-testing/scripts/validate-config.json
|
||||
#
|
||||
# Coverage:
|
||||
# - OWASP A01:2021 - Broken Access Control
|
||||
# - OWASP A02:2021 - Cryptographic Failures
|
||||
# - OWASP A03:2021 - Injection (SQL, XSS, Command)
|
||||
# - OWASP A07:2021 - Identification and Authentication Failures
|
||||
# - Negative tests (no false positives on secure code)
|
||||
#
|
||||
# =============================================================================
|
||||
|
||||
skill: security-testing
|
||||
version: 1.0.0
|
||||
description: >
|
||||
Comprehensive evaluation suite for the security-testing skill.
|
||||
Tests OWASP Top 10 2021 detection capabilities, CWE classification accuracy,
|
||||
CVSS scoring, severity classification, and remediation quality.
|
||||
Supports multi-model testing and integrates with ReasoningBank for
|
||||
continuous improvement.
|
||||
|
||||
# =============================================================================
|
||||
# Multi-Model Configuration
|
||||
# =============================================================================
|
||||
|
||||
models_to_test:
|
||||
- claude-3.5-sonnet # Primary model (high accuracy expected)
|
||||
- claude-3-haiku # Fast model (minimum quality threshold)
|
||||
- gpt-4o # Cross-vendor validation
|
||||
|
||||
# =============================================================================
|
||||
# MCP Integration Configuration
|
||||
# =============================================================================
|
||||
|
||||
mcp_integration:
|
||||
enabled: true
|
||||
namespace: skill-validation
|
||||
|
||||
# Query existing security patterns before running evals
|
||||
query_patterns: true
|
||||
|
||||
# Track each test outcome for learning feedback loop
|
||||
track_outcomes: true
|
||||
|
||||
# Store successful patterns after evals complete
|
||||
store_patterns: true
|
||||
|
||||
# Share learning with fleet coordinator agents
|
||||
share_learning: true
|
||||
|
||||
# Update quality gate with validation metrics
|
||||
update_quality_gate: true
|
||||
|
||||
# Target agents for learning distribution
|
||||
target_agents:
|
||||
- qe-learning-coordinator
|
||||
- qe-queen-coordinator
|
||||
- qe-security-scanner
|
||||
- qe-security-auditor
|
||||
|
||||
# =============================================================================
|
||||
# ReasoningBank Learning Configuration
|
||||
# =============================================================================
|
||||
|
||||
learning:
|
||||
store_success_patterns: true
|
||||
store_failure_patterns: true
|
||||
pattern_ttl_days: 90
|
||||
min_confidence_to_store: 0.7
|
||||
cross_model_comparison: true
|
||||
|
||||
# =============================================================================
|
||||
# Result Format Configuration
|
||||
# =============================================================================
|
||||
|
||||
result_format:
|
||||
json_output: true
|
||||
markdown_report: true
|
||||
include_raw_output: false
|
||||
include_timing: true
|
||||
include_token_usage: true
|
||||
|
||||
# =============================================================================
|
||||
# Environment Setup
|
||||
# =============================================================================
|
||||
|
||||
setup:
|
||||
required_tools:
|
||||
- jq # JSON parsing (required)
|
||||
- npm # Dependency audit (optional but recommended)
|
||||
|
||||
environment_variables:
|
||||
SECURITY_SCAN_DEPTH: "deep"
|
||||
OWASP_ENABLED: "true"
|
||||
SEVERITY_THRESHOLD: "medium"
|
||||
|
||||
fixtures:
|
||||
- name: vulnerable_express_app
|
||||
path: fixtures/vulnerable-express-app.js
|
||||
content: |
|
||||
const express = require('express');
|
||||
const app = express();
|
||||
|
||||
// SQL Injection vulnerability
|
||||
app.get('/user', (req, res) => {
|
||||
const query = `SELECT * FROM users WHERE id = ${req.params.id}`;
|
||||
db.query(query);
|
||||
});
|
||||
|
||||
// XSS vulnerability
|
||||
app.get('/profile', (req, res) => {
|
||||
res.send(`<h1>Hello ${req.query.name}</h1>`);
|
||||
});
|
||||
|
||||
// Path Traversal vulnerability
|
||||
app.get('/file', (req, res) => {
|
||||
const path = './uploads/' + req.query.filename;
|
||||
res.sendFile(path);
|
||||
});
|
||||
|
||||
# =============================================================================
|
||||
# TEST CASES
|
||||
# =============================================================================
|
||||
|
||||
test_cases:
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: SQL Injection (OWASP A03:2021)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc001_sql_injection_string_concat
|
||||
description: "Detect SQL injection via string concatenation in Node.js"
|
||||
category: injection
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
const express = require('express');
|
||||
const mysql = require('mysql');
|
||||
const app = express();
|
||||
|
||||
app.get('/api/users', (req, res) => {
|
||||
const userId = req.params.id;
|
||||
const query = `SELECT * FROM users WHERE id = ${userId}`;
|
||||
db.query(query, (err, results) => {
|
||||
res.json(results);
|
||||
});
|
||||
});
|
||||
context:
|
||||
language: javascript
|
||||
framework: express
|
||||
environment: production
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "SQL injection"
|
||||
- "parameterized"
|
||||
must_not_contain:
|
||||
- "no vulnerabilities"
|
||||
- "secure"
|
||||
must_match_regex:
|
||||
- "CWE-89|CWE-564"
|
||||
- "A03:20[21][0-9]"
|
||||
severity_classification: critical
|
||||
finding_count:
|
||||
min: 1
|
||||
max: 3
|
||||
recommendation_count:
|
||||
min: 1
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.8
|
||||
reasoning_quality_min: 0.7
|
||||
grading_rubric:
|
||||
completeness: 0.3
|
||||
accuracy: 0.5
|
||||
actionability: 0.2
|
||||
|
||||
timeout_ms: 30000
|
||||
|
||||
- id: tc002_sql_injection_parameterized_safe
|
||||
description: "Verify parameterized queries are NOT flagged as vulnerable"
|
||||
category: injection
|
||||
priority: high
|
||||
|
||||
input:
|
||||
code: |
|
||||
app.get('/api/users', (req, res) => {
|
||||
const userId = parseInt(req.params.id, 10);
|
||||
db.query('SELECT * FROM users WHERE id = ?', [userId], (err, results) => {
|
||||
res.json(results);
|
||||
});
|
||||
});
|
||||
context:
|
||||
language: javascript
|
||||
framework: express
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "parameterized"
|
||||
- "secure"
|
||||
must_not_contain:
|
||||
- "SQL injection"
|
||||
- "critical"
|
||||
- "vulnerable"
|
||||
severity_classification: info
|
||||
finding_count:
|
||||
max: 1
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
allow_partial: true
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: Cross-Site Scripting (OWASP A03:2021)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc003_xss_reflected_html_output
|
||||
description: "Detect reflected XSS in unescaped HTML output"
|
||||
category: injection
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
app.get('/profile', (req, res) => {
|
||||
const name = req.query.name;
|
||||
res.send(`
|
||||
<html>
|
||||
<body>
|
||||
<h1>Welcome, ${name}!</h1>
|
||||
<p>Your profile has been loaded.</p>
|
||||
</body>
|
||||
</html>
|
||||
`);
|
||||
});
|
||||
context:
|
||||
language: javascript
|
||||
framework: express
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "XSS"
|
||||
- "cross-site scripting"
|
||||
- "sanitize"
|
||||
- "escape"
|
||||
must_match_regex:
|
||||
- "CWE-79"
|
||||
severity_classification: high
|
||||
finding_count:
|
||||
min: 1
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.8
|
||||
reasoning_quality_min: 0.75
|
||||
|
||||
- id: tc004_xss_dom_based_innerhtml
|
||||
description: "Detect DOM-based XSS via innerHTML assignment"
|
||||
category: injection
|
||||
priority: high
|
||||
|
||||
input:
|
||||
code: |
|
||||
// Client-side JavaScript
|
||||
const params = new URLSearchParams(window.location.search);
|
||||
const message = params.get('msg');
|
||||
document.getElementById('output').innerHTML = message;
|
||||
context:
|
||||
language: javascript
|
||||
framework: vanilla
|
||||
environment: production
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "DOM"
|
||||
- "XSS"
|
||||
- "innerHTML"
|
||||
- "textContent"
|
||||
must_match_regex:
|
||||
- "CWE-79"
|
||||
severity_classification: high
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: Authentication Failures (OWASP A07:2021)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc005_hardcoded_credentials
|
||||
description: "Detect hardcoded credentials and API keys"
|
||||
category: authentication
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
const ADMIN_PASSWORD = 'admin123';
|
||||
const API_KEY = 'sk-1234567890abcdef';
|
||||
const DATABASE_URL = 'postgres://admin:password123@localhost/db';
|
||||
|
||||
app.post('/login', (req, res) => {
|
||||
if (req.body.password === ADMIN_PASSWORD) {
|
||||
req.session.isAdmin = true;
|
||||
res.send('Login successful');
|
||||
}
|
||||
});
|
||||
context:
|
||||
language: javascript
|
||||
framework: express
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "hardcoded"
|
||||
- "credentials"
|
||||
- "secret"
|
||||
- "environment variable"
|
||||
must_match_regex:
|
||||
- "CWE-798|CWE-259"
|
||||
severity_classification: critical
|
||||
finding_count:
|
||||
min: 2
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.8
|
||||
reasoning_quality_min: 0.8
|
||||
|
||||
- id: tc006_weak_password_hashing
|
||||
description: "Detect weak password hashing algorithms (MD5, SHA1)"
|
||||
category: authentication
|
||||
priority: high
|
||||
|
||||
input:
|
||||
code: |
|
||||
const crypto = require('crypto');
|
||||
|
||||
function hashPassword(password) {
|
||||
return crypto.createHash('md5').update(password).digest('hex');
|
||||
}
|
||||
|
||||
function verifyPassword(password, hash) {
|
||||
return hashPassword(password) === hash;
|
||||
}
|
||||
context:
|
||||
language: javascript
|
||||
framework: nodejs
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "MD5"
|
||||
- "weak"
|
||||
- "bcrypt"
|
||||
- "argon2"
|
||||
must_match_regex:
|
||||
- "CWE-327|CWE-328|CWE-916"
|
||||
severity_classification: high
|
||||
finding_count:
|
||||
min: 1
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.8
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: Broken Access Control (OWASP A01:2021)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc007_idor_missing_authorization
|
||||
description: "Detect IDOR vulnerability with missing authorization check"
|
||||
category: authorization
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
app.get('/api/users/:id/profile', (req, res) => {
|
||||
// No authorization check - any user can access any profile
|
||||
const userId = req.params.id;
|
||||
db.query('SELECT * FROM profiles WHERE user_id = ?', [userId])
|
||||
.then(profile => res.json(profile));
|
||||
});
|
||||
|
||||
app.delete('/api/users/:id', (req, res) => {
|
||||
// No check if requesting user owns this account
|
||||
db.query('DELETE FROM users WHERE id = ?', [req.params.id]);
|
||||
res.send('User deleted');
|
||||
});
|
||||
context:
|
||||
language: javascript
|
||||
framework: express
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "authorization"
|
||||
- "access control"
|
||||
- "IDOR"
|
||||
- "ownership"
|
||||
must_match_regex:
|
||||
- "CWE-639|CWE-284|CWE-862"
|
||||
- "A01:2021"
|
||||
severity_classification: critical
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: Cryptographic Failures (OWASP A02:2021)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc008_weak_encryption_des
|
||||
description: "Detect use of weak encryption algorithms (DES, RC4)"
|
||||
category: cryptography
|
||||
priority: high
|
||||
|
||||
input:
|
||||
code: |
|
||||
const crypto = require('crypto');
|
||||
|
||||
function encryptData(data, key) {
|
||||
const cipher = crypto.createCipher('des', key);
|
||||
return cipher.update(data, 'utf8', 'hex') + cipher.final('hex');
|
||||
}
|
||||
|
||||
function decryptData(data, key) {
|
||||
const decipher = crypto.createDecipher('des', key);
|
||||
return decipher.update(data, 'hex', 'utf8') + decipher.final('utf8');
|
||||
}
|
||||
context:
|
||||
language: javascript
|
||||
framework: nodejs
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "DES"
|
||||
- "weak"
|
||||
- "deprecated"
|
||||
- "AES"
|
||||
must_match_regex:
|
||||
- "CWE-327|CWE-328"
|
||||
- "A02:2021"
|
||||
severity_classification: high
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
|
||||
- id: tc009_plaintext_password_storage
|
||||
description: "Detect plaintext password storage"
|
||||
category: cryptography
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
class User {
|
||||
constructor(email, password) {
|
||||
this.email = email;
|
||||
this.password = password; // Stored in plaintext!
|
||||
}
|
||||
|
||||
save() {
|
||||
db.query('INSERT INTO users (email, password) VALUES (?, ?)',
|
||||
[this.email, this.password]);
|
||||
}
|
||||
}
|
||||
context:
|
||||
language: javascript
|
||||
framework: nodejs
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "plaintext"
|
||||
- "password"
|
||||
- "hash"
|
||||
- "bcrypt"
|
||||
must_match_regex:
|
||||
- "CWE-256|CWE-312"
|
||||
- "A02:2021"
|
||||
severity_classification: critical
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.8
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: Path Traversal (Related to A01:2021)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc010_path_traversal_file_access
|
||||
description: "Detect path traversal vulnerability in file access"
|
||||
category: injection
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
const fs = require('fs');
|
||||
|
||||
app.get('/download', (req, res) => {
|
||||
const filename = req.query.file;
|
||||
const filepath = './uploads/' + filename;
|
||||
res.sendFile(filepath);
|
||||
});
|
||||
|
||||
app.get('/read', (req, res) => {
|
||||
const content = fs.readFileSync('./data/' + req.params.name);
|
||||
res.send(content);
|
||||
});
|
||||
context:
|
||||
language: javascript
|
||||
framework: express
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "path traversal"
|
||||
- "directory traversal"
|
||||
- "../"
|
||||
- "sanitize"
|
||||
must_match_regex:
|
||||
- "CWE-22|CWE-23"
|
||||
severity_classification: critical
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: Negative Tests (No False Positives)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc011_secure_code_no_false_positives
|
||||
description: "Verify secure code is NOT flagged as vulnerable"
|
||||
category: negative
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
const express = require('express');
|
||||
const helmet = require('helmet');
|
||||
const rateLimit = require('express-rate-limit');
|
||||
const bcrypt = require('bcrypt');
|
||||
const validator = require('validator');
|
||||
|
||||
const app = express();
|
||||
app.use(helmet());
|
||||
app.use(rateLimit({ windowMs: 15 * 60 * 1000, max: 100 }));
|
||||
|
||||
app.post('/api/users', async (req, res) => {
|
||||
const { email, password } = req.body;
|
||||
|
||||
// Input validation
|
||||
if (!validator.isEmail(email)) {
|
||||
return res.status(400).json({ error: 'Invalid email' });
|
||||
}
|
||||
|
||||
// Secure password hashing
|
||||
const hashedPassword = await bcrypt.hash(password, 12);
|
||||
|
||||
// Parameterized query
|
||||
await db.query(
|
||||
'INSERT INTO users (email, password) VALUES ($1, $2)',
|
||||
[email, hashedPassword]
|
||||
);
|
||||
|
||||
res.status(201).json({ message: 'User created' });
|
||||
});
|
||||
context:
|
||||
language: javascript
|
||||
framework: express
|
||||
environment: production
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "secure"
|
||||
- "best practice"
|
||||
must_not_contain:
|
||||
- "SQL injection"
|
||||
- "XSS"
|
||||
- "critical vulnerability"
|
||||
- "high severity"
|
||||
finding_count:
|
||||
max: 2 # Allow informational findings only
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.6
|
||||
allow_partial: true
|
||||
|
||||
- id: tc012_secure_auth_implementation
|
||||
description: "Verify secure authentication is recognized as safe"
|
||||
category: negative
|
||||
priority: high
|
||||
|
||||
input:
|
||||
code: |
|
||||
const bcrypt = require('bcrypt');
|
||||
const jwt = require('jsonwebtoken');
|
||||
|
||||
async function login(email, password) {
|
||||
const user = await User.findByEmail(email);
|
||||
if (!user) {
|
||||
return { error: 'Invalid credentials' };
|
||||
}
|
||||
|
||||
const match = await bcrypt.compare(password, user.passwordHash);
|
||||
if (!match) {
|
||||
return { error: 'Invalid credentials' };
|
||||
}
|
||||
|
||||
const token = jwt.sign(
|
||||
{ userId: user.id },
|
||||
process.env.JWT_SECRET,
|
||||
{ expiresIn: '1h' }
|
||||
);
|
||||
|
||||
return { token };
|
||||
}
|
||||
context:
|
||||
language: javascript
|
||||
framework: nodejs
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "bcrypt"
|
||||
- "jwt"
|
||||
- "secure"
|
||||
must_not_contain:
|
||||
- "vulnerable"
|
||||
- "critical"
|
||||
- "hardcoded"
|
||||
severity_classification: info
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
allow_partial: true
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CATEGORY: Python Security (Multi-language Support)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
- id: tc013_python_sql_injection
|
||||
description: "Detect SQL injection in Python Flask application"
|
||||
category: injection
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
from flask import Flask, request
|
||||
import sqlite3
|
||||
|
||||
app = Flask(__name__)
|
||||
|
||||
@app.route('/user')
|
||||
def get_user():
|
||||
user_id = request.args.get('id')
|
||||
conn = sqlite3.connect('users.db')
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
|
||||
return str(cursor.fetchone())
|
||||
context:
|
||||
language: python
|
||||
framework: flask
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "SQL injection"
|
||||
- "parameterized"
|
||||
- "f-string"
|
||||
must_match_regex:
|
||||
- "CWE-89"
|
||||
severity_classification: critical
|
||||
finding_count:
|
||||
min: 1
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
|
||||
- id: tc014_python_ssti_jinja
|
||||
description: "Detect Server-Side Template Injection in Jinja2"
|
||||
category: injection
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
from flask import Flask, request, render_template_string
|
||||
|
||||
app = Flask(__name__)
|
||||
|
||||
@app.route('/render')
|
||||
def render():
|
||||
template = request.args.get('template')
|
||||
return render_template_string(template)
|
||||
context:
|
||||
language: python
|
||||
framework: flask
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "SSTI"
|
||||
- "template injection"
|
||||
- "render_template_string"
|
||||
- "Jinja2"
|
||||
must_match_regex:
|
||||
- "CWE-94|CWE-1336"
|
||||
severity_classification: critical
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
|
||||
- id: tc015_python_pickle_deserialization
|
||||
description: "Detect insecure deserialization with pickle"
|
||||
category: injection
|
||||
priority: critical
|
||||
|
||||
input:
|
||||
code: |
|
||||
import pickle
|
||||
from flask import Flask, request
|
||||
|
||||
app = Flask(__name__)
|
||||
|
||||
@app.route('/load')
|
||||
def load_data():
|
||||
data = request.get_data()
|
||||
obj = pickle.loads(data)
|
||||
return str(obj)
|
||||
context:
|
||||
language: python
|
||||
framework: flask
|
||||
|
||||
expected_output:
|
||||
must_contain:
|
||||
- "pickle"
|
||||
- "deserialization"
|
||||
- "untrusted"
|
||||
- "RCE"
|
||||
must_match_regex:
|
||||
- "CWE-502"
|
||||
- "A08:2021"
|
||||
severity_classification: critical
|
||||
|
||||
validation:
|
||||
schema_check: true
|
||||
keyword_match_threshold: 0.7
|
||||
|
||||
# =============================================================================
|
||||
# SUCCESS CRITERIA
|
||||
# =============================================================================
|
||||
|
||||
success_criteria:
|
||||
# Overall pass rate (90% of tests must pass)
|
||||
pass_rate: 0.9
|
||||
|
||||
# Critical tests must ALL pass (100%)
|
||||
critical_pass_rate: 1.0
|
||||
|
||||
# Average reasoning quality score
|
||||
avg_reasoning_quality: 0.75
|
||||
|
||||
# Maximum suite execution time (5 minutes)
|
||||
max_execution_time_ms: 300000
|
||||
|
||||
# Maximum variance between model results (15%)
|
||||
cross_model_variance: 0.15
|
||||
|
||||
# =============================================================================
|
||||
# METADATA
|
||||
# =============================================================================
|
||||
|
||||
metadata:
|
||||
author: "qe-security-auditor"
|
||||
created: "2026-02-02"
|
||||
last_updated: "2026-02-02"
|
||||
coverage_target: >
|
||||
OWASP Top 10 2021: A01 (Broken Access Control), A02 (Cryptographic Failures),
|
||||
A03 (Injection - SQL, XSS, SSTI, Command), A07 (Authentication Failures),
|
||||
A08 (Software Integrity - Deserialization). Covers JavaScript/Node.js
|
||||
Express apps and Python Flask apps. 15 test cases with 90% pass rate
|
||||
requirement and 100% critical pass rate.
|
||||
@@ -0,0 +1,879 @@
|
||||
{
|
||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||
"$id": "https://agentic-qe.dev/schemas/security-testing-output.json",
|
||||
"title": "AQE Security Testing Skill Output Schema",
|
||||
"description": "Schema for security-testing skill output validation. Extends the base skill-output template with OWASP Top 10 categories, CWE identifiers, and CVSS scoring.",
|
||||
"type": "object",
|
||||
"required": ["skillName", "version", "timestamp", "status", "trustTier", "output"],
|
||||
"properties": {
|
||||
"skillName": {
|
||||
"type": "string",
|
||||
"const": "security-testing",
|
||||
"description": "Must be 'security-testing'"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+(-[a-zA-Z0-9]+)?$",
|
||||
"description": "Semantic version of the skill"
|
||||
},
|
||||
"timestamp": {
|
||||
"type": "string",
|
||||
"format": "date-time",
|
||||
"description": "ISO 8601 timestamp of output generation"
|
||||
},
|
||||
"status": {
|
||||
"type": "string",
|
||||
"enum": ["success", "partial", "failed", "skipped"],
|
||||
"description": "Overall execution status"
|
||||
},
|
||||
"trustTier": {
|
||||
"type": "integer",
|
||||
"const": 3,
|
||||
"description": "Trust tier 3 indicates full validation with eval suite"
|
||||
},
|
||||
"output": {
|
||||
"type": "object",
|
||||
"required": ["summary", "findings", "owaspCategories"],
|
||||
"properties": {
|
||||
"summary": {
|
||||
"type": "string",
|
||||
"minLength": 50,
|
||||
"maxLength": 2000,
|
||||
"description": "Human-readable summary of security findings"
|
||||
},
|
||||
"score": {
|
||||
"$ref": "#/$defs/securityScore",
|
||||
"description": "Overall security score"
|
||||
},
|
||||
"findings": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/$defs/securityFinding"
|
||||
},
|
||||
"maxItems": 500,
|
||||
"description": "List of security vulnerabilities discovered"
|
||||
},
|
||||
"recommendations": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/$defs/securityRecommendation"
|
||||
},
|
||||
"maxItems": 100,
|
||||
"description": "Prioritized remediation recommendations with code examples"
|
||||
},
|
||||
"metrics": {
|
||||
"$ref": "#/$defs/securityMetrics",
|
||||
"description": "Security scan metrics and statistics"
|
||||
},
|
||||
"owaspCategories": {
|
||||
"$ref": "#/$defs/owaspCategoryBreakdown",
|
||||
"description": "OWASP Top 10 2021 category breakdown"
|
||||
},
|
||||
"artifacts": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/$defs/artifact"
|
||||
},
|
||||
"maxItems": 50,
|
||||
"description": "Generated security reports and scan artifacts"
|
||||
},
|
||||
"timeline": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/$defs/timelineEvent"
|
||||
},
|
||||
"description": "Scan execution timeline"
|
||||
},
|
||||
"scanConfiguration": {
|
||||
"$ref": "#/$defs/scanConfiguration",
|
||||
"description": "Configuration used for the security scan"
|
||||
}
|
||||
}
|
||||
},
|
||||
"metadata": {
|
||||
"$ref": "#/$defs/metadata"
|
||||
},
|
||||
"validation": {
|
||||
"$ref": "#/$defs/validationResult"
|
||||
},
|
||||
"learning": {
|
||||
"$ref": "#/$defs/learningData"
|
||||
}
|
||||
},
|
||||
"$defs": {
|
||||
"securityScore": {
|
||||
"type": "object",
|
||||
"required": ["value", "max"],
|
||||
"properties": {
|
||||
"value": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 100,
|
||||
"description": "Security score (0=critical issues, 100=no issues)"
|
||||
},
|
||||
"max": {
|
||||
"type": "number",
|
||||
"const": 100,
|
||||
"description": "Maximum score is always 100"
|
||||
},
|
||||
"grade": {
|
||||
"type": "string",
|
||||
"pattern": "^[A-F][+-]?$",
|
||||
"description": "Letter grade: A (90-100), B (80-89), C (70-79), D (60-69), F (<60)"
|
||||
},
|
||||
"trend": {
|
||||
"type": "string",
|
||||
"enum": ["improving", "stable", "declining", "unknown"],
|
||||
"description": "Trend compared to previous scans"
|
||||
},
|
||||
"riskLevel": {
|
||||
"type": "string",
|
||||
"enum": ["critical", "high", "medium", "low", "minimal"],
|
||||
"description": "Overall risk level assessment"
|
||||
}
|
||||
}
|
||||
},
|
||||
"securityFinding": {
|
||||
"type": "object",
|
||||
"required": ["id", "title", "severity", "owasp"],
|
||||
"properties": {
|
||||
"id": {
|
||||
"type": "string",
|
||||
"pattern": "^SEC-\\d{3,6}$",
|
||||
"description": "Unique finding identifier (e.g., SEC-001)"
|
||||
},
|
||||
"title": {
|
||||
"type": "string",
|
||||
"minLength": 10,
|
||||
"maxLength": 200,
|
||||
"description": "Finding title describing the vulnerability"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"maxLength": 2000,
|
||||
"description": "Detailed description of the vulnerability"
|
||||
},
|
||||
"severity": {
|
||||
"type": "string",
|
||||
"enum": ["critical", "high", "medium", "low", "info"],
|
||||
"description": "Severity: critical (CVSS 9.0-10.0), high (7.0-8.9), medium (4.0-6.9), low (0.1-3.9), info (0)"
|
||||
},
|
||||
"owasp": {
|
||||
"type": "string",
|
||||
"pattern": "^A(0[1-9]|10):20(21|25)$",
|
||||
"description": "OWASP Top 10 category (e.g., A01:2021, A03:2025)"
|
||||
},
|
||||
"owaspCategory": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"A01:2021-Broken-Access-Control",
|
||||
"A02:2021-Cryptographic-Failures",
|
||||
"A03:2021-Injection",
|
||||
"A04:2021-Insecure-Design",
|
||||
"A05:2021-Security-Misconfiguration",
|
||||
"A06:2021-Vulnerable-Components",
|
||||
"A07:2021-Identification-Authentication-Failures",
|
||||
"A08:2021-Software-Data-Integrity-Failures",
|
||||
"A09:2021-Security-Logging-Monitoring-Failures",
|
||||
"A10:2021-Server-Side-Request-Forgery"
|
||||
],
|
||||
"description": "Full OWASP category name"
|
||||
},
|
||||
"cwe": {
|
||||
"type": "string",
|
||||
"pattern": "^CWE-\\d{1,4}$",
|
||||
"description": "CWE identifier (e.g., CWE-79 for XSS, CWE-89 for SQLi)"
|
||||
},
|
||||
"cvss": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"score": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 10,
|
||||
"description": "CVSS v3.1 base score"
|
||||
},
|
||||
"vector": {
|
||||
"type": "string",
|
||||
"pattern": "^CVSS:3\\.1/AV:[NALP]/AC:[LH]/PR:[NLH]/UI:[NR]/S:[UC]/C:[NLH]/I:[NLH]/A:[NLH]$",
|
||||
"description": "CVSS v3.1 vector string"
|
||||
},
|
||||
"severity": {
|
||||
"type": "string",
|
||||
"enum": ["None", "Low", "Medium", "High", "Critical"],
|
||||
"description": "CVSS severity rating"
|
||||
}
|
||||
}
|
||||
},
|
||||
"location": {
|
||||
"$ref": "#/$defs/location",
|
||||
"description": "Location of the vulnerability"
|
||||
},
|
||||
"evidence": {
|
||||
"type": "string",
|
||||
"maxLength": 5000,
|
||||
"description": "Evidence: code snippet, request/response, or PoC"
|
||||
},
|
||||
"remediation": {
|
||||
"type": "string",
|
||||
"maxLength": 2000,
|
||||
"description": "Specific fix instructions for this finding"
|
||||
},
|
||||
"references": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["title", "url"],
|
||||
"properties": {
|
||||
"title": { "type": "string" },
|
||||
"url": { "type": "string", "format": "uri" }
|
||||
}
|
||||
},
|
||||
"maxItems": 10,
|
||||
"description": "External references (OWASP, CWE, CVE, etc.)"
|
||||
},
|
||||
"falsePositive": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Potential false positive flag"
|
||||
},
|
||||
"confidence": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 1,
|
||||
"description": "Confidence in finding accuracy (0.0-1.0)"
|
||||
},
|
||||
"exploitability": {
|
||||
"type": "string",
|
||||
"enum": ["trivial", "easy", "moderate", "difficult", "theoretical"],
|
||||
"description": "How easy is it to exploit this vulnerability"
|
||||
},
|
||||
"affectedVersions": {
|
||||
"type": "array",
|
||||
"items": { "type": "string" },
|
||||
"description": "Affected package/library versions for dependency vulnerabilities"
|
||||
},
|
||||
"cve": {
|
||||
"type": "string",
|
||||
"pattern": "^CVE-\\d{4}-\\d{4,}$",
|
||||
"description": "CVE identifier if applicable"
|
||||
}
|
||||
}
|
||||
},
|
||||
"securityRecommendation": {
|
||||
"type": "object",
|
||||
"required": ["id", "title", "priority", "owaspCategories"],
|
||||
"properties": {
|
||||
"id": {
|
||||
"type": "string",
|
||||
"pattern": "^REC-\\d{3,6}$",
|
||||
"description": "Unique recommendation identifier"
|
||||
},
|
||||
"title": {
|
||||
"type": "string",
|
||||
"minLength": 10,
|
||||
"maxLength": 200,
|
||||
"description": "Recommendation title"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"maxLength": 2000,
|
||||
"description": "Detailed recommendation description"
|
||||
},
|
||||
"priority": {
|
||||
"type": "string",
|
||||
"enum": ["critical", "high", "medium", "low"],
|
||||
"description": "Remediation priority"
|
||||
},
|
||||
"effort": {
|
||||
"type": "string",
|
||||
"enum": ["trivial", "low", "medium", "high", "major"],
|
||||
"description": "Estimated effort: trivial(<1hr), low(1-4hr), medium(1-3d), high(1-2wk), major(>2wk)"
|
||||
},
|
||||
"impact": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"maximum": 10,
|
||||
"description": "Security impact if implemented (1-10)"
|
||||
},
|
||||
"relatedFindings": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^SEC-\\d{3,6}$"
|
||||
},
|
||||
"description": "IDs of findings this addresses"
|
||||
},
|
||||
"owaspCategories": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^A(0[1-9]|10):20(21|25)$"
|
||||
},
|
||||
"description": "OWASP categories this recommendation addresses"
|
||||
},
|
||||
"codeExample": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"before": {
|
||||
"type": "string",
|
||||
"maxLength": 2000,
|
||||
"description": "Vulnerable code example"
|
||||
},
|
||||
"after": {
|
||||
"type": "string",
|
||||
"maxLength": 2000,
|
||||
"description": "Secure code example"
|
||||
},
|
||||
"language": {
|
||||
"type": "string",
|
||||
"description": "Programming language"
|
||||
}
|
||||
},
|
||||
"description": "Before/after code examples for remediation"
|
||||
},
|
||||
"resources": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["title", "url"],
|
||||
"properties": {
|
||||
"title": { "type": "string" },
|
||||
"url": { "type": "string", "format": "uri" }
|
||||
}
|
||||
},
|
||||
"maxItems": 10,
|
||||
"description": "External resources and documentation"
|
||||
},
|
||||
"automatable": {
|
||||
"type": "boolean",
|
||||
"description": "Can this fix be automated?"
|
||||
},
|
||||
"fixCommand": {
|
||||
"type": "string",
|
||||
"description": "CLI command to apply fix if automatable"
|
||||
}
|
||||
}
|
||||
},
|
||||
"owaspCategoryBreakdown": {
|
||||
"type": "object",
|
||||
"description": "OWASP Top 10 2021 category scores and findings",
|
||||
"properties": {
|
||||
"A01:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A01:2021 - Broken Access Control"
|
||||
},
|
||||
"A02:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A02:2021 - Cryptographic Failures"
|
||||
},
|
||||
"A03:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A03:2021 - Injection"
|
||||
},
|
||||
"A04:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A04:2021 - Insecure Design"
|
||||
},
|
||||
"A05:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A05:2021 - Security Misconfiguration"
|
||||
},
|
||||
"A06:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A06:2021 - Vulnerable and Outdated Components"
|
||||
},
|
||||
"A07:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A07:2021 - Identification and Authentication Failures"
|
||||
},
|
||||
"A08:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A08:2021 - Software and Data Integrity Failures"
|
||||
},
|
||||
"A09:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A09:2021 - Security Logging and Monitoring Failures"
|
||||
},
|
||||
"A10:2021": {
|
||||
"$ref": "#/$defs/owaspCategoryScore",
|
||||
"description": "A10:2021 - Server-Side Request Forgery (SSRF)"
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"owaspCategoryScore": {
|
||||
"type": "object",
|
||||
"required": ["tested", "score"],
|
||||
"properties": {
|
||||
"tested": {
|
||||
"type": "boolean",
|
||||
"description": "Whether this category was tested"
|
||||
},
|
||||
"score": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 100,
|
||||
"description": "Category score (100 = no issues, 0 = critical)"
|
||||
},
|
||||
"grade": {
|
||||
"type": "string",
|
||||
"pattern": "^[A-F][+-]?$",
|
||||
"description": "Letter grade for this category"
|
||||
},
|
||||
"findingCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of findings in this category"
|
||||
},
|
||||
"criticalCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of critical findings"
|
||||
},
|
||||
"highCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of high severity findings"
|
||||
},
|
||||
"status": {
|
||||
"type": "string",
|
||||
"enum": ["pass", "fail", "warn", "skip"],
|
||||
"description": "Category status"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Category description and context"
|
||||
},
|
||||
"cwes": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^CWE-\\d{1,4}$"
|
||||
},
|
||||
"description": "CWEs found in this category"
|
||||
}
|
||||
}
|
||||
},
|
||||
"securityMetrics": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"totalFindings": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Total vulnerabilities found"
|
||||
},
|
||||
"criticalCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Critical severity findings"
|
||||
},
|
||||
"highCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "High severity findings"
|
||||
},
|
||||
"mediumCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Medium severity findings"
|
||||
},
|
||||
"lowCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Low severity findings"
|
||||
},
|
||||
"infoCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Informational findings"
|
||||
},
|
||||
"filesScanned": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of files analyzed"
|
||||
},
|
||||
"linesOfCode": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Lines of code scanned"
|
||||
},
|
||||
"dependenciesChecked": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of dependencies checked"
|
||||
},
|
||||
"owaspCategoriesTested": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 10,
|
||||
"description": "OWASP Top 10 categories tested"
|
||||
},
|
||||
"owaspCategoriesPassed": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 10,
|
||||
"description": "OWASP Top 10 categories with no findings"
|
||||
},
|
||||
"uniqueCwes": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Unique CWE identifiers found"
|
||||
},
|
||||
"falsePositiveRate": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 1,
|
||||
"description": "Estimated false positive rate"
|
||||
},
|
||||
"scanDurationMs": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Total scan duration in milliseconds"
|
||||
},
|
||||
"coverage": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"sast": {
|
||||
"type": "boolean",
|
||||
"description": "Static analysis performed"
|
||||
},
|
||||
"dast": {
|
||||
"type": "boolean",
|
||||
"description": "Dynamic analysis performed"
|
||||
},
|
||||
"dependencies": {
|
||||
"type": "boolean",
|
||||
"description": "Dependency scan performed"
|
||||
},
|
||||
"secrets": {
|
||||
"type": "boolean",
|
||||
"description": "Secret scanning performed"
|
||||
},
|
||||
"configuration": {
|
||||
"type": "boolean",
|
||||
"description": "Configuration review performed"
|
||||
}
|
||||
},
|
||||
"description": "Scan coverage indicators"
|
||||
}
|
||||
}
|
||||
},
|
||||
"scanConfiguration": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"target": {
|
||||
"type": "string",
|
||||
"description": "Scan target (file path, URL, or package)"
|
||||
},
|
||||
"targetType": {
|
||||
"type": "string",
|
||||
"enum": ["source", "url", "package", "container", "infrastructure"],
|
||||
"description": "Type of target being scanned"
|
||||
},
|
||||
"scanTypes": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"enum": ["sast", "dast", "dependency", "secret", "configuration", "container", "iac"]
|
||||
},
|
||||
"description": "Types of scans performed"
|
||||
},
|
||||
"severity": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"enum": ["critical", "high", "medium", "low", "info"]
|
||||
},
|
||||
"description": "Severity levels included in scan"
|
||||
},
|
||||
"owaspCategories": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^A(0[1-9]|10):20(21|25)$"
|
||||
},
|
||||
"description": "OWASP categories tested"
|
||||
},
|
||||
"tools": {
|
||||
"type": "array",
|
||||
"items": { "type": "string" },
|
||||
"description": "Security tools used"
|
||||
},
|
||||
"excludePatterns": {
|
||||
"type": "array",
|
||||
"items": { "type": "string" },
|
||||
"description": "File patterns excluded from scan"
|
||||
},
|
||||
"rulesets": {
|
||||
"type": "array",
|
||||
"items": { "type": "string" },
|
||||
"description": "Security rulesets applied"
|
||||
}
|
||||
}
|
||||
},
|
||||
"location": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"file": {
|
||||
"type": "string",
|
||||
"maxLength": 500,
|
||||
"description": "File path relative to project root"
|
||||
},
|
||||
"line": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": "Line number"
|
||||
},
|
||||
"column": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": "Column number"
|
||||
},
|
||||
"endLine": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": "End line for multi-line findings"
|
||||
},
|
||||
"endColumn": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": "End column"
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "uri",
|
||||
"description": "URL for web-based findings"
|
||||
},
|
||||
"endpoint": {
|
||||
"type": "string",
|
||||
"description": "API endpoint path"
|
||||
},
|
||||
"method": {
|
||||
"type": "string",
|
||||
"enum": ["GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS"],
|
||||
"description": "HTTP method for API findings"
|
||||
},
|
||||
"parameter": {
|
||||
"type": "string",
|
||||
"description": "Vulnerable parameter name"
|
||||
},
|
||||
"component": {
|
||||
"type": "string",
|
||||
"description": "Affected component or module"
|
||||
}
|
||||
}
|
||||
},
|
||||
"artifact": {
|
||||
"type": "object",
|
||||
"required": ["type", "path"],
|
||||
"properties": {
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": ["report", "sarif", "data", "log", "evidence"],
|
||||
"description": "Artifact type"
|
||||
},
|
||||
"path": {
|
||||
"type": "string",
|
||||
"maxLength": 500,
|
||||
"description": "Path to artifact"
|
||||
},
|
||||
"format": {
|
||||
"type": "string",
|
||||
"enum": ["json", "sarif", "html", "md", "txt", "xml", "csv"],
|
||||
"description": "Artifact format"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"maxLength": 500,
|
||||
"description": "Artifact description"
|
||||
},
|
||||
"sizeBytes": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "File size in bytes"
|
||||
},
|
||||
"checksum": {
|
||||
"type": "string",
|
||||
"pattern": "^sha256:[a-f0-9]{64}$",
|
||||
"description": "SHA-256 checksum"
|
||||
}
|
||||
}
|
||||
},
|
||||
"timelineEvent": {
|
||||
"type": "object",
|
||||
"required": ["timestamp", "event"],
|
||||
"properties": {
|
||||
"timestamp": {
|
||||
"type": "string",
|
||||
"format": "date-time",
|
||||
"description": "Event timestamp"
|
||||
},
|
||||
"event": {
|
||||
"type": "string",
|
||||
"maxLength": 200,
|
||||
"description": "Event description"
|
||||
},
|
||||
"type": {
|
||||
"type": "string",
|
||||
"enum": ["start", "checkpoint", "warning", "error", "complete"],
|
||||
"description": "Event type"
|
||||
},
|
||||
"durationMs": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Duration since previous event"
|
||||
},
|
||||
"phase": {
|
||||
"type": "string",
|
||||
"enum": ["initialization", "sast", "dast", "dependency", "secret", "reporting"],
|
||||
"description": "Scan phase"
|
||||
}
|
||||
}
|
||||
},
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"executionTimeMs": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 3600000,
|
||||
"description": "Execution time in milliseconds"
|
||||
},
|
||||
"toolsUsed": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"enum": ["semgrep", "npm-audit", "trivy", "owasp-zap", "bandit", "gosec", "eslint-security", "snyk", "gitleaks", "trufflehog", "bearer"]
|
||||
},
|
||||
"uniqueItems": true,
|
||||
"description": "Security tools used"
|
||||
},
|
||||
"agentId": {
|
||||
"type": "string",
|
||||
"pattern": "^qe-[a-z][a-z0-9-]*$",
|
||||
"description": "Agent ID (e.g., qe-security-scanner)"
|
||||
},
|
||||
"modelUsed": {
|
||||
"type": "string",
|
||||
"description": "LLM model used for analysis"
|
||||
},
|
||||
"inputHash": {
|
||||
"type": "string",
|
||||
"pattern": "^[a-f0-9]{64}$",
|
||||
"description": "SHA-256 hash of input"
|
||||
},
|
||||
"targetUrl": {
|
||||
"type": "string",
|
||||
"format": "uri",
|
||||
"description": "Target URL if applicable"
|
||||
},
|
||||
"targetPath": {
|
||||
"type": "string",
|
||||
"description": "Target path if applicable"
|
||||
},
|
||||
"environment": {
|
||||
"type": "string",
|
||||
"enum": ["development", "staging", "production", "ci"],
|
||||
"description": "Execution environment"
|
||||
},
|
||||
"retryCount": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 10,
|
||||
"description": "Number of retries"
|
||||
}
|
||||
}
|
||||
},
|
||||
"validationResult": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"schemaValid": {
|
||||
"type": "boolean",
|
||||
"description": "Passes JSON schema validation"
|
||||
},
|
||||
"contentValid": {
|
||||
"type": "boolean",
|
||||
"description": "Passes content validation"
|
||||
},
|
||||
"confidence": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 1,
|
||||
"description": "Confidence score"
|
||||
},
|
||||
"warnings": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"maxLength": 500
|
||||
},
|
||||
"maxItems": 20,
|
||||
"description": "Validation warnings"
|
||||
},
|
||||
"errors": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"maxLength": 500
|
||||
},
|
||||
"maxItems": 20,
|
||||
"description": "Validation errors"
|
||||
},
|
||||
"validatorVersion": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$",
|
||||
"description": "Validator version"
|
||||
}
|
||||
}
|
||||
},
|
||||
"learningData": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"patternsDetected": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"maxLength": 200
|
||||
},
|
||||
"maxItems": 20,
|
||||
"description": "Security patterns detected (e.g., sql-injection-string-concat)"
|
||||
},
|
||||
"reward": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 1,
|
||||
"description": "Reward signal for learning (0.0-1.0)"
|
||||
},
|
||||
"feedbackLoop": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"previousRunId": {
|
||||
"type": "string",
|
||||
"format": "uuid",
|
||||
"description": "Previous run ID for comparison"
|
||||
},
|
||||
"improvement": {
|
||||
"type": "number",
|
||||
"minimum": -1,
|
||||
"maximum": 1,
|
||||
"description": "Improvement over previous run"
|
||||
}
|
||||
}
|
||||
},
|
||||
"newVulnerabilityPatterns": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"pattern": { "type": "string" },
|
||||
"cwe": { "type": "string" },
|
||||
"confidence": { "type": "number" }
|
||||
}
|
||||
},
|
||||
"description": "New vulnerability patterns learned"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,45 @@
|
||||
{
|
||||
"skillName": "security-testing",
|
||||
"skillVersion": "1.0.0",
|
||||
"requiredTools": [
|
||||
"jq"
|
||||
],
|
||||
"optionalTools": [
|
||||
"npm",
|
||||
"semgrep",
|
||||
"trivy",
|
||||
"ajv",
|
||||
"jsonschema",
|
||||
"python3"
|
||||
],
|
||||
"schemaPath": "schemas/output.json",
|
||||
"requiredFields": [
|
||||
"skillName",
|
||||
"status",
|
||||
"output",
|
||||
"output.summary",
|
||||
"output.findings",
|
||||
"output.owaspCategories"
|
||||
],
|
||||
"requiredNonEmptyFields": [
|
||||
"output.summary"
|
||||
],
|
||||
"mustContainTerms": [
|
||||
"OWASP",
|
||||
"security",
|
||||
"vulnerability"
|
||||
],
|
||||
"mustNotContainTerms": [
|
||||
"TODO",
|
||||
"placeholder",
|
||||
"FIXME"
|
||||
],
|
||||
"enumValidations": {
|
||||
".status": [
|
||||
"success",
|
||||
"partial",
|
||||
"failed",
|
||||
"skipped"
|
||||
]
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user