remove the current solution, add skills

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-14 18:37:48 +02:00
parent fd75243a84
commit 767874cb90
363 changed files with 6057 additions and 36380 deletions
+281
View File
@@ -0,0 +1,281 @@
---
name: decompose
description: |
Decompose planned components into atomic implementable features with bootstrap structure plan.
4-step workflow: bootstrap structure plan, feature decomposition, cross-component verification, and Jira task creation.
Supports project mode (_docs/ structure), single component mode, and standalone mode (@file.md).
Trigger phrases:
- "decompose", "decompose features", "feature decomposition"
- "task decomposition", "break down components"
- "prepare for implementation"
disable-model-invocation: true
---
# Feature Decomposition
Decompose planned components into atomic, implementable feature specs with a bootstrap structure plan through a systematic workflow.
## Core Principles
- **Atomic features**: each feature does one thing; if it exceeds 5 complexity points, split it
- **Behavioral specs, not implementation plans**: describe what the system should do, not how to build it
- **Save immediately**: write artifacts to disk after each component; never accumulate unsaved work
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
- **Plan, don't code**: this workflow produces documents and Jira tasks, never implementation code
## Context Resolution
Determine the operating mode based on invocation before any other logic runs.
**Full project mode** (no explicit input file provided):
- PLANS_DIR: `_docs/02_plans/`
- TASKS_DIR: `_docs/02_tasks/`
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, PLANS_DIR
- Runs Step 1 (bootstrap) + Step 2 (all components) + Step 3 (cross-verification) + Step 4 (Jira)
**Single component mode** (provided file is within `_docs/02_plans/` and inside a `components/` subdirectory):
- PLANS_DIR: `_docs/02_plans/`
- TASKS_DIR: `_docs/02_tasks/`
- Derive `<topic>`, component number, and component name from the file path
- Ask user for the parent Epic ID
- Runs Step 2 (that component only) + Step 4 (Jira)
- Overwrites existing feature files in that component's TASKS_DIR subdirectory
**Standalone mode** (explicit input file provided, not within `_docs/02_plans/`):
- INPUT_FILE: the provided file (treated as a component spec)
- Derive `<topic>` from the input filename (without extension)
- TASKS_DIR: `_standalone/<topic>/tasks/`
- Guardrails relaxed: only INPUT_FILE must exist and be non-empty
- Ask user for the parent Epic ID
- Runs Step 2 (that component only) + Step 4 (Jira)
Announce the detected mode and resolved paths to the user before proceeding.
## Input Specification
### Required Files
**Full project mode:**
| File | Purpose |
|------|---------|
| `_docs/00_problem/problem.md` | Problem description and context |
| `_docs/00_problem/restrictions.md` | Constraints and limitations (if available) |
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria (if available) |
| `_docs/01_solution/solution.md` | Finalized solution |
| `PLANS_DIR/<topic>/architecture.md` | Architecture from plan skill |
| `PLANS_DIR/<topic>/system-flows.md` | System flows from plan skill |
| `PLANS_DIR/<topic>/components/[##]_[name]/description.md` | Component specs from plan skill |
**Single component mode:**
| File | Purpose |
|------|---------|
| The provided component `description.md` | Component spec to decompose |
| Corresponding `tests.md` in the same directory (if available) | Test specs for context |
**Standalone mode:**
| File | Purpose |
|------|---------|
| INPUT_FILE (the provided file) | Component spec to decompose |
### Prerequisite Checks (BLOCKING)
**Full project mode:**
1. At least one `<topic>/` directory exists under PLANS_DIR with `architecture.md` and `components/`**STOP if missing**
2. If multiple topics exist, ask user which one to decompose
3. Create TASKS_DIR if it does not exist
4. If `TASKS_DIR/<topic>/` already exists, ask user: **resume from last checkpoint or start fresh?**
**Single component mode:**
1. The provided component file exists and is non-empty — **STOP if missing**
2. Create the component's subdirectory under TASKS_DIR if it does not exist
**Standalone mode:**
1. INPUT_FILE exists and is non-empty — **STOP if missing**
2. Create TASKS_DIR if it does not exist
## Artifact Management
### Directory Structure
```
TASKS_DIR/<topic>/
├── initial_structure.md (Step 1, full mode only)
├── cross_dependencies.md (Step 3, full mode only)
├── SUMMARY.md (final)
├── [##]_[component_name]/
│ ├── [##].[##]_feature_[feature_name].md
│ ├── [##].[##]_feature_[feature_name].md
│ └── ...
├── [##]_[component_name]/
│ └── ...
└── ...
```
### Save Timing
| Step | Save immediately after | Filename |
|------|------------------------|----------|
| Step 1 | Bootstrap structure plan complete | `initial_structure.md` |
| Step 2 | Each component decomposed | `[##]_[name]/[##].[##]_feature_[feature_name].md` |
| Step 3 | Cross-component verification complete | `cross_dependencies.md` |
| Step 4 | Jira tasks created | Jira via MCP |
| Final | All steps complete | `SUMMARY.md` |
### Resumability
If `TASKS_DIR/<topic>/` already contains artifacts:
1. List existing files and match them to the save timing table
2. Identify the last completed component based on which feature files exist
3. Resume from the next incomplete component
4. Inform the user which components are being skipped
## Progress Tracking
At the start of execution, create a TodoWrite with all applicable steps. Update status as each step/component completes.
## Workflow
### Step 1: Bootstrap Structure Plan (full project mode only)
**Role**: Professional software architect
**Goal**: Produce `initial_structure.md` describing the project skeleton for implementation
**Constraints**: This is a plan document, not code. The `implement-initial` command executes it.
1. Read architecture.md, all component specs, and system-flows.md from PLANS_DIR
2. Read problem, solution, and restrictions from `_docs/00_problem/` and `_docs/01_solution/`
3. Research best implementation patterns for the identified tech stack
4. Document the structure plan using `templates/initial-structure.md`
**Self-verification**:
- [ ] All components have corresponding folders in the layout
- [ ] All inter-component interfaces have DTOs defined
- [ ] CI/CD stages cover build, lint, test, security, deploy
- [ ] Environment strategy covers dev, staging, production
- [ ] Test structure includes unit and integration test locations
**Save action**: Write `initial_structure.md`
**BLOCKING**: Present structure plan summary to user. Do NOT proceed until user confirms.
---
### Step 2: Feature Decomposition (all modes)
**Role**: Professional software architect
**Goal**: Decompose each component into atomic, implementable feature specs
**Constraints**: Behavioral specs only — describe what, not how. No implementation code.
For each component (or the single provided component):
1. Read the component's `description.md` and `tests.md` (if available)
2. Decompose into atomic features; create only 1 feature if the component is simple or atomic
3. Split into multiple features only when it is necessary and would be easier to implement
4. Do not create features of other components — only features of the current component
5. Each feature should be atomic, containing 0 APIs or a list of semantically connected APIs
6. Write each feature spec using `templates/feature-spec.md`
7. Estimate complexity per feature (1, 2, 3, 5 points); no feature should exceed 5 points — split if it does
8. Note feature dependencies (within component and cross-component)
**Self-verification** (per component):
- [ ] Every feature is atomic (single concern)
- [ ] No feature exceeds 5 complexity points
- [ ] Feature dependencies are noted
- [ ] Features cover all interfaces defined in the component spec
- [ ] No features duplicate work from other components
**Save action**: Write each `[##]_[name]/[##].[##]_feature_[feature_name].md`
---
### Step 3: Cross-Component Verification (full project mode only)
**Role**: Professional software architect and analyst
**Goal**: Verify feature consistency across all components
**Constraints**: Review step — fix gaps found, do not add new features
1. Verify feature dependencies across all components are consistent
2. Check no gaps: every interface in architecture.md has features covering it
3. Check no overlaps: features don't duplicate work across components
4. Produce dependency matrix showing cross-component feature dependencies
5. Determine recommended implementation order based on dependencies
**Self-verification**:
- [ ] Every architecture interface is covered by at least one feature
- [ ] No circular feature dependencies across components
- [ ] Cross-component dependencies are explicitly noted in affected feature specs
**Save action**: Write `cross_dependencies.md`
**BLOCKING**: Present cross-component summary to user. Do NOT proceed until user confirms.
---
### Step 4: Jira Tasks (all modes)
**Role**: Professional product manager
**Goal**: Create Jira tasks from feature specs under the appropriate parent epics
**Constraints**: Be concise — fewer words with the same meaning is better
1. For each feature spec, create a Jira task following the parsing rules and field mapping from `gen_jira_task_and_branch.md` (skip branch creation and file renaming — those happen during implementation)
2. In full mode: search Jira for epics matching component names/labels to find parent epic IDs
3. In single component mode: use the Epic ID obtained during context resolution
4. In standalone mode: use the Epic ID obtained during context resolution
5. Do NOT create git branches or rename files — that happens during implementation
**Self-verification**:
- [ ] Every feature has a corresponding Jira task
- [ ] Every task is linked to the correct parent epic
- [ ] Task descriptions match feature spec content
**Save action**: Jira tasks created via MCP
---
## Summary Report
After all steps complete, write `SUMMARY.md` using `templates/summary.md` as structure.
## Common Mistakes
- **Coding during decomposition**: this workflow produces specs, never code
- **Over-splitting**: don't create many features if the component is simple — 1 feature is fine
- **Features exceeding 5 points**: split them; no feature should be too complex for a single task
- **Cross-component features**: each feature belongs to exactly one component
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Creating git branches**: branch creation is an implementation concern, not a decomposition one
## Escalation Rules
| Situation | Action |
|-----------|--------|
| Ambiguous component boundaries | ASK user |
| Feature complexity exceeds 5 points after splitting | ASK user |
| Missing component specs in PLANS_DIR | ASK user |
| Cross-component dependency conflict | ASK user |
| Jira epic not found for a component | ASK user for Epic ID |
| Component naming | PROCEED, confirm at next BLOCKING gate |
## Methodology Quick Reference
```
┌────────────────────────────────────────────────────────────────┐
│ Feature Decomposition (4-Step Method) │
├────────────────────────────────────────────────────────────────┤
│ CONTEXT: Resolve mode (full / single component / standalone) │
│ 1. Bootstrap Structure → initial_structure.md (full only) │
│ [BLOCKING: user confirms structure] │
│ 2. Feature Decompose → [##]_[name]/[##].[##]_feature_* │
│ 3. Cross-Verification → cross_dependencies.md (full only) │
│ [BLOCKING: user confirms dependencies] │
│ 4. Jira Tasks → Jira via MCP │
│ ───────────────────────────────────────────────── │
│ Summary → SUMMARY.md │
├────────────────────────────────────────────────────────────────┤
│ Principles: Atomic features · Behavioral specs · Save now │
│ Ask don't assume · Plan don't code │
└────────────────────────────────────────────────────────────────┘
```
@@ -0,0 +1,108 @@
# Feature Specification Template
Create a focused behavioral specification that describes **what** the system should do, not **how** it should be built.
Save as `TASKS_DIR/<topic>/[##]_[component_name]/[##].[##]_feature_[feature_name].md`.
---
```markdown
# [Feature Name]
**Status**: Draft | **Date**: [YYYY-MM-DD] | **Feature**: [Brief Feature Description]
**Complexity**: [1|2|3|5] points
**Dependencies**: [List dependent features or "None"]
**Component**: [##]_[component_name]
## Problem
Clear, concise statement of the problem users are facing.
## Outcome
- Measurable or observable goal 1
- Measurable or observable goal 2
## Scope
### Included
- What's in scope for this feature
### Excluded
- Explicitly what's NOT in scope
## Acceptance Criteria
**AC-1: [Title]**
Given [precondition]
When [action]
Then [expected result]
**AC-2: [Title]**
Given [precondition]
When [action]
Then [expected result]
## Non-Functional Requirements
**Performance**
- [requirement if relevant]
**Compatibility**
- [requirement if relevant]
**Reliability**
- [requirement if relevant]
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|-------------|-----------------|
| AC-1 | [test subject] | [expected result] |
## Integration Tests
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|--------|------------------------|-------------|-------------------|----------------|
| AC-1 | [setup] | [test subject] | [expected behavior] | [NFR if any] |
## Constraints
- [Architectural pattern constraint if critical]
- [Technical limitation]
- [Integration requirement]
## Risks & Mitigation
**Risk 1: [Title]**
- *Risk*: [Description]
- *Mitigation*: [Approach]
```
---
## Complexity Points Guide
- 1 point: Trivial, self-contained, no dependencies
- 2 points: Non-trivial, low complexity, minimal coordination
- 3 points: Multi-step, moderate complexity, potential alignment needed
- 5 points: Difficult, interconnected logic, medium-high risk
- 8 points: Too complex — split into smaller features
## Output Guidelines
**DO:**
- Focus on behavior and user experience
- Use clear, simple language
- Keep acceptance criteria testable (Gherkin format)
- Include realistic scope boundaries
- Write from the user's perspective
- Include complexity estimation
- Note dependencies on other features
**DON'T:**
- Include implementation details (file paths, classes, methods)
- Prescribe technical solutions or libraries
- Add architectural diagrams or code examples
- Specify exact API endpoints or data structures
- Include step-by-step implementation instructions
- Add "how to build" guidance
@@ -0,0 +1,113 @@
# Initial Structure Plan Template
Use this template for the bootstrap structure plan. Save as `TASKS_DIR/<topic>/initial_structure.md`.
---
```markdown
# Initial Project Structure Plan
**Date**: [YYYY-MM-DD]
**Tech Stack**: [language, framework, database, etc.]
**Source**: architecture.md, component specs from _docs/02_plans/<topic>/
## Project Folder Layout
```
project-root/
├── [folder structure based on tech stack and components]
└── ...
```
### Layout Rationale
[Brief explanation of why this structure was chosen — language conventions, framework patterns, etc.]
## DTOs and Interfaces
### Shared DTOs
| DTO Name | Used By Components | Fields Summary |
|----------|-------------------|---------------|
| [name] | [component list] | [key fields] |
### Component Interfaces
| Component | Interface | Methods | Exposed To |
|-----------|-----------|---------|-----------|
| [##]_[name] | [InterfaceName] | [method list] | [consumers] |
## CI/CD Pipeline
| Stage | Purpose | Trigger |
|-------|---------|---------|
| Build | Compile/bundle the application | Every push |
| Lint / Static Analysis | Code quality and style checks | Every push |
| Unit Tests | Run unit test suite | Every push |
| Integration Tests | Run integration test suite | Every push |
| Security Scan | SAST / dependency check | Every push |
| Deploy to Staging | Deploy to staging environment | Merge to staging branch |
### Pipeline Configuration Notes
[Framework-specific notes: CI tool, runners, caching, parallelism, etc.]
## Environment Strategy
| Environment | Purpose | Configuration Notes |
|-------------|---------|-------------------|
| Development | Local development | [local DB, mock services, debug flags] |
| Staging | Pre-production testing | [staging DB, staging services, production-like config] |
| Production | Live system | [production DB, real services, optimized config] |
### Environment Variables
| Variable | Dev | Staging | Production | Description |
|----------|-----|---------|------------|-------------|
| [VAR_NAME] | [value/source] | [value/source] | [value/source] | [purpose] |
## Database Migration Approach
**Migration tool**: [tool name]
**Strategy**: [migration strategy — e.g., versioned scripts, ORM migrations]
### Initial Schema
[Key tables/collections that need to be created, referencing component data access patterns]
## Test Structure
```
tests/
├── unit/
│ ├── [component_1]/
│ ├── [component_2]/
│ └── ...
├── integration/
│ ├── test_data/
│ └── [test files]
└── ...
```
### Test Configuration Notes
[Test runner, fixtures, test data management, isolation strategy]
## Implementation Order
| Order | Component | Reason |
|-------|-----------|--------|
| 1 | [##]_[name] | [why first — foundational, no dependencies] |
| 2 | [##]_[name] | [depends on #1] |
| ... | ... | ... |
```
---
## Guidance Notes
- This is a PLAN document, not code. The `3.05_implement_initial_structure` command executes it.
- Focus on structure and organization decisions, not implementation details.
- Reference component specs for interface and DTO details — don't repeat everything.
- The folder layout should follow conventions of the identified tech stack.
- Environment strategy should account for secrets management and configuration.
@@ -0,0 +1,59 @@
# Decomposition Summary Template
Use this template after all steps complete. Save as `TASKS_DIR/<topic>/SUMMARY.md`.
---
```markdown
# Decomposition Summary
**Date**: [YYYY-MM-DD]
**Topic**: [topic name]
**Total Components**: [N]
**Total Features**: [N]
**Total Complexity Points**: [N]
## Component Breakdown
| # | Component | Features | Total Points | Jira Epic |
|---|-----------|----------|-------------|-----------|
| 01 | [name] | [count] | [sum] | [EPIC-ID] |
| 02 | [name] | [count] | [sum] | [EPIC-ID] |
| ... | ... | ... | ... | ... |
## Feature List
| Component | Feature | Complexity | Jira Task | Dependencies |
|-----------|---------|-----------|-----------|-------------|
| [##]_[name] | [##].[##]_feature_[name] | [points] | [TASK-ID] | [deps or "None"] |
| ... | ... | ... | ... | ... |
## Implementation Order
Recommended sequence based on dependency analysis:
| Phase | Components / Features | Rationale |
|-------|----------------------|-----------|
| 1 | [list] | [foundational, no dependencies] |
| 2 | [list] | [depends on phase 1] |
| 3 | [list] | [depends on phase 1-2] |
| ... | ... | ... |
### Parallelization Opportunities
[Features/components that can be implemented concurrently within each phase]
## Cross-Component Dependencies
| From (Feature) | To (Feature) | Dependency Type |
|----------------|-------------|-----------------|
| [comp.feature] | [comp.feature] | [data / API / event] |
| ... | ... | ... |
## Artifacts Produced
- `initial_structure.md` — project skeleton plan
- `cross_dependencies.md` — dependency matrix
- `[##]_[name]/[##].[##]_feature_*.md` — feature specs per component
- Jira tasks created under respective epics
```
+393
View File
@@ -0,0 +1,393 @@
---
name: plan
description: |
Decompose a solution into architecture, system flows, components, tests, and Jira epics.
Systematic 5-step planning workflow with BLOCKING gates, self-verification, and structured artifact management.
Supports project mode (_docs/ + _docs/02_plans/ structure) and standalone mode (@file.md).
Trigger phrases:
- "plan", "decompose solution", "architecture planning"
- "break down the solution", "create planning documents"
- "component decomposition", "solution analysis"
disable-model-invocation: true
---
# Solution Planning
Decompose a problem and solution into architecture, system flows, components, tests, and Jira epics through a systematic 5-step workflow.
## Core Principles
- **Single Responsibility**: each component does one thing well; do not spread related logic across components
- **Dumb code, smart data**: keep logic simple, push complexity into data structures and configuration
- **Save immediately**: write artifacts to disk after each step; never accumulate unsaved work
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
- **Plan, don't code**: this workflow produces documents and specs, never implementation code
## Context Resolution
Determine the operating mode based on invocation before any other logic runs.
**Project mode** (no explicit input file provided):
- PROBLEM_FILE: `_docs/00_problem/problem.md`
- SOLUTION_FILE: `_docs/01_solution/solution.md`
- PLANS_DIR: `_docs/02_plans/`
- All existing guardrails apply as-is.
**Standalone mode** (explicit input file provided, e.g. `/plan @some_doc.md`):
- INPUT_FILE: the provided file (treated as combined problem + solution context)
- Derive `<topic>` from the input filename (without extension)
- PLANS_DIR: `_standalone/<topic>/plans/`
- Guardrails relaxed: only INPUT_FILE must exist and be non-empty
- `acceptance_criteria.md` and `restrictions.md` are optional — warn if absent
Announce the detected mode and resolved paths to the user before proceeding.
## Input Specification
### Required Files
**Project mode:**
| File | Purpose |
|------|---------|
| PROBLEM_FILE (`_docs/00_problem/problem.md`) | Problem description and context |
| `_docs/00_problem/input_data/` | Reference data examples (if available) |
| `_docs/00_problem/restrictions.md` | Constraints and limitations (if available) |
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria (if available) |
| SOLUTION_FILE (`_docs/01_solution/solution.md`) | Solution draft to decompose |
**Standalone mode:**
| File | Purpose |
|------|---------|
| INPUT_FILE (the provided file) | Combined problem + solution context |
### Prerequisite Checks (BLOCKING)
**Project mode:**
1. PROBLEM_FILE exists and is non-empty — **STOP if missing**
2. SOLUTION_FILE exists and is non-empty — **STOP if missing**
3. Create PLANS_DIR if it does not exist
4. If `PLANS_DIR/<topic>/` already exists, ask user: **resume from last checkpoint or start fresh?**
**Standalone mode:**
1. INPUT_FILE exists and is non-empty — **STOP if missing**
2. Warn if no `restrictions.md` or `acceptance_criteria.md` provided alongside INPUT_FILE
3. Create PLANS_DIR if it does not exist
4. If `PLANS_DIR/<topic>/` already exists, ask user: **resume from last checkpoint or start fresh?**
## Artifact Management
### Directory Structure
At the start of planning, create a topic-named working directory under PLANS_DIR:
```
PLANS_DIR/<topic>/
├── architecture.md
├── system-flows.md
├── risk_mitigations.md
├── risk_mitigations_02.md (iterative, ## as sequence)
├── components/
│ ├── 01_[name]/
│ │ ├── description.md
│ │ └── tests.md
│ ├── 02_[name]/
│ │ ├── description.md
│ │ └── tests.md
│ └── ...
├── common-helpers/
│ ├── 01_helper_[name]/
│ ├── 02_helper_[name]/
│ └── ...
├── e2e_test_infrastructure.md
├── diagrams/
│ ├── components.drawio
│ └── flows/
│ ├── flow_[name].md (Mermaid)
│ └── ...
└── FINAL_report.md
```
### Save Timing
| Step | Save immediately after | Filename |
|------|------------------------|----------|
| Step 1 | Architecture analysis complete | `architecture.md` |
| Step 1 | System flows documented | `system-flows.md` |
| Step 2 | Each component analyzed | `components/[##]_[name]/description.md` |
| Step 2 | Common helpers generated | `common-helpers/[##]_helper_[name].md` |
| Step 2 | Diagrams generated | `diagrams/` |
| Step 3 | Risk assessment complete | `risk_mitigations.md` |
| Step 4 | Tests written per component | `components/[##]_[name]/tests.md` |
| Step 4b | E2E test infrastructure spec | `e2e_test_infrastructure.md` |
| Step 5 | Epics created in Jira | Jira via MCP |
| Final | All steps complete | `FINAL_report.md` |
### Save Principles
1. **Save immediately**: write to disk as soon as a step completes; do not wait until the end
2. **Incremental updates**: same file can be updated multiple times; append or replace
3. **Preserve process**: keep all intermediate files even after integration into final report
4. **Enable recovery**: if interrupted, resume from the last saved artifact (see Resumability)
### Resumability
If `PLANS_DIR/<topic>/` already contains artifacts:
1. List existing files and match them to the save timing table above
2. Identify the last completed step based on which artifacts exist
3. Resume from the next incomplete step
4. Inform the user which steps are being skipped
## Progress Tracking
At the start of execution, create a TodoWrite with all steps (1 through 5, including 4b). Update status as each step completes.
## Workflow
### Step 1: Solution Analysis
**Role**: Professional software architect
**Goal**: Produce `architecture.md` and `system-flows.md` from the solution draft
**Constraints**: No code, no component-level detail yet; focus on system-level view
1. Read all input files thoroughly
2. Research unknown or questionable topics via internet; ask user about ambiguities
3. Document architecture using `templates/architecture.md` as structure
4. Document system flows using `templates/system-flows.md` as structure
**Self-verification**:
- [ ] Architecture covers all capabilities mentioned in solution.md
- [ ] System flows cover all main user/system interactions
- [ ] No contradictions with problem.md or restrictions.md
- [ ] Technology choices are justified
**Save action**: Write `architecture.md` and `system-flows.md`
**BLOCKING**: Present architecture summary to user. Do NOT proceed until user confirms.
---
### Step 2: Component Decomposition
**Role**: Professional software architect
**Goal**: Decompose the architecture into components with detailed specs
**Constraints**: No code; only names, interfaces, inputs/outputs. Follow SRP strictly.
1. Identify components from the architecture; think about separation, reusability, and communication patterns
2. If additional components are needed (data preparation, shared helpers), create them
3. For each component, write a spec using `templates/component-spec.md` as structure
4. Generate diagrams:
- draw.io component diagram showing relations (minimize line intersections, group semantically coherent components, place external users near their components)
- Mermaid flowchart per main control flow
5. Components can share and reuse common logic, same for multiple components. Hence for such occurences common-helpers folder is specified.
**Self-verification**:
- [ ] Each component has a single, clear responsibility
- [ ] No functionality is spread across multiple components
- [ ] All inter-component interfaces are defined (who calls whom, with what)
- [ ] Component dependency graph has no circular dependencies
- [ ] All components from architecture.md are accounted for
**Save action**: Write:
- each component `components/[##]_[name]/description.md`
- comomon helper `common-helpers/[##]_helper_[name].md`
- diagrams `diagrams/`
**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms.
---
### Step 3: Architecture Review & Risk Assessment
**Role**: Professional software architect and analyst
**Goal**: Validate all artifacts for consistency, then identify and mitigate risks
**Constraints**: This is a review step — fix problems found, do not add new features
#### 3a. Evaluator Pass (re-read ALL artifacts)
Review checklist:
- [ ] All components follow Single Responsibility Principle
- [ ] All components follow dumb code / smart data principle
- [ ] Inter-component interfaces are consistent (caller's output matches callee's input)
- [ ] No circular dependencies in the dependency graph
- [ ] No missing interactions between components
- [ ] No over-engineering — is there a simpler decomposition?
- [ ] Security considerations addressed in component design
- [ ] Performance bottlenecks identified
- [ ] API contracts are consistent across components
Fix any issues found before proceeding to risk identification.
#### 3b. Risk Identification
1. Identify technical and project risks
2. Assess probability and impact using `templates/risk-register.md`
3. Define mitigation strategies
4. Apply mitigations to architecture, flows, and component documents where applicable
**Self-verification**:
- [ ] Every High/Critical risk has a concrete mitigation strategy
- [ ] Mitigations are reflected in the relevant component or architecture docs
- [ ] No new risks introduced by the mitigations themselves
**Save action**: Write `risk_mitigations.md`
**BLOCKING**: Present risk summary to user. Ask whether assessment is sufficient.
**Iterative**: If user requests another round, repeat Step 3 and write `risk_mitigations_##.md` (## as sequence number). Continue until user confirms.
---
### Step 4: Test Specifications
**Role**: Professional Quality Assurance Engineer
**Goal**: Write test specs for each component achieving minimum 75% acceptance criteria coverage
**Constraints**: Test specs only — no test code. Each test must trace to an acceptance criterion.
1. For each component, write tests using `templates/test-spec.md` as structure
2. Cover all 4 types: integration, performance, security, acceptance
3. Include test data management (setup, teardown, isolation)
4. Verify traceability: every acceptance criterion from `acceptance_criteria.md` must be covered by at least one test
**Self-verification**:
- [ ] Every acceptance criterion has at least one test covering it
- [ ] Test inputs are realistic and well-defined
- [ ] Expected results are specific and measurable
- [ ] No component is left without tests
**Save action**: Write each `components/[##]_[name]/tests.md`
---
### Step 4b: E2E Black-Box Test Infrastructure
**Role**: Professional Quality Assurance Engineer
**Goal**: Specify a separate consumer application and Docker environment for black-box end-to-end testing of the main system
**Constraints**: Spec only — no test code. Consumer must treat the main system as a black box (no internal imports, no direct DB access).
1. Define Docker environment: services (system under test, test DB, consumer app, dependencies), networks, volumes
2. Specify consumer application: tech stack, entry point, communication interfaces with the main system
3. Define E2E test scenarios from acceptance criteria — focus on critical end-to-end use cases that cross component boundaries
4. Specify test data management: seed data, isolation strategy, external dependency mocks
5. Define CI/CD integration: when to run, gate behavior, timeout
6. Define reporting format (CSV: test ID, name, execution time, result, error message)
Use `templates/e2e-test-infrastructure.md` as structure.
**Self-verification**:
- [ ] Critical acceptance criteria are covered by at least one E2E scenario
- [ ] Consumer app has no direct access to system internals
- [ ] Docker environment is self-contained (`docker compose up` sufficient)
- [ ] External dependencies have mock/stub services defined
**Save action**: Write `e2e_test_infrastructure.md`
---
### Step 5: Jira Epics
**Role**: Professional product manager
**Goal**: Create Jira epics from components, ordered by dependency
**Constraints**: Be concise — fewer words with the same meaning is better
1. Generate Jira Epics from components using Jira MCP, structured per `templates/epic-spec.md`
2. Order epics by dependency (which must be done first)
3. Include effort estimation per epic (T-shirt size or story points range)
4. Ensure each epic has clear acceptance criteria cross-referenced with component specs
5. Generate updated draw.io diagram showing component-to-epic mapping
**Self-verification**:
- [ ] Every component maps to exactly one epic
- [ ] Dependency order is respected (no epic depends on a later one)
- [ ] Acceptance criteria are measurable
- [ ] Effort estimates are realistic
**Save action**: Epics created in Jira via MCP
---
## Quality Checklist (before FINAL_report.md)
Before writing the final report, verify ALL of the following:
### Architecture
- [ ] Covers all capabilities from solution.md
- [ ] Technology choices are justified
- [ ] Deployment model is defined
### Components
- [ ] Every component follows SRP
- [ ] No circular dependencies
- [ ] All inter-component interfaces are defined and consistent
- [ ] No orphan components (unused by any flow)
### Risks
- [ ] All High/Critical risks have mitigations
- [ ] Mitigations are reflected in component/architecture docs
- [ ] User has confirmed risk assessment is sufficient
### Tests
- [ ] Every acceptance criterion is covered by at least one test
- [ ] All 4 test types are represented per component (where applicable)
- [ ] Test data management is defined
### E2E Test Infrastructure
- [ ] Critical use cases covered by E2E scenarios
- [ ] Docker environment is self-contained
- [ ] Consumer app treats main system as black box
- [ ] CI/CD integration and reporting defined
### Epics
- [ ] Every component maps to an epic
- [ ] Dependency order is correct
- [ ] Acceptance criteria are measurable
**Save action**: Write `FINAL_report.md` using `templates/final-report.md` as structure
## Common Mistakes
- **Coding during planning**: this workflow produces documents, never code
- **Multi-responsibility components**: if a component does two things, split it
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Diagrams without data**: generate diagrams only after the underlying structure is documented
- **Copy-pasting problem.md**: the architecture doc should analyze and transform, not repeat the input
- **Vague interfaces**: "component A talks to component B" is not enough; define the method, input, output
- **Ignoring restrictions.md**: every constraint must be traceable in the architecture or risk register
## Escalation Rules
| Situation | Action |
|-----------|--------|
| Ambiguous requirements | ASK user |
| Missing acceptance criteria | ASK user |
| Technology choice with multiple valid options | ASK user |
| Component naming | PROCEED, confirm at next BLOCKING gate |
| File structure within templates | PROCEED |
| Contradictions between input files | ASK user |
| Risk mitigation requires architecture change | ASK user |
## Methodology Quick Reference
```
┌────────────────────────────────────────────────────────────────┐
│ Solution Planning (5-Step Method) │
├────────────────────────────────────────────────────────────────┤
│ CONTEXT: Resolve mode (project vs standalone) + set paths │
│ 1. Solution Analysis → architecture.md, system-flows.md │
│ [BLOCKING: user confirms architecture] │
│ 2. Component Decompose → components/[##]_[name]/description │
│ [BLOCKING: user confirms decomposition] │
│ 3. Review & Risk Assess → risk_mitigations.md │
│ [BLOCKING: user confirms risks, iterative] │
│ 4. Test Specifications → components/[##]_[name]/tests.md │
│ 4b.E2E Test Infra → e2e_test_infrastructure.md │
│ 5. Jira Epics → Jira via MCP │
│ ───────────────────────────────────────────────── │
│ Quality Checklist → FINAL_report.md │
├────────────────────────────────────────────────────────────────┤
│ Principles: SRP · Dumb code/smart data · Save immediately │
│ Ask don't assume · Plan don't code │
└────────────────────────────────────────────────────────────────┘
```
@@ -0,0 +1,128 @@
# Architecture Document Template
Use this template for the architecture document. Save as `_docs/02_plans/<topic>/architecture.md`.
---
```markdown
# [System Name] — Architecture
## 1. System Context
**Problem being solved**: [One paragraph summarizing the problem from problem.md]
**System boundaries**: [What is inside the system vs. external]
**External systems**:
| System | Integration Type | Direction | Purpose |
|--------|-----------------|-----------|---------|
| [name] | REST / Queue / DB / File | Inbound / Outbound / Both | [why] |
## 2. Technology Stack
| Layer | Technology | Version | Rationale |
|-------|-----------|---------|-----------|
| Language | | | |
| Framework | | | |
| Database | | | |
| Cache | | | |
| Message Queue | | | |
| Hosting | | | |
| CI/CD | | | |
**Key constraints from restrictions.md**:
- [Constraint 1 and how it affects technology choices]
- [Constraint 2]
## 3. Deployment Model
**Environments**: Development, Staging, Production
**Infrastructure**:
- [Cloud provider / On-prem / Hybrid]
- [Container orchestration if applicable]
- [Scaling strategy: horizontal / vertical / auto]
**Environment-specific configuration**:
| Config | Development | Production |
|--------|-------------|------------|
| Database | [local/docker] | [managed service] |
| Secrets | [.env file] | [secret manager] |
| Logging | [console] | [centralized] |
## 4. Data Model Overview
> High-level data model covering the entire system. Detailed per-component models go in component specs.
**Core entities**:
| Entity | Description | Owned By Component |
|--------|-------------|--------------------|
| [entity] | [what it represents] | [component ##] |
**Key relationships**:
- [Entity A] → [Entity B]: [relationship description]
**Data flow summary**:
- [Source] → [Transform] → [Destination]: [what data and why]
## 5. Integration Points
### Internal Communication
| From | To | Protocol | Pattern | Notes |
|------|----|----------|---------|-------|
| [component] | [component] | Sync REST / Async Queue / Direct call | Request-Response / Event / Command | |
### External Integrations
| External System | Protocol | Auth | Rate Limits | Failure Mode |
|----------------|----------|------|-------------|--------------|
| [system] | [REST/gRPC/etc] | [API key/OAuth/etc] | [limits] | [retry/circuit breaker/fallback] |
## 6. Non-Functional Requirements
| Requirement | Target | Measurement | Priority |
|------------|--------|-------------|----------|
| Availability | [e.g., 99.9%] | [how measured] | High/Medium/Low |
| Latency (p95) | [e.g., <200ms] | [endpoint/operation] | |
| Throughput | [e.g., 1000 req/s] | [peak/sustained] | |
| Data retention | [e.g., 90 days] | [which data] | |
| Recovery (RPO/RTO) | [e.g., RPO 1hr, RTO 4hr] | | |
| Scalability | [e.g., 10x current load] | [timeline] | |
## 7. Security Architecture
**Authentication**: [mechanism — JWT / session / API key]
**Authorization**: [RBAC / ABAC / per-resource]
**Data protection**:
- At rest: [encryption method]
- In transit: [TLS version]
- Secrets management: [tool/approach]
**Audit logging**: [what is logged, where, retention]
## 8. Key Architectural Decisions
Record significant decisions that shaped the architecture.
### ADR-001: [Decision Title]
**Context**: [Why this decision was needed]
**Decision**: [What was decided]
**Alternatives considered**:
1. [Alternative 1] — rejected because [reason]
2. [Alternative 2] — rejected because [reason]
**Consequences**: [Trade-offs accepted]
### ADR-002: [Decision Title]
...
```
@@ -0,0 +1,156 @@
# Component Specification Template
Use this template for each component. Save as `components/[##]_[name]/description.md`.
---
```markdown
# [Component Name]
## 1. High-Level Overview
**Purpose**: [One sentence: what this component does and its role in the system]
**Architectural Pattern**: [e.g., Repository, Event-driven, Pipeline, Facade, etc.]
**Upstream dependencies**: [Components that this component calls or consumes from]
**Downstream consumers**: [Components that call or consume from this component]
## 2. Internal Interfaces
For each interface this component exposes internally:
### Interface: [InterfaceName]
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `method_name` | `InputDTO` | `OutputDTO` | Yes/No | `ErrorType1`, `ErrorType2` |
**Input DTOs**:
```
[DTO name]:
field_1: type (required/optional) — description
field_2: type (required/optional) — description
```
**Output DTOs**:
```
[DTO name]:
field_1: type — description
field_2: type — description
```
## 3. External API Specification
> Include this section only if the component exposes an external HTTP/gRPC API.
> Skip if the component is internal-only.
| Endpoint | Method | Auth | Rate Limit | Description |
|----------|--------|------|------------|-------------|
| `/api/v1/...` | GET/POST/PUT/DELETE | Required/Public | X req/min | Brief description |
**Request/Response schemas**: define per endpoint using OpenAPI-style notation.
**Example request/response**:
```json
// Request
{ }
// Response
{ }
```
## 4. Data Access Patterns
### Queries
| Query | Frequency | Hot Path | Index Needed |
|-------|-----------|----------|--------------|
| [describe query] | High/Medium/Low | Yes/No | Yes/No |
### Caching Strategy
| Data | Cache Type | TTL | Invalidation |
|------|-----------|-----|-------------|
| [data item] | In-memory / Redis / None | [duration] | [trigger] |
### Storage Estimates
| Table/Collection | Est. Row Count (1yr) | Row Size | Total Size | Growth Rate |
|-----------------|---------------------|----------|------------|-------------|
| [table_name] | | | | /month |
### Data Management
**Seed data**: [Required seed data and how to load it]
**Rollback**: [Rollback procedure for this component's data changes]
## 5. Implementation Details
**Algorithmic Complexity**: [Big O for critical methods — only if non-trivial]
**State Management**: [Local state / Global state / Stateless — explain how state is handled]
**Key Dependencies**: [External libraries and their purpose]
| Library | Version | Purpose |
|---------|---------|---------|
| [name] | [version] | [why needed] |
**Error Handling Strategy**:
- [How errors are caught, propagated, and reported]
- [Retry policy if applicable]
- [Circuit breaker if applicable]
## 6. Extensions and Helpers
> List any shared utilities this component needs that should live in a `helpers/` folder.
| Helper | Purpose | Used By |
|--------|---------|---------|
| [helper_name] | [what it does] | [list of components] |
## 7. Caveats & Edge Cases
**Known limitations**:
- [Limitation 1]
**Potential race conditions**:
- [Race condition scenario, if any]
**Performance bottlenecks**:
- [Bottleneck description and mitigation approach]
## 8. Dependency Graph
**Must be implemented after**: [list of component numbers/names]
**Can be implemented in parallel with**: [list of component numbers/names]
**Blocks**: [list of components that depend on this one]
## 9. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | Unrecoverable failures | `Failed to process order {id}: {error}` |
| WARN | Recoverable issues | `Retry attempt {n} for {operation}` |
| INFO | Key business events | `Order {id} created by user {uid}` |
| DEBUG | Development diagnostics | `Query returned {n} rows in {ms}ms` |
**Log format**: [structured JSON / plaintext — match system standard]
**Log storage**: [stdout / file / centralized logging service]
```
---
## Guidance Notes
- **Section 3 (External API)**: skip entirely for internal-only components. Include for any component that exposes HTTP endpoints, WebSocket connections, or gRPC services.
- **Section 4 (Storage Estimates)**: critical for components that manage persistent data. Skip for stateless components.
- **Section 5 (Algorithmic Complexity)**: only document if the algorithm is non-trivial (O(n^2) or worse, recursive, etc.). Simple CRUD operations don't need this.
- **Section 6 (Helpers)**: if the helper is used by only one component, keep it inside that component. Only extract to `helpers/` if shared by 2+ components.
- **Section 8 (Dependency Graph)**: this is essential for determining implementation order. Be precise about what "depends on" means — data dependency, API dependency, or shared infrastructure.
@@ -0,0 +1,141 @@
# E2E Black-Box Test Infrastructure Template
Describes a separate consumer application that tests the main system as a black box.
Save as `PLANS_DIR/<topic>/e2e_test_infrastructure.md`.
---
```markdown
# E2E Test Infrastructure
## Overview
**System under test**: [main system name and entry points — API URLs, message queues, etc.]
**Consumer app purpose**: Standalone application that exercises the main system through its public interfaces, validating end-to-end use cases without access to internals.
## Docker Environment
### Services
| Service | Image / Build | Purpose | Ports |
|---------|--------------|---------|-------|
| system-under-test | [main app image or build context] | The main system being tested | [ports] |
| test-db | [postgres/mysql/etc.] | Database for the main system | [ports] |
| e2e-consumer | [build context for consumer app] | Black-box test runner | — |
| [dependency] | [image] | [purpose — cache, queue, etc.] | [ports] |
### Networks
| Network | Services | Purpose |
|---------|----------|---------|
| e2e-net | all | Isolated test network |
### Volumes
| Volume | Mounted to | Purpose |
|--------|-----------|---------|
| [name] | [service:path] | [test data, DB persistence, etc.] |
### docker-compose structure
```yaml
# Outline only — not runnable code
services:
system-under-test:
# main system
test-db:
# database
e2e-consumer:
# consumer test app
depends_on:
- system-under-test
```
## Consumer Application
**Tech stack**: [language, framework, test runner]
**Entry point**: [how it starts — e.g., pytest, jest, custom runner]
### Communication with system under test
| Interface | Protocol | Endpoint / Topic | Authentication |
|-----------|----------|-----------------|----------------|
| [API name] | [HTTP/gRPC/AMQP/etc.] | [URL or topic] | [method] |
### What the consumer does NOT have access to
- No direct database access to the main system
- No internal module imports
- No shared memory or file system with the main system
## E2E Test Scenarios
### Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | E2E Test IDs | Coverage |
|-------|---------------------|-------------|----------|
| AC-01 | [criterion] | E2E-01 | Covered |
| AC-02 | [criterion] | E2E-02, E2E-03 | Covered |
| AC-03 | [criterion] | — | NOT COVERED — [reason] |
### E2E-01: [Scenario Name]
**Summary**: [One sentence: what end-to-end use case this validates]
**Traces to**: AC-01
**Preconditions**:
- [System state required before test]
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | [call / send] | [response / event] |
| 2 | [call / send] | [response / event] |
**Max execution time**: [e.g., 10s]
---
### E2E-02: [Scenario Name]
(repeat structure)
---
## Test Data Management
**Seed data**:
| Data Set | Description | How Loaded | Cleanup |
|----------|-------------|-----------|---------|
| [name] | [what it contains] | [SQL script / API call / fixture file] | [how removed after test] |
**Isolation strategy**: [e.g., each test run gets a fresh DB via container restart, or transactions are rolled back, or namespaced data]
**External dependencies**: [any external APIs that need mocking or sandbox environments]
## CI/CD Integration
**When to run**: [e.g., on PR merge to dev, nightly, before production deploy]
**Pipeline stage**: [where in the CI pipeline this fits]
**Gate behavior**: [block merge / warning only / manual approval]
**Timeout**: [max total suite duration before considered failed]
## Reporting
**Format**: CSV
**Columns**: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
**Output path**: [where the CSV is written — e.g., ./e2e-results/report.csv]
```
---
## Guidance Notes
- Every E2E test MUST trace to at least one acceptance criterion. If it doesn't, question whether it's needed.
- The consumer app must treat the main system as a true black box — no internal imports, no direct DB queries against the main system's database.
- Keep the number of E2E tests focused on critical use cases. Exhaustive testing belongs in per-component tests (Step 4).
- Docker environment should be self-contained — `docker compose up` must be sufficient to run the full suite.
- If the main system requires external services (payment gateways, third-party APIs), define mock/stub services in the Docker environment.
+127
View File
@@ -0,0 +1,127 @@
# Jira Epic Template
Use this template for each Jira epic. Create epics via Jira MCP.
---
```markdown
## Epic: [Component Name] — [Outcome]
**Example**: Data Ingestion — Near-real-time pipeline
### Epic Summary
[1-2 sentences: what we are building + why it matters]
### Problem / Context
[Current state, pain points, constraints, business opportunities.
Link to architecture.md and relevant component spec.]
### Scope
**In Scope**:
- [Capability 1 — describe what, not how]
- [Capability 2]
- [Capability 3]
**Out of Scope**:
- [Explicit exclusion 1 — prevents scope creep]
- [Explicit exclusion 2]
### Assumptions
- [System design assumption]
- [Data structure assumption]
- [Infrastructure assumption]
### Dependencies
**Epic dependencies** (must be completed first):
- [Epic name / ID]
**External dependencies**:
- [Services, hardware, environments, certificates, data sources]
### Effort Estimation
**T-shirt size**: S / M / L / XL
**Story points range**: [min]-[max]
### Users / Consumers
| Type | Who | Key Use Cases |
|------|-----|--------------|
| Internal | [team/role] | [use case] |
| External | [user type] | [use case] |
| System | [service name] | [integration point] |
### Requirements
**Functional**:
- [API expectations, events, data handling]
- [Idempotency, retry behavior]
**Non-functional**:
- [Availability, latency, throughput targets]
- [Scalability, processing limits, data retention]
**Security / Compliance**:
- [Authentication, encryption, secrets management]
- [Logging, audit trail]
- [SOC2 / ISO / GDPR if applicable]
### Design & Architecture
- Architecture doc: `_docs/02_plans/<topic>/architecture.md`
- Component spec: `_docs/02_plans/<topic>/components/[##]_[name]/description.md`
- System flows: `_docs/02_plans/<topic>/system-flows.md`
### Definition of Done
- [ ] All in-scope capabilities implemented
- [ ] Automated tests pass (unit + integration + e2e)
- [ ] Minimum coverage threshold met (75%)
- [ ] Runbooks written (if applicable)
- [ ] Documentation updated
### Acceptance Criteria
| # | Criterion | Measurable Condition |
|---|-----------|---------------------|
| 1 | [criterion] | [how to verify] |
| 2 | [criterion] | [how to verify] |
### Risks & Mitigations
| # | Risk | Mitigation | Owner |
|---|------|------------|-------|
| 1 | [top risk] | [mitigation] | [owner] |
| 2 | | | |
| 3 | | | |
### Labels
- `component:[name]`
- `env:prod` / `env:stg`
- `type:platform` / `type:data` / `type:integration`
### Child Issues
| Type | Title | Points |
|------|-------|--------|
| Spike | [research/investigation task] | [1-3] |
| Task | [implementation task] | [1-5] |
| Task | [implementation task] | [1-5] |
| Enabler | [infrastructure/setup task] | [1-3] |
```
---
## Guidance Notes
- Be concise. Fewer words with the same meaning = better epic.
- Capabilities in scope are "what", not "how" — avoid describing implementation details.
- Dependency order matters: epics that must be done first should be listed earlier in the backlog.
- Every epic maps to exactly one component. If a component is too large for one epic, split the component first.
- Complexity points for child issues follow the project standard: 1, 2, 3, 5, 8. Do not create issues above 5 points — split them.
@@ -0,0 +1,104 @@
# Final Planning Report Template
Use this template after completing all 5 steps and the quality checklist. Save as `_docs/02_plans/<topic>/FINAL_report.md`.
---
```markdown
# [System Name] — Planning Report
## Executive Summary
[2-3 sentences: what was planned, the core architectural approach, and the key outcome (number of components, epics, estimated effort)]
## Problem Statement
[Brief restatement from problem.md — transformed, not copy-pasted]
## Architecture Overview
[Key architectural decisions and technology stack summary. Reference `architecture.md` for full details.]
**Technology stack**: [language, framework, database, hosting — one line]
**Deployment**: [environment strategy — one line]
## Component Summary
| # | Component | Purpose | Dependencies | Epic |
|---|-----------|---------|-------------|------|
| 01 | [name] | [one-line purpose] | — | [Jira ID] |
| 02 | [name] | [one-line purpose] | 01 | [Jira ID] |
| ... | | | | |
**Implementation order** (based on dependency graph):
1. [Phase 1: components that can start immediately]
2. [Phase 2: components that depend on Phase 1]
3. [Phase 3: ...]
## System Flows
| Flow | Description | Key Components |
|------|-------------|---------------|
| [name] | [one-line summary] | [component list] |
[Reference `system-flows.md` for full diagrams and details.]
## Risk Summary
| Level | Count | Key Risks |
|-------|-------|-----------|
| Critical | [N] | [brief list] |
| High | [N] | [brief list] |
| Medium | [N] | — |
| Low | [N] | — |
**Iterations completed**: [N]
**All Critical/High risks mitigated**: Yes / No — [details if No]
[Reference `risk_mitigations.md` for full register.]
## Test Coverage
| Component | Integration | Performance | Security | Acceptance | AC Coverage |
|-----------|-------------|-------------|----------|------------|-------------|
| [name] | [N tests] | [N tests] | [N tests] | [N tests] | [X/Y ACs] |
| ... | | | | | |
**Overall acceptance criteria coverage**: [X / Y total ACs covered] ([percentage]%)
## Epic Roadmap
| Order | Epic | Component | Effort | Dependencies |
|-------|------|-----------|--------|-------------|
| 1 | [Jira ID]: [name] | [component] | [S/M/L/XL] | — |
| 2 | [Jira ID]: [name] | [component] | [S/M/L/XL] | Epic 1 |
| ... | | | | |
**Total estimated effort**: [sum or range]
## Key Decisions Made
| # | Decision | Rationale | Alternatives Rejected |
|---|----------|-----------|----------------------|
| 1 | [decision] | [why] | [what was rejected] |
| 2 | | | |
## Open Questions
| # | Question | Impact | Assigned To |
|---|----------|--------|-------------|
| 1 | [unresolved question] | [what it blocks or affects] | [who should answer] |
## Artifact Index
| File | Description |
|------|-------------|
| `architecture.md` | System architecture |
| `system-flows.md` | System flows and diagrams |
| `components/01_[name]/description.md` | Component spec |
| `components/01_[name]/tests.md` | Test spec |
| `risk_mitigations.md` | Risk register |
| `diagrams/components.drawio` | Component diagram |
| `diagrams/flows/flow_[name].md` | Flow diagrams |
```
@@ -0,0 +1,99 @@
# Risk Register Template
Use this template for risk assessment. Save as `_docs/02_plans/<topic>/risk_mitigations.md`.
Subsequent iterations: `risk_mitigations_02.md`, `risk_mitigations_03.md`, etc.
---
```markdown
# Risk Assessment — [Topic] — Iteration [##]
## Risk Scoring Matrix
| | Low Impact | Medium Impact | High Impact |
|--|------------|---------------|-------------|
| **High Probability** | Medium | High | Critical |
| **Medium Probability** | Low | Medium | High |
| **Low Probability** | Low | Low | Medium |
## Acceptance Criteria by Risk Level
| Level | Action Required |
|-------|----------------|
| Low | Accepted, monitored quarterly |
| Medium | Mitigation plan required before implementation |
| High | Mitigation + contingency plan required, reviewed weekly |
| Critical | Must be resolved before proceeding to next planning step |
## Risk Register
| ID | Risk | Category | Probability | Impact | Score | Mitigation | Owner | Status |
|----|------|----------|-------------|--------|-------|------------|-------|--------|
| R01 | [risk description] | [category] | High/Med/Low | High/Med/Low | Critical/High/Med/Low | [mitigation strategy] | [owner] | Open/Mitigated/Accepted |
| R02 | | | | | | | | |
## Risk Categories
### Technical Risks
- Technology choices may not meet requirements
- Integration complexity underestimated
- Performance targets unachievable
- Security vulnerabilities in design
- Data model cannot support future requirements
### Schedule Risks
- Dependencies delayed
- Scope creep from ambiguous requirements
- Underestimated complexity
### Resource Risks
- Key person dependency
- Team lacks experience with chosen technology
- Infrastructure not available in time
### External Risks
- Third-party API changes or deprecation
- Vendor reliability or pricing changes
- Regulatory or compliance changes
- Data source availability
## Detailed Risk Analysis
### R01: [Risk Title]
**Description**: [Detailed description of the risk]
**Trigger conditions**: [What would cause this risk to materialize]
**Affected components**: [List of components impacted]
**Mitigation strategy**:
1. [Action 1]
2. [Action 2]
**Contingency plan**: [What to do if mitigation fails]
**Residual risk after mitigation**: [Low/Medium/High]
**Documents updated**: [List architecture/component docs that were updated to reflect this mitigation]
---
### R02: [Risk Title]
(repeat structure above)
## Architecture/Component Changes Applied
| Risk ID | Document Modified | Change Description |
|---------|------------------|--------------------|
| R01 | `architecture.md` §3 | [what changed] |
| R01 | `components/02_[name]/description.md` §5 | [what changed] |
## Summary
**Total risks identified**: [N]
**Critical**: [N] | **High**: [N] | **Medium**: [N] | **Low**: [N]
**Risks mitigated this iteration**: [N]
**Risks requiring user decision**: [list]
```
@@ -0,0 +1,108 @@
# System Flows Template
Use this template for the system flows document. Save as `_docs/02_plans/<topic>/system-flows.md`.
Individual flow diagrams go in `_docs/02_plans/<topic>/diagrams/flows/flow_[name].md`.
---
```markdown
# [System Name] — System Flows
## Flow Inventory
| # | Flow Name | Trigger | Primary Components | Criticality |
|---|-----------|---------|-------------------|-------------|
| F1 | [name] | [user action / scheduled / event] | [component list] | High/Medium/Low |
| F2 | [name] | | | |
| ... | | | | |
## Flow Dependencies
| Flow | Depends On | Shares Data With |
|------|-----------|-----------------|
| F1 | — | F2 (via [entity]) |
| F2 | F1 must complete first | F3 |
---
## Flow F1: [Flow Name]
### Description
[1-2 sentences: what this flow does, who triggers it, what the outcome is]
### Preconditions
- [Condition 1]
- [Condition 2]
### Sequence Diagram
```mermaid
sequenceDiagram
participant User
participant ComponentA
participant ComponentB
participant Database
User->>ComponentA: [action]
ComponentA->>ComponentB: [call with params]
ComponentB->>Database: [query/write]
Database-->>ComponentB: [result]
ComponentB-->>ComponentA: [response]
ComponentA-->>User: [result]
```
### Flowchart
```mermaid
flowchart TD
Start([Trigger]) --> Step1[Step description]
Step1 --> Decision{Condition?}
Decision -->|Yes| Step2[Step description]
Decision -->|No| Step3[Step description]
Step2 --> EndNode([Result])
Step3 --> EndNode
```
### Data Flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | [source] | [destination] | [what data] | [DTO/event/etc] |
| 2 | | | | |
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| [error type] | [which step] | [how detected] | [what happens] |
### Performance Expectations
| Metric | Target | Notes |
|--------|--------|-------|
| End-to-end latency | [target] | [conditions] |
| Throughput | [target] | [peak/sustained] |
---
## Flow F2: [Flow Name]
(repeat structure above)
```
---
## Mermaid Diagram Conventions
Follow these conventions for consistency across all flow diagrams:
- **Participants**: use component names matching `components/[##]_[name]`
- **Node IDs**: camelCase, no spaces (e.g., `validateInput`, `saveOrder`)
- **Decision nodes**: use `{Question?}` format
- **Start/End**: use `([label])` stadium shape
- **External systems**: use `[[label]]` subroutine shape
- **Subgraphs**: group by component or bounded context
- **No styling**: do not add colors or CSS classes — let the renderer theme handle it
- **Edge labels**: wrap special characters in quotes (e.g., `-->|"O(n) check"|`)
+172
View File
@@ -0,0 +1,172 @@
# Test Specification Template
Use this template for each component's test spec. Save as `components/[##]_[name]/tests.md`.
---
```markdown
# Test Specification — [Component Name]
## Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-01 | [criterion from acceptance_criteria.md] | IT-01, AT-01 | Covered |
| AC-02 | [criterion] | PT-01 | Covered |
| AC-03 | [criterion] | — | NOT COVERED — [reason] |
---
## Integration Tests
### IT-01: [Test Name]
**Summary**: [One sentence: what this test verifies]
**Traces to**: AC-01, AC-03
**Description**: [Detailed test scenario]
**Input data**:
```
[specific input data for this test]
```
**Expected result**:
```
[specific expected output or state]
```
**Max execution time**: [e.g., 5s]
**Dependencies**: [other components/services that must be running]
---
### IT-02: [Test Name]
(repeat structure)
---
## Performance Tests
### PT-01: [Test Name]
**Summary**: [One sentence: what performance aspect is tested]
**Traces to**: AC-02
**Load scenario**:
- Concurrent users: [N]
- Request rate: [N req/s]
- Duration: [N minutes]
- Ramp-up: [strategy]
**Expected results**:
| Metric | Target | Failure Threshold |
|--------|--------|-------------------|
| Latency (p50) | [target] | [max] |
| Latency (p95) | [target] | [max] |
| Latency (p99) | [target] | [max] |
| Throughput | [target req/s] | [min req/s] |
| Error rate | [target %] | [max %] |
**Resource limits**:
- CPU: [max %]
- Memory: [max MB/GB]
- Database connections: [max pool size]
---
### PT-02: [Test Name]
(repeat structure)
---
## Security Tests
### ST-01: [Test Name]
**Summary**: [One sentence: what security aspect is tested]
**Traces to**: AC-04
**Attack vector**: [e.g., SQL injection on search endpoint, privilege escalation via direct ID access]
**Test procedure**:
1. [Step 1]
2. [Step 2]
**Expected behavior**: [what the system should do — reject, sanitize, log, etc.]
**Pass criteria**: [specific measurable condition]
**Fail criteria**: [what constitutes a failure]
---
### ST-02: [Test Name]
(repeat structure)
---
## Acceptance Tests
### AT-01: [Test Name]
**Summary**: [One sentence: what user-facing behavior is verified]
**Traces to**: AC-01
**Preconditions**:
- [Precondition 1]
- [Precondition 2]
**Steps**:
| Step | Action | Expected Result |
|------|--------|-----------------|
| 1 | [user action] | [expected outcome] |
| 2 | [user action] | [expected outcome] |
| 3 | [user action] | [expected outcome] |
---
### AT-02: [Test Name]
(repeat structure)
---
## Test Data Management
**Required test data**:
| Data Set | Description | Source | Size |
|----------|-------------|--------|------|
| [name] | [what it contains] | [generated / fixture / copy of prod subset] | [approx size] |
**Setup procedure**:
1. [How to prepare the test environment]
2. [How to load test data]
**Teardown procedure**:
1. [How to clean up after tests]
2. [How to restore initial state]
**Data isolation strategy**: [How tests are isolated from each other — separate DB, transactions, namespacing]
```
---
## Guidance Notes
- Every test MUST trace back to at least one acceptance criterion (AC-XX). If a test doesn't trace to any, question whether it's needed.
- If an acceptance criterion has no test covering it, mark it as NOT COVERED and explain why (e.g., "requires manual verification", "deferred to phase 2").
- Performance test targets should come from the NFR section in `architecture.md`.
- Security tests should cover at minimum: authentication bypass, authorization escalation, injection attacks relevant to this component.
- Not every component needs all 4 test types. A stateless utility component may only need integration tests.
+470
View File
@@ -0,0 +1,470 @@
---
name: refactor
description: |
Structured refactoring workflow (6-phase method) with three execution modes:
- Full Refactoring: all 6 phases — baseline, discovery, analysis, safety net, execution, hardening
- Targeted Refactoring: skip discovery if docs exist, focus on a specific component/area
- Quick Assessment: phases 0-2 only, outputs a refactoring plan without execution
Supports project mode (_docs/ structure) and standalone mode (@file.md).
Trigger phrases:
- "refactor", "refactoring", "improve code"
- "analyze coupling", "decoupling", "technical debt"
- "refactoring assessment", "code quality improvement"
disable-model-invocation: true
---
# Structured Refactoring (6-Phase Method)
Transform existing codebases through a systematic refactoring workflow: capture baseline, document current state, research improvements, build safety net, execute changes, and harden.
## Core Principles
- **Preserve behavior first**: never refactor without a passing test suite
- **Measure before and after**: every change must be justified by metrics
- **Small incremental changes**: commit frequently, never break tests
- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
- **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
## Context Resolution
Determine the operating mode based on invocation before any other logic runs.
**Project mode** (no explicit input file provided):
- PROBLEM_DIR: `_docs/00_problem/`
- SOLUTION_DIR: `_docs/01_solution/`
- COMPONENTS_DIR: `_docs/02_components/`
- TESTS_DIR: `_docs/02_tests/`
- REFACTOR_DIR: `_docs/04_refactoring/`
- All existing guardrails apply.
**Standalone mode** (explicit input file provided, e.g. `/refactor @some_component.md`):
- INPUT_FILE: the provided file (treated as component/area description)
- Derive `<topic>` from the input filename (without extension)
- REFACTOR_DIR: `_standalone/<topic>/refactoring/`
- Guardrails relaxed: only INPUT_FILE must exist and be non-empty
- `acceptance_criteria.md` is optional — warn if absent
Announce the detected mode and resolved paths to the user before proceeding.
## Mode Detection
After context resolution, determine the execution mode:
1. **User explicitly says** "quick assessment" or "just assess" → **Quick Assessment**
2. **User explicitly says** "refactor [component/file/area]" with a specific target → **Targeted Refactoring**
3. **Default****Full Refactoring**
| Mode | Phases Executed | When to Use |
|------|----------------|-------------|
| **Full Refactoring** | 0 → 1 → 2 → 3 → 4 → 5 | Complete refactoring of a system or major area |
| **Targeted Refactoring** | 0 → (skip 1 if docs exist) → 2 → 3 → 4 → 5 | Refactor a specific component; docs already exist |
| **Quick Assessment** | 0 → 1 → 2 | Produce a refactoring roadmap without executing changes |
Inform the user which mode was detected and confirm before proceeding.
## Prerequisite Checks (BLOCKING)
**Project mode:**
1. PROBLEM_DIR exists with `problem.md` (or `problem_description.md`) — **STOP if missing**, ask user to create it
2. If `acceptance_criteria.md` is missing: **warn** and ask whether to proceed
3. Create REFACTOR_DIR if it does not exist
4. If REFACTOR_DIR already contains artifacts, ask user: **resume from last checkpoint or start fresh?**
**Standalone mode:**
1. INPUT_FILE exists and is non-empty — **STOP if missing**
2. Warn if no `acceptance_criteria.md` provided
3. Create REFACTOR_DIR if it does not exist
## Artifact Management
### Directory Structure
```
REFACTOR_DIR/
├── baseline_metrics.md (Phase 0)
├── discovery/
│ ├── components/
│ │ └── [##]_[name].md (Phase 1)
│ ├── solution.md (Phase 1)
│ └── system_flows.md (Phase 1)
├── analysis/
│ ├── research_findings.md (Phase 2)
│ └── refactoring_roadmap.md (Phase 2)
├── test_specs/
│ └── [##]_[test_name].md (Phase 3)
├── coupling_analysis.md (Phase 4)
├── execution_log.md (Phase 4)
├── hardening/
│ ├── technical_debt.md (Phase 5)
│ ├── performance.md (Phase 5)
│ └── security.md (Phase 5)
└── FINAL_report.md (after all phases)
```
### Save Timing
| Phase | Save immediately after | Filename |
|-------|------------------------|----------|
| Phase 0 | Baseline captured | `baseline_metrics.md` |
| Phase 1 | Each component documented | `discovery/components/[##]_[name].md` |
| Phase 1 | Solution synthesized | `discovery/solution.md`, `discovery/system_flows.md` |
| Phase 2 | Research complete | `analysis/research_findings.md` |
| Phase 2 | Roadmap produced | `analysis/refactoring_roadmap.md` |
| Phase 3 | Test specs written | `test_specs/[##]_[test_name].md` |
| Phase 4 | Coupling analyzed | `coupling_analysis.md` |
| Phase 4 | Execution complete | `execution_log.md` |
| Phase 5 | Each hardening track | `hardening/<track>.md` |
| Final | All phases done | `FINAL_report.md` |
### Resumability
If REFACTOR_DIR already contains artifacts:
1. List existing files and match to the save timing table
2. Identify the last completed phase based on which artifacts exist
3. Resume from the next incomplete phase
4. Inform the user which phases are being skipped
## Progress Tracking
At the start of execution, create a TodoWrite with all applicable phases. Update status as each phase completes.
## Workflow
### Phase 0: Context & Baseline
**Role**: Software engineer preparing for refactoring
**Goal**: Collect refactoring goals and capture baseline metrics
**Constraints**: Measurement only — no code changes
#### 0a. Collect Goals
If PROBLEM_DIR files do not yet exist, help the user create them:
1. `problem.md` — what the system currently does, what changes are needed, pain points
2. `acceptance_criteria.md` — success criteria for the refactoring
3. `security_approach.md` — security requirements (if applicable)
Store in PROBLEM_DIR.
#### 0b. Capture Baseline
1. Read problem description and acceptance criteria
2. Measure current system metrics using project-appropriate tools:
| Metric Category | What to Capture |
|----------------|-----------------|
| **Coverage** | Overall, unit, integration, critical paths |
| **Complexity** | Cyclomatic complexity (avg + top 5 functions), LOC, tech debt ratio |
| **Code Smells** | Total, critical, major |
| **Performance** | Response times (P50/P95/P99), CPU/memory, throughput |
| **Dependencies** | Total count, outdated, security vulnerabilities |
| **Build** | Build time, test execution time, deployment time |
3. Create functionality inventory: all features/endpoints with status and coverage
**Self-verification**:
- [ ] All metric categories measured (or noted as N/A with reason)
- [ ] Functionality inventory is complete
- [ ] Measurements are reproducible
**Save action**: Write `REFACTOR_DIR/baseline_metrics.md`
**BLOCKING**: Present baseline summary to user. Do NOT proceed until user confirms.
---
### Phase 1: Discovery
**Role**: Principal software architect
**Goal**: Generate documentation from existing code and form solution description
**Constraints**: Document what exists, not what should be. No code changes.
**Skip condition** (Targeted mode): If `COMPONENTS_DIR` and `SOLUTION_DIR` already contain documentation for the target area, skip to Phase 2. Ask user to confirm skip.
#### 1a. Document Components
For each component in the codebase:
1. Analyze project structure, directories, files
2. Go file by file, analyze each method
3. Analyze connections between components
Write per component to `REFACTOR_DIR/discovery/components/[##]_[name].md`:
- Purpose and architectural patterns
- Mermaid diagrams for logic flows
- API reference table (name, description, input, output)
- Implementation details: algorithmic complexity, state management, dependencies
- Caveats, edge cases, known limitations
#### 1b. Synthesize Solution & Flows
1. Review all generated component documentation
2. Synthesize into a cohesive solution description
3. Create flow diagrams showing component interactions
Write:
- `REFACTOR_DIR/discovery/solution.md` — product description, component overview, interaction diagram
- `REFACTOR_DIR/discovery/system_flows.md` — Mermaid flowcharts per major use case
Also copy to project standard locations if in project mode:
- `SOLUTION_DIR/solution.md`
- `COMPONENTS_DIR/system_flows.md`
**Self-verification**:
- [ ] Every component in the codebase is documented
- [ ] Solution description covers all components
- [ ] Flow diagrams cover all major use cases
- [ ] Mermaid diagrams are syntactically correct
**Save action**: Write discovery artifacts
**BLOCKING**: Present discovery summary to user. Do NOT proceed until user confirms documentation accuracy.
---
### Phase 2: Analysis
**Role**: Researcher and software architect
**Goal**: Research improvements and produce a refactoring roadmap
**Constraints**: Analysis only — no code changes
#### 2a. Deep Research
1. Analyze current implementation patterns
2. Research modern approaches for similar systems
3. Identify what could be done differently
4. Suggest improvements based on state-of-the-art practices
Write `REFACTOR_DIR/analysis/research_findings.md`:
- Current state analysis: patterns used, strengths, weaknesses
- Alternative approaches per component: current vs alternative, pros/cons, migration effort
- Prioritized recommendations: quick wins + strategic improvements
#### 2b. Solution Assessment
1. Assess current implementation against acceptance criteria
2. Identify weak points in codebase, map to specific code areas
3. Perform gap analysis: acceptance criteria vs current state
4. Prioritize changes by impact and effort
Write `REFACTOR_DIR/analysis/refactoring_roadmap.md`:
- Weak points assessment: location, description, impact, proposed solution
- Gap analysis: what's missing, what needs improvement
- Phased roadmap: Phase 1 (critical fixes), Phase 2 (major improvements), Phase 3 (enhancements)
**Self-verification**:
- [ ] All acceptance criteria are addressed in gap analysis
- [ ] Recommendations are grounded in actual code, not abstract
- [ ] Roadmap phases are prioritized by impact
- [ ] Quick wins are identified separately
**Save action**: Write analysis artifacts
**BLOCKING**: Present refactoring roadmap to user. Do NOT proceed until user confirms.
**Quick Assessment mode stops here.** Present final summary and write `FINAL_report.md` with phases 0-2 content.
---
### Phase 3: Safety Net
**Role**: QA engineer and developer
**Goal**: Design and implement tests that capture current behavior before refactoring
**Constraints**: Tests must all pass on the current codebase before proceeding
#### 3a. Design Test Specs
Coverage requirements (must meet before refactoring):
- Minimum overall coverage: 75%
- Critical path coverage: 90%
- All public APIs must have integration tests
- All error handling paths must be tested
For each critical area, write test specs to `REFACTOR_DIR/test_specs/[##]_[test_name].md`:
- Integration tests: summary, current behavior, input data, expected result, max expected time
- Acceptance tests: summary, preconditions, steps with expected results
- Coverage analysis: current %, target %, uncovered critical paths
#### 3b. Implement Tests
1. Set up test environment and infrastructure if not exists
2. Implement each test from specs
3. Run tests, verify all pass on current codebase
4. Document any discovered issues
**Self-verification**:
- [ ] Coverage requirements met (75% overall, 90% critical paths)
- [ ] All tests pass on current codebase
- [ ] All public APIs have integration tests
- [ ] Test data fixtures are configured
**Save action**: Write test specs; implemented tests go into the project's test folder
**GATE (BLOCKING)**: ALL tests must pass before proceeding to Phase 4. If tests fail, fix the tests (not the code) or ask user for guidance. Do NOT proceed to Phase 4 with failing tests.
---
### Phase 4: Execution
**Role**: Software architect and developer
**Goal**: Analyze coupling and execute decoupling changes
**Constraints**: Small incremental changes; tests must stay green after every change
#### 4a. Analyze Coupling
1. Analyze coupling between components/modules
2. Map dependencies (direct and transitive)
3. Identify circular dependencies
4. Form decoupling strategy
Write `REFACTOR_DIR/coupling_analysis.md`:
- Dependency graph (Mermaid)
- Coupling metrics per component
- Problem areas: components involved, coupling type, severity, impact
- Decoupling strategy: priority order, proposed interfaces/abstractions, effort estimates
**BLOCKING**: Present coupling analysis to user. Do NOT proceed until user confirms strategy.
#### 4b. Execute Decoupling
For each change in the decoupling strategy:
1. Implement the change
2. Run integration tests
3. Fix any failures
4. Commit with descriptive message
Address code smells encountered: long methods, large classes, duplicate code, dead code, magic numbers.
Write `REFACTOR_DIR/execution_log.md`:
- Change description, files affected, test status per change
- Before/after metrics comparison against baseline
**Self-verification**:
- [ ] All tests still pass after execution
- [ ] No circular dependencies remain (or reduced per plan)
- [ ] Code smells addressed
- [ ] Metrics improved compared to baseline
**Save action**: Write execution artifacts
**BLOCKING**: Present execution summary to user. Do NOT proceed until user confirms.
---
### Phase 5: Hardening (Optional, Parallel Tracks)
**Role**: Varies per track
**Goal**: Address technical debt, performance, and security
**Constraints**: Each track is optional; user picks which to run
Present the three tracks and let user choose which to execute:
#### Track A: Technical Debt
**Role**: Technical debt analyst
1. Identify and categorize debt items: design, code, test, documentation
2. Assess each: location, description, impact, effort, interest (cost of not fixing)
3. Prioritize: quick wins → strategic debt → tolerable debt
4. Create actionable plan with prevention measures
Write `REFACTOR_DIR/hardening/technical_debt.md`
#### Track B: Performance Optimization
**Role**: Performance engineer
1. Profile current performance, identify bottlenecks
2. For each bottleneck: location, symptom, root cause, impact
3. Propose optimizations with expected improvement and risk
4. Implement one at a time, benchmark after each change
5. Verify tests still pass
Write `REFACTOR_DIR/hardening/performance.md` with before/after benchmarks
#### Track C: Security Review
**Role**: Security engineer
1. Review code against OWASP Top 10
2. Verify security requirements from `security_approach.md` are met
3. Check: authentication, authorization, input validation, output encoding, encryption, logging
Write `REFACTOR_DIR/hardening/security.md`:
- Vulnerability assessment: location, type, severity, exploit scenario, fix
- Security controls review
- Compliance check against `security_approach.md`
- Recommendations: critical fixes, improvements, hardening
**Self-verification** (per track):
- [ ] All findings are grounded in actual code
- [ ] Recommendations are actionable with effort estimates
- [ ] All tests still pass after any changes
**Save action**: Write hardening artifacts
---
## Final Report
After all executed phases complete, write `REFACTOR_DIR/FINAL_report.md`:
- Refactoring mode used and phases executed
- Baseline metrics vs final metrics comparison
- Changes made summary
- Remaining items (deferred to future)
- Lessons learned
## Escalation Rules
| Situation | Action |
|-----------|--------|
| Unclear refactoring scope | **ASK user** |
| Ambiguous acceptance criteria | **ASK user** |
| Tests failing before refactoring | **ASK user** — fix tests or fix code? |
| Coupling change risks breaking external contracts | **ASK user** |
| Performance optimization vs readability trade-off | **ASK user** |
| Missing baseline metrics (no test suite, no CI) | **WARN user**, suggest building safety net first |
| Security vulnerability found during refactoring | **WARN user** immediately, don't defer |
## Trigger Conditions
When the user wants to:
- Improve existing code structure or quality
- Reduce technical debt or coupling
- Prepare codebase for new features
- Assess code health before major changes
**Keywords**: "refactor", "refactoring", "improve code", "reduce coupling", "technical debt", "code quality", "decoupling"
## Methodology Quick Reference
```
┌────────────────────────────────────────────────────────────────┐
│ Structured Refactoring (6-Phase Method) │
├────────────────────────────────────────────────────────────────┤
│ CONTEXT: Resolve mode (project vs standalone) + set paths │
│ MODE: Full / Targeted / Quick Assessment │
│ │
│ 0. Context & Baseline → baseline_metrics.md │
│ [BLOCKING: user confirms baseline] │
│ 1. Discovery → discovery/ (components, solution) │
│ [BLOCKING: user confirms documentation] │
│ 2. Analysis → analysis/ (research, roadmap) │
│ [BLOCKING: user confirms roadmap] │
│ ── Quick Assessment stops here ── │
│ 3. Safety Net → test_specs/ + implemented tests │
│ [GATE: all tests must pass] │
│ 4. Execution → coupling_analysis, execution_log │
│ [BLOCKING: user confirms changes] │
│ 5. Hardening → hardening/ (debt, perf, security) │
│ [optional, user picks tracks] │
│ ───────────────────────────────────────────────── │
│ FINAL_report.md │
├────────────────────────────────────────────────────────────────┤
│ Principles: Preserve behavior · Measure before/after │
│ Small changes · Save immediately · Ask don't assume│
└────────────────────────────────────────────────────────────────┘
```
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,37 @@
# Solution Draft
## Product Solution Description
[Short description of the proposed solution. Brief component interaction diagram.]
## Existing/Competitor Solutions Analysis
[Analysis of existing solutions for similar problems, if any.]
## Architecture
[Architecture solution that meets restrictions and acceptance criteria.]
### Component: [Component Name]
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|----------|-------|-----------|-------------|-------------|----------|------|-----|
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
[Repeat per component]
## Testing Strategy
### Integration / Functional Tests
- [Test 1]
- [Test 2]
### Non-Functional Tests
- [Performance test 1]
- [Security test 1]
## References
[All cited source links]
## Related Artifacts
- Tech stack evaluation: `_docs/01_solution/tech_stack.md` (if Phase 3 was executed)
- Security analysis: `_docs/01_solution/security_analysis.md` (if Phase 4 was executed)
@@ -0,0 +1,40 @@
# Solution Draft
## Assessment Findings
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|------------------------|----------------------------------------------|-------------|
| [old] | [weak point] | [new] |
## Product Solution Description
[Short description. Brief component interaction diagram. Written as if from scratch — no "updated" markers.]
## Architecture
[Architecture solution that meets restrictions and acceptance criteria.]
### Component: [Component Name]
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|----------|-------|-----------|-------------|-------------|----------|------------|-----|
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
[Repeat per component]
## Testing Strategy
### Integration / Functional Tests
- [Test 1]
- [Test 2]
### Non-Functional Tests
- [Performance test 1]
- [Security test 1]
## References
[All cited source links]
## Related Artifacts
- Tech stack evaluation: `_docs/01_solution/tech_stack.md` (if Phase 3 was executed)
- Security analysis: `_docs/01_solution/security_analysis.md` (if Phase 4 was executed)
+311
View File
@@ -0,0 +1,311 @@
---
name: security-testing
description: "Test for security vulnerabilities using OWASP principles. Use when conducting security audits, testing auth, or implementing security practices."
category: specialized-testing
priority: critical
tokenEstimate: 1200
agents: [qe-security-scanner, qe-api-contract-validator, qe-quality-analyzer]
implementation_status: optimized
optimization_version: 1.0
last_optimized: 2025-12-02
dependencies: []
quick_reference_card: true
tags: [security, owasp, sast, dast, vulnerabilities, auth, injection]
trust_tier: 3
validation:
schema_path: schemas/output.json
validator_path: scripts/validate-config.json
eval_path: evals/security-testing.yaml
---
# Security Testing
<default_to_action>
When testing security or conducting audits:
1. TEST OWASP Top 10 vulnerabilities systematically
2. VALIDATE authentication and authorization on every endpoint
3. SCAN dependencies for known vulnerabilities (npm audit)
4. CHECK for injection attacks (SQL, XSS, command)
5. VERIFY secrets aren't exposed in code/logs
**Quick Security Checks:**
- Access control → Test horizontal/vertical privilege escalation
- Crypto → Verify password hashing, HTTPS, no sensitive data exposed
- Injection → Test SQL injection, XSS, command injection
- Auth → Test weak passwords, session fixation, MFA enforcement
- Config → Check error messages don't leak info
**Critical Success Factors:**
- Think like an attacker, build like a defender
- Security is built in, not added at the end
- Test continuously in CI/CD, not just before release
</default_to_action>
## Quick Reference Card
### When to Use
- Security audits and penetration testing
- Testing authentication/authorization
- Validating input sanitization
- Reviewing security configuration
### OWASP Top 10 (2021)
| # | Vulnerability | Key Test |
|---|---------------|----------|
| 1 | Broken Access Control | User A accessing User B's data |
| 2 | Cryptographic Failures | Plaintext passwords, HTTP |
| 3 | Injection | SQL/XSS/command injection |
| 4 | Insecure Design | Rate limiting, session timeout |
| 5 | Security Misconfiguration | Verbose errors, exposed /admin |
| 6 | Vulnerable Components | npm audit, outdated packages |
| 7 | Auth Failures | Weak passwords, no MFA |
| 8 | Integrity Failures | Unsigned updates, malware |
| 9 | Logging Failures | No audit trail for breaches |
| 10 | SSRF | Server fetching internal URLs |
### Tools
| Type | Tool | Purpose |
|------|------|---------|
| SAST | SonarQube, Semgrep | Static code analysis |
| DAST | OWASP ZAP, Burp | Dynamic scanning |
| Deps | npm audit, Snyk | Dependency vulnerabilities |
| Secrets | git-secrets, TruffleHog | Secret scanning |
### Agent Coordination
- `qe-security-scanner`: Multi-layer SAST/DAST scanning
- `qe-api-contract-validator`: API security testing
- `qe-quality-analyzer`: Security code review
---
## Key Vulnerability Tests
### 1. Broken Access Control
```javascript
// Horizontal escalation - User A accessing User B's data
test('user cannot access another user\'s order', async () => {
const userAToken = await login('userA');
const userBOrder = await createOrder('userB');
const response = await api.get(`/orders/${userBOrder.id}`, {
headers: { Authorization: `Bearer ${userAToken}` }
});
expect(response.status).toBe(403);
});
// Vertical escalation - Regular user accessing admin
test('regular user cannot access admin', async () => {
const userToken = await login('regularUser');
expect((await api.get('/admin/users', {
headers: { Authorization: `Bearer ${userToken}` }
})).status).toBe(403);
});
```
### 2. Injection Attacks
```javascript
// SQL Injection
test('prevents SQL injection', async () => {
const malicious = "' OR '1'='1";
const response = await api.get(`/products?search=${malicious}`);
expect(response.body.length).toBeLessThan(100); // Not all products
});
// XSS
test('sanitizes HTML output', async () => {
const xss = '<script>alert("XSS")</script>';
await api.post('/comments', { text: xss });
const html = (await api.get('/comments')).body;
expect(html).toContain('&lt;script&gt;');
expect(html).not.toContain('<script>');
});
```
### 3. Cryptographic Failures
```javascript
test('passwords are hashed', async () => {
await db.users.create({ email: 'test@example.com', password: 'MyPassword123' });
const user = await db.users.findByEmail('test@example.com');
expect(user.password).not.toBe('MyPassword123');
expect(user.password).toMatch(/^\$2[aby]\$\d{2}\$/); // bcrypt
});
test('no sensitive data in API response', async () => {
const response = await api.get('/users/me');
expect(response.body).not.toHaveProperty('password');
expect(response.body).not.toHaveProperty('ssn');
});
```
### 4. Security Misconfiguration
```javascript
test('errors don\'t leak sensitive info', async () => {
const response = await api.post('/login', { email: 'nonexistent@test.com', password: 'wrong' });
expect(response.body.error).toBe('Invalid credentials'); // Generic message
});
test('sensitive endpoints not exposed', async () => {
const endpoints = ['/debug', '/.env', '/.git', '/admin'];
for (let ep of endpoints) {
expect((await fetch(`https://example.com${ep}`)).status).not.toBe(200);
}
});
```
### 5. Rate Limiting
```javascript
test('rate limiting prevents brute force', async () => {
const responses = [];
for (let i = 0; i < 20; i++) {
responses.push(await api.post('/login', { email: 'test@example.com', password: 'wrong' }));
}
expect(responses.filter(r => r.status === 429).length).toBeGreaterThan(0);
});
```
---
## Security Checklist
### Authentication
- [ ] Strong password requirements (12+ chars)
- [ ] Password hashing (bcrypt, scrypt, Argon2)
- [ ] MFA for sensitive operations
- [ ] Account lockout after failed attempts
- [ ] Session ID changes after login
- [ ] Session timeout
### Authorization
- [ ] Check authorization on every request
- [ ] Least privilege principle
- [ ] No horizontal escalation
- [ ] No vertical escalation
### Data Protection
- [ ] HTTPS everywhere
- [ ] Encrypted at rest
- [ ] Secrets not in code/logs
- [ ] PII compliance (GDPR)
### Input Validation
- [ ] Server-side validation
- [ ] Parameterized queries (no SQL injection)
- [ ] Output encoding (no XSS)
- [ ] Rate limiting
---
## CI/CD Integration
```yaml
# GitHub Actions
security-checks:
steps:
- name: Dependency audit
run: npm audit --audit-level=high
- name: SAST scan
run: npm run sast
- name: Secret scan
uses: trufflesecurity/trufflehog@main
- name: DAST scan
if: github.ref == 'refs/heads/main'
run: docker run owasp/zap2docker-stable zap-baseline.py -t https://staging.example.com
```
**Pre-commit hooks:**
```bash
#!/bin/sh
git-secrets --scan
npm run lint:security
```
---
## Agent-Assisted Security Testing
```typescript
// Comprehensive multi-layer scan
await Task("Security Scan", {
target: 'src/',
layers: { sast: true, dast: true, dependencies: true, secrets: true },
severity: ['critical', 'high', 'medium']
}, "qe-security-scanner");
// OWASP Top 10 testing
await Task("OWASP Scan", {
categories: ['broken-access-control', 'injection', 'cryptographic-failures'],
depth: 'comprehensive'
}, "qe-security-scanner");
// Validate fix
await Task("Validate Fix", {
vulnerability: 'CVE-2024-12345',
expectedResolution: 'upgrade package to v2.0.0',
retestAfterFix: true
}, "qe-security-scanner");
```
---
## Agent Coordination Hints
### Memory Namespace
```
aqe/security/
├── scans/* - Scan results
├── vulnerabilities/* - Found vulnerabilities
├── fixes/* - Remediation tracking
└── compliance/* - Compliance status
```
### Fleet Coordination
```typescript
const securityFleet = await FleetManager.coordinate({
strategy: 'security-testing',
agents: [
'qe-security-scanner',
'qe-api-contract-validator',
'qe-quality-analyzer',
'qe-deployment-readiness'
],
topology: 'parallel'
});
```
---
## Common Mistakes
### ❌ Security by Obscurity
Hiding admin at `/super-secret-admin`**Use proper auth**
### ❌ Client-Side Validation Only
JavaScript validation can be bypassed → **Always validate server-side**
### ❌ Trusting User Input
Assuming input is safe → **Sanitize, validate, escape all input**
### ❌ Hardcoded Secrets
API keys in code → **Environment variables, secret management**
---
## Related Skills
- [agentic-quality-engineering](../agentic-quality-engineering/) - Security with agents
- [api-testing-patterns](../api-testing-patterns/) - API security testing
- [compliance-testing](../compliance-testing/) - GDPR, HIPAA, SOC2
---
## Remember
**Think like an attacker:** What would you try to break? Test that.
**Build like a defender:** Assume input is malicious until proven otherwise.
**Test continuously:** Security testing is ongoing, not one-time.
**With Agents:** Agents automate vulnerability scanning, track remediation, and validate fixes. Use agents to maintain security posture at scale.
@@ -0,0 +1,789 @@
# =============================================================================
# AQE Skill Evaluation Test Suite: Security Testing v1.0.0
# =============================================================================
#
# Comprehensive evaluation suite for the security-testing skill per ADR-056.
# Tests OWASP Top 10 2021 detection, severity classification, remediation
# quality, and cross-model consistency.
#
# Schema: .claude/skills/.validation/schemas/skill-eval.schema.json
# Validator: .claude/skills/security-testing/scripts/validate-config.json
#
# Coverage:
# - OWASP A01:2021 - Broken Access Control
# - OWASP A02:2021 - Cryptographic Failures
# - OWASP A03:2021 - Injection (SQL, XSS, Command)
# - OWASP A07:2021 - Identification and Authentication Failures
# - Negative tests (no false positives on secure code)
#
# =============================================================================
skill: security-testing
version: 1.0.0
description: >
Comprehensive evaluation suite for the security-testing skill.
Tests OWASP Top 10 2021 detection capabilities, CWE classification accuracy,
CVSS scoring, severity classification, and remediation quality.
Supports multi-model testing and integrates with ReasoningBank for
continuous improvement.
# =============================================================================
# Multi-Model Configuration
# =============================================================================
models_to_test:
- claude-3.5-sonnet # Primary model (high accuracy expected)
- claude-3-haiku # Fast model (minimum quality threshold)
- gpt-4o # Cross-vendor validation
# =============================================================================
# MCP Integration Configuration
# =============================================================================
mcp_integration:
enabled: true
namespace: skill-validation
# Query existing security patterns before running evals
query_patterns: true
# Track each test outcome for learning feedback loop
track_outcomes: true
# Store successful patterns after evals complete
store_patterns: true
# Share learning with fleet coordinator agents
share_learning: true
# Update quality gate with validation metrics
update_quality_gate: true
# Target agents for learning distribution
target_agents:
- qe-learning-coordinator
- qe-queen-coordinator
- qe-security-scanner
- qe-security-auditor
# =============================================================================
# ReasoningBank Learning Configuration
# =============================================================================
learning:
store_success_patterns: true
store_failure_patterns: true
pattern_ttl_days: 90
min_confidence_to_store: 0.7
cross_model_comparison: true
# =============================================================================
# Result Format Configuration
# =============================================================================
result_format:
json_output: true
markdown_report: true
include_raw_output: false
include_timing: true
include_token_usage: true
# =============================================================================
# Environment Setup
# =============================================================================
setup:
required_tools:
- jq # JSON parsing (required)
- npm # Dependency audit (optional but recommended)
environment_variables:
SECURITY_SCAN_DEPTH: "deep"
OWASP_ENABLED: "true"
SEVERITY_THRESHOLD: "medium"
fixtures:
- name: vulnerable_express_app
path: fixtures/vulnerable-express-app.js
content: |
const express = require('express');
const app = express();
// SQL Injection vulnerability
app.get('/user', (req, res) => {
const query = `SELECT * FROM users WHERE id = ${req.params.id}`;
db.query(query);
});
// XSS vulnerability
app.get('/profile', (req, res) => {
res.send(`<h1>Hello ${req.query.name}</h1>`);
});
// Path Traversal vulnerability
app.get('/file', (req, res) => {
const path = './uploads/' + req.query.filename;
res.sendFile(path);
});
# =============================================================================
# TEST CASES
# =============================================================================
test_cases:
# ---------------------------------------------------------------------------
# CATEGORY: SQL Injection (OWASP A03:2021)
# ---------------------------------------------------------------------------
- id: tc001_sql_injection_string_concat
description: "Detect SQL injection via string concatenation in Node.js"
category: injection
priority: critical
input:
code: |
const express = require('express');
const mysql = require('mysql');
const app = express();
app.get('/api/users', (req, res) => {
const userId = req.params.id;
const query = `SELECT * FROM users WHERE id = ${userId}`;
db.query(query, (err, results) => {
res.json(results);
});
});
context:
language: javascript
framework: express
environment: production
expected_output:
must_contain:
- "SQL injection"
- "parameterized"
must_not_contain:
- "no vulnerabilities"
- "secure"
must_match_regex:
- "CWE-89|CWE-564"
- "A03:20[21][0-9]"
severity_classification: critical
finding_count:
min: 1
max: 3
recommendation_count:
min: 1
validation:
schema_check: true
keyword_match_threshold: 0.8
reasoning_quality_min: 0.7
grading_rubric:
completeness: 0.3
accuracy: 0.5
actionability: 0.2
timeout_ms: 30000
- id: tc002_sql_injection_parameterized_safe
description: "Verify parameterized queries are NOT flagged as vulnerable"
category: injection
priority: high
input:
code: |
app.get('/api/users', (req, res) => {
const userId = parseInt(req.params.id, 10);
db.query('SELECT * FROM users WHERE id = ?', [userId], (err, results) => {
res.json(results);
});
});
context:
language: javascript
framework: express
expected_output:
must_contain:
- "parameterized"
- "secure"
must_not_contain:
- "SQL injection"
- "critical"
- "vulnerable"
severity_classification: info
finding_count:
max: 1
validation:
schema_check: true
keyword_match_threshold: 0.7
allow_partial: true
# ---------------------------------------------------------------------------
# CATEGORY: Cross-Site Scripting (OWASP A03:2021)
# ---------------------------------------------------------------------------
- id: tc003_xss_reflected_html_output
description: "Detect reflected XSS in unescaped HTML output"
category: injection
priority: critical
input:
code: |
app.get('/profile', (req, res) => {
const name = req.query.name;
res.send(`
<html>
<body>
<h1>Welcome, ${name}!</h1>
<p>Your profile has been loaded.</p>
</body>
</html>
`);
});
context:
language: javascript
framework: express
expected_output:
must_contain:
- "XSS"
- "cross-site scripting"
- "sanitize"
- "escape"
must_match_regex:
- "CWE-79"
severity_classification: high
finding_count:
min: 1
validation:
schema_check: true
keyword_match_threshold: 0.8
reasoning_quality_min: 0.75
- id: tc004_xss_dom_based_innerhtml
description: "Detect DOM-based XSS via innerHTML assignment"
category: injection
priority: high
input:
code: |
// Client-side JavaScript
const params = new URLSearchParams(window.location.search);
const message = params.get('msg');
document.getElementById('output').innerHTML = message;
context:
language: javascript
framework: vanilla
environment: production
expected_output:
must_contain:
- "DOM"
- "XSS"
- "innerHTML"
- "textContent"
must_match_regex:
- "CWE-79"
severity_classification: high
validation:
schema_check: true
keyword_match_threshold: 0.7
# ---------------------------------------------------------------------------
# CATEGORY: Authentication Failures (OWASP A07:2021)
# ---------------------------------------------------------------------------
- id: tc005_hardcoded_credentials
description: "Detect hardcoded credentials and API keys"
category: authentication
priority: critical
input:
code: |
const ADMIN_PASSWORD = 'admin123';
const API_KEY = 'sk-1234567890abcdef';
const DATABASE_URL = 'postgres://admin:password123@localhost/db';
app.post('/login', (req, res) => {
if (req.body.password === ADMIN_PASSWORD) {
req.session.isAdmin = true;
res.send('Login successful');
}
});
context:
language: javascript
framework: express
expected_output:
must_contain:
- "hardcoded"
- "credentials"
- "secret"
- "environment variable"
must_match_regex:
- "CWE-798|CWE-259"
severity_classification: critical
finding_count:
min: 2
validation:
schema_check: true
keyword_match_threshold: 0.8
reasoning_quality_min: 0.8
- id: tc006_weak_password_hashing
description: "Detect weak password hashing algorithms (MD5, SHA1)"
category: authentication
priority: high
input:
code: |
const crypto = require('crypto');
function hashPassword(password) {
return crypto.createHash('md5').update(password).digest('hex');
}
function verifyPassword(password, hash) {
return hashPassword(password) === hash;
}
context:
language: javascript
framework: nodejs
expected_output:
must_contain:
- "MD5"
- "weak"
- "bcrypt"
- "argon2"
must_match_regex:
- "CWE-327|CWE-328|CWE-916"
severity_classification: high
finding_count:
min: 1
validation:
schema_check: true
keyword_match_threshold: 0.8
# ---------------------------------------------------------------------------
# CATEGORY: Broken Access Control (OWASP A01:2021)
# ---------------------------------------------------------------------------
- id: tc007_idor_missing_authorization
description: "Detect IDOR vulnerability with missing authorization check"
category: authorization
priority: critical
input:
code: |
app.get('/api/users/:id/profile', (req, res) => {
// No authorization check - any user can access any profile
const userId = req.params.id;
db.query('SELECT * FROM profiles WHERE user_id = ?', [userId])
.then(profile => res.json(profile));
});
app.delete('/api/users/:id', (req, res) => {
// No check if requesting user owns this account
db.query('DELETE FROM users WHERE id = ?', [req.params.id]);
res.send('User deleted');
});
context:
language: javascript
framework: express
expected_output:
must_contain:
- "authorization"
- "access control"
- "IDOR"
- "ownership"
must_match_regex:
- "CWE-639|CWE-284|CWE-862"
- "A01:2021"
severity_classification: critical
validation:
schema_check: true
keyword_match_threshold: 0.7
# ---------------------------------------------------------------------------
# CATEGORY: Cryptographic Failures (OWASP A02:2021)
# ---------------------------------------------------------------------------
- id: tc008_weak_encryption_des
description: "Detect use of weak encryption algorithms (DES, RC4)"
category: cryptography
priority: high
input:
code: |
const crypto = require('crypto');
function encryptData(data, key) {
const cipher = crypto.createCipher('des', key);
return cipher.update(data, 'utf8', 'hex') + cipher.final('hex');
}
function decryptData(data, key) {
const decipher = crypto.createDecipher('des', key);
return decipher.update(data, 'hex', 'utf8') + decipher.final('utf8');
}
context:
language: javascript
framework: nodejs
expected_output:
must_contain:
- "DES"
- "weak"
- "deprecated"
- "AES"
must_match_regex:
- "CWE-327|CWE-328"
- "A02:2021"
severity_classification: high
validation:
schema_check: true
keyword_match_threshold: 0.7
- id: tc009_plaintext_password_storage
description: "Detect plaintext password storage"
category: cryptography
priority: critical
input:
code: |
class User {
constructor(email, password) {
this.email = email;
this.password = password; // Stored in plaintext!
}
save() {
db.query('INSERT INTO users (email, password) VALUES (?, ?)',
[this.email, this.password]);
}
}
context:
language: javascript
framework: nodejs
expected_output:
must_contain:
- "plaintext"
- "password"
- "hash"
- "bcrypt"
must_match_regex:
- "CWE-256|CWE-312"
- "A02:2021"
severity_classification: critical
validation:
schema_check: true
keyword_match_threshold: 0.8
# ---------------------------------------------------------------------------
# CATEGORY: Path Traversal (Related to A01:2021)
# ---------------------------------------------------------------------------
- id: tc010_path_traversal_file_access
description: "Detect path traversal vulnerability in file access"
category: injection
priority: critical
input:
code: |
const fs = require('fs');
app.get('/download', (req, res) => {
const filename = req.query.file;
const filepath = './uploads/' + filename;
res.sendFile(filepath);
});
app.get('/read', (req, res) => {
const content = fs.readFileSync('./data/' + req.params.name);
res.send(content);
});
context:
language: javascript
framework: express
expected_output:
must_contain:
- "path traversal"
- "directory traversal"
- "../"
- "sanitize"
must_match_regex:
- "CWE-22|CWE-23"
severity_classification: critical
validation:
schema_check: true
keyword_match_threshold: 0.7
# ---------------------------------------------------------------------------
# CATEGORY: Negative Tests (No False Positives)
# ---------------------------------------------------------------------------
- id: tc011_secure_code_no_false_positives
description: "Verify secure code is NOT flagged as vulnerable"
category: negative
priority: critical
input:
code: |
const express = require('express');
const helmet = require('helmet');
const rateLimit = require('express-rate-limit');
const bcrypt = require('bcrypt');
const validator = require('validator');
const app = express();
app.use(helmet());
app.use(rateLimit({ windowMs: 15 * 60 * 1000, max: 100 }));
app.post('/api/users', async (req, res) => {
const { email, password } = req.body;
// Input validation
if (!validator.isEmail(email)) {
return res.status(400).json({ error: 'Invalid email' });
}
// Secure password hashing
const hashedPassword = await bcrypt.hash(password, 12);
// Parameterized query
await db.query(
'INSERT INTO users (email, password) VALUES ($1, $2)',
[email, hashedPassword]
);
res.status(201).json({ message: 'User created' });
});
context:
language: javascript
framework: express
environment: production
expected_output:
must_contain:
- "secure"
- "best practice"
must_not_contain:
- "SQL injection"
- "XSS"
- "critical vulnerability"
- "high severity"
finding_count:
max: 2 # Allow informational findings only
validation:
schema_check: true
keyword_match_threshold: 0.6
allow_partial: true
- id: tc012_secure_auth_implementation
description: "Verify secure authentication is recognized as safe"
category: negative
priority: high
input:
code: |
const bcrypt = require('bcrypt');
const jwt = require('jsonwebtoken');
async function login(email, password) {
const user = await User.findByEmail(email);
if (!user) {
return { error: 'Invalid credentials' };
}
const match = await bcrypt.compare(password, user.passwordHash);
if (!match) {
return { error: 'Invalid credentials' };
}
const token = jwt.sign(
{ userId: user.id },
process.env.JWT_SECRET,
{ expiresIn: '1h' }
);
return { token };
}
context:
language: javascript
framework: nodejs
expected_output:
must_contain:
- "bcrypt"
- "jwt"
- "secure"
must_not_contain:
- "vulnerable"
- "critical"
- "hardcoded"
severity_classification: info
validation:
schema_check: true
allow_partial: true
# ---------------------------------------------------------------------------
# CATEGORY: Python Security (Multi-language Support)
# ---------------------------------------------------------------------------
- id: tc013_python_sql_injection
description: "Detect SQL injection in Python Flask application"
category: injection
priority: critical
input:
code: |
from flask import Flask, request
import sqlite3
app = Flask(__name__)
@app.route('/user')
def get_user():
user_id = request.args.get('id')
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
return str(cursor.fetchone())
context:
language: python
framework: flask
expected_output:
must_contain:
- "SQL injection"
- "parameterized"
- "f-string"
must_match_regex:
- "CWE-89"
severity_classification: critical
finding_count:
min: 1
validation:
schema_check: true
keyword_match_threshold: 0.7
- id: tc014_python_ssti_jinja
description: "Detect Server-Side Template Injection in Jinja2"
category: injection
priority: critical
input:
code: |
from flask import Flask, request, render_template_string
app = Flask(__name__)
@app.route('/render')
def render():
template = request.args.get('template')
return render_template_string(template)
context:
language: python
framework: flask
expected_output:
must_contain:
- "SSTI"
- "template injection"
- "render_template_string"
- "Jinja2"
must_match_regex:
- "CWE-94|CWE-1336"
severity_classification: critical
validation:
schema_check: true
keyword_match_threshold: 0.7
- id: tc015_python_pickle_deserialization
description: "Detect insecure deserialization with pickle"
category: injection
priority: critical
input:
code: |
import pickle
from flask import Flask, request
app = Flask(__name__)
@app.route('/load')
def load_data():
data = request.get_data()
obj = pickle.loads(data)
return str(obj)
context:
language: python
framework: flask
expected_output:
must_contain:
- "pickle"
- "deserialization"
- "untrusted"
- "RCE"
must_match_regex:
- "CWE-502"
- "A08:2021"
severity_classification: critical
validation:
schema_check: true
keyword_match_threshold: 0.7
# =============================================================================
# SUCCESS CRITERIA
# =============================================================================
success_criteria:
# Overall pass rate (90% of tests must pass)
pass_rate: 0.9
# Critical tests must ALL pass (100%)
critical_pass_rate: 1.0
# Average reasoning quality score
avg_reasoning_quality: 0.75
# Maximum suite execution time (5 minutes)
max_execution_time_ms: 300000
# Maximum variance between model results (15%)
cross_model_variance: 0.15
# =============================================================================
# METADATA
# =============================================================================
metadata:
author: "qe-security-auditor"
created: "2026-02-02"
last_updated: "2026-02-02"
coverage_target: >
OWASP Top 10 2021: A01 (Broken Access Control), A02 (Cryptographic Failures),
A03 (Injection - SQL, XSS, SSTI, Command), A07 (Authentication Failures),
A08 (Software Integrity - Deserialization). Covers JavaScript/Node.js
Express apps and Python Flask apps. 15 test cases with 90% pass rate
requirement and 100% critical pass rate.
+879
View File
@@ -0,0 +1,879 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://agentic-qe.dev/schemas/security-testing-output.json",
"title": "AQE Security Testing Skill Output Schema",
"description": "Schema for security-testing skill output validation. Extends the base skill-output template with OWASP Top 10 categories, CWE identifiers, and CVSS scoring.",
"type": "object",
"required": ["skillName", "version", "timestamp", "status", "trustTier", "output"],
"properties": {
"skillName": {
"type": "string",
"const": "security-testing",
"description": "Must be 'security-testing'"
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+(-[a-zA-Z0-9]+)?$",
"description": "Semantic version of the skill"
},
"timestamp": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 timestamp of output generation"
},
"status": {
"type": "string",
"enum": ["success", "partial", "failed", "skipped"],
"description": "Overall execution status"
},
"trustTier": {
"type": "integer",
"const": 3,
"description": "Trust tier 3 indicates full validation with eval suite"
},
"output": {
"type": "object",
"required": ["summary", "findings", "owaspCategories"],
"properties": {
"summary": {
"type": "string",
"minLength": 50,
"maxLength": 2000,
"description": "Human-readable summary of security findings"
},
"score": {
"$ref": "#/$defs/securityScore",
"description": "Overall security score"
},
"findings": {
"type": "array",
"items": {
"$ref": "#/$defs/securityFinding"
},
"maxItems": 500,
"description": "List of security vulnerabilities discovered"
},
"recommendations": {
"type": "array",
"items": {
"$ref": "#/$defs/securityRecommendation"
},
"maxItems": 100,
"description": "Prioritized remediation recommendations with code examples"
},
"metrics": {
"$ref": "#/$defs/securityMetrics",
"description": "Security scan metrics and statistics"
},
"owaspCategories": {
"$ref": "#/$defs/owaspCategoryBreakdown",
"description": "OWASP Top 10 2021 category breakdown"
},
"artifacts": {
"type": "array",
"items": {
"$ref": "#/$defs/artifact"
},
"maxItems": 50,
"description": "Generated security reports and scan artifacts"
},
"timeline": {
"type": "array",
"items": {
"$ref": "#/$defs/timelineEvent"
},
"description": "Scan execution timeline"
},
"scanConfiguration": {
"$ref": "#/$defs/scanConfiguration",
"description": "Configuration used for the security scan"
}
}
},
"metadata": {
"$ref": "#/$defs/metadata"
},
"validation": {
"$ref": "#/$defs/validationResult"
},
"learning": {
"$ref": "#/$defs/learningData"
}
},
"$defs": {
"securityScore": {
"type": "object",
"required": ["value", "max"],
"properties": {
"value": {
"type": "number",
"minimum": 0,
"maximum": 100,
"description": "Security score (0=critical issues, 100=no issues)"
},
"max": {
"type": "number",
"const": 100,
"description": "Maximum score is always 100"
},
"grade": {
"type": "string",
"pattern": "^[A-F][+-]?$",
"description": "Letter grade: A (90-100), B (80-89), C (70-79), D (60-69), F (<60)"
},
"trend": {
"type": "string",
"enum": ["improving", "stable", "declining", "unknown"],
"description": "Trend compared to previous scans"
},
"riskLevel": {
"type": "string",
"enum": ["critical", "high", "medium", "low", "minimal"],
"description": "Overall risk level assessment"
}
}
},
"securityFinding": {
"type": "object",
"required": ["id", "title", "severity", "owasp"],
"properties": {
"id": {
"type": "string",
"pattern": "^SEC-\\d{3,6}$",
"description": "Unique finding identifier (e.g., SEC-001)"
},
"title": {
"type": "string",
"minLength": 10,
"maxLength": 200,
"description": "Finding title describing the vulnerability"
},
"description": {
"type": "string",
"maxLength": 2000,
"description": "Detailed description of the vulnerability"
},
"severity": {
"type": "string",
"enum": ["critical", "high", "medium", "low", "info"],
"description": "Severity: critical (CVSS 9.0-10.0), high (7.0-8.9), medium (4.0-6.9), low (0.1-3.9), info (0)"
},
"owasp": {
"type": "string",
"pattern": "^A(0[1-9]|10):20(21|25)$",
"description": "OWASP Top 10 category (e.g., A01:2021, A03:2025)"
},
"owaspCategory": {
"type": "string",
"enum": [
"A01:2021-Broken-Access-Control",
"A02:2021-Cryptographic-Failures",
"A03:2021-Injection",
"A04:2021-Insecure-Design",
"A05:2021-Security-Misconfiguration",
"A06:2021-Vulnerable-Components",
"A07:2021-Identification-Authentication-Failures",
"A08:2021-Software-Data-Integrity-Failures",
"A09:2021-Security-Logging-Monitoring-Failures",
"A10:2021-Server-Side-Request-Forgery"
],
"description": "Full OWASP category name"
},
"cwe": {
"type": "string",
"pattern": "^CWE-\\d{1,4}$",
"description": "CWE identifier (e.g., CWE-79 for XSS, CWE-89 for SQLi)"
},
"cvss": {
"type": "object",
"properties": {
"score": {
"type": "number",
"minimum": 0,
"maximum": 10,
"description": "CVSS v3.1 base score"
},
"vector": {
"type": "string",
"pattern": "^CVSS:3\\.1/AV:[NALP]/AC:[LH]/PR:[NLH]/UI:[NR]/S:[UC]/C:[NLH]/I:[NLH]/A:[NLH]$",
"description": "CVSS v3.1 vector string"
},
"severity": {
"type": "string",
"enum": ["None", "Low", "Medium", "High", "Critical"],
"description": "CVSS severity rating"
}
}
},
"location": {
"$ref": "#/$defs/location",
"description": "Location of the vulnerability"
},
"evidence": {
"type": "string",
"maxLength": 5000,
"description": "Evidence: code snippet, request/response, or PoC"
},
"remediation": {
"type": "string",
"maxLength": 2000,
"description": "Specific fix instructions for this finding"
},
"references": {
"type": "array",
"items": {
"type": "object",
"required": ["title", "url"],
"properties": {
"title": { "type": "string" },
"url": { "type": "string", "format": "uri" }
}
},
"maxItems": 10,
"description": "External references (OWASP, CWE, CVE, etc.)"
},
"falsePositive": {
"type": "boolean",
"default": false,
"description": "Potential false positive flag"
},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Confidence in finding accuracy (0.0-1.0)"
},
"exploitability": {
"type": "string",
"enum": ["trivial", "easy", "moderate", "difficult", "theoretical"],
"description": "How easy is it to exploit this vulnerability"
},
"affectedVersions": {
"type": "array",
"items": { "type": "string" },
"description": "Affected package/library versions for dependency vulnerabilities"
},
"cve": {
"type": "string",
"pattern": "^CVE-\\d{4}-\\d{4,}$",
"description": "CVE identifier if applicable"
}
}
},
"securityRecommendation": {
"type": "object",
"required": ["id", "title", "priority", "owaspCategories"],
"properties": {
"id": {
"type": "string",
"pattern": "^REC-\\d{3,6}$",
"description": "Unique recommendation identifier"
},
"title": {
"type": "string",
"minLength": 10,
"maxLength": 200,
"description": "Recommendation title"
},
"description": {
"type": "string",
"maxLength": 2000,
"description": "Detailed recommendation description"
},
"priority": {
"type": "string",
"enum": ["critical", "high", "medium", "low"],
"description": "Remediation priority"
},
"effort": {
"type": "string",
"enum": ["trivial", "low", "medium", "high", "major"],
"description": "Estimated effort: trivial(<1hr), low(1-4hr), medium(1-3d), high(1-2wk), major(>2wk)"
},
"impact": {
"type": "integer",
"minimum": 1,
"maximum": 10,
"description": "Security impact if implemented (1-10)"
},
"relatedFindings": {
"type": "array",
"items": {
"type": "string",
"pattern": "^SEC-\\d{3,6}$"
},
"description": "IDs of findings this addresses"
},
"owaspCategories": {
"type": "array",
"items": {
"type": "string",
"pattern": "^A(0[1-9]|10):20(21|25)$"
},
"description": "OWASP categories this recommendation addresses"
},
"codeExample": {
"type": "object",
"properties": {
"before": {
"type": "string",
"maxLength": 2000,
"description": "Vulnerable code example"
},
"after": {
"type": "string",
"maxLength": 2000,
"description": "Secure code example"
},
"language": {
"type": "string",
"description": "Programming language"
}
},
"description": "Before/after code examples for remediation"
},
"resources": {
"type": "array",
"items": {
"type": "object",
"required": ["title", "url"],
"properties": {
"title": { "type": "string" },
"url": { "type": "string", "format": "uri" }
}
},
"maxItems": 10,
"description": "External resources and documentation"
},
"automatable": {
"type": "boolean",
"description": "Can this fix be automated?"
},
"fixCommand": {
"type": "string",
"description": "CLI command to apply fix if automatable"
}
}
},
"owaspCategoryBreakdown": {
"type": "object",
"description": "OWASP Top 10 2021 category scores and findings",
"properties": {
"A01:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A01:2021 - Broken Access Control"
},
"A02:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A02:2021 - Cryptographic Failures"
},
"A03:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A03:2021 - Injection"
},
"A04:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A04:2021 - Insecure Design"
},
"A05:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A05:2021 - Security Misconfiguration"
},
"A06:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A06:2021 - Vulnerable and Outdated Components"
},
"A07:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A07:2021 - Identification and Authentication Failures"
},
"A08:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A08:2021 - Software and Data Integrity Failures"
},
"A09:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A09:2021 - Security Logging and Monitoring Failures"
},
"A10:2021": {
"$ref": "#/$defs/owaspCategoryScore",
"description": "A10:2021 - Server-Side Request Forgery (SSRF)"
}
},
"additionalProperties": false
},
"owaspCategoryScore": {
"type": "object",
"required": ["tested", "score"],
"properties": {
"tested": {
"type": "boolean",
"description": "Whether this category was tested"
},
"score": {
"type": "number",
"minimum": 0,
"maximum": 100,
"description": "Category score (100 = no issues, 0 = critical)"
},
"grade": {
"type": "string",
"pattern": "^[A-F][+-]?$",
"description": "Letter grade for this category"
},
"findingCount": {
"type": "integer",
"minimum": 0,
"description": "Number of findings in this category"
},
"criticalCount": {
"type": "integer",
"minimum": 0,
"description": "Number of critical findings"
},
"highCount": {
"type": "integer",
"minimum": 0,
"description": "Number of high severity findings"
},
"status": {
"type": "string",
"enum": ["pass", "fail", "warn", "skip"],
"description": "Category status"
},
"description": {
"type": "string",
"description": "Category description and context"
},
"cwes": {
"type": "array",
"items": {
"type": "string",
"pattern": "^CWE-\\d{1,4}$"
},
"description": "CWEs found in this category"
}
}
},
"securityMetrics": {
"type": "object",
"properties": {
"totalFindings": {
"type": "integer",
"minimum": 0,
"description": "Total vulnerabilities found"
},
"criticalCount": {
"type": "integer",
"minimum": 0,
"description": "Critical severity findings"
},
"highCount": {
"type": "integer",
"minimum": 0,
"description": "High severity findings"
},
"mediumCount": {
"type": "integer",
"minimum": 0,
"description": "Medium severity findings"
},
"lowCount": {
"type": "integer",
"minimum": 0,
"description": "Low severity findings"
},
"infoCount": {
"type": "integer",
"minimum": 0,
"description": "Informational findings"
},
"filesScanned": {
"type": "integer",
"minimum": 0,
"description": "Number of files analyzed"
},
"linesOfCode": {
"type": "integer",
"minimum": 0,
"description": "Lines of code scanned"
},
"dependenciesChecked": {
"type": "integer",
"minimum": 0,
"description": "Number of dependencies checked"
},
"owaspCategoriesTested": {
"type": "integer",
"minimum": 0,
"maximum": 10,
"description": "OWASP Top 10 categories tested"
},
"owaspCategoriesPassed": {
"type": "integer",
"minimum": 0,
"maximum": 10,
"description": "OWASP Top 10 categories with no findings"
},
"uniqueCwes": {
"type": "integer",
"minimum": 0,
"description": "Unique CWE identifiers found"
},
"falsePositiveRate": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Estimated false positive rate"
},
"scanDurationMs": {
"type": "integer",
"minimum": 0,
"description": "Total scan duration in milliseconds"
},
"coverage": {
"type": "object",
"properties": {
"sast": {
"type": "boolean",
"description": "Static analysis performed"
},
"dast": {
"type": "boolean",
"description": "Dynamic analysis performed"
},
"dependencies": {
"type": "boolean",
"description": "Dependency scan performed"
},
"secrets": {
"type": "boolean",
"description": "Secret scanning performed"
},
"configuration": {
"type": "boolean",
"description": "Configuration review performed"
}
},
"description": "Scan coverage indicators"
}
}
},
"scanConfiguration": {
"type": "object",
"properties": {
"target": {
"type": "string",
"description": "Scan target (file path, URL, or package)"
},
"targetType": {
"type": "string",
"enum": ["source", "url", "package", "container", "infrastructure"],
"description": "Type of target being scanned"
},
"scanTypes": {
"type": "array",
"items": {
"type": "string",
"enum": ["sast", "dast", "dependency", "secret", "configuration", "container", "iac"]
},
"description": "Types of scans performed"
},
"severity": {
"type": "array",
"items": {
"type": "string",
"enum": ["critical", "high", "medium", "low", "info"]
},
"description": "Severity levels included in scan"
},
"owaspCategories": {
"type": "array",
"items": {
"type": "string",
"pattern": "^A(0[1-9]|10):20(21|25)$"
},
"description": "OWASP categories tested"
},
"tools": {
"type": "array",
"items": { "type": "string" },
"description": "Security tools used"
},
"excludePatterns": {
"type": "array",
"items": { "type": "string" },
"description": "File patterns excluded from scan"
},
"rulesets": {
"type": "array",
"items": { "type": "string" },
"description": "Security rulesets applied"
}
}
},
"location": {
"type": "object",
"properties": {
"file": {
"type": "string",
"maxLength": 500,
"description": "File path relative to project root"
},
"line": {
"type": "integer",
"minimum": 1,
"description": "Line number"
},
"column": {
"type": "integer",
"minimum": 1,
"description": "Column number"
},
"endLine": {
"type": "integer",
"minimum": 1,
"description": "End line for multi-line findings"
},
"endColumn": {
"type": "integer",
"minimum": 1,
"description": "End column"
},
"url": {
"type": "string",
"format": "uri",
"description": "URL for web-based findings"
},
"endpoint": {
"type": "string",
"description": "API endpoint path"
},
"method": {
"type": "string",
"enum": ["GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS"],
"description": "HTTP method for API findings"
},
"parameter": {
"type": "string",
"description": "Vulnerable parameter name"
},
"component": {
"type": "string",
"description": "Affected component or module"
}
}
},
"artifact": {
"type": "object",
"required": ["type", "path"],
"properties": {
"type": {
"type": "string",
"enum": ["report", "sarif", "data", "log", "evidence"],
"description": "Artifact type"
},
"path": {
"type": "string",
"maxLength": 500,
"description": "Path to artifact"
},
"format": {
"type": "string",
"enum": ["json", "sarif", "html", "md", "txt", "xml", "csv"],
"description": "Artifact format"
},
"description": {
"type": "string",
"maxLength": 500,
"description": "Artifact description"
},
"sizeBytes": {
"type": "integer",
"minimum": 0,
"description": "File size in bytes"
},
"checksum": {
"type": "string",
"pattern": "^sha256:[a-f0-9]{64}$",
"description": "SHA-256 checksum"
}
}
},
"timelineEvent": {
"type": "object",
"required": ["timestamp", "event"],
"properties": {
"timestamp": {
"type": "string",
"format": "date-time",
"description": "Event timestamp"
},
"event": {
"type": "string",
"maxLength": 200,
"description": "Event description"
},
"type": {
"type": "string",
"enum": ["start", "checkpoint", "warning", "error", "complete"],
"description": "Event type"
},
"durationMs": {
"type": "integer",
"minimum": 0,
"description": "Duration since previous event"
},
"phase": {
"type": "string",
"enum": ["initialization", "sast", "dast", "dependency", "secret", "reporting"],
"description": "Scan phase"
}
}
},
"metadata": {
"type": "object",
"properties": {
"executionTimeMs": {
"type": "integer",
"minimum": 0,
"maximum": 3600000,
"description": "Execution time in milliseconds"
},
"toolsUsed": {
"type": "array",
"items": {
"type": "string",
"enum": ["semgrep", "npm-audit", "trivy", "owasp-zap", "bandit", "gosec", "eslint-security", "snyk", "gitleaks", "trufflehog", "bearer"]
},
"uniqueItems": true,
"description": "Security tools used"
},
"agentId": {
"type": "string",
"pattern": "^qe-[a-z][a-z0-9-]*$",
"description": "Agent ID (e.g., qe-security-scanner)"
},
"modelUsed": {
"type": "string",
"description": "LLM model used for analysis"
},
"inputHash": {
"type": "string",
"pattern": "^[a-f0-9]{64}$",
"description": "SHA-256 hash of input"
},
"targetUrl": {
"type": "string",
"format": "uri",
"description": "Target URL if applicable"
},
"targetPath": {
"type": "string",
"description": "Target path if applicable"
},
"environment": {
"type": "string",
"enum": ["development", "staging", "production", "ci"],
"description": "Execution environment"
},
"retryCount": {
"type": "integer",
"minimum": 0,
"maximum": 10,
"description": "Number of retries"
}
}
},
"validationResult": {
"type": "object",
"properties": {
"schemaValid": {
"type": "boolean",
"description": "Passes JSON schema validation"
},
"contentValid": {
"type": "boolean",
"description": "Passes content validation"
},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Confidence score"
},
"warnings": {
"type": "array",
"items": {
"type": "string",
"maxLength": 500
},
"maxItems": 20,
"description": "Validation warnings"
},
"errors": {
"type": "array",
"items": {
"type": "string",
"maxLength": 500
},
"maxItems": 20,
"description": "Validation errors"
},
"validatorVersion": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Validator version"
}
}
},
"learningData": {
"type": "object",
"properties": {
"patternsDetected": {
"type": "array",
"items": {
"type": "string",
"maxLength": 200
},
"maxItems": 20,
"description": "Security patterns detected (e.g., sql-injection-string-concat)"
},
"reward": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Reward signal for learning (0.0-1.0)"
},
"feedbackLoop": {
"type": "object",
"properties": {
"previousRunId": {
"type": "string",
"format": "uuid",
"description": "Previous run ID for comparison"
},
"improvement": {
"type": "number",
"minimum": -1,
"maximum": 1,
"description": "Improvement over previous run"
}
}
},
"newVulnerabilityPatterns": {
"type": "array",
"items": {
"type": "object",
"properties": {
"pattern": { "type": "string" },
"cwe": { "type": "string" },
"confidence": { "type": "number" }
}
},
"description": "New vulnerability patterns learned"
}
}
}
}
}
@@ -0,0 +1,45 @@
{
"skillName": "security-testing",
"skillVersion": "1.0.0",
"requiredTools": [
"jq"
],
"optionalTools": [
"npm",
"semgrep",
"trivy",
"ajv",
"jsonschema",
"python3"
],
"schemaPath": "schemas/output.json",
"requiredFields": [
"skillName",
"status",
"output",
"output.summary",
"output.findings",
"output.owaspCategories"
],
"requiredNonEmptyFields": [
"output.summary"
],
"mustContainTerms": [
"OWASP",
"security",
"vulnerability"
],
"mustNotContainTerms": [
"TODO",
"placeholder",
"FIXME"
],
"enumValidations": {
".status": [
"success",
"partial",
"failed",
"skipped"
]
}
}