Enhance security auditing capabilities by introducing a comprehensive 5-phase OWASP-based security audit process, including dependency scanning, static analysis, and a consolidated report with severity-ranked findings. Update autopilot workflows to incorporate an optional security audit step before deployment, and refine documentation across related skills for clarity and usability.

2026-06-21 07:21:08 +00:00 · 2026-03-22 18:03:47 +02:00
parent 3165a88f0b
commit 091d9a8fb0
13 changed files with 482 additions and 1976 deletions
@@ -91,7 +91,7 @@ Multi-phase code review against task specs. Produces structured findings with ve
 ### security
-OWASP-based security testing and audit.
+5-phase OWASP-based security audit: dependency scan, static analysis, OWASP Top 10 review, infrastructure review, consolidated report with severity-ranked findings. Integrated into autopilot as an optional step before deploy.
 ### retrospective
@@ -150,7 +150,7 @@ Or just use `/autopilot` to run steps 0-5 automatically.
 | **implement** | "implement", "start implementation" | `_docs/03_implementation/` |
 | **code-review** | "code review", "review code" | Verdict: PASS / FAIL / PASS_WITH_WARNINGS |
 | **refactor** | "refactor", "improve code" | `_docs/04_refactoring/` |
-| **security** | "security audit", "OWASP" | Security findings report |
+| **security** | "security audit", "OWASP", "vulnerability scan" | `_docs/05_security/` |
 | **document** | "document", "document codebase", "reverse-engineer docs" | `_docs/02_document/` + `_docs/00_problem/` + `_docs/01_solution/` |
 | **deploy** | "deploy", "CI/CD", "observability" | `_docs/04_deploy/` |
 | **retrospective** | "retrospective", "retro" | `_docs/05_metrics/` |
@@ -184,6 +184,7 @@ _docs/
 ├── 03_implementation/                   — batch reports, FINAL report
 ├── 04_deploy/                           — containerization, CI/CD, environments, observability, procedures, scripts
 ├── 04_refactoring/                      — baseline, discovery, analysis, execution, hardening
 ├── 05_security/                         — dependency scan, static analysis, OWASP review, infrastructure, report
 └── 05_metrics/                          — retro_[YYYY-MM-DD].md
 ```
@@ -1,7 +1,6 @@
 ---
 description: "OpenAPI/Swagger API documentation standards — applied when editing API spec files"
 globs: ["**/openapi*", "**/swagger*"]
 alwaysApply: false
 ---
 # OpenAPI
@@ -111,13 +111,14 @@ This skill activates when the user wants to:
 │ GREENFIELD FLOW (flows/greenfield.md):                         │
 │   Step 0 Problem → Step 1 Research → Step 2 Plan              │
 │   → Step 3 Decompose → [SESSION] → Step 4 Implement           │
-│   → Step 5 Run Tests → Step 6 Deploy → DONE                   │
+│   → Step 5 Run Tests → 5b Security (opt) → Step 6 Deploy     │
 │   → DONE                                                      │
 │                                                                │
 │ EXISTING CODE FLOW (flows/existing-code.md):                   │
 │   Pre-Step Document → 2b Test Spec → 2c Decompose Tests      │
 │   → [SESSION] → 2d Implement Tests → 2e Refactor             │
 │   → 2f New Task → [SESSION] → 2g Implement                   │
-│   → 2h Run Tests → 2i Deploy → DONE                          │
+│   → 2h Run Tests → 2hb Security (opt) → 2i Deploy → DONE    │
 │                                                                │
 │ STATE: _docs/_autopilot_state.md (see state.md)                │
 │ PROTOCOLS: choice format, Jira auth, errors (see protocols.md) │
@@ -14,6 +14,7 @@ Workflow for projects with an existing codebase. Starts with documentation, prod
 | 2f   | New Task                | new-task/SKILL.md               | Steps 1–8 (loop)                      |
 | 2g   | Implement               | implement/SKILL.md              | (batch-driven, no fixed sub-steps)    |
 | 2h   | Run Tests               | (autopilot-managed)             | Unit tests → Integration/blackbox tests |
 | 2hb  | Security Audit          | security/SKILL.md               | Phase 1–5 (optional)                  |
 | 2i   | Deploy                  | deploy/SKILL.md                 | Steps 1–7                             |
 After Step 2i, the existing-code workflow is complete.
@@ -119,7 +120,7 @@ Action: Run the full test suite to verify the implementation before deployment.
 2. **Integration / blackbox tests**: if `docker-compose.test.yml` or an equivalent test environment exists, spin it up and run the integration test suite
 3. **Report results**: present a summary of passed/failed/skipped tests
-If all tests pass → auto-chain to Step 2i (Deploy).
+If all tests pass → auto-chain to Step 2hb (Security Audit).
 If tests fail → present using Choose format:
@@ -137,8 +138,29 @@ If tests fail → present using Choose format:
 ---
 **Step 2hb — Security Audit (optional)**
 Condition: the autopilot state shows Step 2h (Run Tests) is completed AND the autopilot state does NOT show Step 2hb (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
 Action: Present using Choose format:
 ```
 ══════════════════════════════════════
 DECISION REQUIRED: Run security audit before deploy?
 ══════════════════════════════════════
 A) Run security audit (recommended for production deployments)
 B) Skip — proceed directly to deploy
 ══════════════════════════════════════
 Recommendation: A — catches vulnerabilities before production
 ══════════════════════════════════════
 ```
 - If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 2i (Deploy).
 - If user picks B → Mark Step 2hb as `skipped` in the state file, auto-chain to Step 2i (Deploy).
 ---
 **Step 2i — Deploy**
-Condition: the autopilot state shows Step 2h (Run Tests) is completed AND (`_docs/04_deploy/` does not exist or is incomplete)
+Condition: the autopilot state shows Step 2h (Run Tests) is completed AND (Step 2hb is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
 Action: Read and execute `.cursor/skills/deploy/SKILL.md`
@@ -177,5 +199,6 @@ Action: The project completed a full cycle. Present status and loop back to New
 | Refactor (Step 2e) | Auto-chain → New Task (Step 2f) |
 | New Task (Step 2f) | **Session boundary** — suggest new conversation before Implement |
 | Implement (Step 2g) | Auto-chain → Run Tests (Step 2h) |
-| Run Tests (Step 2h, all pass) | Auto-chain → Deploy (Step 2i) |
+| Run Tests (Step 2h, all pass) | Auto-chain → Security Audit choice (Step 2hb) |
 | Security Audit (Step 2hb, done or skipped) | Auto-chain → Deploy (Step 2i) |
 | Deploy (Step 2i) | **Workflow complete** — existing-code flow done |
@@ -1,6 +1,6 @@
 # Greenfield Workflow
-Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → Decompose → Implement → Run Tests → Deploy.
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → Decompose → Implement → Run Tests → Security Audit (optional) → Deploy.
 ## Step Reference Table
@@ -8,10 +8,11 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
 |------|-----------|------------------------|---------------------------------------|
 | 0    | Problem   | problem/SKILL.md       | Phase 1–4                             |
 | 1    | Research  | research/SKILL.md      | Mode A: Phase 1–4 · Mode B: Step 0–8 |
-| 2    | Plan      | plan/SKILL.md          | Step 1–6                              |
+| 2    | Plan      | plan/SKILL.md          | Step 1–6 + Final                      |
 | 3    | Decompose | decompose/SKILL.md     | Step 1–4                              |
 | 4    | Implement | implement/SKILL.md     | (batch-driven, no fixed sub-steps)    |
 | 5    | Run Tests | (autopilot-managed)    | Unit tests → Integration/blackbox tests |
 | 5b   | Security Audit | security/SKILL.md | Phase 1–5 (optional)                  |
 | 6    | Deploy    | deploy/SKILL.md        | Step 1–7                              |
 ## Detection Rules
@@ -76,7 +77,7 @@ If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FIN
 ---
 **Step 3 — Decompose**
-Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`) AND (workspace has no source code files OR the user explicitly chose normal workflow in Step 2c)
+Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/` does not exist or has no task files (excluding `_dependencies_table.md`)
 Action: Read and execute `.cursor/skills/decompose/SKILL.md`
@@ -102,7 +103,7 @@ Action: Run the full test suite to verify the implementation before deployment.
 2. **Integration / blackbox tests**: if `docker-compose.test.yml` or an equivalent test environment exists, spin it up and run the integration test suite
 3. **Report results**: present a summary of passed/failed/skipped tests
-If all tests pass → auto-chain to Step 6 (Deploy).
+If all tests pass → auto-chain to Step 5b (Security Audit).
 If tests fail → present using Choose format:
@@ -120,8 +121,29 @@ If tests fail → present using Choose format:
 ---
 **Step 5b — Security Audit (optional)**
 Condition: the autopilot state shows Step 5 (Run Tests) is completed AND the autopilot state does NOT show Step 5b (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
 Action: Present using Choose format:
 ```
 ══════════════════════════════════════
 DECISION REQUIRED: Run security audit before deploy?
 ══════════════════════════════════════
 A) Run security audit (recommended for production deployments)
 B) Skip — proceed directly to deploy
 ══════════════════════════════════════
 Recommendation: A — catches vulnerabilities before production
 ══════════════════════════════════════
 ```
 - If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 6 (Deploy).
 - If user picks B → Mark Step 5b as `skipped` in the state file, auto-chain to Step 6 (Deploy).
 ---
 **Step 6 — Deploy**
-Condition: the autopilot state shows Step 5 (Run Tests) is completed AND (`_docs/04_deploy/` does not exist or is incomplete)
+Condition: the autopilot state shows Step 5 (Run Tests) is completed AND (Step 5b is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
 Action: Read and execute `.cursor/skills/deploy/SKILL.md`
@@ -142,5 +164,6 @@ Action: Report project completion with summary. If the user runs autopilot again
 | Plan | Auto-chain → Decompose |
 | Decompose | **Session boundary** — suggest new conversation before Implement |
 | Implement | Auto-chain → Run Tests (Step 5) |
-| Run Tests (all pass) | Auto-chain → Deploy (Step 6) |
+| Run Tests (all pass) | Auto-chain → Security Audit choice (Step 5b) |
 | Security Audit (done or skipped) | Auto-chain → Deploy (Step 6) |
 | Deploy | Report completion |
@@ -106,6 +106,101 @@ All error situations that require user input MUST use the **Choose A / B / C / D
 | User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
 | User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
 ## Error Recovery Protocol
 ### Stuck Detection
 When executing a sub-skill, monitor for these signals:
 - Same artifact overwritten 3+ times without meaningful change
 - Sub-skill repeatedly asks the same question after receiving an answer
 - No new artifacts saved for an extended period despite active execution
 ### Recovery Actions (ordered)
 1. **Re-read state**: read `_docs/_autopilot_state.md` and cross-check against `_docs/` folders
 2. **Retry current sub-step**: re-read the sub-skill's SKILL.md and restart from the current sub-step
 3. **Escalate**: after 2 failed retries, present diagnostic summary to user using Choose format:
 ```
 ══════════════════════════════════════
 RECOVERY: [skill name] stuck at [sub-step]
 ══════════════════════════════════════
 A) Retry with fresh context (new conversation)
 B) Skip this sub-step with warning
 C) Abort and fix manually
 ══════════════════════════════════════
 Recommendation: A — fresh context often resolves stuck loops
 ══════════════════════════════════════
 ```
 ### Circuit Breaker
 If the same autopilot step fails 3 consecutive times across conversations:
 - Record the failure pattern in the state file's `Blockers` section
 - Do NOT auto-retry on next invocation
 - Present the blocker and ask user for guidance before attempting again
 ## Context Management Protocol
 ### Principle
 Disk is memory. Never rely on in-context accumulation — read from `_docs/` artifacts, not from conversation history.
 ### Minimal Re-Read Set Per Skill
 When re-entering a skill (new conversation or context refresh):
 - Always read: `_docs/_autopilot_state.md`
 - Always read: the active skill's `SKILL.md`
 - Conditionally read: only the `_docs/` artifacts the current sub-step requires (listed in each skill's Context Resolution section)
 - Never bulk-read: do not load all `_docs/` files at once
 ### Mid-Skill Interruption
 If context is filling up during a long skill (e.g., document, implement):
 1. Save current sub-step progress to the skill's artifact directory
 2. Update `_docs/_autopilot_state.md` with exact sub-step position
 3. Suggest a new conversation: "Context is getting long — recommend continuing in a fresh conversation for better results"
 4. On re-entry, the skill's resumability protocol picks up from the saved sub-step
 ### Large Artifact Handling
 When a skill needs to read large files (e.g., full solution.md, architecture.md):
 - Read only the sections relevant to the current sub-step
 - Use search tools (Grep, SemanticSearch) to find specific sections rather than reading entire files
 - Summarize key decisions from prior steps in the state file so they don't need to be re-read
 ## Rollback Protocol
 ### Implementation Steps (git-based)
 Handled by `/implement` skill — each batch commit is a rollback checkpoint via `git revert`.
 ### Planning/Documentation Steps (artifact-based)
 For steps that produce `_docs/` artifacts (problem, research, plan, decompose, document):
 1. **Before overwriting**: if re-running a step that already has artifacts, the sub-skill's prerequisite check asks the user (resume/overwrite/skip)
 2. **Rollback to previous step**: use Choose format:
 ```
 ══════════════════════════════════════
 ROLLBACK: Re-run [step name]?
 ══════════════════════════════════════
 A) Re-run the step (overwrites current artifacts)
 B) Stay on current step
 ══════════════════════════════════════
 Warning: This will overwrite files in _docs/[folder]/
 ══════════════════════════════════════
 ```
 3. **Git safety net**: artifacts are committed with each autopilot step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout <commit> -- _docs/<folder>/`
 4. **State file rollback**: when rolling back artifacts, also update `_docs/_autopilot_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date)
 ## Status Summary
 On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). Use the template matching the active flow (see Flow Resolution in SKILL.md).
@@ -122,6 +217,7 @@ On every invocation, before executing any skill, present a status summary built
 Step 3   Decompose           [DONE (N tasks) / IN PROGRESS / NOT STARTED]
 Step 4   Implement           [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED]
 Step 5   Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED]
 Step 5b  Security Audit      [DONE / SKIPPED / IN PROGRESS / NOT STARTED]
 Step 6   Deploy              [DONE / IN PROGRESS / NOT STARTED]
 ═══════════════════════════════════════════════════
 Current: Step N — Name
@@ -144,6 +240,7 @@ On every invocation, before executing any skill, present a status summary built
 Step 2f  New Task            [DONE (N tasks) / IN PROGRESS / NOT STARTED]
 Step 2g  Implement           [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED]
 Step 2h  Run Tests           [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED]
 Step 2hb Security Audit      [DONE / SKIPPED / IN PROGRESS / NOT STARTED]
 Step 2i  Deploy              [DONE / IN PROGRESS / NOT STARTED]
 ═══════════════════════════════════════════════════
 Current: Step N — Name
@@ -10,16 +10,16 @@ The autopilot persists its state to `_docs/_autopilot_state.md`. This file is th
 # Autopilot State
 ## Current Step
-step: [0-6 or "2b" / "2c" / "2d" / "2e" / "2f" / "2g" / "2h" / "2i" or "done"]
+step: [0-6 or "2b" / "2c" / "2d" / "2e" / "2f" / "2g" / "2h" / "2hb" / "2i" or "5b" or "done"]
-name: [Problem / Research / Plan / Blackbox Test Spec / Decompose Tests / Implement Tests / Refactor / New Task / Implement / Run Tests / Deploy / Decompose / Done]
+name: [Problem / Research / Plan / Blackbox Test Spec / Decompose Tests / Implement Tests / Refactor / New Task / Implement / Run Tests / Security Audit / Deploy / Decompose / Done]
-status: [not_started / in_progress / completed]
+status: [not_started / in_progress / completed / skipped]
 sub_step: [optional — sub-skill internal step number + name if interrupted mid-step]
 ## Step ↔ SubStep Reference
 (include the step reference table from the active flow file)
 When updating `Current Step`, always write it as:
-  step: N          ← autopilot step (0–6 or 2b/2c/2d/2e/2f/2g/2h/2i)
+  step: N          ← autopilot step (0–6 or 2b/2c/2d/2e/2f/2g/2h/2hb/2i or 5b)
  sub_step: M      ← sub-skill's own internal step/phase number + name
 Example:
  step: 2
@@ -55,7 +55,7 @@ Read `steps/01_artifact-management.md` for directory structure, save timing, sav
 ## Progress Tracking
-At the start of execution, create a TodoWrite with all steps (1 through 6). Update status as each step completes.
+At the start of execution, create a TodoWrite with all steps (1 through 6 plus Final). Update status as each step completes.
 ## Workflow
@@ -125,3 +125,31 @@ Read and follow `steps/07_quality-checklist.md`.
 | File structure within templates | PROCEED |
 | Contradictions between input files | ASK user |
 | Risk mitigation requires architecture change | ASK user |
 ## Methodology Quick Reference
 ```
 ┌────────────────────────────────────────────────────────────────┐
 │              Solution Planning (6-Step + Final)                  │
 ├────────────────────────────────────────────────────────────────┤
 │ PREREQ: Data Gate (BLOCKING)                                    │
 │   → verify AC, restrictions, input_data, solution exist         │
 │                                                                │
 │ 1. Integration Tests   → blackbox-test-spec/SKILL.md            │
 │    [BLOCKING: user confirms test coverage]                     │
 │ 2. Solution Analysis   → architecture, data model, deployment   │
 │    [BLOCKING: user confirms architecture]                      │
 │ 3. Component Decomp    → component specs + interfaces           │
 │    [BLOCKING: user confirms components]                        │
 │ 4. Review & Risk       → risk register, iterations              │
 │    [BLOCKING: user confirms mitigations]                       │
 │ 5. Test Specifications → per-component test specs               │
 │ 6. Jira Epics          → epic per component + bootstrap         │
 │    ─────────────────────────────────────────────────           │
 │ Final: Quality Checklist → FINAL_report.md                      │
 ├────────────────────────────────────────────────────────────────┤
 │ Principles: Single Responsibility · Dumb code, smart data       │
 │             Save immediately · Ask don't assume                │
 │             Plan don't code                                    │
 └────────────────────────────────────────────────────────────────┘
 ```
@@ -1,6 +1,6 @@
 # Final Planning Report Template
-Use this template after completing all 5 steps and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.
+Use this template after completing all 6 steps and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.
 ---
@@ -1,300 +1,347 @@
 ---
-name: security-testing
+name: security
-description: "Test for security vulnerabilities using OWASP principles. Use when conducting security audits, testing auth, or implementing security practices."
+description: |
-category: specialized-testing
+  OWASP-based security audit skill. Analyzes codebase for vulnerabilities across dependency scanning,
-priority: critical
+  static analysis, OWASP Top 10 review, and secrets detection. Produces a structured security report
-tokenEstimate: 1200
+  with severity-ranked findings and remediation guidance.
-agents: [qe-security-scanner, qe-api-contract-validator, qe-quality-analyzer]
+  Can be invoked standalone or as part of the autopilot flow (optional step before deploy).
-implementation_status: optimized
+  Trigger phrases:
-optimization_version: 1.0
+  - "security audit", "security scan", "OWASP review"
-last_optimized: 2025-12-02
+  - "vulnerability scan", "security check"
-dependencies: []
+  - "check for vulnerabilities", "pentest"
-quick_reference_card: true
+category: review
-tags: [security, owasp, sast, dast, vulnerabilities, auth, injection]
+tags: [security, owasp, sast, vulnerabilities, auth, injection, secrets]
-trust_tier: 3
+disable-model-invocation: true
 validation:
  schema_path: schemas/output.json
  validator_path: scripts/validate-config.json
  eval_path: evals/security-testing.yaml
 ---
-# Security Testing
+# Security Audit
-<default_to_action>
+Analyze the codebase for security vulnerabilities using OWASP principles. Produces a structured report with severity-ranked findings, remediation suggestions, and a security checklist verdict.
 When testing security or conducting audits:
 1. TEST OWASP Top 10 vulnerabilities systematically
 2. VALIDATE authentication and authorization on every endpoint
 3. SCAN dependencies for known vulnerabilities (npm audit)
 4. CHECK for injection attacks (SQL, XSS, command)
 5. VERIFY secrets aren't exposed in code/logs
-**Quick Security Checks:**
+## Core Principles
 - Access control → Test horizontal/vertical privilege escalation
 - Crypto → Verify password hashing, HTTPS, no sensitive data exposed
 - Injection → Test SQL injection, XSS, command injection
 - Auth → Test weak passwords, session fixation, MFA enforcement
 - Config → Check error messages don't leak info
-**Critical Success Factors:**
+- **OWASP-driven**: use the current OWASP Top 10 as the primary framework — verify the latest version at https://owasp.org/www-project-top-ten/ at audit start
- Think like an attacker, build like a defender
+- **Evidence-based**: every finding must reference a specific file, line, or configuration
- Security is built in, not added at the end
+- **Severity-ranked**: findings sorted Critical > High > Medium > Low
- Test continuously in CI/CD, not just before release
+- **Actionable**: every finding includes a concrete remediation suggestion
-</default_to_action>
+- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
 - **Complement, don't duplicate**: the `/code-review` skill does a lightweight security quick-scan; this skill goes deeper
-## Quick Reference Card
+## Context Resolution
-### When to Use
+**Project mode** (default):
- Security audits and penetration testing
+- PROBLEM_DIR: `_docs/00_problem/`
- Testing authentication/authorization
+- SOLUTION_DIR: `_docs/01_solution/`
- Validating input sanitization
+- DOCUMENT_DIR: `_docs/02_document/`
- Reviewing security configuration
+- SECURITY_DIR: `_docs/05_security/`
-### OWASP Top 10
+**Standalone mode** (explicit target provided, e.g. `/security @src/api/`):
-Use the most recent **stable** version of the OWASP Top 10. At the start of each security audit, research the current version at https://owasp.org/www-project-top-ten/ and test against all listed categories. Do not rely on a hardcoded list — the OWASP Top 10 is updated periodically and the current version must be verified.
+- TARGET: the provided path
 - SECURITY_DIR: `_standalone/security/`
-### Tools
+Announce the detected mode and resolved paths to the user before proceeding.
 | Type | Tool | Purpose |
 |------|------|---------|
 | SAST | SonarQube, Semgrep | Static code analysis |
 | DAST | OWASP ZAP, Burp | Dynamic scanning |
 | Deps | npm audit, Snyk | Dependency vulnerabilities |
 | Secrets | git-secrets, TruffleHog | Secret scanning |
-### Agent Coordination
+## Prerequisite Checks
- `qe-security-scanner`: Multi-layer SAST/DAST scanning
+
- `qe-api-contract-validator`: API security testing
+1. Codebase must contain source code files — **STOP if empty**
- `qe-quality-analyzer`: Security code review
+2. Create SECURITY_DIR if it does not exist
 3. If SECURITY_DIR already contains artifacts, ask user: **resume, overwrite, or skip?**
 4. If `_docs/00_problem/security_approach.md` exists, read it for project-specific security requirements
 ## Progress Tracking
 At the start of execution, create a TodoWrite with all phases (1 through 5). Update status as each phase completes.
 ## Workflow
 ### Phase 1: Dependency Scan
 **Role**: Security analyst
 **Goal**: Identify known vulnerabilities in project dependencies
 **Constraints**: Scan only — no code changes
 1. Detect the project's package manager(s): `requirements.txt`, `package.json`, `Cargo.toml`, `*.csproj`, `go.mod`
 2. Run the appropriate audit tool:
   - Python: `pip audit` or `safety check`
   - Node: `npm audit`
   - Rust: `cargo audit`
   - .NET: `dotnet list package --vulnerable`
   - Go: `govulncheck`
 3. If no audit tool is available, manually inspect dependency files for known CVEs using WebSearch
 4. Record findings with CVE IDs, affected packages, severity, and recommended upgrade versions
 **Self-verification**:
 - [ ] All package manifests scanned
 - [ ] Each finding has a CVE ID or advisory reference
 - [ ] Upgrade paths identified for Critical/High findings
 **Save action**: Write `SECURITY_DIR/dependency_scan.md`
 ---
-## Key Vulnerability Tests
+### Phase 2: Static Analysis (SAST)
-### 1. Broken Access Control
+**Role**: Security engineer
-```javascript
+**Goal**: Identify code-level vulnerabilities through static analysis
-// Horizontal escalation - User A accessing User B's data
+**Constraints**: Analysis only — no code changes
 test('user cannot access another user\'s order', async () => {
  const userAToken = await login('userA');
  const userBOrder = await createOrder('userB');
-  const response = await api.get(`/orders/${userBOrder.id}`, {
+Scan the codebase for these vulnerability patterns:
    headers: { Authorization: `Bearer ${userAToken}` }
  });
  expect(response.status).toBe(403);
 });
-// Vertical escalation - Regular user accessing admin
+**Injection**:
-test('regular user cannot access admin', async () => {
+- SQL injection via string interpolation or concatenation
-  const userToken = await login('regularUser');
+- Command injection (subprocess with shell=True, exec, eval, os.system)
-  expect((await api.get('/admin/users', {
+- XSS via unsanitized user input in HTML output
-    headers: { Authorization: `Bearer ${userToken}` }
+- Template injection
  })).status).toBe(403);
 });
 ```
-### 2. Injection Attacks
+**Authentication & Authorization**:
-```javascript
+- Hardcoded credentials, API keys, passwords, tokens
-// SQL Injection
+- Missing authentication checks on endpoints
-test('prevents SQL injection', async () => {
+- Missing authorization checks (horizontal/vertical escalation paths)
-  const malicious = "' OR '1'='1";
+- Weak password validation rules
  const response = await api.get(`/products?search=${malicious}`);
  expect(response.body.length).toBeLessThan(100); // Not all products
 });
-// XSS
+**Cryptographic Failures**:
-test('sanitizes HTML output', async () => {
+- Plaintext password storage (no hashing)
-  const xss = '<script>alert("XSS")</script>';
+- Weak hashing algorithms (MD5, SHA1 for passwords)
-  await api.post('/comments', { text: xss });
+- Hardcoded encryption keys or salts
 - Missing TLS/HTTPS enforcement
-  const html = (await api.get('/comments')).body;
+**Data Exposure**:
-  expect(html).toContain('&lt;script&gt;');
+- Sensitive data in logs or error messages (passwords, tokens, PII)
-  expect(html).not.toContain('<script>');
+- Sensitive fields in API responses (password hashes, SSNs)
-});
+- Debug endpoints or verbose error messages in production configs
-```
+- Secrets in version control (.env files, config with credentials)
-### 3. Cryptographic Failures
+**Insecure Deserialization**:
-```javascript
+- Pickle/marshal deserialization of untrusted data
-test('passwords are hashed', async () => {
+- JSON/XML parsing without size limits
  await db.users.create({ email: 'test@example.com', password: 'MyPassword123' });
  const user = await db.users.findByEmail('test@example.com');
-  expect(user.password).not.toBe('MyPassword123');
+**Self-verification**:
-  expect(user.password).toMatch(/^\$2[aby]\$\d{2}\$/); // bcrypt
+- [ ] All source directories scanned
-});
+- [ ] Each finding has file path and line number
 - [ ] No false positives from test files or comments
-test('no sensitive data in API response', async () => {
+**Save action**: Write `SECURITY_DIR/static_analysis.md`
  const response = await api.get('/users/me');
  expect(response.body).not.toHaveProperty('password');
  expect(response.body).not.toHaveProperty('ssn');
 });
 ```
 ### 4. Security Misconfiguration
 ```javascript
 test('errors don\'t leak sensitive info', async () => {
  const response = await api.post('/login', { email: 'nonexistent@test.com', password: 'wrong' });
  expect(response.body.error).toBe('Invalid credentials'); // Generic message
 });
 test('sensitive endpoints not exposed', async () => {
  const endpoints = ['/debug', '/.env', '/.git', '/admin'];
  for (let ep of endpoints) {
    expect((await fetch(`https://example.com${ep}`)).status).not.toBe(200);
  }
 });
 ```
 ### 5. Rate Limiting
 ```javascript
 test('rate limiting prevents brute force', async () => {
  const responses = [];
  for (let i = 0; i < 20; i++) {
    responses.push(await api.post('/login', { email: 'test@example.com', password: 'wrong' }));
  }
  expect(responses.filter(r => r.status === 429).length).toBeGreaterThan(0);
 });
 ```
 ---
-## Security Checklist
+### Phase 3: OWASP Top 10 Review
 **Role**: Penetration tester
 **Goal**: Systematically review the codebase against current OWASP Top 10 categories
 **Constraints**: Review and document — no code changes
 1. Research the current OWASP Top 10 version at https://owasp.org/www-project-top-ten/
 2. For each OWASP category, assess the codebase:
 | Check | What to Look For |
 |-------|-----------------|
 | Broken Access Control | Missing auth middleware, IDOR vulnerabilities, CORS misconfiguration, directory traversal |
 | Cryptographic Failures | Weak algorithms, plaintext transmission, missing encryption at rest |
 | Injection | SQL, NoSQL, OS command, LDAP injection paths |
 | Insecure Design | Missing rate limiting, no input validation strategy, trust boundary violations |
 | Security Misconfiguration | Default credentials, unnecessary features enabled, missing security headers |
 | Vulnerable Components | Outdated dependencies (from Phase 1), unpatched frameworks |
 | Auth Failures | Brute force paths, weak session management, missing MFA |
 | Data Integrity Failures | Missing signature verification, insecure CI/CD, auto-update without verification |
 | Logging Failures | Missing audit logs, sensitive data in logs, no alerting for security events |
 | SSRF | Unvalidated URL inputs, internal network access from user-controlled URLs |
 3. Rate each category: PASS / FAIL / NOT_APPLICABLE
 4. If `security_approach.md` exists, cross-reference its requirements against findings
 **Self-verification**:
 - [ ] All current OWASP Top 10 categories assessed
 - [ ] Each FAIL has at least one specific finding with evidence
 - [ ] NOT_APPLICABLE categories have justification
 **Save action**: Write `SECURITY_DIR/owasp_review.md`
 ---
 ### Phase 4: Configuration & Infrastructure Review
 **Role**: DevSecOps engineer
 **Goal**: Review deployment configuration for security issues
 **Constraints**: Review only — no changes
 If Dockerfiles, CI/CD configs, or deployment configs exist:
 1. **Container security**: non-root user, minimal base images, no secrets in build args, health checks
 2. **CI/CD security**: secrets management, no credentials in pipeline files, artifact signing
 3. **Environment configuration**: .env handling, secrets injection method, environment separation
 4. **Network security**: exposed ports, TLS configuration, CORS settings, security headers
 If no deployment configs exist, skip this phase and note it in the report.
 **Self-verification**:
 - [ ] All Dockerfiles reviewed
 - [ ] All CI/CD configs reviewed
 - [ ] All environment/config files reviewed
 **Save action**: Write `SECURITY_DIR/infrastructure_review.md`
 ---
 ### Phase 5: Security Report
 **Role**: Security analyst
 **Goal**: Produce a consolidated security audit report
 **Constraints**: Concise, actionable, severity-ranked
 Consolidate findings from Phases 1-4 into a structured report:
 ```markdown
 # Security Audit Report
 **Date**: [YYYY-MM-DD]
 **Scope**: [project name / target path]
 **Verdict**: PASS | PASS_WITH_WARNINGS | FAIL
 ## Summary
 | Severity | Count |
 |----------|-------|
 | Critical | [N] |
 | High     | [N] |
 | Medium   | [N] |
 | Low      | [N] |
 ## OWASP Top 10 Assessment
 | Category | Status | Findings |
 |----------|--------|----------|
 | [category] | PASS / FAIL / N/A | [count or —] |
 ## Findings
 | # | Severity | Category | Location | Title |
 |---|----------|----------|----------|-------|
 | 1 | Critical | Injection | src/api.py:42 | SQL injection via f-string |
 ### Finding Details
 **F1: [title]** (Severity / Category)
 - Location: `[file:line]`
 - Description: [what is vulnerable]
 - Impact: [what an attacker could do]
 - Remediation: [specific fix]
 ## Dependency Vulnerabilities
 | Package | CVE | Severity | Fix Version |
 |---------|-----|----------|-------------|
 | [name] | [CVE-ID] | [sev] | [version] |
 ## Recommendations
 ### Immediate (Critical/High)
 - [action items]
 ### Short-term (Medium)
 - [action items]
 ### Long-term (Low / Hardening)
 - [action items]
 ```
 **Self-verification**:
 - [ ] All findings from Phases 1-4 included
 - [ ] No duplicate findings
 - [ ] Every finding has remediation guidance
 - [ ] Verdict matches severity logic
 **Save action**: Write `SECURITY_DIR/security_report.md`
 **BLOCKING**: Present report summary to user.
 ## Verdict Logic
 - **FAIL**: any Critical or High finding exists
 - **PASS_WITH_WARNINGS**: only Medium or Low findings
 - **PASS**: no findings
 ## Security Checklist (Quick Reference)
 ### Authentication
 - [ ] Strong password requirements (12+ chars)
 - [ ] Password hashing (bcrypt, scrypt, Argon2)
 - [ ] MFA for sensitive operations
 - [ ] Account lockout after failed attempts
- [ ] Session ID changes after login
+- [ ] Session timeout and rotation
 - [ ] Session timeout
 ### Authorization
 - [ ] Check authorization on every request
 - [ ] Least privilege principle
- [ ] No horizontal escalation
+- [ ] No horizontal/vertical escalation paths
 - [ ] No vertical escalation
 ### Data Protection
 - [ ] HTTPS everywhere
 - [ ] Encrypted at rest
- [ ] Secrets not in code/logs
+- [ ] Secrets not in code/logs/version control
 - [ ] PII compliance (GDPR)
 ### Input Validation
- [ ] Server-side validation
+- [ ] Server-side validation on all inputs
 - [ ] Parameterized queries (no SQL injection)
 - [ ] Output encoding (no XSS)
- [ ] Rate limiting
+- [ ] Rate limiting on sensitive endpoints
---
+### CI/CD Security
 - [ ] Dependency audit in pipeline
 - [ ] Secret scanning (git-secrets, TruffleHog)
 - [ ] SAST in pipeline (Semgrep, SonarQube)
 - [ ] No secrets in pipeline config files
-## CI/CD Integration
+## Escalation Rules
-```yaml
+| Situation | Action |
-# GitHub Actions
+|-----------|--------|
-security-checks:
+| Critical vulnerability found | **WARN user immediately** — do not defer to report |
-  steps:
+| No audit tools available | Use manual code review + WebSearch for CVEs |
-    - name: Dependency audit
+| Codebase too large for full scan | **ASK user** to prioritize areas (API endpoints, auth, data access) |
-      run: npm audit --audit-level=high
+| Finding requires runtime testing (DAST) | Note as "requires DAST verification" — this skill does static analysis only |
-
+| Conflicting security requirements | **ASK user** to prioritize |
    - name: SAST scan
      run: npm run sast
    - name: Secret scan
      uses: trufflesecurity/trufflehog@main
    - name: DAST scan
      if: github.ref == 'refs/heads/main'
      run: docker run owasp/zap2docker-stable zap-baseline.py -t https://staging.example.com
 ```
 **Pre-commit hooks:**
 ```bash
 #!/bin/sh
 git-secrets --scan
 npm run lint:security
 ```
 ---
 ## Agent-Assisted Security Testing
 ```typescript
 // Comprehensive multi-layer scan
 await Task("Security Scan", {
  target: 'src/',
  layers: { sast: true, dast: true, dependencies: true, secrets: true },
  severity: ['critical', 'high', 'medium']
 }, "qe-security-scanner");
 // OWASP Top 10 testing
 await Task("OWASP Scan", {
  categories: ['broken-access-control', 'injection', 'cryptographic-failures'],
  depth: 'comprehensive'
 }, "qe-security-scanner");
 // Validate fix
 await Task("Validate Fix", {
  vulnerability: 'CVE-2024-12345',
  expectedResolution: 'upgrade package to v2.0.0',
  retestAfterFix: true
 }, "qe-security-scanner");
 ```
 ---
 ## Agent Coordination Hints
 ### Memory Namespace
 ```
 aqe/security/
 ├── scans/*           - Scan results
 ├── vulnerabilities/* - Found vulnerabilities
 ├── fixes/*           - Remediation tracking
 └── compliance/*      - Compliance status
 ```
 ### Fleet Coordination
 ```typescript
 const securityFleet = await FleetManager.coordinate({
  strategy: 'security-testing',
  agents: [
    'qe-security-scanner',
    'qe-api-contract-validator',
    'qe-quality-analyzer',
    'qe-deployment-readiness'
  ],
  topology: 'parallel'
 });
 ```
 ---
 ## Common Mistakes
-### ❌ Security by Obscurity
+- **Security by obscurity**: hiding admin at secret URLs instead of proper auth
-Hiding admin at `/super-secret-admin` → **Use proper auth**
+- **Client-side validation only**: JavaScript validation can be bypassed; always validate server-side
 - **Trusting user input**: assume all input is malicious until proven otherwise
 - **Hardcoded secrets**: use environment variables and secret management, never code
 - **Skipping dependency scan**: known CVEs in dependencies are the lowest-hanging fruit for attackers
-### ❌ Client-Side Validation Only
+## Trigger Conditions
 JavaScript validation can be bypassed → **Always validate server-side**
-### ❌ Trusting User Input
+When the user wants to:
-Assuming input is safe → **Sanitize, validate, escape all input**
+- Conduct a security audit of the codebase
 - Check for vulnerabilities before deployment
 - Review security posture after implementation
 - Validate security requirements from `security_approach.md`
-### ❌ Hardcoded Secrets
+**Keywords**: "security audit", "security scan", "OWASP", "vulnerability scan", "security check", "pentest"
 API keys in code → **Environment variables, secret management**
---
+**Differentiation**:
 - Lightweight security checks during implementation → handled by `/code-review` Phase 4
 - Full security audit → use this skill
 - Security requirements gathering → handled by `/problem` (security dimension)
-## Related Skills
+## Methodology Quick Reference
 - [agentic-quality-engineering](../agentic-quality-engineering/) - Security with agents
 - [api-testing-patterns](../api-testing-patterns/) - API security testing
 - [compliance-testing](../compliance-testing/) - GDPR, HIPAA, SOC2
---
+```
-
+┌────────────────────────────────────────────────────────────────┐
-## Remember
+│              Security Audit (5-Phase Method)                    │
-
+├────────────────────────────────────────────────────────────────┤
-**Think like an attacker:** What would you try to break? Test that.
+│ PREREQ: Source code exists, SECURITY_DIR created               │
-**Build like a defender:** Assume input is malicious until proven otherwise.
+│                                                                │
-**Test continuously:** Security testing is ongoing, not one-time.
+│ 1. Dependency Scan    → dependency_scan.md                     │
-
+│ 2. Static Analysis    → static_analysis.md                     │
-**With Agents:** Agents automate vulnerability scanning, track remediation, and validate fixes. Use agents to maintain security posture at scale.
+│ 3. OWASP Top 10      → owasp_review.md                        │
 │ 4. Infrastructure     → infrastructure_review.md               │
 │ 5. Security Report    → security_report.md                     │
 │    [BLOCKING: user reviews report]                             │
 ├────────────────────────────────────────────────────────────────┤
 │ Verdict: PASS / PASS_WITH_WARNINGS / FAIL                      │
 │ Principles: OWASP-driven · Evidence-based · Severity-ranked    │
 │             Actionable · Save immediately                      │
 └────────────────────────────────────────────────────────────────┘
 ```
@@ -1,789 +0,0 @@
 # =============================================================================
 # AQE Skill Evaluation Test Suite: Security Testing v1.0.0
 # =============================================================================
 #
 # Comprehensive evaluation suite for the security-testing skill per ADR-056.
 # Tests OWASP Top 10 2021 detection, severity classification, remediation
 # quality, and cross-model consistency.
 #
 # Schema: .claude/skills/.validation/schemas/skill-eval.schema.json
 # Validator: .claude/skills/security-testing/scripts/validate-config.json
 #
 # Coverage:
 # - OWASP A01:2021 - Broken Access Control
 # - OWASP A02:2021 - Cryptographic Failures
 # - OWASP A03:2021 - Injection (SQL, XSS, Command)
 # - OWASP A07:2021 - Identification and Authentication Failures
 # - Negative tests (no false positives on secure code)
 #
 # =============================================================================
 skill: security-testing
 version: 1.0.0
 description: >
  Comprehensive evaluation suite for the security-testing skill.
  Tests OWASP Top 10 2021 detection capabilities, CWE classification accuracy,
  CVSS scoring, severity classification, and remediation quality.
  Supports multi-model testing and integrates with ReasoningBank for
  continuous improvement.
 # =============================================================================
 # Multi-Model Configuration
 # =============================================================================
 models_to_test:
  - claude-3.5-sonnet    # Primary model (high accuracy expected)
  - claude-3-haiku       # Fast model (minimum quality threshold)
  - gpt-4o               # Cross-vendor validation
 # =============================================================================
 # MCP Integration Configuration
 # =============================================================================
 mcp_integration:
  enabled: true
  namespace: skill-validation
  # Query existing security patterns before running evals
  query_patterns: true
  # Track each test outcome for learning feedback loop
  track_outcomes: true
  # Store successful patterns after evals complete
  store_patterns: true
  # Share learning with fleet coordinator agents
  share_learning: true
  # Update quality gate with validation metrics
  update_quality_gate: true
  # Target agents for learning distribution
  target_agents:
    - qe-learning-coordinator
    - qe-queen-coordinator
    - qe-security-scanner
    - qe-security-auditor
 # =============================================================================
 # ReasoningBank Learning Configuration
 # =============================================================================
 learning:
  store_success_patterns: true
  store_failure_patterns: true
  pattern_ttl_days: 90
  min_confidence_to_store: 0.7
  cross_model_comparison: true
 # =============================================================================
 # Result Format Configuration
 # =============================================================================
 result_format:
  json_output: true
  markdown_report: true
  include_raw_output: false
  include_timing: true
  include_token_usage: true
 # =============================================================================
 # Environment Setup
 # =============================================================================
 setup:
  required_tools:
    - jq       # JSON parsing (required)
    - npm      # Dependency audit (optional but recommended)
  environment_variables:
    SECURITY_SCAN_DEPTH: "deep"
    OWASP_ENABLED: "true"
    SEVERITY_THRESHOLD: "medium"
  fixtures:
    - name: vulnerable_express_app
      path: fixtures/vulnerable-express-app.js
      content: |
        const express = require('express');
        const app = express();
        // SQL Injection vulnerability
        app.get('/user', (req, res) => {
          const query = `SELECT * FROM users WHERE id = ${req.params.id}`;
          db.query(query);
        });
        // XSS vulnerability
        app.get('/profile', (req, res) => {
          res.send(`<h1>Hello ${req.query.name}</h1>`);
        });
        // Path Traversal vulnerability
        app.get('/file', (req, res) => {
          const path = './uploads/' + req.query.filename;
          res.sendFile(path);
        });
 # =============================================================================
 # TEST CASES
 # =============================================================================
 test_cases:
  # ---------------------------------------------------------------------------
  # CATEGORY: SQL Injection (OWASP A03:2021)
  # ---------------------------------------------------------------------------
  - id: tc001_sql_injection_string_concat
    description: "Detect SQL injection via string concatenation in Node.js"
    category: injection
    priority: critical
    input:
      code: |
        const express = require('express');
        const mysql = require('mysql');
        const app = express();
        app.get('/api/users', (req, res) => {
          const userId = req.params.id;
          const query = `SELECT * FROM users WHERE id = ${userId}`;
          db.query(query, (err, results) => {
            res.json(results);
          });
        });
      context:
        language: javascript
        framework: express
        environment: production
    expected_output:
      must_contain:
        - "SQL injection"
        - "parameterized"
      must_not_contain:
        - "no vulnerabilities"
        - "secure"
      must_match_regex:
        - "CWE-89|CWE-564"
        - "A03:20[21][0-9]"
      severity_classification: critical
      finding_count:
        min: 1
        max: 3
      recommendation_count:
        min: 1
    validation:
      schema_check: true
      keyword_match_threshold: 0.8
      reasoning_quality_min: 0.7
      grading_rubric:
        completeness: 0.3
        accuracy: 0.5
        actionability: 0.2
    timeout_ms: 30000
  - id: tc002_sql_injection_parameterized_safe
    description: "Verify parameterized queries are NOT flagged as vulnerable"
    category: injection
    priority: high
    input:
      code: |
        app.get('/api/users', (req, res) => {
          const userId = parseInt(req.params.id, 10);
          db.query('SELECT * FROM users WHERE id = ?', [userId], (err, results) => {
            res.json(results);
          });
        });
      context:
        language: javascript
        framework: express
    expected_output:
      must_contain:
        - "parameterized"
        - "secure"
      must_not_contain:
        - "SQL injection"
        - "critical"
        - "vulnerable"
      severity_classification: info
      finding_count:
        max: 1
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
      allow_partial: true
  # ---------------------------------------------------------------------------
  # CATEGORY: Cross-Site Scripting (OWASP A03:2021)
  # ---------------------------------------------------------------------------
  - id: tc003_xss_reflected_html_output
    description: "Detect reflected XSS in unescaped HTML output"
    category: injection
    priority: critical
    input:
      code: |
        app.get('/profile', (req, res) => {
          const name = req.query.name;
          res.send(`
            <html>
              <body>
                <h1>Welcome, ${name}!</h1>
                <p>Your profile has been loaded.</p>
              </body>
            </html>
          `);
        });
      context:
        language: javascript
        framework: express
    expected_output:
      must_contain:
        - "XSS"
        - "cross-site scripting"
        - "sanitize"
        - "escape"
      must_match_regex:
        - "CWE-79"
      severity_classification: high
      finding_count:
        min: 1
    validation:
      schema_check: true
      keyword_match_threshold: 0.8
      reasoning_quality_min: 0.75
  - id: tc004_xss_dom_based_innerhtml
    description: "Detect DOM-based XSS via innerHTML assignment"
    category: injection
    priority: high
    input:
      code: |
        // Client-side JavaScript
        const params = new URLSearchParams(window.location.search);
        const message = params.get('msg');
        document.getElementById('output').innerHTML = message;
      context:
        language: javascript
        framework: vanilla
        environment: production
    expected_output:
      must_contain:
        - "DOM"
        - "XSS"
        - "innerHTML"
        - "textContent"
      must_match_regex:
        - "CWE-79"
      severity_classification: high
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
  # ---------------------------------------------------------------------------
  # CATEGORY: Authentication Failures (OWASP A07:2021)
  # ---------------------------------------------------------------------------
  - id: tc005_hardcoded_credentials
    description: "Detect hardcoded credentials and API keys"
    category: authentication
    priority: critical
    input:
      code: |
        const ADMIN_PASSWORD = 'admin123';
        const API_KEY = 'sk-1234567890abcdef';
        const DATABASE_URL = 'postgres://admin:password123@localhost/db';
        app.post('/login', (req, res) => {
          if (req.body.password === ADMIN_PASSWORD) {
            req.session.isAdmin = true;
            res.send('Login successful');
          }
        });
      context:
        language: javascript
        framework: express
    expected_output:
      must_contain:
        - "hardcoded"
        - "credentials"
        - "secret"
        - "environment variable"
      must_match_regex:
        - "CWE-798|CWE-259"
      severity_classification: critical
      finding_count:
        min: 2
    validation:
      schema_check: true
      keyword_match_threshold: 0.8
      reasoning_quality_min: 0.8
  - id: tc006_weak_password_hashing
    description: "Detect weak password hashing algorithms (MD5, SHA1)"
    category: authentication
    priority: high
    input:
      code: |
        const crypto = require('crypto');
        function hashPassword(password) {
          return crypto.createHash('md5').update(password).digest('hex');
        }
        function verifyPassword(password, hash) {
          return hashPassword(password) === hash;
        }
      context:
        language: javascript
        framework: nodejs
    expected_output:
      must_contain:
        - "MD5"
        - "weak"
        - "bcrypt"
        - "argon2"
      must_match_regex:
        - "CWE-327|CWE-328|CWE-916"
      severity_classification: high
      finding_count:
        min: 1
    validation:
      schema_check: true
      keyword_match_threshold: 0.8
  # ---------------------------------------------------------------------------
  # CATEGORY: Broken Access Control (OWASP A01:2021)
  # ---------------------------------------------------------------------------
  - id: tc007_idor_missing_authorization
    description: "Detect IDOR vulnerability with missing authorization check"
    category: authorization
    priority: critical
    input:
      code: |
        app.get('/api/users/:id/profile', (req, res) => {
          // No authorization check - any user can access any profile
          const userId = req.params.id;
          db.query('SELECT * FROM profiles WHERE user_id = ?', [userId])
            .then(profile => res.json(profile));
        });
        app.delete('/api/users/:id', (req, res) => {
          // No check if requesting user owns this account
          db.query('DELETE FROM users WHERE id = ?', [req.params.id]);
          res.send('User deleted');
        });
      context:
        language: javascript
        framework: express
    expected_output:
      must_contain:
        - "authorization"
        - "access control"
        - "IDOR"
        - "ownership"
      must_match_regex:
        - "CWE-639|CWE-284|CWE-862"
        - "A01:2021"
      severity_classification: critical
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
  # ---------------------------------------------------------------------------
  # CATEGORY: Cryptographic Failures (OWASP A02:2021)
  # ---------------------------------------------------------------------------
  - id: tc008_weak_encryption_des
    description: "Detect use of weak encryption algorithms (DES, RC4)"
    category: cryptography
    priority: high
    input:
      code: |
        const crypto = require('crypto');
        function encryptData(data, key) {
          const cipher = crypto.createCipher('des', key);
          return cipher.update(data, 'utf8', 'hex') + cipher.final('hex');
        }
        function decryptData(data, key) {
          const decipher = crypto.createDecipher('des', key);
          return decipher.update(data, 'hex', 'utf8') + decipher.final('utf8');
        }
      context:
        language: javascript
        framework: nodejs
    expected_output:
      must_contain:
        - "DES"
        - "weak"
        - "deprecated"
        - "AES"
      must_match_regex:
        - "CWE-327|CWE-328"
        - "A02:2021"
      severity_classification: high
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
  - id: tc009_plaintext_password_storage
    description: "Detect plaintext password storage"
    category: cryptography
    priority: critical
    input:
      code: |
        class User {
          constructor(email, password) {
            this.email = email;
            this.password = password;  // Stored in plaintext!
          }
          save() {
            db.query('INSERT INTO users (email, password) VALUES (?, ?)',
                     [this.email, this.password]);
          }
        }
      context:
        language: javascript
        framework: nodejs
    expected_output:
      must_contain:
        - "plaintext"
        - "password"
        - "hash"
        - "bcrypt"
      must_match_regex:
        - "CWE-256|CWE-312"
        - "A02:2021"
      severity_classification: critical
    validation:
      schema_check: true
      keyword_match_threshold: 0.8
  # ---------------------------------------------------------------------------
  # CATEGORY: Path Traversal (Related to A01:2021)
  # ---------------------------------------------------------------------------
  - id: tc010_path_traversal_file_access
    description: "Detect path traversal vulnerability in file access"
    category: injection
    priority: critical
    input:
      code: |
        const fs = require('fs');
        app.get('/download', (req, res) => {
          const filename = req.query.file;
          const filepath = './uploads/' + filename;
          res.sendFile(filepath);
        });
        app.get('/read', (req, res) => {
          const content = fs.readFileSync('./data/' + req.params.name);
          res.send(content);
        });
      context:
        language: javascript
        framework: express
    expected_output:
      must_contain:
        - "path traversal"
        - "directory traversal"
        - "../"
        - "sanitize"
      must_match_regex:
        - "CWE-22|CWE-23"
      severity_classification: critical
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
  # ---------------------------------------------------------------------------
  # CATEGORY: Negative Tests (No False Positives)
  # ---------------------------------------------------------------------------
  - id: tc011_secure_code_no_false_positives
    description: "Verify secure code is NOT flagged as vulnerable"
    category: negative
    priority: critical
    input:
      code: |
        const express = require('express');
        const helmet = require('helmet');
        const rateLimit = require('express-rate-limit');
        const bcrypt = require('bcrypt');
        const validator = require('validator');
        const app = express();
        app.use(helmet());
        app.use(rateLimit({ windowMs: 15 * 60 * 1000, max: 100 }));
        app.post('/api/users', async (req, res) => {
          const { email, password } = req.body;
          // Input validation
          if (!validator.isEmail(email)) {
            return res.status(400).json({ error: 'Invalid email' });
          }
          // Secure password hashing
          const hashedPassword = await bcrypt.hash(password, 12);
          // Parameterized query
          await db.query(
            'INSERT INTO users (email, password) VALUES ($1, $2)',
            [email, hashedPassword]
          );
          res.status(201).json({ message: 'User created' });
        });
      context:
        language: javascript
        framework: express
        environment: production
    expected_output:
      must_contain:
        - "secure"
        - "best practice"
      must_not_contain:
        - "SQL injection"
        - "XSS"
        - "critical vulnerability"
        - "high severity"
      finding_count:
        max: 2  # Allow informational findings only
    validation:
      schema_check: true
      keyword_match_threshold: 0.6
      allow_partial: true
  - id: tc012_secure_auth_implementation
    description: "Verify secure authentication is recognized as safe"
    category: negative
    priority: high
    input:
      code: |
        const bcrypt = require('bcrypt');
        const jwt = require('jsonwebtoken');
        async function login(email, password) {
          const user = await User.findByEmail(email);
          if (!user) {
            return { error: 'Invalid credentials' };
          }
          const match = await bcrypt.compare(password, user.passwordHash);
          if (!match) {
            return { error: 'Invalid credentials' };
          }
          const token = jwt.sign(
            { userId: user.id },
            process.env.JWT_SECRET,
            { expiresIn: '1h' }
          );
          return { token };
        }
      context:
        language: javascript
        framework: nodejs
    expected_output:
      must_contain:
        - "bcrypt"
        - "jwt"
        - "secure"
      must_not_contain:
        - "vulnerable"
        - "critical"
        - "hardcoded"
      severity_classification: info
    validation:
      schema_check: true
      allow_partial: true
  # ---------------------------------------------------------------------------
  # CATEGORY: Python Security (Multi-language Support)
  # ---------------------------------------------------------------------------
  - id: tc013_python_sql_injection
    description: "Detect SQL injection in Python Flask application"
    category: injection
    priority: critical
    input:
      code: |
        from flask import Flask, request
        import sqlite3
        app = Flask(__name__)
        @app.route('/user')
        def get_user():
            user_id = request.args.get('id')
            conn = sqlite3.connect('users.db')
            cursor = conn.cursor()
            cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
            return str(cursor.fetchone())
      context:
        language: python
        framework: flask
    expected_output:
      must_contain:
        - "SQL injection"
        - "parameterized"
        - "f-string"
      must_match_regex:
        - "CWE-89"
      severity_classification: critical
      finding_count:
        min: 1
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
  - id: tc014_python_ssti_jinja
    description: "Detect Server-Side Template Injection in Jinja2"
    category: injection
    priority: critical
    input:
      code: |
        from flask import Flask, request, render_template_string
        app = Flask(__name__)
        @app.route('/render')
        def render():
            template = request.args.get('template')
            return render_template_string(template)
      context:
        language: python
        framework: flask
    expected_output:
      must_contain:
        - "SSTI"
        - "template injection"
        - "render_template_string"
        - "Jinja2"
      must_match_regex:
        - "CWE-94|CWE-1336"
      severity_classification: critical
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
  - id: tc015_python_pickle_deserialization
    description: "Detect insecure deserialization with pickle"
    category: injection
    priority: critical
    input:
      code: |
        import pickle
        from flask import Flask, request
        app = Flask(__name__)
        @app.route('/load')
        def load_data():
            data = request.get_data()
            obj = pickle.loads(data)
            return str(obj)
      context:
        language: python
        framework: flask
    expected_output:
      must_contain:
        - "pickle"
        - "deserialization"
        - "untrusted"
        - "RCE"
      must_match_regex:
        - "CWE-502"
        - "A08:2021"
      severity_classification: critical
    validation:
      schema_check: true
      keyword_match_threshold: 0.7
 # =============================================================================
 # SUCCESS CRITERIA
 # =============================================================================
 success_criteria:
  # Overall pass rate (90% of tests must pass)
  pass_rate: 0.9
  # Critical tests must ALL pass (100%)
  critical_pass_rate: 1.0
  # Average reasoning quality score
  avg_reasoning_quality: 0.75
  # Maximum suite execution time (5 minutes)
  max_execution_time_ms: 300000
  # Maximum variance between model results (15%)
  cross_model_variance: 0.15
 # =============================================================================
 # METADATA
 # =============================================================================
 metadata:
  author: "qe-security-auditor"
  created: "2026-02-02"
  last_updated: "2026-02-02"
  coverage_target: >
    OWASP Top 10 2021: A01 (Broken Access Control), A02 (Cryptographic Failures),
    A03 (Injection - SQL, XSS, SSTI, Command), A07 (Authentication Failures),
    A08 (Software Integrity - Deserialization). Covers JavaScript/Node.js
    Express apps and Python Flask apps. 15 test cases with 90% pass rate
    requirement and 100% critical pass rate.
@@ -1,879 +0,0 @@
 {
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://agentic-qe.dev/schemas/security-testing-output.json",
  "title": "AQE Security Testing Skill Output Schema",
  "description": "Schema for security-testing skill output validation. Extends the base skill-output template with OWASP Top 10 categories, CWE identifiers, and CVSS scoring.",
  "type": "object",
  "required": ["skillName", "version", "timestamp", "status", "trustTier", "output"],
  "properties": {
    "skillName": {
      "type": "string",
      "const": "security-testing",
      "description": "Must be 'security-testing'"
    },
    "version": {
      "type": "string",
      "pattern": "^\\d+\\.\\d+\\.\\d+(-[a-zA-Z0-9]+)?$",
      "description": "Semantic version of the skill"
    },
    "timestamp": {
      "type": "string",
      "format": "date-time",
      "description": "ISO 8601 timestamp of output generation"
    },
    "status": {
      "type": "string",
      "enum": ["success", "partial", "failed", "skipped"],
      "description": "Overall execution status"
    },
    "trustTier": {
      "type": "integer",
      "const": 3,
      "description": "Trust tier 3 indicates full validation with eval suite"
    },
    "output": {
      "type": "object",
      "required": ["summary", "findings", "owaspCategories"],
      "properties": {
        "summary": {
          "type": "string",
          "minLength": 50,
          "maxLength": 2000,
          "description": "Human-readable summary of security findings"
        },
        "score": {
          "$ref": "#/$defs/securityScore",
          "description": "Overall security score"
        },
        "findings": {
          "type": "array",
          "items": {
            "$ref": "#/$defs/securityFinding"
          },
          "maxItems": 500,
          "description": "List of security vulnerabilities discovered"
        },
        "recommendations": {
          "type": "array",
          "items": {
            "$ref": "#/$defs/securityRecommendation"
          },
          "maxItems": 100,
          "description": "Prioritized remediation recommendations with code examples"
        },
        "metrics": {
          "$ref": "#/$defs/securityMetrics",
          "description": "Security scan metrics and statistics"
        },
        "owaspCategories": {
          "$ref": "#/$defs/owaspCategoryBreakdown",
          "description": "OWASP Top 10 2021 category breakdown"
        },
        "artifacts": {
          "type": "array",
          "items": {
            "$ref": "#/$defs/artifact"
          },
          "maxItems": 50,
          "description": "Generated security reports and scan artifacts"
        },
        "timeline": {
          "type": "array",
          "items": {
            "$ref": "#/$defs/timelineEvent"
          },
          "description": "Scan execution timeline"
        },
        "scanConfiguration": {
          "$ref": "#/$defs/scanConfiguration",
          "description": "Configuration used for the security scan"
        }
      }
    },
    "metadata": {
      "$ref": "#/$defs/metadata"
    },
    "validation": {
      "$ref": "#/$defs/validationResult"
    },
    "learning": {
      "$ref": "#/$defs/learningData"
    }
  },
  "$defs": {
    "securityScore": {
      "type": "object",
      "required": ["value", "max"],
      "properties": {
        "value": {
          "type": "number",
          "minimum": 0,
          "maximum": 100,
          "description": "Security score (0=critical issues, 100=no issues)"
        },
        "max": {
          "type": "number",
          "const": 100,
          "description": "Maximum score is always 100"
        },
        "grade": {
          "type": "string",
          "pattern": "^[A-F][+-]?$",
          "description": "Letter grade: A (90-100), B (80-89), C (70-79), D (60-69), F (<60)"
        },
        "trend": {
          "type": "string",
          "enum": ["improving", "stable", "declining", "unknown"],
          "description": "Trend compared to previous scans"
        },
        "riskLevel": {
          "type": "string",
          "enum": ["critical", "high", "medium", "low", "minimal"],
          "description": "Overall risk level assessment"
        }
      }
    },
    "securityFinding": {
      "type": "object",
      "required": ["id", "title", "severity", "owasp"],
      "properties": {
        "id": {
          "type": "string",
          "pattern": "^SEC-\\d{3,6}$",
          "description": "Unique finding identifier (e.g., SEC-001)"
        },
        "title": {
          "type": "string",
          "minLength": 10,
          "maxLength": 200,
          "description": "Finding title describing the vulnerability"
        },
        "description": {
          "type": "string",
          "maxLength": 2000,
          "description": "Detailed description of the vulnerability"
        },
        "severity": {
          "type": "string",
          "enum": ["critical", "high", "medium", "low", "info"],
          "description": "Severity: critical (CVSS 9.0-10.0), high (7.0-8.9), medium (4.0-6.9), low (0.1-3.9), info (0)"
        },
        "owasp": {
          "type": "string",
          "pattern": "^A(0[1-9]|10):20(21|25)$",
          "description": "OWASP Top 10 category (e.g., A01:2021, A03:2025)"
        },
        "owaspCategory": {
          "type": "string",
          "enum": [
            "A01:2021-Broken-Access-Control",
            "A02:2021-Cryptographic-Failures",
            "A03:2021-Injection",
            "A04:2021-Insecure-Design",
            "A05:2021-Security-Misconfiguration",
            "A06:2021-Vulnerable-Components",
            "A07:2021-Identification-Authentication-Failures",
            "A08:2021-Software-Data-Integrity-Failures",
            "A09:2021-Security-Logging-Monitoring-Failures",
            "A10:2021-Server-Side-Request-Forgery"
          ],
          "description": "Full OWASP category name"
        },
        "cwe": {
          "type": "string",
          "pattern": "^CWE-\\d{1,4}$",
          "description": "CWE identifier (e.g., CWE-79 for XSS, CWE-89 for SQLi)"
        },
        "cvss": {
          "type": "object",
          "properties": {
            "score": {
              "type": "number",
              "minimum": 0,
              "maximum": 10,
              "description": "CVSS v3.1 base score"
            },
            "vector": {
              "type": "string",
              "pattern": "^CVSS:3\\.1/AV:[NALP]/AC:[LH]/PR:[NLH]/UI:[NR]/S:[UC]/C:[NLH]/I:[NLH]/A:[NLH]$",
              "description": "CVSS v3.1 vector string"
            },
            "severity": {
              "type": "string",
              "enum": ["None", "Low", "Medium", "High", "Critical"],
              "description": "CVSS severity rating"
            }
          }
        },
        "location": {
          "$ref": "#/$defs/location",
          "description": "Location of the vulnerability"
        },
        "evidence": {
          "type": "string",
          "maxLength": 5000,
          "description": "Evidence: code snippet, request/response, or PoC"
        },
        "remediation": {
          "type": "string",
          "maxLength": 2000,
          "description": "Specific fix instructions for this finding"
        },
        "references": {
          "type": "array",
          "items": {
            "type": "object",
            "required": ["title", "url"],
            "properties": {
              "title": { "type": "string" },
              "url": { "type": "string", "format": "uri" }
            }
          },
          "maxItems": 10,
          "description": "External references (OWASP, CWE, CVE, etc.)"
        },
        "falsePositive": {
          "type": "boolean",
          "default": false,
          "description": "Potential false positive flag"
        },
        "confidence": {
          "type": "number",
          "minimum": 0,
          "maximum": 1,
          "description": "Confidence in finding accuracy (0.0-1.0)"
        },
        "exploitability": {
          "type": "string",
          "enum": ["trivial", "easy", "moderate", "difficult", "theoretical"],
          "description": "How easy is it to exploit this vulnerability"
        },
        "affectedVersions": {
          "type": "array",
          "items": { "type": "string" },
          "description": "Affected package/library versions for dependency vulnerabilities"
        },
        "cve": {
          "type": "string",
          "pattern": "^CVE-\\d{4}-\\d{4,}$",
          "description": "CVE identifier if applicable"
        }
      }
    },
    "securityRecommendation": {
      "type": "object",
      "required": ["id", "title", "priority", "owaspCategories"],
      "properties": {
        "id": {
          "type": "string",
          "pattern": "^REC-\\d{3,6}$",
          "description": "Unique recommendation identifier"
        },
        "title": {
          "type": "string",
          "minLength": 10,
          "maxLength": 200,
          "description": "Recommendation title"
        },
        "description": {
          "type": "string",
          "maxLength": 2000,
          "description": "Detailed recommendation description"
        },
        "priority": {
          "type": "string",
          "enum": ["critical", "high", "medium", "low"],
          "description": "Remediation priority"
        },
        "effort": {
          "type": "string",
          "enum": ["trivial", "low", "medium", "high", "major"],
          "description": "Estimated effort: trivial(<1hr), low(1-4hr), medium(1-3d), high(1-2wk), major(>2wk)"
        },
        "impact": {
          "type": "integer",
          "minimum": 1,
          "maximum": 10,
          "description": "Security impact if implemented (1-10)"
        },
        "relatedFindings": {
          "type": "array",
          "items": {
            "type": "string",
            "pattern": "^SEC-\\d{3,6}$"
          },
          "description": "IDs of findings this addresses"
        },
        "owaspCategories": {
          "type": "array",
          "items": {
            "type": "string",
            "pattern": "^A(0[1-9]|10):20(21|25)$"
          },
          "description": "OWASP categories this recommendation addresses"
        },
        "codeExample": {
          "type": "object",
          "properties": {
            "before": {
              "type": "string",
              "maxLength": 2000,
              "description": "Vulnerable code example"
            },
            "after": {
              "type": "string",
              "maxLength": 2000,
              "description": "Secure code example"
            },
            "language": {
              "type": "string",
              "description": "Programming language"
            }
          },
          "description": "Before/after code examples for remediation"
        },
        "resources": {
          "type": "array",
          "items": {
            "type": "object",
            "required": ["title", "url"],
            "properties": {
              "title": { "type": "string" },
              "url": { "type": "string", "format": "uri" }
            }
          },
          "maxItems": 10,
          "description": "External resources and documentation"
        },
        "automatable": {
          "type": "boolean",
          "description": "Can this fix be automated?"
        },
        "fixCommand": {
          "type": "string",
          "description": "CLI command to apply fix if automatable"
        }
      }
    },
    "owaspCategoryBreakdown": {
      "type": "object",
      "description": "OWASP Top 10 2021 category scores and findings",
      "properties": {
        "A01:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A01:2021 - Broken Access Control"
        },
        "A02:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A02:2021 - Cryptographic Failures"
        },
        "A03:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A03:2021 - Injection"
        },
        "A04:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A04:2021 - Insecure Design"
        },
        "A05:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A05:2021 - Security Misconfiguration"
        },
        "A06:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A06:2021 - Vulnerable and Outdated Components"
        },
        "A07:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A07:2021 - Identification and Authentication Failures"
        },
        "A08:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A08:2021 - Software and Data Integrity Failures"
        },
        "A09:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A09:2021 - Security Logging and Monitoring Failures"
        },
        "A10:2021": {
          "$ref": "#/$defs/owaspCategoryScore",
          "description": "A10:2021 - Server-Side Request Forgery (SSRF)"
        }
      },
      "additionalProperties": false
    },
    "owaspCategoryScore": {
      "type": "object",
      "required": ["tested", "score"],
      "properties": {
        "tested": {
          "type": "boolean",
          "description": "Whether this category was tested"
        },
        "score": {
          "type": "number",
          "minimum": 0,
          "maximum": 100,
          "description": "Category score (100 = no issues, 0 = critical)"
        },
        "grade": {
          "type": "string",
          "pattern": "^[A-F][+-]?$",
          "description": "Letter grade for this category"
        },
        "findingCount": {
          "type": "integer",
          "minimum": 0,
          "description": "Number of findings in this category"
        },
        "criticalCount": {
          "type": "integer",
          "minimum": 0,
          "description": "Number of critical findings"
        },
        "highCount": {
          "type": "integer",
          "minimum": 0,
          "description": "Number of high severity findings"
        },
        "status": {
          "type": "string",
          "enum": ["pass", "fail", "warn", "skip"],
          "description": "Category status"
        },
        "description": {
          "type": "string",
          "description": "Category description and context"
        },
        "cwes": {
          "type": "array",
          "items": {
            "type": "string",
            "pattern": "^CWE-\\d{1,4}$"
          },
          "description": "CWEs found in this category"
        }
      }
    },
    "securityMetrics": {
      "type": "object",
      "properties": {
        "totalFindings": {
          "type": "integer",
          "minimum": 0,
          "description": "Total vulnerabilities found"
        },
        "criticalCount": {
          "type": "integer",
          "minimum": 0,
          "description": "Critical severity findings"
        },
        "highCount": {
          "type": "integer",
          "minimum": 0,
          "description": "High severity findings"
        },
        "mediumCount": {
          "type": "integer",
          "minimum": 0,
          "description": "Medium severity findings"
        },
        "lowCount": {
          "type": "integer",
          "minimum": 0,
          "description": "Low severity findings"
        },
        "infoCount": {
          "type": "integer",
          "minimum": 0,
          "description": "Informational findings"
        },
        "filesScanned": {
          "type": "integer",
          "minimum": 0,
          "description": "Number of files analyzed"
        },
        "linesOfCode": {
          "type": "integer",
          "minimum": 0,
          "description": "Lines of code scanned"
        },
        "dependenciesChecked": {
          "type": "integer",
          "minimum": 0,
          "description": "Number of dependencies checked"
        },
        "owaspCategoriesTested": {
          "type": "integer",
          "minimum": 0,
          "maximum": 10,
          "description": "OWASP Top 10 categories tested"
        },
        "owaspCategoriesPassed": {
          "type": "integer",
          "minimum": 0,
          "maximum": 10,
          "description": "OWASP Top 10 categories with no findings"
        },
        "uniqueCwes": {
          "type": "integer",
          "minimum": 0,
          "description": "Unique CWE identifiers found"
        },
        "falsePositiveRate": {
          "type": "number",
          "minimum": 0,
          "maximum": 1,
          "description": "Estimated false positive rate"
        },
        "scanDurationMs": {
          "type": "integer",
          "minimum": 0,
          "description": "Total scan duration in milliseconds"
        },
        "coverage": {
          "type": "object",
          "properties": {
            "sast": {
              "type": "boolean",
              "description": "Static analysis performed"
            },
            "dast": {
              "type": "boolean",
              "description": "Dynamic analysis performed"
            },
            "dependencies": {
              "type": "boolean",
              "description": "Dependency scan performed"
            },
            "secrets": {
              "type": "boolean",
              "description": "Secret scanning performed"
            },
            "configuration": {
              "type": "boolean",
              "description": "Configuration review performed"
            }
          },
          "description": "Scan coverage indicators"
        }
      }
    },
    "scanConfiguration": {
      "type": "object",
      "properties": {
        "target": {
          "type": "string",
          "description": "Scan target (file path, URL, or package)"
        },
        "targetType": {
          "type": "string",
          "enum": ["source", "url", "package", "container", "infrastructure"],
          "description": "Type of target being scanned"
        },
        "scanTypes": {
          "type": "array",
          "items": {
            "type": "string",
            "enum": ["sast", "dast", "dependency", "secret", "configuration", "container", "iac"]
          },
          "description": "Types of scans performed"
        },
        "severity": {
          "type": "array",
          "items": {
            "type": "string",
            "enum": ["critical", "high", "medium", "low", "info"]
          },
          "description": "Severity levels included in scan"
        },
        "owaspCategories": {
          "type": "array",
          "items": {
            "type": "string",
            "pattern": "^A(0[1-9]|10):20(21|25)$"
          },
          "description": "OWASP categories tested"
        },
        "tools": {
          "type": "array",
          "items": { "type": "string" },
          "description": "Security tools used"
        },
        "excludePatterns": {
          "type": "array",
          "items": { "type": "string" },
          "description": "File patterns excluded from scan"
        },
        "rulesets": {
          "type": "array",
          "items": { "type": "string" },
          "description": "Security rulesets applied"
        }
      }
    },
    "location": {
      "type": "object",
      "properties": {
        "file": {
          "type": "string",
          "maxLength": 500,
          "description": "File path relative to project root"
        },
        "line": {
          "type": "integer",
          "minimum": 1,
          "description": "Line number"
        },
        "column": {
          "type": "integer",
          "minimum": 1,
          "description": "Column number"
        },
        "endLine": {
          "type": "integer",
          "minimum": 1,
          "description": "End line for multi-line findings"
        },
        "endColumn": {
          "type": "integer",
          "minimum": 1,
          "description": "End column"
        },
        "url": {
          "type": "string",
          "format": "uri",
          "description": "URL for web-based findings"
        },
        "endpoint": {
          "type": "string",
          "description": "API endpoint path"
        },
        "method": {
          "type": "string",
          "enum": ["GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS"],
          "description": "HTTP method for API findings"
        },
        "parameter": {
          "type": "string",
          "description": "Vulnerable parameter name"
        },
        "component": {
          "type": "string",
          "description": "Affected component or module"
        }
      }
    },
    "artifact": {
      "type": "object",
      "required": ["type", "path"],
      "properties": {
        "type": {
          "type": "string",
          "enum": ["report", "sarif", "data", "log", "evidence"],
          "description": "Artifact type"
        },
        "path": {
          "type": "string",
          "maxLength": 500,
          "description": "Path to artifact"
        },
        "format": {
          "type": "string",
          "enum": ["json", "sarif", "html", "md", "txt", "xml", "csv"],
          "description": "Artifact format"
        },
        "description": {
          "type": "string",
          "maxLength": 500,
          "description": "Artifact description"
        },
        "sizeBytes": {
          "type": "integer",
          "minimum": 0,
          "description": "File size in bytes"
        },
        "checksum": {
          "type": "string",
          "pattern": "^sha256:[a-f0-9]{64}$",
          "description": "SHA-256 checksum"
        }
      }
    },
    "timelineEvent": {
      "type": "object",
      "required": ["timestamp", "event"],
      "properties": {
        "timestamp": {
          "type": "string",
          "format": "date-time",
          "description": "Event timestamp"
        },
        "event": {
          "type": "string",
          "maxLength": 200,
          "description": "Event description"
        },
        "type": {
          "type": "string",
          "enum": ["start", "checkpoint", "warning", "error", "complete"],
          "description": "Event type"
        },
        "durationMs": {
          "type": "integer",
          "minimum": 0,
          "description": "Duration since previous event"
        },
        "phase": {
          "type": "string",
          "enum": ["initialization", "sast", "dast", "dependency", "secret", "reporting"],
          "description": "Scan phase"
        }
      }
    },
    "metadata": {
      "type": "object",
      "properties": {
        "executionTimeMs": {
          "type": "integer",
          "minimum": 0,
          "maximum": 3600000,
          "description": "Execution time in milliseconds"
        },
        "toolsUsed": {
          "type": "array",
          "items": {
            "type": "string",
            "enum": ["semgrep", "npm-audit", "trivy", "owasp-zap", "bandit", "gosec", "eslint-security", "snyk", "gitleaks", "trufflehog", "bearer"]
          },
          "uniqueItems": true,
          "description": "Security tools used"
        },
        "agentId": {
          "type": "string",
          "pattern": "^qe-[a-z][a-z0-9-]*$",
          "description": "Agent ID (e.g., qe-security-scanner)"
        },
        "modelUsed": {
          "type": "string",
          "description": "LLM model used for analysis"
        },
        "inputHash": {
          "type": "string",
          "pattern": "^[a-f0-9]{64}$",
          "description": "SHA-256 hash of input"
        },
        "targetUrl": {
          "type": "string",
          "format": "uri",
          "description": "Target URL if applicable"
        },
        "targetPath": {
          "type": "string",
          "description": "Target path if applicable"
        },
        "environment": {
          "type": "string",
          "enum": ["development", "staging", "production", "ci"],
          "description": "Execution environment"
        },
        "retryCount": {
          "type": "integer",
          "minimum": 0,
          "maximum": 10,
          "description": "Number of retries"
        }
      }
    },
    "validationResult": {
      "type": "object",
      "properties": {
        "schemaValid": {
          "type": "boolean",
          "description": "Passes JSON schema validation"
        },
        "contentValid": {
          "type": "boolean",
          "description": "Passes content validation"
        },
        "confidence": {
          "type": "number",
          "minimum": 0,
          "maximum": 1,
          "description": "Confidence score"
        },
        "warnings": {
          "type": "array",
          "items": {
            "type": "string",
            "maxLength": 500
          },
          "maxItems": 20,
          "description": "Validation warnings"
        },
        "errors": {
          "type": "array",
          "items": {
            "type": "string",
            "maxLength": 500
          },
          "maxItems": 20,
          "description": "Validation errors"
        },
        "validatorVersion": {
          "type": "string",
          "pattern": "^\\d+\\.\\d+\\.\\d+$",
          "description": "Validator version"
        }
      }
    },
    "learningData": {
      "type": "object",
      "properties": {
        "patternsDetected": {
          "type": "array",
          "items": {
            "type": "string",
            "maxLength": 200
          },
          "maxItems": 20,
          "description": "Security patterns detected (e.g., sql-injection-string-concat)"
        },
        "reward": {
          "type": "number",
          "minimum": 0,
          "maximum": 1,
          "description": "Reward signal for learning (0.0-1.0)"
        },
        "feedbackLoop": {
          "type": "object",
          "properties": {
            "previousRunId": {
              "type": "string",
              "format": "uuid",
              "description": "Previous run ID for comparison"
            },
            "improvement": {
              "type": "number",
              "minimum": -1,
              "maximum": 1,
              "description": "Improvement over previous run"
            }
          }
        },
        "newVulnerabilityPatterns": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "pattern": { "type": "string" },
              "cwe": { "type": "string" },
              "confidence": { "type": "number" }
            }
          },
          "description": "New vulnerability patterns learned"
        }
      }
    }
  }
 }
@@ -1,45 +0,0 @@
 {
  "skillName": "security-testing",
  "skillVersion": "1.0.0",
  "requiredTools": [
    "jq"
  ],
  "optionalTools": [
    "npm",
    "semgrep",
    "trivy",
    "ajv",
    "jsonschema",
    "python3"
  ],
  "schemaPath": "schemas/output.json",
  "requiredFields": [
    "skillName",
    "status",
    "output",
    "output.summary",
    "output.findings",
    "output.owaspCategories"
  ],
  "requiredNonEmptyFields": [
    "output.summary"
  ],
  "mustContainTerms": [
    "OWASP",
    "security",
    "vulnerability"
  ],
  "mustNotContainTerms": [
    "TODO",
    "placeholder",
    "FIXME"
  ],
  "enumValidations": {
    ".status": [
      "success",
      "partial",
      "failed",
      "skipped"
    ]
  }
 }