Files
loader/.cursor/skills/test-spec/SKILL.md
T
2026-04-18 22:04:23 +03:00

273 lines
15 KiB
Markdown

---
name: test-spec
description: |
Test specification skill. Analyzes input data and expected results completeness,
then produces detailed test scenarios (blackbox, performance, resilience, security, resource limits)
that treat the system as a black box. Every test pairs input data with quantifiable expected results
so tests can verify correctness, not just execution.
4-phase workflow: input data + expected results analysis, test scenario specification, data + results validation gate,
test runner script generation. Produces 8 artifacts under tests/ and 2 shell scripts under scripts/.
Trigger phrases:
- "test spec", "test specification", "test scenarios"
- "blackbox test spec", "black box tests", "blackbox tests"
- "performance tests", "resilience tests", "security tests"
category: build
tags: [testing, black-box, blackbox-tests, test-specification, qa]
disable-model-invocation: true
---
# Test Scenario Specification
Analyze input data completeness and produce detailed black-box test specifications. Tests describe what the system should do given specific inputs — they never reference internals.
## Core Principles
- **Black-box only**: tests describe observable behavior through public interfaces; no internal implementation details
- **Traceability**: every test traces to at least one acceptance criterion or restriction
- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
- **Spec, don't code**: this workflow produces test specifications, never test implementation code
- **Every test must have a pass/fail criterion**. Two acceptable shapes:
- **Input/output shape**: concrete input data paired with a quantifiable expected result (exact value, tolerance, threshold, pattern, reference file). Typical for functional blackbox tests, performance tests with load data, data-processing pipelines.
- **Behavioral shape**: a trigger condition + observable system behavior + quantifiable pass/fail criterion, with no input data required. Typical for startup/shutdown tests, retry/backoff policies, state transitions, logging/metrics emission, resilience scenarios. Example criteria: "startup logs `service ready` within 5s", "retry emits 3 attempts with exponential backoff (base 100ms ± 20ms)", "on SIGTERM, service drains in-flight requests within 30s grace period", "health endpoint returns 503 while migrations run".
- For behavioral tests the observable (log line, metric value, state transition, emitted event, elapsed time) must still be quantifiable — the test must programmatically decide pass/fail.
- A test that cannot produce a pass/fail verdict through either shape is not verifiable and must be removed.
## Context Resolution
Fixed paths:
- PROBLEM_DIR: `_docs/00_problem/`
- SOLUTION_DIR: `_docs/01_solution/`
- DOCUMENT_DIR: `_docs/02_document/`
- TESTS_OUTPUT_DIR: `_docs/02_document/tests/`
Announce the resolved paths and the detected invocation mode (below) to the user before proceeding.
### Invocation Modes
- **full** (default): runs all 4 phases against the whole `PROBLEM_DIR` + `DOCUMENT_DIR`. Used in greenfield Plan Step 1 and existing-code Step 3.
- **cycle-update**: runs only a scoped refresh of the existing test-spec artifacts against the current feature cycle's completed tasks. Used by the existing-code flow's per-cycle sync step. See `modes/cycle-update.md` for the narrowed workflow.
## Input Specification
### Required Files
| File | Purpose |
|------|---------|
| `_docs/00_problem/problem.md` | Problem description and context |
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
| `_docs/00_problem/input_data/` | Reference data examples, expected results, and optional reference files |
| `_docs/01_solution/solution.md` | Finalized solution |
### Expected Results Specification
Every input data item MUST have a corresponding expected result that defines what the system should produce. Expected results MUST be **quantifiable** — the test must be able to programmatically compare actual system output against the expected result and produce a pass/fail verdict.
Expected results live inside `_docs/00_problem/input_data/` in one or both of:
1. **Mapping file** (`input_data/expected_results/results_report.md`): a table pairing each input with its quantifiable expected output, using the format defined in `templates/expected-results.md`
2. **Reference files folder** (`input_data/expected_results/`): machine-readable files (JSON, CSV, etc.) containing full expected outputs for complex cases, referenced from the mapping file
```
input_data/
├── expected_results/ ← required: expected results folder
│ ├── results_report.md ← required: input→expected result mapping
│ ├── image_01_expected.csv ← per-file expected detections
│ └── video_01_expected.csv
├── image_01.jpg
├── empty_scene.jpg
└── data_parameters.md
```
**Quantifiability requirements** (see `templates/expected-results.md` for full format and examples):
- Numeric values: exact value or value ± tolerance (e.g., `confidence ≥ 0.85`, `position ± 10px`)
- Structured data: exact JSON/CSV values, or a reference file in `expected_results/`
- Counts: exact counts (e.g., "3 detections", "0 errors")
- Text/patterns: exact string or regex pattern to match
- Timing: threshold (e.g., "response ≤ 500ms")
- Error cases: expected error code, message pattern, or HTTP status
### Optional Files (used when available)
| File | Purpose |
|------|---------|
| `DOCUMENT_DIR/architecture.md` | System architecture for environment design |
| `DOCUMENT_DIR/system-flows.md` | System flows for test scenario coverage |
| `DOCUMENT_DIR/components/` | Component specs for interface identification |
### Prerequisite Checks (BLOCKING)
1. `acceptance_criteria.md` exists and is non-empty — **STOP if missing**
2. `restrictions.md` exists and is non-empty — **STOP if missing**
3. `input_data/` exists and contains at least one file — **STOP if missing**
4. `input_data/expected_results/results_report.md` exists and is non-empty — **STOP if missing**. Prompt the user: *"Expected results mapping is required. Please create `_docs/00_problem/input_data/expected_results/results_report.md` pairing each input with its quantifiable expected output. Use `templates/expected-results.md` as the format reference."*
5. `problem.md` exists and is non-empty — **STOP if missing**
6. `solution.md` exists and is non-empty — **STOP if missing**
7. Create TESTS_OUTPUT_DIR if it does not exist
8. If TESTS_OUTPUT_DIR already contains files, ask user: **resume from last checkpoint or start fresh?**
## Artifact Management
### Directory Structure
```
TESTS_OUTPUT_DIR/
├── environment.md
├── test-data.md
├── blackbox-tests.md
├── performance-tests.md
├── resilience-tests.md
├── security-tests.md
├── resource-limit-tests.md
└── traceability-matrix.md
```
### Save Timing
| Phase | Save immediately after | Filename |
|-------|------------------------|----------|
| Phase 1 | Input data analysis (no file — findings feed Phase 2) | — |
| Phase 2 | Environment spec | `environment.md` |
| Phase 2 | Test data spec | `test-data.md` |
| Phase 2 | Blackbox tests | `blackbox-tests.md` |
| Phase 2 | Performance tests | `performance-tests.md` |
| Phase 2 | Resilience tests | `resilience-tests.md` |
| Phase 2 | Security tests | `security-tests.md` |
| Phase 2 | Resource limit tests | `resource-limit-tests.md` |
| Phase 2 | Traceability matrix | `traceability-matrix.md` |
| Phase 3 | Updated test data spec (if data added) | `test-data.md` |
| Phase 3 | Updated test files (if tests removed) | respective test file |
| Phase 3 | Updated traceability matrix (if tests removed) | `traceability-matrix.md` |
| Hardware Assessment | Test Execution section | `environment.md` (updated) |
| Phase 4 | Test runner script | `scripts/run-tests.sh` |
| Phase 4 | Performance test runner script | `scripts/run-performance-tests.sh` |
### Resumability
If TESTS_OUTPUT_DIR already contains files:
1. List existing files and match them to the save timing table above
2. Identify which phase/artifacts are complete
3. Resume from the next incomplete artifact
4. Inform the user which artifacts are being skipped
## Progress Tracking
At the start of execution, create a TodoWrite with all four phases (plus the hardware assessment between Phase 3 and Phase 4). Update status as each phase completes.
## Workflow
### Phase 1: Input Data & Expected Results Completeness Analysis
Read and follow `phases/01-input-data-analysis.md`.
---
### Phase 2: Test Scenario Specification
Read and follow `phases/02-test-scenarios.md`.
---
### Phase 3: Test Data Validation Gate (HARD GATE)
Read and follow `phases/03-data-validation-gate.md`.
---
### Hardware-Dependency & Execution Environment Assessment (BLOCKING — runs between Phase 3 and Phase 4)
Read and follow `phases/hardware-assessment.md`.
---
### Phase 4: Test Runner Script Generation
Read and follow `phases/04-runner-scripts.md`.
---
### cycle-update mode
If invoked in `cycle-update` mode (see "Invocation Modes" above), read and follow `modes/cycle-update.md` instead of the full 4-phase workflow.
## Escalation Rules
| Situation | Action |
|-----------|--------|
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — specification cannot proceed |
| Missing input_data/expected_results/results_report.md | **STOP** — ask user to provide expected results mapping using the template |
| Ambiguous requirements | ASK user |
| Input data coverage below 75% (Phase 1) | Search internet for supplementary data, ASK user to validate |
| Expected results missing or not quantifiable (Phase 1) | ASK user to provide quantifiable expected results before proceeding |
| Test scenario conflicts with restrictions | ASK user to clarify intent |
| System interfaces unclear (no architecture.md) | ASK user or derive from solution.md |
| Test data or expected result not provided for a test scenario (Phase 3) | WARN user and REMOVE the test |
| Final coverage below 75% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
## Common Mistakes
- **Referencing internals**: tests must be black-box — no internal module names, no direct DB queries against the system under test
- **Vague expected outcomes**: "works correctly" is not a test outcome; use specific measurable values
- **Missing pass/fail criterion**: input/output tests without an expected result, OR behavioral tests without a measurable observable — both are unverifiable and must be removed
- **Non-quantifiable criteria**: "should return good results", "works correctly", "behaves properly" — not verifiable. Use exact values, tolerances, thresholds, pattern matches, or timing bounds that code can evaluate.
- **Forcing the wrong shape**: do not invent fake input data for a behavioral test (e.g., "input: SIGTERM signal") just to fit the input/output shape. Classify the test correctly and use the matching checklist.
- **Missing negative scenarios**: every positive scenario category should have corresponding negative/edge-case tests
- **Untraceable tests**: every test should trace to at least one AC or restriction
- **Writing test code**: this skill produces specifications, never implementation code
## Trigger Conditions
When the user wants to:
- Specify blackbox tests before implementation or refactoring
- Analyze input data completeness for test coverage
- Produce test scenarios from acceptance criteria
**Keywords**: "test spec", "test specification", "blackbox test spec", "black box tests", "blackbox tests", "test scenarios"
## Methodology Quick Reference
```
┌──────────────────────────────────────────────────────────────────────┐
│ Test Scenario Specification (4-Phase) │
├──────────────────────────────────────────────────────────────────────┤
│ PREREQ: Data Gate (BLOCKING) │
│ → verify AC, restrictions, input_data (incl. expected_results.md) │
│ │
│ Phase 1: Input Data & Expected Results Completeness Analysis │
│ → phases/01-input-data-analysis.md │
│ [BLOCKING: user confirms input data + expected results coverage] │
│ │
│ Phase 2: Test Scenario Specification │
│ → phases/02-test-scenarios.md │
│ → environment.md · test-data.md · blackbox-tests.md │
│ → performance-tests.md · resilience-tests.md · security-tests.md │
│ → resource-limit-tests.md · traceability-matrix.md │
│ [BLOCKING: user confirms test coverage] │
│ │
│ Phase 3: Test Data & Expected Results Validation Gate (HARD GATE) │
│ → phases/03-data-validation-gate.md │
│ [BLOCKING: coverage ≥ 75% required to pass] │
│ │
│ Hardware-Dependency Assessment (BLOCKING, pre-Phase-4) │
│ → phases/hardware-assessment.md │
│ │
│ Phase 4: Test Runner Script Generation │
│ → phases/04-runner-scripts.md │
│ → scripts/run-tests.sh (unit + blackbox) │
│ → scripts/run-performance-tests.sh (load/perf scenarios) │
│ │
│ cycle-update mode (scoped refresh) │
│ → modes/cycle-update.md │
├──────────────────────────────────────────────────────────────────────┤
│ Principles: Black-box only · Traceability · Save immediately │
│ Ask don't assume · Spec don't code │
│ No test without data · No test without expected result │
└──────────────────────────────────────────────────────────────────────┘
```