Files
ui/.cursor/skills/test-spec/SKILL.md
T
2026-04-18 22:04:31 +03:00

15 KiB

name, description, category, tags, disable-model-invocation
name description category tags disable-model-invocation
test-spec Test specification skill. Analyzes input data and expected results completeness, then produces detailed test scenarios (blackbox, performance, resilience, security, resource limits) that treat the system as a black box. Every test pairs input data with quantifiable expected results so tests can verify correctness, not just execution. 4-phase workflow: input data + expected results analysis, test scenario specification, data + results validation gate, test runner script generation. Produces 8 artifacts under tests/ and 2 shell scripts under scripts/. Trigger phrases: - "test spec", "test specification", "test scenarios" - "blackbox test spec", "black box tests", "blackbox tests" - "performance tests", "resilience tests", "security tests" build
testing
black-box
blackbox-tests
test-specification
qa
true

Test Scenario Specification

Analyze input data completeness and produce detailed black-box test specifications. Tests describe what the system should do given specific inputs — they never reference internals.

Core Principles

  • Black-box only: tests describe observable behavior through public interfaces; no internal implementation details
  • Traceability: every test traces to at least one acceptance criterion or restriction
  • Save immediately: write artifacts to disk after each phase; never accumulate unsaved work
  • Ask, don't assume: when requirements are ambiguous, ask the user before proceeding
  • Spec, don't code: this workflow produces test specifications, never test implementation code
  • Every test must have a pass/fail criterion. Two acceptable shapes:
    • Input/output shape: concrete input data paired with a quantifiable expected result (exact value, tolerance, threshold, pattern, reference file). Typical for functional blackbox tests, performance tests with load data, data-processing pipelines.
    • Behavioral shape: a trigger condition + observable system behavior + quantifiable pass/fail criterion, with no input data required. Typical for startup/shutdown tests, retry/backoff policies, state transitions, logging/metrics emission, resilience scenarios. Example criteria: "startup logs service ready within 5s", "retry emits 3 attempts with exponential backoff (base 100ms ± 20ms)", "on SIGTERM, service drains in-flight requests within 30s grace period", "health endpoint returns 503 while migrations run".
  • For behavioral tests the observable (log line, metric value, state transition, emitted event, elapsed time) must still be quantifiable — the test must programmatically decide pass/fail.
  • A test that cannot produce a pass/fail verdict through either shape is not verifiable and must be removed.

Context Resolution

Fixed paths:

  • PROBLEM_DIR: _docs/00_problem/
  • SOLUTION_DIR: _docs/01_solution/
  • DOCUMENT_DIR: _docs/02_document/
  • TESTS_OUTPUT_DIR: _docs/02_document/tests/

Announce the resolved paths and the detected invocation mode (below) to the user before proceeding.

Invocation Modes

  • full (default): runs all 4 phases against the whole PROBLEM_DIR + DOCUMENT_DIR. Used in greenfield Plan Step 1 and existing-code Step 3.
  • cycle-update: runs only a scoped refresh of the existing test-spec artifacts against the current feature cycle's completed tasks. Used by the existing-code flow's per-cycle sync step. See modes/cycle-update.md for the narrowed workflow.

Input Specification

Required Files

File Purpose
_docs/00_problem/problem.md Problem description and context
_docs/00_problem/acceptance_criteria.md Measurable acceptance criteria
_docs/00_problem/restrictions.md Constraints and limitations
_docs/00_problem/input_data/ Reference data examples, expected results, and optional reference files
_docs/01_solution/solution.md Finalized solution

Expected Results Specification

Every input data item MUST have a corresponding expected result that defines what the system should produce. Expected results MUST be quantifiable — the test must be able to programmatically compare actual system output against the expected result and produce a pass/fail verdict.

Expected results live inside _docs/00_problem/input_data/ in one or both of:

  1. Mapping file (input_data/expected_results/results_report.md): a table pairing each input with its quantifiable expected output, using the format defined in templates/expected-results.md

  2. Reference files folder (input_data/expected_results/): machine-readable files (JSON, CSV, etc.) containing full expected outputs for complex cases, referenced from the mapping file

input_data/
├── expected_results/            ← required: expected results folder
│   ├── results_report.md        ← required: input→expected result mapping
│   ├── image_01_expected.csv    ← per-file expected detections
│   └── video_01_expected.csv
├── image_01.jpg
├── empty_scene.jpg
└── data_parameters.md

Quantifiability requirements (see templates/expected-results.md for full format and examples):

  • Numeric values: exact value or value ± tolerance (e.g., confidence ≥ 0.85, position ± 10px)
  • Structured data: exact JSON/CSV values, or a reference file in expected_results/
  • Counts: exact counts (e.g., "3 detections", "0 errors")
  • Text/patterns: exact string or regex pattern to match
  • Timing: threshold (e.g., "response ≤ 500ms")
  • Error cases: expected error code, message pattern, or HTTP status

Optional Files (used when available)

File Purpose
DOCUMENT_DIR/architecture.md System architecture for environment design
DOCUMENT_DIR/system-flows.md System flows for test scenario coverage
DOCUMENT_DIR/components/ Component specs for interface identification

Prerequisite Checks (BLOCKING)

  1. acceptance_criteria.md exists and is non-empty — STOP if missing
  2. restrictions.md exists and is non-empty — STOP if missing
  3. input_data/ exists and contains at least one file — STOP if missing
  4. input_data/expected_results/results_report.md exists and is non-empty — STOP if missing. Prompt the user: "Expected results mapping is required. Please create _docs/00_problem/input_data/expected_results/results_report.md pairing each input with its quantifiable expected output. Use templates/expected-results.md as the format reference."
  5. problem.md exists and is non-empty — STOP if missing
  6. solution.md exists and is non-empty — STOP if missing
  7. Create TESTS_OUTPUT_DIR if it does not exist
  8. If TESTS_OUTPUT_DIR already contains files, ask user: resume from last checkpoint or start fresh?

Artifact Management

Directory Structure

TESTS_OUTPUT_DIR/
├── environment.md
├── test-data.md
├── blackbox-tests.md
├── performance-tests.md
├── resilience-tests.md
├── security-tests.md
├── resource-limit-tests.md
└── traceability-matrix.md

Save Timing

Phase Save immediately after Filename
Phase 1 Input data analysis (no file — findings feed Phase 2)
Phase 2 Environment spec environment.md
Phase 2 Test data spec test-data.md
Phase 2 Blackbox tests blackbox-tests.md
Phase 2 Performance tests performance-tests.md
Phase 2 Resilience tests resilience-tests.md
Phase 2 Security tests security-tests.md
Phase 2 Resource limit tests resource-limit-tests.md
Phase 2 Traceability matrix traceability-matrix.md
Phase 3 Updated test data spec (if data added) test-data.md
Phase 3 Updated test files (if tests removed) respective test file
Phase 3 Updated traceability matrix (if tests removed) traceability-matrix.md
Hardware Assessment Test Execution section environment.md (updated)
Phase 4 Test runner script scripts/run-tests.sh
Phase 4 Performance test runner script scripts/run-performance-tests.sh

Resumability

If TESTS_OUTPUT_DIR already contains files:

  1. List existing files and match them to the save timing table above
  2. Identify which phase/artifacts are complete
  3. Resume from the next incomplete artifact
  4. Inform the user which artifacts are being skipped

Progress Tracking

At the start of execution, create a TodoWrite with all four phases (plus the hardware assessment between Phase 3 and Phase 4). Update status as each phase completes.

Workflow

Phase 1: Input Data & Expected Results Completeness Analysis

Read and follow phases/01-input-data-analysis.md.


Phase 2: Test Scenario Specification

Read and follow phases/02-test-scenarios.md.


Phase 3: Test Data Validation Gate (HARD GATE)

Read and follow phases/03-data-validation-gate.md.


Hardware-Dependency & Execution Environment Assessment (BLOCKING — runs between Phase 3 and Phase 4)

Read and follow phases/hardware-assessment.md.


Phase 4: Test Runner Script Generation

Read and follow phases/04-runner-scripts.md.


cycle-update mode

If invoked in cycle-update mode (see "Invocation Modes" above), read and follow modes/cycle-update.md instead of the full 4-phase workflow.

Escalation Rules

Situation Action
Missing acceptance_criteria.md, restrictions.md, or input_data/ STOP — specification cannot proceed
Missing input_data/expected_results/results_report.md STOP — ask user to provide expected results mapping using the template
Ambiguous requirements ASK user
Input data coverage below 75% (Phase 1) Search internet for supplementary data, ASK user to validate
Expected results missing or not quantifiable (Phase 1) ASK user to provide quantifiable expected results before proceeding
Test scenario conflicts with restrictions ASK user to clarify intent
System interfaces unclear (no architecture.md) ASK user or derive from solution.md
Test data or expected result not provided for a test scenario (Phase 3) WARN user and REMOVE the test
Final coverage below 75% after removals (Phase 3) BLOCK — require user to supply data or accept reduced spec

Common Mistakes

  • Referencing internals: tests must be black-box — no internal module names, no direct DB queries against the system under test
  • Vague expected outcomes: "works correctly" is not a test outcome; use specific measurable values
  • Missing pass/fail criterion: input/output tests without an expected result, OR behavioral tests without a measurable observable — both are unverifiable and must be removed
  • Non-quantifiable criteria: "should return good results", "works correctly", "behaves properly" — not verifiable. Use exact values, tolerances, thresholds, pattern matches, or timing bounds that code can evaluate.
  • Forcing the wrong shape: do not invent fake input data for a behavioral test (e.g., "input: SIGTERM signal") just to fit the input/output shape. Classify the test correctly and use the matching checklist.
  • Missing negative scenarios: every positive scenario category should have corresponding negative/edge-case tests
  • Untraceable tests: every test should trace to at least one AC or restriction
  • Writing test code: this skill produces specifications, never implementation code

Trigger Conditions

When the user wants to:

  • Specify blackbox tests before implementation or refactoring
  • Analyze input data completeness for test coverage
  • Produce test scenarios from acceptance criteria

Keywords: "test spec", "test specification", "blackbox test spec", "black box tests", "blackbox tests", "test scenarios"

Methodology Quick Reference

┌──────────────────────────────────────────────────────────────────────┐
│              Test Scenario Specification (4-Phase)                   │
├──────────────────────────────────────────────────────────────────────┤
│ PREREQ: Data Gate (BLOCKING)                                         │
│   → verify AC, restrictions, input_data (incl. expected_results.md)  │
│                                                                      │
│ Phase 1: Input Data & Expected Results Completeness Analysis         │
│   → phases/01-input-data-analysis.md                                 │
│   [BLOCKING: user confirms input data + expected results coverage]   │
│                                                                      │
│ Phase 2: Test Scenario Specification                                 │
│   → phases/02-test-scenarios.md                                      │
│   → environment.md · test-data.md · blackbox-tests.md                │
│   → performance-tests.md · resilience-tests.md · security-tests.md   │
│   → resource-limit-tests.md · traceability-matrix.md                 │
│   [BLOCKING: user confirms test coverage]                            │
│                                                                      │
│ Phase 3: Test Data & Expected Results Validation Gate (HARD GATE)    │
│   → phases/03-data-validation-gate.md                                │
│   [BLOCKING: coverage ≥ 75% required to pass]                        │
│                                                                      │
│ Hardware-Dependency Assessment (BLOCKING, pre-Phase-4)               │
│   → phases/hardware-assessment.md                                    │
│                                                                      │
│ Phase 4: Test Runner Script Generation                               │
│   → phases/04-runner-scripts.md                                      │
│   → scripts/run-tests.sh (unit + blackbox)                           │
│   → scripts/run-performance-tests.sh (load/perf scenarios)           │
│                                                                      │
│ cycle-update mode (scoped refresh)                                   │
│   → modes/cycle-update.md                                            │
├──────────────────────────────────────────────────────────────────────┤
│ Principles: Black-box only · Traceability · Save immediately         │
│             Ask don't assume · Spec don't code                       │
│             No test without data · No test without expected result   │
└──────────────────────────────────────────────────────────────────────┘