mirror of
https://github.com/azaion/loader.git
synced 2026-04-22 22:46:32 +00:00
273 lines
15 KiB
Markdown
273 lines
15 KiB
Markdown
---
|
|
name: test-spec
|
|
description: |
|
|
Test specification skill. Analyzes input data and expected results completeness,
|
|
then produces detailed test scenarios (blackbox, performance, resilience, security, resource limits)
|
|
that treat the system as a black box. Every test pairs input data with quantifiable expected results
|
|
so tests can verify correctness, not just execution.
|
|
4-phase workflow: input data + expected results analysis, test scenario specification, data + results validation gate,
|
|
test runner script generation. Produces 8 artifacts under tests/ and 2 shell scripts under scripts/.
|
|
Trigger phrases:
|
|
- "test spec", "test specification", "test scenarios"
|
|
- "blackbox test spec", "black box tests", "blackbox tests"
|
|
- "performance tests", "resilience tests", "security tests"
|
|
category: build
|
|
tags: [testing, black-box, blackbox-tests, test-specification, qa]
|
|
disable-model-invocation: true
|
|
---
|
|
|
|
# Test Scenario Specification
|
|
|
|
Analyze input data completeness and produce detailed black-box test specifications. Tests describe what the system should do given specific inputs — they never reference internals.
|
|
|
|
## Core Principles
|
|
|
|
- **Black-box only**: tests describe observable behavior through public interfaces; no internal implementation details
|
|
- **Traceability**: every test traces to at least one acceptance criterion or restriction
|
|
- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
|
|
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
|
|
- **Spec, don't code**: this workflow produces test specifications, never test implementation code
|
|
- **Every test must have a pass/fail criterion**. Two acceptable shapes:
|
|
- **Input/output shape**: concrete input data paired with a quantifiable expected result (exact value, tolerance, threshold, pattern, reference file). Typical for functional blackbox tests, performance tests with load data, data-processing pipelines.
|
|
- **Behavioral shape**: a trigger condition + observable system behavior + quantifiable pass/fail criterion, with no input data required. Typical for startup/shutdown tests, retry/backoff policies, state transitions, logging/metrics emission, resilience scenarios. Example criteria: "startup logs `service ready` within 5s", "retry emits 3 attempts with exponential backoff (base 100ms ± 20ms)", "on SIGTERM, service drains in-flight requests within 30s grace period", "health endpoint returns 503 while migrations run".
|
|
- For behavioral tests the observable (log line, metric value, state transition, emitted event, elapsed time) must still be quantifiable — the test must programmatically decide pass/fail.
|
|
- A test that cannot produce a pass/fail verdict through either shape is not verifiable and must be removed.
|
|
|
|
## Context Resolution
|
|
|
|
Fixed paths:
|
|
|
|
- PROBLEM_DIR: `_docs/00_problem/`
|
|
- SOLUTION_DIR: `_docs/01_solution/`
|
|
- DOCUMENT_DIR: `_docs/02_document/`
|
|
- TESTS_OUTPUT_DIR: `_docs/02_document/tests/`
|
|
|
|
Announce the resolved paths and the detected invocation mode (below) to the user before proceeding.
|
|
|
|
### Invocation Modes
|
|
|
|
- **full** (default): runs all 4 phases against the whole `PROBLEM_DIR` + `DOCUMENT_DIR`. Used in greenfield Plan Step 1 and existing-code Step 3.
|
|
- **cycle-update**: runs only a scoped refresh of the existing test-spec artifacts against the current feature cycle's completed tasks. Used by the existing-code flow's per-cycle sync step. See `modes/cycle-update.md` for the narrowed workflow.
|
|
|
|
## Input Specification
|
|
|
|
### Required Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `_docs/00_problem/problem.md` | Problem description and context |
|
|
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
|
|
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
|
|
| `_docs/00_problem/input_data/` | Reference data examples, expected results, and optional reference files |
|
|
| `_docs/01_solution/solution.md` | Finalized solution |
|
|
|
|
### Expected Results Specification
|
|
|
|
Every input data item MUST have a corresponding expected result that defines what the system should produce. Expected results MUST be **quantifiable** — the test must be able to programmatically compare actual system output against the expected result and produce a pass/fail verdict.
|
|
|
|
Expected results live inside `_docs/00_problem/input_data/` in one or both of:
|
|
|
|
1. **Mapping file** (`input_data/expected_results/results_report.md`): a table pairing each input with its quantifiable expected output, using the format defined in `templates/expected-results.md`
|
|
|
|
2. **Reference files folder** (`input_data/expected_results/`): machine-readable files (JSON, CSV, etc.) containing full expected outputs for complex cases, referenced from the mapping file
|
|
|
|
```
|
|
input_data/
|
|
├── expected_results/ ← required: expected results folder
|
|
│ ├── results_report.md ← required: input→expected result mapping
|
|
│ ├── image_01_expected.csv ← per-file expected detections
|
|
│ └── video_01_expected.csv
|
|
├── image_01.jpg
|
|
├── empty_scene.jpg
|
|
└── data_parameters.md
|
|
```
|
|
|
|
**Quantifiability requirements** (see `templates/expected-results.md` for full format and examples):
|
|
|
|
- Numeric values: exact value or value ± tolerance (e.g., `confidence ≥ 0.85`, `position ± 10px`)
|
|
- Structured data: exact JSON/CSV values, or a reference file in `expected_results/`
|
|
- Counts: exact counts (e.g., "3 detections", "0 errors")
|
|
- Text/patterns: exact string or regex pattern to match
|
|
- Timing: threshold (e.g., "response ≤ 500ms")
|
|
- Error cases: expected error code, message pattern, or HTTP status
|
|
|
|
### Optional Files (used when available)
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `DOCUMENT_DIR/architecture.md` | System architecture for environment design |
|
|
| `DOCUMENT_DIR/system-flows.md` | System flows for test scenario coverage |
|
|
| `DOCUMENT_DIR/components/` | Component specs for interface identification |
|
|
|
|
### Prerequisite Checks (BLOCKING)
|
|
|
|
1. `acceptance_criteria.md` exists and is non-empty — **STOP if missing**
|
|
2. `restrictions.md` exists and is non-empty — **STOP if missing**
|
|
3. `input_data/` exists and contains at least one file — **STOP if missing**
|
|
4. `input_data/expected_results/results_report.md` exists and is non-empty — **STOP if missing**. Prompt the user: *"Expected results mapping is required. Please create `_docs/00_problem/input_data/expected_results/results_report.md` pairing each input with its quantifiable expected output. Use `templates/expected-results.md` as the format reference."*
|
|
5. `problem.md` exists and is non-empty — **STOP if missing**
|
|
6. `solution.md` exists and is non-empty — **STOP if missing**
|
|
7. Create TESTS_OUTPUT_DIR if it does not exist
|
|
8. If TESTS_OUTPUT_DIR already contains files, ask user: **resume from last checkpoint or start fresh?**
|
|
|
|
## Artifact Management
|
|
|
|
### Directory Structure
|
|
|
|
```
|
|
TESTS_OUTPUT_DIR/
|
|
├── environment.md
|
|
├── test-data.md
|
|
├── blackbox-tests.md
|
|
├── performance-tests.md
|
|
├── resilience-tests.md
|
|
├── security-tests.md
|
|
├── resource-limit-tests.md
|
|
└── traceability-matrix.md
|
|
```
|
|
|
|
### Save Timing
|
|
|
|
| Phase | Save immediately after | Filename |
|
|
|-------|------------------------|----------|
|
|
| Phase 1 | Input data analysis (no file — findings feed Phase 2) | — |
|
|
| Phase 2 | Environment spec | `environment.md` |
|
|
| Phase 2 | Test data spec | `test-data.md` |
|
|
| Phase 2 | Blackbox tests | `blackbox-tests.md` |
|
|
| Phase 2 | Performance tests | `performance-tests.md` |
|
|
| Phase 2 | Resilience tests | `resilience-tests.md` |
|
|
| Phase 2 | Security tests | `security-tests.md` |
|
|
| Phase 2 | Resource limit tests | `resource-limit-tests.md` |
|
|
| Phase 2 | Traceability matrix | `traceability-matrix.md` |
|
|
| Phase 3 | Updated test data spec (if data added) | `test-data.md` |
|
|
| Phase 3 | Updated test files (if tests removed) | respective test file |
|
|
| Phase 3 | Updated traceability matrix (if tests removed) | `traceability-matrix.md` |
|
|
| Hardware Assessment | Test Execution section | `environment.md` (updated) |
|
|
| Phase 4 | Test runner script | `scripts/run-tests.sh` |
|
|
| Phase 4 | Performance test runner script | `scripts/run-performance-tests.sh` |
|
|
|
|
### Resumability
|
|
|
|
If TESTS_OUTPUT_DIR already contains files:
|
|
|
|
1. List existing files and match them to the save timing table above
|
|
2. Identify which phase/artifacts are complete
|
|
3. Resume from the next incomplete artifact
|
|
4. Inform the user which artifacts are being skipped
|
|
|
|
## Progress Tracking
|
|
|
|
At the start of execution, create a TodoWrite with all four phases (plus the hardware assessment between Phase 3 and Phase 4). Update status as each phase completes.
|
|
|
|
## Workflow
|
|
|
|
### Phase 1: Input Data & Expected Results Completeness Analysis
|
|
|
|
Read and follow `phases/01-input-data-analysis.md`.
|
|
|
|
---
|
|
|
|
### Phase 2: Test Scenario Specification
|
|
|
|
Read and follow `phases/02-test-scenarios.md`.
|
|
|
|
---
|
|
|
|
### Phase 3: Test Data Validation Gate (HARD GATE)
|
|
|
|
Read and follow `phases/03-data-validation-gate.md`.
|
|
|
|
---
|
|
|
|
### Hardware-Dependency & Execution Environment Assessment (BLOCKING — runs between Phase 3 and Phase 4)
|
|
|
|
Read and follow `phases/hardware-assessment.md`.
|
|
|
|
---
|
|
|
|
### Phase 4: Test Runner Script Generation
|
|
|
|
Read and follow `phases/04-runner-scripts.md`.
|
|
|
|
---
|
|
|
|
### cycle-update mode
|
|
|
|
If invoked in `cycle-update` mode (see "Invocation Modes" above), read and follow `modes/cycle-update.md` instead of the full 4-phase workflow.
|
|
|
|
## Escalation Rules
|
|
|
|
| Situation | Action |
|
|
|-----------|--------|
|
|
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — specification cannot proceed |
|
|
| Missing input_data/expected_results/results_report.md | **STOP** — ask user to provide expected results mapping using the template |
|
|
| Ambiguous requirements | ASK user |
|
|
| Input data coverage below 75% (Phase 1) | Search internet for supplementary data, ASK user to validate |
|
|
| Expected results missing or not quantifiable (Phase 1) | ASK user to provide quantifiable expected results before proceeding |
|
|
| Test scenario conflicts with restrictions | ASK user to clarify intent |
|
|
| System interfaces unclear (no architecture.md) | ASK user or derive from solution.md |
|
|
| Test data or expected result not provided for a test scenario (Phase 3) | WARN user and REMOVE the test |
|
|
| Final coverage below 75% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
|
|
|
|
## Common Mistakes
|
|
|
|
- **Referencing internals**: tests must be black-box — no internal module names, no direct DB queries against the system under test
|
|
- **Vague expected outcomes**: "works correctly" is not a test outcome; use specific measurable values
|
|
- **Missing pass/fail criterion**: input/output tests without an expected result, OR behavioral tests without a measurable observable — both are unverifiable and must be removed
|
|
- **Non-quantifiable criteria**: "should return good results", "works correctly", "behaves properly" — not verifiable. Use exact values, tolerances, thresholds, pattern matches, or timing bounds that code can evaluate.
|
|
- **Forcing the wrong shape**: do not invent fake input data for a behavioral test (e.g., "input: SIGTERM signal") just to fit the input/output shape. Classify the test correctly and use the matching checklist.
|
|
- **Missing negative scenarios**: every positive scenario category should have corresponding negative/edge-case tests
|
|
- **Untraceable tests**: every test should trace to at least one AC or restriction
|
|
- **Writing test code**: this skill produces specifications, never implementation code
|
|
|
|
## Trigger Conditions
|
|
|
|
When the user wants to:
|
|
|
|
- Specify blackbox tests before implementation or refactoring
|
|
- Analyze input data completeness for test coverage
|
|
- Produce test scenarios from acceptance criteria
|
|
|
|
**Keywords**: "test spec", "test specification", "blackbox test spec", "black box tests", "blackbox tests", "test scenarios"
|
|
|
|
## Methodology Quick Reference
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────────────────────────┐
|
|
│ Test Scenario Specification (4-Phase) │
|
|
├──────────────────────────────────────────────────────────────────────┤
|
|
│ PREREQ: Data Gate (BLOCKING) │
|
|
│ → verify AC, restrictions, input_data (incl. expected_results.md) │
|
|
│ │
|
|
│ Phase 1: Input Data & Expected Results Completeness Analysis │
|
|
│ → phases/01-input-data-analysis.md │
|
|
│ [BLOCKING: user confirms input data + expected results coverage] │
|
|
│ │
|
|
│ Phase 2: Test Scenario Specification │
|
|
│ → phases/02-test-scenarios.md │
|
|
│ → environment.md · test-data.md · blackbox-tests.md │
|
|
│ → performance-tests.md · resilience-tests.md · security-tests.md │
|
|
│ → resource-limit-tests.md · traceability-matrix.md │
|
|
│ [BLOCKING: user confirms test coverage] │
|
|
│ │
|
|
│ Phase 3: Test Data & Expected Results Validation Gate (HARD GATE) │
|
|
│ → phases/03-data-validation-gate.md │
|
|
│ [BLOCKING: coverage ≥ 75% required to pass] │
|
|
│ │
|
|
│ Hardware-Dependency Assessment (BLOCKING, pre-Phase-4) │
|
|
│ → phases/hardware-assessment.md │
|
|
│ │
|
|
│ Phase 4: Test Runner Script Generation │
|
|
│ → phases/04-runner-scripts.md │
|
|
│ → scripts/run-tests.sh (unit + blackbox) │
|
|
│ → scripts/run-performance-tests.sh (load/perf scenarios) │
|
|
│ │
|
|
│ cycle-update mode (scoped refresh) │
|
|
│ → modes/cycle-update.md │
|
|
├──────────────────────────────────────────────────────────────────────┤
|
|
│ Principles: Black-box only · Traceability · Save immediately │
|
|
│ Ask don't assume · Spec don't code │
|
|
│ No test without data · No test without expected result │
|
|
└──────────────────────────────────────────────────────────────────────┘
|
|
```
|