azaion/annotations

Fork 0

mirror of https://github.com/azaion/annotations.git synced 2026-04-22 22:26:31 +00:00

Files

T

Oleksandr Bezdieniezhnykh e9c42e4888 Sync .cursor from suite (autodev orchestrator + monorepo skills)

2026-04-18 22:03:57 +03:00

15 KiB

Raw Blame History

name, description, category, tags, disable-model-invocation

name

description

Test Scenario Specification

Analyze input data completeness and produce detailed black-box test specifications. Tests describe what the system should do given specific inputs — they never reference internals.

Core Principles

Black-box only: tests describe observable behavior through public interfaces; no internal implementation details
Traceability: every test traces to at least one acceptance criterion or restriction
Save immediately: write artifacts to disk after each phase; never accumulate unsaved work
Ask, don't assume: when requirements are ambiguous, ask the user before proceeding
Spec, don't code: this workflow produces test specifications, never test implementation code
Every test must have a pass/fail criterion. Two acceptable shapes:
- Input/output shape: concrete input data paired with a quantifiable expected result (exact value, tolerance, threshold, pattern, reference file). Typical for functional blackbox tests, performance tests with load data, data-processing pipelines.
- Behavioral shape: a trigger condition + observable system behavior + quantifiable pass/fail criterion, with no input data required. Typical for startup/shutdown tests, retry/backoff policies, state transitions, logging/metrics emission, resilience scenarios. Example criteria: "startup logs service ready within 5s", "retry emits 3 attempts with exponential backoff (base 100ms ± 20ms)", "on SIGTERM, service drains in-flight requests within 30s grace period", "health endpoint returns 503 while migrations run".
For behavioral tests the observable (log line, metric value, state transition, emitted event, elapsed time) must still be quantifiable — the test must programmatically decide pass/fail.
A test that cannot produce a pass/fail verdict through either shape is not verifiable and must be removed.

Context Resolution

Fixed paths:

PROBLEM_DIR: _docs/00_problem/
SOLUTION_DIR: _docs/01_solution/
DOCUMENT_DIR: _docs/02_document/
TESTS_OUTPUT_DIR: _docs/02_document/tests/

Announce the resolved paths and the detected invocation mode (below) to the user before proceeding.

Invocation Modes

full (default): runs all 4 phases against the whole PROBLEM_DIR + DOCUMENT_DIR. Used in greenfield Plan Step 1 and existing-code Step 3.
cycle-update: runs only a scoped refresh of the existing test-spec artifacts against the current feature cycle's completed tasks. Used by the existing-code flow's per-cycle sync step. See modes/cycle-update.md for the narrowed workflow.

Input Specification

Required Files

File	Purpose
`_docs/00_problem/problem.md`	Problem description and context
`_docs/00_problem/acceptance_criteria.md`	Measurable acceptance criteria
`_docs/00_problem/restrictions.md`	Constraints and limitations
`_docs/00_problem/input_data/`	Reference data examples, expected results, and optional reference files
`_docs/01_solution/solution.md`	Finalized solution

Expected Results Specification

Every input data item MUST have a corresponding expected result that defines what the system should produce. Expected results MUST be quantifiable — the test must be able to programmatically compare actual system output against the expected result and produce a pass/fail verdict.

Expected results live inside _docs/00_problem/input_data/ in one or both of:

Mapping file (input_data/expected_results/results_report.md): a table pairing each input with its quantifiable expected output, using the format defined in templates/expected-results.md
Reference files folder (input_data/expected_results/): machine-readable files (JSON, CSV, etc.) containing full expected outputs for complex cases, referenced from the mapping file

input_data/
├── expected_results/            ← required: expected results folder
│   ├── results_report.md        ← required: input→expected result mapping
│   ├── image_01_expected.csv    ← per-file expected detections
│   └── video_01_expected.csv
├── image_01.jpg
├── empty_scene.jpg
└── data_parameters.md

Quantifiability requirements (see templates/expected-results.md for full format and examples):

Numeric values: exact value or value ± tolerance (e.g., confidence ≥ 0.85, position ± 10px)
Structured data: exact JSON/CSV values, or a reference file in expected_results/
Counts: exact counts (e.g., "3 detections", "0 errors")
Text/patterns: exact string or regex pattern to match
Timing: threshold (e.g., "response ≤ 500ms")
Error cases: expected error code, message pattern, or HTTP status

Optional Files (used when available)

File	Purpose
`DOCUMENT_DIR/architecture.md`	System architecture for environment design
`DOCUMENT_DIR/system-flows.md`	System flows for test scenario coverage
`DOCUMENT_DIR/components/`	Component specs for interface identification

Prerequisite Checks (BLOCKING)

acceptance_criteria.md exists and is non-empty — STOP if missing
restrictions.md exists and is non-empty — STOP if missing
input_data/ exists and contains at least one file — STOP if missing
input_data/expected_results/results_report.md exists and is non-empty — STOP if missing. Prompt the user: "Expected results mapping is required. Please create _docs/00_problem/input_data/expected_results/results_report.md pairing each input with its quantifiable expected output. Use templates/expected-results.md as the format reference."
problem.md exists and is non-empty — STOP if missing
solution.md exists and is non-empty — STOP if missing
Create TESTS_OUTPUT_DIR if it does not exist
If TESTS_OUTPUT_DIR already contains files, ask user: resume from last checkpoint or start fresh?

Artifact Management

Directory Structure

TESTS_OUTPUT_DIR/
├── environment.md
├── test-data.md
├── blackbox-tests.md
├── performance-tests.md
├── resilience-tests.md
├── security-tests.md
├── resource-limit-tests.md
└── traceability-matrix.md

Save Timing

Phase	Save immediately after	Filename
Phase 1	Input data analysis (no file — findings feed Phase 2)	—
Phase 2	Environment spec	`environment.md`
Phase 2	Test data spec	`test-data.md`
Phase 2	Blackbox tests	`blackbox-tests.md`
Phase 2	Performance tests	`performance-tests.md`
Phase 2	Resilience tests	`resilience-tests.md`
Phase 2	Security tests	`security-tests.md`
Phase 2	Resource limit tests	`resource-limit-tests.md`
Phase 2	Traceability matrix	`traceability-matrix.md`
Phase 3	Updated test data spec (if data added)	`test-data.md`
Phase 3	Updated test files (if tests removed)	respective test file
Phase 3	Updated traceability matrix (if tests removed)	`traceability-matrix.md`
Hardware Assessment	Test Execution section	`environment.md` (updated)
Phase 4	Test runner script	`scripts/run-tests.sh`
Phase 4	Performance test runner script	`scripts/run-performance-tests.sh`

Resumability

If TESTS_OUTPUT_DIR already contains files:

List existing files and match them to the save timing table above
Identify which phase/artifacts are complete
Resume from the next incomplete artifact
Inform the user which artifacts are being skipped

Progress Tracking

At the start of execution, create a TodoWrite with all four phases (plus the hardware assessment between Phase 3 and Phase 4). Update status as each phase completes.

Workflow

Phase 1: Input Data & Expected Results Completeness Analysis

Read and follow phases/01-input-data-analysis.md.

Phase 2: Test Scenario Specification

Read and follow phases/02-test-scenarios.md.

Phase 3: Test Data Validation Gate (HARD GATE)

Read and follow phases/03-data-validation-gate.md.

Hardware-Dependency & Execution Environment Assessment (BLOCKING — runs between Phase 3 and Phase 4)

Read and follow phases/hardware-assessment.md.

Phase 4: Test Runner Script Generation

Read and follow phases/04-runner-scripts.md.

cycle-update mode

If invoked in cycle-update mode (see "Invocation Modes" above), read and follow modes/cycle-update.md instead of the full 4-phase workflow.

Escalation Rules

Situation	Action
Missing acceptance_criteria.md, restrictions.md, or input_data/	STOP — specification cannot proceed
Missing input_data/expected_results/results_report.md	STOP — ask user to provide expected results mapping using the template
Ambiguous requirements	ASK user
Input data coverage below 75% (Phase 1)	Search internet for supplementary data, ASK user to validate
Expected results missing or not quantifiable (Phase 1)	ASK user to provide quantifiable expected results before proceeding
Test scenario conflicts with restrictions	ASK user to clarify intent
System interfaces unclear (no architecture.md)	ASK user or derive from solution.md
Test data or expected result not provided for a test scenario (Phase 3)	WARN user and REMOVE the test
Final coverage below 75% after removals (Phase 3)	BLOCK — require user to supply data or accept reduced spec

Common Mistakes

Referencing internals: tests must be black-box — no internal module names, no direct DB queries against the system under test
Vague expected outcomes: "works correctly" is not a test outcome; use specific measurable values
Missing pass/fail criterion: input/output tests without an expected result, OR behavioral tests without a measurable observable — both are unverifiable and must be removed
Non-quantifiable criteria: "should return good results", "works correctly", "behaves properly" — not verifiable. Use exact values, tolerances, thresholds, pattern matches, or timing bounds that code can evaluate.
Forcing the wrong shape: do not invent fake input data for a behavioral test (e.g., "input: SIGTERM signal") just to fit the input/output shape. Classify the test correctly and use the matching checklist.
Missing negative scenarios: every positive scenario category should have corresponding negative/edge-case tests
Untraceable tests: every test should trace to at least one AC or restriction
Writing test code: this skill produces specifications, never implementation code

Trigger Conditions

When the user wants to:

Specify blackbox tests before implementation or refactoring
Analyze input data completeness for test coverage
Produce test scenarios from acceptance criteria

Keywords: "test spec", "test specification", "blackbox test spec", "black box tests", "blackbox tests", "test scenarios"

Methodology Quick Reference

┌──────────────────────────────────────────────────────────────────────┐
│              Test Scenario Specification (4-Phase)                   │
├──────────────────────────────────────────────────────────────────────┤
│ PREREQ: Data Gate (BLOCKING)                                         │
│   → verify AC, restrictions, input_data (incl. expected_results.md)  │
│                                                                      │
│ Phase 1: Input Data & Expected Results Completeness Analysis         │
│   → phases/01-input-data-analysis.md                                 │
│   [BLOCKING: user confirms input data + expected results coverage]   │
│                                                                      │
│ Phase 2: Test Scenario Specification                                 │
│   → phases/02-test-scenarios.md                                      │
│   → environment.md · test-data.md · blackbox-tests.md                │
│   → performance-tests.md · resilience-tests.md · security-tests.md   │
│   → resource-limit-tests.md · traceability-matrix.md                 │
│   [BLOCKING: user confirms test coverage]                            │
│                                                                      │
│ Phase 3: Test Data & Expected Results Validation Gate (HARD GATE)    │
│   → phases/03-data-validation-gate.md                                │
│   [BLOCKING: coverage ≥ 75% required to pass]                        │
│                                                                      │
│ Hardware-Dependency Assessment (BLOCKING, pre-Phase-4)               │
│   → phases/hardware-assessment.md                                    │
│                                                                      │
│ Phase 4: Test Runner Script Generation                               │
│   → phases/04-runner-scripts.md                                      │
│   → scripts/run-tests.sh (unit + blackbox)                           │
│   → scripts/run-performance-tests.sh (load/perf scenarios)           │
│                                                                      │
│ cycle-update mode (scoped refresh)                                   │
│   → modes/cycle-update.md                                            │
├──────────────────────────────────────────────────────────────────────┤
│ Principles: Black-box only · Traceability · Save immediately         │
│             Ask don't assume · Spec don't code                       │
│             No test without data · No test without expected result   │
└──────────────────────────────────────────────────────────────────────┘

15 KiB Raw Blame History