4.0 KiB
Phase 4: Test Runner Script Generation
Skip condition: If this skill was invoked from the /plan skill (planning context, no code exists yet), skip Phase 4 entirely. Script creation should instead be planned as a task during decompose — the decomposer creates a task for creating these scripts. Phase 4 only runs when invoked from the existing-code flow (where source code already exists) or standalone.
Role: DevOps engineer
Goal: Generate executable shell scripts that run the specified tests, so autodev and CI can invoke them consistently.
Constraints: Scripts must be idempotent, portable across dev/CI, and exit with non-zero on failure. Respect the Hardware-Dependency Assessment decision recorded in environment.md.
Prerequisite: phases/hardware-assessment.md must have completed and written the "Test Execution" section to TESTS_OUTPUT_DIR/environment.md.
Step 1 — Detect test infrastructure
- Identify the project's test runner from manifests and config files:
- Python:
pytest(pyproject.toml,setup.cfg,pytest.ini) - .NET:
dotnet test(*.csproj,*.sln) - Rust:
cargo test(Cargo.toml) - Node:
npm testorvitest/jest(package.json)
- Python:
- Check the Hardware-Dependency Assessment result recorded in
environment.md:- If local execution was chosen → do NOT generate docker-compose test files; scripts run directly on host
- If Docker execution was chosen → identify/generate docker-compose files for integration/blackbox tests
- If both was chosen → generate both
- Identify performance/load testing tools from dependencies (
k6,locust,artillery,wrk, or built-in benchmarks) - Read
TESTS_OUTPUT_DIR/environment.mdfor infrastructure requirements
Step 2 — Generate test runner
Docker is the default. Only generate a local scripts/run-tests.sh if the Hardware-Dependency Assessment determined local or both execution (i.e., the project requires real hardware like GPU/CoreML/TPU/sensors). For all other projects, use docker-compose.test.yml — it provides reproducibility, isolation, and CI parity without a custom shell script.
If local script is needed — create scripts/run-tests.sh at the project root using .cursor/skills/test-spec/templates/run-tests-script.md as structural guidance. The script must:
- Set
set -euo pipefailand trap cleanup on EXIT - Install all project and test dependencies (e.g.
pip install -q -r requirements.txt -r e2e/requirements.txt,dotnet restore,npm ci). This prevents collection-time import errors on fresh environments. - Optionally accept a
--unit-onlyflag to skip blackbox tests - Run unit/blackbox tests using the detected test runner (activate virtualenv if present, run test runner directly on host)
- Print a summary of passed/failed/skipped tests
- Exit 0 on all pass, exit 1 on any failure
If Docker — generate or update docker-compose.test.yml that builds the test image, installs all dependencies inside the container, runs the test suite, and exits with the test runner's exit code.
Step 3 — Generate scripts/run-performance-tests.sh
Create scripts/run-performance-tests.sh at the project root. The script must:
- Set
set -euo pipefailand trap cleanup on EXIT - Read thresholds from
_docs/02_document/tests/performance-tests.md(or accept as CLI args) - Start the system under test (local or docker-compose, matching the Hardware-Dependency Assessment decision)
- Run load/performance scenarios using the detected tool
- Compare results against threshold values from the test spec
- Print a pass/fail summary per scenario
- Exit 0 if all thresholds met, exit 1 otherwise
Step 4 — Verify scripts
- Verify both scripts are syntactically valid (
bash -n scripts/run-tests.sh) - Mark both scripts as executable (
chmod +x) - Present a summary of what each script does to the user
Save action
Write scripts/run-tests.sh and scripts/run-performance-tests.sh to the project root.