Phase 4: Test Runner Script Generation

Skip condition: If this skill was invoked from the /plan skill (planning context, no code exists yet), skip Phase 4 entirely. Script creation should instead be planned as a task during decompose — the decomposer creates a task for creating these scripts. Phase 4 only runs when invoked from the existing-code flow (where source code already exists) or standalone.

Role: DevOps engineer Goal: Generate executable shell scripts that run the specified tests, so autodev and CI can invoke them consistently. Constraints: Scripts must be idempotent, portable across dev/CI, and exit with non-zero on failure. Respect the Hardware-Dependency Assessment decision recorded in environment.md.

Prerequisite: phases/hardware-assessment.md must have completed and written the "Test Execution" section to TESTS_OUTPUT_DIR/environment.md.

Step 1 — Detect test infrastructure

Identify the project's test runner from manifests and config files:
- Python: pytest (pyproject.toml, setup.cfg, pytest.ini)
- .NET: dotnet test (*.csproj, *.sln)
- Rust: cargo test (Cargo.toml)
- Node: npm test or vitest / jest (package.json)
Check the Hardware-Dependency Assessment result recorded in environment.md:
- If local execution was chosen → do NOT generate docker-compose test files; scripts run directly on host
- If Docker execution was chosen → identify/generate docker-compose files for integration/blackbox tests
- If both was chosen → generate both
Identify performance/load testing tools from dependencies (k6, locust, artillery, wrk, or built-in benchmarks)
Read TESTS_OUTPUT_DIR/environment.md for infrastructure requirements

Step 2 — Generate test runner

Docker is the default. Only generate a local scripts/run-tests.sh if the Hardware-Dependency Assessment determined local or both execution (i.e., the project requires real hardware like GPU/CoreML/TPU/sensors). For all other projects, use docker-compose.test.yml — it provides reproducibility, isolation, and CI parity without a custom shell script.

If local script is needed — create scripts/run-tests.sh at the project root using .cursor/skills/test-spec/templates/run-tests-script.md as structural guidance. The script must:

Set set -euo pipefail and trap cleanup on EXIT
Install all project and test dependencies (e.g. pip install -q -r requirements.txt -r e2e/requirements.txt, dotnet restore, npm ci). This prevents collection-time import errors on fresh environments.
Optionally accept a --unit-only flag to skip blackbox tests
Run unit/blackbox tests using the detected test runner (activate virtualenv if present, run test runner directly on host)
Print a summary of passed/failed/skipped tests
Exit 0 on all pass, exit 1 on any failure

If Docker — generate or update docker-compose.test.yml that builds the test image, installs all dependencies inside the container, runs the test suite, and exits with the test runner's exit code.

Step 3 — Generate `scripts/run-performance-tests.sh`

Create scripts/run-performance-tests.sh at the project root. The script must:

Set set -euo pipefail and trap cleanup on EXIT
Read thresholds from _docs/02_document/tests/performance-tests.md (or accept as CLI args)
Start the system under test (local or docker-compose, matching the Hardware-Dependency Assessment decision)
Run load/performance scenarios using the detected tool
Compare results against threshold values from the test spec
Print a pass/fail summary per scenario
Exit 0 if all thresholds met, exit 1 otherwise

Step 4 — Verify scripts

Verify both scripts are syntactically valid (bash -n scripts/run-tests.sh)
Mark both scripts as executable (chmod +x)
Present a summary of what each script does to the user

Save action

Write scripts/run-tests.sh and scripts/run-performance-tests.sh to the project root.

4.0 KiB Raw Blame History