# Phase 4: Test Runner Script Generation **Skip condition**: If this skill was invoked from the `/plan` skill (planning context, no code exists yet), skip Phase 4 entirely. Script creation should instead be planned as a task during decompose — the decomposer creates a task for creating these scripts. Phase 4 only runs when invoked from the existing-code flow (where source code already exists) or standalone. **Role**: DevOps engineer **Goal**: Generate executable shell scripts that run the specified tests, so autodev and CI can invoke them consistently. **Constraints**: Scripts must be idempotent, portable across dev/CI, and exit with non-zero on failure. Respect the Hardware-Dependency Assessment decision recorded in `environment.md`. **Prerequisite**: `phases/hardware-assessment.md` must have completed and written the "Test Execution" section to `TESTS_OUTPUT_DIR/environment.md`. ## Step 1 — Detect test infrastructure 1. Identify the project's test runner from manifests and config files: - Python: `pytest` (`pyproject.toml`, `setup.cfg`, `pytest.ini`) - .NET: `dotnet test` (`*.csproj`, `*.sln`) - Rust: `cargo test` (`Cargo.toml`) - Node: `npm test` or `vitest` / `jest` (`package.json`) 2. Check the Hardware-Dependency Assessment result recorded in `environment.md`: - If **local execution** was chosen → do NOT generate docker-compose test files; scripts run directly on host - If **Docker execution** was chosen → identify/generate docker-compose files for integration/blackbox tests - If **both** was chosen → generate both 3. Identify performance/load testing tools from dependencies (`k6`, `locust`, `artillery`, `wrk`, or built-in benchmarks) 4. Read `TESTS_OUTPUT_DIR/environment.md` for infrastructure requirements ## Step 2 — Generate test runner **Docker is the default.** Only generate a local `scripts/run-tests.sh` if the Hardware-Dependency Assessment determined **local** or **both** execution (i.e., the project requires real hardware like GPU/CoreML/TPU/sensors). For all other projects, use `docker-compose.test.yml` — it provides reproducibility, isolation, and CI parity without a custom shell script. **If local script is needed** — create `scripts/run-tests.sh` at the project root using `.cursor/skills/test-spec/templates/run-tests-script.md` as structural guidance. The script must: 1. Set `set -euo pipefail` and trap cleanup on EXIT 2. **Install all project and test dependencies** (e.g. `pip install -q -r requirements.txt -r e2e/requirements.txt`, `dotnet restore`, `npm ci`). This prevents collection-time import errors on fresh environments. 3. Optionally accept a `--unit-only` flag to skip blackbox tests 4. Run unit/blackbox tests using the detected test runner (activate virtualenv if present, run test runner directly on host) 5. Print a summary of passed/failed/skipped tests 6. Exit 0 on all pass, exit 1 on any failure **If Docker** — generate or update `docker-compose.test.yml` that builds the test image, installs all dependencies inside the container, runs the test suite, and exits with the test runner's exit code. ## Step 3 — Generate `scripts/run-performance-tests.sh` Create `scripts/run-performance-tests.sh` at the project root. The script must: 1. Set `set -euo pipefail` and trap cleanup on EXIT 2. Read thresholds from `_docs/02_document/tests/performance-tests.md` (or accept as CLI args) 3. Start the system under test (local or docker-compose, matching the Hardware-Dependency Assessment decision) 4. Run load/performance scenarios using the detected tool 5. Compare results against threshold values from the test spec 6. Print a pass/fail summary per scenario 7. Exit 0 if all thresholds met, exit 1 otherwise ## Step 4 — Verify scripts 1. Verify both scripts are syntactically valid (`bash -n scripts/run-tests.sh`) 2. Mark both scripts as executable (`chmod +x`) 3. Present a summary of what each script does to the user ## Save action Write `scripts/run-tests.sh` and `scripts/run-performance-tests.sh` to the project root.