Files
2026-04-18 22:04:31 +03:00

4.0 KiB

Phase 4: Test Runner Script Generation

Skip condition: If this skill was invoked from the /plan skill (planning context, no code exists yet), skip Phase 4 entirely. Script creation should instead be planned as a task during decompose — the decomposer creates a task for creating these scripts. Phase 4 only runs when invoked from the existing-code flow (where source code already exists) or standalone.

Role: DevOps engineer Goal: Generate executable shell scripts that run the specified tests, so autodev and CI can invoke them consistently. Constraints: Scripts must be idempotent, portable across dev/CI, and exit with non-zero on failure. Respect the Hardware-Dependency Assessment decision recorded in environment.md.

Prerequisite: phases/hardware-assessment.md must have completed and written the "Test Execution" section to TESTS_OUTPUT_DIR/environment.md.

Step 1 — Detect test infrastructure

  1. Identify the project's test runner from manifests and config files:
    • Python: pytest (pyproject.toml, setup.cfg, pytest.ini)
    • .NET: dotnet test (*.csproj, *.sln)
    • Rust: cargo test (Cargo.toml)
    • Node: npm test or vitest / jest (package.json)
  2. Check the Hardware-Dependency Assessment result recorded in environment.md:
    • If local execution was chosen → do NOT generate docker-compose test files; scripts run directly on host
    • If Docker execution was chosen → identify/generate docker-compose files for integration/blackbox tests
    • If both was chosen → generate both
  3. Identify performance/load testing tools from dependencies (k6, locust, artillery, wrk, or built-in benchmarks)
  4. Read TESTS_OUTPUT_DIR/environment.md for infrastructure requirements

Step 2 — Generate test runner

Docker is the default. Only generate a local scripts/run-tests.sh if the Hardware-Dependency Assessment determined local or both execution (i.e., the project requires real hardware like GPU/CoreML/TPU/sensors). For all other projects, use docker-compose.test.yml — it provides reproducibility, isolation, and CI parity without a custom shell script.

If local script is needed — create scripts/run-tests.sh at the project root using .cursor/skills/test-spec/templates/run-tests-script.md as structural guidance. The script must:

  1. Set set -euo pipefail and trap cleanup on EXIT
  2. Install all project and test dependencies (e.g. pip install -q -r requirements.txt -r e2e/requirements.txt, dotnet restore, npm ci). This prevents collection-time import errors on fresh environments.
  3. Optionally accept a --unit-only flag to skip blackbox tests
  4. Run unit/blackbox tests using the detected test runner (activate virtualenv if present, run test runner directly on host)
  5. Print a summary of passed/failed/skipped tests
  6. Exit 0 on all pass, exit 1 on any failure

If Docker — generate or update docker-compose.test.yml that builds the test image, installs all dependencies inside the container, runs the test suite, and exits with the test runner's exit code.

Step 3 — Generate scripts/run-performance-tests.sh

Create scripts/run-performance-tests.sh at the project root. The script must:

  1. Set set -euo pipefail and trap cleanup on EXIT
  2. Read thresholds from _docs/02_document/tests/performance-tests.md (or accept as CLI args)
  3. Start the system under test (local or docker-compose, matching the Hardware-Dependency Assessment decision)
  4. Run load/performance scenarios using the detected tool
  5. Compare results against threshold values from the test spec
  6. Print a pass/fail summary per scenario
  7. Exit 0 if all thresholds met, exit 1 otherwise

Step 4 — Verify scripts

  1. Verify both scripts are syntactically valid (bash -n scripts/run-tests.sh)
  2. Mark both scripts as executable (chmod +x)
  3. Present a summary of what each script does to the user

Save action

Write scripts/run-tests.sh and scripts/run-performance-tests.sh to the project root.