- Replace all Jira-specific references with generic tracker/work-item terminology (TRACKER-ID, work item epics); delete project-management.mdc and mcp.json.example - Restructure refactor skill: extract 8 phases (00–07) and templates into separate files; add guided mode for pre-built change lists - Add Step 3 "Code Testability Revision" to existing-code workflow (renumber steps 3–12 → 3–13) - Simplify autopilot state file to minimal current-step pointer - Strengthen coding rules: AAA test comments per language, test failures as blocking gates, dependency install policy - Add Docker Suitability Assessment to test-spec and test-run skills (local vs Docker execution) - Narrow human-attention sound rule to human-input-needed only - Add AskQuestion fallback to plain text across skills - Rename FINAL_implementation_report to implementation_report_* - Simplify cursor-meta (remove _docs numbering table, quality thresholds) - Make techstackrule alwaysApply, add alwaysApply:false to openapi
5.6 KiB
name, description, category, tags, disable-model-invocation
| name | description | category | tags | disable-model-invocation | |||
|---|---|---|---|---|---|---|---|
| test-run | Run the project's test suite, report results, and handle failures. Detects test runners automatically (pytest, dotnet test, cargo test, npm test) or uses scripts/run-tests.sh if available. Trigger phrases: - "run tests", "test suite", "verify tests" | build |
|
true |
Test Run
Run the project's test suite and report results. This skill is invoked by the autopilot at verification checkpoints — after implementing tests, after implementing features, or at any point where the test suite must pass before proceeding.
Workflow
1. Detect Test Runner
Check in order — first match wins:
scripts/run-tests.shexists → use it (the script already encodes the correct execution strategy)docker-compose.test.ymlexists → run the Docker Suitability Check (see below). Docker is preferred; use it unless hardware constraints prevent it.- Auto-detect from project files:
pytest.ini,pyproject.tomlwith[tool.pytest], orconftest.py→pytest*.csprojor*.sln→dotnet testCargo.toml→cargo testpackage.jsonwith test script →npm testMakefilewithtesttarget →make test
If no runner detected → report failure and ask user to specify.
Docker Suitability Check
Docker is the preferred test environment. Before using it, verify no constraints prevent easy Docker execution:
- Check
_docs/02_document/tests/environment.mdfor a "Test Execution" decision (if the test-spec skill already assessed this, follow that decision) - If no prior decision exists, check for disqualifying factors:
- Hardware bindings: GPU, MPS, CUDA, TPU, FPGA, sensors, cameras, serial devices, host-level drivers
- Host dependencies: licensed software, OS-specific services, kernel modules, proprietary SDKs
- Data/volume constraints: large files (> 100MB) impractical to copy into a container
- Network/environment: host networking, VPN, specific DNS/firewall rules
- Performance: Docker overhead would invalidate benchmarks or latency measurements
- If any disqualifying factor found → fall back to local test runner. Present to user using Choose format:
══════════════════════════════════════
DECISION REQUIRED: Docker is preferred but factors
preventing easy Docker execution detected
══════════════════════════════════════
Factors detected:
- [list factors]
══════════════════════════════════════
A) Run tests locally (recommended)
B) Run tests in Docker anyway
══════════════════════════════════════
Recommendation: A — detected constraints prevent
easy Docker execution
══════════════════════════════════════
- If no disqualifying factors → use Docker (preferred default)
2. Run Tests
- Execute the detected test runner
- Capture output: passed, failed, skipped, errors
- If a test environment was spun up, tear it down after tests complete
3. Report Results
Present a summary:
══════════════════════════════════════
TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════
Important: Collection errors (import failures, missing dependencies, syntax errors) count as failures — they are not "skipped" or ignorable.
4. Diagnose Failures
Before presenting choices, list every failing/erroring test with a one-line root cause:
Failures:
1. test_foo.py::test_bar — missing dependency 'netron' (not installed)
2. test_baz.py::test_qux — AssertionError: expected 5, got 3 (logic error)
3. test_old.py::test_legacy — ImportError: no module 'removed_module' (possibly obsolete)
Categorize each as: missing dependency, broken import, logic/assertion error, possibly obsolete, or environment-specific.
5. Handle Outcome
All tests pass → return success to the autopilot for auto-chain.
Any test fails or errors → this is a blocking gate. Never silently ignore or skip failures. Present using Choose format:
══════════════════════════════════════
TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════
A) Investigate and fix failing tests/code, then re-run
B) Remove obsolete tests (if diagnosis shows they are no longer relevant)
C) Abort — fix manually
══════════════════════════════════════
Recommendation: A — fix failures before proceeding
══════════════════════════════════════
- If user picks A → investigate root causes, attempt fixes, then re-run (loop back to step 2)
- If user picks B → confirm which tests to remove, delete them, then re-run (loop back to step 2)
- If user picks C → return failure to the autopilot
Trigger Conditions
This skill is invoked by the autopilot at test verification checkpoints. It is not typically invoked directly by the user.