Files
ai-training/.cursor/skills/test-run/SKILL.md
T
Oleksandr Bezdieniezhnykh 243b69656b Update test results directory structure and enhance Docker configurations
- Modified `.gitignore` to reflect the new path for test results.
- Updated `docker-compose.test.yml` to mount the correct test results directory.
- Adjusted `Dockerfile.test` to set the `PYTHONPATH` and ensure test results are saved in the updated location.
- Added `boto3` and `netron` to `requirements-test.txt` to support new functionalities.
- Updated `pytest.ini` to include the new `pythonpath` for test discovery.

These changes streamline the testing process and ensure compatibility with the updated directory structure.
2026-03-28 00:13:08 +02:00

93 lines
3.9 KiB
Markdown

---
name: test-run
description: |
Run the project's test suite, report results, and handle failures.
Detects test runners automatically (pytest, dotnet test, cargo test, npm test)
or uses scripts/run-tests.sh if available.
Trigger phrases:
- "run tests", "test suite", "verify tests"
category: build
tags: [testing, verification, test-suite]
disable-model-invocation: true
---
# Test Run
Run the project's test suite and report results. This skill is invoked by the autopilot at verification checkpoints — after implementing tests, after implementing features, or at any point where the test suite must pass before proceeding.
## Workflow
### 1. Detect Test Runner
Check in order — first match wins:
1. `scripts/run-tests.sh` exists → use it
2. `docker-compose.test.yml` or equivalent test environment exists → spin it up first, then detect runner below
3. Auto-detect from project files:
- `pytest.ini`, `pyproject.toml` with `[tool.pytest]`, or `conftest.py``pytest`
- `*.csproj` or `*.sln``dotnet test`
- `Cargo.toml``cargo test`
- `package.json` with test script → `npm test`
- `Makefile` with `test` target → `make test`
If no runner detected → report failure and ask user to specify.
### 2. Run Tests
1. Execute the detected test runner
2. Capture output: passed, failed, skipped, errors
3. If a test environment was spun up, tear it down after tests complete
### 3. Report Results
Present a summary:
```
══════════════════════════════════════
TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════
```
**Important**: Collection errors (import failures, missing dependencies, syntax errors) count as failures — they are not "skipped" or ignorable.
### 4. Diagnose Failures
Before presenting choices, list every failing/erroring test with a one-line root cause:
```
Failures:
1. test_foo.py::test_bar — missing dependency 'netron' (not installed)
2. test_baz.py::test_qux — AssertionError: expected 5, got 3 (logic error)
3. test_old.py::test_legacy — ImportError: no module 'removed_module' (possibly obsolete)
```
Categorize each as: **missing dependency**, **broken import**, **logic/assertion error**, **possibly obsolete**, or **environment-specific**.
### 5. Handle Outcome
**All tests pass** → return success to the autopilot for auto-chain.
**Any test fails or errors** → this is a **blocking gate**. Never silently ignore or skip failures. Present using Choose format:
```
══════════════════════════════════════
TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════
A) Investigate and fix failing tests/code, then re-run
B) Remove obsolete tests (if diagnosis shows they are no longer relevant)
C) Leave as-is — acknowledged tech debt (not recommended)
D) Abort — fix manually
══════════════════════════════════════
Recommendation: A — fix failures before proceeding
══════════════════════════════════════
```
- If user picks A → investigate root causes, attempt fixes, then re-run (loop back to step 2)
- If user picks B → confirm which tests to remove, delete them, then re-run (loop back to step 2)
- If user picks C → require explicit user confirmation; log as acknowledged tech debt in the report, then return success with warning to the autopilot
- If user picks D → return failure to the autopilot
## Trigger Conditions
This skill is invoked by the autopilot at test verification checkpoints. It is not typically invoked directly by the user.