mirror of
https://github.com/azaion/gps-denied-desktop.git
synced 2026-04-22 07:16:37 +00:00
165 lines
9.2 KiB
Markdown
165 lines
9.2 KiB
Markdown
---
|
|
name: test-run
|
|
description: |
|
|
Run the project's test suite, report results, and handle failures.
|
|
Detects test runners automatically (pytest, dotnet test, cargo test, npm test)
|
|
or uses scripts/run-tests.sh if available.
|
|
Trigger phrases:
|
|
- "run tests", "test suite", "verify tests"
|
|
category: build
|
|
tags: [testing, verification, test-suite]
|
|
disable-model-invocation: true
|
|
---
|
|
|
|
# Test Run
|
|
|
|
Run the project's test suite and report results. This skill is invoked by the autopilot at verification checkpoints — after implementing tests, after implementing features, or at any point where the test suite must pass before proceeding.
|
|
|
|
## Workflow
|
|
|
|
### 1. Detect Test Runner
|
|
|
|
Check in order — first match wins:
|
|
|
|
1. `scripts/run-tests.sh` exists → use it (the script already encodes the correct execution strategy)
|
|
2. `docker-compose.test.yml` exists → run the Docker Suitability Check (see below). Docker is preferred; use it unless hardware constraints prevent it.
|
|
3. Auto-detect from project files:
|
|
- `pytest.ini`, `pyproject.toml` with `[tool.pytest]`, or `conftest.py` → `pytest`
|
|
- `*.csproj` or `*.sln` → `dotnet test`
|
|
- `Cargo.toml` → `cargo test`
|
|
- `package.json` with test script → `npm test`
|
|
- `Makefile` with `test` target → `make test`
|
|
|
|
If no runner detected → report failure and ask user to specify.
|
|
|
|
#### Execution Environment Check
|
|
|
|
1. Check `_docs/02_document/tests/environment.md` for a "Test Execution" section. If the test-spec skill already assessed hardware dependencies and recorded a decision (local / docker / both), **follow that decision**.
|
|
2. If the "Test Execution" section says **local** → run tests directly on host (no Docker).
|
|
3. If the "Test Execution" section says **docker** → use Docker (docker-compose).
|
|
4. If the "Test Execution" section says **both** → run local first, then Docker (or vice versa), and merge results.
|
|
5. If no prior decision exists → fall back to the hardware-dependency detection logic from the test-spec skill's "Hardware-Dependency & Execution Environment Assessment" section. Ask the user if hardware indicators are found.
|
|
|
|
### 2. Run Tests
|
|
|
|
1. Execute the detected test runner
|
|
2. Capture output: passed, failed, skipped, errors
|
|
3. If a test environment was spun up, tear it down after tests complete
|
|
|
|
### 3. Report Results
|
|
|
|
Present a summary:
|
|
|
|
```
|
|
══════════════════════════════════════
|
|
TEST RESULTS: [N passed, M failed, K skipped, E errors]
|
|
══════════════════════════════════════
|
|
```
|
|
|
|
**Important**: Collection errors (import failures, missing dependencies, syntax errors) count as failures — they are not "skipped" or ignorable. If a collection error is caused by a missing dependency, install it (add to the project's dependency file and install) before re-running. The test runner script (`run-tests.sh`) should install all dependencies automatically — if it doesn't, fix the script to do so.
|
|
|
|
### 4. Diagnose Failures and Skips
|
|
|
|
Before presenting choices, list every failing/erroring/skipped test with a one-line root cause:
|
|
|
|
```
|
|
Failures:
|
|
1. test_foo.py::test_bar — missing dependency 'netron' (not installed)
|
|
2. test_baz.py::test_qux — AssertionError: expected 5, got 3 (logic error)
|
|
3. test_old.py::test_legacy — ImportError: no module 'removed_module' (possibly obsolete)
|
|
|
|
Skips:
|
|
1. test_x.py::test_pre_init — runtime skip: engine already initialized (unreachable in current test order)
|
|
2. test_y.py::test_docker_only — explicit @skip: requires Docker (dead code in local runs)
|
|
```
|
|
|
|
Categorize failures as: **missing dependency**, **broken import**, **logic/assertion error**, **possibly obsolete**, or **environment-specific**.
|
|
|
|
Categorize skips as: **explicit skip (dead code)**, **runtime skip (unreachable)**, **environment mismatch**, or **missing fixture/data**.
|
|
|
|
### 5. Handle Outcome
|
|
|
|
**All tests pass, zero skipped** → return success to the autopilot for auto-chain.
|
|
|
|
**Any test fails or errors** → this is a **blocking gate**. Never silently ignore failures. **Always investigate the root cause before deciding on an action.** Read the failing test code, read the error output, check service logs if applicable, and determine whether the bug is in the test or in the production code.
|
|
|
|
After investigating, present:
|
|
|
|
```
|
|
══════════════════════════════════════
|
|
TEST RESULTS: [N passed, M failed, K skipped, E errors]
|
|
══════════════════════════════════════
|
|
Failures:
|
|
1. test_X — root cause: [detailed reason] → action: [fix test / fix code / remove + justification]
|
|
══════════════════════════════════════
|
|
A) Apply recommended fixes, then re-run
|
|
B) Abort — fix manually
|
|
══════════════════════════════════════
|
|
Recommendation: A — fix root causes before proceeding
|
|
══════════════════════════════════════
|
|
```
|
|
|
|
- If user picks A → apply fixes, then re-run (loop back to step 2)
|
|
- If user picks B → return failure to the autopilot
|
|
|
|
**Any skipped test** → classify as legitimate or illegitimate before deciding whether to block.
|
|
|
|
#### Legitimate skips (accept and proceed)
|
|
|
|
The code path genuinely cannot execute on this runner. Acceptable reasons:
|
|
|
|
- Hardware not physically present (GPU, Apple Neural Engine, sensor, serial device)
|
|
- Operating system mismatch (Darwin-only test on Linux CI, Windows-only test on macOS)
|
|
- Feature-flag-gated test whose feature is intentionally disabled in this environment
|
|
- External service the project deliberately does not control (e.g., a third-party API with no sandbox, and the project has a documented contract test instead)
|
|
|
|
For legitimate skips: verify the skip condition is accurate (the test would run if the hardware/OS were present), verify it has a clear reason string, and proceed.
|
|
|
|
#### Illegitimate skips (BLOCKING — must resolve)
|
|
|
|
The skip is a workaround for something we can and should fix. NOT acceptable reasons:
|
|
|
|
- Required service not running (database, message broker, downstream API we control) → fix: bring the service up, add a docker-compose dependency, or add a mock
|
|
- Missing test fixture, seed data, or sample file → fix: provide the data, generate it, or ASK the user for it
|
|
- Missing environment variable or credential → fix: add to `.env.example`, document, ASK user for the value
|
|
- Flaky-test quarantine with no tracking ticket → fix: create the ticket (or replay via leftovers if tracker is down)
|
|
- Inherited skip from a prior refactor that was never cleaned up → fix: clean it up now
|
|
- Test ordering mutates shared state → fix: isolate the state
|
|
|
|
**Rule of thumb**: if the reason for skipping is "we didn't set something up," that's not a valid skip — set it up. If the reason is "this hardware/OS isn't here," that's valid.
|
|
|
|
#### Resolution steps for illegitimate skips
|
|
|
|
1. Classify the skip (read the skip reason and test body)
|
|
2. If the fix is **mechanical** — start a container, install a dep, add a mock, reorder fixtures — attempt it automatically and re-run
|
|
3. If the fix requires **user input** — credentials, sample data, a business decision — BLOCK and ASK
|
|
4. Never silently mark the skip as "accepted" — every illegitimate skip must either be fixed or escalated
|
|
5. Removal is a last resort and requires explicit user approval with documented reasoning
|
|
|
|
#### Categorization cheatsheet
|
|
|
|
- **explicit skip (e.g. `@pytest.mark.skip`)**: check whether the reason in the decorator is still valid
|
|
- **conditional skip (e.g. `@pytest.mark.skipif`)**: check whether the condition is accurate and whether we can change the environment to make it false
|
|
- **runtime skip (e.g. `pytest.skip()` in body)**: check why the condition fires — often an ordering or environment bug
|
|
- **missing fixture/data**: treated as illegitimate unless user confirms the data is unavailable
|
|
|
|
After investigating, present findings:
|
|
|
|
```
|
|
══════════════════════════════════════
|
|
SKIPPED TESTS: K tests skipped
|
|
══════════════════════════════════════
|
|
1. test_X — root cause: [detailed reason] → action: [fix / restructure / remove + justification]
|
|
2. test_Y — root cause: [detailed reason] → action: [fix / restructure / remove + justification]
|
|
══════════════════════════════════════
|
|
A) Apply recommended fixes, then re-run
|
|
B) Accept skips and proceed (requires user justification per skip)
|
|
══════════════════════════════════════
|
|
```
|
|
|
|
Only option B allows proceeding with skips, and it requires explicit user approval with documented justification for each skip.
|
|
|
|
## Trigger Conditions
|
|
|
|
This skill is invoked by the autopilot at test verification checkpoints. It is not typically invoked directly by the user.
|