mirror of https://github.com/azaion/ai-training.git synced 2026-04-23 08:56:37 +00:00

Files

T

Oleksandr Bezdieniezhnykh cdcd1f6ea7 Fix .cursor skills consistency: flow resolution, tracker-agnostic refs, report naming, error recovery

- Rewrite autopilot flow resolution to 4 deterministic rules based on source code + docs + state file presence
- Replace all hard-coded Jira references with tracker-agnostic terminology across 30+ files
- Move project-management.mdc to _project.md (project-specific, not portable with .cursor)
- Rename FINAL_implementation_report.md to context-dependent names (implementation_report_tests/features/refactor)
- Remove "acknowledged tech debt" option from test-run — failing tests must be fixed or removed
- Add debug/error recovery protocol to protocols.md
- Align directory paths: metrics -> 06_metrics/, add 05_security/, reviews/, 02_task_plans/ to README
- Add missing skills (test-spec, test-run, new-task, ui-design) to README
- Use language-appropriate comment syntax for Arrange/Act/Assert in coderule + testing rules
- Copy updated coderule.mdc to parent suite/.cursor/rules/
- Raise max task complexity from 5 to 8 points in decompose
- Skip test-spec Phase 4 (script generation) during planning context
- Document per-batch vs post-implement test run as intentional
- Add skill-internal state cross-check rule to state.md

2026-03-28 02:34:00 +02:00

5.6 KiB

Raw Blame History

name, description, category, tags, disable-model-invocation

name

description

Test Run

Run the project's test suite and report results. This skill is invoked by the autopilot at verification checkpoints — after implementing tests, after implementing features, or at any point where the test suite must pass before proceeding.

Workflow

1. Detect Test Runner

Check in order — first match wins:

scripts/run-tests.sh exists → use it (the script already encodes the correct execution strategy)
docker-compose.test.yml exists → run the Docker Suitability Check (see below). Docker is preferred; use it unless hardware constraints prevent it.
Auto-detect from project files:
- pytest.ini, pyproject.toml with [tool.pytest], or conftest.py → pytest
- *.csproj or *.sln → dotnet test
- Cargo.toml → cargo test
- package.json with test script → npm test
- Makefile with test target → make test

If no runner detected → report failure and ask user to specify.

Docker Suitability Check

Docker is the preferred test environment. Before using it, verify no constraints prevent easy Docker execution:

Check _docs/02_document/tests/environment.md for a "Test Execution" decision (if the test-spec skill already assessed this, follow that decision)
If no prior decision exists, check for disqualifying factors:
- Hardware bindings: GPU, MPS, CUDA, TPU, FPGA, sensors, cameras, serial devices, host-level drivers
- Host dependencies: licensed software, OS-specific services, kernel modules, proprietary SDKs
- Data/volume constraints: large files (> 100MB) impractical to copy into a container
- Network/environment: host networking, VPN, specific DNS/firewall rules
- Performance: Docker overhead would invalidate benchmarks or latency measurements
If any disqualifying factor found → fall back to local test runner. Present to user using Choose format:

══════════════════════════════════════
 DECISION REQUIRED: Docker is preferred but factors
 preventing easy Docker execution detected
══════════════════════════════════════
 Factors detected:
 - [list factors]
══════════════════════════════════════
 A) Run tests locally (recommended)
 B) Run tests in Docker anyway
══════════════════════════════════════
 Recommendation: A — detected constraints prevent
 easy Docker execution
══════════════════════════════════════

If no disqualifying factors → use Docker (preferred default)

2. Run Tests

Execute the detected test runner
Capture output: passed, failed, skipped, errors
If a test environment was spun up, tear it down after tests complete

3. Report Results

Present a summary:

══════════════════════════════════════
 TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════

Important: Collection errors (import failures, missing dependencies, syntax errors) count as failures — they are not "skipped" or ignorable.

4. Diagnose Failures

Before presenting choices, list every failing/erroring test with a one-line root cause:

Failures:
 1. test_foo.py::test_bar — missing dependency 'netron' (not installed)
 2. test_baz.py::test_qux — AssertionError: expected 5, got 3 (logic error)
 3. test_old.py::test_legacy — ImportError: no module 'removed_module' (possibly obsolete)

Categorize each as: missing dependency, broken import, logic/assertion error, possibly obsolete, or environment-specific.

5. Handle Outcome

All tests pass → return success to the autopilot for auto-chain.

Any test fails or errors → this is a blocking gate. Never silently ignore or skip failures. Present using Choose format:

══════════════════════════════════════
 TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════
 A) Investigate and fix failing tests/code, then re-run
 B) Remove obsolete tests (if diagnosis shows they are no longer relevant)
 C) Abort — fix manually
══════════════════════════════════════
 Recommendation: A — fix failures before proceeding
══════════════════════════════════════

If user picks A → investigate root causes, attempt fixes, then re-run (loop back to step 2)
If user picks B → confirm which tests to remove, delete them, then re-run (loop back to step 2)
If user picks C → return failure to the autopilot

Trigger Conditions

This skill is invoked by the autopilot at test verification checkpoints. It is not typically invoked directly by the user.

5.6 KiB Raw Blame History