mirror of https://github.com/azaion/ai-training.git synced 2026-04-22 09:26:36 +00:00

Files

T

Oleksandr Bezdieniezhnykh 243b69656b Update test results directory structure and enhance Docker configurations

- Modified `.gitignore` to reflect the new path for test results.
- Updated `docker-compose.test.yml` to mount the correct test results directory.
- Adjusted `Dockerfile.test` to set the `PYTHONPATH` and ensure test results are saved in the updated location.
- Added `boto3` and `netron` to `requirements-test.txt` to support new functionalities.
- Updated `pytest.ini` to include the new `pythonpath` for test discovery.

These changes streamline the testing process and ensure compatibility with the updated directory structure.

2026-03-28 00:13:08 +02:00

3.9 KiB

Raw Blame History

name, description, category, tags, disable-model-invocation

name

description

Test Run

Run the project's test suite and report results. This skill is invoked by the autopilot at verification checkpoints — after implementing tests, after implementing features, or at any point where the test suite must pass before proceeding.

Workflow

1. Detect Test Runner

Check in order — first match wins:

scripts/run-tests.sh exists → use it
docker-compose.test.yml or equivalent test environment exists → spin it up first, then detect runner below
Auto-detect from project files:
- pytest.ini, pyproject.toml with [tool.pytest], or conftest.py → pytest
- *.csproj or *.sln → dotnet test
- Cargo.toml → cargo test
- package.json with test script → npm test
- Makefile with test target → make test

If no runner detected → report failure and ask user to specify.

2. Run Tests

Execute the detected test runner
Capture output: passed, failed, skipped, errors
If a test environment was spun up, tear it down after tests complete

3. Report Results

Present a summary:

══════════════════════════════════════
 TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════

Important: Collection errors (import failures, missing dependencies, syntax errors) count as failures — they are not "skipped" or ignorable.

4. Diagnose Failures

Before presenting choices, list every failing/erroring test with a one-line root cause:

Failures:
 1. test_foo.py::test_bar — missing dependency 'netron' (not installed)
 2. test_baz.py::test_qux — AssertionError: expected 5, got 3 (logic error)
 3. test_old.py::test_legacy — ImportError: no module 'removed_module' (possibly obsolete)

Categorize each as: missing dependency, broken import, logic/assertion error, possibly obsolete, or environment-specific.

5. Handle Outcome

All tests pass → return success to the autopilot for auto-chain.

Any test fails or errors → this is a blocking gate. Never silently ignore or skip failures. Present using Choose format:

══════════════════════════════════════
 TEST RESULTS: [N passed, M failed, K skipped, E errors]
══════════════════════════════════════
 A) Investigate and fix failing tests/code, then re-run
 B) Remove obsolete tests (if diagnosis shows they are no longer relevant)
 C) Leave as-is — acknowledged tech debt (not recommended)
 D) Abort — fix manually
══════════════════════════════════════
 Recommendation: A — fix failures before proceeding
══════════════════════════════════════

If user picks A → investigate root causes, attempt fixes, then re-run (loop back to step 2)
If user picks B → confirm which tests to remove, delete them, then re-run (loop back to step 2)
If user picks C → require explicit user confirmation; log as acknowledged tech debt in the report, then return success with warning to the autopilot
If user picks D → return failure to the autopilot

Trigger Conditions

This skill is invoked by the autopilot at test verification checkpoints. It is not typically invoked directly by the user.

3.9 KiB Raw Blame History