mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-23 01:36:34 +00:00
Refine coding standards and testing guidelines
- Updated the coding rule descriptions to emphasize readability, meaningful comments, and test verification. - Revised guidelines to clarify the importance of avoiding boilerplate while maintaining readability. - Enhanced the testing rules to set a minimum coverage threshold of 75% for business logic and specified criteria for test scenarios. - Introduced a mechanism for handling skipped tests, categorizing them as legitimate or illegitimate, and outlined resolution steps. These changes aim to improve code quality, maintainability, and testing effectiveness.
This commit is contained in:
@@ -27,8 +27,11 @@ Analyze input data completeness and produce detailed black-box test specificatio
|
||||
- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
|
||||
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
|
||||
- **Spec, don't code**: this workflow produces test specifications, never test implementation code
|
||||
- **No test without data**: every test scenario MUST have concrete test data; tests without data are removed
|
||||
- **No test without expected result**: every test scenario MUST pair input data with a quantifiable expected result; a test that cannot compare actual output against a known-correct answer is not verifiable and must be removed
|
||||
- **Every test must have a pass/fail criterion**. Two acceptable shapes:
|
||||
- **Input/output shape**: concrete input data paired with a quantifiable expected result (exact value, tolerance, threshold, pattern, reference file). Typical for functional blackbox tests, performance tests with load data, data-processing pipelines.
|
||||
- **Behavioral shape**: a trigger condition + observable system behavior + quantifiable pass/fail criterion, with no input data required. Typical for startup/shutdown tests, retry/backoff policies, state transitions, logging/metrics emission, resilience scenarios. Example criteria: "startup logs `service ready` within 5s", "retry emits 3 attempts with exponential backoff (base 100ms ± 20ms)", "on SIGTERM, service drains in-flight requests within 30s grace period", "health endpoint returns 503 while migrations run".
|
||||
- For behavioral tests the observable (log line, metric value, state transition, emitted event, elapsed time) must still be quantifiable — the test must programmatically decide pass/fail.
|
||||
- A test that cannot produce a pass/fail verdict through either shape is not verifiable and must be removed.
|
||||
|
||||
## Context Resolution
|
||||
|
||||
@@ -177,7 +180,7 @@ At the start of execution, create a TodoWrite with all four phases. Update statu
|
||||
|------------|--------------------------|---------------|----------------|
|
||||
| [file/data] | Yes/No | Yes/No | [missing, vague, no tolerance, etc.] |
|
||||
|
||||
9. Threshold: at least 70% coverage of scenarios AND every covered scenario has a quantifiable expected result (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds table)
|
||||
9. Threshold: at least 75% coverage of scenarios AND every covered scenario has a quantifiable expected result (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds table)
|
||||
10. If coverage is low, search the internet for supplementary data, assess quality with user, and if user agrees, add to `input_data/` and update `input_data/expected_results/results_report.md`
|
||||
11. If expected results are missing or not quantifiable, ask user to provide them before proceeding
|
||||
|
||||
@@ -232,18 +235,26 @@ Capture any new questions, findings, or insights that arise during test specific
|
||||
### Phase 3: Test Data Validation Gate (HARD GATE)
|
||||
|
||||
**Role**: Professional Quality Assurance Engineer
|
||||
**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above 70%.
|
||||
**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above 75%.
|
||||
**Constraints**: This phase is MANDATORY and cannot be skipped.
|
||||
|
||||
#### Step 1 — Build the test-data and expected-result requirements checklist
|
||||
#### Step 1 — Build the requirements checklist
|
||||
|
||||
Scan `blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, and `resource-limit-tests.md`. For every test scenario, extract:
|
||||
Scan `blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, and `resource-limit-tests.md`. For every test scenario, classify its shape (input/output or behavioral) and extract:
|
||||
|
||||
**Input/output tests:**
|
||||
|
||||
| # | Test Scenario ID | Test Name | Required Input Data | Required Expected Result | Result Quantifiable? | Comparison Method | Input Provided? | Expected Result Provided? |
|
||||
|---|-----------------|-----------|---------------------|-------------------------|---------------------|-------------------|----------------|--------------------------|
|
||||
| 1 | [ID] | [name] | [data description] | [what system should output] | [Yes/No] | [exact/tolerance/pattern/threshold] | [Yes/No] | [Yes/No] |
|
||||
|
||||
Present this table to the user.
|
||||
**Behavioral tests:**
|
||||
|
||||
| # | Test Scenario ID | Test Name | Trigger Condition | Observable Behavior | Pass/Fail Criterion | Quantifiable? |
|
||||
|---|-----------------|-----------|-------------------|--------------------|--------------------|---------------|
|
||||
| 1 | [ID] | [name] | [e.g., service receives SIGTERM] | [e.g., drain logs emitted, port closed] | [e.g., drain completes ≤30s] | [Yes/No] |
|
||||
|
||||
Present both tables to the user.
|
||||
|
||||
#### Step 2 — Ask user to provide missing test data AND expected results
|
||||
|
||||
@@ -315,20 +326,20 @@ After all removals, recalculate coverage:
|
||||
|
||||
**Decision**:
|
||||
|
||||
- **Coverage ≥ 70%** → Phase 3 **PASSED**. Present final summary to user.
|
||||
- **Coverage < 70%** → Phase 3 **FAILED**. Report:
|
||||
> ❌ Test coverage dropped to **X%** (minimum 70% required). The removed test scenarios left gaps in the following acceptance criteria / restrictions:
|
||||
- **Coverage ≥ 75%** → Phase 3 **PASSED**. Present final summary to user.
|
||||
- **Coverage < 75%** → Phase 3 **FAILED**. Report:
|
||||
> ❌ Test coverage dropped to **X%** (minimum 75% required). The removed test scenarios left gaps in the following acceptance criteria / restrictions:
|
||||
>
|
||||
> | Uncovered Item | Type (AC/Restriction) | Missing Test Data Needed |
|
||||
> |---|---|---|
|
||||
>
|
||||
> **Action required**: Provide the missing test data for the items above, or add alternative test scenarios that cover these items with data you can supply.
|
||||
|
||||
**BLOCKING**: Loop back to Step 2 with the uncovered items. Do NOT finalize until coverage ≥ 70%.
|
||||
**BLOCKING**: Loop back to Step 2 with the uncovered items. Do NOT finalize until coverage ≥ 75%.
|
||||
|
||||
#### Phase 3 Completion
|
||||
|
||||
When coverage ≥ 70% and all remaining tests have validated data AND quantifiable expected results:
|
||||
When coverage ≥ 75% and all remaining tests have validated data AND quantifiable expected results:
|
||||
|
||||
1. Present the final coverage report
|
||||
2. List all removed tests (if any) with reasons
|
||||
@@ -479,23 +490,23 @@ Create `scripts/run-performance-tests.sh` at the project root. The script must:
|
||||
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — specification cannot proceed |
|
||||
| Missing input_data/expected_results/results_report.md | **STOP** — ask user to provide expected results mapping using the template |
|
||||
| Ambiguous requirements | ASK user |
|
||||
| Input data coverage below 70% (Phase 1) | Search internet for supplementary data, ASK user to validate |
|
||||
| Input data coverage below 75% (Phase 1) | Search internet for supplementary data, ASK user to validate |
|
||||
| Expected results missing or not quantifiable (Phase 1) | ASK user to provide quantifiable expected results before proceeding |
|
||||
| Test scenario conflicts with restrictions | ASK user to clarify intent |
|
||||
| System interfaces unclear (no architecture.md) | ASK user or derive from solution.md |
|
||||
| Test data or expected result not provided for a test scenario (Phase 3) | WARN user and REMOVE the test |
|
||||
| Final coverage below 70% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
|
||||
| Final coverage below 75% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Referencing internals**: tests must be black-box — no internal module names, no direct DB queries against the system under test
|
||||
- **Vague expected outcomes**: "works correctly" is not a test outcome; use specific measurable values
|
||||
- **Missing expected results**: input data without a paired expected result is useless — the test cannot determine pass/fail without knowing what "correct" looks like
|
||||
- **Non-quantifiable expected results**: "should return good results" is not verifiable; expected results must have exact values, tolerances, thresholds, or pattern matches that code can evaluate
|
||||
- **Missing pass/fail criterion**: input/output tests without an expected result, OR behavioral tests without a measurable observable — both are unverifiable and must be removed
|
||||
- **Non-quantifiable criteria**: "should return good results", "works correctly", "behaves properly" — not verifiable. Use exact values, tolerances, thresholds, pattern matches, or timing bounds that code can evaluate.
|
||||
- **Forcing the wrong shape**: do not invent fake input data for a behavioral test (e.g., "input: SIGTERM signal") just to fit the input/output shape. Classify the test correctly and use the matching checklist.
|
||||
- **Missing negative scenarios**: every positive scenario category should have corresponding negative/edge-case tests
|
||||
- **Untraceable tests**: every test should trace to at least one AC or restriction
|
||||
- **Writing test code**: this skill produces specifications, never implementation code
|
||||
- **Tests without data**: every test scenario MUST have concrete test data AND a quantifiable expected result; a test spec without either is not executable and must be removed
|
||||
|
||||
## Trigger Conditions
|
||||
|
||||
@@ -516,7 +527,7 @@ When the user wants to:
|
||||
│ → verify AC, restrictions, input_data (incl. expected_results.md) │
|
||||
│ │
|
||||
│ Phase 1: Input Data & Expected Results Completeness Analysis │
|
||||
│ → assess input_data/ coverage vs AC scenarios (≥70%) │
|
||||
│ → assess input_data/ coverage vs AC scenarios (≥75%) │
|
||||
│ → verify every input has a quantifiable expected result │
|
||||
│ → present input→expected-result pairing assessment │
|
||||
│ [BLOCKING: user confirms input data + expected results coverage] │
|
||||
@@ -538,8 +549,8 @@ When the user wants to:
|
||||
│ → validate input data (quality + quantity) │
|
||||
│ → validate expected results (quantifiable + comparison method) │
|
||||
│ → remove tests without data or expected result, warn user │
|
||||
│ → final coverage check (≥70% or FAIL + loop back) │
|
||||
│ [BLOCKING: coverage ≥ 70% required to pass] │
|
||||
│ → final coverage check (≥75% or FAIL + loop back) │
|
||||
│ [BLOCKING: coverage ≥ 75% required to pass] │
|
||||
│ │
|
||||
│ Phase 4: Test Runner Script Generation │
|
||||
│ → detect test runner + docker-compose + load tool │
|
||||
|
||||
Reference in New Issue
Block a user