mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-22 11:26:36 +00:00
Refine coding standards and testing guidelines
- Updated the coding rule descriptions to emphasize readability, meaningful comments, and test verification. - Revised guidelines to clarify the importance of avoiding boilerplate while maintaining readability. - Enhanced the testing rules to set a minimum coverage threshold of 75% for business logic and specified criteria for test scenarios. - Introduced a mechanism for handling skipped tests, categorizing them as legitimate or illegitimate, and outlined resolution steps. These changes aim to improve code quality, maintainability, and testing effectiveness.
This commit is contained in:
@@ -55,6 +55,11 @@ After selecting the flow, apply its detection rules (first match wins) to determ
|
||||
Every invocation follows this sequence:
|
||||
|
||||
```
|
||||
0. Process leftovers (see `.cursor/rules/tracker.mdc` → Leftovers Mechanism):
|
||||
- Read _docs/_process_leftovers/ if it exists
|
||||
- For each entry, attempt replay against the tracker
|
||||
- Delete successful replays, update failed ones with new timestamp + reason
|
||||
- If any leftover still blocked AND requires user input → STOP and ASK
|
||||
1. Read _docs/_autopilot_state.md (if exists)
|
||||
2. Read all File Index files above
|
||||
3. Cross-check state file against _docs/ folder structure (rules in state.md)
|
||||
|
||||
@@ -28,7 +28,7 @@ The `implementer` agent is the specialist that writes all the code — it receiv
|
||||
- **Integrated review**: `/code-review` skill runs automatically after each batch
|
||||
- **Auto-start**: batches launch immediately — no user confirmation before a batch
|
||||
- **Gate on failure**: user confirmation is required only when code review returns FAIL
|
||||
- **Commit and push per batch**: after each batch is confirmed, commit and push to remote
|
||||
- **Commit per batch**: after each batch is confirmed, commit. Ask the user whether to push to remote unless the user previously opted into auto-push for this session.
|
||||
|
||||
## Context Resolution
|
||||
|
||||
@@ -134,25 +134,38 @@ Only proceed to Step 9 when every AC has a corresponding test.
|
||||
|
||||
### 10. Auto-Fix Gate
|
||||
|
||||
Auto-fix loop with bounded retries (max 2 attempts) before escalating to user:
|
||||
Bounded auto-fix loop — only applies to **mechanical** findings. Critical and Security findings are never auto-fixed.
|
||||
|
||||
1. If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue automatically to step 11
|
||||
2. If verdict is **FAIL** (attempt 1 or 2):
|
||||
- Parse the code review findings (Critical and High severity items)
|
||||
- For each finding, attempt an automated fix using the finding's location, description, and suggestion
|
||||
- Re-run `/code-review` on the modified files
|
||||
- If now PASS or PASS_WITH_WARNINGS → continue to step 11
|
||||
- If still FAIL → increment retry counter, repeat from (2) up to max 2 attempts
|
||||
3. If still **FAIL** after 2 auto-fix attempts: present all findings to user (**BLOCKING**). User must confirm fixes or accept before proceeding.
|
||||
**Auto-fix eligibility matrix:**
|
||||
|
||||
Track `auto_fix_attempts` count in the batch report for retrospective analysis.
|
||||
| Severity | Category | Auto-fix? |
|
||||
|----------|----------|-----------|
|
||||
| Low | any | yes |
|
||||
| Medium | Style, Maintainability, Performance | yes |
|
||||
| Medium | Bug, Spec-Gap, Security | escalate |
|
||||
| High | Style, Scope | yes |
|
||||
| High | Bug, Spec-Gap, Performance, Maintainability | escalate |
|
||||
| Critical | any | escalate |
|
||||
| any | Security | escalate |
|
||||
|
||||
### 11. Commit and Push
|
||||
Flow:
|
||||
|
||||
1. If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue to step 11
|
||||
2. If verdict is **FAIL**:
|
||||
- Partition findings into auto-fix-eligible and escalate (using the matrix above)
|
||||
- For eligible findings, attempt fixes using location/description/suggestion, then re-run `/code-review` on modified files (max 2 rounds)
|
||||
- If all remaining findings are auto-fix-eligible and re-review now passes → continue to step 11
|
||||
- If any non-eligible finding exists at any point → stop auto-fixing, present the full list to the user (**BLOCKING**)
|
||||
3. User must explicitly approve each non-auto-fix finding (accept, request manual fix, mark as out-of-scope) before proceeding.
|
||||
|
||||
Track `auto_fix_attempts` and `escalated_findings` in the batch report for retrospective analysis.
|
||||
|
||||
### 11. Commit (and optionally Push)
|
||||
|
||||
- After user confirms the batch (explicitly for FAIL, implicitly for PASS/PASS_WITH_WARNINGS):
|
||||
- `git add` all changed files from the batch
|
||||
- `git commit` with a message that includes ALL task IDs (tracker IDs or numeric prefixes) of tasks implemented in the batch, followed by a summary of what was implemented. Format: `[TASK-ID-1] [TASK-ID-2] ... Summary of changes`
|
||||
- `git push` to the remote branch
|
||||
- Ask the user whether to push to remote, unless the user previously opted into auto-push for this session
|
||||
|
||||
### 12. Update Tracker Status → In Testing
|
||||
|
||||
|
||||
@@ -119,7 +119,7 @@ Read and follow `steps/07_quality-checklist.md`.
|
||||
|-----------|--------|
|
||||
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — planning cannot proceed |
|
||||
| Ambiguous requirements | ASK user |
|
||||
| Input data coverage below 70% | Search internet for supplementary data, ASK user to validate |
|
||||
| Input data coverage below 75% | Search internet for supplementary data, ASK user to validate |
|
||||
| Technology choice with multiple valid options | ASK user |
|
||||
| Component naming | PROCEED, confirm at next BLOCKING gate |
|
||||
| File structure within templates | PROCEED |
|
||||
|
||||
@@ -32,3 +32,17 @@
|
||||
6. Applicable scenarios
|
||||
7. Team capability requirements
|
||||
8. Migration difficulty
|
||||
|
||||
## Decomposition Completeness Probes (Completeness Audit Reference)
|
||||
|
||||
Used during Step 1's Decomposition Completeness Audit. After generating sub-questions, ask each probe against the current decomposition. If a probe reveals an uncovered area, add a sub-question for it.
|
||||
|
||||
| Probe | What it catches |
|
||||
|-------|-----------------|
|
||||
| **What does this cost — in money, time, resources, or trade-offs?** | Budget, pricing, licensing, tax, opportunity cost, maintenance burden |
|
||||
| **What are the hard constraints — physical, legal, regulatory, environmental?** | Regulations, certifications, spectrum/frequency rules, export controls, physics limits, IP restrictions |
|
||||
| **What are the dependencies and assumptions that could break?** | Supply chain, vendor lock-in, API stability, single points of failure, standards evolution |
|
||||
| **What does the operating environment actually look like?** | Terrain, weather, connectivity, infrastructure, power, latency, user skill level |
|
||||
| **What failure modes exist and what happens when they trigger?** | Degraded operation, fallback, safety margins, blast radius, recovery time |
|
||||
| **What do practitioners who solved similar problems say matters most?** | Field-tested priorities that don't appear in specs or papers |
|
||||
| **What changes over time — and what looks stable now but isn't?** | Technology roadmaps, regulatory shifts, deprecation risk, scaling effects |
|
||||
|
||||
@@ -10,6 +10,12 @@
|
||||
- [ ] Every citation can be directly verified by the user (source verifiability)
|
||||
- [ ] Structure hierarchy is clear; executives can quickly locate information
|
||||
|
||||
## Decomposition Completeness
|
||||
|
||||
- [ ] Domain discovery search executed: searched "key factors when [problem domain]" before starting research
|
||||
- [ ] Completeness probes applied: every probe from `references/comparison-frameworks.md` checked against sub-questions
|
||||
- [ ] No uncovered areas remain: all gaps filled with sub-questions or justified as not applicable
|
||||
|
||||
## Internet Search Depth
|
||||
|
||||
- [ ] Every sub-question was searched with at least 3-5 different query variants
|
||||
|
||||
@@ -97,6 +97,16 @@ When decomposing questions, you must explicitly define the **boundaries of the r
|
||||
|
||||
**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research.
|
||||
|
||||
#### Decomposition Completeness Audit (MANDATORY)
|
||||
|
||||
After generating sub-questions, verify the decomposition covers all major dimensions of the problem — not just the ones that came to mind first.
|
||||
|
||||
1. **Domain discovery search**: Search the web for "key factors when [problem domain]" / "what to consider when [problem domain]" (e.g., "key factors GPS-denied navigation", "what to consider when choosing an edge deployment strategy"). Extract dimensions that practitioners and domain experts consider important but are absent from the current sub-questions.
|
||||
2. **Run completeness probes**: Walk through each probe in `references/comparison-frameworks.md` → "Decomposition Completeness Probes" against the current sub-question list. For each probe, note whether it is covered, not applicable (state why), or missing.
|
||||
3. **Fill gaps**: Add sub-questions (with search query variants) for any uncovered area. Do this before proceeding to Step 2.
|
||||
|
||||
Record the audit result in `00_question_decomposition.md` as a "Completeness Audit" section.
|
||||
|
||||
**Save action**:
|
||||
1. Read all files from INPUT_DIR to ground the research in the project context
|
||||
2. Create working directory `RESEARCH_DIR/`
|
||||
@@ -109,6 +119,7 @@ When decomposing questions, you must explicitly define the **boundaries of the r
|
||||
- List of decomposed sub-questions
|
||||
- **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
|
||||
- **Search query variants** for each sub-question (at least 3-5 per sub-question)
|
||||
- **Completeness audit** (taxonomy cross-reference + domain discovery results)
|
||||
4. Write TodoWrite to track progress
|
||||
|
||||
---
|
||||
|
||||
@@ -102,32 +102,46 @@ After investigating, present:
|
||||
- If user picks A → apply fixes, then re-run (loop back to step 2)
|
||||
- If user picks B → return failure to the autopilot
|
||||
|
||||
**Any test skipped** → this is also a **blocking gate**. Skipped tests mean something is wrong — either with the test, the environment, or the test design. **Never blindly remove a skipped test.** Always investigate the root cause first.
|
||||
**Any skipped test** → classify as legitimate or illegitimate before deciding whether to block.
|
||||
|
||||
#### Investigation Protocol for Skipped Tests
|
||||
#### Legitimate skips (accept and proceed)
|
||||
|
||||
For each skipped test:
|
||||
The code path genuinely cannot execute on this runner. Acceptable reasons:
|
||||
|
||||
1. **Read the test code** — understand what the test is supposed to verify and why it skips.
|
||||
2. **Determine the root cause** — why did the skip condition fire?
|
||||
- Is the test environment misconfigured? (e.g., wrong ports, missing env vars, service not started correctly)
|
||||
- Is the test ordering wrong? (e.g., a fixture in an earlier test mutates shared state)
|
||||
- Is a dependency missing? (e.g., package not installed, fixture file absent)
|
||||
- Is the skip condition outdated? (e.g., code was refactored but the skip guard still checks the old behavior)
|
||||
- Is the test fundamentally untestable in the current setup? (e.g., requires Docker restart, different OS, special hardware)
|
||||
3. **Try to fix the root cause first** — the goal is to make the test run, not to delete it:
|
||||
- Fix the environment or configuration
|
||||
- Reorder tests or isolate shared state
|
||||
- Install the missing dependency
|
||||
- Update the skip condition to match current behavior
|
||||
4. **Only remove as last resort** — if the test truly cannot run in any realistic test environment (e.g., requires hardware not available, duplicates another test with identical assertions), then removal is justified. Document the reasoning.
|
||||
- Hardware not physically present (GPU, Apple Neural Engine, sensor, serial device)
|
||||
- Operating system mismatch (Darwin-only test on Linux CI, Windows-only test on macOS)
|
||||
- Feature-flag-gated test whose feature is intentionally disabled in this environment
|
||||
- External service the project deliberately does not control (e.g., a third-party API with no sandbox, and the project has a documented contract test instead)
|
||||
|
||||
#### Categorization
|
||||
For legitimate skips: verify the skip condition is accurate (the test would run if the hardware/OS were present), verify it has a clear reason string, and proceed.
|
||||
|
||||
- **explicit skip (dead code)**: Has `@pytest.mark.skip` — investigate whether the reason in the decorator is still valid. Often these are temporary skips that became permanent by accident.
|
||||
- **runtime skip (unreachable)**: `pytest.skip()` fires inside the test body — investigate why the condition always triggers. Often fixable by adjusting test order, environment, or the condition itself.
|
||||
- **environment mismatch**: Test assumes a different environment — investigate whether the test environment setup can be fixed.
|
||||
- **missing fixture/data**: Data or service not available — investigate whether it can be provided.
|
||||
#### Illegitimate skips (BLOCKING — must resolve)
|
||||
|
||||
The skip is a workaround for something we can and should fix. NOT acceptable reasons:
|
||||
|
||||
- Required service not running (database, message broker, downstream API we control) → fix: bring the service up, add a docker-compose dependency, or add a mock
|
||||
- Missing test fixture, seed data, or sample file → fix: provide the data, generate it, or ASK the user for it
|
||||
- Missing environment variable or credential → fix: add to `.env.example`, document, ASK user for the value
|
||||
- Flaky-test quarantine with no tracking ticket → fix: create the ticket (or replay via leftovers if tracker is down)
|
||||
- Inherited skip from a prior refactor that was never cleaned up → fix: clean it up now
|
||||
- Test ordering mutates shared state → fix: isolate the state
|
||||
|
||||
**Rule of thumb**: if the reason for skipping is "we didn't set something up," that's not a valid skip — set it up. If the reason is "this hardware/OS isn't here," that's valid.
|
||||
|
||||
#### Resolution steps for illegitimate skips
|
||||
|
||||
1. Classify the skip (read the skip reason and test body)
|
||||
2. If the fix is **mechanical** — start a container, install a dep, add a mock, reorder fixtures — attempt it automatically and re-run
|
||||
3. If the fix requires **user input** — credentials, sample data, a business decision — BLOCK and ASK
|
||||
4. Never silently mark the skip as "accepted" — every illegitimate skip must either be fixed or escalated
|
||||
5. Removal is a last resort and requires explicit user approval with documented reasoning
|
||||
|
||||
#### Categorization cheatsheet
|
||||
|
||||
- **explicit skip (e.g. `@pytest.mark.skip`)**: check whether the reason in the decorator is still valid
|
||||
- **conditional skip (e.g. `@pytest.mark.skipif`)**: check whether the condition is accurate and whether we can change the environment to make it false
|
||||
- **runtime skip (e.g. `pytest.skip()` in body)**: check why the condition fires — often an ordering or environment bug
|
||||
- **missing fixture/data**: treated as illegitimate unless user confirms the data is unavailable
|
||||
|
||||
After investigating, present findings:
|
||||
|
||||
|
||||
@@ -27,8 +27,11 @@ Analyze input data completeness and produce detailed black-box test specificatio
|
||||
- **Save immediately**: write artifacts to disk after each phase; never accumulate unsaved work
|
||||
- **Ask, don't assume**: when requirements are ambiguous, ask the user before proceeding
|
||||
- **Spec, don't code**: this workflow produces test specifications, never test implementation code
|
||||
- **No test without data**: every test scenario MUST have concrete test data; tests without data are removed
|
||||
- **No test without expected result**: every test scenario MUST pair input data with a quantifiable expected result; a test that cannot compare actual output against a known-correct answer is not verifiable and must be removed
|
||||
- **Every test must have a pass/fail criterion**. Two acceptable shapes:
|
||||
- **Input/output shape**: concrete input data paired with a quantifiable expected result (exact value, tolerance, threshold, pattern, reference file). Typical for functional blackbox tests, performance tests with load data, data-processing pipelines.
|
||||
- **Behavioral shape**: a trigger condition + observable system behavior + quantifiable pass/fail criterion, with no input data required. Typical for startup/shutdown tests, retry/backoff policies, state transitions, logging/metrics emission, resilience scenarios. Example criteria: "startup logs `service ready` within 5s", "retry emits 3 attempts with exponential backoff (base 100ms ± 20ms)", "on SIGTERM, service drains in-flight requests within 30s grace period", "health endpoint returns 503 while migrations run".
|
||||
- For behavioral tests the observable (log line, metric value, state transition, emitted event, elapsed time) must still be quantifiable — the test must programmatically decide pass/fail.
|
||||
- A test that cannot produce a pass/fail verdict through either shape is not verifiable and must be removed.
|
||||
|
||||
## Context Resolution
|
||||
|
||||
@@ -177,7 +180,7 @@ At the start of execution, create a TodoWrite with all four phases. Update statu
|
||||
|------------|--------------------------|---------------|----------------|
|
||||
| [file/data] | Yes/No | Yes/No | [missing, vague, no tolerance, etc.] |
|
||||
|
||||
9. Threshold: at least 70% coverage of scenarios AND every covered scenario has a quantifiable expected result (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds table)
|
||||
9. Threshold: at least 75% coverage of scenarios AND every covered scenario has a quantifiable expected result (see `.cursor/rules/cursor-meta.mdc` Quality Thresholds table)
|
||||
10. If coverage is low, search the internet for supplementary data, assess quality with user, and if user agrees, add to `input_data/` and update `input_data/expected_results/results_report.md`
|
||||
11. If expected results are missing or not quantifiable, ask user to provide them before proceeding
|
||||
|
||||
@@ -232,18 +235,26 @@ Capture any new questions, findings, or insights that arise during test specific
|
||||
### Phase 3: Test Data Validation Gate (HARD GATE)
|
||||
|
||||
**Role**: Professional Quality Assurance Engineer
|
||||
**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above 70%.
|
||||
**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above 75%.
|
||||
**Constraints**: This phase is MANDATORY and cannot be skipped.
|
||||
|
||||
#### Step 1 — Build the test-data and expected-result requirements checklist
|
||||
#### Step 1 — Build the requirements checklist
|
||||
|
||||
Scan `blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, and `resource-limit-tests.md`. For every test scenario, extract:
|
||||
Scan `blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, and `resource-limit-tests.md`. For every test scenario, classify its shape (input/output or behavioral) and extract:
|
||||
|
||||
**Input/output tests:**
|
||||
|
||||
| # | Test Scenario ID | Test Name | Required Input Data | Required Expected Result | Result Quantifiable? | Comparison Method | Input Provided? | Expected Result Provided? |
|
||||
|---|-----------------|-----------|---------------------|-------------------------|---------------------|-------------------|----------------|--------------------------|
|
||||
| 1 | [ID] | [name] | [data description] | [what system should output] | [Yes/No] | [exact/tolerance/pattern/threshold] | [Yes/No] | [Yes/No] |
|
||||
|
||||
Present this table to the user.
|
||||
**Behavioral tests:**
|
||||
|
||||
| # | Test Scenario ID | Test Name | Trigger Condition | Observable Behavior | Pass/Fail Criterion | Quantifiable? |
|
||||
|---|-----------------|-----------|-------------------|--------------------|--------------------|---------------|
|
||||
| 1 | [ID] | [name] | [e.g., service receives SIGTERM] | [e.g., drain logs emitted, port closed] | [e.g., drain completes ≤30s] | [Yes/No] |
|
||||
|
||||
Present both tables to the user.
|
||||
|
||||
#### Step 2 — Ask user to provide missing test data AND expected results
|
||||
|
||||
@@ -315,20 +326,20 @@ After all removals, recalculate coverage:
|
||||
|
||||
**Decision**:
|
||||
|
||||
- **Coverage ≥ 70%** → Phase 3 **PASSED**. Present final summary to user.
|
||||
- **Coverage < 70%** → Phase 3 **FAILED**. Report:
|
||||
> ❌ Test coverage dropped to **X%** (minimum 70% required). The removed test scenarios left gaps in the following acceptance criteria / restrictions:
|
||||
- **Coverage ≥ 75%** → Phase 3 **PASSED**. Present final summary to user.
|
||||
- **Coverage < 75%** → Phase 3 **FAILED**. Report:
|
||||
> ❌ Test coverage dropped to **X%** (minimum 75% required). The removed test scenarios left gaps in the following acceptance criteria / restrictions:
|
||||
>
|
||||
> | Uncovered Item | Type (AC/Restriction) | Missing Test Data Needed |
|
||||
> |---|---|---|
|
||||
>
|
||||
> **Action required**: Provide the missing test data for the items above, or add alternative test scenarios that cover these items with data you can supply.
|
||||
|
||||
**BLOCKING**: Loop back to Step 2 with the uncovered items. Do NOT finalize until coverage ≥ 70%.
|
||||
**BLOCKING**: Loop back to Step 2 with the uncovered items. Do NOT finalize until coverage ≥ 75%.
|
||||
|
||||
#### Phase 3 Completion
|
||||
|
||||
When coverage ≥ 70% and all remaining tests have validated data AND quantifiable expected results:
|
||||
When coverage ≥ 75% and all remaining tests have validated data AND quantifiable expected results:
|
||||
|
||||
1. Present the final coverage report
|
||||
2. List all removed tests (if any) with reasons
|
||||
@@ -479,23 +490,23 @@ Create `scripts/run-performance-tests.sh` at the project root. The script must:
|
||||
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — specification cannot proceed |
|
||||
| Missing input_data/expected_results/results_report.md | **STOP** — ask user to provide expected results mapping using the template |
|
||||
| Ambiguous requirements | ASK user |
|
||||
| Input data coverage below 70% (Phase 1) | Search internet for supplementary data, ASK user to validate |
|
||||
| Input data coverage below 75% (Phase 1) | Search internet for supplementary data, ASK user to validate |
|
||||
| Expected results missing or not quantifiable (Phase 1) | ASK user to provide quantifiable expected results before proceeding |
|
||||
| Test scenario conflicts with restrictions | ASK user to clarify intent |
|
||||
| System interfaces unclear (no architecture.md) | ASK user or derive from solution.md |
|
||||
| Test data or expected result not provided for a test scenario (Phase 3) | WARN user and REMOVE the test |
|
||||
| Final coverage below 70% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
|
||||
| Final coverage below 75% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- **Referencing internals**: tests must be black-box — no internal module names, no direct DB queries against the system under test
|
||||
- **Vague expected outcomes**: "works correctly" is not a test outcome; use specific measurable values
|
||||
- **Missing expected results**: input data without a paired expected result is useless — the test cannot determine pass/fail without knowing what "correct" looks like
|
||||
- **Non-quantifiable expected results**: "should return good results" is not verifiable; expected results must have exact values, tolerances, thresholds, or pattern matches that code can evaluate
|
||||
- **Missing pass/fail criterion**: input/output tests without an expected result, OR behavioral tests without a measurable observable — both are unverifiable and must be removed
|
||||
- **Non-quantifiable criteria**: "should return good results", "works correctly", "behaves properly" — not verifiable. Use exact values, tolerances, thresholds, pattern matches, or timing bounds that code can evaluate.
|
||||
- **Forcing the wrong shape**: do not invent fake input data for a behavioral test (e.g., "input: SIGTERM signal") just to fit the input/output shape. Classify the test correctly and use the matching checklist.
|
||||
- **Missing negative scenarios**: every positive scenario category should have corresponding negative/edge-case tests
|
||||
- **Untraceable tests**: every test should trace to at least one AC or restriction
|
||||
- **Writing test code**: this skill produces specifications, never implementation code
|
||||
- **Tests without data**: every test scenario MUST have concrete test data AND a quantifiable expected result; a test spec without either is not executable and must be removed
|
||||
|
||||
## Trigger Conditions
|
||||
|
||||
@@ -516,7 +527,7 @@ When the user wants to:
|
||||
│ → verify AC, restrictions, input_data (incl. expected_results.md) │
|
||||
│ │
|
||||
│ Phase 1: Input Data & Expected Results Completeness Analysis │
|
||||
│ → assess input_data/ coverage vs AC scenarios (≥70%) │
|
||||
│ → assess input_data/ coverage vs AC scenarios (≥75%) │
|
||||
│ → verify every input has a quantifiable expected result │
|
||||
│ → present input→expected-result pairing assessment │
|
||||
│ [BLOCKING: user confirms input data + expected results coverage] │
|
||||
@@ -538,8 +549,8 @@ When the user wants to:
|
||||
│ → validate input data (quality + quantity) │
|
||||
│ → validate expected results (quantifiable + comparison method) │
|
||||
│ → remove tests without data or expected result, warn user │
|
||||
│ → final coverage check (≥70% or FAIL + loop back) │
|
||||
│ [BLOCKING: coverage ≥ 70% required to pass] │
|
||||
│ → final coverage check (≥75% or FAIL + loop back) │
|
||||
│ [BLOCKING: coverage ≥ 75% required to pass] │
|
||||
│ │
|
||||
│ Phase 4: Test Runner Script Generation │
|
||||
│ → detect test runner + docker-compose + load tool │
|
||||
|
||||
Reference in New Issue
Block a user