detections-semantic/.cursor/skills/test-spec/phases/03-data-validation-gate.md

# Phase 3: Test Data & Expected Results Validation Gate (HARD GATE)

**Role**: Professional Quality Assurance Engineer
**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above 75%.
**Constraints**: This phase is MANDATORY and cannot be skipped.

## Step 1 — Build the requirements checklist

Scan `blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, and `resource-limit-tests.md`. For every test scenario, classify its shape (input/output or behavioral) and extract:

**Input/output tests:**

| # | Test Scenario ID | Test Name | Required Input Data | Required Expected Result | Result Quantifiable? | Comparison Method | Input Provided? | Expected Result Provided? |
|---|-----------------|-----------|---------------------|-------------------------|---------------------|-------------------|----------------|--------------------------|
| 1 | [ID] | [name] | [data description] | [what system should output] | [Yes/No] | [exact/tolerance/pattern/threshold] | [Yes/No] | [Yes/No] |

**Behavioral tests:**

| # | Test Scenario ID | Test Name | Trigger Condition | Observable Behavior | Pass/Fail Criterion | Quantifiable? |
|---|-----------------|-----------|-------------------|--------------------|--------------------|---------------|
| 1 | [ID] | [name] | [e.g., service receives SIGTERM] | [e.g., drain logs emitted, port closed] | [e.g., drain completes ≤30s] | [Yes/No] |

Present both tables to the user.

## Step 2 — Ask user to provide missing test data AND expected results

For each row where **Input Provided?** is **No** OR **Expected Result Provided?** is **No**, ask the user:

> **Option A — Provide the missing items**: Supply what is missing:
> - **Missing input data**: Place test data files in `_docs/00_problem/input_data/` or indicate the location.
> - **Missing expected result**: Provide the quantifiable expected result for this input. Update `_docs/00_problem/input_data/expected_results/results_report.md` with a row mapping the input to its expected output. If the expected result is complex, provide a reference CSV file in `_docs/00_problem/input_data/expected_results/`. Use `.cursor/skills/test-spec/templates/expected-results.md` for format guidance.
>
> Expected results MUST be quantifiable — the test must be able to programmatically compare actual vs expected. Examples:
> - "3 detections with bounding boxes [(x1,y1,x2,y2), ...] ± 10px"
> - "HTTP 200 with JSON body matching `expected_response_01.json`"
> - "Processing time < 500ms"
> - "0 false positives in the output set"
>
> **Option B — Skip this test**: If you cannot provide the data or expected result, this test scenario will be **removed** from the specification.

**BLOCKING**: Wait for the user's response for every missing item.

## Step 3 — Validate provided data and expected results

For each item where the user chose **Option A**:

**Input data validation**:

1. Verify the data file(s) exist at the indicated location
2. Verify **quality**: data matches the format, schema, and constraints described in the test scenario (e.g., correct image resolution, valid JSON structure, expected value ranges)
3. Verify **quantity**: enough data samples to cover the scenario (e.g., at least N images for a batch test, multiple edge-case variants)

**Expected result validation**:

4. Verify the expected result exists in `input_data/expected_results/results_report.md` or as a referenced file in `input_data/expected_results/`
5. Verify **quantifiability**: the expected result can be evaluated programmatically — it must contain at least one of:
   - Exact values (counts, strings, status codes)
   - Numeric values with tolerance (e.g., `± 10px`, `≥ 0.85`)
   - Pattern matches (regex, substring, JSON schema)
   - Thresholds (e.g., `< 500ms`, `≤ 5% error rate`)
   - Reference file for structural comparison (JSON diff, CSV diff)
6. Verify **completeness**: the expected result covers all outputs the test checks (not just one field when the test validates multiple)
7. Verify **consistency**: the expected result is consistent with the acceptance criteria it traces to

If any validation fails, report the specific issue and loop back to Step 2 for that item.

## Step 4 — Remove tests without data or expected results

For each item where the user chose **Option B**:

1. Warn the user: `⚠️ Test scenario [ID] "[Name]" will be REMOVED from the specification due to missing test data or expected result.`
2. Remove the test scenario from the respective test file
3. Remove corresponding rows from `traceability-matrix.md`
4. Update `test-data.md` to reflect the removal

**Save action**: Write updated files under TESTS_OUTPUT_DIR:

- `test-data.md`
- Affected test files (if tests removed)
- `traceability-matrix.md` (if tests removed)

## Step 5 — Final coverage check

After all removals, recalculate coverage:

1. Count remaining test scenarios that trace to acceptance criteria
2. Count total acceptance criteria + restrictions
3. Calculate coverage percentage: `covered_items / total_items * 100`

| Metric | Value |
|--------|-------|
| Total AC + Restrictions | ? |
| Covered by remaining tests | ? |
| **Coverage %** | **?%** |

**Decision**:

- **Coverage ≥ 75%** → Phase 3 **PASSED**. Present final summary to user.
- **Coverage < 75%** → Phase 3 **FAILED**. Report:
  > ❌ Test coverage dropped to **X%** (minimum 75% required). The removed test scenarios left gaps in the following acceptance criteria / restrictions:
  >
  > | Uncovered Item | Type (AC/Restriction) | Missing Test Data Needed |
  > |---|---|---|
  >
  > **Action required**: Provide the missing test data for the items above, or add alternative test scenarios that cover these items with data you can supply.

  **BLOCKING**: Loop back to Step 2 with the uncovered items. Do NOT finalize until coverage ≥ 75%.

## Phase 3 Completion

When coverage ≥ 75% and all remaining tests have validated data AND quantifiable expected results:

1. Present the final coverage report
2. List all removed tests (if any) with reasons
3. Confirm every remaining test has: input data + quantifiable expected result + comparison method
4. Confirm all artifacts are saved and consistent

After Phase 3 completion, run `phases/hardware-assessment.md` before Phase 4.