# Validation Harness

## 1. High-Level Overview

**Purpose**: Drive black-box replay, public dataset, SITL, Jetson, and representative validation through the runtime's public interfaces.

**Architectural Pattern**: Test harness / scenario runner.

**Upstream dependencies**: Test data fixtures, public datasets, SITL, Jetson environment.

**Downstream consumers**: CI/CD pipeline, release evidence review.

## 2. Internal Interfaces

### Interface: `ScenarioRunner`

| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `run_scenario` | `ScenarioRequest` | `ScenarioReport` | Yes | `FixtureInvalid`, `RuntimeFailed`, `ThresholdFailed` |
| `validate_fixture` | `FixtureRequest` | `FixtureValidationReport` | No | `FixtureInvalid` |

**Input DTOs**:

```yaml
ScenarioRequest:
  scenario_id: string
  execution_environment: enum(replay, sitl, jetson, representative)
  fixture_paths: list[string]
```

**Output DTOs**:

```yaml
ScenarioReport:
  scenario_id: string
  result: enum(pass, fail, blocked)
  metrics: object
  artifacts: list[path]
  failure_reason: string optional
```

## 3. Data Access Patterns

Reads versioned fixtures and writes reports. Does not import runtime internals.

## 4. Implementation Details

**State Management**: Per-run temporary directories and report aggregation.

**Key Dependencies**:

| Library | Purpose |
|---------|---------|
| pytest or equivalent | Test orchestration |
| pymavlink/log parser | SITL and output validation |
| Docker/compose runner | Replay/SITL environment |

**Error Handling Strategy**:
- Fixture gaps are reported as blocked, not passed.
- Threshold failures include metrics and artifacts.

## 5. Caveats & Edge Cases

**Known limitations**:
- Public datasets are not final acceptance evidence unless representative and license-compatible.
- Missing synchronized target data remains a final acceptance blocker.

## 6. Dependency Graph

**Must be implemented after**: public interfaces are defined.

**Can be implemented in parallel with**: runtime components using mocks/fixtures only after interfaces are stable.

**Blocks**: CI/release gates.

## 7. Logging Strategy

| Log Level | When | Example |
|-----------|------|---------|
| ERROR | Runtime/test process fails | `scenario_failed id=... reason=...` |
| WARN | Fixture blocked | `fixture_blocked missing=...` |
| INFO | Scenario complete | `scenario_complete id=... result=pass` |

**Log format**: Test report CSV/Markdown plus structured runner logs.

**Log storage**: `test-results/`.