7.6 KiB
Test Specification — Validation Harness
Acceptance Criteria Traceability
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|---|---|---|---|
| AC-1.1 through AC-1.4 | Position accuracy, drift, confidence | IT-01, AT-01 | Covered |
| AC-2.1a/b, AC-2.2 | VO and satellite registration | IT-02, IT-03 | Covered |
| AC-3.1 through AC-3.5 | Resilience edge cases | IT-04, IT-05 | Covered |
| AC-4.1 through AC-4.5 | Latency, memory, MAVLink streaming | PT-01, IT-06 | Covered |
| AC-5.1 through AC-5.3 | Startup/failsafe/reboot | IT-07 | Covered |
| AC-6.1 through AC-6.3 | QGC/GCS/WGS84 | IT-06 | Covered |
| AC-7.1, AC-7.2 | Object coordinate contract | IT-08 | Covered at system boundary |
| AC-8.1 through AC-8.6 | Offline cache, freshness, tiles, VPR | IT-03, IT-09, ST-01 | Covered |
| AC-NEW-1 through AC-NEW-8 | Cold start, spoofing, FDR, false-position, thermal, freshness, poisoning, blackout | IT-05, IT-07, PT-02, ST-01, AT-02 | Covered |
Blackbox Tests
IT-01: Still-Image Accuracy Runner
Summary: Verify project still-image replay reports frame-center accuracy.
Traces to: AC-1.1, AC-1.2, AC-1.4
Input data: Project mapped images and expected_results/results_report.md.
Expected result: Report includes per-image error, aggregate 50 m/20 m pass rates, covariance, source label, and anchor age.
Max execution time: 15 minutes.
IT-02: Public VIO Replay Runner
Summary: Verify public/representative synchronized data can drive BASALT/wrapper tests.
Traces to: AC-1.3, AC-2.1a, AC-2.2
Input data: MUN-FRL preferred slice or representative synchronized dataset.
Expected result: Runner validates trajectory, VIO registration, latency, and covariance calibration.
Max execution time: Dataset-dependent.
IT-03: Satellite Anchor Replay Runner
Summary: Verify VPR and anchor verification test scenarios are executable.
Traces to: AC-2.1b, AC-2.2, AC-8.1, AC-8.2, AC-8.6
Input data: ALTO/AerialVL/representative aerial localization fixture plus cache.
Expected result: Runner reports retrieval recall, MRE, accepted/rejected anchors, and freshness behavior.
Max execution time: Dataset-dependent.
IT-04: Outlier/Sharp-Turn/Disconnected Runner
Summary: Verify resilience scenarios are executable and reported.
Traces to: AC-3.1, AC-3.2, AC-3.3, AC-3.4
Input data: Synthetic and public disconnected-segment fixtures.
Expected result: Runner validates relocalization and records degraded-mode timelines.
Max execution time: 30 minutes.
IT-05: Blackout And Spoofing Runner
Summary: Verify total blackout plus spoofing scenarios can be driven through SITL/replay.
Traces to: AC-3.5, AC-NEW-2, AC-NEW-8
Input data: Plane SITL spoofing scenario with 5 s, 15 s, and 35 s blackout windows.
Expected result: Runner measures <=400 ms mode switch, <3 s promotion, monotonic covariance, and failsafe thresholds.
Max execution time: 30 minutes.
IT-06: MAVLink/QGC Contract Runner
Summary: Verify MAVLink output and GCS status assertions are automated.
Traces to: AC-4.3, AC-4.4, AC-4.5, AC-6.1, AC-6.2, AC-6.3
Input data: Plane SITL, QGC observer/log parser, position fixtures.
Expected result: Runner validates v1 GPS_INPUT-only output, WGS84 coordinates, status rate, and command ingress.
Max execution time: 60 minutes.
IT-07: Startup/Reboot Runner
Summary: Verify cold-start and reboot scenarios are measurable.
Traces to: AC-5.1, AC-5.2, AC-5.3, AC-NEW-1
Input data: 50 cold-start runs and companion reboot trace.
Expected result: First valid GPS_INPUT <30 s p95; reboot reinitializes from FC state.
Max execution time: Runset-dependent.
IT-08: Object Coordinate Contract Runner
Summary: Verify AI-camera object coordinate request contract at system boundary.
Traces to: AC-7.1, AC-7.2
Input data: Frame-center estimate, object pixel/angle fixture, gimbal angle, altitude.
Expected result: Output coordinate includes frame-center-consistent accuracy and maneuvering-flight projection error bound.
Max execution time: 5 minutes.
IT-09: Cache And Tile Lifecycle Runner
Summary: Verify cache, generated tiles, and storage tests are executable.
Traces to: AC-8.3, AC-8.4, AC-8.5, AC-NEW-6, AC-NEW-7
Input data: Cache integrity fixtures, generated tile scenarios, PostGIS manifest.
Expected result: Runner validates cache load, tile write gates, no raw-frame retention, stale rejection, and poisoning budget evidence.
Max execution time: Dataset-dependent.
Performance Tests
PT-01: End-To-End Release Gate Runner
Summary: Verify performance and resource tests can run in the proper environment.
Traces to: AC-4.1, AC-4.2, AC-NEW-5
Load scenario:
- Environments: replay, Jetson hardware, SITL.
- Duration: smoke, nightly, and release-gate profiles.
| Metric | Target | Failure Threshold |
|---|---|---|
| End-to-end p95 | <400 ms | >=400 ms |
| Memory | <8 GB | >=8 GB |
| Thermal throttle | 0 events in release gate | Any throttle event |
PT-02: FDR/Storage Runner
Summary: Verify 8-hour storage/endurance test orchestration.
Traces to: AC-NEW-3
| Metric | Target | Failure Threshold |
|---|---|---|
| FDR cap | <=64 GB | >64 GB |
| Rollover logging | Complete | Missing rollover event |
Security Tests
ST-01: Security Fixture Runner
Summary: Verify stale/tampered cache, spoofed MAVLink, and false-anchor scenarios are automated.
Traces to: AC-NEW-4, AC-NEW-6, AC-NEW-7
Attack vector: Cache tampering, stale imagery, spoofed GPS, impossible anchors.
Test procedure:
- Load each security fixture.
- Run scenario through public runtime interfaces.
- Validate output labels, FDR, and rejection reasons.
Expected behavior: No tampered/stale/spoofed input produces a trusted false fix.
Pass criteria: 0 accepted unsafe anchors or spoofed GPS promotions outside gates.
Acceptance Tests
AT-01: Traceability Completeness Report
Summary: Verify every AC has executable or explicitly blocked test coverage.
Traces to: All ACs
| Step | Action | Expected Result |
|---|---|---|
| 1 | Read traceability matrix | All ACs mapped to tests |
| 2 | Run fixture validation | Missing public/representative data is reported as blocked, not passed |
AT-02: Release Evidence Bundle
Summary: Verify release evidence can be assembled.
Traces to: AC-NEW-1 through AC-NEW-8
| Step | Action | Expected Result |
|---|---|---|
| 1 | Run release profile | Reports, tlogs, FDR summaries, cache reports are produced |
| 2 | Collate artifacts | Bundle contains pass/fail status and residual blockers |
Test Data Management
| Data Set | Description | Source | Size |
|---|---|---|---|
project_60_still_images |
Frame-center geolocation smoke | Project data | Project size |
public_dataset_slices |
MUN-FRL/ALTO/Kagaru/EPFL/AerialVL as licensed | Public pinned fixtures | Dataset-dependent |
sitl_scenarios |
Plane spoofing/failsafe traces | Generated | Small |
security_fixtures |
Stale/tampered/cache poisoning cases | Generated | Small |
Setup procedure: Create isolated run directory, restore PostgreSQL schema, mount fixtures read-only, and start requested environment.
Teardown procedure: Stop environments, archive reports, drop run schema, and delete temp volumes.
Data isolation strategy: Unique run ID, schema, ports, cache staging directory, and FDR directory per scenario.