# Blackbox Tests ## Positive Scenarios ### FT-P-01: Still-Image Frame Center Geolocation **Summary**: Validate that the system estimates WGS84 frame centers for the provided 60-image nadir dataset. **Traces to**: AC-1.1, AC-1.2, AC-6.3, AC-8.1 **Category**: Position Accuracy **Preconditions**: - Offline satellite cache fixture is available for the sample area. - Expected results are loaded from `input_data/expected_results/results_report.md`. **Input data**: `project_60_still_images`, `expected_frame_centers` | Step | Consumer Action | Expected System Response | |------|-----------------|--------------------------| | 1 | Submit `AD000001.jpg` through `AD000060.jpg` with height/camera metadata | System emits one WGS84 estimate per processed image | | 2 | Compare each estimate to the mapped expected coordinate | Per-frame error is reported in meters | **Expected outcome**: At least 80% of images are within 50 m and at least 50% are within 20 m. **Max execution time**: 15 minutes for the 60-image replay on the local replay environment. --- ### FT-P-02: Position Confidence Output Contract **Summary**: Validate that every emitted position estimate includes confidence and source-label fields required by the public contract. **Traces to**: AC-1.3, AC-1.4, AC-4.4, AC-4.5 **Category**: Position Confidence **Preconditions**: - Same fixture setup as FT-P-01. **Input data**: `project_60_still_images`, `expected_frame_centers` | Step | Consumer Action | Expected System Response | |------|-----------------|--------------------------| | 1 | Submit the 60-image replay | System emits estimates frame-by-frame, not batched | | 2 | Inspect public output fields | Each estimate contains WGS84 coordinate, 95% covariance semi-major axis, source label, and `last_satellite_anchor_age_ms` | | 3 | Submit a later correction for a prior frame if available | System emits updated estimate with timestamp and covariance without corrupting newer estimates | **Expected outcome**: 100% of emitted estimates include required confidence fields; no `horiz_accuracy` equivalent under-reports the 95% covariance semi-major axis. **Max execution time**: 15 minutes. --- ### FT-P-03: BASALT VIO Replay With Public Synchronized Data **Summary**: Validate that BASALT + safety/anchor wrapper can process synchronized camera/IMU data and produce trajectory estimates with calibrated confidence. **Traces to**: AC-1.3, AC-2.1a, AC-2.2, AC-4.1, AC-4.2 **Category**: VO / IMU Propagation **Preconditions**: - Public synchronized dataset slice is pinned during implementation. Strongest candidates: MUN-FRL, ALTO, EPFL fixed-wing, Kagaru; EuRoC/UZH FPV are proxy-only. - Ground-truth trajectory or frame poses are available. **Input data**: `public_nadir_vio_candidates` | Step | Consumer Action | Expected System Response | |------|-----------------|--------------------------| | 1 | Replay synchronized camera and IMU stream | System emits frame-by-frame `vo_extrapolated` or `satellite_anchored` estimates | | 2 | Compare output trajectory to dataset ground truth | Error and covariance calibration are reported per segment | | 3 | Compare against OpenVINS reference replay | BASALT + wrapper does not materially under-report uncertainty relative to error | **Expected outcome**: VO registration succeeds for >95% of normal overlapping frames in dataset-supported normal segments; VO homography MRE is <1.0 px where homography validation is applicable. **Max execution time**: Dataset-dependent, but replay must report per-frame latency. --- ### FT-P-04: Satellite Retrieval And Anchor Verification **Summary**: Validate that relocalization uses global retrieval plus local verification and emits only verified satellite anchors. **Traces to**: AC-2.1b, AC-2.2, AC-3.2, AC-3.3, AC-8.6 **Category**: Satellite Anchor **Preconditions**: - AerialVL/ALTO/VPAir-style public dataset slice or project satellite-cache fixture is available. - VPR chunks and descriptors are precomputed. **Input data**: Public aerial localization slice, cache fixture | Step | Consumer Action | Expected System Response | |------|-----------------|--------------------------| | 1 | Trigger cold-start or relocalization query | System searches CPU FAISS top-K chunks | | 2 | Present top-K candidates to local verification | System runs ALIKED/DISK+LightGlue and RANSAC | | 3 | Inspect emitted anchor decision | Accepted anchors include source label, MRE, inlier count, covariance, and tile provenance | **Expected outcome**: Cross-domain satellite-anchor MRE is <2.5 px for accepted anchors; rejected candidates do not produce `satellite_anchored` estimates. **Max execution time**: Must be measured as part of performance tests. ## Negative Scenarios ### FT-N-01: Repetitive Or Low-Texture Imagery **Summary**: Validate that visually ambiguous images do not produce confident false satellite anchors. **Traces to**: AC-1.4, AC-3.1, AC-NEW-4, AC-8.6 **Category**: False Position Prevention **Input data**: Repetitive agricultural or low-texture frames from project/public data. | Step | Consumer Action | Expected System Response | |------|-----------------|--------------------------| | 1 | Submit ambiguous frame or sequence | System either emits degraded `vo_extrapolated`/`dead_reckoned` output or rejects low-confidence anchor | | 2 | Inspect anchor and confidence outputs | No anchor is accepted unless local verification and covariance gates pass | **Expected outcome**: 0 confident `satellite_anchored` outputs for candidates that fail local verification, freshness, or Mahalanobis gates. **Max execution time**: 15 minutes per fixture. --- ### FT-N-02: GPS Spoofing During Total Visual Blackout **Summary**: Validate that spoofed GPS is not promoted during total camera occlusion/visual blackout and that output degrades honestly before unusable frames reach VIO. **Traces to**: AC-3.5, AC-5.2, AC-NEW-2, AC-NEW-8 **Category**: Spoofing / Blackout **Input data**: ArduPilot Plane SITL spoofing trace with camera blackout/total-occlusion frames. | Step | Consumer Action | Expected System Response | |------|-----------------|--------------------------| | 1 | Start normal replay with trusted visual/satellite anchor | System emits normal estimates | | 2 | Inject full visual blackout/total occlusion and spoofed `GPS_RAW_INT` | Camera gate sets `usable_for_vio=false`, BASALT is bypassed for occluded frames, and system switches to `dead_reckoned` within <=1 processed frame or <=400 ms | | 3 | Continue blackout beyond thresholds | IMU-only covariance grows monotonically; system degrades fix type and emits failsafe status at specified covariance/time thresholds | **Expected outcome**: Spoofed GPS is ignored; total occlusion never feeds BASALT as a usable VIO frame; `fix_type=0`, `horiz_accuracy=999.0`, and `VISUAL_BLACKOUT_FAILSAFE` are emitted when covariance >500 m or blackout >30 s. **Max execution time**: 10 minutes per SITL scenario. --- ### FT-N-03: Invalid Or Stale Satellite Cache **Summary**: Validate cache freshness, integrity, and provenance gates. **Traces to**: AC-8.2, AC-8.3, AC-NEW-6, AC-NEW-7 **Category**: Cache Integrity **Input data**: `cache_integrity_fixtures` | Step | Consumer Action | Expected System Response | |------|-----------------|--------------------------| | 1 | Replay with stale tile manifest | Tile is rejected or down-confidence weighted; no stale tile emits `satellite_anchored` | | 2 | Replay with hash-mismatched or unsigned manifest | Cache fixture is rejected and security event is logged | | 3 | Replay generated tile with weak parent-pose covariance | Tile is not promoted beyond allowed trust level | **Expected outcome**: 0 invalid/stale/cache-poisoning fixtures produce trusted anchors or trusted basemap tiles. **Max execution time**: 15 minutes.