Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
5.5 KiB
Test Specification — C3 Cross-Domain Matcher
Component-scoped. Suite-level coverage in _docs/02_document/tests/*.md.
Acceptance Criteria Traceability
| AC ID | Acceptance Criterion (one-line) | Test IDs | Coverage |
|---|---|---|---|
| AC-1.1 | Frame-center GPS within 50 m for ≥80% of normal-flight photos | FT-P-01, C3-IT-01 (inlier-budget partition) | Covered |
| AC-2.1b | Satellite-anchor registration | FT-P-05, C3-IT-02 | Covered |
| AC-2.2 (cross-domain portion) | MRE <2.5 px cross-domain | FT-P-06, C3-IT-03 | Covered |
| AC-3.1 | Tolerate up to 350 m outliers, tilt ±20° | FT-N-01, C3-IT-04 | Covered |
| AC-4.1 | E2E latency <400 ms p95 | NFT-PERF-01, C3-PT-01 | Covered |
Component-Internal Tests
C3-IT-01: per-candidate inlier-count floor on Derkachi
Summary: on the Derkachi normal segment, the best candidate's RANSAC inlier count is at least the configured floor (default ≥ 80 inliers per frame).
Traces to: AC-1.1 (component-level partition feeding AC-1.1's accuracy budget)
Description: for the Derkachi normal-segment fixture, run C3 with the production-default DISK+LightGlue backbone; record MatchResult.per_candidate[best_candidate_idx].inlier_count; assert p5 ≥ 80 (i.e., ≥95% of frames clear the floor).
Input data: flight_derkachi/normal_segment_60_stills/ + C10-built tile descriptors (read-only).
Expected result: p5 inlier count ≥ 80 for DISK+LightGlue.
Max execution time: 4 min on Tier-1 (CPU fallback) / 90 s on Tier-2.
C3-IT-02: best-candidate selection determinism
Summary: best_candidate_idx == argmax(inlier_count) for every emitted MatchResult.
Traces to: AC-2.1b
Description: 100 frames through match; assert (a) best_candidate_idx equals the index of the largest inlier_count in per_candidate, (b) ties are broken deterministically (lowest tile_id wins; check against a known-tie synthetic fixture).
Input data: synthetic_matcher/known_tie_10f/ (synthetic frames where two candidates have identical inlier counts).
Expected result: deterministic tie-breaking; no best_candidate_idx mismatch in 100/100 frames.
Max execution time: 60 s.
C3-IT-03: cross-domain MRE bound
Summary: median per-frame reprojection residual stays under 2.5 px for the production-default matcher on the Derkachi normal segment.
Traces to: AC-2.2
Description: same Derkachi fixture; record MatchResult.reprojection_residual_px; assert p95 < 2.5 px.
Input data: as C3-IT-01.
Expected result: p95 < 2.5 px for DISK+LightGlue. ALIKED+LightGlue (secondary) and XFeat (alternate) tested on a smoke subset only — comparative-study verdict belongs to IT-12.
Max execution time: 4 min.
C3-IT-04: tilt + outlier robustness
Summary: under ±20° tilt and synthetic 350 m outliers, the matcher still produces a usable inlier set (≥40 inliers).
Traces to: AC-3.1
Description: synthetically tilt the Derkachi frames by {−20°, −10°, 0°, +10°, +20°}; inject 350 m position outliers into the candidate tile metadata for 5% of frames; assert C3 emits a MatchResult with best_candidate.inlier_count ≥ 40 for ≥90% of frames in each tilt bucket.
Input data: flight_derkachi/tilted_±20°/ (deterministic synthetic tilt).
Expected result: per-bucket inlier-count p10 ≥ 40 for DISK+LightGlue.
Max execution time: 6 min.
C3-IT-05: InsufficientInliersError propagation
Summary: when all N=3 candidates fall below the inlier floor, C3 raises InsufficientInliersError and emits no MatchResult.
Traces to: AC-3.5 (defensive — keeps the spoof-fallback path clean)
Description: synthetic frames with deliberately mismatched candidate tiles (cross-region pulls); assert match raises InsufficientInliersError for every frame and the downstream consumer (C3.5 / C4) receives no MatchResult.
Input data: synthetic_matcher/cross_region_mismatch_20f/.
Expected result: 20/20 frames raise the error.
Max execution time: 60 s.
Performance Tests
C3-PT-01: per-frame match latency on Tier-2 (dominant cost)
Traces to: AC-4.1 (C3 owns the largest single partition of the budget)
Load scenario: 3 Hz, N=3 candidates, 10 min replay; concurrent C2 backbone + C5 iSAM2 update on the same Jetson.
Expected results:
| Metric | Target | Failure Threshold |
|---|---|---|
match p95 |
≤ 180 ms (DISK+LightGlue) | 280 ms |
| Per-candidate p95 | ≤ 60 ms | 95 ms |
| Throughput | ≥ 3 Hz sustained | < 2.5 Hz |
Resource limits:
- GPU memory: ≤ 800 MB for backbone + matcher engines combined.
Security Tests
C3 has no externally-reachable surface; defensive coverage at the cache-poisoning level (NFT-SEC-01) and via shared-runtime invariants (C2.5-IT-03).
Acceptance Tests
Covered transitively via FT-P-01, FT-P-05, FT-P-06.
Test Data Management
| Data Set | Source | Size |
|---|---|---|
flight_derkachi/normal_segment_60_stills/ |
shared | shared |
flight_derkachi/tilted_±20°/ |
generated | ~150 MB |
synthetic_matcher/known_tie_10f/ |
generated | ~5 MB |
synthetic_matcher/cross_region_mismatch_20f/ |
generated | ~10 MB |
| C10-built tile descriptors | C10 artifact | shared |
Setup: C10 must have built tile descriptors; matchers' TRT engines must be compiled (consumes ~5 min on Tier-2 first run; cached after). Teardown: read-only. Data isolation: per-test temp dirs.