Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
7.8 KiB
Test Specification — C5 State Estimator
Component-scoped. Suite-level coverage in _docs/02_document/tests/*.md.
Acceptance Criteria Traceability
| AC ID | Acceptance Criterion (one-line) | Test IDs | Coverage |
|---|---|---|---|
| AC-1.3 | Cumulative drift between satellite-anchored fixes | FT-P-02, C5-IT-01 | Covered |
| AC-1.4 | 95% covariance + source label | FT-P-03, C5-IT-02 | Covered |
| AC-3.5 | Visual blackout + spoofed-GPS failsafe | FT-N-04, C5-IT-03 | Covered |
| AC-4.5 (revised) | Internal smoothing of past keyframes (NOT FC retroactive correction) | FT-P-10, C5-IT-04 | Covered |
| AC-5.2 | On >3 s without estimate, FC IMU-only fallback | NFT-RES-01, C5-IT-05 | Covered |
| AC-NEW-2 | Spoofing-promotion latency <3 s p95 | NFT-PERF-04, C5-IT-06 | Covered |
| AC-NEW-8 | Visual blackout + spoof degraded-mode escalation | FT-N-04, NFT-RES-04, C5-IT-07 | Covered |
Component-Internal Tests
C5-IT-01: source_label state machine produces correct last_satellite_anchor_age_ms
Summary: after a satellite-anchored frame, last_satellite_anchor_age_ms resets to the frame's age; under visual-propagated frames it monotonically increases.
Traces to: AC-1.3 (binning input)
Description: scripted sequence — 3 satellite-anchored frames, then 30 s of visual-propagated, then another satellite-anchored. Assert last_satellite_anchor_age_ms resets at the anchored events and rises monotonically between them with ms-level resolution.
Input data: scripted EstimatorOutput sequence.
Expected result: monotonic between resets; resets within 100 ms of the anchored frame's emitted_at.
Max execution time: 60 s.
C5-IT-02: smoothed-current estimate honest covariance
Summary: current_estimate() produces an SPD 6×6 covariance whose norm reflects the iSAM2 graph's actual posterior — not a fake-confidence value.
Traces to: AC-1.4
Description: build a synthetic graph where the keyframe-11 absolute factor is intentionally noisy (3× covariance); assert C5's emitted covariance norm is at least 2× the steady-state norm of a clean graph. Repeat with a clean factor; assert ≤1.2× steady-state.
Input data: synthetic factor-graph fixtures.
Expected result: noisy → ≥2× norm; clean → ≤1.2× norm.
Max execution time: 30 s.
C5-IT-03: VIO-only fallback under cross-domain matcher failure
Summary: when C4 stops emitting PoseEstimate (matcher failure), C5 continues with VIO-only and labels source_label = visual_propagated.
Traces to: AC-3.5
Description: feed add_vio for 60 s while withholding add_pose_anchor calls; assert (a) current_estimate keeps emitting, (b) source_label == visual_propagated, (c) cov_norm_growing_for_s rises monotonically.
Input data: scripted VIO-only fixture.
Expected result: 60 s of visual_propagated estimates; cov norm monotonically rising.
Max execution time: 90 s.
C5-IT-04: smoothed past-keyframe history is NOT forwarded to FC
Summary: smoothed_history(n) reflects past-keyframe smoothing per AC-4.5 (revised), but the FC emission path uses current_estimate only — the smoothing must NOT alter what C8 emits as GPS_INPUT / MSP2_SENSOR_GPS.
Traces to: AC-4.5 (revised)
Description: trigger an iSAM2 relinearisation that materially shifts a 5-keyframe-old pose; assert (a) smoothed_history(10) shows the shift, (b) the next 10 calls to current_estimate are unaffected, (c) the FDR record stream contains the smoothed history (per AC-4.5) AND the unshifted FC emissions in the same flight log.
Input data: synthetic graph with a deliberately-late loop closure.
Expected result: history shows shift; current_estimate unaffected; FDR has both streams.
Max execution time: 60 s.
C5-IT-05: 3 s no-estimate threshold triggers AC-5.2 fallback
Summary: when no add_vio and no add_pose_anchor arrive for >3 s, current_estimate returns a dead_reckoned-labeled output (or refuses to emit) so C8 can fall to FC IMU-only.
Traces to: AC-5.2
Description: prime C5 with a normal warm-start; cease all input for 4 s; observe current_estimate over the gap; assert (a) at < 3 s gap, label is visual_propagated, (b) at ≥ 3 s gap, label transitions to dead_reckoned or no emission, (c) the transition timestamp is logged at ERROR level.
Input data: scripted gap fixture.
Expected result: transition at gap ≥ 3 s; ERROR logged.
Max execution time: 30 s.
C5-IT-06: spoof-promotion gate enforces ≥10 s + visual consistency
Summary: a previously-spoofed FC GPS source can only be re-promoted to trusted after BOTH (i) gps_health == STABLE_NON_SPOOFED for ≥10 s AND (ii) the next satellite-anchored frame agrees with FC GPS within tolerance.
Traces to: AC-NEW-2, AC-NEW-8
Description: scripted scenario — initial trust → spoof event (gps_health = SPOOFED) → recovery to STABLE_NON_SPOOFED at t=0; satellite-anchored agreement frames at t=5 s, t=11 s. Assert promotion blocks until t=11 s + agreement; reject every promotion attempt before then; log every reject in FDR + STATUSTEXT.
Input data: scripted gps_health + EstimatorOutput sequence.
Expected result: promotion at t=11 s; rejects logged before then.
Max execution time: 60 s.
C5-IT-07: visual blackout + spoof escalation
Summary: simultaneous visual blackout (no C4 anchors) and spoofed FC GPS escalates to dead_reckoned source label and AC-NEW-8 STATUSTEXT.
Traces to: AC-NEW-8
Description: combine the visual-blackout fixture (no add_pose_anchor for 5 s) with a gps_health == SPOOFED event; assert the next emission is dead_reckoned and an AC-NEW-8 STATUSTEXT is published via the C8 path (mocked C8 in the test harness records the outgoing message).
Input data: combined scripted fixture.
Expected result: dead_reckoned label; STATUSTEXT recorded in mock C8 harness.
Max execution time: 30 s.
Performance Tests
C5-PT-01: iSAM2 + Marginals throughput on Tier-2
Traces to: AC-4.1
Load scenario: 3 Hz add_pose_anchor + 200 Hz add_fc_imu + 3 Hz add_vio; 10 min replay.
Expected results:
| Metric | Target | Failure Threshold |
|---|---|---|
add_pose_anchor + current_estimate p95 |
≤ 60 ms | 100 ms |
add_fc_imu p95 |
≤ 1 ms (preintegration buffer-add only) | 5 ms |
add_vio p95 |
≤ 5 ms | 15 ms |
Resource limits:
- Memory: bounded by
IncrementalFixedLagSmoother K=10–20; ≤ 100 MB resident.
Security Tests
C5-ST-01: spoof-rejection logging cannot be silenced
Summary: every spoof-promotion-block event lands in FDR + GCS STATUSTEXT — the system has no config knob that disables this.
Traces to: AC-NEW-2 / AC-NEW-8 (defensive)
Test procedure:
- Configure C5 with the production-default config.
- Inject a spoof-promotion-block event.
- Assert the FDR record stream contains an entry with
kind = "spoof_promotion_block"AND a STATUSTEXT was issued. - Search the codebase for any config flag that could disable either path; assert no such flag exists.
Pass criteria: FDR + STATUSTEXT both recorded; no disabling config flag found. Fail criteria: either path missing or a disabling flag exists.
Acceptance Tests
Covered transitively via FT-P-02 / FT-P-03 / FT-N-04 / NFT-RES-01.
Test Data Management
| Data Set | Source | Size |
|---|---|---|
| Synthetic factor-graph fixtures | scripted | <5 MB |
flight_derkachi/normal_segment_60_stills/ (replayed VIO + pose feeds) |
shared | shared |
gps_health event fixtures |
scripted | <1 MB |
Setup: in-process; no external services.
Teardown: per-test temp dirs.
Data isolation: each test instantiates a fresh StateEstimator.