mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-04-27 13:36:36 +00:00
249 lines
9.3 KiB
Markdown
249 lines
9.3 KiB
Markdown
# Performance Tests
|
||
|
||
> Deployment-binding numbers require Tier T4 (real Jetson Orin Nano Super @ 25 W). T1 runs are functional plausibility checks only — same caveat as `test-data.md` D2.
|
||
|
||
---
|
||
|
||
### NFT-PERF-01: End-to-end latency p95 ≤400 ms (AC-4.1)
|
||
|
||
**Summary**: From camera-frame capture to GPS_INPUT emission, p95 latency ≤ 400 ms on Orin Nano Super @ 25 W.
|
||
**Traces to**: AC-4.1. Tier: T4 (`deferred-hil`) for binding result; T1 functional smoke.
|
||
**Metric**: end-to-end latency in ms, sampled per-frame, aggregated to p50 / p95 / p99.
|
||
|
||
**Preconditions**:
|
||
- Tier T4: real Jetson Orin Nano Super, 25 W power mode (`nvpmodel -m 0` + 25 W profile), thermals stabilized at +25 °C ambient.
|
||
- TRT engines warmed (≥1 min steady-state replay before measurement).
|
||
- 30-min sustained replay of `synthetic_8h_load` slice (or AerialVL S03 mid-segment).
|
||
- Frame timestamping uses the camera-shim `time_usec` and matches against the GPS_INPUT `time_usec`.
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Stream nav-cam frames at 3 fps for 30 min after warm-up | per-frame `(t_emit_gps_input - t_capture)` |
|
||
| 2 | Drop the first 60 s as warm-up | aggregate the rest |
|
||
| 3 | Compute p50, p95, p99, max | report |
|
||
| 4 | Verify drop rate | `dropped_frames / total_frames ≤ 10%` |
|
||
|
||
**Pass criteria**: p95 ≤ 400 ms; drop rate ≤ 10 % (per AC-4.1's "skip-allowed" clause).
|
||
**Duration**: 30 min + 60 s warm-up.
|
||
|
||
---
|
||
|
||
### NFT-PERF-02: cuVSLAM single-frame latency ≤20 ms
|
||
|
||
**Summary**: cuVSLAM inference completes within 20 ms per frame.
|
||
**Traces to**: results_report row 37, F-T1b. Tier: T4 binding; T1 functional.
|
||
**Metric**: cuVSLAM per-frame inference duration, p95.
|
||
|
||
**Preconditions**: cuVSLAM warmed; mono+IMU mode.
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Replay 5 min of nav-cam frames at 3 fps | per-frame `cuvslam_inference_ms` (publicly exposed metric) |
|
||
| 2 | p95 over the run | report |
|
||
|
||
**Pass criteria**: p95 ≤ 20 ms.
|
||
**Duration**: 5 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-03: Cross-view matcher latency
|
||
|
||
**Summary**: Inline matcher (SP+LG TRT FP16/INT8) ≤ 200 ms / pair; LiteSAM re-loc fallback ≤ 2000 ms / pair.
|
||
**Traces to**: AC-4.1 (sub-budget), results_report row 38. Tier: T4 binding.
|
||
**Metric**: per-pair matcher inference time, p95.
|
||
|
||
**Preconditions**: matcher warmed; representative resolution (1024×768 SP+LG / GIM-LG).
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Replay 1000 cross-view pairs through inline path | `inline_matcher_ms` per pair |
|
||
| 2 | Replay 100 cross-view pairs through re-loc path | `reloc_matcher_ms` per pair |
|
||
|
||
**Pass criteria**: inline p95 ≤ 200 ms; re-loc p95 ≤ 2000 ms.
|
||
**Duration**: ≤30 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-04: Orthority per-frame latency ≤50 ms
|
||
|
||
**Summary**: Orthority's per-frame ortho call on Orin Nano Super stays within budget.
|
||
**Traces to**: F-T14, M-27. Tier: T4 binding. If exceeded, fall back to `cv2.warpPerspective + bilinear DEM` per Component 1b documented fall-back.
|
||
**Metric**: ortho per-frame duration, p95.
|
||
|
||
**Preconditions**: Orthority loaded; SRTM-30 m DEM mmap warm; sector classified `flat` or `moderate`.
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Replay 1000 frames | per-frame `ortho_ms` |
|
||
|
||
**Pass criteria**: p95 ≤ 50 ms. If FAIL: open task to switch to fall-back path (not a blocking gate at this test, but a flow trigger).
|
||
**Duration**: ≤10 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-05: Spoofing-promotion latency ≤3 s p95 (AC-NEW-2)
|
||
|
||
**Summary**: Time from spoof onset to SUT promotion as primary GPS source.
|
||
**Traces to**: AC-NEW-2. Tier: T3 (`deferred-sitl`).
|
||
**Metric**: t_promote = `t_promotion_event - t_spoof_onset`, p95 over 50 trials.
|
||
|
||
**Preconditions**: SITL + `gps-spoof-injector`; FC EKF3 lane-switch event observable via `EKF_STATUS_REPORT`.
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | At t=0 inject spoof signal | observe SUT GPS_INPUT promotion (raised `fix_type` to 3D-fix-with-priority + STATUSTEXT `PROMOTE`) |
|
||
| 2 | Repeat 50 trials with randomised spoof magnitudes | distribution |
|
||
|
||
**Pass criteria**: p95 ≤ 3 s.
|
||
**Duration**: ≤30 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-06: Frame-by-frame output cadence (AC-4.4)
|
||
|
||
**Summary**: GPS_INPUT is streamed per-frame, not batched.
|
||
**Traces to**: AC-4.4. Tier: T1 + T4.
|
||
**Metric**: inter-frame interval distribution.
|
||
|
||
**Preconditions**: 30 min steady-state replay.
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Replay at 3 fps | sniff GPS_INPUT timestamps |
|
||
| 2 | Compute inter-arrival deltas | distribution |
|
||
| 3 | Verify no frame is delayed >1 inter-frame interval | — |
|
||
|
||
**Pass criteria**: |Δt - 1/3 s| ≤ 50 ms for ≥99 % of frames; no batches (no clusters of frames within the same 50 ms window).
|
||
**Duration**: 30 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-07: GPS_INPUT message rate (results_report row 9)
|
||
|
||
**Summary**: GPS_INPUT emitted at 5–10 Hz continuous (matches per-frame at 3 fps + duplicates for FC stability when configured).
|
||
**Traces to**: AC-4.3, results_report row 9. Tier: T1.
|
||
**Metric**: rate over 60 s windows.
|
||
|
||
**Preconditions**: steady-state tracking.
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Sniff GPS_INPUT for 5 min | per-second rate |
|
||
|
||
**Pass criteria**: rate ∈ [5, 10] Hz throughout.
|
||
**Duration**: 5 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-08: VPR latency under conditional invocation
|
||
|
||
**Summary**: VPR's DINOv2 forward only fires on re-loc triggers; in cruise it stays near zero CPU/GPU.
|
||
**Traces to**: AC-8.6, restrictions §Satellite (VPR retrieval unit). Tier: T4.
|
||
**Metric**: VPR invocations / second; cruise idle vs re-loc burst.
|
||
|
||
**Preconditions**: 60-min replay with scripted re-loc triggers (cold start, sharp turn, σ_xy > 50 m, VO failure ≥2 frames).
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Run replay | per-second `vpr_invocations` counter |
|
||
| 2 | Compute average across cruise window vs re-loc window | — |
|
||
|
||
**Pass criteria**:
|
||
- Cruise window (no triggers): VPR invocations / 100 frames ≤ 1 (i.e., not invoked per-frame).
|
||
- Re-loc window: VPR invokes within 1 frame of trigger; latency ≤ 200 ms p95 for the DINOv2 forward.
|
||
**Duration**: 60 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-09: Top-K dynamic sizing matches sector / σ_xy
|
||
|
||
**Summary**: VPR top-K honours AC-8.6 dynamic-K rules.
|
||
**Traces to**: AC-8.6. Tier: T1 + T4.
|
||
**Metric**: K value selected per VPR call vs sector class + σ_xy.
|
||
|
||
**Preconditions**: scripted scenarios with (sector ∈ {stable, active}) × (σ_xy ∈ {10, 30, 60}).
|
||
|
||
**Steps**:
|
||
|
||
| Step | Consumer Action | Measurement |
|
||
|------|----------------|-------------|
|
||
| 1 | Trigger VPR in each combination | observe `vpr_top_k` metric |
|
||
|
||
**Pass criteria**:
|
||
- stable + σ_xy ≤ 20 m → K=5.
|
||
- active-conflict → K=20.
|
||
- expanding-window fallback (σ_xy > 50 m or fail-N) → K=50.
|
||
**Duration**: 5 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-10: Failsafe latency ≤3 s no-fix → FC fallback (AC-5.2)
|
||
|
||
**Summary**: When SUT cannot produce any estimate for >3 s, FC observably falls back to IMU-only DR.
|
||
**Traces to**: AC-5.2. Tier: T3.
|
||
**Metric**: time from last-fix-emission to FC fallback signal in `EKF_STATUS_REPORT`.
|
||
|
||
**Preconditions**: scripted blackout in SITL.
|
||
|
||
**Steps**: blackout pipeline; observe FC.
|
||
|
||
**Pass criteria**: FC fallback observable within 4 s of blackout (3 s budget + 1 s observation latency).
|
||
**Duration**: 5 min.
|
||
|
||
---
|
||
|
||
### NFT-PERF-11: Bench-off candidates — accuracy-vs-latency frontier
|
||
|
||
**Summary**: Score inline matcher candidates on the documented bench-off corpora.
|
||
**Traces to**: AC-1.1 / AC-1.2 / AC-2.2 / R2 / R3, F-T15. Tier: T2.
|
||
**Metric**: per-candidate (recall@30 m, p95 latency, peak GPU mem, sustained 30-min thermal stability, seasonal-robustness score).
|
||
|
||
**Preconditions**: AerialVL, UAV-VisLoc, AerialExtreMatch, 2chADCNN, TartanAir V2, internal Mavic.
|
||
|
||
**Steps**: run each candidate (SP+LG, GIM-LG, XFeat sparse, XFeat semi-dense) and each ceiling reference (RoMa v2, MASt3R-SLAM, MapGlue, MATCHA — offline only) over the corpora.
|
||
|
||
**Pass criteria**:
|
||
- Inline candidates must fit in 200 ms / pair on Orin Nano Super @ 25 W.
|
||
- Re-loc candidates (LiteSAM) must fit in 2 s / pair.
|
||
- Selected inline matcher's recall@30 m on AerialVL S03 must support AC-1.1 / AC-1.2.
|
||
**Duration**: 4 h Monte Carlo.
|
||
|
||
---
|
||
|
||
### NFT-PERF-12: Latency under adversarial input — no infinite stall
|
||
|
||
**Summary**: Pathological inputs (uniform-grey frame, all-black frame, very low contrast) do not cause unbounded latency.
|
||
**Traces to**: AC-3.x (resilience), AC-4.1 (negative). Tier: T1.
|
||
**Metric**: per-frame latency capped.
|
||
|
||
**Preconditions**: replay with 5 % of frames replaced by uniform-grey or all-black.
|
||
|
||
**Steps**: replay 30 min; observe latency CDF.
|
||
|
||
**Pass criteria**: each frame's latency ≤ 600 ms (1.5× p95 budget); pipeline never stalls beyond a single frame interval.
|
||
**Duration**: 30 min.
|
||
|
||
---
|
||
|
||
## Test execution caveats
|
||
|
||
- **T1 runs**: produced numbers are NOT deployment-binding. AC-4.1 / NFT-PERF-01 specifically requires Orin Nano Super 25 W (T4) for binding pass.
|
||
- **T4 runs**: bench scheduler enforces single-tenant access; thermal warm-up ≥1 min before measurement window starts.
|
||
- **Frame-rate floor**: AC-4.1 allows ~10 % drop under sustained load. Drop rate IS measured and reported in NFT-PERF-01.
|