gps-denied-onboard/_docs/02_document/tests/performance-tests.md

# Performance Tests

### NFT-PERF-01: Per-Frame Latency On Project Still Images

**Summary**: Validate end-to-end latency for processing project nadir frames through geolocation output.

**Traces to**: AC-4.1, AC-4.4

**Metric**: Capture-to-output latency p50/p95/p99 and dropped-frame rate.

**Preconditions**:
- Jetson Orin Nano Super or equivalent production target is running in the intended power mode.
- `project_60_still_images` fixture is available.

| Step | Consumer Action | Measurement |
|------|-----------------|-------------|
| 1 | Replay images at target 3 fps or faster stress rate | Measure latency from input timestamp to emitted estimate |
| 2 | Record all frame drops | Measure dropped-frame percentage |

**Pass criteria**: p95 latency <400 ms; dropped frames <=10% under sustained load; no batching delay.

**Duration**: Minimum 20 minutes or full fixture loop repeated enough times to reach stable measurements.

---

### NFT-PERF-02: BASALT + Wrapper Replay Latency

**Summary**: Validate relative VIO hot-path latency using synchronized Derkachi video/telemetry and public or representative camera/IMU data.

**Traces to**: AC-2.1a, AC-4.1, AC-4.2

**Metric**: Per-frame VIO latency, completion rate, and memory usage.

**Preconditions**:
- Derkachi `flight_derkachi.mp4` and `data_imu.csv` are mounted and pass fixture validation.
- MUN-FRL/ALTO/EPFL/Kagaru or another representative synchronized dataset slice is pinned for calibrated final comparison.
- OpenVINS reference replay is available for comparison when the dataset supports it.

| Step | Consumer Action | Measurement |
|------|-----------------|-------------|
| 1 | Replay Derkachi video at target 3 fps and stress rates from the 30 fps source | Measure per-frame processing time, dropped frames, and telemetry alignment |
| 2 | Replay synchronized camera/IMU stream through BASALT + wrapper | Measure VIO processing time and completion rate |
| 3 | Compare emitted trajectory against Derkachi `GLOBAL_POSITION_INT` and calibrated dataset ground truth where available | Measure completion rate and error distribution |
| 4 | Monitor memory | Track CPU/GPU shared memory peak |

**Pass criteria**: Normal-frame VO registration >95% on calibration-supported segments; p95 processing latency <400 ms for the hot path; memory <8 GB shared; Derkachi replay maintains stable 3-video-frames-per-telemetry-row alignment with <=10% dropped frames under sustained target-rate replay.

**Duration**: Dataset-dependent; at least one normal segment and one challenging segment.

---

### NFT-PERF-03: Relocalization Trigger Path Latency

**Summary**: Validate the heavy DINOv2-VLAD + FAISS + ALIKED/LightGlue path under bounded top-K settings.

**Traces to**: AC-3.2, AC-3.3, AC-4.1, AC-8.6

**Metric**: Trigger-to-anchor latency, top-K query time, local verification time, accepted/rejected anchor counts.

**Preconditions**:
- Precomputed descriptor index is loaded.
- Dynamic K settings are configured: K=5 stable, K=20 active-conflict, K=50 fallback.

| Step | Consumer Action | Measurement |
|------|-----------------|-------------|
| 1 | Trigger relocalization from cold start or sharp turn | Measure DINOv2 descriptor time and FAISS query time |
| 2 | Verify top-K candidates | Measure ALIKED/LightGlue + RANSAC latency |
| 3 | Emit accepted/rejected decision | Measure total trigger-to-decision latency |

**Pass criteria**: Heavy path is conditional, never blocks steady-state frame output; accepted anchor carries MRE <2.5 px and valid covariance.

**Duration**: 100 relocalization trials across stable and active-conflict sector fixtures.

---

### NFT-PERF-04: Cold Boot Time To First Fix

**Summary**: Validate companion boot to first valid `GPS_INPUT`.

**Traces to**: AC-NEW-1

**Metric**: Time from service start/boot marker to first valid `GPS_INPUT`.

**Preconditions**:
- Engines/indexes are built before the run.
- Cache/index is available locally.
- FC state handoff is simulated or provided.

| Step | Consumer Action | Measurement |
|------|-----------------|-------------|
| 1 | Start service from cold boot condition | Measure initialization stages |
| 2 | Wait for first valid output | Measure first valid `GPS_INPUT` timestamp |

**Pass criteria**: 95th percentile <30 s over 50 runs.

**Duration**: 50 cold-start trials.

---

### NFT-PERF-INFRA: Replay Evidence Smoke

**Summary**: Validate that the Docker replay harness records timing evidence for the runnable local replay subset.

**Traces to**: AZ-234 AC-3, AZ-233 AC-3, AZ-233 AC-4

**Metric**: Scenario execution time and report generation status.

**Preconditions**:
- Docker replay environment is available.
- Project input fixtures are mounted read-only into the replay consumer.

| Step | Consumer Action | Measurement |
|------|-----------------|-------------|
| 1 | Run the replay consumer in Docker mode | Confirm the performance smoke scenario executes |
| 2 | Inspect the generated CSV and FDR summary | Confirm execution time and artifact paths are recorded |

**Pass criteria**: `NFT-PERF-INFRA` reports `pass` and writes run-scoped CSV/Markdown evidence; Jetson-only performance evidence remains in release-gate resource tests.