Files
gps-denied-onboard/_docs/02_document/tests/performance-tests.md
T
Oleksandr Bezdieniezhnykh cab7b5d020 [AZ-233] Update Docker Compose and enhance test documentation
- Modified the Docker Compose configuration to include an input root for replay tests and added an environment variable for enabling SITL.
- Enhanced documentation for various testing processes, including the addition of a Runtime Completeness Decomposition Gate and clarifications on internal module testing requirements.
- Updated the implementation completeness report to reflect the current state and added new test cases for performance and resilience scenarios.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-06 05:03:48 +03:00

5.1 KiB

Performance Tests

NFT-PERF-01: Per-Frame Latency On Project Still Images

Summary: Validate end-to-end latency for processing project nadir frames through geolocation output.

Traces to: AC-4.1, AC-4.4

Metric: Capture-to-output latency p50/p95/p99 and dropped-frame rate.

Preconditions:

  • Jetson Orin Nano Super or equivalent production target is running in the intended power mode.
  • project_60_still_images fixture is available.
Step Consumer Action Measurement
1 Replay images at target 3 fps or faster stress rate Measure latency from input timestamp to emitted estimate
2 Record all frame drops Measure dropped-frame percentage

Pass criteria: p95 latency <400 ms; dropped frames <=10% under sustained load; no batching delay.

Duration: Minimum 20 minutes or full fixture loop repeated enough times to reach stable measurements.


NFT-PERF-02: BASALT + Wrapper Replay Latency

Summary: Validate relative VIO hot-path latency using synchronized Derkachi video/telemetry and public or representative camera/IMU data.

Traces to: AC-2.1a, AC-4.1, AC-4.2

Metric: Per-frame VIO latency, completion rate, and memory usage.

Preconditions:

  • Derkachi flight_derkachi.mp4 and data_imu.csv are mounted and pass fixture validation.
  • MUN-FRL/ALTO/EPFL/Kagaru or another representative synchronized dataset slice is pinned for calibrated final comparison.
  • OpenVINS reference replay is available for comparison when the dataset supports it.
Step Consumer Action Measurement
1 Replay Derkachi video at target 3 fps and stress rates from the 30 fps source Measure per-frame processing time, dropped frames, and telemetry alignment
2 Replay synchronized camera/IMU stream through BASALT + wrapper Measure VIO processing time and completion rate
3 Compare emitted trajectory against Derkachi GLOBAL_POSITION_INT and calibrated dataset ground truth where available Measure completion rate and error distribution
4 Monitor memory Track CPU/GPU shared memory peak

Pass criteria: Normal-frame VO registration >95% on calibration-supported segments; p95 processing latency <400 ms for the hot path; memory <8 GB shared; Derkachi replay maintains stable 3-video-frames-per-telemetry-row alignment with <=10% dropped frames under sustained target-rate replay.

Duration: Dataset-dependent; at least one normal segment and one challenging segment.


NFT-PERF-03: Relocalization Trigger Path Latency

Summary: Validate the heavy DINOv2-VLAD + FAISS + ALIKED/LightGlue path under bounded top-K settings.

Traces to: AC-3.2, AC-3.3, AC-4.1, AC-8.6

Metric: Trigger-to-anchor latency, top-K query time, local verification time, accepted/rejected anchor counts.

Preconditions:

  • Precomputed descriptor index is loaded.
  • Dynamic K settings are configured: K=5 stable, K=20 active-conflict, K=50 fallback.
Step Consumer Action Measurement
1 Trigger relocalization from cold start or sharp turn Measure DINOv2 descriptor time and FAISS query time
2 Verify top-K candidates Measure ALIKED/LightGlue + RANSAC latency
3 Emit accepted/rejected decision Measure total trigger-to-decision latency

Pass criteria: Heavy path is conditional, never blocks steady-state frame output; accepted anchor carries MRE <2.5 px and valid covariance.

Duration: 100 relocalization trials across stable and active-conflict sector fixtures.


NFT-PERF-04: Cold Boot Time To First Fix

Summary: Validate companion boot to first valid GPS_INPUT.

Traces to: AC-NEW-1

Metric: Time from service start/boot marker to first valid GPS_INPUT.

Preconditions:

  • Engines/indexes are built before the run.
  • Cache/index is available locally.
  • FC state handoff is simulated or provided.
Step Consumer Action Measurement
1 Start service from cold boot condition Measure initialization stages
2 Wait for first valid output Measure first valid GPS_INPUT timestamp

Pass criteria: 95th percentile <30 s over 50 runs.

Duration: 50 cold-start trials.


NFT-PERF-INFRA: Replay Evidence Smoke

Summary: Validate that the Docker replay harness records timing evidence for the runnable local replay subset.

Traces to: AZ-234 AC-3, AZ-233 AC-3, AZ-233 AC-4

Metric: Scenario execution time and report generation status.

Preconditions:

  • Docker replay environment is available.
  • Project input fixtures are mounted read-only into the replay consumer.
Step Consumer Action Measurement
1 Run the replay consumer in Docker mode Confirm the performance smoke scenario executes
2 Inspect the generated CSV and FDR summary Confirm execution time and artifact paths are recorded

Pass criteria: NFT-PERF-INFRA reports pass and writes run-scoped CSV/Markdown evidence; Jetson-only performance evidence remains in release-gate resource tests.