missions/_docs/02_document/tests/performance-tests.md

# Performance Tests

> **Status**: produced by autodev `/test-spec` Phase 2 (2026-05-14).
> **Naming**: post-rename target. The thresholds below are the documented expectations from `acceptance_criteria.md` and `architecture.md` § 6 — they reflect what the implemented design *aims* for on the edge devices (Jetson Orin / OrangePI / operator-PC). The test suite asserts these thresholds against the test environment defined in `environment.md` (Docker on developer laptop), with the caveat that production hardware may produce tighter numbers; CI tracks the test-environment numbers as the regression baseline.
> **Test execution mode**: every NFT-PERF-* runs `N` repetitions and computes the documented percentile (P50/P95). Cold-start passes are excluded — 5 warm-up calls precede every measured run.

---

### NFT-PERF-01: Mission cascade-delete latency target

**Summary**: Verifies the documented latency target for the F3 cascade walk against local PostgreSQL on the same device.
**Traces to**: AC-3.6
**Metric**: P50 wall-clock latency for `DELETE /missions/{id}` against a 1-waypoint, no-map_objects, no-media mission.

**Preconditions**:
- `missions` and `postgres-test` colocated on the same Docker network with no inter-host link
- `seed_one_default_vehicle` + 100 minimal missions (each with 1 waypoint, no media/annotations/detection/map_objects rows)
- 5 warm-up `DELETE` calls on missions outside the measured set (to warm Npgsql connection pool + JIT)

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Issue 100 sequential `DELETE /missions/{id_i}` calls (one per seeded mission, 1 ≤ i ≤ 100) | Record per-call wall-clock latency in ms |
| 2 | Compute P50 across the 100 measurements | `median(latencies)` |

**Pass criteria**: `P50 ≤ 50ms`.
**Duration**: ~10–30s of test wall-clock (each call <50ms on healthy local PG).
**Note**: P95 is *also* recorded for trend tracking but does NOT block — only P50 ≤ 50ms is the gate.

---

### NFT-PERF-02: Mission cascade-delete latency under full chain

**Summary**: Same as NFT-PERF-01 but with the full F3 chain (map_objects + media + annotations + detection rows present). No documented threshold; this test establishes a baseline that subsequent runs must not regress against by more than 50%.
**Traces to**: AC-3.1, AC-3.6 (related)
**Metric**: P50 wall-clock latency for `DELETE /missions/{id}` against a `fixture_cascade_F3`-shaped mission.

**Preconditions**:
- Same as NFT-PERF-01 but seed 50 missions each with the `fixture_cascade_F3` chain (3 map_objects, 2 waypoints, 2 media, 2 annotations, 2 detection)
- 5 warm-up calls on additional fixtures outside the measured set

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Issue 50 sequential `DELETE /missions/{id_i}` calls | Record per-call wall-clock latency in ms |
| 2 | Compute P50, P95 | medians + 95th percentile |

**Pass criteria**: `P50 ≤ 200ms` (provisional baseline — 4× the minimal-chain target accounts for 3 extra DELETE statements + index updates). On first green run, lock the achieved P50 ± 50% as the regression gate for subsequent runs.
**Duration**: ~30–60s of test wall-clock.

---

### NFT-PERF-03: Health endpoint latency

**Summary**: Verifies `GET /health` is the lightweight process-liveness probe.
**Traces to**: AC-7.3
**Metric**: P50 wall-clock latency.

**Preconditions**:
- `missions` running, no special seed required
- 5 warm-up `GET /health` calls

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Issue 100 sequential `GET /health` calls (no `Authorization`) | Record per-call wall-clock latency in ms |
| 2 | Compute P50 | `median(latencies)` |

**Pass criteria**: `P50 ≤ 10ms`.
**Duration**: ~1s of test wall-clock.

---

### NFT-PERF-04: Mission list pagination throughput

**Summary**: No documented threshold for `GET /missions` list path — this test establishes a regression baseline so a future change cannot silently 10× the P95 latency.
**Traces to**: AC-2.3 (latency-related, no AC threshold)
**Metric**: P95 wall-clock latency for `GET /missions?page=1&pageSize=20` against a 1000-mission seed.

**Preconditions**:
- Seed 1000 missions referencing `seed_one_default_vehicle`
- 5 warm-up calls

**Steps**:

| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Issue 100 sequential `GET /missions?page=1&pageSize=20` calls | Record per-call wall-clock latency in ms |
| 2 | Compute P95 | `percentile(latencies, 95)` |

**Pass criteria**: on first green run, lock the achieved P95 ± 50% as the regression gate. Initial provisional gate: `P95 ≤ 100ms`. If first run exceeds this, raise the gate AND open a follow-up ticket — do NOT silently accept.
**Duration**: ~10s.

---

## Notes

- All NFT-PERF tests run sequentially (no concurrent client) to remove HTTP/1.1 connection-reuse variance from the measurement. Concurrency is exercised separately under NFT-RES (resilience) when needed for race scenarios.
- Per `restrictions.md` H6, container-level resource limits are NOT enforced inside the container today. Tests assume the test host has ≥ 2 CPU cores and ≥ 2 GB free RAM — hardware-assessment will lock this requirement.
- Latencies measured against the test environment (developer laptop / CI runner) WILL diverge from production edge hardware. The CI gate is the regression baseline; the AC-3.6 / AC-7.3 numerical thresholds are documented production targets that the test environment also satisfies because the test environment is faster, not slower (no PG-on-Jetson penalty).