mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 14:51:14 +00:00

Files

T

Oleksandr Bezdieniezhnykh 73cd632e95 [AZ-428] [AZ-429] [AZ-430] [AZ-431] Add NFT-PERF-01..04 perf scenarios

Batch 85 — 4 Performance NFT scenarios + pure-logic evaluators.

- NFT-PERF-01 (AZ-428, Tier-2): two-config e2e latency p95 ≤ 400 ms
  (K=3@25°C, K=2 hybrid@50°C) + frame-drop ≤10% + informational per-stage
  partition recording (D-CROSS-LATENCY-1).
- NFT-PERF-02 (AZ-429): inter-emit p95 ≤ 350 ms + no ≥3 missed-emit
  windows. fc-adapter-aware SITL timestamp extraction (tlog vs MSP).
- NFT-PERF-03 (AZ-430, Tier-2): cold-start TTFF p95 ≤ 30 s AND max ≤ 45 s
  over N≥10 iterations.
- NFT-PERF-04 (AZ-431): spoof-promotion latency p95 ≤ 600 ms over N≥20
  randomized-start blackout+spoof events.

All scenarios consume external fixtures (AZ-595 dependency surfaced) and
fail loudly when fixtures are missing or empty. Public-boundary
discipline preserved — evaluators do NOT import src/gps_denied_onboard.

Tests: 60 new unit tests pass; 24 scenarios collect (4 tests × 2 fc × 3
vio). Code review: PASS_WITH_WARNINGS — 1 Medium (fixed in batch),
3 Low (production-dependency surfacings + future hygiene).

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-17 16:46:49 +03:00

7.3 KiB

Raw Permalink Blame History

Code Review Report — Batch 85

Batch: 85 (AZ-428 + AZ-429 + AZ-430 + AZ-431 — Performance NFTs) Date: 2026-05-17 Verdict: PASS_WITH_WARNINGS

Findings

#	Severity	Category	File:Line	Title
1	Medium	Maintainability	`e2e/tests/performance/test_nft_perf_04_spoof_promotion.py:127-145`	Duplicate `sitl_observer` import across branches — fixed in batch
2	Low	Spec-Gap (surfacing)	`e2e/tests/performance/test_nft_perf_04_spoof_promotion.py`	Production dependency: injector cannot emit N=20 randomized-start events
3	Low	Spec-Gap (surfacing)	`e2e/tests/performance/test_nft_perf_03_ttff.py`	AC-2 (clean-state per iteration) delegated to Tier-2 harness (AZ-444)
4	Low	Maintainability	`e2e/runner/helpers/{ttff,spoof_promotion,e2e_latency,streaming}_evaluator.py`	CSV-emit boilerplate duplicated across 4 evaluators

Finding Details

F1: Duplicate sitl_observer import across branches (Medium / Maintainability — fixed in batch)

Location: e2e/tests/performance/test_nft_perf_04_spoof_promotion.py:132,140
Description: _resolve_events_fixture_path imported sitl_observer inside two separate branches. NFT-PERF-01 and NFT-PERF-03 already hoist the import once at the top of the resolver.
Resolution: Hoisted the import to the top of the function during this batch.
Task: AZ-431

F2: Production dependency — injector cannot emit N=20 randomized-start events (Low / Spec-Gap — surfacing)

Location: e2e/tests/performance/test_nft_perf_04_spoof_promotion.py
Description: AZ-431 AC-1 says "N≥20 events via blackout_spoof.py with randomized window starts". Current blackout_spoof.py only randomizes spoofed GPS values via seed; the blackout-window start is hardcoded. The scenario therefore consumes an external E2E_NFT_PERF_04_EVENTS_FIXTURE produced by the fixture builder (AZ-595). Scenario fails loudly when the fixture is missing or empty.
Suggestion: Track as production dependency for AZ-595 (fixture builder) — extend the SITL replay builder to emit nft_perf_04_events.json with N≥20 randomized-start records.
Task: AZ-431

F3: AC-2 (clean-state per iteration) delegated to Tier-2 harness (Low / Spec-Gap — surfacing)

Location: e2e/tests/performance/test_nft_perf_03_ttff.py
Description: AZ-430 AC-2 requires per-iteration fdr-output volume wipe + cold SUT restart. Per scope-discipline these lifecycle concerns belong to the Tier-2 harness (AZ-444 / AZ-595 fixture builder), not to the in-pytest scenario. The scenario only consumes a pre-captured nft_perf_03_ttff.json with N≥10 iteration records.
Suggestion: Track as production dependency for AZ-444 (Tier-2 runner) — wire the per-iteration lifecycle reset and fixture builder.
Task: AZ-430

F4: CSV-emit boilerplate duplicated across 4 evaluators (Low / Maintainability)

Location: e2e/runner/helpers/streaming_evaluator.py, spoof_promotion_evaluator.py, ttff_evaluator.py, e2e_latency_evaluator.py
Description: Each evaluator implements write_csv_evidence + write_per_* with the same shape (open file, write header, write rows, return path). Aggregate CSV row formatting is also boilerplate-heavy.
Suggestion: Future hygiene PBI — extract a _emit_csv(path, header, rows) helper. Not blocking; current code is readable and isolated per scenario.
Task: AZ-428 / AZ-429 / AZ-430 / AZ-431

Phase Notes

Phase 1 — Context

All 4 task specs read; ACs walked through against helpers + scenarios.

Phase 2 — Spec Compliance

Task	AC	Evidence
AZ-429	AC-1 p95 ≤ 350 ms	`streaming_evaluator.evaluate_inter_emit.passes_p95` + scenario assertion
AZ-429	AC-2 no ≥3-emit gap	`evaluate_missed_emits.longest_run < MISSED_EMIT_WINDOW_LIMIT`
AZ-429	AC-3 parameterization	6 collected variants (ardupilot/inav × {okvis2, klt_ransac, vins_mono})
AZ-431	AC-1 N≥20 events	`evaluate.passes_event_count` + fixture validation
AZ-431	AC-2 p95 ≤ 600 ms	`evaluate.passes_p95` + scenario assertion
AZ-431	AC-3 parameterization	6 collected variants
AZ-430	AC-1 tier guard	`@pytest.mark.tier2_only`
AZ-430	AC-2 clean state	delegated to Tier-2 harness (AZ-444) — F3 surfaced
AZ-430	AC-3 p95 ≤ 30 s	`te.evaluate.passes_p95`
AZ-430	AC-4 max ≤ 45 s	`te.evaluate.passes_max`
AZ-430	AC-5 parameterization	6 collected variants
AZ-428	AC-1 tier guard	`@pytest.mark.tier2_only`
AZ-428	AC-2 K=3@25 °C p95 ≤ 400 ms	per-config assertion (`config_id == "k3-25c"`)
AZ-428	AC-3 K=2@50 °C p95 ≤ 400 ms	per-config assertion (`config_id == "k2-hybrid-50c"`)
AZ-428	AC-4 frame drop ≤ 10 %	`LatencyReport.passes_frame_drop` per config
AZ-428	AC-5 partition recorded	`write_partition_csv` (informational, no threshold)
AZ-428	AC-6 parameterization	6 collected variants per config; both configs run per param

Phase 3 — Code Quality

SOLID: each evaluator owns one responsibility; fc-adapter-specific timestamp extraction lives in the AZ-429 scenario (_read_emit_times_ms) rather than leaking into the evaluator.
Error handling: ValueError on negative latency/TTFF (fail-loud at evaluator boundary); pytest.fail on malformed fixture (fail-loud at scenario boundary). No bare except.
DRY: streaming_evaluator._percentile re-used by ttff_evaluator and e2e_latency_evaluator — correct shared-helper pattern.
Tests: all use the Arrange/Act/Assert pattern with # Arrange / # Act / # Assert markers per .cursor/rules/coderule.mdc.
Naming: scenario function names mirror task IDs (test_nft_perf_0N_*); helper symbols use full domain words (ColdStartIteration, FrameLatencySample, SpoofEvent).

Phase 4 — Security

No subprocess / shell=True / eval / exec usage in new code.
No hardcoded secrets.
Input from fixtures parsed via json.loads (safe); shape validated with explicit pytest.fail on malformed records — no insecure deserialisation.

Phase 5 — Performance

One sort per percentile call (sorted(values)); fixtures are ≤ N=900 per config — negligible.
No N+1 patterns; no blocking I/O in async contexts.

Phase 6 — Cross-Task Consistency

All 4 evaluators share the _percentile helper from streaming_evaluator.
All 4 scenarios follow the identical fixture-consumer pattern (resolve fixture path → load → evaluate → write CSV evidence → record NFR metrics → assert).
All 4 scenarios use @pytest.mark.scenario_id + @pytest.mark.traces_to consistently.

Phase 7 — Architecture Compliance

All new files under e2e/ (Blackbox Tests component per _docs/02_document/module-layout.md).
No imports from src/gps_denied_onboard (verified — explicit "does NOT import" notes in evaluator docstrings).
No new cyclic dependencies.
No duplicate symbols across components.

Verdict Logic

0 Critical, 0 High → not FAIL.
1 Medium, 3 Low → PASS_WITH_WARNINGS.

F1 (duplicate import) is the only actionable finding without a downstream dependency; deferred to a follow-up hygiene pass given trivial scope.

Cumulative Trigger

Batch 85 advances the K-counter to 1 of K=3 from cumulative baseline (batches 82-84). Cumulative review trigger reached at batch 87.

7.3 KiB Raw Permalink Blame History Unescape Escape