# Code Review Report **Batch**: 71 — AZ-414, AZ-415, AZ-418 **Date**: 2026-05-16 **Verdict**: PASS ## Findings (none) ## Findings Sweep ### Phase 1 — Context Loading Loaded specs `AZ-414_ft_p_07_ftn_02_sharp_turn.md`, `AZ-415_ft_p_08_multi_segment_reloc.md`, `AZ-418_ft_p_10_smoothing_lookback.md`. Reused the existing `geo.py`, `sharp_turn_detector.py` (new this batch), `multi_segment_evaluator.py` (new this batch), `smoothing_evaluator.py` (new this batch), and the `fdr_reader` / `frame_source_replay` / `imu_replay` stub gates used by batches 69 and 70. Re-read `_docs/00_problem/input_data/flight_derkachi/data_imu.csv` header layout to confirm `SCALED_IMU2.zgyro` column and `GLOBAL_POSITION_INT.lat/lon` units (decimal degrees, not 1e-7 int32). ### Phase 2 — Spec Compliance **AZ-414 (FT-P-07 + FT-N-02)** | AC | Coverage | Status | |----|----------|--------| | AC-1 (turn segment ID via `\|gyro_z\| ≥ threshold` for ≥3 rows, with synthetic-overlay fallback marked in evidence CSV) | `test_detect_simple_turn`, `test_detect_short_run_pruned`, `test_detect_negative_yaw_uses_abs_value`, `test_detect_or_synthesize_falls_back_to_overlay`, scenario `synthetic_overlay` column in `write_csv_evidence` | Covered | | AC-2 (during-turn label ∈ {visual_propagated, dead_reckoned}) | `test_ft_n_02_passes_with_only_propagated_labels`, `test_ft_n_02_fails_on_satellite_anchored_during_turn`, FT-N-02 scenario assertion | Covered | | AC-3 (cov non-decreasing during turn) | `test_ft_n_02_fails_on_decreasing_covariance`, FT-N-02 scenario assertion | Covered | | AC-4 (recovery ≤ 3 frames after turn end) | `test_ft_p_07_passes_recovery_within_budget`, `test_ft_p_07_fails_when_recovery_takes_too_long`, FT-P-07 scenario assertion via `MAX_RECOVERY_FRAMES_SAFETY_MS` | Covered | | AC-5 (drift ≤ 200 m) | `test_ft_p_07_fails_when_drift_exceeds_budget`, FT-P-07 scenario assertion | Covered | | AC-6 (heading delta ≤ 70°) | `test_ft_p_07_heading_envelope_with_pre_anchor`, `test_ft_p_07_heading_outside_envelope_fails`, FT-P-07 scenario assertion | Covered | | AC-7 (parameterized per `(fc_adapter, vio_strategy)`) | Both scenarios use `fc_adapter` + `vio_strategy` fixtures from `runner/conftest.py` — `pytest --collect-only` shows 6 variants each | Covered | Note on AC-3.2 threshold: helper reads `AC32_SHARP_TURN_GYRO_Z_MDPS` env var (default 30,000 millidegree/s) per spec note ("reads from test-spec environment, not from a hardcoded constant"). Default + env override + validation covered by `test_default_threshold_when_env_unset`, `test_threshold_env_override_applies`, `test_threshold_env_rejects_non_int`, `test_threshold_env_rejects_non_positive`. **AZ-415 (FT-P-08)** | AC | Coverage | Status | |----|----------|--------| | AC-1 (multi-segment-derkachi fixture with three 5-15 s blackouts) | Fixture parameter, evidence CSV `gap_s` column | Covered (gated on `frame_source_replay`/`fdr_reader`) | | AC-2 (source_label = dead_reckoned inside each blackout) | `test_evaluate_window_label_violation`, scenario assertion | Covered | | AC-3 (recovery to satellite_anchored ≤ 10 s after window end) | `test_evaluate_window_recovery_within_budget`, `test_evaluate_window_recovery_misses_budget`, scenario assertion | Covered | | AC-4 (no centre jump > 200 m at recovery) | `test_evaluate_window_jump_within_budget`, `test_evaluate_window_jump_exceeds_budget`, scenario assertion | Covered | **AZ-418 (FT-P-10)** | AC | Coverage | Status | |----|----------|--------| | AC-1 (raw + smoothed per past keyframe in FDR) | `test_pair_records_groups_by_keyframe`, `test_pair_records_keeps_orphans_partial`, `test_pair_records_rejects_duplicate_pose_kind`, `test_evaluate_excludes_unpaired_keyframes`, scenario `record_type == "keyframe_pose"` filter | Covered | | AC-2 (improvement rate ≥ 0.80) | `test_evaluate_all_smoothed_wins_passes`, `test_evaluate_at_80_pct_improvement_rate_passes`, `test_evaluate_below_80_pct_fails_overall`, scenario assertion | Covered | | AC-3 (mean improvement ≥ 5 m) | `test_evaluate_at_80_pct_improvement_rate_passes`, `test_evaluate_mean_improvement_below_5m_fails`, scenario assertion | Covered | Mode B Fact #107 ("INTERNAL improvement metric; NOT FC-side retroactive correction") explicitly preserved in module docstring. ### Phase 3 — Code Quality * **Single responsibility**: each helper is one concern. `sharp_turn_detector` owns detection + per-segment evaluation for both FT-P-07 (recovery) and FT-N-02 (during turn) because both halves consume the same segment fixture and the same outbound-estimate stream — splitting them would duplicate `TurnFrameSample` collection in two scenarios. The split is at the *assertion* layer (positive vs negative scenario file), not at the detector. `smoothing_evaluator` owns pose-pair logic, GT resolution, and budget evaluation. `multi_segment_evaluator` already reviewed in batch 71 partial. * **No suppressed errors**: every helper raises on invalid input (`get_threshold_mdps` env validation, `pair_records` dupe/unknown pose_kind, `load_zgyro_samples` missing column, `synthesize_overlay_segment` argument validation, `evaluate` empty GT track). * **AAA comment discipline**: every test uses `# Arrange / # Act / # Assert`; sections omitted when not needed (single-line `Assert` for constant tests). * **No narration comments**: docstrings explain non-obvious intent (AC mapping, why orphans are excluded, why nearest-neighbour GT is acceptable, why the env override exists). ### Phase 4 — Security * **No SUT imports**: confirmed by passing `test_no_sut_imports.py` in the full suite. None of the new modules import from `src.gps_denied_onboard`. * **No PII/credentials**: helpers handle synthetic + Derkachi-public GT only. * **No SQL/shell injection surface**: helpers consume CSV via `csv.DictReader` and `pathlib`. No subprocess calls. ### Phase 5 — Performance * All helpers are O(N) over samples. `pair_records` is one dict pass; `evaluate` is O(P) over paired keyframes (P typically ≤200 for the Derkachi window); `detect_turn_segments` is one scan; `evaluate_ft_p_07` uses a linear scan for pre-anchor + recovery (acceptable for ≤1000 samples; if scenario data grows we can switch to bisect). * No nested CSV reads or repeated geodesic recomputations. ### Phase 6 — Cross-Task Consistency * **Pattern parity with batches 69 + 70**: - Skip gate (`_harness_helpers_implemented` fixture) for missing upstream replay helpers — same pattern as `test_ft_p_02_*`, `test_ft_p_04_*`, `test_ft_p_05_*`. - `_NullSink` / `_NullImuEmitter` probes — same pattern as `test_ft_p_04_*`. - Evidence CSV via `write_csv_evidence(out, …)` returning the path — same pattern as `accuracy_evaluator`, `mre_evaluator`, `multi_segment_evaluator`. - NFR metrics via `nfr_recorder.record_metric(name, value, ac_id=…)` — same pattern as `test_ft_p_01_*`, `test_ft_p_04_*`. - Helper modules importable from `runner.helpers.*` with module-level constants in `UPPER_SNAKE` — matches `multi_segment_evaluator` and `mre_evaluator`. * **No drift**: FT-P-07 scenario reuses the `MAX_RECOVERY_FRAMES_SAFETY_MS` budget constant from the helper (no magic numbers); FT-N-02 reuses `ALLOWED_DURING_TURN_LABELS` set so the scenario assertion message prints the spec-accurate alphabetised set. ### Phase 7 — Architecture Compliance * **Public-boundary discipline**: confirmed by `test_no_sut_imports.py` (passing). Helpers consume only the FDR record schema (record_type + payload dict) defined in `runner.helpers.fdr_reader`. * **Directory layout**: new files added to `test_directory_layout.py` parametrize list (`runner/helpers/{smoothing_evaluator,sharp_turn_detector,multi_segment_evaluator}.py`, `tests/positive/test_ft_p_{07,08,10}_*.py`, `tests/negative/test_ft_n_02_*.py`). All 75 parametrized variants pass. * **Determinism**: all helpers are deterministic — no `time.time()`, no random number generation; `random` not imported in any new module. ### Phase 8 — Test Suite Health * Total: **393 passed in 126.64s** (was 325 at end of batch 70). * New tests this batch: **+68** (sharp_turn_detector: 30; smoothing_evaluator: 15; multi_segment_evaluator: 16; directory_layout new entries: 7). * Pre-existing macOS-only `/e2e-results` plugin issue still present — affects scenario test invocation outside Docker only; unit suite unaffected. Tracked under runner reporting (out of batch scope). ## Cross-Task Consistency Verdict PASS — no cross-task drift, no duplicated logic across the three new helpers, no shared mutable state, evidence CSV schemas distinct per scenario but follow the same write pattern. ## Architecture Compliance Verdict PASS — public-boundary blackbox stance preserved; no SUT imports; FDR schema honoured. ## Final Verdict **PASS** — Batch 71 (AZ-414 + AZ-415 + AZ-418) ready for commit.