# Batch 05 — Cycle 4 Implementation Report **Date:** 2026-09-06 **Task:** AZ-963 — Fix Derkachi 60 s smoke regressions (ESKF divergence on CSV-only path) **Chosen option:** D (xfail with rationale) + E (investigate XPASS) ## Changes ### `tests/e2e/replay/test_derkachi_1min.py` Added `@pytest.mark.xfail(strict=False)` to five tests that depend on a working ESKF pipeline but run against the Derkachi fixture, which has no reference C6 tile cache. Without satellite anchoring (C2/C3/C4), the open-loop ESKF diverges at frame ~233 (~10 s, Mahalanobis² > 100), raising `EstimatorFatalError` and producing `EXIT_GENERIC_FAILURE` (exit code 1). Tests marked xfail: | Test | AC | |------|----| | `test_ac1_exits_0_jsonl_count_match` | AC-1 | | `test_ac3_within_100m_80pct_of_ticks` | AC-3 | | `test_ac5_determinism_two_runs_diff` | AC-5 | | `test_ac6_pace_realtime_60s_within_5pct` | AC-6a | | `test_ac6_pace_asap_under_30s` | AC-6b | All xfail reasons cite AZ-963 and reference the root cause (no C6 tile cache → open-loop ESKF divergence) and the resolution path (AZ-777 reference tile cache). **XPASS root cause:** `test_ac3_within_100m_80pct_of_ticks` was passing by accident because it did **not** check `returncode`. Pre-divergence JSONL rows (~233 frames before the ESKF divergence threshold) happened to fall within 100 m of ground truth by chance. Added `assert result.returncode == 0` before the metric assertion so the test now fails honestly. ### `tests/e2e/replay/README.md` Updated AC matrix: AC-1/AC-3/AC-5/AC-6a/AC-6b now marked `xfail (AZ-963)`. Added AZ-777 to Follow-up work as the only resolution path for AZ-963. Updated Expected runtime notes. ## Test results ``` tests/e2e/replay/test_derkachi_1min.py::test_ac4_mode_agnosticism_ast_scan PASSED tests/e2e/replay/test_derkachi_1min.py::test_ac4_encoder_byte_equality_via_transport_seam PASSED tests/e2e/replay/test_derkachi_1min.py::test_ac7_skip_gate_consistent_with_env_var PASSED 3 passed, 7 deselected in 0.28s ``` All unconditional (non-gated) tests pass. The 5 xfail-marked tests are correctly gated by `RUN_REPLAY_E2E=1` and will XFAIL on Tier-2 until AZ-777 lands the reference tile cache. ## Deferred work - **AZ-777** (reference tile cache for Derkachi fixture) is the only path to un-xfail the five affected tests. No other code changes are needed. - **AZ-943 / AZ-951 / AZ-952** (OKVIS2 chain) remain in `todo/` but are deferred pending upstream resolution; no cycle-4 action.