Option A (minimum-deprecation, 2 SP) per user complexity-budget decision. Auto-sync stays importable as a raising stub for one cycle so external callers see a clean ReplayInputAdapterError instead of an ImportError. Full physical removal is filed as AZ-908 (cycle-5+ backlog). Production: - auto_sync.py: 700+ LOC -> 56-line no-op stub raising "auto-sync removed; supply --imu CSV instead" - tlog_video_adapter.py: 700+ LOC -> 105-line deprecated stub; ReplayInputAdapter.open() raises immediately, close() is a no-op - _replay_branch.py: dropped legacy auto-sync branch + _build_auto_sync_config; _validate_replay_paths now requires imu_csv_path; replay_input_adapter_factory parameter removed - cli/replay.py: --time-offset-ms / --skip-auto-sync / --auto-trim emit DeprecationWarning + stderr line; values ignored - tlog_replay_adapter.py + tlog_ground_truth.py docstrings: AUDIT-ONLY Tests: - DELETED test_az405_auto_sync, test_az405_replay_input_adapter, test_az698_window_alignment (covered code no longer runs) - ADDED test_az895_auto_sync_deprecated_stub (5 parametrised, pins AC-1) - test_az402_replay_cli: deprecation warnings + ignored-value asserts - test_az401_compose_root_replay: new imu_csv_path-required gate; deleted the calibration-loading test that relied on the removed replay_input_adapter_factory injection point - test_derkachi_real_tlog: xfail reason refreshed to AZ-848 + AZ-883 (AC-4 "AZ-848-scoped reason") Docs: - module-layout.md: replay_input file list flags deprecated modules, adds csv_ground_truth.py - _dependencies_table.md: +AZ-908 row, preamble + totals updated (179 -> 180 tasks, 567 -> 570 SP) - AZ-908 backlog spec added; AZ-895 spec moved todo -> done - batch_03_cycle4_report.md written Touched-module tests green (111 passed, 1 skipped). Full unit suite green: 2287 passed, 85 skipped, 1 deselected (pre-existing flaky perf test, unrelated). Co-authored-by: Cursor <cursoragent@cursor.com>
E2E replay tests (AZ-404)
End-to-end regression suite that runs the gps-denied-replay
console-script (AZ-402) against the Derkachi 60 s clip and asserts
the AZ-265 epic acceptance criteria.
How to run
# In a fresh venv with the package installed:
RUN_REPLAY_E2E=1 pytest tests/e2e/replay/ -v
Without RUN_REPLAY_E2E=1 the heavy tests skip cleanly. The two
unconditional tests (AC-4a mode-agnosticism scan + AC-7 skip-gate
self-check + the helpers in test_helpers.py) still run.
Fixture state
| Artifact | Status | Source |
|---|---|---|
flight_derkachi.mp4 |
available | _docs/00_problem/input_data/flight_derkachi/ |
data_imu.csv |
available | same dir; 4900 rows at 10 Hz over 489.9 s |
| Synthetic tlog | generated at fixture time | _tlog_synth.py reproduces a pymavlink .tlog from the CSV (the original tlog is not in-repo; the CSV was its export) |
| Camera calibration | placeholder (tests/fixtures/calibration/adti26.json) |
The real Topotek KHP20S30 intrinsics are unknown per camera_info.md. AC-3 is xfailed until a real calibration ships. |
| Operator pre-flight rehearsal | blocked | tests/fixtures/mock-suite-sat-service/ is a bootstrap stub (only GET /healthz); AC-8 skips until the full D-PROJ-2 contract lands. |
Clip range
The first 60 s of the Derkachi flight (Time=0.0 → Time=60.0). The
take-off region exercises the AZ-405 IMU-take-off auto-sync detector;
the cruise region that follows stresses the satellite-anchor + VIO
drift-correction path. To change the trim, edit _CLIP_START_S and
_CLIP_END_S in conftest.py.
Expected runtime (Tier-1)
| Test | Expected wall clock |
|---|---|
AC-1 (--pace asap) |
≤ 30 s |
| AC-2 schema match | piggybacks on AC-1 |
| AC-5 determinism | 2 × asap runs (≤ 60 s total) |
| AC-6 realtime | 60 s ± 3 s |
| AC-6 asap | ≤ 30 s |
| Total suite | ≤ 6 min on Jetson AGX Orin |
The AC-1 / AC-2 / AC-5 tests share --pace asap runs but each
fixture invocation produces a fresh output file, so they do not
short-circuit each other (preserves AC-5's two-runs-diff guarantee).
AC matrix
| AC | Test | State |
|---|---|---|
| AC-1: exit 0 + JSONL count match | test_ac1_exits_0_jsonl_count_match |
runs on Tier-1 |
| AC-2: JSONL schema match | test_ac2_jsonl_schema_match |
runs on Tier-1 |
| AC-3: ≤ 100 m for 80 % of ticks | test_ac3_within_100m_80pct_of_ticks |
xfail (waiting on real calibration) |
| AC-4a: mode-agnosticism AST scan | test_ac4_mode_agnosticism_ast_scan |
unconditional |
| AC-4b: encoder byte-equality | test_ac4_encoder_byte_equality |
skip (waiting on AZ-558) |
| AC-5: determinism | test_ac5_determinism_two_runs_diff |
runs on Tier-1 |
| AC-6a: realtime 60 s ± 5 % | test_ac6_pace_realtime_60s_within_5pct |
runs on Tier-1 |
| AC-6b: asap ≤ 30 s | test_ac6_pace_asap_under_30s |
runs on Tier-1 |
| AC-7: skip-gate self-check | test_ac7_skip_gate_consistent_with_env_var |
unconditional |
| AC-8: operator workflow rehearsal | test_ac8_operator_workflow |
skip (waiting on D-PROJ-2 mock) |
| AC-9: helper L2 correctness | test_helpers.py::test_ac9_l2_* |
unconditional |
| AC-10: README accuracy | this file | live |
Failure-mode cookbook
| Symptom | Likely cause | Fix |
|---|---|---|
gps-denied-replay console-script not on PATH |
package not installed in the test venv | pip install -e . |
| AC-1 line count off by > 5 % | tlog synthesizer drifted from the CSV | regenerate by re-running the test (synthesizer is deterministic; non-determinism would be a real bug) |
| AC-3 fails at ~ 0 % even with calibration | wrong intrinsics OR wrong WGS84 ground truth source — verify the GLOBAL_POSITION_INT columns are still the AC-3 reference (per flight_derkachi/README.md) |
re-derive ground truth |
| AC-5 determinism violated | non-deterministic float ordering in C5 estimator OR a clock leaked into the runtime | bisect via git log against the C5 / clock modules |
| AC-6 realtime drifts on shared CI | shared-runner contention; the spec allows widening to ± 5 s | adjust _HEAVY_SKIP boundary if it persists |
tlog missing required messages |
_tlog_synth.py lost a message group |
check _REQUIRED_MESSAGE_GROUPS in tlog_replay_adapter.py against the synth output |
Files
tests/e2e/replay/
├── README.md ← this file
├── __init__.py ← package marker + module-level docstring
├── _helpers.py ← parse_jsonl, l2_horizontal_m, match_percentage,
│ CapturingMavlinkTransport, GroundTruthRow
├── _tlog_synth.py ← CSV → tlog generator
├── conftest.py ← derkachi_replay_inputs, replay_runner,
│ operator_pre_flight_setup fixtures
├── test_helpers.py ← unit tests for _helpers (unconditional)
└── test_derkachi_1min.py ← AC-1..AC-8 + AC-7 skip gate + AC-4a AST scan
Follow-up work
- Real Topotek KHP20S30 calibration — unblocks AC-3.
- AZ-558 — closes AC-4b (route C8 encoders through
MavlinkTransport). - D-PROJ-2 mock-suite-sat-service — unblocks AC-8 (operator workflow rehearsal).