Cycle-3 refactor run 02-az507 (RouteSpec relocation + module-layout
refresh + AZ-270 lint widening). Single batch of 3 tasks; epic AZ-844.
AZ-845 — Relocate RouteSpec DTO to _types/route.py (rule-9 fix):
* New canonical home: src/gps_denied_onboard/_types/route.py
(frozen+slots dataclass; full docstring carried over verbatim).
* c11_tile_manager/route_client.py imports from _types.route.
* replay_input/tlog_route.py and replay_input/__init__.py keep
re-exports for backward-compat (RouteSpec in __all__).
* 5 test files updated to import from _types.route for symmetry.
* Identity-preserving re-export verified by new test
test_az845_routespec_canonical_home_and_reexport_identity.
AZ-846 — Refresh module-layout.md cycle-3 entries:
* c11_tile_manager Internal list rewritten with all 8 internals
(alphabetised) — corrects a stale entry that referenced files
(satellite_provider_*.py) that no longer exist.
* shared/replay_input file list adds errors.py (cycle-2 carry),
tlog_ground_truth.py (cycle-2 carry), tlog_route.py (cycle-3 NEW).
* shared/_types section registers route.py with provenance line.
* Out-of-scope cycle-2 carry-overs (replay_api/, cli/render_map.py,
helpers/gps_compare.py, etc.) intentionally untouched.
AZ-847 — Widen test_az270 lint to enforce full rule-9 allow-list:
* test_ac6_only_compose_root_imports_concrete_strategies now walks
every components/<X>/*.py ImportFrom/Import and rejects anything
not in the rule-9 allow-list (own subpackage + _types + helpers
+ config/logging/fdr_client/clock + frame_source interface-only).
* Strict superset of the original AC-6 narrow check.
* Reports zero violations on the codebase post-AZ-845.
* Two principled carve-outs documented in the test docstring:
- components/<X>/bench/** path skip (measurement code legitimately
constructs production strategies via runtime_root factories).
- register_* lazy self-registration imports from
runtime_root.<X>_factory (central-registry plugin pattern).
* Both carve-outs surfaced to user via Choose A/B/C/D Risk-1
protocol; user skipped both — agent proceeded with documented
defaults. Doc-only follow-up tracked in
_docs/_process_leftovers/2026-05-24_az847_rule9_wording_followup.md
for rule-9 wording update in module-layout.md.
Test results: 2287 passed, 90 skipped (environmental — Docker / CUDA
/ TensorRT / Jetson hardware / fixtures), 0 failed. Focused subset
(replay_input/ + c11_tile_manager/ + test_az270_compose_root.py)
also clean: 169 passed, 1 skipped.
Tracker: AZ-845/846/847 transitioned In Progress -> In Testing.
Co-authored-by: Cursor <cursoragent@cursor.com>
E2E replay tests (AZ-404)
End-to-end regression suite that runs the gps-denied-replay
console-script (AZ-402) against the Derkachi 60 s clip and asserts
the AZ-265 epic acceptance criteria.
How to run
# In a fresh venv with the package installed:
RUN_REPLAY_E2E=1 pytest tests/e2e/replay/ -v
Without RUN_REPLAY_E2E=1 the heavy tests skip cleanly. The two
unconditional tests (AC-4a mode-agnosticism scan + AC-7 skip-gate
self-check + the helpers in test_helpers.py) still run.
Fixture state
| Artifact | Status | Source |
|---|---|---|
flight_derkachi.mp4 |
available | _docs/00_problem/input_data/flight_derkachi/ |
data_imu.csv |
available | same dir; 4900 rows at 10 Hz over 489.9 s |
| Synthetic tlog | generated at fixture time | _tlog_synth.py reproduces a pymavlink .tlog from the CSV (the original tlog is not in-repo; the CSV was its export) |
| Camera calibration | placeholder (tests/fixtures/calibration/adti26.json) |
The real Topotek KHP20S30 intrinsics are unknown per camera_info.md. AC-3 is xfailed until a real calibration ships. |
| Operator pre-flight rehearsal | blocked | tests/fixtures/mock-suite-sat-service/ is a bootstrap stub (only GET /healthz); AC-8 skips until the full D-PROJ-2 contract lands. |
Clip range
The first 60 s of the Derkachi flight (Time=0.0 → Time=60.0). The
take-off region exercises the AZ-405 IMU-take-off auto-sync detector;
the cruise region that follows stresses the satellite-anchor + VIO
drift-correction path. To change the trim, edit _CLIP_START_S and
_CLIP_END_S in conftest.py.
Expected runtime (Tier-1)
| Test | Expected wall clock |
|---|---|
AC-1 (--pace asap) |
≤ 30 s |
| AC-2 schema match | piggybacks on AC-1 |
| AC-5 determinism | 2 × asap runs (≤ 60 s total) |
| AC-6 realtime | 60 s ± 3 s |
| AC-6 asap | ≤ 30 s |
| Total suite | ≤ 6 min on Jetson AGX Orin |
The AC-1 / AC-2 / AC-5 tests share --pace asap runs but each
fixture invocation produces a fresh output file, so they do not
short-circuit each other (preserves AC-5's two-runs-diff guarantee).
AC matrix
| AC | Test | State |
|---|---|---|
| AC-1: exit 0 + JSONL count match | test_ac1_exits_0_jsonl_count_match |
runs on Tier-1 |
| AC-2: JSONL schema match | test_ac2_jsonl_schema_match |
runs on Tier-1 |
| AC-3: ≤ 100 m for 80 % of ticks | test_ac3_within_100m_80pct_of_ticks |
xfail (waiting on real calibration) |
| AC-4a: mode-agnosticism AST scan | test_ac4_mode_agnosticism_ast_scan |
unconditional |
| AC-4b: encoder byte-equality | test_ac4_encoder_byte_equality |
skip (waiting on AZ-558) |
| AC-5: determinism | test_ac5_determinism_two_runs_diff |
runs on Tier-1 |
| AC-6a: realtime 60 s ± 5 % | test_ac6_pace_realtime_60s_within_5pct |
runs on Tier-1 |
| AC-6b: asap ≤ 30 s | test_ac6_pace_asap_under_30s |
runs on Tier-1 |
| AC-7: skip-gate self-check | test_ac7_skip_gate_consistent_with_env_var |
unconditional |
| AC-8: operator workflow rehearsal | test_ac8_operator_workflow |
skip (waiting on D-PROJ-2 mock) |
| AC-9: helper L2 correctness | test_helpers.py::test_ac9_l2_* |
unconditional |
| AC-10: README accuracy | this file | live |
Failure-mode cookbook
| Symptom | Likely cause | Fix |
|---|---|---|
gps-denied-replay console-script not on PATH |
package not installed in the test venv | pip install -e . |
| AC-1 line count off by > 5 % | tlog synthesizer drifted from the CSV | regenerate by re-running the test (synthesizer is deterministic; non-determinism would be a real bug) |
| AC-3 fails at ~ 0 % even with calibration | wrong intrinsics OR wrong WGS84 ground truth source — verify the GLOBAL_POSITION_INT columns are still the AC-3 reference (per flight_derkachi/README.md) |
re-derive ground truth |
| AC-5 determinism violated | non-deterministic float ordering in C5 estimator OR a clock leaked into the runtime | bisect via git log against the C5 / clock modules |
| AC-6 realtime drifts on shared CI | shared-runner contention; the spec allows widening to ± 5 s | adjust _HEAVY_SKIP boundary if it persists |
tlog missing required messages |
_tlog_synth.py lost a message group |
check _REQUIRED_MESSAGE_GROUPS in tlog_replay_adapter.py against the synth output |
Files
tests/e2e/replay/
├── README.md ← this file
├── __init__.py ← package marker + module-level docstring
├── _helpers.py ← parse_jsonl, l2_horizontal_m, match_percentage,
│ CapturingMavlinkTransport, GroundTruthRow
├── _tlog_synth.py ← CSV → tlog generator
├── conftest.py ← derkachi_replay_inputs, replay_runner,
│ operator_pre_flight_setup fixtures
├── test_helpers.py ← unit tests for _helpers (unconditional)
└── test_derkachi_1min.py ← AC-1..AC-8 + AC-7 skip gate + AC-4a AST scan
Follow-up work
- Real Topotek KHP20S30 calibration — unblocks AC-3.
- AZ-558 — closes AC-4b (route C8 encoders through
MavlinkTransport). - D-PROJ-2 mock-suite-sat-service — unblocks AC-8 (operator workflow rehearsal).