[AZ-599] Batch 79: FT-P-02 Derkachi builder + _common.py extraction

- Add build_p02_fixtures.py: IMU CSV → tlog conversion (RAW_IMU +
  ATTITUDE pairs, centidegrees→radians yaw) and orchestrator that
  runs gps-denied replay against Derkachi MP4 + generated tlog,
  verifying ≥1 record_type="estimate" in the FDR archive.
- Extract run_gps_denied_replay + FDR-parent-dir helpers into
  sitl_replay_builder/_common.py; refactor build_p01_fixtures.py
  to import from _common (b78 tests preserved).
- Add 20 unit tests under e2e/_unit_tests/fixtures/test_sitl_
  replay_builder_p02.py covering AC-1..AC-5; total unit suite
  686/686 passing (regression gate AC-6).
- README updated to document FT-P-01 + FT-P-02 builders.
- Advance autodev state: last_completed_batch=79, current_batch=80;
  prune verbose detail blob.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 13:40:07 +03:00
parent 2f1fb4d0d0
commit 4e0717e543
10 changed files with 1111 additions and 76 deletions
@@ -0,0 +1,77 @@
# FT-P-02 Derkachi fixture builder (IMU CSV → tlog, real-motion replay)
**Task**: AZ-599_ft_p_02_derkachi_builder
**Complexity**: 3 points
**Dependencies**: AZ-598 (shared `run_gps_denied_replay` helper)
**Component**: Blackbox Tests / Test Infrastructure (epic AZ-262)
**Tracker**: AZ-599
## Problem
FT-P-02 (Derkachi continuous-flight drift) needs an offline-replay
FDR archive so the scenario can compute drift between satellite
anchors. The b78 FT-P-01 builder doesn't apply directly — FT-P-02
uses real video + real IMU CSV instead of still images + synthetic
stationary tlog.
## Strategy
Same overall shape as b78 (drive `gps-denied-replay` against a video
+ tlog pair), with two differences:
1. **Video is already MP4** — skip the OpenCV encoding step.
2. **IMU is real recorded telemetry**`data_imu.csv` has 10 Hz
`SCALED_IMU2` + `GLOBAL_POSITION_INT` columns. Need a CSV → tlog
conversion that packs each row as a `RAW_IMU` + `ATTITUDE`
MAVLink pair, with yaw synthesised from `GLOBAL_POSITION_INT.hdg`
and roll/pitch=0 (acceptable for fixed-wing cruise).
Output: the SUT's natural FDR archive directory (the b78 schema-
projection step is dropped because FT-P-02 reads the FDR directly).
## Files Touched
* `e2e/fixtures/sitl_replay_builder/_common.py` (new) — extracts
`run_gps_denied_replay` + `write_observer_fixture` from
`build_p01_fixtures.py` so b78 + b79 share one implementation.
* `e2e/fixtures/sitl_replay_builder/build_p01_fixtures.py` (edit)
— re-export the moved helpers from `_common.py` for backwards
compatibility; b78 tests must still pass unchanged.
* `e2e/fixtures/sitl_replay_builder/build_p02_fixtures.py` (new) —
the FT-P-02 builder.
* `e2e/fixtures/sitl_replay_builder/README.md` (edit) — document
both flows.
* `e2e/_unit_tests/fixtures/test_sitl_replay_builder_p02.py` (new)
— b79 builder tests.
* `e2e/_unit_tests/test_directory_layout.py` (edit) — register
the three new paths.
## Acceptance Criteria
**AC-1**: `convert_imu_csv_to_tlog` parses every row of
`data_imu.csv` and writes one RAW_IMU + one ATTITUDE pair per row.
**AC-2**: `convert_imu_csv_to_tlog` raises `ValueError` on missing
required columns OR malformed numeric rows OR empty CSV.
**AC-3**: Heading conversion uses centidegrees → radians
(`hdg_cdeg * pi / 18000`).
**AC-4**: `build_p02_fixtures` invokes `run_gps_denied_replay` with
the Derkachi MP4 + the generated tlog; the resulting FDR archive
contains ≥1 `record_type="estimate"` record (`verify_fdr_has_estimates`
helper).
**AC-5**: `run_gps_denied_replay` is in `_common.py`; b78 +
b79 both import it from there; b78 tests still pass.
**AC-6**: Full e2e unit-test suite passes.
## Out of Scope
* Live capture EXECUTION (manual operator step, same as b78).
* FT-P-04 / FT-P-05 (also use Derkachi but each needs its own
builder; deferred).
* iNav adapter.
* True SCALED_IMU2 → RAW_IMU unit conversion (pass-through; will
re-visit if the SUT's tlog parser rejects).
@@ -0,0 +1,88 @@
# Batch 79 Report — FT-P-02 Derkachi fixture builder
**Batch**: 79
**Date**: 2026-05-17
**Context**: Test implementation (greenfield Step 10 — Implement Tests)
**Tasks**: AZ-599 (3 cp) — 1 task (FT-P-02 Derkachi builder + b78 helper extraction)
**Cycle**: 1
**Verdict**: COMPLETE — PASS (self-reviewed; see `reviews/batch_79_review.md`)
## Summary
Replicates the b78 vertical-slice pattern for FT-P-02 (Derkachi
continuous-flight drift). Different from b78 in three ways:
1. **Video is already MP4** — no still-image encoding step.
2. **IMU is real recorded telemetry** — needs CSV → tlog conversion
with real motion data (vs. b78's synthetic stationary tlog).
3. **Output is the SUT's natural FDR archive directory** — FT-P-02
reads it via `fdr_reader.iter_records`, not via the per-call
`outbound_messages_*.json` schema b78 ships.
Also extracted `run_gps_denied_replay` + `write_observer_fixture`
into a shared `_common.py` so both b78 and b79 reference one
canonical implementation. Future per-scenario builders (FT-P-04
also uses Derkachi inputs, FT-P-05 needs its own, etc.) will reuse
the same helpers.
### AZ-599 — FT-P-02 Derkachi builder (3 cp)
* **`e2e/fixtures/sitl_replay_builder/_common.py`** (new): shared
`run_gps_denied_replay`, `write_observer_fixture`, `DEFAULT_CLI_BIN`.
* **`e2e/fixtures/sitl_replay_builder/build_p01_fixtures.py`**
(edited): re-imports the moved helpers from `_common.py`; b78
tests still pass unchanged.
* **`e2e/fixtures/sitl_replay_builder/build_p02_fixtures.py`** (new):
* `P02BuilderConfig` frozen dataclass.
* `convert_imu_csv_to_tlog(csv_path, output_tlog)` — parses
`data_imu.csv` rows, packs RAW_IMU + ATTITUDE pairs. Yaw from
`GLOBAL_POSITION_INT.hdg` (centidegrees → radians);
roll/pitch = 0 (fixed-wing cruise approximation).
* `verify_fdr_has_estimates(fdr_path)` — counts
`record_type="estimate"` records; raises if zero.
* `build_p02_fixtures(cfg, ...)` — orchestrator.
* `_main(argv=None)` — argparse CLI.
* **`e2e/fixtures/sitl_replay_builder/README.md`** (edited): two-
builder structure with scenario coverage table; FT-P-01 + FT-P-02
per-builder sections with strategy + usage + limitations.
* **`e2e/_unit_tests/fixtures/test_sitl_replay_builder_p02.py`**
(new): +20 tests.
* **`e2e/_unit_tests/test_directory_layout.py`** (edited): registers
the two new modules (`_common.py` + `build_p02_fixtures.py`).
## Out of scope (deferred)
* **Live capture EXECUTION** — same as b78, a manual operator step.
* **FT-P-04** (Derkachi F2F registration) — also uses Derkachi
inputs and could reuse `build_p02_fixtures`. Deferred to a
follow-up so this batch ships at 3 cp.
* **Other scenarios** (FT-P-03 / 05 / 07 / 08 / 10 / 11, FT-N-01..04).
* **iNav adapter** for FT-P-02.
* **True SCALED_IMU2 → RAW_IMU unit conversion** — pass-through;
re-visit if the SUT's tlog parser rejects.
* **Aggressive-manoeuvre ATTITUDE synthesis** (roll/pitch ≠ 0) —
re-visit if FT-P-04 or other scenarios show fusion drift.
## Test Results
* New unit tests: **20**.
* Full `e2e/_unit_tests` suite: **686 passed in 123 s** (previous:
664 → +22 net = +20 new tests + +2 directory-layout entries).
* No new linter errors.
* b78 builder tests (24) still pass after `_common.py` extraction.
## State
* Spec moved: `_docs/02_tasks/todo/AZ-599_ft_p_02_derkachi_builder.md`
`_docs/02_tasks/done/`.
* `_docs/_autodev_state.md` advanced to `last_completed_batch: 79`.
* No K=3 cumulative review this batch (next due after b81; the
b76-b78 cumulative just shipped one batch ago).
## Incidents
* **Disk-full during regression gate**: the first regression-gate
attempt failed with 212 conftest setup errors (all
`FileNotFoundError: /var/folders/.../pytest-of-...`). Surfaced
to the user with options; user freed space (37 Gi avail after);
re-ran the gate and it passed cleanly. Not a code issue.
@@ -0,0 +1,156 @@
# Code Review Report
**Batch**: 79 — AZ-599 (FT-P-02 Derkachi fixture builder + b78 helper extraction)
**Date**: 2026-05-17
**Verdict**: PASS
## Findings
(none blocking)
### Non-blocking notes
* **Roll/pitch=0 simplification in synthesised ATTITUDE**: acceptable
for the Derkachi fixed-wing cruise data but unrealistic for
aggressive manoeuvres. Documented as a Limitation in the README;
re-visit if FT-P-04 (Derkachi F2F registration) or other
manoeuvre-sensitive scenarios show fusion drift from this gap.
* **`SCALED_IMU2``RAW_IMU` pass-through (no unit conversion)**:
the builder passes scaled accelerometer (mg) and gyro (mrad/s)
values straight into RAW_IMU fields. Documented as a Limitation;
if the SUT's tlog parser rejects, a units conversion pass needs to
be added. Surfaced as a follow-up rather than fixed speculatively.
* **Disk-full incident during regression gate**: not a code issue;
surfaced to the user, who freed space, after which the full
suite (686 tests) passed cleanly.
## Findings Sweep
### Phase 1 — Context Loading
Read the FT-P-02 scenario to confirm: it (a) skips when
`sitl_replay_ready` is false; (b) calls
`fdr_reader.iter_records(fdr_root)` to walk the FDR archive; (c)
filters for `record_type == "estimate"` and projects into
`apd.FdrEstimate`. Confirmed FT-P-02 does NOT call
`wait_for_outbound` (so the b78 schema doesn't apply). Read
`data_imu.csv` to confirm column names + units (10 Hz; ms
timestamps; SCALED_IMU2 in mg / mrad/s; GLOBAL_POSITION_INT.hdg
in centidegrees). Inspected the production
`replay_input/auto_sync.py` AC-13 contract for required tlog
message types (RAW_IMU + ATTITUDE) — this drove the choice of
which messages to pack per CSV row. Verified `pymavlink>=2.4`
already in `pyproject.toml`. Read the b78
`build_p01_fixtures.py` to identify the helper extraction
boundary (`run_gps_denied_replay` + `write_observer_fixture` had
no FT-P-01-specific code, ideal candidates).
### Phase 2 — Spec Compliance
| AC | Coverage | Status |
|----|----------|--------|
| AC-1 (parse every row, one pair per row) | `test_convert_imu_csv_writes_pair_per_row`, `test_convert_imu_csv_real_pymavlink_round_trip` | Covered |
| AC-2 (ValueError on missing cols / malformed numerics / empty CSV) | `test_convert_imu_csv_empty_raises`, `test_convert_imu_csv_missing_required_column_raises`, `test_convert_imu_csv_malformed_numeric_raises`, `test_convert_imu_csv_missing_file_raises` | Covered (4 distinct error paths) |
| AC-3 (heading centideg → rad) | `test_hdg_centideg_to_rad` (5 parametric cases: 0, π/2, π, 3π/2, near-2π) | Covered |
| AC-4 (build_p02 invokes replay + verifies ≥1 estimate) | `test_build_p02_end_to_end_with_mocks`, `test_build_p02_propagates_verify_failure`, `test_verify_fdr_no_estimates_raises`, `test_verify_fdr_counts_estimates`, `test_verify_fdr_tolerates_malformed_lines` | Covered |
| AC-5 (`_common.py` shared between b78 + b79) | `test_common_module_exports_used_by_b01` + b78 suite (24/24) still passes | Covered |
| AC-6 (full unit-test suite passes) | 686/686 in 123 s (previous: 664) | Covered |
### Phase 3 — Code Quality
* **Single responsibility**: `convert_imu_csv_to_tlog` parses CSV +
packs MAVLink (one cohesive concern: CSV row → tlog pair).
`verify_fdr_has_estimates` does one thing. `build_p02_fixtures`
is the only orchestrator. Helper extraction into `_common.py`
removed duplicate definitions cleanly — no copy-pasted code
remains.
* **No suppressed errors**: every JSON-decode in
`verify_fdr_has_estimates` is wrapped in a `try / except
json.JSONDecodeError` that explicitly `continue`s with a comment
explaining the tolerance (real FDR archives can contain partial
writes at the tail). Every CSV parse error wraps in `ValueError`
with row context. `subprocess.run` uses `check=True`.
* **AAA test discipline**: all 20 new tests use
`# Arrange / # Act / # Assert`; omitted when redundant.
* **Comments**: every public function has a docstring documenting
contract + raises; no narrating comments in function bodies.
The yaw conversion uses a named helper (`_hdg_centideg_to_rad`)
rather than an inline magic-formula comment.
* **Public boundary**: b79 module does NOT import any
`src/gps_denied_onboard` symbol (verified by grep). pymavlink
imports are deferred to function bodies so the test factory
injections don't need the real library available.
### Phase 4 — Security
* No new credentials / secrets / network surfaces.
* `convert_imu_csv_to_tlog` uses `csv.DictReader` — no `eval`, no
shell injection. Numeric fields go through `int(float(s))` so
bad input raises ValueError rather than executing arbitrary code.
* `verify_fdr_has_estimates` uses `json.loads` (safe) and silently
ignores malformed lines (documented as tolerance for partial
writes; not a security risk because we never act on the data).
* `subprocess` invocation uses a list-argument `cmd` (no shell
injection surface; same pattern as b78).
### Phase 5 — Performance
* `convert_imu_csv_to_tlog` is O(N) over CSV rows; for the Derkachi
4,900-row file → ~5 s wall-clock estimate (pymavlink packing is
the bottleneck). Bounded by input size; no hidden quadratic
behaviour.
* `verify_fdr_has_estimates` is single-pass O(N) over JSONL lines.
* `_iter_imu_rows` validates the header once and then streams rows;
doesn't materialize the entire CSV in memory until the orchestrator
asks for `list(...)` (could be optimized later if Derkachi data
ever grows orders of magnitude larger; currently 4,900 rows is
trivial).
### Phase 6 — Cross-Task Consistency
* **Builder shape parity with b78**: `BuilderConfig` /
`P02BuilderConfig` dataclass; `build_pXX_fixtures(cfg, *, _runner=...,
_mavlink_writer_factory=..., _verify_fdr=...)` signature; same
underscore-prefixed dependency-injection convention. A future
contributor reading both side-by-side will see the same pattern.
* **`_common.py` extraction validated by both**: b78's existing
`test_run_gps_denied_replay_builds_correct_cmd` /
`test_run_gps_denied_replay_creates_fdr_parent_dir` /
`test_run_gps_denied_replay_passes_extra_args` /
`test_write_observer_fixture_schema` tests pass unchanged against
the moved implementation. b79 adds
`test_common_module_exports_used_by_b01` to lock in the import
relationship.
* **No drift in `observer_<fc_kind>_<host>.json` schema**: same
payload structure as b75/b78.
### Phase 7 — Architecture Compliance
* **Module placement**: new files under
`e2e/fixtures/sitl_replay_builder/` (existing package from b78);
new tests under `e2e/_unit_tests/fixtures/`. All registered in
`test_directory_layout.py`.
* **No `src/gps_denied_onboard` imports**.
* **No new top-level dependencies** — pymavlink already there from
b78/b76; `csv` is stdlib; `math` is stdlib.
* **`__init__.py` not modified** — the b78 docstring already
documents why we don't re-export symbols on the package namespace,
and that's still the right choice for b79.
* **Refactor preserves backwards compatibility**: `build_p01_fixtures`
re-imports `run_gps_denied_replay` + `write_observer_fixture`
from `_common.py`. Any caller (test or production) doing
`bp01.run_gps_denied_replay(...)` still works because the symbol
is present in the module namespace via the `from … import …`
re-export.
## Test Results
* New unit tests: **20** (5 `convert_imu_csv_to_tlog` + 5
`_hdg_centideg_to_rad` parametric + 4 `verify_fdr_has_estimates`
+ 4 `build_p02_fixtures` end-to-end + 1 export-relationship
+ 1 missing-file negative).
* Full `e2e/_unit_tests` suite: **686 passed in 123 s** (previous:
664 → +22 net = +20 new + +2 directory-layout entries).
* No new linter errors (`ReadLints` clean on all touched files).
* No regression in the b78 builder tests (24/24 still pass after
the helper extraction).
+6 -6
View File
@@ -6,16 +6,16 @@ step: 10
name: Implement Tests
status: in_progress
sub_step:
phase: 6
name: implement-tasks-sequentially
phase: 0
name: awaiting-invocation
detail: ""
retry_count: 0
cycle: 1
tracker: jira
last_completed_batch: 78
last_completed_batch: 79
last_cumulative_review: batches_76-78
current_batch: 79
current_batch_tasks: ""
current_batch: 80
last_step_outcomes:
step_8: "Code is testable — no changes needed (testability_assessment.md committed; no list-of-changes, no source edits)"
step_9: "Already complete — 41 blackbox test tasks (AZ-406..AZ-446) under epic AZ-262 with specs in _docs/02_tasks/todo/ were produced in a prior cycle; AZ-406 test-infrastructure bootstrap also pre-existing. Folder fallback satisfied (todo/ has test tasks, _dependencies_table.md reflects 114 product + 41 test = 155 total). No Step-9 work executed in cycle 1."
step_9: "41 blackbox test tasks (AZ-406..AZ-446) under epic AZ-262 in _docs/02_tasks/todo/ pre-existing; AZ-406 test-infra bootstrap pre-existing. Folder fallback satisfied. No Step-9 work executed in cycle 1."