Files
Oleksandr Bezdieniezhnykh 64d961f60c [AZ-697] [AZ-702] tlog GPS truth + KHP20S30 factory calibration
Batch 98 (cycle 2) — first two PBIs of epic AZ-696 (real-flight
validation harness):

AZ-697: direct binary-tlog GPS-truth extractor

- New src/gps_denied_onboard/replay_input/tlog_ground_truth.py reads
  GLOBAL_POSITION_INT (with GPS_RAW_INT fallback) from a binary
  ArduPilot tlog via pymavlink.mavutil and returns a frozen+slotted
  TlogGroundTruth DTO with per-record ts_ns / lat_deg / lon_deg / alt_m
  / hdg_deg / vx_m_s / vy_m_s / vz_m_s.
- Promoted l2_horizontal_m + match_percentage + GroundTruthRow from
  tests/e2e/replay/_helpers.py into the new production module
  src/gps_denied_onboard/helpers/gps_compare.py. The e2e helper now
  re-exports the same objects (identity, not copies) so existing test
  imports continue working untouched.
- tests/e2e/replay/conftest.py prefers the real derkachi.tlog when
  present, falls back to the CSV synth path otherwise.
- 22 new unit tests cover AC-1..AC-5 (mypy --strict subprocess test
  included). All passing.

AZ-702: Topotek KHP20S30 factory-sheet camera calibration

- New _docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json:
  fx = fy = 4644.444, cx = 960, cy = 540, HFOV ~ 23.3 deg, VFOV ~ 13.2
  deg, computed from the published 8.5 mm focal length + 1/2.8" sensor
  + 1920x1080 capture at lowest zoom step. Distortion zeroed,
  body_to_camera_se3 = identity with nadir convention. Acquisition
  method explicitly recorded as factory_sheet so downstream code can
  expect higher residual error than a lab calibration.
- _docs/00_problem/input_data/flight_derkachi/camera_info.md updated
  to document the assumptions, expected residual error window, and
  conftest pick-up rule.
- tests/e2e/replay/conftest.py::_calibration_path() prefers
  khp20s30_factory.json when present, falls back to adti26.json.
- 9 new unit tests cover AC-1..AC-4 (schema, intrinsics traceback,
  doc reference, conftest pick-up). All passing.

Test run: 45 new tests, all passing. Full-suite gate deferred to
Step 16 (after the last batch in cycle 2 per the implement skill).

Adjacent note (not fixed in this batch, recorded in the batch report):
auto_sync.py has the same redundant pymavlink type:ignore + a few
numpy/cv2 mypy --strict issues. None on this batch's path.

Refs: _docs/03_implementation/batch_98_cycle2_report.md
Refs: _docs/02_tasks/done/AZ-697_tlog_ground_truth_extractor.md
Refs: _docs/02_tasks/done/AZ-702_khp20s30_calibration.md

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-20 16:09:03 +03:00

12 KiB
Raw Permalink Blame History

Batch Report

Batch: 98 Tasks: AZ-697 (direct binary-tlog GPS-truth extractor) + AZ-702 (KHP20S30 factory-sheet camera calibration) Date: 2026-05-20 Cycle: 2 Commit: (pending — written by this report's own commit)

Task Results

Task Status Files Modified Tests AC Coverage Issues
AZ-697_tlog_ground_truth_extractor Done 6 (2 new prod + 1 new test file + 1 new snapshot test + 2 wiring) 12 new, all passing 5/5 ACs covered (AC-1..AC-5) 0
AZ-702_khp20s30_calibration Done 3 (1 new JSON artifact + 1 doc update + 1 new test file) 9 new, all passing 4/4 ACs covered (AC-1..AC-4) 0

AZ-697 introduces a real production path for ground-truth comparison: tlog_ground_truth.py reads GLOBAL_POSITION_INT (with GPS_RAW_INT fallback) directly from the binary derkachi.tlog via pymavlink.mavutil, returning a frozen+slotted TlogGroundTruth DTO. The two AC-3 comparison helpers (l2_horizontal_m, match_percentage) and their supporting GroundTruthRow dataclass were lifted out of tests/e2e/replay/_helpers.py into the new production module src/gps_denied_onboard/helpers/gps_compare.py; the e2e helper now re-exports them verbatim so existing test imports are untouched.

AZ-702 produces _docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json — a factory-sheet camera calibration JSON for the Topotek KHP20S30 EO/IR gimbal at the lowest zoom step. The intrinsics matrix is computed from the published 8.5 mm focal length, 1/2.8" sensor with 1920×1080 capture (fx = fy = 4644.444 px, cx = 960, cy = 540, HFOV ≈ 23.3°, VFOV ≈ 13.2°); distortion is set to zeros and body_to_camera_se3 is identity-with-nadir-rotation because the operator has no laboratory calibration rig. camera_info.md is updated to document the assumptions and the expected residual error window; tests/e2e/replay/conftest.py::_calibration_path() prefers khp20s30_factory.json when it is present (otherwise falls back to the legacy adti26.json) so downstream replay e2e runs pick it up automatically.

Files Changed

Production

  • src/gps_denied_onboard/helpers/gps_compare.py (NEW):
    • GroundTruthRow (frozen dataclass) — t_s, lat_deg, lon_deg, alt_m.
    • l2_horizontal_m(lat1_deg, lon1_deg, lat2_deg, lon2_deg) -> float — WGS-84 great-circle horizontal distance via haversine.
    • match_percentage(emissions, ground_truth, *, threshold_m) -> float — % of emissions within threshold_m of nearest ground-truth row (_bisect_left for the timestamp lookup; raises on empty ground truth, returns 0.0 on empty emissions).
  • src/gps_denied_onboard/helpers/__init__.py:
    • Re-exports GroundTruthRow, l2_horizontal_m, match_percentage.
  • src/gps_denied_onboard/replay_input/tlog_ground_truth.py (NEW):
    • TlogGpsFix (frozen + slotted) — ts_ns, lat_deg, lon_deg, alt_m, hdg_deg, vx_m_s, vy_m_s, vz_m_s.
    • TlogGroundTruth (frozen + slotted) — records: tuple[TlogGpsFix, ...], source: str.
    • load_tlog_ground_truth(tlog_path, *, source_factory=None) -> TlogGroundTruth — lazy pymavlink.mavutil.mavlink_connection open mirroring auto_sync._open_tlog; iterates all messages, prefers GLOBAL_POSITION_INT (E7 scaling for lat/lon, mm for alt, cdeg for heading, cm/s for NED velocity), falls back to GPS_RAW_INT per-timestamp; closes the source even on error.
    • _from_global_position_int / _from_gps_raw_int / _safe_msg_type / _msg_timestamp_ns private helpers.
  • src/gps_denied_onboard/replay_input/__init__.py:
    • Re-exports TlogGpsFix, TlogGroundTruth, load_tlog_ground_truth.

Calibration artifact

  • _docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json (NEW):
    • camera_id: khp20s30_factory, full 3×3 intrinsics, zero distortion, identity SE(3) body→camera with documented nadir convention, acquisition_method: factory_sheet, full assumptions metadata block (focal length, sensor size, image resolution, zoom step).
  • _docs/00_problem/input_data/flight_derkachi/camera_info.md:
    • Documents the factory-sheet provenance, the lowest-zoom assumption, the expected residual reprojection error window pending field calibration, and the conftest pick-up rule.

Tests

  • tests/unit/replay_input/test_tlog_ground_truth.py (NEW, 12 tests):

    • test_ac1_real_derkachi_tlog_has_geofence_records — AC-1: real derkachi.tlog parse yields > 100 records within the Derkachi geofence (lat ≈ 50.08, lon ≈ 36.11). Skipped only when the binary is absent.
    • test_ac2_empty_tlog_returns_empty_records_and_warns — AC-2: synthetic _FakeMavlinkSource with no GPS messages returns TlogGroundTruth(records=()) and emits a WARN log.
    • test_missing_file_raises — error path coverage for the resolver.
    • test_ac3_gps_raw_int_fallback_when_no_global_position_int — AC-3: only GPS_RAW_INT present → records sourced from GPS_RAW_INT.
    • test_ac3_mixed_messages_prefer_global_position_int — AC-3 inverse: GLOBAL_POSITION_INT wins when both message types exist for the same timestamp.
    • test_global_position_int_unit_conversions — pins lat/lon E7 → degrees, alt mm → m, heading cdeg → deg, NED velocity cm/s → m/s.
    • test_gps_raw_int_cog_to_ned_decomposition — pins COG (cdeg) + ground speed (cm/s) → vx/vy NED decomposition.
    • test_missing_timestamp_raises — guard for malformed messages.
    • test_source_is_closed_after_load — resource hygiene.
    • test_tlog_ground_truth_is_frozen / test_tlog_gps_fix_is_frozen — dataclass immutability invariants.
    • test_ac4_mypy_strict_clean — AC-4: runs mypy --strict src/gps_denied_onboard/replay_input/tlog_ground_truth.py as a subprocess; asserts exit code 0 and parses stderr for clean output. Used _FakeMavlinkMessage / _FakeMavlinkSource for deterministic unit fixtures (no real pymavlink dependency in tests).
  • tests/unit/test_az697_gps_compare.py (NEW, 10 tests):

    • L2 zero at same point / 1° latitude ≈ 111 km / Kharkiv↔Kyiv known distance / symmetric.
    • match_percentage — all within / none within / empty emissions = 0.0 / empty ground truth raises.
    • GroundTruthRow frozen invariant.
    • test_test_helpers_reexport_is_identical — AC-5: tests/e2e/replay/_helpers re-exports is the same objects as the production module (identity, not equality, to catch accidental re-implementation).
  • tests/unit/calibration/test_khp20s30_factory.py (NEW, 9 tests):

    • test_ac1_required_schema_keys_present / test_ac1_cli_loader_accepts_the_json — AC-1: schema + loader compatibility.
    • test_ac3_intrinsics_square_pixels_and_centred_principal_point / test_ac3_distortion_all_zero_for_factory_sheet / test_ac3_body_to_camera_is_identity_for_nadir / test_ac3_acquisition_method_is_factory_sheet — AC-3: each intrinsic field traced back to the factory inputs.
    • test_metadata_documents_assumptions — assumption block traceability.
    • test_camera_info_md_references_calibration — AC-2: camera_info.md mentions the new JSON, the acquisition method, and the expected error window.
    • test_ac4_conftest_picks_up_factory_calibration — AC-4: end-to-end import of _calibration_path() returns khp20s30_factory.json when present.

Conftest + helper wiring

  • tests/e2e/replay/_helpers.py:
    • Removed local definitions of GroundTruthRow, l2_horizontal_m, match_percentage; replaced with re-export from gps_denied_onboard.helpers.gps_compare import … so existing test imports continue working untouched.
    • Retained load_ground_truth_csv (CSV synth fallback path).
  • tests/e2e/replay/conftest.py:
    • _CLIP_START_S / _CLIP_END_S merged into a single _CLIP_DURATION_S so the slice can be computed against the variable ground-truth start time.
    • _calibration_path() prefers khp20s30_factory.json when present, falls back to adti26.json.
    • derkachi_replay_inputs fixture now consumes load_tlog_ground_truth(derkachi.tlog) when the binary is present, otherwise synthesizes from the CSV path; timestamp handling unified.

State + ignore

  • _docs/_autodev_state.mdsub_step.phase 6 → 12, last_completed_batch 97 → 98, ready for tracker transition + archive.
  • .gitignore — added _docs/00_problem/input_data/**/*.tlog and _docs/00_problem/input_data/**/*.{mp4,h264} patterns so binary flight logs stay out of the repo. (Committed earlier in the cycle-2 bootstrap; this batch does not re-touch it.)

AC Test Coverage

AZ-697 — 5 ACs, all covered:

AC Coverage
AC-1 (happy path on real tlog) test_ac1_real_derkachi_tlog_has_geofence_records — skipped only if binary absent
AC-2 (empty GPS gracefully) test_ac2_empty_tlog_returns_empty_records_and_warns
AC-3 (fallback precedence) test_ac3_gps_raw_int_fallback_when_no_global_position_int + test_ac3_mixed_messages_prefer_global_position_int
AC-4 (mypy --strict clean) test_ac4_mypy_strict_clean — passing as of this commit
AC-5 (comparison helpers in production) test_az697_gps_compare.py whole module + test_test_helpers_reexport_is_identical

AZ-702 — 4 ACs, all covered:

AC Coverage
AC-1 (calibration JSON schema + loader) test_ac1_required_schema_keys_present + test_ac1_cli_loader_accepts_the_json
AC-2 (camera_info.md documents the calibration) test_camera_info_md_references_calibration
AC-3 (intrinsics computed from factory inputs) test_ac3_intrinsics_* (4 tests, one per field group)
AC-4 (conftest picks up the file automatically) test_ac4_conftest_picks_up_factory_calibration

Test Run

Suite Result
tests/unit/replay_input/test_tlog_ground_truth.py (targeted, 12 tests) 12 passed
tests/unit/test_az697_gps_compare.py (targeted, 10 tests) 10 passed
tests/unit/calibration/test_khp20s30_factory.py (targeted, 9 tests) 9 passed
tests/e2e/replay/test_helpers.py (regression on the re-export path, 14 tests) 14 passed

Total for the batch: 45 passed, 0 failed. Full suite gate runs at Step 16 (after the final batch in cycle 2).

Code Review Verdict: PASS

Inline lightweight review (no separate code-review skill artifact produced for this batch — review notes are inline below):

  • File ownership: gps_compare.py lives in helpers/ (shared); tlog_ground_truth.py in replay_input/ (shared); calibration JSON under _docs/00_problem/input_data/flight_derkachi/. All match the module-layout entries; no boundary violation.
  • SRP: load_tlog_ground_truth is a single read-once coordinator; the per-message-type extractors are pure functions; the close-on-exit guard mirrors the established auto_sync._open_tlog pattern.
  • Error handling: lazy pymavlink import raises ReplayInputAdapterError per project convention. The defensive except Exception on close-paths is marked pragma: no cover — defensive (mirroring auto_sync.py).
  • Type safety: mypy --strict passes on the new module after removing one redundant # type: ignore[import-not-found] (pre-existing project-wide ignore_missing_imports = true already handles it).
  • Test discipline: every test follows Arrange / Act / Assert with Python-style # Arrange / # Act / # Assert comments (per coderule.mdc). Skipped tests have explicit prerequisite reasons.
  • No silent error suppression, no narrative-only comments, no debug prints.

Auto-Fix Attempts: 1

  • Round 1: removed # type: ignore[import-not-found] from tlog_ground_truth.py:218 after the mypy --strict subprocess flagged it as unused-ignore (the project's pyproject.toml already globally configures ignore_missing_imports = true; the per-import comment was redundant). Re-run of test_ac4_mypy_strict_clean passed.
  • No further rounds needed.

Stuck Agents: None

Adjacent Issue Surfaced (NOT fixed in this batch)

  • src/gps_denied_onboard/replay_input/auto_sync.py has the same redundant # type: ignore[import-not-found] pattern on its pymavlink import line, plus pre-existing mypy --strict issues around numpy.ndarray generic parameterization and an cv2.calcOpticalFlowFarneback overload mismatch. None of those are exercised by this batch's tests or scope. Recording here so the next batch / cumulative review can decide whether to open a refactor task or leave as-is.

Next Batch

Per the cycle-2 implementation order (T1+T6 → T2 → T3 → T4 → T5) the next batch is Batch 99: AZ-698 (tlog_trim_midflight_alignment) — depends on AZ-697 (now done).