[AZ-697] [AZ-702] tlog GPS truth + KHP20S30 factory calibration

Batch 98 (cycle 2) — first two PBIs of epic AZ-696 (real-flight
validation harness):

AZ-697: direct binary-tlog GPS-truth extractor

- New src/gps_denied_onboard/replay_input/tlog_ground_truth.py reads
  GLOBAL_POSITION_INT (with GPS_RAW_INT fallback) from a binary
  ArduPilot tlog via pymavlink.mavutil and returns a frozen+slotted
  TlogGroundTruth DTO with per-record ts_ns / lat_deg / lon_deg / alt_m
  / hdg_deg / vx_m_s / vy_m_s / vz_m_s.
- Promoted l2_horizontal_m + match_percentage + GroundTruthRow from
  tests/e2e/replay/_helpers.py into the new production module
  src/gps_denied_onboard/helpers/gps_compare.py. The e2e helper now
  re-exports the same objects (identity, not copies) so existing test
  imports continue working untouched.
- tests/e2e/replay/conftest.py prefers the real derkachi.tlog when
  present, falls back to the CSV synth path otherwise.
- 22 new unit tests cover AC-1..AC-5 (mypy --strict subprocess test
  included). All passing.

AZ-702: Topotek KHP20S30 factory-sheet camera calibration

- New _docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json:
  fx = fy = 4644.444, cx = 960, cy = 540, HFOV ~ 23.3 deg, VFOV ~ 13.2
  deg, computed from the published 8.5 mm focal length + 1/2.8" sensor
  + 1920x1080 capture at lowest zoom step. Distortion zeroed,
  body_to_camera_se3 = identity with nadir convention. Acquisition
  method explicitly recorded as factory_sheet so downstream code can
  expect higher residual error than a lab calibration.
- _docs/00_problem/input_data/flight_derkachi/camera_info.md updated
  to document the assumptions, expected residual error window, and
  conftest pick-up rule.
- tests/e2e/replay/conftest.py::_calibration_path() prefers
  khp20s30_factory.json when present, falls back to adti26.json.
- 9 new unit tests cover AC-1..AC-4 (schema, intrinsics traceback,
  doc reference, conftest pick-up). All passing.

Test run: 45 new tests, all passing. Full-suite gate deferred to
Step 16 (after the last batch in cycle 2 per the implement skill).

Adjacent note (not fixed in this batch, recorded in the batch report):
auto_sync.py has the same redundant pymavlink type:ignore + a few
numpy/cv2 mypy --strict issues. None on this batch's path.

Refs: _docs/03_implementation/batch_98_cycle2_report.md
Refs: _docs/02_tasks/done/AZ-697_tlog_ground_truth_extractor.md
Refs: _docs/02_tasks/done/AZ-702_khp20s30_calibration.md

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-20 16:09:03 +03:00
parent a12638dd92
commit 64d961f60c
16 changed files with 1503 additions and 134 deletions
+15 -105
View File
@@ -1,18 +1,22 @@
"""Helpers shared by the AZ-404 E2E replay tests.
The numerical kernels (``l2_horizontal_m``, ``match_percentage``,
``GroundTruthRow``) moved into production code at
:mod:`gps_denied_onboard.helpers.gps_compare` in AZ-697; they're
re-exported here so existing import sites stay stable.
* :func:`parse_jsonl` — read the ``JsonlReplaySink`` output into a list
of dicts with one entry per emit.
* :func:`l2_horizontal_m` — WGS84-aware L2 horizontal distance between
two ``(lat, lon)`` pairs in metres.
* :func:`match_percentage` — share of estimator emissions whose
L2 distance to the closest ground-truth row is within a threshold.
* :class:`CapturingMavlinkTransport` — test-only ``MavlinkTransport``
impl that records every ``write`` so AC-4b can compare the byte
streams produced by ``compose_root(config_live)`` vs.
``compose_root(config_replay)``.
* :func:`load_ground_truth_csv` — the IMU CSV's ``GLOBAL_POSITION_INT``
columns ARE the AC-3 reference (the original tlog's GPS rows
exported to CSV); this helper materialises them.
exported to CSV); this helper materialises them. Retained for the
CSV-only fallback path; the real-tlog branch uses
:func:`gps_denied_onboard.replay_input.load_tlog_ground_truth`
instead.
All functions are pure / deterministic and stay safely importable on
dev macOS without ``RUN_REPLAY_E2E``; the regular regression suite
@@ -24,11 +28,15 @@ from __future__ import annotations
import csv
import json
import math
from dataclasses import dataclass
from pathlib import Path
from typing import Any
from gps_denied_onboard.helpers.gps_compare import (
GroundTruthRow,
l2_horizontal_m,
match_percentage,
)
__all__ = [
"CapturingMavlinkTransport",
"GroundTruthRow",
@@ -39,22 +47,6 @@ __all__ = [
]
# WGS84 mean Earth radius. Matches the value used by
# `helpers/wgs_converter.py` (AZ-279) so the e2e check is consistent
# with the production converter.
_EARTH_RADIUS_M: float = 6_371_008.8
@dataclass(frozen=True)
class GroundTruthRow:
"""One row from the Derkachi data_imu.csv ground-truth slice."""
t_s: float
lat_deg: float
lon_deg: float
alt_m: float
def parse_jsonl(path: Path) -> list[dict[str, Any]]:
"""Return one dict per line of a JsonlReplaySink output file.
@@ -77,29 +69,6 @@ def parse_jsonl(path: Path) -> list[dict[str, Any]]:
return records
def l2_horizontal_m(
lat1_deg: float, lon1_deg: float, lat2_deg: float, lon2_deg: float
) -> float:
"""WGS84-spherical great-circle distance in metres.
Uses the haversine formula with the C5/AZ-279 mean Earth radius.
Sufficient for the AC-3 ≤ 100 m threshold (sub-metre accuracy at
the Derkachi latitude band; the spherical approximation diverges
from the WGS84 ellipsoid by < 0.5 % at these latitudes — well
within the AC-3 budget).
"""
phi1 = math.radians(lat1_deg)
phi2 = math.radians(lat2_deg)
dphi = phi2 - phi1
dlam = math.radians(lon2_deg - lon1_deg)
a = (
math.sin(dphi / 2.0) ** 2
+ math.cos(phi1) * math.cos(phi2) * math.sin(dlam / 2.0) ** 2
)
c = 2.0 * math.asin(min(1.0, math.sqrt(a)))
return _EARTH_RADIUS_M * c
def load_ground_truth_csv(csv_path: Path) -> list[GroundTruthRow]:
"""Load the Derkachi IMU CSV's GPS rows as ground truth.
@@ -123,65 +92,6 @@ def load_ground_truth_csv(csv_path: Path) -> list[GroundTruthRow]:
return rows
def match_percentage(
emissions: list[dict[str, Any]],
ground_truth: list[GroundTruthRow],
*,
threshold_m: float,
) -> float:
"""Share of emissions within ``threshold_m`` of the closest GT row.
For each emitted ``EstimatorOutput`` JSONL record, find the
nearest-in-time ground-truth row, compute the horizontal L2
distance, and count it as a hit when ≤ ``threshold_m``. Returns
the hit ratio in [0.0, 1.0].
Nearest-in-time is sufficient because the IMU CSV's 10 Hz cadence
(matching the C5 emit rate) means the candidate row is typically
< 50 ms off the emit timestamp — well below the AC-3 100 m budget.
"""
if not emissions:
return 0.0
if not ground_truth:
raise AssertionError("ground_truth must be non-empty")
gt_sorted = sorted(ground_truth, key=lambda r: r.t_s)
gt_times = [r.t_s for r in gt_sorted]
hits = 0
for emit in emissions:
emit_ts_ns = int(emit["emitted_at"])
emit_t_s = emit_ts_ns / 1e9
idx = _bisect_left(gt_times, emit_t_s)
candidates = []
if idx > 0:
candidates.append(gt_sorted[idx - 1])
if idx < len(gt_sorted):
candidates.append(gt_sorted[idx])
# Nearest-in-time row.
nearest = min(candidates, key=lambda r: abs(r.t_s - emit_t_s))
emit_pos = emit["position_wgs84"]
d = l2_horizontal_m(
emit_pos["lat_deg"],
emit_pos["lon_deg"],
nearest.lat_deg,
nearest.lon_deg,
)
if d <= threshold_m:
hits += 1
return hits / len(emissions)
def _bisect_left(seq: list[float], target: float) -> int:
"""Stdlib bisect_left, inlined to keep import surface narrow."""
lo, hi = 0, len(seq)
while lo < hi:
mid = (lo + hi) // 2
if seq[mid] < target:
lo = mid + 1
else:
hi = mid
return lo
class CapturingMavlinkTransport:
"""Test-only :class:`MavlinkTransport` that records every write.