[AZ-697] [AZ-702] tlog GPS truth + KHP20S30 factory calibration

Batch 98 (cycle 2) — first two PBIs of epic AZ-696 (real-flight
validation harness):

AZ-697: direct binary-tlog GPS-truth extractor

- New src/gps_denied_onboard/replay_input/tlog_ground_truth.py reads
  GLOBAL_POSITION_INT (with GPS_RAW_INT fallback) from a binary
  ArduPilot tlog via pymavlink.mavutil and returns a frozen+slotted
  TlogGroundTruth DTO with per-record ts_ns / lat_deg / lon_deg / alt_m
  / hdg_deg / vx_m_s / vy_m_s / vz_m_s.
- Promoted l2_horizontal_m + match_percentage + GroundTruthRow from
  tests/e2e/replay/_helpers.py into the new production module
  src/gps_denied_onboard/helpers/gps_compare.py. The e2e helper now
  re-exports the same objects (identity, not copies) so existing test
  imports continue working untouched.
- tests/e2e/replay/conftest.py prefers the real derkachi.tlog when
  present, falls back to the CSV synth path otherwise.
- 22 new unit tests cover AC-1..AC-5 (mypy --strict subprocess test
  included). All passing.

AZ-702: Topotek KHP20S30 factory-sheet camera calibration

- New _docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json:
  fx = fy = 4644.444, cx = 960, cy = 540, HFOV ~ 23.3 deg, VFOV ~ 13.2
  deg, computed from the published 8.5 mm focal length + 1/2.8" sensor
  + 1920x1080 capture at lowest zoom step. Distortion zeroed,
  body_to_camera_se3 = identity with nadir convention. Acquisition
  method explicitly recorded as factory_sheet so downstream code can
  expect higher residual error than a lab calibration.
- _docs/00_problem/input_data/flight_derkachi/camera_info.md updated
  to document the assumptions, expected residual error window, and
  conftest pick-up rule.
- tests/e2e/replay/conftest.py::_calibration_path() prefers
  khp20s30_factory.json when present, falls back to adti26.json.
- 9 new unit tests cover AC-1..AC-4 (schema, intrinsics traceback,
  doc reference, conftest pick-up). All passing.

Test run: 45 new tests, all passing. Full-suite gate deferred to
Step 16 (after the last batch in cycle 2 per the implement skill).

Adjacent note (not fixed in this batch, recorded in the batch report):
auto_sync.py has the same redundant pymavlink type:ignore + a few
numpy/cv2 mypy --strict issues. None on this batch's path.

Refs: _docs/03_implementation/batch_98_cycle2_report.md
Refs: _docs/02_tasks/done/AZ-697_tlog_ground_truth_extractor.md
Refs: _docs/02_tasks/done/AZ-702_khp20s30_calibration.md

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-20 16:09:03 +03:00
parent a12638dd92
commit 64d961f60c
16 changed files with 1503 additions and 134 deletions
@@ -0,0 +1,120 @@
# Direct binary-tlog GPS-truth extractor
**Task**: AZ-697_tlog_ground_truth_extractor
**Name**: Direct binary-tlog GPS-truth extractor (replaces data_imu.csv middle-man)
**Description**: New `tlog_ground_truth.py` module that streams `GLOBAL_POSITION_INT` (or falls back to `GPS_RAW_INT`) from a binary ArduPilot tlog into a typed `TlogGroundTruth` DTO. Production helper (not test-only).
**Complexity**: 3 points
**Dependencies**: None
**Component**: replay_input (cross-cutting validation helper)
**Tracker**: AZ-697
**Epic**: AZ-696
## Problem
Cycle-1 AC-3 (≤ 100 m horizontal error for 80 % of ticks) was permanently
`@xfail` partly because the test fed the SUT a tlog synthesized from
`_docs/00_problem/input_data/flight_derkachi/data_imu.csv`, and read
ground truth from the same CSV — comparing the estimator to itself.
A real binary `derkachi.tlog` (5.8 MB ArduPilot tlog, MAVLink v2) was
committed on 2026-05-20. The remaining gap is a direct extractor that
reads `GLOBAL_POSITION_INT` (or `GPS_RAW_INT`) from the binary and
returns a typed DTO suitable for the AC-3 comparison helper.
## Outcome
- A new production module `src/gps_denied_onboard/replay_input/tlog_ground_truth.py`
exposes `load_tlog_ground_truth(path: Path) -> TlogGroundTruth`.
- The existing AC-3 comparison helpers (`l2_horizontal_m`,
`match_percentage`) move from `tests/e2e/replay/_helpers.py` into
`src/gps_denied_onboard/helpers/` so they are production code, not
test-only.
- The replay-test conftest uses the new extractor when the real tlog is
present; CSV path remains as a synth-tlog fallback.
## Scope
### Included
- New `TlogGroundTruth` dataclass (frozen + slotted) with per-record
`ts_ns`, `lat_deg`, `lon_deg`, `alt_m`, `hdg_deg`, `vx_m_s`, `vy_m_s`,
`vz_m_s` fields.
- `load_tlog_ground_truth(path)` — lazy `pymavlink.mavutil` open
mirroring `replay_input/auto_sync.py::_open_tlog`.
- Move `l2_horizontal_m` + `match_percentage` from test helpers to
`src/gps_denied_onboard/helpers/gps_compare.py`.
- Wire `tests/e2e/replay/conftest.py` to consume the new path when
`derkachi.tlog` exists.
- Unit tests under `tests/unit/replay_input/test_tlog_ground_truth.py`
using a synthetic tlog (extend `tests/e2e/replay/_tlog_synth.py`).
### Excluded
- Tlog trimming for mid-flight slices — AZ-698 (T2).
- Accuracy report writing — AZ-699 (T3).
- Map visualization — AZ-700 (T4).
## Acceptance Criteria
**AC-1: Happy path on real tlog**
Given the committed `derkachi.tlog`
When `load_tlog_ground_truth(derkachi.tlog)` runs
Then it returns `TlogGroundTruth` with `len(records) > 100` and lat ≈ 50.08, lon ≈ 36.11
**AC-2: Empty GPS gracefully**
Given a tlog with no `GLOBAL_POSITION_INT` / `GPS_RAW_INT` messages
When the extractor runs
Then it returns `TlogGroundTruth(records=())` and logs WARN (does NOT raise)
**AC-3: Fallback precedence**
Given a tlog containing only `GPS_RAW_INT` (no `GLOBAL_POSITION_INT`)
When the extractor runs
Then it returns records sourced from `GPS_RAW_INT`
**AC-4: Type safety**
When `mypy --strict src/gps_denied_onboard/replay_input/tlog_ground_truth.py` runs
Then it reports zero errors
**AC-5: Comparison helpers in production**
Given the moved `l2_horizontal_m` + `match_percentage`
When imported from `gps_denied_onboard.helpers.gps_compare`
Then they behave identically to the prior test-helpers location (snapshot test)
## Non-Functional Requirements
**Performance**
- `load_tlog_ground_truth(derkachi.tlog)` (5.8 MB, ~60 s of GPS at 5 Hz) returns in < 2 s on Tier-1 hardware.
**Reliability**
- Lazy pymavlink import; missing dep raises `ReplayInputAdapterError` per project convention.
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|-------------|-----------------|
| AC-1 | Real derkachi.tlog parse | Non-empty TlogGroundTruth with Derkachi geofence lat/lon |
| AC-2 | Tlog with no GPS messages | Empty records tuple + WARN log |
| AC-3 | GPS_RAW_INT fallback | Records sourced from GPS_RAW_INT when GLOBAL_POSITION_INT absent |
| AC-3 | Mixed GLOBAL_POSITION_INT + GPS_RAW_INT | GLOBAL_POSITION_INT wins per AC-3 |
| AC-4 | mypy --strict | Zero errors |
| AC-5 | Helper move snapshot | Same numeric output as prior test-helpers location |
## Blackbox Tests
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|--------|------------------------|-------------|-------------------|----------------|
| AC-1 | derkachi.tlog (real) | Load full tlog | ≥ 100 records, Derkachi geofence | Perf < 2s |
## Constraints
- pymavlink is already a project dep (used by C8); MUST be lazy-imported (auto_sync.py pattern).
- New module MUST follow the project's frozen + slotted dataclass convention.
- File ownership goes in `_docs/02_document/module-layout.md` per AZ-696 epic layout (no contract — internal helper).
## Risks & Mitigation
**Risk 1: MAVLink unit-conversion bugs**
- *Risk*: Mavlink encodes lat/lon as int × 1e7. Forgetting the divide ships records off by 7 orders of magnitude.
- *Mitigation*: AC-1 asserts Derkachi geofence values; unit test snapshots a known fixture.
**Risk 2: pymavlink import flakiness on Jetson**
- *Risk*: pymavlink occasionally fails to import on aarch64.
- *Mitigation*: Lazy import + raise `ReplayInputAdapterError` (existing pattern).
@@ -0,0 +1,106 @@
# Topotek KHP20S30 camera calibration (factory-sheet approximation)
**Task**: AZ-702_khp20s30_calibration
**Name**: Provide a calibration JSON for the Topotek KHP20S30 nadir camera (factory-sheet approximation)
**Description**: Compute and commit a `CameraCalibrationArtifact` JSON for the Derkachi camera (Topotek KHP20S30) from manufacturer factory data. Replaces the `adti26.json` placeholder that AC-3 currently uses. Documents the residual error vs a per-unit checkerboard refinement.
**Complexity**: 1 point
**Dependencies**: None
**Component**: input_data / shared_helpers
**Tracker**: AZ-702
**Epic**: AZ-696
## Problem
`_docs/00_problem/input_data/flight_derkachi/camera_info.md` states the
Topotek KHP20S30 intrinsics are unknown. `tests/e2e/replay/conftest.py`
(line 5056) substitutes `tests/fixtures/calibration/adti26.json` as a
placeholder. AC-3 (≤ 100 m horizontal error for 80 % of ticks) is
`@xfail` until a real calibration ships.
The cheapest reasonable starting point is a factory-sheet approximation
— compute `K` from the manufacturer's published focal length + sensor
geometry, accept the 13 % focal-length residual as a documented
budget, and let AC-3 either PASS or honestly FAIL with the residual
attributed.
## Outcome
- A calibration JSON `khp20s30_factory.json` exists in the Derkachi
input directory, parses against the project's
`CameraCalibrationArtifact` schema, and documents the acquisition
method as `factory_sheet`.
- `camera_info.md` is updated to reference the new calibration + the
residual budget + the deferral handle (`AZ-XXX_checkerboard_refinement`).
- AZ-699 (T3) uses this calibration as its `--camera-calibration` input.
## Scope
### Included
- Source manufacturer factory data for the Topotek KHP20S30 (sensor: 1/2.8" CMOS, 2.13 MP, 1920×1080; lens focal length, FOV, pixel pitch).
- Compute `K = [[fx, 0, cx], [0, fy, cy], [0, 0, 1]]` from `fx = fy = focal_length_mm × (image_width_px / sensor_width_mm)`.
- Set distortion to `[0, 0, 0, 0, 0]` (factory-sheet approximation).
- Set `body_to_camera_se3` to identity-down (nadir; camera-z = aircraft-down).
- Set `acquisition_method = "factory_sheet"`.
- Write `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json`.
- Update `_docs/00_problem/input_data/flight_derkachi/camera_info.md`.
- New unit test under `tests/unit/calibration/` asserting the JSON parses and matches the documented inputs.
### Excluded
- Physical checkerboard calibration (needs hardware).
- PnP-from-tlog back-computation (deferred follow-up).
- Updating `adti26.json` or other test fixtures.
## Acceptance Criteria
**AC-1: Calibration JSON parses**
Given the new `khp20s30_factory.json`
When loaded by the project's calibration parser (same schema as `adti26.json`)
Then it parses without error and all fields are populated
**AC-2: Doc updated**
Given `camera_info.md` before
When the calibration is committed
Then `camera_info.md` says "factory-sheet approximation; per-unit checkerboard refinement deferred — see <future-task>" and lists the residual budget
**AC-3: Unit test snapshot**
Given the new JSON
When the unit test runs
Then it asserts `fx == fy` (square pixels), `cx ≈ width/2`, `cy ≈ height/2`, distortion all zero
**AC-4: T3 consumes this calibration**
Given AZ-699's `test_derkachi_real_tlog.py`
When it runs
Then it loads `khp20s30_factory.json` as `--camera-calibration` (no longer the `adti26.json` placeholder)
## Non-Functional Requirements
**Compatibility**
- JSON schema MUST be identical to existing calibration fixtures (`adti26.json`) — no schema changes in this task.
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|-------------|-----------------|
| AC-1 | JSON loads via existing parser | Object populated |
| AC-3 | Field values match factory inputs | fx == fy, cx/cy at centre, zero distortion |
## Blackbox Tests
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|--------|------------------------|-------------|-------------------|----------------|
| AC-4 | T3 test pointed at new JSON | T3 launches without calibration parse error | Test starts cleanly | Compat |
## Constraints
- MUST follow the calibration contract in `_docs/02_document/contracts/shared_helpers/descriptor_normaliser.md` (or wherever the camera-calibration schema lives).
- MUST be a single committed JSON — no generator script with side effects.
## Risks & Mitigation
**Risk 1: Factory data unavailable at required precision**
- *Risk*: Topotek does not publish the exact focal length / sensor width to the precision needed.
- *Mitigation*: Document the gap; ship with the best-available estimate; flag in `camera_info.md` so T3 surfaces the uncertainty in its failure message.
**Risk 2: Residual error exceeds AC-3 budget**
- *Risk*: 13 % focal-length error may push horizontal error past 100 m at 1 km AGL.
- *Mitigation*: That's the honest finding. T3 reports it. A follow-up task can pursue checkerboard refinement if needed.