[AZ-699] Real-flight validation runner + Markdown accuracy report

New e2e test runs gps-denied-replay --auto-trim against the real
derkachi.tlog + flight video + AZ-702 calibration, computes the
horizontal-error distribution (mean/p50/p95/p99 + 10/25/50/100 m
threshold-hit share), writes _docs/06_metrics/real_flight_
validation_{date}.md, and asserts honest PASS/FAIL with no @xfail
mask. AZ-404's 1-min test is untouched (sibling, not replacement).

Extends gps_compare.py with HorizontalErrorDistribution +
percentile_sorted (numpy-equivalent linear interpolation). New
test helper _report_writer.py renders the canonical Markdown
schema documented as FT-P-20 in blackbox-tests.md.

16 new unit tests pin distribution arithmetic, verdict gate,
failure-message templating (references calibration acquisition
method per AC-3), and report layout. 129 passed in focused
regression, 3 skipped (real video / Tier-2 prerequisites).
Zero new mypy --strict errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-20 16:53:48 +03:00
parent f5366bbca1
commit dcde602f61
9 changed files with 1261 additions and 2 deletions
@@ -18,8 +18,11 @@ from gps_denied_onboard.helpers.engine_filename_schema import (
)
from gps_denied_onboard.helpers.gps_compare import (
GroundTruthRow,
HorizontalErrorDistribution,
horizontal_error_distribution,
l2_horizontal_m,
match_percentage,
percentile_sorted,
)
from gps_denied_onboard.helpers.imu_preintegrator import (
CombinedImuFactor,
@@ -77,6 +80,7 @@ __all__ = [
"EngineFilenameSchema",
"EngineFilenameSchemaError",
"GroundTruthRow",
"HorizontalErrorDistribution",
"ImuPreintegrationError",
"ImuPreintegrator",
"LightGlueConcurrentAccessError",
@@ -92,6 +96,7 @@ __all__ = [
"WgsConverter",
"adjoint",
"exp_map",
"horizontal_error_distribution",
"is_valid_rotation",
"iso_ts_from_clock",
"iso_ts_now",
@@ -100,5 +105,6 @@ __all__ = [
"match_percentage",
"make_imu_preintegrator",
"matrix_to_se3",
"percentile_sorted",
"se3_to_matrix",
]