ReportContext.tlog_path was widened in-place by AZ-959 to mean
"ground-truth source path" without renaming, leaving the rendered
report's "- Tlog: <csv_path>" line cosmetically wrong for CSV
runs. This rename + label fix completes the cleanup.
- helpers/accuracy_report.py: field rename + docstring update +
rendered line now reads "- Ground truth: <path>" for both
inputs.
- replay_api/app.py: kwarg updated, AZ-959 inline comment about
the overload removed (field name now carries the intent).
- tests/unit/test_az699_report_writer.py: fixture updated, two
new symmetric tests assert the canonical label for tlog AND
csv inputs (AC-2).
- tests/e2e/replay/_e2e_orchestrator.py +
test_derkachi_real_tlog.py: kwarg updated.
Tests: 62/62 green across test_az699_report_writer.py,
test_az700_render_map.py, test_az701_replay_api.py.
CSV-replay-input chain (AZ-959 + AZ-960 + AZ-961) is now coherent:
- API accepts (video, csv) with XOR validation
- /static/example-csv serves the AZ-896 reference doc
- Runner dispatches --imu vs --tlog argv
- Report renders with source-agnostic "Ground truth:" label
- Map renders from CSV truth via gps-denied-render-map dispatch
Bookkeeping: AZ-961 spec moved todo/ → done/, dep-table preamble
eighth bump documents the rename + summarises the cycle-4 CSV
chain, state.md records batch 7 complete.
Co-authored-by: Cursor <cursoragent@cursor.com>
New replay_api component: FastAPI service wrapping the offline
gps-denied-replay pipeline. POST tlog+video (multipart) → either
sync 200 with result/map/report URLs, or async 202 + job id with
/jobs/{id} polling. Magic-byte validation, bearer auth, in-memory
JobRegistry with concurrency + queue caps (429 on overflow).
Helper accuracy_report.py promoted from tests/ to src/ because the
API needs the Markdown report writer at runtime; all AZ-699 imports
re-pointed. OpenAPI spec exported to docs.
18/18 unit tests pass (AC-1 sync, AC-2 async, AC-3 state machine,
AC-5 auth, AC-6 health, AC-8 concurrency, AC-9 magic-byte). Full
unit suite: 2251 pass, 86 skip, 1 pre-existing C12 cold-start flake
(unchanged). mypy --strict clean on the new surface.
Co-authored-by: Cursor <cursoragent@cursor.com>
New e2e test runs gps-denied-replay --auto-trim against the real
derkachi.tlog + flight video + AZ-702 calibration, computes the
horizontal-error distribution (mean/p50/p95/p99 + 10/25/50/100 m
threshold-hit share), writes _docs/06_metrics/real_flight_
validation_{date}.md, and asserts honest PASS/FAIL with no @xfail
mask. AZ-404's 1-min test is untouched (sibling, not replacement).
Extends gps_compare.py with HorizontalErrorDistribution +
percentile_sorted (numpy-equivalent linear interpolation). New
test helper _report_writer.py renders the canonical Markdown
schema documented as FT-P-20 in blackbox-tests.md.
16 new unit tests pin distribution arithmetic, verdict gate,
failure-message templating (references calibration acquisition
method per AC-3), and report layout. 129 passed in focused
regression, 3 skipped (real video / Tier-2 prerequisites).
Zero new mypy --strict errors.
Co-authored-by: Cursor <cursoragent@cursor.com>