FT-P-12: parse mavproxy-listener tlog over a 60 s Derkachi replay and assert SUT->GCS GLOBAL_POSITION_INT cadence lands in [1, 2] Hz (AC-6.1). FT-P-13: inject `RELOC:<lat>,<lon>,<radius_m>` STATUSTEXT while the SUT is in dead_reckoned; verify FDR `c8.gcs.operator_command` ack <=2s, `anchor_search_region` centre shifts toward the hint, and no BAD_SIGNATURE / UNAUTHORIZED / REJECTED STATUSTEXT lands in the post-inject window (AC-6.2). Adds runner.helpers.gcs_telemetry_evaluator (rate, hint-ack correlation, haversine search-region shift, rejection scan) and sitl_observer.capture_gcs_tlog (parity surface to capture_ap_tlog). Pure-logic coverage: 39 new unit tests; full e2e/_unit_tests/ suite 746 passing (was 700). Scenarios skip locally on missing SITL replay fixture; production hooks (inbound STATUSTEXT parser, anchor_search_region FDR emitter) tracked outside this task. See _docs/03_implementation/batch_81_report.md + reviews/batch_81_review.md. Co-authored-by: Cursor <cursoragent@cursor.com>
9.2 KiB
Batch 81 Report — FT-P-12 + FT-P-13 GCS downsample + command path
Batch: 81
Date: 2026-05-17
Context: Test implementation (greenfield Step 10 — Implement Tests)
Tasks: AZ-420 (3 cp) — single scenario task covering FT-P-12 + FT-P-13
Cycle: 1
Verdict: COMPLETE — PASS_WITH_WARNINGS (self-reviewed; see reviews/batch_81_review.md)
Summary
Implements the GCS-leg blackbox scenarios under epic AZ-262:
- FT-P-12 — SUT→GCS summary stream cadence (
AC-6.1). The C8QgcTelemetryAdapterpairsGLOBAL_POSITION_INT+NAMED_VALUE_FLOATat the configuredsummary_rate_hz; the test parses themavproxy-listener-captured tlog over a 60 s Derkachi replay and asserts the observedGLOBAL_POSITION_INTrate lands in [1, 2] Hz. - FT-P-13 — GCS-originated operator re-loc hint (
AC-6.2). ASTATUSTEXTcarryingRELOC:<lat>,<lon>,<radius_m>is injected while the SUT is indead_reckoned; the SUT must (a) acknowledge via an FDRc8.gcs.operator_commandrecord within ≤ 2 s, (b) bias its nextanchor_search_regiontoward the hint, (c) not reject the well-formed hint with a security/auth STATUSTEXT.
AZ-420 — FT-P-12 + FT-P-13 (3 cp)
-
e2e/runner/helpers/gcs_telemetry_evaluator.py(new, 430 lines): pure-logic evaluators sourced from the GCS tlog + FDR archive.compute_gcs_summary_rate(messages, *, position_msg_type, ...)→GcsSummaryRateReport(observed_rate_hz, passes, ...)— AC-6.1.extract_inbound_hints(messages, *, hint_prefix='RELOC:')→list[InboundHint]— tlog→DTO adapter.parse_reloc_payload(hint_text)→(lat_deg, lon_deg, radius_m).correlate_hint_acks(hints, acks)→HintAckReport(AC-2). Greedy injection-order pairing; each ack matches at most one hint.evaluate_search_region_shift(regions, hint_inject_us, lat, lon)→SearchRegionShiftReport(AC-3). Compares last pre-hint region centre to first post-hint region centre via haversine distance.haversine_distance_m(lat_a, lon_a, lat_b, lon_b)— great-circle distance, mean Earth radius. Sub-100 km accuracy ≪ 1 m.detect_hint_rejection(messages, inject_us, *, window_us=2e6)→HintRejectionReport(AC-4). Scans STATUSTEXT in the post-inject window forBAD_SIGNATURE/UNAUTHORIZED/REJECTEDtokens.collect_messages_to_list(messages)— convenience for the "parse once, run N analyzers" pattern (mirrorsap_contract_evaluator).
-
e2e/runner/helpers/sitl_observer.py(edited, +25 lines): addscapture_gcs_tlog(host, duration_s) -> Pathmirroringcapture_ap_tlog. Loads the FDR-replay fixture at${E2E_SITL_REPLAY_DIR}/gcs_tlog_<host>.tlog. RaisesRuntimeErroron missing env / missing fixture / non-positive duration. -
e2e/tests/positive/test_ft_p_12_gcs_downsample.py(new, 110 lines): full FT-P-12 scenario. Skips whensitl_replay_readyis False (no SITL fixture). Parametric across(fc_adapter, vio_strategy)via conftest.traces_to(AC-6.1,AC-1,AC-5). -
e2e/tests/positive/test_ft_p_13_gcs_command.py(new, 211 lines): full FT-P-13 scenario. Walks the FDR archive forc8.gcs.operator_commandack records +anchor_search_regionper-frame records. Skips on missing fixture; fails loudly on empty hint list / empty FDR archive so the test cannot silently green-light an unimplemented production path.traces_to(AC-6.2,AC-2,AC-3,AC-4,AC-5). -
e2e/_unit_tests/helpers/test_gcs_telemetry_evaluator.py(new, 39 tests): pure-logic coverage for every evaluator + adapter. Boundary cases include 1.0 / 2.0 Hz inclusive, ack-before-hint ignored, latency exactly at 2 000 ms, no pre-hint region, equal distance non-strict, BAD_SIGNATURE / UNAUTHORIZED / REJECTED token detection, malformedRELOC:payload raisesValueError. -
e2e/_unit_tests/helpers/test_sitl_observer.py(edited, +4 tests):capture_gcs_tloghappy path + missing env + missing fixture + zero/negative duration. Mirrors the existingcapture_ap_tlogtest block. -
e2e/_unit_tests/test_directory_layout.py(edited): registersrunner/helpers/gcs_telemetry_evaluator.py,tests/positive/test_ft_p_12_gcs_downsample.py,tests/positive/test_ft_p_13_gcs_command.py.
Tests
Full e2e/_unit_tests/ suite: 746 passed in 147.57 s (baseline
700 → +46 net). Run via python -m pytest e2e/_unit_tests/ from
the workspace root. No flakes, no skips outside the pre-existing
intentional skips.
Collection check on the two new scenario tests (pytest --collect-only e2e/tests/positive/test_ft_p_12_gcs_downsample.py e2e/tests/positive/test_ft_p_13_gcs_command.py): 12 items collected
(2 tests × 6 (fc_adapter, vio_strategy) combinations each).
The scenarios skip locally because E2E_SITL_REPLAY_DIR is unset —
which is the intended docker-vs-host boundary; they run inside the
docker-compose SITL replay harness.
Per-area test counts (this batch):
| File | Tests added |
|---|---|
test_gcs_telemetry_evaluator.py (new) |
39 |
test_sitl_observer.py (edited) |
4 |
test_directory_layout.py (edited) |
3 (path entries) |
test_no_sut_imports.py (no edit; broader walk) |
implicit +1 module covered |
| Total | +46 |
Acceptance Criteria Verification
| AC | Status | Evidence |
|---|---|---|
| AC-1 — GCS rate ∈ [1, 2] Hz over 60 s window | ✓ | test_ft_p_12_gcs_downsample + 10 compute_gcs_summary_rate unit tests (boundary, degeneracy, custom bounds) |
| AC-2 — FDR ack ≤ 2 s after inject | ✓ | test_ft_p_13_gcs_command + 6 correlate_hint_acks unit tests |
AC-3 — anchor_search_region shifts toward hint |
✓ | test_ft_p_13_gcs_command + 5 evaluate_search_region_shift + 3 haversine_distance_m unit tests |
| AC-4 — No security/auth rejection in window | ✓ | test_ft_p_13_gcs_command + 7 detect_hint_rejection unit tests |
AC-5 — Parameterised per (fc_adapter, vio_strategy) |
✓ | pytest --collect-only shows 6 param IDs per scenario |
Code Review Verdict
PASS_WITH_WARNINGS (no Critical, no High; 2 Low notes — see
reviews/batch_81_review.md).
Auto-Fix Attempts
0 (no auto-fix-eligible findings).
Stuck Agents
None.
Notable Decisions
HintAckReport.passesreturns False for empty hints. The scenario test pre-checksif not hints: pytest.fail(...)before callingcorrelate_hint_acks, so the evaluator never observes an empty list in practice. Leaving the conservative semantic in place — "no hints" is a misuse of the correlator, not a trivial pass — and pushing the explicit failure upstream where the contextual error message ("the fixture builder must inject at least one operator re-loc hint") is more useful.- AC-3's
passesis non-strict shift. A region exactly equidistant before/after the inject is treated as "not biased" (distance_after_m < distance_before_mis strict). This matches the spec wording "shifts toward the hinted location" — zero movement is not a shift. Documented inSearchRegionShiftReport.passes. - Counted
GLOBAL_POSITION_INTonly for AC-6.1, not theNAMED_VALUE_FLOATcompanion. The QGC adapter pairs them so counting both would double-count. The position message is the contract-relevant half; the NAMED_VALUE_FLOAT carries the decorative horizontal-uncertainty annotation. - Tests are shaped to fail loudly when the upstream production
hooks are missing. AC-2 requires the C8 adapter to translate an
inbound STATUSTEXT into an FDR
c8.gcs.operator_commandrecord; AC-3 requires the C2 backbone to emitanchor_search_regionFDR records. Both are deferred work outside AZ-420's scope. The scenario tests skip cleanly when no fixture is present (sitl_replay_ready=False) and fail with a specific error when the fixture exists but lacks the expected hint or ack records. This is the "tests as gates" pattern called out in the implement skill.
Production Dependencies (forward-look)
FT-P-13 transitively depends on:
- Inbound STATUSTEXT command parser in
c8_fc_adapter/mavlink_gcs_adapter.py. Currently the adapter emits but does not consume STATUSTEXT. The C12MavlinkOperatorCommandTransportconcrete impl is a Protocol-only stub. anchor_search_regionFDR record emitted by the C2 backbone per nav-camera frame. The FDR schema (AC-NEW-3 family) reserves the slot but no producer wires it.
These gaps are surfaced (not silently absorbed) by the scenario tests when the fixture builder produces a tlog without the corresponding fixtures. They will be picked up by future production implementation tasks; AZ-420 owns the test surface only.
Out of Scope (deferred)
- Spoofed-GPS escalation STATUSTEXT path — owned by FT-N-04 (AZ-426).
- Operator-reloc-request emission negative-path — owned by FT-N-03 (AZ-425).
- The fixture builder's actual
gcs_tlog_<host>.tlogsynthesis (withRELOC:injection + corresponding FDRc8.gcs.operator_commandack +anchor_search_regionrecords) — owned by AZ-595.
Next Batch
Batch 82 candidates from _docs/02_tasks/todo/ (21 tasks remaining):
AZ-421 (FT-P-14), AZ-422 (FT-P-15), AZ-423 (FT-N-01), AZ-424
(FT-N-02). Topo-order leader is AZ-421. Pick at next /autodev
invocation per implement-skill rules (≤ 4 tasks, ≤ 20 cp).