Files
Oleksandr Bezdieniezhnykh bb744d9078 [AZ-420] Batch 81: FT-P-12 + FT-P-13 GCS scenarios
FT-P-12: parse mavproxy-listener tlog over a 60 s Derkachi replay and
assert SUT->GCS GLOBAL_POSITION_INT cadence lands in [1, 2] Hz (AC-6.1).

FT-P-13: inject `RELOC:<lat>,<lon>,<radius_m>` STATUSTEXT while the SUT
is in dead_reckoned; verify FDR `c8.gcs.operator_command` ack <=2s,
`anchor_search_region` centre shifts toward the hint, and no
BAD_SIGNATURE / UNAUTHORIZED / REJECTED STATUSTEXT lands in the
post-inject window (AC-6.2).

Adds runner.helpers.gcs_telemetry_evaluator (rate, hint-ack correlation,
haversine search-region shift, rejection scan) and
sitl_observer.capture_gcs_tlog (parity surface to capture_ap_tlog).
Pure-logic coverage: 39 new unit tests; full e2e/_unit_tests/ suite
746 passing (was 700). Scenarios skip locally on missing SITL replay
fixture; production hooks (inbound STATUSTEXT parser, anchor_search_region
FDR emitter) tracked outside this task.

See _docs/03_implementation/batch_81_report.md +
reviews/batch_81_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-17 14:46:08 +03:00

9.2 KiB
Raw Permalink Blame History

Batch 81 Report — FT-P-12 + FT-P-13 GCS downsample + command path

Batch: 81 Date: 2026-05-17 Context: Test implementation (greenfield Step 10 — Implement Tests) Tasks: AZ-420 (3 cp) — single scenario task covering FT-P-12 + FT-P-13 Cycle: 1 Verdict: COMPLETE — PASS_WITH_WARNINGS (self-reviewed; see reviews/batch_81_review.md)

Summary

Implements the GCS-leg blackbox scenarios under epic AZ-262:

  • FT-P-12 — SUT→GCS summary stream cadence (AC-6.1). The C8 QgcTelemetryAdapter pairs GLOBAL_POSITION_INT + NAMED_VALUE_FLOAT at the configured summary_rate_hz; the test parses the mavproxy-listener-captured tlog over a 60 s Derkachi replay and asserts the observed GLOBAL_POSITION_INT rate lands in [1, 2] Hz.
  • FT-P-13 — GCS-originated operator re-loc hint (AC-6.2). A STATUSTEXT carrying RELOC:<lat>,<lon>,<radius_m> is injected while the SUT is in dead_reckoned; the SUT must (a) acknowledge via an FDR c8.gcs.operator_command record within ≤ 2 s, (b) bias its next anchor_search_region toward the hint, (c) not reject the well-formed hint with a security/auth STATUSTEXT.

AZ-420 — FT-P-12 + FT-P-13 (3 cp)

  • e2e/runner/helpers/gcs_telemetry_evaluator.py (new, 430 lines): pure-logic evaluators sourced from the GCS tlog + FDR archive.

    • compute_gcs_summary_rate(messages, *, position_msg_type, ...)GcsSummaryRateReport(observed_rate_hz, passes, ...) — AC-6.1.
    • extract_inbound_hints(messages, *, hint_prefix='RELOC:')list[InboundHint] — tlog→DTO adapter.
    • parse_reloc_payload(hint_text)(lat_deg, lon_deg, radius_m).
    • correlate_hint_acks(hints, acks)HintAckReport (AC-2). Greedy injection-order pairing; each ack matches at most one hint.
    • evaluate_search_region_shift(regions, hint_inject_us, lat, lon)SearchRegionShiftReport (AC-3). Compares last pre-hint region centre to first post-hint region centre via haversine distance.
    • haversine_distance_m(lat_a, lon_a, lat_b, lon_b) — great-circle distance, mean Earth radius. Sub-100 km accuracy ≪ 1 m.
    • detect_hint_rejection(messages, inject_us, *, window_us=2e6)HintRejectionReport (AC-4). Scans STATUSTEXT in the post-inject window for BAD_SIGNATURE / UNAUTHORIZED / REJECTED tokens.
    • collect_messages_to_list(messages) — convenience for the "parse once, run N analyzers" pattern (mirrors ap_contract_evaluator).
  • e2e/runner/helpers/sitl_observer.py (edited, +25 lines): adds capture_gcs_tlog(host, duration_s) -> Path mirroring capture_ap_tlog. Loads the FDR-replay fixture at ${E2E_SITL_REPLAY_DIR}/gcs_tlog_<host>.tlog. Raises RuntimeError on missing env / missing fixture / non-positive duration.

  • e2e/tests/positive/test_ft_p_12_gcs_downsample.py (new, 110 lines): full FT-P-12 scenario. Skips when sitl_replay_ready is False (no SITL fixture). Parametric across (fc_adapter, vio_strategy) via conftest. traces_to(AC-6.1,AC-1,AC-5).

  • e2e/tests/positive/test_ft_p_13_gcs_command.py (new, 211 lines): full FT-P-13 scenario. Walks the FDR archive for c8.gcs.operator_command ack records + anchor_search_region per-frame records. Skips on missing fixture; fails loudly on empty hint list / empty FDR archive so the test cannot silently green-light an unimplemented production path. traces_to(AC-6.2,AC-2,AC-3,AC-4,AC-5).

  • e2e/_unit_tests/helpers/test_gcs_telemetry_evaluator.py (new, 39 tests): pure-logic coverage for every evaluator + adapter. Boundary cases include 1.0 / 2.0 Hz inclusive, ack-before-hint ignored, latency exactly at 2 000 ms, no pre-hint region, equal distance non-strict, BAD_SIGNATURE / UNAUTHORIZED / REJECTED token detection, malformed RELOC: payload raises ValueError.

  • e2e/_unit_tests/helpers/test_sitl_observer.py (edited, +4 tests): capture_gcs_tlog happy path + missing env + missing fixture + zero/negative duration. Mirrors the existing capture_ap_tlog test block.

  • e2e/_unit_tests/test_directory_layout.py (edited): registers runner/helpers/gcs_telemetry_evaluator.py, tests/positive/test_ft_p_12_gcs_downsample.py, tests/positive/test_ft_p_13_gcs_command.py.

Tests

Full e2e/_unit_tests/ suite: 746 passed in 147.57 s (baseline 700 → +46 net). Run via python -m pytest e2e/_unit_tests/ from the workspace root. No flakes, no skips outside the pre-existing intentional skips.

Collection check on the two new scenario tests (pytest --collect-only e2e/tests/positive/test_ft_p_12_gcs_downsample.py e2e/tests/positive/test_ft_p_13_gcs_command.py): 12 items collected (2 tests × 6 (fc_adapter, vio_strategy) combinations each). The scenarios skip locally because E2E_SITL_REPLAY_DIR is unset — which is the intended docker-vs-host boundary; they run inside the docker-compose SITL replay harness.

Per-area test counts (this batch):

File Tests added
test_gcs_telemetry_evaluator.py (new) 39
test_sitl_observer.py (edited) 4
test_directory_layout.py (edited) 3 (path entries)
test_no_sut_imports.py (no edit; broader walk) implicit +1 module covered
Total +46

Acceptance Criteria Verification

AC Status Evidence
AC-1 — GCS rate ∈ [1, 2] Hz over 60 s window test_ft_p_12_gcs_downsample + 10 compute_gcs_summary_rate unit tests (boundary, degeneracy, custom bounds)
AC-2 — FDR ack ≤ 2 s after inject test_ft_p_13_gcs_command + 6 correlate_hint_acks unit tests
AC-3 — anchor_search_region shifts toward hint test_ft_p_13_gcs_command + 5 evaluate_search_region_shift + 3 haversine_distance_m unit tests
AC-4 — No security/auth rejection in window test_ft_p_13_gcs_command + 7 detect_hint_rejection unit tests
AC-5 — Parameterised per (fc_adapter, vio_strategy) pytest --collect-only shows 6 param IDs per scenario

Code Review Verdict

PASS_WITH_WARNINGS (no Critical, no High; 2 Low notes — see reviews/batch_81_review.md).

Auto-Fix Attempts

0 (no auto-fix-eligible findings).

Stuck Agents

None.

Notable Decisions

  • HintAckReport.passes returns False for empty hints. The scenario test pre-checks if not hints: pytest.fail(...) before calling correlate_hint_acks, so the evaluator never observes an empty list in practice. Leaving the conservative semantic in place — "no hints" is a misuse of the correlator, not a trivial pass — and pushing the explicit failure upstream where the contextual error message ("the fixture builder must inject at least one operator re-loc hint") is more useful.
  • AC-3's passes is non-strict shift. A region exactly equidistant before/after the inject is treated as "not biased" (distance_after_m < distance_before_m is strict). This matches the spec wording "shifts toward the hinted location" — zero movement is not a shift. Documented in SearchRegionShiftReport.passes.
  • Counted GLOBAL_POSITION_INT only for AC-6.1, not the NAMED_VALUE_FLOAT companion. The QGC adapter pairs them so counting both would double-count. The position message is the contract-relevant half; the NAMED_VALUE_FLOAT carries the decorative horizontal-uncertainty annotation.
  • Tests are shaped to fail loudly when the upstream production hooks are missing. AC-2 requires the C8 adapter to translate an inbound STATUSTEXT into an FDR c8.gcs.operator_command record; AC-3 requires the C2 backbone to emit anchor_search_region FDR records. Both are deferred work outside AZ-420's scope. The scenario tests skip cleanly when no fixture is present (sitl_replay_ready=False) and fail with a specific error when the fixture exists but lacks the expected hint or ack records. This is the "tests as gates" pattern called out in the implement skill.

Production Dependencies (forward-look)

FT-P-13 transitively depends on:

  • Inbound STATUSTEXT command parser in c8_fc_adapter/mavlink_gcs_adapter.py. Currently the adapter emits but does not consume STATUSTEXT. The C12 MavlinkOperatorCommandTransport concrete impl is a Protocol-only stub.
  • anchor_search_region FDR record emitted by the C2 backbone per nav-camera frame. The FDR schema (AC-NEW-3 family) reserves the slot but no producer wires it.

These gaps are surfaced (not silently absorbed) by the scenario tests when the fixture builder produces a tlog without the corresponding fixtures. They will be picked up by future production implementation tasks; AZ-420 owns the test surface only.

Out of Scope (deferred)

  • Spoofed-GPS escalation STATUSTEXT path — owned by FT-N-04 (AZ-426).
  • Operator-reloc-request emission negative-path — owned by FT-N-03 (AZ-425).
  • The fixture builder's actual gcs_tlog_<host>.tlog synthesis (with RELOC: injection + corresponding FDR c8.gcs.operator_command ack + anchor_search_region records) — owned by AZ-595.

Next Batch

Batch 82 candidates from _docs/02_tasks/todo/ (21 tasks remaining): AZ-421 (FT-P-14), AZ-422 (FT-P-15), AZ-423 (FT-N-01), AZ-424 (FT-N-02). Topo-order leader is AZ-421. Pick at next /autodev invocation per implement-skill rules (≤ 4 tasks, ≤ 20 cp).