[AZ-416] [AZ-417] [AZ-419] Test batch 72: FT-P-09 AP/iNav + FT-P-11 cold start

- AZ-416 (FT-P-09-AP): fills mavproxy_tlog_reader.iter_messages with pymavlink body (AZ-406 surface kept); adds ap_contract_evaluator covering AC-1 (signing handshake <=5s), AC-2 (GPS_INPUT >=4.5 Hz), AC-3 (EK3_SRC1_POSXY=3), AC-4 (GPS_RAW_INT health >=80%); scenario forces fc_adapter=ardupilot. - AZ-417 (FT-P-09-iNav): msp_frame_observer covering AC-2 (MSP rate) and AC-3 (fix_type/provider/numSat); scenario forces fc_adapter=inav. - AZ-419 (FT-P-11): cold_start_evaluator covering AC-1 (operator manifest origin), AC-2 (FC EKF fallback), AC-3 (no-origin abort), AC-4 (bounded-delta conflict, ADR-010 Principle #11 amended); scenario parametrized on origin_source plus dedicated no-origin abort scenario. - All scenarios skip-gated on upstream frame_source_replay / imu_replay / fdr_reader / sitl_observer extensions. - +67 unit tests; full e2e unit suite: 460 passed. - K=3 cumulative review fired: PASS for batches 70-72. See _docs/03_implementation/batch_72_report.md, _docs/03_implementation/reviews/batch_72_review.md, _docs/03_implementation/cumulative_review_batches_70-72_cycle1_report.md. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 22:51:14 +00:00 · 2026-05-17 07:49:17 +03:00
parent c6e6cba237
commit a644debdb7
19 changed files with 3041 additions and 9 deletions
@@ -0,0 +1,142 @@
+# Batch 72 Report — Test Implementation (cycle 1, batch 6 of test phase)
+
+**Batch**: 72
+**Date**: 2026-05-16
+**Context**: Test implementation (greenfield Step 10 — Implement Tests)
+**Tasks**: AZ-416 (5pt), AZ-417 (3pt), AZ-419 (3pt) — 11 cp / 3 tasks
+**Cycle**: 1
+**Verdict**: COMPLETE — PASS (self-reviewed + K=3 cumulative reviewed; see
+`reviews/batch_72_review.md` and `cumulative_review_batches_70-72_cycle1_report.md`)
+
+## Summary
+
+FC contract conformance + cold-start init — the three remaining
+scenarios that consume mavproxy / signing / cold-boot fixtures already
+built in batches 67-68. Same pattern as prior batches:
+
+* Pure-logic helper under `e2e/runner/helpers/` (everything the
+  scenario can express without docker-bound SITL access).
+* Scenario file(s) under `e2e/tests/positive/`, parameterized across
+  conftest fixtures, skip-gated on upstream replay / SITL observer
+  / FDR helpers (auto-activates when AZ-441 + AZ-407 leftovers land).
+* Helper-driven unit test file under `e2e/_unit_tests/helpers/`.
+
+### AZ-416 — FT-P-09-AP ArduPilot signing + GPS_INPUT contract (5pt)
+
+* **`runner/helpers/mavproxy_tlog_reader.py`** — AZ-416 fills in the
+  pymavlink-backed `iter_messages` body that AZ-406 reserved. Uses
+  `mavutil.mavlink_connection(str(tlog_path))` with `recv_match` to
+  iterate frames; exposes `TlogMessage(timestamp_us, msg_type, signed,
+  fields)`. The `signed` flag uses `msg.get_signed()` with a
+  defensive `AttributeError` fallback. The function is FAIL-FAST on
+  missing files (raises FileNotFoundError); pymavlink's BAD_DATA
+  frames are skipped silently per the standard idiom.
+* **`runner/helpers/ap_contract_evaluator.py`** — four analysers:
+  - `observe_signing_handshake` (AC-1): first signed frame within
+    `HANDSHAKE_BUDGET_S = 5.0` s AND no `BAD_SIGNATURE` STATUSTEXT
+    within that window.
+  - `compute_gps_input_rate` (AC-2): GPS_INPUT cadence ≥4.5 Hz
+    (constant `GPS_INPUT_MIN_RATE_HZ`).
+  - `validate_ek3_src1_posxy` (AC-3): the AP EKF source-set parameter
+    must equal `EK3_SRC1_POSXY_REQUIRED = 3` (GPS).
+  - `evaluate_gps_raw_int_health` (AC-4): GPS_RAW_INT
+    `fix_type ≥ 3 AND eph ≤ 200` for ≥80 % of the window.
+  - `collect_messages_to_list` — explicit single-pass-iterator
+    materialisation so multiple analysers can share the tlog.
+* **`tests/positive/test_ft_p_09_ap_signing.py`** — scenario forces
+  `fc_adapter=ardupilot` (skips other adapters), parameterised per
+  `vio_strategy`. Records `signing_handshake_s`,
+  `gps_input_rate_hz`, `ek3_src1_posxy`, `gps_raw_int_healthy_fraction`
+  NFR metrics with AC IDs.
+* **22 unit tests** in `test_ap_contract_evaluator.py` + **6** in
+  `test_mavproxy_tlog_reader.py`.
+
+### AZ-417 — FT-P-09-iNav MSP2_SENSOR_GPS contract (3pt)
+
+* **`runner/helpers/msp_frame_observer.py`** — pure logic for AC-2
+  (`compute_rate_hz` with `MSP2_SENSOR_GPS_FUNCTION_ID = 0x1F03` +
+  `MIN_OBSERVED_RATE_HZ = 4.5`) and AC-3 (`evaluate_inav_gps_state`
+  with `MIN_FIX_TYPE = 3` and `REQUIRED_PROVIDER = "MSP"`).
+* **`tests/positive/test_ft_p_09_inav.py`** — scenario forces
+  `fc_adapter=inav` (skips other adapters), parameterised per
+  `vio_strategy`. Probes TCP handshake via
+  `sitl_observer.observe_inav_tcp_handshake` (gated), captures MSP
+  frames via `collect_inav_msp_frames` (gated), queries iNav GPS
+  state via `query_inav_gps_state` (gated).
+* **14 unit tests** in `test_msp_frame_observer.py`.
+
+### AZ-419 — FT-P-11 cold-start init (3pt)
+
+* **`runner/helpers/cold_start_evaluator.py`** — covers ADR-010's
+  primary + secondary + bounded-delta paths plus AC-3 no-origin
+  abort:
+  - `write_manifest` / `read_manifest` — test-fixture builder for the
+    C10 Manifest's `flight.takeoff_origin` (the test fabricates one
+    instead of fetching from C12 because the SUT consumes a Manifest
+    file path, not a service URL).
+  - `read_cold_boot_fixture` — parse the AZ-408 fixture JSON into a
+    typed `ColdBootSnapshot` (converts `lat_e7 / lon_e7 / alt_mm` →
+    decimal degrees + meters).
+  - `evaluate_first_estimate` (AC-1/2/4): distance vs expected origin
+    + source_label rule for bounded-delta + FDR record audit.
+  - `evaluate_no_origin_path` (AC-3): SUT must produce NO outbound
+    estimate AND FDR must record `c5.cold_start_origin.unavailable`.
+  - Constants for accuracy budget (50 m), bounded-delta trigger
+    (200 m), forbidden first-label (`satellite_anchored`), and the
+    three FDR record types.
+* **`tests/positive/test_ft_p_11_cold_start_init.py`** — two scenario
+  functions:
+  - `test_ft_p_11_cold_start_origin_variants` — parametrized on
+    `origin_source ∈ {operator_manifest, fc_ekf,
+    bounded_delta_conflict}`; one fixture / one assertion path per
+    variant.
+  - `test_ft_p_11_cold_start_no_origin_aborts` — AC-3 dedicated
+    scenario.
+  Both rely on `sitl_observer.prepare_sitl_cold_boot` +
+  `prepare_sitl_no_gps` (gated until AZ-407 leftovers land).
+* **19 unit tests** in `test_cold_start_evaluator.py`.
+
+## Tests
+
+* **Full e2e unit suite**: 460 passed in 134.35 s (was 393 at end of
+  batch 71 → +67 net new tests this batch).
+* **Pre-existing**: macOS-only `/e2e-results` plugin issue in
+  scenario invocation outside Docker. Unit suite unaffected.
+
+## Files Touched
+
+**New helpers:**
+* `e2e/runner/helpers/msp_frame_observer.py`
+* `e2e/runner/helpers/ap_contract_evaluator.py`
+* `e2e/runner/helpers/cold_start_evaluator.py`
+
+**Modified helper:**
+* `e2e/runner/helpers/mavproxy_tlog_reader.py` — AZ-416 fills the
+  pymavlink-backed `iter_messages` body that AZ-406 reserved
+  (NotImplementedError → real iterator). Surface unchanged.
+
+**New unit tests:**
+* `e2e/_unit_tests/helpers/test_mavproxy_tlog_reader.py` (6 tests)
+* `e2e/_unit_tests/helpers/test_ap_contract_evaluator.py` (22 tests)
+* `e2e/_unit_tests/helpers/test_msp_frame_observer.py` (14 tests)
+* `e2e/_unit_tests/helpers/test_cold_start_evaluator.py` (19 tests)
+
+**New scenarios:**
+* `e2e/tests/positive/test_ft_p_09_ap_signing.py`
+* `e2e/tests/positive/test_ft_p_09_inav.py`
+* `e2e/tests/positive/test_ft_p_11_cold_start_init.py`
+
+**Updated:**
+* `e2e/_unit_tests/test_directory_layout.py` — added 6 new paths.
+
+**Archived:**
+* `_docs/02_tasks/todo/AZ-416_*.md` → `done/`
+* `_docs/02_tasks/todo/AZ-417_*.md` → `done/`
+* `_docs/02_tasks/todo/AZ-419_*.md` → `done/`
+
+## Cumulative Review Trigger
+
+K=3 FIRED at end of batch 72 (last cumulative covered batches 67-69;
+since then 70 + 71 + 72 = 3 batches). Report written:
+`_docs/03_implementation/cumulative_review_batches_70-72_cycle1_report.md`.
+Verdict: PASS. Next cumulative trigger: end of batch 75.
@@ -0,0 +1,194 @@
+# Cumulative Code Review Report — Batches 70–72 (cycle 1, test phase)
+
+**Date**: 2026-05-16
+**Mode**: cumulative
+**Scope**: union of files changed in batches 70, 71, 72 of cycle 1
+(the test-implementation phase batches that followed the
+`batches_67-69` cumulative review).
+**Verdict**: PASS
+
+## Batch coverage
+
+| Batch | Tasks | Theme |
+|-------|-------|-------|
+| 70 | AZ-409, AZ-412, AZ-413 | Still-image accuracy (FT-P-01), Derkachi frame-to-frame registration (FT-P-04), satellite anchor + MRE budgets (FT-P-05 + FT-P-06) |
+| 71 | AZ-414, AZ-415, AZ-418 | Sharp-turn recovery + failure twin (FT-P-07 + FT-N-02), multi-segment relocalisation (FT-P-08), GTSAM smoothing-loop look-back (FT-P-10) |
+| 72 | AZ-416, AZ-417, AZ-419 | ArduPilot GPS_INPUT contract + signing handshake (FT-P-09-AP), iNav MSP2_SENSOR_GPS contract (FT-P-09-iNav), cold-start initialization (FT-P-11 — 3 origin_source variants + no-origin abort) |
+
+Cycle 1 product implementation under `src/gps_denied_onboard/**` is
+out of scope; drift between product and test phases is checked by
+`test_no_sut_imports.py` (passing).
+
+## Phase 1 — Context Loading
+
+* Read `_docs/02_document/module-layout.md` § `blackbox_tests`.
+* Read `_docs/02_document/architecture.md` § layering.
+* Reviewed batch reports `batch_70_report.md`, `batch_71_report.md`,
+  `batch_72_report.md` (in-progress draft).
+* Reviewed task specs AZ-409, AZ-410 (prior), AZ-411 (prior), AZ-412,
+  AZ-413, AZ-414, AZ-415, AZ-416, AZ-417, AZ-418, AZ-419.
+* Cross-referenced the prior `cumulative_review_batches_67-69`
+  conclusions to verify the K=3 cumulative cadence is honoured.
+
+## Phase 2 — Spec Compliance
+
+Per-task AC coverage at the end of batch 72:
+
+| Task | Status |
+|------|--------|
+| AZ-409 (FT-P-01) | Helper + scenario + 20 unit tests; AC-1..AC-7 covered |
+| AZ-412 (FT-P-04) | Helper + scenario + 26 unit tests; AC-1..AC-5 covered |
+| AZ-413 (FT-P-05 + FT-P-06) | Helper + 2 scenarios + 22 unit tests; AC-1..AC-4 covered (FT-P-06 piggybacks on FT-P-04 + FT-P-05 evidence CSVs) |
+| AZ-414 (FT-P-07 + FT-N-02) | Helper + 2 scenarios + 30 unit tests; AC-1..AC-7 (FT-P-07) AND AC-1..AC-7 (FT-N-02) covered via the shared `sharp_turn_detector` helper |
+| AZ-415 (FT-P-08) | Helper + scenario + 16 unit tests; AC-1..AC-4 covered |
+| AZ-416 (FT-P-09-AP) | Helper + scenario + 22 unit tests (ap_contract_evaluator) + 6 unit tests (mavproxy_tlog_reader); AC-1..AC-5 + D-C8-9 covered |
+| AZ-417 (FT-P-09-iNav) | Helper + scenario + 14 unit tests; AC-1..AC-4 covered |
+| AZ-418 (FT-P-10) | Helper + scenario + 15 unit tests; AC-1..AC-3 covered |
+| AZ-419 (FT-P-11) | Helper + 2 scenarios + 19 unit tests; AC-1..AC-5 covered (3 origin_source parametrize variants + 1 no-origin abort scenario) |
+
+All scenarios are skip-gated on the AZ-441 / AZ-407 leftovers
+(`frame_source_replay`, `imu_replay`, `fdr_reader`, `sitl_observer`
+ext methods); pure-logic acceptance is fully covered in the
+`e2e/_unit_tests/helpers/` test files.
+
+## Phase 3 — Code Quality
+
+* **Single responsibility**: each helper owns ONE analytic concern:
+  - `accuracy_evaluator` — still-image Vincenty + pass-count rules
+  - `registration_classifier` — IMU-derived attitude + normal-segment
+    classification + success ratio
+  - `mre_evaluator` — per-image cross-domain + 95th-percentile MRE
+  - `anchor_pair_detector` — drift binning + monotonicity
+  - `estimate_schema` — schema validation + WGS84 range + int32
+    decoding
+  - `sharp_turn_detector` — gyro_z run detection + during-turn label/cov
+    + recovery lag/drift/heading
+  - `multi_segment_evaluator` — multi-window relocalisation
+  - `smoothing_evaluator` — raw + smoothed pose pair + improvement rate
+  - `mavproxy_tlog_reader` — pymavlink tlog frame iteration
+  - `ap_contract_evaluator` — signing handshake + GPS_INPUT rate +
+    EK3 source-set + GPS_RAW_INT health
+  - `msp_frame_observer` — MSP rate + iNav GPS state evaluation
+  - `cold_start_evaluator` — Manifest build/read + cold-boot snapshot
+    parse + first-estimate / no-origin / bounded-delta evaluation
+* **No suppressed errors**: the only narrow `try`/`except` is in
+  `mavproxy_tlog_reader.iter_messages` for pymavlink's `BAD_DATA` +
+  per-message `to_dict` exceptions — documented in the docstring as
+  the standard pymavlink iteration idiom.
+* **AAA discipline**: all 460 unit tests use `# Arrange / # Act /
+  # Assert`.
+* **No narration comments** in any new module; docstrings carry
+  intent + AC mapping + Mode B Facts where relevant (Fact #107 in
+  `smoothing_evaluator`, Fact #109 noted in scenario docstrings of
+  AZ-416 + AZ-417, ADR-010 Principle #11 in `cold_start_evaluator`).
+
+## Phase 4 — Security
+
+* **`test_no_sut_imports.py` passes** — no e2e helper or test file
+  imports `src/gps_denied_onboard`.
+* **Signing channel observability**: AZ-416 helper observes signed
+  frames + BAD_SIGNATURE STATUSTEXT events without ever validating
+  the signature itself (that's pymavlink + AP-side wiring). The
+  scenario "Forbidden" list (no bypass to unsigned channel) is
+  honoured — `passes` returns False if any `BAD_SIGNATURE` STATUSTEXT
+  appears in the handshake window OR no signed frame arrives.
+* **Test passkey hygiene**: `test_passkey_files_match` (pre-existing)
+  still passes; AZ-416 scenario consumes the docker-secret fixture
+  only.
+* **No credentials in source**: confirmed by grep across all batch
+  72 added modules.
+
+## Phase 5 — Performance
+
+* Across all 12 helpers added in batches 70-72, every analyser is O(N)
+  over its input.
+* `mavproxy_tlog_reader` materialises to a list ONCE per scenario via
+  `ap_contract_evaluator.collect_messages_to_list` so multiple
+  analysers can share the result — the alternative (re-iterating the
+  generator) would re-open the pymavlink connection per analyser.
+* No nested CSV reads or repeated geodesic recomputations in any
+  helper across the three batches.
+
+## Phase 6 — Cross-Task Consistency
+
+Verified across all 9 tasks in the 70-72 window:
+
+* **Skip gate pattern**: every scenario uses an
+  `_*_harness_implemented` fixture that probes one or more
+  `NotImplementedError`-raising helpers and skips with a single,
+  spec-referenced message naming the upstream owner (AZ-441 / AZ-407)
+  and the pure-logic unit-test file that DOES cover the AC.
+* **Constants discipline**: every scenario assertion message
+  references the helper's exported constant by name (e.g.
+  `ace.HANDSHAKE_BUDGET_S`, `cse.BOUNDED_DELTA_TRIGGER_M`,
+  `std.MAX_RECOVERY_FRAMES_SAFETY_MS`), not magic numbers.
+* **Evidence emission**: every scenario emits per-scenario NFR metrics
+  via `nfr_recorder.record_metric(name, value, ac_id=…)`. Per-test CSV
+  artifacts use `write_csv_evidence(out, …)` returning the path —
+  same idiom in `accuracy_evaluator`, `mre_evaluator`,
+  `multi_segment_evaluator`, `smoothing_evaluator`,
+  `sharp_turn_detector`.
+* **Trace markers**: every scenario uses `@pytest.mark.traces_to(...)`
+  with comma-separated AC IDs, matching the
+  `monorepo-document`-owned traceability format used by batches 67-69.
+* **Helper return shape**: every analyser returns a frozen
+  `@dataclass` with a `passes` (or `passes_distance`, `passes_rate`,
+  etc.) property — so the scenario assertion is one boolean check
+  with a structured-data message.
+* **No drift in shared types**: `TlogMessage` (AZ-406 surface, AZ-416
+  body) used identically across `mavproxy_tlog_reader.count_by_type`
+  and `ap_contract_evaluator.*` analysers.
+
+## Phase 7 — Architecture Compliance
+
+* **Module-layout invariant**: every new helper is under
+  `e2e/runner/helpers/`; every new scenario under
+  `e2e/tests/{positive,negative}/`; every new unit test under
+  `e2e/_unit_tests/helpers/`. `test_directory_layout.py` parametrize
+  list updated to enforce the invariant — 75 path entries pass.
+* **Public-boundary**: every scenario uses only the FDR `record_type`
+  + `payload` dict schema, outbound estimate stream, and SITL
+  observer surface; no SUT internals consumed.
+* **Backwards compat with AZ-406 surface**: `mavproxy_tlog_reader`
+  filled in its body without changing the `TlogMessage` dataclass
+  shape or the `iter_messages` / `count_by_type` signatures, so
+  downstream consumers (FT-P-03/14 schema scenario, others) keep
+  working.
+
+## Phase 8 — Test Suite Health Trend
+
+| Batch end | Total tests | Delta |
+|-----------|-------------|-------|
+| 69 | 257 | (baseline) |
+| 70 | 325 | +68 |
+| 71 | 393 | +68 |
+| 72 | 460 | +67 |
+
+Net: +203 unit tests across batches 70-72 / 12 new helper modules + 9
+new scenario files + 1 modified scenario file (FT-P-09-AP wired up
+through the previously stub-only `mavproxy_tlog_reader`).
+
+Pre-existing macOS-only `/e2e-results` plugin issue in scenario
+invocation outside Docker is unaffected by all batch 70-72 changes;
+unit suite untouched by it.
+
+## Cross-Batch Consistency Verdict
+
+PASS — no behavioural drift between batches; helper module shape +
+scenario skeleton + skip-gate pattern + constants discipline + NFR
+metrics format + traces_to marker format all identical across the 9
+tasks.
+
+## Architecture Compliance Verdict
+
+PASS — public-boundary blackbox stance preserved across all 12 new
+helpers; pymavlink boundary correctly placed at the tlog reader;
+ADR-010 Principle #11 amended explicitly encoded in
+`cold_start_evaluator`; Mode B Fact #107 preserved in
+`smoothing_evaluator` docstring.
+
+## Final Verdict
+
+**PASS** — Batches 70-72 (AZ-409, AZ-412, AZ-413, AZ-414, AZ-415,
+AZ-416, AZ-417, AZ-418, AZ-419 — 9 tasks / 27 cp) ready for the next
+K=3 cumulative review at end of batch 75.
@@ -0,0 +1,176 @@
+# Code Review Report
+
+**Batch**: 72 — AZ-416, AZ-417, AZ-419
+**Date**: 2026-05-16
+**Verdict**: PASS
+
+## Findings
+
+(none)
+
+## Findings Sweep
+
+### Phase 1 — Context Loading
+
+Loaded specs `AZ-416_ft_p_09_ap_signing.md`, `AZ-417_ft_p_09_inav.md`,
+`AZ-419_ft_p_11_cold_start_init.md`. Re-read existing
+`runner/helpers/mavproxy_tlog_reader.py` (AZ-406 surface to be filled
+in by AZ-416 per the docstring), `sitl_observer.py`, `fdr_reader.py`,
+`geo.py`. Read `fixtures/cold-boot/cold_boot_fixture.json` for FT-P-11
+secondary path origin. Verified pymavlink ≥2.4 install + the
+`MAVLink.get_signed()` API surface in the venv.
+
+### Phase 2 — Spec Compliance
+
+**AZ-416 (FT-P-09-AP)**
+
+| AC | Coverage | Status |
+|----|----------|--------|
+| AC-1 (signing handshake ≤5 s, no BAD_SIGNATURE) | `test_handshake_passes_when_first_signed_within_window`, `test_handshake_fails_when_no_signed_within_window`, `test_handshake_fails_when_signed_arrives_after_budget`, `test_handshake_fails_on_bad_signature_statustext`, scenario assertion via `observe_signing_handshake` | Covered |
+| AC-2 (GPS_INPUT ≥4.5 Hz for 5 Hz target) | `test_gps_input_rate_at_5hz_for_60s_passes`, `test_gps_input_rate_at_boundary_passes`, `test_gps_input_rate_below_minimum_fails`, scenario assertion via `compute_gps_input_rate` | Covered |
+| AC-3 (EK3_SRC1_POSXY == 3) | `test_validate_ek3_src1_posxy_passes_at_3`, scenario via `validate_ek3_src1_posxy(sitl_observer.read_ap_parameter(...))` | Covered |
+| AC-4 (GPS_RAW_INT healthy fraction ≥80 %) | `test_gps_raw_int_health_all_healthy_passes`, `test_gps_raw_int_health_at_80_pct_boundary_passes`, `test_gps_raw_int_health_below_80_pct_fails`, `test_gps_raw_int_health_eph_threshold_strict`, scenario via `evaluate_gps_raw_int_health` | Covered |
+| AC-5 (vio_strategy parameterization; `fc_adapter` fixed to `ardupilot`) | scenario uses `vio_strategy` fixture from conftest; `fc_adapter != "ardupilot"` is skipped — collection across 6 variants reduces to 3 active variants | Covered |
+| D-C8-9 (signing-handshake observability) | `traces_to` marker + handshake report includes `setup_signing_seen` | Covered |
+
+Also: AZ-416's `mavproxy_tlog_reader.iter_messages` body landed
+(previously raised NotImplementedError per the AZ-406 commit). 6 unit
+tests in `test_mavproxy_tlog_reader.py` exercise the parser against
+synthetic tlogs.
+
+**AZ-417 (FT-P-09-iNav)**
+
+| AC | Coverage | Status |
+|----|----------|--------|
+| AC-1 (TCP connect to inav-sitl:5760 ≤5 s) | scenario via `sitl_observer.observe_inav_tcp_handshake` (skip-gated) | Covered (gated) |
+| AC-2 (MSP2_SENSOR_GPS ≥4.5 Hz for 5 Hz target) | `test_compute_rate_at_target_passes`, `test_compute_rate_at_boundary_passes`, `test_compute_rate_below_minimum_fails`, `test_compute_rate_filters_function_id`, scenario via `compute_rate_hz` | Covered |
+| AC-3 (fix_type ≥3, provider=MSP, numSat matches emitted) | `test_evaluate_gps_state_passes_at_minimum_fix`, `test_evaluate_gps_state_fails_on_low_fix_type`, `test_evaluate_gps_state_fails_on_wrong_provider`, `test_evaluate_gps_state_fails_on_num_sat_mismatch`, scenario via `evaluate_inav_gps_state` | Covered |
+| AC-4 (vio_strategy parameterization; `fc_adapter` fixed to `inav`) | scenario uses `vio_strategy` fixture; skips when `fc_adapter != "inav"` | Covered |
+
+**AZ-419 (FT-P-11)**
+
+| AC | Coverage | Status |
+|----|----------|--------|
+| AC-1 (operator_manifest: estimate ≤50 m of A; FDR `cold_start_origin.set(source="manifest")`) | `test_evaluate_operator_manifest_passes_at_origin`, `test_evaluate_operator_manifest_passes_just_inside_budget`, `test_evaluate_operator_manifest_fails_just_outside_budget`, scenario assertion | Covered |
+| AC-2 (fc_ekf: estimate ≤50 m of FC EKF snapshot; FDR `source="fc_ekf"`) | `test_evaluate_fc_ekf_passes`, scenario assertion | Covered |
+| AC-3 (no origin → SUT refuses takeoff; FDR `cold_start_origin.unavailable`) | `test_evaluate_no_origin_passes_when_silent_and_fdr_records_abort`, `test_evaluate_no_origin_fails_when_sut_emits_anything`, `test_evaluate_no_origin_fails_when_fdr_missing_unavailable_signal`, dedicated scenario `test_ft_p_11_cold_start_no_origin_aborts` | Covered |
+| AC-4 (bounded-delta conflict: operator wins; source_label != satellite_anchored; FDR `gps_bounded_delta.reject`) | `test_evaluate_bounded_delta_conflict_operator_wins`, `test_evaluate_bounded_delta_fails_when_label_is_satellite_anchored`, scenario assertion (third parametrize variant) | Covered |
+| AC-5 (parameterization across `fc_adapter, vio_strategy, origin_source`) | scenario uses conftest's `fc_adapter` + `vio_strategy`; parametrizes `origin_source ∈ {operator_manifest, fc_ekf, bounded_delta_conflict}` separately | Covered |
+
+ADR-010 Principle #11 amended ("operator origin wins on bounded-delta
+conflict; FC GPS logged as suspect") explicitly encoded as
+`BOUNDED_DELTA_TRIGGER_M = 200.0` + the `c5.gps_bounded_delta.reject`
+record audit.
+
+### Phase 3 — Code Quality
+
+* **Single responsibility**: `mavproxy_tlog_reader` only iterates/counts
+  tlog frames (file I/O concern); `ap_contract_evaluator` only consumes
+  `TlogMessage` iterables (analytics concern); `msp_frame_observer`
+  only consumes captured MSP samples. `cold_start_evaluator` is one
+  module because the three FT-P-11 variants share a single FDR record
+  vocabulary + Manifest schema; splitting them would force the scenario
+  to import three near-identical modules.
+* **No suppressed errors**: `mavproxy_tlog_reader.iter_messages`
+  catches the narrow `BAD_DATA` + per-message `to_dict` exceptions
+  (documented in pymavlink) and continues, but the file-not-found
+  + connection-close paths raise / surface naturally. No bare `except`
+  in any new module.
+* **AAA comment discipline**: every test uses `# Arrange / # Act /
+  # Assert`; sections omitted when not needed.
+* **No narration comments**: docstrings explain non-obvious intent
+  (AC mapping, why orphans excluded, why `materialize_to_list` exists,
+  why `EK3_SRC1_POSXY = 3` is the only acceptance value).
+
+### Phase 4 — Security
+
+* **No SUT imports**: confirmed by `test_no_sut_imports.py` (passing in
+  the full suite). None of the new modules import from
+  `src.gps_denied_onboard`.
+* **Signing handshake stance**: the helper does NOT validate signatures
+  itself (that's pymavlink's job); it only counts signed-frame arrivals
+  and `BAD_SIGNATURE` STATUSTEXT incidents. If signing fails in any way
+  AC-1 fails — the scenario does NOT bypass to an unsigned channel
+  (per spec "Forbidden" list).
+* **No secrets in source**: the AP scenario looks up
+  `mavlink-test-passkey.txt` from the on-disk fixture (already
+  verified by `test_passkey_files_match` in `test_directory_layout.py`).
+  The passkey itself is the AZ-407 / AZ-408 fixture, NOT a production
+  key.
+* **No SQL/shell injection surface**: all helpers operate on bytes /
+  pathlib / dict; no subprocess calls in the helper layer (subprocess
+  for `msp_gps_toy` is the SITL-observer's responsibility).
+
+### Phase 5 — Performance
+
+* `mavproxy_tlog_reader.iter_messages` is a single pass over the tlog;
+  pymavlink's `recv_match(blocking=False)` is the standard idiom.
+* `ap_contract_evaluator` consumes the materialised list ONCE per
+  analyser; `collect_messages_to_list` is the documented choice
+  (mavlink_connection's iterator closes on exhaustion so re-iteration
+  isn't safe). For typical 60 s of mavproxy traffic at ~50 msg/s this
+  is ≤3000 messages → trivial in memory.
+* `cold_start_evaluator._scan_fdr_for_cold_start` is one pass.
+* No nested loops over the same data.
+
+### Phase 6 — Cross-Task Consistency
+
+* **Pattern parity with batches 69 + 70 + 71**:
+  - Skip gate (`_*_harness_implemented` fixture) for missing upstream
+    replay/SITL/FDR helpers — same pattern as
+    `test_ft_p_02/04/05/07/08/10_*`.
+  - `_NullSink` probe — same idiom as the prior 5 scenario files.
+  - Evidence side-channel via `nfr_recorder.record_metric(name, value,
+    ac_id=…)` — same pattern as `test_ft_p_01/04/05/07/08/10_*`.
+  - Module-level constants (`UPPER_SNAKE`) for budgets — matches
+    `multi_segment_evaluator`, `mre_evaluator`, `smoothing_evaluator`,
+    `sharp_turn_detector`.
+  - Helper modules importable from `runner.helpers.*`.
+* **No drift**: scenarios reuse the helper's constants (no magic
+  numbers) — `HANDSHAKE_BUDGET_S`, `GPS_INPUT_MIN_RATE_HZ`,
+  `MIN_FIX_TYPE`, `ACCURACY_BUDGET_M`, `BOUNDED_DELTA_TRIGGER_M`,
+  `FDR_RECORD_*`, `FORBIDDEN_FIRST_LABEL_BOUNDED_DELTA`.
+* **No legacy NotImplementedError test left behind**: verified no test
+  asserts `iter_messages` raises NotImplementedError (was AZ-406's
+  surface contract; AZ-416 owns the body per docstring).
+
+### Phase 7 — Architecture Compliance
+
+* **Public-boundary discipline**: confirmed by `test_no_sut_imports.py`
+  (passing). Helpers consume pymavlink (a third-party MAVLink
+  reference impl, not SUT internals) + FDR record schema (record_type
+  + payload dict) + outbound estimate schema. The signing handshake
+  observer specifically does NOT import the SUT's signing-key state
+  per the spec "Forbidden" list.
+* **Directory layout**: new paths added to `test_directory_layout.py`
+  parametrize list (`runner/helpers/{msp_frame_observer,
+  ap_contract_evaluator, cold_start_evaluator}.py`,
+  `tests/positive/test_ft_p_{09_ap_signing, 09_inav, 11_cold_start_init}.py`).
+  All variants pass.
+* **Determinism**: all helpers are deterministic — no `time.time()`,
+  no RNG; pymavlink parses bytes deterministically.
+
+### Phase 8 — Test Suite Health
+
+* Total: **460 passed in 134.35 s** (was 393 at end of batch 71).
+* New tests this batch: **+67** (msp_frame_observer: 14;
+  mavproxy_tlog_reader: 6; ap_contract_evaluator: 22;
+  cold_start_evaluator: 19; directory_layout new entries: 6).
+* Pre-existing macOS-only `/e2e-results` plugin issue still present —
+  affects scenario test invocation outside Docker only; unit suite
+  unaffected. Out of batch scope.
+
+## Cross-Task Consistency Verdict
+
+PASS — no cross-task drift, no duplicated logic across the four new
+helpers, shared `TlogMessage` type used consistently between
+`mavproxy_tlog_reader` and `ap_contract_evaluator`.
+
+## Architecture Compliance Verdict
+
+PASS — public-boundary blackbox stance preserved; no SUT imports;
+pymavlink boundary correctly placed at the tlog reader.
+
+## Final Verdict
+
+**PASS** — Batch 72 (AZ-416 + AZ-417 + AZ-419) ready for commit.
@@ -12,8 +12,8 @@ sub_step:
 retry_count: 0
 cycle: 1
 tracker: jira
-last_completed_batch: 71
-last_cumulative_review: batches_67-69
+last_completed_batch: 72
+last_cumulative_review: batches_70-72
 last_step_outcomes:
  step_8: "Code is testable — no changes needed (testability_assessment.md committed; no list-of-changes, no source edits)"
  step_9: "Already complete — 41 blackbox test tasks (AZ-406..AZ-446) under epic AZ-262 with specs in _docs/02_tasks/todo/ were produced in a prior cycle; AZ-406 test-infrastructure bootstrap also pre-existing. Folder fallback satisfied (todo/ has test tasks, _dependencies_table.md reflects 114 product + 41 test = 155 total). No Step-9 work executed in cycle 1."
@@ -0,0 +1,326 @@
+"""Unit tests for ``runner.helpers.ap_contract_evaluator`` (FT-P-09-AP / AZ-416).
+
+Covers:
+
+* AC-1 ``observe_signing_handshake``: signed-message detection,
+  ``BAD_SIGNATURE`` STATUSTEXT counting, ≤5 s budget.
+* AC-2 ``compute_gps_input_rate``: ≥4.5 Hz for 5 Hz target.
+* AC-3 ``validate_ek3_src1_posxy``: only ``3`` passes.
+* AC-4 ``evaluate_gps_raw_int_health``: ≥80 % healthy fraction
+  (fix_type ≥3 AND eph ≤200).
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from runner.helpers.ap_contract_evaluator import (
+    EK3_SRC1_POSXY_REQUIRED,
+    GPS_INPUT_MIN_RATE_HZ,
+    GPS_INPUT_TARGET_RATE_HZ,
+    GPS_RAW_INT_HEALTHY_FRACTION_REQUIRED,
+    GPS_RAW_INT_MAX_EPH,
+    GPS_RAW_INT_MIN_FIX_TYPE,
+    HANDSHAKE_BUDGET_S,
+    compute_gps_input_rate,
+    evaluate_gps_raw_int_health,
+    observe_signing_handshake,
+    validate_ek3_src1_posxy,
+)
+from runner.helpers.mavproxy_tlog_reader import TlogMessage
+
+
+def _msg(ts_us: int, msg_type: str, *, signed: bool = False, **fields: object) -> TlogMessage:
+    return TlogMessage(timestamp_us=ts_us, msg_type=msg_type, signed=signed, fields=fields)
+
+
+def test_constants_match_spec() -> None:
+    """The AC-1/2/3/4 thresholds must match the spec text."""
+    # Assert
+    assert HANDSHAKE_BUDGET_S == 5.0
+    assert GPS_INPUT_TARGET_RATE_HZ == 5.0
+    assert GPS_INPUT_MIN_RATE_HZ == 4.5
+    assert GPS_RAW_INT_MIN_FIX_TYPE == 3
+    assert GPS_RAW_INT_MAX_EPH == 200
+    assert GPS_RAW_INT_HEALTHY_FRACTION_REQUIRED == 0.80
+    assert EK3_SRC1_POSXY_REQUIRED == 3
+
+
+def test_handshake_passes_when_first_signed_within_window() -> None:
+    """A signed message at +1s passes the 5s budget."""
+    # Arrange
+    msgs = [
+        _msg(0, "HEARTBEAT", signed=False),
+        _msg(500_000, "SETUP_SIGNING", signed=False),
+        _msg(1_000_000, "HEARTBEAT", signed=True),
+    ]
+
+    # Act
+    report = observe_signing_handshake(msgs)
+
+    # Assert
+    assert report.first_signed_us == 1_000_000
+    assert report.lag_s == pytest.approx(1.0)
+    assert report.setup_signing_seen is True
+    assert report.bad_signature_count == 0
+    assert report.passes is True
+
+
+def test_handshake_fails_when_no_signed_within_window() -> None:
+    """No signed message within 5s → AC-1 fail."""
+    # Arrange — only unsigned heartbeats.
+    msgs = [
+        _msg(i * 100_000, "HEARTBEAT", signed=False)
+        for i in range(60)  # 6 seconds of 10Hz heartbeats
+    ]
+
+    # Act
+    report = observe_signing_handshake(msgs)
+
+    # Assert
+    assert report.first_signed_us is None
+    assert report.lag_s is None
+    assert report.passes is False
+
+
+def test_handshake_fails_when_signed_arrives_after_budget() -> None:
+    """Signed message at +6s exceeds the 5s budget → AC-1 fail."""
+    # Arrange
+    msgs = [
+        _msg(0, "HEARTBEAT"),
+        _msg(6_000_000, "HEARTBEAT", signed=True),
+    ]
+
+    # Act
+    report = observe_signing_handshake(msgs)
+
+    # Assert — the signed message is outside the window, so the iterator
+    # stops before recording it. lag_s stays None.
+    assert report.first_signed_us is None
+    assert report.passes is False
+
+
+def test_handshake_fails_on_bad_signature_statustext() -> None:
+    """STATUSTEXT containing BAD_SIGNATURE during the window → AC-1 fail."""
+    # Arrange
+    msgs = [
+        _msg(0, "HEARTBEAT"),
+        _msg(500_000, "STATUSTEXT", text="MAVLink2 BAD_SIGNATURE from system 1"),
+        _msg(1_000_000, "HEARTBEAT", signed=True),
+    ]
+
+    # Act
+    report = observe_signing_handshake(msgs)
+
+    # Assert — got a signed message but ALSO a BAD_SIGNATURE in the window.
+    assert report.first_signed_us == 1_000_000
+    assert report.bad_signature_count == 1
+    assert report.passes is False
+
+
+def test_handshake_empty_stream_does_not_pass() -> None:
+    """No messages → no window → does not pass."""
+    # Act
+    report = observe_signing_handshake([])
+
+    # Assert
+    assert report.window_start_us == 0
+    assert report.first_signed_us is None
+    assert report.passes is False
+
+
+def test_handshake_rejects_invalid_window() -> None:
+    # Act / Assert
+    with pytest.raises(ValueError, match="handshake_window_us"):
+        observe_signing_handshake([], handshake_window_us=0)
+
+
+def test_gps_input_rate_at_5hz_for_60s_passes() -> None:
+    """60s @ 5Hz = 301 frames (incl. t=0 and t=60s) → 5.0 Hz observed."""
+    # Arrange
+    msgs = [_msg(i * 200_000, "GPS_INPUT") for i in range(301)]
+
+    # Act
+    report = compute_gps_input_rate(msgs)
+
+    # Assert
+    assert report.frame_count == 301
+    assert report.observed_rate_hz == pytest.approx(5.0, abs=0.01)
+    assert report.passes is True
+
+
+def test_gps_input_rate_at_boundary_passes() -> None:
+    """4.5 Hz exactly → AC-2 boundary pass."""
+    # Arrange — 10s @ 4.5Hz = 46 frames (start + 45 intervals).
+    period_us = int(round(1_000_000 / 4.5))
+    msgs = [_msg(i * period_us, "GPS_INPUT") for i in range(46)]
+
+    # Act
+    report = compute_gps_input_rate(msgs)
+
+    # Assert
+    assert report.observed_rate_hz == pytest.approx(4.5, abs=0.05)
+    assert report.passes is True
+
+
+def test_gps_input_rate_below_minimum_fails() -> None:
+    """3 Hz observed → AC-2 fail."""
+    # Arrange — 10s @ 3Hz.
+    msgs = [_msg(i * 333_333, "GPS_INPUT") for i in range(31)]
+
+    # Act
+    report = compute_gps_input_rate(msgs)
+
+    # Assert
+    assert report.observed_rate_hz == pytest.approx(3.0, abs=0.05)
+    assert report.passes is False
+
+
+def test_gps_input_rate_ignores_other_messages() -> None:
+    """Only GPS_INPUT frames count; HEARTBEAT/GPS_RAW_INT are noise."""
+    # Arrange — 5 GPS_INPUT + many HEARTBEATs.
+    msgs = [_msg(i * 200_000, "GPS_INPUT") for i in range(5)]
+    msgs += [_msg(i * 100_000, "HEARTBEAT") for i in range(50)]
+
+    # Act
+    report = compute_gps_input_rate(msgs)
+
+    # Assert
+    assert report.frame_count == 5
+
+
+def test_gps_input_rate_empty_stream_does_not_pass() -> None:
+    # Act
+    report = compute_gps_input_rate([])
+
+    # Assert
+    assert report.frame_count == 0
+    assert report.window_us == 0
+    assert report.passes is False
+
+
+def test_gps_input_rate_rejects_negative_minimum() -> None:
+    # Act / Assert
+    with pytest.raises(ValueError, match="min_required_hz"):
+        compute_gps_input_rate([], min_required_hz=-0.1)
+
+
+def test_validate_ek3_src1_posxy_passes_at_3() -> None:
+    """Only the value 3 satisfies AC-3."""
+    # Assert
+    assert validate_ek3_src1_posxy(3) is True
+    assert validate_ek3_src1_posxy(0) is False
+    assert validate_ek3_src1_posxy(1) is False
+    assert validate_ek3_src1_posxy(2) is False
+    assert validate_ek3_src1_posxy(4) is False
+
+
+def test_gps_raw_int_health_all_healthy_passes() -> None:
+    """All 100 samples healthy → fraction 1.0 → AC-4 pass."""
+    # Arrange
+    msgs = [_msg(i, "GPS_RAW_INT", fix_type=3, eph=150) for i in range(100)]
+
+    # Act
+    report = evaluate_gps_raw_int_health(msgs)
+
+    # Assert
+    assert report.total_samples == 100
+    assert report.healthy_samples == 100
+    assert report.healthy_fraction == 1.0
+    assert report.passes is True
+
+
+def test_gps_raw_int_health_at_80_pct_boundary_passes() -> None:
+    """80/100 healthy → boundary inclusive → AC-4 pass."""
+    # Arrange — 80 healthy, 20 with fix_type=2.
+    msgs = [
+        _msg(i, "GPS_RAW_INT", fix_type=3 if i < 80 else 2, eph=150)
+        for i in range(100)
+    ]
+
+    # Act
+    report = evaluate_gps_raw_int_health(msgs)
+
+    # Assert
+    assert report.healthy_fraction == 0.80
+    assert report.passes is True
+
+
+def test_gps_raw_int_health_below_80_pct_fails() -> None:
+    """79/100 healthy → AC-4 fail."""
+    # Arrange
+    msgs = [
+        _msg(i, "GPS_RAW_INT", fix_type=3 if i < 79 else 2, eph=150)
+        for i in range(100)
+    ]
+
+    # Act
+    report = evaluate_gps_raw_int_health(msgs)
+
+    # Assert
+    assert report.healthy_fraction == pytest.approx(0.79)
+    assert report.passes is False
+
+
+def test_gps_raw_int_health_eph_threshold_strict() -> None:
+    """eph=200 is healthy (≤200); eph=201 is not."""
+    # Arrange
+    msgs = [
+        _msg(0, "GPS_RAW_INT", fix_type=3, eph=200),
+        _msg(1, "GPS_RAW_INT", fix_type=3, eph=201),
+    ]
+
+    # Act
+    report = evaluate_gps_raw_int_health(msgs)
+
+    # Assert
+    assert report.total_samples == 2
+    assert report.healthy_samples == 1
+
+
+def test_gps_raw_int_health_missing_fields_skipped_not_healthy() -> None:
+    """A GPS_RAW_INT with missing fix_type still increments total but not healthy."""
+    # Arrange
+    msgs = [
+        _msg(0, "GPS_RAW_INT", fix_type=3, eph=150),
+        _msg(1, "GPS_RAW_INT"),
+    ]
+
+    # Act
+    report = evaluate_gps_raw_int_health(msgs)
+
+    # Assert
+    assert report.total_samples == 2
+    assert report.healthy_samples == 1
+
+
+def test_gps_raw_int_health_ignores_other_message_types() -> None:
+    """Only GPS_RAW_INT contributes to the total."""
+    # Arrange
+    msgs = [
+        _msg(i, "HEARTBEAT") for i in range(50)
+    ] + [
+        _msg(i, "GPS_RAW_INT", fix_type=3, eph=150) for i in range(10)
+    ]
+
+    # Act
+    report = evaluate_gps_raw_int_health(msgs)
+
+    # Assert
+    assert report.total_samples == 10
+
+
+def test_gps_raw_int_health_empty_stream_does_not_pass() -> None:
+    # Act
+    report = evaluate_gps_raw_int_health([])
+
+    # Assert
+    assert report.total_samples == 0
+    assert report.healthy_fraction == 0.0
+    assert report.passes is False
+
+
+def test_gps_raw_int_health_rejects_invalid_fraction() -> None:
+    # Act / Assert
+    with pytest.raises(ValueError, match="fraction_required"):
+        evaluate_gps_raw_int_health([], fraction_required=1.5)
@@ -0,0 +1,382 @@
+"""Unit tests for ``runner.helpers.cold_start_evaluator`` (FT-P-11 / AZ-419).
+
+Covers all three FT-P-11 origin_source paths (AC-1 operator manifest,
+AC-2 fc_ekf, AC-3 no-origin, AC-4 bounded-delta conflict) plus the
+Manifest read/write + cold-boot fixture parsing.
+"""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+import pytest
+
+from runner.helpers.cold_start_evaluator import (
+    ACCURACY_BUDGET_M,
+    BOUNDED_DELTA_TRIGGER_M,
+    FDR_RECORD_BOUNDED_DELTA_REJECT,
+    FDR_RECORD_ORIGIN_SET,
+    FDR_RECORD_ORIGIN_UNAVAILABLE,
+    FORBIDDEN_FIRST_LABEL_BOUNDED_DELTA,
+    ColdBootSnapshot,
+    FdrAuditRecord,
+    LatLonAlt,
+    OutboundEstimate,
+    bounded_delta_distance_m,
+    evaluate_first_estimate,
+    evaluate_no_origin_path,
+    read_cold_boot_fixture,
+    read_manifest,
+    write_manifest,
+)
+from runner.helpers.geo import offset
+
+
+def test_constants_match_spec() -> None:
+    """The AC-1..AC-4 budgets must match the spec text."""
+    # Assert
+    assert ACCURACY_BUDGET_M == 50.0
+    assert BOUNDED_DELTA_TRIGGER_M == 200.0
+    assert FORBIDDEN_FIRST_LABEL_BOUNDED_DELTA == "satellite_anchored"
+    assert FDR_RECORD_ORIGIN_SET == "c5.cold_start_origin.set"
+    assert FDR_RECORD_ORIGIN_UNAVAILABLE == "c5.cold_start_origin.unavailable"
+    assert FDR_RECORD_BOUNDED_DELTA_REJECT == "c5.gps_bounded_delta.reject"
+
+
+def test_write_and_read_manifest_round_trip(tmp_path: Path) -> None:
+    """write_manifest produces JSON read_manifest can parse."""
+    # Arrange
+    origin = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+    path = tmp_path / "manifest.json"
+
+    # Act
+    write_manifest(path, origin)
+    parsed = read_manifest(path)
+
+    # Assert
+    assert parsed.takeoff_origin == origin
+
+
+def test_write_manifest_without_origin_yields_none(tmp_path: Path) -> None:
+    """None origin → manifest has empty `flight` block."""
+    # Arrange
+    path = tmp_path / "manifest.json"
+
+    # Act
+    write_manifest(path, None)
+    parsed = read_manifest(path)
+
+    # Assert
+    assert parsed.takeoff_origin is None
+
+
+def test_read_manifest_missing_file_raises(tmp_path: Path) -> None:
+    # Act / Assert
+    with pytest.raises(FileNotFoundError, match="manifest not found"):
+        read_manifest(tmp_path / "absent.json")
+
+
+def test_read_cold_boot_fixture_parses_int_units(tmp_path: Path) -> None:
+    """lat_e7/lon_e7/alt_mm are converted to decimal degrees + meters."""
+    # Arrange
+    path = tmp_path / "cb.json"
+    payload = {
+        "_schema": "cold-boot-fixture/v1",
+        "global_position_int": {
+            "lat_e7": 500750000,
+            "lon_e7": 361500000,
+            "alt_mm": 100000,
+        },
+    }
+    path.write_text(json.dumps(payload))
+
+    # Act
+    snap = read_cold_boot_fixture(path)
+
+    # Assert
+    assert snap == ColdBootSnapshot(
+        lat_deg=50.0750, lon_deg=36.1500, alt_m=100.0, schema="cold-boot-fixture/v1"
+    )
+
+
+def test_read_cold_boot_fixture_missing_file_raises(tmp_path: Path) -> None:
+    # Act / Assert
+    with pytest.raises(FileNotFoundError, match="cold-boot fixture not found"):
+        read_cold_boot_fixture(tmp_path / "absent.json")
+
+
+def test_evaluate_operator_manifest_passes_at_origin() -> None:
+    """AC-1: estimate exactly at origin → distance 0, passes."""
+    # Arrange
+    origin = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+    estimate = OutboundEstimate(
+        monotonic_ms=1000, lat_deg=50.0, lon_deg=36.2, source_label="visual_propagated"
+    )
+    fdr = [
+        FdrAuditRecord(
+            monotonic_ms=500,
+            record_type=FDR_RECORD_ORIGIN_SET,
+            payload={"source": "manifest"},
+        )
+    ]
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="operator_manifest",
+        expected_origin=origin,
+        first_estimate=estimate,
+        fdr_records=fdr,
+    )
+
+    # Assert
+    assert report.distance_m == pytest.approx(0.0, abs=1e-6)
+    assert report.passes_distance is True
+    assert report.fdr_origin_set_seen is True
+    assert report.fdr_origin_set_source == "manifest"
+
+
+def test_evaluate_operator_manifest_passes_just_inside_budget() -> None:
+    """AC-1: estimate 49 m from origin → inside the 50 m budget → pass."""
+    # Arrange
+    origin = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+    lat, lon = offset(origin.lat_deg, origin.lon_deg, bearing_deg=90.0, distance_m=49.0)
+    estimate = OutboundEstimate(
+        monotonic_ms=1000, lat_deg=lat, lon_deg=lon, source_label="visual_propagated"
+    )
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="operator_manifest",
+        expected_origin=origin,
+        first_estimate=estimate,
+        fdr_records=[],
+    )
+
+    # Assert
+    assert report.distance_m == pytest.approx(49.0, abs=0.5)
+    assert report.passes_distance is True
+
+
+def test_evaluate_operator_manifest_fails_just_outside_budget() -> None:
+    """AC-1: estimate 51 m from origin → outside the 50 m budget → fail."""
+    # Arrange
+    origin = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+    lat, lon = offset(origin.lat_deg, origin.lon_deg, bearing_deg=90.0, distance_m=51.0)
+    estimate = OutboundEstimate(
+        monotonic_ms=1000, lat_deg=lat, lon_deg=lon, source_label="visual_propagated"
+    )
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="operator_manifest",
+        expected_origin=origin,
+        first_estimate=estimate,
+        fdr_records=[],
+    )
+
+    # Assert
+    assert report.distance_m == pytest.approx(51.0, abs=0.5)
+    assert report.passes_distance is False
+
+
+def test_evaluate_operator_manifest_fails_outside_budget() -> None:
+    """AC-1: estimate 100 m off → distance check fails."""
+    # Arrange
+    origin = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+    lat, lon = offset(origin.lat_deg, origin.lon_deg, bearing_deg=90.0, distance_m=100.0)
+    estimate = OutboundEstimate(
+        monotonic_ms=1000, lat_deg=lat, lon_deg=lon, source_label="visual_propagated"
+    )
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="operator_manifest",
+        expected_origin=origin,
+        first_estimate=estimate,
+        fdr_records=[],
+    )
+
+    # Assert
+    assert report.distance_m == pytest.approx(100.0, abs=0.5)
+    assert report.passes_distance is False
+
+
+def test_evaluate_fc_ekf_passes() -> None:
+    """AC-2: estimate near FC EKF snapshot → AC-2 pass."""
+    # Arrange
+    snapshot_origin = LatLonAlt(lat_deg=50.075, lon_deg=36.15, alt_m=100.0)
+    estimate = OutboundEstimate(
+        monotonic_ms=2000,
+        lat_deg=snapshot_origin.lat_deg,
+        lon_deg=snapshot_origin.lon_deg,
+        source_label="visual_propagated",
+    )
+    fdr = [
+        FdrAuditRecord(
+            monotonic_ms=1500,
+            record_type=FDR_RECORD_ORIGIN_SET,
+            payload={"source": "fc_ekf"},
+        )
+    ]
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="fc_ekf",
+        expected_origin=snapshot_origin,
+        first_estimate=estimate,
+        fdr_records=fdr,
+    )
+
+    # Assert
+    assert report.passes_distance is True
+    assert report.fdr_origin_set_source == "fc_ekf"
+
+
+def test_evaluate_bounded_delta_conflict_operator_wins() -> None:
+    """AC-4: estimate near A (operator); source_label != satellite_anchored."""
+    # Arrange
+    a = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+    b_lat, b_lon = offset(a.lat_deg, a.lon_deg, bearing_deg=90.0, distance_m=300.0)
+    b = LatLonAlt(lat_deg=b_lat, lon_deg=b_lon, alt_m=200.0)
+    estimate = OutboundEstimate(
+        monotonic_ms=1000,
+        lat_deg=a.lat_deg,
+        lon_deg=a.lon_deg,
+        source_label="visual_propagated",
+    )
+    fdr = [
+        FdrAuditRecord(
+            monotonic_ms=500,
+            record_type=FDR_RECORD_BOUNDED_DELTA_REJECT,
+            payload={
+                "a": {"lat_deg": a.lat_deg, "lon_deg": a.lon_deg, "alt_m": a.alt_m},
+                "b": {"lat_deg": b.lat_deg, "lon_deg": b.lon_deg, "alt_m": b.alt_m},
+            },
+        )
+    ]
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="bounded_delta_conflict",
+        expected_origin=a,
+        first_estimate=estimate,
+        fdr_records=fdr,
+    )
+
+    # Assert
+    assert report.passes_distance is True
+    assert report.source_label_ok is True
+    assert report.fdr_bounded_delta_seen is True
+    assert report.fdr_bounded_delta_a == a
+    assert report.fdr_bounded_delta_b is not None
+    assert abs(report.fdr_bounded_delta_b.lat_deg - b.lat_deg) < 1e-9
+    assert bounded_delta_distance_m(a, b) > BOUNDED_DELTA_TRIGGER_M
+
+
+def test_evaluate_bounded_delta_fails_when_label_is_satellite_anchored() -> None:
+    """AC-4: source_label = satellite_anchored is FORBIDDEN."""
+    # Arrange
+    a = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+    estimate = OutboundEstimate(
+        monotonic_ms=1000,
+        lat_deg=a.lat_deg,
+        lon_deg=a.lon_deg,
+        source_label="satellite_anchored",
+    )
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="bounded_delta_conflict",
+        expected_origin=a,
+        first_estimate=estimate,
+        fdr_records=[],
+    )
+
+    # Assert
+    assert report.source_label_ok is False
+
+
+def test_evaluate_first_estimate_rejects_unknown_origin_source() -> None:
+    # Act / Assert
+    with pytest.raises(ValueError, match="unknown origin_source"):
+        evaluate_first_estimate(
+            origin_source="garbage",
+            expected_origin=None,
+            first_estimate=None,
+            fdr_records=[],
+        )
+
+
+def test_evaluate_first_estimate_handles_no_estimate() -> None:
+    """If first_estimate is None, distance is None, distance check fails."""
+    # Arrange
+    origin = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+
+    # Act
+    report = evaluate_first_estimate(
+        origin_source="operator_manifest",
+        expected_origin=origin,
+        first_estimate=None,
+        fdr_records=[],
+    )
+
+    # Assert
+    assert report.distance_m is None
+    assert report.passes_distance is False
+    assert report.source_label_ok is False
+
+
+def test_evaluate_no_origin_passes_when_silent_and_fdr_records_abort() -> None:
+    """AC-3: no estimate produced AND FDR has origin_unavailable → pass."""
+    # Arrange
+    fdr = [
+        FdrAuditRecord(
+            monotonic_ms=15_000,
+            record_type=FDR_RECORD_ORIGIN_UNAVAILABLE,
+            payload={"reason": "no_manifest_no_gps"},
+        )
+    ]
+
+    # Act
+    report = evaluate_no_origin_path(first_estimate=None, fdr_records=fdr)
+
+    # Assert
+    assert report.passes is True
+
+
+def test_evaluate_no_origin_fails_when_sut_emits_anything() -> None:
+    """AC-3: any outbound estimate within the budget is a failure."""
+    # Arrange
+    estimate = OutboundEstimate(
+        monotonic_ms=10_000, lat_deg=0.0, lon_deg=0.0, source_label="dead_reckoned"
+    )
+
+    # Act
+    report = evaluate_no_origin_path(first_estimate=estimate, fdr_records=[])
+
+    # Assert
+    assert report.passes is False
+
+
+def test_evaluate_no_origin_fails_when_fdr_missing_unavailable_signal() -> None:
+    """AC-3 also requires the FDR audit record — silence alone is not enough."""
+    # Act
+    report = evaluate_no_origin_path(first_estimate=None, fdr_records=[])
+
+    # Assert
+    assert report.passes is False
+
+
+def test_bounded_delta_distance_m_exceeds_trigger() -> None:
+    """200 m offset → exactly at trigger; 250 m → over."""
+    # Arrange
+    a = LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=0.0)
+    b1_lat, b1_lon = offset(a.lat_deg, a.lon_deg, bearing_deg=0.0, distance_m=200.0)
+    b1 = LatLonAlt(lat_deg=b1_lat, lon_deg=b1_lon, alt_m=0.0)
+    b2_lat, b2_lon = offset(a.lat_deg, a.lon_deg, bearing_deg=0.0, distance_m=250.0)
+    b2 = LatLonAlt(lat_deg=b2_lat, lon_deg=b2_lon, alt_m=0.0)
+
+    # Assert
+    assert bounded_delta_distance_m(a, b1) == pytest.approx(200.0, abs=1.0)
+    assert bounded_delta_distance_m(a, b2) > BOUNDED_DELTA_TRIGGER_M
@@ -0,0 +1,180 @@
+"""Unit tests for ``runner.helpers.mavproxy_tlog_reader.iter_messages``.
+
+AZ-416 fills in the pymavlink-backed body; AZ-406 committed the public
+surface. These tests synthesise a tiny tlog on the fly so the parser
+can be exercised without needing a captured `.tlog` artifact.
+"""
+
+from __future__ import annotations
+
+import struct
+from pathlib import Path
+
+import pytest
+from pymavlink.dialects.v20 import ardupilotmega as mavlink
+
+from runner.helpers.mavproxy_tlog_reader import (
+    TlogMessage,
+    count_by_type,
+    iter_messages,
+)
+
+_SRC_SYSTEM = 1
+_SRC_COMPONENT = mavlink.MAV_COMP_ID_AUTOPILOT1
+_BASE_TS_US = 1_700_000_000_000_000
+
+
+def _write_tlog(tlog_path: Path, records: list[tuple[int, bytes]]) -> Path:
+    """Write a synthetic tlog: ``[8B big-endian ts_us][raw frame]`` per record."""
+    with tlog_path.open("wb") as fh:
+        for ts_us, payload in records:
+            fh.write(struct.pack(">Q", ts_us))
+            fh.write(payload)
+    return tlog_path
+
+
+def _make_mav() -> mavlink.MAVLink:
+    return mavlink.MAVLink(
+        file=None,
+        srcSystem=_SRC_SYSTEM,
+        srcComponent=_SRC_COMPONENT,
+    )
+
+
+def _heartbeat(mav: mavlink.MAVLink) -> bytes:
+    return mav.heartbeat_encode(
+        type=mavlink.MAV_TYPE_FIXED_WING,
+        autopilot=mavlink.MAV_AUTOPILOT_ARDUPILOTMEGA,
+        base_mode=mavlink.MAV_MODE_FLAG_AUTO_ENABLED,
+        custom_mode=10,
+        system_status=mavlink.MAV_STATE_ACTIVE,
+    ).pack(mav)
+
+
+def _gps_raw_int(mav: mavlink.MAVLink, *, fix_type: int = 3, eph: int = 100) -> bytes:
+    return mav.gps_raw_int_encode(
+        time_usec=_BASE_TS_US,
+        fix_type=fix_type,
+        lat=487750000,
+        lon=375940000,
+        alt=280000,
+        eph=eph,
+        epv=200,
+        vel=12000,
+        cog=18000,
+        satellites_visible=12,
+    ).pack(mav)
+
+
+def _gps_input(mav: mavlink.MAVLink) -> bytes:
+    return mav.gps_input_encode(
+        time_usec=_BASE_TS_US,
+        gps_id=0,
+        ignore_flags=0,
+        time_week_ms=0,
+        time_week=0,
+        fix_type=3,
+        lat=487750000,
+        lon=375940000,
+        alt=280.0,
+        hdop=1.0,
+        vdop=2.0,
+        vn=10.0,
+        ve=5.0,
+        vd=0.5,
+        speed_accuracy=0.3,
+        horiz_accuracy=1.0,
+        vert_accuracy=2.0,
+        satellites_visible=12,
+    ).pack(mav)
+
+
+def test_iter_messages_raises_on_missing_file(tmp_path: Path) -> None:
+    # Act / Assert
+    with pytest.raises(FileNotFoundError, match="tlog not found"):
+        list(iter_messages(tmp_path / "absent.tlog"))
+
+
+def test_iter_messages_yields_message_type_and_fields(tmp_path: Path) -> None:
+    """A single heartbeat round-trips through iter_messages."""
+    # Arrange
+    mav = _make_mav()
+    tlog = _write_tlog(tmp_path / "single.tlog", [(_BASE_TS_US, _heartbeat(mav))])
+
+    # Act
+    msgs = list(iter_messages(tlog))
+
+    # Assert
+    assert len(msgs) == 1
+    m = msgs[0]
+    assert isinstance(m, TlogMessage)
+    assert m.msg_type == "HEARTBEAT"
+    assert m.fields["autopilot"] == mavlink.MAV_AUTOPILOT_ARDUPILOTMEGA
+    assert "mavpackettype" not in m.fields  # excluded by the impl
+
+
+def test_iter_messages_preserves_order(tmp_path: Path) -> None:
+    """Multiple records are yielded oldest-first."""
+    # Arrange
+    mav = _make_mav()
+    tlog = _write_tlog(
+        tmp_path / "ordered.tlog",
+        [
+            (_BASE_TS_US + 0, _heartbeat(mav)),
+            (_BASE_TS_US + 1_000_000, _gps_raw_int(mav)),
+            (_BASE_TS_US + 2_000_000, _gps_input(mav)),
+        ],
+    )
+
+    # Act
+    types = [m.msg_type for m in iter_messages(tlog)]
+
+    # Assert
+    assert types == ["HEARTBEAT", "GPS_RAW_INT", "GPS_INPUT"]
+
+
+def test_iter_messages_timestamp_in_microseconds(tmp_path: Path) -> None:
+    """``msg._timestamp`` is seconds; we expose microseconds."""
+    # Arrange
+    mav = _make_mav()
+    tlog = _write_tlog(tmp_path / "ts.tlog", [(_BASE_TS_US + 5_000_000, _heartbeat(mav))])
+
+    # Act
+    msg = next(iter_messages(tlog))
+
+    # Assert — pymavlink rounds to its frame timestamp; tolerate ±1ms slop.
+    assert abs(msg.timestamp_us - (_BASE_TS_US + 5_000_000)) <= 1_000
+
+
+def test_iter_messages_signed_flag_default_false(tmp_path: Path) -> None:
+    """Plain pymavlink-encoded frame is NOT signed → signed=False."""
+    # Arrange
+    mav = _make_mav()
+    tlog = _write_tlog(tmp_path / "u.tlog", [(_BASE_TS_US, _heartbeat(mav))])
+
+    # Act
+    msg = next(iter_messages(tlog))
+
+    # Assert
+    assert msg.signed is False
+
+
+def test_count_by_type_tallies_correctly(tmp_path: Path) -> None:
+    """count_by_type runs iter_messages and aggregates the type counts."""
+    # Arrange
+    mav = _make_mav()
+    tlog = _write_tlog(
+        tmp_path / "mixed.tlog",
+        [
+            (_BASE_TS_US + 0, _heartbeat(mav)),
+            (_BASE_TS_US + 1, _heartbeat(mav)),
+            (_BASE_TS_US + 2, _gps_raw_int(mav)),
+        ],
+    )
+
+    # Act
+    counts = count_by_type(tlog)
+
+    # Assert
+    assert counts["HEARTBEAT"] == 2
+    assert counts["GPS_RAW_INT"] == 1
@@ -0,0 +1,212 @@
+"""Unit tests for ``runner.helpers.msp_frame_observer`` (FT-P-09-iNav / AZ-417).
+
+Covers AC-2 (≥4.5 Hz observed for 5 Hz target) and AC-3 (fix_type ≥3,
+provider=MSP, numSat matches emitted value).
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from runner.helpers.msp_frame_observer import (
+    DEFAULT_TARGET_RATE_HZ,
+    MIN_FIX_TYPE,
+    MIN_OBSERVED_RATE_HZ,
+    MSP2_SENSOR_GPS_FUNCTION_ID,
+    REQUIRED_PROVIDER,
+    InavGpsSnapshot,
+    MspFrameSample,
+    compute_rate_hz,
+    count_frames_by_id,
+    evaluate_inav_gps_state,
+)
+
+
+def _frames(rate_hz: float, n: int, function_id: int = MSP2_SENSOR_GPS_FUNCTION_ID) -> list[MspFrameSample]:
+    """Synthetic frame stream at exactly ``rate_hz`` for ``n`` frames."""
+    if rate_hz <= 0:
+        raise ValueError("rate_hz must be > 0")
+    period_ms = int(round(1000.0 / rate_hz))
+    return [
+        MspFrameSample(monotonic_ms=i * period_ms, function_id=function_id)
+        for i in range(n)
+    ]
+
+
+def test_constants_match_spec() -> None:
+    """The AC-2/AC-3 thresholds + IDs must match the spec text."""
+    # Assert
+    assert MSP2_SENSOR_GPS_FUNCTION_ID == 0x1F03
+    assert DEFAULT_TARGET_RATE_HZ == 5.0
+    assert MIN_OBSERVED_RATE_HZ == 4.5
+    assert MIN_FIX_TYPE == 3
+    assert REQUIRED_PROVIDER == "MSP"
+
+
+def test_count_frames_by_id_filters_correctly() -> None:
+    """Mixed-ID stream tallies per function ID."""
+    # Arrange
+    samples = [
+        MspFrameSample(0, MSP2_SENSOR_GPS_FUNCTION_ID),
+        MspFrameSample(100, 0x1F04),
+        MspFrameSample(200, MSP2_SENSOR_GPS_FUNCTION_ID),
+        MspFrameSample(300, MSP2_SENSOR_GPS_FUNCTION_ID),
+    ]
+
+    # Act
+    counts = count_frames_by_id(samples)
+
+    # Assert
+    assert counts[MSP2_SENSOR_GPS_FUNCTION_ID] == 3
+    assert counts[0x1F04] == 1
+
+
+def test_compute_rate_at_target_passes() -> None:
+    """5 Hz over 60 s window passes the ≥4.5 Hz minimum."""
+    # Arrange — 60s at 5Hz = 301 samples (inclusive of t=0 and t=60000).
+    samples = _frames(rate_hz=5.0, n=301)
+
+    # Act
+    report = compute_rate_hz(samples)
+
+    # Assert
+    assert report.frame_count == 301
+    assert report.observed_rate_hz == pytest.approx(5.0, abs=0.01)
+    assert report.passes is True
+
+
+def test_compute_rate_at_boundary_passes() -> None:
+    """Exactly 4.5 Hz passes (boundary is inclusive)."""
+    # Arrange
+    samples = _frames(rate_hz=4.5, n=46)  # 10s @ 4.5Hz
+
+    # Act
+    report = compute_rate_hz(samples)
+
+    # Assert
+    assert report.observed_rate_hz == pytest.approx(4.5, abs=0.05)
+    assert report.passes is True
+
+
+def test_compute_rate_below_minimum_fails() -> None:
+    """3 Hz observed → fails the ≥4.5 Hz minimum."""
+    # Arrange
+    samples = _frames(rate_hz=3.0, n=31)  # 10s @ 3Hz
+
+    # Act
+    report = compute_rate_hz(samples)
+
+    # Assert
+    assert report.observed_rate_hz == pytest.approx(3.0, abs=0.05)
+    assert report.passes is False
+
+
+def test_compute_rate_zero_samples_does_not_pass() -> None:
+    """Empty input → zero count, zero rate, does not pass."""
+    # Act
+    report = compute_rate_hz([])
+
+    # Assert
+    assert report.frame_count == 0
+    assert report.window_ms == 0
+    assert report.observed_rate_hz == 0.0
+    assert report.passes is False
+
+
+def test_compute_rate_single_sample_does_not_pass() -> None:
+    """One sample yields no window → does not pass."""
+    # Arrange
+    samples = [MspFrameSample(0, MSP2_SENSOR_GPS_FUNCTION_ID)]
+
+    # Act
+    report = compute_rate_hz(samples)
+
+    # Assert
+    assert report.frame_count == 1
+    assert report.window_ms == 0
+    assert report.passes is False
+
+
+def test_compute_rate_filters_function_id() -> None:
+    """Frames with a different function_id are ignored in the rate calc."""
+    # Arrange
+    samples = (
+        _frames(rate_hz=5.0, n=51, function_id=MSP2_SENSOR_GPS_FUNCTION_ID)
+        + _frames(rate_hz=10.0, n=101, function_id=0x1F04)
+    )
+
+    # Act
+    report = compute_rate_hz(samples, function_id=MSP2_SENSOR_GPS_FUNCTION_ID)
+
+    # Assert
+    assert report.frame_count == 51
+    assert report.observed_rate_hz == pytest.approx(5.0, abs=0.01)
+
+
+def test_compute_rate_rejects_negative_minimum() -> None:
+    # Act / Assert
+    with pytest.raises(ValueError, match="min_required_hz"):
+        compute_rate_hz([], min_required_hz=-1.0)
+
+
+def test_evaluate_gps_state_passes_at_minimum_fix() -> None:
+    """fix_type=3, provider=MSP, numSat=10 (matches emitted) → AC-3 pass."""
+    # Arrange
+    snapshot = InavGpsSnapshot(fix_type=3, num_sat=10, provider="MSP")
+
+    # Act
+    report = evaluate_inav_gps_state(snapshot, expected_num_sat=10)
+
+    # Assert
+    assert report.fix_type_ok is True
+    assert report.provider_ok is True
+    assert report.num_sat_ok is True
+    assert report.passes is True
+
+
+def test_evaluate_gps_state_fails_on_low_fix_type() -> None:
+    """fix_type=2 < 3 → AC-3 fail."""
+    # Arrange
+    snapshot = InavGpsSnapshot(fix_type=2, num_sat=10, provider="MSP")
+
+    # Act
+    report = evaluate_inav_gps_state(snapshot, expected_num_sat=10)
+
+    # Assert
+    assert report.fix_type_ok is False
+    assert report.passes is False
+
+
+def test_evaluate_gps_state_fails_on_wrong_provider() -> None:
+    """provider != MSP → AC-3 fail (fallback to internal GPS)."""
+    # Arrange
+    snapshot = InavGpsSnapshot(fix_type=3, num_sat=10, provider="INTERNAL")
+
+    # Act
+    report = evaluate_inav_gps_state(snapshot, expected_num_sat=10)
+
+    # Assert
+    assert report.provider_ok is False
+    assert report.passes is False
+
+
+def test_evaluate_gps_state_fails_on_num_sat_mismatch() -> None:
+    """numSat reported by iNav must match the value emitted by SUT."""
+    # Arrange
+    snapshot = InavGpsSnapshot(fix_type=3, num_sat=12, provider="MSP")
+
+    # Act
+    report = evaluate_inav_gps_state(snapshot, expected_num_sat=10)
+
+    # Assert
+    assert report.num_sat_ok is False
+    assert report.passes is False
+
+
+def test_evaluate_gps_state_rejects_negative_expected_num_sat() -> None:
+    # Arrange
+    snapshot = InavGpsSnapshot(fix_type=3, num_sat=10, provider="MSP")
+
+    # Act / Assert
+    with pytest.raises(ValueError, match="expected_num_sat"):
+        evaluate_inav_gps_state(snapshot, expected_num_sat=-1)
@@ -49,6 +49,9 @@ E2E_ROOT = Path(__file__).resolve().parents[1]
        "runner/helpers/multi_segment_evaluator.py",
        "runner/helpers/smoothing_evaluator.py",
        "runner/helpers/sharp_turn_detector.py",
+        "runner/helpers/msp_frame_observer.py",
+        "runner/helpers/ap_contract_evaluator.py",
+        "runner/helpers/cold_start_evaluator.py",
        "fixtures/mock-suite-sat/Dockerfile",
        "fixtures/mock-suite-sat/app.py",
        "fixtures/mock-suite-sat/requirements.txt",
@@ -89,7 +92,10 @@ E2E_ROOT = Path(__file__).resolve().parents[1]
        "tests/positive/test_ft_p_06_mre_budgets.py",
        "tests/positive/test_ft_p_07_sharp_turn_recovery.py",
        "tests/positive/test_ft_p_08_multi_segment_reloc.py",
+        "tests/positive/test_ft_p_09_ap_signing.py",
+        "tests/positive/test_ft_p_09_inav.py",
        "tests/positive/test_ft_p_10_smoothing_lookback.py",
+        "tests/positive/test_ft_p_11_cold_start_init.py",
        "tests/negative/test_ft_n_02_sharp_turn_failure.py",
    ],
 )
@@ -0,0 +1,240 @@
+"""ArduPilot contract + signing-handshake evaluation for FT-P-09-AP (AZ-416).
+
+Given the captured ``.tlog`` from ``mavproxy-listener`` plus a single
+EK3_SRC1_POSXY parameter read, this helper validates:
+
+* AC-1: signing handshake completes within ≤5 s
+  (``observe_signing_handshake`` — first signed message within the
+  window OR absence of ``BAD_SIGNATURE`` STATUSTEXT during it).
+* AC-2: GPS_INPUT flow at ≥4.5 Hz over the 60 s replay
+  (``compute_gps_input_rate``).
+* AC-3: EK3_SRC1_POSXY == 3 (``validate_ek3_src1_posxy`` — pure check
+  on the param value the caller fetched via mavproxy).
+* AC-4: GPS_RAW_INT health — ``fix_type ≥ 3`` AND ``eph ≤ 200``
+  (HDOP ≤ 2.0) for ≥80 % of the 60 s window
+  (``evaluate_gps_raw_int_health``).
+
+All inputs are pure ``Iterable[TlogMessage]``; the tlog ingestion is
+delegated to ``runner.helpers.mavproxy_tlog_reader.iter_messages``.
+
+Public-boundary discipline: does NOT import any
+``src/gps_denied_onboard`` symbol.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from typing import Iterable, Sequence
+
+from .mavproxy_tlog_reader import TlogMessage
+
+HANDSHAKE_BUDGET_S = 5.0
+GPS_INPUT_TARGET_RATE_HZ = 5.0
+GPS_INPUT_MIN_RATE_HZ = 4.5
+GPS_RAW_INT_MIN_FIX_TYPE = 3
+GPS_RAW_INT_MAX_EPH = 200  # HDOP × 100 ≤ 200 → HDOP ≤ 2.0
+GPS_RAW_INT_HEALTHY_FRACTION_REQUIRED = 0.80
+EK3_SRC1_POSXY_REQUIRED = 3  # AP EKF source-set: 3 = GPS
+
+
+@dataclass(frozen=True)
+class HandshakeReport:
+    """AC-1: signing-handshake completion observation."""
+
+    window_start_us: int
+    window_end_us: int
+    first_signed_us: int | None
+    bad_signature_count: int
+    setup_signing_seen: bool
+
+    @property
+    def lag_s(self) -> float | None:
+        if self.first_signed_us is None:
+            return None
+        return (self.first_signed_us - self.window_start_us) / 1_000_000.0
+
+    @property
+    def passes(self) -> bool:
+        return (
+            self.first_signed_us is not None
+            and self.lag_s is not None
+            and self.lag_s <= HANDSHAKE_BUDGET_S
+            and self.bad_signature_count == 0
+        )
+
+
+@dataclass(frozen=True)
+class GpsInputRateReport:
+    """AC-2: GPS_INPUT rate over the replay window."""
+
+    frame_count: int
+    window_us: int
+    observed_rate_hz: float
+    target_rate_hz: float = GPS_INPUT_TARGET_RATE_HZ
+    min_required_hz: float = GPS_INPUT_MIN_RATE_HZ
+
+    @property
+    def passes(self) -> bool:
+        return (
+            self.window_us > 0
+            and self.observed_rate_hz >= self.min_required_hz
+        )
+
+
+@dataclass(frozen=True)
+class GpsRawIntHealthReport:
+    """AC-4: GPS_RAW_INT fix_type + eph healthy fraction."""
+
+    total_samples: int
+    healthy_samples: int
+    fraction_required: float = GPS_RAW_INT_HEALTHY_FRACTION_REQUIRED
+
+    @property
+    def healthy_fraction(self) -> float:
+        if self.total_samples == 0:
+            return 0.0
+        return self.healthy_samples / self.total_samples
+
+    @property
+    def passes(self) -> bool:
+        return (
+            self.total_samples > 0
+            and self.healthy_fraction >= self.fraction_required
+        )
+
+
+def observe_signing_handshake(
+    messages: Iterable[TlogMessage],
+    *,
+    handshake_window_us: int = int(HANDSHAKE_BUDGET_S * 1_000_000),
+) -> HandshakeReport:
+    """AC-1: first signed message within ``handshake_window_us``.
+
+    The handshake window starts at the FIRST observed message's
+    timestamp (the SUT cannot be heard from before that). The result
+    PASSES if a signed message arrives within the window AND no
+    ``STATUSTEXT`` with ``BAD_SIGNATURE`` is observed during it.
+
+    The SETUP_SIGNING handshake exchange itself is unsigned by spec
+    (it's how the key is shared), so its presence is reported but does
+    NOT gate the pass — the gate is the first SIGNED follow-up.
+    """
+    if handshake_window_us <= 0:
+        raise ValueError(f"handshake_window_us must be > 0, got {handshake_window_us}")
+    window_start: int | None = None
+    window_end: int | None = None
+    first_signed_us: int | None = None
+    bad_sig_count = 0
+    setup_signing_seen = False
+
+    for m in messages:
+        if window_start is None:
+            window_start = m.timestamp_us
+            window_end = window_start + handshake_window_us
+        if window_end is not None and m.timestamp_us > window_end:
+            break
+        if m.msg_type == "SETUP_SIGNING":
+            setup_signing_seen = True
+        if m.signed and first_signed_us is None:
+            first_signed_us = m.timestamp_us
+        if m.msg_type == "STATUSTEXT":
+            text = str(m.fields.get("text", "")).upper()
+            if "BAD_SIGNATURE" in text:
+                bad_sig_count += 1
+
+    return HandshakeReport(
+        window_start_us=window_start or 0,
+        window_end_us=window_end or 0,
+        first_signed_us=first_signed_us,
+        bad_signature_count=bad_sig_count,
+        setup_signing_seen=setup_signing_seen,
+    )
+
+
+def compute_gps_input_rate(
+    messages: Iterable[TlogMessage],
+    *,
+    target_rate_hz: float = GPS_INPUT_TARGET_RATE_HZ,
+    min_required_hz: float = GPS_INPUT_MIN_RATE_HZ,
+) -> GpsInputRateReport:
+    """AC-2: GPS_INPUT cadence over the entire message stream."""
+    if min_required_hz < 0:
+        raise ValueError(f"min_required_hz must be ≥0, got {min_required_hz}")
+    timestamps = [m.timestamp_us for m in messages if m.msg_type == "GPS_INPUT"]
+    if len(timestamps) < 2:
+        return GpsInputRateReport(
+            frame_count=len(timestamps),
+            window_us=0,
+            observed_rate_hz=0.0,
+            target_rate_hz=target_rate_hz,
+            min_required_hz=min_required_hz,
+        )
+    window_us = timestamps[-1] - timestamps[0]
+    if window_us <= 0:
+        return GpsInputRateReport(
+            frame_count=len(timestamps),
+            window_us=window_us,
+            observed_rate_hz=0.0,
+            target_rate_hz=target_rate_hz,
+            min_required_hz=min_required_hz,
+        )
+    observed = (len(timestamps) - 1) / (window_us / 1_000_000.0)
+    return GpsInputRateReport(
+        frame_count=len(timestamps),
+        window_us=window_us,
+        observed_rate_hz=observed,
+        target_rate_hz=target_rate_hz,
+        min_required_hz=min_required_hz,
+    )
+
+
+def validate_ek3_src1_posxy(value: int) -> bool:
+    """AC-3: EK3_SRC1_POSXY must equal 3 (GPS source)."""
+    return value == EK3_SRC1_POSXY_REQUIRED
+
+
+def evaluate_gps_raw_int_health(
+    messages: Iterable[TlogMessage],
+    *,
+    min_fix_type: int = GPS_RAW_INT_MIN_FIX_TYPE,
+    max_eph: int = GPS_RAW_INT_MAX_EPH,
+    fraction_required: float = GPS_RAW_INT_HEALTHY_FRACTION_REQUIRED,
+) -> GpsRawIntHealthReport:
+    """AC-4: ≥``fraction_required`` of GPS_RAW_INT samples must be healthy.
+
+    A sample is "healthy" iff ``fix_type ≥ min_fix_type`` AND
+    ``eph ≤ max_eph``. Both must hold per the spec text.
+    """
+    if not 0.0 <= fraction_required <= 1.0:
+        raise ValueError(
+            f"fraction_required must be in [0, 1], got {fraction_required}"
+        )
+    total = 0
+    healthy = 0
+    for m in messages:
+        if m.msg_type != "GPS_RAW_INT":
+            continue
+        total += 1
+        try:
+            fix_type = int(m.fields["fix_type"])  # type: ignore[arg-type]
+            eph = int(m.fields["eph"])  # type: ignore[arg-type]
+        except (KeyError, TypeError, ValueError):
+            continue
+        if fix_type >= min_fix_type and eph <= max_eph:
+            healthy += 1
+    return GpsRawIntHealthReport(
+        total_samples=total,
+        healthy_samples=healthy,
+        fraction_required=fraction_required,
+    )
+
+
+def collect_messages_to_list(messages: Iterable[TlogMessage]) -> list[TlogMessage]:
+    """Materialise an iterator into a list — convenience for multi-pass eval.
+
+    The scenario reads the tlog once via ``iter_messages`` and runs
+    multiple analyzers over the result. ``iter_messages`` returns a
+    generator that closes its underlying pymavlink connection on
+    exhaustion, so re-iteration is not safe without materialisation.
+    """
+    return list(messages)
@@ -0,0 +1,309 @@
+"""Cold-start initialization evaluation for FT-P-11 (AZ-419 / ADR-010 / AC-5.1).
+
+ADR-010 splits cold-start into two paths:
+
+* **Primary** (operator manifest, AZ-490): C12 bakes
+  ``flight.takeoff_origin`` into the C10 Manifest from the operator-
+  authored mission; airborne C5 consumes it BEFORE any sensor sample
+  via ``set_takeoff_origin``. Used even when the FC EKF has no valid
+  GPS.
+* **Secondary** (FC EKF, legacy AC-5.1): when the Manifest carries no
+  ``takeoff_origin``, the SUT falls back to the FC EKF snapshot.
+* **Bounded-delta conflict** (Principle #11 amended): both signals
+  present but ``|operator − fc_ekf| > 200 m`` → operator wins; FC GPS
+  is logged as suspect via a ``c5.gps_bounded_delta.reject`` FDR
+  record naming both points.
+
+This helper owns the pure-logic side:
+
+* ``write_manifest`` / ``read_manifest`` — manipulate the fixture
+  Manifest the test builder produces.
+* ``read_cold_boot_fixture`` — parse the AZ-408 cold-boot snapshot
+  JSON into a typed ``ColdBootSnapshot``.
+* ``evaluate_first_estimate`` — distance vs expected origin + source
+  label rules + FDR record presence checks.
+
+Public-boundary discipline: does NOT import any
+``src/gps_denied_onboard`` symbol.
+"""
+
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Iterable, Mapping, Sequence
+
+from .geo import distance_m
+
+ACCURACY_BUDGET_M = 50.0  # AC-1/AC-2/AC-4: estimate within ±50 m of origin
+BOUNDED_DELTA_TRIGGER_M = 200.0  # ADR-010 Principle #11 amended
+FIRST_EMISSION_BUDGET_S = 30.0  # AC-1/AC-NEW-1
+FORBIDDEN_FIRST_LABEL_BOUNDED_DELTA = "satellite_anchored"
+FDR_RECORD_ORIGIN_SET = "c5.cold_start_origin.set"
+FDR_RECORD_ORIGIN_UNAVAILABLE = "c5.cold_start_origin.unavailable"
+FDR_RECORD_BOUNDED_DELTA_REJECT = "c5.gps_bounded_delta.reject"
+
+
+@dataclass(frozen=True)
+class LatLonAlt:
+    """One geodetic point: WGS84 degrees + altitude in meters."""
+
+    lat_deg: float
+    lon_deg: float
+    alt_m: float
+
+
+@dataclass(frozen=True)
+class ManifestOrigin:
+    """A subset of the C10 Manifest for FT-P-11 — just the takeoff_origin."""
+
+    takeoff_origin: LatLonAlt | None
+
+
+@dataclass(frozen=True)
+class ColdBootSnapshot:
+    """Parsed AZ-408 cold-boot fixture (FC EKF snapshot pose)."""
+
+    lat_deg: float
+    lon_deg: float
+    alt_m: float
+    schema: str
+
+
+@dataclass(frozen=True)
+class OutboundEstimate:
+    """First outbound estimate observed by the scenario."""
+
+    monotonic_ms: int
+    lat_deg: float
+    lon_deg: float
+    source_label: str
+
+
+@dataclass(frozen=True)
+class FdrAuditRecord:
+    """One FDR record relevant to cold-start auditing."""
+
+    monotonic_ms: int
+    record_type: str
+    payload: Mapping[str, object]
+
+
+@dataclass(frozen=True)
+class FirstEstimateReport:
+    """AC-1 / AC-2 / AC-4: distance + label + FDR record audit."""
+
+    origin_source: str
+    expected_origin: LatLonAlt | None
+    actual_estimate: OutboundEstimate | None
+    distance_m: float | None
+    source_label_ok: bool
+    fdr_origin_set_seen: bool
+    fdr_origin_set_source: str | None
+    fdr_bounded_delta_seen: bool
+    fdr_bounded_delta_a: LatLonAlt | None
+    fdr_bounded_delta_b: LatLonAlt | None
+
+    @property
+    def passes_distance(self) -> bool:
+        return (
+            self.distance_m is not None
+            and self.distance_m <= ACCURACY_BUDGET_M
+        )
+
+
+@dataclass(frozen=True)
+class NoOriginReport:
+    """AC-3: SUT MUST refuse takeoff when no origin is available."""
+
+    estimate_within_budget: bool  # True iff an estimate WAS produced — failure mode
+    fdr_origin_unavailable_seen: bool
+
+    @property
+    def passes(self) -> bool:
+        # AC-3 passes when NO estimate was produced AND the FDR records
+        # the takeoff-abort signal.
+        return not self.estimate_within_budget and self.fdr_origin_unavailable_seen
+
+
+def write_manifest(out_path: Path, takeoff_origin: LatLonAlt | None) -> Path:
+    """Write a minimal C10-Manifest-shaped JSON for the test fixture builder.
+
+    The schema mirrors the AZ-323 canonical Manifest serialization just
+    closely enough that the SUT's ``set_takeoff_origin`` consumer
+    accepts it. Field shape mirrors `_docs/02_document/contracts/c12_*`.
+    """
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    payload: dict[str, object] = {"_schema": "ft-p-11-test-manifest/v1"}
+    if takeoff_origin is not None:
+        payload["flight"] = {
+            "takeoff_origin": {
+                "lat_deg": takeoff_origin.lat_deg,
+                "lon_deg": takeoff_origin.lon_deg,
+                "alt_m": takeoff_origin.alt_m,
+            }
+        }
+    else:
+        payload["flight"] = {}
+    out_path.write_text(json.dumps(payload, indent=2))
+    return out_path
+
+
+def read_manifest(manifest_path: Path) -> ManifestOrigin:
+    """Read a Manifest JSON and extract the ``takeoff_origin`` if present."""
+    if not manifest_path.exists():
+        raise FileNotFoundError(f"manifest not found: {manifest_path}")
+    payload = json.loads(manifest_path.read_text())
+    origin_raw = payload.get("flight", {}).get("takeoff_origin")
+    if origin_raw is None:
+        return ManifestOrigin(takeoff_origin=None)
+    return ManifestOrigin(
+        takeoff_origin=LatLonAlt(
+            lat_deg=float(origin_raw["lat_deg"]),
+            lon_deg=float(origin_raw["lon_deg"]),
+            alt_m=float(origin_raw["alt_m"]),
+        )
+    )
+
+
+def read_cold_boot_fixture(fixture_path: Path) -> ColdBootSnapshot:
+    """Parse the AZ-408 cold-boot JSON into a typed snapshot.
+
+    Converts the fixture's ``lat_e7 / lon_e7 / alt_mm`` (MAVLink int32
+    units, 1e-7 deg + millimeters) to ``lat_deg / lon_deg / alt_m``.
+    """
+    if not fixture_path.exists():
+        raise FileNotFoundError(f"cold-boot fixture not found: {fixture_path}")
+    payload = json.loads(fixture_path.read_text())
+    schema = str(payload.get("_schema", ""))
+    pose = payload["global_position_int"]
+    return ColdBootSnapshot(
+        lat_deg=int(pose["lat_e7"]) / 1e7,
+        lon_deg=int(pose["lon_e7"]) / 1e7,
+        alt_m=int(pose["alt_mm"]) / 1000.0,
+        schema=schema,
+    )
+
+
+def _scan_fdr_for_cold_start(
+    fdr_records: Iterable[FdrAuditRecord],
+) -> dict[str, object]:
+    """Single pass collecting all cold-start-relevant FDR signals."""
+    origin_set_source: str | None = None
+    origin_set_seen = False
+    origin_unavailable_seen = False
+    bounded_delta_seen = False
+    bounded_delta_a: LatLonAlt | None = None
+    bounded_delta_b: LatLonAlt | None = None
+    for r in fdr_records:
+        if r.record_type == FDR_RECORD_ORIGIN_SET:
+            origin_set_seen = True
+            src = r.payload.get("source")
+            if src is not None:
+                origin_set_source = str(src)
+        elif r.record_type == FDR_RECORD_ORIGIN_UNAVAILABLE:
+            origin_unavailable_seen = True
+        elif r.record_type == FDR_RECORD_BOUNDED_DELTA_REJECT:
+            bounded_delta_seen = True
+            a = r.payload.get("a")
+            b = r.payload.get("b")
+            if isinstance(a, Mapping):
+                bounded_delta_a = LatLonAlt(
+                    lat_deg=float(a["lat_deg"]),  # type: ignore[arg-type]
+                    lon_deg=float(a["lon_deg"]),  # type: ignore[arg-type]
+                    alt_m=float(a.get("alt_m", 0.0)),  # type: ignore[arg-type]
+                )
+            if isinstance(b, Mapping):
+                bounded_delta_b = LatLonAlt(
+                    lat_deg=float(b["lat_deg"]),  # type: ignore[arg-type]
+                    lon_deg=float(b["lon_deg"]),  # type: ignore[arg-type]
+                    alt_m=float(b.get("alt_m", 0.0)),  # type: ignore[arg-type]
+                )
+    return {
+        "origin_set_seen": origin_set_seen,
+        "origin_set_source": origin_set_source,
+        "origin_unavailable_seen": origin_unavailable_seen,
+        "bounded_delta_seen": bounded_delta_seen,
+        "bounded_delta_a": bounded_delta_a,
+        "bounded_delta_b": bounded_delta_b,
+    }
+
+
+def evaluate_first_estimate(
+    *,
+    origin_source: str,
+    expected_origin: LatLonAlt | None,
+    first_estimate: OutboundEstimate | None,
+    fdr_records: Sequence[FdrAuditRecord],
+) -> FirstEstimateReport:
+    """Evaluate AC-1/AC-2/AC-4 given the first observed outbound estimate.
+
+    ``origin_source`` is one of:
+    * ``"operator_manifest"`` — AC-1: distance ≤50 m of A AND FDR has
+      ``c5.cold_start_origin.set(source="manifest")``.
+    * ``"fc_ekf"`` — AC-2: distance ≤50 m of FC EKF snapshot AND FDR
+      has ``c5.cold_start_origin.set(source="fc_ekf")``.
+    * ``"bounded_delta_conflict"`` — AC-4: distance ≤50 m of A;
+      source_label != ``satellite_anchored``; FDR has
+      ``c5.gps_bounded_delta.reject`` naming both A and B.
+
+    Any other source string raises ``ValueError``.
+    """
+    if origin_source not in {"operator_manifest", "fc_ekf", "bounded_delta_conflict"}:
+        raise ValueError(
+            f"unknown origin_source {origin_source!r}; expected one of "
+            "{operator_manifest, fc_ekf, bounded_delta_conflict}"
+        )
+
+    distance: float | None = None
+    if first_estimate is not None and expected_origin is not None:
+        distance = distance_m(
+            expected_origin.lat_deg, expected_origin.lon_deg,
+            first_estimate.lat_deg, first_estimate.lon_deg,
+        )
+
+    if origin_source == "bounded_delta_conflict":
+        label_ok = (
+            first_estimate is not None
+            and first_estimate.source_label != FORBIDDEN_FIRST_LABEL_BOUNDED_DELTA
+        )
+    else:
+        label_ok = first_estimate is not None  # any label acceptable for AC-1/AC-2
+
+    audit = _scan_fdr_for_cold_start(fdr_records)
+
+    return FirstEstimateReport(
+        origin_source=origin_source,
+        expected_origin=expected_origin,
+        actual_estimate=first_estimate,
+        distance_m=distance,
+        source_label_ok=label_ok,
+        fdr_origin_set_seen=bool(audit["origin_set_seen"]),
+        fdr_origin_set_source=audit["origin_set_source"],  # type: ignore[arg-type]
+        fdr_bounded_delta_seen=bool(audit["bounded_delta_seen"]),
+        fdr_bounded_delta_a=audit["bounded_delta_a"],  # type: ignore[arg-type]
+        fdr_bounded_delta_b=audit["bounded_delta_b"],  # type: ignore[arg-type]
+    )
+
+
+def evaluate_no_origin_path(
+    *,
+    first_estimate: OutboundEstimate | None,
+    fdr_records: Sequence[FdrAuditRecord],
+) -> NoOriginReport:
+    """AC-3: Manifest empty + SITL no GPS → SUT must NOT emit anything.
+
+    Returns ``passes=True`` iff no outbound estimate was produced AND
+    the FDR carries ``c5.cold_start_origin.unavailable``.
+    """
+    audit = _scan_fdr_for_cold_start(fdr_records)
+    return NoOriginReport(
+        estimate_within_budget=first_estimate is not None,
+        fdr_origin_unavailable_seen=bool(audit["origin_unavailable_seen"]),
+    )
+
+
+def bounded_delta_distance_m(a: LatLonAlt, b: LatLonAlt) -> float:
+    """Convenience: AC-4 trigger condition is ``vincenty(A, B) > 200 m``."""
+    return distance_m(a.lat_deg, a.lon_deg, b.lat_deg, b.lon_deg)
@@ -10,8 +10,11 @@ This module exposes a small typed wrapper so per-scenario tests can:
       of signed vs unsigned messages for NFT-SEC-03).
    3. Attach the source `.tlog` path to the evidence bundler.

-Concrete iteration logic is owned by AZ-416 (FT-P-09-AP); AZ-406 commits
-to the public surface.
+AZ-416 (FT-P-09-AP) owns the pymavlink-backed body; AZ-406 committed to
+the public surface.
+
+Public-boundary discipline: does NOT import any ``src/gps_denied_onboard``
+symbol.
 """

 from __future__ import annotations
@@ -20,6 +23,8 @@ from dataclasses import dataclass
 from pathlib import Path
 from typing import Iterator

+from pymavlink import mavutil
+

@dataclass(frozen=True)
 class TlogMessage:
@@ -32,12 +37,53 @@ class TlogMessage:
 def iter_messages(tlog_path: Path) -> Iterator[TlogMessage]:
    """Iterate `.tlog` messages oldest-first.

-    AZ-406 raises until AZ-416 fills in the pymavlink-backed iterator.
+    Uses ``pymavlink.mavutil.mavlink_connection`` in tlog-file mode.
+    Each yielded ``TlogMessage`` carries:
+
+    * ``timestamp_us`` — unix microseconds, as recorded by mavproxy
+      (pymavlink exposes this as ``msg._timestamp`` in seconds-float).
+    * ``msg_type`` — message name (e.g. ``"GPS_INPUT"``, ``"GPS_RAW_INT"``).
+    * ``signed`` — True iff the wire frame carried a MAVLink 2.0
+      signature block (`msg.get_signed()` on pymavlink ≥2.4).
+    * ``fields`` — dict of field name → value, via ``msg.to_dict()``
+      minus the ``mavpackettype`` key.
+
+    Bad / unparsable frames are skipped (mavlogfile returns ``None`` or
+    raises internally) but EOF closes the iterator cleanly.
    """
-    raise NotImplementedError(
-        "mavproxy_tlog_reader.iter_messages is owned by AZ-416 — "
-        "AZ-406 supplies only the public surface."
-    )
+    if not tlog_path.exists():
+        raise FileNotFoundError(f"tlog not found: {tlog_path}")
+
+    conn = mavutil.mavlink_connection(str(tlog_path))
+    try:
+        while True:
+            msg = conn.recv_match(blocking=False)
+            if msg is None:
+                break
+            msg_type = msg.get_type()
+            if msg_type == "BAD_DATA":
+                continue
+            try:
+                fields = msg.to_dict()
+            except Exception:
+                continue
+            fields.pop("mavpackettype", None)
+            ts_s = getattr(msg, "_timestamp", 0.0) or 0.0
+            try:
+                signed = bool(msg.get_signed())
+            except AttributeError:
+                signed = False
+            yield TlogMessage(
+                timestamp_us=int(ts_s * 1_000_000),
+                msg_type=msg_type,
+                signed=signed,
+                fields=fields,
+            )
+    finally:
+        try:
+            conn.close()
+        except Exception:
+            pass


 def count_by_type(tlog_path: Path) -> dict[str, int]:
@@ -0,0 +1,155 @@
+"""MSP2 frame observer for FT-P-09-iNav (AZ-417 / AC-4.3).
+
+iNav consumes MSP2 over a TCP socket on port 5760. The SUT's
+``c8_fc_adapter`` (iNav-side) emits ``MSP2_SENSOR_GPS`` (function ID
+0x1F03) frames at a configured cadence (target 5 Hz per AC-2).
+
+This helper owns the pure-logic side of FT-P-09-iNav:
+
+* ``compute_rate_hz`` — given a sequence of frame-arrival timestamps,
+  return the observed Hz over a window.
+* ``count_frames_by_id`` — filter + tally per MSP function ID.
+* ``evaluate_inav_gps_state`` — given a snapshot of iNav's ``gpsSol``
+  + ``provider`` after replay, assert AC-3 (fix_type ≥ 3, provider =
+  MSP, numSat matches the emitted value).
+
+The TCP-probe + actual MSP frame capture path is owned by AZ-407
+(``runner.helpers.sitl_observer``) and the iNav SITL docker compose
+service. This module only consumes already-captured data.
+
+Public-boundary discipline: does NOT import any ``src/gps_denied_onboard``
+symbol.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Sequence
+
+MSP2_SENSOR_GPS_FUNCTION_ID = 0x1F03
+DEFAULT_TARGET_RATE_HZ = 5.0
+MIN_OBSERVED_RATE_HZ = 4.5  # AC-2: ≥4.5 Hz observed for 5 Hz target
+MIN_FIX_TYPE = 3  # AC-3: gpsSol.fixType ≥ 3
+REQUIRED_PROVIDER = "MSP"  # AC-3: provider=MSP (no fallback to internal GPS)
+
+
+@dataclass(frozen=True)
+class MspFrameSample:
+    """One MSP frame as captured by the SITL-side observer."""
+
+    monotonic_ms: int
+    function_id: int
+
+
+@dataclass(frozen=True)
+class InavGpsSnapshot:
+    """Snapshot of iNav's ``gpsSol`` + provider state after replay."""
+
+    fix_type: int
+    num_sat: int
+    provider: str
+
+
+@dataclass(frozen=True)
+class RateReport:
+    """Observed rate over a window with pass/fail vs spec target."""
+
+    frame_count: int
+    window_ms: int
+    observed_rate_hz: float
+    target_rate_hz: float
+    min_required_hz: float
+
+    @property
+    def passes(self) -> bool:
+        return (
+            self.window_ms > 0
+            and self.observed_rate_hz >= self.min_required_hz
+        )
+
+
+@dataclass(frozen=True)
+class InavGpsReport:
+    """Evaluation of iNav GPS state against AC-3."""
+
+    snapshot: InavGpsSnapshot
+    expected_num_sat: int
+    fix_type_ok: bool
+    provider_ok: bool
+    num_sat_ok: bool
+
+    @property
+    def passes(self) -> bool:
+        return self.fix_type_ok and self.provider_ok and self.num_sat_ok
+
+
+def count_frames_by_id(samples: Sequence[MspFrameSample]) -> dict[int, int]:
+    """Tally per MSP function ID."""
+    counts: dict[int, int] = {}
+    for s in samples:
+        counts[s.function_id] = counts.get(s.function_id, 0) + 1
+    return counts
+
+
+def compute_rate_hz(
+    samples: Sequence[MspFrameSample],
+    *,
+    function_id: int = MSP2_SENSOR_GPS_FUNCTION_ID,
+    target_rate_hz: float = DEFAULT_TARGET_RATE_HZ,
+    min_required_hz: float = MIN_OBSERVED_RATE_HZ,
+) -> RateReport:
+    """Compute observed Hz for the given function_id over the sample window.
+
+    The window is ``[first_sample.monotonic_ms, last_sample.monotonic_ms]``
+    inclusive. A window of zero ms (≤1 matching sample) is reported but
+    will not pass.
+    """
+    if min_required_hz < 0:
+        raise ValueError(f"min_required_hz must be ≥0, got {min_required_hz}")
+    filtered = [s for s in samples if s.function_id == function_id]
+    if len(filtered) < 2:
+        return RateReport(
+            frame_count=len(filtered),
+            window_ms=0,
+            observed_rate_hz=0.0,
+            target_rate_hz=target_rate_hz,
+            min_required_hz=min_required_hz,
+        )
+    window_ms = filtered[-1].monotonic_ms - filtered[0].monotonic_ms
+    if window_ms <= 0:
+        return RateReport(
+            frame_count=len(filtered),
+            window_ms=window_ms,
+            observed_rate_hz=0.0,
+            target_rate_hz=target_rate_hz,
+            min_required_hz=min_required_hz,
+        )
+    # Rate = (count - 1) / (window in seconds); the first frame is the
+    # epoch boundary, subsequent frames define the cadence.
+    observed = (len(filtered) - 1) / (window_ms / 1000.0)
+    return RateReport(
+        frame_count=len(filtered),
+        window_ms=window_ms,
+        observed_rate_hz=observed,
+        target_rate_hz=target_rate_hz,
+        min_required_hz=min_required_hz,
+    )
+
+
+def evaluate_inav_gps_state(
+    snapshot: InavGpsSnapshot,
+    *,
+    expected_num_sat: int,
+    min_fix_type: int = MIN_FIX_TYPE,
+    required_provider: str = REQUIRED_PROVIDER,
+) -> InavGpsReport:
+    """Validate AC-3: fix_type ≥3, provider=MSP, numSat matches emitted value."""
+    if expected_num_sat < 0:
+        raise ValueError(f"expected_num_sat must be ≥0, got {expected_num_sat}")
+    return InavGpsReport(
+        snapshot=snapshot,
+        expected_num_sat=expected_num_sat,
+        fix_type_ok=snapshot.fix_type >= min_fix_type,
+        provider_ok=snapshot.provider == required_provider,
+        num_sat_ok=snapshot.num_sat == expected_num_sat,
+    )
@@ -0,0 +1,184 @@
+"""FT-P-09-AP — ArduPilot GPS_INPUT contract + MAVLink 2.0 signing (AZ-416 / AC-4.3).
+
+The full scenario:
+
+1. Force ``fc_adapter=ardupilot``; load ``mavlink-test-passkey.txt``
+   as the docker secret feeding the SUT signing channel.
+2. Start the SUT against the ArduPilot SITL container; mavproxy-listener
+   captures the wire traffic to a ``.tlog``.
+3. AC-1: parse the ``.tlog``; first signed frame must arrive within
+   ≤5 s of the first observed message; no ``BAD_SIGNATURE`` STATUSTEXT
+   in that window.
+4. Replay 60 s of Derkachi through the SUT (signed GPS_INPUT flow).
+5. AC-2: GPS_INPUT cadence over the full ``.tlog`` ≥4.5 Hz.
+6. AC-3: ``EK3_SRC1_POSXY`` (read via mavproxy parameter request) ==
+   3 (GPS source).
+7. AC-4: GPS_RAW_INT health (``fix_type ≥ 3`` AND ``eph ≤ 200``)
+   for ≥80 % of the window.
+8. AC-5: parameterised per ``vio_strategy`` (``fc_adapter`` fixed to
+   ``ardupilot``).
+
+Gated on:
+* ``runner.helpers.frame_source_replay`` — owned by AZ-441
+* ``runner.helpers.sitl_observer`` — owned by AZ-407 (AP-side leg
+  ``capture_ap_tlog`` + ``read_ap_parameter``)
+
+Pure-logic AC-1/AC-2/AC-3/AC-4 coverage lives in
+``e2e/_unit_tests/helpers/test_ap_contract_evaluator.py`` and
+``e2e/_unit_tests/helpers/test_mavproxy_tlog_reader.py``.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from runner.helpers import ap_contract_evaluator as ace
+from runner.helpers import mavproxy_tlog_reader as mtr
+
+DERKACHI_DIR = (
+    Path(__file__).resolve().parents[3]
+    / "_docs"
+    / "00_problem"
+    / "input_data"
+    / "flight_derkachi"
+)
+DERKACHI_MP4 = DERKACHI_DIR / "flight_derkachi.mp4"
+MAVLINK_PASSKEY_FIXTURE = (
+    Path(__file__).resolve().parents[2]
+    / "fixtures"
+    / "secrets"
+    / "mavlink-test-passkey.txt"
+)
+
+REPLAY_WINDOW_S = 60
+
+
+@pytest.fixture(scope="module")
+def _ap_harness_implemented() -> bool:
+    """True iff frame_source_replay + sitl_observer AP-side leg are real."""
+    from runner.helpers import sitl_observer
+    from runner.helpers.frame_source_replay import FrameSourceReplayer
+
+    try:
+        replayer = FrameSourceReplayer(sink=_NullSink())  # type: ignore[arg-type]
+        try:
+            replayer.replay_video(Path("/tmp/non-existent.mp4"))
+        except NotImplementedError:
+            return False
+        try:
+            sitl_observer.capture_ap_tlog(host="ardupilot-sitl", duration_s=0.01)
+        except (NotImplementedError, AttributeError):
+            return False
+        try:
+            sitl_observer.read_ap_parameter(host="ardupilot-sitl", name="EK3_SRC1_POSXY")
+        except (NotImplementedError, AttributeError):
+            return False
+        return True
+    except Exception:
+        return False
+
+
+class _NullSink:
+    def write_frame(self, jpeg_bytes: bytes, timestamp_ms: int) -> None:
+        return None
+
+
+@pytest.mark.traces_to("AC-4.3,AC-1,AC-2,AC-3,AC-4,AC-5,D-C8-9")
+def test_ft_p_09_ap_signing(
+    vio_strategy: str,
+    evidence_dir,  # type: ignore[no-untyped-def]
+    run_id: str,
+    nfr_recorder,  # type: ignore[no-untyped-def]
+    request,  # type: ignore[no-untyped-def]
+    _ap_harness_implemented: bool,
+) -> None:
+    """Full FT-P-09-AP scenario; parameterized per vio_strategy."""
+    fc_adapter = request.getfixturevalue("fc_adapter")
+    if fc_adapter != "ardupilot":
+        pytest.skip("FT-P-09-AP is ArduPilot-only; iNav variant is FT-P-09-iNav (AZ-417)")
+
+    if not MAVLINK_PASSKEY_FIXTURE.exists():
+        pytest.fail(
+            f"mavlink-test-passkey fixture missing at {MAVLINK_PASSKEY_FIXTURE} — "
+            "AZ-407 / AZ-408 owns the on-disk fixture."
+        )
+
+    if not _ap_harness_implemented:
+        pytest.skip(
+            "FT-P-09-AP full scenario requires runner.helpers.{frame_source_replay,"
+            "sitl_observer.capture_ap_tlog,sitl_observer.read_ap_parameter} — "
+            "currently AZ-441 / AZ-407 leftovers. Pure-logic AC-1..AC-4 covered by "
+            "e2e/_unit_tests/helpers/test_ap_contract_evaluator.py."
+        )
+
+    from runner.helpers import sitl_observer
+    from runner.helpers.frame_source_replay import FrameSourceReplayer
+
+    # 1. Drive replay (captures tlog continuously via mavproxy-listener).
+    FrameSourceReplayer(_resolve_frame_sink()).replay_video(DERKACHI_MP4)
+    tlog_path = sitl_observer.capture_ap_tlog(
+        host="ardupilot-sitl", duration_s=REPLAY_WINDOW_S,
+    )
+
+    # 2. Materialise the tlog ONCE (iter_messages is single-pass).
+    msgs = ace.collect_messages_to_list(mtr.iter_messages(tlog_path))
+    if not msgs:
+        pytest.fail(f"FT-P-09-AP: empty tlog at {tlog_path}")
+
+    # 3. AC-1: signing handshake.
+    handshake = ace.observe_signing_handshake(msgs)
+
+    # 4. AC-2: GPS_INPUT rate.
+    rate = ace.compute_gps_input_rate(msgs)
+
+    # 5. AC-3: EK3_SRC1_POSXY param read.
+    ek3_value = int(sitl_observer.read_ap_parameter(
+        host="ardupilot-sitl", name="EK3_SRC1_POSXY"
+    ))
+    ek3_ok = ace.validate_ek3_src1_posxy(ek3_value)
+
+    # 6. AC-4: GPS_RAW_INT health.
+    health = ace.evaluate_gps_raw_int_health(msgs)
+
+    # 7. NFR metrics + assertions.
+    if handshake.lag_s is not None:
+        nfr_recorder.record_metric(
+            "ft_p_09_ap.signing_handshake_s", handshake.lag_s, ac_id="AC-1"
+        )
+    nfr_recorder.record_metric(
+        "ft_p_09_ap.gps_input_rate_hz", rate.observed_rate_hz, ac_id="AC-2"
+    )
+    nfr_recorder.record_metric(
+        "ft_p_09_ap.ek3_src1_posxy", float(ek3_value), ac_id="AC-3"
+    )
+    nfr_recorder.record_metric(
+        "ft_p_09_ap.gps_raw_int_healthy_fraction", health.healthy_fraction, ac_id="AC-4"
+    )
+
+    assert handshake.passes, (
+        f"AC-1 (signing handshake ≤{ace.HANDSHAKE_BUDGET_S} s, no BAD_SIGNATURE) failed: "
+        f"first_signed_us={handshake.first_signed_us}, lag_s={handshake.lag_s}, "
+        f"bad_signature_count={handshake.bad_signature_count}"
+    )
+    assert rate.passes, (
+        f"AC-2 (GPS_INPUT ≥{ace.GPS_INPUT_MIN_RATE_HZ} Hz for "
+        f"{ace.GPS_INPUT_TARGET_RATE_HZ} Hz target) failed: "
+        f"observed_rate_hz={rate.observed_rate_hz:.3f}, frames={rate.frame_count}"
+    )
+    assert ek3_ok, (
+        f"AC-3 (EK3_SRC1_POSXY = {ace.EK3_SRC1_POSXY_REQUIRED}) failed: got {ek3_value}"
+    )
+    assert health.passes, (
+        f"AC-4 (GPS_RAW_INT healthy fraction ≥"
+        f"{ace.GPS_RAW_INT_HEALTHY_FRACTION_REQUIRED:.0%}) failed: "
+        f"observed={health.healthy_fraction:.4f}, "
+        f"healthy={health.healthy_samples}/{health.total_samples}"
+    )
+
+
+def _resolve_frame_sink():  # type: ignore[no-untyped-def]
+    raise NotImplementedError(
+        "frame sink resolution is owned by AZ-441 / runner.helpers.frame_source_replay"
+    )
@@ -0,0 +1,171 @@
+"""FT-P-09-iNav — iNav MSP2_SENSOR_GPS contract conformance (AZ-417 / AC-4.3).
+
+The full scenario:
+
+1. Force ``fc_adapter=inav``; start the SUT against the iNav SITL
+   container on ``inav-sitl:5760``.
+2. AC-1: probe the TCP connection establishment from the SUT side
+   within ≤5 s (observable via the SITL observer's connection event).
+3. Replay 60 s of Derkachi through the SUT.
+4. AC-2: count MSP2_SENSOR_GPS (function ID 0x1F03) frame arrivals at
+   iNav; assert ≥4.5 Hz observed.
+5. AC-3: query iNav GPS state via ``msp_gps_toy`` subprocess; assert
+   ``gpsSol.fixType ≥ 3``, ``provider = "MSP"``, ``gpsSol.numSat``
+   matches the emitted value.
+6. AC-4: parameterise per ``vio_strategy`` (``fc_adapter`` fixed to
+   ``inav``).
+
+Gated on:
+* ``runner.helpers.frame_source_replay`` — owned by AZ-441
+* ``runner.helpers.sitl_observer`` — owned by AZ-407 (iNav probe leg
+  is part of the iNav-side `inav_msp_observer` follow-up)
+
+Pure-logic AC-2/AC-3 coverage lives in
+``e2e/_unit_tests/helpers/test_msp_frame_observer.py``.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from runner.helpers import msp_frame_observer as mfo
+
+DERKACHI_DIR = (
+    Path(__file__).resolve().parents[3]
+    / "_docs"
+    / "00_problem"
+    / "input_data"
+    / "flight_derkachi"
+)
+DERKACHI_MP4 = DERKACHI_DIR / "flight_derkachi.mp4"
+
+REPLAY_WINDOW_S = 60
+TCP_HANDSHAKE_BUDGET_S = 5
+
+
+@pytest.fixture(scope="module")
+def _inav_harness_implemented() -> bool:
+    """True iff frame_source_replay + sitl_observer iNav leg are real."""
+    from runner.helpers import sitl_observer
+    from runner.helpers.frame_source_replay import FrameSourceReplayer
+
+    try:
+        replayer = FrameSourceReplayer(sink=_NullSink())  # type: ignore[arg-type]
+        try:
+            replayer.replay_video(Path("/tmp/non-existent.mp4"))
+        except NotImplementedError:
+            return False
+        try:
+            sitl_observer.observe_inav_tcp_handshake(host="inav-sitl", port=5760, timeout_s=0.01)
+        except (NotImplementedError, AttributeError):
+            return False
+        return True
+    except Exception:
+        return False
+
+
+class _NullSink:
+    def write_frame(self, jpeg_bytes: bytes, timestamp_ms: int) -> None:
+        return None
+
+
+@pytest.mark.traces_to("AC-4.3,AC-1,AC-2,AC-3,AC-4")
+def test_ft_p_09_inav(
+    vio_strategy: str,
+    evidence_dir,  # type: ignore[no-untyped-def]
+    run_id: str,
+    nfr_recorder,  # type: ignore[no-untyped-def]
+    request,  # type: ignore[no-untyped-def]
+    _inav_harness_implemented: bool,
+) -> None:
+    """Full FT-P-09-iNav scenario; parameterized per vio_strategy.
+
+    `fc_adapter` is FORCED to ``inav`` (AC-4) — the test skips on any
+    other adapter so the conftest matrix doesn't double-run it under
+    ``ardupilot``.
+    """
+    fc_adapter = request.getfixturevalue("fc_adapter")
+    if fc_adapter != "inav":
+        pytest.skip("FT-P-09-iNav is iNav-only; ardupilot variant is FT-P-09-AP (AZ-416)")
+
+    if not _inav_harness_implemented:
+        pytest.skip(
+            "FT-P-09-iNav full scenario requires runner.helpers.{frame_source_replay,"
+            "sitl_observer.observe_inav_tcp_handshake} — currently AZ-441 / AZ-407 leftovers. "
+            "Pure-logic AC-2/AC-3 covered by "
+            "e2e/_unit_tests/helpers/test_msp_frame_observer.py."
+        )
+
+    from runner.helpers import sitl_observer
+    from runner.helpers.frame_source_replay import FrameSourceReplayer
+
+    # 1. AC-1: TCP handshake.
+    handshake = sitl_observer.observe_inav_tcp_handshake(
+        host="inav-sitl", port=5760, timeout_s=TCP_HANDSHAKE_BUDGET_S,
+    )
+    assert handshake.established_within_s is not None, (
+        f"AC-1 (TCP connect ≤{TCP_HANDSHAKE_BUDGET_S} s) failed: no connection event"
+    )
+    assert handshake.established_within_s <= TCP_HANDSHAKE_BUDGET_S, (
+        f"AC-1 (TCP connect ≤{TCP_HANDSHAKE_BUDGET_S} s) failed: "
+        f"established_within_s={handshake.established_within_s}"
+    )
+
+    # 2. Drive replay.
+    FrameSourceReplayer(_resolve_frame_sink()).replay_video(DERKACHI_MP4)
+
+    # 3. Collect MSP frame arrivals from the iNav observer.
+    capture = sitl_observer.collect_inav_msp_frames(
+        host="inav-sitl", port=5760, window_s=REPLAY_WINDOW_S,
+    )
+    samples = [
+        mfo.MspFrameSample(monotonic_ms=int(f.monotonic_ms), function_id=int(f.function_id))
+        for f in capture.frames
+    ]
+
+    # 4. AC-2: rate.
+    rate_report = mfo.compute_rate_hz(samples)
+
+    # 5. AC-3: iNav GPS state via msp_gps_toy.
+    state = sitl_observer.query_inav_gps_state(host="inav-sitl")
+    gps_report = mfo.evaluate_inav_gps_state(
+        mfo.InavGpsSnapshot(
+            fix_type=int(state.fix_type),
+            num_sat=int(state.num_sat),
+            provider=str(state.provider),
+        ),
+        expected_num_sat=int(capture.expected_num_sat),
+    )
+
+    # 6. NFR metrics + assertions.
+    nfr_recorder.record_metric(
+        "ft_p_09_inav.frame_count", float(rate_report.frame_count), ac_id="AC-2"
+    )
+    nfr_recorder.record_metric(
+        "ft_p_09_inav.observed_rate_hz", rate_report.observed_rate_hz, ac_id="AC-2"
+    )
+    nfr_recorder.record_metric(
+        "ft_p_09_inav.tcp_handshake_s", float(handshake.established_within_s), ac_id="AC-1"
+    )
+    nfr_recorder.record_metric(
+        "ft_p_09_inav.fix_type", float(gps_report.snapshot.fix_type), ac_id="AC-3"
+    )
+
+    assert rate_report.passes, (
+        f"AC-2 (≥{mfo.MIN_OBSERVED_RATE_HZ} Hz for {mfo.DEFAULT_TARGET_RATE_HZ} Hz target) failed: "
+        f"observed_rate_hz={rate_report.observed_rate_hz:.3f}, "
+        f"frames={rate_report.frame_count}, window_ms={rate_report.window_ms}"
+    )
+    assert gps_report.passes, (
+        f"AC-3 failed: fix_type_ok={gps_report.fix_type_ok}, "
+        f"provider_ok={gps_report.provider_ok}, num_sat_ok={gps_report.num_sat_ok}; "
+        f"snapshot={gps_report.snapshot}, expected_num_sat={gps_report.expected_num_sat}"
+    )
+
+
+def _resolve_frame_sink():  # type: ignore[no-untyped-def]
+    raise NotImplementedError(
+        "frame sink resolution is owned by AZ-441 / runner.helpers.frame_source_replay"
+    )
@@ -0,0 +1,309 @@
+"""FT-P-11 — Cold-start initialization (AZ-419 / ADR-010 / AC-5.1).
+
+Three parametrized origin_source variants share one scenario module:
+
+* ``operator_manifest`` (primary path, ADR-010 / AZ-490): Manifest
+  carries ``flight.takeoff_origin = A``; SITL FC has NO valid GPS;
+  SUT cold-starts; first outbound estimate within ±50 m of A;
+  FDR has ``c5.cold_start_origin.set(source="manifest")``.
+* ``fc_ekf`` (secondary path, legacy AC-5.1): Manifest has no
+  ``takeoff_origin``; ``cold-boot-fixture`` JSON loaded into SITL;
+  first outbound estimate within ±50 m of FC EKF snapshot;
+  FDR has ``c5.cold_start_origin.set(source="fc_ekf")``.
+* ``bounded_delta_conflict`` (ADR-010 Principle #11 amended): Manifest
+  carries ``takeoff_origin = A``; FC EKF reports B with
+  ``vincenty(A, B) > 200 m``; first outbound estimate within ±50 m of
+  A; source_label is NOT ``satellite_anchored``; FDR has
+  ``c5.gps_bounded_delta.reject`` naming both A and B.
+
+The fourth variant exercised by AC-3 (no origin available → SUT
+refuses takeoff) lives in a separate scenario function in the same
+module so the parametrize matrix for the other three stays clean.
+
+Gated on the upstream replay + SITL observer + FDR helpers; pure
+logic is covered by
+``e2e/_unit_tests/helpers/test_cold_start_evaluator.py``.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from runner.helpers import cold_start_evaluator as cse
+
+DERKACHI_DIR = (
+    Path(__file__).resolve().parents[3]
+    / "_docs"
+    / "00_problem"
+    / "input_data"
+    / "flight_derkachi"
+)
+DERKACHI_MP4 = DERKACHI_DIR / "flight_derkachi.mp4"
+COLD_BOOT_FIXTURE = (
+    Path(__file__).resolve().parents[2]
+    / "fixtures"
+    / "cold-boot"
+    / "cold_boot_fixture.json"
+)
+
+OPERATOR_ORIGIN = cse.LatLonAlt(lat_deg=50.0, lon_deg=36.2, alt_m=200.0)
+
+
+@pytest.fixture(scope="module")
+def _cold_start_harness_implemented() -> bool:
+    """True iff frame_source_replay + sitl_observer + fdr_reader are real.
+
+    Cold start adds two specific SITL-observer surfaces beyond the
+    common replay path: ``prepare_sitl_cold_boot`` (parameter-load
+    path) and ``prepare_sitl_no_gps`` (``SIM_GPS_DISABLE = 1``).
+    """
+    from runner.helpers import fdr_reader, sitl_observer
+    from runner.helpers.frame_source_replay import FrameSourceReplayer
+
+    try:
+        replayer = FrameSourceReplayer(sink=_NullSink())  # type: ignore[arg-type]
+        try:
+            replayer.replay_video(Path("/tmp/non-existent.mp4"))
+        except NotImplementedError:
+            return False
+        try:
+            list(fdr_reader.iter_records(Path("/tmp/non-existent")))
+        except NotImplementedError:
+            return False
+        try:
+            sitl_observer.prepare_sitl_cold_boot(host="ardupilot-sitl", fixture_path=COLD_BOOT_FIXTURE)
+        except (NotImplementedError, AttributeError):
+            return False
+        try:
+            sitl_observer.prepare_sitl_no_gps(host="ardupilot-sitl")
+        except (NotImplementedError, AttributeError):
+            return False
+        return True
+    except Exception:
+        return False
+
+
+class _NullSink:
+    def write_frame(self, jpeg_bytes: bytes, timestamp_ms: int) -> None:
+        return None
+
+
+@pytest.fixture
+def _cold_run_id(run_id: str) -> str:
+    """Return a fresh run_id — Cold-start REQUIRES an empty fdr-output volume.
+
+    The runner's ``run_id`` is per-invocation already, but cold-start
+    additionally relies on the volume being empty. The actual volume
+    wipe is part of the docker-compose lifecycle owned by AZ-407 and
+    is therefore implicit in the scenario being skipped until the
+    harness is real.
+    """
+    return run_id
+
+
+@pytest.mark.parametrize(
+    "origin_source",
+    ["operator_manifest", "fc_ekf", "bounded_delta_conflict"],
+)
+@pytest.mark.traces_to("AC-5.1,AC-1,AC-2,AC-4,AC-5,ADR-010")
+def test_ft_p_11_cold_start_origin_variants(
+    origin_source: str,
+    fc_adapter: str,
+    vio_strategy: str,
+    evidence_dir,  # type: ignore[no-untyped-def]
+    _cold_run_id: str,
+    nfr_recorder,  # type: ignore[no-untyped-def]
+    tmp_path: Path,
+    _cold_start_harness_implemented: bool,
+) -> None:
+    """FT-P-11 AC-1 / AC-2 / AC-4 across the three origin_source variants."""
+    if not _cold_start_harness_implemented:
+        pytest.skip(
+            "FT-P-11 full scenario requires runner.helpers.{frame_source_replay,"
+            "fdr_reader,sitl_observer.prepare_sitl_cold_boot,"
+            "sitl_observer.prepare_sitl_no_gps} — currently AZ-441 / AZ-407 "
+            "leftovers. Pure-logic AC-1/2/3/4 covered by "
+            "e2e/_unit_tests/helpers/test_cold_start_evaluator.py."
+        )
+
+    from runner.helpers import fdr_reader, sitl_observer
+    from runner.helpers.frame_source_replay import FrameSourceReplayer
+
+    # 1. Stage the fixture per variant.
+    manifest_path = tmp_path / f"ft-p-11-{origin_source}-manifest.json"
+    if origin_source == "operator_manifest":
+        cse.write_manifest(manifest_path, OPERATOR_ORIGIN)
+        sitl_observer.prepare_sitl_no_gps(host=f"{fc_adapter}-sitl")
+        expected_origin = OPERATOR_ORIGIN
+    elif origin_source == "fc_ekf":
+        cse.write_manifest(manifest_path, None)
+        snap = cse.read_cold_boot_fixture(COLD_BOOT_FIXTURE)
+        sitl_observer.prepare_sitl_cold_boot(host=f"{fc_adapter}-sitl", fixture_path=COLD_BOOT_FIXTURE)
+        expected_origin = cse.LatLonAlt(snap.lat_deg, snap.lon_deg, snap.alt_m)
+    elif origin_source == "bounded_delta_conflict":
+        cse.write_manifest(manifest_path, OPERATOR_ORIGIN)
+        snap = cse.read_cold_boot_fixture(COLD_BOOT_FIXTURE)
+        assert (
+            cse.bounded_delta_distance_m(
+                OPERATOR_ORIGIN,
+                cse.LatLonAlt(snap.lat_deg, snap.lon_deg, snap.alt_m),
+            )
+            > cse.BOUNDED_DELTA_TRIGGER_M
+        ), (
+            "Test fixture invariant broken: cold-boot snapshot and operator origin "
+            "must be > 200 m apart for bounded_delta_conflict variant."
+        )
+        sitl_observer.prepare_sitl_cold_boot(host=f"{fc_adapter}-sitl", fixture_path=COLD_BOOT_FIXTURE)
+        expected_origin = OPERATOR_ORIGIN
+    else:
+        pytest.fail(f"Unknown origin_source {origin_source!r}")
+
+    # 2. Cold-start SUT + push the first frame.
+    FrameSourceReplayer(_resolve_frame_sink()).replay_video(
+        DERKACHI_MP4, manifest_path=manifest_path, frame_limit=1,
+    )
+
+    # 3. Collect first outbound estimate + FDR audit records.
+    fdr_root = Path(evidence_dir).parent / f"run-{_cold_run_id}" / "fdr"
+    first_estimate: cse.OutboundEstimate | None = None
+    fdr_records: list[cse.FdrAuditRecord] = []
+    for rec in fdr_reader.iter_records(fdr_root):
+        if (
+            first_estimate is None
+            and rec.record_type == "outbound_estimate"
+        ):
+            payload = rec.payload
+            first_estimate = cse.OutboundEstimate(
+                monotonic_ms=int(rec.monotonic_ms),
+                lat_deg=float(payload["lat_deg"]),  # type: ignore[arg-type]
+                lon_deg=float(payload["lon_deg"]),  # type: ignore[arg-type]
+                source_label=str(payload["source_label"]),  # type: ignore[arg-type]
+            )
+        if rec.record_type in {
+            cse.FDR_RECORD_ORIGIN_SET,
+            cse.FDR_RECORD_ORIGIN_UNAVAILABLE,
+            cse.FDR_RECORD_BOUNDED_DELTA_REJECT,
+        }:
+            fdr_records.append(
+                cse.FdrAuditRecord(
+                    monotonic_ms=int(rec.monotonic_ms),
+                    record_type=rec.record_type,
+                    payload=rec.payload,
+                )
+            )
+
+    # 4. Evaluate + assert per variant.
+    report = cse.evaluate_first_estimate(
+        origin_source=origin_source,
+        expected_origin=expected_origin,
+        first_estimate=first_estimate,
+        fdr_records=fdr_records,
+    )
+
+    if report.distance_m is not None:
+        nfr_recorder.record_metric(
+            f"ft_p_11.{origin_source}.distance_m", report.distance_m, ac_id="AC-1"
+        )
+
+    assert report.passes_distance, (
+        f"FT-P-11 {origin_source}: distance check failed "
+        f"(budget {cse.ACCURACY_BUDGET_M} m): got distance_m={report.distance_m}"
+    )
+
+    if origin_source == "operator_manifest":
+        assert report.fdr_origin_set_source == "manifest", (
+            f"AC-1: FDR must record c5.cold_start_origin.set(source='manifest'); "
+            f"got source={report.fdr_origin_set_source!r}"
+        )
+    elif origin_source == "fc_ekf":
+        assert report.fdr_origin_set_source == "fc_ekf", (
+            f"AC-2: FDR must record c5.cold_start_origin.set(source='fc_ekf'); "
+            f"got source={report.fdr_origin_set_source!r}"
+        )
+    elif origin_source == "bounded_delta_conflict":
+        assert report.source_label_ok, (
+            f"AC-4: source_label MUST NOT be "
+            f"{cse.FORBIDDEN_FIRST_LABEL_BOUNDED_DELTA!r}; got "
+            f"{report.actual_estimate.source_label if report.actual_estimate else None!r}"
+        )
+        assert report.fdr_bounded_delta_seen, (
+            "AC-4: FDR must record c5.gps_bounded_delta.reject naming A and B"
+        )
+
+
+@pytest.mark.traces_to("AC-3,AC-NEW-1")
+def test_ft_p_11_cold_start_no_origin_aborts(
+    fc_adapter: str,
+    vio_strategy: str,
+    evidence_dir,  # type: ignore[no-untyped-def]
+    _cold_run_id: str,
+    nfr_recorder,  # type: ignore[no-untyped-def]
+    tmp_path: Path,
+    _cold_start_harness_implemented: bool,
+) -> None:
+    """AC-3: Manifest empty + SITL no GPS → SUT MUST refuse takeoff."""
+    if not _cold_start_harness_implemented:
+        pytest.skip(
+            "FT-P-11 AC-3 full scenario requires runner.helpers.{frame_source_replay,"
+            "fdr_reader,sitl_observer.prepare_sitl_no_gps} — currently AZ-441 / "
+            "AZ-407 leftovers. Pure-logic AC-3 covered by "
+            "e2e/_unit_tests/helpers/test_cold_start_evaluator.py."
+        )
+
+    from runner.helpers import fdr_reader, sitl_observer
+    from runner.helpers.frame_source_replay import FrameSourceReplayer
+
+    manifest_path = tmp_path / "ft-p-11-no-origin-manifest.json"
+    cse.write_manifest(manifest_path, None)
+    sitl_observer.prepare_sitl_no_gps(host=f"{fc_adapter}-sitl")
+
+    FrameSourceReplayer(_resolve_frame_sink()).replay_video(
+        DERKACHI_MP4, manifest_path=manifest_path, frame_limit=1,
+    )
+
+    fdr_root = Path(evidence_dir).parent / f"run-{_cold_run_id}" / "fdr"
+    first_estimate: cse.OutboundEstimate | None = None
+    fdr_records: list[cse.FdrAuditRecord] = []
+    for rec in fdr_reader.iter_records(fdr_root):
+        if first_estimate is None and rec.record_type == "outbound_estimate":
+            payload = rec.payload
+            first_estimate = cse.OutboundEstimate(
+                monotonic_ms=int(rec.monotonic_ms),
+                lat_deg=float(payload["lat_deg"]),  # type: ignore[arg-type]
+                lon_deg=float(payload["lon_deg"]),  # type: ignore[arg-type]
+                source_label=str(payload["source_label"]),  # type: ignore[arg-type]
+            )
+        if rec.record_type == cse.FDR_RECORD_ORIGIN_UNAVAILABLE:
+            fdr_records.append(
+                cse.FdrAuditRecord(
+                    monotonic_ms=int(rec.monotonic_ms),
+                    record_type=rec.record_type,
+                    payload=rec.payload,
+                )
+            )
+
+    report = cse.evaluate_no_origin_path(
+        first_estimate=first_estimate, fdr_records=fdr_records,
+    )
+
+    nfr_recorder.record_metric(
+        "ft_p_11.no_origin.estimate_emitted",
+        1.0 if report.estimate_within_budget else 0.0,
+        ac_id="AC-3",
+    )
+
+    assert report.passes, (
+        f"AC-3: SUT must NOT emit any estimate AND FDR must record "
+        f"{cse.FDR_RECORD_ORIGIN_UNAVAILABLE} within "
+        f"{cse.FIRST_EMISSION_BUDGET_S} s. "
+        f"estimate_emitted={report.estimate_within_budget}, "
+        f"fdr_unavailable_seen={report.fdr_origin_unavailable_seen}"
+    )
+
+
+def _resolve_frame_sink():  # type: ignore[no-untyped-def]
+    raise NotImplementedError(
+        "frame sink resolution is owned by AZ-441 / runner.helpers.frame_source_replay"
+    )