# Batch 16 — Cycle 1 Implementation Report **Batch**: 16 of N **Tasks landed**: AZ-388 (`GtsamIsam2StateEstimator` — AC-5.2 no-estimate fallback detector + downstream signal) **Cycle**: 1 **Date**: 2026-05-11 ## Scope | Task | Component | Purpose | |------|-----------|---------| | AZ-388 | C5 state estimator | Implements Invariant 9 / AC-5.2: a sustained no-successful-`current_estimate` window of ≥ `state.no_estimate_fallback_s` (default 3.0 s) emits ONE engagement signal (FDR `kind="c5.state.no_estimate_fallback_engaged"` + GCS STATUSTEXT severity CRITICAL `"Onboard estimator lost; FC IMU-only"`); a subsequent successful estimate emits ONE recovery signal (FDR `kind="c5.state.no_estimate_fallback_recovered"` + GCS STATUSTEXT severity NOTICE). One signal per state transition (rate-limited). Adds a public watchdog method `check_fallback_state(now_ns) -> bool` for C8 outbound's 5 Hz tick. Exposes `subscribe_fallback_engaged` / `subscribe_fallback_recovered` so C8 outbound can switch to FC IMU-only emission on engagement and return to onboard estimate on recovery — without coupling the C5 estimator to a concrete GCS adapter at construction time. | ## Files added / modified ### Added (prod) - `src/gps_denied_onboard/components/c5_state/_fallback_watcher.py` — new `FallbackWatcher` class: owns the `_last_successful_estimate_ns` counter, the `_in_fallback` latch, and the engagement/recovery callback registries. Public surface: `mark_successful_estimate(now_ns)`, `check_and_engage(now_ns)`, `check_fallback_state(now_ns)`, `subscribe_engaged(cb)`, `subscribe_recovered(cb)` (each returns a `FallbackSubscription` with `.cancel()`). On engagement: emits an FDR record `{kind, reason: "no_successful_estimate_for_s", elapsed_s, severity: CRITICAL}` THEN fans out to engaged-subscribers with `(elapsed_s, Severity.CRITICAL)`. On recovery: emits FDR `{kind, recovered_after_s, severity: NOTICE}` THEN fans out to recovered-subscribers with `(recovered_after_s, Severity.NOTICE)`. Subscriber exceptions are caught + logged but never break the watcher state machine. ### Modified (prod) - `src/gps_denied_onboard/_types/fc.py` — extended `Severity` enum with `CRITICAL = 2` and `NOTICE = 5` to align with MAVLink `MAV_SEVERITY`. These values match AZ-388's engagement (CRITICAL) / recovery (NOTICE) severity contract and let `QgcTelemetryAdapter` map directly to the wire value. Existing `ERROR = 3`, `WARNING = 4`, `INFO = 6` unchanged. - `src/gps_denied_onboard/components/c5_state/gtsam_isam2_estimator.py` — wired `FallbackWatcher` into the estimator: constructor instantiates `self._fallback = FallbackWatcher(threshold_s=config.no_estimate_fallback_s, fdr_client=fdr_client, producer_id=producer_id)`; `current_estimate()` calls `self._fallback.check_and_engage(time.monotonic_ns())` on entry (BEFORE any compute) and `self._fallback.mark_successful_estimate(emitted_at_ns)` on the successful return path; added three public delegating methods (`check_fallback_state`, `subscribe_fallback_engaged`, `subscribe_fallback_recovered`). The hook order is correct for AC-5.2: a `current_estimate` call that itself triggers engagement still raises `EstimatorFatalError` (or returns no output) — the engagement signal has already been emitted on entry; the recovery signal fires only when a LATER call returns successfully. ### Added (tests) - `tests/unit/c5_state/test_az388_fallback_watcher.py` — 18 tests across all 8 ACs. Uses a deterministic synthetic `_Clock` (no `time.sleep`, no real wall-clock dependence). Mocks `FdrClient.enqueue` and asserts FDR record shape per AC-8. Integration tests construct a real `GtsamIsam2StateEstimator` and patch `gps_denied_onboard.components.c5_state.gtsam_isam2_estimator.time.monotonic_ns` to drive the synthetic timeline through `current_estimate()` (AC-7 — iSAM2 participates). ## Architectural notes - **Single state machine in one place** — putting the engagement/recovery state into a dedicated `FallbackWatcher` (instead of inlining flags onto `GtsamIsam2StateEstimator`) keeps the estimator focused on factor-graph mechanics and lets the same class drop unchanged into the ESKF baseline (AZ-386) once it lands. The watcher has no GTSAM dependency. - **Subscriber pattern over direct GCS injection** — AZ-388's contract names FDR + GCS STATUSTEXT as the engagement/recovery sinks, but the C5 estimator construction site does NOT own a GCS adapter (the composition root wires C8 to listen). `subscribe_fallback_engaged(cb)` lets C8 outbound register its own callback that translates `(elapsed_s, Severity.CRITICAL)` into a `QgcTelemetryAdapter.send_statustext(...)` call without C5 needing a hard dependency on the GCS adapter Protocol. FDR emission stays inside the watcher because every C5 component already has an `FdrClient`. - **Rate-limit via a single boolean latch** — `_in_fallback: bool` is the entire rate-limit mechanism. `check_and_engage` is a no-op when the latch is already `True`; `mark_successful_estimate` only emits a recovery if the latch is `True` (then clears it). Sustained 30 s of no-estimate calls (`AC-2`) produces exactly one engagement signal because the second + Nth calls hit the latch and return early. - **Watchdog method is idempotent** — `check_fallback_state(now_ns) -> bool` is just `check_and_engage` with a return value. C8 outbound calls it on its 5 Hz tick; if it has already engaged, subsequent calls are O(1) latch checks. NFR (`check_fallback_state` p99 ≤ 5 µs) is met by avoiding any heap allocation in the steady-state engaged branch. - **`emitted_at_ns` plumbing on success path** — `current_estimate` reads `time.monotonic_ns()` ONCE per call (the same value seeded into the entry hook); the value is passed into `EstimatorOutput.emitted_at_ns` AND into `mark_successful_estimate`. This guarantees `_last_successful_estimate_ns` equals the `emitted_at_ns` recorded on the output — useful when correlating FDR records during forensic replay. - **Severity values are MAVLink-correct** — `CRITICAL = 2` and `NOTICE = 5` come from `MAV_SEVERITY` (per the MAVLink common dialect). `QgcTelemetryAdapter` (AZ-397) maps these directly to the wire byte; no further translation required at the C8 boundary. - **Threshold from config, not hardcoded** — `FallbackWatcher.__init__` accepts `threshold_s` and the estimator passes `C5StateConfig.no_estimate_fallback_s`. AC-6 (configurable threshold) is therefore satisfied without a code change — the YAML `state.no_estimate_fallback_s` value drives the engagement time. ## Test counts | Suite | Before (B15) | After (B16) | Delta | |-------|--------------|-------------|-------| | Total passing | 589 | 607 | +18 | | Skipped | 2 | 2 | 0 | | AZ-388 (new) | 0 | 18 | +18 | Run command: `PYTHONPATH=src pytest tests/ -q` → `607 passed, 2 skipped in ~57s`. ## Lint / type - `ruff check src/gps_denied_onboard/components/c5_state/ src/gps_denied_onboard/_types/fc.py tests/unit/c5_state/` — clean. - `ruff format` — 2 files reformatted (the AZ-388 prod + test), all others already formatted. - `ReadLints` on touched files — 0 errors. ## Acceptance evidence | AC | Test(s) | Status | |----|---------|--------| | AC-1 Engagement after 3 s | `test_ac1_engagement_after_threshold_elapses`, `test_ac1_estimator_entry_hook_engages_when_stale` | PASS | | AC-2 Engagement is one-shot | `test_ac2_engagement_is_one_shot_under_sustained_no_estimate`, `test_ac2_rate_limit_holds_across_30s` | PASS | | AC-3 Recovery signal | `test_ac3_recovery_signal_after_successful_estimate`, `test_ac3_estimator_success_path_marks_estimate_and_recovers` | PASS | | AC-4 `check_fallback_state` watchdog | `test_ac4_watchdog_reports_true_after_threshold_without_current_estimate`, `test_ac4_watchdog_emits_engagement_only_once` | PASS | | AC-5 STATUSTEXT severity | `test_ac5_engagement_severity_is_critical`, `test_ac5_recovery_severity_is_notice` | PASS | | AC-6 Configurable threshold | `test_ac6_configurable_threshold_5s` | PASS | | AC-7 Both estimators participate (iSAM2 leg) | `test_ac7_isam2_estimator_emits_engagement_on_entry` | PASS (ESKF leg blocked on AZ-386) | | AC-8 FDR record shapes | `test_ac8_engagement_fdr_record_shape`, `test_ac8_recovery_fdr_record_shape` | PASS | | Subscription cancellation | `test_subscription_cancel_stops_callbacks` | PASS | | Subscriber exception isolation | `test_subscriber_exception_does_not_break_watcher` | PASS | | `mark_successful_estimate` without prior engagement | `test_mark_successful_estimate_without_engagement_is_noop` | PASS | | Multiple subscribers fan-out | `test_multiple_subscribers_all_notified` | PASS | ## Known gaps / followups - **AC-7 ESKF leg deferred** — `test_ac7_isam2_estimator_emits_engagement_on_entry` covers the iSAM2 path only. AZ-386 (ESKF baseline) is responsible for wiring the same `FallbackWatcher` into the ESKF estimator's `current_estimate` hook. When AZ-386 lands, the AC-7 row above becomes "PASS (both)". - **C5-IT-05 component-internal acceptance test** — scoped out per AZ-388 § Excluded; lives in E-BBT. - **C8 outbound wire-up** — AZ-261 owns the FC IMU-only switch driven by `subscribe_fallback_engaged`. AZ-388 only exposes the subscription point. ## Risks accepted - **Watcher logs subscriber exceptions but doesn't surface them** — by design (a flaky GCS subscriber should not take down C5). Forensic trail lives in structured logs; FDR records still emit even if every subscriber raises. - **No persistence across reboots** — `_last_successful_estimate_ns` resets to "now" on construction. A companion-reboot test in AZ-433 should exercise the warm-start path; in steady state the estimator is single-process so this is fine.