Files
Oleksandr Bezdieniezhnykh 39a7267a23 [autodev] Step 13 partial: c3_5/c4/c5 cycle-1 doc sync
Batch 2 of the cycle-1 component-doc sync. For each of C3.5
(AdHoP), C4 (Pose), C5 (State):

- Append "Cycle-1 operational reality" paragraph to § 1
  documenting the _STRATEGY_REGISTRY wiring, the
  AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS slot, and the
  composition-time errors raised on missing seeds.
- Relax the OpenCV pin in § 5 to >=4.11.0.86,<4.12 with a
  pointer to the D-CROSS-CVE-1 leftover (C5 adds a new row
  for the AZ-389 orthorectifier subsystem's cv2 import).
- Add "Cycle-1 Tier-2 follow-up dependencies" subsection
  in § 7 where applicable: C3.5 calls out the airborne
  registry's omission of PassthroughRefiner; C5 calls out
  the AZ-389 orthorectifier wiring (default OFF) and the
  AZ-624 operator-supplied flight metadata that must land
  before flipping orthorectifier.enabled=True. C4 has no
  parked Tier-2 (only opencv_gtsam is defined).

Also refresh the D-CROSS-CVE-1 leftover replay timestamp
(condition still upstream-gated: gtsam wheels remain
numpy<2) and bump the autodev state's sub_step.detail to
record "batch 2/~5 done (c3_5/c4/c5); 7 components + 8
helpers + tests/ remain".

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 17:06:44 +03:00

14 KiB
Raw Permalink Blame History

C5 — State Estimator

1. High-Level Overview

Purpose: own the GTSAM iSAM2 + IncrementalFixedLagSmoother (K=1020 keyframes per D-C5-3) state. Fuse VioOutput (C1), PoseEstimate (C4), and FC IMU/attitude windows (C8 inbound) into the posterior pose with native 6×6 covariance via Marginals (D-C5-5 = (c)). Emit the smoothed corrected current frame to C8 for FC delivery; emit smoothed past-keyframes to C13 (FDR only — AC-4.5 internal smoothing, NOT FC retroactive correction).

Architectural Pattern: Strategy with two concrete implementations: GtsamIsam2StateEstimator (production-default) and EskfStateEstimator (mandatory simple-baseline). Selection at startup (ADR-001), BUILD_* gating (ADR-002), composition-root wired (ADR-009).

Cycle-1 operational reality: the airborne binary wires C5 through _STRATEGY_REGISTRY + register_airborne_strategies() (AZ-591) on top of the BUILD_STATE_* build-flag matrix (runtime_root/airborne_bootstrap.py::C5_STATE_BUILD_FLAGS = {"gtsam_isam2": "BUILD_STATE_GTSAM_ISAM2", "eskf": "BUILD_STATE_ESKF"}). Both strategies appear in _C5_STATE_STRATEGIES; the gtsam_isam2 flag defaults ON-when-unset and eskf defaults OFF-when-unset (mirrors state_factory._STATE_BUILD_FLAGS). Strategy registration is lazy: _ensure_state_strategy_registered imports the concrete module (gtsam_isam2_estimator or eskf_baseline) only when the configured strategy's BUILD_STATE_* flag is ON, so a binary configured for eskf never imports gtsam. Constructor injection flows through the pre_constructed dict passed to compose_root(config, pre_constructed=...) (AZ-618 umbrella → AZ-623 c5 helpers phase + AZ-625 eager (estimator, handle) pair phase). The c5_state slot lists ("c5_imu_preintegrator", "c5_se3_utils", "c5_wgs_converter", "c13_fdr") in AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS; c6_tile_store, camera_calibration, flight_id, and companion_id are optional (consumed only when c5_state.orthorectifier.enabled is True — see AZ-389 / § 7 below). Missing required keys raise AirborneBootstrapError at composition time, naming the consumer and missing key. The c5_imu_preintegrator is per-process-cached keyed by config.runtime.camera_calibration_path (AC-623.2) so two build_pre_constructed invocations return the SAME instance — protecting its bias / sample accumulator from a silent reset on re-invocation. AZ-625 short-circuit: build_pre_constructed eagerly invokes build_state_estimator and stashes the StateEstimator under the private _c5_prebuilt_estimator key; _c5_state_wrapper returns the prebuilt instance so c4_pose._isam2_handle and c5_state._isam2_handle reference ONE object across the C4 / C5 seam (AC-625.3). AZ-687 replay-mode guard: when config.mode == "replay" and the minimal replay Config omits the c5_state block, the bootstrap skips the eager (estimator, handle) build to avoid forcing the gtsam import on the replay binary (the C5 wrapper itself never runs without the block).

Upstream dependencies:

  • C1 → VioOutput (relative pose + IMU bias).
  • C4 → PoseEstimate (absolute satellite-anchored pose); C4 adds factors directly to C5's iSAM2 graph (shared substrate).
  • C8 inbound side → FC ImuWindow + AttitudeWindow + GpsHealth (for warm-start AC-5.1, blackout AC-NEW-8, spoofing-promotion AC-NEW-2 / F7).

Downstream consumers:

  • C8 outbound side (per-FC encoder) → EmittedExternalPosition (5 Hz periodic to FC).
  • C6 (mid-flight tile gen via orthorectifier; C5 supplies the PoseEstimate + quality_metadata for tile emission).
  • C13 FDR (smoothed past-keyframe estimates, source-set switch events, spoofing-rejection events).

2. Internal Interfaces

Interface: StateEstimator

Method Input Output Async Error Types
set_takeoff_origin (AZ-490, ADR-010) origin: LatLonAlt, sigma_horiz_m: float, sigma_vert_m: float None No StateEstimatorConfigError, EstimatorAlreadyStartedError
add_vio VioOutput None No EstimatorDegradedError, EstimatorFatalError
add_pose_anchor PoseEstimate None No EstimatorDegradedError, EstimatorFatalError
add_fc_imu ImuWindow None No EstimatorDegradedError
current_estimate () EstimatorOutput (smoothed current keyframe) No
smoothed_history n_keyframes: int list[EstimatorOutput] No
health_snapshot () EstimatorHealth No

Input DTOs: see C1, C4, C8.

Output DTOs:

EstimatorOutput:
  frame_id:                       uuid
  position_wgs84:                 LatLonAlt
  orientation_world_T_body:       Quat (w, x, y, z)
  velocity_world:                 Vector3 (m/s)
  covariance_6x6:                 Matrix6
  source_label:                   enum {satellite_anchored, visual_propagated, dead_reckoned}
  last_satellite_anchor_age_ms:   int
  smoothed:                       bool — true for entries from `smoothed_history`
  emitted_at:                     monotonic_ns

EstimatorHealth:
  isam2_state:                   enum {INIT, TRACKING, DEGRADED, LOST}
  keyframe_count:                int
  cov_norm_growing_for_s:        float — AC-NEW-8 monotonicity check
  spoof_promotion_blocked:       bool — AC-NEW-2 / AC-NEW-8 gate state

3. External API Specification

Not applicable.

4. Data Access Patterns

C5 holds the GTSAM iSAM2 state in memory; persistent storage is only via FDR writes (C13 owns the file). No DB queries.

Storage Estimates

Table/Collection Est. Row Count (1yr) Row Size Total Size Growth Rate
In-memory keyframe window up to 20 keyframes resident ~2 KB / keyframe (factors + values) ~40 KB bounded by IncrementalFixedLagSmoother K=1020

C5 is bounded by design — no unbounded growth.

5. Implementation Details

Algorithmic Complexity:

  • iSAM2 update on factor add: amortised O(K) in keyframe count for the typical case; O(K^2) worst-case on relinearisation.
  • Marginals.marginalCovariance(pose_key): O(K^3) in keyframe-window size; the dominant per-frame cost (~3090 ms steady-state).
  • IncrementalFixedLagSmoother keeps the active window bounded — older keyframes are marginalised out.

State Management:

  • iSAM2 graph + Values + Marginals lifecycle for the flight.
  • Cold-start ladder (ADR-010, AZ-490): set_takeoff_origin(origin, sigma_horiz_m, sigma_vert_m) MUST be invoked before any add_vio / add_fc_imu / add_pose_anchor call. The cold-start window closes on the first add_* call. iSAM2 attaches a PriorFactorPose3 at Pose3.Identity() (operator origin BECOMES local-ENU (0,0,0)) with diagonal sigmas [5°, 5°, 5°, sigma_horiz_m, sigma_horiz_m, sigma_vert_m]; ESKF seeds the nominal position to (0,0,0) and writes the position-block of the error covariance to diag(sigma_horiz_m², sigma_horiz_m², sigma_vert_m²). The method is strictly idempotent on identical args — re-invocation with byte-equal (origin, sigma_horiz_m, sigma_vert_m) is a no-op; re-invocation with different args raises StateEstimatorConfigError. Once the cold-start window closes, further calls raise EstimatorAlreadyStartedError (subclass of StateEstimatorConfigError). Defaults default_takeoff_origin_sigma_horiz_m = 5.0, default_takeoff_origin_sigma_vert_m = 10.0 live in C5StateConfig.
  • Source-label state machine: tracks the AC-NEW-2 / AC-NEW-8 spoofing-promotion gate (≥10 s + visual consistency check + ≤ 200 m bounded-delta before re-promoting a previously-spoofed FC GPS source).
  • Last-anchor-age timer for AC-1.3 binning.

Key Dependencies:

Library Version Purpose
GTSAM (Python + C++) per Plan-phase pin iSAM2 + CombinedImuFactor + BetweenFactorPose3 + GenericProjectionFactorCal3DS2 + Marginals
gtsam_unstable.IncrementalFixedLagSmoother per Plan-phase pin Bounded keyframe window (D-C5-3 K=1020)
Eigen matches GTSAM Lie-algebra math
OpenCV (cv2, AZ-389 orthorectifier subsystem only) >=4.11.0.86,<4.12 (cycle-1 relaxed pin; D-CROSS-CVE-1 deferred — see _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md) Imported by _orthorectifier.py for warp / JPEG encode when c5_state.orthorectifier.enabled = True; default OFF means the import is loaded but the cv2 code path is unreached in cycle-1

Error Handling Strategy:

  • StateEstimatorConfigError: set_takeoff_origin called with a malformed LatLonAlt (out of WGS-84 bounds / non-finite) OR with non-positive / non-finite sigmas, OR re-called inside the cold-start window with conflicting args. EstimatorAlreadyStartedError (a StateEstimatorConfigError subclass): set_takeoff_origin called after the first add_* call sealed the cold-start window. Caller must surface to operator; takeoff blocked.
  • EstimatorDegradedError: factor add yielded poor convergence; covariance inflated; emit EstimatorOutput with degraded label.
  • EstimatorFatalError: iSAM2 numerical failure, KEYFRAME_LIMIT exceeded, etc.; emit no EstimatorOutput for this tick. AC-5.2 fallback (3 s no estimate → FC IMU-only) applies.
  • Spoof-promotion gate (Principle #11 amended, AZ-385 + AZ-490 follow-up): never re-introduce a previously-spoofed FC GPS source until ALL THREE hold — (i) FC gps_health == STABLE_NON_SPOOFED for ≥ 10 s, (ii) the next satellite-anchored frame agrees with the FC GPS within a configurable tolerance, AND (iii) the FC's reported position is within ≤ 200 m of the companion's last emitted PoseEstimate. The same gate is applied at takeoff when a Manifest takeoff_origin is present: an FC GPS reading that disagrees with the operator origin by > 200 m is logged as suspect and the operator origin wins. Document every reject in FDR + GCS STATUSTEXT.

6. Extensions and Helpers

Helper Purpose Used By
ImuPreintegrator shared with C1 C1, C5
SE3Utils shared with C1, C4 C1, C4, C5
WgsConverter shared with C4, C8 C4, C5, C8
SourceLabelStateMachine spoofing-promotion gate logic C5 only — keep inside the component

7. Caveats & Edge Cases

Known limitations:

  • AC-4.5 internal smoothing is onboard only; the FC log is forward-time only. The smoothed past-keyframe estimates go to FDR, not back to the FC.
  • iSAM2 + IncrementalFixedLagSmoother requires careful key management; missing keys cause silent factor-add failures — the implementation MUST log every add_* call's success/failure status.

Potential race conditions:

  • Single writer thread for the iSAM2 graph by design. C1 + C4 + C8-inbound deliver to a timestamp-ordered merge queue ahead of C5's writer thread.

Performance bottlenecks:

  • Marginals.marginalCovariance(pose_key) is the per-frame hot spot. D-CROSS-LATENCY-1 hybrid degrades C4's covariance recovery (not C5's) under thermal throttle.

Cycle-1 Tier-2 follow-up dependencies:

  • AZ-389 orthorectifier wiring_orthorectifier.py + OrthorectifierConfig (enabled, cov_norm_threshold, inlier_floor, tile_size_meters, tile_size_pixels, zoom_level, jpeg_quality) are wired into C5StateConfig and build_state_estimator. The default is enabled: bool = False, which preserves the existing smoke-test wiring that does not provide a TileStore — when False the runtime root skips orthorectifier construction entirely. Production enablement is parked pending AZ-624: the airborne pre_constructed dict must populate camera_calibration, flight_id, companion_id, and c6_tile_store from the operator-supplied manifest / takeoff orchestrator before flipping orthorectifier.enabled=True; until AZ-624 lands those four pre-constructed slots, only the test fixture path (tests/unit/c5_state/test_az389_*.py) exercises the orthorectifier subsystem.
  • AZ-624 operator-supplied flight metadata — the _c5_state_wrapper and _build_c5_state_estimator_pair already accept flight_id and companion_id kwargs and forward them to build_state_estimator. In cycle-1 the airborne build_pre_constructed only seeds the tile_store slot from _build_c6_tile_store(config); the other three (camera_calibration, flight_id, companion_id) are passed as None. Tier-2 follow-up: AZ-624 production main() wiring populates these from the manifest + takeoff orchestrator handshake (currently their None value means orthorectifier disablement is the only safe runtime state — see prior bullet).

8. Dependency Graph

Must be implemented after: C1 (input), C4 (input + shared graph), C8 inbound (FC IMU prior).

Can be implemented in parallel with: C6, C13 — independent paths.

Blocks: C8 outbound (no per-frame estimate), F3 / F4 / F5 / F7 / F9 / F10.

9. Logging Strategy

Log Level When Example
ERROR EstimatorFatalError; iSAM2 numerical failure; AC-5.2 path imminent C5 fatal iSAM2 failure; frame=12345; AC-5.2 fallback
WARN EstimatorDegradedError; spoofing-promotion blocked; cov norm growing >2× steady C5 degraded: cov_inflation=3.1, spoof_block=true
INFO Strategy ready; warm-start applied; spoof-promotion gate state changes C5 ready: estimator=gtsam_isam2, K=15
DEBUG per-frame factor adds + smoothed history depth C5 frame=12345 vio_added=true pose_added=true imu_added=true smoothed_n=15

Log format: structured JSON. Log storage: stdout / journald / FDR via C13 (ERROR + WARN always; smoothed past-keyframe entries always go to FDR per AC-4.5; spoofing-promotion-block events go to FDR + GCS STATUSTEXT).