mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 20:01:12 +00:00
[AZ-776] Open-loop ESKF composition profile via c4_pose.enabled
ADR-012: add c4_pose.enabled (default True) and enforce the (c4_pose.enabled, c5_state.strategy) 2x2 pairing matrix at compose time. When enabled=false, compose_root removes c4_pose from the selection map and build_pre_constructed omits c5_isam2_graph_handle. Replay protocol Invariant 13 owns the gate. Tier-2 conftest YAML writes the open-loop profile; un-xfails AC-1/2/5 and both AC-6 variants in Derkachi (AC-3 stays xfailed for AZ-777). 319/319 runtime_root + c4_pose + c5_state tests green. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -713,4 +713,41 @@ Two facts surfaced during the Step 7 (Implement) batch loop that contradicted th
|
||||
- AZ-405 grows slightly: it now also owns the `replay_input/` coordinator (the natural home for the auto-sync logic + the time-offset application).
|
||||
- AZ-404 (E2E replay test) is unchanged in scope but reworded: it asserts mode-agnosticism (Invariant 1) and runs against the unified airborne image — no fourth-image entrypoint to verify.
|
||||
- C8 gains a thin `MavlinkTransport` Protocol seam introduced by AZ-400: `SerialMavlinkTransport` (live) and `NoopMavlinkTransport` (replay) implement it. This is a no-op restructure of the existing C8 transport code; the encoders are unchanged. The Protocol seam is the architectural mechanism for Invariant 5 (encoders are byte-identical).
|
||||
- Demo↔field fidelity is now structurally guaranteed: the same binary runs in both contexts; any drift between them is a behavioural-test failure, not an SBOM-diff failure.
|
||||
- Demo↔field fidelity is now structurally guaranteed: the same binary runs in both contexts; any drift between them is a behavioural-test failure, not an SBOM-diff failure.
|
||||
|
||||
### ADR-012 — Open-loop ESKF composition profile via `c4_pose.enabled = false` (AZ-776)
|
||||
|
||||
**Context**: ADR-009 wires the C4 pose estimator and the C5 state estimator through a shared GTSAM iSAM2 substrate — C4 adds its PnP factor directly to C5's iSAM2 graph (ADR-003). The `c4_pose` slot in `runtime_root/airborne_bootstrap.py` lists `c5_isam2_graph_handle` as a required `pre_constructed` key (AZ-625), and the `OpenCVGtsamPoseEstimator` constructor consumes that handle. This wiring was sound for the steady-state GTSAM-iSAM2 build of C5.
|
||||
|
||||
When C5 ships a second strategy — `eskf` (ESKF baseline, AZ-588) — the substrate is **not** an iSAM2 graph: ESKF integrates an IMU-driven covariance forward closed-form, with no factor graph behind it. Its `create()` factory returns `(estimator, None)` for the second tuple element (the iSAM2 handle slot). Two facts surfaced from this:
|
||||
|
||||
1. **`c4_pose` cannot be the gate.** C4 owns satellite-anchored pose estimation. ESKF runs satellite-free open-loop. Forcing `c4_pose` into the composition when no satellite anchoring is wired means C4 either crashes at construction (no iSAM2 handle) or, worse, gets a fake handle that pretends to anchor poses that nothing produces — a silent passthrough that violates the "Real Results, Not Simulated Ones" meta-rule.
|
||||
2. **The replay Tier-2 smoke profile needs an honest minimum.** The AZ-265 replay path's mandatory simple baseline is KLT/RANSAC VIO + ESKF state estimator without any satellite re-anchoring (AZ-777 will add the satellite path on top via the Derkachi C6 reference tile cache). Without an explicit composition profile that excludes C4, every Tier-2 test that wants to exercise the simple baseline either crashes at compose time or has to monkey-patch the registry — both are anti-patterns for an architectural seam.
|
||||
|
||||
**Decision**:
|
||||
|
||||
1. **`C4PoseConfig.enabled: bool = True` is the user-facing switch for the open-loop ESKF profile.** Default ON preserves the ADR-003 steady-state airborne path. Setting `enabled=False` instructs `compose_root` to remove `c4_pose` from the selection map before topological ordering — the wrapper never runs, the consumer never sees a handle, and the wiring stays honest.
|
||||
2. **`compose_root` enforces the C4↔C5 pairing matrix at compose time.** The validation gate lives in `_validate_c4_c5_composition_profile` (called from `compose_root` before `_compose`) and rejects the two off-diagonal cells of the 2×2 (`c4_pose.enabled`, `c5_state.strategy`) matrix with a `CompositionError` naming both blocks. The two valid combinations are:
|
||||
- `c4_pose.enabled=True` + `c5_state.strategy="gtsam_isam2"` — the ADR-003 / ADR-009 steady-state airborne path.
|
||||
- `c4_pose.enabled=False` + `c5_state.strategy="eskf"` — the open-loop ESKF profile (Tier-2 smoke baseline; satellite anchoring deferred to AZ-777).
|
||||
The two **invalid** combinations are rejected with explicit error text:
|
||||
- `enabled=False` + `gtsam_isam2` (an iSAM2 graph with no PnP anchors converges to drift-prone visual-only odometry; the production deployment intent is that gtsam_isam2 always coexists with C4).
|
||||
- `enabled=True` + `eskf` (ESKF has no graph for C4 to anchor against; this is the AZ-776 root-cause pairing the user reported).
|
||||
3. **`build_pre_constructed` honours `c4_pose.enabled`.** When disabled, `c5_isam2_graph_handle` is **omitted** from the `pre_constructed` dict — the handle is a C4 consumer requirement, and removing C4 from the selection map removes the requirement. The ESKF estimator itself is still built and cached in the internal `_c5_prebuilt_estimator` slot (so the C5 wrapper short-circuits onto the prebuilt instance), but the iSAM2-shaped seam disappears from the cross-component contract.
|
||||
4. **Component selection is the only thing that changes.** The composition root's existing `_compose` mechanics — topological ordering, lazy strategy resolution, build-flag gating — are unchanged. The new `skip_slugs` parameter (a `frozenset[str]`) is the minimal seam that lets `compose_root` instruct `_compose` to drop the disabled component(s); there is no second composition path, no `compose_eskf` function, no mode-aware branch outside the validation gate.
|
||||
|
||||
**Alternatives considered**:
|
||||
|
||||
1. **Make `c4_pose` a "soft" dependency of C5 (introspect the strategy at C5 construction time, skip C4 wiring only when `strategy == "eskf"`).** Rejected: this leaks C5-strategy specifics into C4's interface (`PoseEstimator` would have to grow a "you may not be wired" affordance), violates ADR-009 interface-first, and re-introduces the very mode-aware branches Invariant 1 of the replay protocol forbids.
|
||||
2. **Make `compose_root` derive `c4_pose.enabled` automatically from `c5_state.strategy` (no user-facing flag).** Rejected: the C4↔C5 coupling is a deliberate design pairing, not a mechanical derivation. Future research strategies (e.g. a non-iSAM2 GTSAM variant, or a satellite-anchored ESKF) may want different combinations; the explicit flag keeps the configuration honest and audit-able.
|
||||
3. **Keep the wiring as-is and rely on the registry mechanism to skip C4.** Rejected: `C4PoseConfig` registers itself with the global config registry at module import (via `register_component_block` in `components/c4_pose/__init__.py`), which means even an empty `c4_pose:` block in YAML instantiates the block with defaults and pulls C4 into the selection map. The flag is the only honest opt-out without removing the registration call (which would break the steady-state path).
|
||||
4. **Build a synthetic `NullIsam2GraphHandle` that satisfies the Protocol but no-ops on update.** Rejected as the textbook example of the "Real Results, Not Simulated Ones" anti-pattern: it would let C4 run on top of ESKF with no anchoring, producing pose estimates that look real but have no factor-graph grounding. The composition-time gate is the honest answer.
|
||||
|
||||
**Consequences**:
|
||||
|
||||
- `tests/e2e/replay/conftest.py` writes `c4_pose: { enabled: false }` into the Tier-2 replay `config.yaml`, alongside the existing `c1_vio: klt_ransac` + `c5_state: eskf` block. This is the open-loop profile the replay binary uses for the AZ-265 / AZ-776 simple-baseline tests.
|
||||
- `tests/e2e/replay/test_derkachi_1min.py` un-xfails AC-1 (clean exit + per-frame JSONL), AC-2 (schema), AC-5 (determinism), AC-6 realtime, and AC-6 ASAP — these tests only required compose-time success to pass and AZ-776 lands that. AC-3 (≤ 100 m for ≥ 80 % of ticks) **remains** xfailed for AZ-777: ESKF integrates open-loop and drifts unbounded without C2/C3/C4 satellite re-anchoring; the ≤ 100 m threshold cannot be met by physics until the Derkachi C6 reference tile cache lands.
|
||||
- `_docs/02_document/contracts/replay/replay_protocol.md` gains a new "Open-loop ESKF composition profile" sub-section in **Composition root extension** plus a new **Invariant 13** ("C4↔C5 pairing matrix is enforced at compose time") that the AZ-776 unit tests own.
|
||||
- `_docs/02_document/components/06_c4_pose/description.md` gains an "Enabled flag" sub-section that points at this ADR; the rest of the component contract is unchanged.
|
||||
- The unit-test surface at `tests/unit/runtime_root/test_az776_open_loop_eskf_composition.py` owns the seven invariants AZ-776 introduces: `C4PoseConfig.enabled` default-true, AC-1 (open-loop ESKF composes without C4), AC-2 (default GTSAM profile still includes C4), AC-3a + AC-3b (the two forbidden pairings raise `CompositionError`), and the two `pre_constructed` behaviours (`c5_isam2_graph_handle` omitted when C4 disabled, present when C4 enabled). The full suite passes in ~4 s.
|
||||
- The composition root's contract surface in `runtime_root/__init__.py` gains one public helper (`CompositionError` was already public; the new `skip_slugs` parameter to `_compose` is module-private). No public CLI flag is added — operators set `c4_pose.enabled = false` in YAML.
|
||||
Reference in New Issue
Block a user