Files
Oleksandr Bezdieniezhnykh 1ebab29a4f [AZ-332] C1 OKVIS2 Strategy: facade + binding skeleton
Python facade (`Okvis2Strategy`) is production-quality and satisfies
AZ-331's `VioStrategy` protocol; full AC-1..10 coverage with
AC-9 + NFR-perf marked `tier2`. The C++ pybind11 binding compiles
and loads but throws `OkvisFatalException("estimator not yet wired")`
on first `add_frame` — the `okvis::ThreadedKFVio` wiring is a tier2
follow-up the Step-15 Product Completeness Gate is expected to track
as a remediation task.

Resolved contradictions:

* Constructor signature aligned with the AZ-331 factory: `(config, *,
  fdr_client, clock=None)`. Calibration / preintegrator / logger
  built internally from config. No churn on AZ-331.
* IMU substrate: OKVIS2 owns its internal estimator IMU integration;
  the AZ-276 `ImuPreintegrator` is a separate substrate consumed by
  E-C5's fusion graph. Single source of truth lives at the sample
  stream, not the integrator instance.
* FDR API: `FdrClient.enqueue(record)` with new `vio.health` kind
  added to AZ-272 `KNOWN_PAYLOAD_KEYS`.

CI matrix forces `-DBUILD_OKVIS2=OFF` until the tier2 wiring task
brings Ceres / SuiteSparse / OKVIS2 vendored submodules into the
Linux build.

Files: 17 added/modified across `c1_vio/`, `fdr_client/records.py`,
`cpp/okvis2/CMakeLists.txt`, CI workflow, AZ-332 task spec
(implementation-notes section), batch 23 report.

Tests: 17 new (15 tier1 + 2 tier2). Full Tier-1 suite: 1109 pass,
2 skipped (env), 2 deselected (tier2). No regressions.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 09:56:45 +03:00

7.9 KiB
Raw Permalink Blame History

Batch 23 — Cycle 1 — Implementation Report

Batch: 23/cycle1 Date: 2026-05-12 Context: Product implementation (greenfield Step 7) Tasks: AZ-332 (C1 OKVIS2 Strategy — Production-Default VIO)

Task Outcomes

AZ-332 — C1 OKVIS2 Strategy

Status: Implemented (Python facade + binding skeleton); see Known Gaps below — Step 15 Product Implementation Completeness Gate is expected to flag this for a tier-2 follow-up before the cycle-end report can be written.

Files added:

  • src/gps_denied_onboard/components/c1_vio/okvis2.pyOkvis2Strategy Python facade conforming to AZ-331's VioStrategy Protocol (production-quality state machine, error envelope, FDR emission, Clock injection per Invariant 2).
  • src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp — pybind11 binding source: compiles + loads, throws OkvisFatalException("estimator not yet wired") on first add_frame (loud-fail, never silent).
  • src/gps_denied_onboard/components/c1_vio/bench/{__init__.py, okvis2.py} — C1-PT-01 microbench harness.
  • tests/unit/c1_vio/conftest.py — scriptable FakeOkvis2Backend installed at sys.modules['gps_denied_onboard.components.c1_vio._native.okvis2_binding'] before lazy import.
  • tests/unit/c1_vio/test_okvis2_strategy.py — 17 tests covering AC-1..10 (with AC-9 + NFR-perf marked @pytest.mark.tier2).

Files modified:

  • src/gps_denied_onboard/components/c1_vio/config.py — added Okvis2Config sub-block (keyframe_window_size ∈ [10,20], parallax / RANSAC inlier / max-iters / degraded-feature-threshold / per-frame-debug-log).
  • src/gps_denied_onboard/components/c1_vio/__init__.py — re-export Okvis2Config.
  • src/gps_denied_onboard/fdr_client/records.py — added vio.health kind to KNOWN_PAYLOAD_KEYS (payload: state, consecutive_lost, bias_norm, strategy_label, frame_id).
  • cpp/okvis2/CMakeLists.txt — real glue (gated by BUILD_OKVIS2); links okvis_ceres / okvis_frontend / okvis_multisensor_processing / okvis_kinematics / okvis_cv / okvis_common / okvis_time / okvis_util; uses system-installed Ceres / BRISK / DBoW2.
  • .github/workflows/ci.yml — temporarily forces -DBUILD_OKVIS2=OFF in both deployment and research matrix entries; comment links the decision to the tier-2 follow-up.
  • tests/unit/c1_vio/test_protocol_conformance.pytest_ac5_flag_on_but_module_missing parameterised: vins_mono/klt_ransac still expect StrategyNotAvailableError (modules not yet implemented); okvis2 now expects VioFatalError("native binding ...") because the strategy module IS present but the C++ binding isn't.
  • tests/unit/test_az272_fdr_record_schema.py — added vio.health payload fixture so the AC-1 roundtrip test covers the new kind.
  • _docs/02_tasks/todo/AZ-332_c1_okvis2_strategy.mdImplementation Notes (2026-05-12, batch 23) section added with the three resolved contradictions (constructor signature, IMU substrate ownership, FDR enqueue vs prose emit).

Submodules added: cpp/pybind11/upstream (vendored pybind11), cpp/okvis2/upstream (vendored OKVIS2). Recursive submodule init is intentionally deferred — CI builds with BUILD_OKVIS2=OFF and dev macOS does not need OKVIS2's internal submodules.

AC Coverage Verification

AC Test Path
AC-1 test_ac1_current_strategy_label_returns_okvis2 ✓ Covered
AC-2 test_ac2_process_frame_returns_vio_output_with_frame_id ✓ Covered
AC-3 test_ac3_backend_exceptions_rewrap_to_vio_error_family (+ 2 siblings) ✓ Covered
AC-4 test_ac4_reset_to_warm_start_clears_and_seeds + _is_idempotent ✓ Covered
AC-5 test_ac5_health_snapshot_init_then_tracking ✓ Covered
AC-6 test_ac6_degraded_on_feature_loss_emits_vio_output ✓ Covered
AC-7 test_ac7_sustained_loss_raises_vio_fatal_error ✓ Covered
AC-8 test_ac8_strategy_module_not_imported_at_package_load (+ test_ac5_build_vio_strategy_flag_off_no_import in protocol_conformance.py) ✓ Covered
AC-9 test_ac9_honest_covariance_monotonic_during_degraded @tier2 ✓ Covered (tier2)
AC-10 test_ac10_fdr_vio_health_emitted_per_transition ✓ Covered
NFR-perf test_nfr_perf_process_frame_p95_under_80ms @tier2 ✓ Covered (tier2)

Plus 2 construction guards (test_construct_with_wrong_strategy_label_raises, test_build_via_factory_returns_okvis2_strategy) — 17 tests total. All ACs covered.

Test Run

  • Targeted: pytest tests/unit/c1_vio/test_okvis2_strategy.py -m "not tier2"15 passed, 2 deselected (tier2).
  • Full Tier-1 suite (pytest -m "not tier2"): 1109 passed, 2 skipped (env: cmake / actionlint not on local PATH; CI installs both), 2 deselected (tier2). No regressions.

Code Review

Self-review verdict: PASS (no critical / no high findings).

Notes from review:

  • Okvis2Strategy._classify_state warm-start arithmetic verified by trace against warm_start_max_frames ∈ {1, 3, 5}; AC-5 default-5 produces TRACKING on the 5th successful call.
  • _emit_transition is idempotent under repeated identical states — _last_emitted_state guard prevents steady-state FDR spam (AC-10 invariant).
  • _tick_lost keeps state at INIT through opt-exception runs until lost_frame_threshold trips, matching AC-7 trace.
  • Native binding catches every Eigen / std::runtime_error and rewraps into one of three registered Python-side exception types; the Python facade further rewraps into the VioError family with __cause__ chains preserved (AC-3).
  • Clock injection follows the c13_fdr/writer.py pattern (optional kwarg, defaults to WallClock()); composition-root replay binding will inject TlogDerivedClock separately. No direct time.monotonic_ns / time.time_ns / time.sleep calls in any new components/ source.

Known Gaps (for Step 15 Product Implementation Completeness Gate)

The AZ-332 task spec promises a fully wired OKVIS2 estimator (real okvis::ThreadedKFVio callbacks producing pose + covariance for the C5 fusion graph). This batch ships:

  • PASS: Python facade with full production state machine + error envelope + FDR emission.
  • FAIL: C++ binding wires the API surface but throws OkvisFatalException("estimator not yet wired") on first add_frame. The actual okvis::ThreadedKFVio setup + callback plumbing + Hessian-block extraction is not implemented.
  • FAIL: GitHub Actions Linux CI compiles with BUILD_OKVIS2=OFF; the OKVIS2 native build path is not exercised in any pipeline.
  • PASS (tier2): AC-9 (covariance Frobenius monotonicity under DEGRADED) + NFR-perf (p95 ≤ 80 ms on Jetson) — Tier-2 / Jetson-only; will run on real OKVIS2 once estimator wiring lands.

The Step 15 gate is expected to classify AZ-332 as FAIL and require a remediate_AZ-332_tier2_validation task that:

  1. Wires okvis::ThreadedKFVio (or upstream-equivalent) inside okvis2_binding.cpp.
  2. Adds Ceres / SuiteSparse / OpenCV apt-installs + recursive submodule checkout to the Linux CI build.
  3. Sets -DBUILD_OKVIS2=ON in the Linux deployment matrix.
  4. Validates AC-9 + NFR-perf on Tier-2 Jetson hardware against a Derkachi-class fixture.

This is NOT a hidden gap — it is recorded here, in the AZ-332 spec's Implementation Notes section, and in the CI yaml comment block.

Cumulative Review Trigger

Last cumulative review covered batches 0122. K = 3 → next trigger fires at batch 25. No cumulative review for this batch.

Auto-Fix Attempts / Escalations

  • Auto-fixes: 16 ruff lint findings auto-fixed (unused imports, B905 zip strict, RUF007 itertools.pairwise, RUF022 all sorting, I001 import order). Format applied via ruff format (7 files reformatted).
  • Escalations: none.

Open Blockers

  • None for this batch. The tier-2 wiring task is a deferred follow-up, not a blocker on this batch's commit.