Files
gps-denied-onboard/_docs/02_tasks/todo/AZ-943_okvis2_threadedslam_binding.md
T
Oleksandr Bezdieniezhnykh e367b07e3b [AZ-943] Import OKVIS2 binding spec + dep table; autodev state trim
Imports AZ-943 (OKVIS2 binding: real ThreadedSlam wiring; AZ-592 split
1/3, 5pt) from Jira into a local task spec at
_docs/02_tasks/todo/AZ-943_okvis2_threadedslam_binding.md so the
implement skill batch loop has the input it needs.

Dependency table: +AZ-943 row, +preamble entry, totals 180→181 tasks /
570→575 SP. AZ-944 + AZ-945 stay Jira-only this session per the
AZ-943→AZ-944→AZ-945 Blocks chain (their local specs land when their
Implement turns come up).

State file trimmed from 52 lines to schema-compliant 13 lines per
.cursor/skills/autodev/state.md (sub_step.detail must be a one-line
pointer, not a logbook). Resume context lives in the new task spec +
git log of 94d2358 (AZ-918..AZ-922 baseline fixes).

Per AZ-942 + AZ-923 are parked (state file's "Open Items At Pause" is
recorded in git log via this commit's body; not retained in state file
going forward).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-29 10:50:18 +03:00

15 KiB
Raw Blame History

C1 OKVIS2 Binding — Real ThreadedSlam Wiring (AZ-592 split 1/3)

Task: AZ-943_okvis2_threadedslam_binding Name: OKVIS2 binding: replace AZ-332 skeleton with real okvis::ThreadedSlam wiring Description: Sub-ticket 1 of 3 from the AZ-592 placeholder split (per state file 2026-05-27 split rationale). Replaces the AZ-332 skeleton in src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp (_build_estimator() no-op, _drive_estimator() raises OkvisFatalException) with the real okvis::ThreadedSlam v2 pipeline: ViParametersReader(yaml).getParameters(...)ThreadedSlam(parameters, dBowDir)setOptimisedGraphCallback(...). Without this wiring, Okvis2Strategy (AZ-332) is the production-default per architecture but throws on first add_frame — the production VIO is unusable. CI build env + Jetson validation are tracked in sibling tickets AZ-944 (3pt, Linux CI + DBoW2 vocab + Tier-1 smoke) and AZ-945 (3pt, Jetson L4T + Tier-2 Derkachi e2e); the Blocks chain in Jira is AZ-943 → AZ-944 → AZ-945. This ticket touches ONLY the C++ binding and the Python facade fake-binding fixture; it does NOT flip BUILD_OKVIS2=ON in CI (that's AZ-944's deliverable). Complexity: 5 points Dependencies: AZ-332 (the AZ-332 skeleton this replaces; in done/), AZ-592 (parent umbrella placeholder; in backlog/) Component: c1_vio (epic AZ-254 / E-C1) Tracker: AZ-943 (https://denyspopov.atlassian.net/browse/AZ-943) Epic: AZ-254 (E-C1)

Document Dependencies

  • _docs/02_document/contracts/c1_vio/vio_strategy_protocol.md — the Protocol the strategy implements (AZ-331).
  • _docs/02_document/components/01_c1_vio/description.md — § 5 implementation details (sliding-window K=1020, per-frame cost), § 7 caveats (thermal throttle latency spikes).
  • ADR-002 (KLT/RANSAC mandatory baseline) — explains why this OKVIS2 wiring does NOT replace KLT/RANSAC; both ship.
  • cpp/okvis2/upstream/ — fully-populated v2 source tree the binding links against.

Problem

src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp is the AZ-332 skeleton:

  • _build_estimator() (line ~251) sets estimator_built_ = false and does nothing else.
  • _drive_estimator() (line ~261) throws OkvisFatalException("OKVIS2 estimator not yet wired — this binding is the AZ-332 skeleton; tier2 follow-up wires okvis::ThreadedKFVio") on first frame.
  • Real OKVIS2 includes (#include <okvis/ThreadedKFVio.hpp> etc.) are commented out at lines ~4850.

Without this wiring, Okvis2Strategy cannot produce any output — the Python facade is complete, the binding compiles and loads, but the first add_frame immediately raises. The production-default VIO is unusable.

API correction since AZ-332: OKVIS2 v2 upstream uses okvis::ThreadedSlam (NOT okvis::ThreadedKFVio as the AZ-332 spec referenced; that's the OKVIS v1 API). The wiring must follow v2 conventions:

okvis::ViParametersReader(yaml_path).getParameters(parameters);
auto estimator = std::make_unique<okvis::ThreadedSlam>(parameters, dBowDir);
estimator->setOptimisedGraphCallback([this](auto&& g, auto&& l, auto&& s) { ... });

Outcome

  • Okvis2Strategy.add_frame(...) produces a real VioOutput (pose + 6×6 covariance + biases + tracking-quality counts) on every keyframe the OKVIS2 backend optimises — no exceptions on the first frame.
  • Okvis2Strategy.reset(...) tears down the C++ estimator and rebuilds it with the supplied seed pose/velocity/bias.
  • Existing Python unit tests (tests/unit/c1_vio/test_okvis2_strategy.py) remain green against the unchanged fake-binding fixture (tests/unit/c1_vio/conftest.py).
  • This ticket alone does NOT light up the Tier-1 or Tier-2 e2e path against real OKVIS2 — that's AZ-944 / AZ-945. Tier-1 unit suite stays the only green-bar evidence here.

Scope

Included

  • Rewrite _build_estimator() to construct a real okvis::ThreadedSlam from yaml_config_ via okvis::ViParametersReader. The DBoW2 vocabulary directory comes from a CMake-defined preprocessor constant (vocab artifact provisioning is AZ-944's scope; this ticket only consumes the path).
  • Rewrite _drive_estimator() to convert py::array_t<uint8_t>cv::Mat (zero-copy preferred) and call estimator_->addImages(stamp, {0: cv_mat}). Returns true iff the optimised-graph callback fired for this frame's keyframe.
  • Wire add_imu(ts_ns, accel, gyro) through estimator_->addImuMeasurement(stamp, alpha, omega). Keep the existing strict-monotonic guard on the binding side (line ~161).
  • Implement the setOptimisedGraphCallback(...) lambda: fill latest_output_ under output_mtx_ with pose_T_world_body (Eigen::Matrix4d), pose_covariance_6x6 (extracted from ViSlamBackend marginalised block — see Implementation Notes), accel_bias / gyro_bias, tracked / new / lost feature counts, mean_parallax, mre_px, emitted_at_ns.
  • Map okvis::TrackingQualityHealthState: GoodTracking, MarginalDegraded, LostLost. Update state_ inside the callback before latest_output_ is filled.
  • Rewrite reset() to release the existing estimator and reconstruct via _build_estimator(); apply the seed pose/velocity/bias to the new instance.
  • Catch all OKVIS2 / Eigen / std::runtime_error inside the binding and rethrow as OkvisInitException (during construction), OkvisOptimizationException (during operation), or OkvisFatalException (irrecoverable). No raw exceptions cross into Python.
  • Uncomment the OKVIS2 #include block (lines ~4850) and verify the _build_estimator / _drive_estimator paths compile cleanly under BUILD_OKVIS2=ON on a developer machine that has the apt deps. CI green-bar is AZ-944, not this ticket.

Excluded

  • CI apt deps and BUILD_OKVIS2=ON flip in Dockerfile.test.jetson / Linux runners — that's AZ-944's deliverable. This ticket leaves the CI build off; the C++ change rides as compile-clean only on hosts that already provision the deps (or after AZ-944 lands).
  • Jetson L4T image build + Tier-2 Derkachi e2e (--vio-strategy okvis2) — that's AZ-945's deliverable.
  • DBoW2 small_voc artifact provisioning — sibling decision in AZ-944 (vendor in-tree vs. download-on-build vs. build-from-source). This ticket consumes whatever path the CMake constant resolves to.
  • AZ-332 skeleton's surface decisions — exception types, latest_output_ struct fields, py::dict shape — settled by AZ-332. This ticket does not change them.
  • Multi-camera support — single nav-camera per RESTRICT-UAV-3 / AZ-332.
  • OKVIS2 upstream source modifications — pin is fixed per AZ-332 Plan-phase; deviations require an ADR. The covariance side-channel approach (Implementation Notes) is intentionally chosen to avoid upstream patching.

Acceptance Criteria

AC-1: Real estimator construction Given yaml_config_ is a valid OKVIS2 v2 YAML config and the DBoW2 vocab path resolves When _build_estimator() runs Then it constructs an okvis::ThreadedSlam instance via okvis::ViParametersReader and stores it in estimator_ (no longer nullptr); estimator_built_ is true; no exception thrown.

AC-2: Frame ingestion drives the estimator Given _drive_estimator() receives a py::array_t<uint8_t> of shape (H, W) (mono camera per RESTRICT-UAV-3) with a valid stamp_ns When the function runs Then it converts the array to cv::Mat (zero-copy preferred) and calls estimator_->addImages(stamp, {0: cv_mat}). Returns true iff the optimised-graph callback fired for this frame's keyframe within the configured timeout.

AC-3: IMU forwarding Given add_imu(ts_ns, accel, gyro) is called with strictly-monotonic timestamps When the function runs Then it forwards (stamp, alpha, omega) to estimator_->addImuMeasurement(...). The existing strict-monotonic guard (binding-side, line ~161) is preserved.

AC-4: Optimised-graph callback fills latest_output_ Given estimator_->setOptimisedGraphCallback(...) is wired with the binding's lambda When the OKVIS2 backend optimises a keyframe Then latest_output_ is filled under output_mtx_ with: pose_T_world_body (Eigen::Matrix4d), pose_covariance_6x6, accel_bias, gyro_bias, tracked_count / new_count / lost_count, mean_parallax, mre_px, emitted_at_ns. The 6×6 covariance is extracted from the ViSlamBackend marginalised block (see Implementation Notes for approach).

AC-5: Health-state mapping Given okvis::TrackingQuality is one of {Good, Marginal, Lost} When the callback fires Then state_ updates to {Tracking, Degraded, Lost} respectively, BEFORE latest_output_ is filled, so a concurrent reader sees consistent state+output.

AC-6: Reset rebuilds with seed Given an active Okvis2Strategy with a built estimator When reset(seed_pose, seed_velocity, seed_bias) is called Then the existing estimator is released (C++ resources freed), _build_estimator() reconstructs a fresh instance, and the seed is applied via OKVIS2's setSeedFromPriors(...) (or equivalent) before the next add_frame.

AC-7: Exception translation Given an OKVIS2-internal exception, an Eigen exception, or a std::runtime_error is raised inside the binding When the binding catches it Then it is rethrown as one of: OkvisInitException (if raised from _build_estimator), OkvisOptimizationException (if raised from _drive_estimator / add_imu), OkvisFatalException (if the backend signals irrecoverable failure). No raw C++ exception crosses the pybind11 boundary.

AC-8: Python unit tests stay green against the fake binding Given the fake-binding fixture at tests/unit/c1_vio/conftest.py is unchanged When pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short runs (Tier-1) Then all pre-existing unit tests pass with no behavioural change. The fake-binding contract is unchanged — only the real C++ side gets wired.

Implementation Notes

Headers needed

  • okvis/ThreadedSlam.hpp — v2 SLAM front-end + back-end coordinator (replaces v1's ThreadedKFVio).
  • okvis/ViParametersReader.hpp — YAML config loader.
  • okvis/Estimator.hpp — back-end (needed for the covariance side-channel access).
  • okvis/cameras/PinholeCamera.hpp — K-matrix → OKVIS camera-object conversion if the binding constructs cameras directly (otherwise the YAML carries them).

6×6 covariance extraction — the known unknown

The setOptimisedGraphCallback payload (ViGraph snapshot) does NOT carry the latest-pose covariance directly; covariance lives inside the Estimator's back-end. Two approaches:

  • (a) Side-channel accessor (preferred for first cut): inside the callback, take a non-const handle to estimator_->backend() (or equivalent) and read the marginalised 6×6 block for the latest pose state. Keep the read protected by output_mtx_. If OKVIS2 v2 marks the back-end accessor private, fall back to subclassing ThreadedSlam and exposing a thin protected getter — still in our binding, no upstream change.
  • (b) Tiny upstream patch: add a public latestPoseCovariance6x6() method to okvis::ViSlamBackend and submit upstream. Faster diff but requires a pin bump + ADR per AZ-332 Plan-phase. Defer to (b) only if (a) hits a hard private-field block.

Pick (a) for the first cut. If (a) requires a subclass-exposed getter, document the subclass in a code comment referencing this AC and AZ-943.

cpp/okvis2/CMakeLists.txt already declares the link targets at lines ~6473: okvis_ceres, okvis_frontend, okvis_multisensor_processing, okvis_kinematics, okvis_cv, okvis_common, okvis_time, okvis_util. The _drive_estimator function needs okvis_cv for the cv::Mat integration. No new targets to add — verify the linker pulls them in cleanly under BUILD_OKVIS2=ON.

pybind11 surface — DO NOT change

The pybind11 module shape (lines ~296318) is correct and the Python facade unit tests confirm it. Do NOT alter the surface — add_frame, add_imu, reset, and the result struct fields stay byte-compatible with the fake binding. Only the C++ implementations behind those symbols change.

DBoW2 vocab path

Define a CMake preprocessor constant (e.g. OKVIS2_DBOW2_VOCAB_DIR) that points to a path the runtime can resolve. AZ-944 will populate this path with the small-vocabulary artifact (decision: vendor in-tree vs. download-on-build vs. build-from-source). For this ticket: declare the constant, consume it, and document the expected file layout (e.g. ${OKVIS2_DBOW2_VOCAB_DIR}/small_voc.yml.gz or similar) in a code comment referencing AZ-944.

Build verification

Compile-clean evidence on a host with apt deps installed (developer Mac with brew install ... equivalents OR a Linux dev VM with apt deps):

BUILD_OKVIS2=ON cmake -S . -B build && cmake --build build --target c1_vio_okvis2_native

Should produce the .so. Capture the build log in the batch report. The _native/__init__.py Python-side import test then confirms the symbol is loadable (without running OKVIS2 — just loading the shared object).

Constraints

  • Pin: OKVIS2 v2 upstream pin from AZ-332 Plan-phase is fixed. Any deviation requires an ADR.
  • No upstream patches unless approach (a) for covariance fails and is documented in a comment + retro entry.
  • Single nav-camera per RESTRICT-UAV-3 — multi-camera ingestion is out of scope.
  • No CI flip: this ticket leaves BUILD_OKVIS2=OFF in Dockerfile.test.jetson / Linux CI runners. AZ-944 owns the flip.
  • Backward compatibility: Python facade fake-binding tests stay green with no fixture changes.

Unit Tests

AC Ref What to Test Required Outcome
AC-1 C++ unit (gtest) — construct Okvis2Binding with a known-good YAML, assert estimator_built_ is true and no exception thrown Pass on a host with apt deps installed
AC-2 C++ unit — feed a synthetic cv::Mat via the C++ side, assert addImages is called once and the optimised-graph callback fires Pass
AC-4 C++ unit — drive a short EuRoC-like image+IMU sequence, assert latest_output_.pose_covariance_6x6 is non-zero finite SPD Pass; eigvals all > 0
AC-7 C++ unit — feed a known-bad YAML; assert OkvisInitException propagates with non-empty what() Pass
AC-8 Python — pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short All pre-existing tests still pass (uses fake binding, no real OKVIS2)

C++ unit tests live under cpp/okvis2/tests/ (or wherever the existing OKVIS2 test layout sits — confirm during implementation; if no harness exists, add a minimal one and document in the batch report).

References

  • Jira ticket: AZ-943 (parent split AZ-592)
  • Sibling Jira tickets (Blocks chain AZ-943 → AZ-944 → AZ-945):
    • AZ-944 (3pt, Linux CI build env + DBoW2 vocab artifact + Tier-1 EuRoC mini smoke)
    • AZ-945 (3pt, Jetson L4T build + Tier-2 Derkachi --vio-strategy okvis2 e2e + perf baseline)
  • AZ-332 spec (the skeleton this replaces): _docs/02_tasks/done/AZ-332_c1_okvis2_strategy.md
  • ADR-002 (KLT/RANSAC mandatory baseline; OKVIS2 is the production-default architectural target)
  • cpp/okvis2/upstream/ (v2 source tree)
  • _docs/_autodev_state.md (resume context: Out-of-band bugfix cycle 94d2358 already committed; AZ-942 / AZ-923 parked; AZ-943→AZ-944→AZ-945 split rationale)