AZ-943 implementation attempt confirmed the C++ binding cannot satisfy AC-4 without upstream OKVIS2 patches. The spec's "approach (a) in-binding subclass workaround" is structurally impossible: - ThreadedSlam::estimator_ is `private` (not `protected`) - ViSlamBackend has no public covariance / counts / parallax / MRE accessor in the v2 upstream headers - TrackingState carries only id / isKeyframe / TrackingQuality enum / recognisedPlace / isFullGraphOptimising / currentKeyframeId — none of the five tracking-stats fields the binding needs Filed the spec-documented "approach (b)" fallback as two sibling tickets, both linked Jira-side as `is blocked by` against AZ-943: - AZ-951 (3 SP): upstream patch — expose 6x6 pose covariance accessor (+ ADR-XXX for the AZ-332 Plan-phase pin deviation) - AZ-952 (3 SP): upstream patch — expose tracking-stats accessor (feature counts + parallax + MRE) AZ-943 transitioned In Progress -> To Do in Jira, full audit comment attached. Local AZ-943 spec moved todo/ -> backlog/ with PAUSED preamble; original AC list preserved for the post-unblock turn. Per user 2026-05-29 confirmation: cycle-4 Derkachi demo target stays KLT/RANSAC (tests/e2e/replay/conftest.py line 159 c1_vio: strategy: klt_ransac), so AZ-951 + AZ-952 + AZ-943 chain is correctly deferred. Pivoting next batch to AZ-897 (replay UI form). Touches: _docs/02_tasks/_dependencies_table.md (preamble + table rows for AZ-943 paused / AZ-951 / AZ-952 added; totals bumped to 142 product + 41 blackbox-test = 183, 448 product + 133 blackbox = 581), _docs/_autodev_state.md (sub_step pivot to AZ-897). Co-authored-by: Cursor <cursoragent@cursor.com>
17 KiB
C1 OKVIS2 Binding — Real ThreadedSlam Wiring (AZ-592 split 1/3)
STATUS (2026-05-29): PAUSED — BLOCKED on AZ-951 + AZ-952.
Implementation attempt on 2026-05-29 confirmed AC-4 is structurally unreachable without upstream OKVIS2 patches:
ThreadedSlam::estimator_isprivate(notprotected) → in-binding subclass workaround proposed in Implementation Notes "approach (a)" is impossible.ViSlamBackendhas no public accessor for 6×6 pose covariance, feature counts, mean parallax, or MRE.TrackingState(callback arg) only carries id / isKeyframe / TrackingQuality enum / recognisedPlace / isFullGraphOptimising / currentKeyframeId — none of the AC-4 telemetry fields.The "approach (b) upstream patch" fallback documented in this file + AZ-592 has been filed as two sibling tickets and linked as
is blocked byagainst AZ-943:
- AZ-951 (3 SP): upstream patch — expose 6×6 pose covariance accessor (+ ADR for pin deviation).
- AZ-952 (3 SP): upstream patch — expose tracking-stats accessor (feature counts + parallax + MRE).
Jira AZ-943 reverted to To Do. This local file moved from
todo/→backlog/. The AC list + Implementation Notes below are PRESERVED unchanged for audit; once AZ-951 + AZ-952 land, AC-4 implementation will callbackend().computeCovariance6x6(state.id)+backend().getLatestTrackingStats(state.id, ...)and the file moves back totodo/.Audit reference: AZ-943 Jira comment "Implementation paused: spec gap discovered (2026-05-29)" — full root-cause + decision rationale.
Task: AZ-943_okvis2_threadedslam_binding
Name: OKVIS2 binding: replace AZ-332 skeleton with real okvis::ThreadedSlam wiring
Description: Sub-ticket 1 of 3 from the AZ-592 placeholder split (per state file 2026-05-27 split rationale). Replaces the AZ-332 skeleton in src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp (_build_estimator() no-op, _drive_estimator() raises OkvisFatalException) with the real okvis::ThreadedSlam v2 pipeline: ViParametersReader(yaml).getParameters(...) → ThreadedSlam(parameters, dBowDir) → setOptimisedGraphCallback(...). Without this wiring, Okvis2Strategy (AZ-332) is the production-default per architecture but throws on first add_frame — the production VIO is unusable. CI build env + Jetson validation are tracked in sibling tickets AZ-944 (3pt, Linux CI + DBoW2 vocab + Tier-1 smoke) and AZ-945 (3pt, Jetson L4T + Tier-2 Derkachi e2e); the Blocks chain in Jira is AZ-943 → AZ-944 → AZ-945. This ticket touches ONLY the C++ binding and the Python facade fake-binding fixture; it does NOT flip BUILD_OKVIS2=ON in CI (that's AZ-944's deliverable).
Complexity: 5 points
Dependencies: AZ-332 (the AZ-332 skeleton this replaces; in done/), AZ-592 (parent umbrella placeholder; in backlog/)
Component: c1_vio (epic AZ-254 / E-C1)
Tracker: AZ-943 (https://denyspopov.atlassian.net/browse/AZ-943)
Epic: AZ-254 (E-C1)
Document Dependencies
_docs/02_document/contracts/c1_vio/vio_strategy_protocol.md— the Protocol the strategy implements (AZ-331)._docs/02_document/components/01_c1_vio/description.md— § 5 implementation details (sliding-window K=10–20, per-frame cost), § 7 caveats (thermal throttle latency spikes).- ADR-002 (KLT/RANSAC mandatory baseline) — explains why this OKVIS2 wiring does NOT replace KLT/RANSAC; both ship.
cpp/okvis2/upstream/— fully-populated v2 source tree the binding links against.
Problem
src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp is the AZ-332 skeleton:
_build_estimator()(line ~251) setsestimator_built_ = falseand does nothing else._drive_estimator()(line ~261) throwsOkvisFatalException("OKVIS2 estimator not yet wired — this binding is the AZ-332 skeleton; tier2 follow-up wires okvis::ThreadedKFVio")on first frame.- Real OKVIS2 includes (
#include <okvis/ThreadedKFVio.hpp>etc.) are commented out at lines ~48–50.
Without this wiring, Okvis2Strategy cannot produce any output — the Python facade is complete, the binding compiles and loads, but the first add_frame immediately raises. The production-default VIO is unusable.
API correction since AZ-332: OKVIS2 v2 upstream uses okvis::ThreadedSlam (NOT okvis::ThreadedKFVio as the AZ-332 spec referenced; that's the OKVIS v1 API). The wiring must follow v2 conventions:
okvis::ViParametersReader(yaml_path).getParameters(parameters);
auto estimator = std::make_unique<okvis::ThreadedSlam>(parameters, dBowDir);
estimator->setOptimisedGraphCallback([this](auto&& g, auto&& l, auto&& s) { ... });
Outcome
Okvis2Strategy.add_frame(...)produces a realVioOutput(pose + 6×6 covariance + biases + tracking-quality counts) on every keyframe the OKVIS2 backend optimises — no exceptions on the first frame.Okvis2Strategy.reset(...)tears down the C++ estimator and rebuilds it with the supplied seed pose/velocity/bias.- Existing Python unit tests (
tests/unit/c1_vio/test_okvis2_strategy.py) remain green against the unchanged fake-binding fixture (tests/unit/c1_vio/conftest.py). - This ticket alone does NOT light up the Tier-1 or Tier-2 e2e path against real OKVIS2 — that's AZ-944 / AZ-945. Tier-1 unit suite stays the only green-bar evidence here.
Scope
Included
- Rewrite
_build_estimator()to construct a realokvis::ThreadedSlamfromyaml_config_viaokvis::ViParametersReader. The DBoW2 vocabulary directory comes from a CMake-defined preprocessor constant (vocab artifact provisioning is AZ-944's scope; this ticket only consumes the path). - Rewrite
_drive_estimator()to convertpy::array_t<uint8_t>→cv::Mat(zero-copy preferred) and callestimator_->addImages(stamp, {0: cv_mat}). Returnstrueiff the optimised-graph callback fired for this frame's keyframe. - Wire
add_imu(ts_ns, accel, gyro)throughestimator_->addImuMeasurement(stamp, alpha, omega). Keep the existing strict-monotonic guard on the binding side (line ~161). - Implement the
setOptimisedGraphCallback(...)lambda: filllatest_output_underoutput_mtx_with pose_T_world_body (Eigen::Matrix4d), pose_covariance_6x6 (extracted fromViSlamBackendmarginalised block — see Implementation Notes), accel_bias / gyro_bias, tracked / new / lost feature counts, mean_parallax, mre_px, emitted_at_ns. - Map
okvis::TrackingQuality→HealthState:Good→Tracking,Marginal→Degraded,Lost→Lost. Updatestate_inside the callback beforelatest_output_is filled. - Rewrite
reset()to release the existing estimator and reconstruct via_build_estimator(); apply the seed pose/velocity/bias to the new instance. - Catch all OKVIS2 / Eigen /
std::runtime_errorinside the binding and rethrow asOkvisInitException(during construction),OkvisOptimizationException(during operation), orOkvisFatalException(irrecoverable). No raw exceptions cross into Python. - Uncomment the OKVIS2
#includeblock (lines ~48–50) and verify the_build_estimator/_drive_estimatorpaths compile cleanly underBUILD_OKVIS2=ONon a developer machine that has the apt deps. CI green-bar is AZ-944, not this ticket.
Excluded
- CI apt deps and
BUILD_OKVIS2=ONflip inDockerfile.test.jetson/ Linux runners — that's AZ-944's deliverable. This ticket leaves the CI build off; the C++ change rides as compile-clean only on hosts that already provision the deps (or after AZ-944 lands). - Jetson L4T image build + Tier-2 Derkachi e2e (
--vio-strategy okvis2) — that's AZ-945's deliverable. - DBoW2 small_voc artifact provisioning — sibling decision in AZ-944 (vendor in-tree vs. download-on-build vs. build-from-source). This ticket consumes whatever path the CMake constant resolves to.
- AZ-332 skeleton's surface decisions — exception types,
latest_output_struct fields, py::dict shape — settled by AZ-332. This ticket does not change them. - Multi-camera support — single nav-camera per RESTRICT-UAV-3 / AZ-332.
- OKVIS2 upstream source modifications — pin is fixed per AZ-332 Plan-phase; deviations require an ADR. The covariance side-channel approach (Implementation Notes) is intentionally chosen to avoid upstream patching.
Acceptance Criteria
AC-1: Real estimator construction
Given yaml_config_ is a valid OKVIS2 v2 YAML config and the DBoW2 vocab path resolves
When _build_estimator() runs
Then it constructs an okvis::ThreadedSlam instance via okvis::ViParametersReader and stores it in estimator_ (no longer nullptr); estimator_built_ is true; no exception thrown.
AC-2: Frame ingestion drives the estimator
Given _drive_estimator() receives a py::array_t<uint8_t> of shape (H, W) (mono camera per RESTRICT-UAV-3) with a valid stamp_ns
When the function runs
Then it converts the array to cv::Mat (zero-copy preferred) and calls estimator_->addImages(stamp, {0: cv_mat}). Returns true iff the optimised-graph callback fired for this frame's keyframe within the configured timeout.
AC-3: IMU forwarding
Given add_imu(ts_ns, accel, gyro) is called with strictly-monotonic timestamps
When the function runs
Then it forwards (stamp, alpha, omega) to estimator_->addImuMeasurement(...). The existing strict-monotonic guard (binding-side, line ~161) is preserved.
AC-4: Optimised-graph callback fills latest_output_
Given estimator_->setOptimisedGraphCallback(...) is wired with the binding's lambda
When the OKVIS2 backend optimises a keyframe
Then latest_output_ is filled under output_mtx_ with: pose_T_world_body (Eigen::Matrix4d), pose_covariance_6x6, accel_bias, gyro_bias, tracked_count / new_count / lost_count, mean_parallax, mre_px, emitted_at_ns. The 6×6 covariance is extracted from the ViSlamBackend marginalised block (see Implementation Notes for approach).
AC-5: Health-state mapping
Given okvis::TrackingQuality is one of {Good, Marginal, Lost}
When the callback fires
Then state_ updates to {Tracking, Degraded, Lost} respectively, BEFORE latest_output_ is filled, so a concurrent reader sees consistent state+output.
AC-6: Reset rebuilds with seed
Given an active Okvis2Strategy with a built estimator
When reset(seed_pose, seed_velocity, seed_bias) is called
Then the existing estimator is released (C++ resources freed), _build_estimator() reconstructs a fresh instance, and the seed is applied via OKVIS2's setSeedFromPriors(...) (or equivalent) before the next add_frame.
AC-7: Exception translation
Given an OKVIS2-internal exception, an Eigen exception, or a std::runtime_error is raised inside the binding
When the binding catches it
Then it is rethrown as one of: OkvisInitException (if raised from _build_estimator), OkvisOptimizationException (if raised from _drive_estimator / add_imu), OkvisFatalException (if the backend signals irrecoverable failure). No raw C++ exception crosses the pybind11 boundary.
AC-8: Python unit tests stay green against the fake binding
Given the fake-binding fixture at tests/unit/c1_vio/conftest.py is unchanged
When pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short runs (Tier-1)
Then all pre-existing unit tests pass with no behavioural change. The fake-binding contract is unchanged — only the real C++ side gets wired.
Implementation Notes
Headers needed
okvis/ThreadedSlam.hpp— v2 SLAM front-end + back-end coordinator (replaces v1'sThreadedKFVio).okvis/ViParametersReader.hpp— YAML config loader.okvis/Estimator.hpp— back-end (needed for the covariance side-channel access).okvis/cameras/PinholeCamera.hpp— K-matrix → OKVIS camera-object conversion if the binding constructs cameras directly (otherwise the YAML carries them).
6×6 covariance extraction — the known unknown
The setOptimisedGraphCallback payload (ViGraph snapshot) does NOT carry the latest-pose covariance directly; covariance lives inside the Estimator's back-end. Two approaches:
- (a) Side-channel accessor (preferred for first cut): inside the callback, take a non-const handle to
estimator_->backend()(or equivalent) and read the marginalised 6×6 block for the latest pose state. Keep the read protected byoutput_mtx_. If OKVIS2 v2 marks the back-end accessor private, fall back to subclassingThreadedSlamand exposing a thin protected getter — still in our binding, no upstream change. - (b) Tiny upstream patch: add a public
latestPoseCovariance6x6()method tookvis::ViSlamBackendand submit upstream. Faster diff but requires a pin bump + ADR per AZ-332 Plan-phase. Defer to (b) only if (a) hits a hard private-field block.
Pick (a) for the first cut. If (a) requires a subclass-exposed getter, document the subclass in a code comment referencing this AC and AZ-943.
CMake link targets
cpp/okvis2/CMakeLists.txt already declares the link targets at lines ~64–73: okvis_ceres, okvis_frontend, okvis_multisensor_processing, okvis_kinematics, okvis_cv, okvis_common, okvis_time, okvis_util. The _drive_estimator function needs okvis_cv for the cv::Mat integration. No new targets to add — verify the linker pulls them in cleanly under BUILD_OKVIS2=ON.
pybind11 surface — DO NOT change
The pybind11 module shape (lines ~296–318) is correct and the Python facade unit tests confirm it. Do NOT alter the surface — add_frame, add_imu, reset, and the result struct fields stay byte-compatible with the fake binding. Only the C++ implementations behind those symbols change.
DBoW2 vocab path
Define a CMake preprocessor constant (e.g. OKVIS2_DBOW2_VOCAB_DIR) that points to a path the runtime can resolve. AZ-944 will populate this path with the small-vocabulary artifact (decision: vendor in-tree vs. download-on-build vs. build-from-source). For this ticket: declare the constant, consume it, and document the expected file layout (e.g. ${OKVIS2_DBOW2_VOCAB_DIR}/small_voc.yml.gz or similar) in a code comment referencing AZ-944.
Build verification
Compile-clean evidence on a host with apt deps installed (developer Mac with brew install ... equivalents OR a Linux dev VM with apt deps):
BUILD_OKVIS2=ON cmake -S . -B build && cmake --build build --target c1_vio_okvis2_native
Should produce the .so. Capture the build log in the batch report. The _native/__init__.py Python-side import test then confirms the symbol is loadable (without running OKVIS2 — just loading the shared object).
Constraints
- Pin: OKVIS2 v2 upstream pin from AZ-332 Plan-phase is fixed. Any deviation requires an ADR.
- No upstream patches unless approach (a) for covariance fails and is documented in a comment + retro entry.
- Single nav-camera per RESTRICT-UAV-3 — multi-camera ingestion is out of scope.
- No CI flip: this ticket leaves
BUILD_OKVIS2=OFFinDockerfile.test.jetson/ Linux CI runners. AZ-944 owns the flip. - Backward compatibility: Python facade fake-binding tests stay green with no fixture changes.
Unit Tests
| AC Ref | What to Test | Required Outcome |
|---|---|---|
| AC-1 | C++ unit (gtest) — construct Okvis2Binding with a known-good YAML, assert estimator_built_ is true and no exception thrown |
Pass on a host with apt deps installed |
| AC-2 | C++ unit — feed a synthetic cv::Mat via the C++ side, assert addImages is called once and the optimised-graph callback fires |
Pass |
| AC-4 | C++ unit — drive a short EuRoC-like image+IMU sequence, assert latest_output_.pose_covariance_6x6 is non-zero finite SPD |
Pass; eigvals all > 0 |
| AC-7 | C++ unit — feed a known-bad YAML; assert OkvisInitException propagates with non-empty what() |
Pass |
| AC-8 | Python — pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short |
All pre-existing tests still pass (uses fake binding, no real OKVIS2) |
C++ unit tests live under cpp/okvis2/tests/ (or wherever the existing OKVIS2 test layout sits — confirm during implementation; if no harness exists, add a minimal one and document in the batch report).
References
- Jira ticket: AZ-943 (parent split AZ-592)
- Sibling Jira tickets (Blocks chain AZ-943 → AZ-944 → AZ-945):
- AZ-944 (3pt, Linux CI build env + DBoW2 vocab artifact + Tier-1 EuRoC mini smoke)
- AZ-945 (3pt, Jetson L4T build + Tier-2 Derkachi
--vio-strategy okvis2e2e + perf baseline)
- AZ-332 spec (the skeleton this replaces):
_docs/02_tasks/done/AZ-332_c1_okvis2_strategy.md - ADR-002 (KLT/RANSAC mandatory baseline; OKVIS2 is the production-default architectural target)
cpp/okvis2/upstream/(v2 source tree)_docs/_autodev_state.md(resume context: Out-of-band bugfix cycle94d2358already committed; AZ-942 / AZ-923 parked; AZ-943→AZ-944→AZ-945 split rationale)