Per user 2026-05-29 directive: "OKVIS2-related tasks needed to be implemented after full e2e derkachi flight test would be finished successfully. So maybe put it back to todo?" Reasoning accepted. OKVIS2 chain is the planned NEXT phase after the cycle-4 Derkachi demo lands, not a cycle-5+ deferral. The 2026-05-27 production-default pivot directive remains in force; today's earlier "deferred to cycle-5+" framing was over-correction after the AZ-943 spec-reality gap. - AZ-943 stays HARD-BLOCKED on AZ-951 + AZ-952 (PAUSED preamble preserved). Cannot be worked on until both blockers land. Moving to todo/ signals "queued, next-after-blockers", not "actionable now". - AZ-951 + AZ-952 are themselves NOT blocked. They ship the upstream patches that unblock AZ-943. Implementation sequence (unchanged): finish cycle-4 demo (AZ-959 + remaining CSV-replay path) → AZ-951 → AZ-952 → AZ-943 → AZ-944 → AZ-945. Current implement-batch target stays AZ-959; this commit is bookkeeping only, does not change what's next on deck. Touches: 3 file moves (backlog/ → todo/), dep-table preamble fourth bump narrative documenting the placement reversal. Co-authored-by: Cursor <cursoragent@cursor.com>
17 KiB
C1 OKVIS2 Binding — Real ThreadedSlam Wiring (AZ-592 split 1/3)
STATUS (2026-05-29): PAUSED — BLOCKED on AZ-951 + AZ-952.
Implementation attempt on 2026-05-29 confirmed AC-4 is structurally unreachable without upstream OKVIS2 patches:
ThreadedSlam::estimator_isprivate(notprotected) → in-binding subclass workaround proposed in Implementation Notes "approach (a)" is impossible.ViSlamBackendhas no public accessor for 6×6 pose covariance, feature counts, mean parallax, or MRE.TrackingState(callback arg) only carries id / isKeyframe / TrackingQuality enum / recognisedPlace / isFullGraphOptimising / currentKeyframeId — none of the AC-4 telemetry fields.The "approach (b) upstream patch" fallback documented in this file + AZ-592 has been filed as two sibling tickets and linked as
is blocked byagainst AZ-943:
- AZ-951 (3 SP): upstream patch — expose 6×6 pose covariance accessor (+ ADR for pin deviation).
- AZ-952 (3 SP): upstream patch — expose tracking-stats accessor (feature counts + parallax + MRE).
Jira AZ-943 reverted to To Do. This local file moved from
todo/→backlog/. The AC list + Implementation Notes below are PRESERVED unchanged for audit; once AZ-951 + AZ-952 land, AC-4 implementation will callbackend().computeCovariance6x6(state.id)+backend().getLatestTrackingStats(state.id, ...)and the file moves back totodo/.Audit reference: AZ-943 Jira comment "Implementation paused: spec gap discovered (2026-05-29)" — full root-cause + decision rationale.
Task: AZ-943_okvis2_threadedslam_binding
Name: OKVIS2 binding: replace AZ-332 skeleton with real okvis::ThreadedSlam wiring
Description: Sub-ticket 1 of 3 from the AZ-592 placeholder split (per state file 2026-05-27 split rationale). Replaces the AZ-332 skeleton in src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp (_build_estimator() no-op, _drive_estimator() raises OkvisFatalException) with the real okvis::ThreadedSlam v2 pipeline: ViParametersReader(yaml).getParameters(...) → ThreadedSlam(parameters, dBowDir) → setOptimisedGraphCallback(...). Without this wiring, Okvis2Strategy (AZ-332) is the production-default per architecture but throws on first add_frame — the production VIO is unusable. CI build env + Jetson validation are tracked in sibling tickets AZ-944 (3pt, Linux CI + DBoW2 vocab + Tier-1 smoke) and AZ-945 (3pt, Jetson L4T + Tier-2 Derkachi e2e); the Blocks chain in Jira is AZ-943 → AZ-944 → AZ-945. This ticket touches ONLY the C++ binding and the Python facade fake-binding fixture; it does NOT flip BUILD_OKVIS2=ON in CI (that's AZ-944's deliverable).
Complexity: 5 points
Dependencies: AZ-332 (the AZ-332 skeleton this replaces; in done/), AZ-592 (parent umbrella placeholder; in backlog/)
Component: c1_vio (epic AZ-254 / E-C1)
Tracker: AZ-943 (https://denyspopov.atlassian.net/browse/AZ-943)
Epic: AZ-254 (E-C1)
Document Dependencies
_docs/02_document/contracts/c1_vio/vio_strategy_protocol.md— the Protocol the strategy implements (AZ-331)._docs/02_document/components/01_c1_vio/description.md— § 5 implementation details (sliding-window K=10–20, per-frame cost), § 7 caveats (thermal throttle latency spikes).- ADR-002 (KLT/RANSAC mandatory baseline) — explains why this OKVIS2 wiring does NOT replace KLT/RANSAC; both ship.
cpp/okvis2/upstream/— fully-populated v2 source tree the binding links against.
Problem
src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp is the AZ-332 skeleton:
_build_estimator()(line ~251) setsestimator_built_ = falseand does nothing else._drive_estimator()(line ~261) throwsOkvisFatalException("OKVIS2 estimator not yet wired — this binding is the AZ-332 skeleton; tier2 follow-up wires okvis::ThreadedKFVio")on first frame.- Real OKVIS2 includes (
#include <okvis/ThreadedKFVio.hpp>etc.) are commented out at lines ~48–50.
Without this wiring, Okvis2Strategy cannot produce any output — the Python facade is complete, the binding compiles and loads, but the first add_frame immediately raises. The production-default VIO is unusable.
API correction since AZ-332: OKVIS2 v2 upstream uses okvis::ThreadedSlam (NOT okvis::ThreadedKFVio as the AZ-332 spec referenced; that's the OKVIS v1 API). The wiring must follow v2 conventions:
okvis::ViParametersReader(yaml_path).getParameters(parameters);
auto estimator = std::make_unique<okvis::ThreadedSlam>(parameters, dBowDir);
estimator->setOptimisedGraphCallback([this](auto&& g, auto&& l, auto&& s) { ... });
Outcome
Okvis2Strategy.add_frame(...)produces a realVioOutput(pose + 6×6 covariance + biases + tracking-quality counts) on every keyframe the OKVIS2 backend optimises — no exceptions on the first frame.Okvis2Strategy.reset(...)tears down the C++ estimator and rebuilds it with the supplied seed pose/velocity/bias.- Existing Python unit tests (
tests/unit/c1_vio/test_okvis2_strategy.py) remain green against the unchanged fake-binding fixture (tests/unit/c1_vio/conftest.py). - This ticket alone does NOT light up the Tier-1 or Tier-2 e2e path against real OKVIS2 — that's AZ-944 / AZ-945. Tier-1 unit suite stays the only green-bar evidence here.
Scope
Included
- Rewrite
_build_estimator()to construct a realokvis::ThreadedSlamfromyaml_config_viaokvis::ViParametersReader. The DBoW2 vocabulary directory comes from a CMake-defined preprocessor constant (vocab artifact provisioning is AZ-944's scope; this ticket only consumes the path). - Rewrite
_drive_estimator()to convertpy::array_t<uint8_t>→cv::Mat(zero-copy preferred) and callestimator_->addImages(stamp, {0: cv_mat}). Returnstrueiff the optimised-graph callback fired for this frame's keyframe. - Wire
add_imu(ts_ns, accel, gyro)throughestimator_->addImuMeasurement(stamp, alpha, omega). Keep the existing strict-monotonic guard on the binding side (line ~161). - Implement the
setOptimisedGraphCallback(...)lambda: filllatest_output_underoutput_mtx_with pose_T_world_body (Eigen::Matrix4d), pose_covariance_6x6 (extracted fromViSlamBackendmarginalised block — see Implementation Notes), accel_bias / gyro_bias, tracked / new / lost feature counts, mean_parallax, mre_px, emitted_at_ns. - Map
okvis::TrackingQuality→HealthState:Good→Tracking,Marginal→Degraded,Lost→Lost. Updatestate_inside the callback beforelatest_output_is filled. - Rewrite
reset()to release the existing estimator and reconstruct via_build_estimator(); apply the seed pose/velocity/bias to the new instance. - Catch all OKVIS2 / Eigen /
std::runtime_errorinside the binding and rethrow asOkvisInitException(during construction),OkvisOptimizationException(during operation), orOkvisFatalException(irrecoverable). No raw exceptions cross into Python. - Uncomment the OKVIS2
#includeblock (lines ~48–50) and verify the_build_estimator/_drive_estimatorpaths compile cleanly underBUILD_OKVIS2=ONon a developer machine that has the apt deps. CI green-bar is AZ-944, not this ticket.
Excluded
- CI apt deps and
BUILD_OKVIS2=ONflip inDockerfile.test.jetson/ Linux runners — that's AZ-944's deliverable. This ticket leaves the CI build off; the C++ change rides as compile-clean only on hosts that already provision the deps (or after AZ-944 lands). - Jetson L4T image build + Tier-2 Derkachi e2e (
--vio-strategy okvis2) — that's AZ-945's deliverable. - DBoW2 small_voc artifact provisioning — sibling decision in AZ-944 (vendor in-tree vs. download-on-build vs. build-from-source). This ticket consumes whatever path the CMake constant resolves to.
- AZ-332 skeleton's surface decisions — exception types,
latest_output_struct fields, py::dict shape — settled by AZ-332. This ticket does not change them. - Multi-camera support — single nav-camera per RESTRICT-UAV-3 / AZ-332.
- OKVIS2 upstream source modifications — pin is fixed per AZ-332 Plan-phase; deviations require an ADR. The covariance side-channel approach (Implementation Notes) is intentionally chosen to avoid upstream patching.
Acceptance Criteria
AC-1: Real estimator construction
Given yaml_config_ is a valid OKVIS2 v2 YAML config and the DBoW2 vocab path resolves
When _build_estimator() runs
Then it constructs an okvis::ThreadedSlam instance via okvis::ViParametersReader and stores it in estimator_ (no longer nullptr); estimator_built_ is true; no exception thrown.
AC-2: Frame ingestion drives the estimator
Given _drive_estimator() receives a py::array_t<uint8_t> of shape (H, W) (mono camera per RESTRICT-UAV-3) with a valid stamp_ns
When the function runs
Then it converts the array to cv::Mat (zero-copy preferred) and calls estimator_->addImages(stamp, {0: cv_mat}). Returns true iff the optimised-graph callback fired for this frame's keyframe within the configured timeout.
AC-3: IMU forwarding
Given add_imu(ts_ns, accel, gyro) is called with strictly-monotonic timestamps
When the function runs
Then it forwards (stamp, alpha, omega) to estimator_->addImuMeasurement(...). The existing strict-monotonic guard (binding-side, line ~161) is preserved.
AC-4: Optimised-graph callback fills latest_output_
Given estimator_->setOptimisedGraphCallback(...) is wired with the binding's lambda
When the OKVIS2 backend optimises a keyframe
Then latest_output_ is filled under output_mtx_ with: pose_T_world_body (Eigen::Matrix4d), pose_covariance_6x6, accel_bias, gyro_bias, tracked_count / new_count / lost_count, mean_parallax, mre_px, emitted_at_ns. The 6×6 covariance is extracted from the ViSlamBackend marginalised block (see Implementation Notes for approach).
AC-5: Health-state mapping
Given okvis::TrackingQuality is one of {Good, Marginal, Lost}
When the callback fires
Then state_ updates to {Tracking, Degraded, Lost} respectively, BEFORE latest_output_ is filled, so a concurrent reader sees consistent state+output.
AC-6: Reset rebuilds with seed
Given an active Okvis2Strategy with a built estimator
When reset(seed_pose, seed_velocity, seed_bias) is called
Then the existing estimator is released (C++ resources freed), _build_estimator() reconstructs a fresh instance, and the seed is applied via OKVIS2's setSeedFromPriors(...) (or equivalent) before the next add_frame.
AC-7: Exception translation
Given an OKVIS2-internal exception, an Eigen exception, or a std::runtime_error is raised inside the binding
When the binding catches it
Then it is rethrown as one of: OkvisInitException (if raised from _build_estimator), OkvisOptimizationException (if raised from _drive_estimator / add_imu), OkvisFatalException (if the backend signals irrecoverable failure). No raw C++ exception crosses the pybind11 boundary.
AC-8: Python unit tests stay green against the fake binding
Given the fake-binding fixture at tests/unit/c1_vio/conftest.py is unchanged
When pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short runs (Tier-1)
Then all pre-existing unit tests pass with no behavioural change. The fake-binding contract is unchanged — only the real C++ side gets wired.
Implementation Notes
Headers needed
okvis/ThreadedSlam.hpp— v2 SLAM front-end + back-end coordinator (replaces v1'sThreadedKFVio).okvis/ViParametersReader.hpp— YAML config loader.okvis/Estimator.hpp— back-end (needed for the covariance side-channel access).okvis/cameras/PinholeCamera.hpp— K-matrix → OKVIS camera-object conversion if the binding constructs cameras directly (otherwise the YAML carries them).
6×6 covariance extraction — the known unknown
The setOptimisedGraphCallback payload (ViGraph snapshot) does NOT carry the latest-pose covariance directly; covariance lives inside the Estimator's back-end. Two approaches:
- (a) Side-channel accessor (preferred for first cut): inside the callback, take a non-const handle to
estimator_->backend()(or equivalent) and read the marginalised 6×6 block for the latest pose state. Keep the read protected byoutput_mtx_. If OKVIS2 v2 marks the back-end accessor private, fall back to subclassingThreadedSlamand exposing a thin protected getter — still in our binding, no upstream change. - (b) Tiny upstream patch: add a public
latestPoseCovariance6x6()method tookvis::ViSlamBackendand submit upstream. Faster diff but requires a pin bump + ADR per AZ-332 Plan-phase. Defer to (b) only if (a) hits a hard private-field block.
Pick (a) for the first cut. If (a) requires a subclass-exposed getter, document the subclass in a code comment referencing this AC and AZ-943.
CMake link targets
cpp/okvis2/CMakeLists.txt already declares the link targets at lines ~64–73: okvis_ceres, okvis_frontend, okvis_multisensor_processing, okvis_kinematics, okvis_cv, okvis_common, okvis_time, okvis_util. The _drive_estimator function needs okvis_cv for the cv::Mat integration. No new targets to add — verify the linker pulls them in cleanly under BUILD_OKVIS2=ON.
pybind11 surface — DO NOT change
The pybind11 module shape (lines ~296–318) is correct and the Python facade unit tests confirm it. Do NOT alter the surface — add_frame, add_imu, reset, and the result struct fields stay byte-compatible with the fake binding. Only the C++ implementations behind those symbols change.
DBoW2 vocab path
Define a CMake preprocessor constant (e.g. OKVIS2_DBOW2_VOCAB_DIR) that points to a path the runtime can resolve. AZ-944 will populate this path with the small-vocabulary artifact (decision: vendor in-tree vs. download-on-build vs. build-from-source). For this ticket: declare the constant, consume it, and document the expected file layout (e.g. ${OKVIS2_DBOW2_VOCAB_DIR}/small_voc.yml.gz or similar) in a code comment referencing AZ-944.
Build verification
Compile-clean evidence on a host with apt deps installed (developer Mac with brew install ... equivalents OR a Linux dev VM with apt deps):
BUILD_OKVIS2=ON cmake -S . -B build && cmake --build build --target c1_vio_okvis2_native
Should produce the .so. Capture the build log in the batch report. The _native/__init__.py Python-side import test then confirms the symbol is loadable (without running OKVIS2 — just loading the shared object).
Constraints
- Pin: OKVIS2 v2 upstream pin from AZ-332 Plan-phase is fixed. Any deviation requires an ADR.
- No upstream patches unless approach (a) for covariance fails and is documented in a comment + retro entry.
- Single nav-camera per RESTRICT-UAV-3 — multi-camera ingestion is out of scope.
- No CI flip: this ticket leaves
BUILD_OKVIS2=OFFinDockerfile.test.jetson/ Linux CI runners. AZ-944 owns the flip. - Backward compatibility: Python facade fake-binding tests stay green with no fixture changes.
Unit Tests
| AC Ref | What to Test | Required Outcome |
|---|---|---|
| AC-1 | C++ unit (gtest) — construct Okvis2Binding with a known-good YAML, assert estimator_built_ is true and no exception thrown |
Pass on a host with apt deps installed |
| AC-2 | C++ unit — feed a synthetic cv::Mat via the C++ side, assert addImages is called once and the optimised-graph callback fires |
Pass |
| AC-4 | C++ unit — drive a short EuRoC-like image+IMU sequence, assert latest_output_.pose_covariance_6x6 is non-zero finite SPD |
Pass; eigvals all > 0 |
| AC-7 | C++ unit — feed a known-bad YAML; assert OkvisInitException propagates with non-empty what() |
Pass |
| AC-8 | Python — pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short |
All pre-existing tests still pass (uses fake binding, no real OKVIS2) |
C++ unit tests live under cpp/okvis2/tests/ (or wherever the existing OKVIS2 test layout sits — confirm during implementation; if no harness exists, add a minimal one and document in the batch report).
References
- Jira ticket: AZ-943 (parent split AZ-592)
- Sibling Jira tickets (Blocks chain AZ-943 → AZ-944 → AZ-945):
- AZ-944 (3pt, Linux CI build env + DBoW2 vocab artifact + Tier-1 EuRoC mini smoke)
- AZ-945 (3pt, Jetson L4T build + Tier-2 Derkachi
--vio-strategy okvis2e2e + perf baseline)
- AZ-332 spec (the skeleton this replaces):
_docs/02_tasks/done/AZ-332_c1_okvis2_strategy.md - ADR-002 (KLT/RANSAC mandatory baseline; OKVIS2 is the production-default architectural target)
cpp/okvis2/upstream/(v2 source tree)_docs/_autodev_state.md(resume context: Out-of-band bugfix cycle94d2358already committed; AZ-942 / AZ-923 parked; AZ-943→AZ-944→AZ-945 split rationale)