[AZ-943] [AZ-951] [AZ-952] Pause AZ-943 on OKVIS2 telemetry gap

AZ-943 implementation attempt confirmed the C++ binding cannot satisfy
AC-4 without upstream OKVIS2 patches. The spec's "approach (a)
in-binding subclass workaround" is structurally impossible:
- ThreadedSlam::estimator_ is `private` (not `protected`)
- ViSlamBackend has no public covariance / counts / parallax / MRE
  accessor in the v2 upstream headers
- TrackingState carries only id / isKeyframe / TrackingQuality enum /
  recognisedPlace / isFullGraphOptimising / currentKeyframeId — none
  of the five tracking-stats fields the binding needs

Filed the spec-documented "approach (b)" fallback as two sibling
tickets, both linked Jira-side as `is blocked by` against AZ-943:

- AZ-951 (3 SP): upstream patch — expose 6x6 pose covariance accessor
  (+ ADR-XXX for the AZ-332 Plan-phase pin deviation)
- AZ-952 (3 SP): upstream patch — expose tracking-stats accessor
  (feature counts + parallax + MRE)

AZ-943 transitioned In Progress -> To Do in Jira, full audit comment
attached. Local AZ-943 spec moved todo/ -> backlog/ with PAUSED
preamble; original AC list preserved for the post-unblock turn.

Per user 2026-05-29 confirmation: cycle-4 Derkachi demo target stays
KLT/RANSAC (tests/e2e/replay/conftest.py line 159
c1_vio: strategy: klt_ransac), so AZ-951 + AZ-952 + AZ-943 chain is
correctly deferred. Pivoting next batch to AZ-897 (replay UI form).

Touches: _docs/02_tasks/_dependencies_table.md (preamble + table
rows for AZ-943 paused / AZ-951 / AZ-952 added; totals bumped to
142 product + 41 blackbox-test = 183, 448 product + 133 blackbox
= 581), _docs/_autodev_state.md (sub_step pivot to AZ-897).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-29 11:48:09 +03:00
parent 42b1db6ace
commit e8caa29da6
5 changed files with 161 additions and 7 deletions
@@ -0,0 +1,194 @@
# C1 OKVIS2 Binding — Real ThreadedSlam Wiring (AZ-592 split 1/3)
> **STATUS (2026-05-29): PAUSED — BLOCKED on AZ-951 + AZ-952.**
>
> Implementation attempt on 2026-05-29 confirmed AC-4 is structurally unreachable without upstream OKVIS2 patches:
>
> - `ThreadedSlam::estimator_` is `private` (not `protected`) → in-binding subclass workaround proposed in Implementation Notes "approach (a)" is impossible.
> - `ViSlamBackend` has no public accessor for 6×6 pose covariance, feature counts, mean parallax, or MRE.
> - `TrackingState` (callback arg) only carries id / isKeyframe / TrackingQuality enum / recognisedPlace / isFullGraphOptimising / currentKeyframeId — none of the AC-4 telemetry fields.
>
> The "approach (b) upstream patch" fallback documented in this file + AZ-592 has been filed as two sibling tickets and linked as `is blocked by` against AZ-943:
>
> - **AZ-951** (3 SP): upstream patch — expose 6×6 pose covariance accessor (+ ADR for pin deviation).
> - **AZ-952** (3 SP): upstream patch — expose tracking-stats accessor (feature counts + parallax + MRE).
>
> Jira AZ-943 reverted to To Do. This local file moved from `todo/` → `backlog/`. The AC list + Implementation Notes below are PRESERVED unchanged for audit; once AZ-951 + AZ-952 land, AC-4 implementation will call `backend().computeCovariance6x6(state.id)` + `backend().getLatestTrackingStats(state.id, ...)` and the file moves back to `todo/`.
>
> Audit reference: AZ-943 Jira comment "Implementation paused: spec gap discovered (2026-05-29)" — full root-cause + decision rationale.
**Task**: AZ-943_okvis2_threadedslam_binding
**Name**: OKVIS2 binding: replace AZ-332 skeleton with real `okvis::ThreadedSlam` wiring
**Description**: Sub-ticket 1 of 3 from the AZ-592 placeholder split (per state file 2026-05-27 split rationale). Replaces the AZ-332 skeleton in `src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp` (`_build_estimator()` no-op, `_drive_estimator()` raises `OkvisFatalException`) with the real `okvis::ThreadedSlam` v2 pipeline: `ViParametersReader(yaml).getParameters(...)``ThreadedSlam(parameters, dBowDir)``setOptimisedGraphCallback(...)`. Without this wiring, `Okvis2Strategy` (AZ-332) is the production-default per architecture but throws on first `add_frame` — the production VIO is unusable. CI build env + Jetson validation are tracked in sibling tickets AZ-944 (3pt, Linux CI + DBoW2 vocab + Tier-1 smoke) and AZ-945 (3pt, Jetson L4T + Tier-2 Derkachi e2e); the Blocks chain in Jira is AZ-943 → AZ-944 → AZ-945. This ticket touches ONLY the C++ binding and the Python facade fake-binding fixture; it does NOT flip `BUILD_OKVIS2=ON` in CI (that's AZ-944's deliverable).
**Complexity**: 5 points
**Dependencies**: AZ-332 (the AZ-332 skeleton this replaces; in `done/`), AZ-592 (parent umbrella placeholder; in `backlog/`)
**Component**: c1_vio (epic AZ-254 / E-C1)
**Tracker**: AZ-943 (https://denyspopov.atlassian.net/browse/AZ-943)
**Epic**: AZ-254 (E-C1)
### Document Dependencies
- `_docs/02_document/contracts/c1_vio/vio_strategy_protocol.md` — the Protocol the strategy implements (AZ-331).
- `_docs/02_document/components/01_c1_vio/description.md` — § 5 implementation details (sliding-window K=1020, per-frame cost), § 7 caveats (thermal throttle latency spikes).
- ADR-002 (KLT/RANSAC mandatory baseline) — explains why this OKVIS2 wiring does NOT replace KLT/RANSAC; both ship.
- `cpp/okvis2/upstream/` — fully-populated v2 source tree the binding links against.
## Problem
`src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp` is the AZ-332 skeleton:
- `_build_estimator()` (line ~251) sets `estimator_built_ = false` and does nothing else.
- `_drive_estimator()` (line ~261) throws `OkvisFatalException("OKVIS2 estimator not yet wired — this binding is the AZ-332 skeleton; tier2 follow-up wires okvis::ThreadedKFVio")` on first frame.
- Real OKVIS2 includes (`#include <okvis/ThreadedKFVio.hpp>` etc.) are commented out at lines ~4850.
Without this wiring, `Okvis2Strategy` cannot produce any output — the Python facade is complete, the binding compiles and loads, but the first `add_frame` immediately raises. The production-default VIO is unusable.
**API correction since AZ-332**: OKVIS2 v2 upstream uses `okvis::ThreadedSlam` (NOT `okvis::ThreadedKFVio` as the AZ-332 spec referenced; that's the OKVIS v1 API). The wiring must follow v2 conventions:
```
okvis::ViParametersReader(yaml_path).getParameters(parameters);
auto estimator = std::make_unique<okvis::ThreadedSlam>(parameters, dBowDir);
estimator->setOptimisedGraphCallback([this](auto&& g, auto&& l, auto&& s) { ... });
```
## Outcome
- `Okvis2Strategy.add_frame(...)` produces a real `VioOutput` (pose + 6×6 covariance + biases + tracking-quality counts) on every keyframe the OKVIS2 backend optimises — no exceptions on the first frame.
- `Okvis2Strategy.reset(...)` tears down the C++ estimator and rebuilds it with the supplied seed pose/velocity/bias.
- Existing Python unit tests (`tests/unit/c1_vio/test_okvis2_strategy.py`) remain green against the unchanged fake-binding fixture (`tests/unit/c1_vio/conftest.py`).
- This ticket alone does NOT light up the Tier-1 or Tier-2 e2e path against real OKVIS2 — that's AZ-944 / AZ-945. Tier-1 unit suite stays the only green-bar evidence here.
## Scope
### Included
- Rewrite `_build_estimator()` to construct a real `okvis::ThreadedSlam` from `yaml_config_` via `okvis::ViParametersReader`. The DBoW2 vocabulary directory comes from a CMake-defined preprocessor constant (vocab artifact provisioning is AZ-944's scope; this ticket only consumes the path).
- Rewrite `_drive_estimator()` to convert `py::array_t<uint8_t>``cv::Mat` (zero-copy preferred) and call `estimator_->addImages(stamp, {0: cv_mat})`. Returns `true` iff the optimised-graph callback fired for this frame's keyframe.
- Wire `add_imu(ts_ns, accel, gyro)` through `estimator_->addImuMeasurement(stamp, alpha, omega)`. Keep the existing strict-monotonic guard on the binding side (line ~161).
- Implement the `setOptimisedGraphCallback(...)` lambda: fill `latest_output_` under `output_mtx_` with pose_T_world_body (Eigen::Matrix4d), pose_covariance_6x6 (extracted from `ViSlamBackend` marginalised block — see Implementation Notes), accel_bias / gyro_bias, tracked / new / lost feature counts, mean_parallax, mre_px, emitted_at_ns.
- Map `okvis::TrackingQuality``HealthState`: `Good``Tracking`, `Marginal``Degraded`, `Lost``Lost`. Update `state_` inside the callback before `latest_output_` is filled.
- Rewrite `reset()` to release the existing estimator and reconstruct via `_build_estimator()`; apply the seed pose/velocity/bias to the new instance.
- Catch all OKVIS2 / Eigen / `std::runtime_error` inside the binding and rethrow as `OkvisInitException` (during construction), `OkvisOptimizationException` (during operation), or `OkvisFatalException` (irrecoverable). No raw exceptions cross into Python.
- Uncomment the OKVIS2 `#include` block (lines ~4850) and verify the `_build_estimator` / `_drive_estimator` paths compile cleanly under `BUILD_OKVIS2=ON` on a developer machine that has the apt deps. CI green-bar is AZ-944, not this ticket.
### Excluded
- **CI apt deps and `BUILD_OKVIS2=ON` flip in `Dockerfile.test.jetson` / Linux runners** — that's AZ-944's deliverable. This ticket leaves the CI build off; the C++ change rides as compile-clean only on hosts that already provision the deps (or after AZ-944 lands).
- **Jetson L4T image build + Tier-2 Derkachi e2e (`--vio-strategy okvis2`)** — that's AZ-945's deliverable.
- **DBoW2 small_voc artifact provisioning** — sibling decision in AZ-944 (vendor in-tree vs. download-on-build vs. build-from-source). This ticket consumes whatever path the CMake constant resolves to.
- **AZ-332 skeleton's surface decisions** — exception types, `latest_output_` struct fields, py::dict shape — settled by AZ-332. This ticket does not change them.
- **Multi-camera support** — single nav-camera per RESTRICT-UAV-3 / AZ-332.
- **OKVIS2 upstream source modifications** — pin is fixed per AZ-332 Plan-phase; deviations require an ADR. The covariance side-channel approach (Implementation Notes) is intentionally chosen to avoid upstream patching.
## Acceptance Criteria
**AC-1: Real estimator construction**
Given `yaml_config_` is a valid OKVIS2 v2 YAML config and the DBoW2 vocab path resolves
When `_build_estimator()` runs
Then it constructs an `okvis::ThreadedSlam` instance via `okvis::ViParametersReader` and stores it in `estimator_` (no longer `nullptr`); `estimator_built_` is `true`; no exception thrown.
**AC-2: Frame ingestion drives the estimator**
Given `_drive_estimator()` receives a `py::array_t<uint8_t>` of shape `(H, W)` (mono camera per RESTRICT-UAV-3) with a valid `stamp_ns`
When the function runs
Then it converts the array to `cv::Mat` (zero-copy preferred) and calls `estimator_->addImages(stamp, {0: cv_mat})`. Returns `true` iff the optimised-graph callback fired for this frame's keyframe within the configured timeout.
**AC-3: IMU forwarding**
Given `add_imu(ts_ns, accel, gyro)` is called with strictly-monotonic timestamps
When the function runs
Then it forwards `(stamp, alpha, omega)` to `estimator_->addImuMeasurement(...)`. The existing strict-monotonic guard (binding-side, line ~161) is preserved.
**AC-4: Optimised-graph callback fills `latest_output_`**
Given `estimator_->setOptimisedGraphCallback(...)` is wired with the binding's lambda
When the OKVIS2 backend optimises a keyframe
Then `latest_output_` is filled under `output_mtx_` with: `pose_T_world_body` (Eigen::Matrix4d), `pose_covariance_6x6`, `accel_bias`, `gyro_bias`, `tracked_count` / `new_count` / `lost_count`, `mean_parallax`, `mre_px`, `emitted_at_ns`. The 6×6 covariance is extracted from the `ViSlamBackend` marginalised block (see Implementation Notes for approach).
**AC-5: Health-state mapping**
Given `okvis::TrackingQuality` is one of `{Good, Marginal, Lost}`
When the callback fires
Then `state_` updates to `{Tracking, Degraded, Lost}` respectively, BEFORE `latest_output_` is filled, so a concurrent reader sees consistent state+output.
**AC-6: Reset rebuilds with seed**
Given an active `Okvis2Strategy` with a built estimator
When `reset(seed_pose, seed_velocity, seed_bias)` is called
Then the existing estimator is released (C++ resources freed), `_build_estimator()` reconstructs a fresh instance, and the seed is applied via OKVIS2's `setSeedFromPriors(...)` (or equivalent) before the next `add_frame`.
**AC-7: Exception translation**
Given an OKVIS2-internal exception, an Eigen exception, or a `std::runtime_error` is raised inside the binding
When the binding catches it
Then it is rethrown as one of: `OkvisInitException` (if raised from `_build_estimator`), `OkvisOptimizationException` (if raised from `_drive_estimator` / `add_imu`), `OkvisFatalException` (if the backend signals irrecoverable failure). No raw C++ exception crosses the pybind11 boundary.
**AC-8: Python unit tests stay green against the fake binding**
Given the fake-binding fixture at `tests/unit/c1_vio/conftest.py` is unchanged
When `pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short` runs (Tier-1)
Then all pre-existing unit tests pass with no behavioural change. The fake-binding contract is unchanged — only the real C++ side gets wired.
## Implementation Notes
### Headers needed
- `okvis/ThreadedSlam.hpp` — v2 SLAM front-end + back-end coordinator (replaces v1's `ThreadedKFVio`).
- `okvis/ViParametersReader.hpp` — YAML config loader.
- `okvis/Estimator.hpp` — back-end (needed for the covariance side-channel access).
- `okvis/cameras/PinholeCamera.hpp` — K-matrix → OKVIS camera-object conversion if the binding constructs cameras directly (otherwise the YAML carries them).
### 6×6 covariance extraction — the known unknown
The `setOptimisedGraphCallback` payload (`ViGraph` snapshot) does NOT carry the latest-pose covariance directly; covariance lives inside the `Estimator`'s back-end. Two approaches:
- **(a) Side-channel accessor** (preferred for first cut): inside the callback, take a non-const handle to `estimator_->backend()` (or equivalent) and read the marginalised 6×6 block for the latest pose state. Keep the read protected by `output_mtx_`. If OKVIS2 v2 marks the back-end accessor private, fall back to subclassing `ThreadedSlam` and exposing a thin protected getter — still in our binding, no upstream change.
- **(b) Tiny upstream patch**: add a public `latestPoseCovariance6x6()` method to `okvis::ViSlamBackend` and submit upstream. Faster diff but requires a pin bump + ADR per AZ-332 Plan-phase. Defer to (b) only if (a) hits a hard private-field block.
Pick (a) for the first cut. If (a) requires a subclass-exposed getter, document the subclass in a code comment referencing this AC and AZ-943.
### CMake link targets
`cpp/okvis2/CMakeLists.txt` already declares the link targets at lines ~6473: `okvis_ceres`, `okvis_frontend`, `okvis_multisensor_processing`, `okvis_kinematics`, `okvis_cv`, `okvis_common`, `okvis_time`, `okvis_util`. The `_drive_estimator` function needs `okvis_cv` for the `cv::Mat` integration. No new targets to add — verify the linker pulls them in cleanly under `BUILD_OKVIS2=ON`.
### pybind11 surface — DO NOT change
The pybind11 module shape (lines ~296318) is correct and the Python facade unit tests confirm it. Do NOT alter the surface — `add_frame`, `add_imu`, `reset`, and the result struct fields stay byte-compatible with the fake binding. Only the C++ implementations behind those symbols change.
### DBoW2 vocab path
Define a CMake preprocessor constant (e.g. `OKVIS2_DBOW2_VOCAB_DIR`) that points to a path the runtime can resolve. AZ-944 will populate this path with the small-vocabulary artifact (decision: vendor in-tree vs. download-on-build vs. build-from-source). For this ticket: declare the constant, consume it, and document the expected file layout (e.g. `${OKVIS2_DBOW2_VOCAB_DIR}/small_voc.yml.gz` or similar) in a code comment referencing AZ-944.
### Build verification
Compile-clean evidence on a host with apt deps installed (developer Mac with `brew install ...` equivalents OR a Linux dev VM with apt deps):
```
BUILD_OKVIS2=ON cmake -S . -B build && cmake --build build --target c1_vio_okvis2_native
```
Should produce the `.so`. Capture the build log in the batch report. The `_native/__init__.py` Python-side import test then confirms the symbol is loadable (without running OKVIS2 — just loading the shared object).
## Constraints
- **Pin**: OKVIS2 v2 upstream pin from AZ-332 Plan-phase is fixed. Any deviation requires an ADR.
- **No upstream patches** unless approach (a) for covariance fails and is documented in a comment + retro entry.
- **Single nav-camera** per RESTRICT-UAV-3 — multi-camera ingestion is out of scope.
- **No CI flip**: this ticket leaves `BUILD_OKVIS2=OFF` in `Dockerfile.test.jetson` / Linux CI runners. AZ-944 owns the flip.
- **Backward compatibility**: Python facade fake-binding tests stay green with no fixture changes.
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|--------------|------------------|
| AC-1 | C++ unit (gtest) — construct `Okvis2Binding` with a known-good YAML, assert `estimator_built_` is `true` and no exception thrown | Pass on a host with apt deps installed |
| AC-2 | C++ unit — feed a synthetic `cv::Mat` via the C++ side, assert `addImages` is called once and the optimised-graph callback fires | Pass |
| AC-4 | C++ unit — drive a short EuRoC-like image+IMU sequence, assert `latest_output_.pose_covariance_6x6` is non-zero finite SPD | Pass; eigvals all > 0 |
| AC-7 | C++ unit — feed a known-bad YAML; assert `OkvisInitException` propagates with non-empty `what()` | Pass |
| AC-8 | Python — `pytest tests/unit/c1_vio/test_okvis2_strategy.py -v --tb=short` | All pre-existing tests still pass (uses fake binding, no real OKVIS2) |
C++ unit tests live under `cpp/okvis2/tests/` (or wherever the existing OKVIS2 test layout sits — confirm during implementation; if no harness exists, add a minimal one and document in the batch report).
## References
- Jira ticket: AZ-943 (parent split AZ-592)
- Sibling Jira tickets (Blocks chain AZ-943 → AZ-944 → AZ-945):
- AZ-944 (3pt, Linux CI build env + DBoW2 vocab artifact + Tier-1 EuRoC mini smoke)
- AZ-945 (3pt, Jetson L4T build + Tier-2 Derkachi `--vio-strategy okvis2` e2e + perf baseline)
- AZ-332 spec (the skeleton this replaces): `_docs/02_tasks/done/AZ-332_c1_okvis2_strategy.md`
- ADR-002 (KLT/RANSAC mandatory baseline; OKVIS2 is the production-default architectural target)
- `cpp/okvis2/upstream/` (v2 source tree)
- `_docs/_autodev_state.md` (resume context: Out-of-band bugfix cycle 94d2358 already committed; AZ-942 / AZ-923 parked; AZ-943→AZ-944→AZ-945 split rationale)
@@ -0,0 +1,65 @@
# OKVIS2 v2 upstream patch: expose 6×6 pose covariance accessor (+ ADR for pin deviation)
**Task**: AZ-951_okvis2_upstream_covariance_patch
**Name**: OKVIS2 v2 upstream patch — expose 6×6 pose covariance accessor (+ ADR for pin deviation)
**Description**: Land the documented "approach (b) upstream patch" escape hatch from AZ-592 (line 30) and AZ-943 (Implementation Notes "Tiny upstream patch"). AZ-943's implementation attempt on 2026-05-29 confirmed that the proposed "approach (a) in-binding subclass workaround" is structurally impossible: `ThreadedSlam::estimator_` is declared `private` (not `protected`), and `ViSlamBackend` has no public covariance accessor anywhere in the v2 upstream headers.
**Complexity**: 3 SP
**Dependencies**: AZ-332 (the AZ-332 Plan-phase pin this work deviates from), AZ-592 (parent placeholder that originally offered this approach as option (a) in its line 30-31)
**Component**: c1_vio (epic AZ-254 / E-C1); also touches `cpp/okvis2/` (upstream wrapper) and `_docs/03_implementation/architecture/decisions/` (ADR)
**Tracker**: AZ-951 (https://denyspopov.atlassian.net/browse/AZ-951)
**Parent Epic**: AZ-254 (E-C1)
**Blocks**: AZ-943 AC-4 (pose_covariance_6x6 field)
Jira AZ-951 is the authoritative spec; this file is the in-workspace mirror.
## Goal
Land the documented "approach (b) upstream patch" escape hatch. **Blocks AZ-943 AC-4** (`pose_covariance_6x6` field). The Python facade (`okvis2.py` `_build_vio_output`) shape-checks the 6×6 covariance; downstream EKF in C2 treats it as Kalman gain weight, so an identity placeholder would lie about VIO uncertainty (contradicts AZ-848 ESKF out-of-order analysis).
## Scope
1. **ADR-XXX (pin deviation rationale)** under `_docs/03_implementation/architecture/decisions/`:
- Title: "OKVIS2 v2 upstream patch — expose ViSlamBackend pose-covariance accessor for Okvis2Backend C++ binding"
- Decision: deviate from AZ-332 Plan-phase fixed pin to land a small, surgical, documented patch.
- Alternatives considered: (1) keep pin + ship placeholder covariance (violates meta-rule "Real Results, Not Simulated Ones"); (2) hard fork OKVIS2 (rejected — too much surface); (3) upstream the patch as a follow-up contribution to the OKVIS2 maintainers (recommended).
- Consequences: future upstream rebases must reapply; patch is small and self-contained to minimise that cost.
2. **Patch file** under `cpp/okvis2/patches/expose_covariance.patch`:
- Make `ThreadedSlam::estimator_` reachable from the binding: either add `public okvis::ViSlamBackend& backend()` to `ThreadedSlam` OR change `estimator_` from `private` to `protected`. Recommend the public accessor — cleaner API surface, less invasive.
- Add `Eigen::Matrix<double, 6, 6> ViSlamBackend::computeCovariance6x6(StateId id) const` — wraps `ceres::Covariance::Compute` over the `realtimeGraph_`'s ceres::Problem for the pose parameter block at `id`. Returns a documented failure-sentinel (identity * large scale + warning log) when the covariance computation is rank-deficient; binding then flags the output as low-confidence.
3. **CMake glue** in `cpp/okvis2/CMakeLists.txt`: apply the patch via the chosen mechanism (decided at scheduling — see Open Decisions).
4. **Verification path**: compile-clean evidence comes via AZ-944 (Linux CI BUILD_OKVIS2=ON flip). Local macOS gets no compile evidence (project policy).
## Acceptance Criteria
- **AC-1**: ADR-XXX exists under `_docs/03_implementation/architecture/decisions/`, follows the project's existing ADR template, and cites AZ-332 (Plan-phase pin), AZ-592 (parent placeholder), AZ-943 (the blocked binding ticket), and this ticket's Jira key.
- **AC-2**: `cpp/okvis2/patches/expose_covariance.patch` exists, applies cleanly to the vendored upstream at `cpp/okvis2/upstream/`, and the patch surface is ≤ 100 lines of diff (keeps future rebase cost low).
- **AC-3**: After patch application, `okvis::ThreadedSlam` has a public method that returns a reference / pointer to its `okvis::ViSlamBackend` member.
- **AC-4**: After patch application, `okvis::ViSlamBackend` has a public `Eigen::Matrix<double, 6, 6> computeCovariance6x6(StateId id) const` method backed by `ceres::Covariance::Compute`. Behaviour:
- Success: returns the 6×6 marginalised pose covariance.
- Failure (rank-deficient / non-converged / wrong state ID): returns a documented failure sentinel and emits a single warning log per occurrence — NO exception thrown into the binding (the binding layer decides whether to surface this as an OkvisOptimizationException).
- **AC-5**: Patch mechanism (in-place `git apply` vs vendored header overrides vs forked submodule) is chosen at scheduling and documented in the patch's commit message + ADR-XXX.
- **AC-6**: Local task spec for AZ-943 is updated to call `backend().computeCovariance6x6(state.id)` inside the `setOptimisedGraphCallback` lambda (the AZ-943 implementation, post-unblock).
## Open Decisions (resolve at scheduling)
1. **Patch application mechanism**: in-place `git apply` (simplest, but touches vendored source) vs vendored header overrides under `cpp/okvis2/include_overrides/` (most transparent in code review) vs forked submodule (heaviest, only if patch grows large). Default proposal: vendored header overrides.
2. **Covariance failure semantics**: silent identity sentinel + log (proposed default; binding flags output as low-confidence) vs raise an OKVIS2-side exception (then binding rethrows as `OkvisOptimizationException`).
## Out of scope
- Tracking-stats telemetry (tracked/new/lost feature counts, mean_parallax, mre_px) — separate sibling ticket (AZ-952); this one is covariance-only because the two pieces have different upstream surface area and risk profiles.
- Submitting the patch to upstream OKVIS2 maintainers (tracked as a follow-up issue on the upstream GitHub mirror after this lands locally).
- The downstream AZ-943 binding update — owned by AZ-943, which is currently blocked-by this ticket.
## References
- AZ-943 implementation attempt (2026-05-29): proved "approach (a) subclass workaround" infeasible — `ThreadedSlam::estimator_` is `private`, `ViSlamBackend` has no public covariance accessor.
- AZ-592 line 30-31: offered this exact approach as fallback when (a) fails.
- AZ-943 Implementation Notes "Tiny upstream patch": defers to this approach explicitly.
- `cpp/okvis2/upstream/okvis_multisensor_processing/include/okvis/ThreadedSlam.hpp` line 254: `private: okvis::ViSlamBackend estimator_;`
- `cpp/okvis2/upstream/okvis_ceres/include/okvis/ViSlamBackend.hpp`: no public covariance accessor anywhere.
- meta-rule.mdc "Real Results, Not Simulated Ones" — the constraint that forces this path over a placeholder.
- Sibling ticket: AZ-952 (tracking-stats accessor).
@@ -0,0 +1,70 @@
# OKVIS2 v2 upstream patch: expose tracking-stats accessor (feature counts + parallax + MRE)
**Task**: AZ-952_okvis2_upstream_tracking_stats_patch
**Name**: OKVIS2 v2 upstream patch — expose tracking-stats accessor (feature counts + parallax + MRE)
**Description**: Sibling to AZ-951 (covariance + ADR). AZ-943's implementation attempt on 2026-05-29 confirmed that the four tracking-stats fields the `Okvis2Backend` C++ binding must fill have no source in OKVIS2 v2's public `setOptimisedGraphCallback` arg list. `TrackingState` (`okvis/ViInterface.hpp` lines 167-174) carries only `id`, `isKeyframe`, `trackingQuality` (enum: Good/Marginal/Lost), `recognisedPlace`, `isFullGraphOptimising`, `currentKeyframeId` — NONE of the five tracking-stats fields the binding needs.
**Complexity**: 3 SP
**Dependencies**: AZ-951 (SOFT — same ADR, same patch-mechanism decision; can land in either order, but combining patches is easier if scheduled together), AZ-332 (the AZ-332 Plan-phase pin this work deviates from alongside AZ-951), AZ-592 (parent placeholder)
**Component**: c1_vio (epic AZ-254 / E-C1); also touches `cpp/okvis2/` (upstream wrapper)
**Tracker**: AZ-952 (https://denyspopov.atlassian.net/browse/AZ-952)
**Parent Epic**: AZ-254 (E-C1)
**Blocks**: AZ-943 AC-4 (tracked/new/lost feature counts + mean_parallax + mre_px fields)
Jira AZ-952 is the authoritative spec; this file is the in-workspace mirror.
## Goal
**Blocks AZ-943 AC-4** (the five tracking-stats fields). The Python facade (`okvis2.py` `_build_vio_output`) consumes all five (line 393-399: `FeatureQuality(tracked=..., new=..., lost=..., mean_parallax=..., mre_px=...)`). The `tracked` field also feeds the `_classify_state(vio_output.feature_quality)` DEGRADED-state classifier (line 241) — placeholders would systematically misclassify health.
Five fields with no source in the public callback surface:
- `tracked_features` (int) — not in callback args; computed inside `okvis::Frontend` during matching.
- `new_features` (int) — same.
- `lost_features` (int) — same.
- `mean_parallax` (double, px) — not in callback args; computed inside `okvis::Frontend` keyframe selection.
- `mre_px` (double, mean reprojection error) — not in callback args; ceres optimisation byproduct on the realtimeGraph.
## Scope
1. **Patch file** under `cpp/okvis2/patches/expose_tracking_stats.patch` (or merge into a single combined patch with AZ-951's covariance patch — scheduler decides):
- Add `void ViSlamBackend::getLatestTrackingStats(StateId id, int& tracked, int& newCount, int& lost, double& meanParallaxPx, double& mreReprojectionPx) const` — reads from the relevant private members (`frontend_` / `realtimeGraph_` / `multiFrames_`) via a single batched accessor.
- `tracked` = count of landmark observations for state `id` that were also observed in the most recent prior keyframe.
- `newCount` = count of landmark observations for state `id` that were NOT observed in any prior frame.
- `lost` = count of landmarks observed in the prior keyframe but absent from `id`.
- `meanParallaxPx` = mean keypoint pixel displacement between `id` and the most recent prior keyframe, over the `tracked` matched set.
- `mreReprojectionPx` = mean per-observation reprojection residual from the realtimeGraph optimisation, over all observations attached to `id`.
2. **CMake glue**: same mechanism as AZ-951's covariance patch (vendored header overrides vs in-place git apply vs forked submodule — decided at AZ-951 scheduling). If AZ-951 lands first, this ticket reuses that mechanism; if scheduled in parallel, mechanism is decided together.
3. **Verification path**: compile-clean evidence comes via AZ-944 (Linux CI BUILD_OKVIS2=ON flip). Local macOS gets no compile evidence (project policy).
## Acceptance Criteria
- **AC-1**: `cpp/okvis2/patches/expose_tracking_stats.patch` (or the combined patch with AZ-951's content) exists, applies cleanly to the vendored upstream at `cpp/okvis2/upstream/`, and the total patch surface across this + AZ-951 stays ≤ 200 lines of diff (keeps future rebase cost low).
- **AC-2**: After patch application, `okvis::ViSlamBackend::getLatestTrackingStats(...)` is a public method that fills the five out-params with finite values for any valid `StateId` of a state that has been through realtimeGraph optimisation. For state IDs that have not yet been optimised, all five are set to documented sentinel values (zeros + warning log).
- **AC-3**: All five values are computed from the realtimeGraph's actual matched-observation set; no placeholders, no defaults. Code comment in the patch explains the derivation for each field.
- **AC-4**: ADR-XXX from AZ-951 is updated to cite this ticket alongside the covariance accessor work, so the deviation-from-pin rationale documents the FULL telemetry exposure surface.
- **AC-5**: Local task spec for AZ-943 is updated to call `backend().getLatestTrackingStats(state.id, ...)` inside the `setOptimisedGraphCallback` lambda (the AZ-943 implementation, post-unblock).
## Open Decisions (resolve at scheduling)
1. **Combined vs separate patch file**: ship as `expose_telemetry.patch` (one file covering both AZ-951's covariance + this ticket's tracking-stats) or as two separate `.patch` files. Default proposal: one combined patch with two logical commits inside it.
2. **Sentinel semantics for unoptimised states**: zeros + warning log (proposed default) vs raise OKVIS2-side exception (binding rethrows as `OkvisOptimizationException`).
3. **Parallax denominator edge case**: when `tracked == 0` (no matches at all), `meanParallaxPx` is undefined. Proposed default: emit NaN + warning log; binding then short-circuits to DEGRADED health state. Scheduler may choose 0.0 instead.
## Out of scope
- 6×6 pose covariance accessor — covered by AZ-951 (sibling ticket; same ADR, same patch mechanism).
- The ADR creation itself — owned by AZ-951; this ticket extends the ADR's scope rather than creating a separate one.
- Submitting the patch to upstream OKVIS2 maintainers (tracked as a follow-up issue after both tickets land locally).
- The downstream AZ-943 binding update — owned by AZ-943, which is currently blocked-by this ticket.
## References
- AZ-943 implementation attempt (2026-05-29): proved the five tracking-stats fields have no source in OKVIS2 v2's public callback surface.
- AZ-592 line 24 (incorrect spec assumption): "derive tracked_features + mean_parallax from TrackingState" — superseded by this ticket; TrackingState does NOT carry these.
- `cpp/okvis2/upstream/okvis_common/include/okvis/ViInterface.hpp` lines 167-174: TrackingState definition (only 6 fields, none of them tracking-stats).
- `cpp/okvis2/upstream/okvis_ceres/include/okvis/ViSlamBackend.hpp`: no public tracking-stats accessor anywhere.
- AZ-848 (jetson_eskf_out_of_order_regression): downstream EKF assumes finite, computed VIO telemetry; placeholders would mislead its diagnostic surface.
- meta-rule.mdc "Real Results, Not Simulated Ones" — the constraint that forces this path over a placeholder.
- Sibling ticket: AZ-951 (covariance + ADR).