Files
gps-denied-onboard/_docs/02_tasks/backlog/AZ-592_AZ-332_tier2_validation.md
Oleksandr Bezdieniezhnykh 6d51e06886 [AZ-589] [AZ-590] [AZ-591] [AZ-592] [AZ-593] Re-classify cycle1 gate findings
Cycle 1 Product Implementation Completeness Gate post-mortem.
AZ-589 + AZ-590 were the wrong abstraction:

- AZ-589 targeted `okvis::ThreadedKFVio` (OKVIS v1 API) which does
  not exist in the vendored OKVIS2 upstream; smartroboticslab/okvis2
  exposes `okvis::ThreadedSlam` instead.
- AZ-590 assumed a "de-ROSified VINS-Mono pin" submodule exists;
  `cpp/vins_mono/upstream/` has no `.gitmodules` entry.
- The actual production gap is the empty central
  `_STRATEGY_REGISTRY`: `register_strategy(...)` is never called
  outside test fixtures, so `compose_root()` raises
  `StrategyNotLinkedError` for every component slug with a
  strategy-selecting config field. Affects c1_vio + c2_vpr +
  c2_5_rerank + c3_matcher + c3_5_adhop + c4_pose + c5_state.

Re-classification:

- AZ-589 + AZ-590 closed Won't Fix (Jira); spec files removed
  from todo/ but rows retained in the dependencies table as
  audit-trail.
- AZ-591 created (todo/, 5pt) — cross-cutting compose_root
  per-binary bootstrap that populates `_STRATEGY_REGISTRY` for
  the airborne binary. Scheduled as Batch 66 sole task.
- AZ-592 created (backlog/, 5pt placeholder) — AZ-332 Tier-2
  validation bundle (real `okvis::ThreadedSlam` wiring + Linux CI
  apt-install + DBoW2 vocab + Jetson). BLOCKED on Tier-2
  prerequisites; honors AZ-332's `AZ-332_tier2_validation`
  self-deferral handle.
- AZ-593 created (backlog/, 5pt placeholder) — AZ-333 Tier-2
  validation bundle (de-ROSified VINS-Mono upstream + binding +
  CI + Jetson). BLOCKED on upstream vendoring decision plus
  Tier-2 prerequisites; honors AZ-333's parallel deferral pattern.
- AZ-332 + AZ-333 re-classified in cycle1 gate report from FAIL
  to BLOCKED-on-Tier-2.

Step 7 stays in_progress until AZ-591 lands; after that it can
advance to Step 8 with AZ-592 + AZ-593 parked in backlog/.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 12:45:58 +03:00

6.6 KiB
Raw Permalink Blame History

AZ-592 — AZ-332 Tier-2 validation: OKVIS2 ThreadedSlam wiring + CI build env + Jetson

Task: AZ-592_AZ-332_tier2_validation Name: AZ-332 Tier-2 validation bundle (OKVIS2) Description: Replace the AZ-332 _native/okvis2_binding.cpp skeleton with real okvis::ThreadedSlam wiring; add the Linux CI apt-install block + flip BUILD_OKVIS2=OFF to ON; package the DBoW2 vocabulary artifact; validate honest 6×6 covariance on real Jetson hardware against Derkachi-class fixtures. Complexity: 5 points (placeholder; likely 8+ once Tier-2 work actually starts — re-size when scheduled) Dependencies: AZ-332, AZ-276 (ImuPreintegrator), AZ-277 (SE3Utils), AZ-591 (compose_root per-binary bootstrap — must land first so the registered c1_vio:okvis2 slot is reachable) Component: c1_vio (epic AZ-254 / E-C1) Tracker: AZ-592 Epic: AZ-254 (E-C1) Status: parked in backlog/ — BLOCKED on Tier-2 prerequisites (see below)

Problem

AZ-332 shipped the Okvis2Strategy Python facade + Okvis2Backend skeleton C++ binding (which throws OkvisFatalException on first frame) and explicitly deferred the real estimator wiring to a Tier-2 follow-up. AZ-332's Implementation Notes line 82 named this follow-up AZ-332_tier2_validation and stated the gate would create it at cycle end.

The cycle-1 gate initially mis-classified AZ-332 as FAIL and created AZ-589_remediate_okvis2_threadedkfvio_wiring against the wrong OKVIS v1 API (ThreadedKFVio doesn't exist in OKVIS2). That ticket has been closed Won't Fix. This task replaces it with the correct scope and API.

Outcome

  1. API-correct C++ binding rewrite: rewrite _native/okvis2_binding.cpp against the actual OKVIS2 upstream API:

    • Headers: okvis/ThreadedSlam.hpp, okvis/ViParametersReader.hpp, okvis/Parameters.hpp, okvis/ViInterface.hpp.
    • Construct okvis::ThreadedSlam(parameters, dBowDir) after reading yaml_config_ via okvis::ViParametersReader(yaml).getParameters(parameters).
    • Subscribe to setOptimisedGraphCallback(...) with a lambda whose signature is void(const State&, const TrackingState&, std::shared_ptr<const AlignedMap<StateId, State>>, std::shared_ptr<const okvis::MapPointVector>). Fill latest_output_ under output_mtx_ from State::T_WS, v_W, b_g, b_a, omega_S, timestamp, isKeyframe; derive tracked_features + mean_parallax from TrackingState.
    • Convert numpy uint8 frames to cv::Mat (re-using the existing py::array_t<uint8_t, c_style|forcecast> no-copy buffer view) and call addImages(okvis_time, {0: cv_mat}).
    • Forward IMU via addImuMeasurement(okvis_time, Eigen::Vector3d(alpha), Eigen::Vector3d(omega)).
    • Map okvis::TrackingQuality (Good/Marginal/Lost) onto the binding's HealthState enum.
    • Reset: re-construct ThreadedSlam from the same parameters and re-subscribe the callback (OKVIS2 has no in-place reset).
  2. 6×6 covariance extraction: ViInterface does not expose the marginalisation block directly. Two options:

    • (a) Add a tiny upstream patch to ThreadedSlam exposing ViSlamBackend::computeCovariance(StateId); document the patch under cpp/okvis2/patches/.
    • (b) Best-effort proxy: emit a fixed-rank diagonal scaled by feature count / tracking-quality until the upstream patch lands. Mark the AC-1.4 covariance honesty test as xfail(strict=True) until option (a) is in.
  3. CMake glue: extend cpp/okvis2/CMakeLists.txt to link OpenCV (cv::Mat is used in the binding). Verify Eigen pin alignment with GTSAM + VINS-Mono (AZ-593).

  4. CI workflow: in .github/workflows/ci.yml, add apt install -y libceres-dev libbrisk-dev libdbow2-dev libsuitesparse-dev libgflags-dev libgoogle-glog-dev libopencv-dev libboost-filesystem-dev libatlas-base-dev libeigen3-dev to the Linux runner setup step. Flip -DBUILD_OKVIS2=OFF to -DBUILD_OKVIS2=ON for the airborne + research matrix kinds.

  5. DBoW2 vocab artifact: package small_voc.yml.gz next to the .so install location. Two options:

    • (a) Vendor inside the repo (small file, ~3MB — but ROS users typically download separately).
    • (b) Fetch at CI time via a pinned URL from a OKVIS2 release artifact mirror; user decides at scheduling time.
  6. Tier-1 integration test: tests/integration/c1_vio/test_az332_okvis2_real_binding.py with @pytest.mark.skipif(not _okvis2_binding_present()). Sanity-check that the binding loads and processes a 60-frame EuRoC-class fixture without throwing; does NOT validate accuracy (Tier-2).

  7. Tier-2 Jetson validation (AC-9 of original AZ-332): run honest 6×6 covariance validation against Derkachi-class fixtures on real Jetson Orin. p95 ≤ 80 ms; p50 ≤ 25 ms per the original NFR-perf budget. Owned by AZ-444 (Tier-2 Jetson harness).

Prerequisites BLOCKED on

  • AZ-591 landed first: compose_root per-binary bootstrap so c1_vio:okvis2 is registered + reachable.
  • Linux CI runner image with apt deps: GitHub Actions ubuntu-latest has most deps but not libbrisk-dev / libdbow2-dev; may require a custom runner image or apt install of dependencies plus a self-built brisk/dbow2.
  • Jetson hardware: for AC-9 honest-covariance validation against Derkachi-class fixtures.
  • DBoW2 vocab decision: vendor in-repo (option 5a) vs. fetch at CI time (option 5b).
  • Eigen pin alignment: confirm GTSAM + OKVIS2 use compatible Eigen versions; vendor Eigen under cpp/_third_party/eigen/ if not.

Scope notes

  • This task as written exceeds the user's 5pt PBI complexity rule. It is filed as a placeholder. When Tier-2 work actually starts, split into:
    • AZ-592a — C++ binding rewrite + CMake (3pt; assumes CI dep install handled externally)
    • AZ-592b — Linux CI dep install + DBoW2 vocab artifact (2pt)
    • AZ-592c — Jetson hardware validation against Derkachi-class fixtures (5pt; runs IT-12 fixtures with covariance honesty assertions)
    • Plus the upstream-patch decision (cpp/okvis2/patches/expose_covariance.patch) as its own ADR addendum if needed.

Notes

  • Coordinate with AZ-593 (VINS-Mono Tier-2 sibling) on shared Eigen / Ceres pin work.
  • Upstream OKVIS2 README documents the apt deps explicitly; copy that list verbatim into the CI workflow comment.
  • The skeleton binding's OkvisFatalException("OKVIS2 estimator not yet wired — this binding is the AZ-332 skeleton") is the deliberate fail-loud surface. Replace it with the real ThreadedSlam calls; do NOT keep a fallback "estimator_built_ = false" branch.
  • The Implementation Notes (2026-05-12, batch 23) block in _docs/02_tasks/done/AZ-332_c1_okvis2_strategy.md documents the original deferral rationale. Keep it intact for audit; this task discharges that contract.