mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 16:11:13 +00:00
[AZ-589] [AZ-590] [AZ-591] [AZ-592] [AZ-593] Re-classify cycle1 gate findings
Cycle 1 Product Implementation Completeness Gate post-mortem. AZ-589 + AZ-590 were the wrong abstraction: - AZ-589 targeted `okvis::ThreadedKFVio` (OKVIS v1 API) which does not exist in the vendored OKVIS2 upstream; smartroboticslab/okvis2 exposes `okvis::ThreadedSlam` instead. - AZ-590 assumed a "de-ROSified VINS-Mono pin" submodule exists; `cpp/vins_mono/upstream/` has no `.gitmodules` entry. - The actual production gap is the empty central `_STRATEGY_REGISTRY`: `register_strategy(...)` is never called outside test fixtures, so `compose_root()` raises `StrategyNotLinkedError` for every component slug with a strategy-selecting config field. Affects c1_vio + c2_vpr + c2_5_rerank + c3_matcher + c3_5_adhop + c4_pose + c5_state. Re-classification: - AZ-589 + AZ-590 closed Won't Fix (Jira); spec files removed from todo/ but rows retained in the dependencies table as audit-trail. - AZ-591 created (todo/, 5pt) — cross-cutting compose_root per-binary bootstrap that populates `_STRATEGY_REGISTRY` for the airborne binary. Scheduled as Batch 66 sole task. - AZ-592 created (backlog/, 5pt placeholder) — AZ-332 Tier-2 validation bundle (real `okvis::ThreadedSlam` wiring + Linux CI apt-install + DBoW2 vocab + Jetson). BLOCKED on Tier-2 prerequisites; honors AZ-332's `AZ-332_tier2_validation` self-deferral handle. - AZ-593 created (backlog/, 5pt placeholder) — AZ-333 Tier-2 validation bundle (de-ROSified VINS-Mono upstream + binding + CI + Jetson). BLOCKED on upstream vendoring decision plus Tier-2 prerequisites; honors AZ-333's parallel deferral pattern. - AZ-332 + AZ-333 re-classified in cycle1 gate report from FAIL to BLOCKED-on-Tier-2. Step 7 stays in_progress until AZ-591 lands; after that it can advance to Step 8 with AZ-592 + AZ-593 parked in backlog/. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,106 +0,0 @@
|
||||
# Remediate AZ-332 — wire `okvis::ThreadedKFVio` inside the OKVIS2 pybind11 binding
|
||||
|
||||
**Task**: AZ-589_remediate_okvis2_threadedkfvio_wiring
|
||||
**Name**: AZ-332 ThreadedKFVio wiring (production-default VIO)
|
||||
**Description**: Replace the AZ-332 skeleton `_native/okvis2_binding.cpp` `_build_estimator()` / `_drive_estimator()` paths with the real `okvis::ThreadedKFVio` estimator wiring. Without this, the airborne deployment binary cannot process a single nav-camera frame (`Okvis2Backend.add_frame` throws `OkvisFatalException("OKVIS2 estimator not yet wired — this binding is the AZ-332 skeleton")` on the first call).
|
||||
**Complexity**: 5 points
|
||||
**Dependencies**: AZ-332, AZ-276 (ImuPreintegrator), AZ-277 (SE3Utils)
|
||||
**Component**: c1_vio (epic AZ-254 / E-C1)
|
||||
**Tracker**: AZ-589
|
||||
**Epic**: AZ-254 (E-C1)
|
||||
|
||||
## Problem
|
||||
|
||||
The Product Implementation Completeness Gate (cycle 1, 2026-05-16) classified AZ-332 as **FAIL**:
|
||||
|
||||
- `src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp` lines 251–272: `_build_estimator()` always sets `estimator_built_ = false`; `_drive_estimator()` throws `OkvisFatalException("Okvis2Backend: OKVIS2 estimator not yet wired ...")`.
|
||||
- The OKVIS2 upstream headers are still commented out (line 48 `// #include <okvis/ThreadedKFVio.hpp>`).
|
||||
- AZ-332's `Runtime Completeness` section forbids "a pre-built deterministic-fallback `VioOutput` while OKVIS2 is 'compiled out'"; the current binding violates that contract.
|
||||
|
||||
AZ-332's own `Implementation Notes (2026-05-12, batch 23)` flagged this for the gate and named the follow-up `AZ-332_tier2_validation`. This task discharges that contract.
|
||||
|
||||
Production-default VIO must operate before F3 (Steady-state per-frame estimation) can run end-to-end on the airborne binary.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `_native/okvis2_binding.cpp::_build_estimator()` instantiates `okvis::ThreadedKFVio` from `yaml_config_`. The real OKVIS2 upstream headers (`okvis/ThreadedKFVio.hpp`, `okvis/Estimator.hpp`, `okvis/VioParametersReader.hpp`) are uncommented and linked.
|
||||
- Output callback attached to `ThreadedKFVio` fills `latest_output_` (under `output_mtx_`) with the real `EstimatorOutput`: SE(3) pose, marginalised 6×6 covariance, accel/gyro biases, feature counts, mean parallax, per-frame MRE.
|
||||
- `_drive_estimator(image)` forwards the frame to the real estimator, returns `true` on a new keyframe output and `false` otherwise. NO `throw` after first frame on the happy path.
|
||||
- `add_imu(ts_ns, accel, gyro)` pushes IMU into `ThreadedKFVio` via the real upstream API (no longer just caching `last_accel_` / `last_gyro_`).
|
||||
- `reset(...)` reinitialises the estimator from the seed pose / velocity / biases via the upstream reset surface; bias propagation goes through the AZ-276 substrate where the Python facade owns it.
|
||||
- CMake glue at `cpp/okvis2/CMakeLists.txt` is upgraded so `BUILD_OKVIS2=ON` actually links the upstream library (Ceres, Brisk, OpenGV, OKVIS2 itself); the Linux Tier-1 CI workflow comment `-DBUILD_OKVIS2=OFF` can be flipped to `ON` once the build succeeds in the GitHub Actions runner image.
|
||||
- The existing AC-1..AC-8 + AC-10 unit-test suite (`tests/unit/c1_vio/test_okvis2_strategy.py` + `conftest.py`'s `FakeOkvis2Backend`) continues to pass — the Python facade contract is unchanged.
|
||||
- A new Tier-1-runnable integration test exercises the real binding against a small fixture (`tests/integration/c1_vio/test_okvis2_real_binding.py`, marked `@pytest.mark.tier1_real_okvis2 + @pytest.mark.skipif(not _okvis2_binding_present())` so Tier-1 CI without OKVIS2 still passes).
|
||||
- The `runtime_root` per-binary bootstrap module registers the `okvis2` strategy via `register_strategy("c1_vio", "okvis2", ...)` so the deployment binary can actually resolve it (currently the registry is empty — see § Notes below).
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- `_native/okvis2_binding.cpp` wiring of `okvis::ThreadedKFVio` (`_build_estimator`, `_drive_estimator`, `add_imu`, `reset`, output callback).
|
||||
- `cpp/okvis2/CMakeLists.txt` upstream link + transitive Ceres/Brisk/OpenGV dependency declaration.
|
||||
- One Tier-1-runnable integration test against a tiny fixture.
|
||||
- A short `_docs/04_refactoring/runtime_bootstrap_register_okvis2/` note describing the per-binary bootstrap registration call site (the registry seam already exists in `runtime_root/__init__.py`; this task adds the actual `register_strategy(...)` call inside the deployment binary's bootstrap).
|
||||
|
||||
### Excluded
|
||||
|
||||
- AC-9 honest-covariance Tier-2 validation against Derkachi-class fixtures on real Jetson hardware — separate Tier-2 perf task (`tier2_AZ-332_honest_covariance_validation`).
|
||||
- Comparative-study (IT-12) test suite changes — owned by `AZ-444_tier2_jetson_harness` and downstream test tasks.
|
||||
- OKVIS2 upstream-source modifications — pinned per Plan-phase; deviations require a separate ADR.
|
||||
- VINS-Mono wiring — separate remediation task (`remediate_vins_mono_estimator_wiring`).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: `_drive_estimator` returns without raising on a valid nav-camera frame**
|
||||
Given a real `Okvis2Backend` constructed with the test fixture's YAML + intrinsics
|
||||
When `add_frame("uuid-abc", ts_ns=…, image=…)` is called on a well-formed image
|
||||
Then the call returns a `bool` (true on keyframe output, false otherwise); no `OkvisFatalException` is raised on the "estimator not yet wired" path
|
||||
|
||||
**AC-2: Output callback populates `latest_output_` after the first keyframe**
|
||||
Given a sequence of N frames feeding a normal-segment fixture
|
||||
When `get_latest_output()` is called after a keyframe is emitted
|
||||
Then the returned dict contains `pose_T_world_body` (4×4 finite float64), `pose_covariance_6x6` (6×6 SPD float64), `accel_bias` + `gyro_bias` (length-3 finite float64), and integer feature counts — all values reflect the real estimator state, not seed zeros
|
||||
|
||||
**AC-3: Python facade unit tests stay green**
|
||||
Given the existing `Okvis2Strategy` AC-1..AC-8 + AC-10 unit tests (which use `FakeOkvis2Backend`)
|
||||
When `pytest tests/unit/c1_vio/test_okvis2_strategy.py` runs
|
||||
Then 100% pass (the facade contract is preserved)
|
||||
|
||||
**AC-4: `BUILD_OKVIS2=OFF` still produces an importable package**
|
||||
Given a build with `BUILD_OKVIS2=OFF`
|
||||
When `import gps_denied_onboard.components.c1_vio` runs
|
||||
Then the package imports without ever touching `_native.okvis2_binding`; `gps_denied_onboard.components.c1_vio.okvis2` is not auto-imported; the AZ-331 factory raises `StrategyNotAvailableError("okvis2", missing_flag="BUILD_OKVIS2")` if `okvis2` is requested
|
||||
|
||||
**AC-5: Composition-root strategy registration is live**
|
||||
Given the deployment binary's per-binary bootstrap module
|
||||
When `compose_root(config)` runs with `config.components["c1_vio"].strategy == "okvis2"`
|
||||
Then `register_strategy("c1_vio", "okvis2", build_okvis2_strategy, tier="airborne", depends_on=(…))` has been called at module import time; `_resolve_strategy` returns the registration without raising `StrategyNotLinkedError`
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- `_drive_estimator` first-call construction cost ≤ 5 s (one-time, swallowed by AC-NEW-1 boot budget).
|
||||
- Steady-state `_drive_estimator` per-frame cost ≤ 80 ms p95 on Tier-2 (AC-9 / NFR-perf validation is the separate Tier-2 task; this task only verifies the wiring is not introducing an obviously broken path).
|
||||
|
||||
**Reliability**
|
||||
- No raw OKVIS2 / Ceres / Eigen exceptions cross the pybind11 boundary; everything is caught and rewrapped into `OkvisInitException` / `OkvisOptimizationException` / `OkvisFatalException`.
|
||||
|
||||
## Constraints
|
||||
|
||||
- Per `ADR-002`, the `BUILD_OKVIS2=OFF` deployment binary must still build and pass all non-OKVIS2 tests after this change.
|
||||
- OKVIS2 upstream is pinned per Plan-phase; the build must use the pinned commit, not HEAD.
|
||||
- AZ-276 `ImuPreintegrator` remains a *separate* substrate for E-C5's fusion graph; OKVIS2's internal IMU integration is owned by `ThreadedKFVio`. The single-IMU-truth invariant operates at the IMU sample-stream level, NOT at the integrator-instance level (resolved in AZ-332 § Implementation Notes).
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: OKVIS2 upstream build fails on the GitHub Actions Linux runner image**
|
||||
- *Risk*: Ceres / Brisk / OpenGV require specific system packages (Eigen ≥ 3.4, gflags, glog, suitesparse, …). The default `ubuntu-latest` image may be missing dependencies.
|
||||
- *Mitigation*: lift the `apt install` block from `cpp/okvis2/README` (or upstream's `INSTALL.md`) into `.github/workflows/ci.yml`'s "Setup C++ deps" step before invoking CMake. Document the dependency list in the task report.
|
||||
|
||||
**Risk 2: ABI conflict between OKVIS2's Ceres pin and the project's GTSAM pin**
|
||||
- *Risk*: GTSAM uses its own internal Ceres-less optimisation; OKVIS2 needs Ceres. If both libraries pull in conflicting Eigen versions, the airborne binary will segfault at link time.
|
||||
- *Mitigation*: vendor OKVIS2's third-party dependencies under `cpp/_third_party/okvis2/` with strict version pinning; verify `objdump -T` or equivalent that the Eigen symbol set matches between GTSAM-compiled and OKVIS2-compiled objects.
|
||||
|
||||
## Notes
|
||||
|
||||
This remediation also incidentally exposes a cross-cutting gap noted in the cycle-1 completeness report: `runtime_root/__init__.py` defines a strategy registry but no module currently calls `register_strategy(...)` — meaning even with this wiring complete, `compose_root` would still raise `StrategyNotLinkedError` on the deployment binary until a per-binary bootstrap module is added. AC-5 above ensures this task lands at least the `c1_vio` slot's registration; the broader cross-component registration sweep is tracked as a separate cross-cutting task (proposed `cross_cutting_per_binary_bootstrap`).
|
||||
@@ -1,104 +0,0 @@
|
||||
# Remediate AZ-333 — wire VINS-Mono estimator inside the pybind11 binding
|
||||
|
||||
**Task**: AZ-590_remediate_vins_mono_estimator_wiring
|
||||
**Name**: AZ-333 VINS-Mono estimator wiring (research / comparative VIO)
|
||||
**Description**: Replace the AZ-333 skeleton `_native/vins_mono_binding.cpp` `_build_estimator()` / `_drive_estimator()` paths with the real VINS-Mono estimator wiring. Without this, the `VinsMonoStrategy` cannot produce a single pose update at runtime — the binding throws `VinsMonoFatalException("VINS-Mono estimator not yet wired ... AZ-333 skeleton")` on the first frame.
|
||||
**Complexity**: 5 points
|
||||
**Dependencies**: AZ-333, AZ-276 (ImuPreintegrator), AZ-277 (SE3Utils)
|
||||
**Component**: c1_vio (epic AZ-254 / E-C1)
|
||||
**Tracker**: AZ-590
|
||||
**Epic**: AZ-254 (E-C1)
|
||||
|
||||
## Problem
|
||||
|
||||
The Product Implementation Completeness Gate (cycle 1, 2026-05-16) classified AZ-333 as **FAIL**:
|
||||
|
||||
- `src/gps_denied_onboard/components/c1_vio/_native/vins_mono_binding.cpp` exhibits the identical skeleton pattern as `okvis2_binding.cpp`: `_build_estimator()` sets `estimator_built_ = false`; `_drive_estimator()` throws `VinsMonoFatalException("VinsMonoBackend: VINS-Mono estimator not yet wired — this binding is the AZ-333 skeleton")`.
|
||||
- VINS-Mono upstream headers (`estimator/estimator.h`, `feature_tracker.h`) are commented out.
|
||||
- AZ-333's `Runtime Completeness` section explicitly requires "a real pybind11 binding around upstream VINS-Mono" and "real per-frame estimator update" — both currently unmet.
|
||||
|
||||
VINS-Mono is the **research / comparative VIO strategy** (`tier=research`), used by IT-12 comparative-study runs and the Tier-2 Jetson harness. Until wired, all OKVIS2-vs-VINS-Mono comparison runs would crash on the VINS-Mono branch.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `_native/vins_mono_binding.cpp::_build_estimator()` instantiates the upstream `Estimator` and `FeatureTracker`, wired with the AZ-333 YAML config and intrinsics.
|
||||
- `_drive_estimator(image)` calls `feature_tracker_.readImage(image, ts)` followed by `estimator_.processMeasurements(...)`, returning `true` on a new keyframe output and `false` otherwise.
|
||||
- `add_imu(ts_ns, accel, gyro)` pushes IMU into the estimator's IMU queue (no longer just caching).
|
||||
- Output extraction reads the estimator's most recent `Vector<double, 7>` pose + bias state and the marginalisation block's covariance.
|
||||
- `reset(...)` reinitialises the estimator via VINS-Mono's `clearState() + setParameter()` surface from a seed pose / velocity / biases.
|
||||
- CMake glue at `cpp/vins_mono/CMakeLists.txt` is upgraded so `BUILD_VINS_MONO=ON` actually links upstream VINS-Mono (Ceres, OpenCV ≥ 4.2, Eigen ≥ 3.4); the Tier-2 perf workflow can flip `-DBUILD_VINS_MONO=OFF` to `ON`.
|
||||
- The existing AC-1..AC-9 unit-test suite (`tests/unit/c1_vio/test_vins_mono_strategy.py` + `conftest.py`'s `FakeVinsMonoBackend`) continues to pass — the Python facade contract is unchanged.
|
||||
- A new Tier-1-runnable integration test exercises the real binding against a small fixture, gated behind `@pytest.mark.tier1_real_vins_mono + @pytest.mark.skipif(not _vins_mono_binding_present())`.
|
||||
- The per-binary bootstrap module's strategy registry now includes a `register_strategy("c1_vio", "vins_mono", ...)` call for the research / comparative-study binary.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- `_native/vins_mono_binding.cpp` wiring of the real `Estimator` + `FeatureTracker` (`_build_estimator`, `_drive_estimator`, `add_imu`, `reset`, pose / covariance extraction).
|
||||
- `cpp/vins_mono/CMakeLists.txt` upstream link + transitive Ceres / OpenCV / Eigen dependency declaration.
|
||||
- One Tier-1-runnable integration test against a tiny fixture.
|
||||
- `runtime_root` registration call site for the `vins_mono` strategy (added inside the research / comparative-study binary's bootstrap, NOT the airborne binary).
|
||||
|
||||
### Excluded
|
||||
|
||||
- AZ-444 Tier-2 Jetson comparative-study harness changes — that task owns the IT-12 run orchestration.
|
||||
- VINS-Mono upstream-source modifications — pinned per Plan-phase; deviations require a separate ADR.
|
||||
- OKVIS2 wiring — sibling remediation task (`remediate_okvis2_threadedkfvio_wiring`).
|
||||
- Replacing VINS-Mono with VINS-Fusion or VINS-RGBD — pinned upstream variant is monocular VINS-Mono.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: `_drive_estimator` returns without raising on a valid nav-camera frame**
|
||||
Given a real `VinsMonoBackend` constructed with the test fixture's YAML + intrinsics
|
||||
When `add_frame("uuid-abc", ts_ns=…, image=…)` is called on a well-formed image
|
||||
Then the call returns a `bool` (true on keyframe output, false otherwise); no `VinsMonoFatalException` is raised on the "estimator not yet wired" path
|
||||
|
||||
**AC-2: Output callback populates `latest_output_` after the first keyframe**
|
||||
Given a sequence of N frames feeding a normal-segment fixture
|
||||
When `get_latest_output()` is called after a keyframe is emitted
|
||||
Then the returned dict contains `pose_T_world_body` (4×4 finite float64), `pose_covariance_6x6` (6×6 SPD float64), `accel_bias` + `gyro_bias` (length-3 finite float64), and integer feature counts — all values reflect the real estimator state, not seed zeros
|
||||
|
||||
**AC-3: Python facade unit tests stay green**
|
||||
Given the existing `VinsMonoStrategy` AC-1..AC-9 unit tests (which use `FakeVinsMonoBackend`)
|
||||
When `pytest tests/unit/c1_vio/test_vins_mono_strategy.py` runs
|
||||
Then 100% pass (the facade contract is preserved)
|
||||
|
||||
**AC-4: `BUILD_VINS_MONO=OFF` still produces an importable package**
|
||||
Given a build with `BUILD_VINS_MONO=OFF`
|
||||
When `import gps_denied_onboard.components.c1_vio` runs
|
||||
Then the package imports without ever touching `_native.vins_mono_binding`; `gps_denied_onboard.components.c1_vio.vins_mono` is not auto-imported; the AZ-331 factory raises `StrategyNotAvailableError("vins_mono", missing_flag="BUILD_VINS_MONO")` if `vins_mono` is requested
|
||||
|
||||
**AC-5: Research binary's strategy registration is live**
|
||||
Given the comparative-study / research binary's per-binary bootstrap module
|
||||
When `compose_root(config)` runs with `config.components["c1_vio"].strategy == "vins_mono"`
|
||||
Then `register_strategy("c1_vio", "vins_mono", build_vins_mono_strategy, tier="research", depends_on=(…))` has been called at module import time; `_resolve_strategy` returns the registration without raising `StrategyNotLinkedError`
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- `_drive_estimator` first-call construction cost ≤ 5 s (one-time, swallowed by the boot budget).
|
||||
- Steady-state `_drive_estimator` per-frame cost ≤ 100 ms p95 on Tier-2 (NFR-perf validation lives in the Tier-2 Jetson harness; this task only verifies the wiring is not obviously broken).
|
||||
|
||||
**Reliability**
|
||||
- No raw VINS-Mono / Ceres / Eigen exceptions cross the pybind11 boundary; everything is caught and rewrapped into `VinsMonoInitException` / `VinsMonoOptimizationException` / `VinsMonoFatalException`.
|
||||
|
||||
## Constraints
|
||||
|
||||
- Per `ADR-002`, the `BUILD_VINS_MONO=OFF` deployment binary must still build and pass all non-VINS-Mono tests after this change.
|
||||
- VINS-Mono upstream is pinned per Plan-phase; build against the pinned commit, not HEAD.
|
||||
- The research binary's bootstrap must not auto-load on the airborne `BUILD_OKVIS2=ON, BUILD_VINS_MONO=OFF` deployment binary — that would defeat the build-flag isolation invariant.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: VINS-Mono's ROS-isms in its upstream source**
|
||||
- *Risk*: Upstream VINS-Mono carries ROS/Catkin assumptions (rosbag input, ROS-flavoured visualisation). These need to be either stripped or stubbed; otherwise the link step pulls in ROS, which the project intentionally does not depend on.
|
||||
- *Mitigation*: vendor a minimal-ROS-stub layer under `cpp/_third_party/vins_mono_ros_stub/` (already a documented pattern in the AZ-444 Plan notes); pre-process the upstream sources to compile against this stub.
|
||||
|
||||
**Risk 2: Eigen ABI conflict with GTSAM / OKVIS2**
|
||||
- *Risk*: VINS-Mono, OKVIS2, and GTSAM all transitively depend on Eigen; if their Eigen pins disagree, ABI breakage at link time.
|
||||
- *Mitigation*: enforce a single Eigen pin across `cpp/_third_party/` via the top-level `CMakeLists.txt`; verify the shared Eigen symbol set matches.
|
||||
|
||||
## Notes
|
||||
|
||||
Identical structural defect to AZ-332. Mitigations and gotchas overlap heavily, so the two remediation tasks should ideally be scheduled in the same batch (the CMake / dependency-management work shares a lot of surface area). The composition-root registry gap (`runtime_root/__init__.py::register_strategy(...)` is never called in `src/`) is also touched by both tasks; coordinate the per-binary bootstrap module so both remediations land their `register_strategy` call in the right binary (airborne for OKVIS2, research/comparative for VINS-Mono).
|
||||
@@ -0,0 +1,122 @@
|
||||
# AZ-591 — compose_root per-binary bootstrap: populate `_STRATEGY_REGISTRY`
|
||||
|
||||
**Task**: AZ-591_compose_root_per_binary_bootstrap
|
||||
**Name**: compose_root per-binary bootstrap (cross-cutting Tier-1)
|
||||
**Description**: Land `airborne_bootstrap.py` + `operator_bootstrap.py` modules under `runtime_root/` that call `register_strategy(...)` for every (component, strategy) pair their respective binary needs. Wire the airborne entrypoint `main()` to call `register_airborne_strategies()` before `compose_root(config)`. Without this, `compose_root()` raises `StrategyNotLinkedError` on the first component lookup and the binary cannot reach takeoff.
|
||||
**Complexity**: 5 points (cross-cutting; touches 7 component slots but each slot is a small factory wrapper)
|
||||
**Dependencies**: AZ-270 (compose_root surface), AZ-331 (c1_vio factory), AZ-339 (c2_vpr factory), AZ-352 (c2.5 factory), AZ-355 (c4_pose factory), AZ-380 (c5_state factory), AZ-345 (c3_matcher factory), AZ-368 (c3.5_adhop factory) — all already in `done/`.
|
||||
**Component**: runtime_root (cross-cutting)
|
||||
**Tracker**: AZ-591
|
||||
**Epic**: AZ-246 (E-CC-CONF — Cross-Cutting / Composition Root)
|
||||
|
||||
## Problem
|
||||
|
||||
The Product Implementation Completeness Gate cycle 1 (2026-05-16) initially classified AZ-332 (OKVIS2 skeleton binding) as `FAIL` and created the now-closed AZ-589 + AZ-590 remediation tasks. Investigation of those remediation tasks surfaced the actual production gap: it has nothing to do with OKVIS2 or VINS-Mono specifically.
|
||||
|
||||
**The central `_STRATEGY_REGISTRY` is dormant**:
|
||||
|
||||
- `src/gps_denied_onboard/runtime_root/__init__.py` defines `_STRATEGY_REGISTRY: dict[tuple[str, str], _Registration]` and the public `register_strategy(component_slug, strategy_name, factory, *, tier, depends_on)` API.
|
||||
- A workspace-wide `grep -nE 'register_strategy\s*\(' src/` returns **only the definition site** — no module under `src/` ever calls `register_strategy()`. The only call sites are inside `tests/unit/test_az270_compose_root.py` (test fixtures that mutate the registry per-test).
|
||||
- `compose_root(config)` calls `_compose()` which walks `config.components` and invokes `_resolve_strategy(slug, strategy_name, allowed_tiers)`. For any component slug whose config block declares a `strategy` field, `_resolve_strategy` looks up `(slug, strategy_name)` in `_STRATEGY_REGISTRY`. Since the registry is empty, it raises `StrategyNotLinkedError`.
|
||||
|
||||
**Affected component slots** (every component config block with a `strategy: str` field — confirmed via `rg 'strategy:\s*str' src/.../components/*/config.py`):
|
||||
|
||||
| Component | Default strategy | Available strategies | Tier(s) |
|
||||
|-----------|------------------|----------------------|---------|
|
||||
| `c1_vio` | `klt_ransac` | `okvis2`, `vins_mono`, `klt_ransac` | airborne |
|
||||
| `c2_vpr` | `net_vlad` | `net_vlad`, `ultra_vpr`, `mega_loc`, `mix_vpr`, `sela_vpr`, `eigen_places`, `salad` | airborne |
|
||||
| `c2_5_rerank` | `inlier_count` | `inlier_count` (single) | airborne |
|
||||
| `c3_matcher` | `disk_lightglue` | `disk_lightglue`, `aliked_lightglue` | airborne |
|
||||
| `c3_5_adhop` | `adhop` | `adhop` (single) | airborne |
|
||||
| `c4_pose` | `opencv_gtsam` | `opencv_gtsam` (single) | airborne |
|
||||
| `c5_state` | `gtsam_isam2` | `gtsam_isam2`, `eskf_baseline` | airborne |
|
||||
|
||||
(Components without a `strategy` field — `c6_tile_cache`, `c7_inference`, `c8_fc_adapter`, `c11_tile_manager`, `c12_operator_orchestrator`, `c13_fdr` — use direct factories that `compose_root` consumes from `pre_constructed`, NOT the registry path. They are NOT in scope for this task.)
|
||||
|
||||
## Outcome
|
||||
|
||||
- `src/gps_denied_onboard/runtime_root/airborne_bootstrap.py` exists and exposes `register_airborne_strategies() -> None`. The function calls `register_strategy(...)` for every (component, strategy) pair in the 7-row table above, with `tier="airborne"`. Each registered factory is a small wrapper that adapts the existing per-component factory (`vio_factory.build_vio_strategy`, `vpr_factory.build_vpr_strategy`, etc.) to the `(config, constructed)` registry-factory signature.
|
||||
- `src/gps_denied_onboard/runtime_root/operator_bootstrap.py` exists and exposes `register_operator_strategies() -> None`. Registers the operator-binary slots (`c10_provisioning`, `c11_tile_manager`, `c12_operator_orchestrator` — these DON'T have a `strategy: str` field today so the operator binary's `compose_operator` flow is already OK; this module is a placeholder for symmetry + future-proofing).
|
||||
- The airborne entrypoint `runtime_root/__init__.py::main()` calls `register_airborne_strategies()` immediately BEFORE the first `compose_root(config)` call. Wired idempotently: re-invoking `main()` (e.g. in tests) does not raise on the second `register_strategy(...)` call because the registration is equal to the existing entry.
|
||||
- The wrapper factories declare `depends_on=(...)` such that `_topo_order()` produces a sensible construction order: dependencies that already exist in the per-component factory signatures (e.g. `c1_vio` needs `fdr_client` from `c13_fdr`) are surfaced as `depends_on` edges OR pulled from the `constructed` dict if `c13_fdr` is in `pre_constructed`. Whichever path matches the production assembly.
|
||||
- New unit tests `tests/unit/runtime_root/test_az591_airborne_bootstrap.py` verify:
|
||||
- AC-1: `register_airborne_strategies()` populates the registry with the 7 component slots (one per non-test strategy registered).
|
||||
- AC-2: `compose_root(config)` against a config that selects `c1_vio.strategy="klt_ransac"` + every other component's default strategy completes without raising `StrategyNotLinkedError`.
|
||||
- AC-3: `register_airborne_strategies()` is idempotent — calling it twice in the same process does not raise.
|
||||
- AC-4: A config that selects a strategy not registered (e.g. `c2_vpr.strategy="not_a_strategy"`) raises `StrategyNotLinkedError` with the available-strategies list populated.
|
||||
- AC-5: The `tier="airborne"` filter excludes operator-only registrations from airborne lookups (verified by calling `compose_operator(config)` on the airborne registrations and confirming `StrategyNotLinkedError`).
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- `runtime_root/airborne_bootstrap.py` (new) — `register_airborne_strategies()` + per-component wrapper factories.
|
||||
- `runtime_root/operator_bootstrap.py` (new, minimal) — placeholder for the operator entrypoint's future registry needs; today only `clear_pose_registry` / `clear_state_registry` style cleanup is needed.
|
||||
- `runtime_root/__init__.py::main()` modification: insert `register_airborne_strategies()` call before `compose_root(config)`.
|
||||
- `tests/unit/runtime_root/test_az591_airborne_bootstrap.py` (new) — AC-1..AC-5 suite.
|
||||
|
||||
### Excluded
|
||||
|
||||
- C++ binding work for OKVIS2 (`AZ-592`) and VINS-Mono (`AZ-593`) — these Tier-2 tasks are parked in `backlog/` until their hardware + CI prerequisites are provisioned. The bootstrap registers the c1_vio:okvis2 + c1_vio:vins_mono slots so the registry seam is correct, but the strategy factory still raises `StrategyNotAvailableError` at construction time when `BUILD_OKVIS2=OFF` (existing behaviour from `vio_factory.py`, unchanged).
|
||||
- Refactoring the per-component factory signatures from `(config, fdr_client=...)` to `(config, constructed)` — instead, the bootstrap's wrapper factories adapt one signature to the other. The per-component factories are stable surfaces and should not change shape inside this task.
|
||||
- Operator binary strategy registrations beyond the placeholder — the operator binary's actual strategy use is handled by direct factories today (`build_flights_api_client`, etc.) which compose_operator already consumes correctly.
|
||||
- Replay-branch additions — `compose_root`'s replay path uses `pre_constructed`, which is orthogonal to the registry-driven path this task fixes.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Bootstrap populates the airborne registry with 7 component slots**
|
||||
Given a fresh process where `_STRATEGY_REGISTRY` is empty
|
||||
When `register_airborne_strategies()` is called
|
||||
Then `list_registered_strategies("c1_vio")` returns `["klt_ransac", "okvis2", "vins_mono"]` (sorted); same exhaustive list for c2_vpr / c2_5_rerank / c3_matcher / c3_5_adhop / c4_pose / c5_state; every registered factory carries `tier="airborne"`.
|
||||
|
||||
**AC-2: compose_root reaches takeoff with default strategies + klt_ransac**
|
||||
Given `register_airborne_strategies()` has been called
|
||||
And a config that selects `c1_vio.strategy="klt_ransac"`, `c2_vpr.strategy="net_vlad"`, `c3_matcher.strategy="disk_lightglue"`, `c4_pose.strategy="opencv_gtsam"`, `c5_state.strategy="gtsam_isam2"` (i.e. defaults)
|
||||
When `compose_root(config)` runs (with required env populated)
|
||||
Then it returns a `RuntimeRoot` whose `components` dict contains all 7 registered slots; no `StrategyNotLinkedError` is raised.
|
||||
|
||||
**AC-3: Idempotent registration**
|
||||
Given `register_airborne_strategies()` has been called once
|
||||
When it is called a second time in the same process
|
||||
Then no exception is raised; the registry retains the same 14+ entries (call-2 is a no-op due to equal `_Registration` records).
|
||||
|
||||
**AC-4: Unknown strategy in config still raises with useful message**
|
||||
Given `register_airborne_strategies()` has been called
|
||||
And a config selects `c2_vpr.strategy="not_a_real_strategy"`
|
||||
When `compose_root(config)` runs
|
||||
Then `StrategyNotLinkedError` is raised with `strategy_name="not_a_real_strategy"`, `component_slug="c2_vpr"`, `available_strategies` including `"net_vlad"` etc., and `reason="not linked"`.
|
||||
|
||||
**AC-5: Tier isolation prevents airborne registrations from leaking into compose_operator**
|
||||
Given `register_airborne_strategies()` has been called (no operator registrations)
|
||||
When `compose_operator(config)` runs against the same config
|
||||
Then it raises `StrategyNotLinkedError` for each airborne-tier registration with `reason` mentioning the tier mismatch; no airborne strategy is constructed by the operator binary path.
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- `register_airborne_strategies()` cost ≤ 50 ms on cold import (it's effectively 14 dict inserts + their dependency-resolution).
|
||||
|
||||
**Reliability**
|
||||
- No raw `RuntimeError` from the registry path should reach the operator — every failure mode passes through `StrategyNotLinkedError` with the contextual fields populated (already true of the existing surface).
|
||||
|
||||
## Constraints
|
||||
|
||||
- The wrapper factories MUST use the existing per-component factories. NEVER duplicate the BUILD_* flag gating logic inside the bootstrap — `vio_factory.build_vio_strategy` already does that for c1_vio, and similarly for each component.
|
||||
- AZ-507 cross-component import rule: `runtime_root/airborne_bootstrap.py` is the composition root, so it MAY import from any component's Public API. NEVER reach into a component's internal modules; always go through the per-component factory.
|
||||
- The `depends_on` declarations MUST be consistent with the per-component factory signatures. Document any inferred ordering in the wrapper factory's docstring.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Per-component factory signatures don't match `(config, constructed)`**
|
||||
- *Risk*: `build_vio_strategy(config, *, fdr_client)` takes `fdr_client` as a kwarg, not from a `constructed` dict. Adapting requires the wrapper to read `constructed["c13_fdr"]` and pass it as `fdr_client=...`. But `c13_fdr` is constructed by the takeoff path (`take_off()`), NOT by `compose_root`'s registry path. So the wrapper's `constructed` may not contain `c13_fdr` at call time.
|
||||
- *Mitigation*: For c1_vio specifically, the existing `take_off()` flow passes `fdr_client` separately via `other_components_factory(config, writer, fc_adapter)`. The bootstrap's wrapper for c1_vio should match this — it expects `constructed` to contain `c13_fdr`, raises a clear error if not, and the airborne entrypoint orchestrates `take_off()` to populate `constructed["c13_fdr"]` before calling `compose_root`. Document the call-order invariant in `airborne_bootstrap.py`.
|
||||
|
||||
**Risk 2: Compose-root construction order doesn't match the live takeoff path**
|
||||
- *Risk*: `_topo_order` runs Kahn's algorithm over the `depends_on` graph; the production `take_off()` runs a specific ordered sequence (writer → flight header → fc_adapter → other components). Disagreement between these two orderings can produce subtle bugs.
|
||||
- *Mitigation*: For now, the airborne bootstrap registers ONLY the 7 strategy-selecting component slots. The `take_off()` / `_replay_branch` flows continue to own c13_fdr / c8_fc_adapter / c6_tile_cache / c7_inference / replay components via their existing direct factories. The `pre_constructed` mechanism lets the registry-driven `_compose` see them already-built. Document this explicitly in the bootstrap module docstring.
|
||||
|
||||
## Notes
|
||||
|
||||
- This task does NOT validate end-to-end on the airborne binary because that requires a real Jetson + nav-camera + FC. It validates that `compose_root()` returns a `RuntimeRoot` without raising — the unit-test gate. End-to-end binary validation lives in the Tier-2 Jetson harness (AZ-444).
|
||||
- After this task lands, the cycle-1 completeness gate report at `_docs/03_implementation/implementation_completeness_cycle1_report.md` should be re-read: the `FAIL` classification for AZ-332 + AZ-333 is re-classified to `BLOCKED on Tier-2 prerequisites` per AZ-592 / AZ-593. The actual production blocker (this task) is being remediated here.
|
||||
- The user's PBI complexity rule caps PBIs at 5pt. This task is at the 5pt boundary because all 7 slots use the same wrapper pattern (so the slot count doesn't multiply complexity). If any slot's wrapper needs more than a few-line factory adapter, that slot's wrapper should split into its own PBI (`AZ-591_<slug>_bootstrap`).
|
||||
Reference in New Issue
Block a user